CA2538981A1 - Method and device for processing audiovisual data using speech recognition - Google Patents
Method and device for processing audiovisual data using speech recognition Download PDFInfo
- Publication number
- CA2538981A1 CA2538981A1 CA002538981A CA2538981A CA2538981A1 CA 2538981 A1 CA2538981 A1 CA 2538981A1 CA 002538981 A CA002538981 A CA 002538981A CA 2538981 A CA2538981 A CA 2538981A CA 2538981 A1 CA2538981 A1 CA 2538981A1
- Authority
- CA
- Canada
- Prior art keywords
- basic units
- audio signal
- time codes
- providing
- recognized speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract 39
- 238000004519 manufacturing process Methods 0.000 claims abstract 10
- 230000005236 sound signal Effects 0.000 claims 22
- 230000000007 visual effect Effects 0.000 claims 6
- 230000002123 temporal effect Effects 0.000 claims 4
- 230000006978 adaptation Effects 0.000 claims 3
- 230000001360 synchronised effect Effects 0.000 claims 2
- 239000002131 composite material Substances 0.000 claims 1
- 238000001514 detection method Methods 0.000 claims 1
- 238000000605 extraction Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/34—Indicating arrangements
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
- G11B27/034—Electronic editing of digitised analogue information signals, e.g. audio or video signals on discs
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
Abstract
A method and apparatus is disclosed for producing an audiovisual work. The method and apparatus is based on speech recognition. Extraction of basic uni ts of speech with related time code is performed. The invention may be advantageously used for performing post-production synchronization of a vide o source, dubbing assisting, closed-captioning assisting and animation generation assisting.
Claims (33)
1. A method for producing an audiovisual work, the method comprising the steps of:
providing an audio signal to a speech recognition module;
performing a speech recognition of said audio signal, the speech recognition comprising an extracting of a series of basic units of recognized speech and related time codes;
receiving the basic units of recognized speech and the related time codes from the speech recognition module;
processing the received basic units to provide synchronization information corresponding to the basic units of recognized speech for a production of said audiovisual work; and displaying on a user interface said synchronization information providing timing information for the basic units of recognized speech in said series.
providing an audio signal to a speech recognition module;
performing a speech recognition of said audio signal, the speech recognition comprising an extracting of a series of basic units of recognized speech and related time codes;
receiving the basic units of recognized speech and the related time codes from the speech recognition module;
processing the received basic units to provide synchronization information corresponding to the basic units of recognized speech for a production of said audiovisual work; and displaying on a user interface said synchronization information providing timing information for the basic units of recognized speech in said series.
2. The method as claimed in claim 1, wherein the production comprises post-production audio synchronization, said synchronization information comprises a graphic representation of, a sound to be performed at each point in time over a span of time during said.audiovisual work, and said interface controls said graphic representation over said span while facilitating synchronized recording of said sound in order to perform post-production.
3. The method as claimed in claim 2, wherein the basic units of recognized speech are phonemes.
4. The method as claimed in claim 2, further comprising the step of converting the basic units of recognized speech received with the time codes into words and words related time codes.
5. The method as claimed in claim 2, further comprising the step of converting the basic units of recognized speech received with the time codes into graphemes and graphemes related time codes, the graphemes being processed to provide synchronization information.
6. The method as claimed in claim 5, further comprising the step of providing a conformed text source, further wherein the synchronization information-provided to the user comprises an indication of a temporal location with respect to the audio signal..
7. The method as claimed in claim 5, further comprising the step of providing a script of at least one part of the audio signal, further wherein the synchronization information provided to the user comprises an indication of a temporal location with respect to the script provided.
8. The method as claimed in claim 5, wherein the displaying on a user interface of said synchronization information, comprises the displaying of the graphemes using a horizontally sizeable font.
9. The method as claimed in claim 5, further comprising the step of detecting a Foley in the audio signal using a Foley detection unit, the detecting comprising the providing of an indication of the Foley and a related Foley time code.
10. The method as claimed in claim 5, further comprising the step of amending at least one part of the audio signal and audio signal related time codes using at least the graphemes and the synchronization information.
11. The method as claimed in claim 4, further comprising the providing of a plurality of words in accordance with the provided audio signal, the providing being performed by an operator.
12. The method as claimed in claim 11, further comprising the step of amending a recognized word in accordance with the plurality of words provided by the operator.
13. The method as claimed in claim 12, further comprising the step of creating a composite signal comprising at least the amended word, a video signal related to the audio source and the audio source.
14. The method as claimed in claim 1, wherein the displaying on a user interface of said synchronization information providing timing information for the basic units of recognized speech in said series is used to produce animation.
15. The method as claimed in claim 14, wherein for blocks of continuous spoken wore, said synchronization information provides essential visem information for each sequential frame to be drawn by an animator.
16. The method as claimed in claim 15, further comprising the step of providing a storyboard database, further comprising the step of converting the basic units of recognized speech received with the time codes into words and words related time codes, the processing of the plurality of words and the words related time codes providing an indication of a current temporal location of the audio signal with respect to the storyboard.
17. The method as claimed i~n claim 16, wherein the basic units of recognized speech are phonemes, further compzising the step of providing a plurality of visems for each bf the plurality of woids, using a visem database'and using the phonemes.
18, The method as claimed in claim 17, further comprising the step of outputting an adjusted voice track.comprising the audio signal, at least one part of the storyboard and the plurality of visems.
19. The method as claimed in claim 1, wherein the production comprises adaptation assisting, the adaptation assisting comprises a graphic representation of the basic units of recognized speech, the related time codes and a plm'ality of adapted basic units provided by a user, and said interface providing a visual indication of a matching of the plurality of adapted basic units with the basic speech units, the matching enabling synchronized adaptation of said audio signal.
20. .The method as claimed in claim 19, wherein the plurality of adapted basic units is provided by performing a speech recognition of an adapted voice source.
21. The method as claimed in claim 20, wherein the speech recognition of the adapted voice source further provides related adapted time codes, further wherein the step of adapting the audio signal using said synchronization information and the plurality of adapted basic units is performed by attempting to match at least one of the series of basic units with at least one of the plurality of adapted basic units using the related time codes and the related adapted time codes.
22. A method for performing closed-captioning of an audio source, the method comprising the steps of:
providing an audio signal of an audiolvideo signal to a speech recognition module;
performing a speech recognition of said audiolvideo signal, and incorporating text of said recognized speech of the audio signal as closed-captioning into a visual or non-visual portion of the audio/video signal in synchronization.
providing an audio signal of an audiolvideo signal to a speech recognition module;
performing a speech recognition of said audiolvideo signal, and incorporating text of said recognized speech of the audio signal as closed-captioning into a visual or non-visual portion of the audio/video signal in synchronization.
23. The method as claimed in claim 21 further comprising the step of providing an indication of an amount of successful replacement ~ of the plurality of basic units of recognized speech of the audio signal by the plurality of basic units of recognized speech of the adapted audio signal.
24. The method as claimed in claim 23, further comprising the step of providing a minimum amount required of successful replacement of the plurality of basic units of recognized speech of the audio signal by the plurality of basic units of recognized speech of the adapted audio signal, the method further comprising the step of canceling the providing of the at least one replaced plurality of basic units with related replaced time codes if the at least one replaced plurality of basic units is lower than the minimum amount required of successful replacement.
25. The method as claimed in claim 1, wherein the audio signal comprises a plurality of voices originating from a plurality of actors, further, comprising the step of assigning each of the series of basic units and the related time codes to a related actor of the plurality of actors.
26. The method as claimed in claim 1, wherein the production comprises closed-captioning production of the audio source, said closed-captioning comprises a graphic representation of the recognized series of basic units, the method further comprising the incorporating of at least one of the series of basic units as closed-captioning in a visual or non-visual portion of the audio/video portion of the audio/video signal in synchronization.
27. The method as claimed in claim 26, further comprising the step of amending at least one part of the plurality of basic units.
28. The method as claimed in claim 1, further comprising the step of converting the basic units of recognized speech received with the time codes into words and words related time codes, further comprising the step of creating a database comprising a word and related basic units.
29. The method as claimed in claim 28, further comprising the step of amending a word of said database, wherein phonemes of the word and the amended word are substantially the same.
30. The method as claimed in claim 1, further comprising the step of converting the basic units of recognized speech received with the time codes into words and words related time codes, further comprising the step of amending at least one word.
31. The method as claimed in claim 30, further comprises the step of providing a visual indication of a word to amend.
32. The method as claimed in claim 1, wherein the audio signal comprises lyrics that are sung, further wherein the production of said audiovisual work comprises a karaoke generation using said audio signal, said karaoke generation comprises a graphic representation of lyrics to be sung at each point in time over a span of time during said audiovisual work using the series of basic units of recognized speech provided and related time codes, together with an index representation of a current temporal position with respect to the graphic representation of the lyrics to be sung.
33. The method as claimed in claim 2, further comprising the step of detecting at least one note encoded in the audio signal according to an encoding scheme, further comprising the providing of the detected at least one note on said graphic representation.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/067,131 US7343082B2 (en) | 2001-09-12 | 2001-09-12 | Universal guide track |
US10/067,131 | 2001-09-12 | ||
PCT/CA2002/001386 WO2003023765A1 (en) | 2001-09-12 | 2002-09-12 | Method and device for processing audiovisual data using speech recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2538981A1 true CA2538981A1 (en) | 2003-03-20 |
CA2538981C CA2538981C (en) | 2011-07-26 |
Family
ID=22073905
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2538981A Expired - Fee Related CA2538981C (en) | 2001-09-12 | 2002-09-12 | Method and device for processing audiovisual data using speech recognition |
Country Status (6)
Country | Link |
---|---|
US (2) | US7343082B2 (en) |
EP (1) | EP1425736B1 (en) |
AT (1) | ATE368277T1 (en) |
CA (1) | CA2538981C (en) |
DE (1) | DE60221408D1 (en) |
WO (1) | WO2003023765A1 (en) |
Families Citing this family (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9286941B2 (en) | 2001-05-04 | 2016-03-15 | Legend3D, Inc. | Image sequence enhancement and motion picture project management system |
US8897596B1 (en) | 2001-05-04 | 2014-11-25 | Legend3D, Inc. | System and method for rapid image sequence depth enhancement with translucent elements |
US8401336B2 (en) | 2001-05-04 | 2013-03-19 | Legend3D, Inc. | System and method for rapid image sequence depth enhancement with augmented computer-generated elements |
US7343082B2 (en) * | 2001-09-12 | 2008-03-11 | Ryshco Media Inc. | Universal guide track |
US7587318B2 (en) * | 2002-09-12 | 2009-09-08 | Broadcom Corporation | Correlating video images of lip movements with audio signals to improve speech recognition |
US8009966B2 (en) | 2002-11-01 | 2011-08-30 | Synchro Arts Limited | Methods and apparatus for use in sound replacement with automatic synchronization to images |
KR20050085344A (en) * | 2002-12-04 | 2005-08-29 | 코닌클리즈케 필립스 일렉트로닉스 엔.브이. | Synchronization of signals |
US7142250B1 (en) * | 2003-04-05 | 2006-11-28 | Apple Computer, Inc. | Method and apparatus for synchronizing audio and video streams |
WO2004093059A1 (en) * | 2003-04-18 | 2004-10-28 | Unisay Sdn. Bhd. | Phoneme extraction system |
WO2004100128A1 (en) * | 2003-04-18 | 2004-11-18 | Unisay Sdn. Bhd. | System for generating a timed phomeme and visem list |
JP3945778B2 (en) * | 2004-03-12 | 2007-07-18 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Setting device, program, recording medium, and setting method |
GB2424534B (en) * | 2005-03-24 | 2007-09-05 | Zootech Ltd | Authoring audiovisual content |
US20070011012A1 (en) * | 2005-07-11 | 2007-01-11 | Steve Yurick | Method, system, and apparatus for facilitating captioning of multi-media content |
US8060591B1 (en) | 2005-09-01 | 2011-11-15 | Sprint Spectrum L.P. | Automatic delivery of alerts including static and dynamic portions |
US7653418B1 (en) * | 2005-09-28 | 2010-01-26 | Sprint Spectrum L.P. | Automatic rotation through play out of audio-clips in response to detected alert events |
ATE440334T1 (en) * | 2006-02-10 | 2009-09-15 | Harman Becker Automotive Sys | SYSTEM FOR VOICE-CONTROLLED SELECTION OF AN AUDIO FILE AND METHOD THEREOF |
US8713191B1 (en) | 2006-11-20 | 2014-04-29 | Sprint Spectrum L.P. | Method and apparatus for establishing a media clip |
US7747290B1 (en) | 2007-01-22 | 2010-06-29 | Sprint Spectrum L.P. | Method and system for demarcating a portion of a media file as a ringtone |
US8179475B2 (en) * | 2007-03-09 | 2012-05-15 | Legend3D, Inc. | Apparatus and method for synchronizing a secondary audio track to the audio track of a video source |
US20080256136A1 (en) * | 2007-04-14 | 2008-10-16 | Jerremy Holland | Techniques and tools for managing attributes of media content |
US20080263433A1 (en) * | 2007-04-14 | 2008-10-23 | Aaron Eppolito | Multiple version merge for media production |
US8751022B2 (en) * | 2007-04-14 | 2014-06-10 | Apple Inc. | Multi-take compositing of digital media assets |
US20080295040A1 (en) * | 2007-05-24 | 2008-11-27 | Microsoft Corporation | Closed captions for real time communication |
TWI341956B (en) * | 2007-05-30 | 2011-05-11 | Delta Electronics Inc | Projection apparatus with function of speech indication and control method thereof for use in the apparatus |
US9390169B2 (en) * | 2008-06-28 | 2016-07-12 | Apple Inc. | Annotation of movies |
US8265450B2 (en) * | 2009-01-16 | 2012-09-11 | Apple Inc. | Capturing and inserting closed captioning data in digital video |
FR2955183B3 (en) * | 2010-01-11 | 2012-01-13 | Didier Calle | METHOD FOR AUTOMATICALLY PROCESSING DIGITAL DATA FOR DOUBLING OR POST SYNCHRONIZATION OF VIDEOS |
US8572488B2 (en) * | 2010-03-29 | 2013-10-29 | Avid Technology, Inc. | Spot dialog editor |
US8744239B2 (en) | 2010-08-06 | 2014-06-03 | Apple Inc. | Teleprompter tool for voice-over tool |
US8730232B2 (en) | 2011-02-01 | 2014-05-20 | Legend3D, Inc. | Director-style based 2D to 3D movie conversion system and method |
US8621355B2 (en) | 2011-02-02 | 2013-12-31 | Apple Inc. | Automatic synchronization of media clips |
US9241147B2 (en) | 2013-05-01 | 2016-01-19 | Legend3D, Inc. | External depth map transformation method for conversion of two-dimensional images to stereoscopic images |
US9407904B2 (en) | 2013-05-01 | 2016-08-02 | Legend3D, Inc. | Method for creating 3D virtual reality from 2D images |
US9288476B2 (en) | 2011-02-17 | 2016-03-15 | Legend3D, Inc. | System and method for real-time depth modification of stereo images of a virtual reality environment |
US9282321B2 (en) | 2011-02-17 | 2016-03-08 | Legend3D, Inc. | 3D model multi-reviewer system |
US9280905B2 (en) * | 2011-12-12 | 2016-03-08 | Inkling Systems, Inc. | Media outline |
WO2014018652A2 (en) | 2012-07-24 | 2014-01-30 | Adam Polak | Media synchronization |
US9007365B2 (en) | 2012-11-27 | 2015-04-14 | Legend3D, Inc. | Line depth augmentation system and method for conversion of 2D images to 3D images |
US9547937B2 (en) | 2012-11-30 | 2017-01-17 | Legend3D, Inc. | Three-dimensional annotation system and method |
US9007404B2 (en) | 2013-03-15 | 2015-04-14 | Legend3D, Inc. | Tilt-based look around effect image enhancement method |
US9438878B2 (en) | 2013-05-01 | 2016-09-06 | Legend3D, Inc. | Method of converting 2D video to 3D video using 3D object models |
US8719032B1 (en) | 2013-12-11 | 2014-05-06 | Jefferson Audio Video Systems, Inc. | Methods for presenting speech blocks from a plurality of audio input data streams to a user in an interface |
US20160042766A1 (en) * | 2014-08-06 | 2016-02-11 | Echostar Technologies L.L.C. | Custom video content |
GB2553960A (en) | 2015-03-13 | 2018-03-21 | Trint Ltd | Media generating and editing system |
US9609307B1 (en) | 2015-09-17 | 2017-03-28 | Legend3D, Inc. | Method of converting 2D video to 3D video using machine learning |
US10387543B2 (en) * | 2015-10-15 | 2019-08-20 | Vkidz, Inc. | Phoneme-to-grapheme mapping systems and methods |
GB201715753D0 (en) * | 2017-09-28 | 2017-11-15 | Royal Nat Theatre | Caption delivery system |
CN112653916B (en) * | 2019-10-10 | 2023-08-29 | 腾讯科技(深圳)有限公司 | Method and equipment for synchronously optimizing audio and video |
US11545134B1 (en) * | 2019-12-10 | 2023-01-03 | Amazon Technologies, Inc. | Multilingual speech translation with adaptive speech synthesis and adaptive physiognomy |
Family Cites Families (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3170907D1 (en) | 1981-01-19 | 1985-07-18 | Richard Welcher Bloomstein | Apparatus and method for creating visual images of lip movements |
GB2101795B (en) | 1981-07-07 | 1985-09-25 | Cross John Lyndon | Dubbing translations of sound tracks on films |
CA1270063A (en) | 1985-05-14 | 1990-06-05 | Kouji Miyao | Translating apparatus |
US5155805A (en) | 1989-05-08 | 1992-10-13 | Apple Computer, Inc. | Method and apparatus for moving control points in displaying digital typeface on raster output devices |
US5159668A (en) | 1989-05-08 | 1992-10-27 | Apple Computer, Inc. | Method and apparatus for manipulating outlines in improving digital typeface on raster output devices |
EP0526064B1 (en) | 1991-08-02 | 1997-09-10 | The Grass Valley Group, Inc. | Video editing system operator interface for visualization and interactive control of video material |
US5434678A (en) | 1993-01-11 | 1995-07-18 | Abecassis; Max | Seamless transmission of non-sequential video segments |
US5481296A (en) | 1993-08-06 | 1996-01-02 | International Business Machines Corporation | Apparatus and method for selectively viewing video information |
JP3356536B2 (en) | 1994-04-13 | 2002-12-16 | 松下電器産業株式会社 | Machine translation equipment |
US5717468A (en) | 1994-12-02 | 1998-02-10 | International Business Machines Corporation | System and method for dynamically recording and displaying comments for a video movie |
JP4078677B2 (en) | 1995-10-08 | 2008-04-23 | イーサム リサーチ デヴェロップメント カンパニー オブ ザ ヘブライ ユニヴァーシティ オブ エルサレム | Method for computerized automatic audiovisual dubbing of movies |
JP3454396B2 (en) | 1995-10-11 | 2003-10-06 | 株式会社日立製作所 | Video change point detection control method, playback stop control method based thereon, and video editing system using them |
US5732184A (en) | 1995-10-20 | 1998-03-24 | Digital Processing Systems, Inc. | Video and audio cursor video editing system |
US5880788A (en) | 1996-03-25 | 1999-03-09 | Interval Research Corporation | Automated synchronization of video image sequences to new soundtracks |
US6154601A (en) | 1996-04-12 | 2000-11-28 | Hitachi Denshi Kabushiki Kaisha | Method for editing image information with aid of computer and editing system |
US5832171A (en) | 1996-06-05 | 1998-11-03 | Juritech, Inc. | System for creating video of an event with a synchronized transcript |
JPH1074204A (en) | 1996-06-28 | 1998-03-17 | Toshiba Corp | Machine translation method and text/translation display method |
EP0848850A1 (en) | 1996-07-08 | 1998-06-24 | Régis Dubos | Audio-visual method and devices for dubbing films |
US5969716A (en) | 1996-08-06 | 1999-10-19 | Interval Research Corporation | Time-based media processing system |
AU6313498A (en) | 1997-02-26 | 1998-09-18 | Tall Poppy Records Limited | Sound synchronizing |
US6134378A (en) | 1997-04-06 | 2000-10-17 | Sony Corporation | Video signal processing device that facilitates editing by producing control information from detected video signal information |
FR2765354B1 (en) | 1997-06-25 | 1999-07-30 | Gregoire Parcollet | FILM DUBBING SYNCHRONIZATION SYSTEM |
EP0899737A3 (en) | 1997-08-18 | 1999-08-25 | Tektronix, Inc. | Script recognition using speech recognition |
DE19740119A1 (en) * | 1997-09-12 | 1999-03-18 | Philips Patentverwaltung | System for cutting digital video and audio information |
US6174170B1 (en) * | 1997-10-21 | 2001-01-16 | Sony Corporation | Display of text symbols associated with audio data reproducible from a recording disc |
JPH11162152A (en) | 1997-11-26 | 1999-06-18 | Victor Co Of Japan Ltd | Lyric display control information editing device |
JPH11289512A (en) * | 1998-04-03 | 1999-10-19 | Sony Corp | Editing list preparing device |
US6490563B2 (en) * | 1998-08-17 | 2002-12-03 | Microsoft Corporation | Proofreading with text to speech feedback |
IT1314671B1 (en) * | 1998-10-07 | 2002-12-31 | Cselt Centro Studi Lab Telecom | PROCEDURE AND EQUIPMENT FOR THE ANIMATION OF A SYNTHESIZED HUMAN FACE MODEL DRIVEN BY AN AUDIO SIGNAL. |
US20010044719A1 (en) * | 1999-07-02 | 2001-11-22 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for recognizing, indexing, and searching acoustic signals |
US7047191B2 (en) * | 2000-03-06 | 2006-05-16 | Rochester Institute Of Technology | Method and system for providing automated captioning for AV signals |
US7085842B2 (en) * | 2001-02-12 | 2006-08-01 | Open Text Corporation | Line navigation conferencing system |
US7343082B2 (en) * | 2001-09-12 | 2008-03-11 | Ryshco Media Inc. | Universal guide track |
-
2001
- 2001-09-12 US US10/067,131 patent/US7343082B2/en not_active Expired - Fee Related
-
2002
- 2002-09-12 DE DE60221408T patent/DE60221408D1/en not_active Expired - Lifetime
- 2002-09-12 EP EP02759989A patent/EP1425736B1/en not_active Expired - Lifetime
- 2002-09-12 CA CA2538981A patent/CA2538981C/en not_active Expired - Fee Related
- 2002-09-12 AT AT02759989T patent/ATE368277T1/en not_active IP Right Cessation
- 2002-09-12 WO PCT/CA2002/001386 patent/WO2003023765A1/en active IP Right Grant
-
2004
- 2004-03-11 US US10/797,576 patent/US20040234250A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
ATE368277T1 (en) | 2007-08-15 |
EP1425736B1 (en) | 2007-07-25 |
EP1425736A1 (en) | 2004-06-09 |
US7343082B2 (en) | 2008-03-11 |
WO2003023765A1 (en) | 2003-03-20 |
US20030049015A1 (en) | 2003-03-13 |
CA2538981C (en) | 2011-07-26 |
US20040234250A1 (en) | 2004-11-25 |
DE60221408D1 (en) | 2007-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2538981A1 (en) | Method and device for processing audiovisual data using speech recognition | |
US11190855B2 (en) | Automatic generation of descriptive video service tracks | |
US20060136226A1 (en) | System and method for creating artificial TV news programs | |
US11942093B2 (en) | System and method for simultaneous multilingual dubbing of video-audio programs | |
JP2004361965A (en) | Text-to-speech conversion system for interlocking with multimedia and method for structuring input data of the same | |
CN110740275B (en) | Nonlinear editing system | |
US11908449B2 (en) | Audio and video translator | |
CN110781649A (en) | Subtitle editing method and device, computer storage medium and electronic equipment | |
JP2020012855A (en) | Device and method for generating synchronization information for text display | |
CN117596433B (en) | International Chinese teaching audiovisual courseware editing system based on time axis fine adjustment | |
Lu et al. | Visualtts: Tts with accurate lip-speech synchronization for automatic voice over | |
Jankowska et al. | Reading rate in filmic audio description | |
Di Gangi et al. | Automatic video dubbing at AppTek | |
JP2002344805A (en) | Method for controlling subtitles display for open caption | |
CN115633136A (en) | Full-automatic music video generation method | |
JP4595098B2 (en) | Subtitle transmission timing detection device | |
CN117240983B (en) | Method and device for automatically generating sound drama | |
CN101266790A (en) | Device and method for automatic time marking of text file | |
JP2003244539A (en) | Consecutive automatic caption processing system | |
JPS6315294A (en) | Voice analysis system | |
US11947924B2 (en) | Providing translated subtitle for video content | |
Bazaz et al. | Automated Dubbing and Facial Synchronization using Deep Learning | |
Pamisetty et al. | Subtitle Synthesis using Inter and Intra utterance Prosodic Alignment for Automatic Dubbing | |
JP2004336606A (en) | Caption production system | |
Weiss | A Framework for Data-driven Video-realistic Audio-visual Speech-synthesis. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |
Effective date: 20200914 |