WO2008113064A1 - Methods and systems for converting video content and information to a sequenced media delivery format - Google Patents

Methods and systems for converting video content and information to a sequenced media delivery format Download PDF

Info

Publication number
WO2008113064A1
WO2008113064A1 PCT/US2008/057182 US2008057182W WO2008113064A1 WO 2008113064 A1 WO2008113064 A1 WO 2008113064A1 US 2008057182 W US2008057182 W US 2008057182W WO 2008113064 A1 WO2008113064 A1 WO 2008113064A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
video
information
presentation
cadenced
Prior art date
Application number
PCT/US2008/057182
Other languages
French (fr)
Inventor
John F. Ellingson
Original Assignee
Vubotics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vubotics, Inc. filed Critical Vubotics, Inc.
Publication of WO2008113064A1 publication Critical patent/WO2008113064A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/87Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving scene cut or scene change detection in combination with video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234336Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by media transcoding, e.g. video is transformed into a slideshow of still pictures or audio is converted into text

Definitions

  • the inventions relate to communications, and in particular, relate to adding information to a communication. Even more particularly, the inventions relate to using a communication such as a video presentation to create a viewable presentation that may include information in addition to that found in the video presentation.
  • video presentation is used generally and herein to refer to any technology that allows for the presentation of scenes in motion.
  • video presentation is used synonymously generally and herein with the following terms unless otherwise noted: video clips, video stream, digital video, streaming media, film, and the like.
  • video presentation may not include any "video” per se, but may be a text/graphic document.
  • a video presentation may have a video portion and an audio portion.
  • video portion refers to that part of the video presentation that may be viewed or seen.
  • video may be used herein synonymously with the term “video portion” or “visual portion”.
  • audio portion is understood generally and is used herein as including any sound that is part of the video presentation such as speech (also referred to as dialogue), sub-vocalization, noise, music, sound effects, etc. whether audible to the human ear or not.
  • the term “audio” may be used herein synonymously with the term “audio portion”.
  • closed captioning may include any text and/or graphics that may constitute a transcription of the audio portion of the video presentation whether the closed captioning is typed, scripted, or otherwise. The transcription may not always be verbatim of what is said. In addition, the closed captioning may include information about who is speaking and may indicate relevant sounds.
  • closed in closed captioning generally means that not all viewers see the captions - only those who decode or activate the captions. Nevertheless, the term “closed captioning” is defined herein to include other types of “captioning” such as “open captions” where the captions are visible to all viewers, and to "burned-in captions" which are permanently visible in a video.
  • closed captioning is defined differently from the term “subtitles” in the United States and Canada, but the term “closed captioning” is defined herein to also include “subtitles”.
  • subtitles applies typically to the translation of the dialogue and perhaps some onscreen text, whereas captioning aims to describe all significant audio content of a video as well as some "non-speech information" such as the identity of speakers and their manner of speaking, music, sounds effects.
  • non-speech information such as the identity of speakers and their manner of speaking, music, sounds effects.
  • the term “closed captioning” shall also include other transcription or narrative of the spoken word such as teleprompter data and the like.
  • a “still” may be a static photograph (or the like in similarity or by analogy) of the visual or video presentation.
  • a frame may be selected to be a still.
  • each frame representing a change in the background or visual presentation may be selected to be a "still”.
  • the inventions relate to communications such as video and other presentations. More particularly, the inventions relate to systems and methods whereby a video or other presentation may be used as the source or basis for a communication that provides portions of the video, provides the audio portion in text form, and adds information. Even more particularly, the inventions relate to systems and methods that allow a user to view the substance of a video presentation on a handheld or other device where the presented text corresponds to the audio of the video presentation and where additional information may be presented to the user.
  • a first embodiment may provide for the creation of a speed-adjustable view-only presentation from a source having video and audio where the created presentation includes information not necessarily included in the source.
  • the embodiment provides text corresponding to the audio of the source.
  • the text may be interactively processed and cadenced. Information is added to the text as background or otherwise.
  • a set of still frames may be created from the video of the source by creating a still frame respectively at intervals or for each change in the video.
  • the set of still frames may be combined or overlaid with the cadenced processed text including the added information.
  • the combination may be presentable as a slide show or otherwise at selectable variable speeds on a hand-held or other device.
  • the slide show may include a set of stills where a still is created from the video portion at respective predetermined intervals or for each change in the video portion.
  • the audio is converted into text (by closed captioning or otherwise), which may be cadenced and/or processed. Information may be added to the text as background or otherwise.
  • the set of stills is overlaid with the text to create the slide show.
  • the slide show may be presented at one or more of several selectable speeds.
  • a further embodiment may process a video presentation by making a still frame from the video portion each time a predetermined event happens in the video portion.
  • a predetermined event may be a change in camera angle, a change in motion of an actor, a change in volume of the audio portion, or an elapse of a predetermined interval.
  • a slideshow of still frames is made by combining the still frames.
  • Text is provided for the audio portion, and information is added to the text as background or otherwise.
  • the text may be processed such as by eliminating a gap in the text of a predetermined size, eliminating a special character, eliminating a marker, eliminating a delimiter, or correcting language.
  • the text may be cadenced.
  • the processing and cadencing of the text may be interactive processes.
  • the text with added information may be combined with the slideshow of still frames for presentation to a viewer or user who may adjust the speed of the presentation as desired.
  • Figure 1 is a flow diagram illustrating an exemplary embodiment of the inventions.
  • Figures 2A - 2F are exemplary screen shots illustrating use of an exemplary embodiment of the inventions.
  • the inventions relate to methods and systems that may provide advantages such as reduction in the size of or with respect to a source presentation for ease of use, storage, transfer, etc.
  • a set of stills from the source presentation may be created and assembled. Each still in the set may represent a background, scene or other change in the video portion of the source presentation.
  • the audio portion of the source presentation may be processed into text and cadenced for serial presentation.
  • the assembled set of stills and the cadenced text may be combined into a slide show with cadenced text. Effectively, the result may be a reduced version in size of or with respect to the original source presentation, but with the additional advantage of a cadenced readable text that corresponds to the audio portion.
  • Such a reduction in size of or with respect to the source presentation may be advantageous in many situations.
  • a potential viewer of a video presentation The viewer may choose to view the slide show with cadenced text prior to or instead of review of the video presentation.
  • the viewer may make the choice because the slide show with cadenced text is smaller in size (and all the advantages that a smaller size provides), and because the slide show with cadenced text substantially provides the gist of the video presentation without its accompanying overhead.
  • FIG. 1 is a flow diagram illustrating operation of an exemplary embodiment of the inventions.
  • the source 10 is a video presentation (also referred to as "video source", "source”).
  • the video presentation generally includes a visual portion and an audio portion (and may, but does not have to, include closed captioning of the audio portion).
  • the exemplary embodiment may operate separately, but concurrently, on the respective portions (in whole or part) of the video presentation and that may result in a reduced version of or with respect to the video presentation.
  • Alternative embodiments of the inventions may operate in other ways.
  • an exemplary embodiment may combine some or all of the actions separately described above (as well as others or not) in a control module.
  • the order of the actions as presented herein may be different in other embodiments.
  • the operation of the exemplary embodiment on the visual portion of the video presentation may be summarized by reference to the actions 12, and 14 on the left side of the flow diagram of Figure 1.
  • the operation on the audio portion of the video presentation may be summarized by reference to the actions 18, 20, 22, 23 on the right side of the flow diagram.
  • the results may be mixed in action 24 and may result in a slide show with cadenced text in action 26.
  • a “still” is a static photograph (or the like in similarity or by analogy) of the visual presentation.
  • each frame representing a change in the background or visual presentation may be selected to be a "still".
  • a change in the visual portion of the video presentation may be defined in a variety of ways, and such definition may vary from embodiment to embodiment and/or from use to use of an embodiment of the inventions.
  • a change in the visual portion of the video presentation may be a change in camera angle, and/or it may be the motion of an actor in the presentation such as when an actor may stand up or sit down.
  • a change that may result in a still is a change in the volume of the audio portion of the video presentation. If an actor raises his or her voice, a still may be created. If the actor lowers his or her voice, a still may be created.
  • the passage of a predetermined length of time in the playing of the video presentation may result in the creation of a still. For example, every two (2) seconds of a video presentation a still may be created of what appears in the presentation at that moment. Other events and/or characteristics may be chosen as one or more of the bases to create a still from the video portion of the presentation.
  • the creation of the stills in action 12 may occur in any manner.
  • the stills may be created manually by an operator who observes the video presentation.
  • the stills may be created automatically based on a set of criteria such as the previously referenced elapse of a certain amount of time and/or the change in volume of the audio portion of the video presentation.
  • the stills are assembled in a set of stills (also referred to as a set of scenes or set of still frames).
  • the stills are assembled in an order to correspond to the visual portion of the video presentation, but they do not necessarily have to be.
  • the order of the text presented may be altered from that in the original audio portion to correspond to the order of the stills.
  • action 14 may not be necessary in some embodiments because the stills may not need to be assembled. They may be created without such need. Further, if there is only one still, then it may not need to be assembled. As yet another alternative, in some cases, the stills may be identical and in that case, all of them or only less than all of them may be assembled (or not). Operation on Audio Portion
  • the same source that provides the visual portion of the video presentation for the operation in actions 12 and 14 may provide the audio portion for the operation in actions 18, 20, 22, 23. Generally, the source is the same for both sides of the flow of actions, but this does not necessarily have to be the case.
  • the operation on the audio portion preferably results in a presentation of the audio portion in cadenced text/graphic form in an easy-to-read manner as an overlay and/or otherwise in association with the set of stills.
  • the speed of the presentation in cadenced text/graphic form may be controlled in some embodiments by the user, and may be varied to speed up, slow down, stop, or start according to the desires of the user.
  • the processing of the audio portion to obtain the cadenced words (and/or graphics) that are displayed in association with the set of stills may result in a sequential delivery and display of the audio content.
  • Further information about the sequential delivery and display of the audio content also may be found in the published patent application entitled Method for Controlling the Rate of Automated Flow and Navigation through Information Presented on a Digitally Controlled Electronic Device, United States Patent Application Publication No. 20070073917, Serial No. 515,950, published on March 29, 2007, which is incorporated herein by reference.
  • Another source for information on delivery and display of audio content may be found in United States Patent No. 6,056,551, Methods and Apparatus for Computer Aided Reading Training, issued May 2, 2000, which is incorporated herein by reference.
  • the operation on the audio portion of the video presentation may begin with caption file action 18.
  • Such action may be unnecessary, but it may provide for the "closed captioning" (or other transcription) of the audio portion, if such transcription has not already been done.
  • the video presentation may be passed through a NORPAK card to result in closed captioning of at least the audio portion of the video presentation.
  • NORPAK card 1 TTX890 Multi-format SD/HD-SDI VANC De-embedder/Decoder Card from Norpak Corporation, Kanata, Ontario, Canada.
  • the NORPAK card may include encoding/insertion receiving, monitoring, and content control functions. Additional information about the NORPAK card may be obtained from www.norpak.ca.
  • Alternative transcription of the audio into text/graphics may be carried out. For example, such transcription may be done manually by a person listening to the audio portion and creating a text file (including graphics, if need be) corresponding to the audio.
  • processing action 20 may include a filter function and may provide for removing gaps or otherwise.
  • the processing action 20 may include removing special characters, markers, and delimeters.
  • the processing action 20 may include language correction functions to generally clean-up or correct the language corresponding to the audio portion of the presentation.
  • a processing action such as action 20 may be unnecessary as the text/graphics corresponding to the audio portion may be ready for further actions without such processing. For example, the text/graphics may be reading for cadencing in action 22 without processing action 20.
  • the result of the processing action 20 may be that of a text file or a stream of ASCII (American Standard Code for Information Interchange) text, but the result may vary depending upon the needs of the relevant system and methods.
  • ASCII American Standard Code for Information Interchange
  • the stream of ASCII text resulting from the processing action 20 is the input to the action of cadence video of words (and/or graphics) 22.
  • the input to the cadencing action may be other than a stream of ASCII text.
  • the elements that may be part of the cadencing action 22 are more fully described in the above referenced United States Patent Application Publication No. 20070073917, which is incorporated herein by reference.
  • the speed of the presentation of the cadenced words (the overall speed and/or the respective speed of individual words, phrases, sentences, etc.)
  • the display time of the presentation of the cadenced words (the overall presentation time and/or the respective presentations times of the individual words, phrases, sentences, etc.).
  • a user may be able to vary the speed, the display time, and/or other features.
  • an action 23 of insertion of additional information may take place.
  • the addition of information may take place in a variety of ways.
  • the additional information may be inserted in the gaps between words as the cadenced words are presented with the slide show of the set of stills created in actions 12 and 14.
  • the length of a gap between words in the stream of ASCII text may be determinable. If the gap is sufficient in length, one or more pieces of additional information may be inserted into that gap.
  • the cadenced words are displayed with the slide show, the one or more pieces of additional information appear in the gap.
  • the information added may be the same each time, or it may vary. Whether the added information is the same each time may depend on the size of the available gaps, or not.
  • the additional information may or may not be related to the content of the stream of ASCII text.
  • the additional information may take the form of text and/or graphics.
  • the additional information may constitute information that a viewer may immediately recognize as associated with a particular source or origin.
  • additional information may have to be appear only for the length of time that allows a user to recognize the additional information and/or the source associated with the additional information.
  • the additional information does not necessarily appear for the same length of time as the text/graphics to which it has been added.
  • multiples examples of added information may appear for respective varying lengths of time whether the added information is the same in each case or not, and/or whether it is the same size or not, and/or whether it is added to respective gaps of the same size or not.
  • Another way additional information may be made part of the processed information is by adding it so as to appear as background to the text of the cadenced words.
  • the subject matter of the cadenced words may be a travelogue description of the Mt. Rushmore National Monument.
  • a graphic image of Mt. Rushmore may appear with cadenced words as background.
  • the overlay of cadenced words may include its own background (separate from the visual portion of the video presentation).
  • the background to the overlay of cadenced words may compliment, contrast, and/or have no relation to the visual portion of the video presentation.
  • the added information used as background may appear as such for all of the text/graphic presentation, or only part(s) of it.
  • the information added as background may be the same as (in whole or in part) as information added in gaps in the text/graphics. Information added as background may continue to appear as background when information added in gaps appears in the presentation (or not, or sometimes).
  • Display control may include anything related to the display of the stream of ASCII text or other text/graphics.
  • the display control may include configurations of the stream of ASCII text or other text/graphics with respect to color, position, size, opacity, etc.
  • the relevant stream of ASCII text or other text/graphics may be changed, added to, subtracted from, etc. by the display control.
  • the output of the display control may result in cadenced words (with or without additional information and with or without display control elements).
  • sequenced words is not to be limited to just "words", but may include other text, graphics, etc. relating to the audio portion of the video presentation.
  • the left hand and right hand actions come together in the mix as video action 24.
  • action 24 the cadenced words are added to (and/or mixed and/or other action) the assembled set of stills made from the visual portion of the video presentation.
  • the result is effectively a slide show with cadenced words 26.
  • the cadenced words from the audio portion of the video presentation are made into a video that is laid over or on top of the slide show (or set of stills) (figuratively or literally) from the visual portion of the video presentation.
  • a viewer may view the assembled set of stills and may be able to concurrently view the cadenced words (with or without additional information).
  • the cadenced words appear smoothly to the viewer in an easy-to-read manner.
  • an embodiment of the inventions may make use of a CHYRON character generation card, 1 PCI board sets, Digital pcCODI and Analog pcCODI, Chyron Corporation, Melville, New York. For more information see www.chyron.com.
  • the mixing action 24 coordinates the set of stills with the processed cadenced (or other) text/graphics taken from the audio portion of the video presentation.
  • a source is a music video.
  • the person viewing the music video processed in accordance with an exemplary embodiment of the inventions might view the words sung as text/graphics in correspondence with the stills presenting the musicians singing those words.
  • the text/graphics may match the set of stills.
  • the text/graphics need not always correlate to the set of stills as the audio portion correlates to the video portion of the source presentation.
  • the text/graphics may be presented in an order that does not correspond to the order of the audio portion, and/or the video portion. This latter example may be an unusual case, but may have its applications in creation of art or new works of authorship or otherwise.
  • the inventions may include advantageous methods and systems that may reduce the size of a source presentation for ease of use, storage, transfer, etc.
  • the result of a synchronized text format to slide show based video may offer an advancement in the field of compression.
  • a viewer of the slide show with cadenced words may not need the audio portion of the video presentation in audio form because the audio portion may be presented to the viewer in a cadenced text format that may be synchronized to the assembled set of scenes that make up the slide show.
  • a viewer may speed his or her review and comprehension of a video presentation by viewing the corresponding slide show with cadenced words resulting from operation of an exemplary embodiment of the present inventions on the video presentation.
  • Figures 2A - 2F illustrate respective screen shots as they may appear on a viewer's personal digital assistant (PDA) or other device when the viewer makes use of an exemplary embodiment of the inventions in watching the beginning of an evening newscast.
  • PDA personal digital assistant
  • the video source presentation includes a video portion and an audio portion.
  • a set of stills 30a-f may be created from the video source pursuant to any of the processes or systems described above (or alternatives thereto). In this case, each still 30a-f is the same, but it does not always have to be the case. These could be copies of a single still, or they could be six identical stills, or a mix of identical stills and new stills.
  • the text/graphic overlay 32a-f includes the text of the audio portion of the video source plus added information.
  • the text graphic overlay 32a-c includes the text spoken by the announcer in the video portion.
  • the text/graphic overlay 32a-c appears as processed cadenced text.
  • the first screen shot 30a includes only the text "Hello” as part of the graphic overlay 32a.
  • the next screen shot 30b includes the text "JOHN” as part of the graphic overlay 32b.
  • the text of "JOHN" has been processed in this example to emphasize the announcer's name, but that does not have to be the case in all embodiments.
  • the cadencing of the text is evident from the fact that the text of "JOHN” does not immediately follow the text "Hello” on the same screen shot, but is spaced a bit apart in time for ease of the viewer's reading comprehension.
  • the third screen shot 30c includes the text "reporting" as part of the graphic overlay 32c.
  • the three terms are separately cadenced and overlaid onto separate screen shots for each of comprehensibility.
  • Other embodiments may be able to cadence more than one word per screen shot.
  • a user may be able to vary some parts of the exemplary embodiment's presentation features such as speed of display of text, duration of display of text, etc.
  • the text/graphic overlay 32d-f continues, but instead of text from the audio portion of the source presentation, the text/graphic overlay 32d-f shows other features.
  • the text/graphic overlay 32d in screen shot 3Od is blank or at least does not appear to contain text or graphics. This may indicate a pause in the announcer's speech (or not).
  • Screen shots 30e-f include text/graphic overlay 32e-f that illustrate the addition of information. The information may have been added because of a gap in the announcer's speech or otherwise.
  • the trademark Coca-Cola appears as the text/graphic overlay 32e.
  • the trademarks including an illustration of a Coke bottle with the words "welcome to the Coke side of life” appear as the text/graphic overlay 32f.
  • the trademarks were added to the text/graphic overlay 32e-f; the trademarks are not part of the audio portion of the source presentation.
  • the trademarks appearing as text graphic overlay 32e-f may appear for a brief amount of time so that they disappear once an average viewer would be thought to have recognized them and before becoming distracted from the topic of the source presentation.
  • added information such as the trademarks appears as text/graphic overlay 32e-f may be made to appear for as long or as short as decided by the person implementing the exemplary embodiment or otherwise.
  • the trademarks used as text/graphic overlays 32e-f are inserted into gaps in the text that correspond to the audio portion of the source presentation.
  • An alternative would be to use the same trademarks or other information as background in the text graphic overlays 32a-f.
  • Another alternative would be to use the same trademarks and/or other information as added information in the text graphic overlay d 32a-f and as background in those overlays 32a-f.
  • Other alternatives are possible.

Abstract

Methods and systems for creating a presentation from a source and adding information as part of the created presentation. An embodiment may create a speed-adjustable view-only presentation from a source including video and audio portions where the created presentation includes information not necessarily included in the source. The embodiment provides text corresponding to the audio portion of the source. The text may be interactively processed and cadenced. Information may be added as background or otherwise to the text. A set of still frames may be created from the video portion of the source by creating a still frame respectively at intervals or for each change in the video portion. The set of still frames may be combined or overlaid with the cadenced processed text including the added information. The combination may be presentable as a slide show or otherwise at selectable variable speeds on a hand-held or other device.

Description

METHODS AND SYSTEMS FOR CONVERTING VIDEO CONTENT AND INFORMATION TO A SEQUENCED MEDIA DELIVERY FORMAT
CROSS-REFERENCE TO RELATED APPLICATIONS
The present application claims priority to and benefit of the prior filed co- pending and commonly owned provisional application, filed in the United States Patent and Trademark Office on March 15, 2007, assigned Serial No. 60/895,121, entitled Methods and Systems for Converting Video with Captioning to Still Frames and Cadenced Text for Reduced File Size and User Controlled Speed of Information, and incorporated herein by reference. The present application also claims priority to and benefit of the prior filed co-pending and commonly owned provisional application, filed in the United States Patent and Trademark Office on March 15, 2007, assigned Serial No. 60/895,108, entitled Methods and Systems for Presenting Information Based on Captioning or Dialogue Associated with a Video Information, and incorporated herein by reference.
FIELD OF THE INVENTIONS
The inventions relate to communications, and in particular, relate to adding information to a communication. Even more particularly, the inventions relate to using a communication such as a video presentation to create a viewable presentation that may include information in addition to that found in the video presentation.
BACKGROUND
The term "video presentation" is used generally and herein to refer to any technology that allows for the presentation of scenes in motion. The term "video presentation" is used synonymously generally and herein with the following terms unless otherwise noted: video clips, video stream, digital video, streaming media, film, and the like. In some cases, the "video presentation" may not include any "video" per se, but may be a text/graphic document.
A video presentation may have a video portion and an audio portion. The term "video portion" as used generally and herein refers to that part of the video presentation that may be viewed or seen. The term "video" may be used herein synonymously with the term "video portion" or "visual portion". The term "audio portion" is understood generally and is used herein as including any sound that is part of the video presentation such as speech (also referred to as dialogue), sub-vocalization, noise, music, sound effects, etc. whether audible to the human ear or not. The term "audio" may be used herein synonymously with the term "audio portion".
The term "closed captioning" may include any text and/or graphics that may constitute a transcription of the audio portion of the video presentation whether the closed captioning is typed, scripted, or otherwise. The transcription may not always be verbatim of what is said. In addition, the closed captioning may include information about who is speaking and may indicate relevant sounds. The term "closed" in closed captioning generally means that not all viewers see the captions - only those who decode or activate the captions. Nevertheless, the term "closed captioning" is defined herein to include other types of "captioning" such as "open captions" where the captions are visible to all viewers, and to "burned-in captions" which are permanently visible in a video. Further, the term "closed captioning" is defined differently from the term "subtitles" in the United States and Canada, but the term "closed captioning" is defined herein to also include "subtitles". In the United States and Canada, the term "subtitles" applies typically to the translation of the dialogue and perhaps some onscreen text, whereas captioning aims to describe all significant audio content of a video as well as some "non-speech information" such as the identity of speakers and their manner of speaking, music, sounds effects. Further, the term "closed captioning" shall also include other transcription or narrative of the spoken word such as teleprompter data and the like.
A "still" may be a static photograph (or the like in similarity or by analogy) of the visual or video presentation. In a video presentation including frames, a frame may be selected to be a still. For example, each frame representing a change in the background or visual presentation may be selected to be a "still".
SUMMARY
Stated generally, the inventions relate to communications such as video and other presentations. More particularly, the inventions relate to systems and methods whereby a video or other presentation may be used as the source or basis for a communication that provides portions of the video, provides the audio portion in text form, and adds information. Even more particularly, the inventions relate to systems and methods that allow a user to view the substance of a video presentation on a handheld or other device where the presented text corresponds to the audio of the video presentation and where additional information may be presented to the user.
The inventions may be implemented in various embodiments. A first embodiment may provide for the creation of a speed-adjustable view-only presentation from a source having video and audio where the created presentation includes information not necessarily included in the source. The embodiment provides text corresponding to the audio of the source. The text may be interactively processed and cadenced. Information is added to the text as background or otherwise. A set of still frames may be created from the video of the source by creating a still frame respectively at intervals or for each change in the video. The set of still frames may be combined or overlaid with the cadenced processed text including the added information. The combination may be presentable as a slide show or otherwise at selectable variable speeds on a hand-held or other device.
Another embodiment of the inventions may use a video presentation to create a slide show with an overlay of text. The slide show may include a set of stills where a still is created from the video portion at respective predetermined intervals or for each change in the video portion. The audio is converted into text (by closed captioning or otherwise), which may be cadenced and/or processed. Information may be added to the text as background or otherwise. The set of stills is overlaid with the text to create the slide show. The slide show may be presented at one or more of several selectable speeds.
A further embodiment may process a video presentation by making a still frame from the video portion each time a predetermined event happens in the video portion. A predetermined event may be a change in camera angle, a change in motion of an actor, a change in volume of the audio portion, or an elapse of a predetermined interval. A slideshow of still frames is made by combining the still frames. Text is provided for the audio portion, and information is added to the text as background or otherwise. The text may be processed such as by eliminating a gap in the text of a predetermined size, eliminating a special character, eliminating a marker, eliminating a delimiter, or correcting language. The text may be cadenced. The processing and cadencing of the text may be interactive processes. The text with added information (whether processed and/or cadenced) may be combined with the slideshow of still frames for presentation to a viewer or user who may adjust the speed of the presentation as desired.
A few exemplary embodiments according to the inventions have been summarized above. Many more are possible; the inventions are not to be limited to these examples. Other features and advantages of the inventions may be more clearly understood and appreciated from a review of the following detailed description and by reference to the appended drawings and claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a flow diagram illustrating an exemplary embodiment of the inventions.
Figures 2A - 2F are exemplary screen shots illustrating use of an exemplary embodiment of the inventions.
DETAILED DESCRIPTION
The inventions are described herein with reference to exemplary embodiments, alternative embodiments, and also with reference to the attached drawings. The inventions, however, can be embodied in many different forms and carried out in a variety of ways, and should not be construed as limited to the embodiments set forth in this description and/or the drawings. The exemplary embodiments that are described and shown herein are only some of the ways to implement the inventions. Elements and/or actions of the inventions may be assembled, connected, configured, and/or taken in an order different in whole or in part from the descriptions herein.
Stated generally, the inventions relate to methods and systems that may provide advantages such as reduction in the size of or with respect to a source presentation for ease of use, storage, transfer, etc. Particularly, a set of stills from the source presentation may be created and assembled. Each still in the set may represent a background, scene or other change in the video portion of the source presentation. In addition, the audio portion of the source presentation may be processed into text and cadenced for serial presentation. The assembled set of stills and the cadenced text may be combined into a slide show with cadenced text. Effectively, the result may be a reduced version in size of or with respect to the original source presentation, but with the additional advantage of a cadenced readable text that corresponds to the audio portion.
Such a reduction in size of or with respect to the source presentation may be advantageous in many situations. As an example, consider a potential viewer of a video presentation. The viewer may choose to view the slide show with cadenced text prior to or instead of review of the video presentation. The viewer may make the choice because the slide show with cadenced text is smaller in size (and all the advantages that a smaller size provides), and because the slide show with cadenced text substantially provides the gist of the video presentation without its accompanying overhead.
Figure 1 is a flow diagram illustrating operation of an exemplary embodiment of the inventions. In this example, the source 10 is a video presentation (also referred to as "video source", "source"). The video presentation generally includes a visual portion and an audio portion (and may, but does not have to, include closed captioning of the audio portion). The exemplary embodiment may operate separately, but concurrently, on the respective portions (in whole or part) of the video presentation and that may result in a reduced version of or with respect to the video presentation. Alternative embodiments of the inventions may operate in other ways. Further, an exemplary embodiment may combine some or all of the actions separately described above (as well as others or not) in a control module. Moreover, the order of the actions as presented herein may be different in other embodiments.
Operation on Visual Portion
The operation of the exemplary embodiment on the visual portion of the video presentation may be summarized by reference to the actions 12, and 14 on the left side of the flow diagram of Figure 1. The operation on the audio portion of the video presentation may be summarized by reference to the actions 18, 20, 22, 23 on the right side of the flow diagram. After operations on the respective portions of the video presentation, the results may be mixed in action 24 and may result in a slide show with cadenced text in action 26.
For the operation on the video portion of the presentation, in action 12 stills are created of the changes in the visual portion of the video presentation. A "still" is a static photograph (or the like in similarity or by analogy) of the visual presentation. In a video presentation including frames, each frame representing a change in the background or visual presentation may be selected to be a "still".
A change in the visual portion of the video presentation may be defined in a variety of ways, and such definition may vary from embodiment to embodiment and/or from use to use of an embodiment of the inventions. For example, a change in the visual portion of the video presentation may be a change in camera angle, and/or it may be the motion of an actor in the presentation such as when an actor may stand up or sit down. As another example of a change that may result in a still is a change in the volume of the audio portion of the video presentation. If an actor raises his or her voice, a still may be created. If the actor lowers his or her voice, a still may be created.
Alternatively, the passage of a predetermined length of time in the playing of the video presentation may result in the creation of a still. For example, every two (2) seconds of a video presentation a still may be created of what appears in the presentation at that moment. Other events and/or characteristics may be chosen as one or more of the bases to create a still from the video portion of the presentation.
The creation of the stills in action 12 may occur in any manner. For example, the stills may be created manually by an operator who observes the video presentation. Alternatively (or in addition), the stills may be created automatically based on a set of criteria such as the previously referenced elapse of a certain amount of time and/or the change in volume of the audio portion of the video presentation.
In action 14, the stills are assembled in a set of stills (also referred to as a set of scenes or set of still frames). Preferably, the stills are assembled in an order to correspond to the visual portion of the video presentation, but they do not necessarily have to be. Of course, if the stills are not assembled in the order of the visual portion of the video presentation, then the order of the text presented may be altered from that in the original audio portion to correspond to the order of the stills.
Alternatively, action 14 may not be necessary in some embodiments because the stills may not need to be assembled. They may be created without such need. Further, if there is only one still, then it may not need to be assembled. As yet another alternative, in some cases, the stills may be identical and in that case, all of them or only less than all of them may be assembled (or not). Operation on Audio Portion
As noted above, the operation of the exemplary embodiment on the audio portion of the video presentation may be summarized by reference to the actions 18, 20, 22, 23 on the right side of the flow diagram of Figure 1.
The same source that provides the visual portion of the video presentation for the operation in actions 12 and 14 may provide the audio portion for the operation in actions 18, 20, 22, 23. Generally, the source is the same for both sides of the flow of actions, but this does not necessarily have to be the case. The operation on the audio portion preferably results in a presentation of the audio portion in cadenced text/graphic form in an easy-to-read manner as an overlay and/or otherwise in association with the set of stills. The speed of the presentation in cadenced text/graphic form may be controlled in some embodiments by the user, and may be varied to speed up, slow down, stop, or start according to the desires of the user.
As is presented in greater detail below, the processing of the audio portion to obtain the cadenced words (and/or graphics) that are displayed in association with the set of stills may result in a sequential delivery and display of the audio content. Further information about the sequential delivery and display of the audio content also may be found in the published patent application entitled Method for Controlling the Rate of Automated Flow and Navigation through Information Presented on a Digitally Controlled Electronic Device, United States Patent Application Publication No. 20070073917, Serial No. 515,950, published on March 29, 2007, which is incorporated herein by reference. Another source for information on delivery and display of audio content may be found in United States Patent No. 6,056,551, Methods and Apparatus for Computer Aided Reading Training, issued May 2, 2000, which is incorporated herein by reference.
Referring to Figure 1, the operation on the audio portion of the video presentation may begin with caption file action 18. Such action may be unnecessary, but it may provide for the "closed captioning" (or other transcription) of the audio portion, if such transcription has not already been done. For example, the video presentation may be passed through a NORPAK card to result in closed captioning of at least the audio portion of the video presentation. For example a NORPAK card, 1 TTX890 Multi-format SD/HD-SDI VANC De-embedder/Decoder Card from Norpak Corporation, Kanata, Ontario, Canada. The NORPAK card may include encoding/insertion receiving, monitoring, and content control functions. Additional information about the NORPAK card may be obtained from www.norpak.ca. Alternative transcription of the audio into text/graphics may be carried out. For example, such transcription may be done manually by a person listening to the audio portion and creating a text file (including graphics, if need be) corresponding to the audio.
If there is closed captioning as part of the video presentation or as the result of action 18, then that closed captioning may be provided as the input to processing action 20. If there is no closed captioning, then the audio portion of the video presentation (translated into text/graphics or otherwise) may be provided as input to the processing action 20. This processing action 20 may include a filter function and may provide for removing gaps or otherwise. In addition, the processing action 20 may include removing special characters, markers, and delimeters. Further, the processing action 20 may include language correction functions to generally clean-up or correct the language corresponding to the audio portion of the presentation. In some embodiments, a processing action such as action 20 may be unnecessary as the text/graphics corresponding to the audio portion may be ready for further actions without such processing. For example, the text/graphics may be reading for cadencing in action 22 without processing action 20.
The result of the processing action 20 may be that of a text file or a stream of ASCII (American Standard Code for Information Interchange) text, but the result may vary depending upon the needs of the relevant system and methods.
The stream of ASCII text resulting from the processing action 20 is the input to the action of cadence video of words (and/or graphics) 22. As noted, the input to the cadencing action may be other than a stream of ASCII text.
The elements that may be part of the cadencing action 22 are more fully described in the above referenced United States Patent Application Publication No. 20070073917, which is incorporated herein by reference. Among the features of the cadencing action 22 that may be included in the exemplary embodiment of the inventions claimed here are the speed of the presentation of the cadenced words (the overall speed and/or the respective speed of individual words, phrases, sentences, etc.), and the display time of the presentation of the cadenced words (the overall presentation time and/or the respective presentations times of the individual words, phrases, sentences, etc.). Preferably, a user may be able to vary the speed, the display time, and/or other features. Although not specifically shown by arrows in Figure 1, there may be a two- way (or more) exchange between the actions of processing 20 and of cadencing 22. For example, the processing action 20 may provide text to the cadencing action 22, and/or the cadencing action 22 may send control signals to the processing action 20. Similarly, there may be some exchanges between and among the other actions in Figure 1.
After the cadencing action 22, an action 23 of insertion of additional information may take place. The addition of information may take place in a variety of ways. The additional information may be inserted in the gaps between words as the cadenced words are presented with the slide show of the set of stills created in actions 12 and 14. In particular, as a result of the cadencing action 22, the length of a gap between words in the stream of ASCII text may be determinable. If the gap is sufficient in length, one or more pieces of additional information may be inserted into that gap. When the cadenced words are displayed with the slide show, the one or more pieces of additional information appear in the gap. The information added may be the same each time, or it may vary. Whether the added information is the same each time may depend on the size of the available gaps, or not.
Still referring to the additional information that may be inserted into a gap(s) in the stream of ASCII text after it has been cadence processed, the additional information may or may not be related to the content of the stream of ASCII text. In addition, the additional information may take the form of text and/or graphics. Further, the additional information may constitute information that a viewer may immediately recognize as associated with a particular source or origin. Thus, additional information may have to be appear only for the length of time that allows a user to recognize the additional information and/or the source associated with the additional information. In other words, the additional information does not necessarily appear for the same length of time as the text/graphics to which it has been added. Also, multiples examples of added information may appear for respective varying lengths of time whether the added information is the same in each case or not, and/or whether it is the same size or not, and/or whether it is added to respective gaps of the same size or not.
Another way additional information may be made part of the processed information is by adding it so as to appear as background to the text of the cadenced words. For example, the subject matter of the cadenced words may be a travelogue description of the Mt. Rushmore National Monument. A graphic image of Mt. Rushmore may appear with cadenced words as background. In other words, the overlay of cadenced words may include its own background (separate from the visual portion of the video presentation). The background to the overlay of cadenced words may compliment, contrast, and/or have no relation to the visual portion of the video presentation. The added information used as background may appear as such for all of the text/graphic presentation, or only part(s) of it. The information added as background may be the same as (in whole or in part) as information added in gaps in the text/graphics. Information added as background may continue to appear as background when information added in gaps appears in the presentation (or not, or sometimes).
An option action that may take place with respect to the text/graphic portion of the presentation is that it may undergo optional display control at some point before, between, during, and/or after the actions 18, 20, 22, 23. Display control may include anything related to the display of the stream of ASCII text or other text/graphics. In particular, the display control may include configurations of the stream of ASCII text or other text/graphics with respect to color, position, size, opacity, etc. Thus, the relevant stream of ASCII text or other text/graphics (plus the additional information if it is included) may be changed, added to, subtracted from, etc. by the display control. The output of the display control may result in cadenced words (with or without additional information and with or without display control elements). The term "cadenced words" is not to be limited to just "words", but may include other text, graphics, etc. relating to the audio portion of the video presentation.
Mixing
Referring again to Figure 1 , the left hand and right hand actions come together in the mix as video action 24. In action 24, the cadenced words are added to (and/or mixed and/or other action) the assembled set of stills made from the visual portion of the video presentation. The result is effectively a slide show with cadenced words 26. Stated differently, the cadenced words from the audio portion of the video presentation are made into a video that is laid over or on top of the slide show (or set of stills) (figuratively or literally) from the visual portion of the video presentation. Thus, a viewer may view the assembled set of stills and may be able to concurrently view the cadenced words (with or without additional information). Advantageously, the cadenced words appear smoothly to the viewer in an easy-to-read manner.
For the mixing, an embodiment of the inventions may make use of a CHYRON character generation card, 1 PCI board sets, Digital pcCODI and Analog pcCODI, Chyron Corporation, Melville, New York. For more information see www.chyron.com.
Presumably, the mixing action 24 coordinates the set of stills with the processed cadenced (or other) text/graphics taken from the audio portion of the video presentation. For example, assume a source is a music video. The person viewing the music video processed in accordance with an exemplary embodiment of the inventions might view the words sung as text/graphics in correspondence with the stills presenting the musicians singing those words. In sum, the text/graphics may match the set of stills.
But the text/graphics need not always correlate to the set of stills as the audio portion correlates to the video portion of the source presentation. In some embodiments, the text/graphics may be presented in an order that does not correspond to the order of the audio portion, and/or the video portion. This latter example may be an unusual case, but may have its applications in creation of art or new works of authorship or otherwise.
Exemplary Advantages
As noted at the outset, the inventions may include advantageous methods and systems that may reduce the size of a source presentation for ease of use, storage, transfer, etc. The result of a synchronized text format to slide show based video may offer an advancement in the field of compression. A viewer of the slide show with cadenced words may not need the audio portion of the video presentation in audio form because the audio portion may be presented to the viewer in a cadenced text format that may be synchronized to the assembled set of scenes that make up the slide show. A viewer may speed his or her review and comprehension of a video presentation by viewing the corresponding slide show with cadenced words resulting from operation of an exemplary embodiment of the present inventions on the video presentation. Alternative Embodiments - Figures 2A - 2F
Figures 2A - 2F illustrate respective screen shots as they may appear on a viewer's personal digital assistant (PDA) or other device when the viewer makes use of an exemplary embodiment of the inventions in watching the beginning of an evening newscast.
In this example, the video source presentation includes a video portion and an audio portion. A set of stills 30a-f may be created from the video source pursuant to any of the processes or systems described above (or alternatives thereto). In this case, each still 30a-f is the same, but it does not always have to be the case. These could be copies of a single still, or they could be six identical stills, or a mix of identical stills and new stills.
Also in this example, the text/graphic overlay 32a-f includes the text of the audio portion of the video source plus added information. In screen shots 30a-c, the text graphic overlay 32a-c includes the text spoken by the announcer in the video portion. The text/graphic overlay 32a-c appears as processed cadenced text. Thus, the first screen shot 30a includes only the text "Hello" as part of the graphic overlay 32a. The next screen shot 30b includes the text "JOHN" as part of the graphic overlay 32b. The text of "JOHN" has been processed in this example to emphasize the announcer's name, but that does not have to be the case in all embodiments. The cadencing of the text is evident from the fact that the text of "JOHN" does not immediately follow the text "Hello" on the same screen shot, but is spaced a bit apart in time for ease of the viewer's reading comprehension. The third screen shot 30c includes the text "reporting" as part of the graphic overlay 32c. The three terms are separately cadenced and overlaid onto separate screen shots for each of comprehensibility. Other embodiments may be able to cadence more than one word per screen shot. A user may be able to vary some parts of the exemplary embodiment's presentation features such as speed of display of text, duration of display of text, etc.
In screen shots 30d-f, the text/graphic overlay 32d-f continues, but instead of text from the audio portion of the source presentation, the text/graphic overlay 32d-f shows other features. For example, the text/graphic overlay 32d in screen shot 3Od is blank or at least does not appear to contain text or graphics. This may indicate a pause in the announcer's speech (or not). Screen shots 30e-f include text/graphic overlay 32e-f that illustrate the addition of information. The information may have been added because of a gap in the announcer's speech or otherwise. In screen shot 3Oe, the trademark Coca-Cola appears as the text/graphic overlay 32e. In screen shot 3Of, the trademarks including an illustration of a Coke bottle with the words "welcome to the Coke side of life" appear as the text/graphic overlay 32f. The trademarks were added to the text/graphic overlay 32e-f; the trademarks are not part of the audio portion of the source presentation. The trademarks appearing as text graphic overlay 32e-f may appear for a brief amount of time so that they disappear once an average viewer would be thought to have recognized them and before becoming distracted from the topic of the source presentation. Alternatively, added information such as the trademarks appears as text/graphic overlay 32e-f may be made to appear for as long or as short as decided by the person implementing the exemplary embodiment or otherwise.
The trademarks used as text/graphic overlays 32e-f are inserted into gaps in the text that correspond to the audio portion of the source presentation. An alternative would be to use the same trademarks or other information as background in the text graphic overlays 32a-f. Another alternative would be to use the same trademarks and/or other information as added information in the text graphic overlay d 32a-f and as background in those overlays 32a-f. Other alternatives are possible.
Conclusion
The above description is not intended and should not be construed to be limited to the examples given. Although the description provides illustrative embodiments, the invention is not limited thereto and may include modification and permutations thereof.
The exemplary embodiments of the present inventions were chosen and described above in order to explain the principles of the invention and their practical applications so as to enable others skilled in the art to utilize the inventions including various embodiments and various modifications as are suited to the particular uses contemplated. The examples provided herein are not intended as limitations of the present invention. Also, the advantages of the inventions discussed herein are not to be considered limiting in any way. Some embodiments may not include one or more of the discussed advantages. Also, the embodiments may have one or more advantages not discussed herein. Other embodiments will suggest themselves to those skilled in the art. Therefore, the scope of the present inventions is to be limited only by the claims below.

Claims

1. A method of creating a speed-adjustable view-only presentation with added information from a source having video and audio, comprising: providing text corresponding to the audio; interactively processing and cadencing the text into cadenced processed text; adding information to the cadenced processed text; creating a set of still frames of the video by creating respectively a still frame of the video at intervals during the video or for each change in the video; combining the set of still frames and the cadenced processed text including the add information; and causing the combination to be presentable at selectable variable speeds, whereby the combination of the set of the still frames and the cadenced processed text including the added information creates the speed-adjustable view-only presentation with the added information.
2. The method of Claim 1, wherein adding the information to the cadenced processed text comprises: selecting a gap in the cadenced processed text; and adding the information in the selected gap in the cadenced processed text.
3. The method of Claim 1, wherein adding the information to the cadenced processed text comprises adding the information as background to the cadenced processed text.
4. A method for using a video presentation to create a slide show, the video presentation including a video portion and an audio portion, comprising: creating stills from the video portion of the video presentation; assembling the stills into a set of stills; creating text from the audio portion of the video presentation; and overlaying the set of stills with the text to create the slide show.
5. The method of Claim 4, further comprising: cadencing the text.
6. The method of Claim 4, further comprising: processing the text.
7. The method of Claim 4, further comprising: adding information to the text prior to overlaying the set of stills with the text.
8. The method of Claim 4, further comprising: causing the slide show to be presentable at one or more of several speeds.
9. The method of Claim 4, wherein creating the stills from the video portion of the video presentation comprises creating a still of the video presentation at respective predetermined intervals of the video presentation.
10. The method of Claim 4, wherein creating the stills from the video portion of the video presentation comprises creating a still respectively for each change in the video portion of the video presentation.
11. The method of Claim 4, wherein creating the stills from the video portion of the video presentation comprises creating a still after the elapse of a predetermined time from the beginning of the video portion, and creating respectively a still after each elapse of the predetermined time thereafter.
12. The method of Claim 4, wherein creating the text from the audio portion of the video presentation comprises closed captioning of the audio portion.
13. A method of processing a source presentation, the source presentation including at least a video part and an audio part, comprising: making a still frame from the video part for each time a predetermined event happens in the video part; making a slideshow of still frames by combining still frames; providing text corresponding to the audio part; adding information to the text; and combining the slideshow and the text including the added information.
14. The method of Claim 13, wherein a predetermined event comprises a change in camera angle, motion of an actor or volume of the audio portion, or elapse of a predetermined interval.
15. The method of Claim 13, further comprising: processing the text.
16. The method Claim 15, wherein processing the text comprises eliminating a gap in the text of a predetermined size, eliminating a special character, eliminating a marker, eliminating a delimiter, or correcting language.
17. The method of Claim 15, wherein adding the information to the text comprises adding the information to the processed text.
18. The method of Claim 13, further comprising: cadencing the text.
19. The method of Claim 18, wherein cadencing the text comprises interactively cadencing and processing the text.
20. The method of Claim 19, wherein adding the information to the text comprises adding the information in a gap in the cadenced text.
PCT/US2008/057182 2007-03-15 2008-03-17 Methods and systems for converting video content and information to a sequenced media delivery format WO2008113064A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US89510807P 2007-03-15 2007-03-15
US89512107P 2007-03-15 2007-03-15
US60/895,108 2007-03-15
US60/895,121 2007-03-15

Publications (1)

Publication Number Publication Date
WO2008113064A1 true WO2008113064A1 (en) 2008-09-18

Family

ID=39760120

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/057182 WO2008113064A1 (en) 2007-03-15 2008-03-17 Methods and systems for converting video content and information to a sequenced media delivery format

Country Status (1)

Country Link
WO (1) WO2008113064A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3143764A4 (en) * 2014-10-16 2017-12-27 Samsung Electronics Co., Ltd. Video processing apparatus and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6505153B1 (en) * 2000-05-22 2003-01-07 Compaq Information Technologies Group, L.P. Efficient method for producing off-line closed captions
US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content
US6804295B1 (en) * 2000-01-07 2004-10-12 International Business Machines Corporation Conversion of video and audio to a streaming slide show
US6820055B2 (en) * 2001-04-26 2004-11-16 Speche Communications Systems and methods for automated audio transcription, translation, and transfer with text display software for manipulating the text
US7131059B2 (en) * 2002-12-31 2006-10-31 Hewlett-Packard Development Company, L.P. Scalably presenting a collection of media objects

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6804295B1 (en) * 2000-01-07 2004-10-12 International Business Machines Corporation Conversion of video and audio to a streaming slide show
US6505153B1 (en) * 2000-05-22 2003-01-07 Compaq Information Technologies Group, L.P. Efficient method for producing off-line closed captions
US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content
US6820055B2 (en) * 2001-04-26 2004-11-16 Speche Communications Systems and methods for automated audio transcription, translation, and transfer with text display software for manipulating the text
US7131059B2 (en) * 2002-12-31 2006-10-31 Hewlett-Packard Development Company, L.P. Scalably presenting a collection of media objects

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3143764A4 (en) * 2014-10-16 2017-12-27 Samsung Electronics Co., Ltd. Video processing apparatus and method
US10014029B2 (en) 2014-10-16 2018-07-03 Samsung Electronics Co., Ltd. Video processing apparatus and method

Similar Documents

Publication Publication Date Title
AU2011200857B2 (en) Method and system for adding translation in a videoconference
CA2374491C (en) Methods and apparatus for the provision of user selected advanced closed captions
EP2356654B1 (en) Method and process for text-based assistive program descriptions for television
KR20130029055A (en) System for translating spoken language into sign language for the deaf
CN102111601B (en) Content-based adaptive multimedia processing system and method
US9749504B2 (en) Optimizing timed text generation for live closed captions and subtitles
US20120033133A1 (en) Closed captioning language translation
US20060272000A1 (en) Apparatus and method for providing additional information using extension subtitles file
Pedersen Audiovisual translation–in general and in Scandinavia
CN102055941A (en) Video player and video playing method
KR101899588B1 (en) System for automatically generating a sign language animation data, broadcasting system using the same and broadcasting method
KR20040039432A (en) Multi-lingual transcription system
KR20050118733A (en) System and method for performing automatic dubbing on an audio-visual stream
USRE40688E1 (en) System for producing personalized video recordings
Gorman et al. Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption
US20020055088A1 (en) Toggle-tongue language education method and apparatus
Neves A world of change in a changing world
TW201102836A (en) Content adaptive multimedia processing system and method for the same
JP2008306691A (en) Bilingual double caption
WO2008113064A1 (en) Methods and systems for converting video content and information to a sequenced media delivery format
Ohene-Djan et al. Emotional Subtitles: A System and Potential Applications for Deaf and Hearing Impaired People.
JPH1141538A (en) Voice recognition character display device
Ohene-Djan et al. E-Subtitles: Emotional Subtitles as a Technology to Assist the Deaf and Hearing-Impaired when Learning from Television and Film.
Araújo Subtitling for the deaf and hard-of-hearing in Brazil
JP4500957B2 (en) Subtitle production system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08732325

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112 (1) EPC, EPO FORM 1205A DATED 21-01-2010

122 Ep: pct application non-entry in european phase

Ref document number: 08732325

Country of ref document: EP

Kind code of ref document: A1