WO1998047084A1 - A method and system for object-based video description and linking - Google Patents
A method and system for object-based video description and linking Download PDFInfo
- Publication number
- WO1998047084A1 WO1998047084A1 PCT/JP1998/001736 JP9801736W WO9847084A1 WO 1998047084 A1 WO1998047084 A1 WO 1998047084A1 JP 9801736 W JP9801736 W JP 9801736W WO 9847084 A1 WO9847084 A1 WO 9847084A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- image
- links
- description
- stream
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/84—Generation or processing of descriptive data, e.g. content descriptors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/74—Browsing; Visualisation therefor
- G06F16/748—Hypervideo
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234318—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into objects, e.g. MPEG-4 objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/235—Processing of additional data, e.g. scrambling of additional data or processing content descriptors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47205—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4722—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content
- H04N21/4725—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting additional data associated with the content using interactive regions of the image, e.g. hot spots
Definitions
- This invention relates to an object-based description and linking method and system for use in describing the contents of a video and linking such video contents with other multimedia contents.
- a video sequence were to be accompanied by a stream of descriptions and links that provided additional information about the video, and which were embedded in the video signal, we could find further information about certain objects in the video by looking up their descriptions, or visiting their related Web sites or files, by following the embedded links. Such descriptions and links may also provide useful information for content-based searching in digital libraries.
- a new method and system for object-based video description and linking is disclosed.
- the method constructs a companion stream for a video sequence which may be in any common format
- textual descriptions, voice annotation, image features, object links, URL links, and Java applets may be recorded for certain objects in the video within each frame.
- the method may be utilized in many applications as described below.
- the system of the invention includes a mechanism for generating an encoded
- An encoder embeds a companion descriptive stream with a video signal.
- a video display
- video/audio contents such as a Web site, a computer file, or other video objects.
- Fig. 1 is a block diagram of the method of the invention.
- Fig. 2 is an illustration of the various types of links that may be incorporated into the invention of Fig. 1.
- Fig. 3 is a block diagram of the system of the invention as used within a television broadcast scheme.
- a new method for describing and linking objects in an image or video sequence is described.
- the method is intended for use with a video system having a certain digital component such as a television or a computer. It should be appreciated that the method of the invention is able to provide additional description and links to any format of image or video. While the method and system of the invention is generally intended for use with a video sequence,
- video includes the concept of a single “image.”
- Fig. 1 the method, depicted generally at 10, builds a description stream 12 as a companion for a video sequence 14, having a plural frames 16 therein.
- a description stream 12 as a companion for a video sequence 14, having a plural frames 16 therein.
- object 16a and object 16b there may be one or more objects of interest such as object 16a and object 16b. It will be appreciated by those of skill in the art that not all of the frames in video sequence 14 must be selected for having a compamon descriptive stream linked therewith.
- the descriptive stream records further information about certain objects appearing
- the stream consists of continuous blocks 18 of data where each block corresponds to a frame 16 in the video sequence and a frame index 20 is recorded at the beginning of the block.
- the "object of interest" may comprise the entire video frame.
- a descriptive stream may be linked to a number of frames, which frames may be sequential or non-sequential. In the case where a descriptive stream is linked with a sequential number of frames, the descriptive stream may be thought of as having a "lifespan,” i.e., if the user does not take some action to reveal the descriptive stream when a linked frame is displayed, the descriptive stream "dies,” and may not in the case of a television broadcast be revived.
- the descriptive stream is part of a video tape, video disc, or computer file
- the user can always return to the location of the descriptive stream and display the information.
- Some form of visible or audible indicia may be displayed to indicate that a descriptive stream is linked with a sequence of video frames.
- Descriptive stream 12 may also be linked to a single image.
- the frame indexes are used to synchronize the descriptive streams with the video
- the block may be further divided into a number of sub-blocks 22, 24, containing what
- each sub-block corresponds to a certain individual
- object of interest appearing in the frame, Le., sub-block 22 corresponds to one object 16a in the
- sub-block 24 corresponds to another object 16b in the same frame.
- a sub-block includes of a number of data fields including but not limited to object index, textual description, voice annotation, image
- Additional information may include notices regarding copyright and other intellectual property rights. Some notices may be encoded and rendered invisible to standard display equipment
- the object index field is used to index an individual object within the frame. It contains the geometrical definition of the object
- the system processes all the object index fields within that frame, locates the corresponding objects, and marks them in some manner, such as by highlighting them.
- the highlighted objects are those that have further information recorded. If a user "clicks" on a highlighted object the system locates the corresponding sub-block and pop-up menu containing the available information items.
- a textual description field is used to store further information about the object in plain text. This field is similar to the traditional closed caption, and its contents may be any information related to the object
- the textual description can help keyword-based search for relevant video contents.
- a content-based video search engine may look up the textual
- An image features field is used to store further information about the object in terms of texture, shape, dominant color, motion model describing motion with respect to a certain
- Image features may be particularly useful for content-based video/image
- An object links field is used to store links to other video objects in the same or
- Object links may be useful for video summarization and object/event tracking.
- the URL links field which is illustrated in Fig. 2, is used to store links to Web pages and/or other objects which are related to the object
- a person in the scene such as person 26, i.e., the object of interest the link in the sub-block 28 may be pointed to a URL 30 for the person's personal homepage 32.
- a symbol or icon in the scene may be linked to a Web site which contains the related background information.
- Companies may also want to link products 34 shown in the video, through a sub-block 36 to a URL 38 to their Web site 40 so that potential customers may learn more about their products.
- a Java applet field is used to store Java code to perform more advanced functions related to the object
- a Java applet may be embedded to enable online ordering for a product shown in the video. Java code may also be written to implement some sophisticated similarity measures to empower advanced content-based video search in digital libraries.
- cassettes used for recording in such systems may be any cassettes used for recording in such systems.
- the memory is referred to as memory-in-cassette (MIC).
- MIC memory-in-cassette
- the descriptive stream may be stored in the MIC, or on the video tape.
- the descriptive stream may be stored along with the video or image contents on the same media, i.e., a DVD disc or tape.
- FIG. 3 depicts the system of the invention, generally at 50, as is used in a television broadcast scheme.
- System 50 includes a capture mechanism, which may be a video
- a video signal is passed to an encoder 54, which also receives appropriate compamon signals from the various types of links which will form the descriptive stream, which encoder generates a combined video/descriptive stream signal 58.
- Signal 58 is transmitted by transmitter 60, which may be a broadcast transmitter, a hard-wire system, or a combination thereof.
- transmitter 60 which may be a broadcast transmitter, a hard-wire system, or a combination thereof.
- receiver 62 which decodes the signal and generates an image for display on video display 64.
- a trigger mechanism 66 is provided to cause receiver 62 to decode and display the descriptive stream.
- a decoder in this embodiment is located in receiver 62 for decoding the embedded descriptive stream.
- the descriptive stream may be displayed in a picture-in-picture (PIP) format on video display 64, or may be displayed on a descriptive stream display 68, which may be co-located with the trigger mechanism, which may take the form of a remote control mechanism for the receiver.
- PIP picture-in-picture
- Some form of indicia may be provided, either as a visible display on video display 64, or as an audible tone, to indicate that a descriptive stream is present in the video
- the data block information is displayed in the descriptive stream display, and the devise
- the information may be displayed immediately, or may be stored for future reference.
- capture mechanism 52, transmitter 60 and receiver 62 may not be required, as the video or image will have already been captured and stored in a library, which library likely resides on magnetic or optical media which is hard-wired to the video or image display.
- a decoder to decode the descriptive stream may be located in the computer or in the display.
- the trigger mechanism may be combined with a mouse or other pointing device, or may be incorporated into a keyboard, either with dedicated keys, or by the assignment of a key sequence.
- the descriptive stream display will likely take the form of a window on the video display or monitor.
- TV stations may utilize the method and system of the invention to add more
- TV set has the capability of decoding the descriptive streams, the viewer may choose to use or not use the advanced functions, just as the viewer may choose to view or not to view closed
- caption text If the user chooses to use the functions, the user may read extra text about someone
- the descriptive stream may be obtained through a variety of mechanisms. It may be constructed manually using an interactive method. An operator may explicitly choose to index certain objects in the video and record some corresponding further information.
- the descriptive stream may also be constructed automatically using any video analysis tools, especially those to be developed for the Moving Pictures Experts Group Standard
- Camcorders, VCRs and DVD recorders may be developed to allow the construction and storage of descriptive streams while recording and editing. Those devices may provide user interface programs to allow a user to manually locate certain objects in their video, index them,
- the user may then choose to enter some text into the textual description field, record some
- the user may choose to allow the programming of the device to propagate those descriptions to the surrounding frames. This may be done by tracking the objects in the nearby
- the method and system of the invention may also be used in digital libraries.
- the method may be applied to video sequences or images originally stored in any common format including RGB, Dl, MPEG, MPEG-2, MPEG-4, etc. If a video sequence is stored in MPEG-4, the location information of the objects in the video may be extracted automatically. This eases the burden of manually locating them. Further information may then be added to each extracted object within a frame and propagated into other sequential or nonsequential frames, if so selected.
- the mechanism described herein may be used to construct descriptive streams. This enables a video sequence or image stored in one format to be viewed and manipulated in a different format and to have the description and linking features of the invention to be applied
- the descriptive streams facilitate content-based video/image indexing and retrieval.
- a search engine may find relevant video contents at the object level, by matching relevant
- search engine may also choose to analyze the voice annotations, match the image features, and/or
- the embedded Java applets may implement more sophisticated similarity measures to further enhance content-based video/image indexing and retrieval.
Abstract
A method for object-based video description and linking is disclosed. The method constructs a companion stream for a video sequence which may be in any common format. In the companion stream, textual descriptions, voice annotation, image features, URL links, and Java applets may be recorded for certain objects in the video within each frame. The system includes a capture mechanism for generating an image, such as a video camera or computer. An encoder embeds a descriptive stream with the video and audio signals, which combined signal is transmitted by a transmitter. A receiver receives and displays the video image and the audio. The user is allowed to select whether or not the embedded descriptive stream is displayed or otherwise used.
Description
DESCRIPTION
A METHOD AND SYSTEM FOR OBJECT-BASED VIDEO DESCRIPTION AND
LINKING Field of the Invention
This invention relates to an object-based description and linking method and system for use in describing the contents of a video and linking such video contents with other multimedia contents.
Background of the Invention In this information age, we daily deal with vast amount of video information when watching TV, making home video, and browsing the World Wide Web. The video which we receive or make is mostly in an "as is" state, i.e., there is no further information available about the content of the video, and the content is not linked to other related resources. Because of this, we view video in a passive manner. It is difficult for us to interact with the video contents and utilize them efficiently. From time to time, we see someone or something in the video about which we would like to find more information. Usually, we do not know where to find such information, and do not begin or continue our quest It is also difficult for us to search for video clips which may contain certain content related to our interests.
Existing multimedia descriptive networking methods and languages comprise the
known art Examples of such methods include the descriptive techniques used in connection with
digital libraries and computer languages, such as HTML and Java. The existing methods used in
digital libraries suffer from shortcomings in that they are not necessarily object-based, e.g., the
methods that use color histograms describe only the global color contents of a picture and do not
describe the contents of the picture; linking and networking capability is not inherent in the systems; and, the video sources must be of a specific type in order to be compatible with the
primary language. Languages such as HTML and Java are difficult to use for describing and
linking video contents in a video sequence, especially when it is desired to treat the video sequence at the object level.
If a video sequence were to be accompanied by a stream of descriptions and links that provided additional information about the video, and which were embedded in the video signal, we could find further information about certain objects in the video by looking up their descriptions, or visiting their related Web sites or files, by following the embedded links. Such descriptions and links may also provide useful information for content-based searching in digital libraries.
Summary of the Invention A new method and system for object-based video description and linking is disclosed. The method constructs a companion stream for a video sequence which may be in any common format In the companion stream, textual descriptions, voice annotation, image features, object links, URL links, and Java applets may be recorded for certain objects in the video within each frame. The method may be utilized in many applications as described below.
The system of the invention includes a mechanism for generating an encoded
image. An encoder embeds a companion descriptive stream with a video signal. A video display
displays the video image. The user is allowed to select whether or not the embedded descriptive
stream is displayed or otherwise used.
It is an object of the invention to develop a method and system for describing and
linking video contents in any format at the video object level.
It is a further object of the invention to allow a video object to be linked to other
video/audio contents, such as a Web site, a computer file, or other video objects.
These and other objects and advantages of the invention will become more fully apparent as the description which follows is read in connection with the drawings.
Brief Description of the Drawings Fig. 1 is a block diagram of the method of the invention. Fig. 2 is an illustration of the various types of links that may be incorporated into the invention of Fig. 1. Fig. 3 is a block diagram of the system of the invention as used within a television broadcast scheme.
Detailed Description of the Preferred Embodiment A new method for describing and linking objects in an image or video sequence is described. The method is intended for use with a video system having a certain digital component such as a television or a computer. It should be appreciated that the method of the invention is able to provide additional description and links to any format of image or video. While the method and system of the invention is generally intended for use with a video sequence,
such as in a television broadcast video tape or video disc, or a series of video frames viewed on a computer, the method and system are also applicable to single images, such as might be found in
an image database, and which are encoded in well-known formats, such as JPEG, MPEG, binary,
etc., or any other format As used herein, "video" includes the concept of a single "image."
Referring now to Fig. 1, the method, depicted generally at 10, builds a description
stream 12 as a companion for a video sequence 14, having a plural frames 16 therein. In each
selected frame, there may be one or more objects of interest such as object 16a and object 16b. It will be appreciated by those of skill in the art that not all of the frames in video sequence 14 must be selected for having a compamon descriptive stream linked therewith.
The descriptive stream records further information about certain objects appearing
in the video. The stream consists of continuous blocks 18 of data where each block corresponds to a frame 16 in the video sequence and a frame index 20 is recorded at the beginning of the block. The "object of interest" may comprise the entire video frame. Additionally, a descriptive stream may be linked to a number of frames, which frames may be sequential or non-sequential. In the case where a descriptive stream is linked with a sequential number of frames, the descriptive stream may be thought of as having a "lifespan," i.e., if the user does not take some action to reveal the descriptive stream when a linked frame is displayed, the descriptive stream "dies," and may not in the case of a television broadcast be revived. Of course, if the descriptive stream is part of a video tape, video disc, or computer file, the user can always return to the location of the descriptive stream and display the information. Some form of visible or audible indicia may be displayed to indicate that a descriptive stream is linked with a sequence of video frames. Descriptive stream 12 may also be linked to a single image.
The frame indexes are used to synchronize the descriptive streams with the video
sequences. The block may be further divided into a number of sub-blocks 22, 24, containing what
are referred to herein as descriptor/links, where each sub-block corresponds to a certain individual
object of interest appearing in the frame, Le., sub-block 22 corresponds to one object 16a in the
frame and sub-block 24 corresponds to another object 16b in the same frame. There may be other
objects in the image that are not defined as objects of interest and which, therefore, do not have a descriptive stream and sub-block associated therewith. A sub-block includes of a number of data fields including but not limited to object index, textual description, voice annotation, image
features, object links, URL links, and Java applets. Additional information may include notices regarding copyright and other intellectual property rights. Some notices may be encoded and rendered invisible to standard display equipment
The object index field is used to index an individual object within the frame. It contains the geometrical definition of the object When a user pauses, or captures, the video at some frame, the system processes all the object index fields within that frame, locates the corresponding objects, and marks them in some manner, such as by highlighting them. The highlighted objects are those that have further information recorded. If a user "clicks" on a highlighted object the system locates the corresponding sub-block and pop-up menu containing the available information items.
A textual description field is used to store further information about the object in plain text. This field is similar to the traditional closed caption, and its contents may be any information related to the object The textual description can help keyword-based search for relevant video contents. A content-based video search engine may look up the textual
descriptions of video sequences trying to match certain keywords. Because the textual description fields are related to individual objects, they enable truly object-based search for video
contents.
A voice annotation field is used to store further information about the object using
natural speech. Again, its contents may be any information related to the object
An image features field is used to store further information about the object in terms of texture, shape, dominant color, motion model describing motion with respect to a certain
reference frame, etc.. Image features may be particularly useful for content-based video/image
indexing and retrieval in digital libraries. An object links field is used to store links to other video objects in the same or
other video sequence or image. Object links may be useful for video summarization and object/event tracking.
The URL links field, which is illustrated in Fig. 2, is used to store links to Web pages and/or other objects which are related to the object For a person in the scene, such as person 26, i.e., the object of interest the link in the sub-block 28 may be pointed to a URL 30 for the person's personal homepage 32. A symbol or icon in the scene may be linked to a Web site which contains the related background information. Companies may also want to link products 34 shown in the video, through a sub-block 36 to a URL 38 to their Web site 40 so that potential customers may learn more about their products. A Java applet field is used to store Java code to perform more advanced functions related to the object For example, a Java applet may be embedded to enable online ordering for a product shown in the video. Java code may also be written to implement some sophisticated similarity measures to empower advanced content-based video search in digital libraries.
In the case of digital video, the cassettes used for recording in such systems may
have a solid-state memory embedded therein which serves as an additional storage location for
information. The memory is referred to as memory-in-cassette (MIC). Where the video sequence
is stored on a digital video cassette, the descriptive stream may be stored in the MIC, or on the
video tape. In general, the descriptive stream may be stored along with the video or image contents on the same media, i.e., a DVD disc or tape.
Figure 3 depicts the system of the invention, generally at 50, as is used in a television broadcast scheme. System 50 includes a capture mechanism, which may be a video
camera, a computer capable of generating a video signal, or any other mechanism that is able to
generate a video signal. A video signal is passed to an encoder 54, which also receives appropriate compamon signals from the various types of links which will form the descriptive stream, which encoder generates a combined video/descriptive stream signal 58. Signal 58 is transmitted by transmitter 60, which may be a broadcast transmitter, a hard-wire system, or a combination thereof. The combined signal is received by receiver 62, which decodes the signal and generates an image for display on video display 64.
A trigger mechanism 66 is provided to cause receiver 62 to decode and display the descriptive stream. A decoder, in this embodiment is located in receiver 62 for decoding the embedded descriptive stream. The descriptive stream may be displayed in a picture-in-picture (PIP) format on video display 64, or may be displayed on a descriptive stream display 68, which may be co-located with the trigger mechanism, which may take the form of a remote control mechanism for the receiver. Some form of indicia may be provided, either as a visible display on video display 64, or as an audible tone, to indicate that a descriptive stream is present in the video
sequence.
Activating trigger mechanism 66 when a descriptive stream is present will likely
result in those objects which have descriptive streams associated therewith being highlighted, or
otherwise marked, to tell the user that additional information about the video object is present
The data block information is displayed in the descriptive stream display, and the devise
manipulated to allow the user to select and activate the display of additional information. The information may be displayed immediately, or may be stored for future reference. Of key
importance is to allow the video display to continue uninterrupted so that others watching the display will not be compelled to remove the remote control from the possession of the user who is
seeking additional information.
In the event that the system of the invention is used with a digital library, on a computer system for instance, capture mechanism 52, transmitter 60 and receiver 62 may not be required, as the video or image will have already been captured and stored in a library, which library likely resides on magnetic or optical media which is hard-wired to the video or image display. In this embodiment a decoder to decode the descriptive stream may be located in the computer or in the display. The trigger mechanism may be combined with a mouse or other pointing device, or may be incorporated into a keyboard, either with dedicated keys, or by the assignment of a key sequence. The descriptive stream display will likely take the form of a window on the video display or monitor.
Applications
Broadcasting TV Programs
TV stations may utilize the method and system of the invention to add more
functionality to their broadcasting programs. They may choose to send out descriptive streams
along with their regular TV signals so that viewers may receive the programs and utilize the
advanced functions described herein. The scenario for a broadcast TV station is similar to that of
sending out closed caption text along with regular TV signals. Broadcasters have the flexibility of
choosing to send or not to send the descriptive streams for their programs at will. If a receiving
TV set has the capability of decoding the descriptive streams, the viewer may choose to use or not use the advanced functions, just as the viewer may choose to view or not to view closed
caption text If the user chooses to use the functions, the user may read extra text about someone
or something in the programs, hear extra voice annotations, or go directly to the related Web site(s), if the TV set is Web enabled, or perform some tasks, such as online ordering, by running
the embedded Java applets.
For a video sequence, the descriptive stream may be obtained through a variety of mechanisms. It may be constructed manually using an interactive method. An operator may explicitly choose to index certain objects in the video and record some corresponding further information. The descriptive stream may also be constructed automatically using any video analysis tools, especially those to be developed for the Moving Pictures Experts Group Standard
No. 7 (MPEG-7). Consumer Home Video The method and system of the invention may be utilized in making consumer video. Camcorders, VCRs and DVD recorders may be developed to allow the construction and storage of descriptive streams while recording and editing. Those devices may provide user interface programs to allow a user to manually locate certain objects in their video, index them,
and recording any corresponding information into the descriptive streams. For example, a user
may locate an object within a frame by specifying a rectangular region which contains the object
The user may then choose to enter some text into the textual description field, record some
speech into the voice annotation field, and key in some Web page address into the URL links
field. The user may choose to allow the programming of the device to propagate those descriptions to the surrounding frames. This may be done by tracking the objects in the nearby
frames. The recorded descriptions for certain objects may also be used as their visual tags.
If a descriptive stream is recorded along with a video sequence as described above,
that video can then be viewed later and support all the functions as described above.
Digital Video/Image Databases
As previously noted, the method and system of the invention may also be used in digital libraries. The method may be applied to video sequences or images originally stored in any common format including RGB, Dl, MPEG, MPEG-2, MPEG-4, etc. If a video sequence is stored in MPEG-4, the location information of the objects in the video may be extracted automatically. This eases the burden of manually locating them. Further information may then be added to each extracted object within a frame and propagated into other sequential or nonsequential frames, if so selected. When a sequence or image is stored in a non-object-based format, the mechanism described herein may be used to construct descriptive streams. This enables a video sequence or image stored in one format to be viewed and manipulated in a different format and to have the description and linking features of the invention to be applied
thereto.
The descriptive streams facilitate content-based video/image indexing and retrieval.
A search engine may find relevant video contents at the object level, by matching relevant
keywords against the text stored in the textual description fields in the descriptive streams. The
search engine may also choose to analyze the voice annotations, match the image features, and/or
look up the linked Web pages for additional information. The embedded Java applets may
implement more sophisticated similarity measures to further enhance content-based video/image indexing and retrieval.
Thus, a method and system for object-based video description and linking has been disclosed. It will be appreciated that variations and modifications thereof may be made within the scope of the invention as defined in the appended claims.
Claims
1. A method of object-based description and linking of objects within an image,
comprising: generating a descriptive stream, including a data block, for the image;
identifying at least one object of interest in the image; inserting description links into the data block for an object of interest; and recording a frame index at the beginning of each data block for synchronizing the description/links with the image.
2. The method of claim 1 wherein said inserting of description/links includes inserting description/links taken from the group of description/links consisting of object indexes, textual descriptions, voice annotation, image features, object links, URL links and Java applets.
3. The method of claim 1 wherein said identifying at least one object of interest includes identifying the entire image as an object of interest
4. The method of claim 1 wherein the image is a portion of a sequence of images
comprising a video sequence of video frames, and wherein said generating a descriptive stream
includes generating a descriptive stream for plural video frames in said video sequence.
5. The method of claim 4 wherein the video frames are in sequential order in said
video sequence.
6. The method of claim 4 wherein the video frames are in non-sequential order in said
video sequence.
7. A method of object-based description and linking of objects within a video sequence, wherein the video sequence includes plural video frames, comprising: generating a descriptive stream, including a data block corresponding to a select ΓÇóvideo frame in the video sequence; identifying at least one object of interest in a video frame; inserting description/links into the data block for an object of interest; and recording a frame index at the beginning of each data block for synchronizing the description links with the video sequence.
8. The method of claim 7 wherein said inserting of description/links includes inserting description/links taken from the group of description/links consisting of object indexes, textual
descriptions, voice annotation, image features, object links, URL links and Java applets.
9. The method of claim 7 wherein said identifying at least one object of interest
includes identifying the entire video frame as an object of interest.
10. The method of claim 7 wherein said generating a descriptive stream includes
generating a descriptive stream for plural video frames in a video sequence.
11. The method of claim 10 wherein the video frames are in sequential order in a video
sequence.
12. The method of claim 10 wherein the video frames are in non-sequential order in a video sequence.
13. A system for object-based video description and linking of objects to an image, wherein the image is represented by an electrical signal, comprising: an encoder for embedding a descriptive stream with the electrical signal; a display mechanism for displaying the image; a decoder for decoding the embedded descriptive stream; and a trigger mechanism for instructing said decoder to decode and display said descriptive stream in a descriptive stream display at the request of a user, and for selecting, at the request of a user, a particular portion of the descriptive stream with which to work.
14. The system of claim 13 which further includes a capture mechanism for generating
the image as a sequence of video frames, and for converting said image into a video signal.
15. The system of claim 14 which further includes a transmitter for transmitting said
video signal and said embedded descriptive stream; and a receiver constructed and arranged for
receiving said video signal and said embedded descriptive stream and for displaying a video
image.
16. The system of claim 14 wherein said capture mechanism is taken from the group consisting of video cameras and computers.
17. The system of claim 13 wherein said trigger mechanism is located in a remote-
control device.
18. The system of claim 13 wherein said descriptive stream display is located in a
remote-control device.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US4327397P | 1997-04-17 | 1997-04-17 | |
US60/043,273 | 1997-04-17 | ||
US90021497A | 1997-07-24 | 1997-07-24 | |
US08/900,214 | 1997-07-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1998047084A1 true WO1998047084A1 (en) | 1998-10-22 |
Family
ID=26720219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP1998/001736 WO1998047084A1 (en) | 1997-04-17 | 1998-04-16 | A method and system for object-based video description and linking |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO1998047084A1 (en) |
Cited By (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2000077678A1 (en) * | 1999-06-14 | 2000-12-21 | Brad Barrett | Method and system for an advanced television system allowing objects within an encoded video session to be interactively selected and processed |
WO2000077790A2 (en) * | 1999-06-15 | 2000-12-21 | Digital Electronic Cinema, Inc. | Systems and methods for facilitating the recomposition of data blocks |
WO2001015454A1 (en) * | 1999-08-26 | 2001-03-01 | Spotware Technologies, Inc. | Method and apparatus for providing supplemental information regarding objects in a video stream |
WO2001026377A1 (en) * | 1999-10-04 | 2001-04-12 | Obvious Technology, Inc. | Network distribution and management of interactive video and multi-media containers |
WO2001065856A1 (en) * | 2000-02-29 | 2001-09-07 | Watchpoint Media, Inc. | Methods and apparatus for hyperlinking in a television broadcast |
WO2001069438A2 (en) * | 2000-03-14 | 2001-09-20 | Starlab Nv/Sa | Methods and apparatus for encoding multimedia annotations using time-synchronized description streams |
WO2002058399A1 (en) * | 2001-01-22 | 2002-07-25 | Thomson Licensing S.A. | Method for choosing a reference information item in a television signal |
WO2002071021A1 (en) * | 2001-03-02 | 2002-09-12 | First International Digital, Inc. | Method and system for encoding and decoding synchronized data within a media sequence |
WO2001044978A3 (en) * | 1999-12-15 | 2003-01-09 | Tangis Corp | Storing and recalling information to augment human memories |
WO2003030126A2 (en) * | 2001-10-01 | 2003-04-10 | Telecom Italia S.P.A. | System and method for transmitting multimedia information streams, for instance for remote teaching |
EP1337091A2 (en) * | 2002-02-19 | 2003-08-20 | Michel Francis Monduc | Method for transmission of audio or video messages over the Internet |
GB2388739A (en) * | 2001-11-03 | 2003-11-19 | Dremedia Ltd | Time-ordered indexing of an information stream |
US6801891B2 (en) | 2000-11-20 | 2004-10-05 | Canon Kabushiki Kaisha | Speech processing system |
US6873993B2 (en) | 2000-06-21 | 2005-03-29 | Canon Kabushiki Kaisha | Indexing method and apparatus |
US6882970B1 (en) | 1999-10-28 | 2005-04-19 | Canon Kabushiki Kaisha | Language recognition using sequence frequency |
WO2005062307A1 (en) * | 2003-12-02 | 2005-07-07 | Eastman Kodak Company | Modifying a portion of an image frame |
EP1578121A2 (en) * | 2004-03-16 | 2005-09-21 | Sony Corporation | Image data storing method and image processing apparatus |
US6990448B2 (en) | 1999-03-05 | 2006-01-24 | Canon Kabushiki Kaisha | Database annotation and retrieval including phoneme data |
US7046263B1 (en) | 1998-12-18 | 2006-05-16 | Tangis Corporation | Requesting computer user's context data |
US7055101B2 (en) | 1998-12-18 | 2006-05-30 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7054812B2 (en) | 2000-05-16 | 2006-05-30 | Canon Kabushiki Kaisha | Database annotation and retrieval |
US7058893B2 (en) | 1998-12-18 | 2006-06-06 | Tangis Corporation | Managing interactions between computer users' context models |
US7062715B2 (en) | 1998-12-18 | 2006-06-13 | Tangis Corporation | Supplying notifications related to supply and consumption of user context data |
US7073129B1 (en) | 1998-12-18 | 2006-07-04 | Tangis Corporation | Automated selection of appropriate information based on a computer user's context |
US7076737B2 (en) | 1998-12-18 | 2006-07-11 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7080322B2 (en) | 1998-12-18 | 2006-07-18 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7107539B2 (en) | 1998-12-18 | 2006-09-12 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7113918B1 (en) * | 1999-08-01 | 2006-09-26 | Electric Planet, Inc. | Method for video enabled electronic commerce |
US7149755B2 (en) * | 2002-07-29 | 2006-12-12 | Hewlett-Packard Development Company, Lp. | Presenting a collection of media objects |
US7212968B1 (en) | 1999-10-28 | 2007-05-01 | Canon Kabushiki Kaisha | Pattern matching method and apparatus |
US7225229B1 (en) | 1998-12-18 | 2007-05-29 | Tangis Corporation | Automated pushing of computer user's context data to clients |
US7240003B2 (en) | 2000-09-29 | 2007-07-03 | Canon Kabushiki Kaisha | Database annotation and retrieval |
US7292979B2 (en) | 2001-11-03 | 2007-11-06 | Autonomy Systems, Limited | Time ordered indexing of audio data |
US7310600B1 (en) | 1999-10-28 | 2007-12-18 | Canon Kabushiki Kaisha | Language recognition using a similarity measure |
US7337116B2 (en) | 2000-11-07 | 2008-02-26 | Canon Kabushiki Kaisha | Speech processing system |
WO2008121758A1 (en) * | 2007-03-30 | 2008-10-09 | Rite-Solutions, Inc. | Methods and apparatus for the creation and editing of media intended for the enhancement of existing media |
WO2009005415A1 (en) * | 2007-07-03 | 2009-01-08 | Teleca Sweden Ab | Method for displaying content on a multimedia player and a multimedia player |
US7478331B2 (en) | 1998-12-18 | 2009-01-13 | Microsoft Corporation | Interface for exchanging context data |
EP2264619A3 (en) * | 1998-11-30 | 2011-03-02 | YUEN, Henry C. | Search engine for video and graphics |
DE10033134B4 (en) * | 1999-10-21 | 2011-05-12 | Frank Knischewski | Method and device for displaying information on selected picture elements of pictures of a video sequence |
US7987175B2 (en) | 1998-11-30 | 2011-07-26 | Gemstar Development Corporation | Search engine for video and graphics |
USRE42728E1 (en) | 1997-07-03 | 2011-09-20 | Sony Corporation | Network distribution and management of interactive video and multi-media containers |
EP2816564A1 (en) * | 2013-06-21 | 2014-12-24 | Nokia Corporation | Method and apparatus for smart video rendering |
US9125169B2 (en) | 2011-12-23 | 2015-09-01 | Rovi Guides, Inc. | Methods and systems for performing actions based on location-based rules |
US9183306B2 (en) | 1998-12-18 | 2015-11-10 | Microsoft Technology Licensing, Llc | Automated selection of appropriate information based on a computer user's context |
US9294799B2 (en) | 2000-10-11 | 2016-03-22 | Rovi Guides, Inc. | Systems and methods for providing storage of data on servers in an on-demand media delivery system |
US10555051B2 (en) | 2016-07-21 | 2020-02-04 | At&T Mobility Ii Llc | Internet enabled video media content stream |
US10638194B2 (en) | 2014-05-06 | 2020-04-28 | At&T Intellectual Property I, L.P. | Embedding interactive objects into a video session |
US10657380B2 (en) | 2017-12-01 | 2020-05-19 | At&T Mobility Ii Llc | Addressable image object |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0618526A2 (en) * | 1993-03-31 | 1994-10-05 | Us West Advanced Technologies, Inc. | Method and apparatus for multi-level navigable video environment |
WO1997012342A1 (en) * | 1995-09-29 | 1997-04-03 | Wistendahl Douglass A | System for using media content in interactive digital media program |
-
1998
- 1998-04-16 WO PCT/JP1998/001736 patent/WO1998047084A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0618526A2 (en) * | 1993-03-31 | 1994-10-05 | Us West Advanced Technologies, Inc. | Method and apparatus for multi-level navigable video environment |
WO1997012342A1 (en) * | 1995-09-29 | 1997-04-03 | Wistendahl Douglass A | System for using media content in interactive digital media program |
Non-Patent Citations (2)
Title |
---|
"MULTIMEDIA HYPERVIDEO LINKS FOR FULL MOTION VIDEOS", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 37, no. 4A, 1 April 1994 (1994-04-01), pages 95, XP000446196 * |
BURRILL V ET AL: "TIME-VARYING SENSITIVE REGIONS IN DYNAMIC MULTIMEDIA OBJECTS: A PRAGMATIC APPROACH TO CONTENT BASED RETRIEVAL FROM VIDEO", INFORMATION AND SOFTWARE TECHNOLOGY, vol. 36, no. 4, 1 January 1994 (1994-01-01), pages 213 - 223, XP000572844 * |
Cited By (84)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
USRE42728E1 (en) | 1997-07-03 | 2011-09-20 | Sony Corporation | Network distribution and management of interactive video and multi-media containers |
US6573907B1 (en) | 1997-07-03 | 2003-06-03 | Obvious Technology | Network distribution and management of interactive video and multi-media containers |
USRE45594E1 (en) | 1997-07-03 | 2015-06-30 | Sony Corporation | Network distribution and management of interactive video and multi-media containers |
US9311405B2 (en) | 1998-11-30 | 2016-04-12 | Rovi Guides, Inc. | Search engine for video and graphics |
EP2264619A3 (en) * | 1998-11-30 | 2011-03-02 | YUEN, Henry C. | Search engine for video and graphics |
US7987175B2 (en) | 1998-11-30 | 2011-07-26 | Gemstar Development Corporation | Search engine for video and graphics |
US8341137B2 (en) | 1998-11-30 | 2012-12-25 | Gemstar Development Corporation | Search engine for video and graphics |
US8341136B2 (en) | 1998-11-30 | 2012-12-25 | Gemstar Development Corporation | Search engine for video and graphics |
US7346663B2 (en) | 1998-12-18 | 2008-03-18 | Microsoft Corporation | Automated response to computer user's context |
US7080322B2 (en) | 1998-12-18 | 2006-07-18 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US9372555B2 (en) | 1998-12-18 | 2016-06-21 | Microsoft Technology Licensing, Llc | Managing interactions between computer users' context models |
US7203906B2 (en) | 1998-12-18 | 2007-04-10 | Tangis Corporation | Supplying notifications related to supply and consumption of user context data |
US9559917B2 (en) | 1998-12-18 | 2017-01-31 | Microsoft Technology Licensing, Llc | Supplying notifications related to supply and consumption of user context data |
US9906474B2 (en) | 1998-12-18 | 2018-02-27 | Microsoft Technology Licensing, Llc | Automated selection of appropriate information based on a computer user's context |
US7137069B2 (en) | 1998-12-18 | 2006-11-14 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7107539B2 (en) | 1998-12-18 | 2006-09-12 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7089497B2 (en) | 1998-12-18 | 2006-08-08 | Tangis Corporation | Managing interactions between computer users' context models |
US7055101B2 (en) | 1998-12-18 | 2006-05-30 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7076737B2 (en) | 1998-12-18 | 2006-07-11 | Tangis Corporation | Thematic response to a computer user's context, such as by a wearable personal computer |
US7478331B2 (en) | 1998-12-18 | 2009-01-13 | Microsoft Corporation | Interface for exchanging context data |
US7073129B1 (en) | 1998-12-18 | 2006-07-04 | Tangis Corporation | Automated selection of appropriate information based on a computer user's context |
US7062715B2 (en) | 1998-12-18 | 2006-06-13 | Tangis Corporation | Supplying notifications related to supply and consumption of user context data |
US7058894B2 (en) | 1998-12-18 | 2006-06-06 | Tangis Corporation | Managing interactions between computer users' context models |
US7058893B2 (en) | 1998-12-18 | 2006-06-06 | Tangis Corporation | Managing interactions between computer users' context models |
US7225229B1 (en) | 1998-12-18 | 2007-05-29 | Tangis Corporation | Automated pushing of computer user's context data to clients |
US9183306B2 (en) | 1998-12-18 | 2015-11-10 | Microsoft Technology Licensing, Llc | Automated selection of appropriate information based on a computer user's context |
US7046263B1 (en) | 1998-12-18 | 2006-05-16 | Tangis Corporation | Requesting computer user's context data |
US6990448B2 (en) | 1999-03-05 | 2006-01-24 | Canon Kabushiki Kaisha | Database annotation and retrieval including phoneme data |
US7257533B2 (en) | 1999-03-05 | 2007-08-14 | Canon Kabushiki Kaisha | Database searching and retrieval using phoneme and word lattice |
WO2000077678A1 (en) * | 1999-06-14 | 2000-12-21 | Brad Barrett | Method and system for an advanced television system allowing objects within an encoded video session to be interactively selected and processed |
WO2000077790A3 (en) * | 1999-06-15 | 2001-04-05 | Digital Electronic Cinema Inc | Systems and methods for facilitating the recomposition of data blocks |
WO2000077790A2 (en) * | 1999-06-15 | 2000-12-21 | Digital Electronic Cinema, Inc. | Systems and methods for facilitating the recomposition of data blocks |
US7113918B1 (en) * | 1999-08-01 | 2006-09-26 | Electric Planet, Inc. | Method for video enabled electronic commerce |
WO2001015454A1 (en) * | 1999-08-26 | 2001-03-01 | Spotware Technologies, Inc. | Method and apparatus for providing supplemental information regarding objects in a video stream |
WO2001026377A1 (en) * | 1999-10-04 | 2001-04-12 | Obvious Technology, Inc. | Network distribution and management of interactive video and multi-media containers |
DE10033134B4 (en) * | 1999-10-21 | 2011-05-12 | Frank Knischewski | Method and device for displaying information on selected picture elements of pictures of a video sequence |
US8863199B1 (en) | 1999-10-21 | 2014-10-14 | Frank Knischewski | Method and device for displaying information with respect to selected image elements of images of a video sequence |
US6882970B1 (en) | 1999-10-28 | 2005-04-19 | Canon Kabushiki Kaisha | Language recognition using sequence frequency |
US7310600B1 (en) | 1999-10-28 | 2007-12-18 | Canon Kabushiki Kaisha | Language recognition using a similarity measure |
US7295980B2 (en) | 1999-10-28 | 2007-11-13 | Canon Kabushiki Kaisha | Pattern matching method and apparatus |
US7212968B1 (en) | 1999-10-28 | 2007-05-01 | Canon Kabushiki Kaisha | Pattern matching method and apparatus |
US7155456B2 (en) | 1999-12-15 | 2006-12-26 | Tangis Corporation | Storing and recalling information to augment human memories |
US9443037B2 (en) | 1999-12-15 | 2016-09-13 | Microsoft Technology Licensing, Llc | Storing and recalling information to augment human memories |
WO2001044978A3 (en) * | 1999-12-15 | 2003-01-09 | Tangis Corp | Storing and recalling information to augment human memories |
US6549915B2 (en) | 1999-12-15 | 2003-04-15 | Tangis Corporation | Storing and recalling information to augment human memories |
WO2001065856A1 (en) * | 2000-02-29 | 2001-09-07 | Watchpoint Media, Inc. | Methods and apparatus for hyperlinking in a television broadcast |
WO2001069438A3 (en) * | 2000-03-14 | 2003-12-31 | Starlab Nv Sa | Methods and apparatus for encoding multimedia annotations using time-synchronized description streams |
WO2001069438A2 (en) * | 2000-03-14 | 2001-09-20 | Starlab Nv/Sa | Methods and apparatus for encoding multimedia annotations using time-synchronized description streams |
US7054812B2 (en) | 2000-05-16 | 2006-05-30 | Canon Kabushiki Kaisha | Database annotation and retrieval |
US6873993B2 (en) | 2000-06-21 | 2005-03-29 | Canon Kabushiki Kaisha | Indexing method and apparatus |
US7240003B2 (en) | 2000-09-29 | 2007-07-03 | Canon Kabushiki Kaisha | Database annotation and retrieval |
US9462317B2 (en) | 2000-10-11 | 2016-10-04 | Rovi Guides, Inc. | Systems and methods for providing storage of data on servers in an on-demand media delivery system |
US9294799B2 (en) | 2000-10-11 | 2016-03-22 | Rovi Guides, Inc. | Systems and methods for providing storage of data on servers in an on-demand media delivery system |
US7337116B2 (en) | 2000-11-07 | 2008-02-26 | Canon Kabushiki Kaisha | Speech processing system |
US6801891B2 (en) | 2000-11-20 | 2004-10-05 | Canon Kabushiki Kaisha | Speech processing system |
WO2002058399A1 (en) * | 2001-01-22 | 2002-07-25 | Thomson Licensing S.A. | Method for choosing a reference information item in a television signal |
KR100895922B1 (en) * | 2001-01-22 | 2009-05-07 | 톰슨 라이센싱 | Method and apparatus for transmission or recording of and reproduction of a video signal with embedded hyperlinks, and information carrier |
WO2002071021A1 (en) * | 2001-03-02 | 2002-09-12 | First International Digital, Inc. | Method and system for encoding and decoding synchronized data within a media sequence |
WO2003030126A2 (en) * | 2001-10-01 | 2003-04-10 | Telecom Italia S.P.A. | System and method for transmitting multimedia information streams, for instance for remote teaching |
WO2003030126A3 (en) * | 2001-10-01 | 2003-10-02 | Telecom Italia Spa | System and method for transmitting multimedia information streams, for instance for remote teaching |
GB2388739A (en) * | 2001-11-03 | 2003-11-19 | Dremedia Ltd | Time-ordered indexing of an information stream |
US7292979B2 (en) | 2001-11-03 | 2007-11-06 | Autonomy Systems, Limited | Time ordered indexing of audio data |
US8972840B2 (en) | 2001-11-03 | 2015-03-03 | Longsand Limited | Time ordered indexing of an information stream |
GB2388739B (en) * | 2001-11-03 | 2004-06-02 | Dremedia Ltd | Time ordered indexing of an information stream |
US7206303B2 (en) | 2001-11-03 | 2007-04-17 | Autonomy Systems Limited | Time ordered indexing of an information stream |
FR2836317A1 (en) * | 2002-02-19 | 2003-08-22 | Michel Francis Monduc | METHOD FOR TRANSMITTING AUDIO OR VIDEO MESSAGES OVER THE INTERNET NETWORK |
EP1337091A2 (en) * | 2002-02-19 | 2003-08-20 | Michel Francis Monduc | Method for transmission of audio or video messages over the Internet |
EP1337091A3 (en) * | 2002-02-19 | 2003-09-10 | Michel Francis Monduc | Method for transmission of audio or video messages over the Internet |
US7149755B2 (en) * | 2002-07-29 | 2006-12-12 | Hewlett-Packard Development Company, Lp. | Presenting a collection of media objects |
WO2005062307A1 (en) * | 2003-12-02 | 2005-07-07 | Eastman Kodak Company | Modifying a portion of an image frame |
EP1578121A2 (en) * | 2004-03-16 | 2005-09-21 | Sony Corporation | Image data storing method and image processing apparatus |
EP1578121A3 (en) * | 2004-03-16 | 2006-05-31 | Sony Corporation | Image data storing method and image processing apparatus |
WO2008121758A1 (en) * | 2007-03-30 | 2008-10-09 | Rite-Solutions, Inc. | Methods and apparatus for the creation and editing of media intended for the enhancement of existing media |
WO2009005415A1 (en) * | 2007-07-03 | 2009-01-08 | Teleca Sweden Ab | Method for displaying content on a multimedia player and a multimedia player |
US9125169B2 (en) | 2011-12-23 | 2015-09-01 | Rovi Guides, Inc. | Methods and systems for performing actions based on location-based rules |
US10347298B2 (en) | 2013-06-21 | 2019-07-09 | Nokia Technologies Oy | Method and apparatus for smart video rendering |
EP2816564A1 (en) * | 2013-06-21 | 2014-12-24 | Nokia Corporation | Method and apparatus for smart video rendering |
US10638194B2 (en) | 2014-05-06 | 2020-04-28 | At&T Intellectual Property I, L.P. | Embedding interactive objects into a video session |
US10555051B2 (en) | 2016-07-21 | 2020-02-04 | At&T Mobility Ii Llc | Internet enabled video media content stream |
US10979779B2 (en) | 2016-07-21 | 2021-04-13 | At&T Mobility Ii Llc | Internet enabled video media content stream |
US11564016B2 (en) | 2016-07-21 | 2023-01-24 | At&T Mobility Ii Llc | Internet enabled video media content stream |
US10657380B2 (en) | 2017-12-01 | 2020-05-19 | At&T Mobility Ii Llc | Addressable image object |
US11216668B2 (en) | 2017-12-01 | 2022-01-04 | At&T Mobility Ii Llc | Addressable image object |
US11663825B2 (en) | 2017-12-01 | 2023-05-30 | At&T Mobility Ii Llc | Addressable image object |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO1998047084A1 (en) | A method and system for object-based video description and linking | |
US7536706B1 (en) | Information enhanced audio video encoding system | |
EP0982947A2 (en) | Audio video encoding system with enhanced functionality | |
US6499057B1 (en) | System and method for activating uniform network resource locators displayed in a media broadcast | |
US20200065322A1 (en) | Multimedia content tags | |
US6868415B2 (en) | Information linking method, information viewer, information register, and information search equipment | |
Nack et al. | Everything you wanted to know about MPEG-7. 1 | |
Tseng et al. | Using MPEG-7 and MPEG-21 for personalizing video | |
Bolle et al. | Video query: Research directions | |
KR100915847B1 (en) | Streaming video bookmarks | |
US7647555B1 (en) | System and method for video access from notes or summaries | |
JP4408768B2 (en) | Description data generation device, audio visual device using description data | |
KR100512138B1 (en) | Video Browsing System With Synthetic Key Frame | |
US7181757B1 (en) | Video summary description scheme and method and system of video summary description data generation for efficient overview and browsing | |
Elmagarmid et al. | Video Database Systems: Issues, Products and Applications | |
US20070124796A1 (en) | Appliance and method for client-sided requesting and receiving of information | |
US20030074671A1 (en) | Method for information retrieval based on network | |
US20050144305A1 (en) | Systems and methods for identifying, segmenting, collecting, annotating, and publishing multimedia materials | |
KR20040101235A (en) | Method and system for retrieving information about television programs | |
CN102483742A (en) | System and method for managing internet media content | |
WO2001027876A1 (en) | Video summary description scheme and method and system of video summary description data generation for efficient overview and browsing | |
Wactlar et al. | Digital video archives: Managing through metadata | |
EP1684517A2 (en) | Information presenting system | |
Cho et al. | News video retrieval using automatic indexing of korean closed-caption | |
Tanaka | Research on Fusion of the Web and TV Broadcasting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: JP Ref document number: 1998543744 Format of ref document f/p: F |
|
122 | Ep: pct application non-entry in european phase |