US20140019479A1 - Technique for processing data in a network - Google Patents

Technique for processing data in a network Download PDF

Info

Publication number
US20140019479A1
US20140019479A1 US13/929,678 US201313929678A US2014019479A1 US 20140019479 A1 US20140019479 A1 US 20140019479A1 US 201313929678 A US201313929678 A US 201313929678A US 2014019479 A1 US2014019479 A1 US 2014019479A1
Authority
US
United States
Prior art keywords
annotation
database
video data
video
annotations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/929,678
Inventor
Arjen P. deVries
Michael Sokolov
David E. Kovalcin
Brian Eberman
Leonidas Kontothanassis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Eureka Database Solutions LLC
AltaVista Co
Altaba Inc
Original Assignee
Yahoo Inc until 2017
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=33032482&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20140019479(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Priority claimed from US09/037,957 external-priority patent/US6173287B1/en
Application filed by Yahoo Inc until 2017 filed Critical Yahoo Inc until 2017
Priority to US13/929,678 priority Critical patent/US20140019479A1/en
Publication of US20140019479A1 publication Critical patent/US20140019479A1/en
Assigned to OVERTURE SERVICES, INC. reassignment OVERTURE SERVICES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ALTA VISTA COMPANY
Assigned to DIGITAL EQUIPMENT CORPORATION reassignment DIGITAL EQUIPMENT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DE VRIES, ARJEN P., KOVALCIN, DAVID E., SOKOLOV, MICHAEL, EBERMAN, BRIAN S, DUFAUX, FREDERIC, KONTOTHANASSIS, LEONIDAS
Assigned to ALTAVISTA COMPANY reassignment ALTAVISTA COMPANY MERGER AND CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: ALTAVISTA COMPANY, ZOOM NEWCO INC.
Assigned to ZOOM NEWCO INC. reassignment ZOOM NEWCO INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COMPAQ COMPUTER CORPORATION, DIGITAL EQUIPMENT CORPORATION
Assigned to ALTAVISTA COMPANY reassignment ALTAVISTA COMPANY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIGITAL EQUIPMENT CORPORATION
Assigned to YAHOO! INC. reassignment YAHOO! INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: OVERTURE SERVICES, INC.
Assigned to EXCALIBUR IP, LLC reassignment EXCALIBUR IP, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YAHOO! INC.
Assigned to YAHOO! INC. reassignment YAHOO! INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EXCALIBUR IP, LLC
Assigned to EXCALIBUR IP, LLC reassignment EXCALIBUR IP, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YAHOO! INC.
Assigned to EUREKA DATABASE SOLUTIONS, LLC reassignment EUREKA DATABASE SOLUTIONS, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EXCALIBUR IP, LLC
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F17/30023
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/489Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99948Application of database or data structure, e.g. distributed, multimedia, or image

Definitions

  • the present invention relates generally to the field of multimedia and, more particularly, to a technique for processing data in a network.
  • the migration of multimedia content from analog form to digital form also provides an organization with the ability to store, search, browse, and retrieve digitized multimedia content from distributed sites. That is, an organization having a number of distributed offices can store, search, browse, and retrieve digitized multimedia content from a centralized storage facility over a proprietary intranet computer network such as, for example, a local area network (LAN), or a public internet computer network such as, for example, the world wide web.
  • a proprietary intranet computer network such as, for example, a local area network (LAN), or a public internet computer network such as, for example, the world wide web.
  • the multimedia content itself may be distributed. That is, an organization that is global in nature may have a number of distributed permanent archival storage locations where digitized multimedia content is permanently stored, or a number of distributed temporary storage locations where digitized multimedia content that is associated with work in progress is temporarily stored. Similar to above, such an organization could also store, search, browse, and retrieve digitized multimedia content from the distributed storage locations over a proprietary intranet computer network or a public internet computer network.
  • an organization may want other entities located outside of the organization to be able to search, browse, and retrieve digitized multimedia content stored and maintained within the organization. For example, an organization may want to sell multimedia content to an outside entity, which may then use the purchased multimedia content for some purpose such as, for example, a news broadcast. Similar to above, the outside entity could search, browse, and retrieve digitized multimedia content from a storage facility within the organization over a proprietary intranet computer network or a public internet computer network.
  • the primary object of the present invention is to provide a technique for processing data in a network.
  • a technique for processing data in a network is disclosed.
  • the technique may be realized as a method for processing data in a network having a plurality of network stations.
  • the method comprises receiving a first representation of data at a first of the plurality of network stations, processing the first representation so as to generate a second representation of the data, and transmitting the second representation from the first network station to a second of the plurality of network stations for storage therein, wherein the second representation is stored at an address within the second network station.
  • the method also comprises receiving the address from the first network station to a third of the plurality of network stations for storage therein.
  • the first, the second, and the third network stations may beneficially be different network stations.
  • processing the first representation may beneficially include at least encoding the first representation or transcoding the first representation.
  • processing the first representation may beneficially include processing the first representation at the first network station.
  • receiving the address may beneficially including receiving the address from the second network station.
  • the address may beneficially have an extended URL format.
  • the method may further beneficially comprise transmitting a request for an identifier of the data from the first network station, and receiving the data identifier at the first network station. If such is the case, transmitting a request for an identifier of the data may beneficially include transmitting a request for an identifier of the data to the third network station. Also, if such is the case, the data identifier may beneficially be associated with an object in a database.
  • the address may beneficially be a first address of a plurality of addresses stored at the third network station, and the method may further beneficially comprise transmitting a request for at least one of the plurality of addresses from the first network station, and receiving a second address at the first network station. If such is the case, transmitting a request for at least one of the plurality of addresses may beneficially include transmitting a request for at least one of the plurality of addresses to the third network station.
  • Each of the plurality of addresses may then beneficially identify a location of a stored representation of data.
  • the second address may beneficially identify a location of the first representation of data.
  • the method may further beneficially comprise transmitting a request for the first representation of data at the second address from the first network station. Then, transmitting a request for the first representation of data at the end second address may beneficially include transmitting a request for the first representation of data at the second address to the second network station.
  • the technique may be realized as at least one signal embodied in at least one carrier wave for transmitting a computer program of instructions configured to be readable by at least one processor for instructing the at least one processor to execute a computer process for performing the above-described method.
  • the technique may be realized as at least one processor readable carrier for storing a computer program of instructions configured to be readable by at least one processor for instructing the at least one processor to execute a computer process for performing the above-described method.
  • the technique may be realized as an apparatus for processing data in a network having a plurality of network stations.
  • the apparatus comprises a first receiver for receiving a first representation of data at a first of the plurality of network stations, a processing device for processing the first representation so as to generate a second representation of the data, and a first transmitter for transmitting the second representation from the first network station to a second of the plurality of network stations for storage therein, wherein the second representation is stored at an address within the second network station.
  • the apparatus also comprises a second receiver for receiving the address at the first network station, and a second transmitter for transmitting the address from the first network station to a third of the plurality of network stations for storage therein.
  • the first, the second, and the third network stations may beneficially be different network stations.
  • the processing device may beneficially include at least an encoder for encoding the first representation or a transcoder for transcoding the first representation.
  • the apparatus may further beneficially comprise a third transmitter for transmitting a request for an identifier of the data from the first network station, and a third receiver for receiving the data identifier at the first network station. If such is the case, the data identifier may beneficially be associated with an object in a database.
  • FIG. 1A is a schematic diagram of a first embodiment of a system for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • FIG. 1B is a schematic diagram of a second embodiment of a system for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • FIG. 2 is a flowchart diagram detailing the processing steps of an encoder client in accordance with the present invention.
  • FIG. 3 is a flowchart diagram detailing the processing steps of a transcoder client in accordance with the present invention.
  • FIG. 4 is a flowchart diagram of an encoding process for use in an encoder and transcoder in accordance with the present invention.
  • FIG. 5 shows the file structure for a file that is stored in a media database containing a digital representation of audio/video data in accordance with the present invention.
  • FIG. 6 shows an annotation structure for an object in accordance with the present invention.
  • FIG. 7 shows the structure of an object database of a meta database in accordance with the present invention.
  • FIG. 8 shows an object table of a meta database in accordance with the present invention.
  • FIG. 9 shows a representation table of a meta database in accordance with the present invention.
  • FIG. 10 shows an annotation table of a meta database in accordance with the present invention.
  • FIG. 11 shows an exemplary HTML query page in accordance with the present invention.
  • FIG. 12 shows an exemplary HTML results page in accordance with the present invention.
  • FIG. 13 shows an exemplary HTML matches page in accordance with the present invention.
  • FIG. 14 shows an exemplary HTML more context page in accordance with the present invention.
  • FIG. 15 is a schematic diagram of a processing device for facilitating the implementation of input data processing and output data generation in the components of the present invention.
  • FIG. 1A there is shown a schematic diagram of a first embodiment of a system 10 A for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • the system 10 A comprises a user 11 , raw audio/video data 12 , at least one encoder client 14 , at least one transcoder client 16 , at least one annotation client 18 , at least one browser client 20 , a media database 22 , a media database server 24 , a meta database 26 , a meta database server (librarian) 28 , an index database 30 , an index database server 32 , and a communication network 34 for allowing communication between all of the above-identified components which are connected thereto.
  • the communication network 34 as described herein is an internet protocol (IP) network using hypertext transfer protocol (HTTP) messaging so as to exploit the distributed nature of the world wide web (WWW).
  • IP internet protocol
  • HTTP hypertext transfer protocol
  • the system 10 A may be implemented using other types of network protocols, and many of the above-identified components may be grouped together in a single processing device so as to altogether eliminate the need for inter- or intra-network communications between these grouped components.
  • the system 10 A operates such that the raw audio/video data 12 is provided to the encoder client 14 for processing by the encoder client 14 .
  • the encoder client 14 sends a message over the communication network 34 to the librarian 28 requesting the creation of an object in the meta database 26 corresponding to the raw audio/video data 12 .
  • the librarian 28 processes the message from the encoder client 14 by creating an object in the meta database 26 corresponding to the raw audio/video data 12 and assigns the object an object identification number as described in more detail below.
  • the librarian 28 then sends a message, including the object identification number associated with the raw audio/video data 12 , over the communication network 34 to the encoder client 14 notifying the encoder client 14 of the creation of the object in the meta database 26 corresponding to the raw audio/video data 12 .
  • the encoder client 14 Upon receipt of the notification from the librarian 28 , the encoder client 14 digitally encodes the raw audio/video data 12 so as to generate a first digital representation of the raw audio/video data 12 , as described in more detail below. The encoder client 14 then sends a message, including the first digital representation of the raw audio/video data 12 , over the communication network 34 to the media database server 24 requesting that the media database server 24 store the first digital representation of the raw audio/video data 12 in the media database 22 . The media database server 24 processes the message from the encoder client 14 by first checking to see if space is available in the media database 22 to store the first digital representation of the raw audio/video data 12 in the media database 22 .
  • the media database server 24 denies the request to store the first digital representation of the raw audio/video data 12 in the media database 22 . However, if space is available in the media database 22 , the media database server 24 stores the first digital representation of the raw audio/video data 12 at a location in the media database 22 and assigns the location a first universal resource locator (URL). The media database server 24 then sends a message, including the first URL, over the communication network 34 to the encoder client 14 notifying the encoder client 14 of the storage of the first digital representation of the raw audio/video data 12 in the media database 22 .
  • URL universal resource locator
  • the encoder client 14 Upon receipt of the notification from the media database server 24 , the encoder client 14 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the librarian 28 notifying the librarian 28 of the digital encoding of the raw audio/video data 12 into the first digital representation of the raw audio/video data 12 , and the storing of the first digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the first URL.
  • the librarian 28 processes the message from the encoder client 14 by storing the first URL in the meta database 26 along with the object identification number associated with the raw audio/video data 12 , as described in more detail below.
  • the transcoder client 16 periodically sends messages to the librarian 28 requesting work from the librarian 28 .
  • the librarian 28 processes such a message from the transcoder client 16 by first checking to see if there are any objects in the meta database 26 that have corresponding digital representations which have not been processed by the transcoder client 16 . If there are no objects in the meta database 26 that have corresponding digital representations which have not been processed by the transcoder client 16 , then the librarian 28 denies the work request.
  • the librarian 28 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the transcoder client 16 , thereby notifying the transcoder client 16 that the first digital representation of the raw audio/video data 12 has not been processed by the transcoder client 16 .
  • the transcoder client 16 Upon receipt of the notification from the librarian 28 , the transcoder client 16 sends a message, including the first URL, over the communication network 34 to the media database server 24 requesting that the media database server 24 send a copy of the first digital representation of the raw audio/video data 12 to the transcoder client 16 for processing by the transcoder client 16 .
  • the media database server 24 processes the message from the transcoder client 16 by sending a message, including a copy of the first digital representation of the raw audio/video data 12 , over the communication network 34 to the transcoder client 16 for processing by the transcoder client 16 .
  • the transcoder client 16 processes the copy of the first digital representation of the raw audio/video data 12 such that a second digital representation of the raw audio/video data 12 is generated, as described in more detail below.
  • the transcoder client 16 After the transcoder client 16 has processed the copy of the first digital representation of the raw audio/video data 12 , and generated the second digital representation of the raw audio/video data 12 , the transcoder client 16 sends a message, including the second digital representation of the raw audio/video data 12 , over the communication network 34 to the media database server 24 requesting that the media database server 24 store the second digital representation of the raw audio/video data 12 in the media database 22 .
  • the media database server 24 processes the message from the transcoder client 16 by first checking to see if space is available in the media database 22 to store the second digital representation of the raw audio/video data 12 in the media database 22 .
  • the media database server 24 denies the request to store the second digital representation of the raw audio/video data 12 in the media database 22 . However, if the space is available in the media database 22 , the media database server 24 stores the second digital representation of the raw audio/video data 12 at a location in the media database 22 and assigns the location a second URL. The media database server 24 then sends a message, including the second URL, over the communication network 34 to the transcoder client 16 notifying the transcoder client 16 of the storing of the second digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the second URL.
  • the transcoder client 16 Upon receipt of the notification from the media database server 24 , the transcoder client 16 sends a message, including the object identification number associated with the raw audio/video data 12 and the second URL, over the communication network 34 to the librarian 28 notifying the librarian 28 of the transcoding of the first digital representation of the raw audio/video data 12 into the second digital representation of the raw audio/video data 12 , and the storing of the second digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the second URL.
  • the librarian 28 processes the message from the transcoder client 16 by storing the second URL in the meta database 26 along with the object identification number associated with the raw audio/video data 12 , as described in more detail below.
  • the annotation client 18 periodically sends messages to the librarian 28 requesting work from the librarian 28 .
  • the librarian 28 processes such a message from the annotation client 18 by first checking to see if there are any objects in the meta database 26 that have corresponding digital representation which have not been processed by the annotation client 18 . If there are no objects in the meta database 26 that have corresponding digital representations which have not been processed by the annotation client 18 , then the librarian 28 denies the work request.
  • the librarian 28 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the annotation client 18 , thereby notifying the annotation client 18 that the first digital representation of the raw audio/video data 12 has not been processed by the annotation client 18 .
  • the annotation client 18 Upon receipt of the notification from the librarian 28 , the annotation client 18 sends a message, including the first URL, over the communication network 34 to the media database server 24 requesting that the media database server 24 send a copy of the first digital representation of the raw audio/video data 12 to the annotation client 18 for processing by the annotation client 18 .
  • the media database server processes the message from the annotation client 18 by sending a message, including a copy of the first digital representation of the raw audio/video data 12 , over the communication network 34 to the annotation client 18 for processing by the annotation client 18 .
  • the annotation client 18 processes the copy of the first digital representation of the raw audio/video data 12 so as to generate annotations for the object in the meta database 26 corresponding to the raw audio/video data 12 , as described in more detail below.
  • the annotation client 18 After the annotation client 18 has processed the copy of the first digital representation of the raw audio/video data 12 , and generated the annotations for the object in the meta database corresponding to the raw audio/video data 12 , the annotation client 18 sends a message, including the object identification number associated with the raw audio/video data 12 and the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 , over the communication network 34 to the librarian 28 notifying the librarian 28 of the generating of the annotations for the object in the meta database corresponding to the raw audio/video data 12 .
  • the librarian 28 processes the message from the annotation client 18 by storing the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 in the meta database 26 along with the object identification number associated with the raw audio/video data 12 , as described in more detail below.
  • the index database server 32 periodically sends messages to the librarian 28 requesting a list of object identification numbers from the librarian 28 which correspond to objects that have been created in the meta database 26 .
  • the librarian 28 processes such a message from the index database server 32 by sending a message, including a list of object identification numbers corresponding to object that have been created in the meta database 26 , over the communication network 34 to the index database server 32 for processing by the index database server 32 .
  • the index database server 32 processes the message from the librarian 28 by sending a message, including, for example, the object identification number associated with the raw audio/video data 12 , over the communication network 34 to the librarian 28 requesting that the librarian 28 send a copy of the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 , such as, for example, the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 .
  • the librarian 28 processes the message from the index database server 32 by sending a message, including the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 , over the communication network 34 to the index database server 32 for processing by the index database server 32 .
  • the index database server 32 processes the message from the librarian 28 by storing the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 in the index database 30 along with, or with reference to, the object identification number associated with the raw audio/video data 12 , as described in more detail below.
  • the browser client 20 allows the user 11 to interface with the index database server 32 such that the user 11 is allowed to search, browse, and retrieve all or a portion of a digital representation such as, for example, the first digital representation of the raw audio/video data 12 .
  • the browser client 20 sends a message, initiated by the user 11 , over the communication network 34 to the index database server 32 requesting a search of the index database 30 .
  • the index database server 32 processes the message from the browser client 20 by sending a message, including a hypertext markup language (HTML) query page, to the browser client 20 for presentation to the user 11 .
  • the browser client 20 then presents the HTML query page to the user 11 .
  • the HTML query page is such that it allows the user 11 to enter textual and Boolean queries.
  • the user 11 enters a query through the HTML query page and the browser client 20 sends a message, including the query, over the communication network 34 to the index database server 32 for processing by the index database server 32 .
  • the index database server 32 processes the message from the browser client 20 by searching the index database 30 for annotations which match the query, and obtaining the object identification number associated with each matching annotation, as described in more detail below.
  • the index database server 32 then sends a message, including each matching annotation and the object identification number associated with each matching annotation, over the communication network 34 to the librarian 28 requesting that the librarian 28 provide the URL of the digital representation from which each matching annotation was generated such as, for example, the first URL.
  • the librarian 28 processes the message from the index database server 32 by searching the meta database 26 for the URL of the digital representation from which each matching annotation was generated, as described in more detail below.
  • the librarian 28 then sends a message, including each matching annotation, the URL of the digital representation from which each matching annotation was generated, and the object identification number associated with each matching annotation, over the communication network 34 to the index database server 32 for processing by the index database server 32 .
  • the index database server 32 processes the message from the librarian 28 by building an HTML results page for presentation to the user 11 .
  • the index database server 32 builds the HTML results page by creating an image or an icon corresponding to the URL of the digital representation from which each matching annotation was generated. That is, each image or icon is hyperlinked to a function or script which allows the user 11 to browse and/or retrieve all of a portion of a corresponding digital representation such as, for example, the first digital representation of the raw audio/video data 12 .
  • the index database server 32 sends a message, including the HTML results page, to the browser client 20 for presentation to the user 11 .
  • the browser client 20 then present the HTML results page to the user 11 so that the user 11 can select one of the images or icons so as to browse and/or retrieve all or a portion of a corresponding digital representation such as, for example, the first digital representation of the raw audio/video data 12 .
  • a method for efficiently delivering slices of media from large media streams is required.
  • the media database server 24 has a server extension for performing these fetch and stream operations.
  • the generalization of the above-described technique is to provide a well known method for selecting a portion of a digital representation using specified file parameters.
  • the URL can be of the form: http://server/file_name?file_parameter
  • Such a generalization allows the file_parameter field to specify a format in which a digital representation will be supplied.
  • the transcoding of a digital representation into another format can be requested of the media database server 24 by so indicating in the file_parameter field.
  • the media database server 24 wi 11 receive an URL in the above-described form from a requesting entity.
  • the media database server 24 determines the appropriate server extension based upon what is indicated in the file_parameter field.
  • the media database server 24 passes the file_name and the file_parameter to the appropriate server extension.
  • the server extension then generates a multipurpose internet mail extensions (MIME) header which is sent to the requesting entity through the media database server 24 .
  • MIME multipurpose internet mail extensions
  • the server extension then opens the file indicated in the file_name field and strips-off any header information that may be contained at the beginning of the file.
  • the file_parameter identifies the portion of the file that was requested by the requesting entity, and optionally drives transcoding or sub-stream extraction.
  • the server extension then generates a new header and provides the requested file portion to the media database server 24 , which then sends the requested file portion to the requesting entity.
  • the efficiency of the approach depends upon the implementation of the server extension for each type of representation.
  • video sequence representation types such as MPEG and/or H.263
  • the present invention allows for the storing of extra information alongside a primary video stream. This makes it possible to return a portion of the primary video stream to a requesting entity from almost any location within the primary video stream without increasing the network bit rate requirements, as described below.
  • Efficient image sequence encoding for video sequences exploits the redundancy that occurs in a sequence of frames. In a video sequence for a single scene, only a few objects will move from one frame to the next. This means that by applying motion compensation it is possible to predict a current image in the video sequence from a previous image. Furthermore, this implies that the current image can be reconstructed from a previously transmitted image if all that is sent to a requesting entity are motion vectors and a difference between a predicted image and an actual image. This technique is well known and is termed predictive encoding.
  • the predictive encoding technique can be extended to make predictions about a current image based upon any prior image and any future image.
  • an image frame which has been encoded independently of any other frame is defined as an intra or I-frame
  • an image frame which has been encoded based upon a previous frame is defined as a predicted or P-frame.
  • frames are generally encoded by breaking them into fix sized blocks. Each block can then be separately encoded by producing an I-block, or each block can be encoded using previous blocks by producing a P-block. Transmitted frames can then consist of a mixture of I-blocks and P-blocks. Additional encoding efficiency is generally gained through this technique.
  • the simplest way to correct this problem is to force the encoder to place I-frames at periodic locations within a primary video sequence.
  • the primary video sequence can then be decoded from any location where an I-frame has been placed.
  • this decreases the encoding efficiency.
  • the present invention solves this problem by maintaining a secondary bit stream of I-frames which can be used to jump into the primary bit stream from any location where an I-frame has been stored.
  • This secondary bit stream of I-frames can be generated by a secondary encoder, which can be included in both the encode client 14 and the transcoder client 16 .
  • This secondary bit stream is combined with the primary bit stream to produce the first digital representation of the raw audio/video data 12 and the second digital representation of the raw audio/video data 12 , as described above.
  • the encoder client 14 processes the raw audio/video data 12 , which is typically in analog form, by digitizing the raw audio/video data 12 with a digitizer 40 .
  • the digitized audio/video data is then encoded by a primary encoder 42 , which generates a primary bit stream 44 for the first digital representation of the raw audio/video data 12 and a prediction of the primary bit stream for the first digital representation of the raw audio/video data 12 .
  • the prediction of the primary bit stream for the first digital representation of the raw audio/video data 12 is separately encoded by a secondary encoder 45 to generate a secondary bit stream 46 for the first digital representation of the raw audio/video data 12 .
  • the primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 are then combined to form the first digital representation 48 of the raw audio/video data 12 , which is stored in the media database 22 at the location identified by the first URL, as described above.
  • the primary bit stream 44 for the first digital representation of the raw audio/video data 12 is typically in a form of an I-frame and a plurality of P-frames, whereas the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 is in the form of all I-frames.
  • the first digital representation 48 of the raw audio/video data 12 is typically stored in a file in the media database 22 .
  • the file typically has a header which has pointers to the beginnings of the primary bit stream 44 and the secondary bit stream 46 within the file. It should be noted that the primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 must be in the same format such as, for example, JPEG, MPEG or H.263.
  • the transcoder client 16 processes the first digital representation 48 of the raw audio/video data 12 with a decoder 50 .
  • the decoded audio/video is then encoded by a primary encoder 52 , which generates a primary bit stream 54 for the second digital representation of the raw audio/video data 12 and a prediction of the primary bit stream for the second digital representation of the raw audio/video data 12 .
  • the prediction of the primary bit stream for the second digital representation of the raw audio/video data 12 is separately encoded by a secondary encoder 55 to generate a secondary bit stream 56 for the second digital representation of the raw audio/video data 12 .
  • the primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 are then combined to form the second digital representation 58 of the raw audio/video data 12 , which is stored in the media database 22 at the location identified by the second URL, as described above.
  • the primary bit stream 54 for the second digital representation of the raw audio/video data 12 is typically in a form of an I-frame and a plurality of P-frames, whereas the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 is in the form of all I-frames.
  • the second digital representation 58 of the raw audio/video data 12 is typically stored in a file in the media database 22 .
  • the file typically has a header which has pointers to the beginnings of the primary bit stream 54 and the secondary bit stream 56 within the file. It should be noted that the primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 must be in the same format such as, for example, JPEG, MPEG or H.263.
  • the primary encoder 42 in the encoder client 14 and the primary encoder 52 in the transcoder client 16 can both operate according to an encoding process 60 such as shown in FIG. 4 .
  • This encoding process 60 comprises digitized audio/visual data 62 , a differencing function 64 , a discrete cosine transform (DCT) function 66 , a quantization (Q) function 68 , an inverse quantization (invQ) function 70 , an inverse discrete cosine transform function (IDCT) 72 , an adding function 74 , a motion estimation function 76 , a motion compensation function 78 , and a delay function 80 .
  • DCT discrete cosine transform
  • Q quantization
  • invQ inverse quantization
  • IDCT inverse discrete cosine transform function
  • a current frame of the digitized audio/visual data 62 is processed according to the encoding process 60 by differencing the current frame of the digitized audio/visual data 62 with a prediction of the current frame at the differencing function 64 .
  • the difference between the current frame of the digitized audio/visual data 62 and the prediction of the current frame is encoded by the discrete cosine transform (DCT) function 66 and the quantization (Q) function 68 to produce an encoded P-frame for a digital representation of the digitized audio/visual data 62 .
  • DCT discrete cosine transform
  • Q quantization
  • This encoded P-frame is decoded by the inverse quantization (invQ) function 70 and the inverse discrete cosine transform function (IDCT) 72 , and then added to a delayed prediction of the current frame by the adding function 74 .
  • the prediction of the current frame is determined by subjecting the output of the adding function 74 to the motion estimation function 76 and the motion compensation function 78 . It is this prediction of the current frame that is encoded by the secondary encoder 45 in the encoder client 14 and the secondary encoder 55 in the transcoder client 16 , as described above.
  • both the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 could alternatively be generated at an encoder associated with the media database server 24 .
  • FIG. 1B there is shown a schematic diagram of a second embodiment of a system 10 B for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • the system 10 B is identical to the system 10 A except for the addition of an encoder 36 , and that the encoder client 14 and the transcoder client 16 would no longer require the secondary encoder 46 and the secondary encoder 56 , respectively, as described above.
  • the encoder 36 would generate both the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 . That is, the encoder client 14 would generate the primary bit stream 44 as described above, and then transmit the primary bit stream 44 to the media database server 24 . The media database server 24 would then provide the primary bit stream 44 to the encoder 36 , which would then generate the secondary bit stream 46 . The encoder 36 would then provide the secondary bit stream 46 to the media database server 24 .
  • the media database server 24 would then combine the primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 to form the first digital representation 48 of the raw audio/video data 12 , which is then stored in the media database 22 at the location identified by the first URL, as described above.
  • the transcoder client 16 would generate the primary bit stream 54 to the media database server 24 .
  • the media database server would then provide the primary bit stream 54 to the encoder 36 , which would then generate the secondary bit stream 56 .
  • the encoder 36 would then provide the secondary bit stream 56 to the media database server 24 .
  • the media database server 24 would then combine the primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 to form the second digital representation 58 of the raw audio/video data 12 , which is then stored in the media database 22 at the location identified by the second URL, as described above.
  • the foregoing is beneficial in that only the primary bit stream 44 and the primary bit stream 54 are transmitted from the encoder client 14 and the transcoder client 16 , respectively, to the media database server 22 , which increases transmission efficiency.
  • a digital representation of an audio/video bit stream consists of three components: an audio layer, a video layer, and a system layer.
  • the system layer tells a decoder how audio and video are interleaved in the audio/video bit stream. The decoder uses this information to split the audio/video bit stream into components and send each component to its appropriate decoder.
  • a video encoder takes a non-encoded video stream and provides an encoded video stream which is then combined with an encoded audio stream to create the three component audio/video stream.
  • the primary bit streams 44 and 54 and the secondary bit streams 46 and 56 as described above represent video streams which will be combined with audio streams to create three component audio/video streams.
  • the media database server 24 stores the first digital representation 48 of the raw audio/video data 12 in the media database 22 such that each P-frame in the primary bit stream 44 for the first digital representation of the raw audio/video data 12 references a corresponding I-frame in the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 , and vice versa.
  • the user 11 can browse and/or retrieve a desired portion of the first digital representation 48 starting at any arbitrary location within the first digital representation 48 by first obtaining an I-frame from the secondary bit stream for the first digital representation of the raw audio/video data 12 which corresponds to the arbitrary starting location of the desired portion, and then obtaining P-frame from the primary bit stream 44 for the first digital representation of the raw audio/video data 12 for all subsequent locations of the desired portion.
  • the media database server 24 will only have to send a message containing a single I-frame in order for the user 11 to browse and/or retrieve a desired portion of the first digital representation 48 , thereby obtaining maximum network transmission efficiency while maintaining the encoding advantages of only a single I-frame in the primary bit stream 44 for the first digital representation of the raw audio/video data 12 .
  • the media database server 24 stores the second digital representation 58 of the raw audio/video data 12 in the media database 22 such that each P-frame in the primary bit stream 54 for the second digital representation of the raw audio/video data 12 references a corresponding I-frame in the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 , and vice versa.
  • the user 11 can browse and/or retrieve a desired portion of the second digital representation 58 starting at any arbitrary location within the second digital representation 58 by first obtaining an I-frame from the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 which corresponds to the arbitrary starting location of the desired portion, and then obtaining P-frames from the primary bit stream 54 for the second digital representation of the raw audio/video data 12 for all subsequent locations of the desired portion.
  • the media database server 24 will only have to send a message containing a single I-frame in order for the user 11 to browse and/or retrieve a desired portion of the second digital representation 58 , thereby obtaining maximum network transmission efficiency while maintaining the encoding advantages of only a single I-frame in the primary bit stream 54 for the second digital representation of the raw audio/video data 12 .
  • the file 90 comprises a header portion 92 , a primary bit stream portion 94 , and a secondary bit stream portion 96 .
  • the header portion 92 comprises a file identifier 98 for either the first digital representation 48 of the raw audio/video data 12 or the second digital representation 58 of the raw audio/video data 12 , a pointer 100 to the beginning of the primary bit stream portion 94 , and a pointer 102 to the beginning of the secondary bit stream portion 96 .
  • the primary bit stream portion 94 comprises an I-frame 104 and a plurality of P-frames 106 .
  • the secondary bit stream portion 96 comprises a plurality of I-frames 108 .
  • the references between the P-frames 106 in the primary bit stream portion 94 and the I-frames 108 in the secondary bit stream portion 96 , and vice versa, can be included in the P-frames 106 in the primary bit stream portion 94 and the I-frames 108 in the secondary bit stream portion 96 .
  • the header portion 92 can include additional pointers to corresponding P-frames 106 in the primary bit stream portion 94 and I-frames 108 in the secondary bit stream portion 96 .
  • the annotation client 18 processes the copy of the first digital representation of the raw audio/video data 12 such that annotations are generated for the object in the meta database 26 corresponding to the raw audio/video data 12 .
  • the librarian 28 then stores these annotations in the meta database 26 along with the object identification number associated with the raw audio/video data 12 .
  • the implementation of these steps in accordance with the present invention is directly related to annotation processes and the structure of the meta database 26 .
  • Annotations are generated for an object so as to provide information about the whole object or a part of the object.
  • Annotations may be generated for an object by trusted automatic processes called annotation daemons, such as the annotation client 18 , or by trusted human annotators.
  • Annotations which have previously been generated for an object, including both annotations produced by annotation daemons or by human annotators, may be reviewed and updated.
  • Annotations in accordance with the present inventions are a typed, probabilistic, stratified collection of values.
  • FIG. 6 there is shown an annotation structure 110 for an object in accordance with the present invention.
  • the annotation structure 110 comprises a first annotation sequence 114 and a second annotation sequence 116 .
  • the first annotation sequence 114 and the second annotation sequence 116 relate to a media stream 112 , which can be either an audio or a video stream.
  • Each annotation sequence represents a different type of annotation such as, for example, words that occur in the media stream 112 or speakers that are recognized in the media stream 112 .
  • Each annotation sequence contains a plurality of time marks 117 and a plurality of arcs 118 .
  • Each time mark 117 represents an instant in time.
  • Each arc 118 also has an associate value and probability.
  • the probability is a measure of confidence in the accuracy of the annotation.
  • the use of a probability allows probabilistic-based retrieval to be supported.
  • the use of a probability also allows the quality (e.g., higher or lower quality) or a replacement annotation to be determined.
  • Each annotation sequence can be applied to the entire media stream 112 or to a part thereof.
  • the annotation structure 110 as described above differs from many video annotation systems that work on shot lists.
  • a video is first broken down into thematic chunks called shots that are then grouped into scenes.
  • shots are then taken as a basic atomic unit for annotation. That is, each shot is annotated, and searching will only retrieve particular shots.
  • the difficulty of this prior approach is that performing the above-processing automatically can be very difficult.
  • the present invention avoids this difficulty by allowing the presence of people and things to be marked within a scene.
  • the structure of the meta database 26 is such that it is an object database built on top of standard relational databases.
  • Each object in the object database of the meta database 26 represents some form of audio/video data such as, for example, the raw audio/video data 12 , as described above.
  • a representation of an object in the object database of the meta database 26 can be a representation of the audio/video data that is represented by the object in the object database of the meta database 26 such as, for example, the first digital representation of the raw audio/video data 12 , as described above.
  • An annotation of an object in the object database of the meta database 26 can be an annotation that is generated by processing one or more representations of the audio/video data that is represented by the object in the object database of the meta database 26 such as, for example, an annotation that was generated by processing the copy of the first digital representation of the raw audio/video data 12 , as described above.
  • the structure of an object database 120 of the meta database 26 in accordance with the present invention is shown in FIG. 7 .
  • the object database 120 comprises an object 122 , a plurality of representations 124 of the object 122 , and a plurality of annotations 126 of the object 122 .
  • each of the plurality of representations 124 of the object 122 reference the object 122
  • each of the plurality of annotations 126 of the object 122 reference the object 122 .
  • an annotation 126 may reference more than one object 122 , indicating that the annotation 126 is shared by the more than one object 122 .
  • All of the objects in the object database of the meta database 26 are listed in an object table 130 of the meta database 26 , as shown in FIG. 8 .
  • Each of the objects in the object database of the meta database 26 are assigned an object identification number 132 , as previously described.
  • Each object identification number 132 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted.
  • Each of the objects in the object database of the meta database 26 are typically listed in the object table 130 according to the value of the their object identification number 132 , as shown.
  • Each of the objects in the object database of the meta database 26 are also assigned an object type 134 .
  • the object type 134 can be, for example, video or audio, corresponding to the type of data that is represented by the object in the object database of the meta database 26 . Accordingly, each of the objects in the object database of the meta database 26 are listed in the object table 130 with a corresponding object type 134 .
  • each of the representations in the object database of the meta database are listed in a representation table 140 of the meta database 26 , as shown in FIG. 9 .
  • Each of the representations in the object database of the meta database 26 are assigned a representation identification number 142 . Similar to the object identification numbers 132 , each representation identification number 142 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted.
  • Each of the representations in the object database of the meta database 26 are typically listed in the representation table 140 according to the value of their representation identification number 142 , as shown.
  • each of the representations in the object database of the meta database 26 is associated with an object in the object database of the meta database 26 . Accordingly, each of the representations in the object database of the meta database 26 are listed in the representation table 140 with an associated object identification number 132 .
  • Each of the representations in the object database of the meta database 26 are also assigned a representation type 144 .
  • the representation type 144 can be, for example. Video/mpeg, video/x-realvideo, audio/mpeg, or audio/c-realvideo, corresponding to the format type of the representation in the object database of the meta database 26 . Accordingly, each of the representations in the object database of the meta database 26 are listed in the representation table 140 with a corresponding representation type 144 .
  • each of the representations in the object database of the meta database 26 have an associated URL which identifies the location in the media database 22 where the representation can be found. Accordingly, each of the representations in the object database of the meta database 26 are listed in the representations table 140 with an associated URL 146 .
  • annotations in the object database of the meta database 26 are listed in an annotation table 150 of the meta database 26 , as shown in FIG. 10 .
  • Each of the annotations in the object database of the meta database 26 are assigned an annotation identification number 152 . Similar to the object identification numbers 132 and the representation identification numbers 142 , each annotation identification number 152 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted.
  • Each of the annotations in the object database of the meta database 26 are typically listed in the annotation table 150 according to the value of their annotation identification number 152 , as shown.
  • each of the annotations in the object database of the meta database 26 are associated with an object in the object database of the meta database 26 . Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with an associated object identification number 132 .
  • Each of the annotations in the object database of the meta database 26 are also assigned an annotation type 154 .
  • the annotation type 154 can be, for example, transcript, speaker or keyframe.
  • Each annotation type 154 corresponds to the type of annotation that has been generated for a corresponding object in the object database of the meta database 26 . Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with a corresponding annotation type 154 .
  • Each of the annotations in the object database of the meta database 26 have a corresponding annotation value 156 .
  • the annotation value 156 can be, for example, a word, the name of a speaker, or an URL which references an image in the media database 22 .
  • Each annotation value 156 corresponds to the actual annotated element of the object in the object database of the meta database 26 . Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with a corresponding annotation value 156 .
  • Annotations which have been generated for an object that represents an audio/video stream have a corresponding annotation start time 158 and a corresponding annotation end time 160 .
  • the annotation start time 158 corresponds to the location in the audio/video stream where an annotation actually begins.
  • the annotation end time 160 corresponds to the location in the audio/video stream where an annotation actually ends. Accordingly, each of the annotations in the object database of the meta database 26 which have been generated for an object that represents an audio/video stream are listed in the annotation table 150 with a corresponding annotation start time 158 and a corresponding annotation end time 160 .
  • the index database server 32 stores the annotations that were generated for the object in the meta database 26 corresponding to the raw audio/video data 12 in the index database 30 along with the object identification number associated with the raw audio/video data 12 .
  • the index database server 32 searches the index database 30 for annotations which match a query initiated by the user 11 , and then obtains the object identification number associated with each matching annotation.
  • the implementation of these steps in accordance with the present invention is directly related to the indexing process and the structure of the index database 30 .
  • the index database server 32 stores the annotations in the index database 30 such that an entry is created in the index database 30 for each annotation value. Following each annotation value entry in the index database 30 is a list of start times for each occurrence of the annotation value within an associated object. The start times can be listed according to actual time of occurrence in the associated object or in delta value form. Following the list of start times for each occurrence of the annotation value within the associated object is the object identification number corresponding to the associated object, or a reference to such object identification number. Thus, each of these annotation value entries in the index database 30 is linked in some manner to the start times for each occurrence of the annotation value within an associated object and the object identification number corresponding to the associated object. Therefore, whenever the index database server 32 searches the index database 30 for annotation values which match a query, the start times for each occurrence of a matching annotation value within an associated object and the object identification number corresponding to the associated object can be easily obtained.
  • the index database server 32 can send a message, including the matching annotation value, the start times for each occurrence of the matching annotation value within an associated object, and the object identification number corresponding to the associated object, over the communication network 34 to the librarian 28 requesting that the librarian 28 provide further information relating to the matching annotation value and the associates object identification number.
  • Such information can include the annotation type, the annotation start time, the annotation end time, the representation type, the URL, and the object type associated with the matching annotation value and the associated object identification number, all of which have been described above.
  • the librarian 28 provides everything that the index database server 32 requires to build an HTML results page for presentation to the user 11 .
  • the start times for each occurrence of a matching annotation value within an associated object are included in the message from the index database server 32 to the librarian 28 so as to make searching the meta database more efficient. That is, searching the meta database 26 for numerical values typically requires less processing than searching the meta database 26 for textual values. Also, a matching annotation value and the start times for each occurrence of a matching annotation value within an associated object are directly related. However, a matching annotation value is typically a textual value, whereas the start times for each occurrence of a matching annotation value within an associated object are numerical values. Thus, using the start times for each occurrence of a matching annotation value within an associated object to search the meta database 26 for information is more efficient than using a matching annotation value.
  • the index database server 32 inherently knows that it must look to the librarian 28 to provide further information relating to the matching annotation value and the associated object identification number. That is, it is inherent to the index database server 32 that a request for further information relating to the matching annotation value and the associated object identification number must be sent to the librarian 28 .
  • system 10 A and system 10 B bother operate such that subsequent to a request from the encoder client 14 , the librarian 28 creates an object in the meta database 26 , and stores information in the meta database 26 along with the object.
  • This information includes the URL of a digital representation of media data, the form of the digital representation of the media data, the type (e.g., audio, video, etc.) of the form of the digital representation of the media data, the format in which the digital representation of the media data is stored at the URL, the URL and types of any ancillary files associated with the media data such as a transcript or closed-caption file, and any associated high-level meta data such as the title of the media data and/or its author.
  • any ancillary files associated with the media data such as a transcript or closed-caption file
  • any associated high-level meta data such as the title of the media data and/or its author.
  • the annotation client 18 can request work from the librarian 28 and process digital representations which the librarian 28 has indicated have not already been processed by the annotation client 18 , as previously described.
  • the annotation client 18 employs an automatic process, called a daemon process, to perform the annotation function.
  • Automatic daemon processes are preferred over human annotation processes, which can be very laborious.
  • automatic daemon processes which produce high quality results appropriately termed trusted daemon processes, are sometimes hard to come by given the current state of technology.
  • the present invention achieves this by allowing each annotation client 18 to communicate with the librarian 28 and the media database server 24 over the communication network 34 using a standard messaging protocol (e.g., HTTP messaging).
  • a standard messaging protocol e.g., HTTP messaging
  • the annotation client 18 requests work from the librarian 28 by providing two boolean conditions, an identifier of the annotation client 18 , a version number of the annotation client 18 , and an estimate of how long the annotation client 18 will take to complete the work (i.e., the annotation process).
  • the first Boolean condition is used to test for the existence of an object which satisfies the input requirements of the daemon process. That is, if an object satisfies the condition, then the inputs necessary for the daemon process to run exist and are referenced in the meta database 26 .
  • the second boolean condition tests for the non-existence of the output produced by the daemon process. If these boolean conditions are satisfied, then the daemon process should be run on the object.
  • the librarian 28 provides work to the annotation client 18 by first creating a list containing all objects which satisfy both boolean conditions. The librarian 28 then filters the list by eliminating objects which are presently being processed, or locked, by another annotation client 18 having the same identifier and version number. The librarian 28 then creates a key for each object remaining on the list which identifies the annotation client 18 and includes an estimate of how long the annotation client 18 will take to complete the work. This key is used to lock out other annotation clients 18 as described above. The librarian 28 then provides the URL of each digital representation remaining on the list to the annotation client 18 for processing, as previously described.
  • the annotation client 18 uses the returned work information to perform its operations. That is, the annotation client 18 uses the URL of each digital representation to request each digital representation from the media database server 22 , as previously described. The annotation client 18 then performs its work.
  • the annotation client 18 Upon completion of its work, the annotation client 18 checks its work into the librarian 28 for storage in the meta database 26 . The annotation client 18 accomplishes this task by returning the object identification number associated with the object, the newly generated annotation data, and the key to the librarian 28 . The librarian 28 checks the key to make sure that it matches the key in a space reserved for the completed operation. If the annotation client 18 returns the correct key, and the estimated work completion time has not expired, the key will match and the librarian 28 will accept the complete result. However, if the estimated work completion time has expired, the key may also have expired if another annotation client 18 , having the same identifier and version number, requested work after the estimated work completion time had expired. If this is the case, the work will have been given to the new requesting annotation client 18 , and a new key will have been generated. Therefore, the first requesting annotation client will not be able to check in its work.
  • the aforementioned protocol permits completely distributed processing of information with very low communications overhead. Also, the use of URLs makes it possible for the processing to occur anywhere on the network, although only privileged addresses (i.e., those belonging to trusted annotation clients 18 ) may install results in the librarian 28 . Furthermore, the simple time stamp protocol makes the system tolerant to processing failures.
  • the index database server 32 indexes the meta database 26 by periodically requesting from the librarian 28 a list of object identification numbers which correspond to objects that have been created in the meta database 26 .
  • the librarian 28 provides a list of object identification numbers which correspond to objects that have been created in the meta database 26 to the index database server 32 .
  • the index database server 32 requests from the librarian 28 , for each object identification number, a copy of all of the annotations that were generated for each object in the meta database 26 .
  • the librarian 28 provides, for each object identification number, a copy of all of the annotations that were generated for each object in the meta database 26 to the index database server 32 .
  • the index database server 32 then stores the annotations that were generated for each object in the meta database 26 in the index database 30 along with, or with reference to, each associated object identification number.
  • the browser client 20 sends a message, initiated by the user 11 , to the index database server 32 requesting a search of the index database 30 .
  • the index database server 32 provides an HTML query page to the browser client 20 for presentation to the user 11 .
  • the browser client 20 then presents the HTML query page to the user 11 .
  • FIG. 11 there is shown an exemplary HTML query page 170 including a search field 172 , a user-selectable search command 174 , a user-selectable “help” option 176 , and a user-selectable “advanced search” option 178 .
  • the user 11 enters a query through the HTML query page and the browser client 20 sends a message, including the query, to the index database server 32 for processing by the index database server 32 .
  • the index database server 32 searches the index database 30 for annotation values which match the query. Once the index database server 32 has found matching annotation values, the index database server 32 ranks the matching annotation values according to relevance, and obtains the object identification number associated with each matching annotation value.
  • the index database server 32 requests the librarian 28 to provide further information relating to each matching annotation value by referencing each associated object identification number. As previously described, such information can include the annotation type, the annotation start time, the annotation end time, the representation type, the URL, and the object type associated with each matching annotation value and the associated object identification number.
  • the librarian 28 then sends the requested information to the index database server 32 .
  • the index database server 32 ranks the matching annotation values using a modified document retrieval technique.
  • the unmodified document retrieval technique uses a document as a basic unit, and determines the importance of a document based upon a query. That is, the importance of a document is based on the number of occurrences of each query word within the document, with each query word being weighed by the rarity of the query word in a document database. Thus, more rare words are given higher weights than common words, and documents with more query words receive higher total weights than documents with fewer query words.
  • a typical equation for computing the score of a document is
  • Score( d ) sum — ⁇ q ⁇ w[q] (1)
  • d is a document
  • q is a query word
  • sum_ ⁇ q ⁇ is the number of times that the query word q appears in the document d
  • w[q] is the weight of the query word q.
  • an indexing system In audio/video retrieval, it is a requirement that users be able to start an audio/video stream from the most relevant position within the audio/visual stream. This, an indexing system must not only determine that an audio/video stream is relevant, but also all relevant locations within the audio/video stream, and preferably rank the relevance of those locations.
  • the present invention modifies the above-described technique by letting h[i] be a valid starting location within an audio/video stream, and letting L[q,j] be the jth location of the query word q in the audio/video stream. Then the score at valid starting location h[i] can be given by
  • DELTA is a settable distance weight equal to 10-30 seconds.
  • the score at a valid starting location is a weighted sum over all the locations the query word appears after the valid starting location, where the weight of each appearance of a query word is the product of the query word weight and a negative exponential weight on the distance between the occurrence of the query word and the query word in time.
  • This modified ranking technique provides a unique advantage to the index database server 32 of the present invention.
  • the index database server 32 uses the information provided by the librarian 28 to build an HTML results page for presentation to the user 11 .
  • the index database server 32 builds the HTML results page by creating an image or an icon for each matching annotation value. Each image or icon is hyperlinked to a function or script which allows the user 11 to browse and/or retrieve all of a portion of a corresponding digital representation.
  • the index database server 32 sends the HTML results page to the browser client 20 for presentation to the user 11 .
  • the browser client 20 then presents the HTML results page to the user 11 so that the user 11 can select one of the images or icons so as to browse and/or retrieve all or a portion of a corresponding digital representation.
  • the HTML results page 190 for a query which included the terms “commission” and “history.”
  • the HTML results page 190 includes an almost exact copy of the HTML query page 192 containing a statement as to the number of matches that were found for the query, which in this case is five.
  • the HTML results page 190 also includes either a video icon 194 or an audio icon 196 depending upon the type of object that is associated with each matching annotation value. Both the video icon 194 and the audio icon 196 are provided along with some detail about each associated object.
  • the title of the corresponding video stream, a frame of the corresponding video stream, a textual excerpt from the corresponding video stream, the length of the corresponding video stream, the language that is spoken in the corresponding video stream, and the number of matches that occur within the corresponding video stream are shown or listed along with the video icon 194 .
  • the title of the corresponding audio stream, a textual excerpt from the corresponding audio stream, the length of the corresponding audio stream, the language that is spoken in the corresponding audio stream, and the number of matches that occur within the corresponding audio stream are listed along with the audio icon 196 .
  • the video or audio stream will play from the location of the first match within the corresponding video or audio stream. This is possible because both the video icon 194 and the audio icon 196 are hyperlinked back to a function or script in the index database server 32 , whereby the index database server 32 uses the information provided by the librarian 28 to access a corresponding digital representation in the media database 22 using the extended URL format described above. If more than one match occurs within either a video or an audio stream, then a user-selectable “matches” option 198 is provided to allow the user 11 browse each location within the video or audio stream where a match has occurred, as described in more detail below.
  • a user-selectable “more context” option 200 is provided to allow the user 11 browse locations surrounding the location of the first match within the corresponding video or audio stream, as described in more detail below.
  • the user 11 has selected the “matches” option 198 associated with the third match presented in the HTML results page 190 (i.e., the video entitled, 1998 State of the Union Address).
  • FIG. 13 there is shown an exemplary HTML matches page 210 for allowing the user 11 to browse each location within the video stream associated with the third match presented in the HTML results page 190 where a match has occurred.
  • the HTML matches page 210 includes an almost exact copy of the HTML query page 212 , which contains an additional user-selectable “search this result” option 214 for allowing the user 11 to refine the results of a previous query.
  • the HTML matches page 210 also includes a matches header 216 containing the title of the corresponding video stream, the length of the corresponding video stream, the language that is spoken in the corresponding video stream, and the number of matches that occur within the corresponding video stream, which in this case is four.
  • the HTML matches page 210 further include a frame 218 which corresponds to each match that occurs within the corresponding video stream.
  • Each frame 218 includes a video icon 220 , which functions in a manner similar to the previously-described video icon 194 .
  • Each frame 218 and corresponding video icon 220 are provided along with some detail about each associated match that occurs within the corresponding video stream.
  • the HTML matches page 210 includes a user-selectable “more context” option 222 for each match to allow the user 11 browse locations surrounding the location of each associated match within the corresponding video stream.
  • HTML more context page 230 for allowing the user 11 to browse locations surrounding the location of the first match presented in the HTML matches page 210 within the corresponding video stream.
  • the HTML more context page 230 includes an almost exact copy of the HTML query page 232 , which contains an additional user-selectable “search this result” option 234 for allowing the user 11 to refine the results of a previous query.
  • the HTML more context page 230 also includes a more context header 236 containing the title of the corresponding video stream, and the language that is spoken in the corresponding video stream.
  • the HTML more context page 230 further includes a frame 239 which corresponds to an actual frame within the corresponding video stream.
  • Each frame 238 includes a video icon 240 , which functions in a manner similar to the previously-described video icons 194 and 220 .
  • Each frame 238 and corresponding video icon 240 are provided along with some detail about each associated frame 238 within the corresponding video stream. For example, the exact time location of the frame 238 within the corresponding video stream and a textual excerpt from the corresponding video stream are listed along with each frame 238 and corresponding video icon 240 .
  • the HTML more context page 230 still further includes a user-selectable “backward” option 242 and a user-selectable “forward” option 244 for allowing the user 11 to browse further locations surrounding the location of the first match presented in the HTML matches page 210 within the corresponding video stream.
  • the encoder client 14 , the transcoder client 16 , the annotation client 18 , the browser client 20 , the media database server 24 , the librarian 28 , the index database server 32 , and the encoder 36 all involve the processing of input data and the generation of output data to some extent.
  • the processing of the input data and the generation of the output data are preferably implemented by software programs.
  • each of the above-described system components preferably comprises a processing device 250 including at least one processor (P) 252 , memory (M) 254 , and input/output (I/O) interface 256 , connected to each other by a bus 258 , for facilitating the implementation of input data processing and output data generation in each of the above-described system components.
  • a processing device 250 including at least one processor (P) 252 , memory (M) 254 , and input/output (I/O) interface 256 , connected to each other by a bus 258 , for facilitating the implementation of input data processing and output data generation in each of the above-described system components.

Abstract

A technique for processing data in a network is disclosed. In one particular exemplary embodiment, the technique may be realized as a method for processing data in a network having a plurality of network stations. The method comprises receiving a first representation of data at a first of the plurality of network stations, processing the first representation so as to generate a second representation of the data, and transmitting the second representation from the first network station to a second of the plurality of network stations for storage therein, wherein the second representation is stored at an address within the second network station. The method also comprises receiving the address at the first network station, and transmitting the address from the first network station to a third of the plurality of network stations for storage therein.

Description

    CLAIM OF PRIORITY
  • This application is a continuation of and claims priority to U.S. Ser. No. 13/089,872, filed Apr. 19, 2011, now allowed, which is a continuation of U.S. Ser. No. 10/935,120, filed on Sep. 8, 2004, now U.S. Pat. No. 8,060,509, which is a continuation of U.S. Ser. No. 09/814,213, filed on Mar. 22, 2001, now U.S. Pat. No. 6,799,298, which is a continuation of U.S. Ser. No. 09/204,286, filed on Dec. 3, 1998, now U.S. Pat. No. 6,311,189, which is a continuation of U.S. Ser. No. 09/037,957, filed on Mar. 11, 1998, now U.S. Pat. No. 6,173,287, all of which are hereby incorporated by reference in their entirety.
  • The present invention relates generally to the field of multimedia and, more particularly, to a technique for processing data in a network.
  • BACKGROUND OF THE INVENTION
  • There are a large number of organizations that presently have substantial amounts of audio, video, and image content in analog form. Many of the organizations are currently moving toward putting such multimedia content into digital form in order to save costs in the areas of data storage and retrieval. That is, similar to other types of data, multimedia content can be easily stored on and retrieved from relatively inexpensive digital storage devices.
  • The migration of multimedia content from analog form to digital form also provides an organization with the ability to store, search, browse, and retrieve digitized multimedia content from distributed sites. That is, an organization having a number of distributed offices can store, search, browse, and retrieve digitized multimedia content from a centralized storage facility over a proprietary intranet computer network such as, for example, a local area network (LAN), or a public internet computer network such as, for example, the world wide web.
  • Furthermore, the multimedia content itself may be distributed. That is, an organization that is global in nature may have a number of distributed permanent archival storage locations where digitized multimedia content is permanently stored, or a number of distributed temporary storage locations where digitized multimedia content that is associated with work in progress is temporarily stored. Similar to above, such an organization could also store, search, browse, and retrieve digitized multimedia content from the distributed storage locations over a proprietary intranet computer network or a public internet computer network.
  • Additionally, an organization may want other entities located outside of the organization to be able to search, browse, and retrieve digitized multimedia content stored and maintained within the organization. For example, an organization may want to sell multimedia content to an outside entity, which may then use the purchased multimedia content for some purpose such as, for example, a news broadcast. Similar to above, the outside entity could search, browse, and retrieve digitized multimedia content from a storage facility within the organization over a proprietary intranet computer network or a public internet computer network.
  • However, despite the above-described benefits associated with digitized multimedia content, organizations presently have little or no means of searching within multimedia content, organizing information about multimedia content, and delivering multimedia content in a ubiquitous manner That is, there are presently little or no means for searching inside streams of multimedia content (e.g., audio/visual streams), adding meta-information to multimedia content (i.e., annotating multimedia content) for purposes of indexing within multimedia content, and providing universal access to indexed multimedia content over a variety of connection speeds and on a variety of client platforms. Accordingly, it would be desirable to provide a technique for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in an efficient and cost-effective manner so as to overcome the above-described shortcomings of the prior art.
  • OBJECTS OF THE INVENTION
  • The primary object of the present invention is to provide a technique for processing data in a network.
  • The above-stated primary object, as well as other objects, features, and advantages, of the present invention will become readily apparent from the following detailed description which is to be read in conjunction with the appended drawings.
  • SUMMARY OF THE INVENTION
  • A technique for processing data in a network is disclosed. In one particular exemplary embodiment, the technique may be realized as a method for processing data in a network having a plurality of network stations. The method comprises receiving a first representation of data at a first of the plurality of network stations, processing the first representation so as to generate a second representation of the data, and transmitting the second representation from the first network station to a second of the plurality of network stations for storage therein, wherein the second representation is stored at an address within the second network station. The method also comprises receiving the address from the first network station to a third of the plurality of network stations for storage therein.
  • In accordance with other aspects of this particular exemplary embodiment, the first, the second, and the third network stations may beneficially be different network stations.
  • In accordance with further aspects of this particular exemplary embodiment, processing the first representation may beneficially include at least encoding the first representation or transcoding the first representation.
  • In accordance with additional aspects of this particular exemplary embodiment, processing the first representation may beneficially include processing the first representation at the first network station.
  • In accordance with still other aspects of this particular exemplary embodiment, receiving the address may beneficially including receiving the address from the second network station.
  • In accordance with still further aspects of this particular exemplary embodiment, the address may beneficially have an extended URL format.
  • In accordance with still additional aspects of this particular exemplary embodiment, the method may further beneficially comprise transmitting a request for an identifier of the data from the first network station, and receiving the data identifier at the first network station. If such is the case, transmitting a request for an identifier of the data may beneficially include transmitting a request for an identifier of the data to the third network station. Also, if such is the case, the data identifier may beneficially be associated with an object in a database.
  • In accordance with yet further aspects of this particular exemplary embodiment, the address may beneficially be a first address of a plurality of addresses stored at the third network station, and the method may further beneficially comprise transmitting a request for at least one of the plurality of addresses from the first network station, and receiving a second address at the first network station. If such is the case, transmitting a request for at least one of the plurality of addresses may beneficially include transmitting a request for at least one of the plurality of addresses to the third network station. Each of the plurality of addresses may then beneficially identify a location of a stored representation of data. The second address may beneficially identify a location of the first representation of data. If such is the case, the method may further beneficially comprise transmitting a request for the first representation of data at the second address from the first network station. Then, transmitting a request for the first representation of data at the end second address may beneficially include transmitting a request for the first representation of data at the second address to the second network station.
  • In another particular exemplary embodiment, the technique may be realized as at least one signal embodied in at least one carrier wave for transmitting a computer program of instructions configured to be readable by at least one processor for instructing the at least one processor to execute a computer process for performing the above-described method.
  • In still another particular exemplary embodiment, the technique may be realized as at least one processor readable carrier for storing a computer program of instructions configured to be readable by at least one processor for instructing the at least one processor to execute a computer process for performing the above-described method.
  • In yet another particular exemplary embodiment, the technique may be realized as an apparatus for processing data in a network having a plurality of network stations. The apparatus comprises a first receiver for receiving a first representation of data at a first of the plurality of network stations, a processing device for processing the first representation so as to generate a second representation of the data, and a first transmitter for transmitting the second representation from the first network station to a second of the plurality of network stations for storage therein, wherein the second representation is stored at an address within the second network station. The apparatus also comprises a second receiver for receiving the address at the first network station, and a second transmitter for transmitting the address from the first network station to a third of the plurality of network stations for storage therein.
  • In accordance with other aspects of this particular exemplary embodiment, the first, the second, and the third network stations may beneficially be different network stations.
  • In accordance with further aspects of this particular exemplary embodiment, the processing device may beneficially include at least an encoder for encoding the first representation or a transcoder for transcoding the first representation.
  • In accordance with additional aspects of this particular exemplary embodiment, the apparatus may further beneficially comprise a third transmitter for transmitting a request for an identifier of the data from the first network station, and a third receiver for receiving the data identifier at the first network station. If such is the case, the data identifier may beneficially be associated with an object in a database.
  • The present disclosures will now described in more detail with reference to exemplary embodiments thereof as shown in the accompanying drawings. While the present disclosure is described below with references to exemplary embodiments, it should be understood that the present disclosure is not limited thereto. Those of ordinary skill in the art having access to the teachings herein will recognize additional implementations, modifications, and embodiments, as well as other fields of use, which are within the scope of the present disclosure as described herein, and with respect to which the present disclosures may be of significant utility.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In order to facilitate a fuller understanding of the present invention, reference is now made to the appended drawings. These drawings should not be construed as limiting the present invention, but are intended to be exemplary only.
  • FIG. 1A is a schematic diagram of a first embodiment of a system for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • FIG. 1B is a schematic diagram of a second embodiment of a system for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention.
  • FIG. 2 is a flowchart diagram detailing the processing steps of an encoder client in accordance with the present invention.
  • FIG. 3 is a flowchart diagram detailing the processing steps of a transcoder client in accordance with the present invention.
  • FIG. 4 is a flowchart diagram of an encoding process for use in an encoder and transcoder in accordance with the present invention.
  • FIG. 5 shows the file structure for a file that is stored in a media database containing a digital representation of audio/video data in accordance with the present invention.
  • FIG. 6 shows an annotation structure for an object in accordance with the present invention.
  • FIG. 7 shows the structure of an object database of a meta database in accordance with the present invention.
  • FIG. 8 shows an object table of a meta database in accordance with the present invention.
  • FIG. 9 shows a representation table of a meta database in accordance with the present invention.
  • FIG. 10 shows an annotation table of a meta database in accordance with the present invention.
  • FIG. 11 shows an exemplary HTML query page in accordance with the present invention.
  • FIG. 12 shows an exemplary HTML results page in accordance with the present invention.
  • FIG. 13 shows an exemplary HTML matches page in accordance with the present invention.
  • FIG. 14 shows an exemplary HTML more context page in accordance with the present invention.
  • FIG. 15 is a schematic diagram of a processing device for facilitating the implementation of input data processing and output data generation in the components of the present invention.
  • DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
  • Referring to FIG. 1A, there is shown a schematic diagram of a first embodiment of a system 10A for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention. The system 10A comprises a user 11, raw audio/video data 12, at least one encoder client 14, at least one transcoder client 16, at least one annotation client 18, at least one browser client 20, a media database 22, a media database server 24, a meta database 26, a meta database server (librarian) 28, an index database 30, an index database server 32, and a communication network 34 for allowing communication between all of the above-identified components which are connected thereto. The communication network 34 as described herein is an internet protocol (IP) network using hypertext transfer protocol (HTTP) messaging so as to exploit the distributed nature of the world wide web (WWW). However, the system 10A may be implemented using other types of network protocols, and many of the above-identified components may be grouped together in a single processing device so as to altogether eliminate the need for inter- or intra-network communications between these grouped components.
  • In a brief overview, the system 10A operates such that the raw audio/video data 12 is provided to the encoder client 14 for processing by the encoder client 14. Before processing the raw audio/video data 12, the encoder client 14 sends a message over the communication network 34 to the librarian 28 requesting the creation of an object in the meta database 26 corresponding to the raw audio/video data 12. The librarian 28 processes the message from the encoder client 14 by creating an object in the meta database 26 corresponding to the raw audio/video data 12 and assigns the object an object identification number as described in more detail below. The librarian 28 then sends a message, including the object identification number associated with the raw audio/video data 12, over the communication network 34 to the encoder client 14 notifying the encoder client 14 of the creation of the object in the meta database 26 corresponding to the raw audio/video data 12.
  • Upon receipt of the notification from the librarian 28, the encoder client 14 digitally encodes the raw audio/video data 12 so as to generate a first digital representation of the raw audio/video data 12, as described in more detail below. The encoder client 14 then sends a message, including the first digital representation of the raw audio/video data 12, over the communication network 34 to the media database server 24 requesting that the media database server 24 store the first digital representation of the raw audio/video data 12 in the media database 22. The media database server 24 processes the message from the encoder client 14 by first checking to see if space is available in the media database 22 to store the first digital representation of the raw audio/video data 12 in the media database 22. If space is not available in the media database 22, the media database server 24 denies the request to store the first digital representation of the raw audio/video data 12 in the media database 22. However, if space is available in the media database 22, the media database server 24 stores the first digital representation of the raw audio/video data 12 at a location in the media database 22 and assigns the location a first universal resource locator (URL). The media database server 24 then sends a message, including the first URL, over the communication network 34 to the encoder client 14 notifying the encoder client 14 of the storage of the first digital representation of the raw audio/video data 12 in the media database 22.
  • Upon receipt of the notification from the media database server 24, the encoder client 14 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the librarian 28 notifying the librarian 28 of the digital encoding of the raw audio/video data 12 into the first digital representation of the raw audio/video data 12, and the storing of the first digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the first URL. The librarian 28 processes the message from the encoder client 14 by storing the first URL in the meta database 26 along with the object identification number associated with the raw audio/video data 12, as described in more detail below.
  • The transcoder client 16 periodically sends messages to the librarian 28 requesting work from the librarian 28. The librarian 28 processes such a message from the transcoder client 16 by first checking to see if there are any objects in the meta database 26 that have corresponding digital representations which have not been processed by the transcoder client 16. If there are no objects in the meta database 26 that have corresponding digital representations which have not been processed by the transcoder client 16, then the librarian 28 denies the work request. However, if there are objects in the meta database 26 that have corresponding digital representations which have not been processed by the transcoder client 16, such as, for example, the first digital representation of the raw audio/video data 12, then the librarian 28 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the transcoder client 16, thereby notifying the transcoder client 16 that the first digital representation of the raw audio/video data 12 has not been processed by the transcoder client 16.
  • Upon receipt of the notification from the librarian 28, the transcoder client 16 sends a message, including the first URL, over the communication network 34 to the media database server 24 requesting that the media database server 24 send a copy of the first digital representation of the raw audio/video data 12 to the transcoder client 16 for processing by the transcoder client 16. The media database server 24 processes the message from the transcoder client 16 by sending a message, including a copy of the first digital representation of the raw audio/video data 12, over the communication network 34 to the transcoder client 16 for processing by the transcoder client 16. The transcoder client 16 processes the copy of the first digital representation of the raw audio/video data 12 such that a second digital representation of the raw audio/video data 12 is generated, as described in more detail below.
  • After the transcoder client 16 has processed the copy of the first digital representation of the raw audio/video data 12, and generated the second digital representation of the raw audio/video data 12, the transcoder client 16 sends a message, including the second digital representation of the raw audio/video data 12, over the communication network 34 to the media database server 24 requesting that the media database server 24 store the second digital representation of the raw audio/video data 12 in the media database 22. The media database server 24 processes the message from the transcoder client 16 by first checking to see if space is available in the media database 22 to store the second digital representation of the raw audio/video data 12 in the media database 22. If space is not available in the media database 22, the media database server 24 denies the request to store the second digital representation of the raw audio/video data 12 in the media database 22. However, if the space is available in the media database 22, the media database server 24 stores the second digital representation of the raw audio/video data 12 at a location in the media database 22 and assigns the location a second URL. The media database server 24 then sends a message, including the second URL, over the communication network 34 to the transcoder client 16 notifying the transcoder client 16 of the storing of the second digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the second URL.
  • Upon receipt of the notification from the media database server 24, the transcoder client 16 sends a message, including the object identification number associated with the raw audio/video data 12 and the second URL, over the communication network 34 to the librarian 28 notifying the librarian 28 of the transcoding of the first digital representation of the raw audio/video data 12 into the second digital representation of the raw audio/video data 12, and the storing of the second digital representation of the raw audio/video data 12 in the media database 22 at the location identified by the second URL. The librarian 28 processes the message from the transcoder client 16 by storing the second URL in the meta database 26 along with the object identification number associated with the raw audio/video data 12, as described in more detail below.
  • The annotation client 18 periodically sends messages to the librarian 28 requesting work from the librarian 28. The librarian 28 processes such a message from the annotation client 18 by first checking to see if there are any objects in the meta database 26 that have corresponding digital representation which have not been processed by the annotation client 18. If there are no objects in the meta database 26 that have corresponding digital representations which have not been processed by the annotation client 18, then the librarian 28 denies the work request. However, if there are objects in the meta database 26 that have corresponding digital representations which have not been processed by the annotation client 18, such as, for example, the first digital representation of the raw audio/video data 12, then the librarian 28 sends a message, including the object identification number associated with the raw audio/video data 12 and the first URL, over the communication network 34 to the annotation client 18, thereby notifying the annotation client 18 that the first digital representation of the raw audio/video data 12 has not been processed by the annotation client 18.
  • Upon receipt of the notification from the librarian 28, the annotation client 18 sends a message, including the first URL, over the communication network 34 to the media database server 24 requesting that the media database server 24 send a copy of the first digital representation of the raw audio/video data 12 to the annotation client 18 for processing by the annotation client 18. The media database server processes the message from the annotation client 18 by sending a message, including a copy of the first digital representation of the raw audio/video data 12, over the communication network 34 to the annotation client 18 for processing by the annotation client 18. The annotation client 18 processes the copy of the first digital representation of the raw audio/video data 12 so as to generate annotations for the object in the meta database 26 corresponding to the raw audio/video data 12, as described in more detail below.
  • After the annotation client 18 has processed the copy of the first digital representation of the raw audio/video data 12, and generated the annotations for the object in the meta database corresponding to the raw audio/video data 12, the annotation client 18 sends a message, including the object identification number associated with the raw audio/video data 12 and the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12, over the communication network 34 to the librarian 28 notifying the librarian 28 of the generating of the annotations for the object in the meta database corresponding to the raw audio/video data 12. The librarian 28 processes the message from the annotation client 18 by storing the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 in the meta database 26 along with the object identification number associated with the raw audio/video data 12, as described in more detail below.
  • The index database server 32 periodically sends messages to the librarian 28 requesting a list of object identification numbers from the librarian 28 which correspond to objects that have been created in the meta database 26. The librarian 28 processes such a message from the index database server 32 by sending a message, including a list of object identification numbers corresponding to object that have been created in the meta database 26, over the communication network 34 to the index database server 32 for processing by the index database server 32. The index database server 32 processes the message from the librarian 28 by sending a message, including, for example, the object identification number associated with the raw audio/video data 12, over the communication network 34 to the librarian 28 requesting that the librarian 28 send a copy of the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12, such as, for example, the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12. The librarian 28 processes the message from the index database server 32 by sending a message, including the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12, over the communication network 34 to the index database server 32 for processing by the index database server 32. The index database server 32 processes the message from the librarian 28 by storing the annotations that were generated for the object in the meta database corresponding to the raw audio/video data 12 in the index database 30 along with, or with reference to, the object identification number associated with the raw audio/video data 12, as described in more detail below.
  • The browser client 20 allows the user 11 to interface with the index database server 32 such that the user 11 is allowed to search, browse, and retrieve all or a portion of a digital representation such as, for example, the first digital representation of the raw audio/video data 12. The browser client 20 sends a message, initiated by the user 11, over the communication network 34 to the index database server 32 requesting a search of the index database 30. The index database server 32 processes the message from the browser client 20 by sending a message, including a hypertext markup language (HTML) query page, to the browser client 20 for presentation to the user 11. The browser client 20 then presents the HTML query page to the user 11. The HTML query page is such that it allows the user 11 to enter textual and Boolean queries.
  • The user 11 enters a query through the HTML query page and the browser client 20 sends a message, including the query, over the communication network 34 to the index database server 32 for processing by the index database server 32. The index database server 32 processes the message from the browser client 20 by searching the index database 30 for annotations which match the query, and obtaining the object identification number associated with each matching annotation, as described in more detail below. The index database server 32 then sends a message, including each matching annotation and the object identification number associated with each matching annotation, over the communication network 34 to the librarian 28 requesting that the librarian 28 provide the URL of the digital representation from which each matching annotation was generated such as, for example, the first URL. The librarian 28 processes the message from the index database server 32 by searching the meta database 26 for the URL of the digital representation from which each matching annotation was generated, as described in more detail below. The librarian 28 then sends a message, including each matching annotation, the URL of the digital representation from which each matching annotation was generated, and the object identification number associated with each matching annotation, over the communication network 34 to the index database server 32 for processing by the index database server 32.
  • The index database server 32 processes the message from the librarian 28 by building an HTML results page for presentation to the user 11. The index database server 32 builds the HTML results page by creating an image or an icon corresponding to the URL of the digital representation from which each matching annotation was generated. That is, each image or icon is hyperlinked to a function or script which allows the user 11 to browse and/or retrieve all of a portion of a corresponding digital representation such as, for example, the first digital representation of the raw audio/video data 12. Once the HTML results page has been built, the index database server 32 sends a message, including the HTML results page, to the browser client 20 for presentation to the user 11. The browser client 20 then present the HTML results page to the user 11 so that the user 11 can select one of the images or icons so as to browse and/or retrieve all or a portion of a corresponding digital representation such as, for example, the first digital representation of the raw audio/video data 12.
  • In order to browse and/or retrieve all or a portion of a digital representation such as, for example, the first digital representation of the raw audio/video data 12, a method for efficiently delivering slices of media from large media streams is required. For real-time media stream such as video or audio tracks, URLs must be extended to specify not only a desired file but also the starting and ending time that is to be returned to a requesting entity. This can be done by attaching one or more server extensions to a standard HTTP server such that an URL of the form: http://www.digital.com/movie.mpg?st-1:00:00.00?et=1:00:05.00
  • will cause a server extension attached to the standard HTTP server, in this case named “www.digital.com”, to fetch and stream the moving pictures expert group (MPEG) stream for “movie” starting at the time code “1:00:00.00” and ending at time code “1:00:05.00.” In the system 10A shown in FIG. 1A, the media database server 24 has a server extension for performing these fetch and stream operations.
  • The generalization of the above-described technique is to provide a well known method for selecting a portion of a digital representation using specified file parameters. The URL can be of the form: http://server/file_name?file_parameter
  • Such a generalization allows the file_parameter field to specify a format in which a digital representation will be supplied. Thus, the transcoding of a digital representation into another format can be requested of the media database server 24 by so indicating in the file_parameter field. For example, to extract MPEG audio from an MPEG system stream, the media database server 24 wi11 receive an URL in the above-described form from a requesting entity. The media database server 24 determines the appropriate server extension based upon what is indicated in the file_parameter field. The media database server 24 then passes the file_name and the file_parameter to the appropriate server extension. The server extension then generates a multipurpose internet mail extensions (MIME) header which is sent to the requesting entity through the media database server 24. The server extension then opens the file indicated in the file_name field and strips-off any header information that may be contained at the beginning of the file. The file_parameter identifies the portion of the file that was requested by the requesting entity, and optionally drives transcoding or sub-stream extraction. The server extension then generates a new header and provides the requested file portion to the media database server 24, which then sends the requested file portion to the requesting entity.
  • Although this generalized technique is feasible, the efficiency of the approach depends upon the implementation of the server extension for each type of representation. For video sequence representation types such as MPEG and/or H.263, the present invention allows for the storing of extra information alongside a primary video stream. This makes it possible to return a portion of the primary video stream to a requesting entity from almost any location within the primary video stream without increasing the network bit rate requirements, as described below.
  • Efficient image sequence encoding for video sequences exploits the redundancy that occurs in a sequence of frames. In a video sequence for a single scene, only a few objects will move from one frame to the next. This means that by applying motion compensation it is possible to predict a current image in the video sequence from a previous image. Furthermore, this implies that the current image can be reconstructed from a previously transmitted image if all that is sent to a requesting entity are motion vectors and a difference between a predicted image and an actual image. This technique is well known and is termed predictive encoding.
  • The predictive encoding technique can be extended to make predictions about a current image based upon any prior image and any future image. However, the details of such an extension are not necessary to understanding the methodology of the present invention. What is necessary to understanding the methodology of the present invention, is that an image frame which has been encoded independently of any other frame is defined as an intra or I-frame, and an image frame which has been encoded based upon a previous frame is defined as a predicted or P-frame.
  • An important extension of the above discussion, is that frames are generally encoded by breaking them into fix sized blocks. Each block can then be separately encoded by producing an I-block, or each block can be encoded using previous blocks by producing a P-block. Transmitted frames can then consist of a mixture of I-blocks and P-blocks. Additional encoding efficiency is generally gained through this technique.
  • For network transmissions, the critical thing is to minimize bandwidth while maintaining accuracy in a reconstructed image. These two issues are balanced by sending as many P-frames or P-blocks as possible, and sending only an occasional I-frame or I-block when it is necessary to correct errors. This is because I-frames and I-blocks are substantially larger than P-frames and P-blocks. Therefore, a typical encoder will generate an encoded file that consists mostly of P-frames and P-blocks with the occasional I-frame and I-block. Maximum efficiency is gained by only ever providing one I-frame at the head of a file, and then only providing a mixture of I-blocks and P-blocks to the rest of the file.
  • However, it should be apparent from this discussion that the above-described approach is incompatible with being able to transmit a valid image sequence file from any location within a primary video stream. This is because an image sequence decoder can only start decoding from a complete I-frame. If there is only one I-frame in a file, and it is located at the head of the file, then that is the only place in the file from which the image sequence decoder can start decoding the file. The file must therefore be transmitted from its beginning, which typically results in decreased transmission efficiency.
  • The simplest way to correct this problem is to force the encoder to place I-frames at periodic locations within a primary video sequence. The primary video sequence can then be decoded from any location where an I-frame has been placed. However, this decreases the encoding efficiency.
  • The present invention solves this problem by maintaining a secondary bit stream of I-frames which can be used to jump into the primary bit stream from any location where an I-frame has been stored. This secondary bit stream of I-frames can be generated by a secondary encoder, which can be included in both the encode client 14 and the transcoder client 16. This secondary bit stream is combined with the primary bit stream to produce the first digital representation of the raw audio/video data 12 and the second digital representation of the raw audio/video data 12, as described above.
  • Referring to FIG. 2, there is shown a flowchart diagram detailing the processing steps of the encoder client 14. The encoder client 14 processes the raw audio/video data 12, which is typically in analog form, by digitizing the raw audio/video data 12 with a digitizer 40. The digitized audio/video data is then encoded by a primary encoder 42, which generates a primary bit stream 44 for the first digital representation of the raw audio/video data 12 and a prediction of the primary bit stream for the first digital representation of the raw audio/video data 12. The prediction of the primary bit stream for the first digital representation of the raw audio/video data 12 is separately encoded by a secondary encoder 45 to generate a secondary bit stream 46 for the first digital representation of the raw audio/video data 12. The primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 are then combined to form the first digital representation 48 of the raw audio/video data 12, which is stored in the media database 22 at the location identified by the first URL, as described above. The primary bit stream 44 for the first digital representation of the raw audio/video data 12 is typically in a form of an I-frame and a plurality of P-frames, whereas the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 is in the form of all I-frames. The first digital representation 48 of the raw audio/video data 12 is typically stored in a file in the media database 22. The file typically has a header which has pointers to the beginnings of the primary bit stream 44 and the secondary bit stream 46 within the file. It should be noted that the primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 must be in the same format such as, for example, JPEG, MPEG or H.263.
  • Referring to FIG. 3, there is shown a flowchart diagram detailing the processing steps of the transcoder client 16. The transcoder client 16 processes the first digital representation 48 of the raw audio/video data 12 with a decoder 50. The decoded audio/video is then encoded by a primary encoder 52, which generates a primary bit stream 54 for the second digital representation of the raw audio/video data 12 and a prediction of the primary bit stream for the second digital representation of the raw audio/video data 12. The prediction of the primary bit stream for the second digital representation of the raw audio/video data 12 is separately encoded by a secondary encoder 55 to generate a secondary bit stream 56 for the second digital representation of the raw audio/video data 12. The primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 are then combined to form the second digital representation 58 of the raw audio/video data 12, which is stored in the media database 22 at the location identified by the second URL, as described above. The primary bit stream 54 for the second digital representation of the raw audio/video data 12 is typically in a form of an I-frame and a plurality of P-frames, whereas the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 is in the form of all I-frames. The second digital representation 58 of the raw audio/video data 12 is typically stored in a file in the media database 22. The file typically has a header which has pointers to the beginnings of the primary bit stream 54 and the secondary bit stream 56 within the file. It should be noted that the primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 must be in the same format such as, for example, JPEG, MPEG or H.263.
  • The primary encoder 42 in the encoder client 14 and the primary encoder 52 in the transcoder client 16 can both operate according to an encoding process 60 such as shown in FIG. 4. This encoding process 60 comprises digitized audio/visual data 62, a differencing function 64, a discrete cosine transform (DCT) function 66, a quantization (Q) function 68, an inverse quantization (invQ) function 70, an inverse discrete cosine transform function (IDCT) 72, an adding function 74, a motion estimation function 76, a motion compensation function 78, and a delay function 80. A current frame of the digitized audio/visual data 62 is processed according to the encoding process 60 by differencing the current frame of the digitized audio/visual data 62 with a prediction of the current frame at the differencing function 64. The difference between the current frame of the digitized audio/visual data 62 and the prediction of the current frame is encoded by the discrete cosine transform (DCT) function 66 and the quantization (Q) function 68 to produce an encoded P-frame for a digital representation of the digitized audio/visual data 62. This encoded P-frame is decoded by the inverse quantization (invQ) function 70 and the inverse discrete cosine transform function (IDCT) 72, and then added to a delayed prediction of the current frame by the adding function 74. The prediction of the current frame is determined by subjecting the output of the adding function 74 to the motion estimation function 76 and the motion compensation function 78. It is this prediction of the current frame that is encoded by the secondary encoder 45 in the encoder client 14 and the secondary encoder 55 in the transcoder client 16, as described above.
  • At this point it should be noted that similar results can be obtained by encoding each frame of the digitized audio/visual data 62 so as to produce the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12, as described above.
  • It should also be noted that both the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 could alternatively be generated at an encoder associated with the media database server 24. For example, referring for FIG. 1B, there is shown a schematic diagram of a second embodiment of a system 10B for organizing distributed multimedia content and for searching, browsing, and retrieving such organized distributed multimedia content in accordance with the present invention. The system 10B is identical to the system 10A except for the addition of an encoder 36, and that the encoder client 14 and the transcoder client 16 would no longer require the secondary encoder 46 and the secondary encoder 56, respectively, as described above. The encoder 36 would generate both the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12. That is, the encoder client 14 would generate the primary bit stream 44 as described above, and then transmit the primary bit stream 44 to the media database server 24. The media database server 24 would then provide the primary bit stream 44 to the encoder 36, which would then generate the secondary bit stream 46. The encoder 36 would then provide the secondary bit stream 46 to the media database server 24. The media database server 24 would then combine the primary bit stream 44 for the first digital representation of the raw audio/video data 12 and the secondary bit stream 46 for the first digital representation of the raw audio/video data 12 to form the first digital representation 48 of the raw audio/video data 12, which is then stored in the media database 22 at the location identified by the first URL, as described above. Similarly, the transcoder client 16 would generate the primary bit stream 54 to the media database server 24. The media database server would then provide the primary bit stream 54 to the encoder 36, which would then generate the secondary bit stream 56. The encoder 36 would then provide the secondary bit stream 56 to the media database server 24. The media database server 24 would then combine the primary bit stream 54 for the second digital representation of the raw audio/video data 12 and the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 to form the second digital representation 58 of the raw audio/video data 12, which is then stored in the media database 22 at the location identified by the second URL, as described above. The foregoing is beneficial in that only the primary bit stream 44 and the primary bit stream 54 are transmitted from the encoder client 14 and the transcoder client 16, respectively, to the media database server 22, which increases transmission efficiency.
  • It should further be noted that the primary bit streams 44 and 54 and the secondary bit streams 46 and 56 as described above only represent the video portion of the first digital representation 48 of the raw audio/video data 12 and the second digital representation 58 of the raw audio/video 12, respectively. That is, a digital representation of an audio/video bit stream consists of three components: an audio layer, a video layer, and a system layer. The system layer tells a decoder how audio and video are interleaved in the audio/video bit stream. The decoder uses this information to split the audio/video bit stream into components and send each component to its appropriate decoder. On the other end, a video encoder takes a non-encoded video stream and provides an encoded video stream which is then combined with an encoded audio stream to create the three component audio/video stream. Thus, the primary bit streams 44 and 54 and the secondary bit streams 46 and 56 as described above represent video streams which will be combined with audio streams to create three component audio/video streams.
  • In view of the above, it is now appropriate to indicate that the media database server 24 stores the first digital representation 48 of the raw audio/video data 12 in the media database 22 such that each P-frame in the primary bit stream 44 for the first digital representation of the raw audio/video data 12 references a corresponding I-frame in the secondary bit stream 46 for the first digital representation of the raw audio/video data 12, and vice versa. Thus, the user 11 can browse and/or retrieve a desired portion of the first digital representation 48 starting at any arbitrary location within the first digital representation 48 by first obtaining an I-frame from the secondary bit stream for the first digital representation of the raw audio/video data 12 which corresponds to the arbitrary starting location of the desired portion, and then obtaining P-frame from the primary bit stream 44 for the first digital representation of the raw audio/video data 12 for all subsequent locations of the desired portion. This is beneficial in that the media database server 24 will only have to send a message containing a single I-frame in order for the user 11 to browse and/or retrieve a desired portion of the first digital representation 48, thereby obtaining maximum network transmission efficiency while maintaining the encoding advantages of only a single I-frame in the primary bit stream 44 for the first digital representation of the raw audio/video data 12.
  • Similarly, the media database server 24 stores the second digital representation 58 of the raw audio/video data 12 in the media database 22 such that each P-frame in the primary bit stream 54 for the second digital representation of the raw audio/video data 12 references a corresponding I-frame in the secondary bit stream 56 for the second digital representation of the raw audio/video data 12, and vice versa. Thus, the user 11 can browse and/or retrieve a desired portion of the second digital representation 58 starting at any arbitrary location within the second digital representation 58 by first obtaining an I-frame from the secondary bit stream 56 for the second digital representation of the raw audio/video data 12 which corresponds to the arbitrary starting location of the desired portion, and then obtaining P-frames from the primary bit stream 54 for the second digital representation of the raw audio/video data 12 for all subsequent locations of the desired portion. This is beneficial in that the media database server 24 will only have to send a message containing a single I-frame in order for the user 11 to browse and/or retrieve a desired portion of the second digital representation 58, thereby obtaining maximum network transmission efficiency while maintaining the encoding advantages of only a single I-frame in the primary bit stream 54 for the second digital representation of the raw audio/video data 12.
  • Referring to FIG. 5, there is shown a file structure for a file 90 that is stored in the media database 22 containing either the first digital representation 48 of the raw audio/video data 12 or the second digital representation 58 of the raw audio/video data 12. The file 90 comprises a header portion 92, a primary bit stream portion 94, and a secondary bit stream portion 96. The header portion 92 comprises a file identifier 98 for either the first digital representation 48 of the raw audio/video data 12 or the second digital representation 58 of the raw audio/video data 12, a pointer 100 to the beginning of the primary bit stream portion 94, and a pointer 102 to the beginning of the secondary bit stream portion 96. The primary bit stream portion 94 comprises an I-frame 104 and a plurality of P-frames 106. The secondary bit stream portion 96 comprises a plurality of I-frames 108. The references between the P-frames 106 in the primary bit stream portion 94 and the I-frames 108 in the secondary bit stream portion 96, and vice versa, can be included in the P-frames 106 in the primary bit stream portion 94 and the I-frames 108 in the secondary bit stream portion 96. Alternatively, the header portion 92 can include additional pointers to corresponding P-frames 106 in the primary bit stream portion 94 and I-frames 108 in the secondary bit stream portion 96.
  • As previously described, the annotation client 18 processes the copy of the first digital representation of the raw audio/video data 12 such that annotations are generated for the object in the meta database 26 corresponding to the raw audio/video data 12. The librarian 28 then stores these annotations in the meta database 26 along with the object identification number associated with the raw audio/video data 12. The implementation of these steps in accordance with the present invention is directly related to annotation processes and the structure of the meta database 26.
  • Annotations are generated for an object so as to provide information about the whole object or a part of the object. Annotations may be generated for an object by trusted automatic processes called annotation daemons, such as the annotation client 18, or by trusted human annotators. Annotations which have previously been generated for an object, including both annotations produced by annotation daemons or by human annotators, may be reviewed and updated.
  • Annotations in accordance with the present inventions are a typed, probabilistic, stratified collection of values. Referring to FIG. 6, there is shown an annotation structure 110 for an object in accordance with the present invention. The annotation structure 110 comprises a first annotation sequence 114 and a second annotation sequence 116. The first annotation sequence 114 and the second annotation sequence 116 relate to a media stream 112, which can be either an audio or a video stream. Each annotation sequence represents a different type of annotation such as, for example, words that occur in the media stream 112 or speakers that are recognized in the media stream 112.
  • Each annotation sequence contains a plurality of time marks 117 and a plurality of arcs 118. Each time mark 117 represents an instant in time. Each arc 118 also has an associate value and probability. The probability is a measure of confidence in the accuracy of the annotation. The use of a probability allows probabilistic-based retrieval to be supported. The use of a probability also allows the quality (e.g., higher or lower quality) or a replacement annotation to be determined. Each annotation sequence can be applied to the entire media stream 112 or to a part thereof.
  • The annotation structure 110 as described above differs from many video annotation systems that work on shot lists. In this prior art approach, a video is first broken down into thematic chunks called shots that are then grouped into scenes. Each shot is then taken as a basic atomic unit for annotation. That is, each shot is annotated, and searching will only retrieve particular shots. The difficulty of this prior approach is that performing the above-processing automatically can be very difficult. The present invention avoids this difficulty by allowing the presence of people and things to be marked within a scene.
  • The structure of the meta database 26 is such that it is an object database built on top of standard relational databases. Each object in the object database of the meta database 26 represents some form of audio/video data such as, for example, the raw audio/video data 12, as described above. For every object in the object database of the meta database 26 there can be one or more representations and/or annotations. A representation of an object in the object database of the meta database 26 can be a representation of the audio/video data that is represented by the object in the object database of the meta database 26 such as, for example, the first digital representation of the raw audio/video data 12, as described above. An annotation of an object in the object database of the meta database 26 can be an annotation that is generated by processing one or more representations of the audio/video data that is represented by the object in the object database of the meta database 26 such as, for example, an annotation that was generated by processing the copy of the first digital representation of the raw audio/video data 12, as described above.
  • The structure of an object database 120 of the meta database 26 in accordance with the present invention is shown in FIG. 7. The object database 120 comprises an object 122, a plurality of representations 124 of the object 122, and a plurality of annotations 126 of the object 122. As indicated by the direction of the arrows, each of the plurality of representations 124 of the object 122 reference the object 122, and each of the plurality of annotations 126 of the object 122 reference the object 122. It should be noted that an annotation 126 may reference more than one object 122, indicating that the annotation 126 is shared by the more than one object 122.
  • All of the objects in the object database of the meta database 26 are listed in an object table 130 of the meta database 26, as shown in FIG. 8. Each of the objects in the object database of the meta database 26 are assigned an object identification number 132, as previously described. Each object identification number 132 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted. Each of the objects in the object database of the meta database 26 are typically listed in the object table 130 according to the value of the their object identification number 132, as shown.
  • Each of the objects in the object database of the meta database 26 are also assigned an object type 134. The object type 134 can be, for example, video or audio, corresponding to the type of data that is represented by the object in the object database of the meta database 26. Accordingly, each of the objects in the object database of the meta database 26 are listed in the object table 130 with a corresponding object type 134.
  • All of the representations in the object database of the meta database are listed in a representation table 140 of the meta database 26, as shown in FIG. 9. Each of the representations in the object database of the meta database 26 are assigned a representation identification number 142. Similar to the object identification numbers 132, each representation identification number 142 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted. Each of the representations in the object database of the meta database 26 are typically listed in the representation table 140 according to the value of their representation identification number 142, as shown.
  • As previously discussed, each of the representations in the object database of the meta database 26 is associated with an object in the object database of the meta database 26. Accordingly, each of the representations in the object database of the meta database 26 are listed in the representation table 140 with an associated object identification number 132.
  • Each of the representations in the object database of the meta database 26 are also assigned a representation type 144. The representation type 144 can be, for example. Video/mpeg, video/x-realvideo, audio/mpeg, or audio/c-realvideo, corresponding to the format type of the representation in the object database of the meta database 26. Accordingly, each of the representations in the object database of the meta database 26 are listed in the representation table 140 with a corresponding representation type 144.
  • As previously discussed, each of the representations in the object database of the meta database 26 have an associated URL which identifies the location in the media database 22 where the representation can be found. Accordingly, each of the representations in the object database of the meta database 26 are listed in the representations table 140 with an associated URL 146.
  • All of the annotations in the object database of the meta database 26 are listed in an annotation table 150 of the meta database 26, as shown in FIG. 10. Each of the annotations in the object database of the meta database 26 are assigned an annotation identification number 152. Similar to the object identification numbers 132 and the representation identification numbers 142, each annotation identification number 152 is unique and is typically in numeric or alphanumeric form, although other forms are also permitted. Each of the annotations in the object database of the meta database 26 are typically listed in the annotation table 150 according to the value of their annotation identification number 152, as shown.
  • As previously discussed, each of the annotations in the object database of the meta database 26 are associated with an object in the object database of the meta database 26. Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with an associated object identification number 132.
  • Each of the annotations in the object database of the meta database 26 are also assigned an annotation type 154. The annotation type 154 can be, for example, transcript, speaker or keyframe. Each annotation type 154 corresponds to the type of annotation that has been generated for a corresponding object in the object database of the meta database 26. Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with a corresponding annotation type 154.
  • Each of the annotations in the object database of the meta database 26 have a corresponding annotation value 156. The annotation value 156 can be, for example, a word, the name of a speaker, or an URL which references an image in the media database 22. Each annotation value 156 corresponds to the actual annotated element of the object in the object database of the meta database 26. Accordingly, each of the annotations in the object database of the meta database 26 are listed in the annotation table 150 with a corresponding annotation value 156.
  • Annotations which have been generated for an object that represents an audio/video stream have a corresponding annotation start time 158 and a corresponding annotation end time 160. The annotation start time 158 corresponds to the location in the audio/video stream where an annotation actually begins. Conversely, the annotation end time 160 corresponds to the location in the audio/video stream where an annotation actually ends. Accordingly, each of the annotations in the object database of the meta database 26 which have been generated for an object that represents an audio/video stream are listed in the annotation table 150 with a corresponding annotation start time 158 and a corresponding annotation end time 160.
  • As previously described, the index database server 32 stores the annotations that were generated for the object in the meta database 26 corresponding to the raw audio/video data 12 in the index database 30 along with the object identification number associated with the raw audio/video data 12. The index database server 32 then searches the index database 30 for annotations which match a query initiated by the user 11, and then obtains the object identification number associated with each matching annotation. The implementation of these steps in accordance with the present invention is directly related to the indexing process and the structure of the index database 30.
  • The index database server 32 stores the annotations in the index database 30 such that an entry is created in the index database 30 for each annotation value. Following each annotation value entry in the index database 30 is a list of start times for each occurrence of the annotation value within an associated object. The start times can be listed according to actual time of occurrence in the associated object or in delta value form. Following the list of start times for each occurrence of the annotation value within the associated object is the object identification number corresponding to the associated object, or a reference to such object identification number. Thus, each of these annotation value entries in the index database 30 is linked in some manner to the start times for each occurrence of the annotation value within an associated object and the object identification number corresponding to the associated object. Therefore, whenever the index database server 32 searches the index database 30 for annotation values which match a query, the start times for each occurrence of a matching annotation value within an associated object and the object identification number corresponding to the associated object can be easily obtained.
  • Once the index database server 32 has a matching annotation value, the start times for each occurrence of the matching annotation value within an associated object, and the object identification number corresponding to the associated object, the index database server 32 can send a message, including the matching annotation value, the start times for each occurrence of the matching annotation value within an associated object, and the object identification number corresponding to the associated object, over the communication network 34 to the librarian 28 requesting that the librarian 28 provide further information relating to the matching annotation value and the associates object identification number. Such information can include the annotation type, the annotation start time, the annotation end time, the representation type, the URL, and the object type associated with the matching annotation value and the associated object identification number, all of which have been described above. In short, the librarian 28 provides everything that the index database server 32 requires to build an HTML results page for presentation to the user 11.
  • At this point it should be noted that the start times for each occurrence of a matching annotation value within an associated object are included in the message from the index database server 32 to the librarian 28 so as to make searching the meta database more efficient. That is, searching the meta database 26 for numerical values typically requires less processing than searching the meta database 26 for textual values. Also, a matching annotation value and the start times for each occurrence of a matching annotation value within an associated object are directly related. However, a matching annotation value is typically a textual value, whereas the start times for each occurrence of a matching annotation value within an associated object are numerical values. Thus, using the start times for each occurrence of a matching annotation value within an associated object to search the meta database 26 for information is more efficient than using a matching annotation value.
  • At this point it should be noted that the index database server 32 inherently knows that it must look to the librarian 28 to provide further information relating to the matching annotation value and the associated object identification number. That is, it is inherent to the index database server 32 that a request for further information relating to the matching annotation value and the associated object identification number must be sent to the librarian 28.
  • In view of the above, the operation of both the system 10A and system 10B can now be described in more detail. That is, system 10A and system 10B bother operate such that subsequent to a request from the encoder client 14, the librarian 28 creates an object in the meta database 26, and stores information in the meta database 26 along with the object. This information includes the URL of a digital representation of media data, the form of the digital representation of the media data, the type (e.g., audio, video, etc.) of the form of the digital representation of the media data, the format in which the digital representation of the media data is stored at the URL, the URL and types of any ancillary files associated with the media data such as a transcript or closed-caption file, and any associated high-level meta data such as the title of the media data and/or its author.
  • After the object has been created, the annotation client 18 can request work from the librarian 28 and process digital representations which the librarian 28 has indicated have not already been processed by the annotation client 18, as previously described. The annotation client 18 employs an automatic process, called a daemon process, to perform the annotation function. Automatic daemon processes are preferred over human annotation processes, which can be very laborious. However, automatic daemon processes which produce high quality results, appropriately termed trusted daemon processes, are sometimes hard to come by given the current state of technology. Thus, it is important to provide a flexible, distributed, open architecture which can be used to incorporate new approaches to automatic annotation. The present invention achieves this by allowing each annotation client 18 to communicate with the librarian 28 and the media database server 24 over the communication network 34 using a standard messaging protocol (e.g., HTTP messaging).
  • The annotation client 18 requests work from the librarian 28 by providing two boolean conditions, an identifier of the annotation client 18, a version number of the annotation client 18, and an estimate of how long the annotation client 18 will take to complete the work (i.e., the annotation process). The first Boolean condition is used to test for the existence of an object which satisfies the input requirements of the daemon process. That is, if an object satisfies the condition, then the inputs necessary for the daemon process to run exist and are referenced in the meta database 26. The second boolean condition tests for the non-existence of the output produced by the daemon process. If these boolean conditions are satisfied, then the daemon process should be run on the object.
  • The librarian 28 provides work to the annotation client 18 by first creating a list containing all objects which satisfy both boolean conditions. The librarian 28 then filters the list by eliminating objects which are presently being processed, or locked, by another annotation client 18 having the same identifier and version number. The librarian 28 then creates a key for each object remaining on the list which identifies the annotation client 18 and includes an estimate of how long the annotation client 18 will take to complete the work. This key is used to lock out other annotation clients 18 as described above. The librarian 28 then provides the URL of each digital representation remaining on the list to the annotation client 18 for processing, as previously described.
  • The annotation client 18 uses the returned work information to perform its operations. That is, the annotation client 18 uses the URL of each digital representation to request each digital representation from the media database server 22, as previously described. The annotation client 18 then performs its work.
  • Upon completion of its work, the annotation client 18 checks its work into the librarian 28 for storage in the meta database 26. The annotation client 18 accomplishes this task by returning the object identification number associated with the object, the newly generated annotation data, and the key to the librarian 28. The librarian 28 checks the key to make sure that it matches the key in a space reserved for the completed operation. If the annotation client 18 returns the correct key, and the estimated work completion time has not expired, the key will match and the librarian 28 will accept the complete result. However, if the estimated work completion time has expired, the key may also have expired if another annotation client 18, having the same identifier and version number, requested work after the estimated work completion time had expired. If this is the case, the work will have been given to the new requesting annotation client 18, and a new key will have been generated. Therefore, the first requesting annotation client will not be able to check in its work.
  • The aforementioned protocol permits completely distributed processing of information with very low communications overhead. Also, the use of URLs makes it possible for the processing to occur anywhere on the network, although only privileged addresses (i.e., those belonging to trusted annotation clients 18) may install results in the librarian 28. Furthermore, the simple time stamp protocol makes the system tolerant to processing failures.
  • It is also possible to directly select an object to be worked on. This allows a human to force an order of work. This is useful for human review of annotations produced by automatic daemon processes. From the point of view of the librarian 28, a human sitting at an annotation station is just another requesting annotation client 18. However, the human will want to request work that has already been completed by an automatic daemon process by specifically searching for items and then locking those items with a key. When a human reviews the work, the probabilities of the annotation can be updated to nearly 1 because the annotations were reviewed via a manual process. When the work is checked in, the librarian 28 will check that the new annotations are of higher quality than the old annotations by looking at the probabilities associated with each annotation.
  • The index database server 32 indexes the meta database 26 by periodically requesting from the librarian 28 a list of object identification numbers which correspond to objects that have been created in the meta database 26. In response, the librarian 28 provides a list of object identification numbers which correspond to objects that have been created in the meta database 26 to the index database server 32. The index database server 32 then requests from the librarian 28, for each object identification number, a copy of all of the annotations that were generated for each object in the meta database 26. In response, the librarian 28 provides, for each object identification number, a copy of all of the annotations that were generated for each object in the meta database 26 to the index database server 32. The index database server 32 then stores the annotations that were generated for each object in the meta database 26 in the index database 30 along with, or with reference to, each associated object identification number.
  • As previously described, the browser client 20 sends a message, initiated by the user 11, to the index database server 32 requesting a search of the index database 30. In response, the index database server 32 provides an HTML query page to the browser client 20 for presentation to the user 11. The browser client 20 then presents the HTML query page to the user 11. Referring to FIG. 11, there is shown an exemplary HTML query page 170 including a search field 172, a user-selectable search command 174, a user-selectable “help” option 176, and a user-selectable “advanced search” option 178.
  • The user 11 enters a query through the HTML query page and the browser client 20 sends a message, including the query, to the index database server 32 for processing by the index database server 32. In response, the index database server 32 searches the index database 30 for annotation values which match the query. Once the index database server 32 has found matching annotation values, the index database server 32 ranks the matching annotation values according to relevance, and obtains the object identification number associated with each matching annotation value. The index database server 32 then requests the librarian 28 to provide further information relating to each matching annotation value by referencing each associated object identification number. As previously described, such information can include the annotation type, the annotation start time, the annotation end time, the representation type, the URL, and the object type associated with each matching annotation value and the associated object identification number. The librarian 28 then sends the requested information to the index database server 32.
  • At this point it should be noted that the index database server 32 ranks the matching annotation values using a modified document retrieval technique. The unmodified document retrieval technique uses a document as a basic unit, and determines the importance of a document based upon a query. That is, the importance of a document is based on the number of occurrences of each query word within the document, with each query word being weighed by the rarity of the query word in a document database. Thus, more rare words are given higher weights than common words, and documents with more query words receive higher total weights than documents with fewer query words. A typical equation for computing the score of a document is

  • Score(d)=sum {q}w[q]  (1)
  • Wherein d is a document, q is a query word, sum_{q} is the number of times that the query word q appears in the document d, and w[q] is the weight of the query word q. It should be clear that the above-described technique requires using all of the words in a document for determining the weight of the document.
  • In audio/video retrieval, it is a requirement that users be able to start an audio/video stream from the most relevant position within the audio/visual stream. This, an indexing system must not only determine that an audio/video stream is relevant, but also all relevant locations within the audio/video stream, and preferably rank the relevance of those locations.
  • The present invention modifies the above-described technique by letting h[i] be a valid starting location within an audio/video stream, and letting L[q,j] be the jth location of the query word q in the audio/video stream. Then the score at valid starting location h[i] can be given by

  • score(h[i])=sum L[q,j]>=h[i])w[q] exp(−(L[q,j]−h[k])/DELTA
  • wherein DELTA is a settable distance weight equal to 10-30 seconds. This, the score at a valid starting location is a weighted sum over all the locations the query word appears after the valid starting location, where the weight of each appearance of a query word is the product of the query word weight and a negative exponential weight on the distance between the occurrence of the query word and the query word in time. This modified ranking technique provides a unique advantage to the index database server 32 of the present invention.
  • The index database server 32 uses the information provided by the librarian 28 to build an HTML results page for presentation to the user 11. The index database server 32 builds the HTML results page by creating an image or an icon for each matching annotation value. Each image or icon is hyperlinked to a function or script which allows the user 11 to browse and/or retrieve all of a portion of a corresponding digital representation. Once the HTML results page has been built, the index database server 32 sends the HTML results page to the browser client 20 for presentation to the user 11. The browser client 20 then presents the HTML results page to the user 11 so that the user 11 can select one of the images or icons so as to browse and/or retrieve all or a portion of a corresponding digital representation.
  • Referring to FIG. 12, there is shown an exemplary HTML results page 190 for a query which included the terms “commission” and “history.” The HTML results page 190 includes an almost exact copy of the HTML query page 192 containing a statement as to the number of matches that were found for the query, which in this case is five. The HTML results page 190 also includes either a video icon 194 or an audio icon 196 depending upon the type of object that is associated with each matching annotation value. Both the video icon 194 and the audio icon 196 are provided along with some detail about each associated object. For example, in the case of a video icon 194, the title of the corresponding video stream, a frame of the corresponding video stream, a textual excerpt from the corresponding video stream, the length of the corresponding video stream, the language that is spoken in the corresponding video stream, and the number of matches that occur within the corresponding video stream are shown or listed along with the video icon 194. In the case of an audio icon 196, the title of the corresponding audio stream, a textual excerpt from the corresponding audio stream, the length of the corresponding audio stream, the language that is spoken in the corresponding audio stream, and the number of matches that occur within the corresponding audio stream are listed along with the audio icon 196.
  • If the user 11 selects either a video icon 194 or an audio icon 196, then the video or audio stream will play from the location of the first match within the corresponding video or audio stream. This is possible because both the video icon 194 and the audio icon 196 are hyperlinked back to a function or script in the index database server 32, whereby the index database server 32 uses the information provided by the librarian 28 to access a corresponding digital representation in the media database 22 using the extended URL format described above. If more than one match occurs within either a video or an audio stream, then a user-selectable “matches” option 198 is provided to allow the user 11 browse each location within the video or audio stream where a match has occurred, as described in more detail below. If the user 11 desires to browse locations surrounding the location of the first match within the corresponding video or audio stream, then a user-selectable “more context” option 200 is provided to allow the user 11 browse locations surrounding the location of the first match within the corresponding video or audio stream, as described in more detail below.
  • To illustrate the above-described “matches” option 198, it is assumed that the user 11 has selected the “matches” option 198 associated with the third match presented in the HTML results page 190 (i.e., the video entitled, 1998 State of the Union Address). Referring to FIG. 13, there is shown an exemplary HTML matches page 210 for allowing the user 11 to browse each location within the video stream associated with the third match presented in the HTML results page 190 where a match has occurred. The HTML matches page 210 includes an almost exact copy of the HTML query page 212, which contains an additional user-selectable “search this result” option 214 for allowing the user 11 to refine the results of a previous query. The HTML matches page 210 also includes a matches header 216 containing the title of the corresponding video stream, the length of the corresponding video stream, the language that is spoken in the corresponding video stream, and the number of matches that occur within the corresponding video stream, which in this case is four. The HTML matches page 210 further include a frame 218 which corresponds to each match that occurs within the corresponding video stream. Each frame 218 includes a video icon 220, which functions in a manner similar to the previously-described video icon 194. Each frame 218 and corresponding video icon 220 are provided along with some detail about each associated match that occurs within the corresponding video stream. For example, the exact time location of the match within the corresponding video stream and a textual excerpt from the corresponding video stream are listed along with each frame 218 and corresponding video icon 220. Similar to the HTML results page 190, the HTML matches page 210 includes a user-selectable “more context” option 222 for each match to allow the user 11 browse locations surrounding the location of each associated match within the corresponding video stream.
  • To illustrate the above-described “more context” options 200 and 222, it is assumed that the user 11 has selected the “more context” option 222 associated with the first match presented in the HTML matches page 210. Referring to FIG. 14, there is shown an exemplary HTML more context page 230 for allowing the user 11 to browse locations surrounding the location of the first match presented in the HTML matches page 210 within the corresponding video stream. The HTML more context page 230 includes an almost exact copy of the HTML query page 232, which contains an additional user-selectable “search this result” option 234 for allowing the user 11 to refine the results of a previous query. The HTML more context page 230 also includes a more context header 236 containing the title of the corresponding video stream, and the language that is spoken in the corresponding video stream. The HTML more context page 230 further includes a frame 239 which corresponds to an actual frame within the corresponding video stream. Each frame 238 includes a video icon 240, which functions in a manner similar to the previously-described video icons 194 and 220. Each frame 238 and corresponding video icon 240 are provided along with some detail about each associated frame 238 within the corresponding video stream. For example, the exact time location of the frame 238 within the corresponding video stream and a textual excerpt from the corresponding video stream are listed along with each frame 238 and corresponding video icon 240. The HTML more context page 230 still further includes a user-selectable “backward” option 242 and a user-selectable “forward” option 244 for allowing the user 11 to browse further locations surrounding the location of the first match presented in the HTML matches page 210 within the corresponding video stream.
  • Lastly, it should be noted that the encoder client 14, the transcoder client 16, the annotation client 18, the browser client 20, the media database server 24, the librarian 28, the index database server 32, and the encoder 36 all involve the processing of input data and the generation of output data to some extent. The processing of the input data and the generation of the output data are preferably implemented by software programs. Thus, referring to FIG. 15, each of the above-described system components preferably comprises a processing device 250 including at least one processor (P) 252, memory (M) 254, and input/output (I/O) interface 256, connected to each other by a bus 258, for facilitating the implementation of input data processing and output data generation in each of the above-described system components.
  • The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the present invention, in addition to those described therein, will be apparent to those of skill in the art from the foregoing description and accompanying drawings. Thus, such modifications are intended to fall within the scope of the appended claims.

Claims (22)

1. A method comprising:
identifying one or more annotations corresponding to at least one location within a media stream wherein each of the one or more annotations is associated with a measure of confidence in the accuracy of the annotation;
transmitting to a client device an icon associated with at least one of the identified annotations, retrieving data corresponding to at least one of the identified annotations; and
transmitting said data on the client device.
2. The method of claim 1, wherein the media stream comprises a plurality of scenes.
3. The method of claim 2, wherein each annotation corresponds to at least one of the plurality of scenes and at least one annotation comprises information about the at least one of the plurality of scenes.
4. The method of claim 3, wherein the information comprises words used in at least one of the plurality of scenes.
5. The method of claim 3, wherein the information comprises an identification of a person in at least one of the plurality of scenes.
6. The method of claim 1 wherein the measure of confidence is a probability.
7. The method of claim 1 wherein the data corresponding to at least one of the identified annotations comprises at least one of video, audio, and text.
8. The method of claim 2, wherein:
one or more of the identified annotations comprises information about one or more of the plurality of scenes, and
the data corresponding to the one or more identified annotation comprises one or more of the plurality of scenes.
9. The method of claim 1, wherein:
said data comprises at least one of an image and a video, and
at least one of the one or more annotations includes a URL that links to the at least one of an image and a video.
10. The method of claim 1, wherein one or more annotations includes at least one of a start time corresponding to a location in the media stream where the annotation begins and an end time corresponding to a location in the media stream where the annotation ends.
11. The method of claim 1, wherein the data corresponding to at least one of the identified annotations is presented to the client device in response to a query.
12. A system comprising at least one processing device having software associated therewith that, when executed, causes the at least on processing device to perform a method comprising:
identifying one or more annotations corresponding to at least one location within a media stream wherein each of the one or more annotations is associated with a measure of confidence in the accuracy of the annotation;
transmitting to a client device an icon associated with at least one of the identified annotations,
retrieving data corresponding to at least one of the identified annotations; and
transmitting said data on the client device.
13. The system of claim 12, wherein the media stream comprises a plurality of scenes.
14. The system of claim 13, wherein each annotation corresponds to at least one of the plurality of scenes and at least one annotation comprises information about the at least one of the plurality of scenes.
15. The system of claim 14, wherein the information comprises words used in at least one of the plurality of scenes.
16. The system of claim 3, wherein the information comprises an identification of a person in at least one of the plurality of scenes.
17. The system of claim 12 wherein the measure of confidence is a probability.
18. The system of claim 12 wherein the data corresponding to at least one of the identified annotations comprises at least one of video, audio, and text.
19. The system of claim 13, wherein:
one or more of the identified annotations comprises information about one or more of the plurality of scenes, and
the data corresponding to the one or more identified annotation comprises one or more of the plurality of scenes.
20. The system of claim 12, wherein:
said data comprises at least one of an image and a video, and
at least one of the one or more annotations includes a URL that links to the at least one of an image and a video.
21. The system of claim 12, wherein one or more annotations includes at least one of a start time corresponding to a location in the media stream where the annotation begins and an end time corresponding to a location in the media stream where the annotation ends.
22. The system of claim 12, wherein the data corresponding to at least one of the identified annotations is presented to the client device in response to a query.
US13/929,678 1998-03-11 2013-06-27 Technique for processing data in a network Abandoned US20140019479A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/929,678 US20140019479A1 (en) 1998-03-11 2013-06-27 Technique for processing data in a network

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US09/037,957 US6173287B1 (en) 1998-03-11 1998-03-11 Technique for ranking multimedia annotations of interest
US09/204,286 US6311189B1 (en) 1998-03-11 1998-12-03 Technique for matching a query to a portion of media
US09/814,213 US6799298B2 (en) 1998-03-11 2001-03-22 Technique for locating an item of interest within a stored representation of data
US10/935,120 US8060509B2 (en) 1998-03-11 2004-09-08 Technique for processing data in a network
US13/089,872 US8504576B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network
US13/929,678 US20140019479A1 (en) 1998-03-11 2013-06-27 Technique for processing data in a network

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/089,872 Continuation US8504576B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network

Publications (1)

Publication Number Publication Date
US20140019479A1 true US20140019479A1 (en) 2014-01-16

Family

ID=33032482

Family Applications (6)

Application Number Title Priority Date Filing Date
US09/814,213 Expired - Lifetime US6799298B2 (en) 1998-03-11 2001-03-22 Technique for locating an item of interest within a stored representation of data
US10/935,120 Expired - Fee Related US8060509B2 (en) 1998-03-11 2004-09-08 Technique for processing data in a network
US13/089,872 Expired - Fee Related US8504576B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network
US13/089,854 Expired - Fee Related US9122682B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network
US13/929,678 Abandoned US20140019479A1 (en) 1998-03-11 2013-06-27 Technique for processing data in a network
US14/798,510 Abandoned US20150317308A1 (en) 1998-03-11 2015-07-14 Technique for processing data in a network

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US09/814,213 Expired - Lifetime US6799298B2 (en) 1998-03-11 2001-03-22 Technique for locating an item of interest within a stored representation of data
US10/935,120 Expired - Fee Related US8060509B2 (en) 1998-03-11 2004-09-08 Technique for processing data in a network
US13/089,872 Expired - Fee Related US8504576B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network
US13/089,854 Expired - Fee Related US9122682B2 (en) 1998-03-11 2011-04-19 Technique for processing data in a network

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/798,510 Abandoned US20150317308A1 (en) 1998-03-11 2015-07-14 Technique for processing data in a network

Country Status (1)

Country Link
US (6) US6799298B2 (en)

Families Citing this family (118)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6859799B1 (en) * 1998-11-30 2005-02-22 Gemstar Development Corporation Search engine for video and graphics
US9424240B2 (en) * 1999-12-07 2016-08-23 Microsoft Technology Licensing, Llc Annotations for electronic content
US7873649B2 (en) * 2000-09-07 2011-01-18 Oracle International Corporation Method and mechanism for identifying transaction on a row of data
KR101399240B1 (en) 2000-10-11 2014-06-02 유나이티드 비디오 프로퍼티즈, 인크. Systems and methods for delivering media content
US6785676B2 (en) * 2001-02-07 2004-08-31 International Business Machines Corporation Customer self service subsystem for response set ordering and annotation
US6778193B2 (en) 2001-02-07 2004-08-17 International Business Machines Corporation Customer self service iconic interface for portal entry and search specification
US6853998B2 (en) 2001-02-07 2005-02-08 International Business Machines Corporation Customer self service subsystem for classifying user contexts
US6873990B2 (en) 2001-02-07 2005-03-29 International Business Machines Corporation Customer self service subsystem for context cluster discovery and validation
US7774408B2 (en) * 2001-04-23 2010-08-10 Foundationip, Llc Methods, systems, and emails to link emails to matters and organizations
US7653631B1 (en) * 2001-05-10 2010-01-26 Foundationip, Llc Method for synchronizing information in multiple case management systems
US20040054968A1 (en) * 2001-07-03 2004-03-18 Daniel Savage Web page with system for displaying miniature visual representations of search engine results
US7146409B1 (en) 2001-07-24 2006-12-05 Brightplanet Corporation System and method for efficient control and capture of dynamic database content
JP2003122999A (en) * 2001-10-11 2003-04-25 Honda Motor Co Ltd System, program, and method providing measure for trouble
US7260439B2 (en) * 2001-11-01 2007-08-21 Fuji Xerox Co., Ltd. Systems and methods for the automatic extraction of audio excerpts
US8590013B2 (en) 2002-02-25 2013-11-19 C. S. Lee Crawford Method of managing and communicating data pertaining to software applications for processor-based devices comprising wireless communication circuitry
US20030167181A1 (en) * 2002-03-01 2003-09-04 Schwegman, Lundberg, Woessner & Kluth, P.A. Systems and methods for managing information disclosure statement (IDS) references
US20040199400A1 (en) * 2002-12-17 2004-10-07 Lundberg Steven W. Internet-based patent and trademark application management system
KR100781507B1 (en) * 2003-06-07 2007-12-03 삼성전자주식회사 Apparatus and method for displaying multimedia data, and recording medium having the method recorded thereon
US8321470B2 (en) * 2003-06-20 2012-11-27 International Business Machines Corporation Heterogeneous multi-level extendable indexing for general purpose annotation systems
US9026901B2 (en) * 2003-06-20 2015-05-05 International Business Machines Corporation Viewing annotations across multiple applications
US7315857B2 (en) * 2004-05-13 2008-01-01 International Business Machines Corporation Method and system for propagating annotations using pattern matching
US7900133B2 (en) 2003-12-09 2011-03-01 International Business Machines Corporation Annotation structure type determination
US7254593B2 (en) * 2004-01-16 2007-08-07 International Business Machines Corporation System and method for tracking annotations of data sources
JP4046086B2 (en) * 2004-01-21 2008-02-13 トヨタ自動車株式会社 Variable compression ratio internal combustion engine
EP2662784A1 (en) * 2004-03-15 2013-11-13 Yahoo! Inc. Search systems and methods with integration of user annotations
US7669117B2 (en) * 2004-03-18 2010-02-23 International Business Machines Corporation Method and system for creation and retrieval of global annotations
JP4709213B2 (en) * 2004-06-23 2011-06-22 オラクル・インターナショナル・コーポレイション Efficient evaluation of queries using transformations
US7516121B2 (en) * 2004-06-23 2009-04-07 Oracle International Corporation Efficient evaluation of queries using translation
US7668806B2 (en) * 2004-08-05 2010-02-23 Oracle International Corporation Processing queries against one or more markup language sources
US7548918B2 (en) * 2004-12-16 2009-06-16 Oracle International Corporation Techniques for maintaining consistency for different requestors of files in a database management system
US7627574B2 (en) * 2004-12-16 2009-12-01 Oracle International Corporation Infrastructure for performing file operations by a database server
US20060136508A1 (en) * 2004-12-16 2006-06-22 Sam Idicula Techniques for providing locks for file operations in a database management system
US7716260B2 (en) * 2004-12-16 2010-05-11 Oracle International Corporation Techniques for transaction semantics for a database server performing file operations
US7783708B2 (en) * 2005-01-27 2010-08-24 Microsoft Corporation Attachment browser
CA2635499A1 (en) 2005-02-12 2006-08-24 Teresis Media Management, Inc. Methods and apparatuses for assisting the production of media works and the like
US8577683B2 (en) 2008-08-15 2013-11-05 Thomas Majchrowski & Associates, Inc. Multipurpose media players
US20060212509A1 (en) * 2005-03-21 2006-09-21 International Business Machines Corporation Profile driven method for enabling annotation of World Wide Web resources
US7809675B2 (en) * 2005-06-29 2010-10-05 Oracle International Corporation Sharing state information among a plurality of file operation servers
US8224837B2 (en) * 2005-06-29 2012-07-17 Oracle International Corporation Method and mechanism for supporting virtual content in performing file operations at a RDBMS
US7660581B2 (en) 2005-09-14 2010-02-09 Jumptap, Inc. Managing sponsored content based on usage history
US9703892B2 (en) 2005-09-14 2017-07-11 Millennial Media Llc Predictive text completion for a mobile communication facility
US7548915B2 (en) 2005-09-14 2009-06-16 Jorey Ramer Contextual mobile content placement on a mobile communication facility
US8311888B2 (en) 2005-09-14 2012-11-13 Jumptap, Inc. Revenue models associated with syndication of a behavioral profile using a monetization platform
US8290810B2 (en) 2005-09-14 2012-10-16 Jumptap, Inc. Realtime surveying within mobile sponsored content
US10592930B2 (en) 2005-09-14 2020-03-17 Millenial Media, LLC Syndication of a behavioral profile using a monetization platform
US8156128B2 (en) 2005-09-14 2012-04-10 Jumptap, Inc. Contextual mobile content placement on a mobile communication facility
US7860871B2 (en) 2005-09-14 2010-12-28 Jumptap, Inc. User history influenced search results
US8666376B2 (en) 2005-09-14 2014-03-04 Millennial Media Location based mobile shopping affinity program
US8503995B2 (en) 2005-09-14 2013-08-06 Jumptap, Inc. Mobile dynamic advertisement creation and placement
US8660891B2 (en) 2005-11-01 2014-02-25 Millennial Media Interactive mobile advertisement banners
US7769764B2 (en) 2005-09-14 2010-08-03 Jumptap, Inc. Mobile advertisement syndication
US7752209B2 (en) 2005-09-14 2010-07-06 Jumptap, Inc. Presenting sponsored content on a mobile communication facility
US9058406B2 (en) 2005-09-14 2015-06-16 Millennial Media, Inc. Management of multiple advertising inventories using a monetization platform
US8131271B2 (en) 2005-11-05 2012-03-06 Jumptap, Inc. Categorization of a mobile user profile based on browse behavior
US7676394B2 (en) 2005-09-14 2010-03-09 Jumptap, Inc. Dynamic bidding and expected value
US9471925B2 (en) 2005-09-14 2016-10-18 Millennial Media Llc Increasing mobile interactivity
US10911894B2 (en) 2005-09-14 2021-02-02 Verizon Media Inc. Use of dynamic content generation parameters based on previous performance of those parameters
US8805339B2 (en) 2005-09-14 2014-08-12 Millennial Media, Inc. Categorization of a mobile user profile based on browse and viewing behavior
US7702318B2 (en) 2005-09-14 2010-04-20 Jumptap, Inc. Presentation of sponsored content based on mobile transaction event
US8364521B2 (en) 2005-09-14 2013-01-29 Jumptap, Inc. Rendering targeted advertisement on mobile communication facilities
US9201979B2 (en) 2005-09-14 2015-12-01 Millennial Media, Inc. Syndication of a behavioral profile associated with an availability condition using a monetization platform
US7912458B2 (en) 2005-09-14 2011-03-22 Jumptap, Inc. Interaction analysis and prioritization of mobile content
US20110313853A1 (en) 2005-09-14 2011-12-22 Jorey Ramer System for targeting advertising content to a plurality of mobile communication facilities
US7603360B2 (en) 2005-09-14 2009-10-13 Jumptap, Inc. Location influenced search results
US8832100B2 (en) 2005-09-14 2014-09-09 Millennial Media, Inc. User transaction history influenced search results
US9076175B2 (en) 2005-09-14 2015-07-07 Millennial Media, Inc. Mobile comparison shopping
US8027879B2 (en) 2005-11-05 2011-09-27 Jumptap, Inc. Exclusivity bidding for mobile sponsored content
US8209344B2 (en) 2005-09-14 2012-06-26 Jumptap, Inc. Embedding sponsored content in mobile applications
US8615719B2 (en) 2005-09-14 2013-12-24 Jumptap, Inc. Managing sponsored content for delivery to mobile communication facilities
US8364540B2 (en) 2005-09-14 2013-01-29 Jumptap, Inc. Contextual targeting of content using a monetization platform
US8238888B2 (en) 2006-09-13 2012-08-07 Jumptap, Inc. Methods and systems for mobile coupon placement
US8819659B2 (en) 2005-09-14 2014-08-26 Millennial Media, Inc. Mobile search service instant activation
US8302030B2 (en) 2005-09-14 2012-10-30 Jumptap, Inc. Management of multiple advertising inventories using a monetization platform
US10038756B2 (en) 2005-09-14 2018-07-31 Millenial Media LLC Managing sponsored content based on device characteristics
US8103545B2 (en) 2005-09-14 2012-01-24 Jumptap, Inc. Managing payment for sponsored content presented to mobile communication facilities
US7577665B2 (en) 2005-09-14 2009-08-18 Jumptap, Inc. User characteristic influenced search results
US8195133B2 (en) 2005-09-14 2012-06-05 Jumptap, Inc. Mobile dynamic advertisement creation and placement
US8433297B2 (en) 2005-11-05 2013-04-30 Jumptag, Inc. System for targeting advertising content to a plurality of mobile communication facilities
US8812526B2 (en) 2005-09-14 2014-08-19 Millennial Media, Inc. Mobile content cross-inventory yield optimization
US8229914B2 (en) 2005-09-14 2012-07-24 Jumptap, Inc. Mobile content spidering and compatibility determination
US8989718B2 (en) 2005-09-14 2015-03-24 Millennial Media, Inc. Idle screen advertising
US8688671B2 (en) 2005-09-14 2014-04-01 Millennial Media Managing sponsored content based on geographic region
US8175585B2 (en) 2005-11-05 2012-05-08 Jumptap, Inc. System for targeting advertising content to a plurality of mobile communication facilities
US8571999B2 (en) 2005-11-14 2013-10-29 C. S. Lee Crawford Method of conducting operations for a social network application including activity list generation
EP1813311A1 (en) * 2005-11-25 2007-08-01 Cognis IP Management GmbH Oil-in-water emulsions based on special emulsifiers
US20070143255A1 (en) * 2005-11-28 2007-06-21 Webaroo, Inc. Method and system for delivering internet content to mobile devices
US7610304B2 (en) * 2005-12-05 2009-10-27 Oracle International Corporation Techniques for performing file operations involving a link at a database management system
JP2007207328A (en) * 2006-01-31 2007-08-16 Toshiba Corp Information storage medium, program, information reproducing method, information reproducing device, data transfer method, and data processing method
US20070240060A1 (en) * 2006-02-08 2007-10-11 Siemens Corporate Research, Inc. System and method for video capture and annotation
US7529794B2 (en) * 2006-02-21 2009-05-05 International Business Machines Corporation Method and system for mediating published message streams for selective distribution
JP5649303B2 (en) * 2006-03-30 2015-01-07 エスアールアイ インターナショナルSRI International Method and apparatus for annotating media streams
US8095394B2 (en) * 2006-05-18 2012-01-10 Progressive Casualty Insurance Company Rich claim reporting system
US8856267B2 (en) * 2006-11-16 2014-10-07 Rangecast Technologies, Llc Network audio directory server and method
US7908260B1 (en) 2006-12-29 2011-03-15 BrightPlanet Corporation II, Inc. Source editing, internationalization, advanced configuration wizard, and summary page selection for information automation systems
US8316309B2 (en) * 2007-05-31 2012-11-20 International Business Machines Corporation User-created metadata for managing interface resources on a user interface
US9542394B2 (en) * 2007-06-14 2017-01-10 Excalibur Ip, Llc Method and system for media-based event generation
US8793256B2 (en) 2008-03-26 2014-07-29 Tout Industries, Inc. Method and apparatus for selecting related content for display in conjunction with a media
US20100091835A1 (en) * 2008-10-14 2010-04-15 Morris Robert P Method And System For Processing A Media Stream
US8429287B2 (en) * 2009-04-29 2013-04-23 Rangecast Technologies, Llc Network audio distribution system and method
KR20110047768A (en) 2009-10-30 2011-05-09 삼성전자주식회사 Apparatus and method for displaying multimedia contents
CN101853308A (en) * 2010-06-11 2010-10-06 中兴通讯股份有限公司 Method and application terminal for personalized meta-search
US9317861B2 (en) * 2011-03-30 2016-04-19 Information Resources, Inc. View-independent annotation of commercial data
JP5882683B2 (en) * 2011-11-02 2016-03-09 キヤノン株式会社 Information processing apparatus and method
US9396277B2 (en) * 2011-12-09 2016-07-19 Microsoft Technology Licensing, Llc Access to supplemental data based on identifier derived from corresponding primary application data
US8805418B2 (en) 2011-12-23 2014-08-12 United Video Properties, Inc. Methods and systems for performing actions based on location-based rules
LT2797921T (en) 2011-12-31 2017-11-27 Beigene, Ltd. Fused tetra or penta-cyclic dihydrodiazepinocarbazolones as parp inhibitors
BR112014016163A8 (en) 2011-12-31 2017-07-04 Beigene Ltd fused tetra or penta-cyclic pyridophthalazinones as parp inhibitors
US9608983B2 (en) * 2013-04-30 2017-03-28 Sensormatic Electronics, LLC Authentication system and method for embedded applets
US9817911B2 (en) 2013-05-10 2017-11-14 Excalibur Ip, Llc Method and system for displaying content relating to a subject matter of a displayed media program
US9020469B2 (en) 2013-06-04 2015-04-28 Rangecast Technologies, Llc Network audio distribution system and method
US9268861B2 (en) 2013-08-19 2016-02-23 Yahoo! Inc. Method and system for recommending relevant web content to second screen application users
CN105893387B (en) * 2015-01-04 2021-03-23 伊姆西Ip控股有限责任公司 Intelligent multimedia processing method and system
US10909131B1 (en) * 2017-04-28 2021-02-02 EMC IP Holding Company LLC Method and system for indexing and searching data sub-streams
US11259075B2 (en) * 2017-12-22 2022-02-22 Hillel Felman Systems and methods for annotating video media with shared, time-synchronized, personal comments
US11792485B2 (en) * 2017-12-22 2023-10-17 Hillel Felman Systems and methods for annotating video media with shared, time-synchronized, personal reactions
CN109241301A (en) * 2018-08-31 2019-01-18 北京优酷科技有限公司 Resource recommendation method and device
CN110602555B (en) * 2019-07-30 2021-01-01 华为技术有限公司 Video transcoding method and device
CN111966428B (en) * 2020-08-21 2022-07-15 支付宝(杭州)信息技术有限公司 Page processing method and device and page backtracking method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5600775A (en) * 1994-08-26 1997-02-04 Emotion, Inc. Method and apparatus for annotating full motion video and other indexed data structures
US5675710A (en) * 1995-06-07 1997-10-07 Lucent Technologies, Inc. Method and apparatus for training a text classifier
US5703655A (en) * 1995-03-24 1997-12-30 U S West Technologies, Inc. Video programming retrieval using extracted closed caption data which has been partitioned and stored to facilitate a search and retrieval process
US5920317A (en) * 1996-06-11 1999-07-06 Vmi Technologies Incorporated System and method for storing and displaying ultrasound images
US6006241A (en) * 1997-03-14 1999-12-21 Microsoft Corporation Production of a video stream with synchronized annotations over a computer network
US6085185A (en) * 1996-07-05 2000-07-04 Hitachi, Ltd. Retrieval method and system of multimedia database
US20020099552A1 (en) * 2001-01-25 2002-07-25 Darryl Rubin Annotating electronic information with audio clips

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4616336A (en) 1983-05-11 1986-10-07 International Business Machines Corp. Independent image and annotation overlay with highlighting of overlay conflicts
US4864501A (en) 1987-10-07 1989-09-05 Houghton Mifflin Company Word annotation system
WO1989011693A1 (en) 1988-05-27 1989-11-30 Wang Laboratories, Inc. Document annotation and manipulation in a data processing system
US5146552A (en) 1990-02-28 1992-09-08 International Business Machines Corporation Method for associating annotation with electronically published material
CA2045907C (en) 1991-06-28 1998-12-15 Gerald B. Anderson A method for storing and retrieving annotations and redactions in final form documents
US5524193A (en) 1991-10-15 1996-06-04 And Communications Interactive multimedia annotation method and apparatus
EP0622930A3 (en) * 1993-03-19 1996-06-05 At & T Global Inf Solution Application sharing for computer collaboration system.
US5920694A (en) 1993-03-19 1999-07-06 Ncr Corporation Annotation of computer video displays
US5502727A (en) 1993-04-20 1996-03-26 At&T Corp. Image and audio communication system having graphical annotation capability
US5826025A (en) * 1995-09-08 1998-10-20 Sun Microsystems, Inc. System for annotation overlay proxy configured to retrieve associated overlays associated with a document request from annotation directory created from list of overlay groups
US5822539A (en) 1995-12-08 1998-10-13 Sun Microsystems, Inc. System for adding requested document cross references to a document by annotation proxy configured to merge and a directory generator and annotation server
KR19990072122A (en) * 1995-12-12 1999-09-27 바자니 크레이그 에스 Method and apparatus for real-time image transmission
US5832474A (en) * 1996-02-26 1998-11-03 Matsushita Electric Industrial Co., Ltd. Document search and retrieval system with partial match searching of user-drawn annotations
US5870754A (en) * 1996-04-25 1999-02-09 Philips Electronics North America Corporation Video retrieval of MPEG compressed sequences using DC and motion signatures
US5721827A (en) 1996-10-02 1998-02-24 James Logan System for electrically distributing personalized information
US6041335A (en) * 1997-02-10 2000-03-21 Merritt; Charles R. Method of annotating a primary image with an image and for transmitting the annotated primary image
US5930777A (en) * 1997-04-15 1999-07-27 Barber; Timothy P. Method of charging for pay-per-access information over a network
US5983218A (en) * 1997-06-30 1999-11-09 Xerox Corporation Multimedia database for use over networks
US6360234B2 (en) * 1997-08-14 2002-03-19 Virage, Inc. Video cataloger system with synchronized encoders
US6237011B1 (en) * 1997-10-08 2001-05-22 Caere Corporation Computer-based document management system
JP4183311B2 (en) * 1997-12-22 2008-11-19 株式会社リコー Document annotation method, annotation device, and recording medium
US6055538A (en) * 1997-12-22 2000-04-25 Hewlett Packard Company Methods and system for using web browser to search large collections of documents
US6173287B1 (en) * 1998-03-11 2001-01-09 Digital Equipment Corporation Technique for ranking multimedia annotations of interest
US6229524B1 (en) * 1998-07-17 2001-05-08 International Business Machines Corporation User interface for interaction with video
WO2002037763A1 (en) * 2000-11-06 2002-05-10 Matsushita Electric Industrial Co., Ltd. Transmitter, receiver, and broadcast data distribution method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5600775A (en) * 1994-08-26 1997-02-04 Emotion, Inc. Method and apparatus for annotating full motion video and other indexed data structures
US5703655A (en) * 1995-03-24 1997-12-30 U S West Technologies, Inc. Video programming retrieval using extracted closed caption data which has been partitioned and stored to facilitate a search and retrieval process
US5675710A (en) * 1995-06-07 1997-10-07 Lucent Technologies, Inc. Method and apparatus for training a text classifier
US5920317A (en) * 1996-06-11 1999-07-06 Vmi Technologies Incorporated System and method for storing and displaying ultrasound images
US6085185A (en) * 1996-07-05 2000-07-04 Hitachi, Ltd. Retrieval method and system of multimedia database
US6006241A (en) * 1997-03-14 1999-12-21 Microsoft Corporation Production of a video stream with synchronized annotations over a computer network
US20020099552A1 (en) * 2001-01-25 2002-07-25 Darryl Rubin Annotating electronic information with audio clips

Also Published As

Publication number Publication date
US20050044078A1 (en) 2005-02-24
US20010051958A1 (en) 2001-12-13
US8504576B2 (en) 2013-08-06
US8060509B2 (en) 2011-11-15
US20120209843A1 (en) 2012-08-16
US9122682B2 (en) 2015-09-01
US20150317308A1 (en) 2015-11-05
US6799298B2 (en) 2004-09-28
US20120207447A1 (en) 2012-08-16

Similar Documents

Publication Publication Date Title
US9122682B2 (en) Technique for processing data in a network
US6275827B1 (en) Technique for processing data
US7487072B2 (en) Method and system for querying multimedia data where adjusting the conversion of the current portion of the multimedia data signal based on the comparing at least one set of confidence values to the threshold
KR100798570B1 (en) System and method for indexing, searching, identifying, and editing portions of electronic multimedia files
US7149750B2 (en) Method, system and program product for extracting essence from a multimedia file received in a first format, creating a metadata file in a second file format and using a unique identifier assigned to the essence to access the essence and metadata file
CN1317661C (en) System and method for facilitating internet search by providing web document layout image
US7747629B2 (en) System and method for positional representation of content for efficient indexing, search, retrieval, and compression
US7617195B2 (en) Optimizing the performance of duplicate identification by content
US20080256050A1 (en) System and method for modeling user selection feedback in a search result page
US20020198962A1 (en) Method, system, and computer program product for distributing a stored URL and web document set
CN1809827A (en) System and process for network site fragmented search
US20140195523A1 (en) Method and system for indexing information and providing results for a search including objects having predetermined attributes
Bell et al. The MG retrieval system: compressing for space and speed
KR20000072232A (en) system of distribution for digital contents using internet
KR20050006565A (en) System And Method For Managing And Editing Multimedia Data
US20020059239A1 (en) Data managing method, data managing system, data managing apparatus, data handling apparatus, computer program, and recording medium
CA2339217A1 (en) Information access
JP2006185059A (en) Contents management apparatus
SAKAI et al. Searching multimedia information in distributed environment
Ioannou et al. Effective access to large audiovisual assets based on user preferences
Burnik et al. Content and presentation adaptation in hypermedia systems
Avrithis et al. Intelligent Semantic Access to Audiovisual Content
Wechsler et al. A new ranking principle for multimedia information retrieval
JP2002091977A (en) Computer and information system
Adistambha et al. Query streaming for multimedia query by content from mobile devices

Legal Events

Date Code Title Description
AS Assignment

Owner name: DIGITAL EQUIPMENT CORPORATION, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DE VRIES, ARJEN P.;KONTOTHANASSIS, LEONIDAS;DUFAUX, FREDERIC;AND OTHERS;SIGNING DATES FROM 19990621 TO 19991102;REEL/FRAME:035592/0128

Owner name: ALTAVISTA COMPANY, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DIGITAL EQUIPMENT CORPORATION;REEL/FRAME:035592/0345

Effective date: 20000717

Owner name: ZOOM NEWCO INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:COMPAQ COMPUTER CORPORATION;DIGITAL EQUIPMENT CORPORATION;REEL/FRAME:035592/0309

Effective date: 19990818

Owner name: OVERTURE SERVICES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ALTA VISTA COMPANY;REEL/FRAME:035592/0375

Effective date: 20060612

Owner name: ALTAVISTA COMPANY, CALIFORNIA

Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ZOOM NEWCO INC.;ALTAVISTA COMPANY;REEL/FRAME:035592/0280

Effective date: 19990818

Owner name: YAHOO| INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:OVERTURE SERVICES, INC.;REEL/FRAME:035592/0406

Effective date: 20091012

AS Assignment

Owner name: EXCALIBUR IP, LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO| INC.;REEL/FRAME:038383/0466

Effective date: 20160418

AS Assignment

Owner name: YAHOO| INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EXCALIBUR IP, LLC;REEL/FRAME:038951/0295

Effective date: 20160531

AS Assignment

Owner name: EXCALIBUR IP, LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YAHOO| INC.;REEL/FRAME:038950/0592

Effective date: 20160531

AS Assignment

Owner name: EUREKA DATABASE SOLUTIONS, LLC, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EXCALIBUR IP, LLC;REEL/FRAME:043587/0408

Effective date: 20170913

STCB Information on status: application discontinuation

Free format text: EXPRESSLY ABANDONED -- DURING EXAMINATION