WO2003056832A1 - Method, apparatus, and program for evolving algorithms for detecting - Google Patents

Method, apparatus, and program for evolving algorithms for detecting Download PDF

Info

Publication number
WO2003056832A1
WO2003056832A1 PCT/IB2002/005713 IB0205713W WO03056832A1 WO 2003056832 A1 WO2003056832 A1 WO 2003056832A1 IB 0205713 W IB0205713 W IB 0205713W WO 03056832 A1 WO03056832 A1 WO 03056832A1
Authority
WO
WIPO (PCT)
Prior art keywords
algorithm
parameters
information stream
predetermined content
media information
Prior art date
Application number
PCT/IB2002/005713
Other languages
French (fr)
Inventor
Lalitha Agnihotri
James D. Schaffer
Nevenka Dimitrova
Thomas F. M. Mcgee
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to AU2002367237A priority Critical patent/AU2002367237A1/en
Priority to EP02790652A priority patent/EP1464178B1/en
Priority to KR10-2004-7010323A priority patent/KR20040070290A/en
Priority to DE60219523T priority patent/DE60219523D1/en
Priority to JP2003557215A priority patent/JP4347056B2/en
Publication of WO2003056832A1 publication Critical patent/WO2003056832A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/432Content retrieval operation from a local storage medium, e.g. hard-disk
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/414Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
    • H04N21/4147PVR [Personal Video Recorder]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/20Disc-shaped record carriers

Definitions

  • This invention relates generally to the detection of commercials or other predetermined content in video information streams, using a search algorithm, and in particular to a method, apparatus, and program for evolving algorithm parameters to accurately detect transitions from one type of content to another type of content, using a search algorithm, such as a genetic algorithm.
  • Personal video receivers/recorders and devices that modify and/or record the content of broadcast video are becoming increasingly popular.
  • a personal video recorder that automatically records programs on a hard disk based on preferences of a user.
  • content detection One of the features under investigation for such systems is content detection.
  • a system that can detect commercials may allow substitute advertisements to be inserted in a video stream ("commercial swapping") or the temporary halting of the video at the end of a commercial to prevent a user, who was distracted during a commercial, from missing any of the main program content.
  • Content detection also may enable users who are not interested in the content of commercials or promotions interposed within a recorded television program, to skip through those commercials either manually or by using a device designed to perform skipping autonomically (see, e.g., U.S. Pat. No. 5,151,788).
  • One method is the detection of a high cut rate or sudden change in a scene with no fade or movement transition between temporally adjacent frames. Cuts can include fades so the cuts do not have to be hard cuts. A more robust criterion may be high transition rates.
  • Another indicator is the presence of a black frame (or unicolor/monochrome frame) coupled with silence, which may indicate the beginning of a commercial break. One or more black frames are usually found immediately before and after an individual commercial segment.
  • Another known indicator of commercials is high "activity", which is the rate of change in the luminance level between two different sets of frames. In commercials, objects and scenes generally move faster and change more frequently than during non-commercial video segments, and thus commercials typically are filled with "activity”. When a low amount of activity is detected, the commercial is deemed to have ended, and a resumption in recording may follow.
  • Another known technique is to measure the temporal distance between black frame sequences to determine the presence of a commercial, based on whether the measured temporal distance exceeds or is less than a predetermined threshold. Still another technique identifies commercials based on matching images, wherein differences in the quality of image content is used as an indicator.
  • Another technique for identifying a black frame such as disclosed in U.S. Pat. No. 4,314,285 by Bonner et al., senses a drop in the voltage level of the input signal below a threshold.
  • Yet another technique such as disclosed in U.S. Pat. No. 5,333,091 by Iggulden et al., is to record an entire program including any commercials. A notation is made whenever a black frame is broadcast. After recordation, a processor determines whether the time period in between black frames was a commercial or a program. This is accomplished by a simple formula. If the time period is less than a threshold of five minutes, there is deemed to be a commercial. During playback, the device fast-forwards the tape past the areas determined to be commercials. Unfortunately, the presence of two black frames within five minutes of each other is not necessarily representative of a commercial as this could occur during a dimly lit or dark scene.
  • predetermined content such as commercials
  • the foregoing objects of the invention are realized by a method for optimizing the performance of an algorithm for detecting predetermined content, such as one or more commercials, in a media information stream (e.g., a video information divided into a plurality of frames, and/or audio information, etc.), and a program and apparatus that operate in accordance with the method.
  • the algorithm is a function of a set of parameters, wherein each set is also referred to herein as a chromosome, and each parameter preferably is a threshold value representing a condition that, depending on whether or not satisfied, tends to indicate whether or not predetermined subject matter (e.g., commercial or other subject matter) is present in the media information stream.
  • the method comprises the steps of performing the algorithm at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm for each performance thereof, and automatically evolving at least one respective set of parameters employed in the algorithm to maximize the degree of accuracy at which the algorithm detects the predetermined content in the media information stream.
  • the algorithm detects the predetermined content, which may be desired or undesired content, based on a detection of at least one of a black frame/unicolor frame among the plurality of frames, an average cut frame distance, an average cut frame distance trend, a brand name, a cut and black frame, an average intensity color histogram, audio silence, a change in volume, frame similarity, character detection, a static image, or any other types of features of interest that may indicate the presence of predetermined content in a video or other type of media information stream.
  • the step of automatically evolving includes performing a search algorithm, preferably a genetic algorithm, to evolve the at least one respective set of parameters.
  • the evolving includes the steps of determining the accuracy at which the algorithm detects the predetermined content in the media information stream for each performance of the algorithm, selecting at least one of the respective sets of parameters, based on a result of the step of determining the accuracy, and producing at least one offspring set of parameters, based on the at least one set of parameters selected in the selecting step.
  • the offspring set(s) of parameters and/or original set(s) of parameters determined to yield the most accurate results are then employed in further, respective performances of the algorithm, and one or more further offspring sets of parameters are produced again, if needed, until a set of parameters which substantially maximizes the accuracy of the algorithm's performance is determined.
  • optimum high performance (also referred to as "optimum") set of parameters
  • the algorithm can reside on a server and/or in local information appliances, and the set of parameters and/or algorithm itself can be downloaded from the server to the local information appliances or vice versa.
  • Fig. 1 is a block diagram of a hardware system 1 that is suitable for practicing this invention, wherein the system 1 comprises a server 2 and at least one user information appliance 4 that are constructed and operated in accordance with this invention, and which are bidirectionally coupled together through an interface 6.
  • the system 1 comprises a server 2 and at least one user information appliance 4 that are constructed and operated in accordance with this invention, and which are bidirectionally coupled together through an interface 6.
  • Fig. 2 is an example of a plurality of chromosomes Crl-Crn that may be stored in a memory 15 of the server 2 and/or a memory 18 of the user information appliance 4 of Fig. 1, wherein the chromosomes Crl-Crn each include parameter or threshold values that are suitable for use in an algorithm for detecting predetermined content, such as commercials, in a video or other media information stream.
  • Fig. 3 is a logical flow diagram of a method in accordance with this invention for evaluating a video information stream for the presence of predetermined content, and for automatically varying parameters of the algorithm to enable the algorithm to detect the predetermined content with a maximum degree of accuracy.
  • Figs. 4a and 4b are a logical flow diagram showing in detail sub-steps performed during step 112 of Fig. 3.
  • Figs. 5a and 5b show an example of chromosomes that may be employed in the method of Fig. 3, wherein Fig. 5a shows a representation of a cross-over point, and Fig. 5b represents an example of a gene mutation of a chromosome.
  • Fig. 1 is a block diagram of a hardware system 1 that is suitable for practicing this invention.
  • the system 1 comprises a server 2 and at least one user information appliance 4.
  • the server 2 and information appliance 4 are bidirectionally coupled to one another through an interface 6.
  • the interface 6 may include various types of interconnecting equipment and interfaces for coupling the server 2 to the information appliance 4, such as, for example, one or more wires, cables, switches, routers, optical fibers, a wireless interface, and/or one or more networks (e.g., the Internet and/or other, proprietary network(s)), modems, and/or other suitable types of communication equipment/interfaces, depending on applicable system design and operating criteria, although, for convenience, no such equipment is shown in Fig. 1.
  • networks e.g., the Internet and/or other, proprietary network(s)
  • modems e.g., modems, and/or other suitable types of communication equipment/interfaces, depending on applicable system design and operating criteria, although, for convenience, no such equipment is
  • the individual information appliance 4 may include, for example, a PC, a personal video recorder (PVR), a video cassette recorder (NCR), a digital video recorder (DVR), a personal television receiver (PTR), a DVD player, and the like, although other suitable types of user information appliances also may be employed. Although only a single server 2 and a single user information appliance 4 are shown in Fig. 1 , the number and variety of user information appliances that may be in communication with the server 2 can vary widely, as can the number of servers 2 that are in communication with individual user information appliances, depending upon, for example, user needs and geographic location(s), applicable system design and operating criteria, etc.
  • teaching of this invention is not to be construed as being limited for use with any particular type of server computer or information appliance.
  • teaching of this invention may be employed in conjunction with any suitable type of devices that are capable of processing media information, such as video information, audio information, and/or combination video/audio information, etc.
  • the server 2 is a computer or farm of computers that facilitate the transmission, storage, and reception of information between different points.
  • the server 2 preferably comprises a controller (such as one or more microprocessors and/or logic arrays) (CPU) 10 for performing arithmetic and/or logical operations required for program execution.
  • the controller 10 executes computer readable code, i.e., stored applications, such as those described below.
  • the server 2 also comprises at least one communication interface 8 for bidirectionally coupling the controller 10 to external interfaces, such as the interface 6 and any other interfaces (not shown) to which the server 2 may be coupled, for enabling the server 2 to transceive information with external source and destination devices (e.g., information appliance 4) coupled those interfaces, although for convenience, only the interface 6 and appliance 4 are shown.
  • That information may include signaling information in accordance with the applicable external interface standard employed, video, audio, and other data.
  • the server 2 preferably also comprises one or more input user-interfaces 11 that are each coupled to the controller 10, and at least one output user-interface 13 that also is coupled to the controller 10.
  • the input user-interface 11 may include, for example, a keyboard, a mouse, a trackball, touch screen, and/or any other suitable type of user-operable input device(s)
  • the output user-interface 13 may include, for example, a video display, a liquid crystal or other flat panel display, a speaker, a printer, and/or any other suitable type of output device(s) for enabling a user to perceive outputted information.
  • the server 2 preferably also comprises one or more associated memories (e.g., disk drives, CD-ROM drives, read-only memories, and/or random access memories) 15 that are bidirectionally coupled to the controller 10.
  • the memory 15 stores temporary data and instructions, and also stores various application programs, routines and operating programs that are used by the controller 10 for controlling the overall operation of the server 2.
  • an operating system 17 such as UNIX or Windows NT, preferably is stored in the memory 15, and a number of applications such as, for example, a video encoder 19, a video decoder 21, a frame grabber 23, and a cut detector 24 also may be stored in the memory 15, although other types of operating systems and application software may be employed as well, and/or one or more of the applications, such as applications 19, 21, and 23, may be embodied as separate hardware components within the server 2, rather than as application software.
  • the video encoder 19 is employed by the controller 10 to encode video information in a conventional manner, when deemed necessary by the controller 10, and the video decoder 21 is employed by the controller 10 to decode compressed video data, when deemed necessary by the controller 10, in a conventional manner.
  • the frame grabber 23 includes software that is employed by the controller 10 to capture single frames from a video information stream for enabling the captured frames to be subsequently processed.
  • the cut detector 24 detects, for example, whether a change in scene has occurred, in a known manner.
  • the memory 15 also stores various counters and variables, such as, for example, an Actual#CommFrames variable, a CommercialProbability variable, a Totalldentified counter, a LastUniColor variable, and a #CorrIdentified counter, which are employed in a manner as will be described below.
  • various counters and variables such as, for example, an Actual#CommFrames variable, a CommercialProbability variable, a Totalldentified counter, a LastUniColor variable, and a #CorrIdentified counter, which are employed in a manner as will be described below.
  • the memory 15 also stores routines for implementing a method in accordance with this invention for detecting predetermined content, such as, for example, commercials or other content, in video information streams using a predetermined content detection algorithm that is a function of a number of parameters (e.g., threshold values), and for automatically learning values of those parameters which are optimized to enable the content detection algorithm to detect the predetermined content with at least a predetermined level of accuracy. That method will be described below in relation to Figs. 3, 4a, and 4b.
  • the memory 15 preferably also stores a plurality of chromosomes that are employed in the routines for implementing the method of the invention.
  • Each chromosome preferably includes one or more of the parameters, which are of a type suitable for use in the predetermined content detection algorithm referred to above.
  • At least part of the content detection algorithm performs a technique, such as, for example, an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, a black frame detection technique, a cut and black frame detection technique, a frame similarity detection technique, a character detection technique, an average intensity color histogram technique, a static image detection technique, or any other existing or later developed techniques which can be used for detecting predetermined content in media information streams.
  • a technique such as, for example, an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, a black frame detection technique, a cut and black frame detection technique, a frame similarity detection technique, a character detection technique, an average intensity color histogram technique, a static image detection technique, or any other existing or later developed techniques which can be used for detecting predetermined content in media information streams.
  • a technique such as, for example, an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, a
  • each chromosome Crl-Crn includes a plurality of parameters, namely threshold values, of which the algorithm is a function.
  • the thresholds represent conditions that, depending on whether or not satisfied, indicate the presence or absence of commercial content.
  • the thresholds are represented by a bit string having a predetermined bit length (e.g., 9 bits), although in other embodiments, other suitable types of values may be employed instead of bit strings, depending on applicable operating criteria.
  • a bit string having a predetermined bit length e.g. 9 bits
  • Other suitable types of values may be employed instead of bit strings, depending on applicable operating criteria.
  • the following is an example of thresholds used in an algorithm for detecting commercial content in a video information stream, although it should be noted that other thresholds besides those defined below may be employed instead and/or in addition to those thresholds, depending on, for example, the type of content detection algorithm being employed.
  • SeparationThreshold (shown as "SeparationThld” in Fig.2) - this threshold represents a predetermined minimum temporal distance that can be expected to exist between two commercials, and enables a detection to be made of a black/unicolor frame which precedes a potential beginning frame of a commercial segment (i.e., a frame including commercial content) in a video information stream.
  • a black/unicolor frame can be preliminarily identified as such a preceding frame of a commercial segment if the temporal distance between that frame and a black/unicolor frame which was previously detected as immediately following a last detected commercial segment is greater than the value of SeparationThreshold (as used herein, the term "commercial segment" means a collection of successive frames which include commercial content, and which form a single commercial).
  • UnicolorlnSuccThreshold (shown as "UnicolorlnSuccThld” in Fig. 2) - This threshold represents the minimum number of black/unicolor frames expected to be separating commercial segments or commercial and non-commercial program segments. The occurrence of a potential commercial ending can be recognized if the number of black unicolor frames detected in succession (after a last frame of commercial segment) is greater than UnicolorlnSuccThreshold, as will be described below.
  • MinCommercialThreshold (shown as "MinCommercialThld” in Fig. 2) - This threshold represents a predetermined, expected minimum amount of time of an individual commercial segment, and is used in determining whether a potential commercial start and a potential commercial ending determination should not be confirmed. Potential commercial starts and endings are not confirmed if it is determined that the temporal distance separating them is less than the value of MinCommercialThreshold.
  • MaxCommercialThreshold (shown as "MaxCommercialThld” in Fig. 2) - This threshold represents a predetermined, expected maximum amount of time of an individual commercial segment, and also is used in determining whether a potential commercial start and a potential commercial ending determination should not be confirmed. Potential commercial starts and endings are not confirmed if it is determined that the temporal distance separating them is greater than the value of MinCommercialThreshold.
  • RestartThreshold shown as "RestartThld” in Fig. 2) - This threshold represents a predetermined number of frames which are reasonably expected to be included in a commercial segment (once a commercial segment has started). If the temporal distance between a current frame and a last detected commercial exceeds that threshold during the performance of the method described below, then no confirmation is made that the potential commercial segment is indeed a commercial.
  • DistForSuccThreshold (shown as "DistForSuccThld” in Fig. 2) - This threshold corresponds to a maximum, expected commercial duration, and represents a predetermined, maximum expected temporal distance between two consecutive black/unicolor frames separating consecutively detected commercials.
  • this threshold is employed to (1) determine whether or not a detected black/unicolor frame immediately succeeds a previously detected one (a detected black/unicolor frame is considered to be in succession with the previously detected one of those frames if the frame are temporally separated by less than the value of DistForSuccThreshold), (2) detect a first black/unicolor frame immediately following a commercial ending frame (this occurs if the temporal distance between two successively detected black frames is greater than the product of 2*DistForSuccThreshold), and (3) determine whether or not a potential commercial start or ending determination should be confirmed.
  • DistForSuccThreshold the manner in which the various chromosome threshold values described above are employed in an exemplary embodiment of this invention will be described in detail below.
  • Video information originated from a source device such as the information appliance 4 or some other device (e.g., a video camera, etc.) (not shown) may be provided to the server 2 and inputted therein through the at least one communication interface 8.
  • the inputted video information may be digital or analog information, and may be in compressed or uncompressed form, depending on, for example, the type of source device and associated external interface employed.
  • An A/D converter 9a and a D/A converter 9b preferably also are included in the server 2, either as part of the controller 10 or as separate components.
  • the A/D converter 9a can be programmed by the controller 10 for enabling analog information received from an external interface, such as interface 6, to be converted into digital form.
  • the D/A converter 9b can be used by the controller 10 to convert digital information into corresponding analog information, before the information is outputted to the external interface 6, although, depending on the type of interface employed, that information need not be so converted before being forwarded to the interface 6.
  • the controller 10 also may employ the video decoder 21 to decode compressed video information inputted thereto, depending on applicable performance criteria, and may employ the video encoder 19 to encode video information before it is transmitted through the communication interface 8, depending on applicable performance criteria.
  • the user information appliance 4 preferably comprises at least one communication interface 14 and a controller 16 (CPU) bidirectionally coupled thereto.
  • the interface 14 bidirectionally couples the appliance 4 to one or more external communication interfaces, such as the interface 6 and any other external interfaces (not shown) to which the information appliance 4 may be coupled.
  • the interface 14 enables the appliance 4 to transceive information with external source and destination devices (e.g., server 2) that may be coupled thereto, although for convenience, only the server 2 and one external interface 6 are shown. That information may include signaling information in accordance with the applicable external interface standard employed, video, audio, and other data.
  • a user interface of the user information appliance 4 includes an output user interface, such as a display 36, and an input user device, typically a key matrix 20, all of which are coupled to the controller 16, although in other embodiments, other suitable types of output and input user interfaces also may be employed.
  • the key matrix 20 includes various user-interface keys that are used for initiating some operation of the user information appliance 4, such as, for example, PLAY, FAST FORWARD, STOP, REWIND, and PAUSE keys, various menu scrolling keys, etc.
  • a MARK key for marking commercial content also may be included in the key matrix 20.
  • the user information appliance 4 also includes various memories, such as a RAM and a ROM, shown collectively as the memory 18.
  • the memory 18 may store temporary data and instructions, various counters and other variables, and preferably also stores various applications, routines, and operating programs 27.
  • the memory 18 may store a video encoder 33, a video decoder 35, a cut detector 29, and a frame grabber 31, although other types of operating systems and application software may be employed instead and/or one or more of the applications, such as applications 33, 35, and 31, may be embodied as separate hardware components within the appliance 4, rather than as application software.
  • the video encoder 33 stored in information appliance 4 may be employed by the controller 16 to encode video information
  • the video decoder 35 may be employed by the controller 16 to decode compressed video data, in a conventional manner.
  • the frame grabber 31 includes software that is employed by the controller 16 to capture single frames from a video signal stream, for enabling the captured frames to be subsequently processed.
  • the cut detector 29 detects, for example, whether a change in scene has occurred, in a known manner.
  • at least some of the routines stored in the memory 18 implement a method in accordance with this invention, to be described below in relation to Figs. 3, 4a, and 4b.
  • the memory 18 also stores at least some of the various counters, variables, and/or chromosomes described above in relation to the server 2, although for convenience, they will not now be further described.
  • Input video information originated from a source device such as the server 2 or some other source device (e.g., a video camera, etc.) (not shown), may be received within the appliance 4 through the at least one communication interface 14.
  • the video information inputted into the information appliance 4 may be in digital or analog form, compressed or uncompressed, depending on, for example, the type of source device and associated external interface employed.
  • an A/D converter 11a and a D/A converter 1 lb also may be included in the information appliance 4, either as part of the controller 16 or as separate components.
  • the A/D converter 1 la may be programmed by the controller 16 for enabling analog information received by the appliance 4 from an external interface, such as interface 6, to be converted into digital form, before being provided to the controller 16.
  • the D/A converter 1 lb may be employed to convert digital information into corresponding analog information, before the information is outputted to the external interface 6, although, depending on the type of interface 6 employed, that information need not be so converted before being forwarded to the interface 6.
  • a user can identify selected individual frames or other segments of a sample video clip as either including predetermined content, such as commercial subject matter, or as not including such predetermined content. Thereafter, the sample video clip is automatically evaluated for the presence of such predetermined content using a predetermined content detection algorithm that is a function of a number of parameters, such as the chromosome threshold values described above.
  • threshold values are then evolved, if needed, in successive iterations of the algorithm (the evolution occurs through use of a super-algorithm, such as a genetic algorithm), to increase the accuracy of the detections, until the threshold values are considered to be optimized for enabling the algorithm to detect the predetermined content with maximum accuracy amongst all employed values.
  • a super-algorithm such as a genetic algorithm
  • the method of the invention may be employed for use in detecting other types of information content of interest, such as, for example, explicit, violent, or other content types, depending on the application of interest.
  • step 100 the method is started, and it is assumed that the server 2 is provided with at least one sample video clip that is stored in the memory 15, and that the sample video clip includes at least one commercial segment (as pointed out above, as used herein, the term "commercial segment” means one or more successive video frames having commercial content, and forming a single commercial) and boundaries of the commercial segments (customarily, a generous sample of video clips having a variety of commercials would be employed to ensure that robust algorithm chromosomes are determined).
  • commercial segment means one or more successive video frames having commercial content, and forming a single commercial
  • boundaries of the commercial segments customarily, a generous sample of video clips having a variety of commercials would be employed to ensure that robust algorithm chromosomes are determined).
  • those boundaries may include black/unicolor frames, as described herein, wherein one or more of those frames appear (temporally) immediately before and others immediately after each commercial segment, although other suitable types of boundaries may be employed instead, depending on the type of content detection algorithm employed.
  • the sample video clip is stored in the memory 15 in association with content identifier information specifying (1) which particular frames of the clip include and/or do not include commercial content, (2) frame numbers identifying those frames, and (3) a variable Actual#CommFrames representing the total number of frames including commercial content.
  • the sample video clip and content identifier information may be downloaded from any external source through the interface 8, in which case the video clip and content identifier information are forwarded to the controller 10 and then stored by the controller 10 in the memory 15.
  • the video clip may be, for example, a portion of a television signal or internet file broadcast downloaded from the interface 6, a video clip uploaded from the user information appliance 4, a video clip downloaded from a particular web site, or a video signal originated from any other source (not shown) that may be coupled to the server 2.
  • the content identifier information may be stored in the memory 15 after the sample video clip already is stored in that memory. For example, while viewing individual frames of the sample video clip on the display 13, the user may enter content identifier information specifying whether or not each individual frame includes commercial content, into the server memory 15 through the input user interface 11, and then that information is stored in association with the frame information.
  • step 100 the thresholds of the individual chromosomes Crl-Crn are initialized to some predetermined values (e.g., represented by a bit string), either as specified by the user or by the routine stored in memory 15, and thus an initial population P(t) of the chromosomes is provided, where, for the purposes of this description, "t" is a variable representing the population level.
  • some predetermined values e.g., represented by a bit string
  • step 110 it is assumed that, for example, the user operates the input user interface 11 to enter command information into the controller 11 specifying that the sample video clip be examined for the presence of predetermined content, namely, in this example, commercial subject matter.
  • the controller 10 performs a predetermined content detection algorithm that is identified as step 112 in Fig. 3.
  • a predetermined content detection algorithm that is shown in further detail by the method steps shown in Figs. 4a and 4b, and, is performed to evaluate the sample video clip for the presence of commercial content based on the threshold values within each chromosome of the population P(t).
  • the algorithm is performed separately for each chromosome of the population P(t), so that multiple performances of the algorithm occur, either in parallel or in series with one another, and so that there is at least one performance of the algorithm for each chromosome.
  • the following description will be made in the context of the performance of the content detection algorithm for only a single one of the chromosomes, although it should be understood that the algorithm is performed for each chromosome separately.
  • step 200 of Fig. 4a the content detection algorithm is entered, and it is assumed that the frame grabber 23 detects a first video frame in the sample video clip, using a known frame detection technique ("Yes" in step 200). Thereafter, control passes to step 202 where the cut detector 24 determines whether or not a cut (i.e., a change in a scene or a change in content) has occurred based on the content of the detected frame relative to that of an immediately-preceding detected frame, if any, using a known cut detection technique. If no cut is detected in step 202 ("No" at step 202), then control passes to step 206 where the method continues in a manner described below.
  • a cut i.e., a change in a scene or a change in content
  • step 202 control passes to step 204 where the controller 10 sets the CommercialProbability variable equal to value '1 '. Control then passes to step 206.
  • the frame detection step 220 and cut detection step 202 may be performed using any suitable, known frame detection and cut detection techniques, respectively, such as, for example, those described in U.S. Patent 6,100,941, which, as pointed out above, is incorporated by reference herein.
  • step 206 the controller 10 determines whether or not the frame detected in step 200 is a black or unicolor frame, using a known black frame/unicolor frame detection technique (such as, for example, a black frame/unicolor frame technique described in U.S. Patent No. 6,100,941 or U.S. Patent Application No. 09/417,288, or any other suitable black frame/unicolor frame detection technique). If the performance of step 206 results in a determination that the frame detected in step 200 is not a black or unicolor frame ("No" in step 206), control passes through connector A to step 220 of Fig. 4b, where the method then continues in a manner as will be described below.
  • a known black frame/unicolor frame detection technique such as, for example, a black frame/unicolor frame technique described in U.S. Patent No. 6,100,941 or U.S. Patent Application No. 09/417,288, or any other suitable black frame/unicolor frame detection technique.
  • step 206 If the performance of step 206 results in a determination that the frame detected in step 200 is a black or unicolor frame ("Yes" in step 206), then control passes to step 208 where the controller 10 increments the value of counter UniColorlnSuccession (originally initialized to '0' when the algorithm in step 112 is initially entered) stored in memory 15 by ' 1 ', and also updates the value of variable LastUniColor so that it represents the number of the current frame. Thereafter, a number of steps are performed to determine whether or not a potential start or ending of a commercial exists near the detected black/unicolor frame in the video clip.
  • counter UniColorlnSuccession originally initialized to '0' when the algorithm in step 112 is initially entered
  • the controller 10 determines whether or not the temporal distance between the newly detected black or unicolor frame and a last detected black/unicolor frame (if any) exceeds the value of DistForSuccThreshold.
  • the temporal distance may be calculated by first subtracting the number of the newly-detected black/unicolor frame (identified by, e.g., LastUniColor) from that of the last detected black/unicolor frame (if any), and then multiplying the subtraction result by the inverse of an applicable, predetermined frame rate (e.g., the inverse of either 25 frames/second, 30 frames/second, or 24 frames/second) to convert the subtraction result into units of time.
  • an applicable, predetermined frame rate e.g., the inverse of either 25 frames/second, 30 frames/second, or 24 frames/second
  • step 210 The product of that multiplication represents the temporal distance in question, and is compared to the value of DistForSuccThreshold to determine whether or not the temporal distance exceeds that threshold value. If the performance of step 210 results in a determination of "No", which indicates that the current frame probably is not located immediately prior to a frame that includes commercial content, then control passes to step 214 which is performed in a manner as will be described below.
  • step 212 the controller 10 determines whether or not the temporal distance between a last detected commercial (if any) and the newly-detected black/unicolor frame exceeds the value of the SeparationThreshold, or if no commercial segment was previously detected in the sample video clip (e.g., a first commercial segment may be present). For example, that temporal distance may be calculated by first subtracting the number of the newly-detected black/unicolor frame from the number of the last frame of a last detected commercial segment (if any was determined based on a previous performance of step 228 of Fig.
  • step 212 If the performance of step 212 results in a determination of "Yes”, then control passes to step 216 where the controller 10 recognizes that the frame detected in step 200 is potentially adjacent to a next, beginning frame of a commercial segment. If the performance of step 212 results in a determination of "No", which indicates that the current frame likely is not located adjacent to a next, beginning frame of a commercial segment, then control passes to step 214, which will now be described.
  • Step 214 the controller 10 determines whether or not the number of black/unicolor frames (e.g., UniColorlnSuccession) that have been detected since the algorithm was initiated in step 112 (Fig. 3) exceeds UnicolorlnSuccThreshold. If the performance of step 214 results in a determination of "Yes”, then control passes to step 218 where the controller 10 recognizes that the frame detected in step 200 is potentially one of a series of black/unicolor frames following the ending of a commercial segment (i.e., the potential end of the commercial segment is recognized). If the performance of step 214 results in a determination of "No", then control passes through connector A to step 220 of Fig. 4b.
  • black/unicolor frames e.g., UniColorlnSuccession
  • step 220 a decision is made as to whether or not the potential presence of a commercial segment was previously determined to exist, and control then passes to either step 232 (to be described below) or step 222, based on the result of that determination. For example, if step 220 was entered into directly after a determination of "No" in step 206, then the performance of step 220 results in control being passed to step 232. If step 220 was entered directly after a potential beginning of a commercial segment was recognized in step 216 or after a potential ending of a commercial segment was recognized in step 218, then the performance of step 220 results in control being passed to step 222, which will now be described.
  • step 222 the controller 10 determines whether the approximate duration of the potential commercial segment is within a predetermined time period. For example, step 222 may be performed by the controller 10 determining whether the temporal distance separating a frame last determined to be a black/unicolor frame potentially following the end of a commercial segment (a last performance of step 218) and an earlier frame determined to be potentially located adjacent to a next, beginning frame of a commercial segment (in an earlier performance step of 216), is greater than the value of the MinCommercialThreshold and less than the value of the MaxCommercialThreshold. If the performance of step 222 results in a determination of "No", then control passes to step 232.
  • step 224 the controller 10 examines the value of the CommercialProbability variable to determine whether or not it is equal to ' 1 '. If it is determined in step 224 that the value of the CommercialProbability variable is equal to '1', then control passes to step 230 where the controller 10 stores in the memory 15 a record confirming that the frame detected in step 200 is a black or unicolor frame that is located temporally adjacent to a next, beginning frame of a commercial segment.
  • step 226 the controller 10 determines whether or not the temporal distance between the current black or unicolor frame (detected in step 206) and a last detected black/unicolor frame (if any) exceeds the product of 2*DistForSuccThreshold. If the performance of step 226 results in a determination of "Yes", which confirms that the current black/unicolor frame is a first black/unicolor frame appearing immediately after a last frame (that includes commercial content) of a commercial segment, then control passes to step 228 where the controller 10 stores in the memory 15 a record indicating such and confirming that the presence of a commercial segment ending has been detected. If the performance of step 226 results in a determination of "No”, then control passes to step 232, which will now be described.
  • step 232 the controller 10 determines whether or not (1) the temporal distance between the frame (if any) confirmed in step 228 or 230 and an earlier frame (if any) last confirmed in an earlier performance of step 228 or 230 exceeds the value of
  • step 232 determines whether the temporal distance between the current black/unicolor frame and a last-detected black/unicolor frame (if any) exceeds the value of DistForSuccThreshold, wherein the temporal distances may be determined and compared to the corresponding threshold values in the manner described above. If the performance of step 232 results in a determination of "Yes", then control passes to step 234 where the controller 10 sets the value of the CommercialProbability variable to '0' to indicate that no commercial segment was detected in the previously-described method steps. Thereafter, control passes through connector B back to step 200 of Fig. 4a.
  • step 233 If, on the other hand, the performance of step 232 results in a determination of "No", then control passes to step 233 where, if the last confirm step performed was step 228 ("Yes" in step 233), control is passed to step 233' where the controller 10 stores information in the memory 15 specifying that the frames (identified by corresponding frame numbers) appearing temporally between the black/unicolor frame confirmed in step 228 (as being located immediately after a last frame of a commercial segment) and a black/unicolor frame which was last confirmed in earlier step 230 (as being located immediately prior to a first frame of the commercial segment), include commercial content and collectively represent a commercial segment.
  • step 236 the controller 10 increases the value of the Totalldentified counter (originally initialized at '0' when step 112 was entered) stored in memory 15 by the number of frames identified in step 233' as including commercial content, and control then passes through connector B back to step 200 of Fig. 4a. If, on the other hand, step 230 was the last confirm step performed ("No" in step 233), control passes directly through connector B back to step 200 of Fig. 4a.
  • step 200 is performed in the above-described manner. If the performance of that step results in there being a detection of a next video frame the sample video clip ("Yes" in step 200), then control passes to step 202 where the method then continues in the above-described manner. If no other frames are detected ("No" in step 200), then control passes to step 114 of Fig. 3, where the method then continues in the following manner. Step 114 of Fig. 3 is entered into after the algorithm of step 112 is performed for each chromosome of the set of chromosomes of population P(t), stored in the memory 15.
  • step 112 for each of the initial chromosomes Crl-Crn of population P(t), results in there being stored in the memory 15, for each chromosome, a respective Totalldentified counter value representing the total number of frames (if any) in the sample video clip that were identified as including commercial content by the commercial detection algorithm employing that chromosome (see, e.g., step 236 of Fig. 4b) in step 112, and information specifying the frame numbers of the set(s) of those frames (see, e.g., step 233').
  • step 114 the controller 10 determines, for each individual chromosome of the population P(t), whether or not the frames (if any) identified in step 233' during the performance of step 112 for that chromosome, were correctly identified as including commercial content, by correlating the identified frames to the corresponding content identifier information (specifying whether or not the frames include commercial content) originally stored in memory 15 in step 100 of Fig. 3. For example, assuming that a particular frame was identified as including commercial content during the earlier performance of the algorithm of step 112 for a particular chromosome, and assuming that the content identifier information stored in memory 15 specifies that the same frame does indeed include commercial content, then that frame is determined in step 114 as having been correctly identified as including commercial content.
  • step 114 A similar determination is made in step 114 for each frame identified in earlier step 233' for each chromosome. Then control passes to step 115 where the controller 10 updates the value of #CorrIdentified associated with each initial chromosome Crl-Crn of the population P(t) stored in memory 15 so that, for each chromosome, the updated value specifies the number of frames which were determined in step 114 as having been correctly identified (as including commercial content during the performance of step 112) for that chromosome. As a result, a separate value of #CorrIdentified is provided for each chromosome (to indicate the number of frames correctly identified during the algorithm performed using that chromosome).
  • step 115 control passes to step 116 where values for the counters #CorrIdentified (updated in step 115) and Totalldentified (updated in step 112) associated with each chromosome are employed by the controller 10 to determine a Recall and a Precision for that chromosome, using the following formulas FI and F2, respectively:
  • a predetermined selection strategy that is based on the Recall and Precision values determined in step 116.
  • Any suitable type of selection strategy may be employed in step 117, such as, for example, a stochastic selection process, a random process with a probability of selection that is proportional to fitness, a strategy which selects chromosomes yielding the highest 50% of all of the Precision and Recall determined in step 116, a strategy which selects chromosomes yielding Recall and Precision values equaling or exceeding a predetermined value, or another suitable fitness selection strategy, etc., depending on predetermined operating criteria.
  • step 117 is performed by determining the fitness (F) of each chromosome, using the following formula (F3), and by then selecting the chromosomes yielding the highest 50% of all of the calculated fitness values (F):
  • step 118 control passes to step 118 where, according to one embodiment of the invention, each individual chromosome selected in step 117 is randomly paired with another one of those selected chromosomes, and then mated with that other selected chromosome, if the paired chromosomes are determined to be non-incestuous. For example, in one embodiment, after the chromosomes are paired together in step 118 (Fig.
  • an incest threshold value such as a predetermined bit string length or some other suitable value.
  • six of the corresponding bits of the chromosomes pair in Fig. 5a differ from one another, and thus, in a case where the incest threshold value is 1/4 of the bit string length, the performance of that portion of step 118 results in a determination that those chromosomes are not incestuous.
  • the chromosomes determined to be non-incestuous are then mated by randomly choosing a cross-over point 300, and then swapping the bits of the pair appearing after the cross-over point so that offspring chromosomes are generated (or this may be accomplished using HUX; see the Eshelman publication)).
  • Fig. 5b shows an example of such offspring chromosomes Crkl and Crk2 generated by the parent chromosomes of Fig. 5a (step 118).
  • the crossover operation may be performed in any suitable manner known in the art, such as that described in relevant portions of the Eshelman publication referred to above.
  • the production of offspring in step 118 may be performed by, for example, randomly mutating the value of each chromosome by flipping a predetermined portion (e.g., 35%) of the bits of each chromosome, at random (with independent probability), in a manner as known in the art.
  • Fig. 5c shows an example of one of the parent chromosomes Crl of Fig. 5a and an offspring chromosome Crkl resulting from the mutation of that parent chromosome.
  • the mutation performed during step 118 may be performed by randomly choosing a cross-over point and swapping bits in the above-described manner, and then randomly mutating the resultant bit strings (individual bits), or vice versa, in the manner described above.
  • step 118 results in there being a plurality of offspring chromosomes Crkl-Crki provided (which hereinafter also are referred to collectively as offspring population K(t)) (assuming, in the case of sexual reproduction, that at least one of the parent chromosome pairs was determined to be non-incestuous in that step, wherein for that embodiment each pair of offspring chromosomes was generated from a corresponding pair of parent chromosomes).
  • step 120 each of the chromosomes Crkl-Crki is employed, in lieu of the parent chromosomes Crl-Crn of initial population P(t), in the content detection algorithm described above in relation to step 112, and then, after that algorithm is performed for each chromosome Crkl-Crki, steps that are the same as steps 113-116 are performed for each of those chromosomes Crkl-Crki. That is, step 120 is performed in the same manner as steps 112-116 described above, except that the offspring chromosomes Crkl-Crki are employed in those steps in place of the parent chromosomes Crl-Crn of initial population P(t).
  • step 120 results in a determination of a fitness value (F) yielded for each offspring chromosome Crkl-Crki (as in step 116), in the same manner as described above.
  • control passes to step 122 where, in accordance with one embodiment of the invention, another selection of chromosomes is made, but this time the selection is made from amongst all chromosomes of the previous chromosome population P(t) (e.g., Crl-Crn) and all chromosomes of offspring population K(t) (e.g., Crkl-Crki), to generate a new population P(t t+1), by employing the same chromosome fitness selection strategy as that described above in relation to step 117, or any other suitable existing or later developed selection strategy.
  • P(t) e.g., Crl-Crn
  • K(t) e.g., Crkl-Crki
  • step 124 a convergence determination is made, by determining whether (a) the value of the incest threshold is equal to '0' and (b) the fitness (F) of each chromosome selected in step 122 is the same. If either (a) or (b) is not true, then a determination is made as to whether there were no chromosomes selected from population K(t) (i.e., none survived) in step 122. If none were selected in that step, then the value of the incest threshold is decreased by ' 1 ' ("N" in step 124), and control then passes back to step 118 where the method then proceeds therefrom in the above described manner, but to mate the chromosomes of the newly generated population. If, on the other hand, both (a) and (b) are determined to be true in step 124 ("Y" in step 124), then control passes to step 126.
  • Step 126 a determination is made as to whether or not the method should be terminated.
  • that step is performed by determining if either (i) a predetermined number of chromosomes of offspring population K(t) have been evaluated in step 120 since the method was first began in step 100, or (ii) a restart step 130 (described below) has been performed a predetermined number of times since the method began in step 100.
  • step 126 may be performed to determine if both of the conditions (i) and (ii) have been satisfied, or, in other embodiments, the determination may be made as to only one of the conditions (i) and (ii), although, it should be noted that other suitable types of decisions besides those described herein may be employed instead, depending on applicable operating criteria.
  • step 126 If the performance of step 126 results in a determination of "Yes” ("Y” in step 126), control passes to step 128, which will be described below. Otherwise, if step 126 results in a determination of "No" ("N” at step 126), control passes to step 130, where a soft restart procedure is performed.
  • each of the resulting copies, except one is then mutated by flipping a predetermined proportion (e.g., 35%) of the bits of the copy, at random without replacement.
  • a predetermined proportion e.g., 35%) of the bits of the copy, at random without replacement.
  • F fitness
  • Those chromosomes (including the non-mutated and mutated copies) collectively form a new chromosome population P(t).
  • control passes from step 130 back to step 112, wherein the method then continues in the above described manner, but with the new chromosome population P(t) being employed for the various thresholds of the algorithm.
  • Those threshold values which in this example are ST2, UIST2, MinCT2, MaxCT2, RT2, and DFST2 (Fig.
  • a user of the server 2 operates the input user interface 11 to enter information into the server 2 specifying that a selected video stream, such as the video clip originally provided in the memory 15 in step 100 or another video information signal provided to the server 2 (e.g., a downloaded or uploaded video clip, a received broadcast video information stream, or one or more otherwise provided video clips or other video segments, etc.), be evaluated for the presence of, for example, commercial subject matter.
  • the controller 10 responds by retrieving the optimized chromosome threshold values ST2, UIST2, MinCT2, MaxCT2, RT2, and DFST2 identified in step 128 and then performing the content detection algorithm shown in Figs. 4a and 4b, using the retrieved values for the appropriate thresholds SeparationThreshold,
  • the video information is evaluated for the presence of commercial content, based on those thresholds.
  • the use of those optimized threshold values enables the content detection algorithm to detect commercial content in the video information with a maximum degree of accuracy. Thereafter, the results of evaluation of the video information may then be employed as desired (e.g., to delete or replace the commercial content from the signal, etc.).
  • the optimized threshold values identified in step 128 also may be provided to other devices, such as the user information appliance 4, for enabling those threshold values to be employed in the content detection algorithm in those devices.
  • those values may be downloaded or otherwise provided to the user information appliance 4 for storage in the memory 18 of that appliance 4. Thereafter, those values may be retrieved by the controller 16 for use in performing the content detection algorithm (of Figs. 4a and 4b), to evaluate a selected video stream provided in the appliance 4 for the presence of commercial content, in a similar manner as described above in connection with the server 2.
  • software representing a content detection algorithm such as, for example, the commercial detection algorithm shown in Figs.
  • FIG. 4a and 4b can be downloaded or be otherwise provided from the server 2 to user information appliances 4, in association with, or separately from, the optimized chromosome threshold values, and those values can then be employed in the algorithm in the information appliances to detect predetermined content an in information stream.
  • Software representing the overall method of Fig. 3 also may be downloaded or be otherwise provided from server 2 to information appliances 4, or be pre-stored in those appliances 4, for enabling that method to be performed in those devices for determining chromosome threshold values, which can then be uploaded or be otherwise provided back to the server 2, if desired, or employed in a suitable content detection algorithm in the appliances 4.
  • step 117 and part of step 120 being performed to select chromosomes based on their fitnesses yielded as a function of Recall and Precision values
  • those selections may be made based on an evaluation of only Recall values or only Precision values yielded by chromosomes, or based on any other suitable measure of accuracy, and the measures may be of a scalar or vector type.
  • the invention is described in the context of the content detection algorithm being performed to identify the presence of commercials based on a black frame or unicolor frame detection technique, the invention is not limited for use with only that technique or for detecting only commercial content. It also is within the scope of this invention to employ any other suitable types of now existing or later developed techniques which can be used for detecting any type of predetermined content in analog or digital video signals or any other types of media information (e.g., audio) besides/in addition to video information (examples of at least some other techniques involving video information were discussed above), and any type of low-level, mid-level, or hi-level (e.g., the presence of multiple black frames in succession) features that can be extracted, either in the compressed or uncompressed domain, may be employed in those techniques.
  • any type of now existing or later developed techniques which can be used for detecting any type of predetermined content in analog or digital video signals or any other types of media information (e.g., audio) besides/in addition to video information (examples of at least some
  • each technique would employ appropriate types of chromosomes that are suitable for use as parameters in the respective techniques. It should therefore be appreciated that the method of the present invention may be employed to optimize the detection of any type of desired or undesired content, included in any type of media information, and is not limited for use only in conjunction with detecting commercial content in video information. Moreover, as used herein, the phrase "information stream" is not intended to limit the invention to on-line applications.
  • an information stream may include one or more of types of such information, depending on the application of interest and predetermined operating criteria.
  • each chromosome may include multiple sets of parameters, wherein each set can be used in a corresponding technique.
  • each chromosome Crl-Crn (and offspring chromosome) may also include appropriate parameter values for use in other types of techniques, such as an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, another type of black frame detection technique, etc., and each technique may be run separately for each chromosome, using the appropriate parameter values for that technique.
  • a user may select (in initial step 100) which technique is desired to be employed, and then, as a result, all individual chromosome parameter values besides those which are suitable for use in the selected technique (e.g., all values besides those shown in Fig. 2, in a case where an algorithm such as that shown in Figs. 4a and 4b is selected) are initialized to '0', so that no results are obtained from the non-selected techniques.
  • those parameter values need not be set to DO', and each technique may be performed as a separate content detection algorithm for yielding separate results. Chromosomes that include genes specifying other types of information besides thresholds also may be employed in accordance with this invention.
  • a gene may specify that a color histogram should be computed by algorithm A, B, C, or D (not shown).
  • Other genes may specify alternate ways to combine selected features into a final decision about content classification.
  • the chromosome values need not be represented in bit string form, and may instead be represented in any other suitable form. It should therefore be clear that the present invention is not limited to being used only in conjunction with chromosomes that include threshold values represented by bit strings, as described herein.
  • the invention is described in the context of the high performance chromosome threshold values being determined by the server 2, broadly construed, the invention is not so limited.
  • the method depicted in Fig. 3, 4a, and 4b may be performed within other suitable devices, such as the user information appliance 4.
  • the method may be performed by evaluating a sample video clip within the devices (e.g., appliance 4), in the above-described manner, and the sample video clip may be provided in the devices from any source, such as the server 2.
  • the threshold values employed in the algorithm within server 2 may be provided to the server 2 from an external source, such as information appliances 4.
  • step 118 is performed to mate chromosomes from the present population being evaluated.
  • the present invention provides a novel method for automatically evolving parameter values until those values substantially maximize the accuracy of a media content detection algorithm that is a function of those parameters.
  • the thresholds enable the algorithm to detect predetermined content in a media information stream with a maximum degree of accuracy.
  • This method is advantageous in that it improves the accuracy of such content detection algorithms automatically, and therefore relieves users of the burden of having to manually select appropriate threshold values.
  • the method of the invention can circumvent attempts made by commercial producers to prevent the successful detection of the commercials by modifying their broadcast commercials.

Abstract

A method for optimizing the performance of an algorithm for detecting predetermined content in a media information stream, and a program and apparatus that operate in accordance with the method. The algorithm is a function of a set of parameters. The method comprises the steps of performing the algorithm at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm for each performance thereof, and automatically evolving at least one respective set of parameters employed in the algorithm to maximize the degree of accuracy at which the algorithm detects the predetermined content in the media information stream.

Description

Method, apparatus, and program for evolving algorithms for detecting
This invention relates generally to the detection of commercials or other predetermined content in video information streams, using a search algorithm, and in particular to a method, apparatus, and program for evolving algorithm parameters to accurately detect transitions from one type of content to another type of content, using a search algorithm, such as a genetic algorithm.
Personal video receivers/recorders and devices that modify and/or record the content of broadcast video are becoming increasingly popular. One example is a personal video recorder that automatically records programs on a hard disk based on preferences of a user. One of the features under investigation for such systems is content detection. For example, a system that can detect commercials may allow substitute advertisements to be inserted in a video stream ("commercial swapping") or the temporary halting of the video at the end of a commercial to prevent a user, who was distracted during a commercial, from missing any of the main program content. Content detection also may enable users who are not interested in the content of commercials or promotions interposed within a recorded television program, to skip through those commercials either manually or by using a device designed to perform skipping autonomically (see, e.g., U.S. Pat. No. 5,151,788).
There are many known methods for detecting commercials. One method is the detection of a high cut rate or sudden change in a scene with no fade or movement transition between temporally adjacent frames. Cuts can include fades so the cuts do not have to be hard cuts. A more robust criterion may be high transition rates. Another indicator is the presence of a black frame (or unicolor/monochrome frame) coupled with silence, which may indicate the beginning of a commercial break. One or more black frames are usually found immediately before and after an individual commercial segment. Another known indicator of commercials is high "activity", which is the rate of change in the luminance level between two different sets of frames. In commercials, objects and scenes generally move faster and change more frequently than during non-commercial video segments, and thus commercials typically are filled with "activity". When a low amount of activity is detected, the commercial is deemed to have ended, and a resumption in recording may follow.
Another known technique is to measure the temporal distance between black frame sequences to determine the presence of a commercial, based on whether the measured temporal distance exceeds or is less than a predetermined threshold. Still another technique identifies commercials based on matching images, wherein differences in the quality of image content is used as an indicator.
Another technique for identifying a black frame such as disclosed in U.S. Pat. No. 4,314,285 by Bonner et al., senses a drop in the voltage level of the input signal below a threshold. Yet another technique, such as disclosed in U.S. Pat. No. 5,333,091 by Iggulden et al., is to record an entire program including any commercials. A notation is made whenever a black frame is broadcast. After recordation, a processor determines whether the time period in between black frames was a commercial or a program. This is accomplished by a simple formula. If the time period is less than a threshold of five minutes, there is deemed to be a commercial. During playback, the device fast-forwards the tape past the areas determined to be commercials. Unfortunately, the presence of two black frames within five minutes of each other is not necessarily representative of a commercial as this could occur during a dimly lit or dark scene.
While all of the techniques referred to above show promising results and may be well-suited for their intended purposes, their reliability and accuracy in detecting content such as commercials can be wanting in at least some cases. One important factor that can contribute to this problem is the use of inappropriate or non-optimum algorithm parameters, such as threshold values, in those techniques. Since it can be difficult to pre-select the most appropriate parameter values for use in a content detection technique, especially when the parameter selection is performed manually, it cannot be assured that the selected values will be best suited for enabling the technique to yield highly accurate results. Also, even in cases where optimum parameter values are employed, commercial producers can change various commercial features to render those values obsolete, and thereby prevent the commercial detection algorithms from successfully detecting commercial content. There is a need, therefore, to provide a technique which overcomes these problems by automatically learning optimum parameter values to be used in a video content detection algorithm, based on an application of that algorithm to a selected video stream, and which thereby enables predetermined content included the video stream to be detected accurately and reliably. It is an object of this invention to provide a method, apparatus, and program for automatically learning parameter values which are optimized to enable a media content detection algorithm that is a function of those parameters to detect predetermined content in a media information stream.
It is another object of this invention to provide a method, apparatus, and program which perform an algorithm for evaluating video or other media information streams for the presence of predetermined content, such as commercials, and which automatically vary parameters of the algorithm until values are determined which enable the algorithm to detect the predetermined content with a maximum degree of accuracy.
Further objects and advantages of this invention will become apparent from a consideration of the drawings and ensuing description.
The foregoing objects of the invention are realized by a method for optimizing the performance of an algorithm for detecting predetermined content, such as one or more commercials, in a media information stream (e.g., a video information divided into a plurality of frames, and/or audio information, etc.), and a program and apparatus that operate in accordance with the method. The algorithm is a function of a set of parameters, wherein each set is also referred to herein as a chromosome, and each parameter preferably is a threshold value representing a condition that, depending on whether or not satisfied, tends to indicate whether or not predetermined subject matter (e.g., commercial or other subject matter) is present in the media information stream.
In accordance with an embodiment of this invention, the method comprises the steps of performing the algorithm at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm for each performance thereof, and automatically evolving at least one respective set of parameters employed in the algorithm to maximize the degree of accuracy at which the algorithm detects the predetermined content in the media information stream.
In accordance with an aspect of this invention, the algorithm detects the predetermined content, which may be desired or undesired content, based on a detection of at least one of a black frame/unicolor frame among the plurality of frames, an average cut frame distance, an average cut frame distance trend, a brand name, a cut and black frame, an average intensity color histogram, audio silence, a change in volume, frame similarity, character detection, a static image, or any other types of features of interest that may indicate the presence of predetermined content in a video or other type of media information stream. In accordance with another aspect of this invention, the step of automatically evolving includes performing a search algorithm, preferably a genetic algorithm, to evolve the at least one respective set of parameters.
In one embodiment of the invention, the evolving includes the steps of determining the accuracy at which the algorithm detects the predetermined content in the media information stream for each performance of the algorithm, selecting at least one of the respective sets of parameters, based on a result of the step of determining the accuracy, and producing at least one offspring set of parameters, based on the at least one set of parameters selected in the selecting step. The offspring set(s) of parameters and/or original set(s) of parameters determined to yield the most accurate results are then employed in further, respective performances of the algorithm, and one or more further offspring sets of parameters are produced again, if needed, until a set of parameters which substantially maximizes the accuracy of the algorithm's performance is determined.
After such a high performance (also referred to as "optimum") set of parameters has been identified, that set can be used in the corresponding algorithm in any device, for enabling predetermined content in a media information stream to be successfully and accurately detected. The algorithm can reside on a server and/or in local information appliances, and the set of parameters and/or algorithm itself can be downloaded from the server to the local information appliances or vice versa.
The present invention will be more readily understood from a detailed description of the preferred embodiments taken in conjunction with the following figures:
Fig. 1 is a block diagram of a hardware system 1 that is suitable for practicing this invention, wherein the system 1 comprises a server 2 and at least one user information appliance 4 that are constructed and operated in accordance with this invention, and which are bidirectionally coupled together through an interface 6.
Fig. 2 is an example of a plurality of chromosomes Crl-Crn that may be stored in a memory 15 of the server 2 and/or a memory 18 of the user information appliance 4 of Fig. 1, wherein the chromosomes Crl-Crn each include parameter or threshold values that are suitable for use in an algorithm for detecting predetermined content, such as commercials, in a video or other media information stream.
Fig. 3 is a logical flow diagram of a method in accordance with this invention for evaluating a video information stream for the presence of predetermined content, and for automatically varying parameters of the algorithm to enable the algorithm to detect the predetermined content with a maximum degree of accuracy.
Figs. 4a and 4b are a logical flow diagram showing in detail sub-steps performed during step 112 of Fig. 3.
Figs. 5a and 5b show an example of chromosomes that may be employed in the method of Fig. 3, wherein Fig. 5a shows a representation of a cross-over point, and Fig. 5b represents an example of a gene mutation of a chromosome.
Identically labeled elements appearing in different ones of the figures refer to the same elements but may not be referenced in the description for all figures.
Fig. 1 is a block diagram of a hardware system 1 that is suitable for practicing this invention. In the illustrated embodiment, the system 1 comprises a server 2 and at least one user information appliance 4. The server 2 and information appliance 4 are bidirectionally coupled to one another through an interface 6. The interface 6 may include various types of interconnecting equipment and interfaces for coupling the server 2 to the information appliance 4, such as, for example, one or more wires, cables, switches, routers, optical fibers, a wireless interface, and/or one or more networks (e.g., the Internet and/or other, proprietary network(s)), modems, and/or other suitable types of communication equipment/interfaces, depending on applicable system design and operating criteria, although, for convenience, no such equipment is shown in Fig. 1.
The individual information appliance 4 may include, for example, a PC, a personal video recorder (PVR), a video cassette recorder (NCR), a digital video recorder (DVR), a personal television receiver (PTR), a DVD player, and the like, although other suitable types of user information appliances also may be employed. Although only a single server 2 and a single user information appliance 4 are shown in Fig. 1 , the number and variety of user information appliances that may be in communication with the server 2 can vary widely, as can the number of servers 2 that are in communication with individual user information appliances, depending upon, for example, user needs and geographic location(s), applicable system design and operating criteria, etc. It should be noted that the teaching of this invention is not to be construed as being limited for use with any particular type of server computer or information appliance. In general, the teaching of this invention may be employed in conjunction with any suitable type of devices that are capable of processing media information, such as video information, audio information, and/or combination video/audio information, etc.
The server 2 is a computer or farm of computers that facilitate the transmission, storage, and reception of information between different points. The server 2 preferably comprises a controller (such as one or more microprocessors and/or logic arrays) (CPU) 10 for performing arithmetic and/or logical operations required for program execution. The controller 10 executes computer readable code, i.e., stored applications, such as those described below. The server 2 also comprises at least one communication interface 8 for bidirectionally coupling the controller 10 to external interfaces, such as the interface 6 and any other interfaces (not shown) to which the server 2 may be coupled, for enabling the server 2 to transceive information with external source and destination devices (e.g., information appliance 4) coupled those interfaces, although for convenience, only the interface 6 and appliance 4 are shown. That information may include signaling information in accordance with the applicable external interface standard employed, video, audio, and other data.
The server 2 preferably also comprises one or more input user-interfaces 11 that are each coupled to the controller 10, and at least one output user-interface 13 that also is coupled to the controller 10. The input user-interface 11 may include, for example, a keyboard, a mouse, a trackball, touch screen, and/or any other suitable type of user-operable input device(s), and the output user-interface 13 may include, for example, a video display, a liquid crystal or other flat panel display, a speaker, a printer, and/or any other suitable type of output device(s) for enabling a user to perceive outputted information.
The server 2 preferably also comprises one or more associated memories (e.g., disk drives, CD-ROM drives, read-only memories, and/or random access memories) 15 that are bidirectionally coupled to the controller 10. The memory 15 stores temporary data and instructions, and also stores various application programs, routines and operating programs that are used by the controller 10 for controlling the overall operation of the server 2. For example, an operating system 17 such as UNIX or Windows NT, preferably is stored in the memory 15, and a number of applications such as, for example, a video encoder 19, a video decoder 21, a frame grabber 23, and a cut detector 24 also may be stored in the memory 15, although other types of operating systems and application software may be employed as well, and/or one or more of the applications, such as applications 19, 21, and 23, may be embodied as separate hardware components within the server 2, rather than as application software. The video encoder 19 is employed by the controller 10 to encode video information in a conventional manner, when deemed necessary by the controller 10, and the video decoder 21 is employed by the controller 10 to decode compressed video data, when deemed necessary by the controller 10, in a conventional manner. The frame grabber 23 includes software that is employed by the controller 10 to capture single frames from a video information stream for enabling the captured frames to be subsequently processed. The cut detector 24 detects, for example, whether a change in scene has occurred, in a known manner.
In accordance with one embodiment of the invention, the memory 15 also stores various counters and variables, such as, for example, an Actual#CommFrames variable, a CommercialProbability variable, a Totalldentified counter, a LastUniColor variable, and a #CorrIdentified counter, which are employed in a manner as will be described below. Preferably, the memory 15 also stores routines for implementing a method in accordance with this invention for detecting predetermined content, such as, for example, commercials or other content, in video information streams using a predetermined content detection algorithm that is a function of a number of parameters (e.g., threshold values), and for automatically learning values of those parameters which are optimized to enable the content detection algorithm to detect the predetermined content with at least a predetermined level of accuracy. That method will be described below in relation to Figs. 3, 4a, and 4b.
In accordance with an aspect of this invention, the memory 15 preferably also stores a plurality of chromosomes that are employed in the routines for implementing the method of the invention. Each chromosome preferably includes one or more of the parameters, which are of a type suitable for use in the predetermined content detection algorithm referred to above. In accordance with an aspect of this invention, at least part of the content detection algorithm performs a technique, such as, for example, an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, a black frame detection technique, a cut and black frame detection technique, a frame similarity detection technique, a character detection technique, an average intensity color histogram technique, a static image detection technique, or any other existing or later developed techniques which can be used for detecting predetermined content in media information streams. For a description of at least some of those techniques, reference may be had to (1) U.S. Patent No. 6,100,941, issued on August 8, 2000, entitled "Apparatus and Method for Locating a Commercial Disposed Within a Video Data Stream," by Nevenka Dimitrova, Thomas McGee, Herman Elenbaas, Eugene Leyvi, Carolyn Ramsey, and David Berkowitz (hereinafter "U.S. Patent 6,100,941"), (2) U.S. Patent Application No. 09/417,288, filed 10/13/99, entitled "Automatic Signature-Base Spotting, Learning and Extracting of Commercials and Other Video Content," by Nevenka Dimitrova, Thomas McGee, and Lalitha Agnihotri (hereinafter "U.S. Patent Application No. 09/417,288"), and (3) U.S. Patent Application No. 09/854,511, filed 5/14/2001, entitled "Video Content Detection Method And System Leveraging Data-Compression Constructs", each of which is incorporated by reference herein in its entirety, as if fully set forth herein.
Referring to Fig. 2, an example is shown of the plurality of chromosomes stored in the memory 15, wherein, in this example, the chromosomes are identified as Crl- Crn and are suitable for use in an algorithm for detecting commercial content in a video information stream, based on a black frame/unicolor frame detection technique. Each chromosome Crl-Crn includes a plurality of parameters, namely threshold values, of which the algorithm is a function. The thresholds represent conditions that, depending on whether or not satisfied, indicate the presence or absence of commercial content. Preferably, the thresholds are represented by a bit string having a predetermined bit length (e.g., 9 bits), although in other embodiments, other suitable types of values may be employed instead of bit strings, depending on applicable operating criteria. The following is an example of thresholds used in an algorithm for detecting commercial content in a video information stream, although it should be noted that other thresholds besides those defined below may be employed instead and/or in addition to those thresholds, depending on, for example, the type of content detection algorithm being employed. SeparationThreshold (shown as "SeparationThld" in Fig.2) - this threshold represents a predetermined minimum temporal distance that can be expected to exist between two commercials, and enables a detection to be made of a black/unicolor frame which precedes a potential beginning frame of a commercial segment (i.e., a frame including commercial content) in a video information stream. As will be described below, a black/unicolor frame can be preliminarily identified as such a preceding frame of a commercial segment if the temporal distance between that frame and a black/unicolor frame which was previously detected as immediately following a last detected commercial segment is greater than the value of SeparationThreshold (as used herein, the term "commercial segment" means a collection of successive frames which include commercial content, and which form a single commercial).
UnicolorlnSuccThreshold (shown as "UnicolorlnSuccThld" in Fig. 2) - This threshold represents the minimum number of black/unicolor frames expected to be separating commercial segments or commercial and non-commercial program segments. The occurrence of a potential commercial ending can be recognized if the number of black unicolor frames detected in succession (after a last frame of commercial segment) is greater than UnicolorlnSuccThreshold, as will be described below.
MinCommercialThreshold (shown as "MinCommercialThld" in Fig. 2) - This threshold represents a predetermined, expected minimum amount of time of an individual commercial segment, and is used in determining whether a potential commercial start and a potential commercial ending determination should not be confirmed. Potential commercial starts and endings are not confirmed if it is determined that the temporal distance separating them is less than the value of MinCommercialThreshold.
MaxCommercialThreshold (shown as "MaxCommercialThld" in Fig. 2) - This threshold represents a predetermined, expected maximum amount of time of an individual commercial segment, and also is used in determining whether a potential commercial start and a potential commercial ending determination should not be confirmed. Potential commercial starts and endings are not confirmed if it is determined that the temporal distance separating them is greater than the value of MinCommercialThreshold. RestartThreshold (shown as "RestartThld" in Fig. 2) - This threshold represents a predetermined number of frames which are reasonably expected to be included in a commercial segment (once a commercial segment has started). If the temporal distance between a current frame and a last detected commercial exceeds that threshold during the performance of the method described below, then no confirmation is made that the potential commercial segment is indeed a commercial.
DistForSuccThreshold (shown as "DistForSuccThld" in Fig. 2) - This threshold corresponds to a maximum, expected commercial duration, and represents a predetermined, maximum expected temporal distance between two consecutive black/unicolor frames separating consecutively detected commercials. As will be described in further detail below, this threshold is employed to (1) determine whether or not a detected black/unicolor frame immediately succeeds a previously detected one (a detected black/unicolor frame is considered to be in succession with the previously detected one of those frames if the frame are temporally separated by less than the value of DistForSuccThreshold), (2) detect a first black/unicolor frame immediately following a commercial ending frame (this occurs if the temporal distance between two successively detected black frames is greater than the product of 2*DistForSuccThreshold), and (3) determine whether or not a potential commercial start or ending determination should be confirmed. The manner in which the various chromosome threshold values described above are employed in an exemplary embodiment of this invention will be described in detail below.
Referring again to Fig. 1, other components of the server 2 will now be described. Video information originated from a source device, such as the information appliance 4 or some other device (e.g., a video camera, etc.) (not shown) may be provided to the server 2 and inputted therein through the at least one communication interface 8. The inputted video information may be digital or analog information, and may be in compressed or uncompressed form, depending on, for example, the type of source device and associated external interface employed. An A/D converter 9a and a D/A converter 9b preferably also are included in the server 2, either as part of the controller 10 or as separate components. The A/D converter 9a can be programmed by the controller 10 for enabling analog information received from an external interface, such as interface 6, to be converted into digital form. The D/A converter 9b can be used by the controller 10 to convert digital information into corresponding analog information, before the information is outputted to the external interface 6, although, depending on the type of interface employed, that information need not be so converted before being forwarded to the interface 6.
As pointed out above, the controller 10 also may employ the video decoder 21 to decode compressed video information inputted thereto, depending on applicable performance criteria, and may employ the video encoder 19 to encode video information before it is transmitted through the communication interface 8, depending on applicable performance criteria.
Having described the server 2, the user information appliance 4 will now be described. The user information appliance 4 preferably comprises at least one communication interface 14 and a controller 16 (CPU) bidirectionally coupled thereto. The interface 14 bidirectionally couples the appliance 4 to one or more external communication interfaces, such as the interface 6 and any other external interfaces (not shown) to which the information appliance 4 may be coupled. The interface 14 enables the appliance 4 to transceive information with external source and destination devices (e.g., server 2) that may be coupled thereto, although for convenience, only the server 2 and one external interface 6 are shown. That information may include signaling information in accordance with the applicable external interface standard employed, video, audio, and other data.
A user interface of the user information appliance 4 includes an output user interface, such as a display 36, and an input user device, typically a key matrix 20, all of which are coupled to the controller 16, although in other embodiments, other suitable types of output and input user interfaces also may be employed. The key matrix 20 includes various user-interface keys that are used for initiating some operation of the user information appliance 4, such as, for example, PLAY, FAST FORWARD, STOP, REWIND, and PAUSE keys, various menu scrolling keys, etc. A MARK key for marking commercial content also may be included in the key matrix 20.
The user information appliance 4 also includes various memories, such as a RAM and a ROM, shown collectively as the memory 18. The memory 18 may store temporary data and instructions, various counters and other variables, and preferably also stores various applications, routines, and operating programs 27. For example, in accordance with one embodiment of the information appliance 4, the memory 18 may store a video encoder 33, a video decoder 35, a cut detector 29, and a frame grabber 31, although other types of operating systems and application software may be employed instead and/or one or more of the applications, such as applications 33, 35, and 31, may be embodied as separate hardware components within the appliance 4, rather than as application software. As for the video encoder 19 and decoder 21 stored in the server 2, the video encoder 33 stored in information appliance 4 may be employed by the controller 16 to encode video information, and the video decoder 35 may be employed by the controller 16 to decode compressed video data, in a conventional manner. The frame grabber 31 includes software that is employed by the controller 16 to capture single frames from a video signal stream, for enabling the captured frames to be subsequently processed. The cut detector 29 detects, for example, whether a change in scene has occurred, in a known manner. In accordance with one embodiment of the invention, at least some of the routines stored in the memory 18 implement a method in accordance with this invention, to be described below in relation to Figs. 3, 4a, and 4b. Moreover, in one embodiment of the invention, the memory 18 also stores at least some of the various counters, variables, and/or chromosomes described above in relation to the server 2, although for convenience, they will not now be further described. Input video information originated from a source device, such as the server 2 or some other source device (e.g., a video camera, etc.) (not shown), may be received within the appliance 4 through the at least one communication interface 14. Like the information inputted into the server 2, the video information inputted into the information appliance 4 may be in digital or analog form, compressed or uncompressed, depending on, for example, the type of source device and associated external interface employed. Also like the server 2, an A/D converter 11a and a D/A converter 1 lb also may be included in the information appliance 4, either as part of the controller 16 or as separate components. The A/D converter 1 la may be programmed by the controller 16 for enabling analog information received by the appliance 4 from an external interface, such as interface 6, to be converted into digital form, before being provided to the controller 16. The D/A converter 1 lb may be employed to convert digital information into corresponding analog information, before the information is outputted to the external interface 6, although, depending on the type of interface 6 employed, that information need not be so converted before being forwarded to the interface 6.
Having described the various components of the system 1, an aspect of this invention will now be described, with reference to the flow diagram of Fig. 3. In accordance with this aspect of the invention, a user can identify selected individual frames or other segments of a sample video clip as either including predetermined content, such as commercial subject matter, or as not including such predetermined content. Thereafter, the sample video clip is automatically evaluated for the presence of such predetermined content using a predetermined content detection algorithm that is a function of a number of parameters, such as the chromosome threshold values described above. Selected ones of those threshold values are then evolved, if needed, in successive iterations of the algorithm (the evolution occurs through use of a super-algorithm, such as a genetic algorithm), to increase the accuracy of the detections, until the threshold values are considered to be optimized for enabling the algorithm to detect the predetermined content with maximum accuracy amongst all employed values. It should be noted that, although the invention is described below in the context of an example in which a video clip sample is evaluated for the presence of commercial content, the invention is not intended for use only in applications for detecting commercial content. For example, in other embodiments, the method of the invention may be employed for use in detecting other types of information content of interest, such as, for example, explicit, violent, or other content types, depending on the application of interest.
Referring now to Fig. 3, in step 100 the method is started, and it is assumed that the server 2 is provided with at least one sample video clip that is stored in the memory 15, and that the sample video clip includes at least one commercial segment (as pointed out above, as used herein, the term "commercial segment" means one or more successive video frames having commercial content, and forming a single commercial) and boundaries of the commercial segments (customarily, a generous sample of video clips having a variety of commercials would be employed to ensure that robust algorithm chromosomes are determined). By example only, those boundaries may include black/unicolor frames, as described herein, wherein one or more of those frames appear (temporally) immediately before and others immediately after each commercial segment, although other suitable types of boundaries may be employed instead, depending on the type of content detection algorithm employed. It also is assumed that the sample video clip is stored in the memory 15 in association with content identifier information specifying (1) which particular frames of the clip include and/or do not include commercial content, (2) frame numbers identifying those frames, and (3) a variable Actual#CommFrames representing the total number of frames including commercial content. For example, the sample video clip and content identifier information may be downloaded from any external source through the interface 8, in which case the video clip and content identifier information are forwarded to the controller 10 and then stored by the controller 10 in the memory 15. The video clip may be, for example, a portion of a television signal or internet file broadcast downloaded from the interface 6, a video clip uploaded from the user information appliance 4, a video clip downloaded from a particular web site, or a video signal originated from any other source (not shown) that may be coupled to the server 2. As another example, the content identifier information may be stored in the memory 15 after the sample video clip already is stored in that memory. For example, while viewing individual frames of the sample video clip on the display 13, the user may enter content identifier information specifying whether or not each individual frame includes commercial content, into the server memory 15 through the input user interface 11, and then that information is stored in association with the frame information. It further is assumed in step 100 that the thresholds of the individual chromosomes Crl-Crn are initialized to some predetermined values (e.g., represented by a bit string), either as specified by the user or by the routine stored in memory 15, and thus an initial population P(t) of the chromosomes is provided, where, for the purposes of this description, "t" is a variable representing the population level.
Thereafter, in step 110 it is assumed that, for example, the user operates the input user interface 11 to enter command information into the controller 11 specifying that the sample video clip be examined for the presence of predetermined content, namely, in this example, commercial subject matter. In response to the command information being inputted into the controller 10 in step 110, the controller 10 performs a predetermined content detection algorithm that is identified as step 112 in Fig. 3. For this exemplary embodiment, that algorithm is shown in further detail by the method steps shown in Figs. 4a and 4b, and, is performed to evaluate the sample video clip for the presence of commercial content based on the threshold values within each chromosome of the population P(t). In a preferred embodiment of this invention, the algorithm is performed separately for each chromosome of the population P(t), so that multiple performances of the algorithm occur, either in parallel or in series with one another, and so that there is at least one performance of the algorithm for each chromosome. For convenience, the following description will be made in the context of the performance of the content detection algorithm for only a single one of the chromosomes, although it should be understood that the algorithm is performed for each chromosome separately.
In step 200 of Fig. 4a, the content detection algorithm is entered, and it is assumed that the frame grabber 23 detects a first video frame in the sample video clip, using a known frame detection technique ("Yes" in step 200). Thereafter, control passes to step 202 where the cut detector 24 determines whether or not a cut (i.e., a change in a scene or a change in content) has occurred based on the content of the detected frame relative to that of an immediately-preceding detected frame, if any, using a known cut detection technique. If no cut is detected in step 202 ("No" at step 202), then control passes to step 206 where the method continues in a manner described below. Otherwise, if a cut is detected in step 202, control passes to step 204 where the controller 10 sets the CommercialProbability variable equal to value '1 '. Control then passes to step 206. It should be noted that the frame detection step 220 and cut detection step 202 may be performed using any suitable, known frame detection and cut detection techniques, respectively, such as, for example, those described in U.S. Patent 6,100,941, which, as pointed out above, is incorporated by reference herein.
In step 206 the controller 10 determines whether or not the frame detected in step 200 is a black or unicolor frame, using a known black frame/unicolor frame detection technique (such as, for example, a black frame/unicolor frame technique described in U.S. Patent No. 6,100,941 or U.S. Patent Application No. 09/417,288, or any other suitable black frame/unicolor frame detection technique). If the performance of step 206 results in a determination that the frame detected in step 200 is not a black or unicolor frame ("No" in step 206), control passes through connector A to step 220 of Fig. 4b, where the method then continues in a manner as will be described below. If the performance of step 206 results in a determination that the frame detected in step 200 is a black or unicolor frame ("Yes" in step 206), then control passes to step 208 where the controller 10 increments the value of counter UniColorlnSuccession (originally initialized to '0' when the algorithm in step 112 is initially entered) stored in memory 15 by ' 1 ', and also updates the value of variable LastUniColor so that it represents the number of the current frame. Thereafter, a number of steps are performed to determine whether or not a potential start or ending of a commercial exists near the detected black/unicolor frame in the video clip.
In step 210 the controller 10 determines whether or not the temporal distance between the newly detected black or unicolor frame and a last detected black/unicolor frame (if any) exceeds the value of DistForSuccThreshold. For example, the temporal distance may be calculated by first subtracting the number of the newly-detected black/unicolor frame (identified by, e.g., LastUniColor) from that of the last detected black/unicolor frame (if any), and then multiplying the subtraction result by the inverse of an applicable, predetermined frame rate (e.g., the inverse of either 25 frames/second, 30 frames/second, or 24 frames/second) to convert the subtraction result into units of time. The product of that multiplication represents the temporal distance in question, and is compared to the value of DistForSuccThreshold to determine whether or not the temporal distance exceeds that threshold value. If the performance of step 210 results in a determination of "No", which indicates that the current frame probably is not located immediately prior to a frame that includes commercial content, then control passes to step 214 which is performed in a manner as will be described below. If the performance of step 210 results in a determination of "Yes", then control passes to step 212 where the controller 10 determines whether or not the temporal distance between a last detected commercial (if any) and the newly-detected black/unicolor frame exceeds the value of the SeparationThreshold, or if no commercial segment was previously detected in the sample video clip (e.g., a first commercial segment may be present). For example, that temporal distance may be calculated by first subtracting the number of the newly-detected black/unicolor frame from the number of the last frame of a last detected commercial segment (if any was determined based on a previous performance of step 228 of Fig. 4b) (a last, black frame of a previous commercial), and then multiplying the subtraction result by the inverse of the applicable, predetermined frame rate. The product of that multiplication represents the temporal distance between the newly-detected black/unicolor frame and an ending frame of a last detected commercial, and is compared to the value of SeparationThreshold to determine whether or not the temporal distance exceeds that threshold value.
If the performance of step 212 results in a determination of "Yes", then control passes to step 216 where the controller 10 recognizes that the frame detected in step 200 is potentially adjacent to a next, beginning frame of a commercial segment. If the performance of step 212 results in a determination of "No", which indicates that the current frame likely is not located adjacent to a next, beginning frame of a commercial segment, then control passes to step 214, which will now be described.
In Step 214 the controller 10 determines whether or not the number of black/unicolor frames (e.g., UniColorlnSuccession) that have been detected since the algorithm was initiated in step 112 (Fig. 3) exceeds UnicolorlnSuccThreshold. If the performance of step 214 results in a determination of "Yes", then control passes to step 218 where the controller 10 recognizes that the frame detected in step 200 is potentially one of a series of black/unicolor frames following the ending of a commercial segment (i.e., the potential end of the commercial segment is recognized). If the performance of step 214 results in a determination of "No", then control passes through connector A to step 220 of Fig. 4b.
Referring to Fig. 4b, in step 220 a decision is made as to whether or not the potential presence of a commercial segment was previously determined to exist, and control then passes to either step 232 (to be described below) or step 222, based on the result of that determination. For example, if step 220 was entered into directly after a determination of "No" in step 206, then the performance of step 220 results in control being passed to step 232. If step 220 was entered directly after a potential beginning of a commercial segment was recognized in step 216 or after a potential ending of a commercial segment was recognized in step 218, then the performance of step 220 results in control being passed to step 222, which will now be described.
In step 222 the controller 10 determines whether the approximate duration of the potential commercial segment is within a predetermined time period. For example, step 222 may be performed by the controller 10 determining whether the temporal distance separating a frame last determined to be a black/unicolor frame potentially following the end of a commercial segment (a last performance of step 218) and an earlier frame determined to be potentially located adjacent to a next, beginning frame of a commercial segment (in an earlier performance step of 216), is greater than the value of the MinCommercialThreshold and less than the value of the MaxCommercialThreshold. If the performance of step 222 results in a determination of "No", then control passes to step 232. Otherwise, if the performance of step 222 results in a determination of "Yes", then control passes to step 224 where the controller 10 examines the value of the CommercialProbability variable to determine whether or not it is equal to ' 1 '. If it is determined in step 224 that the value of the CommercialProbability variable is equal to '1', then control passes to step 230 where the controller 10 stores in the memory 15 a record confirming that the frame detected in step 200 is a black or unicolor frame that is located temporally adjacent to a next, beginning frame of a commercial segment. If it is determined in step 224 that the value of the CommercialProbability variable is not equal to '1', then control passes to step 226 where the controller 10 determines whether or not the temporal distance between the current black or unicolor frame (detected in step 206) and a last detected black/unicolor frame (if any) exceeds the product of 2*DistForSuccThreshold. If the performance of step 226 results in a determination of "Yes", which confirms that the current black/unicolor frame is a first black/unicolor frame appearing immediately after a last frame (that includes commercial content) of a commercial segment, then control passes to step 228 where the controller 10 stores in the memory 15 a record indicating such and confirming that the presence of a commercial segment ending has been detected. If the performance of step 226 results in a determination of "No", then control passes to step 232, which will now be described.
In step 232 the controller 10 determines whether or not (1) the temporal distance between the frame (if any) confirmed in step 228 or 230 and an earlier frame (if any) last confirmed in an earlier performance of step 228 or 230 exceeds the value of
RestartThreshold, or (2) the temporal distance between the current black/unicolor frame and a last-detected black/unicolor frame (if any) exceeds the value of DistForSuccThreshold, wherein the temporal distances may be determined and compared to the corresponding threshold values in the manner described above. If the performance of step 232 results in a determination of "Yes", then control passes to step 234 where the controller 10 sets the value of the CommercialProbability variable to '0' to indicate that no commercial segment was detected in the previously-described method steps. Thereafter, control passes through connector B back to step 200 of Fig. 4a.
If, on the other hand, the performance of step 232 results in a determination of "No", then control passes to step 233 where, if the last confirm step performed was step 228 ("Yes" in step 233), control is passed to step 233' where the controller 10 stores information in the memory 15 specifying that the frames (identified by corresponding frame numbers) appearing temporally between the black/unicolor frame confirmed in step 228 (as being located immediately after a last frame of a commercial segment) and a black/unicolor frame which was last confirmed in earlier step 230 (as being located immediately prior to a first frame of the commercial segment), include commercial content and collectively represent a commercial segment. Thereafter, in step 236 the controller 10 increases the value of the Totalldentified counter (originally initialized at '0' when step 112 was entered) stored in memory 15 by the number of frames identified in step 233' as including commercial content, and control then passes through connector B back to step 200 of Fig. 4a. If, on the other hand, step 230 was the last confirm step performed ("No" in step 233), control passes directly through connector B back to step 200 of Fig. 4a.
After control passes back to step 200 through connector B from either step 233, 234, or 236, step 200 is performed in the above-described manner. If the performance of that step results in there being a detection of a next video frame the sample video clip ("Yes" in step 200), then control passes to step 202 where the method then continues in the above-described manner. If no other frames are detected ("No" in step 200), then control passes to step 114 of Fig. 3, where the method then continues in the following manner. Step 114 of Fig. 3 is entered into after the algorithm of step 112 is performed for each chromosome of the set of chromosomes of population P(t), stored in the memory 15. For example, the performance of step 112 for each of the initial chromosomes Crl-Crn of population P(t), results in there being stored in the memory 15, for each chromosome, a respective Totalldentified counter value representing the total number of frames (if any) in the sample video clip that were identified as including commercial content by the commercial detection algorithm employing that chromosome (see, e.g., step 236 of Fig. 4b) in step 112, and information specifying the frame numbers of the set(s) of those frames (see, e.g., step 233'). Now, in step 114 the controller 10 determines, for each individual chromosome of the population P(t), whether or not the frames (if any) identified in step 233' during the performance of step 112 for that chromosome, were correctly identified as including commercial content, by correlating the identified frames to the corresponding content identifier information (specifying whether or not the frames include commercial content) originally stored in memory 15 in step 100 of Fig. 3. For example, assuming that a particular frame was identified as including commercial content during the earlier performance of the algorithm of step 112 for a particular chromosome, and assuming that the content identifier information stored in memory 15 specifies that the same frame does indeed include commercial content, then that frame is determined in step 114 as having been correctly identified as including commercial content. A similar determination is made in step 114 for each frame identified in earlier step 233' for each chromosome. Then control passes to step 115 where the controller 10 updates the value of #CorrIdentified associated with each initial chromosome Crl-Crn of the population P(t) stored in memory 15 so that, for each chromosome, the updated value specifies the number of frames which were determined in step 114 as having been correctly identified (as including commercial content during the performance of step 112) for that chromosome. As a result, a separate value of #CorrIdentified is provided for each chromosome (to indicate the number of frames correctly identified during the algorithm performed using that chromosome). After step 115 is performed, control passes to step 116 where values for the counters #CorrIdentified (updated in step 115) and Totalldentified (updated in step 112) associated with each chromosome are employed by the controller 10 to determine a Recall and a Precision for that chromosome, using the following formulas FI and F2, respectively:
Recall = #CorrIdentified / Acmal#CommFrames (FI)
Precision = #CorrIdentified / Totalldentified (F2).
For example, assuming that the earlier performance of steps 112 and 115 for each chromosome C 1 -Cn of initial population P(t) results in a determination of the counter values #CorrIdentified and Totalldentified shown in Table I below for those chromosomes, and that the value of Actual#CommFrames is 90 for each chromosome, then the performance of the formulas FI and F2 in step 116 results in the Recall and Precision values shown in Table I being calculated for the corresponding chromosomes Crl-Crn. Those values are stored in the memory 15 by the controller 10.
TABL ,E I
Actual#Comm-
Cr# #CorrIdentified Totalldentified Recall Pre Frames
Crl 80 90 100 0.888 0.8
Cr2 90 90 90 1.0 1.0
Cm 75 90 95 0.833 0.7!
After step 116 is performed, control passes to step 117 where, in accordance with one embodiment of the invention, the controller 10 selects certain ones of the chromosomes by employing a predetermined selection strategy that is based on the Recall and Precision values determined in step 116. Any suitable type of selection strategy may be employed in step 117, such as, for example, a stochastic selection process, a random process with a probability of selection that is proportional to fitness, a strategy which selects chromosomes yielding the highest 50% of all of the Precision and Recall determined in step 116, a strategy which selects chromosomes yielding Recall and Precision values equaling or exceeding a predetermined value, or another suitable fitness selection strategy, etc., depending on predetermined operating criteria. In this regard, reference may be had to the publications entitled "Genetic Algorithms And Evolutionary Programming", Artificial intelligence: A Modern Approach, 1995, Chapter 20.8, pages 619-621, by Stuart Russell et al. (hereinafter "the Genetic Algorithms publication"), "The CHC Adaptive Search Algorithm: How To Have Safe Search When Engaging In Nontraditional Genetic Recombination", Foundations Of Genetic Algorithms, 1991, pages 265-283, by Larry Eshelman (hereinafter "the Eshelman publication"), for a description of examples of fitness selection strategies that may be employed in step 117, although other suitable strategies may be employed instead, and the manner in which the controller 10 would be programmed to perform such strategies would be readily appreciated by one skilled in the relevant art in view of this description. Reference also may be had to U.S. Patent 5,390,283, "Method for Optimizing of a Pick and Place Machine", by Larry Eshelman and James D. Schaffer, issued on February 14, 1995, for a description of the use of a CHC algorithm for determining a near- optimal allocation of components in a "pick and place" machine. That U.S. patent is incorporated by reference herein in its entirety.
In the present exemplary embodiment, and for the purposes of this description, step 117 is performed by determining the fitness (F) of each chromosome, using the following formula (F3), and by then selecting the chromosomes yielding the highest 50% of all of the calculated fitness values (F):
(F) = (2*(Precision)(Recall))/(Precision+Recall) (F3)
After step 117 is performed, control passes to step 118 where, according to one embodiment of the invention, each individual chromosome selected in step 117 is randomly paired with another one of those selected chromosomes, and then mated with that other selected chromosome, if the paired chromosomes are determined to be non-incestuous. For example, in one embodiment, after the chromosomes are paired together in step 118 (Fig. 5 a shows an example of two randomly-paired chromosomes Crl and Crn), a determination is made as to whether or not the paired chromosomes are incestuous by examining, for each pair, whether or not the values of chromosomes of the pair differ from one another (e.g., as measured by a Hamming distance) by at least an incest threshold value, such as a predetermined bit string length or some other suitable value. As an example, six of the corresponding bits of the chromosomes pair in Fig. 5a differ from one another, and thus, in a case where the incest threshold value is 1/4 of the bit string length, the performance of that portion of step 118 results in a determination that those chromosomes are not incestuous. Thereafter, in accordance with one embodiment of the invention, the chromosomes determined to be non-incestuous are then mated by randomly choosing a cross-over point 300, and then swapping the bits of the pair appearing after the cross-over point so that offspring chromosomes are generated (or this may be accomplished using HUX; see the Eshelman publication)). Fig. 5b shows an example of such offspring chromosomes Crkl and Crk2 generated by the parent chromosomes of Fig. 5a (step 118). The crossover operation may be performed in any suitable manner known in the art, such as that described in relevant portions of the Eshelman publication referred to above.
In accordance with another embodiment of the invention, the production of offspring in step 118 may be performed by, for example, randomly mutating the value of each chromosome by flipping a predetermined portion (e.g., 35%) of the bits of each chromosome, at random (with independent probability), in a manner as known in the art. Fig. 5c shows an example of one of the parent chromosomes Crl of Fig. 5a and an offspring chromosome Crkl resulting from the mutation of that parent chromosome. In still another embodiment of this invention, the mutation performed during step 118 may be performed by randomly choosing a cross-over point and swapping bits in the above-described manner, and then randomly mutating the resultant bit strings (individual bits), or vice versa, in the manner described above.
The performance of step 118 results in there being a plurality of offspring chromosomes Crkl-Crki provided (which hereinafter also are referred to collectively as offspring population K(t)) (assuming, in the case of sexual reproduction, that at least one of the parent chromosome pairs was determined to be non-incestuous in that step, wherein for that embodiment each pair of offspring chromosomes was generated from a corresponding pair of parent chromosomes). After step 118 is performed, control passes to step 120 where each of the chromosomes Crkl-Crki is employed, in lieu of the parent chromosomes Crl-Crn of initial population P(t), in the content detection algorithm described above in relation to step 112, and then, after that algorithm is performed for each chromosome Crkl-Crki, steps that are the same as steps 113-116 are performed for each of those chromosomes Crkl-Crki. That is, step 120 is performed in the same manner as steps 112-116 described above, except that the offspring chromosomes Crkl-Crki are employed in those steps in place of the parent chromosomes Crl-Crn of initial population P(t). Since steps 112-116 were already described above, for convenience a further detailed description of those steps will not be made herein. It should be clear to one skilled in the relevant art in view of this description, however, how those steps are performed employing the offspring chromosomes Crkl-Crki. The performance of step 120 results in a determination of a fitness value (F) yielded for each offspring chromosome Crkl-Crki (as in step 116), in the same manner as described above. Thereafter, control passes to step 122 where, in accordance with one embodiment of the invention, another selection of chromosomes is made, but this time the selection is made from amongst all chromosomes of the previous chromosome population P(t) (e.g., Crl-Crn) and all chromosomes of offspring population K(t) (e.g., Crkl-Crki), to generate a new population P(t=t+1), by employing the same chromosome fitness selection strategy as that described above in relation to step 117, or any other suitable existing or later developed selection strategy. Thereafter, in step 124 a convergence determination is made, by determining whether (a) the value of the incest threshold is equal to '0' and (b) the fitness (F) of each chromosome selected in step 122 is the same. If either (a) or (b) is not true, then a determination is made as to whether there were no chromosomes selected from population K(t) (i.e., none survived) in step 122. If none were selected in that step, then the value of the incest threshold is decreased by ' 1 ' ("N" in step 124), and control then passes back to step 118 where the method then proceeds therefrom in the above described manner, but to mate the chromosomes of the newly generated population. If, on the other hand, both (a) and (b) are determined to be true in step 124 ("Y" in step 124), then control passes to step 126.
In Step 126, a determination is made as to whether or not the method should be terminated. In accordance with one embodiment of the invention, that step is performed by determining if either (i) a predetermined number of chromosomes of offspring population K(t) have been evaluated in step 120 since the method was first began in step 100, or (ii) a restart step 130 (described below) has been performed a predetermined number of times since the method began in step 100. In other embodiments of the invention, step 126 may be performed to determine if both of the conditions (i) and (ii) have been satisfied, or, in other embodiments, the determination may be made as to only one of the conditions (i) and (ii), although, it should be noted that other suitable types of decisions besides those described herein may be employed instead, depending on applicable operating criteria.
If the performance of step 126 results in a determination of "Yes" ("Y" in step 126), control passes to step 128, which will be described below. Otherwise, if step 126 results in a determination of "No" ("N" at step 126), control passes to step 130, where a soft restart procedure is performed. In a preferred embodiment of the invention, the soft restart procedure of step 130 is performed by copying the chromosome (of the population P(t=t+1)) which, among all of the chromosomes of population P(t) evaluated in the previous performance of step 116 and offspring population K(t) evaluated in the previous performance of step 120, yielded the highest fitness (F) value (among all chromosomes of newly generated population P(t=t+1)), to provide plural (e.g., fifty) copies of that chromosome (also referred to herein as a current best one of all of those chromosome). Preferably, each of the resulting copies, except one, is then mutated by flipping a predetermined proportion (e.g., 35%) of the bits of the copy, at random without replacement. As a result of step 130, a single, non- mutated copy of the chromosome which yielded the highest fitness (F) value, and a plurality of mutated versions of that copied chromosome, are provided. Those chromosomes (including the non-mutated and mutated copies) collectively form a new chromosome population P(t). Thereafter, control passes from step 130 back to step 112, wherein the method then continues in the above described manner, but with the new chromosome population P(t) being employed for the various thresholds of the algorithm. Step 128 of Fig. 3 will now be described. In step 128 the controller 10 stores information in the memory 15 specifying that the threshold values of a current best chromosome remaining in the population P(t=t+1) after both steps 124 and 126 consecutively resulted in determinations of "Yes", be employed in future operations for detecting the presence of commercial content in video streams. Those threshold values, which in this example are ST2, UIST2, MinCT2, MaxCT2, RT2, and DFST2 (Fig. 2), are considered to be the best (e.g., "optimum" or high performance) of all the chromosome threshold values UISTl-UISTn, MinCTl-MinCTn, MaxCTl-MaxCTn, RTl-RTn, DFSTl-DFSTn (Fig. 2), respectively, for enabling the content detection algorithm represented in Figs. 4a and 4b to detect commercial content in a video information stream with a maximum degree of accuracy.
Sometime later, it is assumed that a user of the server 2 operates the input user interface 11 to enter information into the server 2 specifying that a selected video stream, such as the video clip originally provided in the memory 15 in step 100 or another video information signal provided to the server 2 (e.g., a downloaded or uploaded video clip, a received broadcast video information stream, or one or more otherwise provided video clips or other video segments, etc.), be evaluated for the presence of, for example, commercial subject matter. As a result, the controller 10 responds by retrieving the optimized chromosome threshold values ST2, UIST2, MinCT2, MaxCT2, RT2, and DFST2 identified in step 128 and then performing the content detection algorithm shown in Figs. 4a and 4b, using the retrieved values for the appropriate thresholds SeparationThreshold,
UnicolorlnSuccThreshold, MinCommercialThreshold, MaxCommercialThreshold, RestartThreshold, and DistForSuccThreshold, respectively, employed in that algorithm. In this manner, the video information is evaluated for the presence of commercial content, based on those thresholds. The use of those optimized threshold values enables the content detection algorithm to detect commercial content in the video information with a maximum degree of accuracy. Thereafter, the results of evaluation of the video information may then be employed as desired (e.g., to delete or replace the commercial content from the signal, etc.). The optimized threshold values identified in step 128 also may be provided to other devices, such as the user information appliance 4, for enabling those threshold values to be employed in the content detection algorithm in those devices. For example, instead of or in addition to employing the optimized chromosome threshold values in the content detection algorithm in the server 2, those values may be downloaded or otherwise provided to the user information appliance 4 for storage in the memory 18 of that appliance 4. Thereafter, those values may be retrieved by the controller 16 for use in performing the content detection algorithm (of Figs. 4a and 4b), to evaluate a selected video stream provided in the appliance 4 for the presence of commercial content, in a similar manner as described above in connection with the server 2. In other embodiments, software representing a content detection algorithm, such as, for example, the commercial detection algorithm shown in Figs. 4a and 4b, can be downloaded or be otherwise provided from the server 2 to user information appliances 4, in association with, or separately from, the optimized chromosome threshold values, and those values can then be employed in the algorithm in the information appliances to detect predetermined content an in information stream. Software representing the overall method of Fig. 3 also may be downloaded or be otherwise provided from server 2 to information appliances 4, or be pre-stored in those appliances 4, for enabling that method to be performed in those devices for determining optimum chromosome threshold values, which can then be uploaded or be otherwise provided back to the server 2, if desired, or employed in a suitable content detection algorithm in the appliances 4. It should be noted that, although the invention is described in the context of step 117 (and part of step 120) being performed to select chromosomes based on their fitnesses yielded as a function of Recall and Precision values, in other embodiments those selections may be made based on an evaluation of only Recall values or only Precision values yielded by chromosomes, or based on any other suitable measure of accuracy, and the measures may be of a scalar or vector type.
It also should be noted that, although the invention is described in the context of the content detection algorithm being performed to identify the presence of commercials based on a black frame or unicolor frame detection technique, the invention is not limited for use with only that technique or for detecting only commercial content. It also is within the scope of this invention to employ any other suitable types of now existing or later developed techniques which can be used for detecting any type of predetermined content in analog or digital video signals or any other types of media information (e.g., audio) besides/in addition to video information (examples of at least some other techniques involving video information were discussed above), and any type of low-level, mid-level, or hi-level (e.g., the presence of multiple black frames in succession) features that can be extracted, either in the compressed or uncompressed domain, may be employed in those techniques. As can be appreciated by one skilled in the art in view of this description, in cases where other types of techniques are employed besides the black/unicolor frame detection technique referred to in the above description, each technique would employ appropriate types of chromosomes that are suitable for use as parameters in the respective techniques. It should therefore be appreciated that the method of the present invention may be employed to optimize the detection of any type of desired or undesired content, included in any type of media information, and is not limited for use only in conjunction with detecting commercial content in video information. Moreover, as used herein, the phrase "information stream" is not intended to limit the invention to on-line applications. Indeed, it is within the scope of this invention to evaluate any applicable type of media information, such as, for example, video information, audio information, combination video/audio information, etc., within any suitable type of environment, whether on-line or off-line, and an information stream may include one or more of types of such information, depending on the application of interest and predetermined operating criteria.
Moreover, in one embodiment of the invention, each chromosome may include multiple sets of parameters, wherein each set can be used in a corresponding technique. For example, in addition to the various threshold values shown in Fig. 2 (which can be used in the algorithm of step 112 described above), each chromosome Crl-Crn (and offspring chromosome) may also include appropriate parameter values for use in other types of techniques, such as an average cut frame distance detection technique, an average cut frame distance trend detection technique, a brand name detection technique, another type of black frame detection technique, etc., and each technique may be run separately for each chromosome, using the appropriate parameter values for that technique. In one embodiment, a user may select (in initial step 100) which technique is desired to be employed, and then, as a result, all individual chromosome parameter values besides those which are suitable for use in the selected technique (e.g., all values besides those shown in Fig. 2, in a case where an algorithm such as that shown in Figs. 4a and 4b is selected) are initialized to '0', so that no results are obtained from the non-selected techniques. In other embodiments, those parameter values need not be set to DO', and each technique may be performed as a separate content detection algorithm for yielding separate results. Chromosomes that include genes specifying other types of information besides thresholds also may be employed in accordance with this invention. For example, a gene may specify that a color histogram should be computed by algorithm A, B, C, or D (not shown). Other genes may specify alternate ways to combine selected features into a final decision about content classification. Also, the chromosome values need not be represented in bit string form, and may instead be represented in any other suitable form. It should therefore be clear that the present invention is not limited to being used only in conjunction with chromosomes that include threshold values represented by bit strings, as described herein.
It also should be noted that, although the invention is described in the context of the high performance chromosome threshold values being determined by the server 2, broadly construed, the invention is not so limited. For example, as described above, it also is within the scope of this invention for the method depicted in Fig. 3, 4a, and 4b to be performed within other suitable devices, such as the user information appliance 4. In those embodiments, the method may be performed by evaluating a sample video clip within the devices (e.g., appliance 4), in the above-described manner, and the sample video clip may be provided in the devices from any source, such as the server 2. Also, in other embodiments, the threshold values employed in the algorithm within server 2 may be provided to the server 2 from an external source, such as information appliances 4.
Although the foregoing description has been described in the context of the method of the invention being implemented using software instructions, in other embodiments hardware circuitry may be used in place of such instructions for implementing the method of the invention. The particular types of circuitry employed would be readily appreciated by those of ordinary skill in the art, in view of this description. Also, while the invention has been described in the context of employing the above-described genetic algorithm to evolve the chromosome values used in a content detection algorithm, in other embodiments, other suitable types of genetic or other types of search algorithms may be employed instead, depending on the application of interest. A multitude of evolutionary algorithms are available that may be employed in accordance with this invention, and the particular choice of evolutionary algorithm for use in this invention is optional. It should therefore be understood that the invention is not limited for use only with the genetic algorithm described herein, and that the use of other types of evolutionary algorithms would be evident to one of ordinary skill in the art in the context of this disclosure.
It should further be noted that, although the invention is described in the context of the method selecting chromosomes based on their fitness in step 117 of Fig. 3, in another embodiment of the invention, no such selection need be performed, and the selection of chromosomes only occurs in step 122 which is performed to select amongst a present group of parent and offspring chromosomes. As can be appreciated in view of this description, in that embodiment, step 118 is performed to mate chromosomes from the present population being evaluated. As has been described in the foregoing description, the present invention provides a novel method for automatically evolving parameter values until those values substantially maximize the accuracy of a media content detection algorithm that is a function of those parameters. As a result, the thresholds enable the algorithm to detect predetermined content in a media information stream with a maximum degree of accuracy. This method is advantageous in that it improves the accuracy of such content detection algorithms automatically, and therefore relieves users of the burden of having to manually select appropriate threshold values. Moreover, by virtue of determining the optimum threshold values automatically, the method of the invention can circumvent attempts made by commercial producers to prevent the successful detection of the commercials by modifying their broadcast commercials.
While the invention has been particularly shown and described with respect to preferred embodiments thereof, it will be understood by those skilled in the art that changes in form and details may be made therein without departing from the scope and spirit of the invention.

Claims

CLAIMS:
1. A method for optimizing the performance of an algorithm (112) for detecting predetermined content in a media information stream, the algorithm (112) being a function of a set of parameters, wherein the method comprises the steps of: performing the algorithm (112) at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm (112) for each performance thereof; and automatically evolving at least one respective set of parameters employed in the algorithm (112) to maximize the degree of accuracy at which the algorithm (112) detects the predetermined content in the media information stream.
2. A method as set forth in Claim 1, wherein the media information stream includes at least one of video and audio information, and the predetermined content includes desired or undesired content.
3. A method as set forth in Claim 1 , wherein the algorithm (112) detects the predetermined content based on a detection of at least one predetermined feature derived from the media information stream.
4. A method as set forth in Claim 1, wherein the step of automatically evolving includes performing a genetic algorithm to evolve the at least one respective set of parameters.
5. A method as set forth in Claim 1 , wherein the media information stream includes a video information stream divided into a plurality of frames, the predetermined content includes at least one commercial, and the algorithm (112) includes the steps of: detecting a plurality of black or unicolor frames in the video information stream; identifying the presence of a beginning portion of a commercial based on the detection of at least one of the plurality of black or unicolor frames; and identifying the presence of an ending portion of the commercial based on the detection of at least one other of the plurality of black or unicolor frames.
6. A method as set forth in Claim 1 , wherein the step of automatically evolving comprises the steps of: determining the accuracy at which the algorithm (112) detects the predetermined content in the media information stream for each performance of the algorithm
(112); selecting at least one of the respective sets of parameters, based on a result of the step of determining the accuracy; and producing at least one offspring set of parameters based on the at least one set of parameters selected in the selecting step.
7. A method as set forth in Claim 6, wherein the step of automatically evolving further comprises the steps of: further performing the algorithm (112) at least once to detect the presence of the predetermined content in the media information stream, while employing a respective offspring set of parameters, produced in the producing step, in the algorithm (112) for each further performance thereof; - determining the accuracy at which the algorithm (112) detects the predetermined content in the media information stream for each further performance of the algorithm (112); and further selecting one or more of at least one respective set of parameters selected in the selecting step and at least one offspring set of parameters produced in the producing step, based on a result of that step of determining.
8. A method as set forth in Claim 7, wherein the step of automatically evolving further comprises the steps of: determining if there is a convergence of all sets of parameters remaining after the further selecting step; and if there is a convergence, storing a record of at least one of those sets of parameters selected in the further selecting step.
9. A method as set forth in Claim 6, wherein the step of producing comprises: pairing randomly-selected ones of the sets of parameters selected in the selecting step; determining if the sets of parameters paired in the pairing step are incestuous; and - for each paired sets of parameters determined to be non-incestuous, swapping one or more values of the parameters of those sets with one another.
10. A method as set forth in Claim 9, wherein the step of determining if the sets of parameters paired in the pairing step are incestuous comprises: - determining a number of corresponding parameter values of each paired sets of parameters, which differ from one another, if any; and determining if the number of corresponding parameter values determined to be differing from one another is less than a predetermined incest threshold.
11. A method as set forth in Claim 7, wherein the step of automatically evolving further comprises the steps of: determining if there is a convergence of all sets of parameters remaining after the further selecting step; and if there is no convergence, - mutating at least one value of a most optimum one of all the sets of parameters remaining after the further selecting step, to produce plural mutated versions of the most optimum set of parameters; and
- performing at least some steps of the method again, beginning with performing the algorithm (112), but plural times, to detect the presence of the predetermined content in the media information stream while employing the most optimum set of parameters and the mutated versions of the most optimum set of parameters, in respective performances of the algorithm (112).
12. A method as set forth in Claim 11, wherein the step of mutating comprises: - producing plural copies of the most optimum set of parameters; and changing at least one parameter value of each of the plural copies of the most optimum set of parameters.
13. A method as set forth in Claim 8, wherein the step of producing comprises: pairing randomly-selected ones of the sets of parameters selected in the selecting step; determining if the sets of parameters paired in the pairing step are incestuous by: - determining a number of corresponding parameter values of each paired sets of parameters, which differ from one another, if any, and
- determining if the number of corresponding parameter values determined to be differing from one another is less than a predetermined incest threshold; and for each paired sets of parameters determined to be non-incestuous, swapping at least corresponding values of the parameters of those sets with one another, wherein the step of determining if there is a convergence comprises at least one of:
- determining if the predetermined incest threshold is equal to a predetermined value; and - determining if performances of the algorithm (112) employing the sets of parameters remaining after the further selecting step each result in detections of the predetermined content with substantially a same degree of accuracy.
14. A method as set forth in Claim 13, and further comprising: - determining if any offspring sets of parameters remain after the further selecting step is performed; and if no offspring set of parameters remains, decreasing the predetermined incest threshold by a predetermined reduction value.
15. A method as set forth in Claim 8, wherein if there is a convergence of all of the sets of parameters remaining after the further selecting step, performing at least one of: determining if the method has been performed a predetermined number of times; and determining if a predetermined number of offspring sets of parameters has been produced, and wherein if either of those determining steps results in an affirmative determination, performing the step of storing.
16. A method as set forth in Claim 1 , wherein the step of automatically evolving includes evolving the at least one respective set of parameters employed in the algorithm (112) to generate an evolved set of parameters which is optimized to enable the algorithm (112) to detect the predetermined content in the media information stream with a maximum degree of accuracy.
17. A method as set forth in Claim 16, and further comprising the step of forwarding at least one of the algorithm (112) and the evolved set of parameters to a predetermined destination (4; 2).
18. A method for evaluating a media information stream, comprising the steps of: performing one or more algorithms (112), each to detect the presence of predetermined content in the media information stream, wherein each algorithm (112) is a function of a corresponding chromosome; and - automatically determining a value, for the chromosome of at least one of the algorithms (112), which enables that algorithm (112) to detect the presence of the predetermined content in the media information stream with an increased degree of accuracy relative to the accuracy achieved when other values are employed.
19. An apparatus (2; 4) for optimizing the performance of an algorithm (112) for detecting predetermined content in a media information stream, the algorithm (112) being a function of a set of parameters, the apparatus (2; 4) comprising: means (10; 16) for performing the algorithm (112) at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm (112) for each performance thereof; and means (10; 6) for automatically evolving at least one respective set of parameters employed in the algorithm (112) to maximize the degree of accuracy at which the algorithm (112) detects the predetermined content in the media information stream.
20. A program product comprising computer readable-code which, when executed, performs a method for optimizing the performance of an algorithm (112) for detecting predetermined content in a media information stream, the algorithm (112) being a function of a set of parameters, the method comprising the steps of: performing the algorithm (112) at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm (112) for each performance thereof; and automatically evolving at least one respective set of parameters employed in the algorithm (112) to maximize the degree of accuracy at which the algorithm (112) detects the predetermined content in the media information stream.
21. A storage medium ( 15 ; 18) storing a program having computer readable-code which, when executed, performs a method for optimizing the performance of an algorithm (112) for detecting predetermined content in a media information stream, the algorithm (112) being a function of a set of parameters, the method comprising the steps of: performing the algorithm (112) at least once to detect the predetermined content in the media information stream, while employing a respective set of parameters in the algorithm (112) for each performance thereof; and - automatically evolving at least one respective set of parameters employed in the algorithm (112) to maximize the degree of accuracy at which the algorithm (112) detects the predetermined content in the media information stream.
22. A system (1) for exchanging information, comprising: - at least one first information apparatus (2; 4); and at least one second information apparatus (4; 2), comprising:
- an interface (14; 8), coupled to said first information apparatus (2; 4) through an external communication interface (6),
- a memory (18; 15) storing at least a program, and - a controller (16; 10) coupled to said memory (18; 15) and said interface (14;
8), said controller (16; 10) operating under the control of the program stored in said memory (18; 15) for performing a method comprising (a) performing an algorithm (112) at least once to detect predetermined content in a provided media information stream, while employing a respective set of parameters in the algorithm (112) for each performance thereof, wherein the algorithm (112) is a function of the set of parameters, (b) automatically evolving at least one respective set of parameters employed in the algorithm (112) to determine an optimum set of parameters which maximizes the degree of accuracy at which the algorithm (112) detects the predetermined content in the media information stream, and (c) forwarding information representing at least one of the algorithm (112) and the optimum set of parameters to the at least one first information apparatus (2; 4) through the interface (14; 8) and the external communication interface (6).
23. A system (1) as set forth in Claim 22, wherein the first information apparatus (2; 4) comprises: a further interface (8; 14), coupled to said interface (14; 8) of said second information apparatus (4; 2) through the external communication interface (6); and a further controller (10; 16) coupled to said further interface (8; 14), said further controller (10; 16) being responsive to said further interface (8; 14) receiving the information from said second information apparatus (4; 2) for at least one of storing the information in an associated further memory (15; 18) and performing the algorithm (112), while employing the optimum set of parameters in the algorithm (112), to detect the predetermined content in a provided information stream.
24. A system (1) as set forth in Claim 22, wherein the first information apparatus
(2; 4) is operable for providing the information stream to said controller (16; 10) through said interface (14 ; 8) and said external communication interface (6), and wherein said controller (16; 10) performs the method after receiving that provided information stream.
PCT/IB2002/005713 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting WO2003056832A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
AU2002367237A AU2002367237A1 (en) 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting
EP02790652A EP1464178B1 (en) 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting content in information streams
KR10-2004-7010323A KR20040070290A (en) 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting
DE60219523T DE60219523D1 (en) 2001-12-31 2002-12-23 METHOD, DEVICE AND PROGRAM FOR DEVELOPING DETECTION ALGORITHMS
JP2003557215A JP4347056B2 (en) 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting content in an information stream

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/029,916 US7337455B2 (en) 2001-12-31 2001-12-31 Method, apparatus, and program for evolving algorithms for detecting content in information streams
US10/029,916 2001-12-31

Publications (1)

Publication Number Publication Date
WO2003056832A1 true WO2003056832A1 (en) 2003-07-10

Family

ID=21851556

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2002/005713 WO2003056832A1 (en) 2001-12-31 2002-12-23 Method, apparatus, and program for evolving algorithms for detecting

Country Status (9)

Country Link
US (1) US7337455B2 (en)
EP (1) EP1464178B1 (en)
JP (1) JP4347056B2 (en)
KR (1) KR20040070290A (en)
CN (1) CN100512430C (en)
AT (1) ATE359672T1 (en)
AU (1) AU2002367237A1 (en)
DE (1) DE60219523D1 (en)
WO (1) WO2003056832A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005228326A (en) * 2004-02-09 2005-08-25 Sap Ag Data processing system, method for displaying customized parameter of application program, and computer program product

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8875198B1 (en) 2001-08-19 2014-10-28 The Directv Group, Inc. Network video unit
US9602862B2 (en) 2000-04-16 2017-03-21 The Directv Group, Inc. Accessing programs using networked digital video recording devices
US7917008B1 (en) 2001-08-19 2011-03-29 The Directv Group, Inc. Interface for resolving recording conflicts with network devices
US10390074B2 (en) 2000-08-08 2019-08-20 The Directv Group, Inc. One click web records
US9171851B2 (en) * 2000-08-08 2015-10-27 The Directv Group, Inc. One click web records
US8949374B2 (en) 2000-08-08 2015-02-03 The Directv Group, Inc. Method and system for remote television replay control
AU2002243448A1 (en) * 2000-10-24 2002-06-24 Singingfish.Com, Inc. Method of sizing an embedded media player page
WO2004028153A1 (en) * 2002-09-23 2004-04-01 Koninklijke Philips Electronics N.V. A video recorder unit and method of operation therefor
US7735104B2 (en) * 2003-03-20 2010-06-08 The Directv Group, Inc. System and method for navigation of indexed video content
US8752115B2 (en) * 2003-03-24 2014-06-10 The Directv Group, Inc. System and method for aggregating commercial navigation information
US20050193426A1 (en) * 2004-02-27 2005-09-01 Raja Neogi System and method to control fingerprint processing in a media network
TWI275300B (en) * 2004-04-02 2007-03-01 Mstar Semiconductor Inc Method of processing fields of images
WO2005124782A1 (en) * 2004-06-18 2005-12-29 Matsushita Electric Industrial Co., Ltd. Av content processing device, av content processing method, av content processing program, and integrated circuit used in av content processing device
CA2574998C (en) 2004-07-23 2011-03-15 Nielsen Media Research, Inc. Methods and apparatus for monitoring the insertion of local media content into a program stream
JP2007228343A (en) * 2006-02-24 2007-09-06 Orion Denki Kk Digital broadcast receiver
US8671346B2 (en) * 2007-02-09 2014-03-11 Microsoft Corporation Smart video thumbnail
US20090158157A1 (en) * 2007-12-14 2009-06-18 Microsoft Corporation Previewing recorded programs using thumbnails
US8997150B2 (en) * 2008-03-10 2015-03-31 Hulu, LLC Method and apparatus for permitting user interruption of an advertisement and the substitution of alternate advertisement version
US20090320060A1 (en) * 2008-06-23 2009-12-24 Microsoft Corporation Advertisement signature tracking
US8209713B1 (en) 2008-07-11 2012-06-26 The Directv Group, Inc. Television advertisement monitoring system
US8055749B1 (en) * 2008-09-30 2011-11-08 Amazon Technologies, Inc. Optimizing media distribution using metrics
CN102256642B (en) * 2008-12-16 2014-09-17 松下健康医疗器械株式会社 Medication administering device
US10116902B2 (en) 2010-02-26 2018-10-30 Comcast Cable Communications, Llc Program segmentation of linear transmission
US9258175B1 (en) 2010-05-28 2016-02-09 The Directv Group, Inc. Method and system for sharing playlists for content stored within a network
EP2622557B1 (en) 2010-09-27 2019-07-17 Hulu, LLC Method and apparatus for providing directed advertising based on user preferences
US8489526B2 (en) * 2010-11-24 2013-07-16 International Business Machines Corporation Controlling quarantining and biasing in cataclysms for optimization simulations
US9563844B2 (en) 2011-06-30 2017-02-07 International Business Machines Corporation Speculative asynchronous sub-population evolutionary computing utilizing a termination speculation threshold
US8577814B1 (en) * 2011-07-28 2013-11-05 Amazon Technologies, Inc. System and method for genetic creation of a rule set for duplicate detection
US8966520B2 (en) 2011-10-03 2015-02-24 Hulu, LLC Video ad swapping in a video streaming system
US9165247B2 (en) 2012-01-04 2015-10-20 International Business Machines Corporation Using global and local catastrophes across sub-populations in parallel evolutionary computing
US9066159B2 (en) 2012-10-23 2015-06-23 Hulu, LLC User control of ad selection for subsequent ad break of a video
US9064149B1 (en) * 2013-03-15 2015-06-23 A9.Com, Inc. Visual search utilizing color descriptors
US9299009B1 (en) 2013-05-13 2016-03-29 A9.Com, Inc. Utilizing color descriptors to determine color content of images
US9305257B2 (en) 2013-05-20 2016-04-05 International Business Machines Corporation Adaptive cataclysms in genetic algorithms
CN104063313B (en) * 2014-04-15 2017-05-17 深圳英飞拓科技股份有限公司 Intelligent analytical algorithm test system and method
US9369780B2 (en) * 2014-07-31 2016-06-14 Verizon Patent And Licensing Inc. Methods and systems for detecting one or more advertisement breaks in a media content stream
CA2967572C (en) * 2014-11-12 2023-09-26 Analytics Media Group, LLC Media planning system
US10027995B2 (en) * 2016-01-21 2018-07-17 Treepodia Ltd. System and method for generating media content in evolutionary manner
US9872049B1 (en) * 2016-06-30 2018-01-16 SnifferCat, Inc. Systems and methods for dynamic stitching of advertisements
US10083369B2 (en) 2016-07-01 2018-09-25 Ricoh Company, Ltd. Active view planning by deep learning
US10129586B2 (en) * 2016-12-19 2018-11-13 Google Llc Detecting and isolating television program content from recordings of television airings
CN107623863B (en) * 2017-09-21 2020-11-06 广州华多网络科技有限公司 Algorithm testing method and device and server

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0735754A2 (en) * 1995-03-30 1996-10-02 Deutsche Thomson-Brandt Gmbh Method and apparatus for the classification of television signals

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4314285A (en) * 1979-05-11 1982-02-02 Bonner Edgar L Editing system for video apparatus
JPH01284092A (en) * 1988-01-26 1989-11-15 Integrated Circuit Technol Ltd Method and apparatus for discriminating and eliminating specific data from video signal
US5086479A (en) * 1989-06-30 1992-02-04 Hitachi, Ltd. Information processing system using neural network learning function
US5436653A (en) * 1992-04-30 1995-07-25 The Arbitron Company Method and system for recognition of broadcast segments
US5390283A (en) * 1992-10-23 1995-02-14 North American Philips Corporation Method for optimizing the configuration of a pick and place machine
US5333091B2 (en) * 1993-01-08 1996-12-17 Arthur D Little Enterprises Method and apparatus for controlling a videotape player to automatically scan past recorded commercial messages
US6100941A (en) * 1998-07-28 2000-08-08 U.S. Philips Corporation Apparatus and method for locating a commercial disposed within a video data stream
US6366296B1 (en) * 1998-09-11 2002-04-02 Xerox Corporation Media browser using multimodal analysis
US6577346B1 (en) * 2000-01-24 2003-06-10 Webtv Networks, Inc. Recognizing a pattern in a video segment to identify the video segment
US6957200B2 (en) * 2001-04-06 2005-10-18 Honeywell International, Inc. Genotic algorithm optimization method and network

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0735754A2 (en) * 1995-03-30 1996-10-02 Deutsche Thomson-Brandt Gmbh Method and apparatus for the classification of television signals

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CALIANI M ET AL: "Computer analysis of TV spots: the semiotics perspective", MULTIMEDIA COMPUTING AND SYSTEMS, 1998. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON AUSTIN, TX, USA 28 JUNE-1 JULY 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 28 June 1998 (1998-06-28), pages 170 - 179, XP010291591, ISBN: 0-8186-8557-3 *
LIENHART R ET AL: "On the detection and recognition of television commercials", MULTIMEDIA COMPUTING AND SYSTEMS '97. PROCEEDINGS., IEEE INTERNATIONAL CONFERENCE ON OTTAWA, ONT., CANADA 3-6 JUNE 1997, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 3 June 1997 (1997-06-03), pages 509 - 516, XP010239226, ISBN: 0-8186-7819-4 *
SANCHEZ J M ET AL: "AudiCom: a video analysis system for auditing commercial broadcasts", MULTIMEDIA COMPUTING AND SYSTEMS, 1999. IEEE INTERNATIONAL CONFERENCE ON FLORENCE, ITALY 7-11 JUNE 1999, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 7 June 1999 (1999-06-07), pages 272 - 276, XP010519397, ISBN: 0-7695-0253-9 *
VALKEALAHTI K ET AL: "Reduced multidimensional histograms in color texture description", PATTERN RECOGNITION, 1998. PROCEEDINGS. FOURTEENTH INTERNATIONAL CONFERENCE ON BRISBANE, QLD., AUSTRALIA 16-20 AUG. 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 16 August 1998 (1998-08-16), pages 1057 - 1061, XP010297694, ISBN: 0-8186-8512-3 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005228326A (en) * 2004-02-09 2005-08-25 Sap Ag Data processing system, method for displaying customized parameter of application program, and computer program product
JP4675639B2 (en) * 2004-02-09 2011-04-27 エスアーペー アーゲー Data processing system, method for displaying customization parameters of application program, and computer program product

Also Published As

Publication number Publication date
JP2005513967A (en) 2005-05-12
US7337455B2 (en) 2008-02-26
US20030126598A1 (en) 2003-07-03
KR20040070290A (en) 2004-08-06
EP1464178A1 (en) 2004-10-06
ATE359672T1 (en) 2007-05-15
AU2002367237A1 (en) 2003-07-15
JP4347056B2 (en) 2009-10-21
CN1611076A (en) 2005-04-27
DE60219523D1 (en) 2007-05-24
CN100512430C (en) 2009-07-08
EP1464178B1 (en) 2007-04-11

Similar Documents

Publication Publication Date Title
US7337455B2 (en) Method, apparatus, and program for evolving algorithms for detecting content in information streams
US10181015B2 (en) System for identifying content of digital data
US6993245B1 (en) Iterative, maximally probable, batch-mode commercial detection for audiovisual content
JP4202316B2 (en) Black field detection system and method
KR101001172B1 (en) Method and apparatus for similar video content hopping
US6819863B2 (en) System and method for locating program boundaries and commercial boundaries using audio categories
JP4182369B2 (en) Recording / reproducing apparatus and method, and recording medium
EP1624391A2 (en) Systems and methods for smart media content thumbnail extraction
US7742680B2 (en) Apparatus and method for processing signals
EP2172010B1 (en) Digital video recorder collaboration and similar media segment determination
JP2008211777A (en) System and method for indexing commercials in video presentation
EP1067786B1 (en) Data describing method and data processor
EP1383079A2 (en) Method, apparatus, and program for evolving neural network architectures to detect content in media information
US8325803B2 (en) Signal processing apparatus, signal processing method, and program
US7054388B2 (en) Signal detection method and apparatus, relevant program, and storage medium storing the program
JP2007110188A (en) Recording apparatus, recording method, reproducing apparatus, and reproducing method
KR20050033075A (en) Unit for and method of detection a content property in a sequence of video images
JP2007066409A (en) Recording and reproducing apparatus, and recording and reproducing method
CN102034520B (en) Electronic device and content reproduction method
JP2011078028A (en) Electronic equipment, and method and program for generating metadata

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2002790652

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 20028264592

Country of ref document: CN

Ref document number: 2003557215

Country of ref document: JP

Ref document number: 1020047010323

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2002790652

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 2002790652

Country of ref document: EP