WO2015038121A1 - Video segmentation by audio selection - Google Patents
Video segmentation by audio selection Download PDFInfo
- Publication number
- WO2015038121A1 WO2015038121A1 PCT/US2013/059343 US2013059343W WO2015038121A1 WO 2015038121 A1 WO2015038121 A1 WO 2015038121A1 US 2013059343 W US2013059343 W US 2013059343W WO 2015038121 A1 WO2015038121 A1 WO 2015038121A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- content
- playback
- scenes
- request
- Prior art date
Links
- 230000011218 segmentation Effects 0.000 title description 3
- 238000000034 method Methods 0.000 claims abstract description 20
- 230000007175 bidirectional communication Effects 0.000 description 12
- 241000287436 Turdus merula Species 0.000 description 6
- 238000012360 testing method Methods 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/34—Indicating arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/19—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
- G11B27/28—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/239—Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
- H04N21/2393—Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/42203—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47211—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting pay-per-view content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4728—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/8106—Monomedia components thereof involving special audio data, e.g. different tracks for different languages
Definitions
- the present invention relates to video content playback, and, in particular to the video content playback based upon audio selections.
- video content has been broken (segmented) into scenes either due to editing decisions or how a script is written.
- scenes are typically used for trick play functions (jump to the next scene) or chapters that are accessed on a DVD/Blu-Ray disc.
- the present invention segments video content based upon the audio (music) associations of the video instead of as in the prior art based on scenes associated with the video.
- the present invention provides a framework for breaking up a video for a media asset such as a television show or movie or other content (including multimedia content) into segments that correlate to the audio/music selections instead of scenes as in the prior art.
- a method and apparatus including receiving a user request, determining if the user request is a playback request, displaying to a user, playback options, if the user request is the playback request, receiving the playback options selected by the user, determining if the playback options selected by the user are by audio selection and forwarding scenes that match the audio selection of the user for playback.
- Fig. 1 is a flowchart of an exemplary method for segmenting a video into different segments that correlate to the audio associated with such a video.
- Fig. 2 presents the results of the application of the described method of the present invention.
- FIGs. 3 A and 3B together form a flowchart of an exemplary embodiment of the method of the present invention.
- Fig. 4 is a block diagram of an exemplary embodiment in accordance with the principles of the present invention.
- the present invention provides a mechanism for breaking up (segmenting) a video (such as a movie or a television show or any other form of content) into audio selections, which can be music selections such as songs or background music.
- the video segment can then be mapped to such audio segments (called M) instead of using the typically scenes for which a video is segmented.
- Video scenes are typically the way a video is constructed either due to editing decisions, how a script is written, commercial break requirements, and the like.
- Fig. 1 is a flowchart of an exemplary method for segmenting a video into different segments that correlate to the audio associated with such a video.
- the invention begins with a video asset that is composed of a number of scenes. Each scene can then be associated with a number (0 to X) of music selections which can be songs associated with different music artists.
- a media asset containing audio and video is ingested into a server or other type of storage device.
- a time reference is assigned to the audio and the video of the media asset.
- such a time reference can be timestamps although other mechanisms can be used for this step.
- an audio recognition mechanism is applied to recognize the different audio segments in the media asset.
- Such audio recognition techniques can be the use of a program such as Shazam, which identifies songs/music selections based on audio fingerprint of the audio selection or AudioID developed by Fraunhofer institute, volume changes in the audio such as periods of silence that exceed a threshold time value and the like.
- Other audio recognition mechanisms can be used and/or combined with Shazam and/or AudioID.
- the video is segmented by way of the audio selections that were identified in the previous step. That is, instead of having the video scenes control how a video is organized, it will be the audio selections which are used to segment a video. Hence, the beginning and the end of an audio selection will have a video segment corresponding in kind.
- the time basis for the audio and the video segment would then match up using the time reference (e.g., time stamps).
- the audio segment can be referred by a song title, artist, copyright, and the like if such information can be determined by the characteristics of the audio segment, whereby such information can be assigned as metadata to such audio selections.
- Fig. 2 presents the results of the application of the described method of the present invention where a video that is 30 minutes in length is divided according to Music Selections Ml, M2, M3, M4, M5, and M6.
- Fig. 2 displays the corresponding scenes SI, S2, S3, and S4.
- Table 1 presents an example of audio selections and the related metadata of such selections.
- a user can specify that a video be played back where the audio segments that correspond to a certain artist (M83, Prince) are used to select the corresponding video segments that are played back (Ml, for example for M83 and M2 and M3 for Prince).
- M83, Prince the audio segments that correspond to a certain artist
- Ml the audio segments that correspond to a certain artist
- M3 for example for M83 and M2 and M3 for Prince
- a user can also be provided with an option to buy the corresponding audio segments in a form such as an MP3 from a service such as M-Go, Amazon, and the like if a video is presented with a corresponding list of audio selections as presented in Table 1.
- M-Go a service
- Amazon Amazon
- a first aspect of the invention lets a user playback all of the scenes that match a particular artist or song title. Hence, if Prince is selected as the relevant artist in the present example both the 1 st scene and the 2 nd scene will be played back (from the beginning of such scenes or where such songs begin in accordance with a user preference).
- the playback of such scenes can be done in sequential order or in a randomized order.
- a second aspect of the invention will let a user playback "all" of the scenes that pertain to a particular artist or song title where all of such scenes are available to a user.
- Such scenes can be from different movies, television shows, and the like.
- One feature of this aspect of the present invention is that the scenes that are played back can have the SAME song performed by a different artist.
- a cover version of "BLACKBIRD” could be played back with a corresponding scene instead of having the "4 th scene” where the Beatles performed the version of "BLACKBIRD”. This is a substitute song feature.
- a third aspect will let a user buy MP3 versions of songs from a service provider such as AMAZON, M-GO, and the like.
- a service provider such as AMAZON, M-GO, and the like.
- alternative versions of songs can be offered as well as "Blackbird" from the Beatles and respective cover versions of the song.
- the invention could be performed by a content server or content distribution system.
- the segmented content could be stored on a user's device including but not limited to televisions, set top boxes, computers, laptops, dual mode smart phones, iPods, iPads, iPhones, and tablets.
- a user's device including but not limited to televisions, set top boxes, computers, laptops, dual mode smart phones, iPods, iPads, iPhones, and tablets.
- the third aspect of the present invention involving purchasing certain audio would be coordinated with the user's content provider in terms of billing (accounting).
- Other secondary aspects of the third aspect of the present invention could include offering user's "deals” if they purchased, for example, "Blackbird” by all artists that recorded that song or perhaps if the user purchased an entire Beatles album on which the song "Blackbird” was included or if they purchased other movies or movie segments that included the song "Blackbird”.
- Figs. 3 A and 3B together form a flowchart of an exemplary embodiment of the method of the present invention.
- a user request is received.
- a test is performed to determine if the user request is a playback request.
- a playback request may be by movie, TV show, other content, by scene, or by audio selection. If the user request is a playback request, then at 315, the playback options are displayed to the user.
- the user's playback options are received.
- a test is performed to determine if the playback option selected is by audio selection (rather than by scene as in the prior art).
- the movie, TV show or scene selected by the user is forwarded to the user for playback at 330. If the playback option selected is by audio selection then at 335, a test is performed to determine if all scenes in the current content that match the user's audio selection are to be played back. If all scenes in the current content that match the user's audio selection are to be played back then a test is performed at 340 to determine if the playback is to start at the beginning of each scene of the current content that matches the user's audio selection.
- the playback is to start at the beginning of each scene of the current content that matches the user's audio selection then forward the scenes (in the order that the scenes occur or shuffled (in some random order)) to the user for playback at 345. If the playback is not to start at the beginning of each scene of the current content that matches the user's audio selection then at 350 forward the scenes (in the order that the scenes occur or shuffled (in some random order)) starting at the point where the audio selection begins to the user for playback. If all scenes in the current content that match the user's audio selection are not to be played back then at 355 a test is performed to determine if all available scenes that match the user's audio selection are to be played back.
- Step 380 If all available scenes that match the user's audio selection are not to be played back then at 380, the available scenes that match the user's audio selection are displayed. At 385, the user's scene selections from the available scenes are received. Steps/operations 340, 345 and 345 will not be described again since they are he same as described above.
- the user request was not a playback request, then it is assumed to be a purchase request for audio selections and at 365 a request for audio selections and account information to make the purchase is displayed.
- the user's account information is received and validated.
- the audio selections purchased by the user are transmitted to the user.
- Fig. 4 is a block diagram of an exemplary embodiment in accordance with the principles of the present invention.
- the present invention is practiced at a content provider's equipment/system.
- the user interface module 405 is in bi-directional communication with the user.
- User interface module 405 is also in bi-directional communication with a user request module 410, which parses the user's request to determine if the user's request is a purchase request or a playback request. If the user's request is a purchase request then the user request module 410, which is in bi- directional communication with the purchase request module 415, forwards the user's purchase request to the purchases request module 415.
- the purchase request module 415 is in bi-directional communication with the account information module 420.
- the account information module request and receives the user's account information via the user requests module 410 and the user interface module 405 and validates the user's account information. Once the user's account information is validated the account information module 420 returns confirmation to the purchase request module 415.
- the purchase request module 415 requests and retrieves the user's audio selection information via the user request module 410 and the user interface module 405.
- the user's audio selection information is forwarded to the retrieve audio selection module 425 with which the purchase request module is in bi-directional communication.
- the retrieve audio selection module 425 retrieves the user's audio selection from the content storage and retrieval module 445 with which the retrieve audio selection module 425 is in bi-directional communication.
- the user request module 410 forwards the request to playback module 430 with which the user request module is in bi-directional communication.
- the ⁇ playback module 430 determines if the playback is by scene (or entire content) or by audio selection. If the playback is by scene (or entire content) then the playback request module 430 forwards the user's selection to the retrieve content by scene or entire content module 440 with which the playback request module 430 is in bidirectional communication.
- the retrieve content by scene or entire content module 440 retrieves the user's content selection from the content storage and retrieval module 445 with which the retrieve content by scene or entire content module 440 is in bi-directional communication.
- the playback request module 430 forwards the user's selection to the retrieve content by audio selection to the retrieve content by audio selection module 435 with which the playback request module 430 is in bi- directional communication.
- the user's selection information includes whether to playback scenes starting at the beginning of each scene or starting at the point that the audio selection begins.
- the user's selection information also includes whether the user wants to playback scenes that match the user's audio selection in the current content or all available content.
- the retrieve content by audio selection module 435 retrieves the user's content selection that match the user's audio selection from the content storage and retrieval module 445 with which the retrieve content by audio selection module 435 is in bi-directional communication. All content retrieved from the content storage and retrieval module is transmitted (forwarded) to the user on the reverse path from which the request was received via the bi-directional communication paths.
- the means for receiving a user request is the user interface module.
- the means for determining if said user request is a playback request is the user request module.
- the means for displaying to a user, playback options, if said user request is said playback request is the user interface module.
- the means for receiving said playback options selected by said user is the user interface module.
- the means for determining if said playback options selected by said user are by audio selection is the playback request module.
- the means for forwarding scenes that match said audio selection of said user for playback is via retrieve content by audio selection module (having retrieved the selected content from the content storage and retrieval module).
- the retrieve content by audio selection module forwards the retrieved content back to the playback request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
- the means for determining if said user has selected playing back of all scenes in content currently being viewed by said user is the playback request module.
- the means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences is the playback request module.
- the means for forwarding content selected by said user for playback, if said playback options selected by said user are not by audio selection is via retrieve content by scene or entire content module (having retrieved the selected content from the content storage and retrieval module).
- the retrieve content by scene or entire content module forwards the retrieved content back to the playback request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
- the means for determining if said user has selected playing back all scenes in all available content, if said user has not selected playing back scenes that match the user's audio selection from said current content is the playback request module.
- the means for displaying a list of all scenes of all available content that match said audio selection of said user is the user interface module.
- the means for receiving said user's scene selection from said list of all scenes of all available content that match said audio selection of said user is the user interface module.
- the means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences is the playback request module.
- the means for receiving said audio selections for purchase and account information from said user is the uer interface module.
- the means for validating said user's account information is the account information module.
- the means for transmitting said purchased selections to said user is via the retrieve audio selection module (having retrieved the selected content from the content storage and retrieval module).
- the retrieve audio selection module forwards the retrieved content back to the purchase request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
- the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof.
- Special purpose processors may include application specific integrated circuits (ASICs), reduced instruction set computers (RISCs) and/or field programmable gate arrays (FPGAs).
- ASICs application specific integrated circuits
- RISCs reduced instruction set computers
- FPGAs field programmable gate arrays
- the present invention is implemented as a combination of hardware and software.
- the software is preferably implemented as an application program tangibly embodied on a program storage device.
- the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
- the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interface(s).
- CPU central processing units
- RAM random access memory
- I/O input/output
- the computer platform also includes an operating system and microinstruction code.
- the various processes and functions described herein may either be part of the microinstruction code or part of the application program (or a combination thereof), which is executed via the operating system.
- various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.
Abstract
A method and apparatus are described including receiving a user request, determining if the user request is a playback request, displaying to a user, playback options, if the user request is the playback request, receiving the playback options selected by the user, determining if the playback options selected by the user are by audio selection and forwarding scenes that match the audio selection of the user for playback.
Description
VIDEO SEGMENTATION BY AUDIO SELECTION
FIELD OF THE INVENTION
The present invention relates to video content playback, and, in particular to the video content playback based upon audio selections.
BACKGROUND OF THE INVENTION
Conventionally, video content has been broken (segmented) into scenes either due to editing decisions or how a script is written. Such scenes are typically used for trick play functions (jump to the next scene) or chapters that are accessed on a DVD/Blu-Ray disc.
SUMMARY OF THE INVENTION
The present invention segments video content based upon the audio (music) associations of the video instead of as in the prior art based on scenes associated with the video. The present invention provides a framework for breaking up a video for a media asset such as a television show or movie or other content (including multimedia content) into segments that correlate to the audio/music selections instead of scenes as in the prior art.
A method and apparatus are described including receiving a user request, determining if the user request is a playback request, displaying to a user, playback options, if the user request is the playback request, receiving the playback options selected by the user, determining if the playback options selected by the user are by audio selection and forwarding scenes that match the audio selection of the user for playback.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is best understood from the following detailed description when read in conjunction with the accompanying drawings. The drawings include the following figures briefly described below:
Fig. 1 is a flowchart of an exemplary method for segmenting a video into different segments that correlate to the audio associated with such a video.
Fig. 2 presents the results of the application of the described method of the present invention.
Figs. 3 A and 3B together form a flowchart of an exemplary embodiment of the method of the present invention.
Fig. 4 is a block diagram of an exemplary embodiment in accordance with the principles of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The present invention provides a mechanism for breaking up (segmenting) a video (such as a movie or a television show or any other form of content) into audio selections, which can be music selections such as songs or background music. The video segment can then be mapped to such audio segments (called M) instead of using the typically scenes for which a video is segmented. Video scenes are typically the way a video is constructed either due to editing decisions, how a script is written, commercial break requirements, and the like.
Fig. 1 is a flowchart of an exemplary method for segmenting a video into different segments that correlate to the audio associated with such a video. The invention begins with a video asset that is composed of a number of scenes. Each scene can then be associated with a number (0 to X) of music selections which can be songs associated with different music artists. At 105, a media asset containing audio and video is ingested into a server or other type of storage device. At 110, a time reference is assigned to the audio and the video of the media asset. Preferably, such a time reference can be timestamps although other mechanisms can be used for this step. At 115, an audio recognition mechanism is applied to recognize the different audio segments in the media asset. Such audio recognition techniques can be the use of a program such as Shazam, which identifies songs/music selections based on audio fingerprint of the audio selection or AudioID developed by Fraunhofer institute, volume changes in the audio such as periods of silence that exceed a threshold time value and the like. Other audio recognition mechanisms can be used and/or combined with Shazam and/or AudioID. At 120, the video is segmented by way of the audio selections that were identified in the previous step. That is, instead of having the video scenes control how a video is organized, it will be the audio selections which are used to segment a video. Hence, the beginning and the end of an audio selection will have
a video segment corresponding in kind. The time basis for the audio and the video segment would then match up using the time reference (e.g., time stamps). In an alternative embodiment, if the audio segment is identified using an audio recognition technique, the audio segment can be referred by a song title, artist, copyright, and the like if such information can be determined by the characteristics of the audio segment, whereby such information can be assigned as metadata to such audio selections.
Fig. 2 presents the results of the application of the described method of the present invention where a video that is 30 minutes in length is divided according to Music Selections Ml, M2, M3, M4, M5, and M6. In addition, Fig. 2 displays the corresponding scenes SI, S2, S3, and S4.
Table 1 presents an example of audio selections and the related metadata of such selections. Using the information listed below, a user can specify that a video be played back where the audio segments that correspond to a certain artist (M83, Prince) are used to select the corresponding video segments that are played back (Ml, for example for M83 and M2 and M3 for Prince). A user can also be provided with an option to buy the corresponding audio segments in a form such as an MP3 from a service such as M-Go, Amazon, and the like if a video is presented with a corresponding list of audio selections as presented in Table 1. Other examples are possible in accordance with the described principles.
A first aspect of the invention lets a user playback all of the scenes that match a particular artist or song title. Hence, if Prince is selected as the relevant artist in the present example both the 1st scene and the 2nd scene will be played back (from the beginning of such scenes or where such songs begin in accordance with a user preference). In addition, the playback of such scenes can be done in sequential order or in a randomized order.
A second aspect of the invention will let a user playback "all" of the scenes that pertain to a particular artist or song title where all of such scenes are available to a user. Such scenes can be from different movies, television shows, and the like. One feature of this aspect of the present invention is that the scenes that are played back can have the SAME song performed by a different artist. Hence, a cover version of "BLACKBIRD" could be played back with a corresponding scene instead of having the "4th scene" where the Beatles performed the version of "BLACKBIRD". This is a substitute song feature.
A third aspect will let a user buy MP3 versions of songs from a service provider such as AMAZON, M-GO, and the like. With this embodiment, alternative versions of songs can be offered as well as "Blackbird" from the Beatles and respective cover versions of the song.
The invention could be performed by a content server or content distribution system. In an alternative embodiment, the segmented content could be stored on a user's device including but not limited to televisions, set top boxes, computers, laptops, dual mode smart phones, iPods, iPads, iPhones, and tablets. As processing power increases and processing devices get smaller (and less costly) much of the above described segmentation could be performed by an end user's device. The third aspect of the present invention involving purchasing certain audio would be coordinated with the user's content provider in terms of billing (accounting). Other secondary aspects of the third aspect of the present invention could include offering user's "deals" if they purchased, for example, "Blackbird" by all artists that recorded that song or perhaps if the user purchased an entire Beatles album on which the song "Blackbird" was included or if they purchased other movies or movie segments that included the song "Blackbird".
Figs. 3 A and 3B together form a flowchart of an exemplary embodiment of the method of the present invention. At 305, a user request is received. At 310, a test is
performed to determine if the user request is a playback request. A playback request may be by movie, TV show, other content, by scene, or by audio selection. If the user request is a playback request, then at 315, the playback options are displayed to the user. At 320, the user's playback options are received. At 325, a test is performed to determine if the playback option selected is by audio selection (rather than by scene as in the prior art). If the playback option selected is not by audio selection then the movie, TV show or scene selected by the user is forwarded to the user for playback at 330. If the playback option selected is by audio selection then at 335, a test is performed to determine if all scenes in the current content that match the user's audio selection are to be played back. If all scenes in the current content that match the user's audio selection are to be played back then a test is performed at 340 to determine if the playback is to start at the beginning of each scene of the current content that matches the user's audio selection. If the playback is to start at the beginning of each scene of the current content that matches the user's audio selection then forward the scenes (in the order that the scenes occur or shuffled (in some random order)) to the user for playback at 345. If the playback is not to start at the beginning of each scene of the current content that matches the user's audio selection then at 350 forward the scenes (in the order that the scenes occur or shuffled (in some random order)) starting at the point where the audio selection begins to the user for playback. If all scenes in the current content that match the user's audio selection are not to be played back then at 355 a test is performed to determine if all available scenes that match the user's audio selection are to be played back. If all available scenes that match the user's audio selection are to be played back then at 360, forward all available scenes that match the user's audio selection to the user for playback. It should be understood that the user may playback the received content immediately or store the received content for playback at a later time more convenient for the user.
If all available scenes that match the user's audio selection are not to be played back then at 380, the available scenes that match the user's audio selection are displayed. At 385, the user's scene selections from the available scenes are received. Steps/operations 340, 345 and 345 will not be described again since they are he same as described above.
If the user request was not a playback request, then it is assumed to be a purchase request for audio selections and at 365 a request for audio selections and account information to make the purchase is displayed. At 370, the user's account
information is received and validated. At 375, the audio selections purchased by the user are transmitted to the user.
Fig. 4 is a block diagram of an exemplary embodiment in accordance with the principles of the present invention. The present invention is practiced at a content provider's equipment/system. The user interface module 405 is in bi-directional communication with the user. User interface module 405 is also in bi-directional communication with a user request module 410, which parses the user's request to determine if the user's request is a purchase request or a playback request. If the user's request is a purchase request then the user request module 410, which is in bi- directional communication with the purchase request module 415, forwards the user's purchase request to the purchases request module 415. The purchase request module 415 is in bi-directional communication with the account information module 420. The account information module request and receives the user's account information via the user requests module 410 and the user interface module 405 and validates the user's account information. Once the user's account information is validated the account information module 420 returns confirmation to the purchase request module 415. The purchase request module 415 then requests and retrieves the user's audio selection information via the user request module 410 and the user interface module 405. The user's audio selection information is forwarded to the retrieve audio selection module 425 with which the purchase request module is in bi-directional communication. The retrieve audio selection module 425 retrieves the user's audio selection from the content storage and retrieval module 445 with which the retrieve audio selection module 425 is in bi-directional communication.
If the user request is determined by the user request module 410 to be a playback request then the user request module forwards the request to playback module 430 with which the user request module is in bi-directional communication. The \playback module 430 determines if the playback is by scene (or entire content) or by audio selection. If the playback is by scene (or entire content) then the playback request module 430 forwards the user's selection to the retrieve content by scene or entire content module 440 with which the playback request module 430 is in bidirectional communication. The retrieve content by scene or entire content module 440 retrieves the user's content selection from the content storage and retrieval module 445 with which the retrieve content by scene or entire content module 440 is in bi-directional communication.
If the user's request is determined by the playback request module 430 to be a request to playback by audio selection then the playback request module 430 forwards the user's selection to the retrieve content by audio selection to the retrieve content by audio selection module 435 with which the playback request module 430 is in bi- directional communication. The user's selection information includes whether to playback scenes starting at the beginning of each scene or starting at the point that the audio selection begins. The user's selection information also includes whether the user wants to playback scenes that match the user's audio selection in the current content or all available content. The retrieve content by audio selection module 435 retrieves the user's content selection that match the user's audio selection from the content storage and retrieval module 445 with which the retrieve content by audio selection module 435 is in bi-directional communication. All content retrieved from the content storage and retrieval module is transmitted (forwarded) to the user on the reverse path from which the request was received via the bi-directional communication paths.
The means for receiving a user request is the user interface module. The means for determining if said user request is a playback request is the user request module. The means for displaying to a user, playback options, if said user request is said playback request is the user interface module. The means for receiving said playback options selected by said user is the user interface module. The means for determining if said playback options selected by said user are by audio selection is the playback request module. The means for forwarding scenes that match said audio selection of said user for playback is via retrieve content by audio selection module (having retrieved the selected content from the content storage and retrieval module). The retrieve content by audio selection module forwards the retrieved content back to the playback request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
The means for determining if said user has selected playing back of all scenes in content currently being viewed by said user is the playback request module. The means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences is the playback request module.
The means for forwarding content selected by said user for playback, if said playback options selected by said user are not by audio selection is via retrieve content by scene or entire content module (having retrieved the selected content from the content storage and retrieval module). The retrieve content by scene or entire content module forwards the retrieved content back to the playback request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
The means for determining if said user has selected playing back all scenes in all available content, if said user has not selected playing back scenes that match the user's audio selection from said current content is the playback request module. The means for displaying a list of all scenes of all available content that match said audio selection of said user is the user interface module. The means for receiving said user's scene selection from said list of all scenes of all available content that match said audio selection of said user is the user interface module. The means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences is the playback request module.
The means for receiving said audio selections for purchase and account information from said user is the uer interface module. The means for validating said user's account information is the account information module. The means for transmitting said purchased selections to said user is via the retrieve audio selection module (having retrieved the selected content from the content storage and retrieval module). The retrieve audio selection module forwards the retrieved content back to the purchase request module which forwards the retrieved content to the user request module which forwards the retrieved content to the user interface module, which forwards the retrieved content to the user for playback.
It is to be understood that the present invention may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. Special purpose processors may include application specific integrated circuits (ASICs), reduced instruction set computers (RISCs) and/or field programmable gate arrays (FPGAs). Preferably, the present invention is implemented as a combination of hardware and software. Moreover, the software is preferably
implemented as an application program tangibly embodied on a program storage device. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interface(s). The computer platform also includes an operating system and microinstruction code. The various processes and functions described herein may either be part of the microinstruction code or part of the application program (or a combination thereof), which is executed via the operating system. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.
It is to be further understood that, because some of the constituent system components and method steps depicted in the accompanying figures are preferably implemented in software, the actual connections between the system components (or the process steps) may differ depending upon the manner in which the present invention is programmed. Given the teachings herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention.
Claims
CLAIMS:
1. A method, said method comprising: receiving a user request; determining if said user request is a playback request; displaying to a user, playback options, if said user request is said playback request; receiving said playback options selected by said user; determining if said playback options selected by said user are by audio selection; and forwarding scenes that match said audio selection of said user for playback.
2. The method according to claim 1, further comprising: determining if said user has selected playing back of all scenes in content currently being viewed by said user; and determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences.
3. The method according to claim 1, further comprising forwarding content selected by said user for playback, if said playback options selected by said user are not by audio selection.
4. The method according to claim 2, further comprising determining if said user has selected playing back all scenes in all available content, if said user has not selected playing back scenes that match the user's audio selection from said current content. 5. The method according to claim 4, further comprising: displaying a list of all scenes of all available content that match said audio selection of said user;
receiving said user's scene selection from said list of all scenes of all available content that match said audio selection of said user; and determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences.
The method according to claim 1, further comprising: receiving said audio selections for purchase and account information from said user; validating said user's account information; and transmitting said purchased selections to said user.
A system, comprising: means for receiving a user request; means for determining if said user request is a playback request; means for displaying to a user, playback options, if said user request is said playback request; means for receiving said playback options selected by said user; means for determining if said playback options selected by said user are by audio selection; and means for forwarding scenes that match said audio selection of said user for playback.
The system according to claim 7, further comprising: means for determining if said user has selected playing back of all scenes in content currently being viewed by said user; and means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences.
9. The system according to claim 7, further comprising means for forwarding content selected by said user for playback, if said playback options selected by said user are not by audio selection.
10. The system according to claim 8, further comprising means for determining if said user has selected playing back all scenes in all available content, if said user has not selected playing back scenes that match the user's audio selection from said current content.
11. The system according to claim 10, further comprising: means for displaying a list of all scenes of all available content that match said audio selection of said user; means for receiving said user's scene selection from said list of all scenes of all available content that match said audio selection of said user; and means for determining if said user has selected playing back content starting at a beginning of each scene or said user has selected playing back content where said audio selection selected by said user commences.
12. The system according to claim 7, further comprising: means for receiving said audio selections for purchase and account information from said user; means for validating said user's account information; and means for transmitting said purchased selections to said user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2013/059343 WO2015038121A1 (en) | 2013-09-12 | 2013-09-12 | Video segmentation by audio selection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2013/059343 WO2015038121A1 (en) | 2013-09-12 | 2013-09-12 | Video segmentation by audio selection |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015038121A1 true WO2015038121A1 (en) | 2015-03-19 |
Family
ID=49226576
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2013/059343 WO2015038121A1 (en) | 2013-09-12 | 2013-09-12 | Video segmentation by audio selection |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2015038121A1 (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10111872A (en) * | 1996-10-08 | 1998-04-28 | Nippon Telegr & Teleph Corp <Ntt> | Device and method for distributing moving image |
US6600874B1 (en) * | 1997-03-19 | 2003-07-29 | Hitachi, Ltd. | Method and device for detecting starting and ending points of sound segment in video |
WO2005093752A1 (en) * | 2004-03-23 | 2005-10-06 | British Telecommunications Public Limited Company | Method and system for detecting audio and video scene changes |
EP1708101A1 (en) * | 2004-01-14 | 2006-10-04 | Mitsubishi Denki Kabushiki Kaisha | Summarizing reproduction device and summarizing reproduction method |
US20080304807A1 (en) * | 2007-06-08 | 2008-12-11 | Gary Johnson | Assembling Video Content |
EP2608206A2 (en) * | 2011-12-21 | 2013-06-26 | Samsung Electronics Co., Ltd. | Content playing apparatus and control method thereof |
-
2013
- 2013-09-12 WO PCT/US2013/059343 patent/WO2015038121A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10111872A (en) * | 1996-10-08 | 1998-04-28 | Nippon Telegr & Teleph Corp <Ntt> | Device and method for distributing moving image |
US6600874B1 (en) * | 1997-03-19 | 2003-07-29 | Hitachi, Ltd. | Method and device for detecting starting and ending points of sound segment in video |
EP1708101A1 (en) * | 2004-01-14 | 2006-10-04 | Mitsubishi Denki Kabushiki Kaisha | Summarizing reproduction device and summarizing reproduction method |
WO2005093752A1 (en) * | 2004-03-23 | 2005-10-06 | British Telecommunications Public Limited Company | Method and system for detecting audio and video scene changes |
US20080304807A1 (en) * | 2007-06-08 | 2008-12-11 | Gary Johnson | Assembling Video Content |
EP2608206A2 (en) * | 2011-12-21 | 2013-06-26 | Samsung Electronics Co., Ltd. | Content playing apparatus and control method thereof |
Non-Patent Citations (5)
Title |
---|
SARACENO C ET AL: "Identification of story units in audio-visual sequences by joint audio and video processing", IMAGE PROCESSING, 1998. ICIP 98. PROCEEDINGS. 1998 INTERNATIONAL CONFERENCE ON CHICAGO, IL, USA 4-7 OCT. 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, vol. 1, 4 October 1998 (1998-10-04), pages 363 - 367, XP010308744, ISBN: 978-0-8186-8821-8, DOI: 10.1109/ICIP.1998.723500 * |
SMITH M A ET AL: "VIDEO SKIMMING AND CHARACTERIZATION THROUGH THE COMBINATION OF IMAGE AND LANGUAGE UNDERSTANDING TECHNIQUES", PROCEEDINGS OF THE 1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. SAN JUAN, PUERTO RICO, JUNE 17 - 19, 1997; [PROCEEDINGS OF THE IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION], LOS ALAM, vol. CONF. 16, 17 June 1997 (1997-06-17), pages 775 - 781, XP000776576, ISBN: 978-0-7803-4236-1 * |
SUNDARAM H ET AL: "A UTILITY FRAMEWORK FOR THE AUTOMATIC GENERATION OF AUDIO-VISUAL SKIMS", PROCEEDINGS ACM MULTIMEDIA 2002. 10TH. INTERNATIONAL CONFERENCE ON MULTIMEDIA. JUAN-LES-PINS, FRANCE, DEC. 1 - 6, 2002; [ACM INTERNATIONAL MULTIMEDIA CONFERENCE], ACM, NEW YORK, NY, vol. CONF. 10, 1 December 2002 (2002-12-01), pages 189 - 198, XP001174988, ISBN: 978-1-58113-620-3, DOI: 10.1145/641007.641042 * |
SUNDARAM H ET AL: "Video scene segmentation using video and audio features", MULTIMEDIA AND EXPO, 2000. ICME 2000. 2000 IEEE INTERNATIONAL CONFEREN CE ON NEW YORK, NY, USA 30 JULY-2 AUG. 2000, PISCATAWAY, NJ, USA,IEEE, US, vol. 2, 30 July 2000 (2000-07-30), pages 1145 - 1148, XP010513212, ISBN: 978-0-7803-6536-0, DOI: 10.1109/ICME.2000.871563 * |
YAO WANG ET AL: "Multimedia Content Analysis - Using Both Audio and Visual Clues", IEEE SIGNAL PROCESSING MAGAZINE, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 17, no. 6, 1 November 2000 (2000-11-01), pages 12 - 36, XP011089877, ISSN: 1053-5888, DOI: 10.1109/79.888862 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10157181B2 (en) | Mixed source media playback | |
US9077956B1 (en) | Scene identification | |
US9319724B2 (en) | Favorite media program scenes systems and methods | |
US9380282B2 (en) | Providing item information during video playing | |
US9124950B2 (en) | Providing item information notification during video playing | |
CN104575550B (en) | Multimedia file title skipping method and electronic device | |
US20110289135A1 (en) | Asset resolvable bookmarks | |
US9635337B1 (en) | Dynamically generated media trailers | |
US9092436B2 (en) | Programming interface for use by media bundles to provide media presentations | |
US8522357B2 (en) | Rights-based advertisement management in protected media | |
US20160249091A1 (en) | Method and an electronic device for providing a media stream | |
JP2014534513A (en) | Method and user interface for classifying media assets | |
US20150350736A1 (en) | Source agnostic content model | |
CN101202894B (en) | Method, system for playing program sequence and digital television receiver | |
JP2010282319A (en) | Server apparatus and advertisement system | |
US9501482B2 (en) | Download queue as part of user interface library view for on-demand content systems and methods | |
US10536735B2 (en) | Purchasing and viewing content based on a linear broadcast | |
US20150312613A1 (en) | Personalized smart-list video channels | |
US20140156739A1 (en) | Client device, information processing method, and information processing system | |
US11956520B2 (en) | Methods and systems for providing dynamically composed personalized media assets | |
WO2015038121A1 (en) | Video segmentation by audio selection | |
US11729480B2 (en) | Systems and methods to enhance interactive program watching | |
US11570523B1 (en) | Systems and methods to enhance interactive program watching | |
US20240022774A1 (en) | Smart automatic skip mode | |
JP4826677B2 (en) | Recording medium and reproducing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13765906 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13765906 Country of ref document: EP Kind code of ref document: A1 |