US20100023330A1 - Speed podcasting - Google Patents

Speed podcasting Download PDF

Info

Publication number
US20100023330A1
US20100023330A1 US12/180,906 US18090608A US2010023330A1 US 20100023330 A1 US20100023330 A1 US 20100023330A1 US 18090608 A US18090608 A US 18090608A US 2010023330 A1 US2010023330 A1 US 2010023330A1
Authority
US
United States
Prior art keywords
words
audio
segments
essential
playback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US12/180,906
Other versions
US9953651B2 (en
Inventor
Corville O. Allen
Albert A. Chung
Binh C. Truong
Kam K. Yee
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/180,906 priority Critical patent/US9953651B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TRUONG, BINH C., CHUNG, ALBERT A., ALLEN, CORVILLE O., LEE, KAM K.
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF ONE OF THE ASSIGNORS FROM LEE, KAM K. TO YEE, KAM K. PREVIOUSLY RECORDED ON REEL 021303 FRAME 0737. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT OF ASSIGNORS' INTEREST TO INTERNATIONAL BUSINESS MACHINES CORPORATION. Assignors: TRUONG, BINH C., CHUNG, ALBERT A., ALLEN, CORVILLE O., YEE, KAM K.
Publication of US20100023330A1 publication Critical patent/US20100023330A1/en
Priority to US15/960,534 priority patent/US10332522B2/en
Application granted granted Critical
Publication of US9953651B2 publication Critical patent/US9953651B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention relates to the field of media playback and more particularly to the field of podcasting for computer communications networks.
  • Media storage and handling technologies including modern audio and video compression algorithms when merged with the accessibility of the World Wide Web, provide for wide variety of modes of content distribution to please every conceivable type of prospective content consumer.
  • content browsers can retrieve text, imagery, audio, video and audiovisual materials with a single click of a mouse button.
  • Recent advances in broadband connectivity render the exercise of streaming media for viewing in a Web browser a simplistic exercise.
  • most media browsing technologies of the Internet require persistent connectivity to the Internet.
  • computing users of today are a mobile sort and seldom enjoy broadband connectivity at all times. Yet, computing users have become accustomed to consuming network distributed media content at all hours—particularly at home, while exercising or when traveling out of range of wireless network connectivity.
  • computing users have become accustomed to consuming network distributed media content at all hours—particularly at home, while exercising or when traveling out of range of wireless network connectivity.
  • mobile media playback devices enabled to retrieve and store content from over the Internet—particularly music and videos—and to permit playback through the devices while lacking a network connection at a subsequent time.
  • Podcasting represents one such new form of media.
  • a podcast is a brief segment of audiovisual material recorded for distribution to mobile video playback devices for playback that is not dependant upon network connectivity.
  • Podcasts are periodic in nature in that end users often subscribe to a podcast and in consequence of a subscription, new editions of the subscribed podcast are downloaded to the mobile video playback device as those new editions become available and as the mobile video playback device obtains network connectivity to the source of the new edition of the podcast.
  • podcasts lack some of the elemental convenience factors of traditional print media. For example, in traditional print media a content consumer can scan an article to quickly ascertain whether the article is of interest. To the extent the article is not of interest, there is no need for the content consumer to read the entire article. So much is not so with a podcast. In a podcast, the content consumer must listen to the entirety of the podcast in order to determine whether the podcast is of interest. Recent technologies attempt to address this shortcoming of podcasting through the speech recognition of the audio portion of a podcast to permit keyword searching of the relevant portions of the podcast. Exemplary products produce a heatmap within the playback control of the podcast to indicate where in the podcast a word of interest can be found.
  • a speed podcasting method can include speech recognizing an audio portion of a podcast, parsing the speech recognized audio portion to identify essential words, and playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.
  • an audio segment can correspond to a phrase or sentence proximate to an identified essential word.
  • playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words can include matching words in the speech recognized audio portion to essential words in a data store of essential words, determining audio segments of the audio portion including the matched words, and playing back the determined audio segments while excluding other audio segments of the audio portion from playback.
  • playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words can include identifying non-essential words in the speech recognized audio portion, and excluding audio segments of the audio portion from playback including the identified non-essential words while playing back other audio segments of the audio portion.
  • the method also can include consulting a thesaurus to retrieve words synonymous with the matched words, and adding the synonymous words to the data store of essential words.
  • playing back the determined audio segments while excluding other audio segments of the audio portion from playback can include applying a rating to each word in the data store of essential words, associating a playback speed with each rating, excluding from the determined audio segments audio segments with essential words associated with a rating inconsistent with a contemporaneously selected playback speed, and playing back only the determined audio segments while excluding other audio segments of the audio portion from playback.
  • a mobile video playback device can be provided.
  • the device can include a computer program and system configured to playback podcasts stored in the mobile video playback device, a datastore of essential words, a speech recognition engine, and speed podcasting logic executing under management of the operating system.
  • the logic can include program code enabled to speech recognize an audio portion of a selected podcast, to parse the speech recognized audio portion to identify words present in the datastore of essential words, and to play back only audio segments and corresponding video segments of the podcast including the matched words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.
  • the device also can include a thesaurus such that the program code can be further enabled to locate in the thesaurus words synonymous with the identified words present in the datastore of essential words and to add the synonymous words to the datastore of essential words.
  • FIG. 1 is a schematic illustration of a process for speed podcasting
  • FIG. 2 is a schematic illustration of a mobile podcasting playback device configured for speed podcasting
  • FIG. 3 is a flow chart illustrating a process for speed podcasting in the mobile podcasting playback device of FIG. 2 .
  • Embodiments of the present invention provide a method, system and computer program product for speed podcasting.
  • an audio portion of a podcast can be speech recognized into a transcript and the words of the transcript can be filtered into a set of essential words and non-essential words. Thereafter, during a speed playback of the podcast, only the video and audio portions corresponding to the essential words can be played back while the portions of the video and audio of the podcast corresponding to the non-essential words can be skipped. Consequently, a content consumer speed podcasting a podcast can quickly ascertain the “gist” of the podcast without being compelled to listen to the entire podcast.
  • FIG. 1 schematic depicts a process for speed podcasting.
  • a mobile video playback device 110 can be configured to playback a podcast 120 including synchronized audio and video portions.
  • the audio portions 130 of the podcast 120 can be speech recognized to produce a transcript of both essential words 150 A and non-essential words 150 B.
  • non-essential words 150 B can include articles, adverbs and adjectives not essential to the subject matter of the podcast 120
  • essential words 150 A can include nouns and verbs directed to the subject matter of the podcast 120 .
  • the placement of the nouns and verbs in a phrase can be determinative in classifying a word as non-essential or essential.
  • thesaurus 160 can be consulted to expand the essential words 150 A to include synonymous words.
  • a playback speed setting 170 can specify which of the essential words 150 A are to be associated with the essential audio segments 140 A during speed podcasting of the podcast 120 .
  • FIG. 2 is a schematic illustration of a mobile podcasting playback device configured for speed podcasting.
  • mobile video playback device 210 can include a podcast playback computer program or system 220 and a data store 230 of podcasts 240 .
  • the podcast playback computer program or system 220 of the mobile video playback device 210 can be configured to playback selected ones of the podcasts including both audio and video portions of the selected ones of the podcasts.
  • Speed podcasting logic 250 can be coupled to the podcast playback computer program or system 220 and also to a speech recognition engine 260 .
  • the speed podcasting logic 250 can include program code enabled to play back only audio segments of a selected one of the podcasts 240 that contains essential words 270 stored in the data store 230 and present in a transcript of the selected one of the podcasts 240 produced by speech recognition engine 260 .
  • the program code of the speed podcasting logic 250 can direct speech recognition engine 260 to speech recognize an audio portion of a selected one of the podcasts 240 into a transcript.
  • the program code of the speed podcasting logic 250 further can parse the transcript into audio segments containing words matching the essential words 270 disposed in data store 230 while excluding other audio segments (presumptively containing non-essential words).
  • thesaurus 280 can be consulted to identify words synonymous with the words matching the essential words 270 to augment the essential words 270 in the data store 230 .
  • a subset of audio segments of the audio of the selected one of the podcasts 240 can be produced that contain only essential words 270 in the data store 230 .
  • the program code of the speed podcasting logic 250 can direct the podcast playback computer program or system 220 to playback only the audio segments in the subset during a speed podcasting operation.
  • FIG. 3 is a flow chart illustrating a process for speed podcasting.
  • a podcast can be loaded including synchronized audio and video portions.
  • the audio portions of the loaded podcast can be speech recognized to produce a transcript and in block 330 , the transcript can be indexed with the audio and video portions to maintain synchronization between the transcript, audio and video portions.
  • a playback speed can be determined for speed podcasting of the loaded podcast.
  • the transcript can be parsed to filter out non-essential words and also essential words not having a rating high enough to meet a threshold value corresponding to the determined playback speed.
  • additional words can be added to the data store of essential words that are synonymous with the essential words identified in the transcript.
  • each audio segment can correspond to a phrase or sentence proximate to an identified essential word.
  • a set of words beginning from zero or more words preceding an essential word and concluding zero or more words following the essential word can be considered an audio segment.
  • all words leading from an essential word to a punctuation mark can be considered an audio segment.
  • Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements.
  • the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, and the like.
  • the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
  • Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk.
  • Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
  • a data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus.
  • the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
  • I/O devices including but not limited to keyboards, displays, pointing devices, etc.
  • Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.

Abstract

Embodiments of the present invention address deficiencies of the art in respect to podcasting and provide a method, system and computer program product for speed podcasting. In an embodiment of the invention, a speed podcasting method can include speech recognizing an audio portion of a podcast, parsing the speech recognized audio portion to identify essential words, and playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to the field of media playback and more particularly to the field of podcasting for computer communications networks.
  • 2. Description of the Related Art
  • Historically, content has been disseminated for public consumption through each of the print medium, video medium and audio medium. Early forms of content publication relied exclusively upon print media such as newspapers, books and magazines. The early age of motion pictures provided and alternative mode of content distribution—typically through news reels preceding a feature film. Radio also provided a mode of content distribution in which consumers listen to content rather than view or read content. The dawn of television further advanced the distribution of content in an audiovisual medium to a degree rivaling print media. The explosive adoption of the World Wide Web as a primary source of content for consumption, however, remains unprecedented.
  • Media storage and handling technologies including modern audio and video compression algorithms when merged with the accessibility of the World Wide Web, provide for wide variety of modes of content distribution to please every conceivable type of prospective content consumer. Through the gateway of a simple Web page, content browsers can retrieve text, imagery, audio, video and audiovisual materials with a single click of a mouse button. Recent advances in broadband connectivity render the exercise of streaming media for viewing in a Web browser a simplistic exercise. However, most media browsing technologies of the Internet require persistent connectivity to the Internet.
  • Notwithstanding, computing users of today are a mobile sort and seldom enjoy broadband connectivity at all times. Yet, computing users have become accustomed to consuming network distributed media content at all hours—particularly at home, while exercising or when traveling out of range of wireless network connectivity. Capitalizing on the mobile nature of computing users, well known large manufacturers of consumer electronics have engaged in a protracted effort to manufacture mobile media playback devices enabled to retrieve and store content from over the Internet—particularly music and videos—and to permit playback through the devices while lacking a network connection at a subsequent time.
  • Arising out of this protracted effort, new forms of media have been developed to capitalize on the nature of the mobile content consumer. Podcasting represents one such new form of media. A podcast is a brief segment of audiovisual material recorded for distribution to mobile video playback devices for playback that is not dependant upon network connectivity. Podcasts are periodic in nature in that end users often subscribe to a podcast and in consequence of a subscription, new editions of the subscribed podcast are downloaded to the mobile video playback device as those new editions become available and as the mobile video playback device obtains network connectivity to the source of the new edition of the podcast.
  • Despite the utility of podcasts as a content distribution medium, podcasts lack some of the elemental convenience factors of traditional print media. For example, in traditional print media a content consumer can scan an article to quickly ascertain whether the article is of interest. To the extent the article is not of interest, there is no need for the content consumer to read the entire article. So much is not so with a podcast. In a podcast, the content consumer must listen to the entirety of the podcast in order to determine whether the podcast is of interest. Recent technologies attempt to address this shortcoming of podcasting through the speech recognition of the audio portion of a podcast to permit keyword searching of the relevant portions of the podcast. Exemplary products produce a heatmap within the playback control of the podcast to indicate where in the podcast a word of interest can be found.
  • BRIEF SUMMARY OF THE INVENTION
  • Embodiments of the present invention address deficiencies of the art in respect to podcasting and provide a novel and non-obvious method, system and computer program product for speed podcasting. In an embodiment of the invention, a speed podcasting method can include speech recognizing an audio portion of a podcast, parsing the speech recognized audio portion to identify essential words, and playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words. In this regard, an audio segment can correspond to a phrase or sentence proximate to an identified essential word.
  • In one aspect of the embodiment, playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words can include matching words in the speech recognized audio portion to essential words in a data store of essential words, determining audio segments of the audio portion including the matched words, and playing back the determined audio segments while excluding other audio segments of the audio portion from playback.
  • In another aspect of the embodiment, playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words can include identifying non-essential words in the speech recognized audio portion, and excluding audio segments of the audio portion from playback including the identified non-essential words while playing back other audio segments of the audio portion.
  • In yet another aspect of the embodiment, the method also can include consulting a thesaurus to retrieve words synonymous with the matched words, and adding the synonymous words to the data store of essential words. In even yet another aspect of the embodiment, playing back the determined audio segments while excluding other audio segments of the audio portion from playback can include applying a rating to each word in the data store of essential words, associating a playback speed with each rating, excluding from the determined audio segments audio segments with essential words associated with a rating inconsistent with a contemporaneously selected playback speed, and playing back only the determined audio segments while excluding other audio segments of the audio portion from playback.
  • In another embodiment of the invention, a mobile video playback device can be provided. The device can include a computer program and system configured to playback podcasts stored in the mobile video playback device, a datastore of essential words, a speech recognition engine, and speed podcasting logic executing under management of the operating system. The logic can include program code enabled to speech recognize an audio portion of a selected podcast, to parse the speech recognized audio portion to identify words present in the datastore of essential words, and to play back only audio segments and corresponding video segments of the podcast including the matched words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words. Optionally, the device also can include a thesaurus such that the program code can be further enabled to locate in the thesaurus words synonymous with the identified words present in the datastore of essential words and to add the synonymous words to the datastore of essential words.
  • Additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The aspects of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
  • BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
  • The accompanying drawings, which are incorporated in and constitute part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention. The embodiments illustrated herein are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown, wherein:
  • FIG. 1 is a schematic illustration of a process for speed podcasting;
  • FIG. 2 is a schematic illustration of a mobile podcasting playback device configured for speed podcasting; and,
  • FIG. 3 is a flow chart illustrating a process for speed podcasting in the mobile podcasting playback device of FIG. 2.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Embodiments of the present invention provide a method, system and computer program product for speed podcasting. In accordance with an embodiment of the present invention, an audio portion of a podcast can be speech recognized into a transcript and the words of the transcript can be filtered into a set of essential words and non-essential words. Thereafter, during a speed playback of the podcast, only the video and audio portions corresponding to the essential words can be played back while the portions of the video and audio of the podcast corresponding to the non-essential words can be skipped. Consequently, a content consumer speed podcasting a podcast can quickly ascertain the “gist” of the podcast without being compelled to listen to the entire podcast.
  • In illustration, FIG. 1 schematic depicts a process for speed podcasting. As shown in FIG. 1, a mobile video playback device 110 can be configured to playback a podcast 120 including synchronized audio and video portions. The audio portions 130 of the podcast 120 can be speech recognized to produce a transcript of both essential words 150A and non-essential words 150B. In this regard, non-essential words 150B can include articles, adverbs and adjectives not essential to the subject matter of the podcast 120, while essential words 150A can include nouns and verbs directed to the subject matter of the podcast 120. The placement of the nouns and verbs in a phrase can be determinative in classifying a word as non-essential or essential. Optionally, thesaurus 160 can be consulted to expand the essential words 150A to include synonymous words.
  • Once the speech recognized audio portions 130 of the podcast 120 have been parsed into essential words 150A and non-essential words 150B, only essential audio segments 140A of the audio portions 130 with essential words 150A can be played back during speed podcasting, while non-essential audio segments 140B of the audio portions 130 with non-essential words 150B can be omitted from playback of the podcast 120. As yet another option, some of the essential words 150A can be rated as more relevant than others of the essential words 150A. Consequently, a playback speed setting 170 can specify which of the essential words 150A are to be associated with the essential audio segments 140A during speed podcasting of the podcast 120.
  • In further illustration, FIG. 2 is a schematic illustration of a mobile podcasting playback device configured for speed podcasting. As shown in FIG. 2, mobile video playback device 210 can include a podcast playback computer program or system 220 and a data store 230 of podcasts 240. The podcast playback computer program or system 220 of the mobile video playback device 210 can be configured to playback selected ones of the podcasts including both audio and video portions of the selected ones of the podcasts. Speed podcasting logic 250 can be coupled to the podcast playback computer program or system 220 and also to a speech recognition engine 260. The speed podcasting logic 250 can include program code enabled to play back only audio segments of a selected one of the podcasts 240 that contains essential words 270 stored in the data store 230 and present in a transcript of the selected one of the podcasts 240 produced by speech recognition engine 260.
  • In operation, the program code of the speed podcasting logic 250 can direct speech recognition engine 260 to speech recognize an audio portion of a selected one of the podcasts 240 into a transcript. The program code of the speed podcasting logic 250 further can parse the transcript into audio segments containing words matching the essential words 270 disposed in data store 230 while excluding other audio segments (presumptively containing non-essential words). Optionally, thesaurus 280 can be consulted to identify words synonymous with the words matching the essential words 270 to augment the essential words 270 in the data store 230. In any event, a subset of audio segments of the audio of the selected one of the podcasts 240 can be produced that contain only essential words 270 in the data store 230. Finally, the program code of the speed podcasting logic 250 can direct the podcast playback computer program or system 220 to playback only the audio segments in the subset during a speed podcasting operation.
  • In yet further illustration of the operation of the mobile podcasting playback device, FIG. 3 is a flow chart illustrating a process for speed podcasting. Beginning in block 310, a podcast can be loaded including synchronized audio and video portions. In block 320, the audio portions of the loaded podcast can be speech recognized to produce a transcript and in block 330, the transcript can be indexed with the audio and video portions to maintain synchronization between the transcript, audio and video portions. In block 340, a playback speed can be determined for speed podcasting of the loaded podcast.
  • Thereafter, in block 350 by reference to a data store of essential words and also rules for characterizing different word types like articles and pronouns, adverbs and adjectives as non-essential in nature, the transcript can be parsed to filter out non-essential words and also essential words not having a rating high enough to meet a threshold value corresponding to the determined playback speed. Further, in block 360, additional words can be added to the data store of essential words that are synonymous with the essential words identified in the transcript.
  • Finally, in block 370, only audio segments indexed to correspond to the essential words in the transcript can be played back while audio segments indexed to correspond to the non-essential words in the transcript can be omitted. For instance, each audio segment can correspond to a phrase or sentence proximate to an identified essential word. As an example, a set of words beginning from zero or more words preceding an essential word and concluding zero or more words following the essential word can be considered an audio segment. As another example, all words leading from an essential word to a punctuation mark can be considered an audio segment.
  • Embodiments of the invention can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In a preferred embodiment, the invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, and the like. Furthermore, the invention can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
  • For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
  • A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution. Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.

Claims (13)

1. A speed podcasting method comprising:
speech recognizing an audio portion of a podcast;
parsing the speech recognized audio portion to identify essential words; and,
playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.
2. The method of claim 1, wherein playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words, comprises:
matching words in the speech recognized audio portion to essential words in a data store of essential words;
determining audio segments of the audio portion including the matched words; and,
playing back the determined audio segments while excluding other audio segments of the audio portion from playback.
3. The method of claim 1, wherein playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words, comprises:
identifying non-essential words in the speech recognized audio portion; and,
excluding audio segments of the audio portion from playback including the identified non-essential words while playing back other audio segments of the audio portion.
4. The method of claim 2, further comprising:
consulting a thesaurus to retrieve words synonymous with the matched words; and,
adding the synonymous words to the data store of essential words.
5. The method of claim 2, wherein playing back the determined audio segments while excluding other audio segments of the audio portion from playback comprises:
applying a rating to each word in the data store of essential words;
associating a playback speed with each rating;
excluding from the determined audio segments audio segments with essential words associated with a rating inconsistent with a contemporaneously selected playback speed; and,
playing back only the determined audio segments while excluding other audio segments of the audio portion from playback.
6. A mobile video playback device comprising:
an operating system configured to playback podcasts stored in the mobile video playback device;
a datastore of essential words;
a speech recognition engine; and,
speed podcasting logic executing under management of the operating system comprising program code enabled to speech recognize an audio portion of a selected podcast, to parse the speech recognized audio portion to identify words present in the datastore of essential words, and to play back only audio segments and corresponding video segments of the podcast including the matched words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.
7. The device of claim 6, wherein non-essential words comprise articles and pronouns.
8. The device of claim 6, further comprising a thesaurus, the program code being further enabled to locate in the thesaurus words synonymous with the identified words present in the datastore of essential words and to add the synonymous words to the datastore of essential words.
9. A computer program product comprising a computer usable medium embodying computer usable program code for, the computer program product comprising:
computer usable program code for speech recognizing an audio portion of a podcast;
computer usable program code for parsing the speech recognized audio portion to identify essential words; and,
computer usable program code for playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words.
10. The computer program product of claim 9, wherein the computer usable program code for playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words, comprises:
computer usable program code for matching words in the speech recognized audio portion to essential words in a data store of essential words;
computer usable program code for determining audio segments of the audio portion including the matched words; and,
computer usable program code for playing back the determined audio segments while excluding other audio segments of the audio portion from playback.
11. The computer program product of claim 9, wherein the computer usable program code for playing back only audio segments and corresponding video segments of the podcast including the essential words while excluding from playback audio segments and corresponding video segments of the podcast including non-essential words, comprises:
computer usable program code for identifying non-essential words in the speech recognized audio portion; and,
computer usable program code for excluding audio segments of the audio portion from playback including the identified non-essential words while playing back other audio segments of the audio portion.
12. The computer program product of claim 10, further comprising:
computer usable program code for consulting a thesaurus to retrieve words synonymous with the matched words; and,
computer usable program code for adding the synonymous words to the data store of essential words.
13. The computer program product of claim 10, wherein the computer usable program code for playing back the determined audio segments while excluding other audio segments of the audio portion from playback comprises:
computer usable program code for applying a rating to each word in the data store of essential words;
computer usable program code for associating a playback speed with each rating;
computer usable program code for excluding from the determined audio segments audio segments with essential words associated with a rating inconsistent with a contemporaneously selected playback speed; and,
computer usable program code for playing back only the determined audio segments while excluding other audio segments of the audio portion from playback.
US12/180,906 2008-07-28 2008-07-28 Speed podcasting Active 2032-03-30 US9953651B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/180,906 US9953651B2 (en) 2008-07-28 2008-07-28 Speed podcasting
US15/960,534 US10332522B2 (en) 2008-07-28 2018-04-23 Speed podcasting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/180,906 US9953651B2 (en) 2008-07-28 2008-07-28 Speed podcasting

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/960,534 Continuation US10332522B2 (en) 2008-07-28 2018-04-23 Speed podcasting

Publications (2)

Publication Number Publication Date
US20100023330A1 true US20100023330A1 (en) 2010-01-28
US9953651B2 US9953651B2 (en) 2018-04-24

Family

ID=41569436

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/180,906 Active 2032-03-30 US9953651B2 (en) 2008-07-28 2008-07-28 Speed podcasting
US15/960,534 Expired - Fee Related US10332522B2 (en) 2008-07-28 2018-04-23 Speed podcasting

Family Applications After (1)

Application Number Title Priority Date Filing Date
US15/960,534 Expired - Fee Related US10332522B2 (en) 2008-07-28 2018-04-23 Speed podcasting

Country Status (1)

Country Link
US (2) US9953651B2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100324895A1 (en) * 2009-01-15 2010-12-23 K-Nfb Reading Technology, Inc. Synchronization for document narration
US20130067333A1 (en) * 2008-10-03 2013-03-14 Finitiv Corporation System and method for indexing and annotation of video content
GB2502944A (en) * 2012-03-30 2013-12-18 Jpal Ltd Segmentation and transcription of speech
US8903723B2 (en) 2010-05-18 2014-12-02 K-Nfb Reading Technology, Inc. Audio synchronization for document narration with user-selected playback
CN107193841A (en) * 2016-03-15 2017-09-22 北京三星通信技术研究有限公司 Media file accelerates the method and apparatus played, transmit and stored
CN111885457A (en) * 2020-07-15 2020-11-03 歌尔科技有限公司 Wireless earphone, audio playing method and computer readable storage medium

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11250213B2 (en) * 2019-04-16 2022-02-15 International Business Machines Corporation Form-based transactional conversation system design
CN110177298B (en) * 2019-05-27 2021-03-26 湖南快乐阳光互动娱乐传媒有限公司 Voice-based video speed doubling playing method and system
CA3147589A1 (en) * 2019-07-15 2021-01-21 Axon Enterprise, Inc. Methods and systems for transcription of audio data

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5914941A (en) * 1995-05-25 1999-06-22 Information Highway Media Corporation Portable information storage/playback apparatus having a data interface
US6061675A (en) * 1995-05-31 2000-05-09 Oracle Corporation Methods and apparatus for classifying terminology utilizing a knowledge catalog
US6167370A (en) * 1998-09-09 2000-12-26 Invention Machine Corporation Document semantic analysis/selection with knowledge creativity capability utilizing subject-action-object (SAO) structures
US6466901B1 (en) * 1998-11-30 2002-10-15 Apple Computer, Inc. Multi-language document search and retrieval system
US20020156632A1 (en) * 2001-04-18 2002-10-24 Haynes Jacqueline A. Automated, computer-based reading tutoring systems and methods
US6598039B1 (en) * 1999-06-08 2003-07-22 Albert-Inc. S.A. Natural language interface for searching database
US20050177805A1 (en) * 2004-02-11 2005-08-11 Lynch Michael R. Methods and apparatuses to generate links from content in an active window
US20060190804A1 (en) * 2005-02-22 2006-08-24 Yang George L Writing and reading aid system
US20060271365A1 (en) * 2000-09-18 2006-11-30 International Business Machines Corporation Methods and apparatus for processing information signals based on content
US7207003B1 (en) * 2000-08-31 2007-04-17 International Business Machines Corporation Method and apparatus in a data processing system for word based render browser for skimming or speed reading web pages
US20070130112A1 (en) * 2005-06-30 2007-06-07 Intelligentek Corp. Multimedia conceptual search system and associated search method
US20070288435A1 (en) * 2006-05-10 2007-12-13 Manabu Miki Image storage/retrieval system, image storage apparatus and image retrieval apparatus for the system, and image storage/retrieval program
US20080086303A1 (en) * 2006-09-15 2008-04-10 Yahoo! Inc. Aural skimming and scrolling
US20080155616A1 (en) * 1996-10-02 2008-06-26 Logan James D Broadcast program and advertising distribution system
US20080168134A1 (en) * 2007-01-10 2008-07-10 International Business Machines Corporation System and Methods for Providing Relevant Assets in Collaboration Mediums
US20080221876A1 (en) * 2007-03-08 2008-09-11 Universitat Fur Musik Und Darstellende Kunst Method for processing audio data into a condensed version
US20080276159A1 (en) * 2007-05-01 2008-11-06 International Business Machines Corporation Creating Annotated Recordings and Transcripts of Presentations Using a Mobile Device
US20080282153A1 (en) * 2007-05-09 2008-11-13 Sony Ericsson Mobile Communications Ab Text-content features
US20090030682A1 (en) * 2007-07-23 2009-01-29 Call Catchers, Inc. System and method for publishing media files
US20090187516A1 (en) * 2008-01-18 2009-07-23 Tapas Kanungo Search summary result evaluation model methods and systems
US7983915B2 (en) * 2007-04-30 2011-07-19 Sonic Foundry, Inc. Audio content search engine
US8392183B2 (en) * 2006-04-25 2013-03-05 Frank Elmo Weber Character-based automated media summarization
US8661035B2 (en) * 2006-12-29 2014-02-25 International Business Machines Corporation Content management system and method

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4041467A (en) * 1975-11-28 1977-08-09 Xerox Corporation Transcriber system for the automatic generation and editing of text from shorthand machine outlines
JP4320491B2 (en) * 1999-11-18 2009-08-26 ソニー株式会社 Document processing system, terminal device, document providing device, document processing method, recording medium
US6904171B2 (en) * 2000-12-15 2005-06-07 Hewlett-Packard Development Company, L.P. Technique to identify interesting print articles for later retrieval and use of the electronic version of the articles
GB0309174D0 (en) * 2003-04-23 2003-05-28 Stevenson David W System and method for navigating a web site
JP2006277103A (en) * 2005-03-28 2006-10-12 Fuji Xerox Co Ltd Document translating method and its device
GB2415352B (en) * 2005-07-28 2006-06-28 Wildbird J C Foods Ltd Bird feeder accessory
US7716966B2 (en) * 2006-06-28 2010-05-18 Medtronic Cryocath Lp Mesh leak detection system for a medical device
US8209605B2 (en) * 2006-12-13 2012-06-26 Pado Metaware Ab Method and system for facilitating the examination of documents
CA2679094A1 (en) * 2007-02-23 2008-08-28 1698413 Ontario Inc. System and method for delivering content and advertisements
US7840604B2 (en) * 2007-06-04 2010-11-23 Precipia Systems Inc. Method, apparatus and computer program for managing the processing of extracted data

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5914941A (en) * 1995-05-25 1999-06-22 Information Highway Media Corporation Portable information storage/playback apparatus having a data interface
US6061675A (en) * 1995-05-31 2000-05-09 Oracle Corporation Methods and apparatus for classifying terminology utilizing a knowledge catalog
US20080155616A1 (en) * 1996-10-02 2008-06-26 Logan James D Broadcast program and advertising distribution system
US6167370A (en) * 1998-09-09 2000-12-26 Invention Machine Corporation Document semantic analysis/selection with knowledge creativity capability utilizing subject-action-object (SAO) structures
US6466901B1 (en) * 1998-11-30 2002-10-15 Apple Computer, Inc. Multi-language document search and retrieval system
US6598039B1 (en) * 1999-06-08 2003-07-22 Albert-Inc. S.A. Natural language interface for searching database
US7207003B1 (en) * 2000-08-31 2007-04-17 International Business Machines Corporation Method and apparatus in a data processing system for word based render browser for skimming or speed reading web pages
US20060271365A1 (en) * 2000-09-18 2006-11-30 International Business Machines Corporation Methods and apparatus for processing information signals based on content
US20020156632A1 (en) * 2001-04-18 2002-10-24 Haynes Jacqueline A. Automated, computer-based reading tutoring systems and methods
US20050177805A1 (en) * 2004-02-11 2005-08-11 Lynch Michael R. Methods and apparatuses to generate links from content in an active window
US20060190804A1 (en) * 2005-02-22 2006-08-24 Yang George L Writing and reading aid system
US20070130112A1 (en) * 2005-06-30 2007-06-07 Intelligentek Corp. Multimedia conceptual search system and associated search method
US8392183B2 (en) * 2006-04-25 2013-03-05 Frank Elmo Weber Character-based automated media summarization
US20070288435A1 (en) * 2006-05-10 2007-12-13 Manabu Miki Image storage/retrieval system, image storage apparatus and image retrieval apparatus for the system, and image storage/retrieval program
US20080086303A1 (en) * 2006-09-15 2008-04-10 Yahoo! Inc. Aural skimming and scrolling
US8661035B2 (en) * 2006-12-29 2014-02-25 International Business Machines Corporation Content management system and method
US20080168134A1 (en) * 2007-01-10 2008-07-10 International Business Machines Corporation System and Methods for Providing Relevant Assets in Collaboration Mediums
US20080221876A1 (en) * 2007-03-08 2008-09-11 Universitat Fur Musik Und Darstellende Kunst Method for processing audio data into a condensed version
US7983915B2 (en) * 2007-04-30 2011-07-19 Sonic Foundry, Inc. Audio content search engine
US20080276159A1 (en) * 2007-05-01 2008-11-06 International Business Machines Corporation Creating Annotated Recordings and Transcripts of Presentations Using a Mobile Device
US20080282153A1 (en) * 2007-05-09 2008-11-13 Sony Ericsson Mobile Communications Ab Text-content features
US20090030682A1 (en) * 2007-07-23 2009-01-29 Call Catchers, Inc. System and method for publishing media files
US20090187516A1 (en) * 2008-01-18 2009-07-23 Tapas Kanungo Search summary result evaluation model methods and systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SpeechSkimmer: a system for interactively skimming recorded speech by Barry Arons, ACM Transactions on Computer-Human Interaction, 1997 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130067333A1 (en) * 2008-10-03 2013-03-14 Finitiv Corporation System and method for indexing and annotation of video content
US9407942B2 (en) * 2008-10-03 2016-08-02 Finitiv Corporation System and method for indexing and annotation of video content
US20100324895A1 (en) * 2009-01-15 2010-12-23 K-Nfb Reading Technology, Inc. Synchronization for document narration
US8903723B2 (en) 2010-05-18 2014-12-02 K-Nfb Reading Technology, Inc. Audio synchronization for document narration with user-selected playback
US9478219B2 (en) 2010-05-18 2016-10-25 K-Nfb Reading Technology, Inc. Audio synchronization for document narration with user-selected playback
GB2502944A (en) * 2012-03-30 2013-12-18 Jpal Ltd Segmentation and transcription of speech
US9786283B2 (en) 2012-03-30 2017-10-10 Jpal Limited Transcription of speech
CN107193841A (en) * 2016-03-15 2017-09-22 北京三星通信技术研究有限公司 Media file accelerates the method and apparatus played, transmit and stored
EP3403415A4 (en) * 2016-03-15 2019-04-17 Samsung Electronics Co., Ltd. Method and device for accelerated playback, transmission and storage of media files
CN111885457A (en) * 2020-07-15 2020-11-03 歌尔科技有限公司 Wireless earphone, audio playing method and computer readable storage medium

Also Published As

Publication number Publication date
US20180240462A1 (en) 2018-08-23
US10332522B2 (en) 2019-06-25
US9953651B2 (en) 2018-04-24

Similar Documents

Publication Publication Date Title
US10332522B2 (en) Speed podcasting
US11197036B2 (en) Multimedia stream analysis and retrieval
US11100096B2 (en) Video content search using captioning data
US9190052B2 (en) Systems and methods for providing information discovery and retrieval
US9148619B2 (en) Music soundtrack recommendation engine for videos
KR100684484B1 (en) Method and apparatus for linking a video segment to another video segment or information source
US20130198268A1 (en) Generation of a music playlist based on text content accessed by a user
US20080201361A1 (en) Targeted insertion of an audio - video advertising into a multimedia object
US20180276298A1 (en) Analyzing user searches of verbal media content
US20140164371A1 (en) Extraction of media portions in association with correlated input
Marujo et al. Keyphrase cloud generation of broadcast news
JP5894149B2 (en) Enhancement of meaning using TOP-K processing
CN107247768A (en) Method for ordering song by voice, device, terminal and storage medium
WO2010147734A1 (en) Method and apparatus for classifying content
US20140163956A1 (en) Message composition of media portions in association with correlated text
Ionescu et al. An audio-visual approach to web video categorization
Bost et al. Serial speakers: a dataset of tv series
US20200257724A1 (en) Methods, devices, and storage media for content retrieval
US9563711B2 (en) Automated surfacing of tagged content in vertical applications
US10699242B2 (en) Automated surfacing of tagged content adjunct to vertical applications
US7949667B2 (en) Information processing apparatus, method, and program
US10678854B1 (en) Approximate string matching in search queries to locate quotes
Gibbon et al. Automated content metadata extraction services based on MPEG standards
US20240048821A1 (en) System and method for generating a synopsis video of a requested duration
US20230401389A1 (en) Enhanced Natural Language Processing Search Engine for Media Content

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ALLEN, CORVILLE O.;CHUNG, ALBERT A.;TRUONG, BINH C.;AND OTHERS;REEL/FRAME:021303/0737;SIGNING DATES FROM 20080721 TO 20080724

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ALLEN, CORVILLE O.;CHUNG, ALBERT A.;TRUONG, BINH C.;AND OTHERS;SIGNING DATES FROM 20080721 TO 20080724;REEL/FRAME:021303/0737

AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF ONE OF THE ASSIGNORS FROM LEE, KAM K. TO YEE, KAM K. PREVIOUSLY RECORDED ON REEL 021303 FRAME 0737;ASSIGNORS:ALLEN, CORVILLE O.;CHUNG, ALBERT A.;TRUONG, BINH C.;AND OTHERS;REEL/FRAME:021598/0828;SIGNING DATES FROM 20080721 TO 20080724

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF ONE OF THE ASSIGNORS FROM LEE, KAM K. TO YEE, KAM K. PREVIOUSLY RECORDED ON REEL 021303 FRAME 0737. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT OF ASSIGNORS' INTEREST TO INTERNATIONAL BUSINESS MACHINES CORPORATION;ASSIGNORS:ALLEN, CORVILLE O.;CHUNG, ALBERT A.;TRUONG, BINH C.;AND OTHERS;SIGNING DATES FROM 20080721 TO 20080724;REEL/FRAME:021598/0828

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4