WO2001030046A2 - Streaming content over a telephone interface - Google Patents

Streaming content over a telephone interface Download PDF

Info

Publication number
WO2001030046A2
WO2001030046A2 PCT/US2000/041429 US0041429W WO0130046A2 WO 2001030046 A2 WO2001030046 A2 WO 2001030046A2 US 0041429 W US0041429 W US 0041429W WO 0130046 A2 WO0130046 A2 WO 0130046A2
Authority
WO
WIPO (PCT)
Prior art keywords
telephone
content
streaming
audio
streaming content
Prior art date
Application number
PCT/US2000/041429
Other languages
French (fr)
Other versions
WO2001030046A3 (en
Inventor
Hadi Partovi
Michael S. Mccue
Angus Macdonald Davis
Michael M. Plitkins
Anthony Accardi
Original Assignee
Tellme Networks, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/426,102 external-priority patent/US6807574B1/en
Priority claimed from US09/431,002 external-priority patent/US6970915B1/en
Application filed by Tellme Networks, Inc. filed Critical Tellme Networks, Inc.
Priority to AU22997/01A priority Critical patent/AU2299701A/en
Publication of WO2001030046A2 publication Critical patent/WO2001030046A2/en
Publication of WO2001030046A3 publication Critical patent/WO2001030046A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1096Supplementary features, e.g. call forwarding or call holding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/53Network services using third party service providers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4938Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/20Aspects of automatic or semi-automatic exchanges related to features of supplementary services
    • H04M2203/2061Language aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2242/00Special services or facilities
    • H04M2242/22Automatic class or number identification arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42025Calling or Called party identification service
    • H04M3/42034Calling party identification service
    • H04M3/42059Making use of the calling party identifier
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42025Calling or Called party identification service
    • H04M3/42034Calling party identification service
    • H04M3/42059Making use of the calling party identifier
    • H04M3/42068Making use of the calling party identifier where the identifier is used to access a profile

Definitions

  • This disclosure relates to the field of streaming content.
  • the disclosure relates to technologies for providing streaming content to users over a telephone interface.
  • the disclosure also relates to identifying and registering users using telephone identifying information and personalizing the content, including the streaming content, presented to them using a profile selected using the telephone identifying information.
  • Streaming content is multimedia data sent as a stream of data.
  • the content is broken into small parts, compressed, sent over a network, uncompressed, and then played on a user's computer. This means that users can begin viewing, or listening to, the content before the entire stream is received.
  • streaming media systems include streaming audio and video systems from Real Networks, Inc., and Microsoft Corporation, and Apple Computer, Inc.
  • One problem with such systems is that the streaming data can only be accessed by users' computers.
  • telephone identifying information will be discussed.
  • Many telephone systems that support enhanced user features use telephone identifying information as a basic component.
  • a variety of example systems will be discussed that use telephone identifying information to provide enhanced user features will be discussed.
  • ANI automatic number identification
  • CID calling number identification
  • CID calling number identification
  • NANP North American Numbering Plan
  • ANI telephone identifying information
  • ANI telephone identifying information
  • previous systems have been single purpose and typically require reference to other information provided separately.
  • credit card activation lines require separately provided information, e.g. your home phone number from the application.
  • Some systems allow a user to build personalized content over the web.
  • One example is the my Yahoo!TM service provided by Yahoo! of Santa Clara, California at ⁇ http://my.yahoo.com/>.
  • the personalized content pages developed on the web are delivered over the web to users accessing the pages with computers.
  • These systems rely on a usemame and password type system to identify the user rather than telephone identifying information and the delivery mechanisms is different.
  • Still other systems allow users to personalize the content without entering special editing modes. For example, Amazon.com, of Seattle, Washington, keeps track of your purchases and preferences using cookies stored on a customer's web browser.
  • Some telephone systems provide limited customization capabilities. For example, voice mail systems from Octel, a division of Lucent Technologies, allow a user to set preferences for prompt length, but those settings must be made explicitly by each user. Further, customization is limited to a few options like prompt length and outgoing message selection. The user can not redefine the way the voice mail system works for her him beyond those narrow customization options. Further, these customizations do not affect the kinds of content and further the presentation is not selected based on telephone identifying information.
  • Amtrak's 1-800-USA-RAIL reservation line use telephone identifying information to select an initial region. For example, if you call Amtrak's reservation number in the Northeastern United States, the system presents options relating to the Boston- Washington line. However, if you call from California, the system presents information about travel between San Francisco and Los Angeles.
  • the area codes and/or exchanges can then be paired to different scripts or default selections. For example, the area codes for New York City, e.g. "212”, could be mapped to the Northeast Corridor while San Francisco, "415", could be mapped to the San Francisco-Los Angeles line.
  • MoviefoneTM uses the current time at the called number to present appropriate information.
  • the called number can be obtained using the dialed number identification service (DNIS).
  • DNIS dialed number identification service
  • you call MoviefoneTM in the San Francisco Bay Area at 10 o'clock in the morning only movies starting after 10 o'clock in the morning in the San Francisco Bay Area will be presented to you.
  • you call the Philadelphia MoviefoneTM, +1 (215) 222-FILM, from California you will hear the Philadelphia movie times in Eastern Time.
  • a call to the Philadelphia MoviefoneTM will produce information for Philadelphia show times after one o'clock in the afternoon Eastern Time at Philadelphia area theatres.
  • Some free long distance services provide customized advertising to support their services.
  • One example is Free WayTM offered by Broadpoint, of Landover, Maryland, ⁇ http://www.broadpoint.com/>. These services require an explicit user registration process, typically using a computer to access a web site, to provide the service with a profile. Once the profile is provided, the advertising is targeted to the particular person's explicitly provided demographic information. In some instances, the advertising may be targeted both based on the caller's demographics and their location. Thus, callers from the San Francisco Bay Area with a particular explicit demographic profile may be presented one ad, while callers from outside the San Francisco Bay Area may be presented with another ad.
  • Another, similar, service is offered on by phone by UAccess, Inc., ⁇ http://www.uaccess.com/>, by calling +1 (800) UACCESS, and provides consumers targeted advertising based on profile information they enter.
  • Voice systems such as GALAXY from the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, have been adapted to provide information about purchasing decisions for used cars. For example, GALAXY has been used to allow for interactive browsing of automobile classified ads. These voice systems are problem domain specific. Further, the systems are designed to locate vehicles matching a particular set of criterion, rather than making actual recommendations.
  • Amazon.com will make book suggestions for users connected to the web via a computer. However, those suggestions are limited to a particular site, e.g. Amazon.com. i. Voice Login
  • a login identifier e.g. credit card number, account number, etc.
  • PIN personal identification number
  • Some systems abbreviate this process by allowing a user calling from a particular phone to shortcut this process slightly. For example, callers using a phone number associated with a particular credit card might only be asked to enter the last four digits of their credit card number together with their billing zip code instead of all sixteen digits of the card number.
  • Other products such as Nuance VerfierTM from Nuance Communications, Menlo Park, California, support voice login capabilities, e.g. you just speak instead of entering a password.
  • a method and apparatus for providing streaming content over telephones is described.
  • the creation of a voice portal is supported by the invention.
  • Embodiments of the invention allows users to place a telephone call to access the voice portal.
  • the user can access many different types of content.
  • This content can include text based content which is read to the user by a text to speech system (e.g., news reports, stock prices, text content of Internet sites), audio content which can be played to the user (e.g., voicemail messages, music), and streaming audio content (e.g., Internet broadcast radio shows, streaming news reports, and streaming live broadcasts).
  • This content can be accessed from many different places. For example, the content can be retrieved from a news feed, a local streaming content server, an audio repository, and/or an Internet based streaming content server.
  • the streaming content allows the user to access live web broadcasts even though the user may not have access to a computer.
  • the system allows audio signals from multiple sources to be combined to enhance the user experience. Additionally, the user can control many aspects of the delivery of the streaming content.
  • a computer system supports the voice portal functionality.
  • the system can include an interface to the Public Switched Telephone Network and a system for interfacing with the Internet.
  • the PSTN interface supports communications with the users' telephones, while the Internet interface provides for connectivity with the Internet.
  • Between the two interfaces are a number of subsystems that support the selection, conversion, mixing and/or control of audio content (and in particular, streaming audio content). Importantly, these subsystems also support the personalization of the content. The following describes some of the personalization features of various embodiments of the invention.
  • the personalized content presented during a telephone call is specific to that user based on the profile associated with her/his telephone identifying information. For example, if a user, John, has previously called from the telephone number 650-493-####, he may have indicated he prefers a Southern dialect. Then, upon subsequent calls to the system from his telephone John will be greeted in a Southern dialect based on the profile associated with his number.
  • Example personalizations provided by embodiments of the invention will now be described.
  • the system may personalize the session based on the time and/or date as determined from the telephone identifying information. For example, based on the local time for the calling party, time and/or date appropriate options may be presented. For example, if a user calls from California at noon, a restaurant for lunch may be suggested. However, a caller from London at that same moment might be presented evening entertainment selections because of the eight hour time difference.
  • the system may personalize the session based on the caller's locale as determined from the telephone identifying information. For example, caller's from Palo Alto, California may hear different selections and options than callers from Washington, D.C. This may include locale specific events, e.g. a county fair, locale specific announcements, e.g. flood watch for Santa Clara County, etc.
  • locale specific events e.g. a county fair
  • locale specific announcements e.g. flood watch for Santa Clara County, etc.
  • the system may target advertising based on the caller's demographic and/or psychographic profile. Additionally, the advertising may be targeted based on the telephone identifying information. For example, overall demographic information for a particular area code-exchange combination may be used, e.g. 650-493 corresponds to Palo Alto, California, with an average income of $X. On an international scale, this can be used to the extent that the particular numbering plan can be combined with relevant geographic/demographic/psychographic information about callers. Both types of targeted advertising allow the callers to be qualified, e.g. match the requirements of the advertiser. For example, a San Francisco jewelry store might only want to reach households in the Bay Area with an average household income exceeding $50,000 a year. The targeted advertising can ensure that callers presented with the ad are qualified.
  • overall demographic information for a particular area code-exchange combination may be used, e.g. 650-493 corresponds to Palo Alto, California, with an average income of $X. On an international scale, this can be used to the extent that the particular number
  • the system may adapt the voice character, e.g. the speech patterns and dialect, of the system according to the caller's telephone identifying information and/or the caller's own voice character.
  • the system may reduce the speed at which the system speaks and/or increase the volume. This may be based on the telephone identifying information, e.g. hospital, an explicit request by the user, e.g. "Slow down", and/or implicitly based on the caller's speech patterns and interactions with the system.
  • the system can make purchasing suggestions. For example, close to a holiday like Mother's day, the system may suggest a gift based on what other people in that locale, e.g. Palo Alto, are buying and or based on the user's own purchasing history, e.g. she/he bought flowers last year.
  • the voice portal can recommend the purchase of an audio CD based on previous audio CD purchases.
  • the system may use a voice password and/or touch-tone login system when appropriate to distinguish the caller or verify the caller's identity for specific activities.
  • Profiles can be constructed implicitly as the caller uses embodiments of the invention as well as through explicit designation of preferences. For example, the user might specify an existing personalized web site to use in building her/his profile for the voice system. Similarly, for a caller from New York who repeatedly asks for the weather in San Francisco, the system might automatically include the San Francisco weather in the standard weather report without explicit specification, or confirmation.
  • new callers may have an initial profile generated based on 'one or more database lookups for demographic information based on their telephone identifying information.
  • Fig. 1 illustrates a system including embodiments of the invention used to provide streaming contents from the Internet to a telephone.
  • Fig. 2 illustrates the components of a voice portal supporting streaming content delivery and personalized content.
  • Fig. 3 is a process flow diagram supporting personalization and registration of and for users accessing a voice portal over a telephone interface.
  • Fig. 4 is a process flow diagram for personalizing a voice portal over a web based interface.
  • Fig. 5 is a process flow diagram for providing personalized content according to some embodiments of the invention.
  • Fig. 6 is a process flow diagram for providing streaming content to a telephone.
  • a voice portal for presenting streaming content over a telephone interface is described.
  • the voice portal allows users of telephones, including cellular telephones, to access a voice portal by dialing a phone number to listen to streaming content.
  • the information provided over the voice portal may come from the World Wide Web (WWW), databases, third parties, and/or other sources.
  • WWW World Wide Web
  • voice portal refers to the capability of various embodiments of the invention to provide customized voice and/or audio content services to a caller.
  • the voice portal can receive dual-tone multi-frequency (DTMF or touch-tone) commands as well as spoken commands to further control the voice and/or audio content presented and the manner of presentation.
  • DTMF or touch-tone dual-tone multi-frequency
  • voice portal with personalization capabilities.
  • any system that provides a telephone interface, an Internet interface, and can send an extracted audio signal, from streaming content being received by the Internet interface, to the telephone interface, would support the invention.
  • Embodiments of the invention use telephone identifying information to personalize caller interactions with the voice portal. This allows the system to present highly customized information to each caller based on a personal profile the system associates with the telephone identifying information.
  • telephone identifying information will be used to refer to ANI information, CID information, and/or some other technique for automatically identifying the source of a call and/or other call setup information.
  • ANI information typically includes a dialed number identification service (DNIS).
  • CID information may include text data including the subscriber's name and/or address, e.g. "Jane Doe".
  • Other examples of telephone identifying information might include the type of calling phone, e.g. cellular, pay phone, and/or hospital phone.
  • the telephone identifying information may include wireless carrier specific identifying information, e.g. location of wireless phone now, etc.
  • signaling system seven (SS7) information may be included in the telephone identifying information.
  • SS7 signaling system seven
  • a user profile is a collection of information about a particular user.
  • the user profile typically includes collections of different information as shown and described more fully in connection with Figure 6.
  • the user profile contains a combination of explicitly made selections and implicitly made selections.
  • Explicitly made selections in the user profile stem from requests by the user to the system. For example, the user might add business news to the main topic list.
  • explicit selections come in the form of a voice, or touch-tone command, to save a particular location, e.g. "Remember this", “Bookmark it”, “shortcut this", pound (#) key touch-tone, etc., or through adjustments to the user profile made through the web interface using a computer.
  • implicit selections come about through the conduct and behavior of the user. For example, if the user repeatedly asks for the weather in Palo Alto, California, the system may automatically provide the Palo Alto weather report without further prompting. In other embodiments, the user may be prompted to confirm the system's implicit choice, e.g. the system might prompt the user "Would you like me to include Palo Alto in the standard weather report from now on?”
  • the system may allow the user to customize the system to meet her/his needs better. For example, the user may be allowed to control the verbosity of prompts, the dialect used, and/or other settings for the system. These customizations can be made either explicitly or implicitly. For example if the user is providing commands before most prompts are finished, the system could recognize that a less verbose set of prompts is needed and implicitly set the user's prompting preference to briefer prompts.
  • a topic is any collection of similar content. Topics may be arranged hierarchically as well. For example, a topic might be business news, while subtopics might include stock quotes, market report, and analyst reports. Within a topic different types of content are available. For example, in the stock quotes subtopic, the content might include stock quotes. The distinction between topics and the content within the topics is primarily one of degree in that each topic, or subtopic, will usually contain several pieces of content.
  • the term qualified as it is used in this application refers to whether or not a particular user being presented an advertisement, or other material, meets the demographic and/or psychographic profile requirements for that advertisement, or content. For example, a San Francisco-based bookstore might request that all listeners to its advertisement be located in a particular part of the San Francisco Bay Area. Thus, a user of the system would be qualified if she lived in the designated part of the San Francisco Bay Area.
  • Different embodiments of the invention may qualify users of the system according to different requirements. For example, in some instances advertising, or content, is qualified solely based on telephone identifying information. In other embodiments the telephone identifying information is used in conjunction with other information such as an associated user profile, a reverse telephone number lookup for locale demographics, and/or other information.
  • the term locale refers to any geographic area.
  • the geographic area may be a neighborhood, a city, a county, a metropolitan region, a state, a country, a continent, a group of countries, and/or some other collection of one or more geographic areas, e.g. all United State major metropolitan areas.
  • a single user of the system may be considered to be in several locales.
  • a caller from Palo Alto, California might be in the Palo Alto locale, a Silicon Valley locale, a San Francisco Bay Area locale, a Northern California locale, a California state locale, and a United States locale.
  • the telephone identifying information for a single telephone number can be mapped to a number of system-defined locales.
  • voice character refers to all aspects of speech pronunciation including dialect, speed, volume, gender of speaker, pitch, language, voice talent used, actor, characteristics of speech, and/or other prosody values. Users can adjust the voice character of the system by changing their voice character settings. For example, an elderly user could select voice character settings that provide louder volume and slower speech. Similarly, a caller from the South could adjust the voice character settings to support a Southern dialect.
  • Demographic profiles typically include factual information, e.g. age, gender, marital status, income, etc.
  • Psychographic profiles typically include information about behaviors, e.g. fun loving, analytical, compassionate, fast reader, slow reader, etc. As used in this application, the term demographic profile will be used to refer to both demographic and psychographic profiles.
  • Streaming technology allows a user to see, and/or listen to, streaming content as it is being downloaded.
  • Streaming content is audio and/or video data sent as a stream.
  • the content is broken into small parts, compressed, sent over a network, uncompressed, and then played on a user's computer. This means that users can begin viewing, or listening to, the content before the entire stream is received.
  • video data can include animations.
  • streaming data as opposed to non-streaming data, does not require the user to download an entire data file before the user can begin listening to or viewing the data.
  • Figure 1 illustrates a system including embodiments of the invention used to provide streaming content and personalized content to users of telephones according to telephone identifying information.
  • the system of Figure 1 can be used to allow users of standard telephones and cellular telephones to access a voice portal with streaming content from their telephones.
  • Figure 1 includes a telephone 100, a cellular telephone 101, a computer 102, a telephone network 104, an Internet 106, a telephone gateway 107, a web server 108, a voice portal 1 10, a shared database 1 12, a personalized site 130, a streaming content server 150, a local streaming content server 170 and a streaming content path 170.
  • the cellular telephone 101 and the telephone 100 are coupled in communication with the telephone network 104.
  • the telephone network 104 is coupled in communication with the telephone gateway 107.
  • the telephone gateway 107 is coupled in communication with the voice portal 1 10.
  • the computer 102 and the streaming content server 150 is coupled in communication with the Internet 106.
  • the Internet 106 is coupled in communication with the web server 108.
  • the voice portal 1 10 and the web server 108 are coupled in communication with the shared database 112.
  • the personalized site 130 is coupled in communication with the Internet 106.
  • the local streaming content server 160 is coupled in communication with the voice portal 110.
  • the telephone 100 and the cellular telephone 101 are two different telephone interfaces to the voice portal 110.
  • the telephone 100 and the cellular telephone 101 may be any sort of telephone and/or cellular telephone.
  • the telephone 100 or the cellular telephone 101 may be a land line phone, a PBX telephone, a satellite phone, a wireless telephone, and/or any other type of communication device capable of providing voice communication and/or touch-tone signals over the telephone network 104.
  • any audio signal carrying interface could be used.
  • the telephone network 104 may be the public switched telephone network (PSTN) and/or some other type of telephone network.
  • PSTN public switched telephone network
  • IP Internet Protocol
  • the telephone network 104 is coupled to the telephone gateway 107 that allows the voice communications and/or touch-tone signals from the telephone network 104 to reach the voice portal 110 in usable form. Similarly, the telephone gateway 107 allows audio signals generated by the voice portal 1 10 to be sent over the telephone network 104 to respective telephones, e.g. the telephone 100.
  • the telephone network 104 generally represents an audio signal carrying network.
  • the computer 102 is a computer such as a personal computer, a thin client computer, a server computer, a handheld computer, a set top box computer, and/or some other type of visual web browsing device.
  • the computer 102 is coupled in communication with the Internet 106, e.g.
  • the computer 102 typically provides a visual interface to the WWW and the web server 108 using web browsing software such as Internet ExplorerTM from Microsoft Corporation, Redmond, Washington.
  • Both the web server 108 and the voice portal 1 10 are capable of communicating with the shared database 1 12 to register users, build personal profiles implicitly and/or explicitly as will be described more fully below.
  • the database 1 12 stores profiles for each user based on an association between one or more pieces of telephone identifying information and a particular user.
  • the database may have a profile for a user Sarah Smith that is keyed to her home telephone number, e.g. 650-493-####.
  • Sarah could associate other numbers, e.g. work, cellular, etc., with her profile either implicitly, e.g. by repeatedly calling the voice portal 1 10 from those numbers, or explicitly, e.g. by adding those numbers to the system directly.
  • an existing profile for a web-based portal is adapted for use by the voice portal 1 10 by associating one or more telephone numbers with the existing profile as stored in the shared database 1 12.
  • the existing profile may be further modified for use with the voice portal 1 10 to allow for different preferences between the web and the voice interfaces.
  • the streaming content server 170 represents any computer system capable of providing streaming content.
  • the streaming content server 170 could be a server provided by Real Networks, Microsoft Corporation, Apple Computer, Inc., or any number of streaming content deliver systems.
  • the local content server 160 is similar to the streaming content server 170 except that the local content server 160 is directly accessible to the voice portal 110 (i.e., not requiring an Internet access).
  • the streaming content path 170 represents logical paths by which streaming content can be delivered telephones.
  • the streaming content path 170 shows that the streaming content is accessed from the streaming content server 150, or the local streaming content server 160, sent through the voice portal 110, the telephone gateway 107 and the telephone network 104, to be delivered to the telephone 100 (or cellular phone 101).
  • Figure 2 illustrates the components of a voice portal supporting streaming and personalized content. This could be used to support the voice portal 110.
  • the voice portal 110 is coupled in communication with the telephone gateway 107.
  • the voice portal 110 includes a call manager 200, an execution engine 202, a data connectivity engine 220, an evaluation engine 222 and a streaming engine 224.
  • Figure 2 includes elements that may be included in the voice portal 1 10, or which may be separate from, but coupled to, the voice portal 1 10.
  • Figure 2 also includes a recognition server 210, a text to speech server 214, an audio repository 212, the local streaming content server 160, the shared database 1 12, a database 226, the Internet 106, a database 228 and a web site 230.
  • the call manager 200 within the voice portal 110 is coupled to the execution engine 202.
  • the execution engine 202 is coupled to the recognition server 210, the text to speech server 214, the audio repository 212, data connectivity engine 220, the evaluation engine 222 and the streaming engine 224.
  • the voice portal 110 is coupled in communication with the shared database 1 12, the database 226 and the Internet 106.
  • the Internet 106 is coupled in communication with the streaming content server 150 and the database 228 and the web site 230.
  • the voice portal 110 is implemented using one or more computers.
  • the computers may be server computers such as UNIX workstations, personal computers and/or some other type of computers.
  • Each of the components of the voice portal 1 10 may be implemented on a single computer, multiple computers and/or in a distributed fashion.
  • each of the components of the voice portal 1 10 is a functional unit that may be divided over multiple computers and/or multiple processors.
  • the voice portal 1 10 represents an example of a telephone interface subsystem. Different components may be included in a telephone interface subsystem.
  • a telephone interface subsystem may include one or more of the following components: the call manager 200, the execution engine, the data connectivity 220, the evaluation engine 222, the streaming engine 224, the audio repository 212, the text to speech 214 and/or the recognition engine 210.
  • the call manager 200 is responsible for scheduling call and process flow among the various components of the voice portal 110.
  • the call manager 200 sequences access to the execution engine 202.
  • the execution engine 202 handles access to the recognition server 210, the text to speech server 214, the audio repository 212, the data connectivity engine 220, the evaluation engine 222 and the streaming engine 224.
  • the recognition server 210 supports voice, or speech, recognition.
  • the recognition server 210 may use Nuance 6TM recognition software from Nuance Communications, Menlo Park, California, and/or some other speech recognition product.
  • the execution engine 202 provides necessary grammars to the recognition server 210 to assist in the recognition process. The results from the recognition server 210 can then be used by the execution engine 202 to further direct the call session.
  • the recognition server 110 may support voice login using products such as Nuance VerifierTM and/or other voice login and verification products.
  • the text to speech server 214 supports the conversion of text to synthesized speech for transmission over the telephone gateway 107.
  • the execution engine 202 could request that the phrase, "The temperature in Palo Alto, California, is currently 58 degrees and rising" be spoken to a caller. That phrase would be translated to speech by the text to speech server 214 for playback over the telephone network on the telephone (e.g. the telephone 100). Additionally the text to speech server 214 may respond using a selected dialect and/or other voice character settings appropriate for the caller.
  • the audio repository 212 may include recorded sounds and/or voices.
  • the audio repository 212 is coupled to one of the databases (e.g. the database 226, the database 228 and/or the shared database 112) for storage of audio files.
  • the audio repository server 212 responds to requests from the execution engine 202 to play a specific sound or recording.
  • the audio repository 212 may contain a standard voice greeting for callers to the voice portal 1 10, in which case the execution engine 202 could request play-back of that particular sound file.
  • the selected sound file would then be delivered by the audio repository 212 through the call manager 200 and across the telephone gateway 107 to the caller on the telephone, e.g. the telephone 100.
  • the telephone gateway 107 may include digital signal processors (DSPs) that support the generation of sounds and/or audio mixing.
  • DSPs digital signal processors
  • Some embodiments of the invention include telephony systems from Dialogics, an Intel Corporation.
  • the execution engine 202 supports the execution of multiple threads with each thread operating one or more applications for a particular call to the voice portal 110.
  • a thread may be started to provide her/him a voice interface to the system and for accessing other options.
  • an extensible markup language (XML)- style language is used to program applications.
  • Each application is then written in the XML-style language and executed in a thread on the execution engine 202.
  • an XML-style language such as VoiceXML from the VoiceXML Forum, ⁇ http://www.voicexml.org/>, is extended for use by the execution engine 202 in the voice portal 1 10.
  • the execution engine 202 may access the data connectivity engine 220 for access to databases and web sites (e.g. the shared database 112, the web site 230), the evaluation engine 222 for computing tasks and the streaming engine 224 for presentation of streaming media and audio.
  • the streaming engine 224 may allow users of the voice portal 1 10 to access streaming audio content, or the audio portion of streaming video content, over the telephone interface.
  • a streaming media broadcast from ZDNetTM could be accessed by the streaming engine 224 for playback through the voice portal.
  • the streaming engine 224 can act as a streaming content client to a streaming content server, e.g., the streaming engine 224 can act like a RealPlayer software client to receive streaming content broadcasts from a Real Networks server.
  • the streaming engine 224 can participate in a streaming content broadcast by acting like a streaming broadcast forwarding server. This second function is particularly useful where multiple users are listening to the same broadcast at the same time (e.g., multiple users may call into the voice portal 110 to listen to the same live streaming broadcast of a company's conference call with the analysts).
  • the data connectivity engine 220 supports access to a variety of databases including databases accessed across the Internet 106, e.g. the database 228, and also access to web sites over the Internet such as the web site 230.
  • the data connectivity engine can access standard query language (SQL) databases, open database connectivity databases (ODBC), and/or other types of databases.
  • SQL standard query language
  • ODBC open database connectivity databases
  • the shared database 1 12 is represented separately from the other databases in Figure 2; however, the shared database 1 12 may in fact be part of one of the other databases, e.g. the database 226. Thus, the shared database 1 12 is distinguished from other databases accessed by the voice portal 1 10 in that it contains user profile information.
  • the voice portal 1 10 is able to flexibly handle multiple callers from a single telephone, e.g. Tom and Dick are roommates and both call from 650-493-####. Similarly, the voice portal 1 10 is able to handle a single caller that uses multiple telephones, e.g. Tom has a cell phone 650-245-####, his home phone 650-493-####, and a work phone 408-301 -####.
  • the manner in which the voice portal 1 10 can handle some of the above situations will be discussed throughout. In the example used while describing Figure 3, the process will be described using a caller Jane Smith as an exemplary caller who has never registered with the voice portal 1 10 from any telephone and an exemplary caller John Doe who has previously called the voice portal 1 10 from his telephone 100.
  • step 300 telephone identifying information is received. This is shown in Figure 1 by call flow arrow 1 14 representing the transfer of telephone identifying information through the telephone gateway 107 to the voice portal 110. This step occurs after a user has placed a call to the voice portal 1 10 with a telephone, e.g. the telephone 100.
  • a known profile e.g. is the user registered?
  • Some examples may be illustrative. If Jane Smith uses the cellular telephone 101 to call the voice portal 110 for the first time, her telephone identifying information will not be associated with any existing unique profile in the shared database 1 12. Therefore, at step 302, the determination would be made that she is not registered and the process would continue at step 304. In contrast, John Doe has previously called the voice portal from the telephone 100 and so his telephone identifying information will be associated with a profile in the shared database 1 12 and the process would continue at step 306.
  • a new profile is created at step 304.
  • the new profile may be initialized using a variety of information derived from the telephone identifying information and/or predetermined values for the voice portal 110.
  • an initial profile can be created using the calling number, e.g. 650-493-####, included in the telephone identifying information to select initial profile settings.
  • the call flow arrow 116 shows this process on Figure 1. The use of the telephone identifying information to create an initial profile is discussed below in the section "Automatic Profile Initialization".
  • the profile is not initialized using the telephone identifying information.
  • the user may be explicitly queried by the voice portal 110 to create one or more components of the initial profile, e.g. "Please speak your first name", to allow for more personalized prompting by the voice portal 1 10.
  • the profile is retrieved from the shared database 112 as shown by the call flow arrow 1 18.
  • the profile can be updated throughout the call based on the user's behavior and actions ⁇ implicit preferences - as well as explicit requests from the user to customize the voice portal 110.
  • the personalized content can be presented to the user as shown by the call flow arrow 122 in Figure 1.
  • John Doe who is calling from the telephone 100, already has a profile in the shared database 1 12. That profile may indicate that John prefers a southern dialect and likes to hear a quick stock market report immediately on call in.
  • John his telephone identifying information serves to log him directly into the system and trigger the personalized behavior unique to him: a quick stock market report in a southern dialect.
  • the voice portal may support multiple callers from a single telephone. For example, Sarah Brown and John Doe may both use the telephone 100 to call the voice portal 110.
  • the voice portal may prompt for a password or other unique identifier, either as voice or touch-tone, to select among the profiles.
  • the voice portal is configured to minimize the need for a caller to provide a password. Thus, during a single call session, the caller is typically only asked to provide her/his password a single time.
  • some embodiments of the invention may require that a password always be used to complete commercial transactions and/or after the passage of a predetermined period, e.g. ten minutes since last password prompt.
  • the user may adjust her/his profile to allow login without a password for playback features.
  • a single profile can be associated with multiple calling numbers.
  • the user Jane Doe could specify that both the telephone 100 and the cellular telephone 101 should be associated with her profile.
  • Jane calls from a new telephone e.g. pay phone
  • she/he is prompted as to whether to remember the number for future use.
  • additional telephone identifying information e.g. this is a pay phone
  • voice verification may be used to recognize a caller's voice instead of, or in addition to, using a password or other identification number.
  • Typical events that would require a password, or that the user be authenticated previously with a password might include adding and removing items from the user profile through explicit commands as well as requests for specific personal information, e.g. that user's stock portfolio, bank account balances, etc.
  • callers it is not necessary for callers to the voice portal 1 10 to explicitly specify their preferences using this embodiment of the invention.
  • the callers' behaviors and actions are used by the voice portal 110 to adopt implicit preferences, sometimes after receiving confirmation. For example, behaviors and actions reflecting repeated access to a content in a particular topic, or a particular topic, may cause the voice portal 110 to automatically include the repeatedly requested content in the default message.
  • the system can add the San Francisco weather to the standard weather report.
  • the system may request confirmation before adding the weather report, e.g. "Would you like me to include San Francisco in the standard weather report?"
  • users who repeatedly ask for information about business related issues may find that the system will adjust the main menu to include business.
  • that option may drop off the main menu.
  • the system may ask for confirmation before modifying the menu choices, or the system may notify the user of a modification and/or allow a user to review/change past modifications.
  • the structure and content of the call may change, e.g. San Francisco weather will be announced at the beginning of future calls and sports information may be omitted.
  • the process shown in Figure 4 assumes that a profile has already been created, e.g. by calling for the first time as described above.
  • users may create profiles using the web interface by providing the telephone identifying information for their primary calling phone number and a password.
  • the telephone identifying information provided here the primary calling phone number, can be used to create the initial profile.
  • the profile is accessed using a computer (e.g. the computer 102) via a web interface.
  • the web interface is provided by a web server (e.g. the web server 108) and allows for access to the shared database 1 12 as shown by the call flow arrow 120.
  • the user can manually identify content and topics to build her/his profile at step 404.
  • This can be supported by allowing the user to specify topics from a list of topics and then specifying per topic content from a list of content.
  • the topics might include business, sports, news, entertainment, and weather, to name a few.
  • the user could include weather, news, and business in her/his main menu and then further customize the specific content to be presented within those topics. For example, within weather, the user might select the specific cities she/he wants listed in her/his weather menu and/or the cities for which the weather is automatically played.
  • the user can identify a web location with personalized content to use in building her/his profile, e.g. a uniform resource indicator (URI).
  • a uniform resource indicator e.g. a uniform resource indicator (URI).
  • Figure 1 includes the personalized site 130.
  • the personalized site 130 could be a customized portal web page, e.g. myYahoo!, My Netscape, etc., a home page the user herself/himself has designed, and/or any other web page that includes content of interest to the user.
  • the user can identify the personalized site with a uniform resource indicator (URI), including a login identifier and password if necessary, e.g. for myYahoo!
  • the personalized site 130 can then be accessed and the pertinent user preferences, e.g. news, stocks, selected.
  • URI uniform resource indicator
  • the voice portal 1 10 may present its own content for that particular item, e.g. the version of the
  • step 402 and step 404 can be used together allowing a user to quickly transfer preferences from a web portal to her/his voice portal while still supporting explicit personalization.
  • an existing web portal profile is voice enabled for use by a voice portal through the association of telephone identifying information with the existing web portal.
  • the telephone identifying information e.g. the primary calling number
  • an existing web profile e.g. myYahoo! profile
  • web sites like the personalized site 130 may be accessed using the voice portal 1 10 in some embodiments of the invention through the use of the data connectivity engine 220 as shown in Figure 2.
  • Some embodiments of the invention may allow users of the voice portal 1 10 to add to their profile from other web sites. For example, if a user of the computer 102 is accessing a web site (e.g. the personalized site 130), the web site might include a link like "Add this to your voice portal.” Thus, for example, from a service such as MapQuestTM or AmeritradeTM, the user could click on a link to add a particular piece of content or a particular topic to their portal for the voice portal 110.
  • a service such as MapQuestTM or AmeritradeTM
  • a user could add her/his "QQQ" stock symbol to her/his profile on the voice portal 1 10 even though the voice portal 110 may be operated independently of the particular web site.
  • the web browser software on the user's computer can support an option to add a bookmark to the user's profile stored in the shared database 1 12 for the voice portal 110.
  • a menu option in the browser on the computer 102 might include "Add Page to Voice Portal Shortcuts" and upon selecting that menu option, the current web page would be added to the user's profile on the voice portal 110.
  • a URI on the web server 108 that included the information to be added.
  • the web server 108 might ask for a primary calling phone number and or a password.
  • a cookie stored by the browser on the computer 102 may be used to obviate one or both of these steps.
  • a confirmation page may be shown including a return link to the originating web page.
  • step 500 a request is made for content, or a topic. Then one or more of steps 502-510 take place, in parallel or sequence, and then the content is presented at step 512. Which of steps 502-510 occur for a given request may be determined based on the topic or content requested. For example, step 504 can be omitted when non-time dependent information is presented.
  • the telephone identifying information includes information about the caller's locale independent of any user provided registration information. This information can be derived from telephone routing tables that provide a descriptive name for each area code/exchange combination within the North American Numbering Plan (NANP). Thus, the phone number 650-493-#### would be associated with "Palo Alto, California”. Similarly, 650-592-#### would be associated with "San Carlos, California”. This information may be directly present in the telephone identifying information provided to the voice portal 110, or may be ascertained from a local exchange routing guide (LERG). For international callers outside the NANP, similar types of telephone identifying information can be mapped to locales within countries to the extent permitted by the particular numbering plan.
  • LAG local exchange routing guide
  • the city-state combination may correspond to multiple locales for the purposes of the voice portal 110.
  • a county- wide or multi-city locale can be defined that encompasses multiple area code/exchange combinations.
  • a single caller may be in multiple locales.
  • Locale information can be further refined through the use of additional databases, e.g. city/state to zip code databases, street address to five digit zip code databases, reverse lookup databases that map phone numbers to street addresses, longitude-latitude conversion databases, and/or other databases that provide locale related information from telephone identifying information.
  • V and H coordinates might be determined using the telephone identifying information. Those can be further converted to a longitude and latitude to determine the locale.
  • a reverse phone number database could be used to find a specific street address for the telephone identifying information.
  • Examples of the uses for the locale information include: providing locale-appropriate lottery results, providing driving directions to a requested destination, providing locale-appropriate weather reports, providing locale-appropriate show times for movies other events, e.g. cultural, governmental, etc., traffic reports, yellow page listings, and/or providing other locale-related information.
  • the telephone identifying information includes information about the caller's locale independent of any user provided registration information. This information can be derived from telephone routing tables that provide a descriptive name for each area code/exchange combination within the NANP. Thus, the phone number 650- 493-#### would be associated with "Palo Alto, California" and thus the correct time zone, Pacific, could be selected as well. This time zone may be directly present in the telephone identifying information provided to the voice portal 110, or may be ascertained from the LERG. For international callers outside the NANP, similar types of telephone identifying information can be mapped to locales within countries to the extent permitted by the particular numbering plan. Thus, callers from United Kingdom numbers would be mapped to British Standard Time.
  • the time zone information allows the voice portal 110 to customize the presentation of information based on the time in the caller's locale. Callers can use a single nationwide, or international, number to reach the voice portal 110, e.g. 800-###-####. The voice portal 110 will use the time zone information to adjust the content presented to each user.
  • the voice portal 110 might report a stock quote to the user while on a Friday evening, the voice portal 110 might suggest a movie. For example, "It is Friday night, would you be interested in seeing a movie?" A "yes" response by the caller will lead to the presentation of a list that is both time and date adapted and locale appropriate. For example, a caller from Palo Alto at six o'clock p.m. on a Friday would hear about show times after six o'clock p.m. in his local area.
  • the voice portal 110 may connect the user to an appropriate transaction system to complete a user requested transaction such a the purchase of an airline ticket, a movie ticket, an audio CD, etc.
  • the voice portal 1 10 may be able to directly complete the transaction using the data connectivity engine 220 and access to the Internet 106 and/or one more databases (e.g. the database 226). This process can occur even if the caller has not explicitly provided the voice portal 1 10 her/his home location or the current time. For example, this personalized content might be presented immediately at after step 304 of Figure 3 in step 306.
  • time sensitive information can be presented such as airline schedules, cultural and other events, etc.
  • a caller asking for flight times to New York from a 650-493-#### telephone number might be prompted to select one of the three local airports: San Francisco International, San Jose International, and Oakland International, and then the flight times to New York after the current time in the Pacific time zone would be presented.
  • Some additional examples include customizing the presentation of business reports based on whether or not the market is open; modifying the greeting prompt based on the time of day; and providing traffic information automatically during commute hours, but not at other times.
  • Embodiments of the invention support the presentation of targeted advertising, or other content, to callers of the voice portal 1 10 as shown at step 508.
  • the two primary types of targeted advertising supported by embodiments of the invention will be described.
  • the different types of targeted advertising can be combined as well.
  • Telephone identifying information can be used to reference demographic information about callers from a particular area. For example, if the telephone identifying information includes the calling number 650-493-####, corresponding to Palo Alto, California, general demographic information about callers from that particular region can be used to target the advertising, or other content. Further, if a reverse lookup database is used, the phone number can, in some instances, locate specific demographic information for a given household, or caller.
  • This personalization allows the targeting of advertising to qualified callers by the voice portal 112. For example, an advertiser of expensive luxury vehicles might request that its callers be qualified based on their income, or a particular psychographic attribute, e.g. fun-loving. In that case, the demographic profile corresponding to the telephone identifying information can be used to qualify the caller. Thus, callers from the relatively affluent city of Palo Alto, California might receive the advertising. Similarly, if a particular household meets the requirements based on a reverse lookup, those households can receive the advertising as well.
  • This profile may indicate interests based on the explicit and implicit preferences, e.g. likes sports, and can be used in combination with the telephone identifying information to more closely tailor ads to the caller.
  • the telephone identifying information includes information about the caller's locale independent of any user provided registration information.
  • the locales may be associated with one or more standard voice character settings, e.g. for dialect, and also idiomatic speech.
  • callers from California may receive different prompts and a different dialect from the voice portal then callers from Florida.
  • the telephone identifying information may include information about the type of phone, e.g. pay phone, hospital phone, etc., that can be used to adjust the voice character, e.g. louder and slower speech.
  • the caller's speaking voice may be used to refine the voice character of the system.
  • callers with speech patterns from a particular region of the country may find that after several verbal interactions with the voice portal, the content being presented at step 512 is being spoken using a voice character more suited to their own speech patterns.
  • the voice character for those callers may be slowed and played back louder. Additional examples include allowing users to select different voice actors, different background music and/or sound effects, control the verbosity of prompts, etc.
  • step 510 the customization of content through purchase suggestions.
  • the system can make purchasing suggestions.
  • the suggestions could be based on the caller's locale and what others in that locale have purchased. In other embodiments, the suggestions may be based on the profile of the user relative to other user's purchases. In some embodiments, approaches such as collaborative filtering are used to generate recommendations.
  • Examples of recommendations may include particular goods and services, e.g. flowers for Mom a few days before Mother's Day. Further, the exact suggestion may vary based on the caller's past habits, e.g. in the past you bought chocolates so this year chocolates might be suggested again. Alternatively, if many people from your locale are buying a particular book that might be suggested as well.
  • the particular purchase recommendation may relate to goods and services offered independently, by, and/or in affiliation with the operator of the voice portal 1 10.
  • the system may support the use of one or more passwords, either spoken or touch-tone for login and authentication purposes.
  • the passwords provide for protection against modifications to a user's profile without authentication. Additionally, certain specific actions, e.g. making a purchase, listening to certain types of personalized content, etc., may require authentication.
  • the authentication system will support either a voice or a touch-tone password for users of the voice portal 1 10. This flexibility addresses situations where the voice password is not working due to line conditions and/or conditions of the calling telephone. Products such as Nuance VerifierTM and/or other voice login and verification products may be used to provide the voice login feature. In some embodiments, both types of authentication may be required.
  • embodiments of the invention may minimize the need for the user to re-authenticate herself/himself, as described above.
  • the password, either voice and/or touch-tone, used for authentication for telephone sessions may be the same or different as any passwords used for authentication for web access to the profile customization options described in conjunction with Figure 4.
  • the telephone identifying information can be used to select an appropriate demographic profile and list of topics based on the calling locale.
  • a reverse lookup of the calling number provided with the telephone identifying information is used to obtain a specific demographic profile for a caller and/or her/his household. Then the demographic information derived from the locale and/or the reverse lookup are used to set initial profile values. For example, the user's income might be estimated based on the average income for the calling locale, e.g. Palo Alto, California, or from demographic information from the reverse lookup.
  • the caller's initial topics might be selected based on commonly selected topics for her/his locale and/or the preferences available based on the demographic information retrieved by the reverse lookup.
  • initial values may be revised based on a caller's later actions. For example, if the initial estimate of a caller's age is too high, later actions may cause that information to be revised. Similarly, callers may be permitted to explicitly provide certain types of demographic information as needed. For example, the user might provide her/his birth date to a horoscope feature provided by the voice portal 1 10, in that instance, the birth date might be incorporated into the profile.
  • Fig. 6 illustrates an example use of the system of Fig. 1. to deliver streaming content to a telephone user. Note that some of the blocks in Fig. 6 can be executed in different order and/or performed in parallel.
  • the execution engine 202 receives a request to access an Internet site.
  • This request could come from a number of different sources (e.g., a voice command from the user, a selection by the user of a menu of possible sites, a command from another part of the voice portal 110 because of some personalization choices made by the user). What is important is that some Internet related request is received.
  • the request may correspond to a URI (Uniform Resource Identifier). Note, if the URI corresponds to non-streaming data, then the voice portal 110 acts as described above.
  • information about the streaming content is determined.
  • the type and source of streaming content can have different effects on the streaming engine 224. For example, if the content stream complies with Real Network's standard, then the streaming engine 224 may use the RealPlayer client functional code. The streaming engine 224 could decide to cache a large part of the streaming content if the streaming content server is not local. Additionally, the streaming engine 224 may notify the server to send only the audio streaming content. What is important is that the streaming engine 224 may make decisions about how to access and process the content stream based upon the type and/or location of the Internet site being accessed.
  • the voice portal 1 10 begins to receive the streaming content. This can include receive packets of streaming content and converting the streaming content into digital audio data. Alternatively, no conversion is needed where the streaming content is sent in a format compatible with internal representation of the audio data within the voice portal 110. Note, only a portion of the entire stream is processed at any one time. The stream is received in parts and, as such, is processed in parts.
  • the execution engine may cause the audio data may be mixed with additional audio data. This may be done when user menu prompts/notices from the voice portal 110 are supplied overtop of the audio from the streaming audio signal. This mixing may be done in the telephone gateway 107 and/or in the computers running the voice portal 110.
  • Examples of the types of audio signals mixed can include streaming content from the local streaming content server 160, streaming content from the streaming content server 150, audio from the text to speech engine 214, and/or audio content from the audio repository 212.
  • This allows prompts, system notifications, menu options, and other user interface features to be combined with streaming content delivery.
  • the mixing may be accomplished by reducing the power level of the streaming content and adding the second signal. Mixing may also be achieve by inserting an audio signal into the stream. This allows for the insertion of advertisements, systems prompts, etc.
  • the audio retrieved from the local streaming content server 160, the text to speech engine 214, and/or the audio repository 212 may need to be converted to a form compatible with the representation of audio within the voice portal 1 10.
  • the audio repository 212 may store audio signals in many different forms (e.g., .WAV form, MP3, MIDI, etc.), these forms may need to be converted into the internal representation of audio signals within the voice portal 1 10. This conversion may also include resolving a location of stored audio information. For example, if the audio repository 212 refers to audio data stored at another location (e.g. on the Internet) , the voice portal 110 may need to resolve the address, download the audio signal, and then convert the signal to the appropriate internal representation. Note, the voice portal 1 10 may use multiple internal representations of audio data.
  • the mixed audio signal is sent from the telephone gateway 107, through the PSTN, to the user's telephone.
  • a test is made to determine whether the entire stream has been received. If not, block 630 through 660 are repeated.
  • the user receives streaming content from the Internet on their telephone.
  • the following describes important aspects of the invention that enhance the user's experience with the voice portal 110.
  • Restore - allows the user to restore a previous stream to the user's last location.
  • Rate - indicates the rate at which the content stream is being received by the voice portal 1 10.
  • Jump - allows the user to jump in the stream to subject matter of interest.
  • the voice portal 1 10 performs voice recognition on the content stream (this may involve caching portions of the content stream) and searches for the corresponding phrase, subject matter, word, etc.
  • the data may include the length, title, source, and/or server type.
  • the voice portal 1 10 can support channels.
  • the channels correspond to specific content types such as business, sports, etc.
  • the channels may include streaming data. A user could select a channel and have data from various sources combined and delivered to him/her.
  • History capabilities may be included. This allows the user to select from previously accessed Internet sites to replay streaming content or new content from the same site.
  • the voice portal 1 10 may insert advertisements in the streaming content. This may be done by adding advertisements before, during, or after, the streaming content is delivered to the user.
  • the ads may be determined from the user's preferences.
  • Some embodiments of the invention may use text feeds as part of their streaming content. For example, some embodiments of the invention may use the closed caption translation of a portion of a audio broadcast to provide streaming translation facilities to a user.
  • Some embodiments of the invention allow for the mixing of audio from sources other than the local streaming content server 160, the text to speech engine 214, or the audio repository 212.
  • the voice portal 110 may access non-streaming audio data from the Internet to mix with streaming content audio signals.
  • a voice portal for presenting streaming content over a telephone interface has been described.
  • the streaming content may be supplied from the Internet and may be combined with other signals. Personalization of the data may also occur.
  • the personalized information provided over the voice portal may come from the World Wide Web (WWW), databases, third parties, and or other sources. Telephone identifying information is used by embodiments of the invention to personalize caller interactions with the voice portal.
  • WWW World Wide Web

Abstract

A method and apparatus for providing streaming content over telephones is described. The creation of a voice portal is supported by the invention. Embodiments of the invention use allows users to place a telephone call to access the voice portal. The user can access many different types of content. This content can include text based content which is read to the user by a text to speech system (e.g., news reports, stock prices, text content of Internet sites), audio content which can be played to the user (e.g., voicemail messages, music), and streaming audio content (e.g., Internet broadcast radio shows, streaming news reports, and streaming live broadcasts). This content can be accessed from many different places. For example, the content can be retrieved from a news feed, a local streaming content server, an audio repository, and/or an Internet based streaming content server. The streaming content allows the user to access live web broadcasts even though the user may not have access to a computer. The system allows audio signals from multiple sources to be combined to enhance the user experience. Additionally, the user can control many aspects of the delivery of the streaming content. Personalization features of the voice portal allow for an improved user experience during streaming content delivery.

Description

STREAMING CONTENT OVER A TELEPHONE INTERFACE
Field
This disclosure relates to the field of streaming content. In particular, the disclosure relates to technologies for providing streaming content to users over a telephone interface. The disclosure also relates to identifying and registering users using telephone identifying information and personalizing the content, including the streaming content, presented to them using a profile selected using the telephone identifying information.
Description of the Related Art
The following describes streaming content available over computer networks. Then various telephone systems are described.
Streaming content is multimedia data sent as a stream of data. Typically, the content is broken into small parts, compressed, sent over a network, uncompressed, and then played on a user's computer. This means that users can begin viewing, or listening to, the content before the entire stream is received. Examples of streaming media systems include streaming audio and video systems from Real Networks, Inc., and Microsoft Corporation, and Apple Computer, Inc. One problem with such systems is that the streaming data can only be accessed by users' computers.
The following describes various techniques used in telephone systems to provide enhanced user features. First, telephone identifying information will be discussed. Many telephone systems that support enhanced user features use telephone identifying information as a basic component. Then, a variety of example systems will be discussed that use telephone identifying information to provide enhanced user features will be discussed.
1. Telephone Identifying Information
The advent of automatic number identification (ANI) and calling number identification (CNID, CLID, or CID) within the North American Numbering Plan (NANP) has supported the creation of a number of services that use these pieces of telephone identifying information. Comparable systems may be used in other numbering plans and countries to support similar services. For example, when consumers receive credit cards in the mail, they have to call from their home telephone numbers to activate the cards. This is a typical use of ANI. In this instance, the credit card company matches the ANI information provided when the consumer calls to a previously provided telephone number. If the ANI matches the credit card company's records, the credit card company activates the card.
2. Examples of Telephone System Personalization
a. Personalization Generally
With the advent of widely available real-time delivery of telephone identifying information such as ANI, a number of systems have been developed to use that information. One of the most common uses of ANI is for credit card activation. However, previous systems have been single purpose and typically require reference to other information provided separately. For example, credit card activation lines require separately provided information, e.g. your home phone number from the application.
b. Building Personalized Content on the Web
Some systems allow a user to build personalized content over the web. One example is the my Yahoo!™ service provided by Yahoo! of Santa Clara, California at <http://my.yahoo.com/>. The personalized content pages developed on the web are delivered over the web to users accessing the pages with computers. These systems rely on a usemame and password type system to identify the user rather than telephone identifying information and the delivery mechanisms is different.
c. Interactive Personalization
Still other systems allow users to personalize the content without entering special editing modes. For example, Amazon.com, of Seattle, Washington, keeps track of your purchases and preferences using cookies stored on a customer's web browser. Some telephone systems provide limited customization capabilities. For example, voice mail systems from Octel, a division of Lucent Technologies, allow a user to set preferences for prompt length, but those settings must be made explicitly by each user. Further, customization is limited to a few options like prompt length and outgoing message selection. The user can not redefine the way the voice mail system works for her him beyond those narrow customization options. Further, these customizations do not affect the kinds of content and further the presentation is not selected based on telephone identifying information.
d. Locale Selection
Services such as Amtrak's 1-800-USA-RAIL reservation line use telephone identifying information to select an initial region. For example, if you call Amtrak's reservation number in the Northeastern United States, the system presents options relating to the Boston- Washington line. However, if you call from California, the system presents information about travel between San Francisco and Los Angeles.
This can be accomplished by using the calling party's area code and/or exchange included with the telephone identifying information to select a region. The area codes and/or exchanges can then be paired to different scripts or default selections. For example, the area codes for New York City, e.g. "212", could be mapped to the Northeast Corridor while San Francisco, "415", could be mapped to the San Francisco-Los Angeles line.
However this does not change the kind of content presented and it is not user-selected.
e. Time Appropriate Information Presentation
Several services provide information through the telephone. That information may be adapted based on the time of day or date. Some systems provide the information irrespective of the telephone identifying information. One example is Moviefone™, 777-FILM in most locales. Moviefone™ uses the current time at the called number to present appropriate information. The called number can be obtained using the dialed number identification service (DNIS). Thus, if you call Moviefone™ in the San Francisco Bay Area at 10 o'clock in the morning, only movies starting after 10 o'clock in the morning in the San Francisco Bay Area will be presented to you. However, if you call the Philadelphia Moviefone™, +1 (215) 222-FILM, from California, you will hear the Philadelphia movie times in Eastern Time. Thus, at 10 o'clock in the morning Pacific Time, a call to the Philadelphia Moviefone™ will produce information for Philadelphia show times after one o'clock in the afternoon Eastern Time at Philadelphia area theatres.
f. Targeted Advertising
Some free long distance services provide customized advertising to support their services. One example is Free Way™ offered by Broadpoint, of Landover, Maryland, <http://www.broadpoint.com/>. These services require an explicit user registration process, typically using a computer to access a web site, to provide the service with a profile. Once the profile is provided, the advertising is targeted to the particular person's explicitly provided demographic information. In some instances, the advertising may be targeted both based on the caller's demographics and their location. Thus, callers from the San Francisco Bay Area with a particular explicit demographic profile may be presented one ad, while callers from outside the San Francisco Bay Area may be presented with another ad. Another, similar, service is offered on by phone by UAccess, Inc., <http://www.uaccess.com/>, by calling +1 (800) UACCESS, and provides consumers targeted advertising based on profile information they enter.
g. Voice Character
Most telephone systems have a small number of voice actors. Continuing with the example of Moviefone™, one actor performs all of the menus and prompts. Other systems may use different voice actors for different subsystems.
These actors are typically selected on a system wide basis and as such, different voices, talents, speeds, characteristics, dialects, and other prosody aspects of the presentation are not user selectable.
h. Purchase Recommendations
Voice systems such as GALAXY from the Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts, have been adapted to provide information about purchasing decisions for used cars. For example, GALAXY has been used to allow for interactive browsing of automobile classified ads. These voice systems are problem domain specific. Further, the systems are designed to locate vehicles matching a particular set of criterion, rather than making actual recommendations.
Other systems are web based. For example, Amazon.com will make book suggestions for users connected to the web via a computer. However, those suggestions are limited to a particular site, e.g. Amazon.com. i. Voice Login
Most telephone systems require a user to explicitly identify herself/himself by using a combination of a login identifier, e.g. credit card number, account number, etc., and a personal identification number (PIN). Some systems abbreviate this process by allowing a user calling from a particular phone to shortcut this process slightly. For example, callers using a phone number associated with a particular credit card might only be asked to enter the last four digits of their credit card number together with their billing zip code instead of all sixteen digits of the card number. Other products such as Nuance Verfier™ from Nuance Communications, Menlo Park, California, support voice login capabilities, e.g. you just speak instead of entering a password.
j. Initial Profile Generation from Database Lookups
Most systems that provide information over the telephone require users to explicitly answer one or more questions in one form or another, e.g. over the phone, the web, and or in written form. These questions form a demographic and/or psychographic profile for the user. All of these systems require the user to explicitly provide her/his profile information.
SUMMARY
A method and apparatus for providing streaming content over telephones is described. The creation of a voice portal is supported by the invention. Embodiments of the invention allows users to place a telephone call to access the voice portal. The user can access many different types of content. This content can include text based content which is read to the user by a text to speech system (e.g., news reports, stock prices, text content of Internet sites), audio content which can be played to the user (e.g., voicemail messages, music), and streaming audio content (e.g., Internet broadcast radio shows, streaming news reports, and streaming live broadcasts). This content can be accessed from many different places. For example, the content can be retrieved from a news feed, a local streaming content server, an audio repository, and/or an Internet based streaming content server.
The streaming content allows the user to access live web broadcasts even though the user may not have access to a computer. The system allows audio signals from multiple sources to be combined to enhance the user experience. Additionally, the user can control many aspects of the delivery of the streaming content. In one embodiment, a computer system supports the voice portal functionality. The system can include an interface to the Public Switched Telephone Network and a system for interfacing with the Internet. The PSTN interface supports communications with the users' telephones, while the Internet interface provides for connectivity with the Internet. Between the two interfaces are a number of subsystems that support the selection, conversion, mixing and/or control of audio content (and in particular, streaming audio content). Importantly, these subsystems also support the personalization of the content. The following describes some of the personalization features of various embodiments of the invention.
The personalized content presented during a telephone call is specific to that user based on the profile associated with her/his telephone identifying information. For example, if a user, John, has previously called from the telephone number 650-493-####, he may have indicated he prefers a Southern dialect. Then, upon subsequent calls to the system from his telephone John will be greeted in a Southern dialect based on the profile associated with his number. Example personalizations provided by embodiments of the invention will now be described. The system may personalize the session based on the time and/or date as determined from the telephone identifying information. For example, based on the local time for the calling party, time and/or date appropriate options may be presented. For example, if a user calls from California at noon, a restaurant for lunch may be suggested. However, a caller from London at that same moment might be presented evening entertainment selections because of the eight hour time difference.
The system may personalize the session based on the caller's locale as determined from the telephone identifying information. For example, caller's from Palo Alto, California may hear different selections and options than callers from Washington, D.C. This may include locale specific events, e.g. a county fair, locale specific announcements, e.g. flood watch for Santa Clara County, etc.
The system may target advertising based on the caller's demographic and/or psychographic profile. Additionally, the advertising may be targeted based on the telephone identifying information. For example, overall demographic information for a particular area code-exchange combination may be used, e.g. 650-493 corresponds to Palo Alto, California, with an average income of $X. On an international scale, this can be used to the extent that the particular numbering plan can be combined with relevant geographic/demographic/psychographic information about callers. Both types of targeted advertising allow the callers to be qualified, e.g. match the requirements of the advertiser. For example, a San Francisco jewelry store might only want to reach households in the Bay Area with an average household income exceeding $50,000 a year. The targeted advertising can ensure that callers presented with the ad are qualified. The system may adapt the voice character, e.g. the speech patterns and dialect, of the system according to the caller's telephone identifying information and/or the caller's own voice character. Thus, for example, for callers who speak more slowly the system may reduce the speed at which the system speaks and/or increase the volume. This may be based on the telephone identifying information, e.g. hospital, an explicit request by the user, e.g. "Slow down", and/or implicitly based on the caller's speech patterns and interactions with the system.
Based on the caller's profile — as retrieved through the telephone identifying information — and/or demographic information from other sources, e.g. locale based and/or reverse phone directory lookup, the system can make purchasing suggestions. For example, close to a holiday like Mother's day, the system may suggest a gift based on what other people in that locale, e.g. Palo Alto, are buying and or based on the user's own purchasing history, e.g. she/he bought flowers last year. Similarly, the voice portal can recommend the purchase of an audio CD based on previous audio CD purchases. The system may use a voice password and/or touch-tone login system when appropriate to distinguish the caller or verify the caller's identity for specific activities.
These customizations may be combined in a variety of fashions. Thus, for example, the time, the locale, and a preferred dialect may be used to present a purchase recommendation. Profiles can be constructed implicitly as the caller uses embodiments of the invention as well as through explicit designation of preferences. For example, the user might specify an existing personalized web site to use in building her/his profile for the voice system. Similarly, for a caller from New York who repeatedly asks for the weather in San Francisco, the system might automatically include the San Francisco weather in the standard weather report without explicit specification, or confirmation.
Additionally, new callers may have an initial profile generated based on 'one or more database lookups for demographic information based on their telephone identifying information.
BRIEF DESCRIPTION OF THE FIGURES
Fig. 1 illustrates a system including embodiments of the invention used to provide streaming contents from the Internet to a telephone.
Fig. 2 illustrates the components of a voice portal supporting streaming content delivery and personalized content.
Fig. 3 is a process flow diagram supporting personalization and registration of and for users accessing a voice portal over a telephone interface.
Fig. 4 is a process flow diagram for personalizing a voice portal over a web based interface. Fig. 5 is a process flow diagram for providing personalized content according to some embodiments of the invention.
Fig. 6 is a process flow diagram for providing streaming content to a telephone.
DETAILED DESCRIPTION
A. Introduction
A voice portal for presenting streaming content over a telephone interface is described. The voice portal allows users of telephones, including cellular telephones, to access a voice portal by dialing a phone number to listen to streaming content. The information provided over the voice portal may come from the World Wide Web (WWW), databases, third parties, and/or other sources.
The term voice portal refers to the capability of various embodiments of the invention to provide customized voice and/or audio content services to a caller. The voice portal can receive dual-tone multi-frequency (DTMF or touch-tone) commands as well as spoken commands to further control the voice and/or audio content presented and the manner of presentation. Note that not all embodiments of the invention use a voice portal, and in particular, not all embodiments use a voice portal with personalization capabilities. Generally, any system that provides a telephone interface, an Internet interface, and can send an extracted audio signal, from streaming content being received by the Internet interface, to the telephone interface, would support the invention.
Embodiments of the invention use telephone identifying information to personalize caller interactions with the voice portal. This allows the system to present highly customized information to each caller based on a personal profile the system associates with the telephone identifying information.
The invention will be described in greater detail as follows. First, a number of definitions useful to understanding the invention are presented. Then, the hardware and software architecture is presented in the System Overview. Then, a series of sections describe the various personalization features provided by different embodiments of the invention. Then, a section describes the streaming content capabilities of the system.
B. Definitions
1. Telephone Identifying Information
For the purposes of this application, the term telephone identifying information will be used to refer to ANI information, CID information, and/or some other technique for automatically identifying the source of a call and/or other call setup information. For example, ANI information typically includes a dialed number identification service (DNIS). Similarly, CID information may include text data including the subscriber's name and/or address, e.g. "Jane Doe". Other examples of telephone identifying information might include the type of calling phone, e.g. cellular, pay phone, and/or hospital phone. Additionally, the telephone identifying information may include wireless carrier specific identifying information, e.g. location of wireless phone now, etc. Also, signaling system seven (SS7) information may be included in the telephone identifying information.
2. User Profile
A user profile is a collection of information about a particular user. The user profile typically includes collections of different information as shown and described more fully in connection with Figure 6. Notably, the user profile contains a combination of explicitly made selections and implicitly made selections. Explicitly made selections in the user profile stem from requests by the user to the system. For example, the user might add business news to the main topic list. Typically, explicit selections come in the form of a voice, or touch-tone command, to save a particular location, e.g. "Remember this", "Bookmark it", "shortcut this", pound (#) key touch-tone, etc., or through adjustments to the user profile made through the web interface using a computer.
In contrast, implicit selections come about through the conduct and behavior of the user. For example, if the user repeatedly asks for the weather in Palo Alto, California, the system may automatically provide the Palo Alto weather report without further prompting. In other embodiments, the user may be prompted to confirm the system's implicit choice, e.g. the system might prompt the user "Would you like me to include Palo Alto in the standard weather report from now on?"
Additionally, the system may allow the user to customize the system to meet her/his needs better. For example, the user may be allowed to control the verbosity of prompts, the dialect used, and/or other settings for the system. These customizations can be made either explicitly or implicitly. For example if the user is providing commands before most prompts are finished, the system could recognize that a less verbose set of prompts is needed and implicitly set the user's prompting preference to briefer prompts.
3. Topics and Content
A topic is any collection of similar content. Topics may be arranged hierarchically as well. For example, a topic might be business news, while subtopics might include stock quotes, market report, and analyst reports. Within a topic different types of content are available. For example, in the stock quotes subtopic, the content might include stock quotes. The distinction between topics and the content within the topics is primarily one of degree in that each topic, or subtopic, will usually contain several pieces of content.
4. Qualified
The term qualified as it is used in this application refers to whether or not a particular user being presented an advertisement, or other material, meets the demographic and/or psychographic profile requirements for that advertisement, or content. For example, a San Francisco-based bookstore might request that all listeners to its advertisement be located in a particular part of the San Francisco Bay Area. Thus, a user of the system would be qualified if she lived in the designated part of the San Francisco Bay Area. Different embodiments of the invention may qualify users of the system according to different requirements. For example, in some instances advertising, or content, is qualified solely based on telephone identifying information. In other embodiments the telephone identifying information is used in conjunction with other information such as an associated user profile, a reverse telephone number lookup for locale demographics, and/or other information.
5. Locale
As used in this application, the term locale refers to any geographic area. The geographic area may be a neighborhood, a city, a county, a metropolitan region, a state, a country, a continent, a group of countries, and/or some other collection of one or more geographic areas, e.g. all United State major metropolitan areas.
For this reason, a single user of the system may be considered to be in several locales. For example, a caller from Palo Alto, California, might be in the Palo Alto locale, a Silicon Valley locale, a San Francisco Bay Area locale, a Northern California locale, a California state locale, and a United States locale.
Thus, the telephone identifying information for a single telephone number can be mapped to a number of system-defined locales.
6. Voice Character
The term voice character as it is used in this application refers to all aspects of speech pronunciation including dialect, speed, volume, gender of speaker, pitch, language, voice talent used, actor, characteristics of speech, and/or other prosody values. Users can adjust the voice character of the system by changing their voice character settings. For example, an elderly user could select voice character settings that provide louder volume and slower speech. Similarly, a caller from the South could adjust the voice character settings to support a Southern dialect.
7. Demographic and Psychographic Profiles
Both demographic profiles and psychographic profiles contain information relating to a user. Demographic profiles typically include factual information, e.g. age, gender, marital status, income, etc. Psychographic profiles typically include information about behaviors, e.g. fun loving, analytical, compassionate, fast reader, slow reader, etc. As used in this application, the term demographic profile will be used to refer to both demographic and psychographic profiles. 8. Streaming Content
Streaming technology allows a user to see, and/or listen to, streaming content as it is being downloaded. Streaming content is audio and/or video data sent as a stream. Typically, the content is broken into small parts, compressed, sent over a network, uncompressed, and then played on a user's computer. This means that users can begin viewing, or listening to, the content before the entire stream is received. Note, video data can include animations. Importantly, streaming data, as opposed to non-streaming data, does not require the user to download an entire data file before the user can begin listening to or viewing the data.
C. System Overview
First, the hardware and software architecture of a system including an embodiment of the invention will be described with reference to Figures 1-2. Figure 1 illustrates a system including embodiments of the invention used to provide streaming content and personalized content to users of telephones according to telephone identifying information. The system of Figure 1 can be used to allow users of standard telephones and cellular telephones to access a voice portal with streaming content from their telephones.
The following lists the elements of Figure 1 and describes their interconnections. Figure 1 includes a telephone 100, a cellular telephone 101, a computer 102, a telephone network 104, an Internet 106, a telephone gateway 107, a web server 108, a voice portal 1 10, a shared database 1 12, a personalized site 130, a streaming content server 150, a local streaming content server 170 and a streaming content path 170. The cellular telephone 101 and the telephone 100 are coupled in communication with the telephone network 104. The telephone network 104 is coupled in communication with the telephone gateway 107. The telephone gateway 107 is coupled in communication with the voice portal 1 10. The computer 102 and the streaming content server 150 is coupled in communication with the Internet 106. The Internet 106 is coupled in communication with the web server 108. The voice portal 1 10 and the web server 108 are coupled in communication with the shared database 112. The personalized site 130 is coupled in communication with the Internet 106. The local streaming content server 160 is coupled in communication with the voice portal 110. The following describes each of the elements of Figure 1 in greater detail. The use of each of the elements will be described further in conjunction with the sections describing the personalization features.
The telephone 100 and the cellular telephone 101 are two different telephone interfaces to the voice portal 110. The telephone 100 and the cellular telephone 101 may be any sort of telephone and/or cellular telephone. For example the telephone 100 or the cellular telephone 101 may be a land line phone, a PBX telephone, a satellite phone, a wireless telephone, and/or any other type of communication device capable of providing voice communication and/or touch-tone signals over the telephone network 104. However, any audio signal carrying interface could be used. The telephone network 104 may be the public switched telephone network (PSTN) and/or some other type of telephone network. For example, some embodiments of the invention may allow users with a voice over Internet Protocol (IP) phone to access the voice portal 1 10. The telephone network 104 is coupled to the telephone gateway 107 that allows the voice communications and/or touch-tone signals from the telephone network 104 to reach the voice portal 110 in usable form. Similarly, the telephone gateway 107 allows audio signals generated by the voice portal 1 10 to be sent over the telephone network 104 to respective telephones, e.g. the telephone 100. The telephone network 104 generally represents an audio signal carrying network. The computer 102 is a computer such as a personal computer, a thin client computer, a server computer, a handheld computer, a set top box computer, and/or some other type of visual web browsing device. The computer 102 is coupled in communication with the Internet 106, e.g. by a dial-up connection, a digital subscriber loop (DSL), a cable modem, and/or some other type of connection. This allows the computer 102 to communicate with the web server 108. The computer 102 typically provides a visual interface to the WWW and the web server 108 using web browsing software such as Internet Explorer™ from Microsoft Corporation, Redmond, Washington.
Both the web server 108 and the voice portal 1 10 are capable of communicating with the shared database 1 12 to register users, build personal profiles implicitly and/or explicitly as will be described more fully below. The database 1 12 stores profiles for each user based on an association between one or more pieces of telephone identifying information and a particular user. Thus, the database may have a profile for a user Sarah Smith that is keyed to her home telephone number, e.g. 650-493-####. Additionally, Sarah could associate other numbers, e.g. work, cellular, etc., with her profile either implicitly, e.g. by repeatedly calling the voice portal 1 10 from those numbers, or explicitly, e.g. by adding those numbers to the system directly.
In some embodiments, an existing profile for a web-based portal is adapted for use by the voice portal 1 10 by associating one or more telephone numbers with the existing profile as stored in the shared database 1 12. In these embodiments, the existing profile may be further modified for use with the voice portal 1 10 to allow for different preferences between the web and the voice interfaces.
The call flow arrows 1 14-122 shown on Figure 1 will be described in greater detail below. The streaming content server 170 represents any computer system capable of providing streaming content. The streaming content server 170 could be a server provided by Real Networks, Microsoft Corporation, Apple Computer, Inc., or any number of streaming content deliver systems. The local content server 160 is similar to the streaming content server 170 except that the local content server 160 is directly accessible to the voice portal 110 (i.e., not requiring an Internet access).
The streaming content path 170 represents logical paths by which streaming content can be delivered telephones. The streaming content path 170 shows that the streaming content is accessed from the streaming content server 150, or the local streaming content server 160, sent through the voice portal 110, the telephone gateway 107 and the telephone network 104, to be delivered to the telephone 100 (or cellular phone 101).
Figure 2 illustrates the components of a voice portal supporting streaming and personalized content. This could be used to support the voice portal 110.
The following lists the elements of Figure 2 and describes their interconnections. The voice portal 110 is coupled in communication with the telephone gateway 107. The voice portal 110 includes a call manager 200, an execution engine 202, a data connectivity engine 220, an evaluation engine 222 and a streaming engine 224. Additionally Figure 2 includes elements that may be included in the voice portal 1 10, or which may be separate from, but coupled to, the voice portal 1 10. Thus Figure 2 also includes a recognition server 210, a text to speech server 214, an audio repository 212, the local streaming content server 160, the shared database 1 12, a database 226, the Internet 106, a database 228 and a web site 230. The call manager 200 within the voice portal 110 is coupled to the execution engine 202. The execution engine 202 is coupled to the recognition server 210, the text to speech server 214, the audio repository 212, data connectivity engine 220, the evaluation engine 222 and the streaming engine 224. The voice portal 110 is coupled in communication with the shared database 1 12, the database 226 and the Internet 106. The Internet 106 is coupled in communication with the streaming content server 150 and the database 228 and the web site 230. The following describes each of the elements of Figure 2 in greater detail. The use of each of the elements will be described further in conjunction with the sections describing the personalization features and the streaming content features.
Typically, the voice portal 110 is implemented using one or more computers. The computers may be server computers such as UNIX workstations, personal computers and/or some other type of computers. Each of the components of the voice portal 1 10 may be implemented on a single computer, multiple computers and/or in a distributed fashion. Thus, each of the components of the voice portal 1 10 is a functional unit that may be divided over multiple computers and/or multiple processors. The voice portal 1 10 represents an example of a telephone interface subsystem. Different components may be included in a telephone interface subsystem. For example, a telephone interface subsystem may include one or more of the following components: the call manager 200, the execution engine, the data connectivity 220, the evaluation engine 222, the streaming engine 224, the audio repository 212, the text to speech 214 and/or the recognition engine 210. The call manager 200 is responsible for scheduling call and process flow among the various components of the voice portal 110. The call manager 200 sequences access to the execution engine 202. Similarly, the execution engine 202 handles access to the recognition server 210, the text to speech server 214, the audio repository 212, the data connectivity engine 220, the evaluation engine 222 and the streaming engine 224.
The recognition server 210 supports voice, or speech, recognition. The recognition server 210 may use Nuance 6™ recognition software from Nuance Communications, Menlo Park, California, and/or some other speech recognition product. The execution engine 202 provides necessary grammars to the recognition server 210 to assist in the recognition process. The results from the recognition server 210 can then be used by the execution engine 202 to further direct the call session. Additionally, the recognition server 110 may support voice login using products such as Nuance Verifier™ and/or other voice login and verification products.
The text to speech server 214 supports the conversion of text to synthesized speech for transmission over the telephone gateway 107. For example, the execution engine 202 could request that the phrase, "The temperature in Palo Alto, California, is currently 58 degrees and rising" be spoken to a caller. That phrase would be translated to speech by the text to speech server 214 for playback over the telephone network on the telephone (e.g. the telephone 100). Additionally the text to speech server 214 may respond using a selected dialect and/or other voice character settings appropriate for the caller.
The audio repository 212 may include recorded sounds and/or voices. In some embodiments the audio repository 212 is coupled to one of the databases (e.g. the database 226, the database 228 and/or the shared database 112) for storage of audio files. Typically, the audio repository server 212 responds to requests from the execution engine 202 to play a specific sound or recording.
For example, the audio repository 212 may contain a standard voice greeting for callers to the voice portal 1 10, in which case the execution engine 202 could request play-back of that particular sound file. The selected sound file would then be delivered by the audio repository 212 through the call manager 200 and across the telephone gateway 107 to the caller on the telephone, e.g. the telephone 100. Additionally, the telephone gateway 107 may include digital signal processors (DSPs) that support the generation of sounds and/or audio mixing. Some embodiments of the invention include telephony systems from Dialogics, an Intel Corporation.
The execution engine 202 supports the execution of multiple threads with each thread operating one or more applications for a particular call to the voice portal 110. Thus, for example, if the user has called in to the voice portal 1 10, a thread may be started to provide her/him a voice interface to the system and for accessing other options. In some embodiments of the invention an extensible markup language (XML)- style language is used to program applications. Each application is then written in the XML-style language and executed in a thread on the execution engine 202. In some embodiments, an XML-style language such as VoiceXML from the VoiceXML Forum, <http://www.voicexml.org/>, is extended for use by the execution engine 202 in the voice portal 1 10.
Additionally, the execution engine 202 may access the data connectivity engine 220 for access to databases and web sites (e.g. the shared database 112, the web site 230), the evaluation engine 222 for computing tasks and the streaming engine 224 for presentation of streaming media and audio. The streaming engine 224 may allow users of the voice portal 1 10 to access streaming audio content, or the audio portion of streaming video content, over the telephone interface. For example, a streaming media broadcast from ZDNet™ could be accessed by the streaming engine 224 for playback through the voice portal. The streaming engine 224 can act as a streaming content client to a streaming content server, e.g., the streaming engine 224 can act like a RealPlayer software client to receive streaming content broadcasts from a Real Networks server. Additionally, the streaming engine 224 can participate in a streaming content broadcast by acting like a streaming broadcast forwarding server. This second function is particularly useful where multiple users are listening to the same broadcast at the same time (e.g., multiple users may call into the voice portal 110 to listen to the same live streaming broadcast of a company's conference call with the analysts).
The data connectivity engine 220 supports access to a variety of databases including databases accessed across the Internet 106, e.g. the database 228, and also access to web sites over the Internet such as the web site 230. In some embodiments the data connectivity engine can access standard query language (SQL) databases, open database connectivity databases (ODBC), and/or other types of databases. The shared database 1 12 is represented separately from the other databases in Figure 2; however, the shared database 1 12 may in fact be part of one of the other databases, e.g. the database 226. Thus, the shared database 1 12 is distinguished from other databases accessed by the voice portal 1 10 in that it contains user profile information.
Having described the hardware and software architecture supporting various embodiments of the invention, the various personalization features provided by different embodiments of the invention will now be described. After the personalization features, the streaming content aspects of various embodiments of the invention are described in detail.
P. Telephone Driven Profile Building
Turning to Figure 3, the process of creating a profile using a telephone interface will be described. This process will be described with reference to the call flow arrows shown on Figure 1 as well.
The voice portal 1 10 is able to flexibly handle multiple callers from a single telephone, e.g. Tom and Dick are roommates and both call from 650-493-####. Similarly, the voice portal 1 10 is able to handle a single caller that uses multiple telephones, e.g. Tom has a cell phone 650-245-####, his home phone 650-493-####, and a work phone 408-301 -####. The manner in which the voice portal 1 10 can handle some of the above situations will be discussed throughout. In the example used while describing Figure 3, the process will be described using a caller Jane Smith as an exemplary caller who has never registered with the voice portal 1 10 from any telephone and an exemplary caller John Doe who has previously called the voice portal 1 10 from his telephone 100.
First, at step 300, telephone identifying information is received. This is shown in Figure 1 by call flow arrow 1 14 representing the transfer of telephone identifying information through the telephone gateway 107 to the voice portal 110. This step occurs after a user has placed a call to the voice portal 1 10 with a telephone, e.g. the telephone 100.
Next, at step 302, a determination is made as to whether the telephone identifying information corresponds to a known profile, e.g. is the user registered? Some examples may be illustrative. If Jane Smith uses the cellular telephone 101 to call the voice portal 110 for the first time, her telephone identifying information will not be associated with any existing unique profile in the shared database 1 12. Therefore, at step 302, the determination would be made that she is not registered and the process would continue at step 304. In contrast, John Doe has previously called the voice portal from the telephone 100 and so his telephone identifying information will be associated with a profile in the shared database 1 12 and the process would continue at step 306.
If, the telephone identifying information is not associated with an existing profile in the shared database 112, a new profile is created at step 304. The new profile may be initialized using a variety of information derived from the telephone identifying information and/or predetermined values for the voice portal 110. Thus, for example, when Jane Smith calls for the first time from the cellular telephone 101, an initial profile can be created using the calling number, e.g. 650-493-####, included in the telephone identifying information to select initial profile settings. The call flow arrow 116 shows this process on Figure 1. The use of the telephone identifying information to create an initial profile is discussed below in the section "Automatic Profile Initialization".
In some embodiments, the profile is not initialized using the telephone identifying information. In other embodiments, the user may be explicitly queried by the voice portal 110 to create one or more components of the initial profile, e.g. "Please speak your first name", to allow for more personalized prompting by the voice portal 1 10. Once a profile is created, the process continues at step 306.
At step 306, the profile is retrieved from the shared database 112 as shown by the call flow arrow 1 18. The profile can be updated throughout the call based on the user's behavior and actions ~ implicit preferences - as well as explicit requests from the user to customize the voice portal 110. Once a profile is selected at step 306, the personalized content can be presented to the user as shown by the call flow arrow 122 in Figure 1. For example, John Doe, who is calling from the telephone 100, already has a profile in the shared database 1 12. That profile may indicate that John prefers a southern dialect and likes to hear a quick stock market report immediately on call in. Thus, for John, his telephone identifying information serves to log him directly into the system and trigger the personalized behavior unique to him: a quick stock market report in a southern dialect. In contrast, a different caller, Sarah Brown, from a different telephone will be provided different personalized content based on that telephone identifying information. The voice portal may support multiple callers from a single telephone. For example, Sarah Brown and John Doe may both use the telephone 100 to call the voice portal 110. In the case where two or more profiles are identified with the same telephone identifying information, the voice portal may prompt for a password or other unique identifier, either as voice or touch-tone, to select among the profiles. However, as a general matter, the voice portal is configured to minimize the need for a caller to provide a password. Thus, during a single call session, the caller is typically only asked to provide her/his password a single time. However, some embodiments of the invention may require that a password always be used to complete commercial transactions and/or after the passage of a predetermined period, e.g. ten minutes since last password prompt. In some embodiments, the user may adjust her/his profile to allow login without a password for playback features.
Also, a single profile can be associated with multiple calling numbers. For example, the user Jane Doe could specify that both the telephone 100 and the cellular telephone 101 should be associated with her profile. Similarly, if Jane calls from a new telephone, e.g. pay phone, she can provide her existing telephone number and her password to access her profile. In some embodiments, whenever the user calls from a new telephone number, she/he is prompted as to whether to remember the number for future use. In some embodiments, additional telephone identifying information, e.g. this is a pay phone, is used so that the caller is not prompted to associated telephone numbers that are likely to be single time uses with her/his profile. Similarly, voice verification may be used to recognize a caller's voice instead of, or in addition to, using a password or other identification number.
Typical events that would require a password, or that the user be authenticated previously with a password, might include adding and removing items from the user profile through explicit commands as well as requests for specific personal information, e.g. that user's stock portfolio, bank account balances, etc.
It is not necessary for callers to the voice portal 1 10 to explicitly specify their preferences using this embodiment of the invention. The callers' behaviors and actions are used by the voice portal 110 to adopt implicit preferences, sometimes after receiving confirmation. For example, behaviors and actions reflecting repeated access to a content in a particular topic, or a particular topic, may cause the voice portal 110 to automatically include the repeatedly requested content in the default message.
For example, if a caller from New York City repeatedly asks for the weather in San Francisco, the system can add the San Francisco weather to the standard weather report. Alternatively, the system may request confirmation before adding the weather report, e.g. "Would you like me to include San Francisco in the standard weather report?" Similarly, at the level of topics, users who repeatedly ask for information about business related issues may find that the system will adjust the main menu to include business. Similarly, if that same user never asks for sports scores, that option may drop off the main menu. In some embodiments, the system may ask for confirmation before modifying the menu choices, or the system may notify the user of a modification and/or allow a user to review/change past modifications. As a result, the structure and content of the call may change, e.g. San Francisco weather will be announced at the beginning of future calls and sports information may be omitted.
Through the use of this process, the need for a specialized editing mode of the type seen on customizable web portals is reduced. The user's actions and behaviors shape the options presented to her/him. Thus, reducing the need to explicitly pick topics and/or content in an editing mode. However, some embodiments of the invention may allow for explicit profile creation in an editing mode over the web, see below, and/or over the telephone. Also, users are typically permitted to add and remove topics and/or items at will with explicit commands, e.g. "Remember this", "Remove", "Add to my stock list", etc.
E. Web Driven Profile Building
Turning to Figure 4, the process of modifying a profile for use over a telephone interface over the web will be described. This process will be described with reference to the call flow arrows shown on Figure 1 as well.
The process shown in Figure 4 assumes that a profile has already been created, e.g. by calling for the first time as described above. However, in some embodiments of the invention, users may create profiles using the web interface by providing the telephone identifying information for their primary calling phone number and a password. As is the case with the telephone registration process described in step 304, the telephone identifying information provided, here the primary calling phone number, can be used to create the initial profile.
Starting at step 400, the profile is accessed using a computer (e.g. the computer 102) via a web interface. The web interface is provided by a web server (e.g. the web server 108) and allows for access to the shared database 1 12 as shown by the call flow arrow 120.
Once the user has signed in to access her/his profile on the computer, she/he can manually identify content and topics to build her/his profile at step 404. This can be supported by allowing the user to specify topics from a list of topics and then specifying per topic content from a list of content. For example, the topics might include business, sports, news, entertainment, and weather, to name a few. The user could include weather, news, and business in her/his main menu and then further customize the specific content to be presented within those topics. For example, within weather, the user might select the specific cities she/he wants listed in her/his weather menu and/or the cities for which the weather is automatically played.
Alternatively, at step 402, the user can identify a web location with personalized content to use in building her/his profile, e.g. a uniform resource indicator (URI). For example, Figure 1 includes the personalized site 130. The personalized site 130 could be a customized portal web page, e.g. myYahoo!, My Netscape, etc., a home page the user herself/himself has designed, and/or any other web page that includes content of interest to the user. The user can identify the personalized site with a uniform resource indicator (URI), including a login identifier and password if necessary, e.g. for myYahoo! The personalized site 130 can then be accessed and the pertinent user preferences, e.g. news, stocks, selected. Taking the example of a customized portal site, the main topics selected. e.g. horoscopes, and the content within, Sagittarius, could be adopted. However, the voice portal 1 10 may present its own content for that particular item, e.g. the version of the
Sagittarius horoscope on the voice portal 1 10 not the version from the personalized site
130. The processes of step 402 and step 404 can be used together allowing a user to quickly transfer preferences from a web portal to her/his voice portal while still supporting explicit personalization.
Alternatively, in some embodiments of the invention, an existing web portal profile is voice enabled for use by a voice portal through the association of telephone identifying information with the existing web portal. In this embodiment, at step 402, the telephone identifying information, e.g. the primary calling number, is associated with an existing web profile, e.g. myYahoo! profile, stored in the shared database 112 and that existing web profile is then usable from the voice portal 110 either using voice or touch-tone commands. Additionally, web sites like the personalized site 130 may be accessed using the voice portal 1 10 in some embodiments of the invention through the use of the data connectivity engine 220 as shown in Figure 2.
F. Profile Building Via Other Web Sites
Some embodiments of the invention may allow users of the voice portal 1 10 to add to their profile from other web sites. For example, if a user of the computer 102 is accessing a web site (e.g. the personalized site 130), the web site might include a link like "Add this to your voice portal." Thus, for example, from a service such as MapQuest™ or Ameritrade™, the user could click on a link to add a particular piece of content or a particular topic to their portal for the voice portal 110.
For example, a user could add her/his "QQQ" stock symbol to her/his profile on the voice portal 1 10 even though the voice portal 110 may be operated independently of the particular web site.
This type of web based profile building allows for widespread profile building for the voice portal 110 from a variety of Internet sites. Also, in some embodiments, the web browser software on the user's computer (e.g. the computer 102) can support an option to add a bookmark to the user's profile stored in the shared database 1 12 for the voice portal 110. For example, a menu option in the browser on the computer 102 might include "Add Page to Voice Portal Shortcuts" and upon selecting that menu option, the current web page would be added to the user's profile on the voice portal 110.
This would typically be accomplished by accessing a URI on the web server 108 that included the information to be added. At that point, the web server 108 might ask for a primary calling phone number and or a password. In some embodiments, a cookie stored by the browser on the computer 102 may be used to obviate one or both of these steps. After the user provides the information, or it is accepted automatically, a confirmation page may be shown including a return link to the originating web page. Several example URI's for adding content are shown below:
<http://www.voiceportal.com/add.cgi?topic=stock%20quote&content=QQQ> <http://www.voiceportaI. co /add. cgi?shortcut=MapQuest&ref=www. inapquest.com/voice.vxml> <http://www.voiceportal. co /add. cgi?shortcut=myYahoo&ref=my.yahoo.com/voice.vxml&login=jdoe>
These examples are illustrative of the various types of URI's that can be placed as links on web sites to allow users of the voice portal 1 10 to further customize their profile. G. Locale Based Personalization
Turning to Figure 5, the basic personalization framework used by several embodiments of the invention is presented. At step 500, a request is made for content, or a topic. Then one or more of steps 502-510 take place, in parallel or sequence, and then the content is presented at step 512. Which of steps 502-510 occur for a given request may be determined based on the topic or content requested. For example, step 504 can be omitted when non-time dependent information is presented.
Turning to step 502, the customization of content based on the calling locale. The telephone identifying information includes information about the caller's locale independent of any user provided registration information. This information can be derived from telephone routing tables that provide a descriptive name for each area code/exchange combination within the North American Numbering Plan (NANP). Thus, the phone number 650-493-#### would be associated with "Palo Alto, California". Similarly, 650-592-#### would be associated with "San Carlos, California". This information may be directly present in the telephone identifying information provided to the voice portal 110, or may be ascertained from a local exchange routing guide (LERG). For international callers outside the NANP, similar types of telephone identifying information can be mapped to locales within countries to the extent permitted by the particular numbering plan. The city-state combination may correspond to multiple locales for the purposes of the voice portal 110. For example, a county- wide or multi-city locale can be defined that encompasses multiple area code/exchange combinations. Thus, a single caller may be in multiple locales. Locale information can be further refined through the use of additional databases, e.g. city/state to zip code databases, street address to five digit zip code databases, reverse lookup databases that map phone numbers to street addresses, longitude-latitude conversion databases, and/or other databases that provide locale related information from telephone identifying information. Thus, for example, V and H coordinates might be determined using the telephone identifying information. Those can be further converted to a longitude and latitude to determine the locale. Alternatively, a reverse phone number database could be used to find a specific street address for the telephone identifying information. Examples of the uses for the locale information include: providing locale-appropriate lottery results, providing driving directions to a requested destination, providing locale-appropriate weather reports, providing locale-appropriate show times for movies other events, e.g. cultural, governmental, etc., traffic reports, yellow page listings, and/or providing other locale-related information.
H. Time/Date Based Personalization
Turning to step 504, the customization of content based on the time and/or date will now be described. The telephone identifying information includes information about the caller's locale independent of any user provided registration information. This information can be derived from telephone routing tables that provide a descriptive name for each area code/exchange combination within the NANP. Thus, the phone number 650- 493-#### would be associated with "Palo Alto, California" and thus the correct time zone, Pacific, could be selected as well. This time zone may be directly present in the telephone identifying information provided to the voice portal 110, or may be ascertained from the LERG. For international callers outside the NANP, similar types of telephone identifying information can be mapped to locales within countries to the extent permitted by the particular numbering plan. Thus, callers from United Kingdom numbers would be mapped to British Standard Time.
The time zone information allows the voice portal 110 to customize the presentation of information based on the time in the caller's locale. Callers can use a single nationwide, or international, number to reach the voice portal 110, e.g. 800-###-####. The voice portal 110 will use the time zone information to adjust the content presented to each user.
Thus, during the lunch hour, the voice portal 110 might report a stock quote to the user while on a Friday evening, the voice portal 110 might suggest a movie. For example, "It is Friday night, would you be interested in seeing a movie?" A "yes" response by the caller will lead to the presentation of a list that is both time and date adapted and locale appropriate. For example, a caller from Palo Alto at six o'clock p.m. on a Friday would hear about show times after six o'clock p.m. in his local area.
If necessary, the voice portal 110 may connect the user to an appropriate transaction system to complete a user requested transaction such a the purchase of an airline ticket, a movie ticket, an audio CD, etc. However, in many instances, the voice portal 1 10 may be able to directly complete the transaction using the data connectivity engine 220 and access to the Internet 106 and/or one more databases (e.g. the database 226). This process can occur even if the caller has not explicitly provided the voice portal 1 10 her/his home location or the current time. For example, this personalized content might be presented immediately at after step 304 of Figure 3 in step 306.
Similarly, other time sensitive information can be presented such as airline schedules, cultural and other events, etc. Thus, for example a caller asking for flight times to New York from a 650-493-#### telephone number might be prompted to select one of the three local airports: San Francisco International, San Jose International, and Oakland International, and then the flight times to New York after the current time in the Pacific time zone would be presented.
Some additional examples include customizing the presentation of business reports based on whether or not the market is open; modifying the greeting prompt based on the time of day; and providing traffic information automatically during commute hours, but not at other times.
I. Targeted Advertising
Embodiments of the invention support the presentation of targeted advertising, or other content, to callers of the voice portal 1 10 as shown at step 508. The two primary types of targeted advertising supported by embodiments of the invention will be described. The different types of targeted advertising can be combined as well.
1. Based Solely on Telephone Identifying Information
Telephone identifying information can be used to reference demographic information about callers from a particular area. For example, if the telephone identifying information includes the calling number 650-493-####, corresponding to Palo Alto, California, general demographic information about callers from that particular region can be used to target the advertising, or other content. Further, if a reverse lookup database is used, the phone number can, in some instances, locate specific demographic information for a given household, or caller.
This personalization allows the targeting of advertising to qualified callers by the voice portal 112. For example, an advertiser of expensive luxury vehicles might request that its callers be qualified based on their income, or a particular psychographic attribute, e.g. fun-loving. In that case, the demographic profile corresponding to the telephone identifying information can be used to qualify the caller. Thus, callers from the relatively affluent city of Palo Alto, California might receive the advertising. Similarly, if a particular household meets the requirements based on a reverse lookup, those households can receive the advertising as well.
2. Based on Profile
Another source of information about the caller is the profile used by the shared database 112. This profile may indicate interests based on the explicit and implicit preferences, e.g. likes sports, and can be used in combination with the telephone identifying information to more closely tailor ads to the caller.
For example, if the caller has added movie and entertainment information to her/his profile, either explicitly or implicitly, advertising related to movies and entertainment could be favored over other qualified advertising based on the caller's profile. Other examples include providing brokerage, and other financial services, advertisements to callers who frequently check stock quotes and/or have a customized stock list. J. Adaptive Voice Character
Turning to step 508, the customization of content through adaptive voice character will now be described. The telephone identifying information includes information about the caller's locale independent of any user provided registration information. The locales may be associated with one or more standard voice character settings, e.g. for dialect, and also idiomatic speech. Thus, callers from California may receive different prompts and a different dialect from the voice portal then callers from Florida.
Similarly, the telephone identifying information may include information about the type of phone, e.g. pay phone, hospital phone, etc., that can be used to adjust the voice character, e.g. louder and slower speech.
Additionally, the caller's speaking voice may be used to refine the voice character of the system. Thus, callers with speech patterns from a particular region of the country may find that after several verbal interactions with the voice portal, the content being presented at step 512 is being spoken using a voice character more suited to their own speech patterns. Similarly, in response to callers who request that information be repeated several times, the voice character for those callers may be slowed and played back louder. Additional examples include allowing users to select different voice actors, different background music and/or sound effects, control the verbosity of prompts, etc.
K. Purchase Recommendations
Turning to step 510, the customization of content through purchase suggestions.
Based on the caller's profile — as retrieved through the telephone identifying information — and/or demographic information from other sources, e.g. locale based and/or reverse lookup, the system can make purchasing suggestions.
The suggestions could be based on the caller's locale and what others in that locale have purchased. In other embodiments, the suggestions may be based on the profile of the user relative to other user's purchases. In some embodiments, approaches such as collaborative filtering are used to generate recommendations.
Examples of recommendations may include particular goods and services, e.g. flowers for Mom a few days before Mother's Day. Further, the exact suggestion may vary based on the caller's past habits, e.g. in the past you bought chocolates so this year chocolates might be suggested again. Alternatively, if many people from your locale are buying a particular book that might be suggested as well. The particular purchase recommendation may relate to goods and services offered independently, by, and/or in affiliation with the operator of the voice portal 1 10.
L. Voice Login
As discussed above, the system may support the use of one or more passwords, either spoken or touch-tone for login and authentication purposes. The passwords provide for protection against modifications to a user's profile without authentication. Additionally, certain specific actions, e.g. making a purchase, listening to certain types of personalized content, etc., may require authentication. Typically, the authentication system will support either a voice or a touch-tone password for users of the voice portal 1 10. This flexibility addresses situations where the voice password is not working due to line conditions and/or conditions of the calling telephone. Products such as Nuance Verifier™ and/or other voice login and verification products may be used to provide the voice login feature. In some embodiments, both types of authentication may be required.
Once logged in, or authenticated, embodiments of the invention may minimize the need for the user to re-authenticate herself/himself, as described above. Additionally, the password, either voice and/or touch-tone, used for authentication for telephone sessions may be the same or different as any passwords used for authentication for web access to the profile customization options described in conjunction with Figure 4.
M. Automatic Profile Initialization
As discussed in conjunction with Figure 3, it may be desirable to initialize the profile using the telephone identifying information. The telephone identifying information can be used to select an appropriate demographic profile and list of topics based on the calling locale. In other embodiments, a reverse lookup of the calling number provided with the telephone identifying information is used to obtain a specific demographic profile for a caller and/or her/his household. Then the demographic information derived from the locale and/or the reverse lookup are used to set initial profile values. For example, the user's income might be estimated based on the average income for the calling locale, e.g. Palo Alto, California, or from demographic information from the reverse lookup. Similarly, the caller's initial topics might be selected based on commonly selected topics for her/his locale and/or the preferences available based on the demographic information retrieved by the reverse lookup.
These initial values may be revised based on a caller's later actions. For example, if the initial estimate of a caller's age is too high, later actions may cause that information to be revised. Similarly, callers may be permitted to explicitly provide certain types of demographic information as needed. For example, the user might provide her/his birth date to a horoscope feature provided by the voice portal 1 10, in that instance, the birth date might be incorporated into the profile.
N. Streaming Content Delivery
The preceding discussion has focused on personalization of content for users accessing the voice portal 110. This section describes special treatment of streaming content within various embodiments of the invention. Fig. 6 illustrates an example use of the system of Fig. 1. to deliver streaming content to a telephone user. Note that some of the blocks in Fig. 6 can be executed in different order and/or performed in parallel.
At block 610, the execution engine 202 receives a request to access an Internet site. This request could come from a number of different sources (e.g., a voice command from the user, a selection by the user of a menu of possible sites, a command from another part of the voice portal 110 because of some personalization choices made by the user). What is important is that some Internet related request is received. In some embodiments, the request may correspond to a URI (Uniform Resource Identifier). Note, if the URI corresponds to non-streaming data, then the voice portal 110 acts as described above.
At block 620, information about the streaming content is determined. The type and source of streaming content can have different effects on the streaming engine 224. For example, if the content stream complies with Real Network's standard, then the streaming engine 224 may use the RealPlayer client functional code. The streaming engine 224 could decide to cache a large part of the streaming content if the streaming content server is not local. Additionally, the streaming engine 224 may notify the server to send only the audio streaming content. What is important is that the streaming engine 224 may make decisions about how to access and process the content stream based upon the type and/or location of the Internet site being accessed.
At block 630, the voice portal 1 10 begins to receive the streaming content. This can include receive packets of streaming content and converting the streaming content into digital audio data. Alternatively, no conversion is needed where the streaming content is sent in a format compatible with internal representation of the audio data within the voice portal 110. Note, only a portion of the entire stream is processed at any one time. The stream is received in parts and, as such, is processed in parts. At block 640, the execution engine may cause the audio data may be mixed with additional audio data. This may be done when user menu prompts/notices from the voice portal 110 are supplied overtop of the audio from the streaming audio signal. This mixing may be done in the telephone gateway 107 and/or in the computers running the voice portal 110. Examples of the types of audio signals mixed can include streaming content from the local streaming content server 160, streaming content from the streaming content server 150, audio from the text to speech engine 214, and/or audio content from the audio repository 212. This allows prompts, system notifications, menu options, and other user interface features to be combined with streaming content delivery. The mixing may be accomplished by reducing the power level of the streaming content and adding the second signal. Mixing may also be achieve by inserting an audio signal into the stream. This allows for the insertion of advertisements, systems prompts, etc. Note that, like the audio signal from the streaming content, the audio retrieved from the local streaming content server 160, the text to speech engine 214, and/or the audio repository 212, may need to be converted to a form compatible with the representation of audio within the voice portal 1 10. For example, the audio repository 212 may store audio signals in many different forms (e.g., .WAV form, MP3, MIDI, etc.), these forms may need to be converted into the internal representation of audio signals within the voice portal 1 10. This conversion may also include resolving a location of stored audio information. For example, if the audio repository 212 refers to audio data stored at another location (e.g. on the Internet) , the voice portal 110 may need to resolve the address, download the audio signal, and then convert the signal to the appropriate internal representation. Note, the voice portal 1 10 may use multiple internal representations of audio data.
At block 650, the mixed audio signal is sent from the telephone gateway 107, through the PSTN, to the user's telephone.
At block 660, a test is made to determine whether the entire stream has been received. If not, block 630 through 660 are repeated.
Thus, the user receives streaming content from the Internet on their telephone. The following describes important aspects of the invention that enhance the user's experience with the voice portal 110.
1. User Controls
Through touch tone and/or voice commands, the user can control the delivering of the streaming content. Various embodiments of the invention allow for variations and combinations of the following commands: • Volume control - increase/decrease the volume of the audio
• Start/stop/pause/skip/fast forward/rewind - allows the user to control the stream. Some of these commands use caching of the content by the voice portal 1 10. Where the streaming content server supports these commands, the streaming engine 224 may pass these commands to the server.
• Stream location - indicates how far into the stream the user is and/or how long the stream is and/or how much is left in the stream. • Switch streams - allows the user to switch to another content stream
• Restore - allows the user to restore a previous stream to the user's last location.
• Rate - indicates the rate at which the content stream is being received by the voice portal 1 10.
• Jump - allows the user to jump in the stream to subject matter of interest. To use this command, the voice portal 1 10 performs voice recognition on the content stream (this may involve caching portions of the content stream) and searches for the corresponding phrase, subject matter, word, etc.
• Extra - this command tells the user about the metadata for a stream. The data may include the length, title, source, and/or server type.
2. Additional Features
This section describes additional features of various embodiments of the invention.
The voice portal 1 10 can support channels. The channels correspond to specific content types such as business, sports, etc. The channels may include streaming data. A user could select a channel and have data from various sources combined and delivered to him/her.
History capabilities may be included. This allows the user to select from previously accessed Internet sites to replay streaming content or new content from the same site. To support the business, the voice portal 1 10 may insert advertisements in the streaming content. This may be done by adding advertisements before, during, or after, the streaming content is delivered to the user. The ads may be determined from the user's preferences. Some embodiments of the invention may use text feeds as part of their streaming content. For example, some embodiments of the invention may use the closed caption translation of a portion of a audio broadcast to provide streaming translation facilities to a user.
Some embodiments of the invention allow for the mixing of audio from sources other than the local streaming content server 160, the text to speech engine 214, or the audio repository 212. As noted above, the voice portal 110 may access non-streaming audio data from the Internet to mix with streaming content audio signals.
O. Conclusion
Thus, a voice portal for presenting streaming content over a telephone interface has been described. The streaming content may be supplied from the Internet and may be combined with other signals. Personalization of the data may also occur. The personalized information provided over the voice portal may come from the World Wide Web (WWW), databases, third parties, and or other sources. Telephone identifying information is used by embodiments of the invention to personalize caller interactions with the voice portal.
The foregoing description of various embodiments of the invention has been presented for purposes of illustration and description. It is not intended to limit the invention to the precise forms disclosed. Many modifications and equivalent arrangements will be apparent.

Claims

CLAIMSWhat is claimed is:
1. A method of providing streaming content from the Internet to a telephone using a computer system, the computer system including a telephone interface system coupled in communications with an Internet access system, the telephone interface system being coupled in communications with the telephone, the method comprising: receiving an Internet access request, the Internet access request corresponding to an Internet site outside of the computer system; receiving the streaming content from the Internet site, the streaming content including an audio portion; and sending at least the audio portion of the streaming content over the telephone interface system to send an audio signal, corresponding to the audio portion, to the telephone.
2. The method of claim 1, wherein the receiving the Internet access request comprises receiving a verbal request to access the Internet site and performing voice recognition on the verbal request to determine the Internet access request.
3. The method of claim 1, wherein the receiving the Internet access request comprises receiving a series of one or more touch tone signals and decoding the series of one or more touch tone signals to determine the Internet access request.
4. The method of claim 1, further comprising providing a menu corresponding to accessible Internet sites, and wherein the receiving the Internet access request corresponds to receiving a selection from the menu.
5. The method of claim 1 , further comprising determining the type of streaming content and converting the corresponding type of streaming content into an audio portion.
6. The method of claim 1 , wherein the Internet access system includes a web server, wherein the web server connects to the Internet site to receive the streaming content.
7. The method of claim 1 , wherein the streaming content is received as packet data from a packet switched network and wherein the telephone interface system communicates the audio signal to the Public Switched Telephone Network.
8. The method of claim 1, wherein the Internet access request is translated into a URI.
9. The method of claim 1, wherein the computer system includes personalized content information and wherein the Internet access request is generated from the personalized content information.
10. The method of claim 1 , wherein the computer system further comprises a local streaming content system, the local streaming content system including a second streaming content, the second streaming content including at least a second audio portion, and wherein the method further comprises accessing the local streaming content system to provide a second audio signal corresponding to the second audio portion to the telephone.
1 1. The method of claim 10, further comprising mixing the audio portion with the second audio portion and wherein the audio signal corresponds to the mix.
12. The method of claim 10, wherein the computer system further comprises an audio repository, wherein the method further comprises accessing the audio content from the repository to provide to the telephone, and mixing the audio portion with the second audio portion and the audio content.
13. The method of claim 1, wherein the computer system further comprises an audio repository and wherein the method further comprises accessing the audio content from the repository to provide to the telephone.
14. The method of claim 1 , further comprising receiving a command, the command corresponding to a request by the telephone user to adjust the sending of the audio portion, and adjusting the sending of the audio portion.
15. The method of claim 13, wherein the command corresponds to a pause command, and wherein the sending the audio portion is paused and wherein a pause command is sent out to the Internet site.
16. The method of claim 1 , wherein the command corresponds to a pause command, and wherein the sending the audio portion is paused and wherein at least the audio portion of the streaming content is cached during the pause.
17. A computer system delivering streaming content to a telephone, the streaming content being received from an Internet streaming content server, the computer system comprising: means for receiving an Internet access request, the Internet access request corresponding to an Internet site outside of the computer system; means for receiving the streaming content from the Internet streaming content server, the streaming content including an audio portion; and means for sending at least the audio portion of the streaming to the telephone.
18. A computer system to deliver streaming content from the Internet to a telephone, the computer system comprising: an Internet interface including at least one program to receive the streaming content from the Internet and extract a streaming audio signal from the streaming content; a telephone interface to send an audio signal to the telephone, the audio signal corresponding to the streaming audio signal; and a control subsystem to control the Internet interface and the telephone interface.
19. The computer system of claim 18, further comprising an audio repository storing audio sounds, and the computer system includes a second program to cause at least some sounds to be mixed with the streaming audio signal.
20. The computer system of claim 19, wherein the second computer mixes by adding the at least some sounds into the streaming audio signal.
21. The computer system of claim 19, wherein the second computer mixes by inserting the at least some sounds into the streaming audio signal.
22. The computer system of claim 19, wherein at least one sound is an advertisement.
23. The computer system of claim 22, wherein the computer system includes personal preference information and wherein the advertisement is chosen based at least partially upon the personal preference information.
24. The computer system of claim 19, wherein at least one sound is a system prompt.
25. The computer system of claim 18, wherein the telephone interface subsystem includes a call manager, the call manager supporting multiple simultaneous telephone calls over the telephone interface, at least one of the simultaneous telephone calls receiving the streaming audio signal.
PCT/US2000/041429 1999-10-22 2000-10-20 Streaming content over a telephone interface WO2001030046A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU22997/01A AU2299701A (en) 1999-10-22 2000-10-20 Streaming content over a telephone interface

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US09/426,102 US6807574B1 (en) 1999-10-22 1999-10-22 Method and apparatus for content personalization over a telephone interface
US09/426,102 1999-10-22
US09/431,002 1999-11-01
US09/431,002 US6970915B1 (en) 1999-11-01 1999-11-01 Streaming content over a telephone interface

Publications (2)

Publication Number Publication Date
WO2001030046A2 true WO2001030046A2 (en) 2001-04-26
WO2001030046A3 WO2001030046A3 (en) 2001-09-07

Family

ID=27026910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/041429 WO2001030046A2 (en) 1999-10-22 2000-10-20 Streaming content over a telephone interface

Country Status (2)

Country Link
AU (1) AU2299701A (en)
WO (1) WO2001030046A2 (en)

Cited By (126)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2381697A (en) * 2001-11-01 2003-05-07 Intellprop Ltd Telecommunications services apparatus
EP1311102A1 (en) * 2001-11-08 2003-05-14 Hewlett-Packard Company Streaming audio under voice control
GB2382262A (en) * 2001-11-13 2003-05-21 Intellprop Ltd Telecommunications system with access to audio recordings
EP1329829A2 (en) * 2002-01-16 2003-07-23 mediaBeam GmbH Method for the acquisition and distribution of data provided by a web page
EP1398947A2 (en) * 2002-09-13 2004-03-17 Sharp Kabushiki Kaisha Broadcast program recording method, communication control device and mobile communication device
EP1449351A1 (en) * 2001-11-15 2004-08-25 Highwired Technologies, Inc. Method and apparatus for a mixed-media messaging delivery system
EP1564945A1 (en) * 2004-02-10 2005-08-17 Alcatel VXML streaming for a unified messaging system with telephonic user interface
GB2376421B (en) * 2001-06-12 2005-10-05 Freeline Comm Ltd Apparatus for playing a game
US7085960B2 (en) 2001-10-30 2006-08-01 Hewlett-Packard Development Company, L.P. Communication system and method
US7106837B2 (en) 2001-08-31 2006-09-12 Mitel Networks Corporation Split browser
US7274672B2 (en) 2001-10-31 2007-09-25 Hewlett-Packard Development Company, L.P. Data processing system and method
CN100440968C (en) * 2004-04-09 2008-12-03 华为技术有限公司 System and method for broadcasting flow media data on demand
US7757173B2 (en) * 2003-07-18 2010-07-13 Apple Inc. Voice menu system
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0841831A2 (en) * 1996-11-07 1998-05-13 AT&T Corp. Wan-based voice gateway
EP0845894A2 (en) * 1996-11-05 1998-06-03 Boston Technology Inc. A system for accessing multimedia mailboxes and messages over the internet and via telephone
EP0847179A2 (en) * 1996-12-04 1998-06-10 AT&T Corp. System and method for voiced interface with hyperlinked information
US5799063A (en) * 1996-08-15 1998-08-25 Talk Web Inc. Communication system and method of providing access to pre-recorded audio messages via the Internet
US5915001A (en) * 1996-11-14 1999-06-22 Vois Corporation System and method for providing and using universally accessible voice and speech data files

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5799063A (en) * 1996-08-15 1998-08-25 Talk Web Inc. Communication system and method of providing access to pre-recorded audio messages via the Internet
EP0845894A2 (en) * 1996-11-05 1998-06-03 Boston Technology Inc. A system for accessing multimedia mailboxes and messages over the internet and via telephone
EP0841831A2 (en) * 1996-11-07 1998-05-13 AT&T Corp. Wan-based voice gateway
US5915001A (en) * 1996-11-14 1999-06-22 Vois Corporation System and method for providing and using universally accessible voice and speech data files
EP0847179A2 (en) * 1996-12-04 1998-06-10 AT&T Corp. System and method for voiced interface with hyperlinked information

Cited By (174)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
GB2376421B (en) * 2001-06-12 2005-10-05 Freeline Comm Ltd Apparatus for playing a game
US7106837B2 (en) 2001-08-31 2006-09-12 Mitel Networks Corporation Split browser
US7085960B2 (en) 2001-10-30 2006-08-01 Hewlett-Packard Development Company, L.P. Communication system and method
US7274672B2 (en) 2001-10-31 2007-09-25 Hewlett-Packard Development Company, L.P. Data processing system and method
GB2381697A (en) * 2001-11-01 2003-05-07 Intellprop Ltd Telecommunications services apparatus
GB2381697B (en) * 2001-11-01 2004-11-17 Intellprop Ltd Telecommunications services apparatus
EP1311102A1 (en) * 2001-11-08 2003-05-14 Hewlett-Packard Company Streaming audio under voice control
GB2382262A (en) * 2001-11-13 2003-05-21 Intellprop Ltd Telecommunications system with access to audio recordings
EP1449351A1 (en) * 2001-11-15 2004-08-25 Highwired Technologies, Inc. Method and apparatus for a mixed-media messaging delivery system
US7213259B2 (en) 2001-11-15 2007-05-01 Highwired Technologies, Inc. Method and apparatus for a mixed-media messaging delivery system
EP1449351A4 (en) * 2001-11-15 2005-10-19 Highwired Technologies Inc Method and apparatus for a mixed-media messaging delivery system
EP1329829A2 (en) * 2002-01-16 2003-07-23 mediaBeam GmbH Method for the acquisition and distribution of data provided by a web page
EP1329829A3 (en) * 2002-01-16 2004-04-14 mediaBeam GmbH Method for the acquisition and distribution of data provided by a web page
EP1398947A3 (en) * 2002-09-13 2007-08-01 Sharp Kabushiki Kaisha Broadcast program recording method, communication control device and mobile communication device
EP1398947A2 (en) * 2002-09-13 2004-03-17 Sharp Kabushiki Kaisha Broadcast program recording method, communication control device and mobile communication device
US7636544B2 (en) 2002-09-13 2009-12-22 Sharp Kabushiki Kaisha Broadcast program recording method, communication control device, and mobile communication device
US7757173B2 (en) * 2003-07-18 2010-07-13 Apple Inc. Voice menu system
EP1564945A1 (en) * 2004-02-10 2005-08-17 Alcatel VXML streaming for a unified messaging system with telephonic user interface
CN100440968C (en) * 2004-04-09 2008-12-03 华为技术有限公司 System and method for broadcasting flow media data on demand
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8930191B2 (en) 2006-09-08 2015-01-06 Apple Inc. Paraphrasing of user requests and results by automated digital assistant
US8942986B2 (en) 2006-09-08 2015-01-27 Apple Inc. Determining user intent based on ontologies of domains
US9117447B2 (en) 2006-09-08 2015-08-25 Apple Inc. Using event alert text as input to an automated assistant
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US10475446B2 (en) 2009-06-05 2019-11-12 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US9548050B2 (en) 2010-01-18 2017-01-17 Apple Inc. Intelligent automated assistant
US8903716B2 (en) 2010-01-18 2014-12-02 Apple Inc. Personalized vocabulary for digital assistant
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US11410053B2 (en) 2010-01-25 2022-08-09 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607141B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10984327B2 (en) 2010-01-25 2021-04-20 New Valuexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10984326B2 (en) 2010-01-25 2021-04-20 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US11556230B2 (en) 2014-12-02 2023-01-17 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services

Also Published As

Publication number Publication date
WO2001030046A3 (en) 2001-09-07
AU2299701A (en) 2001-04-30

Similar Documents

Publication Publication Date Title
US6970915B1 (en) Streaming content over a telephone interface
US6842767B1 (en) Method and apparatus for content personalization over a telephone interface with adaptive personalization
WO2001030046A2 (en) Streaming content over a telephone interface
US7376586B1 (en) Method and apparatus for electronic commerce using a telephone interface
US8271331B2 (en) Integrated, interactive telephone and computer network communications system
US7908383B2 (en) Method and apparatus for phone application state management mechanism
US8938060B2 (en) Technique for effectively providing personalized communications and information assistance services
US7457397B1 (en) Voice page directory system in a voice page creation and delivery system
US6895084B1 (en) System and method for generating voice pages with included audio files for use in a voice page delivery system
US5850433A (en) System and method for providing an on-line directory service
US7933389B2 (en) System and method generating voice sites
US8023622B2 (en) Technique for call context based advertising through an information assistance service
US7447299B1 (en) Voice and telephone keypad based data entry for interacting with voice information services
US20010012335A1 (en) Preference based telecommunication information service
US20040140989A1 (en) Content subscription and delivery service
US20120077472A1 (en) Technique for effectively providing a personalized information assistance service
US11153425B2 (en) System and method for providing interactive services
US20040047453A1 (en) Variable automated response system
US11232461B2 (en) System and method for causing messages to be delivered to users of a distributed voice application execution system
US20220247863A1 (en) System and method for placing telephone calls using a distributed voice application execution system architecture
US7941481B1 (en) Updating an electronic phonebook over electronic communication networks
US8488767B2 (en) Technique for selective presentation of information in response to a request for information assistance service
US20050025293A1 (en) Calling card access to internet portal using interactive voice response (IVR) system
CA2453499A1 (en) Technique for effectively providing personalized communications and information assistance services

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 69(1)EPC DATED 08.07.02

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP