WO1997040611A1 - Method and apparatus for information retrieval using audio interface - Google Patents
Method and apparatus for information retrieval using audio interface Download PDFInfo
- Publication number
- WO1997040611A1 WO1997040611A1 PCT/US1997/003690 US9703690W WO9740611A1 WO 1997040611 A1 WO1997040611 A1 WO 1997040611A1 US 9703690 W US9703690 W US 9703690W WO 9740611 A1 WO9740611 A1 WO 9740611A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- document
- server
- interface device
- node
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/027—Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4938—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/40—Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2201/00—Electronic components, circuits, software, systems or apparatus used in telephone systems
- H04M2201/60—Medium conversion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M2207/00—Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place
- H04M2207/40—Type of exchange or network, i.e. telephonic medium, in which the telephonic communication takes place terminals with audio html browser
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M7/00—Arrangements for interconnection between switching centres
- H04M7/006—Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
Definitions
- the present invention relates to information retrieval in general. More particularly, the present invention relates to information retrieval over a network utilizing an audio user interface.
- the amount of information available over communication networks is large and growing at a fast rate.
- the most popular of such networks is the Internet, which is a network of linked computers around the world.
- Much of the popularity of the Internet may be attributed to the World Wide Web (WWW) portion of the Internet.
- the WWW is a portion of the Internet in which information is typically passed between server computers and client computers using the Hypertext Transfer Protocol (HTTP).
- HTTP Hypertext Transfer Protocol
- a server stores information and serves (i.e. sends) the information to a client in response to a request from the client.
- the clients execute computer software programs, often called browsers, which aid in the requesting and displaying of information. Examples of WWW browsers are Netscape Navigator, available from Netscape Communications, Inc., and the Internet Explorer, available from Microsoft Corp.
- URLs are described in detail in Berners-Lee, T., et al., Uniform Resource Locators, RFC 1738, Network Working Group, 1994, which is incorporated herein by reference. For example, the URL http://www.hostname.com/documentl.html 1 , identifies the document
- HTML documents contain information which is used by the browser to display information to a user at a computer display screen.
- An HTML document may contain text, logical structure commands, hypertext links, and user input commands. If the user selects (for example by a mouse click) a hypertext link from the display, the browser will request another document from a server.
- HTML browsers are based upon textual and graphical user interfaces.
- documents are presented as images on a computer screen.
- images include, for example, text, graphics, hypertext links, and user input dialog boxes.
- All user interaction with the WWW is through a graphical user interface.
- audio data is capable of being received and played back at a user computer (e.g. a .wav or .au file), such receipt of audio data is secondary to the graphical interface of the WWW.
- audio data may be sent as a result of a user request, but there is no means for a user to interact with the WWW using an audio interface.
- the present invention provides a method and apparatus for retrieving information from a document server using an audio interface device (e.g. a telephone).
- An inte ⁇ reter is provided which receives documents from a document server operating in accordance with a document serving protocol.
- the inte ⁇ reter inte ⁇ rets the document into audio data which is provided to the audio user interface.
- the inte ⁇ reter also receives audio user input from the audio interface device.
- the inte ⁇ reter inte ⁇ rets the audio user input into user data which is appropriate to be sent to the document server in accordance with the document serving protocol and provides the user data to the server .
- the inte ⁇ reter may be located within the audio user interface, within the document server, or disposed in a communication channel between the audio user interface and the document server.
- a telecommunications network node for carrying out the audio browsing functions of the present invention is included as a node in a telecommunications network, such as a long distance telephone network.
- An audio channel is established between the audio interface device and the node.
- a document serving protocol channel is established between the node and the document server.
- the node receives documents served by the document server in accordance with the document serving protocol and inte ⁇ rets the documents into audio data appropriate for the audio user interface.
- the node then sends the audio data to the audio interface device via the audio channel.
- the node also receives audio user input (e.g. DTMF tones or speech) from the audio interface device and inte ⁇ rets the audio user input into user data appropriate for the document server. The node then sends the user data to the document server in accordance with the document serving protocol.
- audio user input e.g. DTMF tones or speech
- the document server is a World Wide Web document server which communicates with clients via the hypertext transfer protocol.
- a user can engage in an audio browsing session with a World Wide Web document server via an audio interface
- the World Wide Web document server can treat such a browsing session in a conventional manner and does not need to know whether the particular browsing session is being initiated from a client executing a conventional graphical browser or from an audio interface device.
- the necessary inte ⁇ reting functions are carried out in the telecommunications network node and these functions are transparent to both a user using the audio inte ⁇ reting device and the World Wide Web document server operating in accordance with the hypertext transfer protocol.
- Fig. 1 shows a diagram of a telecommunications system which is suitable to practice the present invention
- Fig. 2 is a block diagram of the components of the audio processing node.
- Fig. 3 is a block diagram of the components of the audio inte ⁇ reter node.
- Fig. 4 is a block diagram of a document server.
- Fig. 5 is an example audio-HTML document.
- Fig. 6 is an example HTML document.
- Fig. 7 is a block diagram of an embodiment in which the audio browsing functions are implemented at a user interface device.
- Fig. 8 is a block diagram of the components of the user interface device of Fig. 7.
- Fig. 9 is a block diagram of an embodiment in which the audio browsing functions are implemented at an audio browsing document server.
- Fig. 10 is a block diagram of the components of the audio browsing document server of Fig. 9.
- Fig. 1 1 is a block diagram of an embodiment in which the audio inte ⁇ reting functions are implemented at an audio inte ⁇ reter document server.
- Fig. 12 is a block diagram of the components of the audio inte ⁇ reter document server of Fig. 11.
- Fig. 1 shows a diagram of a telecommunications system 100 which is suitable to practice the present invention.
- An audio interface device such as telephone 1 10 is connected to a local exchange carrier (LEC) 120. Audio interface devices other than a telephone may also be used. For example, the audio interface device could be a multimedia computer having telephony capabilities.
- a user of telephone 1 10 places a telephone call to a telephone number associated with information provided by a document server, such as document server 160.
- the document server 160 is part of communication network 162.
- network 162 is the Internet.
- Telephone numbers associated with information accessible through a document server, such as document server 160, are set up so that they are routed to special telecommunication network nodes, such as audio browsing adjunct 150.
- the audio browsing adjunct 150 is a node in telecommunications network 102 which is a long distance telephone network.
- the call is routed to the LEC 120, which further routes the call to a long distance carrier switch 130 via trunk 125.
- Long distance network 102 would generally have other switches similar to switch 130 for routing calls. However, only one switch is shown in Fig. 1 for clarity.
- switch 130 in the telecommunications network 102 is an "intelligent" switch, in that it contains (or is connected to) a processing unit 131 which may be programmed to carry out various functions. Such use of processing units in telecommunications network switches, and the programming thereof, is well known in the art.
- the call is then routed to the audio browsing adjunct 150.
- the audio browsing adjunct 150 Upon receipt of the call at switch 130, the call is then routed to the audio browsing adjunct 150.
- the routing of calls through a telecommunications network is well known in the art and will not be described further herein.
- audio browsing services in accordance with the present invention are provided only to users who are subscribers to an audio browsing service provided by the telecommunication network 102 service provider.
- a database 140 connected to switch 130 contains a list of such subscribers.
- Switch 130 performs a database 140 lookup to determine if the call originated from a subscriber to the service.
- One way to accomplish this is to store a list of calling telephone numbers (ANI) in database 140.
- the LEC 120 provides switch 130 with the ANI of the telephone 110.
- the switch 130 performs a database 140 lookup to determine if the ANI is included in the list of subscribers to the audio browsing service stored in database 140. If the ANI is present in that list, then the switch 130 routes the call to the audio browsing adjunct 150 in accordance with the present invention. If the ANI does not belong to a subscriber to the audio browsing service, then an appropriate message may be sent to telephone 110.
- the audio browsing adjunct 150 contains an audio processing node 152 and an audio inte ⁇ reter node 154, both of which will be described in further detail below.
- the audio browsing adjunct 150 provides the audio browsing functionality in accordance with the present invention.
- the audio browsing adjunct 150 Upon receipt of the call from telephone 110, the audio browsing adjunct 150 establishes a communication channel with the document server 160 associated with the called telephone number via link 164.
- link 164 is a socket connection over TCP/IP, the establishment of which is well known in the art.
- Audio browsing adjunct 150 and the document server 160 communicate with each other using a document serving protocol.
- a document serving protocol is a communication protocol for the transfer of information between a client and a server.
- a client requests information from a server by sending a request to the server and the server responds to the request by sending a document containing the requested information to the server.
- a document serving protocol channel is established between the audio browsing adjunct 150 and the document server 160 via link 164.
- the document serving protocol is the Hypertext Transfer Protocol (HTTP). This protocol is well known in the art of WWW communication and is described in detail in Berners-Lee, T. and Connolly, D., Hypertext Transfer Protocol (HTTP) Working Draft of the Internet Engineering Task Force, 1993, which is inco ⁇ orated herein by reference.
- HTTP Hypertext Transfer Protocol
- the audio browsing adjunct 150 communicates with the document server 160 using the HTTP protocol.
- the document server 160 behaves as if were communicating with any conventional WWW client executing a conventional graphical browser.
- the document server 160 serves documents to the audio browsing adjunct 150 in response to requests it receives over link 164.
- a document is a collection of information.
- the document may be a static document in that the document is pre-defined at the server 160 and all requests for that document result in the same information being served.
- the document could be a dynamic document, whereby the information which is served in response to a request is dynamically generated at the time the request is made.
- dynamic documents are generated by scripts, which are programs executed by the server 160 in response to a request for information.
- a URL may be associated with a script.
- the server 160 When the server 160 receives a request including that URL, the server 160 will execute the script to generate a dynamic document, and will serve the dynamically generated document to the client which requested the information.
- the use of scripts to dynamically generate documents is well known in the art.
- the documents served by server 160 include text, logical structure commands, hypertext links, and user input commands.
- One characteristic of these documents is that the physical structure of the information contained in the document (i.e., the physical layout view of the information when displayed at a client executing a conventional graphics browser), is not defined.
- a document contains logical structure commands, which are inte ⁇ reted at a browser to define a physical layout.
- logical structure commands include emphasis commands, new paragraph commands, etc.
- the syntactic structure of such commands may conform to the conventions of a more general pu ⁇ ose document structuring language, such as Standard Generalized Markup Language (SGML), which is described in Goldfarb, Charles, The SGML Handbook, Clarendon Press, 1990, which is inco ⁇ orated by reference herein.
- these documents are Hypertext Markup Language (HTML) documents.
- HTML is a well known language based on SGML which is used to define documents which are served by WWW servers. HTML is described in detail in Bemers-Lee, T. and Connolly, D., Hypertext Markup Language (HTML), Working Draft of the Internet Engineering Task Force, 1993, which is inco ⁇ orated herein by reference.
- the browser When an HTML document is received by a client executing a conventional browser, the browser inte ⁇ rets the HTML document into an image and displays the image upon a computer display screen.
- the audio browsing adjunct 150 upon receipt of a document from document server 160, the audio browsing adjunct 150 converts the document into audio data. The details of such conversion will be discussed in further detail below.
- the audio data is then sent to telephone 110 via switch 130 and LEC 120.
- the user of telephone 110 can access information from document server 160 via an audio interface.
- the user can send audio user input from the telephone 110 back to the audio browsing adjunct 150.
- This audio user input may be, for example, speech signals or DTMF tones.
- the audio browsing adjunct 150 converts the audio user input into user data or instructions which are appropriate for transmitting to the document server 160 via link 164 in accordance with the HTTP protocol.
- the user data or instructions are then sent to the document server 160 via the document serving protocol channel.
- user interaction with the document server is via an audio user interface. In this manner, a user can engage in a browsing session with a WWW document server via an audio interface.
- the document server can treat such a browsing session in a conventional manner and does not need to know whether a particular browsing session is being initiated from a client executing a conventional graphical browser or from an audio interface such as a telephone.
- the audio browsing adjunct 150 within the network 102 inte ⁇ rets the documents being served by document server 160 into audio data appropriate to be sent to telephone 110.
- the audio browsing adjunct 150 inte ⁇ rets audio user input received at telephone 1 10 into user data appropriate to be received by the document server 160.
- a more detailed description of an advantageous embodiment will now be given in conjunction with an example browsing session. Assume a user at telephone 110 dials the number (123) 456-7890 2 which has been set up to be associated with information accessible through document server 160 and therefore routed to audio browsing adjunct 150.
- the call gets routed to LEC 120, at which point LEC 120 recognizes the telephone number as one which is to be routed to long distance network 102, and more particularly to switch 130.
- switch 130 Upon receipt of the call, switch 130 in turn routes the call to the audio browsing adjunct 150 via link 132. Thus, there is established an audio channel between telephone 110 and audio browsing adjunct 150.
- the audio processing node 152 comprises a telephone network interface module 210, a DTMF decoder/generator 212, a speech recognition module 214, a text to speech module 216, and an audio play/record module 218, each of which is connected to an
- Telephone numbers are used herein for example purposes only. There is no significance to the use of any particular telephone number other than for exemplification of the present invention. No reference to actual telephone numbers is intended.
- the audio processing node 152 contains a central processing unit 224, memory unit 228, and a packet network interface 230, each of which is connected to the control/data bus 222.
- the overall functioning of the audio processing node 152 is controlled by the central processing unit 224.
- Central processing unit 224 operates under control of executed computer program instructions 232 which are stored in memory unit 228.
- Memory unit 228 may be any type of machine readable storage device.
- memory unit 228 may be a random access memory (RAM), a read only memory (ROM), a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electronically erasable programmable read only memory (EEPROM), a magnetic storage media (i.e. a magnetic disk), or an optical storage media (i.e. a CD-ROM).
- RAM random access memory
- ROM read only memory
- PROM programmable read only memory
- EPROM erasable programmable read only memory
- EEPROM electronically erasable programmable read only memory
- the audio processing node 152 may contain various combinations of machine readable storage devices, which are accessible by the central processing unit 224, and which are capable of storing a combination of computer program instructions 232 and data 234.
- the telephone network interface module 210 handles the low level interaction between the audio processing node 152 and telephone network switch 130.
- module 210 consists of one or more analog tip/ring loop start telephone line terminations.
- central processing unit 224 is able to control link 132 via control data bus 222.
- Control functions include on-hook/off-hook, ring detection, and far-end on-hook detection.
- module 210 includes one or more channelized digital interfaces, such as T1/DS1, El, or PRI. Signaling can be in-band or out-of-band.
- the DTMF decoder/generator 212 handles the conversion of DTMF tones into digital data and the generation of DTMF tones from digital data.
- the speech recognition module 214 performs speech recognition of speech signals originating at user telephone 110 and received over the audio bus 220. Such speech signals are processed and converted into digital data by the speech recognition module 214.
- the text to speech module 216 converts text of documents received from document server 160 into audio speech signals to be transmitted to a user at telephone 110.
- the audio play /record module 218 is used to play audio data
- each module 210, 212, 214, 216, 218 are shown as separate functional modules in Fig. 2.
- the functionality of each of modules 212, 214, 216, and 218 may be implemented in hardware, software, or a combination of hardware and software, using well known signal processing techniques.
- the functionality of module 210 may be implemented in hardware or a combination of hardware and software, using well known signal processing techniques. The functioning of each of these modules will be described in further detail below in conjunction with the example.
- the packet network interface 230 is used for communication between the audio processing node 152 and the audio inte ⁇ reter node 154.
- the audio browsing adjunct 150 also contains an audio inte ⁇ reter node 154 which is connected to the audio processing node 152.
- the audio inte ⁇ reter node 154 is shown in further detail in Fig. 3.
- Audio inte ⁇ reter node 154 contains a central processing unit 302, a memory 304, and two packet network interfaces 306 and 308 connected by a control/data bus 310.
- the overall functioning of the audio inte ⁇ reter node 154 is controlled by the central processing unit 302.
- Central processing unit 302 operates under control of executed computer program instructions 312 which are stored in memory unit 304.
- Memory unit 304 may be any type of machine readable storage device.
- memory unit 304 may be a random access memory (RAM), a read only memory (ROM), a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electronically erasable programmable read only memory (EEPROM), a magnetic storage media (i.e. a magnetic disk), or an optical storage media (i.e. a CD-ROM).
- RAM random access memory
- ROM read only memory
- PROM programmable read only memory
- EPROM erasable programmable read only memory
- EEPROM electronically erasable programmable read only memory
- magnetic storage media i.e. a magnetic disk
- optical storage media i.e. a CD-ROM
- the audio inte ⁇ reter node 154 may contain various combinations of machine readable storage devices, which are accessible by the central processing unit 302, and which are capable of storing a combination of computer program instructions 312 and data 314.
- the dialed telephone number is provided to switch 130 from the local exchange carrier 120 in a manner which is well known in the art, and in turn, the DN is provided to the audio browsing adjunct 150 from switch 130.
- a list of URL's which are associated with DN's is stored as data 234 in memory 228. Assume in the present example the DN (123) 456-7890 is associated with URL http://www.att.com/ ⁇ phone/greeting.
- the list of URL's associated with various DN's is stored in a network database, such as database 140, instead of locally at the audio browsing adjunct 150.
- the central processing unit 224 of the audio processing node 152 sends a signal to network switch 130 to request a lookup to database 140.
- the switch would request the URL from database 140 and return the resulting URL to the audio processing node 152.
- the communication between the audio processing node 152,. switch 130 and database 140 may be via an out of band signaling system, such as SS7, which is well known in the art.
- An advantage to this configuration is that a plurality of audio browsing adjuncts may be present in the network 102, and each may share a single database 140. In this manner, only one database 140 needs to be updated with URLs and associated DNs.
- the central processing unit 224 of the audio processing node 152 After receiving the URL associated with the DN, the central processing unit 224 of the audio processing node 152 sends a message (including the URL) to the audio inte ⁇ reter node 154 instructing the audio inte ⁇ reter node 154 to initiate an audio inte ⁇ reting/browsing session. Such a message is passed from the central processing unit 224 to the packet network interface 230 via the control data bus 222.
- the message is sent from packet network interface 230 of the audio processing node 152 to the packet network interface 306 of the audio inte ⁇ reting node 154 via connection 153.
- the audio processing node 152 and the audio inte ⁇ reter node 154 are collocated and thus form an integrated audio browsing adjunct 150.
- the audio processing node 152 and the audio inte ⁇ reter node 154 may be geographically separated. Several such alternate embodiments are described below.
- the connection 153 may be a packet data network connection (e.g., TCP/IP connection over Ethernet) which is well known in the art.
- the audio inte ⁇ reter node 154 receives a message via packet network interface 306 that it is to initiate a new audio inte ⁇ reting/browsing session.
- the central processing unit 302 is capable of controlling multiple audio inte ⁇ reting/browsing sessions for multiple users simultaneously. Such multiprocess execution by a processor is well known, and generally entails the instantiation of a software process for controlling each of the session.
- the audio inte ⁇ reting node 154 Upon the initiation of an audio inte ⁇ reting/ browsing session, the audio inte ⁇ reting node 154 sends an HTTP request for URL http://www.att.com ⁇ phone'greeting to the document server 160 over connection 164.
- the document server 160 is associated with the host name www.att.com.
- Document server 160 is shown in further detail in Fig. 4.
- Document server 160 is a computer containing a central processing unit 402 connected to a memory 404.
- the functions ol " tne ducumcni server 160 are controlled by the central processing unit 402 executing computer program instructions 416 stored in memory 404.
- the document server 160 receives requests for documents from the audio inte ⁇ reter node 154 via connection 164 and packet network interface 440.
- the central processing unit 402 interprets the requests and retrieves the requested information from memory 404.
- Such requests may be for HTML documents 408, audio-HTML documents 410, audio files 412, or graphics files 414.
- HTML documents 408 are well known and contain conventional HTML instructions for use
- An audio-HTML document is similar to an HTML document but has additional instructions which are particularly directed to inte ⁇ retation by the audio inte ⁇ reter node 154 in accordance with the present invention. Such instructions which are particular to the audio browsing aspects of the present invention will be identified herein as audio-HTML instructions. The details of audio-HTML documents and audio-HTML instructions will be described in further detail below.
- Audio files 412 are files which contain audio information.
- Graphics files 414 are files which contain graphical information.
- a URL identifies a particular document on a particular document server.
- Memory 404 may also contain scripts 418 for dynamically generating HTML documents and audio-HTML documents.
- an HTTP request for URL http://www.att.com/ ⁇ phone/greeting is received by the document server 160 from the audio inte ⁇ reter node 154 via connection 164.
- the document server inte ⁇ rets this URL and retrieves an audio-HTML page from memory 404 under central processing unit 402 control.
- the central processing unit 402 then sends this audio-HTML document to the audio inte ⁇ reter node 154 via packet network interface 440 and link 164.
- the audio-HTML document 500 which is sent in response to the request for URL http://www.att.com/ ⁇ phone/greeting, and which is received by the audio inte ⁇ reter node 154, is shown in Fig. 5.
- the audio inte ⁇ reter node 154 begins inte ⁇ reting the document 500 as follows.
- the ⁇ HEAD> section lines 502-506, of the document 500, including the title of the page, is not converted into voice, and is ignored by the audio inte ⁇ reter node 154.
- the ⁇ TITLE> section may be inte ⁇ reted using text to speech as described below.
- the text "Hello! at line 508 in the ⁇ BODY> section of the document 500 is sent from the audio inte ⁇ reter node 154 to the audio processing node 152 via packet network interface 306 and link 153.
- the audio inte ⁇ reter node 154 sends instructions to the audio processing node 152 that the text is to be processed by the text to speech module 216.
- the audio processing node 152
- the text to speech module 216 receives the text and instructions via the packet network interface 230, and the text is supplied to the text to speech module 216 via control/data bus 222.
- the text to speech module 216 generates the audio signal to play "Hello " 3 and sends the signal to the telephone network interface module 210 via audio bus 220.
- the telephone network interface module 210 then sends the audio signal to telephone 1 10.
- text to speech conversion is well known and conventional text to speech techniques may be used by the text to speech module 214. For example, the punctuation "! in the text may be inte ⁇ reted as increased volume when the text is converted to speech.
- Line 510 of document 500 is a form instruction, and the audio inte ⁇ reter node 154 does not send anything to the audio processing node 152 in connection with this instruction.
- the audio inte ⁇ reter node 154 inte ⁇ rets line 510 to indicate that it will be expecting a future response from the user, and that this response is to be given as an argument to the script identified by http.7/machine:8888/hastings-bin getscript.sh.
- Line 512 is an audio-HTML instruction.
- the audio inte ⁇ reter node 154 inte ⁇ rets line 512 by sending an http request to server 160 for the audio file identified by www- spr.ih.att.com/ ⁇ hastings/annc/greeting.mu8, which resides in memory 404 in storage area 412.
- the document server 160 retrieves the audio file from memory 404 and sends it to the audio inte ⁇ reter node 154 via link 164.
- the audio inte ⁇ reter node 154 Upon receipt of the audio file, the audio inte ⁇ reter node 154 sends the file, along with instructions indicating that the file is to be played by the audio play/record module 218, to the audio processing node 152.
- the audio processing node 152 Upon receipt of the file and instructions, the audio processing node 152 routes the audio file to the audio play/record module 218.
- the audio play/record module 218 generates an audio signal which is sent to the telephone network interface module 210 via audio bus 220.
- the telephone network interface module 210 then sends the audio signal to the telephone 110.
- the user at telephone 1 10 hears the contents of the audio file www-spr.ih.att.com/ ⁇ hastings/annc/greeting.mu8 at the speaker of telephone 110.
- Lines 514-516 are audio-HTML instructions.
- the audio inte ⁇ reter node 154 does not send line 514 to the audio processing node 152.
- Line 514 indicates that a
- Italic type is used herein to indicate text which is played as audio speech.
- the audio will ask the user to make a selection based upon some criteria, and the audio inte ⁇ reter node 154 will wait for a response from the user at telephone 110. Also, as a result of processing line 516, the central processing unit 302 sends a message to the audio processing node 152 instructing the telephone network interface module 210 to be prepared to receive audio input.
- the user responds with audio user input from telephone 110.
- the audio user input may be in the form of DTMF tones generated by the user pressing a key on the keypad of telephone 1 10. For example, if the user presses "2" on telephone 110 keypad, the DTMF tone associated with "2" is received by the audio processing node 152 via the telephone network interface module 210.
- the audio signal is recognized as a DTMF tone by the central processing unit 224, and instructions are passed to telephone network interface module 210 to send the signal to the DTMF decoder/generator 212 via the audio bus 220.
- the central processing unit 224 instructs the DTMF decoder/generator 212 to convert the DTMF tone into digital data and to pass the digital data to the packet network interface 230 for transmission to the audio inte ⁇ reter node 154.
- the audio inte ⁇ reter node 154 Upon receipt of the signal, the audio inte ⁇ reter node 154 recognizes that the user has responded with choice 2, which corresponds with the value "Jim” as indicated by line 520 of the audio-HTML document 500. Thus, the audio inte ⁇ reter node 154 sends the value "Jim" associated with the variable "collectvar" to the script http://machine:8888/hastings-bin/getscript.sh identified in line 510 of document 500.
- the audio inte ⁇ reter node 154 instructs the text to speech module 216 to generate a speech signal "choice not understood, try again", and that signal is provided to the user at telephone 110.
- audio user input may be in the form of a voice signal.
- the voice signal is received by the audio processing node 152 via the telephone network interface module 210.
- the audio signal is recognized as a voice signal by the central processing unit 224, and instructions are passed to telephone network interface module 210 to send the signal to the speech recognition module 214 via the audio bus 220.
- the central processing unit 224 instructs the speech recognition module 214 to convert the voice signal into digital data and to pass the data to the packet network interface 230 for transmission to the audio inte ⁇ reter node 154.
- the audio inte ⁇ reter node 154 processes the data as described above in conjunction with the DTMF audio user input.
- the speech recognition module 214 operates in accordance with conventional speech recognition techniques which are well known in the art.
- Hypertext links often appear in HTML documents. When displayed on the screen of a computer executing a conventional graphical browser, a hypertext link will be graphically identified (e.g. underlined). If a user graphically selects a link, for example by clicking on the link with a mouse, then the browser generates a request for the document indicated by the link and sends the request to the document server.
- This page gives you a choice of links to follow to other World Wide Web pages. Please click on one of the links below.
- the user would then select one of the links using a graphical pointing device such as a mouse. If the user selects the link click here for information on cars then the browser would generate a request for the document identified by the URL http://www.abc.com/cars.html. If the user selects the link click here for information on trucks then the browser would generate a request for the document identified by the URL http://www.abc.com/trucks.html.
- the audio inte ⁇ reter node 154 sends an instruction to the audio processing node 152, instructing the DTMF decoder/generator 212 to generate a tone to the telephone 1 10.
- the tone could be generated by the audio inte ⁇ reter node 154 sending an instruction to the audio processing node 152, instructing the audio play/record module 218 to play an audio file containing tone audio.
- the particular tone is one which is used to signify the beginning of a hypertext link to the user.
- the audio inte ⁇ reter node 154 then supplies the text of the hypertext link, "click here for information on cars", to the audio processing node 154 with an instruction indicating that the text is to be processed by the text to speech module 216.
- the speech audio signal "click here for information on cars", is provided to the telephone 110.
- the audio inte ⁇ reter node 154 then sends an instruction to the audio processing node 152, instructing the DTMF decoder/generator 212 to generate a tone to the telephone 1 10.
- This particular tone is one which is used to signify the end of a hypertext link to the user.
- the tones used to signify the beginning and end of hypertext links may be the same or different tones.
- the ending tone is followed by a
- hypertext link may be identified by speech audio signals such as "begin link [hypertext] end link".
- the user If the user wishes to follow the link, then the user supplies user audio input during the pause. For example, suppose the user wanted to follow the link "click here for information on cars". The user would enter audio input during the pause following the generated speech audio signal for the link.
- the audio input may be, for example, a DTMF tone generated by pressing a key on the telephone 110 keypad.
- the DTMF tone is received by the audio processing node 152 and processed by the DTMF decoder/generator 212.
- Data representing the DTMF tone is provided to the audio inte ⁇ reter node 154 via the control/data bus 222, packet network interface 230, and link 153.
- the audio inte ⁇ reter node 154 Upon receipt of the signal, the audio inte ⁇ reter node 154 recognizes that the signal has been received during the pause following the selected link, and the audio inte ⁇ reter node 154 generates a request for the WWW document identified by the URL http://www.abc.com/cars.html, which is associated with the selected link. Altematively, audio user input for selecting a hypertext link may be in the form of a speech signal.
- Another type of link is a hypertext anchor link.
- An anchor link allows a user to jump to a particular location within a single HTML document.
- the browser displays the portion of the document indicated by the link.
- the audio inte ⁇ reter node 154 will begin inte ⁇ reting the document at the point specified by the link.
- line 620 of document 600 contains a hypertext anchor to the portion of the document at line 625. This hypertext link is identified to the user in a manner similar to that of the hypertext links which identify new HTML documents, as described above.
- the hypertext anchor links may be distinguished by, for example, a different audio tone or a generated speech signal identifying the link as an anchor link. If the user selects the anchor link at line 620, then the audio inte ⁇ reter node 154 will skip down to the text at line 625 and will begin inte ⁇ reting the HTML document 600 at that point.
- the advantageous embodiment described above in conjunction with Fig. 1 is configured such that the audio browsing adjunct 150, including the audio processing node 152 and the audio inte ⁇ reter node 154, is embodied in a telecommunications network node located within a long distance telecommunications network 102. This configuration provides the advantage that the audio browsing functions in accordance with the present invention can be provided to telephone network subscribers by the telephone network 102 service provider.
- Fig. 7 One such alternate configuration is shown in Fig. 7, in which the functions of the audio browsing adjunct are shown implemented at a user interface device 700.
- the functions of the audio processing node 152, along with the functions of the audio inte ⁇ reter node 154, are integrated within the single user interface device 700.
- the user interface device 700 communicates with the document server 160 through a communication link 702.
- Link 702 is similar to link 164 which was described above in connection with Fig. 1.
- link 702 may be a socket connection over TCP/IP, the establishment of which is well known in the art.
- User interface device 700 is shown in further detail in Fig. 8.
- User interface device 700 comprises a keypad keyboard 802 and a microphone 804 for accepting user input, and a speaker 806 for providing audio output to the user.
- the user interface device 700 also comprises a keypad kcyooard interface module 816 connected to a control/data bus 824.
- the user interface device 700 also comprises a codec 810, a speech recognition module 818. a text to speech module 820, and an audio play/record module 822, each of which is connected to an audio bus 808 and the control/data bus 824 as shown in Fig. 8.
- the codec 810 contains an analog to digital converter 812 and a digital to analog converter 814, both of which are controlled by a central processing unit 826 via the control/data bus 824.
- the analog to digital converter 812 converts analog audio user input from microphone 804 into digital audio signals and provides
- the digital to analog converter 814 converts digital audio signals from the audio bus 808 to analog audio signals to be sent to the speaker 806.
- the keypad/keyboard interface module 816 receives input from the keypad/keyboard 802 and provides the input to the control data bus 824.
- the speech recognition module 818, the text to speech module 820, and the audio play/record module 822, perform the same functions, and are similarly configured, as modules 214, 216, and 218, respectively, which were described above in conjunction with Fig. 2.
- the user interface device 700 contains a packet network interface 834 for connecting to a packet network, such as the Internet, via link 702.
- the user interface device 700 contains central processing unit 826 and a memory unit 828, both of which are connected to the control/data bus 824.
- the overall functioning of the user interface device 700 is controlled by the central processing unit 826.
- Central processing unit 826 operates under control of executed computer program instructions 830 which are stored in memory unit 828.
- Memory unit 828 also contains data 832.
- the user interface device 700 implements the functions of the audio processing node 152 and the audio inte ⁇ reter node 154, which were described above in conjunction with the embodiment of Fig. 1. These functions are implemented by the central processing unit 826 executing computer program instructions 830.
- the computer program instructions 830 would include program instructions which are the same as, or similar to: 1) computer program instructions 232 implementing the functions of the audio processing node 152; and 2) computer program instructions 312 implementing the functions of the audio inte ⁇ reter node 154.
- the functioning of the audio processing node 152 and the audio inte ⁇ reter node 154 were described in detail above, and will not be described in further detail here.
- Central processing unit 836 is capable of executing multiple processes at the same time, and in this way implements the functions of the audio processing node 152 and the audio inte ⁇ reter node 154.
- This multiprocess functioning is illustrated in Fig. 8 where the central processing unit 826 is shown executing audio inte ⁇ reting/browsing process 836 and audio processing process 838.
- a user of user interface device 700 would request a URL using keypad/keyboard 802 or microphone 804. If the keypad keyboard 802 is used to request a URL, the keypad/keyboard interface module 816 would provide the requested URL to the central processing unit 826 via the control/data bus 824.
- the microphone 804 is used to request a URL
- the user's voice is received by microphone 804, digitized by analog to digital converter 812, and passed to the speech recognition module 818 via the audio bus 808.
- the speech recognition module 818 would then provide the requested URL to the central processing unit 826 via the control/data bus 824.
- the central processing unit 826 Upon receipt of the URL, the central processing unit 826 initiates an audio browsing/inte ⁇ reting session by instantiating an audio inte ⁇ reting/browsing process 836.
- the audio inte ⁇ reting/browsing process 836 sends an HTTP request to the document server 160 via the packet network interface 834 in a manner similar to that described above in conjunction with the embodiment of Fig. 1.
- the audio inte ⁇ reting/browsing process 836 inte ⁇ rets the document in accordance with the audio browsing techniques of the present invention.
- the audio resulting from the inte ⁇ retation of the document is provided to the user via the speaker 806 under control of the audio processing process 838.
- a user of the user interface device 700 can provide audio user input to the user interface device via the microphone 804.
- Figs. 7 and 8 show the user interface device 700 communicating directly with the document server 160 in the packet network 162.
- the user interface device 700 could be configured to communicated with the document server 160 via a standard telephone connection.
- 72- 834 would be replaced with a telephone interface circuit, which would be controlled by central processing unit 826 via control/data bus 824.
- User interface device 700 would then initiate a telephone call to the document server via the telephone network.
- the document server 160 would terminate the call from the user interface device 700 using hardware similar to the telephone network interface module 210 (Fig. 2). Altematively, the call could be terminated within the telephone network, with the termination point providing a packet network connection to the document server 160.
- the functions of the audio browsing adjunct 150 (including the functions of the audio processing node 152 and the audio inte ⁇ reter node 154) and the document server 160 are implemented within an audio browsing document server 900.
- calls are routed from a telephone 110, through LEC 120, switch 130, and another LEC 902, fo the audio browsing document server 900.
- the audio browsing document server 900 could be reached from a conventional telephone 110 via a telephone network.
- the audio browsing document server 900 is also connected to the Internet via a link 904.
- the audio browsing document server 900 is shown in further detail in Fig. 10.
- the audio browsing document server 900 comprises a telephone network interface module 1010, a DTMF decoder/generator 1012, a speech recognition module 1014, a text to speech module 1016, and an audio play/record module 1018, each of which is connected to an audio bus 1002 and a control/data bus 1004, as shown in Fig. 10.
- Each of these modules 1010, 1012, 1014, 1016, and 1018 perform the same functions, and are similarly configured, as modules 210, 212, 214, 216, and 218, respectively, which were described above in conjunction with Fig. 2.
- the audio browsing document server 900 contains a packet network interface 1044 for connecting to a packet network, such as the Internet, via link 904.
- the packet network interface 1044 is similar to the packet network interface 230 described above in conjunction with Fig. 2. Further, the audio browsing document server 900 contains a central processing unit 1020 and a memory unit 1030, both of which are connected to the control/data bus 1004. The overall functioning of the audio browsing document server 900 is controlled by the central processing unit
- Central processing unit 1020 operates under control of executed computer program instructions 1032 which are stored in memory unit 1030.
- Memory unit 1030 also contains data 1034, HTML documents 1036, audio-HTML documents 1038, audio files 1040, and graphics files 1042.
- the audio browsing document server 900 implements the functions of the audio processing node 152, the audio inte ⁇ reter node 154, and the document server 160, which were described above in conjunction with the embodiment of Fig. 1. These functions are implemented by the central processing unit 1020 executing computer program instructions 1032.
- the computer program instructions 1032 would include program instructions which are the same as, or similar to: 1 ) computer program instructions 232 implementing the functions of the audio processing node 152; 2) computer program instructions 312 implementing the functions of the audio inte ⁇ reter node 154; and 3) computer program instructions 416 implementing the functions of the document server 160.
- the functioning of the audio processing node 152, the audio inte ⁇ reter node 154, and the document server 160 were described in detail, and will not be described in further detail here.
- Central processing unit 1020 is capable of executing multiple processes at the same time, and in this way implements the functions of the audio processing node 152, the audio inte ⁇ reter node 154, and the document server 160. This multiprocess functioning is illustrated in Fig. 10 where the central processing unit 1020 is shown executing audio inte ⁇ reting/browsing process 1022, document serving process 1024, and audio processing process 1026.
- a call placed by telephone 1 10 to a telephone number associated with info ⁇ nation accessible through the audio browsing document server 900 is routed to the audio browsing document server 900 via LEC 120, switch 130, and LEC 902. It is noted that a plurality of telephone numbers may be associated with various information accessible through the audio browsing document server 900, and each such telephone number would be routed to the audio browsing document server 900.
- the ringing line is detected through the telephone network interface module 1010 under control of the audio processing process 1026.
- the central processing unit 1020 Upon detection of the call, the central processing unit 1020 performs a lookup to determine the URL which is
- the DN is provided to the audio browsing document server 900 from the LEC 902 in a manner which is well known in the art.
- a list of DN's with associated URL's is stored as data 1034 in memory 1030.
- the central processing unit 1020 Upon receipt of the URL associated with the DN, the central processing unit 1020 initiates an audio browsing/inte ⁇ reting session by instantiating an audio inte ⁇ reting/browsing process 1022.
- the audio inte ⁇ reting/browsing process 1022 sends an HTTP request to the document serving process 1024 which is co-executing on the central processing unit 1020.
- the document serving process 1024 performs the document server functions as described above in conjunction with document server 160 in the embodiment shown in Fig. 1.
- the central processing unit 1020 retrieves the document associated with the URL from memory 1030.
- the audio inte ⁇ reting/browsing process 1022 then inte ⁇ rets the document in accordance with the audio browsing techniques of the present invention.
- the audio resulting from the inte ⁇ retation of the document is provided to the user under control of the audio processing process 1026.
- a user of telephone 110 can provide audio user input to the audio browsing document server 900 in a manner similar to that described above in conjunction with the embodiment of Fig. 1.
- the audio processing node 152 and the audio inte ⁇ reter node 154 were collocated. However, the functions of the audio
- ⁇ processing node 152 and the audio inte ⁇ reter node 154 may be geographically separated as shown in Fig. 1 1.
- the audio processing node 152 is contained within the telecommunications network 102 and an audio inte ⁇ reter document server 1 100 is contained within the packet network 162.
- the functioning of the audio processing node 152 is as described above in conjunction with the embodiment of Fig. 1.
- the audio inte ⁇ reter document server 1100 which implements the functions of a document server, such as document server 160, and the functions of the audio inte ⁇ reter node 154, is shown in further detail in Fig. 12.
- the audio inte ⁇ reter document server 1 100 contains a packet network interface 1202 connected to link 153 and to a control/data bus 1204.
- the audio inte ⁇ reter document server 1 100 contains a central processing unit 1206 and a memory unit 1212, both of which are connected to the control/data bus 1204. The overall functioning of the audio inte ⁇ reter document server 1 100 is controlled by the central processing unit 1206. Central processing unit 1206 operates under control of executed computer program instructions 1214 which are stored in memory unit 1212. Memory unit 1212 also contains data 1216, HTML documents 1218, audio-HTML documents 1220, audio files 1222, and graphics files 1224.
- the audio inte ⁇ reter document server 1100 implements the functions of the audio inte ⁇ reter node 154 and the document server 160, which were described above in conjunction with the embodiment of Fig. 1. These functions are implemented by the central processing unit 1206 executing computer program instructions 1214.
- the computer program instructions 1214 would include program instructions which are the same as, or similar to: 1) computer program instructions 312 implementing the functions of the audio inte ⁇ reter node 154; and 2) computer program instructions 416 implementing the functions of the document server 160.
- the functioning of the audio inte ⁇ reter node 154 and the document server 160 were described in detail above, and will not be described in further detail here.
- Central processing unit 1206 is capable of executing multiple processes at the same time, and in this way implements the functions of the audio inte ⁇ reter node 154 and the document server 160. This multiprocess functioning is illustrated in Fig. 12 where the central processing unit 1206 is shown executing audio inte ⁇ reting/browsing process 1208 and document serving process 1210.
- the audio processing node 152 communicates with the audio inte ⁇ reter document server 1 100 over link 153 in a manner similar to that described above in conjunction with Fig. 1.
- the audio inte ⁇ reter browsing process 1208 communicates with the document serving process 1210 through the central processing unit 1206 via inter-process communication.
- the audio browsing aspects of the present invention may be implemented in various ways, such that the audio processing functions, the audio inte ⁇ reting/browsing functions, and the document serving functions, may be integrated or separate, depending on the particular configuration.
- One skilled in the art would recognize that there are other possible configurations for providing the audio browsing functions of the present invention.
- the present invention may be used in conjunction with standard HTML documents, which are generally intended to be used with conventional graphics browsers, or with audio-HTML documents which are created specifically for use in accordance with the audio browsing features of the present invention.
- audio inte ⁇ retation of standard HTML documents many standard text to speech conversion techniques may be used.
- the following section describes the techniques which may be used to convert standard HTML documents into audio data.
- the techniques described herein for converting HTML documents into audio data are exemplary only, and various other techniques for converting HTML documents into audio signals could be readily implemented by one skilled in the art given this disclosure.
- Standard text passages are inte ⁇ reted using conventional text to speech conversion techniques which are well known.
- the text is inte ⁇ reted as it is encountered in the document, and such inte ⁇ retation continues until the user supplies audio input (e.g. to answer a prompt or follow a link), or a prompt is reached in the document.
- the end of a sentence is inte ⁇ reted by adding a pause to the audio, and paragraph marks ⁇ p> are inte ⁇ reted by inserting a longer pause.
- Text styles may be inte ⁇ reted as follows.
- ⁇ DFN>word ⁇ /DFN> Read text as an independent unit (e.g. using inflection and setting off with pauses).
- Image instructions are specifications in HTML which indicate that a particular image is to be inserted into the document.
- An example of an HTML image instruction is as follows:
- This instruction indicates that the image file "image.gif is to be retrieved from the machine defined in the URL and displayed by the client browser.
- Certain conventional graphic browsers do not support image files, and therefore, HTML image instructions sometimes include alternate text to be displayed instead of the image.
- the text "image of car” is included as an alternative to the image file.
- the audio browsing techniques of the present invention if an image instruction contains a text alternative, then the text is processed and converted to speech and the speech signal is provided to the user.
- the speech signal "image o/car” would be provided to a user at telephone
- a speech signal is generated indicating that an image with no text alternative was encountered (e.g. "A picture without an alternative description").
- the default is that the red box is checked.
- the user would be able to change this default by checking either the blue or green box.
- the above sequence of instructions would be processed into a speech signal provided to the user as follows: The following selections may be toggled by pressing # during the pause: red currently checked (pause), blue (pause), green (pause).
- the user can toggle the item preceding the pause.
- a second press of the # key will move the user out of the input sequence.
- the user may press *r to repeat the list of options.
- the user may select the checkbox options using voice signal input.
- HTML documents can be browsed in accordance with the audio browsing techniques of the present invention.
- audio-HTML instructions may be introduced into conventional HTML documents. These audio- HTML instructions are described below.
- a voice source instruction :
- a collect name instruction :
- ⁇ COLLECT NAMI- "collectvar"> specifies the beginning of a prompt-and-collect sequence. Such a collect name instruction is followed by a prompt instruction and a set of choice instructions. When the user makes a choice, as indicated by audio user input, the results of the user choice are supplied to the documents server associated with the variable collectvar.
- the collect name instruction, along with an associated prompt-and-collect sequence, is described in detail in conjunction with the lines 514-524 of the example document 500 of Fig. 5.
- This instruction causes the audio browsing adjunct 150 to pause and wait for DTMF input from the user.
- the user inputs a DTMF sequence by pressing keys on the keypad of telephone 110 with the end of the sequence indicated by pressing by the # key.
- the DTMF input is processed as described above in conjunction the example HTML document 500.
- the decoded DTMF signal is then supplied to the document server associated with the variable varname.
- the MAXLENGTH parameter indicates the maximum length (DTMF inputs) that are allowed for the input. If the user enters more than the maximum number of DTMF keys (in this example 5), then the system ignores the excess input.
- the SPEECH input instruction :
- ⁇ INPUT TYPE "SPEECH”
- This instruction causes the audio browsing adjunct 150 to pause and to wait for DTMF speech input from the user.
- the user inputs a speech signal by speaking into the microphone of telephone 110.
- the speech input is processed as described above in conjunction with the example HTML document 500.
- the speech signal is then supplied to the document server associated with the variable varname.
- the MAXLENGTH parameter indicates that the maximum length of the speech input is 5 seconds.
- the audio-HTML instructions described herein are exemplary of the types of audio-HTML instructions which may be implemented to exploit the advantages of the audio browsing techniques of the present invention. Additional audio-HTML instructions could be readily implemented by one skilled in the art given this disclosure.
- the audio browsing adjunct 150 supports various navigation instructions.
- users may use conventional techniques for navigating through a document.
- Such conventional techniques include text sliders for scrolling through a document, cursor movement, and instructions such as page up, page down, home, and end.
- users may navigate through documents using audio user input, either in the form of DTMF tones or speech, as follows.
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP97915886A EP0834229A1 (en) | 1996-04-22 | 1997-03-18 | Method and apparatus for information retrieval using audio interface |
JP9538046A JPH11510977A (en) | 1996-04-22 | 1997-03-18 | Method and apparatus for extracting information using audio interface |
IL12264797A IL122647A (en) | 1996-04-22 | 1997-03-18 | Method and apparatus for information retrieval using audio interface |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63580196A | 1996-04-22 | 1996-04-22 | |
US08/635,801 | 1996-04-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1997040611A1 true WO1997040611A1 (en) | 1997-10-30 |
Family
ID=24549170
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1997/003690 WO1997040611A1 (en) | 1996-04-22 | 1997-03-18 | Method and apparatus for information retrieval using audio interface |
Country Status (7)
Country | Link |
---|---|
EP (1) | EP0834229A1 (en) |
JP (1) | JPH11510977A (en) |
KR (1) | KR19990028327A (en) |
CA (1) | CA2224712A1 (en) |
IL (1) | IL122647A (en) |
MX (1) | MX9710150A (en) |
WO (1) | WO1997040611A1 (en) |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998035491A1 (en) * | 1997-02-05 | 1998-08-13 | British Telecommunications Public Limited Company | Voice-data interface |
EP0942576A2 (en) * | 1998-03-11 | 1999-09-15 | AT&T Corp. | Method and apparatus for facilitating access to the internet from a telephone network |
WO1999046920A1 (en) * | 1998-03-10 | 1999-09-16 | Siemens Corporate Research, Inc. | A system for browsing the world wide web with a traditional telephone |
WO1999054801A2 (en) * | 1998-04-20 | 1999-10-28 | Sun Microsystems, Inc. | Method and apparatus of supporting an audio protocol in a network |
WO2000025486A1 (en) * | 1998-10-23 | 2000-05-04 | Nokia Networks Oy | Method and apparatus for distributing an audio or video information |
WO2000076192A1 (en) * | 1999-06-08 | 2000-12-14 | Aplio, Societe Anonyme | Method and system for accessing, via a computerised communication network such as internet, a multimedia voice server |
EP1096766A2 (en) * | 1999-10-29 | 2001-05-02 | Nortel Networks Limited | Methods and systems for providing access to stored audio data over a network |
EP1101343A1 (en) * | 1998-07-24 | 2001-05-23 | Motorola, Inc. | Telecommunication audio processing systems and methods therefor |
US6240391B1 (en) | 1999-05-25 | 2001-05-29 | Lucent Technologies Inc. | Method and apparatus for assembling and presenting structured voicemail messages |
EP1122636A2 (en) | 2000-02-03 | 2001-08-08 | Siemens Corporate Research, Inc. | System and method for analysis, description and voice-driven interactive input to html forms |
WO2001061894A2 (en) * | 2000-02-18 | 2001-08-23 | Penguinradio, Inc. | Method and system for providing digital audio broadcasts and digital audio files via a computer network |
EP1131940A1 (en) * | 1998-11-17 | 2001-09-12 | Telstra R & D Management Pty. Ltd. | A data access system and method |
EP1133734A2 (en) * | 1998-10-02 | 2001-09-19 | International Business Machines Corporation | Conversational browser and conversational systems |
EP1168799A2 (en) * | 2000-06-30 | 2002-01-02 | Fujitsu Limited | Data processing system with vocalisation mechanism |
EP1168300A1 (en) * | 2000-06-29 | 2002-01-02 | Fujitsu Limited | Data processing system for vocalizing web content |
EP1178656A1 (en) * | 2000-08-02 | 2002-02-06 | Passcall Advanced Technologies Ltd | System and method for computerless surfing of an information network |
US6393107B1 (en) | 1999-05-25 | 2002-05-21 | Lucent Technologies Inc. | Method and apparatus for creating and sending structured voicemail messages |
US6424945B1 (en) | 1999-12-15 | 2002-07-23 | Nokia Corporation | Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection |
EP1233590A1 (en) * | 2001-02-19 | 2002-08-21 | Sun Microsystems, Inc. | Content provider for a computer system |
JP2002532018A (en) * | 1998-12-01 | 2002-09-24 | ニュアンス・コミュニケーションズ | Method and system for forming and browsing an audio web |
US6459774B1 (en) | 1999-05-25 | 2002-10-01 | Lucent Technologies Inc. | Structured voicemail messages |
EP1246439A1 (en) * | 2001-03-26 | 2002-10-02 | Alcatel | System and method for voice controlled internet browsing using a permanent D-channel connection |
SG98374A1 (en) * | 2000-03-14 | 2003-09-19 | Egis Comp Systems Pte Ltd | A client and method for controlling communications thereof |
US6771743B1 (en) * | 1996-09-07 | 2004-08-03 | International Business Machines Corporation | Voice processing system, method and computer program product having common source for internet world wide web pages and voice applications |
US7308462B1 (en) | 1999-10-29 | 2007-12-11 | Nortel Networks Limited | Methods and systems for building and distributing audio packages |
EP1240775B1 (en) * | 1999-12-10 | 2011-10-12 | Deutsche Telekom AG | Communication system and method for establishing an internet connection by means of a telephone |
EP2452306A1 (en) * | 2009-07-08 | 2012-05-16 | Onering S.R.L. | Device for collecting and managing documents and for controlling viewing thereof and method for using the device |
US8644465B2 (en) | 2002-11-29 | 2014-02-04 | Streamwide | Method for processing audio data on a network and device therefor |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4565585B2 (en) * | 2000-04-13 | 2010-10-20 | キヤノン株式会社 | Data processing apparatus, data processing method, and recording medium |
US7735101B2 (en) | 2006-03-28 | 2010-06-08 | Cisco Technology, Inc. | System allowing users to embed comments at specific points in time into media presentation |
-
1997
- 1997-03-18 JP JP9538046A patent/JPH11510977A/en not_active Ceased
- 1997-03-18 IL IL12264797A patent/IL122647A/en not_active IP Right Cessation
- 1997-03-18 EP EP97915886A patent/EP0834229A1/en not_active Withdrawn
- 1997-03-18 KR KR1019970709642A patent/KR19990028327A/en active Search and Examination
- 1997-03-18 CA CA002224712A patent/CA2224712A1/en not_active Abandoned
- 1997-03-18 WO PCT/US1997/003690 patent/WO1997040611A1/en not_active Application Discontinuation
- 1997-12-15 MX MX9710150A patent/MX9710150A/en not_active IP Right Cessation
Non-Patent Citations (6)
Title |
---|
ATKINS D L ET AL: "INTEGRATED WEB AND TELEPHONE SERVICE CREATION", BELL LABS TECHNICAL JOURNAL, vol. 2, no. 1, 1 January 1997 (1997-01-01), pages 19 - 35, XP002036350 * |
BAGGIA P ET AL: "A MAN-MACHINE DIALOGUE SYSTEM FOR SPEECH ACCESS TO E-MAIL INFORMATION USING THE TELEPHONE: IMPLEMENTATION AND FIRST RESULTS", CSELT TECHNICAL REPORT ON EUROSPEECH 1991. MARCH 1992 REPORT CONTAINS C.D. AT BACK OF ISSUE, vol. 20 - NO 1, 1 March 1992 (1992-03-01), TOSCO F, pages 79 - 83, XP000314315 * |
FISCHELL D R ET AL: "INTERACTIVE VOICE TECHNOLOGY APPLICATIONS", AT & T TECHNICAL JOURNAL, vol. 69, no. 5, 1 September 1990 (1990-09-01), pages 61 - 76, XP000224080 * |
GESSLER, KOTULA: "PDAs as mobile WWW Browsers", COMPUTER NETWORKS AND ISDN SYSTEMS, 1 December 1995 (1995-12-01), pages 53 - 59, XP002037371 * |
PAGE J H ET AL: "THE LAUREATE TEXT-TO-SPEECH SYSTEM - ARCHITECTURE AND APPLICATIONS", BT TECHNOLOGY JOURNAL, vol. 14, no. 1, 1 January 1996 (1996-01-01), pages 57 - 67, XP000554639 * |
RICCIO A ET AL: "VOICE BASED REMOTE DATA BASE ACCESS", PROCEEDINGS OF THE EUROPEAN CONFERENCE ON SPEECH COMMUNICATION AND TECHNOLOGY (EUROSPEECH), PARIS, SEPT. 26 - 28, 1989, vol. 1, 26 September 1989 (1989-09-26), TUBACH J P;MARIANI J J, pages 561 - 564, XP000209922 * |
Cited By (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6771743B1 (en) * | 1996-09-07 | 2004-08-03 | International Business Machines Corporation | Voice processing system, method and computer program product having common source for internet world wide web pages and voice applications |
WO1998035491A1 (en) * | 1997-02-05 | 1998-08-13 | British Telecommunications Public Limited Company | Voice-data interface |
WO1999046920A1 (en) * | 1998-03-10 | 1999-09-16 | Siemens Corporate Research, Inc. | A system for browsing the world wide web with a traditional telephone |
EP0942576A3 (en) * | 1998-03-11 | 2002-01-16 | AT&T Corp. | Method and apparatus for facilitating access to the internet from a telephone network |
EP0942576A2 (en) * | 1998-03-11 | 1999-09-15 | AT&T Corp. | Method and apparatus for facilitating access to the internet from a telephone network |
WO1999054801A2 (en) * | 1998-04-20 | 1999-10-28 | Sun Microsystems, Inc. | Method and apparatus of supporting an audio protocol in a network |
US6675054B1 (en) | 1998-04-20 | 2004-01-06 | Sun Microsystems, Inc. | Method and apparatus of supporting an audio protocol in a network environment |
WO1999054801A3 (en) * | 1998-04-20 | 2000-06-15 | Sun Microsystems Inc | Method and apparatus of supporting an audio protocol in a network |
EP1101343A4 (en) * | 1998-07-24 | 2004-08-18 | Motorola Inc | Telecommunication audio processing systems and methods therefor |
EP1101343A1 (en) * | 1998-07-24 | 2001-05-23 | Motorola, Inc. | Telecommunication audio processing systems and methods therefor |
EP1133734A4 (en) * | 1998-10-02 | 2005-12-14 | Ibm | Conversational browser and conversational systems |
EP1133734A2 (en) * | 1998-10-02 | 2001-09-19 | International Business Machines Corporation | Conversational browser and conversational systems |
WO2000025486A1 (en) * | 1998-10-23 | 2000-05-04 | Nokia Networks Oy | Method and apparatus for distributing an audio or video information |
EP1131940A4 (en) * | 1998-11-17 | 2004-12-15 | Telstra New Wave Pty Ltd | A data access system and method |
EP1131940A1 (en) * | 1998-11-17 | 2001-09-12 | Telstra R & D Management Pty. Ltd. | A data access system and method |
JP2002532018A (en) * | 1998-12-01 | 2002-09-24 | ニュアンス・コミュニケーションズ | Method and system for forming and browsing an audio web |
US6240391B1 (en) | 1999-05-25 | 2001-05-29 | Lucent Technologies Inc. | Method and apparatus for assembling and presenting structured voicemail messages |
US6393107B1 (en) | 1999-05-25 | 2002-05-21 | Lucent Technologies Inc. | Method and apparatus for creating and sending structured voicemail messages |
US6459774B1 (en) | 1999-05-25 | 2002-10-01 | Lucent Technologies Inc. | Structured voicemail messages |
WO2000076192A1 (en) * | 1999-06-08 | 2000-12-14 | Aplio, Societe Anonyme | Method and system for accessing, via a computerised communication network such as internet, a multimedia voice server |
FR2794924A1 (en) * | 1999-06-08 | 2000-12-15 | Aplio Sa | METHOD AND SYSTEM FOR ACCESSING A MULTIMEDIA VOICE SERVER VIA AN INTERNET COMPUTER COMMUNICATION NETWORK |
EP1096766A3 (en) * | 1999-10-29 | 2004-06-30 | Nortel Networks Limited | Methods and systems for providing access to stored audio data over a network |
US7376710B1 (en) | 1999-10-29 | 2008-05-20 | Nortel Networks Limited | Methods and systems for providing access to stored audio data over a network |
US7308462B1 (en) | 1999-10-29 | 2007-12-11 | Nortel Networks Limited | Methods and systems for building and distributing audio packages |
EP1096766A2 (en) * | 1999-10-29 | 2001-05-02 | Nortel Networks Limited | Methods and systems for providing access to stored audio data over a network |
EP1240775B1 (en) * | 1999-12-10 | 2011-10-12 | Deutsche Telekom AG | Communication system and method for establishing an internet connection by means of a telephone |
US6424945B1 (en) | 1999-12-15 | 2002-07-23 | Nokia Corporation | Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection |
EP1122636A3 (en) * | 2000-02-03 | 2007-11-14 | Siemens Corporate Research, Inc. | System and method for analysis, description and voice-driven interactive input to html forms |
EP1122636A2 (en) | 2000-02-03 | 2001-08-08 | Siemens Corporate Research, Inc. | System and method for analysis, description and voice-driven interactive input to html forms |
WO2001061894A3 (en) * | 2000-02-18 | 2002-05-30 | Penguinradio Inc | Method and system for providing digital audio broadcasts and digital audio files via a computer network |
WO2001061894A2 (en) * | 2000-02-18 | 2001-08-23 | Penguinradio, Inc. | Method and system for providing digital audio broadcasts and digital audio files via a computer network |
SG98374A1 (en) * | 2000-03-14 | 2003-09-19 | Egis Comp Systems Pte Ltd | A client and method for controlling communications thereof |
US6823311B2 (en) | 2000-06-29 | 2004-11-23 | Fujitsu Limited | Data processing system for vocalizing web content |
EP1168300A1 (en) * | 2000-06-29 | 2002-01-02 | Fujitsu Limited | Data processing system for vocalizing web content |
EP1168799A2 (en) * | 2000-06-30 | 2002-01-02 | Fujitsu Limited | Data processing system with vocalisation mechanism |
EP1168799A3 (en) * | 2000-06-30 | 2005-12-14 | Fujitsu Limited | Data processing system with vocalisation mechanism |
EP1178656A1 (en) * | 2000-08-02 | 2002-02-06 | Passcall Advanced Technologies Ltd | System and method for computerless surfing of an information network |
EP1233590A1 (en) * | 2001-02-19 | 2002-08-21 | Sun Microsystems, Inc. | Content provider for a computer system |
US7406525B2 (en) | 2001-02-19 | 2008-07-29 | Sun Microsystems, Inc. | Content provider and method for a computer system |
EP1246439A1 (en) * | 2001-03-26 | 2002-10-02 | Alcatel | System and method for voice controlled internet browsing using a permanent D-channel connection |
US8644465B2 (en) | 2002-11-29 | 2014-02-04 | Streamwide | Method for processing audio data on a network and device therefor |
EP2452306A1 (en) * | 2009-07-08 | 2012-05-16 | Onering S.R.L. | Device for collecting and managing documents and for controlling viewing thereof and method for using the device |
Also Published As
Publication number | Publication date |
---|---|
MX9710150A (en) | 1998-07-31 |
IL122647A (en) | 2002-05-23 |
EP0834229A1 (en) | 1998-04-08 |
IL122647A0 (en) | 1998-08-16 |
JPH11510977A (en) | 1999-09-21 |
KR19990028327A (en) | 1999-04-15 |
CA2224712A1 (en) | 1997-10-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO1997040611A1 (en) | Method and apparatus for information retrieval using audio interface | |
KR100566014B1 (en) | Methods and devices for voice conversation over a network using parameterized conversation definitions | |
CA2351899C (en) | Method of using speech recognition to initiate a wireless application protocol (wap) session | |
US7054818B2 (en) | Multi-modal information retrieval system | |
US6738803B1 (en) | Proxy browser providing voice enabled web application audio control for telephony devices | |
US6445694B1 (en) | Internet controlled telephone system | |
US6430175B1 (en) | Integrating the telephone network and the internet web | |
US7486664B2 (en) | Internet controlled telephone system | |
US7340043B2 (en) | Voice extensible markup language-based announcements for use with intelligent network services | |
US20060064499A1 (en) | Information retrieval system including voice browser and data conversion server | |
US20050111439A1 (en) | Voice integrated VOIP system | |
US7263177B1 (en) | Method and system for operating interactive voice response systems tandem | |
JP2000502849A (en) | Access to communication services | |
WO2002033942A1 (en) | Providing blended interface for wireless information services | |
WO2003063137A1 (en) | Multi-modal information delivery system | |
WO2004006131A1 (en) | An arrangement and a method relating to access to internet content | |
CA2315168A1 (en) | Method and apparatus for allowing selective disposition of an incoming telephone call during an internet session | |
US6285683B1 (en) | Method and apparatus for providing extended capability telephone services via an automated server | |
Danielsen | The promise of a voice-enabled Web | |
US7336771B2 (en) | Voice extensible markup language enhancements of intelligent network services | |
WO2002093402A1 (en) | Method and system for creating pervasive computing environments | |
US20150029898A1 (en) | Method, apparatus, and article of manufacture for web-based control of a call server | |
US20030223555A1 (en) | Enabling legacy interactive voice response units to accept multiple forms of input | |
US20040141595A1 (en) | Voice extensible markup language-based web interface with intelligent network services | |
Guedhami et al. | Web Enabled Telecommunication Service Control Using VoxML |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA IL JP KR MX |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
ENP | Entry into the national phase |
Ref document number: 2224712 Country of ref document: CA Ref country code: CA Ref document number: 2224712 Kind code of ref document: A Format of ref document f/p: F |
|
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 1997 538046 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1019970709642 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1997915886 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWP | Wipo information: published in national office |
Ref document number: 1997915886 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1019970709642 Country of ref document: KR |
|
WWR | Wipo information: refused in national office |
Ref document number: 1019970709642 Country of ref document: KR |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1997915886 Country of ref document: EP |