WO2001059541A2 - Method and apparatus for accessing web pages - Google Patents

Method and apparatus for accessing web pages Download PDF

Info

Publication number
WO2001059541A2
WO2001059541A2 PCT/US2001/003067 US0103067W WO0159541A2 WO 2001059541 A2 WO2001059541 A2 WO 2001059541A2 US 0103067 W US0103067 W US 0103067W WO 0159541 A2 WO0159541 A2 WO 0159541A2
Authority
WO
WIPO (PCT)
Prior art keywords
voice
client computer
web page
user
audio pointer
Prior art date
Application number
PCT/US2001/003067
Other languages
French (fr)
Other versions
WO2001059541A3 (en
Inventor
Dominic Pang
Original Assignee
Dominic Pang Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dominic Pang Corp. filed Critical Dominic Pang Corp.
Priority to AU2001233149A priority Critical patent/AU2001233149A1/en
Publication of WO2001059541A2 publication Critical patent/WO2001059541A2/en
Publication of WO2001059541A3 publication Critical patent/WO2001059541A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing

Definitions

  • the present invention relates in general to a system and method for providing user- friendly interfaces for accessing information sources over a communication network, and, in particular, to a system and method for accessing web pages over the Internet.
  • BACKGROUND Public data communication networks such as the Internet have revolutionized access to information by providing a powerful new system for disseminating information such as news, product information, advertisements, images, samples of music and video clips, etc. to distantly located persons or entities.
  • the Internet includes a number of interconnected computers — usually called server computers or servers — which store such information.
  • Servers receive requests for such information from distantly located computers typically operated by users — which are usually called client computers or clients.
  • the servers respond by transmitting the requested information to the client computer via the Internet.
  • HTTP Hyper Text Transfer Protocol
  • This protocol is typically executed over a transport layer protocol such as Transmission Control Protocol/Internet Protocol (TCP/IP), which establishes and maintains a connection between the client computer and the server which, in turn, are interconnected with numerous other servers.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the client includes a computer program called a web browser, or simply, a "browser,” that provides an interactive interface/display of the information received from access to the vast resources of the Internet.
  • a user operating a client launches the browser on the client. Thereafter, the user typically specifies a network address from where desired information is to be retrieved e.g., a web page on the worldwide web. This address is generally expressed in the form of a Universal Resource Locator (URL), which contains a domain name and subdirectory information for a server from which information is retrieved.
  • URL Universal Resource Locator
  • the browser When the user identifies a desired web page, the browser transmits a request for that page to an appropriate server via the Internet. In doing so, the client establishes a TCP/IP connection with its Internet Service Provider (ISP) and, through that provider, requests a piece of information from the appropriate server in the form of a HTTP message packet. In response, the client receives a response message, which is typically a packet of data in the Hyper Text Markup Language (HTML), which packet is also referred to as a web page. The browser then displays the page for the user on display screen attached to the client.
  • ISP Internet Service Provider
  • HTTP Hyper Text Markup Language
  • Each web page is located at a network address that can be expressed in the URL format.
  • the URLs are typically expressed in Roman characters such as those used in the English language.
  • a domain name is chosen so that it is easy for users to remember.
  • companies and organizations with widely recognized company names such as "Coca-Cola®”, “Motorola®”, and “IBM®”, typically use their trademarked corporate names as the domain name for their home page, i.e. www.motorola.com and www.ibm.com.
  • this addressing system is convenient for users familiar with a language founded on a Romanized alphabet system, it is not convenient for a large portion of the global population whose native languages do not employ such an alphabet.
  • languages such as Persian, Hebrew, Japanese, Korean, and Chinese do not use the Romanized alphabet.
  • the Chinese language for instance, uses characters or symbols which require about 17 strokes for each symbol.
  • the names of established companies are typically written with characters from their native languages, and therefore cannot be express using characters of the Roman alphabet. Accordingly, since URL's must be formed from roman characters, these company names therefore cannot be used as Internet domain names. If such a company selects a domain name formed of Roman characters, this selected name will have little or no meaning to its customers because the characters and domain names are unfamiliar to them. This same problem arises when companies established in the United States or in other countries (which use domain names in a Romanized alphabet) try to market their products and services in certain foreign countries.
  • Cang Jie breaks down Chinese characters into 26 building blocks, or radicals, on a PC keyboard.
  • a method used in Taiwan, Zhuyin uses a set of 37 phonetic symbols on the PC keyboard for input purposes.
  • Another system used in China, Pinyin uses standard romanized letters.
  • Wu Bi a five-stroke method
  • Hong Kong-based Ziran has come up with a 10-stroke system.
  • U.S. Pat. No. 6,009,444 to Chen discloses a method of entering text for the Zhuyin phonetic Chinese language, in which a character is representable as a first symbol selected from a first subset of symbols and a second symbol selected from a second subset of symbols, where the first and second subsets are mutually exclusive.
  • a first key on which is displayed a first subset of symbols is activated (e.g., any one of keys 1-6).
  • a candidate first symbol is displayed in response to the step of activating the first key.
  • a second key is activated on which is displayed a second subset of symbols (e.g., any one of keys 7-0).
  • the candidate first symbol is fixed and a candidate second symbol is displayed in response to activating the second key.
  • a third key can be activated (e.g., any one of keys 7-0), on which is displayed a further subset of symbols, whereupon the candidate second symbol is fixed.
  • the invention relates generally to an improved system and method for retrieving web pages from a plurality of server computers connected to a public communication network such as the Internet.
  • the system includes a plurality of voice activated client computers, and at least one server computer which shall be designated as a voice bridge server.
  • Each voice activated client computer includes a presentation device, such as a computer display, for presenting the content of web pages to a user. It also includes a recording mechanism having a microphone for recording audio pointers which represent the names, to the inputted through speech, assigned to web pages which the user desires to access.
  • Each voice activated client computer also has a communication mechanism for connecting to said public communication network to obtain desired web pages.
  • a communication mechanism for connecting to said public communication network to obtain desired web pages.
  • the communication mechanism first transmits the recorded audio pointer via said public communication network to a remote voice bridge server connected to the internet.
  • the voice bridge server includes an audio pointer database for storing, for each of a plurality of Internet web pages, a corresponding audio representation of a voice name assigned to the web page (the "audio pointer"). It also stores a network address for each such web page.
  • the voice bridge server receives from a remotely located voice activated client computer a recording of a spoken audio pointer of a desired web page, it compares the received recording to data in the audio printer database to determine the network address for the desired web page. It then transmits the network address for said desired web page to said remotely located voice activated client computer.
  • the voice activated client computer When the voice activated client computer receives from the voice bridge server a network address for the web page corresponding to the audio pointer, it transmits a request for the desired web page to a remotely located web page server identified by the network address, and receives said web page from said remotely located web page server for presentation by a browser program.
  • the system includes a plurality of voice bridge servers, each for handling audio pointers in one specific language.
  • each voice activated client computer includes a language selection mechanism for allowing the user to select a desired language for the audio pointers.
  • the voice activated client computer also includes a voice bridge selection mechanism which, based upon the selected language, selects a corresponding remote voice bridge server. The communication mechanism of the voice activated client computer then transmits each recorded audio pointer to the corresponding voice bridge server for that language.
  • voice bridge servers can be deployed based on geographic regions to service clients in those regions.
  • Each such server may contain multiple databases, each for handling a different language commonly used in said region.
  • FIG. 1 is a block diagram of a plurality of conventional client computers, voice activated client computers, conventional web servers and voice bridge server computers coupled to a communication network.
  • FIG. 2 is a block diagram of an illustrative voice activated client computer.
  • FIG. 3 is a block diagram of an exemplary voice bridge server computer.
  • FIG. 4 is a flow-chart depicting the flow of execution in a preferred voice activated client computer configuration.
  • FIG. 5 is a flow-chart depicting the operation of a voice bridge server computer.
  • FIG. 6 is a flow-chart of procedures performed by a voice activated client computer once the voice bridge server returns the search results based upon the user's spoken audio pointers.
  • FIG. 1 depicts a plurality of conventional client computers 100, conventional web servers 102, voice activated client computers 104 and voice bridge server computers 106 all connected to a communication network such as the Internet 108.
  • the web servers 102 are conventional computer servers which are configured to store and disseminate web pages.
  • Conventional client computers 100 transmit requests for these web pages directly to web servers 102, as is well known in the art.
  • the web servers transmit web pages over the Internet 108 for display at the requesting client computer 100.
  • each voice activated client computer 104 includes a processor 200 (which includes a microprocessor such as the
  • Pentium® EH and memory such as semiconductor memory
  • a display 202 such as a CRT or a flat-panel display
  • a storage device 204 such as a fixed disk drive
  • input devices 208 such as a keyboard or a pointing device such as a mouse.
  • It also includes a speech input device such as a microphone 206, and an audio output device 210 (which are both connected to a DSP 224, which handles coding, decoding, and data I/O to and from audio hardware and software that utilizes them).
  • each client computer 104 is also equipped with a communication mechanism 212 that enables communication with other devices connected to networks, such as the Internet 108 via communication protocols such as TCP/IP.
  • the connection to the Internet 108 can be established via an Internet Service Provider (ISP) such as America Online, Inc., or via an office Local Area Network.
  • ISP Internet Service Provider
  • a transmitter and receiver are both included in a network card.
  • Each voice activated client computer 104 includes a browser program 220 for accessing web pages from web servers 102 in the same manner as conventional client computers 100.
  • Each voice activated client computer 104 is enhanced according to the principles of the present invention to include a mechanism that allows a user to request web pages by speaking a predetermined name or phrase. Accordingly, as shown in FIG. 2, each voice activated client computer 104 includes a Client Side Software (CSS) program 222.
  • SCS Client Side Software
  • a conventional computer such as a personal computer is illustratively depicted as the voice activated client computer 104
  • the present inventive principles can be practiced by enhancing devices such as the Web-TVTM device marketed by Microsoft Corporation of Redmond, Washington; a hand-held computer such as the Palm PilotTM marketed by 3-COM Corporation of Santa Clara, California; the AOL- TVTM device marketed by America Online, Inc. of Dulles, Virginia; or the device used in conjunction with the Wireless WebTM service from Sprint Corporation of Westwood, Kansas.
  • each voice bridge server computer 106 is preferably implemented by installing voice bridge application software onto a general purpose computer that includes a processor 300 (including a microprocessor such as an AlphaTM microprocessor and associated semiconductor memory), a storage device 302 such as a fixed disk drive, and a data communications device 304 such as a 3-COMTM network card to connect to the Internet 108 using a protocol such as (TCP/IP).
  • a protocol such as (TCP/IP).
  • the connection to the Internet 108 can be established via an ISP such as America Online, Inc., or a direct connection such as a T-l or a Digital Subscriber Line connection.
  • the voice bridge server 106 includes one or more database(s) 314 such as the Oracle Relational Database Management System.
  • the voice bridge server computer 106 is additionally configured to execute an operating system such as Windows-NTTM or Linux, and web server software such as the one marketed by Netscape® Corporation of Sunnyvale, California or obtainable from sources such as Apache.
  • the web server software is configured to interface with a communication device to receive packets of messages from computers connected to the Internet 108, decipher the information in the packets, and act according to instructions provided in the packets.
  • the processor 300 executes a Server Side Software (SSS) program 310 and a Voice Bridge Application program (VBA) 312.
  • the VBA 312 includes a voice recognition module (Not Shown) and a database 314 containing samples of audio pointers that sponsors would prefer to use to designate their web site.
  • the voice recognition module can be purchased or licensed from companies that specialize in the analysis of spoken waveforms.
  • the voice recognition module is preferably tuned to a lower accuracy setting to accommodate a larger number of users with distinct speaking styles or accents.
  • the SSS 140 and the VBA 312 are written in standard programming languages such as Java, C++, or implemented in part as middleware components that interface with the database 314.
  • the present invention is implemented in part on the voice activated client computer 104 and in part on the voice bridge server computer 106.
  • FIG. 4 depicts the steps performed by the voice activated client computer 104, as a first step, a user installs the CSS 222 on a standard client computer 104 (step 400) equipped with a DSP and a microphone.
  • This installation can be accomplished by any of the standard methods known to persons skilled in the art, which methods include installing via a portable medium such as a CDROM, floppy disk, or downloaded via the Internet 108.
  • a voice activated client computer When installed on the voice activated client computer 104, the computer displays an icon on the desktop of the voice activated client computer's 104 display screen 202.
  • the user activates or "launches” the CSS 222 by using the mouse and clicking once or twice on the icon (Step 402).
  • the CSS 222 preferably opens a window on the voice activated client computer 104 and prompts the user for a preferred language selection by displaying a language selection list (step 404).
  • this selection list includes a menu of options selectable by the user, radio buttons, and/or check boxes and the like.
  • a preferred language can be selected while installing the CSS 222.
  • any language choice that is so selected is stored in the client computer 104 as a "default" language.
  • the user thereafter launches the CSS 222 a default language is assumed to be selected and therefore no further selection is necessary.
  • the user wishes to change the language selected, he can override the default language by selecting a different language from a selection box as explained above.
  • the CSS 222 assigns and stores internally a user identifier for the user (step 406).
  • This user identifier is preferably a unique identifier, which could be established by generating a random number and appending the random number to a unique number on the client computer 104. This unique number is obtained from the user's computer license key, network card identifier, a transformation of the user's name or other such method. This step is only applicable when the CSS runs for the first time.
  • the CSS 222 then prompts the user for an audio pointer in the selected language which identifies a web page which the user wishes to view (step 408).
  • This prompt is preferably an audio prompt or a visual prompt.
  • the user is beeped to indicate when to start speaking.
  • a visual prompt the user is shown a window with a pre-designated area comprising a button to press before speaking the phrase.
  • the CSS 222 also instructs or activates the DSP 224 to start listening to the speaker's voice as received by microphone 108 (step 410).
  • the DSP 224 is configured to wait and listen for a predetermined (expiration) period of time, for example, 2-3 seconds before timing out.
  • the CSS 222 instructs the DSP 224 to record the audio pointer in a file on the fixed disk drive 204 (steps 412, 416).
  • the stored audio pointer is a wave format (with a ".wav” extension) file or an audio format (with a ".au” extension) file that is encoded according to a method known to persons skilled in the art. If, on the other hand, the user does not speak a phrase during the expiration period, the CSS 222 expires (times out) and prompts the user again (steps 412, 414).
  • the CSS 222 then verifies whether a network connection is established for the voice activated client computer 104 (step 418). Preferably, this verification is done by sending a "ping" signal to a remote computer such as the voice bridge server 106 over the network 108. If a connection is not present, the CSS 222 instructs the voice activated client computer to establish a network connection (step 420) via a standard method.
  • the CSS 222 Based on the language selected by the user, the CSS 222 identifies a particular voice bridge server computer 106 (step 422) from among a plurality of voice bridge servers 106, each of which is designated to serve a different language. It should be noted, however, that there may be only one voice bridge server 106 designated to service a plurality of languages, or a particular language group comprising a plurality of dialects in which case, the CSS will be requesting data from one of many databases on a voice bridge server. Other embodiments, which are easily understood by persons skilled in the art, include a plurality of server computers that collectively comprise a single voice bridge server 106.
  • the CSS 222 transmits the user's unique identifier and the recorded audio pointer to the voice bridge server 106 (step 424). This is accomplished by using a method such as file transfer protocol (ftp), HTTP, and others known to persons of ordinary skill in the art. The CSS then removes the file from the client computer.
  • ftp file transfer protocol
  • HTTP HyperText Transfer Protocol
  • a voice bridge service provider who is an entity that operates the voice bridge server 106, invites other entities — organizations, individuals or companies (hereinafter collectively referred to as "sponsors") — to do business with the VBSP.
  • the sponsor In order to do business with the VBSP, the sponsor first opens an account with the VBSP by filling an online or a paper form, which requests the sponsor's name, address, billing information and the like. Once a sponsor fills out the form and (optionally) pays a prescribed fee, a sponsor account is established for each sponsor and is recorded in the database 314.
  • the sponsor may provide a plurality of network addresses (e.g., URLs) for registration with the VBSP.
  • the sponsor For each network address that the sponsor wishes to register, the sponsor preferably provides the address (either in a symbolic form or in a dotted decimal notation) and a corresponding word or phrase in a particular language of interest, which the system will use as an "audio pointer" to the network address.
  • the sponsor may provide at least one audio pointer for each of a plurality of languages of interest. This selected audio pointer need not have any relationship to the contents of the network address; it is only a symbolic name — such as a nickname or a memorable phrase — which the sponsor wishes to designate as the name for that network address.
  • the audio pointer can also be arbitrarily chosen and therefore need not be related in any way to the particular network address.
  • This audio pointer is one that identifies a particular network address.
  • the audio pointer is recorded by a native speaker of the selected language(s) of interest in a particular vernacular or dialect.
  • the SSS 140 For each network address registered by a sponsor, the SSS 140 stores the address together with the corresponding designated audio pointer in the audio pointer, database 314 (step 552).
  • a single sponsor may also register a plurality of audio pointers for a given web page or alternatively a single audio pointer may be associated with a list of addresses.
  • an aspect of the present invention is to match sounds in an entire phrase with similar stored sound samples, rather than recognizing the exact words spoken by a user, confusing or like-sounding words or phrases — such as "Apple Computer” and “April Computer,” wherein the latter phrase is the former phrase mis-pronounced — are either removed from or not entered into the database 314.
  • confusing or like-sounding words or phrases such as "Apple Computer” and "April Computer,” wherein the latter phrase is the former phrase mis-pronounced — are either removed from or not entered into the database 314.
  • the SSS 140 "listens" for any incoming requests from voice activated client computers 106 (Step 556).
  • the client computer 104 transmits the audio pointer and the user's identifier to the voice bridge server 106, the voice bridge server 106 receives them and stores them in memory (step 558).
  • the SSS 140 searches database 314 to determine if the VBSP has a prior usage record for the user identifier (step 559). If no prior usage record exists, the SSS 140 instructs the VBA 312 to store the user's identifier in user access records maintained on disk 302 (step 560). If the user has an existing user access record, then it is retrieved from disk 302 to assist in interpreting the new request.
  • the SSS 140 transfers control of execution to the VBA 312 (step 562).
  • the VBA 312 preferably removes from the received audio pointer any leading or trailing silence and compares this received pointer to the audio pointers stored in the database 314 to locate the stored audio pointers which correspond to the received audio pointer, (step 566). In other embodiments, intra-phrase silence is also removed.
  • the VBA 312 retrieves from the database 314 the corresponding network address (URL) or addresses and forwards it to the SSS 140 (step 566).
  • the SSS 140 transmits the corresponding network address or addresses to the voice activated client computer 104 (step 568) to permit the client computer to request the web page from the appropriate web server 103.
  • the VBA 312 may find a plurality of registered audio pointers which seem to match the received audio pointer, for example, if two registered audio pointers sound similar (e.g. "April Computers" and "Apple Computers.") In this case, the SSS 140 gathers all the matched URLs and compares them to this user's user access record to determine priority of all the matched URLs. The SSS then forwards all this information to the CSS in order for the user to pick the desired URL.
  • these multiple matches could be the result either of a confusing or similar sounding name, or because the sponsor included a plurality of matching URLs to be returned whenever a user speaks a particular word or phrase.
  • the former is the case when a user mis-pronounces a word which matches two close audio pointers and therefore the match results in two URLs.
  • the latter could be the case, for example, when a Portugese company registers two URLs, one for use in Portugal and another for use in Brazil. A user uttering the same audio pointer or phrase in the Portugese language will therefore match both the URLs, as intended by the sponsor, and these are returned to the user for a further selection by the user at the voice activated client computer 104.
  • the SSS 140 maintains user access records of every access by a user of the voice bridge server 106 for use in assisting the voice recognition module in interpreting subsequent requests.
  • the user access records typically include information identifying the web pages which the user has accessed in the past. However, it may also include information regarding the speech characteristics of the user and other information useful to the voice recognition module.
  • every input and output between the SSS 140 and the VBA 312 and between the client computer 104 and the voice bridge server 106 are also logged. (Step 570).
  • the CSS 222 receives from the voice bridge server 106 the network addresses which matched the audio pointer sent by the client computer to the voice bridge server (step 680). If there are a plurality of network addresses matched for the audio pointer, the CSS 222 displays these in a selection list on the display device, arranged by priority, allowing the user to make a selection from the list (step 682).
  • the CSS 222 then instructs the browser 220 to send a request for the desired web page to the selected network address supplied by the voice bridge server 106 (step 684).
  • the selected page is received from the remote web server 102 and displayed on the display device 220.
  • VBA 312 and the SSS 310 could be performed by hardware or software according to any other method as designed by a person skilled in the art.
  • the web browser can be any content viewer.
  • the web page can be any content available on the Internet. All such deviations, departures, modifications and rearrangements should be construed to be within the spirit and scope of the appended claims.

Abstract

A method and apparatus for user-friendly access to information located on the Internet. The method includes a user using a client computer to select a language, speaking an audio pointer in the language, identifying a particular voice bridge server computer based on the selected language, transmitting the audio pointer to the voice bridge server computer, matching the audio pointer with a sample stored at the voice bridge server computer, retrieving corresponding URL or URLs and transmitting the URL or URLs to the voice activated client computer (Fig. 2). The voice activated client computer thereafter uses these URL or URLs to retrieve information pointed to by the URL or URLs and display it on a content viewer.

Description

METHOD AND APPARATUS FOR ACCESSING WEB PAGES
TECHNICAL FIELD The present invention relates in general to a system and method for providing user- friendly interfaces for accessing information sources over a communication network, and, in particular, to a system and method for accessing web pages over the Internet.
BACKGROUND Public data communication networks, such as the Internet, have revolutionized access to information by providing a powerful new system for disseminating information such as news, product information, advertisements, images, samples of music and video clips, etc. to distantly located persons or entities.
In general, the Internet includes a number of interconnected computers — usually called server computers or servers — which store such information. Servers receive requests for such information from distantly located computers typically operated by users — which are usually called client computers or clients. The servers respond by transmitting the requested information to the client computer via the Internet. The most popular current methods for these request-response interactions between a client computer and a server include a protocol called the Hyper Text Transfer Protocol (HTTP). This protocol is typically executed over a transport layer protocol such as Transmission Control Protocol/Internet Protocol (TCP/IP), which establishes and maintains a connection between the client computer and the server which, in turn, are interconnected with numerous other servers. To understand the present invention, it is helpful to understand the way the client interacts with the server. The client includes a computer program called a web browser, or simply, a "browser," that provides an interactive interface/display of the information received from access to the vast resources of the Internet. A user operating a client launches the browser on the client. Thereafter, the user typically specifies a network address from where desired information is to be retrieved e.g., a web page on the worldwide web. This address is generally expressed in the form of a Universal Resource Locator (URL), which contains a domain name and subdirectory information for a server from which information is retrieved.
When the user identifies a desired web page, the browser transmits a request for that page to an appropriate server via the Internet. In doing so, the client establishes a TCP/IP connection with its Internet Service Provider (ISP) and, through that provider, requests a piece of information from the appropriate server in the form of a HTTP message packet. In response, the client receives a response message, which is typically a packet of data in the Hyper Text Markup Language (HTML), which packet is also referred to as a web page. The browser then displays the page for the user on display screen attached to the client. Persons of ordinary skill in the art understand the several variations to the methods of identifying the desired web pages such as a text entry area, a dialog box, or others.
Each web page is located at a network address that can be expressed in the URL format. The URLs are typically expressed in Roman characters such as those used in the English language. For most web pages, and in particular most "home" pages for web sites, a domain name is chosen so that it is easy for users to remember. For example, companies and organizations with widely recognized company names such as "Coca-Cola®", "Motorola®", and "IBM®", typically use their trademarked corporate names as the domain name for their home page, i.e. www.motorola.com and www.ibm.com. Although this addressing system is convenient for users familiar with a language founded on a Romanized alphabet system, it is not convenient for a large portion of the global population whose native languages do not employ such an alphabet.
For example, languages such as Persian, Hebrew, Japanese, Korean, and Chinese do not use the Romanized alphabet. The Chinese language, for instance, uses characters or symbols which require about 17 strokes for each symbol. In countries where such languages are spoken, the names of established companies are typically written with characters from their native languages, and therefore cannot be express using characters of the Roman alphabet. Accordingly, since URL's must be formed from roman characters, these company names therefore cannot be used as Internet domain names. If such a company selects a domain name formed of Roman characters, this selected name will have little or no meaning to its customers because the characters and domain names are unfamiliar to them. This same problem arises when companies established in the United States or in other countries (which use domain names in a Romanized alphabet) try to market their products and services in certain foreign countries. Since customers in these areas are not familiar with the Romanized alphabet and whereas these customers are most likely more familiar with the translated names of these corporations in Korean, Chinese, or whatever local language at issue, the domain names for such companies (even if well known in the United States) often have little meaning to these foreign customers and are therefore not the most effective in such markets. Further, even if the domain name system allows the use of other types of characters, such as Chinese, Korean, Persian, Hebrew or Japanese characters, it would be difficult for users to enter such characters into a computer using a conventional keyboard such as a personal computer (PC) keyboard which is the standard input device for computers installed worldwide, without regard to the native language. For example, there are at least five different — and complex — methods for inputting Chinese into a personal computer through the standard keyboard. One method, called Cang Jie, breaks down Chinese characters into 26 building blocks, or radicals, on a PC keyboard. A method used in Taiwan, Zhuyin, uses a set of 37 phonetic symbols on the PC keyboard for input purposes. Another system used in China, Pinyin, uses standard romanized letters. In China, there is a five-stroke method called Wu Bi, while Hong Kong-based Ziran has come up with a 10-stroke system.
Numerous methods have been invented for entering Chinese textual characters using a keyboard. U.S. Pat. No. 6,009,444 to Chen discloses a method of entering text for the Zhuyin phonetic Chinese language, in which a character is representable as a first symbol selected from a first subset of symbols and a second symbol selected from a second subset of symbols, where the first and second subsets are mutually exclusive. A first key on which is displayed a first subset of symbols is activated (e.g., any one of keys 1-6). A candidate first symbol is displayed in response to the step of activating the first key. A second key is activated on which is displayed a second subset of symbols (e.g., any one of keys 7-0). The candidate first symbol is fixed and a candidate second symbol is displayed in response to activating the second key. A third key can be activated (e.g., any one of keys 7-0), on which is displayed a further subset of symbols, whereupon the candidate second symbol is fixed.
Even with these techniques, these characters are difficult to type into a computer and any domain names formed from such characters are difficult for English speaking customers to understand. By way of example, this illustrates numerous obstacles to implementation of a multi-lingual URL system, even if it were possible to overcome the complications of inputting non-Roman characters.
SUMMARY OF THE INVENTION The invention relates generally to an improved system and method for retrieving web pages from a plurality of server computers connected to a public communication network such as the Internet. The system includes a plurality of voice activated client computers, and at least one server computer which shall be designated as a voice bridge server. Each voice activated client computer includes a presentation device, such as a computer display, for presenting the content of web pages to a user. It also includes a recording mechanism having a microphone for recording audio pointers which represent the names, to the inputted through speech, assigned to web pages which the user desires to access.
Each voice activated client computer also has a communication mechanism for connecting to said public communication network to obtain desired web pages. When a user speaks an assigned name, which shall be called an audio pointer of a desired web page, the communication mechanism first transmits the recorded audio pointer via said public communication network to a remote voice bridge server connected to the internet.
The voice bridge server includes an audio pointer database for storing, for each of a plurality of Internet web pages, a corresponding audio representation of a voice name assigned to the web page (the "audio pointer"). It also stores a network address for each such web page. When the voice bridge server receives from a remotely located voice activated client computer a recording of a spoken audio pointer of a desired web page, it compares the received recording to data in the audio printer database to determine the network address for the desired web page. It then transmits the network address for said desired web page to said remotely located voice activated client computer. When the voice activated client computer receives from the voice bridge server a network address for the web page corresponding to the audio pointer, it transmits a request for the desired web page to a remotely located web page server identified by the network address, and receives said web page from said remotely located web page server for presentation by a browser program. In a preferred embodiment, the system includes a plurality of voice bridge servers, each for handling audio pointers in one specific language. In this embodiment, each voice activated client computer includes a language selection mechanism for allowing the user to select a desired language for the audio pointers. The voice activated client computer also includes a voice bridge selection mechanism which, based upon the selected language, selects a corresponding remote voice bridge server. The communication mechanism of the voice activated client computer then transmits each recorded audio pointer to the corresponding voice bridge server for that language.
In other embodiments, voice bridge servers can be deployed based on geographic regions to service clients in those regions. Each such server may contain multiple databases, each for handling a different language commonly used in said region. BRIEF DESCRIPTION OF THE DRAWINGS These and other objects, features and advantages of the invention will be more readily apparent from the following detailed description of presently preferred embodiments and the appended claims with a reference to the accompanying drawings, where like numerals represent like parts, and in which:
FIG. 1 is a block diagram of a plurality of conventional client computers, voice activated client computers, conventional web servers and voice bridge server computers coupled to a communication network.
FIG. 2 is a block diagram of an illustrative voice activated client computer. FIG. 3 is a block diagram of an exemplary voice bridge server computer.
FIG. 4 is a flow-chart depicting the flow of execution in a preferred voice activated client computer configuration.
FIG. 5 is a flow-chart depicting the operation of a voice bridge server computer. FIG. 6 is a flow-chart of procedures performed by a voice activated client computer once the voice bridge server returns the search results based upon the user's spoken audio pointers.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
System Overview FIG. 1 depicts a plurality of conventional client computers 100, conventional web servers 102, voice activated client computers 104 and voice bridge server computers 106 all connected to a communication network such as the Internet 108.
The web servers 102 are conventional computer servers which are configured to store and disseminate web pages. Conventional client computers 100 transmit requests for these web pages directly to web servers 102, as is well known in the art. In response to such requests, the web servers transmit web pages over the Internet 108 for display at the requesting client computer 100.
Voice activated client computer 104 Referring now to FIG. 2, in a preferred embodiment, each voice activated client computer 104 includes a processor 200 (which includes a microprocessor such as the
Pentium® EH and memory such as semiconductor memory), a display 202 such as a CRT or a flat-panel display, a storage device 204 such as a fixed disk drive, and input devices 208 such as a keyboard or a pointing device such as a mouse. It also includes a speech input device such as a microphone 206, and an audio output device 210 (which are both connected to a DSP 224, which handles coding, decoding, and data I/O to and from audio hardware and software that utilizes them).
Further, each client computer 104 is also equipped with a communication mechanism 212 that enables communication with other devices connected to networks, such as the Internet 108 via communication protocols such as TCP/IP. In general, the connection to the Internet 108 can be established via an Internet Service Provider (ISP) such as America Online, Inc., or via an office Local Area Network. In a preferred embodiment, a transmitter and receiver are both included in a network card.
Each voice activated client computer 104 includes a browser program 220 for accessing web pages from web servers 102 in the same manner as conventional client computers 100. Each voice activated client computer 104 is enhanced according to the principles of the present invention to include a mechanism that allows a user to request web pages by speaking a predetermined name or phrase. Accordingly, as shown in FIG. 2, each voice activated client computer 104 includes a Client Side Software (CSS) program 222. Though a conventional computer such as a personal computer is illustratively depicted as the voice activated client computer 104, in alternative embodiments, the present inventive principles can be practiced by enhancing devices such as the Web-TV™ device marketed by Microsoft Corporation of Redmond, Washington; a hand-held computer such as the Palm Pilot™ marketed by 3-COM Corporation of Santa Clara, California; the AOL- TV™ device marketed by America Online, Inc. of Dulles, Virginia; or the device used in conjunction with the Wireless Web™ service from Sprint Corporation of Westwood, Kansas.
Voice Bridge Server Computer 106 Referring to FIG. 3, each voice bridge server computer 106 is preferably implemented by installing voice bridge application software onto a general purpose computer that includes a processor 300 (including a microprocessor such as an Alpha™ microprocessor and associated semiconductor memory), a storage device 302 such as a fixed disk drive, and a data communications device 304 such as a 3-COM™ network card to connect to the Internet 108 using a protocol such as (TCP/IP). In general, the connection to the Internet 108 can be established via an ISP such as America Online, Inc., or a direct connection such as a T-l or a Digital Subscriber Line connection. Further, in a preferred embodiment, the voice bridge server 106 includes one or more database(s) 314 such as the Oracle Relational Database Management System.
The voice bridge server computer 106 is additionally configured to execute an operating system such as Windows-NT™ or Linux, and web server software such as the one marketed by Netscape® Corporation of Sunnyvale, California or obtainable from sources such as Apache. The web server software is configured to interface with a communication device to receive packets of messages from computers connected to the Internet 108, decipher the information in the packets, and act according to instructions provided in the packets.
The processor 300 executes a Server Side Software (SSS) program 310 and a Voice Bridge Application program (VBA) 312. Preferably, the VBA 312 includes a voice recognition module (Not Shown) and a database 314 containing samples of audio pointers that sponsors would prefer to use to designate their web site. The voice recognition module can be purchased or licensed from companies that specialize in the analysis of spoken waveforms. The voice recognition module is preferably tuned to a lower accuracy setting to accommodate a larger number of users with distinct speaking styles or accents. In preferred embodiments, the SSS 140 and the VBA 312 are written in standard programming languages such as Java, C++, or implemented in part as middleware components that interface with the database 314.
Operation of the Invention In a preferred embodiment, the present invention is implemented in part on the voice activated client computer 104 and in part on the voice bridge server computer 106.
Operation of the Voice Activated Client Computer 104 Referring to FIG. 4, which depicts the steps performed by the voice activated client computer 104, as a first step, a user installs the CSS 222 on a standard client computer 104 (step 400) equipped with a DSP and a microphone. This installation can be accomplished by any of the standard methods known to persons skilled in the art, which methods include installing via a portable medium such as a CDROM, floppy disk, or downloaded via the Internet 108. Such a system is referred to as a voice activated client computer. When installed on the voice activated client computer 104, the computer displays an icon on the desktop of the voice activated client computer's 104 display screen 202. Then the user activates or "launches" the CSS 222 by using the mouse and clicking once or twice on the icon (Step 402). When launched, the CSS 222 preferably opens a window on the voice activated client computer 104 and prompts the user for a preferred language selection by displaying a language selection list (step 404). In a preferred embodiment, this selection list includes a menu of options selectable by the user, radio buttons, and/or check boxes and the like. Alternatively, if a user so desires, a preferred language can be selected while installing the CSS 222. In this case, any language choice that is so selected is stored in the client computer 104 as a "default" language. When the user thereafter launches the CSS 222, a default language is assumed to be selected and therefore no further selection is necessary. However, if after the CSS 222 is launched, the user wishes to change the language selected, he can override the default language by selecting a different language from a selection box as explained above.
After the user selects a language, the CSS 222 assigns and stores internally a user identifier for the user (step 406). This user identifier is preferably a unique identifier, which could be established by generating a random number and appending the random number to a unique number on the client computer 104. This unique number is obtained from the user's computer license key, network card identifier, a transformation of the user's name or other such method. This step is only applicable when the CSS runs for the first time.
The CSS 222 then prompts the user for an audio pointer in the selected language which identifies a web page which the user wishes to view (step 408). This prompt is preferably an audio prompt or a visual prompt. In case of an audio prompt, the user is beeped to indicate when to start speaking. In case of a visual prompt, the user is shown a window with a pre-designated area comprising a button to press before speaking the phrase. The CSS 222 also instructs or activates the DSP 224 to start listening to the speaker's voice as received by microphone 108 (step 410). Preferably, the DSP 224 is configured to wait and listen for a predetermined (expiration) period of time, for example, 2-3 seconds before timing out. If the user speaks the audio pointer during this time period, the CSS 222 instructs the DSP 224 to record the audio pointer in a file on the fixed disk drive 204 (steps 412, 416). In one embodiment, the stored audio pointer is a wave format (with a ".wav" extension) file or an audio format (with a ".au" extension) file that is encoded according to a method known to persons skilled in the art. If, on the other hand, the user does not speak a phrase during the expiration period, the CSS 222 expires (times out) and prompts the user again (steps 412, 414).
The CSS 222 then verifies whether a network connection is established for the voice activated client computer 104 (step 418). Preferably, this verification is done by sending a "ping" signal to a remote computer such as the voice bridge server 106 over the network 108. If a connection is not present, the CSS 222 instructs the voice activated client computer to establish a network connection (step 420) via a standard method.
Based on the language selected by the user, the CSS 222 identifies a particular voice bridge server computer 106 (step 422) from among a plurality of voice bridge servers 106, each of which is designated to serve a different language. It should be noted, however, that there may be only one voice bridge server 106 designated to service a plurality of languages, or a particular language group comprising a plurality of dialects in which case, the CSS will be requesting data from one of many databases on a voice bridge server. Other embodiments, which are easily understood by persons skilled in the art, include a plurality of server computers that collectively comprise a single voice bridge server 106.
The CSS 222 transmits the user's unique identifier and the recorded audio pointer to the voice bridge server 106 (step 424). This is accomplished by using a method such as file transfer protocol (ftp), HTTP, and others known to persons of ordinary skill in the art. The CSS then removes the file from the client computer.
Operation of the Voice Bridge Server Computer 106 Referring now to FIG. 5, during an initialization step 550, a voice bridge service provider (VBSP), who is an entity that operates the voice bridge server 106, invites other entities — organizations, individuals or companies (hereinafter collectively referred to as "sponsors") — to do business with the VBSP. In order to do business with the VBSP, the sponsor first opens an account with the VBSP by filling an online or a paper form, which requests the sponsor's name, address, billing information and the like. Once a sponsor fills out the form and (optionally) pays a prescribed fee, a sponsor account is established for each sponsor and is recorded in the database 314.
After registration, the sponsor may provide a plurality of network addresses (e.g., URLs) for registration with the VBSP. For each network address that the sponsor wishes to register, the sponsor preferably provides the address (either in a symbolic form or in a dotted decimal notation) and a corresponding word or phrase in a particular language of interest, which the system will use as an "audio pointer" to the network address. In some cases, the sponsor may provide at least one audio pointer for each of a plurality of languages of interest. This selected audio pointer need not have any relationship to the contents of the network address; it is only a symbolic name — such as a nickname or a memorable phrase — which the sponsor wishes to designate as the name for that network address. The audio pointer can also be arbitrarily chosen and therefore need not be related in any way to the particular network address. This audio pointer is one that identifies a particular network address. Preferably, the audio pointer is recorded by a native speaker of the selected language(s) of interest in a particular vernacular or dialect.
For each network address registered by a sponsor, the SSS 140 stores the address together with the corresponding designated audio pointer in the audio pointer, database 314 (step 552). A single sponsor may also register a plurality of audio pointers for a given web page or alternatively a single audio pointer may be associated with a list of addresses.
Since an aspect of the present invention is to match sounds in an entire phrase with similar stored sound samples, rather than recognizing the exact words spoken by a user, confusing or like-sounding words or phrases — such as "Apple Computer" and "April Computer," wherein the latter phrase is the former phrase mis-pronounced — are either removed from or not entered into the database 314. By thus avoiding confusing phrases, the accuracy of the matching process at a later stage is enhanced (step 554).
The SSS 140 "listens" for any incoming requests from voice activated client computers 106 (Step 556). When, as described in step 424, the client computer 104 transmits the audio pointer and the user's identifier to the voice bridge server 106, the voice bridge server 106 receives them and stores them in memory (step 558).
In one embodiment, the SSS 140 searches database 314 to determine if the VBSP has a prior usage record for the user identifier (step 559). If no prior usage record exists, the SSS 140 instructs the VBA 312 to store the user's identifier in user access records maintained on disk 302 (step 560). If the user has an existing user access record, then it is retrieved from disk 302 to assist in interpreting the new request.
After receiving the user's audio pointer as in step 558, the SSS 140 transfers control of execution to the VBA 312 (step 562). The VBA 312 preferably removes from the received audio pointer any leading or trailing silence and compares this received pointer to the audio pointers stored in the database 314 to locate the stored audio pointers which correspond to the received audio pointer, (step 566). In other embodiments, intra-phrase silence is also removed.
If the VBA finds a single audio pointer that matches the received audio pointer, the VBA 312 retrieves from the database 314 the corresponding network address (URL) or addresses and forwards it to the SSS 140 (step 566).
After the matching step, the SSS 140 transmits the corresponding network address or addresses to the voice activated client computer 104 (step 568) to permit the client computer to request the web page from the appropriate web server 103. The VBA 312 may find a plurality of registered audio pointers which seem to match the received audio pointer, for example, if two registered audio pointers sound similar (e.g. "April Computers" and "Apple Computers.") In this case, the SSS 140 gathers all the matched URLs and compares them to this user's user access record to determine priority of all the matched URLs. The SSS then forwards all this information to the CSS in order for the user to pick the desired URL. It should be noted that these multiple matches could be the result either of a confusing or similar sounding name, or because the sponsor included a plurality of matching URLs to be returned whenever a user speaks a particular word or phrase. The former is the case when a user mis-pronounces a word which matches two close audio pointers and therefore the match results in two URLs. The latter could be the case, for example, when a Portugese company registers two URLs, one for use in Portugal and another for use in Brazil. A user uttering the same audio pointer or phrase in the Portugese language will therefore match both the URLs, as intended by the sponsor, and these are returned to the user for a further selection by the user at the voice activated client computer 104.
Advantageously, the SSS 140 maintains user access records of every access by a user of the voice bridge server 106 for use in assisting the voice recognition module in interpreting subsequent requests. The user access records typically include information identifying the web pages which the user has accessed in the past. However, it may also include information regarding the speech characteristics of the user and other information useful to the voice recognition module. In alternative preferred embodiments, every input and output between the SSS 140 and the VBA 312 and between the client computer 104 and the voice bridge server 106 are also logged. (Step 570).
Operation of the Voice Activated Client Computer 104
Referring to FIG. 6, the CSS 222 receives from the voice bridge server 106 the network addresses which matched the audio pointer sent by the client computer to the voice bridge server (step 680). If there are a plurality of network addresses matched for the audio pointer, the CSS 222 displays these in a selection list on the display device, arranged by priority, allowing the user to make a selection from the list (step 682).
The CSS 222 then instructs the browser 220 to send a request for the desired web page to the selected network address supplied by the voice bridge server 106 (step 684). The selected page is received from the remote web server 102 and displayed on the display device 220. The foregoing discloses a method for providing web access based on a user- selectable language, which language could be one that does not use an alphabet system such as the Roman alphabet system. Persons skilled in the art may make several modifications and rearrangements without departing from the spirit and scope of the invention or without undue experimentation. For example, some of the steps are optional, and the order of the steps described herein can be changed. The functions performed by the VBA 312 and the SSS 310 could be performed by hardware or software according to any other method as designed by a person skilled in the art. The web browser can be any content viewer. The web page can be any content available on the Internet. All such deviations, departures, modifications and rearrangements should be construed to be within the spirit and scope of the appended claims.

Claims

What is claimed is:
1. A voice activated client computer for retrieving web pages from a plurality of server computers connected to a public communication network comprising: a presentation device for presenting the content of web pages to a user, a voice name recording mechanism comprising: a microphone and a recording mechanism for recording from said microphone, audio pointers which represent names assigned to desired web pages, and a communication mechanism for communicating with at least one remote server computer connected to said public communication network to obtain said desired web page, wherein said communication mechanism transmits said recorded audio pointer to a remote voice bridge server connected to said public communication network, receives from said remote voice bridge server network address or addresses for said web page or pages corresponding to such audio pointers, transmits a request for said web page or pages to a remote server identified by said network address, and receives said web pages from said remote web page server for presentation by said presentation device.
2. The voice activated client computer of claim 1 wherein said network address is a uniform resource locator.
3. The voice activated client computer of claim 1 wherein said presentation device comprises a display for displaying the content of said web pages.
4. The voice activated client computer of claim 1 wherein said presentation device comprises a loudspeaker for playing audio representations of the content of said web pages.
5. The voice activated client computer of claim 1 further comprising: a language selection mechanism for allowing the user to select a desired language for said audio pointer, a voice bridge selection mechanism which, based upon said selected language, selects a remote voice bridge server for said selected language, and wherein said communication mechanism transmits said recorded audio pointer to said selected remote voice bridge server.
6. A voice bridge server computer for supplying network addresses for each of a plurality of Internet web pages comprising: an audio pointer database for storing, for each of a plurality of Internet web pages, corresponding audio pointer data representative of a spoken name assigned to said web page and at least one network address for said page, a receiving mechanism for receiving from a remote client computer a recording of an audio pointer representative of the assigned name of a desired web page, and to receive an identifier identifying the remote client computer, voice recognition module for comparing said received audio pointer said from client to data in said voice name database to determine the network address for said desired web page, and a transmission mechanism for transmitting the network address for said desired web page to said remote client computer.
7. The voice activated client computer of claim 6 wherein said network address is a uniform resource locator.
8. A system for retrieving web pages from a plurality of server computer connected to a public communication network comprising: a plurality of voice activated client computers, and at least one voice bridge server computer, wherein, each said voice activated client computer comprises: a presentation device for presenting the content of web pages to a user, a voice name recording mechanism comprising: a microphone and a recording mechanism for recording from said microphone, an audio pointer which represents a spoken name assigned to a desired web page, and a communication mechanism for communicating with a remote server computer connected to said public communication network to obtain said desired web page, wherein said communication mechanism transmits said recorded audio pointer to a remote voice bridge server connected to said public communication network, receives from said remote voice bridge server a network address for said web page corresponding to said audio pointer, transmits a request for said web page to a remote server identified by said network address, and receives said web page from said remote web page server for presentation by said presentation device, and wherein said at least one voice bridge server comprises: an audio pointer database for storing, for each of a plurality of Internet web pages, corresponding sample audio pointer data representative of a spoken name assigned to said web page and at least one network address for said page, a receiving mechanism for receiving from a remote client computer a recording of an audio pointer assigned to said desired web page, and an identifier identifying the remote client computer, voice recognition module for comparing said received recording to data in said audio pointer database to determine the network address for said desired web page, a transmission mechanism for transmitting the network address for said desired web page to said remote client computer.
The system of claim 5 further comprising a user access record store which includes, for each user, the user's history of accessed web pages and vocal characteristics information useful to the voice recognition module for purposes of increasing accuracy.
9. The system of claim 8 wherein said presentation device comprises a display for displaying the content of said web pages.
10. The system of claim 8 wherein said presentation device comprises a loudspeaker for playing audio representation of the content of said web pages.
11. The system of claim 8 further comprising a plurality of voice bridge servers, each maintaining an audio pointer database for a given language, and wherein each said voice activated client computer further comprises: a language selection mechanism for allowing the user to select a desired language for said audio pointer, a voice bridge selection mechanism which, based upon said selected language, selects a remote voice bridge server for said selected language, and wherein said communication mechanism transmits said recorded audio pointer to said selected remote voice bridge server.
12. An apparatus for delivering a web page to a client computer connected to a communication network, comprising: a processor; a web browser program executed by the processor; a microphone; a display device; a first transmitting device; and a first receiving device, wherein the processor is configured to allow a user to select a language, to assign a unique user identifier to the user, to receive an audio pointer via the microphone from the user in the selected language, said audio pointer compπsing a term recognizable in the selected language, to locate an appropπate voice bπdge server computer based on the language selected by the user, to transmit the user identifier and the audio pointer via the first transmitting device over the communication network to said server computer, to receive via the first receiving device a Universal Resource Locator (URL) from said voice bπdge server computer, whereupon a web page located at the URL is displayed in the web browser program on the display device.
13 The apparatus of claim 12, wherein the web page is designed in a HTML format.
14. The apparatus of claim 12, wherein the distantly located computer compπses: a second processor, said processor communicatively coupled to
(a) a second receiving device;
(b) a matching device;
(c) a mapping device;
(d) a dehveπng device; (e) a memory device; and
(f) a data gatheπng device, wherein the memory device compπses executable code to instruct the second processor to receive the audio pointer from the client computer via the second receiving device, to instruct the matching device to match the audio pointer with a matching sample of such audio pointer, to instruct the mapping device to map the matching speech sample to a Universal Resource Locator (URL) that does not necessaπly have any relationship to the audio pointer, to instruct the dehveπng device to deliver a web page pointed to by the URL to the client computer, and wherein the data-gatheπng device will record the entire process for summary or to assist in matching, including the identity of the desired web page, the user identifier, and vocal characteπstics of the user.
15. A user-friendly system to provide Internet access, the system comprising: an audio pointer database, said database comprising: at least one numerical value derived from at least one audio pointer by a native speaker of a given language; and at least one network address corresponding to the numerical value; a server computer communicatively coupled to a client computer via a communications network, and to said database, the client computer comprising: a client computer microphone to receive an audio pointer from a user; a client computer transmitter to transmit the audio pointer as spoken by the user to the server computer; and a client computer receiver to receive a web page from the internet; a server receiver coupled to the voice bridge server computer, said voice bridge server receiver configured to receive the audio pointer transmitted from the client computer; and a matching device coupled to the voice bridge server computer, said matching device configured to derive a numerical value from the audio pointer, translate the feature vectors to match a network address by matching an entry in said audio pointer database, wherein the server computer retrieves a web page located at the network address and to transmit it to the client computer.
16. The system of claim 15, wherein said client computer is communicatively coupled to said communications network via a wire-line or a wireless connection.
17. The system of claim 15, wherein said network address is a Universal Resource Locator.
18. The system of claim 15, wherein the matching device comprises computer- executable software code stored on a computer-readable medium.
19. A user-friendly method of providing web access comprising the steps of: specifying a language; receiving an audio pointer from a speaker in the specified language; locating a server computer based on the selection of the language; matching the audio pointer to a network address by referring to an audio pointer database; retrieving a web page located at the network address; and displaying said web page.
PCT/US2001/003067 2000-02-09 2001-01-30 Method and apparatus for accessing web pages WO2001059541A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001233149A AU2001233149A1 (en) 2000-02-09 2001-01-30 Method and apparatus for accessing web pages

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US50130100A 2000-02-09 2000-02-09
US09/501,301 2000-02-09

Publications (2)

Publication Number Publication Date
WO2001059541A2 true WO2001059541A2 (en) 2001-08-16
WO2001059541A3 WO2001059541A3 (en) 2006-05-18

Family

ID=23992961

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/003067 WO2001059541A2 (en) 2000-02-09 2001-01-30 Method and apparatus for accessing web pages

Country Status (2)

Country Link
AU (1) AU2001233149A1 (en)
WO (1) WO2001059541A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10424147B2 (en) 2011-03-03 2019-09-24 J.J. Mackay Canada Limited Parking meter with contactless payment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5974449A (en) * 1997-05-09 1999-10-26 Carmel Connection, Inc. Apparatus and method for providing multimedia messaging between disparate messaging platforms
US6157705A (en) * 1997-12-05 2000-12-05 E*Trade Group, Inc. Voice control of a server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5974449A (en) * 1997-05-09 1999-10-26 Carmel Connection, Inc. Apparatus and method for providing multimedia messaging between disparate messaging platforms
US6157705A (en) * 1997-12-05 2000-12-05 E*Trade Group, Inc. Voice control of a server

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10424147B2 (en) 2011-03-03 2019-09-24 J.J. Mackay Canada Limited Parking meter with contactless payment

Also Published As

Publication number Publication date
WO2001059541A3 (en) 2006-05-18
AU2001233149A8 (en) 2006-11-02
AU2001233149A1 (en) 2001-08-20

Similar Documents

Publication Publication Date Title
US6101472A (en) Data processing system and method for navigating a network using a voice command
US6188985B1 (en) Wireless voice-activated device for control of a processor-based host system
US7516154B2 (en) Cross language advertising
US7283973B1 (en) Multi-modal voice-enabled content access and delivery system
US9507771B2 (en) Methods for using a speech to obtain additional information
KR100329244B1 (en) Remote web page reader
US9152375B2 (en) Speech recognition interface for voice actuation of legacy systems
US7680816B2 (en) Method, system, and computer program product providing for multimodal content management
JP2009505284A (en) Virtual robot communication format customized by endpoint
US6738827B1 (en) Method and system for alternate internet resource identifiers and addresses
JP2004515846A (en) How to find web pages by using visual images
JP2009524156A (en) Search tool that provides optional use of human search guides
WO2001057815A2 (en) Web site access by manual entry of a character string into a software interface
JP2003091298A (en) Automatic control of household activity using speech recognition and natural language
US20020089470A1 (en) Real time internet transcript presentation system
US20050010422A1 (en) Speech processing apparatus and method
US7216287B2 (en) Personal voice portal service
JP3141833B2 (en) Network access system
JPWO2005091128A1 (en) Audio processing apparatus and system and audio processing method
GB2330429A (en) Data stream enhancement
JP5448192B2 (en) Search system, terminal, server, search method, program
WO2001059541A2 (en) Method and apparatus for accessing web pages
US20030126461A1 (en) Audio/visual URL icon
US7346651B2 (en) Method of searching information site by item keyword and action keyword
JP2001075968A (en) Information retrieving method and recording medium recording the same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)