USRE42904E1 - System and apparatus for dynamically generating audible notices from an information network - Google Patents

System and apparatus for dynamically generating audible notices from an information network Download PDF

Info

Publication number
USRE42904E1
USRE42904E1 US11/119,493 US11949305A USRE42904E US RE42904 E1 USRE42904 E1 US RE42904E1 US 11949305 A US11949305 A US 11949305A US RE42904 E USRE42904 E US RE42904E
Authority
US
United States
Prior art keywords
information
program instructions
format
phonemes
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US11/119,493
Inventor
James H. Stephens, Jr.
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zarbana Digital Fund LLC
Original Assignee
Frederick Monocacy LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/427,233 external-priority patent/US6557026B1/en
Application filed by Frederick Monocacy LLC filed Critical Frederick Monocacy LLC
Priority to US11/119,493 priority Critical patent/USRE42904E1/en
Assigned to FREDERICK MONOCACY LLC reassignment FREDERICK MONOCACY LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MORPHISM, L.L.C.
Assigned to MORPHISM, L.L.C. reassignment MORPHISM, L.L.C. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: STEPHENS, JAMES H.
Application granted granted Critical
Publication of USRE42904E1 publication Critical patent/USRE42904E1/en
Assigned to ZARBAÑA DIGITAL FUND LLC reassignment ZARBAÑA DIGITAL FUND LLC MERGER (SEE DOCUMENT FOR DETAILS). Assignors: FREDERICK MONOCACY LLC
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
    • H04M3/4938Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/60Medium conversion

Definitions

  • This invention relates generally to devices for browsing information on an information network. More specifically, this invention relates to an apparatus and system for receiving personalized information from an information network in audio format using distributed text-to-speech processing.
  • a number of different information networks are available that allow access to information contained on their computers, with the Internet being one that is generally known to the public.
  • the capabilities, usefulness, and amount of information available from information networks are ever-increasing.
  • users often subscribe to one or more information services that are accessible via an information network.
  • a user must browse the information network for information that is of interest to them.
  • a user must interrupt their use of an application program, such as spreadsheets or word processing programs, to browse the information network.
  • Even messages sent from information networks to users via e-mail or instant messaging facilities require the user to take specific action to learn the content of the messages.
  • subscription services and portal services allow a user to customize the format and, to a certain extent, the content, of the information provided, a user must still manually navigate to the various sources of information to see if there is anything of interest to them. Still further, a user often has to sift through a lot of information that is of no interest to them, thereby consuming more time than necessary.
  • Another drawback to current capabilities is that the user typically is not informed immediately when information of interest becomes available, but rather, must enter commands to browse the information sources, and therefore may not receive information of interest as soon as it is available.
  • the information is typically in conventional orthography and the output is synthetic speech.
  • the input is provided in the form of a digital signal which represents the characters of conventional orthography.
  • the primary output is also a digital signal representing an acoustic waveform corresponding to the synthetic speech.
  • Digital-to-analog conversion is a well known technique for producing analog signals which can drive audio speakers.
  • the signal may have any convenient implementation, e.g. electrical, magnetic, electromagnetic or optical.
  • Speech converters usually include two major sub-units namely an analyzer and a synthesizer.
  • the analyzer divides the original input signal into small textual elements.
  • the synthesizer converts each of these small elements into a short segment of digital waveform and it also joins these together to produce the output.
  • One category of linguistic processors is designated as “converters” in that they change the nature of the symbols utilized. For example a “converter” alters a signal representing a word or other linguistic element in graphemes into a signal representing the same element in phomenes using a grapheme to phoneme dictionary.
  • This dictionary requires a large amount of storage space, and it is therefore preferable to store and maintain one dictionary in a central location, such as a network server, so that it may be accessed by several users, instead of storing and maintaining separate copies of the dictionary on each user's workstation.
  • the benefits of maintaining large resources on servers arc both ease of maintenance and reduced client system resource requirements. Further, converting the phonemes to an audio signal generates a large amount of data, and transferring the data in audio format requires a large amount of bandwidth.
  • TTS text-to-speech
  • TTS engines that use different algorithms for transforming text data to audio data.
  • these other TTS engines also involve converting text data to an intermediate format that requires less storage than the data in audio format. Therefore, it is also desirable to distribute other types of TTS engines between at least two data processors in a manner which optimizes processing time, data transfer, and storage space efficiency.
  • the present invention provides a system for converting information from a text format to an audio format, wherein the text to speech conversion is distributed among two or more data processors.
  • One data processor executes a first set of program instructions to receive information in text format from a data source, to convert the information from the text format to an intermediate format, such as phonemes, and to transmit the information in the intermediate format to the second data processor.
  • the second data processor executes a second set of program instructions to convert the information from the intermediate format to the audio format.
  • the first data processor such as a network server, includes one or more databases to aid TTS synthesis, such as one or more grapheme to phoneme dictionaries, that are accessible by multiple users.
  • the second data processor is a client side data processor, such as a client workstation.
  • the present invention provides a computer program product for dynamically generating audible notices from an information network using distributed text to speech processing.
  • the information network includes a client processor and a remote processor, such as a network server.
  • the computer program product includes a first set of program instructions that are executed on the remote processor that generate an intermediate representation of the information, such as a phonemic representation.
  • the computer program product further includes a second set of program instructions that are executed on the client side processor that allow a user to preselect at least one data source that is accessible from the information network, to receive information from the at least one preselected data source, and to convert the information from a text format to an audio format based on the intermediate representation of the information.
  • the first set of program instructions utilize a dictionary for translating graphemes to phonemes that is stored in a location that is accessible by the first set of program instructions.
  • the present invention provides a method for dynamically generating audible notices from an information network which includes preselecting at least one data source from the information network, receiving information from the at least one preselected data source, converting the information from a text format to an intermediate format in a remote processor, converting the information from the intermediate format to an audio format in a client processor, and transmitting audio signals representative of the information in audio format.
  • the text is converted into an intermediate phonemic representation using a dictionary for translating graphemes to phonemes.
  • the dictionary is stored in a location that is accessible by the remote processor.
  • the phonemes are converted to audio output signals in the client processor.
  • Each embodiment of the present invention distributes the text to speech processing so that multiple users can take advantage of resources requiring a large amount of storage space from a remote, centralized processor, such as a network server.
  • Intermediate processing of the information is performed at the remote processor to take advantage of the centralized resources, thus reducing the amount of data transfer from the remote processor to the client processor.
  • the information, in intermediate format is then transferred to the client processor, where it is converted to audio output signals. This feature also advantageously reduces data transfer requirements, since audio output format typically requires a large amount of data storage compared to the intermediate format.
  • FIG. 1 is a block diagram of a system for accessing an information network found in the prior art.
  • FIG. 1a is a block diagram of an example of a computer workstation found in the prior art with which the present invention may be utilized.
  • FIG. 2 is a block diagram of a two-tier architecture for providing speech-synthesized information in accordance with the present invention.
  • FIG. 3 is a block diagram of a three-tier architecture for providing speech-synthesized information in accordance with the present invention.
  • FIG. 4 is a block diagram of a two-tier architecture for providing speech-synthesized information with distributed text to speech processing in accordance with the present invention.
  • FIG. 5 is a block diagram of a three-tier architecture for providing speech-synthesized information with distributed text to speech processing in accordance with the present invention.
  • the method and apparatus of the present invention is applicable to devices that access a computerized information network.
  • a number of different information networks are available that allow access to information contained on their computers, with the Internet being one that is generally known to the public. While the Internet is used herein as an example of how the present invention is utilized, it is important to recognize that the present invention is also applicable to other information networks and information systems such as Intranets, database management systems, and document retrieval systems.
  • FIG. 1 An example of a typical Internet connection 110 found in the prior art is shown in FIG. 1 .
  • a user that wishes to access information on the Internet typically has a computer workstation 112 that executes an application program known as browser 114 .
  • Workstation 112 establishes a communication link 116 with web server 118 such as a dial-up wired connection with a modem, a direct link such as a T 1 or ISDN line, or a wireless connection through a cellular or satellite network.
  • web server 118 such as a dial-up wired connection with a modem, a direct link such as a T 1 or ISDN line, or a wireless connection through a cellular or satellite network.
  • workstation 112 sends a request for information, such as a search for documents pertaining to a specified topic, or a specific web page to web server 118 .
  • Each web server 118 , 120 , 122 , 124 on the Internet has a known address which the user must supply to the browser 114 in order to connect to the appropriate web server 118 , 120 , 122 , or 124 . If the information is not available on the user's web server 118 , a central link such as backbone 126 allows web servers 118 , 120 , 122 , 124 to communicate with one another to supply the requested information. Because web servers 118 , 120 , 122 , 124 can contain more than one web page, the user will also specify in the address which particular web page he wants to view.
  • the web servers 118 , 120 , 122 , 124 execute a web server application program, often referred to as a portal, which monitors requests, services requests for the information on that particular web server, and transmits the information to the user's workstation 112 .
  • a display generated by browser 114 to present information provided by a program on the server side is then presented on computer workstation 112 .
  • the display typically includes one or more areas for the user to enter commands and to view the information presented.
  • a web page is primarily visual data that is intended to be displayed on the display device, such as the monitor of user's workstation 112 .
  • web server 118 When web server 118 receives a web page request, it will transmit a document, generally written in a markup language such as hypertext markup language (HTML) or extensible markup language (XML), across communication link 116 to the requesting browser 114 .
  • Communication link 116 may be one or a combination of different data transmission systems, such as a direct dial-up modem connected to a telephone line, dedicated high-speed data links such as Ti or ISDN lines, and even wireless networks which transmit information via satellite or cellular networks.
  • Browser 114 interprets the markup language and outputs the web page to the monitor of user workstation 112 .
  • This web page displayed on the user's display may contain text, graphics, and links (which are addresses of other web pages). These other web pages (i.e., those represented by links) may be on the same or on different web servers 118 , 120 , 122 , 124 . The user can go to these other web pages by clicking on the links using a mouse or other pointing device.
  • web server 118 receives a search request, the request is sent to the server containing the search engine specified by the user.
  • the search engine then compiles one or more pages containing a list of links to web pages on other web browsers 120 , 122 , 124 that may contain information relevant to the user's request.
  • the search engine transmits the page(s) in markup language back to the requesting web server. This entire system of web pages with links to other web pages on other servers across the world is known as the “World Wide Web”.
  • Workstation 112 and/or web servers 116 are computer systems, such as computer system 130 as shown in FIG. 1a .
  • Computer system 130 includes central processing unit (CPU) 132 connected by host bus 134 to various components including main memory 136 , storage device controller 138 , network interface 140 , audio and video controllers 142 , and input/output devices 144 connected via input/output (I/O) controllers 146 .
  • CPU central processing unit
  • I/O controllers 146 input/output controllers 146
  • this system encompasses all types of computer systems including, for example, mainframes, minicomputers, workstations, servers, personal computers, Internet terminals, network appliances, notebooks, palm tops, personal digital assistants, and embedded systems.
  • I/O peripheral devices often include speaker systems 152 , graphics devices 154 , and other I/O devices 144 such as display monitors, keyboards, mouse-type input devices, floppy and hard disk drives, DVD drives, CD-ROM drives, and printers.
  • I/O peripheral devices often include speaker systems 152 , graphics devices 154 , and other I/O devices 144 such as display monitors, keyboards, mouse-type input devices, floppy and hard disk drives, DVD drives, CD-ROM drives, and printers.
  • Many computer systems also include network capability, terminal devices, modems, televisions, sound devices, voice recognition devices, electronic pen devices, and mass storage devices such as tape drives. The number of devices available to add to personal computer systems continues to grow, however computer system 130 may include fewer components than shown in FIG. 1a and described herein.
  • the peripheral devices usually communicate with processor 132 over one or more buses 134 , 156 , 158 , with the buses communicating with each other through the use of one or more bridges 160 , 162 .
  • Computer system 130 may be one of many workstations or servers connected to a network such as a local area network (LAN), a wide area network (WAN), or a global information network such as the Internet through network interface 140 .
  • LAN local area network
  • WAN wide area network
  • Internet global information network
  • CPU 132 can be constructed from one or more microprocessors and/or integrated circuits.
  • Main memory 136 stores programs and data that CPU 132 may access.
  • an operating system program is loaded into main memory 136 .
  • the operating system manages the resources of computer system 130 , such as CPU 132 , audio controller 142 , storage device controller 138 , network interface 140 , I/O controllers 146 , and host bus 134 .
  • the operating system reads one or more configuration files to determine the hardware and software resources connected to computer system 130 .
  • main memory 136 includes the operating system, configuration file, and one or more application programs with related program data.
  • Application programs can run with program data as input, and output their results as program data in main memory 136 or to one or more mass storage devices through a memory controller (not shown) and storage device controller 138 .
  • CPU 132 executes one or more application programs, including one or more programs to establish a connection to a computer network through network interface 140 .
  • the application programs may be embodied in one executable module or may be a collection of routines that are executed as required.
  • Operating systems commonly use “windows”, as well known in the art, to present information about or from an application program.
  • Each application program typically has its own window that is generated when the application program is executing.
  • Each window may be minimized to an icon, maximized to fill the display, overlaid in front of other windows, and underlaid behind other windows.
  • Storage device controller 138 allows computer system 130 to retrieve and store data from mass storage devices such as magnetic disks (hard disks, diskettes), and optical disks (DVD and CD-ROM).
  • the information from the DASD can be in many forms including application programs and program data.
  • Data retrieved through storage device controller 138 is usually placed in main memory 136 where CPU 132 can process it.
  • audio controller 142 is connected to PCI bus 156 in FIG. 1a , but may be connected to the ISA bus 138 or reside on the motherboard (not shown) in alternative embodiments.
  • computer system 130 is shown to contain only a single main CPU 132 and a single system bus 134 , those skilled in the art will appreciate that the present invention may be practiced using a computer system that has multiple CPUs 132 and/or multiple busses 134 .
  • the interfaces that are used in the preferred embodiment may include separate, fully programmed microprocessors that are used to off-load computationally intensive processing from CPU 132 , or may include input/output (I/O) adapters to perform similar functions.
  • PCI bus 156 is used as an exemplar of any input-output devices attached to any I/O bus;
  • AGP bus 159 is used as an exemplar of any graphics bus;
  • graphics device 154 is used as an exemplar of any graphics controller;
  • host-to-PCI bridge 160 and PCI-to-ISA bridge 162 are used as exemplars of any type of bridge. Consequently, as used herein the specific exemplars set forth in FIG. 1 are intended to be representative of their more general classes. In general, use of any specific exemplar herein is also intended to be representative of its class and the non-inclusion of such specific devices in the foregoing list should not be taken as indicating that limitation is desired.
  • FIG. 2 shows a block diagram of components included in one embodiment of notice system 200 for dynamically generating audible notices from an information network according to the present invention.
  • Notice system 200 allows a user to customize delivery of information based on, for example, the data source and a user's profile.
  • Notice system 200 provides the information in speech-synthesized format as well as on the user's workstation display as the information becomes available.
  • Notice system 200 may perform the following functions independently or in conjunction with other components in Internet connection 110 :
  • notice system 200 One benefit of notice system 200 is that the user does not have to monitor data sources manually because notice system 200 presents the headlines in audible format as they become available. The user does not have to take any action to receive up-to-date news as its appears, nor does the user have to interrupt his work to check data sources manually. For example, if a user subscribes to one or more services that provide world news and/or financial data sources, notice system 200 could be configured to report when the price of one or more specified stocks moves up or down by more than a given percent as the change is published by the stock quote data source. Further, the information will be output to the display associated with workstation 112 even when the window for notice system 200 is not visible on the user's screen.
  • notice system 200 When the user hears a spoken headline of interest, he or she can use the display generated by notice system 200 to access one or more hyperlinks leading to page(s) that contain the full story for the headline.
  • the user can specify criteria and parameters to prioritize reported stories, such criteria including, but not limited to user preferences, noteworthiness, and story metadata (e.g., a specified importance, expiration date, and/or urgency).
  • program instructions can be included in client 204 to monitor user behavior and generate criteria and parameters based on the user's previous interaction with notice system 200 .
  • Notice system 200 also presents this news in text format in a browser window, which need not be visible when the story arrives. As the data sources post news stories, notice system 200 announces the headlines. Notice system 200 includes one or more news summary page listing all of the recent headlines. Each headline is a hyperlink to the web page that contains the full story. Optionally, summary pages may provide additional information with each headline. For example, the summary pages may include additional story text, graphics, or links.
  • Notice system 200 also includes text-to-speech (TTS) engine 208 , sound player 210 , data source monitor 212 , and data source story adapter 214 .
  • TTS engine 208 includes programs instructions for synthesizing speech into a standard audio format from textual input, such as markup language, and is commercially available from a variety of manufacturers.
  • TTS engine 208 may reside in client 204 or be a component in remote services 216 , e.g., TTS engine 226 .
  • a “story” in notice system 200 includes some or all of the following components:
  • the story URL points to a web page (usually on the data source's site) that contains the full story.
  • Notice system 200 specifies a default set of data sources, such as data sources 218 , 220 , 222 .
  • a story can also define new data sources, however. By including an optional source definition, a story can announce the new sources of information to users.
  • Another optional component of a story is a set of one or more parameters, which some data sources require to access information.
  • a financial data source requires a stock symbol to retrieve price quotes for a particular stock.
  • Notice system 200 can accommodate zero, one, or more parameters for a particular data source.
  • a story may optionally contain a variety of other information such as an identification, a time stamp, the name of the author of a story, graphics, audio, video, advertisements, keywords, and categorization information. If a story does not have a time stamp, notice system 200 automatically assigns one to it.
  • Client 204 outputs the story's headline in audible format using sound player 210 .
  • the story's headline may be marked up in a speech synthesis markup language.
  • Story formats are available from a virtually unlimited variety of subscriber and non-subscriber data sources, such as data sources 218 , 220 , and 222 .
  • Notice system 200 includes a syntax for a textual representation of a story. This story syntax is also referred to as “story format”.
  • Information that is in a foreign format (i.e., not in story format) from data sources 218 and 220 is converted to story format in data source story adapters 214 , 224 .
  • stories that are supplied in story format, such as from data source 222 do not require conversion.
  • Adapters 214 , 224 are usually designed to convert source from one specific foreign format to story format.
  • the syntax for story format is defined by an XML document type definition (DTD), which allows a developer to define keyword assignments for tags and their associated parameters, as known in the art.
  • DTD XML document type definition
  • data sources 218 , 220 may provide information in story format, or, alternatively, client 204 may include one or more adapters to convert information from foreign formats to story format.
  • the present invention allows a user to specify one or more data sources 218 , 220 , 222 from which to receive information, as well as one or more noteworthiness criterion for selecting stories presented to the user by notice system 200 . If a data source has a noteworthiness criterion, notice system 200 reads a new story from that data source only if the story satisfies the criterion. The noteworthiness criteria that are available for selection is based on the type of information provided by a particular data source.
  • a stock quote data source noteworthiness criterion could be “price change greater than 1% from the last announced price”. If the data source supplies more than one criterion, the user can select a conjunction or disjunction of criterion. Furthermore, a criterion can be parameterized, in which case the user supplies one or more parameters. For example, “percentage change in trading volume” is a parameterized stock quote criterion. The user could specify a parameter of “2%” to be informed of a volume change greater than 102% or less than 98% of the previously reported volume.
  • Data sources 218 , 220 , 222 publish stories and include the following components:
  • the description URL points to a web page that describes data source 218 , 220 , 222 .
  • Notice system 200 uses the stories URL to get the latest stories data sources 218 , 220 , 222 .
  • the range of topics for stories is unlimited.
  • a product catalog can be specified as a data source.
  • the stories are announcements of new products, discontinued products, improved products, etc.
  • a weather forecast data source publishes forecast “stories”.
  • the automobile section of the classified advertisement section of a newspaper publishes classified ad “stories” about cars that are for sale.
  • a ticker tape publishes stock quote “stories”.
  • a user may specify a data source category, which is a group of related data sources.
  • a “World News” data source category would contain data sources for world news stories. It would also contain data source categories for different countries and/or regions of the world such as Asia and the Middle East. A data source may belong to zero or more data source categories.
  • Notice system 200 includes a default set of data sources 218 , 220 , 222 .
  • a story can define a new data source. Such stories are referred to as source stories.
  • a user reading a source story can subscribe to the source the story announces.
  • a user can also manually enter a definition for a web-based format source. The definition requires at least the URL for data source stories. If a data source adapter 214 , 224 is available, a user on a fat client notice system 200 can specify the location of the adapter. In this case, notice system 200 will download and install adapter 214 , 224 .
  • Client 204 includes browser 202 which interprets documents and scripts that are typically written in mark-up language. Client 204 generates a news page that is refreshed automatically via a ‘Refresh’ META tag or other mechanism for refreshing the display. The refresh rate can adapt to the rate of arrival of new stories or a refresh command may be pushed from miniserver 206 when a new story is sent to browser 202 . Client 204 also either plays audio served from remote TTS engine 226 , or the client invokes local TTS engine 208 to generate speech. If remote TTS engine 226 is used, browser 202 must be capable of playing audio. If local TTS engine 208 is used, either browser 202 , TTS engine 208 , or another set of program instructions in client 204 must be capable of playing audio.
  • Remote services 216 perform five primary functions: data source monitoring, data source management, data source interfacing, state management, and client services.
  • Notice system 200 includes capabilities for client 204 to pull stories from data sources 218 , 220 , 222 , and for remote services 216 to push stories to client 204 .
  • data source monitor 212 polls data sources 218 , 220 , 222 periodically to check the availability of new stories.
  • the polling schedules can be fairly complex including an adaptive scheduler, which increases the polling frequency with the rate of arrival of new stories.
  • the adaptive scheduler reduces the polling rate as the rate of arrival of new stories decreases.
  • Static schedulers are also included, for example, hourly polling during business hours.
  • Data source management includes the creation, modification, and deletion of data sources 218 , 220 , 222 .
  • Miniserver 206 manages state information including user registrations, subscriptions, data source definitions, stories, user preferences, user profiles, data source profiles, data source categories, and other information. Miniserver 206 stores most of the state information in relational databases.
  • Client services are all of the services notice system 200 requires including new story reports, subscription modifications, and user preferences modifications.
  • notice system 200 provides an optional auto-personalization feature whereby the user can choose to have notice system 200 model the user's interests. With this model, notice system 200 can automatically subscribe the user to sources relevant to the users interests. Notice system 200 can also direct relevant stories to the user from data sources to which the user doesn't subscribe.
  • Notice system 200 can categorize data sources 218 , 220 , 222 with either explicit data (e.g., as part of a data source definition) or derived data (from, e.g., machine learning techniques). Notice system 200 may categorize stories as well. A story can belong to one or more story categories. Each data source 218 , 220 , 222 is a de facto story category. Notice system 200 can use any story data—or data derived from the story—to categorize it.
  • Notice system 200 also monitors and dynamically logs its overall state, includes story arrival rates, errors, usage data, and other information.
  • Notice system 200 may serve audio advertisements with headlines. These audio ads can be personalized based on the headlines, the user's profile, and other information. Notice system 200 may also place advertisements on the summary pages served to client 204 . The advertisements can be personalized based on the data source, current stories, the user's profile, and other information that may be customized by the user. Further, data sources 218 , 220 , 222 can also deliver ads in its data source markup language as “stories”, or in its stories.
  • notice server 300 A three-tier embodiment of the present invention for notice server 300 is shown in FIG. 3 , including client 302 , server 304 , and remote services 306 .
  • Notice system 300 provides capabilities and advantages that are virtually identical to notice system 200 including providing customized delivery of stories in speech-synthesized format as well as in a window on a display as the stories become available, auto-personalization, categorizing data sources 218 , 220 , 222 , and categorizing stories.
  • client 302 is a “thin” client architecture
  • client 204 in notice system 200 FIG. 2
  • Miniserver 206 provides enough functionality in client 204 to eliminate any requirement for a separate server, such as server 304 in notice system 300 .
  • Client 302 in notice system 300 further includes browser 308 , TTS engine 310 , and sound player 210 .
  • Server 304 includes miniserver 314 , data source story adapter 214 , and data source monitor 212 .
  • TTS engine 320 resides in server 304 , thereby replacing TTS engine 310 in client 302 .
  • the TTS engine may be located on the client side, (e.g. TTS engine 208 or 310 ) or in a computer system that is remote from the client side (e.g., TTS engine 226 or 320 ).
  • TTS engines 208 , 310 Two issues that arise when TTS is performed remote from the client side are the computational resources required to convert text to speech, and the bandwidth required to transfer speed data from the remote processor to the client side.
  • One alternative is to distribute TTS engines 208 , 310 throughout notice system 200 or 300 to reduce bandwidth and computational burden on a single TTS engine.
  • functions of TTS engines 208 , 226 , 310 , 320 can be broken down into a composition of functions g(f(x)).
  • One type of known TTS engine 208 , 226 , 310 , 320 involves expanding text (x) into phonemes in the function f(x) and requires a large dictionary for translating graphemes to phonemes.
  • a phoneme is a component part or unit in the pronunciation of a word in the sound system of a language.
  • the function g(f(x)) computes sounds that represent the phonemes and could be more computationally intensive compared to the function f(x). Converting the phonemes to representative sounds, also referred to as audio data, generates a large amount of data, even when audio compression schemes are utilized. Ideally, this conversion is performed on client side 204 , 302 to alleviate the need to transfer a large amount of audio data from remote services 216 or server 304 .
  • f(x) is distributed in TTS engine 426 where remote services 216 has storage capacity for the large word-to-phoneme dictionary 428 .
  • g(f(x)) is distributed in TTS engine 408 on client side 204 , thereby offloading heavy computational workload and data transfer requirements from remote services 216 .
  • f(x) is distributed in TTS engine 520 in server 304 , which has storage capacity for the large word-to-phoneme dictionary 522
  • g(f(x)) is distributed to TTS engine 510 on client side 302 , thereby offloading heavy computational workload and data transfer requirements from server 304 .
  • Server 304 performs data source monitoring via data source monitor 212 , as discussed hereinabove for notice system 200 .
  • Server 304 also manages state information including user registrations, subscriptions, data source definitions, stories, user preferences, user profiles, data source profiles, data source categories, and other information.
  • Server 304 stores most of the state information in relational databases.
  • server 304 may perform data source interfacing, such as converting information in a foreign format to story format using data source adapter 214 .
  • a required data source adapter such as data source adapter 224 , may reside in remote services 306 .
  • Notice system 300 also includes capabilities for client 302 to pull stories from data sources 218 , 220 , 222 , and for remote services 216 to push stories to client 302 , through server 304 .
  • data source monitor 212 polls data sources 218 , 220 , 222 periodically to check the availability of new stories in a manner similar to that described in the discussion for notice system 200 hereinabove.
  • Notice systems such as notice systems 200 and 300 may serve audio advertisements with headlines. These audio ads can be personalized based on the headlines, the user's profile, and other information. Notice systems 200 , 300 may also place advertisements on the summary pages served to clients 204 , 302 , respectively. The advertisements can be personalized based on the data source, current stories, the user's profile, and other information that may be customized by the user. Further, data sources 218 , 220 , 222 can also deliver ads in its data source markup language as “stories”, or in its stories.

Abstract

An apparatus and method for converting information from a text format to an audio format using distributed processing. A first set of computer readable program instructions receive information from a data source, convert the information from the text format to an intermediate format, and transmit the information in the intermediate format to a second data processor. A second set of program instructions, executable on the second data processor, are also included to convert the information from the intermediate format to the audio format. The first set of program instructions are executed on a remote, or server side, data processor, while the second set of program instructions are executed on a client side data processor. The first set of program instructions expand the information in the text format into phonemes using a grapheme to phoneme dictionary. The second set of program instructions convert the phonemes to audio output signals.

Description

This is a continuation-in-part of application Ser. No. 09/409,000, filed Sep. 29, 1999, entitled “System and Apparatus For Dynamically Generating Audible Notices From An Information Network.”
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates generally to devices for browsing information on an information network. More specifically, this invention relates to an apparatus and system for receiving personalized information from an information network in audio format using distributed text-to-speech processing.
2. Description of the Related Art
A number of different information networks are available that allow access to information contained on their computers, with the Internet being one that is generally known to the public. The capabilities, usefulness, and amount of information available from information networks are ever-increasing. Further, users often subscribe to one or more information services that are accessible via an information network. Currently, a user must browse the information network for information that is of interest to them. Oftentimes, a user must interrupt their use of an application program, such as spreadsheets or word processing programs, to browse the information network. Even messages sent from information networks to users via e-mail or instant messaging facilities require the user to take specific action to learn the content of the messages. Additionally, while some subscription services and portal services allow a user to customize the format and, to a certain extent, the content, of the information provided, a user must still manually navigate to the various sources of information to see if there is anything of interest to them. Still further, a user often has to sift through a lot of information that is of no interest to them, thereby consuming more time than necessary. Another drawback to current capabilities is that the user typically is not informed immediately when information of interest becomes available, but rather, must enter commands to browse the information sources, and therefore may not receive information of interest as soon as it is available.
In the prior art, systems are available to provide information requested from an information network in aural format, however, these systems require interaction with the user and do not provide the information that the user has indicated an interest in automatically as the information becomes available.
It is therefore desirable to provide users with the ability to prescreen information from various, selected sources, to reduce the amount of time required to find items of interest to the user.
It is also desirable to provide users with relevant information as soon as possible after the news becomes available.
It is also desirable to provide a summary of news items of interest to the user, and to allow the user to access more in-depth information regarding a particular summary.
It is further desirable to receive the information aurally, thereby allowing the user to receive information of interest without being required to interrupt their activity to manipulate or view the information.
There are several known methods for converting information from text format to audio format for output to an audio output device such as an audio speaker system. The information is typically in conventional orthography and the output is synthetic speech. The input is provided in the form of a digital signal which represents the characters of conventional orthography. The primary output is also a digital signal representing an acoustic waveform corresponding to the synthetic speech. Digital-to-analog conversion is a well known technique for producing analog signals which can drive audio speakers. The signal may have any convenient implementation, e.g. electrical, magnetic, electromagnetic or optical.
Speech converters usually include two major sub-units namely an analyzer and a synthesizer. The analyzer divides the original input signal into small textual elements. The synthesizer converts each of these small elements into a short segment of digital waveform and it also joins these together to produce the output.
It will be appreciated that the linguistic analysis of a sentence is exceedingly complicated since it involves many different linguistic tasks, and a wide variety of linguistic processors are commercially available, each of which is capable of doing at least one of the tasks. Further, different portions of the linguistic analysis can be distributed among at least two different data processors.
One category of linguistic processors is designated as “converters” in that they change the nature of the symbols utilized. For example a “converter” alters a signal representing a word or other linguistic element in graphemes into a signal representing the same element in phomenes using a grapheme to phoneme dictionary. This dictionary requires a large amount of storage space, and it is therefore preferable to store and maintain one dictionary in a central location, such as a network server, so that it may be accessed by several users, instead of storing and maintaining separate copies of the dictionary on each user's workstation. The benefits of maintaining large resources on servers arc both ease of maintenance and reduced client system resource requirements. Further, converting the phonemes to an audio signal generates a large amount of data, and transferring the data in audio format requires a large amount of bandwidth.
The invention disclosed in U.S. patent application Ser. No. 09/409,000, filed Sep. 29, 1999, entitled “System and Apparatus For Dynamically Generating Audible Notices From An Information Network” discloses a text-to-speech (TTS) engine that resides either in a client-side processor, in a server-side processor, or which is distributed among data processors in the system. TTS processing functions are computationally intensive and some tasks require a large amount of storage space and bandwidth for data transfer. Therefore, it is further desirable to distribute the TTS engine between at least two data processors in a manner which optimizes processing time, data transfer, and storage space efficiency.
In addition to grapheme to phoneme TTS converters, there are other TTS engines that use different algorithms for transforming text data to audio data. Typically, these other TTS engines also involve converting text data to an intermediate format that requires less storage than the data in audio format. Therefore, it is also desirable to distribute other types of TTS engines between at least two data processors in a manner which optimizes processing time, data transfer, and storage space efficiency.
SUMMARY OF THE INVENTION
In one embodiment, the present invention provides a system for converting information from a text format to an audio format, wherein the text to speech conversion is distributed among two or more data processors. One data processor executes a first set of program instructions to receive information in text format from a data source, to convert the information from the text format to an intermediate format, such as phonemes, and to transmit the information in the intermediate format to the second data processor. The second data processor executes a second set of program instructions to convert the information from the intermediate format to the audio format. In one embodiment, the first data processor, such as a network server, includes one or more databases to aid TTS synthesis, such as one or more grapheme to phoneme dictionaries, that are accessible by multiple users. The second data processor is a client side data processor, such as a client workstation.
In another embodiment, the present invention provides a computer program product for dynamically generating audible notices from an information network using distributed text to speech processing. The information network includes a client processor and a remote processor, such as a network server. The computer program product includes a first set of program instructions that are executed on the remote processor that generate an intermediate representation of the information, such as a phonemic representation. The computer program product further includes a second set of program instructions that are executed on the client side processor that allow a user to preselect at least one data source that is accessible from the information network, to receive information from the at least one preselected data source, and to convert the information from a text format to an audio format based on the intermediate representation of the information.
In one embodiment, the first set of program instructions utilize a dictionary for translating graphemes to phonemes that is stored in a location that is accessible by the first set of program instructions.
In another embodiment, the present invention provides a method for dynamically generating audible notices from an information network which includes preselecting at least one data source from the information network, receiving information from the at least one preselected data source, converting the information from a text format to an intermediate format in a remote processor, converting the information from the intermediate format to an audio format in a client processor, and transmitting audio signals representative of the information in audio format. In one embodiment, the text is converted into an intermediate phonemic representation using a dictionary for translating graphemes to phonemes. The dictionary is stored in a location that is accessible by the remote processor. The phonemes are converted to audio output signals in the client processor.
Each embodiment of the present invention distributes the text to speech processing so that multiple users can take advantage of resources requiring a large amount of storage space from a remote, centralized processor, such as a network server. Intermediate processing of the information is performed at the remote processor to take advantage of the centralized resources, thus reducing the amount of data transfer from the remote processor to the client processor. The information, in intermediate format, is then transferred to the client processor, where it is converted to audio output signals. This feature also advantageously reduces data transfer requirements, since audio output format typically requires a large amount of data storage compared to the intermediate format.
The foregoing has outlined rather broadly the objects, features, and technical advantages of the present invention so that the detailed description of the invention that follows may be better understood.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a system for accessing an information network found in the prior art.
FIG. 1a is a block diagram of an example of a computer workstation found in the prior art with which the present invention may be utilized.
FIG. 2 is a block diagram of a two-tier architecture for providing speech-synthesized information in accordance with the present invention.
FIG. 3 is a block diagram of a three-tier architecture for providing speech-synthesized information in accordance with the present invention.
FIG. 4 is a block diagram of a two-tier architecture for providing speech-synthesized information with distributed text to speech processing in accordance with the present invention.
FIG. 5 is a block diagram of a three-tier architecture for providing speech-synthesized information with distributed text to speech processing in accordance with the present invention.
The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings. The use of the same reference symbols in different drawings indicates similar or identical items.
DETAILED DESCRIPTION
The method and apparatus of the present invention is applicable to devices that access a computerized information network. A number of different information networks are available that allow access to information contained on their computers, with the Internet being one that is generally known to the public. While the Internet is used herein as an example of how the present invention is utilized, it is important to recognize that the present invention is also applicable to other information networks and information systems such as Intranets, database management systems, and document retrieval systems.
An example of a typical Internet connection 110 found in the prior art is shown in FIG. 1. A user that wishes to access information on the Internet typically has a computer workstation 112 that executes an application program known as browser 114. Workstation 112 establishes a communication link 116 with web server 118 such as a dial-up wired connection with a modem, a direct link such as a T1 or ISDN line, or a wireless connection through a cellular or satellite network. When the user enters a request for information by entering commands in browser 114, workstation 112 sends a request for information, such as a search for documents pertaining to a specified topic, or a specific web page to web server 118. Each web server 118, 120, 122, 124 on the Internet has a known address which the user must supply to the browser 114 in order to connect to the appropriate web server 118, 120, 122, or 124. If the information is not available on the user's web server 118, a central link such as backbone 126 allows web servers 118, 120, 122, 124 to communicate with one another to supply the requested information. Because web servers 118, 120, 122, 124 can contain more than one web page, the user will also specify in the address which particular web page he wants to view. The web servers 118, 120, 122, 124 execute a web server application program, often referred to as a portal, which monitors requests, services requests for the information on that particular web server, and transmits the information to the user's workstation 112. A display generated by browser 114 to present information provided by a program on the server side is then presented on computer workstation 112. The display typically includes one or more areas for the user to enter commands and to view the information presented.
In the prior art, a web page is primarily visual data that is intended to be displayed on the display device, such as the monitor of user's workstation 112. When web server 118 receives a web page request, it will transmit a document, generally written in a markup language such as hypertext markup language (HTML) or extensible markup language (XML), across communication link 116 to the requesting browser 114. Communication link 116 may be one or a combination of different data transmission systems, such as a direct dial-up modem connected to a telephone line, dedicated high-speed data links such as Ti or ISDN lines, and even wireless networks which transmit information via satellite or cellular networks. Browser 114 interprets the markup language and outputs the web page to the monitor of user workstation 112. This web page displayed on the user's display may contain text, graphics, and links (which are addresses of other web pages). These other web pages (i.e., those represented by links) may be on the same or on different web servers 118, 120, 122, 124. The user can go to these other web pages by clicking on the links using a mouse or other pointing device. When web server 118 receives a search request, the request is sent to the server containing the search engine specified by the user. The search engine then compiles one or more pages containing a list of links to web pages on other web browsers 120, 122, 124 that may contain information relevant to the user's request. The search engine transmits the page(s) in markup language back to the requesting web server. This entire system of web pages with links to other web pages on other servers across the world is known as the “World Wide Web”.
Workstation 112 and/or web servers 116 are computer systems, such as computer system 130 as shown in FIG. 1a. Computer system 130 includes central processing unit (CPU) 132 connected by host bus 134 to various components including main memory 136, storage device controller 138, network interface 140, audio and video controllers 142, and input/output devices 144 connected via input/output (I/O) controllers 146. Those skilled in the art will appreciate that this system encompasses all types of computer systems including, for example, mainframes, minicomputers, workstations, servers, personal computers, Internet terminals, network appliances, notebooks, palm tops, personal digital assistants, and embedded systems. Typically computer system 130 also includes cache memory 150 to facilitate quicker access between processor 132 and main memory 136. I/O peripheral devices often include speaker systems 152, graphics devices 154, and other I/O devices 144 such as display monitors, keyboards, mouse-type input devices, floppy and hard disk drives, DVD drives, CD-ROM drives, and printers. Many computer systems also include network capability, terminal devices, modems, televisions, sound devices, voice recognition devices, electronic pen devices, and mass storage devices such as tape drives. The number of devices available to add to personal computer systems continues to grow, however computer system 130 may include fewer components than shown in FIG. 1a and described herein.
The peripheral devices usually communicate with processor 132 over one or more buses 134, 156, 158, with the buses communicating with each other through the use of one or more bridges 160, 162. Computer system 130 may be one of many workstations or servers connected to a network such as a local area network (LAN), a wide area network (WAN), or a global information network such as the Internet through network interface 140.
CPU 132 can be constructed from one or more microprocessors and/or integrated circuits. Main memory 136 stores programs and data that CPU 132 may access. When computer system 130 starts up, an operating system program is loaded into main memory 136. The operating system manages the resources of computer system 130, such as CPU 132, audio controller 142, storage device controller 138, network interface 140, I/O controllers 146, and host bus 134. The operating system reads one or more configuration files to determine the hardware and software resources connected to computer system 130.
During operation, main memory 136 includes the operating system, configuration file, and one or more application programs with related program data. Application programs can run with program data as input, and output their results as program data in main memory 136 or to one or more mass storage devices through a memory controller (not shown) and storage device controller 138. CPU 132 executes one or more application programs, including one or more programs to establish a connection to a computer network through network interface 140. The application programs may be embodied in one executable module or may be a collection of routines that are executed as required. Operating systems commonly use “windows”, as well known in the art, to present information about or from an application program. Each application program typically has its own window that is generated when the application program is executing. Each window may be minimized to an icon, maximized to fill the display, overlaid in front of other windows, and underlaid behind other windows.
Storage device controller 138 allows computer system 130 to retrieve and store data from mass storage devices such as magnetic disks (hard disks, diskettes), and optical disks (DVD and CD-ROM). The information from the DASD can be in many forms including application programs and program data. Data retrieved through storage device controller 138 is usually placed in main memory 136 where CPU 132 can process it.
One skilled in the art will recognize that the foregoing components and devices are used as examples for sake of conceptual clarity and that various configuration modifications are common. For example, audio controller 142 is connected to PCI bus 156 in FIG. 1a, but may be connected to the ISA bus 138 or reside on the motherboard (not shown) in alternative embodiments. As further example, although computer system 130 is shown to contain only a single main CPU 132 and a single system bus 134, those skilled in the art will appreciate that the present invention may be practiced using a computer system that has multiple CPUs 132 and/or multiple busses 134. In addition, the interfaces that are used in the preferred embodiment may include separate, fully programmed microprocessors that are used to off-load computationally intensive processing from CPU 132, or may include input/output (I/O) adapters to perform similar functions. Further, PCI bus 156 is used as an exemplar of any input-output devices attached to any I/O bus; AGP bus 159 is used as an exemplar of any graphics bus; graphics device 154 is used as an exemplar of any graphics controller; and host-to-PCI bridge 160 and PCI-to-ISA bridge 162 are used as exemplars of any type of bridge. Consequently, as used herein the specific exemplars set forth in FIG. 1 are intended to be representative of their more general classes. In general, use of any specific exemplar herein is also intended to be representative of its class and the non-inclusion of such specific devices in the foregoing list should not be taken as indicating that limitation is desired.
FIG. 2 shows a block diagram of components included in one embodiment of notice system 200 for dynamically generating audible notices from an information network according to the present invention. Notice system 200 allows a user to customize delivery of information based on, for example, the data source and a user's profile. Notice system 200 provides the information in speech-synthesized format as well as on the user's workstation display as the information becomes available. Notice system 200 may perform the following functions independently or in conjunction with other components in Internet connection 110:
    • play headline audio for new, noteworthy stories as those stories appear;
    • present the user with textual (typically HTML-rendered) story headlines;
    • allow the user to select a headline to view the entire story;
    • allow the user to subscribe and unsubscribe to data sources; and
    • allow the user to set various preferences (e.g., monitoring schedules).
One benefit of notice system 200 is that the user does not have to monitor data sources manually because notice system 200 presents the headlines in audible format as they become available. The user does not have to take any action to receive up-to-date news as its appears, nor does the user have to interrupt his work to check data sources manually. For example, if a user subscribes to one or more services that provide world news and/or financial data sources, notice system 200 could be configured to report when the price of one or more specified stocks moves up or down by more than a given percent as the change is published by the stock quote data source. Further, the information will be output to the display associated with workstation 112 even when the window for notice system 200 is not visible on the user's screen. When the user hears a spoken headline of interest, he or she can use the display generated by notice system 200 to access one or more hyperlinks leading to page(s) that contain the full story for the headline. The user can specify criteria and parameters to prioritize reported stories, such criteria including, but not limited to user preferences, noteworthiness, and story metadata (e.g., a specified importance, expiration date, and/or urgency). Further, program instructions can be included in client 204 to monitor user behavior and generate criteria and parameters based on the user's previous interaction with notice system 200.
Notice system 200 also presents this news in text format in a browser window, which need not be visible when the story arrives. As the data sources post news stories, notice system 200 announces the headlines. Notice system 200 includes one or more news summary page listing all of the recent headlines. Each headline is a hyperlink to the web page that contains the full story. Optionally, summary pages may provide additional information with each headline. For example, the summary pages may include additional story text, graphics, or links.
Notice system 200 also includes text-to-speech (TTS) engine 208, sound player 210, data source monitor 212, and data source story adapter 214. Notice system 200 is a two-tier system having client 204 communicating directly with remote services 216. TTS engine 208 includes programs instructions for synthesizing speech into a standard audio format from textual input, such as markup language, and is commercially available from a variety of manufacturers. In the embodiment of the present invention shown in FIG. 2, TTS engine 208 may reside in client 204 or be a component in remote services 216, e.g., TTS engine 226.
A “story” in notice system 200 includes some or all of the following components:
    • headline;
    • story URL;
    • optional source definition;
    • optional identification;
    • optional parameter;
    • optional timestamp;
    • optional advertisement; and
    • optional additional data.
The story URL points to a web page (usually on the data source's site) that contains the full story. Notice system 200 specifies a default set of data sources, such as data sources 218, 220, 222. A story can also define new data sources, however. By including an optional source definition, a story can announce the new sources of information to users.
Another optional component of a story is a set of one or more parameters, which some data sources require to access information. For example, a financial data source requires a stock symbol to retrieve price quotes for a particular stock. Notice system 200 can accommodate zero, one, or more parameters for a particular data source.
A story may optionally contain a variety of other information such as an identification, a time stamp, the name of the author of a story, graphics, audio, video, advertisements, keywords, and categorization information. If a story does not have a time stamp, notice system 200 automatically assigns one to it. Client 204 outputs the story's headline in audible format using sound player 210. The story's headline may be marked up in a speech synthesis markup language.
Stories are available from a virtually unlimited variety of subscriber and non-subscriber data sources, such as data sources 218, 220, and 222. Notice system 200 includes a syntax for a textual representation of a story. This story syntax is also referred to as “story format”. Information that is in a foreign format (i.e., not in story format) from data sources 218 and 220 is converted to story format in data source story adapters 214, 224. Stories that are supplied in story format, such as from data source 222, do not require conversion. Adapters 214, 224 are usually designed to convert source from one specific foreign format to story format. In one embodiment, the syntax for story format is defined by an XML document type definition (DTD), which allows a developer to define keyword assignments for tags and their associated parameters, as known in the art. Thus, data sources 218, 220 may provide information in story format, or, alternatively, client 204 may include one or more adapters to convert information from foreign formats to story format.
A user does not necessarily want to hear the headlines of all new stories from all available data sources. Otherwise, a user would be inundated with constant updates of information. For example, a user who subscribes to stock quotes would here a continues stream of price updates. Accordingly, the present invention allows a user to specify one or more data sources 218, 220, 222 from which to receive information, as well as one or more noteworthiness criterion for selecting stories presented to the user by notice system 200. If a data source has a noteworthiness criterion, notice system 200 reads a new story from that data source only if the story satisfies the criterion. The noteworthiness criteria that are available for selection is based on the type of information provided by a particular data source. For example, a stock quote data source noteworthiness criterion could be “price change greater than 1% from the last announced price”. If the data source supplies more than one criterion, the user can select a conjunction or disjunction of criterion. Furthermore, a criterion can be parameterized, in which case the user supplies one or more parameters. For example, “percentage change in trading volume” is a parameterized stock quote criterion. The user could specify a parameter of “2%” to be informed of a volume change greater than 102% or less than 98% of the previously reported volume.
Data sources 218, 220, 222 publish stories and include the following components:
    • name
    • description URL
    • stories URL
    • optional schedule
    • optional data source groups
    • optional additional data
The description URL points to a web page that describes data source 218, 220, 222. Notice system 200 uses the stories URL to get the latest stories data sources 218, 220, 222. The range of topics for stories is unlimited. For example, a product catalog can be specified as a data source. The stories are announcements of new products, discontinued products, improved products, etc. A weather forecast data source publishes forecast “stories”. The automobile section of the classified advertisement section of a newspaper publishes classified ad “stories” about cars that are for sale. A ticker tape publishes stock quote “stories”.
Further, a user may specify a data source category, which is a group of related data sources. For example, a “World News” data source category would contain data sources for world news stories. It would also contain data source categories for different countries and/or regions of the world such as Asia and the Middle East. A data source may belong to zero or more data source categories.
Notice system 200 includes a default set of data sources 218, 220, 222. In addition, a story can define a new data source. Such stories are referred to as source stories. A user reading a source story can subscribe to the source the story announces. A user can also manually enter a definition for a web-based format source. The definition requires at least the URL for data source stories. If a data source adapter 214, 224 is available, a user on a fat client notice system 200 can specify the location of the adapter. In this case, notice system 200 will download and install adapter 214, 224.
Client 204 includes browser 202 which interprets documents and scripts that are typically written in mark-up language. Client 204 generates a news page that is refreshed automatically via a ‘Refresh’ META tag or other mechanism for refreshing the display. The refresh rate can adapt to the rate of arrival of new stories or a refresh command may be pushed from miniserver 206 when a new story is sent to browser 202. Client 204 also either plays audio served from remote TTS engine 226, or the client invokes local TTS engine 208 to generate speech. If remote TTS engine 226 is used, browser 202 must be capable of playing audio. If local TTS engine 208 is used, either browser 202, TTS engine 208, or another set of program instructions in client 204 must be capable of playing audio.
Remote services 216 perform five primary functions: data source monitoring, data source management, data source interfacing, state management, and client services.
Notice system 200 includes capabilities for client 204 to pull stories from data sources 218, 220, 222, and for remote services 216 to push stories to client 204. For data sources that do not push stories to client 204, data source monitor 212 polls data sources 218, 220, 222 periodically to check the availability of new stories. The polling schedules can be fairly complex including an adaptive scheduler, which increases the polling frequency with the rate of arrival of new stories. The adaptive scheduler reduces the polling rate as the rate of arrival of new stories decreases. Static schedulers are also included, for example, hourly polling during business hours.
Data source management includes the creation, modification, and deletion of data sources 218, 220, 222.
Miniserver 206 manages state information including user registrations, subscriptions, data source definitions, stories, user preferences, user profiles, data source profiles, data source categories, and other information. Miniserver 206 stores most of the state information in relational databases.
Client services are all of the services notice system 200 requires including new story reports, subscription modifications, and user preferences modifications.
In one embodiment notice system 200 provides an optional auto-personalization feature whereby the user can choose to have notice system 200 model the user's interests. With this model, notice system 200 can automatically subscribe the user to sources relevant to the users interests. Notice system 200 can also direct relevant stories to the user from data sources to which the user doesn't subscribe.
Notice system 200 can categorize data sources 218, 220, 222 with either explicit data (e.g., as part of a data source definition) or derived data (from, e.g., machine learning techniques). Notice system 200 may categorize stories as well. A story can belong to one or more story categories. Each data source 218, 220, 222 is a de facto story category. Notice system 200 can use any story data—or data derived from the story—to categorize it.
Notice system 200 also monitors and dynamically logs its overall state, includes story arrival rates, errors, usage data, and other information.
Notice system 200 may serve audio advertisements with headlines. These audio ads can be personalized based on the headlines, the user's profile, and other information. Notice system 200 may also place advertisements on the summary pages served to client 204. The advertisements can be personalized based on the data source, current stories, the user's profile, and other information that may be customized by the user. Further, data sources 218, 220, 222 can also deliver ads in its data source markup language as “stories”, or in its stories.
A three-tier embodiment of the present invention for notice server 300 is shown in FIG. 3, including client 302, server 304, and remote services 306. Notice system 300 provides capabilities and advantages that are virtually identical to notice system 200 including providing customized delivery of stories in speech-synthesized format as well as in a window on a display as the stories become available, auto-personalization, categorizing data sources 218, 220, 222, and categorizing stories. One of the differences between notice system 200 and notice system 300 is that client 302 is a “thin” client architecture, whereas client 204 in notice system 200 (FIG. 2) is a “fat” client architecture. Miniserver 206 provides enough functionality in client 204 to eliminate any requirement for a separate server, such as server 304 in notice system 300.
Client 302 in notice system 300 further includes browser 308, TTS engine 310, and sound player 210. Server 304 includes miniserver 314, data source story adapter 214, and data source monitor 212. In an alternate embodiment, TTS engine 320 resides in server 304, thereby replacing TTS engine 310 in client 302. In both notice system 200 (FIG. 2) and notice system 300, the TTS engine may be located on the client side, (e.g. TTS engine 208 or 310) or in a computer system that is remote from the client side (e.g., TTS engine 226 or 320).
Two issues that arise when TTS is performed remote from the client side are the computational resources required to convert text to speech, and the bandwidth required to transfer speed data from the remote processor to the client side. One alternative is to distribute TTS engines 208, 310 throughout notice system 200 or 300 to reduce bandwidth and computational burden on a single TTS engine. In many types of text to speech converters, functions of TTS engines 208, 226, 310, 320 can be broken down into a composition of functions g(f(x)). One type of known TTS engine 208, 226, 310, 320 involves expanding text (x) into phonemes in the function f(x) and requires a large dictionary for translating graphemes to phonemes. A phoneme is a component part or unit in the pronunciation of a word in the sound system of a language. The function g(f(x)) computes sounds that represent the phonemes and could be more computationally intensive compared to the function f(x). Converting the phonemes to representative sounds, also referred to as audio data, generates a large amount of data, even when audio compression schemes are utilized. Ideally, this conversion is performed on client side 204, 302 to alleviate the need to transfer a large amount of audio data from remote services 216 or server 304.
Thus, in an embodiment of the two-tier architecture 400 shown in FIG. 4, f(x) is distributed in TTS engine 426 where remote services 216 has storage capacity for the large word-to-phoneme dictionary 428. Further, g(f(x)) is distributed in TTS engine 408 on client side 204, thereby offloading heavy computational workload and data transfer requirements from remote services 216.
Likewise, in an embodiment of the three-tier architecture 500 shown in FIG. 5, f(x) is distributed in TTS engine 520 in server 304, which has storage capacity for the large word-to-phoneme dictionary 522, and g(f(x)) is distributed to TTS engine 510 on client side 302, thereby offloading heavy computational workload and data transfer requirements from server 304.
Server 304 performs data source monitoring via data source monitor 212, as discussed hereinabove for notice system 200. Server 304 also manages state information including user registrations, subscriptions, data source definitions, stories, user preferences, user profiles, data source profiles, data source categories, and other information. Server 304 stores most of the state information in relational databases. Further, server 304 may perform data source interfacing, such as converting information in a foreign format to story format using data source adapter 214. Alternatively, a required data source adapter, such as data source adapter 224, may reside in remote services 306.
Notice system 300 also includes capabilities for client 302 to pull stories from data sources 218, 220, 222, and for remote services 216 to push stories to client 302, through server 304. For data sources that do not push stories to client 302 via server 304, data source monitor 212 polls data sources 218, 220, 222 periodically to check the availability of new stories in a manner similar to that described in the discussion for notice system 200 hereinabove.
Notice systems, such as notice systems 200 and 300, may serve audio advertisements with headlines. These audio ads can be personalized based on the headlines, the user's profile, and other information. Notice systems 200, 300 may also place advertisements on the summary pages served to clients 204, 302, respectively. The advertisements can be personalized based on the data source, current stories, the user's profile, and other information that may be customized by the user. Further, data sources 218, 220, 222 can also deliver ads in its data source markup language as “stories”, or in its stories.
While the invention has been described with respect to the embodiments and variations set forth above, these embodiments and variations are illustrative and the invention is not to be considered limited in scope to these embodiments and variations. For example, although a TTS engine for converting graphemes to phonemes has been discussed as an example of a TTS engine that may utilize the present invention, the present invention may also be utilized with other similar functions which compute intermediate representations and generate a relatively small amount of data compared to the final audio output. Further, several different databases may be included in one remote location, such as grapheme to phoneme dictionaries for a variety of different languages. Accordingly, various other embodiments and modifications and improvements not described herein may be within the spirit and scope of the present invention, as defined by the following claims.

Claims (35)

What is claimed is:
1. A system for converting information from a text format to an audio format, the system comprising:
at least one data source;
a server side data processor;
a client side data processor;
a first set of program instructions executable on the server side data processor, the first set of program instructions including:
first program instructions being operable to receive information from the at least one data source;
second program instructions being operable to convert the information from the text format to an intermediate phonemic representation;
third program instructions being operable to transmit the information in the intermediate phonemic representation to the client side data processor; and
a second set of program instructions executable on the client side data processor, the second set of program instructions including:
fourth program instructions being operable to convert the information from the intermediate phonemic representation to the audio format.
2. The system, as set forth in claim 1, wherein the second set of program instructions convert the information in the text format into phonemes.
3. The system, as set forth in claim 2, further comprising:
a dictionary for translating the text to phonemes, wherein the dictionary is stored in a location that is accessible by the first server side data processor.
4. The system, as set forth in claim 3, wherein the fourth program instructions convert the phonemes to audio output signals.
5. The computer program product, as set forth in claim 4 7, further comprising:
a dictionary for translating graphemes to phonemes, wherein the dictionary is stored in a location that is accessible by the first function.
6. A computer program product for dynamically generating audible notices from an information network, the information network including a client processor and a remote processor, the computer program product comprising:
first program instructions being operable to allow a user to preselect at least one data source, wherein the at least one data source is accessible from the information network;
second program instructions being operable to receive information from the at least one preselected data source; and
third program instructions being operable to convert the information from a text format to an audio format, wherein the third program instructions perform a first function for generating an intermediate phonemic representation of the information and a second function for generating an audio representation of the information based on the intermediate phonemic representation of the information, wherein the first function is performed in the remote processor and the second function is performed in the client processor.
7. The computer program product, as set forth in claim 6, wherein the first function converts the information in the text format into phonemes.
8. The computer program product, as set forth in claim 7, wherein the second function generates a representation of sounds based on the phonemes.
9. The computer program product of claim 6, wherein said second program instructions are operable to receive information from the at least one preselected data source as information becomes available from the data source, and wherein said third program instructions are operable to automatically convert the information from a text format to an audio format.
10. A method for dynamically generating audible notices from an information network, the method comprising:
preselecting at least one data source, wherein the at least one data source is accessible from the information network;
receiving information from the at least one preselected data source;
converting the information from a text format to an intermediate phonemic representation in a remote processor;
converting the information from the intermediate phonemic representation to an audio format in a client processor; and
transmitting audio signals representative of the information in audio format.
11. The method, as set forth in claim 10, wherein the first function converts said converting the information from a text format to an intermediate phonemic representation comprises converting the information in the text format into phonemes.
12. The method, as set forth in claim 11, further comprising: wherein said converting the information in the text format into phonemes comprises accessing a dictionary for translating graphemes to phonemes, wherein the dictionary is stored in a location that is accessible by the remote processor.
13. The method, as set forth in claim 11, wherein the second function generates said converting the information from the intermediate phonemic representation to an audio format comprises generating a representation of sounds based on the phonemes.
14. The method, as set forth in claim 10, wherein the remote processor is a server side processor.
15. The system, as set forth in claim 14 18, wherein the second remote program instructions convert the information in the text format into phonemes.
16. The system, as set forth in claim 14 18, wherein the remote computer system is a server computer system.
17. The method of claim 10, wherein said information is received from the at least one preselected data source as information becomes available, and said transmitting of audio signals representative of the information in audio format occurs automatically.
18. A system for dynamically generating audible notices from an information network, the method system comprising:
at least one data source;
a client computer system;
a remote computer system;
a set of remote program instructions executable on the remote computer system, the remote program instructions including:
first remote program instructions being operable to receive information from a data source;
second remote program instructions being operable to convert the information from a text format to an intermediate phonemic representation;
third remote program instructions being operable to transmit the information in intermediate phonemic representation to the client computer system; and
a set of client program instructions executable on the client computer system, the client program instructions including:
first client program instructions being operable to convert the information from the intermediate phonemic representation to an audio format.
19. The system, as set forth in claim 18, further comprising:
a dictionary for translating the text to phonemes, wherein the dictionary is stored in a location that is accessible by the remote processor computer system.
20. The system, as set forth in claim 18 15, wherein the first client program instructions convert the phonemes to audio output signals.
21. A client system, comprising:
a processor; and
a memory coupled to the processor and configured to store program instructions that are executable by the processor to cause the client system to:
receive, over an information network from a remote system, information in an intermediate phonemic representation, wherein the intermediate phonemic representation is derived from a text format of the information; and
convert the information from the intermediate phonemic representation to an audio format.
22. The system as recited in claim 21, wherein the information network comprises a wide-area network (WAN) and wherein the remote system comprises a web server.
23. The system as recited in claim 21, wherein to convert the information from the intermediate phonemic representation to an audio format, the program instructions are executable to convert the phonemes to audio output signals.
24. The system as recited in claim 23, wherein the program instructions are executable to convert the phonemes to the audio output signals automatically in response to receiving the information from the remote system.
25. The system as recited in claim 21, wherein the program instructions are executable to select a data source for the information prior to receiving the information in the phonemic representation.
26. A method, comprising:
a client device receiving, over a wide-area network (WAN) from a server, information in an intermediate phonemic representation, wherein the intermediate phonemic representation is derived from a text format of the information; and
the client device converting the information from the intermediate phonemic representation to an audio format.
27. The method as recited in claim 26, wherein the server comprises a web server and wherein the WAN comprises the Internet.
28. The method as recited in claim 27, said converting comprises converting the phonemes to audio output signals.
29. The method as recited in claim 28, said converting the phonemes to the audio output signals is performed automatically in response to the receiving of the information from the web server.
30. The method as recited in claim 27, further comprising selecting a data source for the information prior to said receiving.
31. An article of manufacture comprising a non-transitory computer readable storage medium having program instructions stored thereon that, in response to execution by a client computer system, cause the client computer system to perform operations including:
receiving, over an information network from a remote server computer system, information in an intermediate phonemic representation, wherein the intermediate phonemic representation is derived from a text format of the information; and
converting the information from the intermediate phonemic representation to an audio format.
32. The article of manufacture of claim 31, wherein the information network comprises a wide-area network (WAN) and wherein the remote server computer system comprises a web server.
33. The article of manufacture of claim 31, wherein converting the information from the intermediate phonemic representation to an audio format further comprises converting the phonemes to audio output signals.
34. The article of manufacture of claim 33, the operations further comprising converting the phonemes to the audio output signals automatically in response to receiving the information from the remote server computer system.
35. The article of manufacture of claim 31, the operations further comprising selecting a data source for the information prior to receiving the information in the phonemic representation.
US11/119,493 1999-09-29 2005-04-29 System and apparatus for dynamically generating audible notices from an information network Expired - Lifetime USRE42904E1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/119,493 USRE42904E1 (en) 1999-09-29 2005-04-29 System and apparatus for dynamically generating audible notices from an information network

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US40900099A 1999-09-29 1999-09-29
US09/427,233 US6557026B1 (en) 1999-09-29 1999-10-26 System and apparatus for dynamically generating audible notices from an information network
US11/119,493 USRE42904E1 (en) 1999-09-29 2005-04-29 System and apparatus for dynamically generating audible notices from an information network

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/427,233 Reissue US6557026B1 (en) 1999-09-29 1999-10-26 System and apparatus for dynamically generating audible notices from an information network

Publications (1)

Publication Number Publication Date
USRE42904E1 true USRE42904E1 (en) 2011-11-08

Family

ID=23618647

Family Applications (1)

Application Number Title Priority Date Filing Date
US11/119,493 Expired - Lifetime USRE42904E1 (en) 1999-09-29 2005-04-29 System and apparatus for dynamically generating audible notices from an information network

Country Status (1)

Country Link
US (1) USRE42904E1 (en)

Cited By (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169367A1 (en) * 2006-05-04 2010-07-01 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US20130110854A1 (en) * 2011-10-26 2013-05-02 Kimber Lockhart Preview pre-generation based on heuristics and algorithmic prediction/assessment of predicted user behavior for enhancement of user experience
US8719445B2 (en) 2012-07-03 2014-05-06 Box, Inc. System and method for load balancing multiple file transfer protocol (FTP) servers to service FTP connections for a cloud-based service
US8745267B2 (en) 2012-08-19 2014-06-03 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US8868574B2 (en) 2012-07-30 2014-10-21 Box, Inc. System and method for advanced search and filtering mechanisms for enterprise administrators in a cloud-based environment
US8892679B1 (en) 2013-09-13 2014-11-18 Box, Inc. Mobile device, methods and user interfaces thereof in a mobile device platform featuring multifunctional access and engagement in a collaborative environment provided by a cloud-based platform
US8914900B2 (en) 2012-05-23 2014-12-16 Box, Inc. Methods, architectures and security mechanisms for a third-party application to access content in a cloud-based platform
US8990151B2 (en) 2011-10-14 2015-03-24 Box, Inc. Automatic and semi-automatic tagging features of work items in a shared workspace for metadata tracking in a cloud-based content management system with selective or optional user contribution
US8990307B2 (en) 2011-11-16 2015-03-24 Box, Inc. Resource effective incremental updating of a remote client with events which occurred via a cloud-enabled platform
US9015601B2 (en) 2011-06-21 2015-04-21 Box, Inc. Batch uploading of content to a web-based collaboration environment
US9019123B2 (en) 2011-12-22 2015-04-28 Box, Inc. Health check services for web-based collaboration environments
US9027108B2 (en) 2012-05-23 2015-05-05 Box, Inc. Systems and methods for secure file portability between mobile applications on a mobile device
US9054919B2 (en) 2012-04-05 2015-06-09 Box, Inc. Device pinning capability for enterprise cloud service and storage accounts
US9063912B2 (en) 2011-06-22 2015-06-23 Box, Inc. Multimedia content preview rendering in a cloud content management system
US9117087B2 (en) 2012-09-06 2015-08-25 Box, Inc. System and method for creating a secure channel for inter-application communication based on intents
US9135462B2 (en) 2012-08-29 2015-09-15 Box, Inc. Upload and download streaming encryption to/from a cloud-based platform
US9197718B2 (en) 2011-09-23 2015-11-24 Box, Inc. Central management and control of user-contributed content in a web-based collaboration environment and management console thereof
US9195519B2 (en) 2012-09-06 2015-11-24 Box, Inc. Disabling the self-referential appearance of a mobile application in an intent via a background registration
US9195636B2 (en) 2012-03-07 2015-11-24 Box, Inc. Universal file type preview for mobile devices
US9213684B2 (en) 2013-09-13 2015-12-15 Box, Inc. System and method for rendering document in web browser or mobile device regardless of third-party plug-in software
US9237170B2 (en) 2012-07-19 2016-01-12 Box, Inc. Data loss prevention (DLP) methods and architectures by a cloud service
US9292833B2 (en) 2012-09-14 2016-03-22 Box, Inc. Batching notifications of activities that occur in a web-based collaboration environment
US9311071B2 (en) 2012-09-06 2016-04-12 Box, Inc. Force upgrade of a mobile application via a server side configuration file
US9369520B2 (en) 2012-08-19 2016-06-14 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US9396245B2 (en) 2013-01-02 2016-07-19 Box, Inc. Race condition handling in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9396216B2 (en) 2012-05-04 2016-07-19 Box, Inc. Repository redundancy implementation of a system which incrementally updates clients with events that occurred via a cloud-enabled platform
US9413587B2 (en) 2012-05-02 2016-08-09 Box, Inc. System and method for a third-party application to access content within a cloud-based platform
US9483473B2 (en) 2013-09-13 2016-11-01 Box, Inc. High availability architecture for a cloud-based concurrent-access collaboration platform
US9495364B2 (en) 2012-10-04 2016-11-15 Box, Inc. Enhanced quick search features, low-barrier commenting/interactive features in a collaboration platform
US9507795B2 (en) 2013-01-11 2016-11-29 Box, Inc. Functionalities, features, and user interface of a synchronization client to a cloud-based environment
US9519526B2 (en) 2007-12-05 2016-12-13 Box, Inc. File management system and collaboration service and integration capabilities with third party applications
US9519886B2 (en) 2013-09-13 2016-12-13 Box, Inc. Simultaneous editing/accessing of content by collaborator invitation through a web-based or mobile application to a cloud-based collaboration platform
US9535909B2 (en) 2013-09-13 2017-01-03 Box, Inc. Configurable event-based automation architecture for cloud-based collaboration platforms
US9553758B2 (en) 2012-09-18 2017-01-24 Box, Inc. Sandboxing individual applications to specific user folders in a cloud-based service
US9558202B2 (en) 2012-08-27 2017-01-31 Box, Inc. Server side techniques for reducing database workload in implementing selective subfolder synchronization in a cloud-based environment
US9575981B2 (en) 2012-04-11 2017-02-21 Box, Inc. Cloud service enabled to handle a set of files depicted to a user as a single file in a native operating system
US9602514B2 (en) 2014-06-16 2017-03-21 Box, Inc. Enterprise mobility management and verification of a managed application by a content provider
US9628268B2 (en) 2012-10-17 2017-04-18 Box, Inc. Remote key management in a cloud-based environment
US9633037B2 (en) 2013-06-13 2017-04-25 Box, Inc Systems and methods for synchronization event building and/or collapsing by a synchronization component of a cloud-based platform
US9652741B2 (en) 2011-07-08 2017-05-16 Box, Inc. Desktop application for access and interaction with workspaces in a cloud-based content management system and synchronization mechanisms thereof
US9665349B2 (en) 2012-10-05 2017-05-30 Box, Inc. System and method for generating embeddable widgets which enable access to a cloud-based collaboration platform
US9691051B2 (en) 2012-05-21 2017-06-27 Box, Inc. Security enhancement through application access control
US9705967B2 (en) 2012-10-04 2017-07-11 Box, Inc. Corporate user discovery and identification of recommended collaborators in a cloud platform
US9712510B2 (en) 2012-07-06 2017-07-18 Box, Inc. Systems and methods for securely submitting comments among users via external messaging applications in a cloud-based platform
US9734817B1 (en) * 2014-03-21 2017-08-15 Amazon Technologies, Inc. Text-to-speech task scheduling
US9756022B2 (en) 2014-08-29 2017-09-05 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US9773051B2 (en) 2011-11-29 2017-09-26 Box, Inc. Mobile platform file and folder selection functionalities for offline access and synchronization
US9794256B2 (en) 2012-07-30 2017-10-17 Box, Inc. System and method for advanced control tools for administrators in a cloud-based service
US9792320B2 (en) 2012-07-06 2017-10-17 Box, Inc. System and method for performing shard migration to support functions of a cloud-based service
US9805050B2 (en) 2013-06-21 2017-10-31 Box, Inc. Maintaining and updating file system shadows on a local device by a synchronization client of a cloud-based platform
US9894119B2 (en) 2014-08-29 2018-02-13 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms
US9953036B2 (en) 2013-01-09 2018-04-24 Box, Inc. File system monitoring in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9959420B2 (en) 2012-10-02 2018-05-01 Box, Inc. System and method for enhanced security and management mechanisms for enterprise administrators in a cloud-based environment
US9965745B2 (en) 2012-02-24 2018-05-08 Box, Inc. System and method for promoting enterprise adoption of a web-based collaboration environment
US9978040B2 (en) 2011-07-08 2018-05-22 Box, Inc. Collaboration sessions in a workspace on a cloud-based content management system
US10038731B2 (en) 2014-08-29 2018-07-31 Box, Inc. Managing flow-based interactions with cloud-based shared content
US10110656B2 (en) 2013-06-25 2018-10-23 Box, Inc. Systems and methods for providing shell communication in a cloud-based platform
US10200256B2 (en) 2012-09-17 2019-02-05 Box, Inc. System and method of a manipulative handle in an interactive mobile user interface
US10229134B2 (en) 2013-06-25 2019-03-12 Box, Inc. Systems and methods for managing upgrades, migration of user data and improving performance of a cloud-based platform
US10235383B2 (en) 2012-12-19 2019-03-19 Box, Inc. Method and apparatus for synchronization of items with read-only permissions in a cloud-based environment
US10452667B2 (en) 2012-07-06 2019-10-22 Box Inc. Identification of people as search results from key-word based searches of content in a cloud-based environment
US10509527B2 (en) 2013-09-13 2019-12-17 Box, Inc. Systems and methods for configuring event-based automation in cloud-based collaboration platforms
US10530854B2 (en) 2014-05-30 2020-01-07 Box, Inc. Synchronization of permissioned content in cloud-based environments
US10554426B2 (en) 2011-01-20 2020-02-04 Box, Inc. Real time notification of activities that occur in a web-based collaboration environment
US10574442B2 (en) 2014-08-29 2020-02-25 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US10599671B2 (en) 2013-01-17 2020-03-24 Box, Inc. Conflict resolution, retry condition management, and handling of problem files for the synchronization client to a cloud-based platform
US10725968B2 (en) 2013-05-10 2020-07-28 Box, Inc. Top down delete or unsynchronization on delete of and depiction of item synchronization with a synchronization client to a cloud-based platform
US10846074B2 (en) 2013-05-10 2020-11-24 Box, Inc. Identification and handling of items to be ignored for synchronization with a cloud-based platform by a synchronization client
US10866931B2 (en) 2013-10-22 2020-12-15 Box, Inc. Desktop application for accessing a cloud collaboration platform
US10915492B2 (en) 2012-09-19 2021-02-09 Box, Inc. Cloud-based platform enabled with media content indexed for text-based searches and/or metadata extraction
US11210610B2 (en) 2011-10-26 2021-12-28 Box, Inc. Enhanced multimedia content preview rendering in a cloud content management system
US11232481B2 (en) 2012-01-30 2022-01-25 Box, Inc. Extended applications of multimedia content previews in the cloud-based content management system

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5944786A (en) 1996-12-04 1999-08-31 Quinn; Ken Automatic notification of receipt of electronic mail (e-mail) via telephone system without requiring log-on to e-mail server
US5963217A (en) 1996-11-18 1999-10-05 7Thstreet.Com, Inc. Network conference system using limited bandwidth to generate locally animated displays
US6021433A (en) 1996-01-26 2000-02-01 Wireless Internet, Inc. System and method for transmission of data
US6076060A (en) * 1998-05-01 2000-06-13 Compaq Computer Corporation Computer method and apparatus for translating text to sound
US6088673A (en) * 1997-05-08 2000-07-11 Electronics And Telecommunications Research Institute Text-to-speech conversion system for interlocking with multimedia and a method for organizing input data of the same
US6122682A (en) * 1997-03-24 2000-09-19 Toyota Jidosha Kabushiki Kaisha Communication system for controlling data processing according to a state of a communication terminal device
US6181736B1 (en) 1997-03-25 2001-01-30 Nxi Communications, Inc. Network communication system
US6182041B1 (en) * 1998-10-13 2001-01-30 Nortel Networks Limited Text-to-speech based reminder system
US6282511B1 (en) * 1996-12-04 2001-08-28 At&T Voiced interface with hyperlinked information

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6021433A (en) 1996-01-26 2000-02-01 Wireless Internet, Inc. System and method for transmission of data
US5963217A (en) 1996-11-18 1999-10-05 7Thstreet.Com, Inc. Network conference system using limited bandwidth to generate locally animated displays
US5944786A (en) 1996-12-04 1999-08-31 Quinn; Ken Automatic notification of receipt of electronic mail (e-mail) via telephone system without requiring log-on to e-mail server
US6282511B1 (en) * 1996-12-04 2001-08-28 At&T Voiced interface with hyperlinked information
US6122682A (en) * 1997-03-24 2000-09-19 Toyota Jidosha Kabushiki Kaisha Communication system for controlling data processing according to a state of a communication terminal device
US6181736B1 (en) 1997-03-25 2001-01-30 Nxi Communications, Inc. Network communication system
US6088673A (en) * 1997-05-08 2000-07-11 Electronics And Telecommunications Research Institute Text-to-speech conversion system for interlocking with multimedia and a method for organizing input data of the same
US6076060A (en) * 1998-05-01 2000-06-13 Compaq Computer Corporation Computer method and apparatus for translating text to sound
US6182041B1 (en) * 1998-10-13 2001-01-30 Nortel Networks Limited Text-to-speech based reminder system

Cited By (98)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100169367A1 (en) * 2006-05-04 2010-07-01 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US9092403B2 (en) * 2006-05-04 2015-07-28 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US20140229824A1 (en) * 2006-05-04 2014-08-14 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US20140229825A1 (en) * 2006-05-04 2014-08-14 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US9400772B2 (en) * 2006-05-04 2016-07-26 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US10460021B2 (en) 2006-05-04 2019-10-29 Samsung Electronics Co., Ltd. Method and device for selecting a word to be defined in mobile communication terminal having an electronic dictionary
US9519526B2 (en) 2007-12-05 2016-12-13 Box, Inc. File management system and collaboration service and integration capabilities with third party applications
US10554426B2 (en) 2011-01-20 2020-02-04 Box, Inc. Real time notification of activities that occur in a web-based collaboration environment
US9015601B2 (en) 2011-06-21 2015-04-21 Box, Inc. Batch uploading of content to a web-based collaboration environment
US9063912B2 (en) 2011-06-22 2015-06-23 Box, Inc. Multimedia content preview rendering in a cloud content management system
US9652741B2 (en) 2011-07-08 2017-05-16 Box, Inc. Desktop application for access and interaction with workspaces in a cloud-based content management system and synchronization mechanisms thereof
US9978040B2 (en) 2011-07-08 2018-05-22 Box, Inc. Collaboration sessions in a workspace on a cloud-based content management system
US9197718B2 (en) 2011-09-23 2015-11-24 Box, Inc. Central management and control of user-contributed content in a web-based collaboration environment and management console thereof
US8990151B2 (en) 2011-10-14 2015-03-24 Box, Inc. Automatic and semi-automatic tagging features of work items in a shared workspace for metadata tracking in a cloud-based content management system with selective or optional user contribution
US11210610B2 (en) 2011-10-26 2021-12-28 Box, Inc. Enhanced multimedia content preview rendering in a cloud content management system
US9098474B2 (en) * 2011-10-26 2015-08-04 Box, Inc. Preview pre-generation based on heuristics and algorithmic prediction/assessment of predicted user behavior for enhancement of user experience
US20130110854A1 (en) * 2011-10-26 2013-05-02 Kimber Lockhart Preview pre-generation based on heuristics and algorithmic prediction/assessment of predicted user behavior for enhancement of user experience
US8990307B2 (en) 2011-11-16 2015-03-24 Box, Inc. Resource effective incremental updating of a remote client with events which occurred via a cloud-enabled platform
US9015248B2 (en) 2011-11-16 2015-04-21 Box, Inc. Managing updates at clients used by a user to access a cloud-based collaboration service
US9773051B2 (en) 2011-11-29 2017-09-26 Box, Inc. Mobile platform file and folder selection functionalities for offline access and synchronization
US11853320B2 (en) 2011-11-29 2023-12-26 Box, Inc. Mobile platform file and folder selection functionalities for offline access and synchronization
US11537630B2 (en) 2011-11-29 2022-12-27 Box, Inc. Mobile platform file and folder selection functionalities for offline access and synchronization
US10909141B2 (en) 2011-11-29 2021-02-02 Box, Inc. Mobile platform file and folder selection functionalities for offline access and synchronization
US9019123B2 (en) 2011-12-22 2015-04-28 Box, Inc. Health check services for web-based collaboration environments
US11232481B2 (en) 2012-01-30 2022-01-25 Box, Inc. Extended applications of multimedia content previews in the cloud-based content management system
US9965745B2 (en) 2012-02-24 2018-05-08 Box, Inc. System and method for promoting enterprise adoption of a web-based collaboration environment
US10713624B2 (en) 2012-02-24 2020-07-14 Box, Inc. System and method for promoting enterprise adoption of a web-based collaboration environment
US9195636B2 (en) 2012-03-07 2015-11-24 Box, Inc. Universal file type preview for mobile devices
US9054919B2 (en) 2012-04-05 2015-06-09 Box, Inc. Device pinning capability for enterprise cloud service and storage accounts
US9575981B2 (en) 2012-04-11 2017-02-21 Box, Inc. Cloud service enabled to handle a set of files depicted to a user as a single file in a native operating system
US9413587B2 (en) 2012-05-02 2016-08-09 Box, Inc. System and method for a third-party application to access content within a cloud-based platform
US9396216B2 (en) 2012-05-04 2016-07-19 Box, Inc. Repository redundancy implementation of a system which incrementally updates clients with events that occurred via a cloud-enabled platform
US9691051B2 (en) 2012-05-21 2017-06-27 Box, Inc. Security enhancement through application access control
US9027108B2 (en) 2012-05-23 2015-05-05 Box, Inc. Systems and methods for secure file portability between mobile applications on a mobile device
US9280613B2 (en) 2012-05-23 2016-03-08 Box, Inc. Metadata enabled third-party application access of content at a cloud-based platform via a native client to the cloud-based platform
US9552444B2 (en) 2012-05-23 2017-01-24 Box, Inc. Identification verification mechanisms for a third-party application to access content in a cloud-based platform
US8914900B2 (en) 2012-05-23 2014-12-16 Box, Inc. Methods, architectures and security mechanisms for a third-party application to access content in a cloud-based platform
US9021099B2 (en) 2012-07-03 2015-04-28 Box, Inc. Load balancing secure FTP connections among multiple FTP servers
US8719445B2 (en) 2012-07-03 2014-05-06 Box, Inc. System and method for load balancing multiple file transfer protocol (FTP) servers to service FTP connections for a cloud-based service
US9792320B2 (en) 2012-07-06 2017-10-17 Box, Inc. System and method for performing shard migration to support functions of a cloud-based service
US9712510B2 (en) 2012-07-06 2017-07-18 Box, Inc. Systems and methods for securely submitting comments among users via external messaging applications in a cloud-based platform
US10452667B2 (en) 2012-07-06 2019-10-22 Box Inc. Identification of people as search results from key-word based searches of content in a cloud-based environment
US9237170B2 (en) 2012-07-19 2016-01-12 Box, Inc. Data loss prevention (DLP) methods and architectures by a cloud service
US9794256B2 (en) 2012-07-30 2017-10-17 Box, Inc. System and method for advanced control tools for administrators in a cloud-based service
US8868574B2 (en) 2012-07-30 2014-10-21 Box, Inc. System and method for advanced search and filtering mechanisms for enterprise administrators in a cloud-based environment
US9729675B2 (en) 2012-08-19 2017-08-08 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US8745267B2 (en) 2012-08-19 2014-06-03 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US9369520B2 (en) 2012-08-19 2016-06-14 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US9558202B2 (en) 2012-08-27 2017-01-31 Box, Inc. Server side techniques for reducing database workload in implementing selective subfolder synchronization in a cloud-based environment
US9135462B2 (en) 2012-08-29 2015-09-15 Box, Inc. Upload and download streaming encryption to/from a cloud-based platform
US9450926B2 (en) 2012-08-29 2016-09-20 Box, Inc. Upload and download streaming encryption to/from a cloud-based platform
US9117087B2 (en) 2012-09-06 2015-08-25 Box, Inc. System and method for creating a secure channel for inter-application communication based on intents
US9195519B2 (en) 2012-09-06 2015-11-24 Box, Inc. Disabling the self-referential appearance of a mobile application in an intent via a background registration
US9311071B2 (en) 2012-09-06 2016-04-12 Box, Inc. Force upgrade of a mobile application via a server side configuration file
US9292833B2 (en) 2012-09-14 2016-03-22 Box, Inc. Batching notifications of activities that occur in a web-based collaboration environment
US10200256B2 (en) 2012-09-17 2019-02-05 Box, Inc. System and method of a manipulative handle in an interactive mobile user interface
US9553758B2 (en) 2012-09-18 2017-01-24 Box, Inc. Sandboxing individual applications to specific user folders in a cloud-based service
US10915492B2 (en) 2012-09-19 2021-02-09 Box, Inc. Cloud-based platform enabled with media content indexed for text-based searches and/or metadata extraction
US9959420B2 (en) 2012-10-02 2018-05-01 Box, Inc. System and method for enhanced security and management mechanisms for enterprise administrators in a cloud-based environment
US9705967B2 (en) 2012-10-04 2017-07-11 Box, Inc. Corporate user discovery and identification of recommended collaborators in a cloud platform
US9495364B2 (en) 2012-10-04 2016-11-15 Box, Inc. Enhanced quick search features, low-barrier commenting/interactive features in a collaboration platform
US9665349B2 (en) 2012-10-05 2017-05-30 Box, Inc. System and method for generating embeddable widgets which enable access to a cloud-based collaboration platform
US9628268B2 (en) 2012-10-17 2017-04-18 Box, Inc. Remote key management in a cloud-based environment
US10235383B2 (en) 2012-12-19 2019-03-19 Box, Inc. Method and apparatus for synchronization of items with read-only permissions in a cloud-based environment
US9396245B2 (en) 2013-01-02 2016-07-19 Box, Inc. Race condition handling in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9953036B2 (en) 2013-01-09 2018-04-24 Box, Inc. File system monitoring in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9507795B2 (en) 2013-01-11 2016-11-29 Box, Inc. Functionalities, features, and user interface of a synchronization client to a cloud-based environment
US10599671B2 (en) 2013-01-17 2020-03-24 Box, Inc. Conflict resolution, retry condition management, and handling of problem files for the synchronization client to a cloud-based platform
US10846074B2 (en) 2013-05-10 2020-11-24 Box, Inc. Identification and handling of items to be ignored for synchronization with a cloud-based platform by a synchronization client
US10725968B2 (en) 2013-05-10 2020-07-28 Box, Inc. Top down delete or unsynchronization on delete of and depiction of item synchronization with a synchronization client to a cloud-based platform
US9633037B2 (en) 2013-06-13 2017-04-25 Box, Inc Systems and methods for synchronization event building and/or collapsing by a synchronization component of a cloud-based platform
US10877937B2 (en) 2013-06-13 2020-12-29 Box, Inc. Systems and methods for synchronization event building and/or collapsing by a synchronization component of a cloud-based platform
US11531648B2 (en) 2013-06-21 2022-12-20 Box, Inc. Maintaining and updating file system shadows on a local device by a synchronization client of a cloud-based platform
US9805050B2 (en) 2013-06-21 2017-10-31 Box, Inc. Maintaining and updating file system shadows on a local device by a synchronization client of a cloud-based platform
US10229134B2 (en) 2013-06-25 2019-03-12 Box, Inc. Systems and methods for managing upgrades, migration of user data and improving performance of a cloud-based platform
US10110656B2 (en) 2013-06-25 2018-10-23 Box, Inc. Systems and methods for providing shell communication in a cloud-based platform
US10044773B2 (en) 2013-09-13 2018-08-07 Box, Inc. System and method of a multi-functional managing user interface for accessing a cloud-based platform via mobile devices
US8892679B1 (en) 2013-09-13 2014-11-18 Box, Inc. Mobile device, methods and user interfaces thereof in a mobile device platform featuring multifunctional access and engagement in a collaborative environment provided by a cloud-based platform
US11822759B2 (en) 2013-09-13 2023-11-21 Box, Inc. System and methods for configuring event-based automation in cloud-based collaboration platforms
US9213684B2 (en) 2013-09-13 2015-12-15 Box, Inc. System and method for rendering document in web browser or mobile device regardless of third-party plug-in software
US9704137B2 (en) 2013-09-13 2017-07-11 Box, Inc. Simultaneous editing/accessing of content by collaborator invitation through a web-based or mobile application to a cloud-based collaboration platform
US11435865B2 (en) 2013-09-13 2022-09-06 Box, Inc. System and methods for configuring event-based automation in cloud-based collaboration platforms
US9483473B2 (en) 2013-09-13 2016-11-01 Box, Inc. High availability architecture for a cloud-based concurrent-access collaboration platform
US9535909B2 (en) 2013-09-13 2017-01-03 Box, Inc. Configurable event-based automation architecture for cloud-based collaboration platforms
US9519886B2 (en) 2013-09-13 2016-12-13 Box, Inc. Simultaneous editing/accessing of content by collaborator invitation through a web-based or mobile application to a cloud-based collaboration platform
US10509527B2 (en) 2013-09-13 2019-12-17 Box, Inc. Systems and methods for configuring event-based automation in cloud-based collaboration platforms
US10866931B2 (en) 2013-10-22 2020-12-15 Box, Inc. Desktop application for accessing a cloud collaboration platform
US9734817B1 (en) * 2014-03-21 2017-08-15 Amazon Technologies, Inc. Text-to-speech task scheduling
US10530854B2 (en) 2014-05-30 2020-01-07 Box, Inc. Synchronization of permissioned content in cloud-based environments
US9602514B2 (en) 2014-06-16 2017-03-21 Box, Inc. Enterprise mobility management and verification of a managed application by a content provider
US10038731B2 (en) 2014-08-29 2018-07-31 Box, Inc. Managing flow-based interactions with cloud-based shared content
US11146600B2 (en) 2014-08-29 2021-10-12 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms
US10708321B2 (en) 2014-08-29 2020-07-07 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms
US9756022B2 (en) 2014-08-29 2017-09-05 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US10708323B2 (en) 2014-08-29 2020-07-07 Box, Inc. Managing flow-based interactions with cloud-based shared content
US9894119B2 (en) 2014-08-29 2018-02-13 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms
US10574442B2 (en) 2014-08-29 2020-02-25 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US11876845B2 (en) 2014-08-29 2024-01-16 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms

Similar Documents

Publication Publication Date Title
USRE42904E1 (en) System and apparatus for dynamically generating audible notices from an information network
US6557026B1 (en) System and apparatus for dynamically generating audible notices from an information network
US8250467B2 (en) Deriving menu-based voice markup from visual markup
US6018710A (en) Web-based interactive radio environment: WIRE
US8694319B2 (en) Dynamic prosody adjustment for voice-rendering synthesized data
US6636853B1 (en) Method and apparatus for representing and navigating search results
US6721781B1 (en) Method of providing an alternative audio format of a web page in response to a request for audible presentation of the same
US8781840B2 (en) Retrieval and presentation of network service results for mobile device using a multimodal browser
US6295530B1 (en) Internet service of differently formatted viewable data signals including commands for browser execution
Chisholm et al. Web content accessibility guidelines
US20020003547A1 (en) System and method for transcoding information for an audio or limited display user interface
CN117194609A (en) Providing command bundle suggestions for automated assistants
US20070073756A1 (en) System and method configuring contextual based content with published content for display on a user interface
US20070101313A1 (en) Publishing synthesized RSS content as an audio file
EP2037377A1 (en) Methods and systems for providing supplemental contextual content
KR20000011423A (en) Display Screen and Window Size Related Web Page Adaptation System
JP3864197B2 (en) Voice client terminal
US20070100629A1 (en) Porting synthesized email data to audio files
CN1279804A (en) System and method for auditorially representing pages of SGML data
GB2364802A (en) Electronic document delivery and transformation
CN1177150A (en) Web server mechanism for processing function calls for dynamic data queries in web page
US20070100631A1 (en) Producing an audio appointment book
US6922733B1 (en) Method for coordinating visual and speech web browsers
JP2002014893A (en) Web page guiding server for user who use screen reading out software
JP2009151541A (en) Optimum information presentation method in retrieval system

Legal Events

Date Code Title Description
AS Assignment

Owner name: MORPHISM, L.L.C., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:STEPHENS, JAMES H.;REEL/FRAME:023941/0059

Effective date: 19990830

Owner name: FREDERICK MONOCACY LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MORPHISM, L.L.C.;REEL/FRAME:023941/0084

Effective date: 20040817

RF Reissue application filed

Effective date: 20111107

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: ZARBANA DIGITAL FUND LLC, DELAWARE

Free format text: MERGER;ASSIGNOR:FREDERICK MONOCACY LLC;REEL/FRAME:037338/0692

Effective date: 20150811