WO1998016055A1 - Text communication systems - Google Patents

Text communication systems Download PDF

Info

Publication number
WO1998016055A1
WO1998016055A1 PCT/GB1997/002724 GB9702724W WO9816055A1 WO 1998016055 A1 WO1998016055 A1 WO 1998016055A1 GB 9702724 W GB9702724 W GB 9702724W WO 9816055 A1 WO9816055 A1 WO 9816055A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
message
words
entered
telephone
Prior art date
Application number
PCT/GB1997/002724
Other languages
French (fr)
Inventor
Jeffrey Wilson
Original Assignee
Intellprop Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB9620773A external-priority patent/GB2317982B/en
Application filed by Intellprop Limited filed Critical Intellprop Limited
Priority to AU45653/97A priority Critical patent/AU4565397A/en
Publication of WO1998016055A1 publication Critical patent/WO1998016055A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42382Text-based messaging services in telephone networks such as PSTN/ISDN, e.g. User-to-User Signalling or Short Message Service for fixed networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. SMS or e-mail
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M11/00Telephonic communication systems specially adapted for combination with other electrical systems
    • H04M11/06Simultaneous speech and data transmission, e.g. telegraphic transmission over the same conductors
    • H04M11/066Telephone sets adapted for data transmision
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/5322Centralised arrangements for recording incoming messages, i.e. mailbox systems for recording text messages

Definitions

  • This invention relates to communication systems which can be used to send text messages such as alphanumeric messages from a telephone terminal to a desired destination.
  • Known communication systems include selective call or paging systems which generally use automatic means for sending tone or numeric messages to a receiver and manual means to send alphanumeric messages to an alphanumeric receiver.
  • some systems exist which allow letter input from an ordinary MF (multi- frequency) telephone using a predefined code.
  • numeric and alphanumeric pagers that work as follows. If a subscriber wishes a numeric pager user to telephone them, they dial a telephone number consisting of a predefined prefix followed by the number of the required pager. Upon answer they then press the star key twice. As the system captures the caller's calling line identity, their telephone number is then transmitted to the numeric pager. This is a very fast and effective means of transmitting the caller's telephone number, but does not allow specific numeric or alphanumeric messages to be sent.
  • British Telecom operates a paging bureau which allows alphanumeric messages to be sent to any alphanumeric pager.
  • This service provides for the caller to speak the required message to a bureau operator, who then enters the message via a keyboard for transmission to the requested pager.
  • the need to route all alphanumeric messages through a bureau operator leads to. relatively high operating costs which are then reflected in the pager user's service charges.
  • a further proposal has involved the use of a telephone keypad having letters as well as numbers.
  • Various standards exist for allocating letters to a numerical telephone keypad one such being the ISO (International Standards Organisation) standard format. All the standards require a plurality of letters to be associated with each number; for example, the ISO standard associates three or four letters with each number key.
  • the problem that then arises is that, although words can be spelt by means of telephone keypad letter entry (and this entry method is in itself well known), there will be occasions when two or more words will have the same numeric code, in this document such words being referred to as "numnates” (i.e. numerical cognates).
  • the code 63 would in the ISO format be used for the entry of both "of" and "me” .
  • a text communication system comprising means for decoding messages entered by the use of telephone dialling means having number entry and a plurality of letters associated with each of at least some of the numbers, the message decoding means comprising a telephony server operable to read back words identified as a number sequence but entered letter by letter by the telephone dialling means, and operable in response to a predetermined input from the telephone dialling means indicating an unwanted word to read back an alternative word associated with the same number sequence, the message decoding means being operable to validate a wanted word in response to further, input from the telephone dialling means, the message decoding means further including predictive means, responsive to at least one preceding entered word to set a list of alternative words in a hierarchy depending on the at least one preceding entered word, the telephony server being operable to read back the complete message upon completion of input, the system including means for communicating the complete message.
  • One particular system according to the preferred embodiment of the invention utilises a message entry system based on a dictionary of approximately 12,500 words including place names and approximately 4,500 first names.
  • Each word in the dictionary is assigned a part of speech, or in the case of words that can have multiple parts of speech, a code indicating all of the parts of speech.
  • Each word in the dictionary is defined as a word that is spoken differently from any other word with the same spelling, or any word that is spelt differently from any other word. It is fiow possible to create a prediction hierarchy based on the following principles. Firstly, in which domain is the system being used? For example, is it for paging messages, SMS messages, fax messages, in-company e-mail, teletext, classified ad entry, etc.? Selecting and defining the domain allows the predictor results to be modified as and when appropriate. Secondly, at any given time in the message there will be key subject pointers so that if "chips" were a subject of discussion we would have an indicator from previous entries as to whether the subject is fish and chips or integrated circuits. Finally, an nth order predictor is used to provide a ranked prediction for any numnate entry where the elements of the predictor can be individual words and/or parts of speech or grammatical position.
  • the telephone server will, in response to a predetermined input from the telephone dialling means, such as the number of times that a particular key has been activated, read back one of the letters associated with the number on that key. For example, if a particular key represents 'A', 'B' or 'C ⁇ a single key activation may represent 'A', double activation may represent 'B', and triple activation may represent 'C.
  • a predetermined input from the telephone dialling means such as the number of times that a particular key has been activated
  • Figure 1 shows an automated text entry messaging system according to an embodiment of the invention
  • Figures 2A to 2E respectively show five different versions of telephone keypad which can be used for text entry in the system of Figure 1 ; and Figure 3 is a template or illustration showing correspondence between numbers and letters for the preferred keypad shown in Figure 2A.
  • the preferred automated messaging system comprises a telephony server 10 connected to a messaging transmission system, such as a paging transmission system 11 which can transmit to a multiplicity of pagers including a specific pager 12.
  • the telephony server 10 is accessed from a telephone 13 via a telephone network 14.
  • the telephony server 10 may, for example, be a Telsis Hi-Call, particular features of which are described in International Patent Application Publication No. WO 92/22165. In that publication, the telephony server is referred to as a voice services equipment (VSE). Other terms include voice response system (VRS) or interactive voice response (IVR) equipment.
  • VSE voice services equipment
  • Other terms include voice response system (VRS) or interactive voice response (IVR) equipment.
  • the telephony server 10 provides a line interface, tone detection, voice store and a programmable environment including an appropriate program and data structure.
  • a vocabulary list and part-of-speech (POS) code store 15 Associated with the telephony server 10, and optionally forming part of the telephone server 10, are a vocabulary list and part-of-speech (POS) code store 15 and a corresponding speech output means 16.
  • the vocabulary list and POS code store 15 includes a list of text words to be recognised by the system, and corresponding part- of-speech codes associated with the text words, and also a translation table for translating number key inputs into text words.
  • the speech output means 16 has the ability to provide voice-processed speech output of the words in the store 15, as well as of individual numbers and letters.
  • a caller wishing to send a message to the pager 12 would dial from the telephone 13 a telephone number that can be one or more predefined numbers or a telephone number related in some way to the pager number.
  • 'dial' is intended to cover any form of telephone entry means by which telephone numbers (and the letters represented thereby) can be input by a caller.
  • a predefined pager will be selected based on some other information, which may be the caller's telephone number or other data, or at some time during the interaction between the caller and the telephony server the pager number will be entered either directly or indirectly.
  • the telephony server 10 will answer the call routed via the telephone network 14 and interact with the caller in order to accept in a manner acceptable for widespread use the input of text messages for onward transmission via the paging transmission system 11 to the pager 12.
  • the telephony server 10 is associated with the vocabulary list and POS code store 15 which includes a stored list of text words, namely a vocabulary list, with associated part-of-speech codes, and also means for correlating information input by the caller with the words as well as individual letters or numbers, by means of which the equipment is able to decode messages entered by the use of a telephone dialling means, such as a telephone keypad, and to read back the entered messages by the speech output means 16 for confirmation of correct entry.
  • a telephone dialling means such as a telephone keypad
  • the telephone 13 may be provided with letters shown on the number keys, for example in accordance with Figure 2A.
  • Figure 2A If the telephone to be used does not have letters on the number keys, it may be possible to provide a template or simply an illustration as in Figure 2A (or any of Figures 2B to 2E as appropriate) for the user to work out which number keys are to be activated. If a template or illustration is provided, the preferred version is as shown in Figure 3 (corresponding to the ISO layout of Figure 2A), provision being made for entering an apostrophe, as described below.
  • key number 0 zero
  • the system will say "spelling" or the like.
  • the code for each letter is then entered, using multiple key activation where necessary. Each time that the hash key is depressed, the system will say the letter and then await the next.
  • the code "* * #" is entered, whereupon the system will say the complete message and ask the user to confirm or validate the message, for example by pressing the key “1 " to confirm or the key '0' to reject the message. If '0' is pressed, it is then possible to start with a completely new message. If '1 ' is pressed, the system will confirm acceptance of the message, which will then be sent to the transmission system 11 for onward transmission/reception and display.
  • the system is configured, as required, for a specified domain, i.e. paging, classified ad entry, etc.
  • a specified domain i.e. paging, classified ad entry, etc.
  • it monitors the subject matter by monitoring the use of nouns and adjectives in preceding text in order that decisions can be made on the basis of context.
  • an nth order word predictor is used so that the ranked order of numnates can be varied on a dynamic basis; in this way the behaviour of the system to the user is very effective in its ability to correctly rank the numnates so that selection of alternatives is only infrequently required.
  • the grammatical information for each word in the store 15 is derived from the associated POS code.
  • the predictor provides the system with a means to propose words in an order that seems logically acceptable to the user.
  • "of" is at least ten times more frequent than "me”
  • the correct response ["call me”] to the entry of 2255# 63# can be provided by a predictive element which avoids the message feedback of "call of” which would result from a system based on word frequency ordering.
  • "call of” is an atypical combination in English language messages.
  • Each rule can consist of up to 64 subrules and a subrule may depend upon message position or on the previous word or previous word's part of speech.
  • the preferred vocabulary was selected to be suitable for message applications. As a result of the ISO letter layout the current vocabulary has resulted in almost 1 ,500 entry codes which are associated with numnates.
  • Words followed by (h) are homographs for which more than one spoken form exists.
  • Table 3 shows the part of speech codes used by a first order predictor system, first order meaning that only the preceding word (or lack of word) is taken into account in the prediction.
  • the first seven codes (1 to 7) refer to phrase position; “som” means “start of message”; therefore, code "2" indicates that the current entry is the very first word entered in the particular message. It will be seen from the table that there are also codes for comma, full stop, question mark, and also when the previous word entry was spelt or where digits were entered. An explanation of the part of speech codes is given in Table 4.
  • the store 15 will include all predictor codes associated with a particular word.
  • ns ppns I per pronoun. s pps he per pronoun perp I/he pos pronoun posp my ref pronoun refp myself dec pronoun deep mine pronoun pron me contact verb cverb call modal verb mverb shall ns verb nsverb give s verb sverb gives name verb nmverb admire ing verb ingverb doing pres.verb presv go past, verb pastv went verb verb do verb sep sep up (turn up) month mon July gen month gmon July's first name fname Joan gen f name gfname Jeff's place place London iplace iplace Newcastle mplace mplace upon fplace fplace Tyne
  • Adjectives An “a adjective” would be preceded by the “a” form of the indefinite article and an “an adjective” would be preceded by the “an” form of the indefinite article.
  • Adverbs An "a adverb” would be preceded by the “a” form of the indefinite article and an “an adverb” would be preceded by the “an” form of the indefinite article.
  • noun refers to the genitive form of nouns, here taken to be the possessive singular form.
  • the noun category itself covers all noun forms.
  • Preposition This category includes all English prepositions, some common examples are
  • Query words These words are frequently used to start a phrase and immediately imply a question. They are ⁇ who>, ⁇ what>, ⁇ when>, ⁇ how>, ⁇ why>, ⁇ where > .
  • Numbers: "Number” is the general category and there are separate markers for cardinals
  • Pronouns are broken into a number of categories. The coarse breakdown is into personal pronouns (e.g. ⁇ I>), possessive pronouns (e.g. ⁇ my>), reflexive pronouns (e.g. ⁇ myself >) and declarative pronouns (e.g. ⁇ mine>).
  • the personal pronouns are segmented into two, with the first form consisting of ⁇ I > , ⁇ you > , ⁇ we > , ⁇ they > , which are normally associated with verbs in the present tense that do not end in "s", and secondly into the third person singular forms where the verb in the present tense often ends in "s". These include ⁇ he>, ⁇ she>, ⁇ it>, ⁇ someone >, ⁇ everyone >.
  • Verbs A contact verb is defined as one which implies contact and could be immediately followed by a pronoun or name.
  • a non-contact verb that could be followed by a name is identified as a "name verb”.
  • Modal verbs are ⁇ may > , ⁇ might > , ⁇ should > , ⁇ can > , ⁇ could > , ⁇ would > , ⁇ will > .
  • the "verb sep” indicator indicates those separable part of English separable verbs such as "up” from the verb to "turn up”.
  • This category includes ⁇ up > , ⁇ down> , ⁇ out > , ⁇ off> , ⁇ on> , ⁇ in> , ⁇ over> and ⁇ by > .
  • Month The months category is self explanatory and there is also a category for the genitive case.
  • Places There is a first name category and a second category for the genitive forms. Places: There are four categories for places. The first place category is applicable to all word entries from place names; however, the other three categories indicate in a combined name the order information, whether initial, medial or final.
  • Negation These words have a negative connotation and include ⁇ no > , ⁇ not > , ⁇ never > and ⁇ nor > .
  • Noun- Verb This selection is made when there are only two numnates, one of which is a noun only and the other is a verb only. In this instance the default subrule structure is for three subrules as follows:
  • indefinite article noun, verb, definite article: noun, verb, adjective: noun, verb, default: verb, noun.
  • Word-Name This selection is made when there are only two numnates, one of which is a word and the other a first name.
  • the default rule structure in this instance is
  • contact verb Name, word name verb: Name, word default: word, Name
  • Word-Place This selection is made when there are only two numnates, one of which is a word and the other a place name.
  • the default rule structure in this instance is in: Place, word to: Place, word default: word, Place
  • W$ is the previous word or part of speech.
  • the entry code is transferred into a string L$ of length L. If L ⁇ 6 then L$ is used to point directly into a code table consisting of 1,000,000 l ⁇ bit word values. If L> 6 then L$ is passed through a hashing function that mathematically maps each of approximately 8,000 values into a unique 6 digit code which is used to point into a "long word" code table consisting of 1,000,000 l ⁇ bit word values.
  • each active entry in the code tables has the following form.
  • Zero If there is no word associated with this code entry - in this case the word must be "spelt" using the spelling mode. [The system logs all spelt words so that any frequently used word for the given domain that is not in the initial vocabulary can be subsequently added.]
  • Positive Integer If there is only one word (i.e. one spoken form) then this integer is the system file number for that word. The system uses this file number to speak back the entered "word” .
  • Negative Integer If there is more than one word associated with this entry code then the modulus of this negative integer provides the rule number for this code entry.
  • Each rule has a number of subrules which provide predicted rankings for each of the numnates depending on the preceding text.
  • the system is capable of logging all instances where a user did not select the highest ranked numnate in order that additional subrules can be developed to improve the first choice success rate.
  • this "exception" information can be used to improve the parts of speech definitions.
  • the current system envisages a manual or semi-automatic process for enhancing system performance but automatic means can be implemented in order to improve performance.
  • the telephony server 10 may initially request the user to activate some key combination which will uniquely identify the key layout, or possibly a characteristic word which will provide different (but unique) number codes depending on the key layout.
  • the user may enter, in response to an appropriate request, the name and/or serial number of the telephone, or respond appropriately to a list of such names and/or numbers spoken by the system, such that an affirmative reply will set the appropriate translation mode within the system.
  • the system can be used for the immediate transmission of messages
  • the addition of a database 17 allows other facilities such as reminder services to be offered as well.
  • the user could enter date/ time dependent reminders via the telephony server 10 to the database 17, for example providing notification of meetings, birthdays, anniversaries and the like; when the entered date/time matches the current date/time, the appropriate reminder will be sent from the database 17 via the telephony server 10 and the transmission system 11, or alternatively direct from the database 17 to the transmission system 11 , to the required pager, which can be either the user's own pager or one belonging to a third party.
  • the system can also be used to expand the facilities available from a telephone system, such as the provision of contact services.
  • a facility can be provided by the telephony server 10 to hold the incoming call (for example, for a predetermined time) whereupon the pager user can telephone the system and be linked up with the caller.
  • the system can be used to control other aspects of telephone systems as well.
  • the telephone network 14 is a public network but it will be apparent that communication between the telephones and the telephony server 10 could in appropriate circumstances be provided by a PABX system instead or as well.
  • the telephony server 10 can be set up to deliver a personalised acknowledgement or greeting when a caller rings in to deliver a message for a particular user. This could be either by way of voice synthesis or could be a prerecorded message. Other types of voice interaction can also be provided by the telephony server 10. For example, if personnel within a company each carry a pager, and the caller does not know an individual's number, it could be possible for the caller to get the required information by specifying the company name, whereupon a list of numbers and associated personnel would be reproduced, or alternatively the system could be configured so that the caller inputs the individual's initials or other letter combinations representing the name so that no knowledge of numbers is required or offered.
  • the telephone dialling means as described above is in the form of a layout of keys on a keypad, such as one using MF dialling.
  • any other form of dialling may be used, even pulse dialling although this would involve some reduction in user convenience and speed of use.
  • a further alternative would be a screen-based processing system in which a representation of the keypad is displayed on the screen, and individual keys may be activated by the use of a mouse, cursor keys or other input entry devices.
  • the words included in the vocabulary list and part-of- speech code store 15 can be selected as desired, and may be customised to fit specific requirements.
  • the vocabulary list may include names, places and parts of addresses in order to simplify use of the system. Common misspellings, slang and US spellings are further examples which can be included in the list.
  • Hyphenated words can be treated in various ways; preferably the hyphen is ignored and the word is input as either two separate words or as a single merged word.
  • the technique described is also applicable to other languages such as German, Italian etc. and also to other alphabets either with appropriate handsets or through the use of templates; in addition the technique can also be used with non-alphabetical languages (e.g. Chinese) that permit alphabetic representation.
  • non-alphabetical languages e.g. Chinese

Abstract

In a communication system allowing text messages to be entered on the keypad of a telephone (13), for example for onward transmission by a transmitter (11) to a selected receiver (12) for display on the receiver, a telephony server (10) provides confirmation of correct message entry by speaking back individual words and then the complete message to the caller who has entered the message. The text message can be entered on a numerical keypad also having letters shown on the number keys. When a series of key inputs corresponds to more than one word, the caller can reject the word spoken back by the telephony server (10) until the required word is spoken. A vocabulary list of text words is held in a store (15) along with corresponding part-of-speech codes which indicate the grammar/syntax of each of the words. The order in which a list of possible words is spoken back to the caller is changed according to prediction rules based on at least the immediatley preceding word and/or its part-of-speech code.

Description

TEXT COMMUNICATION SYSTEMS
This invention relates to communication systems which can be used to send text messages such as alphanumeric messages from a telephone terminal to a desired destination.
Known communication systems include selective call or paging systems which generally use automatic means for sending tone or numeric messages to a receiver and manual means to send alphanumeric messages to an alphanumeric receiver. However, some systems exist which allow letter input from an ordinary MF (multi- frequency) telephone using a predefined code.
As speech recognition capability improves, there exists the possibility of automatic input of pager messages, but given the large potential vocabulary and large variation found between native speakers this approach is not yet considered fully practicable. In one known paging system, automated services exist for numeric and alphanumeric pagers that work as follows. If a subscriber wishes a numeric pager user to telephone them, they dial a telephone number consisting of a predefined prefix followed by the number of the required pager. Upon answer they then press the star key twice. As the system captures the caller's calling line identity, their telephone number is then transmitted to the numeric pager. This is a very fast and effective means of transmitting the caller's telephone number, but does not allow specific numeric or alphanumeric messages to be sent. For subscribers with alphanumeric pagers, it is possible to enter a message using two keystrokes for each letter; however, this system suffers from a number of disadvantages as the caller receives no feedback on the message input. The combination of awkward entry and lack of feedback make the service difficult for widespread use.
In the United Kingdom, British Telecom operates a paging bureau which allows alphanumeric messages to be sent to any alphanumeric pager. This service provides for the caller to speak the required message to a bureau operator, who then enters the message via a keyboard for transmission to the requested pager. The need to route all alphanumeric messages through a bureau operator leads to. relatively high operating costs which are then reflected in the pager user's service charges.
Traditionally, pagers have been available on a subscription basis with calls to the service priced relatively cheaply. However, a number of new service offerings are now available and one of these, known as "Calling Party Pays" paging, allows a pager user to buy a pager with one-off payment, with the service revenue being gained from the calling party initiating message transmission, calls being made to premium rate numbers from which the paging operator receives a share of the call revenue. This approach has allowed numeric pagers to be available on a non- subscription basis and generally numeric messages can be input from any telephone using either MF signalling or speech recognition of the digits, i.e. speech recognition using a limited vocabulary in order to achieve good recognition performance.
Current systems do not encourage widespread use of Calling Party Pays paging for alphanumeric pagers since the provision of bureau operators in such a system would result in the costs of the calls being too high for widespread use, except in countries where appropriately-skilled labour costs are low. Even then, efficient labour utilisation is difficult to achieve since busy times requiring larger numbers of operators may be difficult to predict. This problem could to some extent be addressed by the use of recorded call systems in which the operators subsequently process recorded calls, but this would involve both increased system costs and delays in the onward transmission of messages.
It has also been proposed, for example for use in teletext systems, to use two- digit numerical codes, each representing an individual letter, for input of text messages on a letter-by-letter basis. However, this is a slow and cumbersome method of composing messages of any significant length, and also requires the user to have access to a table showing correspondence between letters and codes.
A further proposal has involved the use of a telephone keypad having letters as well as numbers. Various standards exist for allocating letters to a numerical telephone keypad, one such being the ISO (International Standards Organisation) standard format. All the standards require a plurality of letters to be associated with each number; for example, the ISO standard associates three or four letters with each number key. The problem that then arises is that, although words can be spelt by means of telephone keypad letter entry (and this entry method is in itself well known), there will be occasions when two or more words will have the same numeric code, in this document such words being referred to as "numnates" (i.e. numerical cognates). As an example, the code 63 would in the ISO format be used for the entry of both "of" and "me" . It has been proposed to resolve this difficulty by using published word order information and offering the user a word choice based on frequency of use, each word upon request being read back to the user in word frequency order and alternatives offered if necessary. For example, in one set of data relating to word frequency, "of" is ten times more likely to occur than "me" . Thus, in compiling the message "call me tonight" , the user would enter: 2255# and hear "call"
63# and hear "of" ff and hear "me"
866448# and hear " tonight " . It would appear that the need to press the hash (#) key one further time to hear an alternative choice should not impose too much difficulty for message entry. However, the fact that each time that, for example, "me" is required, the system will always offer "of" as a first choice can lead to considerable irritation. Also, although only one extra key needs to be pressed, there is an additional significant delay due to having to hear both words and, more importantly, there is a problem in understanding that arises from the user having heard and transferred to short term memory the apparent phrase "call of me tonight" .
According to the invention there is provided a text communication system comprising means for decoding messages entered by the use of telephone dialling means having number entry and a plurality of letters associated with each of at least some of the numbers, the message decoding means comprising a telephony server operable to read back words identified as a number sequence but entered letter by letter by the telephone dialling means, and operable in response to a predetermined input from the telephone dialling means indicating an unwanted word to read back an alternative word associated with the same number sequence, the message decoding means being operable to validate a wanted word in response to further, input from the telephone dialling means, the message decoding means further including predictive means, responsive to at least one preceding entered word to set a list of alternative words in a hierarchy depending on the at least one preceding entered word, the telephony server being operable to read back the complete message upon completion of input, the system including means for communicating the complete message.
In a preferred embodiment, it has been found that, in particular with a restricted vocabulary list such as one including only everyday words, many letter combinations will in fact include just one commonly-used word. For those key inputs that represent more than one word, alternative words can be selected from a list, the order or hierarchy of which depends on one or more preceding entered words.
One particular system according to the preferred embodiment of the invention utilises a message entry system based on a dictionary of approximately 12,500 words including place names and approximately 4,500 first names.
Each word in the dictionary is assigned a part of speech, or in the case of words that can have multiple parts of speech, a code indicating all of the parts of speech.
Each word in the dictionary is defined as a word that is spoken differently from any other word with the same spelling, or any word that is spelt differently from any other word. It is fiow possible to create a prediction hierarchy based on the following principles. Firstly, in which domain is the system being used? For example, is it for paging messages, SMS messages, fax messages, in-company e-mail, teletext, classified ad entry, etc.? Selecting and defining the domain allows the predictor results to be modified as and when appropriate. Secondly, at any given time in the message there will be key subject pointers so that if "chips" were a subject of discussion we would have an indicator from previous entries as to whether the subject is fish and chips or integrated circuits. Finally, an nth order predictor is used to provide a ranked prediction for any numnate entry where the elements of the predictor can be individual words and/or parts of speech or grammatical position.
Much research work is currently underway on collocations, which are often analysed on the basis of words found within a predetermined number of words either side of the defined word. Obviously, for message entry we have no knowledge of following words, but predictors can be built on the basis of an analysis of large amounts of text. The initial message vocabulary for message entry has approximately 1,500 numnate codes and rules have been created for each numnate code. As a simple example of a first order predictor, in the case of "me" and "of" , a simple subrule is that if the preceding word is a contact verb, for example, "call", "phone", "ring" , "contact", etc. , then the predicted word order is "me" first and "of" second. In fact, the practical likelihood of "of" in this case is very small, but the system is configured so that all alternatives associated with the entry code can be selected in order to provide the greatest flexibility. If words are not present in the vocabulary list of the system, then they may be spelt using a spelling mode.
In the preferred embodiment, it is possible to enter individual letters if the word to be input is not in a vocabulary list of the telephony server. Assuming that the number input is associated with more than one letter, the telephone server will, in response to a predetermined input from the telephone dialling means, such as the number of times that a particular key has been activated, read back one of the letters associated with the number on that key. For example, if a particular key represents 'A', 'B' or 'C\ a single key activation may represent 'A', double activation may represent 'B', and triple activation may represent 'C. In the preferred system, the fact that the complete message is read back to the caller prior to onward transmission provides a further opportunity for any errors in the message to be identified and corrected before transmission.
The invention will now be described by way of example with reference to the accompanying drawings, throughout which like parts are referred to by like references, and in which:
Figure 1 shows an automated text entry messaging system according to an embodiment of the invention;
Figures 2A to 2E respectively show five different versions of telephone keypad which can be used for text entry in the system of Figure 1 ; and Figure 3 is a template or illustration showing correspondence between numbers and letters for the preferred keypad shown in Figure 2A. Referring to Figure 1 of the drawings, the preferred automated messaging system comprises a telephony server 10 connected to a messaging transmission system, such as a paging transmission system 11 which can transmit to a multiplicity of pagers including a specific pager 12. The telephony server 10 is accessed from a telephone 13 via a telephone network 14.
The telephony server 10 may, for example, be a Telsis Hi-Call, particular features of which are described in International Patent Application Publication No. WO 92/22165. In that publication, the telephony server is referred to as a voice services equipment (VSE). Other terms include voice response system (VRS) or interactive voice response (IVR) equipment. The telephony server 10 provides a line interface, tone detection, voice store and a programmable environment including an appropriate program and data structure.
Associated with the telephony server 10, and optionally forming part of the telephone server 10, are a vocabulary list and part-of-speech (POS) code store 15 and a corresponding speech output means 16. The vocabulary list and POS code store 15 includes a list of text words to be recognised by the system, and corresponding part- of-speech codes associated with the text words, and also a translation table for translating number key inputs into text words. The speech output means 16 has the ability to provide voice-processed speech output of the words in the store 15, as well as of individual numbers and letters.
The operation of the telephony server 10, the vocabulary list and POS code store 15 and the speech output means 16 will now be described in the context of automating message entry and transmission to pagers.
A caller wishing to send a message to the pager 12 would dial from the telephone 13 a telephone number that can be one or more predefined numbers or a telephone number related in some way to the pager number. The use of the word
'dial' is intended to cover any form of telephone entry means by which telephone numbers (and the letters represented thereby) can be input by a caller.
In the event that the pager number does not form some part of the telephone number dialled, then either a predefined pager will be selected based on some other information, which may be the caller's telephone number or other data, or at some time during the interaction between the caller and the telephony server the pager number will be entered either directly or indirectly.
The telephony server 10 will answer the call routed via the telephone network 14 and interact with the caller in order to accept in a manner acceptable for widespread use the input of text messages for onward transmission via the paging transmission system 11 to the pager 12.
As described above, the telephony server 10 is associated with the vocabulary list and POS code store 15 which includes a stored list of text words, namely a vocabulary list, with associated part-of-speech codes, and also means for correlating information input by the caller with the words as well as individual letters or numbers, by means of which the equipment is able to decode messages entered by the use of a telephone dialling means, such as a telephone keypad, and to read back the entered messages by the speech output means 16 for confirmation of correct entry.
In order for effective input to be possible, the telephone 13 may be provided with letters shown on the number keys, for example in accordance with Figure 2A.
This is the above-mentioned ISO number/letter layout being introduced in the UK; other layouts such as those shown in Figures 2B to 2E may be used, for example in other countries such as the USA, and the access to the vocabulary list and POS code store 15 associated with the telephony server 10 would need to be arranged accordingly to allow for any different layout, as discussed below.
If the telephone to be used does not have letters on the number keys, it may be possible to provide a template or simply an illustration as in Figure 2A (or any of Figures 2B to 2E as appropriate) for the user to work out which number keys are to be activated. If a template or illustration is provided, the preferred version is as shown in Figure 3 (corresponding to the ISO layout of Figure 2A), provision being made for entering an apostrophe, as described below.
Operation of the preferred system will be described by referring to specific examples of word input, these corresponding to the key layout shown in Figures 2A and 3. In general, in the preferred system, to enter a word, it is sufficient to press the keys which contain each of the letters in turn, then to press the hash (#) key.
For example, to enter "OFFICE" , the following keys would be depressed: "6 3 3 4 2 3 ". Assuming that the word is in the vocabulary list and speech code store 15 associated with the telephony server 10, the word "office" will be identified from the number key inputs, and the speech output means 16 will cause the system to say "office" as soon as the hash key is activated and will then await the next word.
It may sometimes be necessary to enter numbers, such as for time or date identification purposes. This is done by firstly depressing the star (*) key. For example, to enter "636" , the following keys would be depressed: "* 6 3 6 #" . The system will say "six, three, six" as soon as the hash key is activated. If a desired word is not in the vocabulary list store 15, the system will say "no entry" or similar upon attempted entry. In that case, it is possible to spell the message letter by letter; multiple activation of the corresponding key is used, as follows. For example, the number 2 represents 'A', 'B' or 'C in Figures 2A and 3. One activation of the key represents 'A', two represent 'B' and three represent 'C. In order to enter spelling mode, key number 0 (zero) is depressed, whereupon the system will say "spelling" or the like. The code for each letter is then entered, using multiple key activation where necessary. Each time that the hash key is depressed, the system will say the letter and then await the next.
For example, to enter "ESHER" , first input "0" to enter spelling mode, then "3 3 #" for Ε\ "7 7 7 7 #" for 'S\ "4 4 #" for Η\ "3 3 #" for 'E\ and "7 7 7 #" for 'R' . Each letter will be spoken back by the system upon each depression of the hash key. At the end of the spelled word, the key "0" is depressed once more, and the system will confirm by saying "end spelling" or the like.
In order to include an apostrophe (') in a word, the number key ' 1 ' is depressed as shown in the template or illustration of Figure 3.
As indicated above, there are situations in which different common words will have the same code, for example "RAIN" and "PAIN" both have the code 7246. In such cases, upon input of "7 2 4 6 #", the system will say whichever of the words with that code has been predicted to be more likely, as described below. If the spoken-back word is the desired word, the user proceeds to the next word in the message to be input in the usual way; if not, the hash key is pressed, again, and the system will say the alternate word. If there are more than two possible words, the user continues to depress the hash key until the desired word is spoken. Again, once this has occurred, the user proceeds to the next word in the message to be input. When the user comes to the end of the list of words, the system may say "no entry" whereupon the user can continue to press the hash key to hear the list of words again in turn, or may press the key "0" to enter the spelling mode as discussed above.
During message entry the user can enter "***" and the system will speak back the message entered so far.
Once the message has been completed, the code "* * #" is entered, whereupon the system will say the complete message and ask the user to confirm or validate the message, for example by pressing the key "1 " to confirm or the key '0' to reject the message. If '0' is pressed, it is then possible to start with a completely new message. If '1 ' is pressed, the system will confirm acceptance of the message, which will then be sent to the transmission system 11 for onward transmission/reception and display.
The manner in which words are predicted by the system, in the event that numnates exist, in other words if a numerical code has more than one word associated with it, will now be described.
Firstly, the system is configured, as required, for a specified domain, i.e. paging, classified ad entry, etc. Secondly, it monitors the subject matter by monitoring the use of nouns and adjectives in preceding text in order that decisions can be made on the basis of context. Finally, an nth order word predictor is used so that the ranked order of numnates can be varied on a dynamic basis; in this way the behaviour of the system to the user is very effective in its ability to correctly rank the numnates so that selection of alternatives is only infrequently required. As explained below, the grammatical information for each word in the store 15 is derived from the associated POS code.
In addition, in those cases where the first ranked numnate does not match the user's intention, the predictor provides the system with a means to propose words in an order that seems logically acceptable to the user. Although in one of the word frequency studies "of" is at least ten times more frequent than "me", the correct response ["call me"] to the entry of 2255# 63# can be provided by a predictive element which avoids the message feedback of "call of" which would result from a system based on word frequency ordering. Clearly, "call of" is an atypical combination in English language messages. Whereas in the general case of English grammar it must address all words in the dictionary, there is, in fact, no particular need for a telephone based text entry predictor to incorporate rules for many of the words as these are uniquely specified by the associated code. It is however, desirable to track parts of speech when these are not unambiguously defined by the text input. The vocabulary of the preferred implementation described here consists of approximately 12,500 words and approximately 4,500 first names. There are approximately 1,500 codes for which numnates exist and predictive rules have been produced for each of these codes. These predictive rules are invoked by the respective speech codes, and the processor in the telephony server 10 applies the rules in the context of the preceding entered words (if any).
Each rule can consist of up to 64 subrules and a subrule may depend upon message position or on the previous word or previous word's part of speech.
The preferred vocabulary was selected to be suitable for message applications. As a result of the ISO letter layout the current vocabulary has resulted in almost 1 ,500 entry codes which are associated with numnates.
Some of these common numnates are considered below in Table 1. In addition to the common numnates there are other words which are homographs
(different pronunciations of words) of each other, such as "use" (noun) and "use"
(verb). Once again, it is important that the rules produce meaningful rankings. Table 2 lists homographs which could form a part of the vocabulary in British English. TABLE 1
COMMON NUMNATES
Words followed by (h) are homographs for which more than one spoken form exists.
23 be Ad 6 an a.m. am Bo
29 by Cy 3 he if 6 go in 3 me of 6 no on Mo 9 my Oz 6 so p.m. 7 up us 23 bad ace Aad Abe 26 can ban 27 car bar cap 28 act bat cat Abu 33 add bed Bee 43 age aid bid 47 air Cis 63 and cod Bod 69 any boy box cow coy bow(h) Amy Cox 87 cup bus 88 but cut 27 far ear 29 day fax 43 die did 59 fly Ely 63 end doe 93 eye dye 446 him gin
447 his hip
468 got gnu hot
487 its Gus
529 law jaw lay Jay
538 let jet Kev
562 job lob
568 lot Lou
569 low joy
626 man mam nan
629 may Max
633 off odd Ned
645 oil nil
663 one nod Noe One
669 now mow
688 out nut
729 pay say saw
733 red see ref
738 set pet
743 she pie Sid
747 sir pip
749 six shy
766 son Ron Soo
786 run rum sum sun
829 tax Tay
836 ten Udo
843 the tie
866 too ton Tom
873 use(h)
896 two Tyn
927 war was
929 way wax
938 yet wet 946 who win
2253 able bake bald cake
2255 call ball
2273 base care case acre
2328 beat Beau
2582 club Alva
2583 blue clue
2639 body Andy Cody
2663 come bond bone
2665 book cook cool
2684 both anti
2769 army brow
3278 east dart fast
3327 fear dear
3463 find dine fine
3483 five dive
3663 food done fond Enne
3667 door Enos
3696 down Enzo
3733 free Fred
3837 ever eves
3855 full dull Fulk
4253 half gale Hale
4263 game hand
4273 hard hare
4283 have gate gave hate
4323 head Hebe
4327 hear gear heap
4373 here herd Gerd
4444 high Gigi
4653 hold gold golf hole Iole
4663 good home gone lone
4687 hour hots 4739 grey grew
5263 land lane Jane Kane
5283 late Kate
5337 keep Jeep
5377 less Jess Kerr
5433 life lied
5463 kind line lime
5483 live(h) jive
5646 join logo John
5664 long Joni Kong
5673 lose lord Jose
5683 love loud
6253 make male
6263 name Mame
6327 near neap
6397 news mews
6463 mind mine nine mime
6673 more nose
6683 move note
7243 page paid raid said
7253 sale pale sake Ralf
7263 same rand sand
7278 part past
7283 rate save rave
7323 read(h)
7325 real peak seal
7335 seek peel
7336 seem seen
7363 send pend Rene
7433 side ride shed
7473 rise pipe ripe
7653 role pole sold Rolf Sole
7663 some roof Rome 7666 room soon
7678 sort port post
7827 star pubs
7829 stay quay ruby
7837 step sues
7867 stop pump rump runs suns
7873 sure pure surf
8253 take tale Vale
8255 talk tall
8436 them then Theo
8439 they view tidy
9255 walk wall
9327 year wear Wear
9355 well Yell
9378 west wept
9433 wide wife
9673 word were
9675 work York
22733 based cased
22779 carry Barry
22873 cause abuse(h)
24246 again chain
24453 child Chile
25246 claim Alain Albin Blain
25673 close(h)
27696 brown crown
34448 eight digit
36789 forty empty
46873 house(h)
53278 least leapt
63337 offer needs odder
63837 never meter
64448 might night 66639 money moody
73224 reach peach
74273 share phase sh
74448 right sight
76863 pound sound ro
78278 start quart
78425 quick stick
78483 quite suite
82583 value valve
84373 there these
84733 three tired Ti
92837 water waves
233673 before afford
628837 matter natter
646883 minute(h)
668437 mother movies
686237 number ounces
732663 second(h)
732673 record(h)
732766 reason season
737678 report resort
967537 worker worlds
6334237 officer offices
7762377 process(h)
7763823 produce(h)
266648833 committee committed TABLE 2
HOMOGRAPHS
abstract incline resume abuse increase row alternate insert segment annex insult separate appropriate invalid sow associate lead subject attribute live suspect august lives tear close minute transform combine misuse transport compact moderate upset compound object use conduct perfect wind console permit content polish console present contest proceed contract produce contrast progress convert project convict protest delegate read deliberate recall duplicate record estimate refresh excuse refund export refuse graduate reject import research The predictive process is used in order to ensure that numnates are ranked appropriately. This can be a multi-level process based on grammatical constraints involving rules of syntax and, if appropriate, semantics, and can also incorporate pragmatic information.
Common parts of speech in English are adjectives, adverbs, nouns, pronouns, verbs and names. However, the system can also recognise other parts of speech and, in the preferred system, it is more important that the rules are appropriate to each of the numnates rather than to a general English language environment.
Individual words can be in themselves be considered parts of speech and the predictor uses word-specific information as well as part of speech information.
The following Table 3 shows the part of speech codes used by a first order predictor system, first order meaning that only the preceding word (or lack of word) is taken into account in the prediction. The first seven codes (1 to 7) refer to phrase position; "som" means "start of message"; therefore, code "2" indicates that the current entry is the very first word entered in the particular message. It will be seen from the table that there are also codes for comma, full stop, question mark, and also when the previous word entry was spelt or where digits were entered. An explanation of the part of speech codes is given in Table 4.
The part of speech code definitions overlap and each word in the vocabulary is marked with all appropriate parts of speech. Thus the store 15 will include all predictor codes associated with a particular word.
In view of the fact that the vocabulary of the system includes many first names, an important part of the speech categories is the contact verb, this being a verb occurring before pronouns such as "me" and/or first names. TABLE 3
PART OF SPEECH CODES
Code Brief Description Shortform Example
1 som , . ?
2 som
3
4
5 ?
6 spelling
7 digits
8 a adjective aadj big
9 an adjective anadj awful
10 adjective adj small
11 a adverb aadv quickly
12 an adverb anadv angrily
13 adverb adv slowly
14 conjunction conj and
15 demonstrative dem that
16 interjection interj oh
17 preposition prep to
18 query word qword who
19 a noun anoun man
20 an noun announ eagle
21 sing, noun snoun man
22 pi. noun plnoun men
23 noun noun man
24 gen noun gnoun man's
25 cardinal card three
26 ordinal ord third
27 number num ten per pronoun. ns ppns I per pronoun. s pps he per pronoun perp I/he pos pronoun posp my ref pronoun refp myself dec pronoun deep mine pronoun pron me contact verb cverb call modal verb mverb shall ns verb nsverb give s verb sverb gives name verb nmverb admire ing verb ingverb doing pres.verb presv go past, verb pastv went verb verb do verb sep sep up (turn up) month mon July gen month gmon July's first name fname Joan gen f name gfname Jeff's place place London iplace iplace Newcastle mplace mplace upon fplace fplace Tyne
TABLE 4
PARTS OF SPEECH DESCRIPTIONS
Adjectives: An "a adjective" would be preceded by the "a" form of the indefinite article and an "an adjective" would be preceded by the "an" form of the indefinite article.
Adverbs: An "a adverb" would be preceded by the "a" form of the indefinite article and an "an adverb" would be preceded by the "an" form of the indefinite article.
Conjunctions: The following are marked as conjunctions in the vocabulary: <and> , <but> ,
< however > , <if> .
Demonstratives: <This>, <that>, <these>, <those>.
Interjections: <Oh>
Nouns: An "a noun" would be preceded by the "a" form of the indefinite article and an
"an noun" would be preceded by the "an" form of the indefinite article.
Singular and plural noun descriptors are self explanatory and "gen noun" refers to the genitive form of nouns, here taken to be the possessive singular form. The noun category itself covers all noun forms.
Preposition: This category includes all English prepositions, some common examples are
<to>, <on>, <at>.
Query words: These words are frequently used to start a phrase and immediately imply a question. They are <who>, <what>, <when>, <how>, <why>, < where > .
Numbers: "Number" is the general category and there are separate markers for cardinals
(e.g. "one") and ordinals (e.g. "first").
Pronouns: Pronouns are broken into a number of categories. The coarse breakdown is into personal pronouns (e.g. <I>), possessive pronouns (e.g. <my>), reflexive pronouns (e.g. < myself >) and declarative pronouns (e.g. <mine>). In addition, the personal pronouns are segmented into two, with the first form consisting of < I > , < you > , < we > , < they > , which are normally associated with verbs in the present tense that do not end in "s", and secondly into the third person singular forms where the verb in the present tense often ends in "s". These include <he>, <she>, <it>, < someone >, < everyone >. Verbs: A contact verb is defined as one which implies contact and could be immediately followed by a pronoun or name.
A non-contact verb that could be followed by a name is identified as a "name verb".
Modal verbs are < may > , < might > , < should > , < can > , < could > , < would > , < will > .
The "ns verb" and "s verb" markers are used to differentiate regular English verbs in the present tense, where the third person singular takes an "s".
Present and past markers are self explanatory, and "ing" form markers refer to any verb ending in "ing" .
The "verb sep" indicator indicates those separable part of English separable verbs such as "up" from the verb to "turn up". This category includes < up > , < down> , < out > , < off> , < on> , < in> , < over> and <by > .
Month: The months category is self explanatory and there is also a category for the genitive case.
Names: There is a first name category and a second category for the genitive forms. Places: There are four categories for places. The first place category is applicable to all word entries from place names; however, the other three categories indicate in a combined name the order information, whether initial, medial or final.
Negation: These words have a negative connotation and include < no > , < not > , < never > and < nor > .
The rules used in the prediction technique provide for a number of predefined structures and these are described below. i) Unique: Where the unique option has been selected only one of the numnates is selected and there is only one unique default subrule with a single file entry (the selected numnate). ii) Default: Where default has been selected this indicates that there is only one subrule and therefore in all instances the ranked order is fixed.
iii) Alphabetic: If default and alphabetic are selected this indicates that there is only one default subrule in which the numnates are ranked alphabetically.
iv) Noun- Verb: This selection is made when there are only two numnates, one of which is a noun only and the other is a verb only. In this instance the default subrule structure is for three subrules as follows:
indefinite article (a or an as appropriate): noun, verb, definite article: noun, verb, adjective: noun, verb, default: verb, noun.
v) Word-Name: This selection is made when there are only two numnates, one of which is a word and the other a first name. The default rule structure in this instance is
contact verb: Name, word name verb: Name, word default: word, Name
vi) Word-Place: This selection is made when there are only two numnates, one of which is a word and the other a place name. The default rule structure in this instance is in: Place, word to: Place, word default: word, Place
Some specific rules based on a first order predictor are given below as illustrations (W$ is the previous word or part of speech).
go:m w$ Ranking
"are" in go
"go" in go
"her" in go
"him" in go
"is" in go
"it" in go
"its" in go
"them" in go noun in go default go in
far: ear W$ Ranking
"an" ear far
"deaf" ear far
"her" ear far
"his" ear far
"inner" ear far
"its" ear far
"middle" ear far
"my" ear far
"one" ear far
"wear" ear far "with" ear far
"wore" ear far
"worn" ear far
"your" ear far default far ear
The operation of the system is described below for a specific domain; however, rules can be domain dependent and although the rules illustrated here are fixed, rules can be dependent upon subject matter.
For word entry, the entry code is transferred into a string L$ of length L. If L<6 then L$ is used to point directly into a code table consisting of 1,000,000 lόbit word values. If L> 6 then L$ is passed through a hashing function that mathematically maps each of approximately 8,000 values into a unique 6 digit code which is used to point into a "long word" code table consisting of 1,000,000 lόbit word values.
Although there are less than 20,000 active entries in this 4Mbyte table structure, this design approach provides a fast and easy look up mechanism. Each active entry in the code tables has the following form. Zero: If there is no word associated with this code entry - in this case the word must be "spelt" using the spelling mode. [The system logs all spelt words so that any frequently used word for the given domain that is not in the initial vocabulary can be subsequently added.] Positive Integer: If there is only one word (i.e. one spoken form) then this integer is the system file number for that word. The system uses this file number to speak back the entered "word" . Negative Integer: If there is more than one word associated with this entry code then the modulus of this negative integer provides the rule number for this code entry.
Each rule has a number of subrules which provide predicted rankings for each of the numnates depending on the preceding text.
As an example, a subrule would exist for 63 such that whereas the default ranking would be
1 of
2 me the ranking after CALL would be
1 me
2 of
so that in a message such as
CALL ME TONIGHT
the entry 63 would result in the selection "ME" whereas in a message such as
SOME OF US ARE ...
the entry 63 would result in the selection "OF" .
Although one version of this system is based on a first order predictor, it is a simple matter to modify the subrule data structure in order to provide multiple order prediction.
The system is capable of logging all instances where a user did not select the highest ranked numnate in order that additional subrules can be developed to improve the first choice success rate.
In a similar manner, this "exception" information can be used to improve the parts of speech definitions.
The current system envisages a manual or semi-automatic process for enhancing system performance but automatic means can be implemented in order to improve performance.
The system has been described as operating with a telephone keyboard layout as shown in Figure 2A. As mentioned above, the translation between number key activation and the text words in the vocabulary list store 15 will need to be changed if a different keyboard layout, such as one of the layouts shown in Figures 2B to 2E, is used instead. In a simple system, the translation between a particular keyboard layout and the associated letters may be fixed, and this will be adequate in situations where it is definite that no other keyboard layout will be used. However, when it is possible that different users may have different keyboard layouts, some means of layout identification should be provided. For example, in a script-determined system, the telephony server 10 may initially request the user to activate some key combination which will uniquely identify the key layout, or possibly a characteristic word which will provide different (but unique) number codes depending on the key layout. Alternatively, the user may enter, in response to an appropriate request, the name and/or serial number of the telephone, or respond appropriately to a list of such names and/or numbers spoken by the system, such that an affirmative reply will set the appropriate translation mode within the system.
Although as described above, the system can be used for the immediate transmission of messages, the addition of a database 17 allows other facilities such as reminder services to be offered as well. Thus, for example, by the use of appropriate codes, the user could enter date/ time dependent reminders via the telephony server 10 to the database 17, for example providing notification of meetings, birthdays, anniversaries and the like; when the entered date/time matches the current date/time, the appropriate reminder will be sent from the database 17 via the telephony server 10 and the transmission system 11, or alternatively direct from the database 17 to the transmission system 11 , to the required pager, which can be either the user's own pager or one belonging to a third party.
The system can also be used to expand the facilities available from a telephone system, such as the provision of contact services. Thus, for example, if a message to a pager user, entered as described above, gives an indication that the caller wishes to speak to the pager user at that instant, a facility can be provided by the telephony server 10 to hold the incoming call (for example, for a predetermined time) whereupon the pager user can telephone the system and be linked up with the caller.
The system can be used to control other aspects of telephone systems as well.
As long as the telephone system is set up to recognise text word commands sent to it, these can be entered by means of the preferred text-entry communication system.
As discussed above, the telephone network 14 is a public network but it will be apparent that communication between the telephones and the telephony server 10 could in appropriate circumstances be provided by a PABX system instead or as well.
If desired, the telephony server 10 can be set up to deliver a personalised acknowledgement or greeting when a caller rings in to deliver a message for a particular user. This could be either by way of voice synthesis or could be a prerecorded message. Other types of voice interaction can also be provided by the telephony server 10. For example, if personnel within a company each carry a pager, and the caller does not know an individual's number, it could be possible for the caller to get the required information by specifying the company name, whereupon a list of numbers and associated personnel would be reproduced, or alternatively the system could be configured so that the caller inputs the individual's initials or other letter combinations representing the name so that no knowledge of numbers is required or offered. Although the invention has been described in the context of a paging system whereby entered messages are communicated for display on selected pagers, it will be apparent that a similar technique can be used in any system requiring text messages or commands to be entered by telephone, such as for onward transmission to a required party. Examples of other such systems include electronic mail systems, teletext systems, SMS (short message service) telephone systems which provide displays on mobile telephones for the communication of text information, computer access systems, classified advertisement entry systems, and fax entry systems.
The telephone dialling means as described above is in the form of a layout of keys on a keypad, such as one using MF dialling. However, any other form of dialling may be used, even pulse dialling although this would involve some reduction in user convenience and speed of use. A further alternative would be a screen-based processing system in which a representation of the keypad is displayed on the screen, and individual keys may be activated by the use of a mouse, cursor keys or other input entry devices.
It will be apparent that the words included in the vocabulary list and part-of- speech code store 15 (and the corresponding spoken versions from the speech output means 16) can be selected as desired, and may be customised to fit specific requirements. For example, the vocabulary list may include names, places and parts of addresses in order to simplify use of the system. Common misspellings, slang and US spellings are further examples which can be included in the list. Hyphenated words can be treated in various ways; preferably the hyphen is ignored and the word is input as either two separate words or as a single merged word.
The technique described is also applicable to other languages such as German, Italian etc. and also to other alphabets either with appropriate handsets or through the use of templates; in addition the technique can also be used with non-alphabetical languages (e.g. Chinese) that permit alphabetic representation.

Claims

1. A text communication system comprising means for decoding messages entered by the use of telephone dialling means having number entry and a plurality of letters associated with each of at least some of the numbers, the message decoding means comprising a telephony server operable to read back words identified as a number sequence but entered letter by letter by the telephone dialling means, and operable in response to a predetermined input from the telephone dialling means indicating an unwanted word to read back an alternative word associated with the same number sequence, the message decoding means being operable to validate a wanted word in response to further input from the telephone dialling means, the message decoding means further including predictive means, responsive to at least one preceding entered word to set a list of alternative words in a hierarchy depending on the at least one preceding entered word, the telephony server being operable to read back the complete message upon completion of input, the system including means for communicating the complete message.
2. A system according to claim 1 , wherein the predictive means is further responsive to an indication that the word being entered is the first word of a message.
3. A system according to claim 1 or claim 2, wherein the predictive means is further responsive to an indication that previous entry related to (i) punctuation, (ii) a spelt word, or (iii) entered digits.
4. A system according to claim 1 , claim 2 or claim 3, wherein the predictive means is responsive only to the immediately preceding entered word or other entry.
5. A system according to claim 1, claim 2 or claim 3, wherein the predictive means is responsive to two or more of the immediately preceding entered words or other entries.
6. A system according to any one of the preceding claims, wherein the telephony server is further operable, when set in letter entry mode, to read back letters entered by the telephone dialling means in response to predetermined input from the telephone dialling means.
7. A system according to claim 6, wherein the predetermined input from the telephone dialling means involves activation of a predetermined number key a different number of times respectively corresponding to the different letters associated with that number key.
8. A system according to any one of the preceding claims, including means for storing a vocabulary list of text words for identification of words entered letter by letter by a caller entering the message, and for identification of alternative words if necessary, part-of-speech codes also being stored for each of the words in the vocabulary list.
9. A system according to claim 8, including speech output means having a store of speech words corresponding to the vocabulary list in the text word store, for reading back speech words to the caller.
10. A system according to claim 8 or claim 9, wherein the message decoding means includes means for correlating the input from the telephone dialling means with the list of text words, at least some of the number code inputs from the telephone dialling means each corresponding to more than one word in the list.
11. A system according to claim 10, wherein the means for correlating the input from the telephone dialling means with the list of text words has the ability for the correlation to be changed to allow for different formats of telephone dialling means.
12. A system according to claim 11, wherein the correlating means is changeable in script-determined form.
13. A system according to any one of the preceding claims, wherein correspondence between numbers and letters on the telephone dialling means is as set out in any of Figures 2A to 2E of the accompanying drawings.
14. A system according to any one of the preceding claims, wherein the message decoding means can be set into a number receiving mode upon respective input from the telephone dialling means, such that the resulting message can include numbers.
15. A system according to any one of the preceding claims, wherein the message decoding means can delete an entered word, letter or number, upon receipt from the telephone dialling means of a delete code.
16. A system according to any one of the preceding claims, wherein activation of a predetermined key by the telephone dialling means indicates completion of a word, letter or number.
17. A system according to claim 16, wherein further activation of the predetermined key causes the telephony server to read back an alternative word.
18. A system according to claim 16 or claim 17, wherein the predetermined key is a hash (#) key.
19. A system according to any one of the preceding claims, wherein the communicating means comprises means for transmitting the complete message to a selected receiver.
20. A system according to claim 19, wherein the transmitting means and the receivers form part of a paging system.
21. A system according to claim 19, wherein the transmitting means and the receivers form part of a mobile telephone system provided with a. short message service facility.
22. A system according to claim 19, wherein the transmitting means and the receivers form part of an electronic mail system, a teletext system, a computer access system, a classified advertisement entry system or a fax entry system.
23. A system according to any one of the preceding claims, wherein the system is operable to control a function of a telephone system in response to an entered text message.
24. A system according to any one of the preceding claims, including a database for storing time and/or date dependent messages and for forwarding each message at the appropriate time and/or date.
PCT/GB1997/002724 1996-10-04 1997-10-03 Text communication systems WO1998016055A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU45653/97A AU4565397A (en) 1996-10-04 1997-10-03 Text communication systems

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9620773.3 1996-10-04
GB9620773A GB2317982B (en) 1991-06-04 1996-10-04 Text communication systems

Publications (1)

Publication Number Publication Date
WO1998016055A1 true WO1998016055A1 (en) 1998-04-16

Family

ID=10800976

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1997/002724 WO1998016055A1 (en) 1996-10-04 1997-10-03 Text communication systems

Country Status (2)

Country Link
AU (1) AU4565397A (en)
WO (1) WO1998016055A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6961700B2 (en) 1996-09-24 2005-11-01 Allvoice Computing Plc Method and apparatus for processing the output of a speech recognition engine
US7761175B2 (en) 2001-09-27 2010-07-20 Eatoni Ergonomics, Inc. Method and apparatus for discoverable input of symbols on a reduced keypad
USRE43082E1 (en) 1998-12-10 2012-01-10 Eatoni Ergonomics, Inc. Touch-typable devices based on ambiguous codes and methods to design such devices
US8200865B2 (en) 2003-09-11 2012-06-12 Eatoni Ergonomics, Inc. Efficient method and apparatus for text entry based on trigger sequences

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4649563A (en) * 1984-04-02 1987-03-10 R L Associates Method of and means for accessing computerized data bases utilizing a touch-tone telephone instrument
US4650927A (en) * 1984-11-29 1987-03-17 International Business Machines Corporation Processor-assisted communication system using tone-generating telephones
EP0319193A2 (en) * 1987-11-30 1989-06-07 Bernard N. Riskin Method and apparatus for identifying words entered on DTMF pushbuttons
US5200988A (en) * 1991-03-11 1993-04-06 Fon-Ex, Inc. Method and means for telecommunications by deaf persons utilizing a small hand held communications device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4649563A (en) * 1984-04-02 1987-03-10 R L Associates Method of and means for accessing computerized data bases utilizing a touch-tone telephone instrument
US4650927A (en) * 1984-11-29 1987-03-17 International Business Machines Corporation Processor-assisted communication system using tone-generating telephones
EP0319193A2 (en) * 1987-11-30 1989-06-07 Bernard N. Riskin Method and apparatus for identifying words entered on DTMF pushbuttons
US5200988A (en) * 1991-03-11 1993-04-06 Fon-Ex, Inc. Method and means for telecommunications by deaf persons utilizing a small hand held communications device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6961700B2 (en) 1996-09-24 2005-11-01 Allvoice Computing Plc Method and apparatus for processing the output of a speech recognition engine
USRE43082E1 (en) 1998-12-10 2012-01-10 Eatoni Ergonomics, Inc. Touch-typable devices based on ambiguous codes and methods to design such devices
US7761175B2 (en) 2001-09-27 2010-07-20 Eatoni Ergonomics, Inc. Method and apparatus for discoverable input of symbols on a reduced keypad
US8200865B2 (en) 2003-09-11 2012-06-12 Eatoni Ergonomics, Inc. Efficient method and apparatus for text entry based on trigger sequences

Also Published As

Publication number Publication date
AU4565397A (en) 1998-05-05

Similar Documents

Publication Publication Date Title
Schmandt Phoneshell: the telephone as computer terminal
US5995590A (en) Method and apparatus for a communication device for use by a hearing impaired/mute or deaf person or in silent environments
US8271331B2 (en) Integrated, interactive telephone and computer network communications system
US5146487A (en) Method for creating and composing audio text message
US5247568A (en) Method for creating and composing audio text message
US6580790B1 (en) Calling assistance system and method
US6621892B1 (en) System and method for converting electronic mail text to audio for telephonic delivery
US6226500B1 (en) Portable radio communication apparatus
CN100471212C (en) Log based ringing tone service
US20030104839A1 (en) Communication terminal having a text editor application with a word completion feature
US20100138441A1 (en) Method for storing telephone number by automatically analyzing message and mobile terminal executing the method
EP1327974A2 (en) System and method for providing locale-specific interpretation of text data
US6014136A (en) Data processing apparatus with communication function
JPH10507895A (en) Message creation communication system
JP2002540731A (en) System and method for generating a sequence of numbers for use by a mobile phone
KR100301219B1 (en) Voice Portal Service System Using Speech Recognition/Text-to-Speech at Communication Network and Method thereof
CN111556201B (en) Method, device and storage medium for intelligently answering incoming call
US20060199611A1 (en) Portable device with calendar application
Marics et al. Designing voice menu applications for telephones
CN101175272B (en) Method for reading text short message
WO1998016055A1 (en) Text communication systems
GB2317982A (en) Text communication systems
EP1690412A1 (en) Telecommunications services apparatus and methods
US20020160818A1 (en) Wireless mobile phone having encoded data entry facilities
US8311586B2 (en) Method of processing information inputted while a mobile communication terminal is in an active communications state

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA SG US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 09254985

Country of ref document: US

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA