WO2000022597A1 - Method for computer-aided foreign language instruction - Google Patents

Method for computer-aided foreign language instruction Download PDF

Info

Publication number
WO2000022597A1
WO2000022597A1 PCT/US1999/023264 US9923264W WO0022597A1 WO 2000022597 A1 WO2000022597 A1 WO 2000022597A1 US 9923264 W US9923264 W US 9923264W WO 0022597 A1 WO0022597 A1 WO 0022597A1
Authority
WO
WIPO (PCT)
Prior art keywords
host
student
response
setting
computer
Prior art date
Application number
PCT/US1999/023264
Other languages
French (fr)
Inventor
David Lawrence Topolewski
Luther Marvin Shannon
Original Assignee
Planetlingo Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Planetlingo Inc. filed Critical Planetlingo Inc.
Priority to AU62927/99A priority Critical patent/AU6292799A/en
Publication of WO2000022597A1 publication Critical patent/WO2000022597A1/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • G09B7/02Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student
    • G09B7/04Electrically-operated teaching apparatus or devices working with questions and answers of the type wherein the student is expected to construct an answer to the question which is presented or wherein the machine gives an answer to the question presented by a student characterised by modifying the teaching programme in response to a wrong answer, e.g. repeating the question, supplying a further explanation

Definitions

  • This invention relates to field of foreign language instruction generally, and particularly to a multi-media, computer-aided system for helping a student learn a foreign language in a simulated conversational environment.
  • foreign language instructional software has also now become available. See, for example, those described in U.S. Patent Nos. 5,810,599 and 5,697,789.
  • the currently available programs are essentially transpositions of the previously available non-interactive mediums such as flash cards and tapes, sometimes augmented with entertaining or educational graphics or visuals.
  • the ' 789 patent includes a video display of an animated representation of a person's lips as the person is pronouncing the selected words).
  • the currently available foreign language instructional software does not provide a truly interactive system which approximates or simulates a teacher-student interactive environment.
  • Face-to-face interactive instruction is generally best for several reasons.
  • the face-to-face student/teacher system has its drawbacks as well.
  • the classroom setting inherently involves bringing a number of people together at the same place and time. This takes scheduling, and forces the student to adapt his or her schedule to that of the class, rather than being able to have the teacher available on-call.
  • the class room system is also typically limited to at most a few hours a week. Therefore, the system cannot provide a truly personalized educational experience.
  • personal tutoring solves some of these drawbacks, it does not solve all of them, and has some of its own. For example, even though the personal tutoring system involves just two people, it still requires scheduling. It can also be expensive. And, the tutor is not always going to be available at any time or place, at the student's beck and call.
  • Such an improved system is provided in a method that utilizes a commercially available personal computer capable of what is now commonly referred to as multi-media computing, and which comprises the steps of storing in the ⁇ . computer's memory (or otherwise providing for access) for display on the monitor a visual simulation of a real-life setting, such as the interior of a fast food restaurant, or a bank, or a doctor's office, for only a few examples; storing in the computer's memory (or otherwise providing for access) for display on the monitor an animated character appropriate to the real-life setting, such as, for example, an order taker in the fast food restaurant setting, a teller in a bank, or the receptionist, nurse or doctor in the doctor's office in the examples set forth above; storing in memory conversational statements for the host that are appropriate to the setting displayed; storing in computer's memory (or otherwise providing for access) for replay through the computer's speaker system, of a library of possible responses by the student to each of the host's conversational statements; having the host initiate a
  • a foreign language instructional method is provided that closely simulates the teacher-student interactive environment via a multi-media, computer-assisted system thereby allowing the student, on his or her own personal computer, to simultaneously enjoy a truly interactive learning situation and the personal convenience of being able to study when, where and for how long the student alone dictates.
  • This invention enables simulated, situational conversations to take place between the student and one of several alternative "teachers" — a server attached to the telephone; a stand alone computer or other stand-alone device; or a personal computer attached to the Internet.
  • a student with a standard personal computer can "enter” a fast food restaurant and engage in a situational conversation with an on-screen host. After the simulated host opens the conversation with a greeting and a first inquiry (such as "Hello, welcome to Cyber cafe, can I help you") the system will await a verbal response from the student.
  • a first inquiry such as "Hello, welcome to Cyber cafe, can I help you”
  • the student's response will be "interpreted” by the computer and compared against a pre-programmed set of expected responses and/or knowledge representation bases. Whereas prior art language instruction systems required precise responses, this invention will accept a large variety of acceptable grammars (or responses) within a constrained topic.
  • this invention will accept hundreds or thousands of variations of spoken input, both correct and incorrect, and then provide a response to the student based upon the system's understanding of what the student has said. These responses can range from an answer, a request for more information, a correction, a rejection, to a hint, for example.
  • the vocal response will be "communicated" to the student via the computer speakers and also by the animated host on-screen, whose facial movements will be synchronized with the sound output from the speakers and will be properly animated to visually cue the student as well.
  • the invention will also include a conversation "tree" that will allow the interaction between the student and the host to follow the path dictated by the student.
  • a conversation "tree" that will allow the interaction between the student and the host to follow the path dictated by the student.
  • the student orders a hamburger, he or she would be able to do so in a variety of ways - for example "I'll have a hamburger please.” “Give me a burger.” “I want a cheeseburger.” All of these would be recognized by the computer as appropriate responses, to which the host would inquire about how well done the student wanted it cooked, whether the student wants onions and pickles, or french fries, etc. If the student asks for pizza instead, the method of this invention will recognize the response, and will inquire about thin or thick crust, toppings, etc.
  • the method of this invention is designed for use with a typical commercially available personal computer capable of multi-media process, having a processing unit, memory, monitor, operating system, CD-ROM drive, keyboard, mouse (or other equivalent device), speakers, and microphone, such as those computers widely available from innumerable sources such as IBM, Compaq, Dell, and Gateway.
  • ASR Automatic Speech Recognition
  • NLU Natural Language Understanding
  • the ASR and NLU products offered by Unisys are utilized along with the Unisys Natural Language Assistant (NLA), and stored in the computer memory according to the manufacturer's specifications and directions.
  • NLA Natural Language Assistant
  • the memory of the computer is also loaded in the conventional way for display on the computer's monitor of a visual simulation of a real life situational setting that a person would typically encounter during a trip to the country in which the foreign language being learned is spoken.
  • the situational setting could be a fast food restaurant, or a bank, or an airport, or a post office, or a doctor's office, or a car rental agency, or a hotel, or a bus, etc.
  • Those skilled in the art will understand that there are many, many situational settings that a tourist will encounter, each of which has its own situational conversation vocabularies.
  • the computer memory has stored in the conventional way for display on the monitor in conjunction with the setting a "host" who is appropriate for the setting.
  • the host will be the order taker who works behind the counter and cash register, who greets the customer (in this instance the student), takes the order, retrieves and delivers the food to the student, states the total cost, collects the money, and makes the change.
  • the host could be the teller, who also greets the bank customer-student, inquires as to or receives the student's input on the desired transaction, and then completes the transaction.
  • the computer memory has stored in a conventional way a conversation tree that is appropriate to the setting.
  • the conversation tree will comprise a vocabulary of words and phrases that enable the host and the student to have a simulated real life conversation that is appropriate to the setting.
  • the conversation will be initiated by the host. For example, in the Cyber cafe, the host would welcome the student, then ask what he or she wanted. Using ASR, NLU and NLA, the computer will be able to recognize a large number of variant responses. If the student responds with any statement that is recognized as asking for a hamburger, the "limb" of the conversation tree dealing with a hamburger order is pursued.
  • the host may be programmed to ask, in response to an order for a hamburger, as to how the student wants it cooked.
  • An appropriate response (again recognized by the combination of ASR, NLU and NLA) would then elicit the next logical and appropriate statement or inquiry by the host - whether the student wants it with or without onions, for example.
  • the computer will have been programmed to ask for a repeated response from the student if the first response is not understood or is incorrect. After a pre-determined number of inaccurate or inappropriate responses, the computer will provide a translation of the host's statement or inquiry into the student's native language, then restart the conversation.
  • Each language and culture has visual cues that are quite important in communication. The combination of the spoken word with the appropriate visual cue provides the most effective communication.
  • the computer memory has stored in the conventional way data that will allow the host, at the same time it is "mouthing" the words that are being played on the computer's speakers, to also provide those visual cues that typically accompany that statement.
  • the host may advise the customer/student of the location of the napkin dispenser by saying “It's over there.”
  • the fast food worker would likely also point in the right direction.
  • the simulated host would also point at the same time as the host mouthed the words.
  • the method of this invention will include any number of such settings which can be accessed though computer disk or CD-ROM.
  • the student will be able to experience a simulated trip to the foreign country.
  • a Japanese businessman or business woman will be able to have virtually visited the United States, and successfully communicated his or her way in English from the airport, to the hotel, to a restaurant, to a bank, and to a business meeting many times before actually setting foot in the United States.

Abstract

A foreign spoken language instructional method for use by a student utilizing a multi-media computer in which the computer is programmed to display virtual settings that would be encountered by the foreign student in everyday life in the country for the foreign language being learned is spoken. For example, one of the settings could be a fast food restaurant. The computer is also programmed to create a virtual host that is appropriate for that setting, and to allow, through the use of automatic speech recognition and natural language understanding technologies, a real-life conversation to occur between the student and the host that is appropriate to the setting. The computer will have stored or have access to a vocabulary and library of possible responses and statements by both the host and the student, along with a conversational tree utilizing those vocabularies and libraries that will enable the conversation to follow any of several twists and turns, and not a precise, pre-set, structured command-response protocol.

Description

DESCRIPTION
Method For Computer-Aided Foreign Language Instruction
Background Of The Invention
1. Field of the Invention This invention relates to field of foreign language instruction generally, and particularly to a multi-media, computer-aided system for helping a student learn a foreign language in a simulated conversational environment.
2. Background
Learning a foreign language has been a universal pursuit literally since Biblical times and the Tower of Babel. This pursuit is more prevalent and important today than ever before. As the global community becomes more and more interdependent, as international travel and commerce become more and more the norm, and as the world becomes more and more linked by the aptly named World Wide Web, being able to communicate with people from other nations and cultures is becoming more of a necessity and less of a scholarly pursuit. Indeed, because the English language has become the international language of business, learning English has taken on prerequisite status in developed and developing nations around the world. In Japan, for example, not only are several years of classroom instruction in English required course work in the Japanese equivalent of elementary and high school, but augmenting that class work with private instruction in English is quite common.
One has only to travel outside the United States to be amazed at the degree to which children and adults in other countries are taking the time and expending the effort to learn English. Even in the United States learning a foreign language is a common goal.
Up to the present time, the process of learning a foreign language has followed one or another of several time-honored methods - classroom instruction in which an instructor leads the students through the learning process using face-to-face interaction and textual materials; flash cards; self-instruction audio or video tapes in which words and phrases are spoken on the tape amid pauses for the student to repeat the phrase back to the tape player.
With the advent of personal computing, foreign language instructional software has also now become available. See, for example, those described in U.S. Patent Nos. 5,810,599 and 5,697,789. The currently available programs, however, are essentially transpositions of the previously available non-interactive mediums such as flash cards and tapes, sometimes augmented with entertaining or educational graphics or visuals. (For example, the ' 789 patent includes a video display of an animated representation of a person's lips as the person is pronouncing the selected words). However, the currently available foreign language instructional software does not provide a truly interactive system which approximates or simulates a teacher-student interactive environment. Of all of these currently available educational systems, it is generally agreed that the interactive approach of student-to-teacher (necessarily well-trained and experienced), and preferably in a one-on-one setting, is the most effective and efficient way to learn a foreign language, and is much preferable to the various non-interactive systems such as flash cards or tapes. Unfortunately, most teaching and interaction is done without the benefit of context.
Face-to-face interactive instruction is generally best for several reasons. First, in the interactive student-teacher environment there can be real time communication that is tailored exclusively to that student's particular progress, strengths and weaknesses. Second, the teacher is able to receive and process a variety of responses that may include improper pronunciation and grammatical construction, and still provide the correct response, and a correcting of the erroneous portion of the student's response. Thirdly, there is a recognized physio-linguistic phenomenon in which what a person sees in the way of visual cues affects what that person simultaneously perceives aurally. This is sometimes referred to as the McGurk Effect. Specifically, during childhood a person learns to associate certain mouth/facial positions and expressions with certain vocal sounds. In other words, when the person combines the visual cues simultaneously with the spoken word, understanding and comprehension are remarkably improved. This phenomenon is readily understood by reference to the preference of business people to have face-to-face meetings rather than telephone conferences. While the latter is often more convenient, it is generally believed that communication in the face-to-face meetings is typically better than in the telephone conference which is devoid of visual cues, even though every spoken word will have been heard precisely.
However, the face-to-face student/teacher system has its drawbacks as well. In the classroom setting, there are multiple students so a truly personalized approach is not possible. Plus, the classroom setting inherently involves bringing a number of people together at the same place and time. This takes scheduling, and forces the student to adapt his or her schedule to that of the class, rather than being able to have the teacher available on-call. The class room system is also typically limited to at most a few hours a week. Therefore, the system cannot provide a truly personalized educational experience. While personal tutoring solves some of these drawbacks, it does not solve all of them, and has some of its own. For example, even though the personal tutoring system involves just two people, it still requires scheduling. It can also be expensive. And, the tutor is not always going to be available at any time or place, at the student's beck and call.
Therefore, there exists a need in the art for an improved foreign language instructional system that incorporates the benefits of these prior art systems while minimizing their drawbacks.
Summary Of The Invention
Such an improved system is provided in a method that utilizes a commercially available personal computer capable of what is now commonly referred to as multi-media computing, and which comprises the steps of storing in the ■. computer's memory (or otherwise providing for access) for display on the monitor a visual simulation of a real-life setting, such as the interior of a fast food restaurant, or a bank, or a doctor's office, for only a few examples; storing in the computer's memory (or otherwise providing for access) for display on the monitor an animated character appropriate to the real-life setting, such as, for example, an order taker in the fast food restaurant setting, a teller in a bank, or the receptionist, nurse or doctor in the doctor's office in the examples set forth above; storing in memory conversational statements for the host that are appropriate to the setting displayed; storing in computer's memory (or otherwise providing for access) for replay through the computer's speaker system, of a library of possible responses by the student to each of the host's conversational statements; having the host initiate a conversation with the student in the foreign language that is appropriate for the setting displayed; allowing the student to respond to the host verbally in the foreign language; converting the verbal response to text utilizing automatic speech recognition technology that has been stored in the computer's memory (or is otherwise accessible); interpreting the meaning of the response using natural language understanding technology that has been stored in the computer's memory (or is otherwise accessible) comparing the converted response to a library of possible responses by the student that has been stored in the computer memory or is otherwise accessible; selecting the most appropriate responsive host statement from the library of stored responses; causing that response to be played on the computer speaker system and synchronized with the visual display of the host "mouthing" the audible response; and continuing in this fashion through a plurality of conversational turns comprising a statement or inquiry by the host, a response by the student, a response by the host, and so on until the conversation appropriate for the setting is completed. By bringing together several separate technologies in a novel, non-obvious, fully integrated way, a foreign language instructional method is provided that closely simulates the teacher-student interactive environment via a multi-media, computer-assisted system thereby allowing the student, on his or her own personal computer, to simultaneously enjoy a truly interactive learning situation and the personal convenience of being able to study when, where and for how long the student alone dictates. This invention enables simulated, situational conversations to take place between the student and one of several alternative "teachers" — a server attached to the telephone; a stand alone computer or other stand-alone device; or a personal computer attached to the Internet. For example, a student with a standard personal computer can "enter" a fast food restaurant and engage in a situational conversation with an on-screen host. After the simulated host opens the conversation with a greeting and a first inquiry (such as "Hello, welcome to Cyber Cafe, can I help you") the system will await a verbal response from the student. By incorporating Automatic Voice Recognition technology and Natural Language Understanding technology, the student's response will be "interpreted" by the computer and compared against a pre-programmed set of expected responses and/or knowledge representation bases. Whereas prior art language instruction systems required precise responses, this invention will accept a large variety of acceptable grammars (or responses) within a constrained topic. Indeed, this invention will accept hundreds or thousands of variations of spoken input, both correct and incorrect, and then provide a response to the student based upon the system's understanding of what the student has said. These responses can range from an answer, a request for more information, a correction, a rejection, to a hint, for example.
The vocal response will be "communicated" to the student via the computer speakers and also by the animated host on-screen, whose facial movements will be synchronized with the sound output from the speakers and will be properly animated to visually cue the student as well.
The invention will also include a conversation "tree" that will allow the interaction between the student and the host to follow the path dictated by the student. For example, in the Cyber Cafe, if the student orders a hamburger, he or she would be able to do so in a variety of ways - for example "I'll have a hamburger please." "Give me a burger." "I want a cheeseburger." All of these would be recognized by the computer as appropriate responses, to which the host would inquire about how well done the student wanted it cooked, whether the student wants onions and pickles, or french fries, etc. If the student asks for pizza instead, the method of this invention will recognize the response, and will inquire about thin or thick crust, toppings, etc. Detailed Description Of The Preferred Embodiment
In the presently preferred embodiment, the method of this invention is designed for use with a typical commercially available personal computer capable of multi-media process, having a processing unit, memory, monitor, operating system, CD-ROM drive, keyboard, mouse (or other equivalent device), speakers, and microphone, such as those computers widely available from innumerable sources such as IBM, Compaq, Dell, and Gateway.
The computer memory is pre-loaded in the conventional way with both Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) engines. ASR technologies of various types are now available commercially. Some of these systems are speaker-dependent, meaning that they are customized, over time, to recognize one speaker's phrasing and pronunciation. Some systems recognize discrete speech patterns in which the spoken words are separated by short pauses. Others have the capability to recognize continuous speech. Some have very small vocabularies and libraries. Other have quite large vocabularies and libraries of phrases. Regardless of the system, all of them convert the spoken word to text, then analyze the text to determine its meaning.
In the preferred embodiment, the ASR and NLU products offered by Unisys are utilized along with the Unisys Natural Language Assistant (NLA), and stored in the computer memory according to the manufacturer's specifications and directions. These commercially available technologies allow the computer to recognize and interpret conversational English, for example, so that interactions can occur in a natural way, without complex menus or artificially constrained vocabularies or syntax.
The memory of the computer is also loaded in the conventional way for display on the computer's monitor of a visual simulation of a real life situational setting that a person would typically encounter during a trip to the country in which the foreign language being learned is spoken. For example, if the method of this invention is being used to teach the English language, the situational setting could be a fast food restaurant, or a bank, or an airport, or a post office, or a doctor's office, or a car rental agency, or a hotel, or a bus, etc. Those skilled in the art will understand that there are many, many situational settings that a tourist will encounter, each of which has its own situational conversation vocabularies. A tourist who, by virtue of his or her computer, has visited a virtual American fast food restaurant and successfully ordered food many times before actually walking into one in the United States will be far better able to handle the real life situation than a tourist who has only memorized the English words hamburger, french fries and a cola. The computer memory has stored in the conventional way for display on the monitor in conjunction with the setting a "host" who is appropriate for the setting. For example, if the setting if a fast food restaurant, the host will be the order taker who works behind the counter and cash register, who greets the customer (in this instance the student), takes the order, retrieves and delivers the food to the student, states the total cost, collects the money, and makes the change. In the bank setting, the host could be the teller, who also greets the bank customer-student, inquires as to or receives the student's input on the desired transaction, and then completes the transaction.
For each of the virtual settings, the computer memory has stored in a conventional way a conversation tree that is appropriate to the setting. The conversation tree will comprise a vocabulary of words and phrases that enable the host and the student to have a simulated real life conversation that is appropriate to the setting. The conversation will be initiated by the host. For example, in the Cyber Cafe, the host would welcome the student, then ask what he or she wanted. Using ASR, NLU and NLA, the computer will be able to recognize a large number of variant responses. If the student responds with any statement that is recognized as asking for a hamburger, the "limb" of the conversation tree dealing with a hamburger order is pursued. The host may be programmed to ask, in response to an order for a hamburger, as to how the student wants it cooked. An appropriate response (again recognized by the combination of ASR, NLU and NLA) would then elicit the next logical and appropriate statement or inquiry by the host - whether the student wants it with or without onions, for example. The computer will have been programmed to ask for a repeated response from the student if the first response is not understood or is incorrect. After a pre-determined number of inaccurate or inappropriate responses, the computer will provide a translation of the host's statement or inquiry into the student's native language, then restart the conversation. Each language and culture has visual cues that are quite important in communication. The combination of the spoken word with the appropriate visual cue provides the most effective communication. To enhance the learning experience in the preferred embodiment of the method of this invention, the computer memory has stored in the conventional way data that will allow the host, at the same time it is "mouthing" the words that are being played on the computer's speakers, to also provide those visual cues that typically accompany that statement. As a very simple example, in the fast food restaurant during some portion of the conversation the host may advise the customer/student of the location of the napkin dispenser by saying "It's over there." In real life, the fast food worker would likely also point in the right direction. In the preferred embodiment of this invention, the simulated host would also point at the same time as the host mouthed the words. The method of this invention will include any number of such settings which can be accessed though computer disk or CD-ROM. The student will be able to experience a simulated trip to the foreign country. For example, utilizing the method of this invention, a Japanese businessman or businesswoman will be able to have virtually visited the United States, and successfully communicated his or her way in English from the airport, to the hotel, to a restaurant, to a bank, and to a business meeting many times before actually setting foot in the United States.
While the preferred embodiment of the method of this invention has been set forth above, it will be obvious to those skilled in the art that many modifications are possible to the embodiments disclosed without departing from the inventive concept claimed below.
Therefore, this patent and the protection it provides is not limited to the embodiments set forth above, but is of the full scope of the following claims, including equivalents thereof.

Claims

Claims
1. A foreign spoken language instructional method for use by a student utilizing a multi-media computer, the computer having a processor, memory, an operating system, a monitor, a microphone, a modem and speakers, the memory including automatic speech recognition and natural voice understanding technologies, the method comprising the steps of: a. storing in memory for display on the monitor a visual simulation of a real-life setting, such as the interior of a fast food restaurant, or a bank, or a doctor's office, for only a few examples; b. storing in memory for display on the monitor an animated character appropriate to the real-life setting, such as, for example, an order taker in the fast food restaurant setting, a teller in a bank, or the receptionist, nurse or doctor in the doctor's office in the examples set forth above; c. storing in memory conversational statements for the host that are appropriate to the setting displayed; d. storing in memory a library of possible responses by the student to each of the host's conversational statement; e. having the host initiate a conversation with the student in the foreign language that is appropriate for the setting displayed; f. allowing the student to respond to the host verbally in the foreign language; g. converting the verbal response to text utilizing automatic speech recognition technology; h. comparing the converted response to a library of possible responses by the student; i. selecting the most appropriately responsive host statement from the library of stored responses; j. causing that response to be played on the computer speakers and synchronized with the visual display of the host "mouthing" the audible response; k. continuing in this fashion through a plurality of conversational turns comprising a statement or inquiry by the host, a response by the student, a response by the host, and so on until the conversation appropriate for the setting is completed.
2. The invention of claim 1 further comprising the step of having the animated host simultaneously provide the appropriate visual cues that correspond to the statement being made by the host in that foreign language.
3. The invention of claim 1 further comprising the step of simultaneously displaying on the monitor the text of the response by the host, if desired by the student.
4. The invention of claim 1 further comprising the step of utilizing the stored natural language technology to analyze the student's response so as to be able to recognize many different syntaxes, grammars, sentence structures to discern the concept being communicated by the student.
5. A foreign language instructional method for use by a student utilizing a multi-media computer, the computer having a processor, memory, an operating system, a monitor, a microphone and speakers, the memory including automatic speech recognition and natural voice understanding technologies, the computer further having stored in memory or otherwise being accessible a visual simulation of a real-life setting, an animated character appropriate to the real-life setting, conversational statements for the host that are appropriate to the setting displayed, a library of possible responses by the student to each of the host's conversational statements, the method comprising the steps of: a. recalling for display on the monitor a visual simulation of a real-life setting, such as the interior of a fast food restaurant, or a bank, or a doctor's office, for only a few examples; b. recalling for display on the monitor an animated character appropriate to the real-life setting, such as, for example, an order taker in the fast food restaurant setting, a teller in a bank, or the receptionist, nurse or doctor in the doctor's office in the examples set forth above; c. having the host initiate a conversation with the student in the foreign language that is appropriate for the setting displayed; d. allowing the student to respond to the host verbally in the foreign language; e. converting the verbal response to text utilizing automatic speech recognition technology; f. comparing the converted response to a library of possible responses by the student; AMENDED CLAIMS
[received by the International Bureau on 24 March 2000 (24.03.00); original claims 1-3 and 5-7 amended; original claims 4 and 8 cancelled; new claims 9-14 added; remaining claims unchanged (8 pages)]
1. A foreign spoken language instructional method for use by a student utilizing a multi -media computer, the computer having a processor, memory, an operating system, a monitor, a microphone, and speakers, the memory including automatic speech recognition and natural language understanding technologies, the method comprising the steps of : a. storing in memory for display on the monitor a visual simulation of a background contextual setting; b. storing in memory for display on the monitor an animated character visually tailored to the context of the displayed background setting; c. storing in memory a library of possible conversational statements for the host that correspond to the context of the displayed background setting; d. storing in memory a library of possible responses by the student to each of the host's conversational statements; e. having the host initiate a conversation with the student corresponding to the displayed background setting in a selected foreign language; f. converting a foreign language verbal response from the student to text utilizing an automatic speech recognizer; g. utilizing the stored natural language technology to analyze the student's response so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by the student; h. comparing the converted response to the library of possible responses by the student; i . selecting a host statement from the library of stored responses that corresponds to the translated verbal response of the student; j . causing the host response to be played on the computer speakers synchronized with a visual display of the host "mouthing" the audible response; k. repeating steps (f) through (j) through a plurality of conversational exchanges comprising a statement or inquiry by the host, a response by the student, a response by the host, and so on until a conversation corresponding to the context of the background setting is completed.
2. The method of claim 1 further comprising the step of having the animated host simultaneously provide visual cues corresponding to the statement being made by the host in the selected foreign language.
3. The method of claim 1 further comprising the step of selectively displaying on the monitor the text of the response by the host.
4 . (Canceled)
5. A foreign language instructional method for use by a student utilizing a multimedia computer, the computer having a processor, memory, an operating system, a monitor, a microphone and speakers, the memory including automatic speech recognition and natural language understanding technologies, the computer further having stored in memory or otherwise being accessible a visual simulation of a background setting, an animated character visually tailored to the context of the background setting, a library of conversational statements for the host that correspond to the context of the background setting, and a library of possible responses by the student to each of the host's conversational statements, the method comprising the steps of: a. recalling for display on the monitor a visual simulation of a background contextual setting; b. recalling for display on the monitor an animated character visually tailored to the context of the displayed background setting; c. having the host initiate a conversation with the student corresponding to the displayed background setting in a selected foreign language; d. converting a foreign language verbal response from the student to text utilizing an automatic speech recognizer; e. utilizing the stored natural language technology to analyze the student's response so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by the student; f. comparing the converted response to the library of possible responses by the student; g. selecting a host statement from the library of stored responses that corresponds to the translated verbal response of the student; h. causing the response to be played on the computer speakers synchronized with a visual display of the host "mouthing" the audible response; i. repeating steps (d) through (h) through a plurality of conversational exchanges comprising a statement or inquiry by the host, a response by the student, a response by the host, and so on until a conversation corresponding to the context of the background setting is completed.
6. The method of claim 5 further comprising the step of having the animated host simultaneously provide visual cues corresponding to the statement being made by the host in the selected foreign language.
7. The method of claim 5 further comprising the step of selectively displaying on the monitor the text of the response by the host .
8. (Canceled)
9. A method for computer-aided language interaction involving a multi -media computing device, the computing device having a processor, memory, an operating system, a graphical display, a microphone, and speakers, the memory including automatic speech recognition and natural language understanding technologies, the method comprising the steps of: a. storing in memory for display on the graphical display a visual simulation of a background contextual setting; b. storing in memory for display on the graphical display a host character visually tailored to the context of the displayed background setting; c. storing in memory a library of possible conversational statements for the host that correspond to the context of the displayed background setting; d. storing in memory a library of possible responses by a student to each of the host's conversational statements; e. having the host initiate a conversation with the student corresponding to the displayed background setting in a selected language; f. analyzing a verbal response from the student utilizing an automatic speech recognizer and the stored natural language technology so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by the student; g. comparing the analyzed response to the library of possible responses by the student; h. selecting a host statement from the library of stored responses that corresponds to the translated verbal response of the student; i . causing the host response to be played on the computer speakers; j . repeating steps (f) through (i) through a plurality of conversational exchanges comprising a statement or inquiry by the host, a response by the student, a response by the host, and so on until a conversation corresponding to the context of the background setting is completed.
10. The method of claim 9 wherein the host character is animated, further comprising the steps of synchronizing said host response with a visual display of the host "mouthing" the audible response, and having the host simultaneously provide visual cues corresponding to the statement being made by the host in the selected language .
11. The method of claim 9 further comprising the step of selectively displaying on
SUBSTITUTE SHEET the graphical display the text of the response by the host .
12. A method for conversation-based language interaction involving a multi-media computer, the computer having a processor, memory, an operating system, a monitor, a microphone and speakers, the memory including automatic speech recognition and natural language understanding technologies, the computer further having stored in memory or otherwise being accessible a visual simulation of a background setting, a host character visually tailored to the context of the background setting, a library of conversational statements for the host that correspond to the context of the background setting, and a library of possible responses by the user to each of the host's conversational statements, the method comprising the steps of: a. recalling for display on the monitor a visual simulation of a background contextual setting; b. recalling for display on the monitor a host character visually tailored to the context of the displayed background setting; c. having the host initiate a conversation with a user corresponding to the displayed background setting in a selected language; d. analyzing a verbal response from the user to text utilizing an automatic speech recognizer and the stored natural language technology so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by the user; e . comparing the analyzed response to the library of possible responses by the user; f. selecting a host statement from the library of stored responses that corresponds to the translated verbal response of the user; g. causing the response to be played on the computer speakers; h. repeating steps (d) through (g) through a plurality of conversational exchanges comprising a statement or inquiry by the host, a response by the user, a response by the host, and so on until a conversation corresponding to the context of the background setting is completed.
13. The method of claim 12, wherein said host character is animated, further comprising the steps of synchronized the host response with a visual display of the host "mouthing" the host response, and having the animated host simultaneously provide visual cues corresponding to the statement being made by the host in the selected language.
14. The method of claim 12 further comprising the step of selectively displaying on the monitor the text of the response by the host . Statement Under Article 19
The ISA cites WO 98/11523 (Appleby) with respect to claims 1-3 and 5-7 as an allegedly pertinent document.
Claims 1 and 5 have been amended to incorporate the recitals of claims 4 and 8, respectively. Claim 4, for example, recites the step of " utilizing the stored natural language technology to analyze the student's response so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by the student," and claim 8 contains similar recitals. Claims 4 and 8 were not found by the ISA either to lack novelty or an inventive step over Appleby. Accordingly, claims 1 and 5 as amended are believed to be both novel and contain an inventive step over the cited items.
Claims 2-3 and 6 -1 are each dependent upon one of the aforementioned independent claims.
Claims 9 and 12 are new independent claims. Similar to original claims 4 and 8, and as combined with further subject
matter appearing generally in claims 1 and 5, they each recite the step of " analyzing a verbal response from the student utilizing an automatic speech recognizer and the stored natural language technology so as to recognize a plurality of different syntaxes, grammars, and sentence structures to discern the concept being communicated by" the student or user. New claims 10-11 and 13-14 depend upon claims 9 and 12. It is respectfully submitted that new claims 9-14 are thus novel and contain an inventive step over Appleby.
PCT/US1999/023264 1998-10-15 1999-10-06 Method for computer-aided foreign language instruction WO2000022597A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU62927/99A AU6292799A (en) 1998-10-15 1999-10-06 Method for computer-aided foreign language instruction

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17335798A 1998-10-15 1998-10-15
US09/173,357 1998-10-15

Publications (1)

Publication Number Publication Date
WO2000022597A1 true WO2000022597A1 (en) 2000-04-20

Family

ID=22631646

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/023264 WO2000022597A1 (en) 1998-10-15 1999-10-06 Method for computer-aided foreign language instruction

Country Status (3)

Country Link
AU (1) AU6292799A (en)
TW (1) TW448379B (en)
WO (1) WO2000022597A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000043975A1 (en) * 1999-01-26 2000-07-27 Microsoft Corporation Virtual challenge system and method for teaching a language
EP1217609A2 (en) * 2000-12-22 2002-06-26 Hewlett-Packard Company Speech recognition
WO2012174506A1 (en) * 2011-06-17 2012-12-20 Rosetta Stone, Ltd. System and method for language instruction using visual and/or audio prompts
JP2020030400A (en) * 2018-08-23 2020-02-27 國立台湾師範大学 Method of education and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997021201A1 (en) * 1995-12-04 1997-06-12 Bernstein Jared C Method and apparatus for combined information from speech signals for adaptive interaction in teaching and testing
EP0801370A1 (en) * 1996-04-09 1997-10-15 HE HOLDINGS, INC. dba HUGHES ELECTRONICS System and method for multimodal interactive speech and language training
US5717828A (en) * 1995-03-15 1998-02-10 Syracuse Language Systems Speech recognition apparatus and method for learning
WO1998011523A1 (en) * 1996-09-13 1998-03-19 British Telecommunications Public Limited Company Training apparatus and method
CA2228917A1 (en) * 1997-04-14 1998-10-14 At&T Corp. System and method for providing remote automatic speech recognition services via a packet network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5717828A (en) * 1995-03-15 1998-02-10 Syracuse Language Systems Speech recognition apparatus and method for learning
WO1997021201A1 (en) * 1995-12-04 1997-06-12 Bernstein Jared C Method and apparatus for combined information from speech signals for adaptive interaction in teaching and testing
EP0801370A1 (en) * 1996-04-09 1997-10-15 HE HOLDINGS, INC. dba HUGHES ELECTRONICS System and method for multimodal interactive speech and language training
WO1998011523A1 (en) * 1996-09-13 1998-03-19 British Telecommunications Public Limited Company Training apparatus and method
CA2228917A1 (en) * 1997-04-14 1998-10-14 At&T Corp. System and method for providing remote automatic speech recognition services via a packet network
EP0872827A2 (en) * 1997-04-14 1998-10-21 AT&T Corp. System and method for providing remote automatic speech recognition services via a packet network

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000043975A1 (en) * 1999-01-26 2000-07-27 Microsoft Corporation Virtual challenge system and method for teaching a language
US6234802B1 (en) 1999-01-26 2001-05-22 Microsoft Corporation Virtual challenge system and method for teaching a language
EP1217609A2 (en) * 2000-12-22 2002-06-26 Hewlett-Packard Company Speech recognition
EP1217609A3 (en) * 2000-12-22 2004-02-25 Hewlett-Packard Company Speech recognition
WO2012174506A1 (en) * 2011-06-17 2012-12-20 Rosetta Stone, Ltd. System and method for language instruction using visual and/or audio prompts
US9911349B2 (en) 2011-06-17 2018-03-06 Rosetta Stone, Ltd. System and method for language instruction using visual and/or audio prompts
JP2020030400A (en) * 2018-08-23 2020-02-27 國立台湾師範大学 Method of education and electronic device

Also Published As

Publication number Publication date
AU6292799A (en) 2000-05-01
TW448379B (en) 2001-08-01

Similar Documents

Publication Publication Date Title
US7778948B2 (en) Mapping each of several communicative functions during contexts to multiple coordinated behaviors of a virtual character
US20100304342A1 (en) Interactive Language Education System and Method
US20050255431A1 (en) Interactive language learning system and method
US20020150869A1 (en) Context-responsive spoken language instruction
Bernstein et al. Subarashii: Encounters in Japanese spoken language education
US20030028378A1 (en) Method and apparatus for interactive language instruction
WO2000030059A1 (en) Method and apparatus for increased language fluency
WO2005099414A2 (en) Comprehensive spoken language learning system
Rypa et al. VILTS: A tale of two technologies
Ehsani et al. An interactive dialog system for learning Japanese
JP6993745B1 (en) Lecture evaluation system and method
Aktuğ Common pronunciation errors of seventh grade EFL learners: A case from Turkey
Holden III Extensive listening: A new approach to an old problem
WO2000022597A1 (en) Method for computer-aided foreign language instruction
Alimin Developing listening materials for the tenth graders of Islamic senior high school
WO2002050799A2 (en) Context-responsive spoken language instruction
JP2001337594A (en) Method for allowing learner to learn language, language learning system and recording medium
Andriani The Use of Natural Reader Software in Teaching Pronunciation and Speaking Performances
Çekiç The effects of computer assisted pronunciation teaching on the listening comprehension of Intermediate learners
JP6998420B2 (en) Task-oriented digital language learning method
Lê et al. Speech-enabled tools for augmented interaction in e-learning applications
TWI227449B (en) Match-making system and method for on-line language learning
Havrylenko ESP LISTENING IN ONLINE LEARNING TO UNIVERSITY STUDENTS
PARAMIDA A Thesis
Montgomery Self-reported listening strategies by students in an intensive English language program

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU BR CA CN JP KR MX SG

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase