US20090070100A1 - Methods, systems, and computer program products for spoken language grammar evaluation - Google Patents

Methods, systems, and computer program products for spoken language grammar evaluation Download PDF

Info

Publication number
US20090070100A1
US20090070100A1 US11/853,076 US85307607A US2009070100A1 US 20090070100 A1 US20090070100 A1 US 20090070100A1 US 85307607 A US85307607 A US 85307607A US 2009070100 A1 US2009070100 A1 US 2009070100A1
Authority
US
United States
Prior art keywords
grammar
candidate
spoken
spoken language
language grammar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/853,076
Inventor
Rajni Bajaj
Sreeram V. Balakrishnan
Mridula Bhandari
Lyndon J. D'Silva
Sandeep Jindal
Pooja Kumar
Nitendra Rajput
Ashish Verma
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nuance Communications Inc
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/853,076 priority Critical patent/US20090070100A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BHANDARI, MRIDULA, BALAKRISHNAN, SREERAM V., BAJAJ, RAJNI, KUMAR, POOJA, D'SILVA, LYNDON J., JINDAL, SANDEEP, RAJPUT, NITENDRA, VERMA, ASHISH
Priority to US12/055,623 priority patent/US7966180B2/en
Publication of US20090070100A1 publication Critical patent/US20090070100A1/en
Assigned to NUANCE COMMUNICATIONS, INC. reassignment NUANCE COMMUNICATIONS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B7/00Electrically-operated teaching apparatus or devices working with questions and answers
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/04Speaking

Definitions

  • the present disclosure relates generally to linguistic analysis, and, in particular, to spoken language grammar evaluation.
  • Embodiments of the invention include a method for spoken language grammar evaluation.
  • the method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text.
  • the method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.
  • Additional embodiments include a system for spoken language grammar evaluation.
  • the system includes a host system in communication with a user system, where the user system provides audio input and output for a candidate.
  • the system further includes a grammar database in communication with the host system, and a grammar assessment tool (GAT) executing upon the host system.
  • the GAT sends a recorded question to the candidate.
  • the user system plays the recorded question and records a spoken answer.
  • the GAT receives the spoken answer from the candidate, and initiates a conversion of the spoken answer into text.
  • the GAT further compares the text to the grammar database, calculates a spoken language grammar evaluation score based on the comparison, and outputs the spoken language grammar evaluation score.
  • FIG. 1 illustrates one example of a block diagram of a system upon which spoken language grammar evaluation may be implemented in exemplary embodiments
  • FIG. 2 illustrates one example of a flow diagram describing a process for spoken language grammar test development in accordance with exemplary embodiments
  • FIG. 3 illustrates one example of a flow diagram describing a process for spoken language grammar evaluation in accordance with exemplary embodiments.
  • Exemplary embodiments as shown and described by the various figures and the accompanying text, provide methods, systems and computer program products for spoken language grammar evaluation.
  • a question or sentence is played as audio content to a candidate, and spoken utterances of the candidate, in response thereto, are evaluated for grammatical correctness.
  • a “candidate” refers to a user whose spoken language grammar is under evaluation.
  • a speech recognition system may be employed to convert the candidate's speech into text.
  • speech recognition systems can be error prone, such that converted text generated by a speech recognition system may not be exactly what the candidate said.
  • performing a grammar test based only on the text as generated by the speech recognition system may provide incorrect results.
  • the candidate's sentences are restricted through making the candidate listen to a sentence, and then prompting the candidate to speak a grammatically correct version of the sentence.
  • This technique ensures that the sentence spoken by the candidate is among the sentences that can be correctly converted to text by the speech recognition system. Moreover, this method may increase spoken language grammar evaluation accuracy. Further, since the entire evaluation can be performed as spoken interactions, factors such as spontaneity, just-in-time sentence composition, and other such factors are incorporated in evaluating the candidate's spoken language grammar.
  • FIG. 1 there is a block diagram of a system 100 upon which spoken language grammar evaluation is implemented in exemplary embodiments.
  • the system 100 of FIG. 1 includes a host system 102 in communication with user systems 104 over a network 106 .
  • the host system 102 is a high-speed processing device (e.g., a mainframe computer) including at least one processing circuit (e.g., a CPU) capable of reading and executing instructions, and handling numerous interaction requests from the user systems 104 .
  • the host system 102 may function as an application server, a database management server, and/or a web server.
  • the user systems 104 may comprise desktop, laptop, or general-purpose computer devices that provide an interface for candidates to perform spoken language grammar evaluation. System administrators of the host system 102 may also access the host system 102 via the user systems 104 , performing such tasks as developing grammar test content. While only a single host system 102 is shown in FIG. 1 , it will be understood that multiple host systems can be implemented, each in communication with one another via direct coupling or via one or more networks. For example, multiple host systems may be interconnected through a distributed network architecture. The single host system 102 may also represent a cluster of hosts collectively performing processes as described in greater detail herein. In alternate exemplary embodiments, the host system 102 is integrated with a user system 104 as a single personal computer or workstation.
  • the user systems 104 interface with audio input and output devices, such a microphone 108 and a speaker 110 .
  • the user systems 104 are mobile devices, such as Web-enabled wireless phones, with the microphone 108 and speaker 110 integrated into the user systems 104 .
  • a candidate may record responses to questions or other statements output via the speaker 110 .
  • the user systems 104 may include Web browsing software and/or other communication technologies to exchange information with the host system 102 via the network 106 .
  • the network 106 may be any type of communications network known in the art.
  • the network 106 may be an intranet, extranet, or an internetwork, such as the Internet, or a combination thereof.
  • the network 106 can include wireless, wired, and/or fiber optic links.
  • the host system 102 accesses and stores data in a data storage device 112 .
  • the data storage device 112 refers to any type of storage and may comprise a secondary storage element, e.g., hard disk drive, tape, or a storage subsystem that is internal or external to the host system 102 .
  • Types of data that may be stored in the data storage device 112 include files and databases, such as audio and textual information. It will be understood that the data storage device 112 shown in FIG. 1 is provided for purposes of simplification and ease of explanation and is not to be construed as limiting in scope. To the contrary, there may be multiple data storage devices 112 utilized by the host system 102 .
  • the data storage device 112 may store questions 114 , model answers 116 , answer speech recognition (ASR) grammar 118 , candidate answers 120 , and training data 122 , as further described herein.
  • ASR answer speech recognition
  • the host system 102 executes various applications, including a grammar assessment tool (GAT) 124 and a speech recognition system (SRS) 126 .
  • GAT grammar assessment tool
  • SRS speech recognition system
  • An operating system and other applications e.g., business applications, a web server, etc., may also be executed by the host system 102 as dictated by the needs of the enterprise of the host system 102 .
  • the GAT 124 performs spoken language grammar evaluation in response to a request received from the user systems 104 .
  • the GAT 124 may send the questions 114 to the user systems 104 to elicit spoken responses from candidates.
  • the spoken responses are returned to the host system 102 and may be stored as the candidate answers 120 .
  • the SRS 126 converts the candidate answers 120 from speech into text.
  • the GAT 124 may compare the text output of the SRS 126 to the ASR grammar 118 as developed using the model answers 116 , and calculate an associated spoken language grammar evaluation score.
  • the GAT 124 may also calculate a total weighted spoken language grammar evaluation score as a summation of multiple responses from a candidate, weighted relative to the difficulty of each question as determined from the training data 122 .
  • the SRS 126 performs the comparison of converted text to the ASR grammar 118 , and calculates the spoken language grammar evaluation score.
  • the GAT 124 and the SRS 126 are shown as separate applications executing on the host system 102 , it will be understood that the applications may be merged or further subdivided as a single application, multiple applications, or any combination thereof. Moreover, while described as applications, the GAT 124 and the SRS 126 can be implemented as plug-ins, applets, modules, scripts, or other such formats known in the art. In alternate exemplary embodiments, the processing associated with the GAT 124 and the SRS 126 is split between the host system 102 and the client systems 104 , e.g., a distributed computing architecture.
  • the host system 102 accesses the SRS 126 over the network 106 (e.g., the Internet), if the SRS 126 is available as a hosted service on another networked system (not depicted).
  • the network 106 e.g., the Internet
  • the details of developing a spoken language grammar test and a process for spoken language grammar evaluation are further provided herein.
  • a process 200 for spoken language grammar test development will now be described in accordance with exemplary embodiments, and in reference to the system 100 of FIG. 1 .
  • An administrator can perform the process 200 to configure the data stored in the data storage device 112 for spoken language grammar testing.
  • a question is selected for the spoken language grammar test.
  • a “question” may be any statement that elicits a candidate response, but the question need not be in the form of an inquiry.
  • the question is recorded in an audio format. The recorded question may be written to the questions 114 for use during grammar testing.
  • possible text answers to the question are identified. The possible answers may include both grammatically correct and incorrect answers that are anticipated. For example, a question could be, “I am owning a big car.” Possible answers could include those listed in table 1.
  • the possible answers can be manually generated.
  • the possible answers are automatically generated using a technique known in the art, such as the techniques taught by Uchimoto, K., Sekine, S. and Isahara, H., “Text Generation from Keywords,” Proc. COLING, 2002; and/or John Lee and Stephanie Seneff, “Automatic Grammar Correction for Second-Language Learners,” Interspeech—ICSLP (Pittsburgh) 17-21 Sep. 2006.
  • the possible answers are stored in the model answers 116 .
  • the model answers 116 are grammatically analyzed and the results are written to a grammar database, i.e., the ASR grammar 118 .
  • Grammatical analysis may include coming up with a list of correct, as well as possible incorrect answers. Having explicit incorrect answers in the list can help to increase evaluation confidence when one of the answers in the list is recorded. It can be seen that the speech recognition grammar encapsulates a list of numerous possible ways that a sentence (i.e., the question) can be made grammatically correct. Grammatically incorrect answers may be written to the ASR grammar 118 and flagged as incorrect to assist in determining whether the candidate's grammar under analysis is correct or incorrect. The process 200 can be repeated to generate a set of questions 114 and possible answers to form one or more grammar tests, with additional model answers 116 and ASR grammar 118 written to the data storage device 112 .
  • a candidate initiates spoken language grammar evaluation via a request from a user system 104 to the GAT 124 .
  • the GAT 124 sends a recorded question from the questions 114 to the user system 104 .
  • the recorded question is played to the candidate.
  • the candidate may listen to the recorded question as audio output through the speaker 110 .
  • the recorded question is a sentence containing one or more grammatical errors. The candidate is prompted to speak a corrected form of the sentence without grammatical errors.
  • a spoken answer from the candidate is recorded.
  • the spoken answer may be input through the microphone 108 , transmitted to the host system 102 , received by the GAT 124 , and written to the candidate answers 120 .
  • the GAT 124 initiates the SRS 126 to convert the spoken answer from the candidate answers 120 into text.
  • the SRS 126 may use any process known in the art to convert from recorded audio into text.
  • the SRS 126 applies a limited conversion vocabulary based on the data stored in the ASR grammar 118 .
  • the GAT 124 compares the text to the contents of the ASR grammar 118 (i.e., the grammar database). In alternate exemplary embodiments, the SRS 126 performs the comparison in block 308 . The comparison matches the candidate's response with one of the possible correct answers that is present in the ASR grammar 118 .
  • the SRS 126 may correctly convert the spoken answer into text using the ASR grammar 118 . If the candidate's response is not present in the ASR grammar 118 , then the SRS 126 may not be able to correctly convert the spoken answer into text.
  • the GAT 124 calculates a spoken language grammar evaluation score based on the comparison.
  • the SRS 126 performs the calculation in block 310 . If the sentence recorded by the candidate (i.e., the text of the spoken answer) is one that exists in the ASR grammar 118 , a higher score is assigned to the response than for an incorrect answer. If the candidate's spoken answer cannot be matched with one of the possible correct answers in the ASR grammar 118 , it may be assumed that the candidate has spoken a grammatically incorrect sentence, since all of the grammatically correct possibilities are stored in the ASR grammar 118 per the process 200 of FIG. 2 .
  • a failure to locate a response in the ASR grammar 118 may result in a score of zero for the associated question.
  • the text of the candidate's response is matched with the list of sentences from the model answers 116 to identify correct and incorrect grammar.
  • the candidate is awarded a score of one for correct grammar and zero for incorrect grammar.
  • a lower grammatical word weight may be assigned for the words that do not play a critical role in making the distinction between correct and incorrect grammar. For example, the emphasized words in table 2 may be ignored or assigned a low grammatical word weight in determining whether the candidate's spoken language grammar was correct or incorrect for the given question.
  • spoken language grammar evaluation includes assessing various properties in the candidate's grammar such as use of articles, prepositions, subject-verb-agreement, and word order.
  • a response to a large number of questions 114 e.g., twenty or more, are sought from the candidate.
  • the questions 114 are designed to cover the various properties of the candidate's grammar.
  • the method of calculating a final score from the individual scores for each question may be based on the ability to differentiate between “good” candidates and “bad” candidates. Good candidates are those who are strong in grammar (e.g., more correct answers), and bad candidates are those who are weak in grammar (e.g., fewer correct answers) as identified through evaluation performance results.
  • a training process may include evaluating the scores of multiple candidates based on a common grammar test to establish a scoring weight for the relative difficulty of various questions 114 .
  • the score for this question q is given by the equation: [a+b ⁇ (c+d)]/[a+b+c+d].
  • This formulation assumes that there are five categories in which the candidates are evaluated (0, 1, 2, 3, 4—with 4 being the best). Similar formulations for the other categories can be calculated by modifying the equation and threshold values. This formulation is based on weighting that emphasizes questions that are answered correctly by candidates who primarily get high scores OR that are answered incorrectly by candidates who primarily get low scores.
  • the training process generates a set of weights for each question, which are stored in the training data 122 .
  • the training data 122 can be manually verified to ensure that it is accurate.
  • the GAT 124 outputs the spoken language grammar evaluation score.
  • the GAT 124 may also output a summation of the weighted spoken language grammar evaluation scores for the candidate as the weighted sum of each question attempted by the candidate, applying the weights calculated in the training data 122 .
  • exemplary embodiments include spoken language grammar evaluation of a candidate.
  • a weighted score for a candidate can be calculated to establish the relative performance of the candidate as compared to other candidates.
  • Advantages of exemplary embodiments include performing a grammar evaluation as a spoken response to a predetermined question to enhance a candidate's spontaneity and just-in-time sentence composition ability without visual assistance. Further advantages include applying a grammatical word weight to determine the grammatical correctness of the candidate's spoken answer by reducing the effect of speech recognition errors in words that are deemed non-critical to the grammar evaluation.
  • embodiments can be embodied in the form of computer-implemented processes and apparatuses for practicing those processes.
  • the invention is embodied in computer program code executed by one or more network elements.
  • Embodiments include computer program code containing instructions embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, universal serial bus (USB) flash drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention.
  • Embodiments include computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention.
  • the computer program code segments configure the microprocessor to create specific logic circuits.

Abstract

A method, system, and computer program product for spoken language grammar evaluation are provided. The method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.

Description

    BACKGROUND OF THE INVENTION
  • The present disclosure relates generally to linguistic analysis, and, in particular, to spoken language grammar evaluation.
  • Written and spoken language grammar skills of a person are often uncorrelated. This is due to the fact that there are several factors that exist in the spoken form of the language and not in the written form, such as spontaneity, no visual help, and just-in-time sentence composition. Therefore, written grammar tests may not be suitable to judge the spoken grammar skills of people.
  • In today's global world, where people with differing native languages are required to converse in foreign languages, it would be beneficial to develop an automated approach to improve people's conversational language skills through interactive grammatical analysis. Accordingly, there is a need in the art for automated spoken language grammar evaluation.
  • BRIEF SUMMARY OF THE INVENTION
  • Embodiments of the invention include a method for spoken language grammar evaluation. The method includes playing a recorded question to a candidate, recording a spoken answer from the candidate, and converting the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.
  • Additional embodiments include a system for spoken language grammar evaluation. The system includes a host system in communication with a user system, where the user system provides audio input and output for a candidate. The system further includes a grammar database in communication with the host system, and a grammar assessment tool (GAT) executing upon the host system. The GAT sends a recorded question to the candidate. The user system plays the recorded question and records a spoken answer. The GAT receives the spoken answer from the candidate, and initiates a conversion of the spoken answer into text. The GAT further compares the text to the grammar database, calculates a spoken language grammar evaluation score based on the comparison, and outputs the spoken language grammar evaluation score.
  • Further embodiments include computer program product for spoken language grammar evaluation. The computer program product includes a storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for implementing a method. The method includes sending a recorded question to a candidate, receiving a spoken answer from the candidate, and initiating a conversion of the spoken answer into text. The method further includes comparing the text to a grammar database, calculating a spoken language grammar evaluation score based on the comparison, and outputting the spoken language grammar evaluation score.
  • Other systems, methods, and/or computer program products according to embodiments will be or become apparent to one with skill in the art upon review of the following drawings and detailed description. It is intended that all such additional systems, methods, and/or computer program products be included within this description, be within the scope of the present invention, and be protected by the accompanying claims.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
  • FIG. 1 illustrates one example of a block diagram of a system upon which spoken language grammar evaluation may be implemented in exemplary embodiments;
  • FIG. 2 illustrates one example of a flow diagram describing a process for spoken language grammar test development in accordance with exemplary embodiments; and
  • FIG. 3 illustrates one example of a flow diagram describing a process for spoken language grammar evaluation in accordance with exemplary embodiments.
  • The detailed description explains the preferred embodiments of the invention, together with advantages and features, by way of example with reference to the drawings.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Exemplary embodiments, as shown and described by the various figures and the accompanying text, provide methods, systems and computer program products for spoken language grammar evaluation. In exemplary embodiments, a question or sentence is played as audio content to a candidate, and spoken utterances of the candidate, in response thereto, are evaluated for grammatical correctness. Here, a “candidate” refers to a user whose spoken language grammar is under evaluation. A speech recognition system may be employed to convert the candidate's speech into text. In general, speech recognition systems can be error prone, such that converted text generated by a speech recognition system may not be exactly what the candidate said. Thus, performing a grammar test based only on the text as generated by the speech recognition system may provide incorrect results. In exemplary embodiments, the candidate's sentences are restricted through making the candidate listen to a sentence, and then prompting the candidate to speak a grammatically correct version of the sentence. This technique ensures that the sentence spoken by the candidate is among the sentences that can be correctly converted to text by the speech recognition system. Moreover, this method may increase spoken language grammar evaluation accuracy. Further, since the entire evaluation can be performed as spoken interactions, factors such as spontaneity, just-in-time sentence composition, and other such factors are incorporated in evaluating the candidate's spoken language grammar.
  • Turning now to the drawings, it will be seen that in FIG. 1 there is a block diagram of a system 100 upon which spoken language grammar evaluation is implemented in exemplary embodiments. The system 100 of FIG. 1 includes a host system 102 in communication with user systems 104 over a network 106. In exemplary embodiments, the host system 102 is a high-speed processing device (e.g., a mainframe computer) including at least one processing circuit (e.g., a CPU) capable of reading and executing instructions, and handling numerous interaction requests from the user systems 104. The host system 102 may function as an application server, a database management server, and/or a web server. The user systems 104 may comprise desktop, laptop, or general-purpose computer devices that provide an interface for candidates to perform spoken language grammar evaluation. System administrators of the host system 102 may also access the host system 102 via the user systems 104, performing such tasks as developing grammar test content. While only a single host system 102 is shown in FIG. 1, it will be understood that multiple host systems can be implemented, each in communication with one another via direct coupling or via one or more networks. For example, multiple host systems may be interconnected through a distributed network architecture. The single host system 102 may also represent a cluster of hosts collectively performing processes as described in greater detail herein. In alternate exemplary embodiments, the host system 102 is integrated with a user system 104 as a single personal computer or workstation.
  • In exemplary embodiments, the user systems 104 interface with audio input and output devices, such a microphone 108 and a speaker 110. In alternate exemplary embodiments, the user systems 104 are mobile devices, such as Web-enabled wireless phones, with the microphone 108 and speaker 110 integrated into the user systems 104. Using the microphone 108, a candidate may record responses to questions or other statements output via the speaker 110. The user systems 104 may include Web browsing software and/or other communication technologies to exchange information with the host system 102 via the network 106.
  • The network 106 may be any type of communications network known in the art. For example, the network 106 may be an intranet, extranet, or an internetwork, such as the Internet, or a combination thereof. The network 106 can include wireless, wired, and/or fiber optic links.
  • In exemplary embodiments, the host system 102 accesses and stores data in a data storage device 112. The data storage device 112 refers to any type of storage and may comprise a secondary storage element, e.g., hard disk drive, tape, or a storage subsystem that is internal or external to the host system 102. Types of data that may be stored in the data storage device 112 include files and databases, such as audio and textual information. It will be understood that the data storage device 112 shown in FIG. 1 is provided for purposes of simplification and ease of explanation and is not to be construed as limiting in scope. To the contrary, there may be multiple data storage devices 112 utilized by the host system 102. In support of spoken language grammar evaluation, the data storage device 112 may store questions 114, model answers 116, answer speech recognition (ASR) grammar 118, candidate answers 120, and training data 122, as further described herein.
  • In exemplary embodiments, the host system 102 executes various applications, including a grammar assessment tool (GAT) 124 and a speech recognition system (SRS) 126. An operating system and other applications, e.g., business applications, a web server, etc., may also be executed by the host system 102 as dictated by the needs of the enterprise of the host system 102. In exemplary embodiments, the GAT 124 performs spoken language grammar evaluation in response to a request received from the user systems 104. The GAT 124 may send the questions 114 to the user systems 104 to elicit spoken responses from candidates. The spoken responses are returned to the host system 102 and may be stored as the candidate answers 120. In exemplary embodiments, the SRS 126 converts the candidate answers 120 from speech into text. The GAT 124 may compare the text output of the SRS 126 to the ASR grammar 118 as developed using the model answers 116, and calculate an associated spoken language grammar evaluation score. The GAT 124 may also calculate a total weighted spoken language grammar evaluation score as a summation of multiple responses from a candidate, weighted relative to the difficulty of each question as determined from the training data 122. In alternate exemplary embodiments, the SRS 126 performs the comparison of converted text to the ASR grammar 118, and calculates the spoken language grammar evaluation score.
  • Although the GAT 124 and the SRS 126 are shown as separate applications executing on the host system 102, it will be understood that the applications may be merged or further subdivided as a single application, multiple applications, or any combination thereof. Moreover, while described as applications, the GAT 124 and the SRS 126 can be implemented as plug-ins, applets, modules, scripts, or other such formats known in the art. In alternate exemplary embodiments, the processing associated with the GAT 124 and the SRS 126 is split between the host system 102 and the client systems 104, e.g., a distributed computing architecture. In alternate exemplary embodiments, the host system 102 accesses the SRS 126 over the network 106 (e.g., the Internet), if the SRS 126 is available as a hosted service on another networked system (not depicted). The details of developing a spoken language grammar test and a process for spoken language grammar evaluation are further provided herein.
  • Turning now to FIG. 2, a process 200 for spoken language grammar test development will now be described in accordance with exemplary embodiments, and in reference to the system 100 of FIG. 1. An administrator can perform the process 200 to configure the data stored in the data storage device 112 for spoken language grammar testing. At block 202, a question is selected for the spoken language grammar test. Here, a “question” may be any statement that elicits a candidate response, but the question need not be in the form of an inquiry. At block 204, the question is recorded in an audio format. The recorded question may be written to the questions 114 for use during grammar testing. At block 206, possible text answers to the question are identified. The possible answers may include both grammatically correct and incorrect answers that are anticipated. For example, a question could be, “I am owning a big car.” Possible answers could include those listed in table 1.
  • TABLE 1
    Possible Answers
    MODEL ANSWERS CORRECT?
    I am owning a big car. No.
    I own a big car. Yes.
    I have a big car. Yes.
    I owe a big car. Yes.
    I am driving a big car. Yes.
    I drive a big car. Yes.
  • The possible answers can be manually generated. In alternate exemplary embodiments, the possible answers are automatically generated using a technique known in the art, such as the techniques taught by Uchimoto, K., Sekine, S. and Isahara, H., “Text Generation from Keywords,” Proc. COLING, 2002; and/or John Lee and Stephanie Seneff, “Automatic Grammar Correction for Second-Language Learners,” Interspeech—ICSLP (Pittsburgh) 17-21 Sep. 2006. In exemplary embodiments, the possible answers are stored in the model answers 116. At block 208, the model answers 116 are grammatically analyzed and the results are written to a grammar database, i.e., the ASR grammar 118. Grammatical analysis may include coming up with a list of correct, as well as possible incorrect answers. Having explicit incorrect answers in the list can help to increase evaluation confidence when one of the answers in the list is recorded. It can be seen that the speech recognition grammar encapsulates a list of numerous possible ways that a sentence (i.e., the question) can be made grammatically correct. Grammatically incorrect answers may be written to the ASR grammar 118 and flagged as incorrect to assist in determining whether the candidate's grammar under analysis is correct or incorrect. The process 200 can be repeated to generate a set of questions 114 and possible answers to form one or more grammar tests, with additional model answers 116 and ASR grammar 118 written to the data storage device 112.
  • Turning now to FIG. 3, a process 300 for spoken language grammar evaluation will now be described in accordance with exemplary embodiments, and in reference to the system 100 of FIG. 1. In exemplary embodiments, a candidate initiates spoken language grammar evaluation via a request from a user system 104 to the GAT 124. In response thereto, the GAT 124 sends a recorded question from the questions 114 to the user system 104. At block 302, the recorded question is played to the candidate. The candidate may listen to the recorded question as audio output through the speaker 110. In exemplary embodiments, the recorded question is a sentence containing one or more grammatical errors. The candidate is prompted to speak a corrected form of the sentence without grammatical errors.
  • At block 304, a spoken answer from the candidate is recorded. The spoken answer may be input through the microphone 108, transmitted to the host system 102, received by the GAT 124, and written to the candidate answers 120.
  • At block 306, the GAT 124 initiates the SRS 126 to convert the spoken answer from the candidate answers 120 into text. The SRS 126 may use any process known in the art to convert from recorded audio into text. In exemplary embodiments, the SRS 126 applies a limited conversion vocabulary based on the data stored in the ASR grammar 118. At block 308, the GAT 124 compares the text to the contents of the ASR grammar 118 (i.e., the grammar database). In alternate exemplary embodiments, the SRS 126 performs the comparison in block 308. The comparison matches the candidate's response with one of the possible correct answers that is present in the ASR grammar 118. If the candidate speaks a grammatically correct sentence, then the SRS 126 may correctly convert the spoken answer into text using the ASR grammar 118. If the candidate's response is not present in the ASR grammar 118, then the SRS 126 may not be able to correctly convert the spoken answer into text.
  • At block 310, the GAT 124 calculates a spoken language grammar evaluation score based on the comparison. In alternate exemplary embodiments, the SRS 126 performs the calculation in block 310. If the sentence recorded by the candidate (i.e., the text of the spoken answer) is one that exists in the ASR grammar 118, a higher score is assigned to the response than for an incorrect answer. If the candidate's spoken answer cannot be matched with one of the possible correct answers in the ASR grammar 118, it may be assumed that the candidate has spoken a grammatically incorrect sentence, since all of the grammatically correct possibilities are stored in the ASR grammar 118 per the process 200 of FIG. 2. A failure to locate a response in the ASR grammar 118 may result in a score of zero for the associated question. When the candidate's response is located in the ASR grammar 118, the text of the candidate's response is matched with the list of sentences from the model answers 116 to identify correct and incorrect grammar. In exemplary embodiments, the candidate is awarded a score of one for correct grammar and zero for incorrect grammar.
  • In determining whether a sentence is grammatically correct or incorrect, a lower grammatical word weight may be assigned for the words that do not play a critical role in making the distinction between correct and incorrect grammar. For example, the emphasized words in table 2 may be ignored or assigned a low grammatical word weight in determining whether the candidate's spoken language grammar was correct or incorrect for the given question.
  • TABLE 2
    Words with a Lower Grammatical Word Weight in
    Possible Answers
    MODEL ANSWERS CORRECT?
    I am owning a big car. No.
    I own a big car. Yes.
    I have a big car. Yes.
    I owe a big car. Yes.
    I am driving a big car. Yes.
    I drive a big car. Yes.
  • Thus, a speech recognition error in one of the emphasized words in table 2 with a lower grammatical significance will not result in a performance degradation of the complete spoken language grammar evaluation system. This makes the process 300 for spoken language grammar evaluation robust, since it can absorb some speech recognition errors.
  • In exemplary embodiments, spoken language grammar evaluation includes assessing various properties in the candidate's grammar such as use of articles, prepositions, subject-verb-agreement, and word order. In order to evaluate such properties of the candidate's grammar, a response to a large number of questions 114, e.g., twenty or more, are sought from the candidate. The questions 114 are designed to cover the various properties of the candidate's grammar. Further, the method of calculating a final score from the individual scores for each question may be based on the ability to differentiate between “good” candidates and “bad” candidates. Good candidates are those who are strong in grammar (e.g., more correct answers), and bad candidates are those who are weak in grammar (e.g., fewer correct answers) as identified through evaluation performance results. In order to identify which of the questions 114 are most valuable in terms of differentiating multiple candidates, the input from several candidates can be selected for analysis, with the results stored in the training data 122. A training process may include evaluating the scores of multiple candidates based on a common grammar test to establish a scoring weight for the relative difficulty of various questions 114. In exemplary embodiments, the score for each question in the common grammar test is calculated by the following formula: For a given question q: Let a be the number of candidates that are able to answer it correctly, and their score in the grammar test is >=3. Let b be the number of candidates that are not able to answer it correctly, and their score in the grammar test is <=1. Let c be the number of candidates that are able to answer it correctly, but their score in the grammar test is <=1. Let d be the number of users that are not able to answer it correctly, but their score in the grammar test is >=3. The score for this question q is given by the equation: [a+b−(c+d)]/[a+b+c+d]. This formulation assumes that there are five categories in which the candidates are evaluated (0, 1, 2, 3, 4—with 4 being the best). Similar formulations for the other categories can be calculated by modifying the equation and threshold values. This formulation is based on weighting that emphasizes questions that are answered correctly by candidates who primarily get high scores OR that are answered incorrectly by candidates who primarily get low scores. In exemplary embodiments, the training process generates a set of weights for each question, which are stored in the training data 122. The training data 122 can be manually verified to ensure that it is accurate.
  • At block 312, the GAT 124 outputs the spoken language grammar evaluation score. The GAT 124 may also output a summation of the weighted spoken language grammar evaluation scores for the candidate as the weighted sum of each question attempted by the candidate, applying the weights calculated in the training data 122.
  • Technical effects of exemplary embodiments include spoken language grammar evaluation of a candidate. A weighted score for a candidate can be calculated to establish the relative performance of the candidate as compared to other candidates. Advantages of exemplary embodiments include performing a grammar evaluation as a spoken response to a predetermined question to enhance a candidate's spontaneity and just-in-time sentence composition ability without visual assistance. Further advantages include applying a grammatical word weight to determine the grammatical correctness of the candidate's spoken answer by reducing the effect of speech recognition errors in words that are deemed non-critical to the grammar evaluation.
  • As described above, embodiments can be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. In exemplary embodiments, the invention is embodied in computer program code executed by one or more network elements. Embodiments include computer program code containing instructions embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, universal serial bus (USB) flash drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. Embodiments include computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.
  • While the invention has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims. Moreover, the use of the terms first, second, etc. do not denote any order or importance, but rather the terms first, second, etc. are used to distinguish one element from another. Furthermore, the use of the terms a, an, etc. do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced item.

Claims (2)

1-20. (canceled)
21. A method for spoken language grammar evaluation, comprising:
determining training data as a collection of responses to a plurality of questions, wherein the plurality of questions are designed to cover multiple grammar properties;
calculating a question weight for the plurality of questions as a function of test data from a plurality of candidates, wherein the plurality of candidates includes a mix of candidates that are identified as weak and strong in grammar;
storing the training data;
playing a recorded question to a candidate, wherein the recorded question is one of the plurality of questions;
recording a spoken answer from the candidate;
converting the spoken answer into text;
comparing the text to a grammar database;
calculating a spoken language grammar evaluation score based on the comparison, including a weighted spoken language grammar evaluation score as a function of the question weight; and
outputting a summation of the weighted spoken language grammar evaluation scores for the candidate, wherein the summation is adjusted in response to additional spoken answers from the candidate.
US11/853,076 2007-09-11 2007-09-11 Methods, systems, and computer program products for spoken language grammar evaluation Abandoned US20090070100A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/853,076 US20090070100A1 (en) 2007-09-11 2007-09-11 Methods, systems, and computer program products for spoken language grammar evaluation
US12/055,623 US7966180B2 (en) 2007-09-11 2008-03-26 Methods, systems, and computer program products for spoken language grammar evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/853,076 US20090070100A1 (en) 2007-09-11 2007-09-11 Methods, systems, and computer program products for spoken language grammar evaluation

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/055,623 Continuation US7966180B2 (en) 2007-09-11 2008-03-26 Methods, systems, and computer program products for spoken language grammar evaluation

Publications (1)

Publication Number Publication Date
US20090070100A1 true US20090070100A1 (en) 2009-03-12

Family

ID=40432826

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/853,076 Abandoned US20090070100A1 (en) 2007-09-11 2007-09-11 Methods, systems, and computer program products for spoken language grammar evaluation
US12/055,623 Expired - Fee Related US7966180B2 (en) 2007-09-11 2008-03-26 Methods, systems, and computer program products for spoken language grammar evaluation

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/055,623 Expired - Fee Related US7966180B2 (en) 2007-09-11 2008-03-26 Methods, systems, and computer program products for spoken language grammar evaluation

Country Status (1)

Country Link
US (2) US20090070100A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332217A1 (en) * 2009-06-29 2010-12-30 Shalom Wintner Method for text improvement via linguistic abstractions
US20130185057A1 (en) * 2012-01-12 2013-07-18 Educational Testing Service Computer-Implemented Systems and Methods for Scoring of Spoken Responses Based on Part of Speech Patterns
CN104951219A (en) * 2014-03-25 2015-09-30 华为技术有限公司 Text input method for mobile terminal and mobile terminal
CN108154735A (en) * 2016-12-06 2018-06-12 爱天教育科技(北京)有限公司 Oral English Practice assessment method and device
CN110164447A (en) * 2019-04-03 2019-08-23 苏州驰声信息科技有限公司 A kind of spoken language methods of marking and device

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8775184B2 (en) * 2009-01-16 2014-07-08 International Business Machines Corporation Evaluating spoken skills
US9378650B2 (en) * 2009-09-04 2016-06-28 Naomi Kadar System and method for providing scalable educational content
US9384678B2 (en) * 2010-04-14 2016-07-05 Thinkmap, Inc. System and method for generating questions and multiple choice answers to adaptively aid in word comprehension
US9235566B2 (en) 2011-03-30 2016-01-12 Thinkmap, Inc. System and method for enhanced lookup in an online dictionary
US9530329B2 (en) * 2014-04-10 2016-12-27 Laurence RUDOLPH System and method for conducting multi-layer user selectable electronic testing
US9953646B2 (en) 2014-09-02 2018-04-24 Belleau Technologies Method and system for dynamic speech recognition and tracking of prewritten script
KR102396983B1 (en) * 2015-01-02 2022-05-12 삼성전자주식회사 Method for correcting grammar and apparatus thereof
WO2017044415A1 (en) 2015-09-07 2017-03-16 Voicebox Technologies Corporation System and method for eliciting open-ended natural language responses to questions to train natural language processors
US9448993B1 (en) 2015-09-07 2016-09-20 Voicebox Technologies Corporation System and method of recording utterances using unmanaged crowds for natural language processing
US9519766B1 (en) 2015-09-07 2016-12-13 Voicebox Technologies Corporation System and method of providing and validating enhanced CAPTCHAs
US9401142B1 (en) 2015-09-07 2016-07-26 Voicebox Technologies Corporation System and method for validating natural language content using crowdsourced validation jobs
WO2017044409A1 (en) 2015-09-07 2017-03-16 Voicebox Technologies Corporation System and method of annotating utterances based on tags assigned by unmanaged crowds
CN107067834A (en) * 2017-03-17 2017-08-18 麦片科技(深圳)有限公司 Point-of-reading system with oral evaluation function
CN109035896B (en) * 2018-08-13 2021-11-05 广东小天才科技有限公司 Oral training method and learning equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5540589A (en) * 1994-04-11 1996-07-30 Mitsubishi Electric Information Technology Center Audio interactive tutor
US5717828A (en) * 1995-03-15 1998-02-10 Syracuse Language Systems Speech recognition apparatus and method for learning
US6224383B1 (en) * 1999-03-25 2001-05-01 Planetlingo, Inc. Method and system for computer assisted natural language instruction with distracters
US20020086268A1 (en) * 2000-12-18 2002-07-04 Zeev Shpiro Grammar instruction with spoken dialogue
US20050281395A1 (en) * 2004-06-16 2005-12-22 Brainoxygen, Inc. Methods and apparatus for an interactive audio learning system
US6985862B2 (en) * 2001-03-22 2006-01-10 Tellme Networks, Inc. Histogram grammar weighting and error corrective training of grammar weights
US7440895B1 (en) * 2003-12-01 2008-10-21 Lumenvox, Llc. System and method for tuning and testing in a speech recognition system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070015121A1 (en) 2005-06-02 2007-01-18 University Of Southern California Interactive Foreign Language Teaching

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5540589A (en) * 1994-04-11 1996-07-30 Mitsubishi Electric Information Technology Center Audio interactive tutor
US5717828A (en) * 1995-03-15 1998-02-10 Syracuse Language Systems Speech recognition apparatus and method for learning
US6224383B1 (en) * 1999-03-25 2001-05-01 Planetlingo, Inc. Method and system for computer assisted natural language instruction with distracters
US20020086268A1 (en) * 2000-12-18 2002-07-04 Zeev Shpiro Grammar instruction with spoken dialogue
US6985862B2 (en) * 2001-03-22 2006-01-10 Tellme Networks, Inc. Histogram grammar weighting and error corrective training of grammar weights
US7440895B1 (en) * 2003-12-01 2008-10-21 Lumenvox, Llc. System and method for tuning and testing in a speech recognition system
US20050281395A1 (en) * 2004-06-16 2005-12-22 Brainoxygen, Inc. Methods and apparatus for an interactive audio learning system

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100332217A1 (en) * 2009-06-29 2010-12-30 Shalom Wintner Method for text improvement via linguistic abstractions
US20130185057A1 (en) * 2012-01-12 2013-07-18 Educational Testing Service Computer-Implemented Systems and Methods for Scoring of Spoken Responses Based on Part of Speech Patterns
US9514109B2 (en) * 2012-01-12 2016-12-06 Educational Testing Service Computer-implemented systems and methods for scoring of spoken responses based on part of speech patterns
CN104951219A (en) * 2014-03-25 2015-09-30 华为技术有限公司 Text input method for mobile terminal and mobile terminal
CN108154735A (en) * 2016-12-06 2018-06-12 爱天教育科技(北京)有限公司 Oral English Practice assessment method and device
CN110164447A (en) * 2019-04-03 2019-08-23 苏州驰声信息科技有限公司 A kind of spoken language methods of marking and device
CN110164447B (en) * 2019-04-03 2021-07-27 苏州驰声信息科技有限公司 Spoken language scoring method and device

Also Published As

Publication number Publication date
US7966180B2 (en) 2011-06-21
US20090070111A1 (en) 2009-03-12

Similar Documents

Publication Publication Date Title
US7966180B2 (en) Methods, systems, and computer program products for spoken language grammar evaluation
US8990082B2 (en) Non-scorable response filters for speech scoring systems
US9704413B2 (en) Non-scorable response filters for speech scoring systems
US8392190B2 (en) Systems and methods for assessment of non-native spontaneous speech
JP4002401B2 (en) Subject ability measurement system and subject ability measurement method
US8271281B2 (en) Method for assessing pronunciation abilities
US9224383B2 (en) Unsupervised language model adaptation for automated speech scoring
US9652999B2 (en) Computer-implemented systems and methods for estimating word accuracy for automatic speech recognition
US7778834B2 (en) Method and system for assessing pronunciation difficulties of non-native speakers by entropy calculation
US20130185057A1 (en) Computer-Implemented Systems and Methods for Scoring of Spoken Responses Based on Part of Speech Patterns
US9443193B2 (en) Systems and methods for generating automated evaluation models
US9262941B2 (en) Systems and methods for assessment of non-native speech using vowel space characteristics
US9837070B2 (en) Verification of mappings between phoneme sequences and words
US9799228B2 (en) Systems and methods for natural language processing for speech content scoring
CN103559892A (en) Method and system for evaluating spoken language
KR20160008949A (en) Apparatus and method for foreign language learning based on spoken dialogue
US20080126094A1 (en) Data Modelling of Class Independent Recognition Models
KR20110068491A (en) Grammar error simulation apparatus and method
McTear et al. Evaluating the conversational interface
Loukina et al. Automated scoring across different modalities
Cook et al. Elicited imitation for prediction of OPI test scores
Neumeyer et al. Webgrader: a multilingual pronunciation practice tool
US20050144010A1 (en) Interactive language learning method capable of speech recognition
Tadimeti et al. Evaluation of off-the-shelf speech recognizers on different accents in a dialogue domain
US11588800B2 (en) Customizable voice-based user authentication in a multi-tenant system

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BAJAJ, RAJNI;BALAKRISHNAN, SREERAM V.;BHANDARI, MRIDULA;AND OTHERS;REEL/FRAME:019807/0031;SIGNING DATES FROM 20070703 TO 20070907

AS Assignment

Owner name: NUANCE COMMUNICATIONS, INC., MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:022689/0317

Effective date: 20090331

Owner name: NUANCE COMMUNICATIONS, INC.,MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:022689/0317

Effective date: 20090331

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION