US6374217B1 - Fast update implementation for efficient latent semantic language modeling - Google Patents
Fast update implementation for efficient latent semantic language modeling Download PDFInfo
- Publication number
- US6374217B1 US6374217B1 US09/267,334 US26733499A US6374217B1 US 6374217 B1 US6374217 B1 US 6374217B1 US 26733499 A US26733499 A US 26733499A US 6374217 B1 US6374217 B1 US 6374217B1
- Authority
- US
- United States
- Prior art keywords
- pseudo
- document
- language model
- vector
- document vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 239000013598 vector Substances 0.000 claims abstract description 100
- 238000004458 analytical method Methods 0.000 claims abstract description 47
- 238000013507 mapping Methods 0.000 claims abstract description 26
- 230000004044 response Effects 0.000 claims abstract description 8
- 238000000034 method Methods 0.000 claims description 39
- 238000012545 processing Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 11
- 238000000354 decomposition reaction Methods 0.000 claims description 7
- 238000013459 approach Methods 0.000 description 12
- 238000012549 training Methods 0.000 description 11
- 239000002131 composite material Substances 0.000 description 8
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 2
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
- G10L15/197—Probabilistic grammars, e.g. word n-grams
Definitions
- the present invention relates to speech recognition and, more particularly, to language modeling in large-vocabulary speech recognition systems.
- Speech recognition is typically the process of converting an acoustic signal into a linguistic message.
- the resulting message may need to contain just enough information to reliably communicate a speaker's goal.
- it may be critical that the resulting message represent a verbatim transcription of a sequence of spoken words. In either event, an accurate statistical, or stochastic, language model is desirable for successful recognition.
- stochastic language models are commonly used in speech recognition systems to constrain acoustic analyses, to guide searches through various text hypotheses, and to aid in the determination of final text transcriptions. Therefore, it is vital that a stochastic language model be easily implementable and highly reliable.
- Available language modeling techniques have proven less than adequate for many real world applications. For example, while many existing models perform satisfactorily in small-vocabulary contexts in which the range of spoken words input to a recognition system is severely limited (e.g., to 1000 words or less), relatively few known models are even tractable in large-vocabulary contexts in which the range of possible spoken words is virtually unlimited (e.g., 20,000 words or more).
- n-gram paradigm Traditionally, language models have relied upon the classic n-gram paradigm to define the probability of occurrence, within a spoken vocabulary, of all possible sequences of n words. Because it emphasizes word order, the n-gram paradigm is properly cast as a syntactic approach to language modeling. Also, because it provides probabilities for relatively small groups of words (i.e., n is typically less than the number of words in a sentence of average length), the n-gram paradigm is said to impose local language constraints on the speech recognition process.
- a conventional speech recognition system Given a language model consisting of a set of a priori n-gram probabilities, a conventional speech recognition system can define a “most likely” linguistic output message based on an acoustic input signal.
- n-gram paradigm does not contemplate word meaning, and because limits on available processing and memory resources preclude the use of models in which n is made large enough to incorporate global language constraints, models based purely on the n-gram paradigm are not always sufficiently reliable. This is particularly true in modern, large-vocabulary applications.
- latent semantic analysis is a data-driven technique which, given a corpus of training text, describes which words appear in which global contexts (e.g., which documents). This allows words to be represented as vectors in a convenient vector space.
- global contexts e.g., which documents.
- This allows words to be represented as vectors in a convenient vector space.
- the full power of latent semantic analysis has yet to be exploited.
- the various known semantic models may ultimately prove beneficial in certain applications, the inherent lack of tight local word order constraints in such models may ultimately prevent their widespread acceptance and use.
- Stochastic language modeling plays a central role in large vocabulary speech recogniton, where it is typically used to constrain the acoustic analysis, guide the search through various (partial) text hypotheses, and contribute to the determination of the final transcription.
- a new class of statistical language models have been recently introduced that exploit both syntactic and semantic information. This approach embeds latent semantic analysis (LSA), which is used to capture meaningful word associations in the available context, into the standard n-gram paradigm, which relies on the probability of occurrence in the language of all possible strings of n words.
- LSA latent semantic analysis
- a method and apparatus for a fast update implementation for efficient latent semantic language modeling in a hybrid stochastic language model which seamlessly combines syntactic and semantic analyses is provided.
- Speech or acoustic signals are received, features are extracted from the signals, and an acoustic vector sequence is produced from the signals by a mapping from words and documents of the signals.
- the speech signals are processed directly using a language model produced by integrating a latent semantic analysis into an n-gram probability.
- the latent semantic analysis language model probability is computed using a first pseudo-document vector expressed in terms of a second pseudo-document vector.
- Expressing the first pseudo-document vector in terms of the second pseudo-document vector comprises updating the second pseudo-document vector directly in latent semantic analysis space in order to produce the first pseudo-document vector in response to at least one addition of a candidate word of the received speech signals. Updating precludes mapping the sparse representations for a current word and pseudo-document to vectors for a current word and pseudo-document for each addition of a candidate word of the received speech signals, wherein a number of computations of the processing are reduced by a value approximately equal to a vocabulary size. A linguistic message representative of the received speech signals is generated.
- FIG. 1 is a speech recognition system of an embodiment of the present invention.
- FIG. 2 is a computer system hosting a speech recognition system of an embodiment of the present invention.
- FIG. 3 is a computer system memory hosting a speech recognition system of an embodiment of the present invention.
- FIG. 4 is a speech recognition system of a preferred embodiment of the present invention.
- FIG. 5 is a flowchart for a speech recognition system comprising a fast update implementation for efficient latent semantic language modeling of an embodiment of the present invention.
- FIG. 1 is a speech recognition system 100 of an embodiment of the present invention comprising a transducer 130 , a signal pre-processor 120 , a recognition processor 160 , an acoustic model 170 , a lexicon 180 , and a language model 190 .
- the language model 190 of an embodiment of the present invention comprises latent semantic analysis (LSA), but the embodiment is not so limited.
- the signal pre-processor 120 includes an analog-to-digital (A/D) converter 140 and a feature extractor 150 .
- An acoustic signal is input to the transducer 130 , and an output of the transducer 130 is coupled to an input of the A/D converter 140 .
- An output of the A/D converter 140 is in turn coupled to an input of the feature extractor 150 , and an output of the feature extractor 150 is coupled to an input of the recognition processor 160 .
- the recognition processor 160 receives input from a set of acoustic models 170 , the lexicon 180 , and the language model 190 and produces a linguistic message output.
- FIG. 2 is a computer system 200 hosting the speech recognition system (SRS) of one embodiment.
- the computer system 200 comprises, but is not limited to, a system bus 201 that allows for communication among a processor 202 , a digital signal processor 208 , a memory 204 , and a mass storage device 207 .
- the system bus 201 is also coupled to receive inputs from a keyboard 222 , a pointing device 223 , and a speech signal input device 225 , but is not so limited.
- the system bus 201 provides outputs to a display device 221 and a hard copy device 224 , but is not so limited.
- FIG. 3 is the computer system memory 310 hosting the speech recognition system of one embodiment.
- An input device 302 provides speech signals to a digitizer and bus interface 304 .
- the digitizer 304 samples and digitizes the speech signals for further processing.
- the digitizer and bus interface 304 allows for storage of the digitized speech signals in the speech input data memory component 318 of memory 310 via the system bus 308 .
- the digitized speech signals are processed by a digital processor 306 using algorithms and data stored in the components 312 - 322 of the memory 310 .
- the algorithms and data that are used in processing the speech signals are stored in components of the memory 310 comprising, but not limited to, a hidden Markov model (HMM) training and recognition processing computer program 312 , a viterbi processing computer program code and storage 314 , a preprocessing computer program code and storage 316 , language model memory 320 , and acoustic model memory 322 .
- HMM hidden Markov model
- an acoustic speech signal is input to the system 100 using the transducer 130 , which may be for example a microphone.
- a corresponding analog electrical signal, output by the transducer 130 is then converted to digital form by the A/D converter 140 .
- the resulting digital speech samples are then processed in successive time intervals within the feature extractor 150 , using conventional methods, to produce a sequence of acoustic feature vectors.
- the resulting feature vectors are optionally converted, using known vector quantization techniques, into a sequence of discrete feature code-words representative of various acoustic prototypes.
- the feature vectors, or code-words are then transformed by the recognition processor 160 to produce an appropriate linguistic message output, but the embodiment is not so limited.
- the recognition processor 160 utilizes the set of acoustic models 170 , the lexicon 180 , and the language model 190 , in combination, to constrain and make workable the transformation process, but is not so limited.
- the set of acoustic models 170 e.g., well known Hidden Markov Models
- the set of acoustic models 170 is used to evaluate the feature vectors output by the feature extractor 150 against basic units of speech, such as phonemes or allophones. The most likely basic units of speech are then processed, in accordance with information provided by the lexicon 180 and the language model 190 , to generate the final linguistic message output.
- the lexicon 180 of an embodiment defines the vocabulary of the recognition system 100 in terms of the basic speech elements (words), and the language model 190 defines allowable sequences of vocabulary items.
- the language model 190 may be a stochastic language model which provides a set of a priori probabilities, each probability indicating a likelihood that a given word may occur in a particular context. Such a set of a priori probabilities may be used, for example, to help search for and prioritize candidate output messages based on sequences of basic speech elements. Note, however, that the precise method by which the recognition processor 160 utilizes the language model 190 to create an output message from a sequence of basic speech units is not necessary for an understanding of the present invention as long as an LSA component is used at some point.
- the language model 190 may be asingle-span, or single-context, language model, wherein the language model 190 may be a syntactic model (e.g., an n-gram model), providing a set of a priori probabilities based on a local word context, or it may be a semantic model (e.g., a latent semantic model), providing a priori probabilities based on a global word context.
- the language model 190 provides a set of n-gram a priori probabilities, each of which defines the likelihood that a particular word within the system vocabulary (defined by the lexicon 180 ) will occur immediately following a string of n ⁇ 1 words which are also within the system vocabulary.
- the language model 190 provides, for each word w q in an available vocabulary V, a conditional probability Pr(w q
- the recognition processor 160 can search for, and assess the likelihood of, various text hypotheses in producing the output message.
- H q (l) ) can be estimated during a training phase using existing text databases.
- the Linguistic Data Consortium sponsored by the Advanced Research Project Agency (ARPA) provides a wide range of application-specific databases which can be used for training purposes.
- ARPA Advanced Research Project Agency
- unreliable estimates and a lack of global constraints render the local-span n-gram model impractical in many large-vocabulary applications.
- semantic analyses may provide single-span language models incorporating global constraints.
- the language model 190 may provide a set of global conditional probabilities, each defining a likelihood that a particular word within the system vocabulary will occur given a specified global context.
- the global context might comprise, for example, documents selected from a set of training documents which are tailored, prior to recognition, to suit a particular application.
- the global context might be dynamic with respect to the recognition process, comprising for example a relatively long (e.g., 1000-word) text message representing the most recent output of the recognition processor 160 .
- the latent semantic model provides, for every word w q in a system vocabulary V, a conditional probability Pr(w q
- H q ( g ) ) Pr ⁇ ( w q
- C sub k denotes one of a set of K word clusters which span the underlying word/document space. These clusters can be interpreted as a convenient representation of the semantic events occurring in the training database.
- equation (2) translates the fact that the probability of a word depends on its importance relative to each semantic event as well as the importance of the semantic event itself.
- d q ) may be obtained using suitable multi-variate distributions, wherein such distributions are induced by appropriate distance measures defined in the vector space representation which results from the singular value decomposition framework of latent semantic analysis.
- the recognition processor 160 may employ a set of semantic a priori probabilities defined in accordance with equation (2) to search for and prioritize various text hypotheses when generating output messages.
- this semantic single-span model does not incorporate potentially useful local language constraints.
- an embodiment of the present invention teaches that the problems described herein associated with conventional, single-span systems may be overcome by strategically integrating the beneficial features of both language model types. Therefore, the present invention teaches that it is possible to combine local constraints, such as those provided by the n-gram paradigm, with global constraints, such as those provided by a latent semantic model, to integrate both syntactic and semantic information into a single, hybrid language model.
- FIG. 4 is a speech recognition system of a preferred embodiment of the present invention.
- the exemplary system 400 comprises a transducer 130 , a signal pre-processor 120 , a hybrid recognition processor 420 , an acoustic model 170 , a lexicon 180 , and a hybrid, multiple-span language model 410 .
- the signal pre-processor 120 of an embodiment comprises an analog-to-digital (A/D) converter 140 and a feature extractor 150 , but is not so limited.
- An acoustic signal is input to the transducer 130 , and an output of the transducer 130 is coupled to an input of the A/D converter 140 .
- A/D analog-to-digital
- An output of the A/D converter 140 is in turn coupled to an input of the feature extractor 150 , and an output of the feature extractor 150 is coupled to an input of the hybrid recognition processor 420 .
- the hybrid recognition processor 420 receives input from the acoustic model 170 , the lexicon 180 , and the hybrid language model 410 and produces a linguistic message output.
- the transducer 130 , the signal pre-processor 120 , the acoustic model 170 , and the lexicon 180 function as described herein with respect to FIG. 1 .
- the hybrid processor 420 of an embodiment of the present invention carries out speech recognition using a hybrid language model 410 which combines local and global language constraints to realize both syntactic and semantic modeling benefits.
- the hybrid processing of an embodiment of the present invention in contrast to typical approaches carried out using a two-pass approach, is computed in a single recognition pass using an integrated model with the fast update implementation of the present invention.
- the fast update implementation for efficient latent semantic language modeling by reducing the number of calculations necessary to perform recognition, eliminates the necessity of using a two-pass approach requiring a first pass to generate a first set of likelihoods, or scores, for a group of “most likely” candidate output messages and a second pass to process the first set of scores and produce a second set of improved, hybrid scores.
- one single-span paradigm is made subordinate to another by making appropriate assumptions with respect to conditional probabilities which are used to construct the composite, multi-span paradigm.
- subordinating the n-gram paradigm to the latent semantic paradigm amounts to driving the recognition process using global constraints while fine-tuning it using local constraints.
- subordinating the latent semantic paradigm to the n-gram paradigm yields a recognition process which proceeds locally while taking global constraints into account.
- latent semantic analysis is subordinated to the n-gram paradigm to effectively integrate semantic information into a search that is primarily syntactically driven.
- the resulting language model is therefore properly described as a modified n-gram incorporating large-span semantic information.
- an integrated paradigm is defined by computing a conditional probability Pr(w q
- the local history H (l) includes a string of n ⁇ 1 words w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 as is described above with respect to the n-gram paradigm
- the global history H (g) includes a broad word span, or document, d q as is described above with respect to latent semantic analysis.
- Such a composite conditional probability can be written, generally, as follows:
- H q (h) ) Pr(w q
- H q (l) ,H q (g) ) Pr(w q
- H q (h) ) that a particular word w q will occur, given an immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 and a relevant document d q can be computed explicitly by dividing the probability of the particular word w q and the document d q , given the immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 , by a summation which includes, for every individual word w i in the system vocabulary V, the probability of the individual word w i and the document d q , given the immediate context w q ⁇ 1 w q ⁇ 2 .
- the composite conditional probability can be written as follows: Pr ⁇ ( w q
- H q ( h ) ) Pr ⁇ ( w q , d q
- the probability of the particular word w q is independent of the immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 of the word w q .
- the probability of the particular word w q and the document d q given the immediate context w q ⁇ 1 w q ⁇ 2 . . .
- w q ⁇ n+1 can be computed as a product of the probability of the particular word w q , given the document d q , and the probability of the document d q , given the immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 . Therefore, the numerator of equation (4) can be expanded as:
- w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 ) Pr(w q
- the probability of the relevant document d q for the particular word w q is equal to the probability of the word w q , given its immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 .
- Such an assumption effectively subordinates the latent semantic model to the n-gram paradigm. In other words, the assumption is that, on the basis of just the immediate context w q ⁇ 1 w q ⁇ 2 . . .
- equation (5) can be simplified as:
- w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 ) Pr(w q
- a composite conditional probability is computed by dividing the product of the probability of the particular word w q , given the document d q , and the probability of the particular word w q , given the immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 , by a summation which includes, for every individual word w i in the system vocabulary V, a product of the probability of the individual word w i , given the document d q , and the probability of the individual word w i , given the immediate context w q ⁇ 1 w q ⁇ 2 . . . w q ⁇ n+1 of the particular word w q .
- the composite conditional probability is then used as an integrated paradigm to achieve single-pass recognition in an embodiment of the present invention. Consequently, integrated a priori probabilities are computed using the composite conditional probability formula, and the integrated a priori probabilities are used to search for and prioritize candidate linguistic messages.
- each element of the integrated paradigm can be computed in a straightforward manner according to the n-gram and latent semantic paradigms described above.
- the integrated paradigm of the exemplary embodiment is easily implemented using available resources.
- the exemplary integrated paradigm can be interpreted in the context of Bayesian estimation. Therefore, if the conditional probability Pr(w q
- rescoring N-best lists with the integrated models of an embodiment of the present invention significantly improves recognition accuracy.
- an embodiment of the present invention reduces the computational cost so that LSA language modeling can be included inside the search.
- M and N are on the order of ten and hundred thousand, respectively; T might comprise a couple hundred million words, but the embodiment is not so limited.
- the LSA approach defines a mapping between the sets V and T and a vector space s, whereby each word w j in V and each document d j in T is represented by a vector in S. This mapping follows from the singular value decomposition (SVD) of the matrix of co-occurences between words and docuements.
- the (m ⁇ n) word documents matrix of an embodiment is denoted as W.
- U is the (M ⁇ R) matrix of left singular vectors u i (1 ⁇ i ⁇ M)u i
- S is the (R ⁇ R)diagonal matrix of singular values
- V is the (N ⁇ R) matrix of right singular vectors v j (1 ⁇ j ⁇ N)
- R ⁇ M( ⁇ N) is the order of the decomposition
- T denotes matrix transposition.
- the ith left singular vector u i can be viewed as the representation of the ith word w i in a vector space of dimension R.
- the jth right singular vector v j can be viewed as the representation of the jth document d j in the same vector space of dimension R.
- the space S sought is the space spanned by the left and right singular vectors.
- H q - 1 lsa ) Pr ⁇ ( w q
- w q is the current word and H q ⁇ 1 (isa) is the associated history for this word, i.e., the current document so far (also referred to as the current pseudo-document).
- This is done in three steps comprising, but not limited to: (i) construct sparse representations w q and ⁇ tilde over (d) ⁇ q ⁇ 1 for the current word and pseudo-document, (ii) use equations 9 and 10 to map these quantities to vectors u q and ⁇ tilde over (v) ⁇ q ⁇ 1 in the space S, and (iii) use a suitable measure in S to evaluate the closeness between u q and ⁇ tilde over (v) ⁇ q ⁇ 1 .
- the mapping of equation 2 is pre-computed as part of the SVD decomposition.
- the mapping of equation 3 requires O (M R) floating point operations each time it is envoked.
- An embodiment of the present invention uses a fast update implementation that exploits the sequential nature of pseudo-documents for a fast update algorithm.
- the document context remains largely unchanged, with only the most recent candidate word added. Taking advantage of this fact allows the new pseudo-document vector to be expressed directly in terms of the old pseudo-document vector, instead of each time re-computing the mapping from scratch.
- the word-document matrix W is a matrix of elements w i j , where W i j represents the weighted count of word w i in document d j .
- equation 17 requires only O (R) floating point operations.
- the pseudo-document vector can be updated directly in the LSA space at a fraction of the computational cost typically required to map the sparse representation to the space S.
- FIG. 5 is a flowchart for speech recognition using a fast update implementation for efficient latent semantic language modeling of an embodiment of the present invention. Operation begins at step 502 , at which speech or acoustic signals are received. Features are extracted from the speech signals, at step 504 , and an acoustic vector sequence is produced from the received speech signals by a mapping from words and documents of the received speech signals. The speech signals are processed directly using a language model produced by integrating a latent semantic analysis into an n-gram probability. The hybrid n-gram plus latent semantic analysis language model probability is computed, at step 506 , using a first pseudo-document vector expressed in terms of a second pseudo-document vector.
- Expressing the first pseudo-document vector in terms of the second pseudo-document vector comprises updating the second pseudo-document vector directly in latent semantic analysis space in order to produce the first pseudo-document vector in response to at least one addition of a candidate word of the received speech signals. Updating precludes mapping the sparse representations for a current word and pseudo-document to vectors for a current word and pseudo-document for each addition of a candidate word of the received speech signals, wherein a number of computations of the processing are reduced by a value approximately equal to a vocabulary size.
- Computation of the probability is accomplished by constructing sparse representations for a current word and pseudo-document, mapping the sparse representations for a current word and pseudo-document to vectors for a current word and pseudo-document, and evaluating the closeness between the vectors for a current word and pseudo-document.
- a linguistic message representative of the received speech signals is generated, at step 508 .
Abstract
Description
Claims (24)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/267,334 US6374217B1 (en) | 1999-03-12 | 1999-03-12 | Fast update implementation for efficient latent semantic language modeling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/267,334 US6374217B1 (en) | 1999-03-12 | 1999-03-12 | Fast update implementation for efficient latent semantic language modeling |
Publications (1)
Publication Number | Publication Date |
---|---|
US6374217B1 true US6374217B1 (en) | 2002-04-16 |
Family
ID=23018362
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/267,334 Expired - Lifetime US6374217B1 (en) | 1999-03-12 | 1999-03-12 | Fast update implementation for efficient latent semantic language modeling |
Country Status (1)
Country | Link |
---|---|
US (1) | US6374217B1 (en) |
Cited By (149)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010014859A1 (en) * | 1999-12-27 | 2001-08-16 | International Business Machines Corporation | Method, apparatus, computer system and storage medium for speech recongnition |
US20020038207A1 (en) * | 2000-07-11 | 2002-03-28 | Ibm Corporation | Systems and methods for word prediction and speech recognition |
US20020042711A1 (en) * | 2000-08-11 | 2002-04-11 | Yi-Chung Lin | Method for probabilistic error-tolerant natural language understanding |
US20020111803A1 (en) * | 2000-12-20 | 2002-08-15 | International Business Machines Corporation | Method and system for semantic speech recognition |
US20030105632A1 (en) * | 2000-05-23 | 2003-06-05 | Huitouze Serge Le | Syntactic and semantic analysis of voice commands |
US6651059B1 (en) * | 1999-11-15 | 2003-11-18 | International Business Machines Corporation | System and method for the automatic recognition of relevant terms by mining link annotations |
US6772120B1 (en) * | 2000-11-21 | 2004-08-03 | Hewlett-Packard Development Company, L.P. | Computer method and apparatus for segmenting text streams |
US20040199389A1 (en) * | 2001-08-13 | 2004-10-07 | Hans Geiger | Method and device for recognising a phonetic sound sequence or character sequence |
US20060230036A1 (en) * | 2005-03-31 | 2006-10-12 | Kei Tateno | Information processing apparatus, information processing method and program |
US7124081B1 (en) * | 2001-09-28 | 2006-10-17 | Apple Computer, Inc. | Method and apparatus for speech recognition using latent semantic adaptation |
US7177871B1 (en) * | 2000-06-19 | 2007-02-13 | Petrus Wilhelmus Maria Desain | Method and system for providing communication via a network |
US20070112755A1 (en) * | 2005-11-15 | 2007-05-17 | Thompson Kevin B | Information exploration systems and method |
US20070219793A1 (en) * | 2006-03-14 | 2007-09-20 | Microsoft Corporation | Shareable filler model for grammar authoring |
US20070239431A1 (en) * | 2006-03-30 | 2007-10-11 | Microsoft Corporation | Scalable probabilistic latent semantic analysis |
US20080091430A1 (en) * | 2003-05-14 | 2008-04-17 | Bellegarda Jerome R | Method and apparatus for predicting word prominence in speech synthesis |
US20080228928A1 (en) * | 2007-03-15 | 2008-09-18 | Giovanni Donelli | Multimedia content filtering |
US20080306742A1 (en) * | 2006-08-14 | 2008-12-11 | International Business Machines Corporation | Apparatus, method, and program for supporting speech interface design |
US20090099841A1 (en) * | 2007-10-04 | 2009-04-16 | Kubushiki Kaisha Toshiba | Automatic speech recognition method and apparatus |
GB2463909A (en) * | 2008-09-29 | 2010-03-31 | Toshiba Res Europ Ltd | Speech recognition utilising determination of a back-off group of code words in using an acoustic model. |
GB2463908A (en) * | 2008-09-29 | 2010-03-31 | Toshiba Res Europ Ltd | Speech recognition utilising a hybrid combination of probabilities output from a language model and an acoustic model. |
US8103505B1 (en) | 2003-11-19 | 2012-01-24 | Apple Inc. | Method and apparatus for speech synthesis using paralinguistic variation |
CN103345923A (en) * | 2013-07-26 | 2013-10-09 | 电子科技大学 | Sparse representation based short-voice speaker recognition method |
US8682660B1 (en) * | 2008-05-21 | 2014-03-25 | Resolvity, Inc. | Method and system for post-processing speech recognition results |
US8713021B2 (en) | 2010-07-07 | 2014-04-29 | Apple Inc. | Unsupervised document clustering using latent semantic density analysis |
US8868409B1 (en) * | 2014-01-16 | 2014-10-21 | Google Inc. | Evaluating transcriptions with a semantic parser |
US8892446B2 (en) | 2010-01-18 | 2014-11-18 | Apple Inc. | Service orchestration for intelligent automated assistant |
US9026431B1 (en) * | 2013-07-30 | 2015-05-05 | Google Inc. | Semantic parsing with multiple parsers |
US20150278194A1 (en) * | 2012-11-07 | 2015-10-01 | Nec Corporation | Information processing device, information processing method and medium |
US9251139B2 (en) * | 2014-04-08 | 2016-02-02 | TitleFlow LLC | Natural language processing for extracting conveyance graphs |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9300784B2 (en) | 2013-06-13 | 2016-03-29 | Apple Inc. | System and method for emergency calls initiated by voice command |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9535906B2 (en) | 2008-07-31 | 2017-01-03 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
US9606986B2 (en) | 2014-09-29 | 2017-03-28 | Apple Inc. | Integrated word N-gram and class M-gram language models |
US9620104B2 (en) | 2013-06-07 | 2017-04-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US9626955B2 (en) | 2008-04-05 | 2017-04-18 | Apple Inc. | Intelligent text-to-speech conversion |
US9633660B2 (en) | 2010-02-25 | 2017-04-25 | Apple Inc. | User profiling for voice input processing |
US9633674B2 (en) | 2013-06-07 | 2017-04-25 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9646614B2 (en) | 2000-03-16 | 2017-05-09 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US20170149964A1 (en) * | 2004-05-03 | 2017-05-25 | Somatek | System and method for providing particularized audible alerts |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US9697822B1 (en) | 2013-03-15 | 2017-07-04 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US20170200066A1 (en) * | 2016-01-13 | 2017-07-13 | Adobe Systems Incorporated | Semantic Natural Language Vector Space |
US20170200065A1 (en) * | 2016-01-13 | 2017-07-13 | Adobe Systems Incorporated | Image Captioning with Weak Supervision |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9798393B2 (en) | 2011-08-29 | 2017-10-24 | Apple Inc. | Text correction processing |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9922642B2 (en) | 2013-03-15 | 2018-03-20 | Apple Inc. | Training an at least partial voice command system |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9953088B2 (en) | 2012-05-14 | 2018-04-24 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9959870B2 (en) | 2008-12-11 | 2018-05-01 | Apple Inc. | Speech recognition involving a mobile device |
US9966065B2 (en) | 2014-05-30 | 2018-05-08 | Apple Inc. | Multi-command single utterance input method |
US9966068B2 (en) | 2013-06-08 | 2018-05-08 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10185542B2 (en) | 2013-06-09 | 2019-01-22 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10199051B2 (en) | 2013-02-07 | 2019-02-05 | Apple Inc. | Voice trigger for a digital assistant |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10283110B2 (en) | 2009-07-02 | 2019-05-07 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
CN110517693A (en) * | 2019-08-01 | 2019-11-29 | 出门问问(苏州)信息科技有限公司 | Audio recognition method, device, electronic equipment and computer readable storage medium |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10529322B2 (en) * | 2017-06-15 | 2020-01-07 | Google Llc | Semantic model for tagging of word lattices |
US10540989B2 (en) | 2005-08-03 | 2020-01-21 | Somatek | Somatic, auditory and cochlear communication system and method |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US10568032B2 (en) | 2007-04-03 | 2020-02-18 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10791216B2 (en) | 2013-08-06 | 2020-09-29 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
CN113158643A (en) * | 2021-04-27 | 2021-07-23 | 广东外语外贸大学 | Novel text readability assessment method and system |
US11217255B2 (en) | 2017-05-16 | 2022-01-04 | Apple Inc. | Far-field extension for digital assistant services |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5621859A (en) | 1994-01-19 | 1997-04-15 | Bbn Corporation | Single tree method for grammar directed, very large vocabulary speech recognizer |
US5712957A (en) | 1995-09-08 | 1998-01-27 | Carnegie Mellon University | Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists |
US5828999A (en) * | 1996-05-06 | 1998-10-27 | Apple Computer, Inc. | Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems |
US5835893A (en) * | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
US5839106A (en) | 1996-12-17 | 1998-11-17 | Apple Computer, Inc. | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model |
US6208971B1 (en) * | 1998-10-30 | 2001-03-27 | Apple Computer, Inc. | Method and apparatus for command recognition using data-driven semantic inference |
-
1999
- 1999-03-12 US US09/267,334 patent/US6374217B1/en not_active Expired - Lifetime
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5621859A (en) | 1994-01-19 | 1997-04-15 | Bbn Corporation | Single tree method for grammar directed, very large vocabulary speech recognizer |
US5712957A (en) | 1995-09-08 | 1998-01-27 | Carnegie Mellon University | Locating and correcting erroneously recognized portions of utterances by rescoring based on two n-best lists |
US5835893A (en) * | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
US5828999A (en) * | 1996-05-06 | 1998-10-27 | Apple Computer, Inc. | Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems |
US5839106A (en) | 1996-12-17 | 1998-11-17 | Apple Computer, Inc. | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model |
US6208971B1 (en) * | 1998-10-30 | 2001-03-27 | Apple Computer, Inc. | Method and apparatus for command recognition using data-driven semantic inference |
Non-Patent Citations (5)
Title |
---|
Bahl, L. et la., "A Maximum Likelihood Approach to Continuous Speech Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 2, Mar. 1993, pp. 179-190. |
Bellegarda, J. et al., "A Novel Word Clustering Algorithm Based On Latent Semantic Analysis," IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, May 7-10, 1996, Marriott Marquis Hotel, Atlanta, GA., USA. pp. 172-175. |
Bellegarda, J., "A Latent Semantic Analysis Framework For Large-Span Language Modeling," Proc. EuroSpeech '97 Rhodes, Greece, Sep. 1997, pp. 1451-1454. |
Bellegarda, J., "A Multispan Language Modeling Framework for Large Vocabulary Speech Recognition," IEEE Transaction on Speech and Audio Processing, vol. 6, No. 5, Sep. 1998, pp. 456-467. |
Katz, S., "Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer," IEEE Transactions on Acoustics, Speech, and Signal Processing Society, vol. ASSP-35, No. 3, Mar. 1987, pp. 400-401. |
Cited By (213)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6651059B1 (en) * | 1999-11-15 | 2003-11-18 | International Business Machines Corporation | System and method for the automatic recognition of relevant terms by mining link annotations |
US20010014859A1 (en) * | 1999-12-27 | 2001-08-16 | International Business Machines Corporation | Method, apparatus, computer system and storage medium for speech recongnition |
US6917910B2 (en) * | 1999-12-27 | 2005-07-12 | International Business Machines Corporation | Method, apparatus, computer system and storage medium for speech recognition |
US9646614B2 (en) | 2000-03-16 | 2017-05-09 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US20030105632A1 (en) * | 2000-05-23 | 2003-06-05 | Huitouze Serge Le | Syntactic and semantic analysis of voice commands |
US7177871B1 (en) * | 2000-06-19 | 2007-02-13 | Petrus Wilhelmus Maria Desain | Method and system for providing communication via a network |
US7359852B2 (en) * | 2000-07-11 | 2008-04-15 | International Business Machines Corporation | Systems and methods for natural spoken language word prediction and speech recognition |
US8150693B2 (en) | 2000-07-11 | 2012-04-03 | Nuance Communications, Inc. | Methods and apparatus for natural spoken language speech recognition |
US8000966B2 (en) | 2000-07-11 | 2011-08-16 | Nuance Communications, Inc. | Methods and apparatus for natural spoken language speech recognition with word prediction |
US20020038207A1 (en) * | 2000-07-11 | 2002-03-28 | Ibm Corporation | Systems and methods for word prediction and speech recognition |
US6920420B2 (en) | 2000-08-11 | 2005-07-19 | Industrial Technology Research Institute | Method for probabilistic error-tolerant natural language understanding |
US20020042711A1 (en) * | 2000-08-11 | 2002-04-11 | Yi-Chung Lin | Method for probabilistic error-tolerant natural language understanding |
US6772120B1 (en) * | 2000-11-21 | 2004-08-03 | Hewlett-Packard Development Company, L.P. | Computer method and apparatus for segmenting text streams |
US20020111803A1 (en) * | 2000-12-20 | 2002-08-15 | International Business Machines Corporation | Method and system for semantic speech recognition |
US6937983B2 (en) * | 2000-12-20 | 2005-08-30 | International Business Machines Corporation | Method and system for semantic speech recognition |
US20040199389A1 (en) * | 2001-08-13 | 2004-10-07 | Hans Geiger | Method and device for recognising a phonetic sound sequence or character sequence |
US7966177B2 (en) * | 2001-08-13 | 2011-06-21 | Hans Geiger | Method and device for recognising a phonetic sound sequence or character sequence |
US7124081B1 (en) * | 2001-09-28 | 2006-10-17 | Apple Computer, Inc. | Method and apparatus for speech recognition using latent semantic adaptation |
US20080091430A1 (en) * | 2003-05-14 | 2008-04-17 | Bellegarda Jerome R | Method and apparatus for predicting word prominence in speech synthesis |
US7778819B2 (en) | 2003-05-14 | 2010-08-17 | Apple Inc. | Method and apparatus for predicting word prominence in speech synthesis |
US8103505B1 (en) | 2003-11-19 | 2012-01-24 | Apple Inc. | Method and apparatus for speech synthesis using paralinguistic variation |
US10694030B2 (en) | 2004-05-03 | 2020-06-23 | Somatek | System and method for providing particularized audible alerts |
US20170149964A1 (en) * | 2004-05-03 | 2017-05-25 | Somatek | System and method for providing particularized audible alerts |
US10104226B2 (en) * | 2004-05-03 | 2018-10-16 | Somatek | System and method for providing particularized audible alerts |
US20060230036A1 (en) * | 2005-03-31 | 2006-10-12 | Kei Tateno | Information processing apparatus, information processing method and program |
US10540989B2 (en) | 2005-08-03 | 2020-01-21 | Somatek | Somatic, auditory and cochlear communication system and method |
US11878169B2 (en) | 2005-08-03 | 2024-01-23 | Somatek | Somatic, auditory and cochlear communication system and method |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US7676463B2 (en) * | 2005-11-15 | 2010-03-09 | Kroll Ontrack, Inc. | Information exploration systems and method |
US20070112755A1 (en) * | 2005-11-15 | 2007-05-17 | Thompson Kevin B | Information exploration systems and method |
US7865357B2 (en) * | 2006-03-14 | 2011-01-04 | Microsoft Corporation | Shareable filler model for grammar authoring |
US20070219793A1 (en) * | 2006-03-14 | 2007-09-20 | Microsoft Corporation | Shareable filler model for grammar authoring |
US7844449B2 (en) * | 2006-03-30 | 2010-11-30 | Microsoft Corporation | Scalable probabilistic latent semantic analysis |
US20070239431A1 (en) * | 2006-03-30 | 2007-10-11 | Microsoft Corporation | Scalable probabilistic latent semantic analysis |
US20080306742A1 (en) * | 2006-08-14 | 2008-12-11 | International Business Machines Corporation | Apparatus, method, and program for supporting speech interface design |
US7729921B2 (en) * | 2006-08-14 | 2010-06-01 | Nuance Communications, Inc. | Apparatus, method, and program for supporting speech interface design |
US8930191B2 (en) | 2006-09-08 | 2015-01-06 | Apple Inc. | Paraphrasing of user requests and results by automated digital assistant |
US9117447B2 (en) | 2006-09-08 | 2015-08-25 | Apple Inc. | Using event alert text as input to an automated assistant |
US8942986B2 (en) | 2006-09-08 | 2015-01-27 | Apple Inc. | Determining user intent based on ontologies of domains |
US20080228928A1 (en) * | 2007-03-15 | 2008-09-18 | Giovanni Donelli | Multimedia content filtering |
US8626930B2 (en) | 2007-03-15 | 2014-01-07 | Apple Inc. | Multimedia content filtering |
US10568032B2 (en) | 2007-04-03 | 2020-02-18 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US8311825B2 (en) * | 2007-10-04 | 2012-11-13 | Kabushiki Kaisha Toshiba | Automatic speech recognition method and apparatus |
US20090099841A1 (en) * | 2007-10-04 | 2009-04-16 | Kubushiki Kaisha Toshiba | Automatic speech recognition method and apparatus |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US9626955B2 (en) | 2008-04-05 | 2017-04-18 | Apple Inc. | Intelligent text-to-speech conversion |
US8682660B1 (en) * | 2008-05-21 | 2014-03-25 | Resolvity, Inc. | Method and system for post-processing speech recognition results |
US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US9535906B2 (en) | 2008-07-31 | 2017-01-03 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
GB2463909A (en) * | 2008-09-29 | 2010-03-31 | Toshiba Res Europ Ltd | Speech recognition utilising determination of a back-off group of code words in using an acoustic model. |
GB2463908A (en) * | 2008-09-29 | 2010-03-31 | Toshiba Res Europ Ltd | Speech recognition utilising a hybrid combination of probabilities output from a language model and an acoustic model. |
GB2463908B (en) * | 2008-09-29 | 2011-02-16 | Toshiba Res Europ Ltd | Speech recognition apparatus and method |
GB2463909B (en) * | 2008-09-29 | 2010-08-11 | Toshiba Res Europ Ltd | Speech recognition apparatus and method |
US9959870B2 (en) | 2008-12-11 | 2018-05-01 | Apple Inc. | Speech recognition involving a mobile device |
US10475446B2 (en) | 2009-06-05 | 2019-11-12 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US10283110B2 (en) | 2009-07-02 | 2019-05-07 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US11423886B2 (en) | 2010-01-18 | 2022-08-23 | Apple Inc. | Task flow identification based on user intent |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US9548050B2 (en) | 2010-01-18 | 2017-01-17 | Apple Inc. | Intelligent automated assistant |
US8892446B2 (en) | 2010-01-18 | 2014-11-18 | Apple Inc. | Service orchestration for intelligent automated assistant |
US8903716B2 (en) | 2010-01-18 | 2014-12-02 | Apple Inc. | Personalized vocabulary for digital assistant |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US9633660B2 (en) | 2010-02-25 | 2017-04-25 | Apple Inc. | User profiling for voice input processing |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US8713021B2 (en) | 2010-07-07 | 2014-04-29 | Apple Inc. | Unsupervised document clustering using latent semantic density analysis |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US10102359B2 (en) | 2011-03-21 | 2018-10-16 | Apple Inc. | Device access using voice authentication |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10706373B2 (en) | 2011-06-03 | 2020-07-07 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US9798393B2 (en) | 2011-08-29 | 2017-10-24 | Apple Inc. | Text correction processing |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9953088B2 (en) | 2012-05-14 | 2018-04-24 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US20150278194A1 (en) * | 2012-11-07 | 2015-10-01 | Nec Corporation | Information processing device, information processing method and medium |
US10978090B2 (en) | 2013-02-07 | 2021-04-13 | Apple Inc. | Voice trigger for a digital assistant |
US10199051B2 (en) | 2013-02-07 | 2019-02-05 | Apple Inc. | Voice trigger for a digital assistant |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US9922642B2 (en) | 2013-03-15 | 2018-03-20 | Apple Inc. | Training an at least partial voice command system |
US9697822B1 (en) | 2013-03-15 | 2017-07-04 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
US9620104B2 (en) | 2013-06-07 | 2017-04-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9633674B2 (en) | 2013-06-07 | 2017-04-25 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9966068B2 (en) | 2013-06-08 | 2018-05-08 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US10185542B2 (en) | 2013-06-09 | 2019-01-22 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US9300784B2 (en) | 2013-06-13 | 2016-03-29 | Apple Inc. | System and method for emergency calls initiated by voice command |
CN103345923A (en) * | 2013-07-26 | 2013-10-09 | 电子科技大学 | Sparse representation based short-voice speaker recognition method |
CN103345923B (en) * | 2013-07-26 | 2016-05-11 | 电子科技大学 | A kind of phrase sound method for distinguishing speek person based on rarefaction representation |
US9026431B1 (en) * | 2013-07-30 | 2015-05-05 | Google Inc. | Semantic parsing with multiple parsers |
US10791216B2 (en) | 2013-08-06 | 2020-09-29 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US8868409B1 (en) * | 2014-01-16 | 2014-10-21 | Google Inc. | Evaluating transcriptions with a semantic parser |
US20160117312A1 (en) * | 2014-04-08 | 2016-04-28 | TitleFlow LLC | Natural language processing for extracting conveyance graphs |
US9251139B2 (en) * | 2014-04-08 | 2016-02-02 | TitleFlow LLC | Natural language processing for extracting conveyance graphs |
US10521508B2 (en) * | 2014-04-08 | 2019-12-31 | TitleFlow LLC | Natural language processing for extracting conveyance graphs |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US11257504B2 (en) | 2014-05-30 | 2022-02-22 | Apple Inc. | Intelligent assistant for home automation |
US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9966065B2 (en) | 2014-05-30 | 2018-05-08 | Apple Inc. | Multi-command single utterance input method |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US10169329B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Exemplar-based natural language processing |
US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9606986B2 (en) | 2014-09-29 | 2017-03-28 | Apple Inc. | Integrated word N-gram and class M-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US11556230B2 (en) | 2014-12-02 | 2023-01-17 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US11087759B2 (en) | 2015-03-08 | 2021-08-10 | Apple Inc. | Virtual assistant activation |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US20170200065A1 (en) * | 2016-01-13 | 2017-07-13 | Adobe Systems Incorporated | Image Captioning with Weak Supervision |
US20170200066A1 (en) * | 2016-01-13 | 2017-07-13 | Adobe Systems Incorporated | Semantic Natural Language Vector Space |
US9811765B2 (en) * | 2016-01-13 | 2017-11-07 | Adobe Systems Incorporated | Image captioning with weak supervision |
US9792534B2 (en) * | 2016-01-13 | 2017-10-17 | Adobe Systems Incorporated | Semantic natural language vector space |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US11037565B2 (en) | 2016-06-10 | 2021-06-15 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US11152002B2 (en) | 2016-06-11 | 2021-10-19 | Apple Inc. | Application integration with a digital assistant |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10553215B2 (en) | 2016-09-23 | 2020-02-04 | Apple Inc. | Intelligent automated assistant |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US11405466B2 (en) | 2017-05-12 | 2022-08-02 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US11217255B2 (en) | 2017-05-16 | 2022-01-04 | Apple Inc. | Far-field extension for digital assistant services |
US10529322B2 (en) * | 2017-06-15 | 2020-01-07 | Google Llc | Semantic model for tagging of word lattices |
CN110517693B (en) * | 2019-08-01 | 2022-03-04 | 出门问问(苏州)信息科技有限公司 | Speech recognition method, speech recognition device, electronic equipment and computer-readable storage medium |
CN110517693A (en) * | 2019-08-01 | 2019-11-29 | 出门问问(苏州)信息科技有限公司 | Audio recognition method, device, electronic equipment and computer readable storage medium |
CN113158643A (en) * | 2021-04-27 | 2021-07-23 | 广东外语外贸大学 | Novel text readability assessment method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6374217B1 (en) | Fast update implementation for efficient latent semantic language modeling | |
US5839106A (en) | Large-vocabulary speech recognition using an integrated syntactic and semantic statistical language model | |
US6477488B1 (en) | Method for dynamic context scope selection in hybrid n-gram+LSA language modeling | |
CN108417210B (en) | Word embedding language model training method, word recognition method and system | |
US9412365B2 (en) | Enhanced maximum entropy models | |
Mangu et al. | Finding consensus in speech recognition: word error minimization and other applications of confusion networks | |
EP0570660B1 (en) | Speech recognition system for natural language translation | |
US8150693B2 (en) | Methods and apparatus for natural spoken language speech recognition | |
US20070179784A1 (en) | Dynamic match lattice spotting for indexing speech content | |
Heigold et al. | Equivalence of generative and log-linear models | |
US20030055640A1 (en) | System and method for parameter estimation for pattern recognition | |
US6314400B1 (en) | Method of estimating probabilities of occurrence of speech vocabulary elements | |
Kadyan et al. | A comparative study of deep neural network based Punjabi-ASR system | |
Chien et al. | Joint acoustic and language modeling for speech recognition | |
Yamamoto et al. | Multi-class composite N-gram language model | |
Erdogan et al. | Using semantic analysis to improve speech recognition performance | |
Fukada et al. | Automatic generation of multiple pronunciations based on neural networks | |
JP3961780B2 (en) | Language model learning apparatus and speech recognition apparatus using the same | |
Iyer et al. | Transforming out-of-domain estimates to improve in-domain language models. | |
US20010003174A1 (en) | Method of generating a maximum entropy speech model | |
JP3088364B2 (en) | Spoken language understanding device and spoken language understanding system | |
KR20040069060A (en) | Method and apparatus for continous speech recognition using bi-directional n-gram language model | |
Matsubara et al. | Stochastic dependency parsing of spontaneous Japanese spoken language | |
Naptali et al. | Word co-occurrence matrix and context dependent class in lsa based language model for speech recognition | |
JP3035239B2 (en) | Speaker normalization device, speaker adaptation device, and speech recognition device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: APPLE COMPUTER, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BELLEGARDA, JEROME R.;REEL/FRAME:009829/0081 Effective date: 19990311 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: APPLE INC., CALIFORNIA Free format text: CHANGE OF NAME;ASSIGNOR:APPLE COMPUTER, INC., A CALIFORNIA CORPORATION;REEL/FRAME:019399/0918 Effective date: 20070109 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |