DE19723294A1

DE19723294A1 - Pattern recognition method for speech or written data

Info

Publication number: DE19723294A1
Application number: DE19723294A
Authority: DE
Inventors: Juergen Franke; Joachim Gloger; Eberhard Mandler; Alfred Dr Kaltenmeier
Original assignee: Daimler Benz AG
Current assignee: Mercedes Benz Group AG
Priority date: 1997-06-04
Filing date: 1997-06-04
Publication date: 1998-12-10
Anticipated expiration: 2017-06-05
Also published as: DE19723294C2

Abstract

The pattern recognition method is based upon hidden Markov models and is used for the identification of spoken and written information. Characteristic vectors are transformed into symbol vectors using vector quantisation. The quantisation is made using polynomial classifiers in a mathematical process that does not involve iteration.

Description

Die Erfindung betrifft ein Mustererkennungsverfahren nach dem Oberbegriff des Patentanspruchs 1.The invention relates to a pattern recognition method according to the preamble of Claim 1.

Für Mustererkennungsverfahren, insbesondere zur Erkennung von Sprache oder von Schrift sind seit längerem bevorzugt Erkennungssysteme auf der Basis von Hidden-Markov-Modellen im Einsatz, siehe beispielsweise [KCGM 93]. Bei derartigen Erkennungs-Systemen werden von Prüfobjekten mehrdimensionale Merkmalsvektoren gewonnen und auf Symbole eines mehrdimensionalen Symbolvektoraumes abgebildet. Die Liste dieser Symbole, die verschiedenen Zuständen des zugrunde gelegten HMM-Modells zugeordnet sind, wird als Codebook bezeichnet. Diese Symbole können wiederum als Koeffizienten eines mehrdimensionalen Symbolvektors aufgefaßt werden.For pattern recognition processes, in particular for recognizing speech or of writing have long been preferred recognition systems based on Hidden Markov models in use, see for example [KCGM 93]. At Such detection systems become multidimensional of test objects Feature vectors obtained and based on symbols of a multidimensional Symbol vector space shown. The list of these symbols, the different States assigned to the underlying HMM model is called Called Codebook. These symbols can in turn be used as coefficients multidimensional symbol vector.

Für die Transformation der Merkmalsvektoren in den Symbolvektorraum, die als Vektorquantisierung bezeichnet wird, wird prinzipiell zwischen kontinuierlichen, teil kontinuierlichen und diskreten Modellen unterschieden. Die kontinuierlichen Modelle kommen wegen des damit verbundenen Verarbeitungsaufwands in den meisten Fällen nicht in Betracht.For the transformation of the feature vectors into the symbol vector space, which as Vector quantization is referred to, in principle, between continuous, partial distinguish between continuous and discrete models. The continuous models come in most because of the processing overhead involved Cases out of the question.

Bei dem diskreten Modell findet eine sogenannte harte Entscheidung bei der Zuordnung eines Merkmalsvektors zu einem Symbol statt. Dem gegenüber wird bei den teil-kontinuierlichen Modellen ein Merkmalsvektor mehreren Symbolen mit im Regelfall unterschiedlicher Gewichtung zugeordnet. Für die Bestimmung dieser Gewichte wird beispielsweise für die Cluster im zustandsabhängigen Merkmalsvektor-Raum eine statistische Verteilung, insbesondere eine Gauss- oder eine Gamma-Verteilung, angenommen. Die teil-kontinuierlichen Modelle zeigen im allgemeinen eine wesentlich bessere Erkennungsrate als die diskreten Modelle.With the discrete model, a so-called hard decision is made with the Assignment of a feature vector to a symbol instead. This is countered by the partially continuous models a feature vector with several symbols in the Usually assigned different weights. For the determination of this Weights are used, for example, for the clusters in the state-dependent Feature vector space a statistical distribution, especially a Gaussian or a gamma distribution. The semi-continuous models show in generally a much better recognition rate than the discrete models.

Der vorliegenden Erfindung liegt die Aufgabe zugrunde, die Erkennungsleistung eines Mustererkennungsverfahrens dieser Art weiter zu verbessern, ohne den Aufwand bei der Erkennung nennenswert zu steigern.The object of the present invention is the recognition performance a pattern recognition method of this kind without improving the Significantly increase the effort involved in detection.

Die Erfindung ist im Patentanspruch 1 beschrieben. Die Unteransprüche enthalten vorteilhafte Ausgestaltungen und Weiterbildungen der Erfindung.The invention is described in claim 1. The sub-claims included advantageous refinements and developments of the invention.

Die Erfindung geht von der Erkenntnis aus, daß die bei üblichen teil-kontinuierlichen Modellen angenommene statistische Verteilung die tatsächliche Verteilung der Merkmalsvektoren im zustandsabhängigen Vektorraum häufig nur sehr unbefriedigend beschreibt. Demgegenüber wird bei der vorliegenden Erfindung keinerlei Annahme über eine bestimmte Verteilungsfunktion gemacht, sondern es werden aus repräsentativen Trainingsdaten die Koeffizienten eines Polynom-Klassifikators bestimmt, welcher bei der Vektorquantisierung eine wesentlich genauere Approximation an die tatsächlich vorliegende Verteilung im Vektorraum darstellt.The invention is based on the knowledge that the conventional semi-continuous Statistical distribution assumed the actual distribution of the models Characteristic vectors in the state-dependent vector space often only very much unsatisfactory. In contrast, the present invention made no assumption about a particular distribution function, but it the coefficients of a Polynomial classifier determines which one is essential in vector quantization more precise approximation to the actually existing distribution in vector space represents.

In einer vorteilhaften Ausführungsform wird der Polynom-Klassifikator mit Momenten-Matrizen trainiert, die auch bei einem Training der HMM-Modelle anfallen. Die Vektorquantisierung erfolgt vorzugsweise mehrstufig verzweigt in der Art, daß mehrere Polynom-Teil-Klassifikatoren vorgesehen sind, welche die Knoten der Verzweigung darstellen und in einer baumartigen Verzweigung angeordnet sind, was gegenüber einem einzigen Klassifikator den Vorteil hat, daß eine Beschränkung auf nur die besten Pfade des Baumes vorgenommen werden kann und dadurch wesentlich weniger Polynome evaluiert werden müssen. Vorzugsweise sind diese Teil-Klassifikatoren als binäre Polynom-Klassifikatoren ausgeführt. Vorteilhafterweise wird in an sich bekannter Art vor der Durchführung der Vektorquantisierung die Dimension der Merkmalsvektoren mittels einer linearen Diskriminanz-Analyse reduziert. In an advantageous embodiment, the polynomial classifier is included Moments matrices trained, even during a training session of the HMM models attack. The vector quantization is preferably carried out in multiple stages in the Art that several polynomial part classifiers are provided, which are the nodes represent the branch and are arranged in a tree-like branch, which has the advantage over a single classifier that a restriction can be made on only the best paths of the tree and thereby far fewer polynomials need to be evaluated. These are preferably Part classifiers implemented as binary polynomial classifiers. Advantageously, prior to carrying out the Vector quantization the dimension of the feature vectors using a linear Discriminant analysis reduced.

Die Erfindung ist nachfolgend noch anhand eines Erkennungssystems zur Erkennung gebundener Schrift veranschaulicht. Ein solches Erkennungssystem ist beispielsweise in [KCGM 93] beschrieben und für Einzelheiten eines solchen beispielhaften Erkennungssystems wird auf diese Literaturstelle verwiesen.The invention is based on a detection system Bound font recognition illustrated. Such a detection system is described for example in [KCGM 93] and for details of such exemplary detection system is referred to this reference.

Ausgehend von einem binären Pixel-Bild einer handgeschriebenen Buchstaben- oder Zahlenfolge besteht eine wesentliche Aufgabe beim Einsatz eines HMM-Erkenners darin, aus dem Binär-Bild eine Folge von Vektoren als Eingangsgrößen des HMM-Erkenners zu erzeugen. Eine solche Vorverarbeitung kann beispielsweise den Übergang von der Pixel-Darstellung zu einer Kontur-Beschreibung umfassen, an den sich dann Normierungsschritte, wie Schreiblinienschätzung, Aufrichtung schräger Schrift, Drehung des Schriftbilds und Übergang zu einer Skelett-Darstellung anschließen können. Solche Vorverarbeitungsmaßnahmen sind in verschiedener Weise bekannt und gebräuchlich. Eine detailliertere Darstellung der Vorverarbeitungs-Maßnahmen eines solchen Erkennungssystems ist beispielsweise in [CGM 93] beschrieben. Die Einstellung eines HMM-Erkenners ist komplex und, da in verschiedenen Ausführungen aus dem Stand der Technik hinreichend bekannt, an dieser Stelle nicht im Detail beschrieben. Zur Erzeugung von Merkmalsvektoren aus dem vorverarbeiteten Bild wird beispielsweise ein schmaler Fensterausschnitt in angenommener Schreibrichtung schrittweise über das Schriftbild bewegt und bei jedem Schritt werden aus dem Fensterinhalt Merkmale extrahiert, die in ihrer Gesamtheit einen Merkmalsvektor bilden. Diese Vorgehensweise wird sowohl in einer Trainingsphase auf Schriftbilder bekannten Inhaltes als auch in einer nachfolgenden Erkennungsphase auf Schriftbilder unbekannten Inhaltes angewandt.Starting from a binary pixel image of a handwritten letter or sequence of numbers is an essential task when using a HMM recognizer in it, from the binary image a sequence of vectors as input variables of the HMM recognizer. Such preprocessing can, for example include the transition from the pixel representation to a contour description, then the standardization steps, such as writing line estimation, erection oblique writing, rotation of the typeface and transition to one Can connect skeleton representation. Such preprocessing measures are in known and used in various ways. A more detailed representation of the Pre-processing measures of such a detection system are, for example described in [CGM 93]. Hiring an HMM recognizer is complex and because well known in various versions from the prior art not described in detail here. For generating feature vectors from For example, the preprocessed image has a narrow window section in assumed writing direction moved gradually over the typeface and at Every step features are extracted from the window content, which in their Together form a feature vector. This procedure is used in both a training phase on typefaces of known content as well as in a subsequent recognition phase applied to typefaces of unknown content.

Bei ausreichend großer Anzahl von Trainingsmustern zeigen die aus diesen Mustern gewonnenen Merkmalsvektoren Anhäufungen, sogenannte Cluster, im mehrdimensionalen Raum der Merkmalsvektoren. Diese Cluster können auf an sich bekannte Weise, beispielsweise nach dem bekannten und überwiegend angewandten LBG-Algorithmus analysiert werden. If the number of training samples is large enough, they show from these samples feature vectors obtained, so-called clusters, in the multi-dimensional space of the feature vectors. These clusters can on themselves known manner, for example according to the known and predominantly applied LBG algorithm can be analyzed.

Gemäß einem aus dem Stand der Technik bekannten vorteilhaften Vorgehen werden bei Bewegung des Fensterausschnitts nicht nur die statischen Merkmale einer Fensterposition, sondern auch die Differenzen der Merkmale zu benachbarten Fensterpositionen bestimmt und weiterverarbeitet, was zu einem entsprechend höherdimensionalen Merkmalsvektor führt. Die nachfolgenden Ausführungen gelten, ohne daß es jeweils eines gesonderten expliziten Hinweises bedarf, auch für solche und ähnliche Varianten der Merkmalsgewinnung.According to an advantageous procedure known from the prior art not only the static features when moving the window section a window position, but also the differences of the characteristics to neighboring ones Window positions determined and processed, resulting in a corresponding higher-dimensional feature vector leads. The following explanations apply without the need for a separate explicit reference, even for such and similar variants of feature extraction.

In einer ersten Trainingsphase des HMM-Erkenners werden auf der Basis der zuvor erfolgten zustandsunabhängigen Vektorquantisierungstrainings (ein oder mehrere sogenannte Codebooks) und unter Zugrundelegens vorgebbarer Strukturen der HMM-Modelle für die möglichen Objektklassen mögliche Pfade durch die Strukturen und Zuordnungen einzelner Merkmalsvektoren zu Zuständen der HMM-Modelle, z. B. unter Anwendung des sogenannten Forward-Backward-Verfahrens, bestimmt und dabei die Koeffizienten der HMM-Modelle ermittelt. In diesem Rahmen wurden auch für die verschiedenen Zustände, die die Symbole eines einzelnen neuen zustandsabhängigen Codebooks bilden, Momentenmatrizen aus den zuzuordnenden Merkmalsvektoren bestimmt. Die Momentenmatrizen werden zur Schätzung einer linearen Diskriminanz-Transformations-Matrix herangezogen, die wiederum zur Reduzierung der Dimension der Merkmalsvektoren dient. Diese Verfahrensschritte sind an sich bekannt und gebräuchlich und im Detail in (Fuk 90) beschrieben.In a first training phase the HMM recognizer will be based on the previous state-independent vector quantization training (one or more so-called codebooks) and on the basis of predefinable structures of the HMM models for the possible object classes possible paths through the structures and assignments of individual feature vectors to states of the HMM models, e.g. B. using the so-called forward-backward method, determined and the coefficients of the HMM models are determined. In this context, too for the different states that the symbols of a single new one form state-dependent codebooks, moment matrices from the assigned feature vectors determined. The moment matrices become Estimation of a linear discriminant transformation matrix used again serves to reduce the dimension of the feature vectors. This Process steps are known per se and are customary and in detail in (Fuk 90) described.

Aus den Momentenmatrizen können bei Einsatz eines Vektorquantisierers auf der Basis angenommener statistischer Verteilungen, z. B. Gauß-Verteilungen, die Parameter der Verteilung wie z. B. Schwerpunkt und Kovarianz ermittelt werden. Solche angenommenen statistischen Verteilungen nähern jedoch die tatsächlichen Verteilungen häufig nur unzureichend an.When using a vector quantizer on the Based on assumed statistical distributions, e.g. B. Gaussian distributions Distribution parameters such as B. Focus and covariance can be determined. However, such assumed statistical distributions approximate the real ones Distributions are often inadequate.

Gemäß der vorliegenden Erfindung wird mit dem Einsatz eines Polynom-Klassifikators ein gänzlich anderer Weg beschritten, bei welchem a priori keinerlei Annahmen über eine bestimmte statistische Verteilung gemacht werden. Die Koeffizienten des Polynom-Klassifikators werden auf der Basis der in der Trainingsphase bestimmten Merkmalsvektoren ermittelt.According to the present invention, the use of a Polynomial classifier went a completely different way, in which a priori none Assumptions about a certain statistical distribution are made. The Coefficients of the polynomial classifier are based on the in the Training phase determined certain feature vectors.

Vorteilhafterweise werden dabei anstelle der Merkmalsvektoren die bereits zuvor gebildeten Momentenmatrizen herangezogen und so der bei der Koeffizientenermittlung ohnehin erforderliche Schritt der Bildung dieser Matrizen aus den Merkmalsvektoren vermieden. Darüber hinaus kann auf diesem Weg vorteilhafterweise die explizite Kennzeichnung der Klassenzugehörigkeit der einzelnen Merkmalsvektoren umgangen werden, da die Momentenmatrizen ohnehin zu bestimmten HMM-Zuständen gehören und diese im späteren Vektorquantisierer auf Polynomklassifikatorbasis gerade die zu erzeugenden Symbole darstellen.Advantageously, instead of the feature vectors, they are already previously formed moment matrices and so the at Determination of coefficients anyway necessary step of forming these matrices avoided the feature vectors. In addition, this way advantageously the explicit identification of the class belonging to the individual feature vectors are bypassed, since the moment matrices anyway belong to certain HMM states and these in the later vector quantizer represent the symbols to be generated on the basis of a polynomial classifier.

Aus Gründen des Rechenaufwands ist es auch hier vorteilhaft, mit der gegenüber dem ursprünglichen Merkmalsvektor-Raum reduzierten Dimension zu arbeiten, indem die lineare Diskriminanz-Transformation auf die Merkmalsvektoren bzw. die Momentenmatrizen angewandt wird.For reasons of computational effort, it is also advantageous here with the to work in the dimension dimension reduced to the original feature vector space, by the linear discriminant transformation on the feature vectors or the Moment matrices is applied.

Die Polynomstruktur wird vorzugsweise als vollständige Struktur vorgegeben, kann aber auch zur Reduzierung des Rechenaufwands ausgedünnt werden und bleibt während der Trainingsphase unverändert. Für die Diskriminanten-Funktion d(v) des Polynoms zu den verschiedenen Objektklassen gilt
The polynomial structure is preferably specified as a complete structure, but can also be thinned out to reduce the computing effort and remains unchanged during the training phase. The following applies to the discriminant function d (v) of the polynomial for the different object classes

d(v) = A^Tx(v)
d (v) = A ^T x (v)

mit A als Koeffizientenmatrix und x(v) als Funktion für eine vektorielle Abbildung des ursprünglichen Merkmalsvektors v auf die Polynomstrukturliste x.with A as the coefficient matrix and x (v) as a function for a vectorial mapping of the original feature vector v on the polynomial structure list x.

Für jeden Knoten des Pfad-Verzweigungsbaums (trellis) kann die Wahrscheinlichkeit, daß ein Merkmalsvektor einem bestimmten Zustand eines HMM-Modells zuzuordnen ist, bestimmt werden, was zu neuen Momentenmatrizen E {xx^T} und E {xy^T} mit y als Symbolklassen-Vektor führt, aus denen die Koeffizientenmatrix nach
For each node of the path branch tree (trellis), the probability that a feature vector can be assigned to a specific state of an HMM model can be determined, which leads to new moment matrices E {xx ^T } and E {xy ^T } with y as symbol class Vector leads from which the coefficient matrix follows

A = E {xx^T}^-1 E {xy^T}
A = E {xx ^T } ^-1 E {xy ^T }

bestimmt werden kann. Eine detaillierte Beschreibung dieser und weiterer Aspekte von Polynom-Klassifikatoren gibt [Sch 96].can be determined. A detailed description of these and other aspects of polynomial classifiers are [Sch 96].

Die Verwendung eines Polynom-Klassifikators bietet, zumindest bei Verwendung des kleinsten mittleren Fehlerquadrats als Optimierungskriterium, den Vorteil einer mathematisch geschlossenen Lösung ohne Iterationen.The use of a polynomial classifier offers, at least when used of the smallest mean square of error as an optimization criterion, the advantage of a mathematically closed solution without iterations.

Zur Verringerung des Rechenaufwands wird die Vektorquantisierung vorzugsweise nicht mittels eines einzigen einstufigen Polynom-Klassifikators, sondern mehrstufig mit sukzessiver Verzweigung in aufeinanderfolgenden Stufen unter Einsatz mehrerer Polynom-Teil-Klassifikatoren nach Art einer Baumstruktur durchgeführt. Die Teilklassifikatoren sind insbesondere vorteilhafterweise als binäre Polynom- Klassifikatoren ausgeführt.Vector quantization is preferred to reduce the computational effort not by means of a single one-level polynomial classifier, but multi-level with successive branching in successive stages using several Polynomial part classifiers carried out in the manner of a tree structure. The Subclassifiers are particularly advantageous as binary polynomial Classifiers executed.

Die Einstellung des Polynom-Klassifikators in einer Trainingsphase wird im Regelfall gegenüber einem Vektorquantisierer mit angenommener statistischer Verteilungsfunktion in dieser Trainingsphase mit einem höheren Rechenaufwand verbunden sein. Dieser fällt jedoch nur einmal in der Trainingsphase an. Der Rechenaufwand in der späteren Erkennungsphase ist nicht oder nur unwesentlich höher im Vergleich zur Vektorquantisierung auf der Basis statistischer Verteilungsfunktionen. Bei Durchlaufen des Verzweigungsbaums der Teil- Klassifikatoren kann durch Abbrechen unbedeutender Pfade (pruning) der Verarbeitungsaufwand bei der Vektorquantisierung weiter reduziert werden.The setting of the polynomial classifier in a training phase is usually versus a vector quantizer with assumed statistical Distribution function in this training phase with a higher computing effort be connected. However, this only occurs once in the training phase. Of the Computing effort in the later recognition phase is not or only insignificant higher compared to vector quantization based on statistical Distribution functions. When passing through the branch tree of the partial Classifiers can be broken by truncating insignificant paths (pruning) Processing costs for vector quantization can be further reduced.

Die verbesserte Beschreibung der tatsächlichen Verteilung im Merkmalsvektor- Raum durch den Polynom-Klassifikator führt nach ersten Untersuchungen gegenüber einer angenommenen Normalverteilung zu einer signifikanten Verringerung der Fehlerrate, bei sonst gleicher Einstellung des Erkennungssystems beispielsweise zu einer Fehlerrate von 5,6% gegenüber 6,3% bei angenommener Normalverteilung. The improved description of the actual distribution in the feature vector Space through the polynomial classifier leads after initial investigations compared to an assumed normal distribution to a significant one Reduction of the error rate, with otherwise the same setting of the detection system For example, at an error rate of 5.6% compared to 6.3% with an assumed one Normal distribution.

Bibliography

[KCGM93] A. Kaltenmeier, T. Caesar, J.M. Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceedings of the Second International Conference on Document Analysis and Recognition, S. 139-142, Tsukuba Science City, Japan, 20.-22. Oktober 1993. IEEE Computer Society Press
[CGM93] T. Caesar, J.M. Gloger, and E. Mandler. Preprocessing and feature extraction for a handwriting recognition system. In Proceedings of the Second International Conference on Document Analysis and Recognition, S. 408-411 Tsukuba Science City, Japan, 20.-22. Oktober 1993, IEEE Computer Society Press.
[Fuk90] Keinosuke Fukunaga, Introduction to statistical pattern recognition, Academic Press Inc., San Diego, CA, 2. Edition, 1990
[Sch96] Jürgen Schürmann, Pattern Classification, John Wiley & Sons Inc., New York 1996[KCGM93] A. Kaltenmeier, T. Caesar, JM Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceedings of the Second International Conference on Document Analysis and Recognition, pp. 139-142, Tsukuba Science City, Japan, 20.-22. October 1993. IEEE Computer Society Press
[CGM93] T. Caesar, JM Gloger, and E. Mandler. Preprocessing and feature extraction for a handwriting recognition system. In Proceedings of the Second International Conference on Document Analysis and Recognition, pp. 408-411 Tsukuba Science City, Japan, 20.-22. October 1993, IEEE Computer Society Press.
[Fuk90] Keinosuke Fukunaga, Introduction to statistical pattern recognition, Academic Press Inc., San Diego, CA, 2nd edition, 1990
[Sch96] Jürgen Schürmann, Pattern Classification, John Wiley & Sons Inc., New York 1996

Claims

1. Pattern recognition method based on hidden Markov models, in which feature vectors are transformed into symbol vectors by means of vector quantization, characterized in that the vector quantization is carried out by means of a polynomial classifier without assumption about statistical distributions present in the condition-dependent feature space.

2. The method according to claim 1, characterized in that the Vector quantization is branched into several stages, with individual polynomial part classifiers represent the nodes of the branch.

3. The method according to claim 2, characterized in that the polynomial part Classifiers are binary polynomial classifiers.

4. The method according to any one of claims 1 to 3, characterized in that the Dimension of the feature vectors reduced and the vector quantization based on the dimensionally reduced feature vectors is carried out.

5. The method according to any one of claims 1 to 4, characterized in that the Coefficients of the polynomial classifier from those in a training phase feature vectors determined or quantities derived therefrom are estimated.

6. The method according to claim 5, characterized in that in a Training phase of the HMM recognizer from the feature vectors moment matrices formed and these moment matrices also to estimate the Coefficients of the polynomial classifier or the polynomial classifiers be used.