DE19723294C2

DE19723294C2 - Pattern recognition methods

Info

Publication number: DE19723294C2
Application number: DE19723294A
Authority: DE
Inventors: Juergen Franke; Joachim Gloger; Eberhard Mandler; Alfred Kaltenmeier
Original assignee: DaimlerChrysler AG
Current assignee: Mercedes Benz Group AG
Priority date: 1997-06-04
Filing date: 1997-06-04
Publication date: 2003-06-18
Anticipated expiration: 2017-06-05
Also published as: DE19723294A1

Description

Die Erfindung betrifft ein Mustererkennungsverfahren nach dem Oberbegriff des Patentanspruchs 1.The invention relates to a pattern recognition method according to the preamble of Patent claim 1.

Für Mustererkennungsverfahren, insbesondere zur Erkennung von Sprache oder von Schrift sind seit längerem bevorzugt Erkennungssysteme auf der Basis von Hidden-Markov-Modellen im Einsatz, siehe beispielsweise [KCGM 93]. Bei derartigen Erkennungs-Systemen werden von Prüfobjekten mehrdimensionale Merkmalsvektoren gewonnen und auf Symbole eines mehrdimensionalen Symbolvektoraumes abgebildet. Die Liste dieser Symbole, die verschiedenen Zuständen des zugrunde gelegten HMM-Modells zugeordnet sind, wird als Codebook bezeichnet. Diese Symbole können wiederum als Koeffizienten eines mehrdimensionalen Symbolvektors aufgefaßt werden.For pattern recognition methods, in particular for recognizing speech or writing, recognition systems based on Hidden Markov Models have been used for some time, see for example [KCGM 93 ]. In such recognition systems, multi-dimensional feature vectors are obtained from test objects and mapped onto symbols of a multi-dimensional symbol vector space. The list of these symbols associated with different states of the underlying HMM model is called a codebook. These symbols, in turn, can be considered as coefficients of a multidimensional symbol vector.

Für die Transformation der Merkmalsvektoren in den Symbolvektorraum, die als Vektorquantisierung bezeichnet wird, wird prinzipiell zwischen kontinuierlichen, teil kontinuierlichen und diskreten Modellen unterschieden. Die kontinuierlichen Modelle kommen wegen des damit verbundenen Verarbeitungsaufwands in den meisten Fällen nicht in Betracht.For the transformation of the feature vectors into the symbol vector space, called Vector quantization is, in principle, between continuous, part differentiated between continuous and discrete models. The continuous models come in most because of the associated processing overhead Not considered.

Bei dem diskreten Modell findet eine sogenannte harte Entscheidung bei der Zuordnung eines Merkmalsvektors zu einem Symbol statt. Dem gegenüber wird bei den teil-kontinuierlichen Modellen ein Merkmalsvektor mehreren Symbolen mit im Regelfall unterschiedlicher Gewichtung zugeordnet. Für die Bestimmung dieser Gewichte wird beispielsweise für die Cluster im zustandsabhängigen Merkmalsvektor-Raum eine statistische Verteilung, insbesondere eine Gauss- oder eine Gamma-Verteilung, angenommen. Die teil-kontinuierlichen Modelle zeigen im allgemeinen eine wesentlich bessere Erkennungsrate als die diskreten ModelleIn the discrete model, there is a so-called hard decision in the Assignment of a feature vector to a symbol instead. Opposite is at the semicontinuous models a feature vector multiple symbols with im Usually assigned a different weighting. For the determination of this For example, weights for the clusters are state-dependent Feature vector space a statistical distribution, in particular a Gauss- or a gamma distribution, assumed. The part-continuous models show in the In general, a much better recognition rate than the discrete models

Der vorliegenden Erfindung liegt die Aufgabe zugrunde, die Erkennungsleistung eines Mustererkennungsverfahrens dieser Art weiter zu verbessern, ohne den Aufwand bei der Erkennung nennenswert zu steigern.The present invention is based on the object, the recognition performance a pattern recognition method of this kind, without the Significant increase in recognition effort.

Die Erfindung ist im Patentanspruch 1 beschrieben. Die Unteransprüche enthalten vorteilhafte Ausgestaltungen und Weiterbildungen der Erfindung.The invention is described in claim 1. The subclaims contain advantageous embodiments and modifications of the invention.

Die Erfindung geht von der Erkenntnis aus, daß die bei üblichen teil-kontinuierlichen Modellen angenommene statistische Verteilung die tatsächliche Verteilung der Merkmalsvektoren im zustandsabhängigen Vektorraum häufig nur sehr unbefriedigend beschreibt. Demgegenüber wird bei der vorliegenden Erfindung keinerlei Annahme über eine bestimmte Verteilungsfunktion gemacht, sondern es werden aus repräsentativen Trainingsdaten die Koeffizienten eines Polynom- Klassifikators bestimmt, welcher bei der Vektorquantisierung eine wesentlich genauere Approximation an die tatsächlich vorliegende Verteilung im Vektorraum darstellt.The invention is based on the finding that the usual partial-continuous Models assumed statistical distribution the actual distribution of Feature vectors in the state-dependent vector space often only very unsatisfactory describes. In contrast, in the present invention made no assumption about a particular distribution function, but it from representative training data the coefficients of a polynomial Classifier determines which one is essential in vector quantization closer approximation to the actual distribution in vector space represents.

In einer vorteilhaften Ausführungsform wird der Polynom-Klassifikator mit Momenten-Matrizen trainiert, die auch bei einem Training der HMM-Modelle anfallen. Die Vektorquantisierung erfolgt vorzugsweise mehrstufig verzweigt in der Art, daß mehrere Polynom-Teil-Klassifikatoren vorgesehen sind, welche die Knoten der Verzweigung darstellen und in einer baumartigen Verzweigung angeordnet sind, was gegenüber einem einzigen Klassifikator den Vorteil hat, daß eine Beschränkung auf nur die besten Pfade des Baumes vorgenommen werden kann und dadurch wesentlich weniger Polynome evaluiert werden müssen. Vorzugsweise sind diese Teil-Klassifikatoren als binäre Polynom-Klassifikatoren ausgeführt.In an advantageous embodiment, the polynomial classifier with Moments matrices are also trained in training the HMM models incurred. The vector quantization is preferably carried out in several stages in the Kind, that several polynomial part classifiers are provided, which are the nodes represent the junction and are arranged in a tree-like junction, which has the advantage over a single classifier that a restriction on only the best paths of the tree can be made and thereby significantly fewer polynomials need to be evaluated. Preferably, these are Part classifiers run as binary polynomial classifiers.

Vorteilhafterweise wird in an sich bekannter Art vor der Durchführung der Vektorquantisierung die Dimension der Merkmalsvektoren mittels einer linearen Diskriminanz-Analyse reduziert. Advantageously, in a manner known per se prior to the implementation of Vector quantization the dimension of the feature vectors by means of a linear Discriminant analysis reduced.

Die Erfindung ist nachfolgend noch anhand eines Erkennungssystems zur Erkennung gebundener Schrift veranschaulicht. Ein solches Erkennungssystem ist beispielsweise in [KCGM 93] beschrieben und für Einzelheiten eines solchen beispielhaften Erkennungssystems wird auf diese Literaturstelle verwiesen.The invention is illustrated below still using a recognition system for recognizing bound font. Such a recognition system is described, for example, in [KCGM 93 ] and for details of such an exemplary recognition system, reference is made to this reference.

Ausgehend von einem binären Pixel-Bild einer handgeschriebenen Buchstaben- oder Zahlenfolge besteht eine wesentliche Aufgabe beim Einsatz eines HMM- Erkenners darin, aus dem Binär-Bild eine Folge von Vektoren als Eingangsgrößen des HMM-Erkenners zu erzeugen. Eine solche Vorverarbeitung kann beispielsweise den Übergang von der Pixel-Darstellung zu einer Kontur-Beschreibung umfassen, an den sich dann Normierungsschritte, wie Schreiblinienschätzung, Aufrichtung schräger Schrift, Drehung des Schriftbilds und Übergang zu einer Skelett- Darstellung anschließen können. Solche Vorverarbeitungsmaßnahmen sind in verschiedener Weise bekannt und gebräuchlich. Eine detailliertere Darstellung der Vorverarbeitungs-Maßnahmen eines solchen Erkennungssystems ist beispielsweise in [CGM 93] beschrieben. Die Einstellung eines HMM-Erkenners ist komplex und, da in verschiedenen Ausführungen aus dem Stand der Technik hinreichend bekannt, an dieser Stelle nicht im Detail beschrieben. Zur Erzeugung von Merkmalsvektoren aus dem vorverarbeiteten Bild wird beispielsweise ein schmaler Fensterausschnitt in angenommener Schreibrichtung schrittweise über das Schriftbild bewegt und bei jedem Schritt werden aus dem Fensterinhalt Merkmale extrahiert, die in ihrer Gesamtheit einen Merkmalsvektor bilden. Diese Vorgehensweise wird sowohl in einer Trainingsphase auf Schriftbilder bekannten Inhaltes als auch in einer nachfolgenden Erkennungsphase auf Schriftbilder unbekannten Inhaltes angewandt.Starting from a binary pixel image of a handwritten letter or number sequence, an essential task when using an HMM recognizer is to generate from the binary image a sequence of vectors as input variables of the HMM recognizer. Such preprocessing may comprise, for example, the transition from the pixel representation to a contour description, which may then be followed by normalization steps, such as writing line estimation, oblique script erection, rotation of the typeface, and transition to a skeleton representation. Such preprocessing measures are known and used in various ways. A more detailed illustration of the preprocessing measures of such a recognition system is described, for example, in [CGM 93 ]. The setting of an HMM recognizer is complex and, as well known in various embodiments of the prior art, is not described in detail here. In order to generate feature vectors from the preprocessed image, for example, a narrow window section in the assumed writing direction is moved stepwise over the text image, and at each step features are extracted from the window content which in their entirety form a feature vector. This procedure is applied to typefaces of known content in a training phase as well as to typefaces of unknown content in a subsequent recognition phase.

Bei ausreichend großer Anzahl von Trainingsmustern zeigen die aus diesen Mustern gewonnenen Merkmalsvektoren Anhäufungen, sogenannte Cluster, im mehrdimensionalen Raum der Merkmalsvektoren. Diese Cluster können auf an sich bekannte Weise, beispielsweise nach dem bekannten und überwiegend angewandten LBG-Algorithmus analysiert werden. If the number of training patterns is sufficiently large, they will show from these patterns feature vectors accumulations, so-called clusters, im multidimensional space of feature vectors. These clusters can be on known manner, for example, according to the known and predominantly applied LBG algorithm.

Gemäß einem aus dem Stand der Technik bekannten vorteilhaften Vorgehen werden bei Bewegung des Fensterausschnitts nicht nur die statischen Merkmale einer Fensterposition, sondern auch die Differenzen der Merkmale zu benachbarten Fensterpositionen bestimmt und weiterverarbeitet, was zu einem entsprechend höherdimensionalen Merkmalsvektor führt. Die nachfolgenden Ausführungen gelten, ohne daß es jeweils eines gesonderten expliziten Hinweises bedarf, auch für solche und ähnliche Varianten der Merkmalsgewinnung.According to an advantageous procedure known from the prior art When moving the window, not only are the static features a window position, but also the differences of the features to adjacent ones Window positions determined and further processed, resulting in a corresponding higher-dimensional feature vector leads. The following statements apply, without the need for a separate explicit reference, even for such and similar variants of feature extraction.

In einer ersten Trainingsphase des HMM-Erkenners werden auf der Basis der zuvor erfolgten zustandsunabhängigen Vektorquantisierungstrainings (ein oder mehrere sogenannte Codebooks) und unter Zugrundelegens vorgebbarer Strukturen der HMM-Modelle für die möglichen Objektklassen mögliche Pfade durch die Strukturen und Zuordnungen einzelner Merkmalsvektoren zu Zuständen der HMM-Modelle, z. B. unter Anwendung des sogenannten Forward-Backward-Verfahrens, bestimmt und dabei die Koeffizienten der HMM-Modelle ermittelt. In diesem Rahmen wurden auch für die verschiedenen Zustände, die die Symbole eines einzelnen neuen zustandsabhängigen Codebooks bilden, Momentenmatrizen aus den zuzuordnenden Merkmalsvektoren bestimmt. Die Momentenmatrizen werden zur Schätzung einer linearen Diskriminanz-Transformations-Matrix herangezogen, die wiederum zur Reduzierung der Dimension der Merkmalsvektoren dient. Diese Verfahrensschritte sind an sich bekannt und gebräuchlich und im Detail in (Fuk 90) beschrieben.In a first training phase of the HMM recognizer possible paths through the structures and assignments of individual feature vectors to states of the HMM on the basis of the previously made state-independent Vektorquantisierungstrainings (one or more so-called codebooks) and on the basis of predeterminable structures of the HMM models for the possible object classes Models, eg. B. using the so-called forward-backward method, and determines the coefficients of the HMM models. In this framework, torque matrices were also determined from the feature vectors to be assigned for the various states which form the symbols of a single new state-dependent codebook. The moment matrices are used to estimate a linear discriminant transformation matrix, which in turn serves to reduce the dimension of the feature vectors. These process steps are known and customary and described in detail in (Fuk 90 ).

Aus den Momentenmatrizen können bei Einsatz eines Vektorquantisierers auf der Basis angenommener statistischer Verteilungen, z. B. Gauss-Verteilungen, die Parameter der Verteilung wie z. B. Schwerpunkt und Kovarianz ermittelt werden. Solche angenommenen statistischen Verteilungen nähern jedoch die tatsächlichen Verteilungen häufig nur unzureichend an.From the moment matrices can be when using a Vektorquantisierers on the Basis of assumed statistical distributions, e.g. Gaussian distributions, the Parameters of the distribution such. B. center of gravity and covariance can be determined. However, such assumed statistical distributions approximate the actual ones Distributions often insufficient.

Gemäß der vorliegenden Erfindung wird mit dem Einsatz eines Polynom- Klassifikators ein gänzlich anderer Weg beschritten, bei welchem a priori keinerlei Annahmen über eine bestimmte statistische Verteilung gemacht werden. Die Koeffizienten des Polynom-Klassifikators werden auf der Basis der in der Trainingsphase bestimmten Merkmalsvektoren ermittelt.According to the present invention, the use of a polynomial Classificator a completely different way followed, in which a priori no Assumptions about a particular statistical distribution are made. The Coefficients of the polynomial classifier are calculated on the basis of the Training phase determined feature vectors.

Vorteilhafterweise werden dabei anstelle der Merkmalsvektoren die bereits zuvor gebildeten Momentenmatrizen herangezogen und so der bei der Koeffizientenermittlung ohnehin erforderliche Schritt der Bildung dieser Matrizen aus den Merkmalsvektoren vermieden. Darüber hinaus kann auf diesem Weg vorteilhafterweise die explizite Kennzeichnung der Klassenzugehörigkeit der einzelnen Merkmalsvektoren umgangen werden, da die Momentenmatrizen ohnehin zu bestimmten HMM-Zuständen gehören und diese im späteren Vektorquantisierer auf Polynomklassifikatorbasis gerade die zu erzeugenden Symbole darstellen.Advantageously, instead of the feature vectors, the already previously used formed torque matrices and so in the Coefficient determination anyway required step of forming these matrices avoided the feature vectors. In addition, this way can Advantageously, the explicit classification of the class membership of individual feature vectors are bypassed because the moment matrices anyway belong to certain HMM states and these in the later vector quantizer polynomial class just the symbols to be generated.

Aus Gründen des Rechenaufwands ist es auch hier vorteilhaft, mit der gegenüber dem ursprünglichen Merkmalsvektor-Raum reduzierten Dimension zu arbeiten, indem die lineare Diskriminanz-Transformation auf die Merkmalsvektoren bzw. die Momentenmatrizen angewandt wird.For reasons of computational effort, it is also advantageous here with the opposite working on the original feature vector space reduced dimension by the linear discriminant transformation on the feature vectors or the Torque matrices is applied.

Die Polynomstruktur wird vorzugsweise als vollständige Struktur vorgegeben, kann aber auch zur Reduzierung des Rechenaufwands ausgedünnt werden und bleibt während der Trainingsphase unverändert. Für die Diskriminanten-Funktion d(v) des Polynoms zu den verschiedenen Objektklassen gilt
The polynomial structure is preferably given as a complete structure, but can also be thinned out to reduce the computational effort and remains unchanged during the training phase. For the discriminant function d (v) of the polynomial to the different object classes

d(v) = A^Tx(v)
d (v) = A ^T x (v)

mit A als Koeffizientenmatrix und x(v) als Funktion für eine vektorielle Abbildung des ursprünglichen Merkmalsvektors v auf die Polynomstrukturliste x.with A as a coefficient matrix and x (v) as a function for a vectorial mapping of the original feature vector v onto the polynomial structure list x.

Für jeden Knoten des Pfad-Verzweigungsbaums (trellis) kann die Wahrscheinlichkeit, daß ein Merkmalsvektor einem bestimmten Zustand eines HMM- Modells zuzuordnen ist, bestimmt werden, was zu neuen Momentenmatrizen E{xx^T} und E{xy^T} mit y als Symbolklassen-Vektor führt, aus denen die Koeffizientenmatrix nach

A = E{xx^T}^-1E{xy^T}
For each node of the path branch tree (trellis), the probability that a feature vector is to be assigned to a particular state of an HMM model can be determined, resulting in new moment matrices E {xx ^T } and E {xy ^T } with y as symbol classes. Vector leads from which the coefficient matrix after

A = E {xx ^T } ^-1 E {xy ^T }

bestimmt werden kann. Eine detaillierte Beschreibung dieser und weiterer Aspekte von Polynom-Klassifikatoren gibt [Sch 96].can be determined. A detailed description of these and other aspects of polynomial classifiers can be found in [Sch 96 ].

Die Verwendung eines Polynom-Klassifikators bietet, zumindest bei Verwendung des kleinsten mittleren Fehlerquadrats als Optimierungskriterium, den Vorteil einer mathematisch geschlossenen Lösung ohne Iterationen.The use of a polynomial classifier, at least when used the least mean square error as an optimization criterion, the advantage of a mathematically closed solution without iterations.

Zur Verringerung des Rechenaufwands wird die Vektorquantisierung vorzugsweise nicht mittels eines einzigen einstufigen Polynom-Klassifikators, sondern mehrstufig mit sukzessiver Verzweigung in aufeinanderfolgenden Stufen unter Einsatz mehrerer Polynom-Teil-Klassifikatoren nach Art einer Baumstruktur durchgeführt. Die Teilklassifikatoren sind insbesondere vorteilhafterweise als binäre Polynom- Klassifikatoren ausgeführt.To reduce the computational effort, vector quantization is preferred not by means of a single single-stage polynomial classifier, but in several stages with successive branching in successive stages using multiple Polynomial part classifiers performed in the manner of a tree structure. The Partial classifiers are particularly advantageous as binary polynomial Classifiers executed.

Die Einstellung des Polynom-Klassifikators in einer Trainingsphase wird im Regelfall gegenüber einem Vektorquantisierer mit angenommener statistischer Verteilungsfunktion in dieser Trainingsphase mit einem höheren Rechenaufwand verbunden sein. Dieser fällt jedoch nur einmal in der Trainingsphase an. Der Rechenaufwand in der späteren Erkennungsphase ist nicht oder nur unwesentlich höher im Vergleich zur Vektorquantisierung auf der Basis statistischer Verteilungsfunktionen. Bei Durchlaufen des Verzweigungsbaums der Teil- Klassifikatoren kann durch Abbrechen unbedeutender Pfade (pruning) der Verarbeitungsaufwand bei der Vektorquantisierung weiter reduziert werden.The setting of the polynomial classifier in a training phase is usually versus a vector quantizer with assumed statistical Distribution function in this training phase with a higher computational effort be connected. However, this only occurs once in the training phase. Of the Calculation effort in the later recognition phase is not or only insignificant higher compared to vector quantization based on statistical Distribution functions. When going through the branch tree of the sub- Classifiers can be pruning by truncating insignificant paths Processing complexity in the vector quantization can be further reduced.

Die verbesserte Beschreibung der tatsächlichen Verteilung im Merkmalsvektor- Raum durch den Polynom-Klassifikator führt nach ersten Untersuchungen gegenüber einer angenommenen Normalverteilung zu einer signifikanten Verringerung der Fehlerrate, bei sonst gleicher Einstellung des Erkennungssystems beispielsweise zu einer Fehlerrate von 5,6% gegenüber 6,3% bei angenommener Normalverteilung. The improved description of the actual distribution in the feature vector Space through the polynomial classifier leads to initial investigations compared to an assumed normal distribution to a significant Reduction of the error rate, otherwise setting the same recognition system for example, an error rate of 5.6% compared to 6.3% if assumed Normal distribution.

Bibliography

[KCGM93] A. Kaltenmeier, T. Caesar, J. M. Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceedings of the Second International Conference on Document Analysis and Recognition, S. 139-142, Tsukuba Science City, Japan, 20.-22. Oktober 1993. IEEE Computer Society Press
[CGM93] T. Caesar, J. M. Gloger, and E. Mandler. Preprocessing and feature extraction for a handwriting recognition system. In Proceedings of the Second International Conference on Document Analysis and Recognition, S. 408-411, Tsukuba Science City, Japan, 20.-22. Oktober 1993, IEEE Computer Society Press.
[Fuk90] Keinosuke Fukunaga, Introduction to statistical pattern recognition, Academic Press Inc., San Diego, CA, 2. Edition, 1990
[Sch96] Jürgen Schürmann, Pattern Classification, John Wiley & Sons Inc., New York [KCGM93] A. Kaltenmeier, T. Caesar, JM Gloger, and E. Mandler. Sophisticated topology of hidden markov models for cursive script recognition. In Proceedings of the Second International Conference on Document Analysis and Recognition, pp. 139-142, Tsukuba Science City, Japan, 20.-22. October 1993. IEEE Computer Society Press
[CGM93] T. Caesar, JM Gloger, and E. Mandler. Preprocessing and feature extraction for a handwriting recognition system. In Proceedings of the Second International Conference on Document Analysis and Recognition, pp. 408-411, Tsukuba Science City, Japan, 20.-22. October 1993, IEEE Computer Society Press.
[Fuk90] Keinosuke Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press Inc., San Diego, CA, 2nd Edition, 1990
[Sch96] Jürgen Schürmann, Pattern Classification, John Wiley & Sons Inc., New York

19961996

Claims

A pattern recognition method based on Hidden Markov models in which feature vectors are transformed into symbol vectors by means of vector quantization, characterized in that the vector quantization is performed by means of a polynomial classifier without assumption over statistical distributions present in the state dependent feature space.

2. The method according to claim 1, characterized in that the Vector quantization is branched in multiple stages, with individual polynomial Part classifiers representing nodes of the branch.

3. The method according to claim 2, characterized in that the polynomial partial Classifiers are binary polynomial classifiers.

4. The method according to any one of claims 1 to 3, characterized in that the Dimension of the feature vectors reduced and the vector quantization based on the dimension-reduced feature vectors is performed.

5. The method according to any one of claims 1 to 4, characterized in that the Coefficients of the polynomial classifier from those in a training phase estimated feature vectors or derived variables.

6. The method according to claim 5, characterized in that in one Training phase of the HMM recognizer from the feature vectors moment Matrices are formed and these moment matrices are also used to estimate the Coefficients of the polynomial classifier or polynomial classifiers be used.