WO1996009607A1 - Handwritten pattern recognizer - Google Patents
Handwritten pattern recognizer Download PDFInfo
- Publication number
- WO1996009607A1 WO1996009607A1 PCT/US1995/011664 US9511664W WO9609607A1 WO 1996009607 A1 WO1996009607 A1 WO 1996009607A1 US 9511664 W US9511664 W US 9511664W WO 9609607 A1 WO9609607 A1 WO 9609607A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pattern
- parameter
- parameters
- parameters include
- match
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
Definitions
- the present invention relates to pattern recognition systems in general and to systems for recognizing handwritten patterns, such as letters, numbers and signatures, in particular.
- FIG. 1 An example prior art system is shown in Fig. 1 to which reference is now made. It typically includes a digitizer 10, a segmenter 12, a feature extractor 14, a classifier 16 and a reference character database 18.
- the digitizer 10 converts an input pattern into a series of paired position (x,y) and sometimes also pressure P coordinates of sample points along the stroke.
- the segmenter 12 divides the input pattern into separate characters (i.e. if the input pattern was a handwritten "the", the segmenter 12 would divide the separate strokes into the characters "t", "h” and ,l e") .
- the feature extractor 14 extracts the features of each character and transforms each character into a standard format, called a "compressed model".
- the classifier 16 compares the standardized input character against the standardized reference characters stored in the reference database 18.
- the reference character which has the best match, by some criterion or criteria, is output as the recognized character.
- U.S. Patent 4,040,009 to Kadota et al. describes a system which assumes a certain structure for the patterns being recognized and utilizes this knowledge to resolve ambiguities among characters that, from the compressed model, are indistinguishable otherwise.
- the classifier 16 of the system of Kadota et al. has two recognition phases. The first phase divides the reference characters into "confusion groups" where the members of each confusion group are indistinguishable from each other. In the second phase, an a priori pair-wise matrix of pair-wise specific features is created. Each pair-wise feature discriminates between a pair of reference characters based on the distance of each reference to the relevant feature.
- Other patents which describe this approach are US Patents 4,718,102 and 4,531,231, both to Crane et al.
- U.S. Patent 5,125,039 to Hawkins describes a system which records the occurrence of features in an unknown object and compares the result with dictionary entries for the reference characters.
- the dictionary entries indicate that, for the reference character, each feature either occurs or does not occur (i.e. they are binary features) .
- the feature list of the unknown object is XOR'd with the feature list of each reference character and the unknown object is assigned the identity of the reference character to which it has the best XOR match.
- the list of parameters which the system identifies is not unique but the set of parameters should reasonably define the expected types of patterns and their expected variation. Possible parameters include the aspect ratio of the height of the pattern's bounding rectangle to its width and the relative length of the first stroke from pen-down to the first features of interest, such as a sharp angle change or a local minimum or maximum.
- the system also includes a pattern match determiner which produces match values for each parameter of the input pattern with its corresponding parameter of each reference parameter.
- the match determiner also produces an overall match value for each reference pattern.
- a pattern classifier selects the reference pattern whose parameter set is "closest", by some matching criterion, to that of the input pattern.
- the pattern classifier includes best candidate means for selecting the reference pattern with the smallest match value.
- the pattern classifier includes K nearest neighbor means which selects the group of reference patterns having the K smallest match values, divides the group into classes according to which type of pattern they represent and, if available, selects the class having the most reference patterns therein.
- the parameters are local parameters, global parameters and stroke-based parameters.
- the input pattern is provided as a sequence of sample points.
- Fig. 1 is a block diagram illustration of a prior art handwritten pattern recognition system
- Fig. 2 is a block diagram illustration of novel parameter extraction and classification units forming part of a handwriting pattern recognition of the present invention
- Fig. 3 is a flow chart illustration of the operations of a parameter set comparator forming part of the parameter extraction and classification units of Fig. 2;
- Figs. 4A and 4B are flow chart illustrations of the operations of a pattern classifier forming part of the parameter extraction and classification units of Fig. 2; and Figs. 5A, 5B, 5C, 5D, 5E and 5F are illustrations of letters indicating various elements useful in determining parameters.
- the present invention is a handwritten pattern recognition system. As such, it comprises a digitizer 10 and a segmenter 12, as in the prior art.
- Fig. 2 to which reference is now made, details the elements of its parameter extractor, labeled 20, its reference pattern database, labeled 22, and its classifier, labeled 24.
- the parameter extractor 20 receives sample points along line 23 from the segmenter 12.
- Parameter extractor 20 comprises a plurality of independent parameter determination modules 26, each determining a different parameter, such as length, aspect ratio, etc. , of the input sample points, and a parameter concatenator 28 which produces a parameter set, on line 43, from the output of the modules 26.
- Each parameter determination module 26 comprises a parameter generator 30 and a normalizer 32.
- the parameter generators 30 each generate a single parameter f; (which can have a range of values which include the null value) and the corresponding normalizer 32 normalizes the parameter f; (i varies from 1 to I the number of parameters) to provide the parameters with some set of standard units.
- each normalizer 32 normalizes its parameter t,,' with a pre-determined standard deviation value corresponding thereto.
- the standard deviation value for f is produced by determining the value of the parameter f ( for all reference patterns in a large reference database and taking the first standard deviation thereof.
- the output of normalizer 32 is a normalized parameter f * ,; and the output of the concatenator 28 is a parameter set F(f' j ) whose elements, due to the normalization, are all in standard units and can thus, be compared.
- the classifier 24 comprises a reference pattern selector 40, a parameter set comparator 42, and a pattern classifier 44.
- the reference pattern selector 40 selects, from reference database 22, the reference parameter sets F. i f ,) (j varies from 1 to M, the number of reference characters) to be compared to the input parameter set F(f',) produced by the parameter concatenator 28.
- the parameter set comparator 42 receives the reference parameter set
- the parameter set comparator 40 produces, along a line 45, a comparison value for each reference parameter set F j (f' JJ ) .
- the pattern classifier 44 selects the reference parameter set F,(f ' ⁇ J ) which is closest, by some match criterion described hereinbelow, to the input parameter set F(f'). The reference pattern corresponding to the selected reference parameter set is put out, along line 47, as the matched pattern.
- reference pattern selector 40 can select all of the patterns in the database 22 or it can select a portion thereof. For example, if another type of classifier has already processed the input pattern and determined that a group of reference patterns are similar to the input pattern, the other classifier can so indicate to the reference pattern selector 40 and it can choose only those patterns of the group found by the other classifier.
- the parameters can be any type of parameters which describe the expected patterns.
- the parameter set should describe local and global features of patterns and/or of strokes of patterns in order to cover as much of the variations in handwriting as possible. For example, the difference between a u and a v is a local one, centered around the sharpness of the curve.
- step 54 the Euclidean distance D. between the two normalized parameters f ' ⁇ and f. is determined. The process is repeated for all of the N normalized parameters.
- step 56 which defines the match value M j as a weighted and normalized sum of distances D, over the set of valid normalized parameters.
- M is defined as:
- the weights W are determined by an off-line optimization process performed on a very large number of reference characters.
- the process optimizes the quality of the recognition by selecting the weighting of the parameters.
- Pattern classifier 44 selects, among the match values K. ⁇ , the pattern which best matches the input pattern in accordance with some voting procedure.
- Two exemplary classification procedures are illustrated in Figs. 4A and 4B.
- Fig. 4A illustrates a "best candidate voting scheme”
- Fig. 4B illustrates a "group voting scheme”.
- the best candidate scheme described in Fig. 4A is simply the selection of the smallest match value and the production of the reference pattern having the corresponding index.
- the specific steps involve initializing the INDEX and MATCH values (step 58) , looping over j (step 59) , comparing j to the current value of MATCH (step 60) and storing (step 61) M. in MATCH and j in INDEX only if M ; is smaller than the current value of MATCH.
- the value of INDEX once the loop on j has finished is the index of the reference pattern with the best match.
- the voting scheme of Fig. 4B assumes some noise in the match values and attempts to reduce its effect by selecting the best K candidates having the K lowest match values M,.
- the K candidates are reviewed to determine if there is any group of candidates which are different versions of the same pattern.
- the matched pattern is that pattern which has the largest group.
- This method is also known as the "K Nearest Neighbor” method and is described in the article by Fukunanga, K. and Hostetler, L.D., "K-Nearest Neighbor Bays Risk Estimation", IEEE Transactions on Information Theory. IT-21, 1975, p. 285, which article is incorporated herein by reference.
- step 4B are: a) selecting (step 62) the K, where K is odd, patterns with the smallest match value M j and storing their indices j in a manner similar to that described with reference to Fig. 4A, b) reviewing (step 64) the K patterns to determine if there is a single group which has the most members; c) if there is a single group, selecting (step 68) one of the patterns in the group as the representative matched pattern; d) if not, determining (step 66) if there is more than one group with the same number of members; e) if not, selecting (step 70) the pattern with the smallest match value; f) if yes, selecting (step 69) the group which is the largest group with the smallest average match value.
- the parameters can be any types of parameters which describe the expected patterns.
- the parameters can be any types of parameters which describe the expected patterns.
- Figs. 5A, 5B, 5C, 5D, 5E and 5F the following is an exemplary set of parameters useful for identifying alphanumeric characters.
- the first and second parameters are the ratio between the length of the first stroke and the length of its projection on the horizontal and vertical axes, respectively.
- a stroke is defined as the sample points between the pen-down and pen up points.
- Fig. 5A shows three letters, A, B and C, and their projections 80 and 82 on the horizontal and vertical axes, respectively. Since the letters are approximately the same height, their projections 82 on the vertical axis are approximately equal. However, along the horizontal axis, their projections 80 are very different. In fact, the horizontal projection 80 of the first stroke of the letter B is just a point.
- the first and second parameters are formally defined as:
- the third parameter is defined as the ratio of the lengths of the first and second strokes, or:
- the fourth parameter is defined as the length of the portion of the first stroke beginning at the pen-down point and ending at the first feature of interest, such as a sharp angle change, a local maximum or minimum or any other pre-defined feature.
- the fifth parameter is defined as the length of the portion of the first stroke beginning at the pen-up point and moving backwards to the last feature of interest.
- Fig. 5B which shows the letters y, a, W and w.
- the fourth parameters are labeled 84, the fifth parameters are labeled 86 and the features of interest are labeled 85.
- the fourth and fifth parameters 84 and 86 end at the same point.
- the letter W has three sharp angle changes and the letter w has one local maximum.
- the sharp angle change can be defined in any appropriate manner. In one embodiment, it is determined by reviewing values of the local tangent angles at each sample point and select the sample point whose neighbors have significantly different tangent angles.
- the local maximum or minimum is defined as any point whose y or x coordinate is either larger or smaller, respectively, than those of the preceding and succeeding J points, where J is typically four.
- ⁇ 1 is the distance between neighboring sample points.
- the sixth parameter is the distance in the horizontal direction, between the pen-down point and the first feature of interest of the first stroke. This is shown in Fig. 5C for the letters y and g and is labeled 90.
- the parameter is defined as:
- the seventh parameter is the distance along the horizontal direction between the pen-up point and the last feature of interest in the vertical direction of the first stroke or:
- the eighth parameter shown in Fig. 5E and labeled 94, is the distance along the horizontal axis between the pen-down and pen-up points of the first stroke, or;
- the ninth and tenth parameters are similar to the fourth and fifth parameters but for the second stroke.
- the eleventh parameter is defined as the distance between the centers of the first two strokes.
- the letter T is shown in Fig. 5F and the centers of the first and strokes are labeled 96 and 98, respectively.
- parameter 13 is defined as:
- the final parameter, parameter 12 is defined as the ratio between the overall height of the pattern to its width, or:
Abstract
A handwritten pattern recognition system for recognizing an input pattern is provided. The system has a plurality of parameter determining units (26), each determining the value of a desired parameter for an input pattern to be recognized. The system also includes a pattern match determiner (24) which produces match values for each parameter of the input pattern with its corresponding parameter of each reference pattern (22). The match determiner (24) also produces an overall match value for each reference pattern. A pattern classifier (44) selects the reference pattern whose parameter set is 'closest', by some matching criterion, to that of the input pattern.
Description
HANDWRITTEN PATTERN RECOGNIZER
FIELD OF THE INVENTION The present invention relates to pattern recognition systems in general and to systems for recognizing handwritten patterns, such as letters, numbers and signatures, in particular.
BACKGROUND OF THE INVENTION
Various handwritten pattern recognition systems are known in the art and they have varying degrees of success at recognition. These systems typically assume some particular structure of the characters (patterns) under investigation and utilize the structure to improve their recognition ability.
An example prior art system is shown in Fig. 1 to which reference is now made. It typically includes a digitizer 10, a segmenter 12, a feature extractor 14, a classifier 16 and a reference character database 18. The digitizer 10 converts an input pattern into a series of paired position (x,y) and sometimes also pressure P coordinates of sample points along the stroke. The segmenter 12 divides the input pattern into separate characters (i.e. if the input pattern was a handwritten "the", the segmenter 12 would divide the separate strokes into the characters "t", "h" and ,le") . The feature extractor 14 extracts the features of each character and transforms each character into a standard format, called a "compressed model". The classifier 16 then compares the standardized input character against the standardized reference characters stored in the reference database 18. The reference character which has the best match, by some criterion or criteria, is output as the recognized character. U.S. Patent
4,284,975 to Odaka and U.S. Patent 4,607,386 to Morita et al. describe representative systems.
U.S. Patent 4,040,009 to Kadota et al. describes a
system which assumes a certain structure for the patterns being recognized and utilizes this knowledge to resolve ambiguities among characters that, from the compressed model, are indistinguishable otherwise. The classifier 16 of the system of Kadota et al. has two recognition phases. The first phase divides the reference characters into "confusion groups" where the members of each confusion group are indistinguishable from each other. In the second phase, an a priori pair-wise matrix of pair-wise specific features is created. Each pair-wise feature discriminates between a pair of reference characters based on the distance of each reference to the relevant feature. Other patents which describe this approach are US Patents 4,718,102 and 4,531,231, both to Crane et al.
Unfortunately, the criteria for recognizing confusion groups and for defining pair-wise features are based on the writing style of the particular reference characters in the database. As a result, the prior art systems cannot recognize characters which have a significantly different writing style.
U.S. Patent 5,125,039 to Hawkins describes a system which records the occurrence of features in an unknown object and compares the result with dictionary entries for the reference characters. The dictionary entries indicate that, for the reference character, each feature either occurs or does not occur (i.e. they are binary features) . The feature list of the unknown object is XOR'd with the feature list of each reference character and the unknown object is assigned the identity of the reference character to which it has the best XOR match.
SUMMARY OF THE PRESENT INVENTION Applicants have realized that a) there are global parameters, such as lengths of strokes, and local
parameters, such as locations of features of interest, and b) that all parameters are equally important in recognition. Furthermore, some parameters have a range of values and are not binary in nature. Recognition can be improved by utilizing these realizations with a multi-objective recognition criterion.
The list of parameters which the system identifies is not unique but the set of parameters should reasonably define the expected types of patterns and their expected variation. Possible parameters include the aspect ratio of the height of the pattern's bounding rectangle to its width and the relative length of the first stroke from pen-down to the first features of interest, such as a sharp angle change or a local minimum or maximum.
It is therefore an object of the present invention to provide a handwritten pattern recognition system having a plurality of parameter determining units, each determining the value of a desired parameter for an input pattern to be recognized. The system also includes a pattern match determiner which produces match values for each parameter of the input pattern with its corresponding parameter of each reference parameter. The match determiner also produces an overall match value for each reference pattern. A pattern classifier selects the reference pattern whose parameter set is "closest", by some matching criterion, to that of the input pattern.
Additionally, in accordance with a preferred embodiment of the present invention, the pattern classifier includes best candidate means for selecting the reference pattern with the smallest match value. Alternatively, the pattern classifier includes K nearest neighbor means which selects the group of reference patterns having the K smallest match values, divides the group into classes according to which type
of pattern they represent and, if available, selects the class having the most reference patterns therein.
Moreover, in accordance with a preferred embodiment of the present invention, the parameters are local parameters, global parameters and stroke-based parameters.
Finally, in accordance with a preferred embodiment of the present invention, the input pattern is provided as a sequence of sample points.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the drawings in which:
Fig. 1 is a block diagram illustration of a prior art handwritten pattern recognition system;
Fig. 2 is a block diagram illustration of novel parameter extraction and classification units forming part of a handwriting pattern recognition of the present invention;
Fig. 3 is a flow chart illustration of the operations of a parameter set comparator forming part of the parameter extraction and classification units of Fig. 2;
Figs. 4A and 4B are flow chart illustrations of the operations of a pattern classifier forming part of the parameter extraction and classification units of Fig. 2; and Figs. 5A, 5B, 5C, 5D, 5E and 5F are illustrations of letters indicating various elements useful in determining parameters.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT The present invention is a handwritten pattern recognition system. As such, it comprises a digitizer
10 and a segmenter 12, as in the prior art. Fig. 2 , to which reference is now made, details the elements of its parameter extractor, labeled 20, its reference pattern database, labeled 22, and its classifier, labeled 24.
The parameter extractor 20 receives sample points along line 23 from the segmenter 12. Parameter extractor 20 comprises a plurality of independent parameter determination modules 26, each determining a different parameter, such as length, aspect ratio, etc. , of the input sample points, and a parameter concatenator 28 which produces a parameter set, on line 43, from the output of the modules 26.
Each parameter determination module 26 comprises a parameter generator 30 and a normalizer 32. The parameter generators 30 each generate a single parameter f; (which can have a range of values which include the null value) and the corresponding normalizer 32 normalizes the parameter f; (i varies from 1 to I the number of parameters) to provide the parameters with some set of standard units. For example, each normalizer 32 normalizes its parameter t,,' with a pre-determined standard deviation value corresponding thereto. The standard deviation value for f; is produced by determining the value of the parameter f( for all reference patterns in a large reference database and taking the first standard deviation thereof. The output of normalizer 32 is a normalized parameter f * ,; and the output of the concatenator 28 is a parameter set F(f'j) whose elements, due to the normalization, are all in standard units and can thus, be compared.
The classifier 24 comprises a reference pattern selector 40, a parameter set comparator 42, and a pattern classifier 44. One at a time, the reference pattern selector 40 selects, from reference database
22, the reference parameter sets F. i f ,) (j varies from 1 to M, the number of reference characters) to be compared to the input parameter set F(f',) produced by the parameter concatenator 28. The parameter set comparator 42 receives the reference parameter set
Fj(f'JJ) along line 49 and the input parameter set F(f',) along the line 43 and compares the reference parameter set Fjff'jj) with the input parameter set F(f',). The parameter set comparator 40 produces, along a line 45, a comparison value for each reference parameter set Fj(f'JJ) . The pattern classifier 44 selects the reference parameter set F,(f ' }J) which is closest, by some match criterion described hereinbelow, to the input parameter set F(f'). The reference pattern corresponding to the selected reference parameter set is put out, along line 47, as the matched pattern. It will be appreciated that reference pattern selector 40 can select all of the patterns in the database 22 or it can select a portion thereof. For example, if another type of classifier has already processed the input pattern and determined that a group of reference patterns are similar to the input pattern, the other classifier can so indicate to the reference pattern selector 40 and it can choose only those patterns of the group found by the other classifier. In accordance with a preferred embodiment of the present invention, the parameters can be any type of parameters which describe the expected patterns. The parameter set should describe local and global features of patterns and/or of strokes of patterns in order to cover as much of the variations in handwriting as possible. For example, the difference between a u and a v is a local one, centered around the sharpness of the curve. Other differences, such as angle of the letters, are more global. By considering many possible features of patterns, the noise in handwritten
patterns, caused by non-rigid hands or by sheer laziness in writing, affects the results to a lesser degree than if only certain types of criteria are utilized. It is noted that the parameters are independent of each other and are processed in parallel. This is in contrast to the prior art which first sorts in accordance with the global conditions and only afterwards, considers local conditions. Fig. 3 illustrates, in flow chart format, the operations of the parameter set comparator 40 for each reference parameter set F, ( f * ,j) . For each of the i=N normalized parameters f',, the parameter set comparator 40 first determines, in step 52, whether or not the normalized parameter f. is null in the input parameter set F(f'j) or the reference parameter set Fj(f'jj). If one or both sets have a null value, the comparator 40 returns to step 50 and increments the value of i. Otherwise, in step 54, the Euclidean distance D. between the two normalized parameters f '^ and f. is determined. The process is repeated for all of the N normalized parameters.
Although the parameters are independent, they are not all equally sensitive measures of shape. Therefore, they are combined together in a weighted fashion to produce the match value M This occurs in step 56 which defines the match value Mj as a weighted and normalized sum of distances D, over the set of valid normalized parameters. Thus, M, is defined as:
2
M, = i - ■vvaalliidd wιDi
3 N Mon . - m-orsf-f -. va ιll i i drf --ffeoaatuurr -eise (1)
The weights W, are determined by an off-line optimization process performed on a very large number
of reference characters. The process optimizes the quality of the recognition by selecting the weighting of the parameters.
Pattern classifier 44 selects, among the match values K.}, the pattern which best matches the input pattern in accordance with some voting procedure. Two exemplary classification procedures are illustrated in Figs. 4A and 4B. Fig. 4A illustrates a "best candidate voting scheme" and Fig. 4B illustrates a "group voting scheme".
The best candidate scheme described in Fig. 4A is simply the selection of the smallest match value and the production of the reference pattern having the corresponding index. The specific steps involve initializing the INDEX and MATCH values (step 58) , looping over j (step 59) , comparing j to the current value of MATCH (step 60) and storing (step 61) M. in MATCH and j in INDEX only if M; is smaller than the current value of MATCH. The value of INDEX once the loop on j has finished is the index of the reference pattern with the best match.
The voting scheme of Fig. 4B assumes some noise in the match values and attempts to reduce its effect by selecting the best K candidates having the K lowest match values M,. The K candidates are reviewed to determine if there is any group of candidates which are different versions of the same pattern. The matched pattern is that pattern which has the largest group. This method is also known as the "K Nearest Neighbor" method and is described in the article by Fukunanga, K. and Hostetler, L.D., "K-Nearest Neighbor Bays Risk Estimation", IEEE Transactions on Information Theory. IT-21, 1975, p. 285, which article is incorporated herein by reference. The specific steps shown in Fig. 4B are: a) selecting (step 62) the K, where K is odd,
patterns with the smallest match value Mj and storing their indices j in a manner similar to that described with reference to Fig. 4A, b) reviewing (step 64) the K patterns to determine if there is a single group which has the most members; c) if there is a single group, selecting (step 68) one of the patterns in the group as the representative matched pattern; d) if not, determining (step 66) if there is more than one group with the same number of members; e) if not, selecting (step 70) the pattern with the smallest match value; f) if yes, selecting (step 69) the group which is the largest group with the smallest average match value.
As mentioned before, the parameters can be any types of parameters which describe the expected patterns. With reference to Figs. 5A, 5B, 5C, 5D, 5E and 5F, the following is an exemplary set of parameters useful for identifying alphanumeric characters.
The first and second parameters are the ratio between the length of the first stroke and the length of its projection on the horizontal and vertical axes, respectively. A stroke is defined as the sample points between the pen-down and pen up points.
Fig. 5A shows three letters, A, B and C, and their projections 80 and 82 on the horizontal and vertical axes, respectively. Since the letters are approximately the same height, their projections 82 on the vertical axis are approximately equal. However, along the horizontal axis, their projections 80 are very different. In fact, the horizontal projection 80 of the first stroke of the letter B is just a point.
The first and second parameters are formally defined as:
_ length-of-first-stroke (2)
-P _ length-of-first-stroke ,-.* s J max. y-nin'
The third parameter is defined as the ratio of the lengths of the first and second strokes, or:
The fourth parameter is defined as the length of the portion of the first stroke beginning at the pen-down point and ending at the first feature of interest, such as a sharp angle change, a local maximum or minimum or any other pre-defined feature. The fifth parameter is defined as the length of the portion of the first stroke beginning at the pen-up point and moving backwards to the last feature of interest. These parameters are illustrated in Fig. 5B which shows the letters y, a, W and w. The fourth parameters are labeled 84, the fifth parameters are labeled 86 and the features of interest are labeled 85. For the letters y and a, which have one sharp angle change 85, the fourth and fifth parameters 84 and 86 end at the same point. The letter W has three sharp angle changes and the letter w has one local maximum.
The sharp angle change can be defined in any appropriate manner. In one embodiment, it is determined by reviewing values of the local tangent angles at each sample point and select the sample point
whose neighbors have significantly different tangent angles. The local maximum or minimum is defined as any point whose y or x coordinate is either larger or smaller, respectively, than those of the preceding and succeeding J points, where J is typically four.
The formal definitions for the fourth and fifth parameters are: firs -feature-of-interest f4= ∑ Al (5) pen-down
last-feature-o -in erest f5= ∑ -.1 (6) pen-up
where Δ1 is the distance between neighboring sample points.
The sixth parameter is the distance in the horizontal direction, between the pen-down point and the first feature of interest of the first stroke. This is shown in Fig. 5C for the letters y and g and is labeled 90. The parameter is defined as:
***6 - I lxpen-down-*"first-featurβ-of-interest I I ' ' '
The seventh parameter, shown in Fig. 5D by reference numeral 92, is the distance along the horizontal direction between the pen-up point and the last feature of interest in the vertical direction of the first stroke or:
*-7 ~ l lxlast-feature-of-interest -xpen-u l I - ° >
The eighth parameter, shown in Fig. 5E and labeled
94, is the distance along the horizontal axis between the pen-down and pen-up points of the first stroke, or;
*** β I lXpen-down ~Xpen-up I I ( -* )
The ninth and tenth parameters are similar to the fourth and fifth parameters but for the second stroke.
If there is no second stroke parameters 9 and 10 receive null values. Similarly, for any of the above parameters, if the stroke has no feature of interest, the parameter receives a null value.
The eleventh parameter is defined as the distance between the centers of the first two strokes. The letter T is shown in Fig. 5F and the centers of the first and strokes are labeled 96 and 98, respectively.
The distance between them is labeled 100. Formally, parameter 13 is defined as:
j- _ j i first-stroke _ second-strokes _ / first-stroke _ second-strokes 2 . . 11 V c9 ~xcg I sYcg y eg > \ ■*■ -> I
The final parameter, parameter 12, is defined as the ratio between the overall height of the pattern to its width, or:
f12=(-^^)whole-pattern (11)
"' nax '' nin
It will be appreciated that other parameters can also be included and that not any or all of the above-described parameters have to be included.
It will be appreciated by persons skilled in the art that the present invention is not limited to what has been particularly shown and described hereinabove. Rather the scope of the present invention is defined by the claims which follow:
Claims
1. A handwritten pattern recognition system for recognizing an input pattern, the system comprising: a. a first plurality of independent parameter determination modules, each receiving said input pattern, for determining a set of parameters of said input pattern, wherein said parameters comprise at least one of local parameters, global parameters and stroke-based parameters; b. a reference pattern database storing sets of reference parameters, one for each of a second plurality of reference patterns; c. a parameter match determiner for determining a parameter match value of each input parameter with its corresponding parameter of said reference parameter sets and for producing overall match values for at least selected ones of said reference parameter sets as a function of said parameter match values for each reference parameter in said selected reference parameter sets; and d. a pattern classifier for classifying said input pattern as one of said reference patterns by voting among said overall match values.
2. A system according to claim 1 and wherein said pattern classifier comprises best candidate means for selecting the reference pattern with the smallest match value.
3. A system according to claim 1 and wherein said pattern classifier comprises K nearest neighbor means for selecting the group of reference patterns having the K smallest match values, for dividing said group into classes according to which type of pattern they represent and, if available, for selecting the class having the most reference patterns therein.
4. A system according to claim 1 and wherein said parameter determination modules comprise parameter generators and normalizers.
5. A system according to claim 1 and wherein said input pattern is provided as a sequence of sample points.
6. A system according to claim 1 and wherein said parameters include the length of a stroke from pen-down to the first feature of interest.
7. A system according to claim 1 and wherein said parameters include the length of a stroke from pen-up back to the last feature of interest.
8. A system according to claim 1 and wherein said parameters include the horizontal distance from pen-down to the first feature of interest.
9. A system according to claim 1 and wherein said parameters include the horizontal distance from pen-up back to the last feature of interest.
10. A system according to claim 7 and wherein said features of interest comprise at least one of local vertical minimum, local vertical maximum, local horizontal minimum, local horizontal maximum and sharp angle change.
11. A system according to claim 1 and wherein said parameters include the ratio of the length of a stroke to its projection on one of the horizontal and vertical axes.
12. A system according to claim 1 and wherein said parameters include the ratio of the lengths of the first two strokes of one pattern.
13. A system according to claim 1 and wherein said parameters include the distance between the centers of the first two strokes of one pattern.
14. A system according to claim 1 and wherein said parameters include the aspect ratio of the pattern.
15. A system according to claim 1 and wherein said reference pattern database is a portion of said database as selected by another handwriting recognition unit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU36760/95A AU3676095A (en) | 1994-09-22 | 1995-09-20 | Handwritten pattern recognizer |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IL111,039 | 1994-09-22 | ||
IL11103994A IL111039A (en) | 1994-09-22 | 1994-09-22 | Handwritten pattern recognizer |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1996009607A1 true WO1996009607A1 (en) | 1996-03-28 |
Family
ID=11066571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1995/011664 WO1996009607A1 (en) | 1994-09-22 | 1995-09-20 | Handwritten pattern recognizer |
Country Status (4)
Country | Link |
---|---|
US (1) | US6023529A (en) |
AU (1) | AU3676095A (en) |
IL (1) | IL111039A (en) |
WO (1) | WO1996009607A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6647145B1 (en) | 1997-01-29 | 2003-11-11 | Co-Operwrite Limited | Means for inputting characters or commands into a computer |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3702978B2 (en) * | 1996-12-26 | 2005-10-05 | ソニー株式会社 | Recognition device, recognition method, learning device, and learning method |
US6111985A (en) * | 1997-06-06 | 2000-08-29 | Microsoft Corporation | Method and mechanism for providing partial results in full context handwriting recognition |
US6167411A (en) * | 1998-06-22 | 2000-12-26 | Lucent Technologies Inc. | User interface for entering and editing data in data entry fields |
US6304667B1 (en) | 2000-06-21 | 2001-10-16 | Carmen T. Reitano | System and method for incorporating dyslexia detection in handwriting pattern recognition systems |
US6724936B1 (en) | 2000-08-23 | 2004-04-20 | Art-Advanced Recognition Technologies, Ltd. | Handwriting input device and method using a single character set |
US7177473B2 (en) * | 2000-12-12 | 2007-02-13 | Nuance Communications, Inc. | Handwriting data input device with multiple character sets |
WO2003023696A1 (en) * | 2001-09-12 | 2003-03-20 | Auburn University | System and method of handwritten character recognition |
US20050152600A1 (en) * | 2004-01-14 | 2005-07-14 | International Business Machines Corporation | Method and apparatus for performing handwriting recognition by analysis of stroke start and end points |
US7298904B2 (en) * | 2004-01-14 | 2007-11-20 | International Business Machines Corporation | Method and apparatus for scaling handwritten character input for handwriting recognition |
US7756337B2 (en) * | 2004-01-14 | 2010-07-13 | International Business Machines Corporation | Method and apparatus for reducing reference character dictionary comparisons during handwriting recognition |
US7490033B2 (en) * | 2005-01-13 | 2009-02-10 | International Business Machines Corporation | System for compiling word usage frequencies |
JP5735126B2 (en) * | 2013-04-26 | 2015-06-17 | 株式会社東芝 | System and handwriting search method |
US9525638B2 (en) | 2013-10-15 | 2016-12-20 | Internap Corporation | Routing system for internet traffic |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5341438A (en) * | 1992-07-22 | 1994-08-23 | Eastman Kodak Company | Method and apparatus for segmenting and classifying unconstrained handwritten characters |
US5361379A (en) * | 1991-10-03 | 1994-11-01 | Rockwell International Corporation | Soft-decision classifier |
US5392363A (en) * | 1992-11-13 | 1995-02-21 | International Business Machines Corporation | On-line connected handwritten word recognition by a probabilistic method |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS51118333A (en) * | 1975-04-11 | 1976-10-18 | Hitachi Ltd | Pattern recognition system |
US4193056A (en) * | 1977-05-23 | 1980-03-11 | Sharp Kabushiki Kaisha | OCR for reading a constraint free hand-written character or the like |
JPS5580183A (en) * | 1978-12-12 | 1980-06-17 | Nippon Telegr & Teleph Corp <Ntt> | On-line recognition processing system of hand-written character |
US4531231A (en) * | 1983-01-19 | 1985-07-23 | Communication Intelligence Corporation | Method for distinguishing between complex character sets |
US4718102A (en) * | 1983-01-19 | 1988-01-05 | Communication Intelligence Corporation | Process and apparatus involving pattern recognition |
JPS6079485A (en) * | 1983-10-06 | 1985-05-07 | Sharp Corp | Handwriting character recognition processing device |
US5060277A (en) * | 1985-10-10 | 1991-10-22 | Palantir Corporation | Pattern classification means using feature vector regions preconstructed from reference data |
JPH01246678A (en) * | 1988-03-29 | 1989-10-02 | Toshiba Corp | Pattern recognizing device |
US5048100A (en) * | 1988-12-15 | 1991-09-10 | Michael Kuperstein | Self organizing neural network method and system for general classification of patterns |
US5125039A (en) * | 1989-06-16 | 1992-06-23 | Hawkins Jeffrey C | Object recognition system |
US5577135A (en) * | 1994-03-01 | 1996-11-19 | Apple Computer, Inc. | Handwriting signal processing front-end for handwriting recognizers |
-
1994
- 1994-09-22 IL IL11103994A patent/IL111039A/en not_active IP Right Cessation
-
1995
- 1995-09-14 US US08/528,293 patent/US6023529A/en not_active Expired - Fee Related
- 1995-09-20 AU AU36760/95A patent/AU3676095A/en not_active Abandoned
- 1995-09-20 WO PCT/US1995/011664 patent/WO1996009607A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5361379A (en) * | 1991-10-03 | 1994-11-01 | Rockwell International Corporation | Soft-decision classifier |
US5341438A (en) * | 1992-07-22 | 1994-08-23 | Eastman Kodak Company | Method and apparatus for segmenting and classifying unconstrained handwritten characters |
US5392363A (en) * | 1992-11-13 | 1995-02-21 | International Business Machines Corporation | On-line connected handwritten word recognition by a probabilistic method |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6647145B1 (en) | 1997-01-29 | 2003-11-11 | Co-Operwrite Limited | Means for inputting characters or commands into a computer |
Also Published As
Publication number | Publication date |
---|---|
AU3676095A (en) | 1996-04-09 |
US6023529A (en) | 2000-02-08 |
IL111039A0 (en) | 1994-11-28 |
IL111039A (en) | 1998-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0539749B1 (en) | Handwriting recognition system and method | |
Connell et al. | Template-based online character recognition | |
US9665768B2 (en) | Process of handwriting recognition and related apparatus | |
EP0355748B1 (en) | A pattern recognition apparatus and method for doing the same | |
Zanchettin et al. | A KNN-SVM hybrid model for cursive handwriting recognition | |
Paclík et al. | Building road-sign classifiers using a trainable similarity measure | |
Bhowmik et al. | SVM-based hierarchical architectures for handwritten Bangla character recognition | |
Baird | Feature identification for hybrid structural/statistical pattern classification | |
US5940535A (en) | Method and apparatus for designing a highly reliable pattern recognition system | |
US5005205A (en) | Handwriting recognition employing pairwise discriminant measures | |
Burrow | Arabic handwriting recognition | |
US20030007683A1 (en) | Method and system for separating text and drawings in digital ink | |
US6023529A (en) | Handwritten pattern recognizer with selective feature weighting | |
CN105894050A (en) | Multi-task learning based method for recognizing race and gender through human face image | |
Lehal et al. | Feature extraction and classification for OCR of Gurmukhi script | |
US6339655B1 (en) | Handwriting recognition system using substroke analysis | |
Rosyda et al. | A review of various handwriting recognition methods | |
Zarro et al. | Recognition-based online Kurdish character recognition using hidden Markov model and harmony search | |
JPH11203415A (en) | Device and method for preparing similar pattern category discrimination dictionary | |
Lamghari et al. | Template matching for recognition of handwritten Arabic characters using structural characteristics and Freeman code | |
Elzobi et al. | A hidden Markov model-based approach with an adaptive threshold model for off-line Arabic handwriting recognition | |
Pinto et al. | A new graph-like classification method applied to ancient handwritten musical symbols | |
Ali et al. | Different handwritten character recognition methods: a review | |
Serrau et al. | An experimental comparison of fingerprint classification methods using graphs | |
Agrawal et al. | Context aware on-line diagramming recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU IS JP KE KG KP KR KZ LK LR LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TT UA UG UZ VN |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): KE MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase |