WO1998055957A1 - Partial results in full context handwriting recognition - Google Patents

Partial results in full context handwriting recognition Download PDF

Info

Publication number
WO1998055957A1
WO1998055957A1 PCT/US1998/011481 US9811481W WO9855957A1 WO 1998055957 A1 WO1998055957 A1 WO 1998055957A1 US 9811481 W US9811481 W US 9811481W WO 9855957 A1 WO9855957 A1 WO 9855957A1
Authority
WO
WIPO (PCT)
Prior art keywords
alternates
alternate
character
previous
cost
Prior art date
Application number
PCT/US1998/011481
Other languages
French (fr)
Inventor
Patrick M. Haluptzok
Gregory N. Hullender
Original Assignee
Microsoft Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corporation filed Critical Microsoft Corporation
Priority to AU77229/98A priority Critical patent/AU7722998A/en
Publication of WO1998055957A1 publication Critical patent/WO1998055957A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/226Character recognition characterised by the type of writing of cursive writing
    • G06V30/2268Character recognition characterised by the type of writing of cursive writing using stroke segmentation
    • G06V30/2276Character recognition characterised by the type of writing of cursive writing using stroke segmentation with probabilistic networks, e.g. hidden Markov models

Definitions

  • the invention relates generally to the input of user information into computer systems, and more particularly to the recognition of handwritten characters input by a user.
  • Context-based recognition is done in one of two ways, i.e., a backward context approach or a full context approach.
  • Backward context recognition analyzes previous characters, if any, to determine the best translation of the current character.
  • a benefit of backward context recognition is that each entered character is recognized and immediately displayed for user verification.
  • a full context approach looks at the characters both before and after each character in a string to help in the recognition thereof.
  • Backward context recognition is less accurate than full context recognition because by only looking backwards, only half of the available context is used.
  • a handwritten string such as "
  • the backwards context recognition has to make its initial decision based solely on the first character, which may be incorrect, particularly if the character shape is one which closely resembles the shape of one or more other characters. Thereafter, subsequent characters are analyzed relative to this initial guess, e.g., a letter more likely follows a letter than a number. As can be appreciated, if the initial guess is incorrect, with backwards context recognition, some or all subsequent characters may be also incorrect, further compounding the error .
  • Full (forwards and backwards) context recognition is thus more accurate than backwards recognition because of the ability to use all rather than half of the available context information.
  • a first approach to full context recognition is to wait for the user to complete all characters in a given string (with the user indicating completion) and then analyze the full string.
  • Full context thus provides a relatively high level of recognition accuracy, but also has a number of drawbacks associated therewith.
  • the user receives no feedback while entering the characters until the user indicates that the string is complete.
  • Second there is a noticeably long delay between receipt of the string and the completion of the recognition operation.
  • the memory requirements for keeping a potentially unlimited amount of electronic ink in the memory for a potentially unlimited amount of characters would overwhelm many systems, particularly hand-held (palmtop) computing devices in which handwriting recognition is commonplace.
  • Another object is to provide a full context recognition method and mechanism that facilitates rapid visual feedback while having relatively low memory requirements.
  • Yet another object is to provide a method and mechanism of the above kind that is fast, reliable, cost- efficient, flexible and extensible.
  • the present invention provides a method and mechanism for recognizing handwritten characters previously entered into the system from the context of subsequent handwritten characters presently being entered into the system.
  • each handwritten character entered is associated with a plurality of alternate code points therefor, with each alternate code point having probability information associated therewith.
  • the alternate code points therefor are placed at the end of a queue, and a cost is determined from each alternate code point to any immediately preceding alternate in the queue. The cost is based on the probability information of the alternates and a transition cost therebetween. Then, the lowest cost path back from each of the alternates at the end of the queue to an alternate at the beginning of the queue is determined.
  • each lowest cost path back converges at (traverses) a common alternate in the queue
  • the common alternate and any alternates previous thereto on the path back are recognized as the code points for each of the handwritten characters associated therewith.
  • the alternates corresponding to recognized code points are removed from the queue, and the code points appropriately displayed on a screen as recognized characters, to allow editing thereof or the like.
  • FIGURE 1 is a block diagram representing a computer system into which the present invention may be incorporated;
  • FIG. 2 is a representation of an exemplary touch- sensitive screen for inputting user information to the system and for outputting system information to the user;
  • FIG. 3 is a block diagram representing functional components for recognizing handwritten characters according to the present invention;
  • FIGS. 4 - 6 represent an exemplary sequence of handwritten user characters with corresponding character alternatives returned by a shape matcher and buffered and analyzed for context recognition thereof;
  • FIGS . 7 - 12 represent an alternate exemplary sequence of handwritten user characters with corresponding character alternatives returned by a shape matcher and buffered and analyzed for context recognition;
  • FIG. 13 is a representation of a simplified state machine used to perform context recognition on the characters of FIGS. 7 - 12;
  • FIG. 14 is a representation of a lattice of alternative paths and showing cost information of some of the paths for the buffered alternative characters of FIG. 9;
  • FIGS. 15 - 16 comprise a flow diagram representing the general steps taken to perform full context recognition while displaying partial results in accordance with one aspect of the present invention
  • FIG. 17 is a representation of a dictionary state machine used to perform context recognition
  • FIG. 18 is a representation of a bigram state machine used to perform context recognition
  • FIG. 19 is a representation of a lattice of alternative paths resulting from the state machines of FIGS. 17 and 18.
  • FIG. 1 there is shown a computer system generally designated 20 into which the present invention may be incorporated.
  • the computer system 20 includes a processor 22 operatively connected to storage 24, the storage including read only memory (ROM) , random access memory
  • the processor 22 also connects through I/O circuitry 28 to one or more input devices 30 ⁇ - 30 3 , such as a keyboard and/or a pointing device such as a mouse.
  • the system 20 includes means for inputting pen-based information, including handwritten characters, such as a pen and/or pen-sensitive (e.g., touch-sensitive) input device 30 3 .
  • the system 20 also includes at least one local output device 34 connected to the I/O circuitry 28 for communicating information, such as via a graphical user interface, to the user of the system 20.
  • the output device 34 may include a liquid crystal display screen or the like integrated with the pen-input device 30 3 , which, taken together, can be considered a pen-sensitive screen 36 (FIG. 2) .
  • a preferred system 20 is a hand-held personal computing device running on the Windows CE operating system loaded in the storage 24. At least one application program such as a word processing program may also be loaded into the storage 24.
  • the pen-sensitive screen 36 includes a visible manifestation of a text buffer 38 for displaying and allowing edit control of the text characters that have been input into and recognized by the system 20.
  • the system 20 recognizes the character in part by its shape, from among a set of possible computer characters, (e.g., ASCII or Unicode code points) .
  • the system 20 also recognizes the character in part from its full context, i.e., its surrounding characters both forward and backward.
  • the code point is transferred to the text buffer 38, i.e., recognized character buffer 38, and displayed therein in an appropriate font and font size, when it is established that the additional context cannot change the code point regardless of further character input.
  • the touch screen 36 preferably includes one or more button input areas 40 ⁇ - 40 3 (FIG. 2), which may provide a virtual keyboard through which the user may directly input characters, character-based editing commands and so on.
  • the buttons may represent characters, such that when contacted by a pen, the character is input into the system as if it was typed by the user on a conventional keyboard.
  • the user may use such buttons to enter editing commands, such as Backspace or Enter commands, by touching the screen 36 at an appropriately displayed button in the one of the areas 40 ⁇
  • the display/touch-sensitive screen 36 typically includes two or more pen input display areas 42 - 44.
  • the pen input display areas 42 - 44 are used to receive handwritten characters from the user in the form of strokes.
  • the user typically alternates character input between the boxes 42 - 44. For example, a user will write in the first box 42, with the system echoing the pen movement on the screen by darkening or coloring the contacted areas, giving the user the impression that the pen has ink therein.
  • the user begins writing in the second box 43 it is assumed that the character in the first box 42 is complete.
  • the stroke information from the first box 42 is thus sent to a recognition process 48, and the displayed strokes (ink) in the first box 42 are erased.
  • the recognizer runs on a separate thread whereby it can recognize a character while the user is writing the next character.
  • the same actions occur in the second box 43, i.e., the strokes are sent for character recognition thereof and the second box 43 erased upon a detection that the user is now writing in the third box 44 (or has returned to write in the first box 42) .
  • a time-out when no pen activity is detected is also treated as a completed character, as will a direct command from the user, entered via one of the button input areas 40 ⁇ - 40 3 or the like, indicating that the user is finished with a character .
  • the recognition process 48 receives the character and sends the character data (e.g., the stroke or bitmap information) to a shape matcher 50.
  • the shape matcher 50 compares the character data against shape data 52 maintained in the storage 24.
  • the shape data 52 may be organized in one or more databases containing character shape information accumulated from samples taken from thousands or even millions of users.
  • the shape matcher 50 returns a list of possible alternates 54 (i.e., possible code points) for the character, each alternate having a probability (or some other weighted value) associated therewith indicative of the likelihood of the user- entered character's match with the sample data. For example, the leftmost part of FIG.
  • the shape matcher 50 may return in the list 54 any number of alternative characters, or a fixed number of alternates such as the most-probable twenty.
  • the preferred shape matcher 50 returns a cost proportional to how unlikely each alternative is.
  • the cost equals the negative of the natural log of the percentage probability, to determine the lowest cost path as described below. Negative natural logs are used since some of the probabilities can be very small percentages, and since summing the costs (performed by a known algorithm) corresponds to multiplying the percentages.
  • each of the alternates of the preceding character has a cost associated with transitioning to each of the alternates of the most recent character. As can be appreciated, this forms a lattice of possible paths (e.g., FIG. 14) between the most recent character's alternates and the previous character's alternates.
  • a state machine 58 statistically derived from a language model or some other source and kept in the storage 24, maintains information needed to determine the costs of transitioning from each alternative of the last-written character to each of the alternatives of the preceding character .
  • the recognition process 48 uses a context computation process 60 to compute the total cost to reach each of the character alternatives of the last written character.
  • a preferred context computation process uses an algorithm described in the paper entitled "The Vi terbi Algorithm7' G. David Forney Jr., Proceedings of the IEEE, VOL 61, No. 3, (1973) .
  • the Viterbi algorithm is known in the character recognition art and is thus not described in detail herein, except to summarize that the algorithm finds the lowest cost path from among a lattice of possible paths between nodes, wherein each node has a cost associated therewith and each branch between any two nodes has a cost associated therewith.
  • the cost of each path including the cost of the alternates and the transition cost, are computed for each path.
  • the lowest cost path to get to each of the last- written character's alternates are saved along with the cost. More particularly, a backpointer from each of the last-written alternates pointing on the lowest cost backpath to a predecessor alternate is saved, along with the cost, in a pointer-cost data table 62 or the like for use by the recognition process 48 as described below.
  • FIG. 4 shows five alternates in a list 54 2 within the context recognition buffer 56 for the second box, intended by the user to be a lowercase "a" character.
  • Each of the alternates in the list 54 2 have a cost to get from the alternates in the list 54 ⁇ thereto. The lowest of these costs are saved as a backpointer, shown as an arrow in FIGS. 4 - 6. Note that the various characters and paths shown in this example are only for the purposes of providing a conceptual illustration, and are not necessarily derivable from any real data.
  • the lowest cost path to get to the lowercase "o" character in the list 54 2 is from the lowercase “p” alternate in box 54 ⁇ , the "a” in 54 2 to the capital “P” in 54 ⁇ and so on.
  • backpointers are stored for each of these lowest cost paths.
  • not all of the previous character's alternates e.g., the "b" and "f” in the list 54 ⁇ of FIG. 4
  • those alternates that do have backpointers pointing thereto are considered to be active.
  • FIG. 6 a "Pa” will be placed in the recognized character buffer 38 and the lists 54 ⁇ and 54 2 removed from the context recognition buffer 56.
  • FIGS. 7 - 12 show the characters as they are added to the system, (e.g., the user has just begun writing in the next box) , and with the underscore character in FIG. 12 representing a space character.
  • the simplified state machine 60 ⁇ represented in FIG. 13 is used to determine transition costs, with only lowercase letters (including the space character) and numerical digits zero through nine allowed.
  • the set of lowercase letters represented by an "I.e.” in FIG. 13, have a seventy-five percent chance of being followed by another lowercase letter and a twenty-five percent chance of being followed by a number.
  • the set of numbers represented by an "N” in FIG. 13, have a seventy-five percent chance of being followed by another number and a twenty-five percent chance of being followed by a letter.
  • more complex state machines based on various language models or the like may be used to determine transition costs in actual systems .
  • the system considers a character to be entered when the user begins writing in another box, stops writing for a sufficiently long period of time (times out) , or directly commands the system to enter the character.
  • the recognition process 48 begins when an input character's data is received at step 800 (FIG. 8), as shown in FIG. 7 (wherein the user has just begun to write in another box) .
  • the recognition process 48 passes the character information (e.g., coordinates representing the strokes, a bitmap or some other appropriate format) to the shape matcher 50 for recognition thereof.
  • the shape matcher 50 returns a ranked list of alternates 54 ⁇ , and the ranked list 54 ⁇ is appended to the (currently empty) context recognition buffer 56.
  • Other character alternates such as an "i” (assumed undotted) and a "t" (uncrossed) may be also returned with much lower probabilities but are not separately shown.
  • the process determines that the user is not done writing, an act which would force the end of the recognition process as described below. For now, the process 48 continues to step 808.
  • Step 808 a set of currently active alternates are established as all of the alternates returned in the list 54 ⁇ (by the shape matcher 50) for the last character written, the vertical stroke.
  • Step 810 is a computation process for determining the lowest cost paths between alternates of adjacent entries, but since this is the first entry, the only cost is the cost of each alternate at this time.
  • the process thus continues to step 820 (FIG. 16) to test if the number of active alternates in the current set is one. Since the shape matcher 50 has returned more than one alternate for this first character, all of which are active, the process continues to step 822.
  • Step 822 tests to determine if the current set of active alternates belongs to the first character in the context recognition buffer 56. At this time, this is true, so step 822 returns to step 800 (FIG. 15) to await the next handwritten character.
  • the circular-shaped character information is received at step 800, and sent to the shape matcher at step 802.
  • a list of alternates 54 2 including a lowercase "o" and a zero ("0") are returned and placed in the context recognition buffer 56 at step 804, the zero shown with a slash therethrough in FIG. 9.
  • Step 806 again tests to determine if the user is done writing, and since the user is still writing, the process continues to step 808.
  • the currently active alternates are the alternates of the last box written, i.e., the "o" and the "0" (plus others, not separately shown) .
  • the context recognition process 48 uses the state machine 60 ⁇ , along with the costs returned by the shape matcher with each alternate, to compute and determine the lowest cost path from the previous to each of the alternates returned for the last written (circular shaped) character.
  • FIG. 14 shows the lattice of paths and the cost information for the paths, shown as percentage probabilities for comparative readability purposes. For additional clarity, the following table sets forth the transition costs for this lattice (with lowercase "1" shown as a capital and the numbers written out to avoid confusion) based on the state machine 60 ⁇ of FIG. 13:
  • the context computation process 60 (Viterbi algorithm) is used at step 810 to determine the lowest cost paths to the alternates of the last written character by summing the costs. As apparent from an analysis of FIG. 14 and as shown in FIG. 9, backpointers from the zero ("0") to the one ("1") and from the "o” to the "1" show the results of the computation.
  • the process 48 then continues to step 820 of FIG. 16. Since there are at least these two alternates, step 820 of Fig. 16 branches to step 822, wherein it is determined that the list 54 2 does not represent the first character in the context recognition buffer 56.
  • step 822 branches to step 824 where the currently active alternates are changed to alternates representing the previous character that are active (have backpointers thereto), i.e., the one ("1") and the "1" in the list 54 ⁇ .
  • the currently active alternates are changed to alternates representing the previous character that are active (have backpointers thereto), i.e., the one ("1") and the "1" in the list 54 ⁇ .
  • each of the alternates in the list 54 2 that are not separately shown also have to point to either the one ("1") alternate or the "1" alternate. This is because of the low cost of these top two alternates relative to the possible transition costs under the rules of the state machine 60 ⁇ .
  • step 820 branches to step 822 where this time it is determined that the set of currently active alternates belong to the list representing the first character in the context recognition buffer 56. Step 822 thus returns to step 800 of FIG. 15.
  • step 820 again fails to detect any list with only one active alternate therein before the first character is reached at step 822.
  • the next entered character begins to change the context in a way that will ultimately decide the characters n the string. More particularly, the "p" character in the list 54 receives a high probability, such as ninety-five percent (i.e., low cost, equal to 0.0513), while the other characters such as the "8" receive a much lower probability, such as four percent (cost equals 3.2189). However, this time through the recognition process 48, the cost for all of the alternates for the last written character are computed at step 810, with a backpointer assigned to each as shown in FIG. 11.
  • the backpointer from the "8" points to the previous character's zero ("0") alternate.
  • the user completes the next, s-shaped character and begins another character (by causing a space character to be entered in some manner) as shown in FIG. 12.
  • the s-shaped character is sent to the shape matcher 50.
  • a "5" and an "s” is returned at step 804 with some cost assigned to each, along with other alternates in the list 54 5 .
  • the user is not finished writing at step 806, and thus at step 808 the currently active alternates are the "5" and the "s" (plus others, not separately shown) of the list 54s-
  • step 810 because of the low cost of the "p,” the cost from the "5" to the “p” is still lower than from the "5" to the “8,” notwithstanding the transition cost differential from the state machine 60 ⁇ .
  • 95 percent times 25 percent is greater than 4 percent times 75 percent, and thus the backpointer from the "5" is to the preceding "p” alternate instead of the preceding "8" alternate.
  • the process continues to step 820 of FIG. 16. At step 820, because there are at least these two alternates, the process branches to step 822.
  • Step 822 determines that this is not the first character in the context recognition buffer 56, and thus continues to step 824 where the currently active alternate set is changed to point to the "p" alternate in the previous list of alternates. Note that only the "p” alternate is placed in the set because only the "p” alternate is active, i.e., has a backpointer thereto.
  • step 820 determines that the number of currently active alternate is now one, and thus branches to step 826.
  • the sole currently active alternate, the "p" alternate is placed along with its preceding alternates on its lowest cost backpath (i.e., "loops") into the recognized character buffer 38.
  • the lists 54 ⁇ - 54 4 corresponding to these now-recognized characters are removed from the context recognition buffer 56 whereby, the list 54 5 containing the "5" and the "s" alternates becomes the first entry in the context recognition buffer 56.
  • the above-described process thus places partial results into the recognized character buffer 38 as soon as available, yet uses a full context recognition process that is as accurate as waiting for an entire string to be completed before analysis thereof.
  • the string "loop" was somewhat of an extreme example chosen to show the advantages over backwards-only recognition, in practice it has been found that one to three characters are ordinarily all that are needed by the process of the present invention before the backpointers converge to a common alternate. This provides acceptably rapid visual feedback along with the improved accuracy of full context recognition, while requiring only a small amount of memory for electronic ink.
  • characters are not later changed regardless of further characters input into the string.
  • step 806 will detect that there are no more characters forthcoming from which forward context recognition can be accomplished. At such a time, the process 48 branches to step 812 to find the lowest-cost path back in order to determine the best characters of those remaining. Note that step 812 is just the conventional use of the Viterbi algorithm, used as if the remaining string was the full string entered all-at-once for recognition.
  • the above-described simplified state machine has the properties that every state is connected to every other state and any given code point always transitions to the same state regardless of the state in which it started. Any digit always goes to the digit state and any letter always goes to the letter state. In such an instance, the code point always uniquely determines the state, so there is no need to store the state separately. Moreover, all of the states are in a valid state.
  • the simplified machine is a bigram state transition machine, which has one state per character in the alphabet, each state is connected to every other state (including itself).
  • a slightly more complex bigram state machine is arranged as a table having transition costs corresponding to the probability of letter bigrams implied by the states, i.e., the cost to transition from "a” to "b” is the probability of seeing an "ab” bigram in normal text.
  • the shape matcher returns a list of code point, score pairs.
  • the set of lists maintained by the context recognition process (engine) is actually a list of triples, i.e., code point, score, state, along with a backpointer .
  • the shape matcher returns a list of alternates which is made up character code points with associated probabilities.
  • the context recognition process i.e., context engine
  • the language model states correspond to different paths through the character alternate arrays that are valid character transitions in the language model.
  • the context recognition process computes the total path score for each of the new language model states by summing the cost of the character alternative assigned by the shape matcher with the cost of the transition from the previous language model state to the current language model state.
  • FIG. 17 shows a simplified dictionary state machine 70 having only three words therein which are relevant to the following example, i.e., "cattle", "oat” and "octant". Note that the "v" shown in states 13, 7 and 14 indicate a valid end of a word (termination) as described below.
  • FIG. 18 shows a simplified eight-character bigram state machine 72 that will also be used in the present language model example, with the character for transitioning to each state shown as a character in parentheses within the numbered state.
  • the bigram state machine 72 starts with a low probability such as one percent, corresponding to the fact that most words (ninety-nine percent) are expected to be in the dictionary 70. This means that the process selects a dictionary word if one is reasonable, but still can select a non-dictionary word. Note that more than one such state machine is included so that the user is not limited to writing only dictionary words, otherwise as will become apparent below, all paths would be pruned whenever a word was not in the dictionary. Indeed, a bigram state machine, such as 72 with each state being valid to terminate an input string, (and thus which cannot fail), is ordinarily provided for this purpose .
  • the context recognition uses the language model state alternative generating process mentioned above, e.g., on the first character box, every code point in the first box's ranked list of codepoint alternatives is used to generate the list of language model states that could be transitioned to.
  • each code point alternative generates two language model state alternatives, with the one at the top pointing to the dictionary 70 and the one at the bottom pointing to the bigram state machine 72. Note that since the same character can transition to two different language model states at the same time, this is the same as creating a single nondeterministic state machine from the combination of the various state machines .
  • Nondeterministic state machines are described by Knuth, The Art of Computer Programming, Addison-Wesley (1973) .
  • both state machines 70, 72 share a common initial start state, indicated by the zero state in both FIGS. 17 and 18.
  • four initial language model state alternatives are created from the first box's proposed codepoint alternatives of
  • the context recognition process When the code point alternate list for the next ink box is returned, i.e., the "a" (.75 chance) and the "c" (.25 chance), the context recognition process generates the language model state alternatives that can be transitioned to from the previous box's four language model state alternates via the two code point alternates returned in the alternate list.
  • the two code point alternatives thus generate as many as eight language model states in this example. However, based on the state machine 70, not all of these eight state machine alternatives are viable. More particularly, as shown in FIG.
  • the previous "c” can transition to the "a", and the previous “o” to either the “a” or a “c” alternates, but the dictionary state machine 70 does not allow for the previous "c” to transition to another "c”. This eliminates the potential quadruple therefor from the list. Moreover, some of the states in the language model state alternate list for this character box are duplicates. As the Viterbi algorithm is being used to find the lowest cost path through the language model states, it is evident to one familiar with the Viterbi algorithm that only the lowest cost path to a state needs to be kept.
  • the highest cost (lowest probability) duplicates are pruned, whereby the character that made the transition with the lowest cost path to a previous state is kept.
  • the search for duplicates and their removal is preferable done as the list is built, thus dynamically pruning the number of paths.
  • the quadruples for the second box are [a, 3, ( .75* .75* .99* .5) , ⁇ c,l], [a, 100, ( .75*.75*.01*.125*.125) , ⁇ c,101], [a, 4, ( .25*.75*.99*.5*.5) , ⁇ o,2], [a, 100, (.25*.75*.01*.125*.125) , ⁇ o,104], [c, 101, ( .25*.75*.01*.125*.125) , ⁇ c,101], [c, 5, ( .25*.25*.99*.5*.5) , ⁇ o,2], and [c, 101, ( .25*.25*.01*.125*.125) , ⁇ o,104], where the " ⁇ " represents a backpointer to a previous (code point alternate, state) pair.
  • FIG. 19 shows the remaining code points, states and backpointers, at this time only looking back from Roman numerals II to I.
  • the backpointers from the second alternates are to two distinct characters, "c" and "o", and thus for now no partial results may be displayed. The display of partial results will be described below at the time in the example that such partial results are capable of being safely displayed.
  • the next character entered is either a "t” (75% probability) or an "x" (25% probability) .
  • states 3, 4 and 5 can all transition via a “t” to states 6, 7 and 8, respectively, while from the bigram state machine 72, states 100 and 101 can transition to state 105 via a "t".
  • none of the states 3, 4 or 5 can validly transition to another state via the "x" alternate, and thus these dictionary "x" quadruples drop out.
  • the "x” is a valid bigram transition, of course, from state 100 to state 106 and state 101 to state 106.
  • the valid quadruples are thus [t, 6, (PROB), ⁇ a,3]; [t, 105, (PROB), ⁇ a,100]; [t, 7, (PROB), ⁇ a,4]; [t, 105, (PROB), ⁇ c,101]; [t, 8, (PROB), ⁇ a,5]; [x, 106, (PROB), ⁇ a, 100] and [x, 106, (PROB), ⁇ c,101].
  • the algorithm that determines which characters can be displayed safely is modified to search backwards through the queue of language model states for each character box, until all the still active language model states for an ink box have converged to the same proposed alternate character. Note that this does not mean the paths must converge, (since each state machine will have its own path, if any, pointing to its own state) , but that any remaining paths represent the same character.
  • the algorithm includes a rule, stating that the requirement that all the paths represent the same character has to hold for all ink boxes back to the first ink box written to safely display the character alternate.
  • the backpointers from the fourth character alternates (IV) are to a common character, "t", either via a (t, 6) or a (t, 105) .
  • each of these "t” backpointers are to an "a” character, which in turn are to a "c” character.
  • this search allows a sophisticated language model made up of state machines, of which some of the state machines may be dictionaries organized as state machines (trie state machine in the present implementation) .
  • This enables a dictionary search with forward and backward context, while providing partial results that cannot be changed by further input.
  • state machines can be added.
  • a special state machine for numbers can be added, such as one which knows about commas, currency symbols, and decimal points.
  • Another state machine for computer file names, or Internet URLs can be added.
  • a user may add a personal supplementary dictionary of private words.
  • other considerations are handled wherein the context recognition process creates a list of language model state entries from the ranked code point list provided by the shape matcher. A first consideration is when the same character can go to more than one state in the same state machine.
  • a third consideration for which the context recognition process creates multiple language model state entries from a single code point in the shape matchers alternate list is when there is more than one state machine, as described above. This usually happens only at the beginning of input, but it can also happen (for example) when a hyphen, dash, slash, or space is encountered. Because it is important that there be at least one valid state machine for any input, the bigram state machine is used, via which, at the beginning of input, each element m the first ranked list gets a bigram state and a dictionary state. It is also useful to also include state machines for currency amounts, file names, Internet URLs, and so on.
  • the dictionary itself may be nondeterministic, to handle compound forms, e.g., words or terms containing a space, (such as "habeus corpus"), a hyphen (such as “T-shirt”) or a slash (such as "I/O”) .
  • a space such as "habeus corpus”
  • a hyphen such as "T-shirt”
  • a slash such as "I/O”
  • any two dictionary words separated by space, hyphen, or slash are allowed, but it is even more desirable in certain instances to allow a non-dictionary word only when it is part of such a compound.
  • backpointers will point back two or more columns rather than always pointing to the immediately previous column of alternates.
  • the full context recognition method and mechanism enables rapid visual feedback while having relatively low memory requirements.
  • the method and mechanism function with existing recognition components such as shape matching components and language model context components, and is fast, reliable, cost-efficient, flexible and extensible.

Abstract

A method for displaying partial results of context handwriting recognition. As handwritten characters are entered (30), a shape matcher (50) associates the character with alternate code points, with each alternate code point having probability information. The alternate code points are placed at the end of a queue (38), and a cost (58) is determined from each alternate code point to any immediately preceding alternate. The cost is based on the probability information of the alternates and a transition cost. The lowest cost path back from each of the alternates at the end of the queue to an alternate at the beginning is determined. If each lowest cost path back converges to a common alternate, the common alternate and any previous alternates on the path back are recognized as the code points for each of the handwritten character. The alternates corresponding to these code points are removed, and the code points appropriately displayed on a screen (34) as recognized characters, to allow editing.

Description

Partial Results in Full Context Handwriting Recognition
FIELD OF THE INVENTION
The invention relates generally to the input of user information into computer systems, and more particularly to the recognition of handwritten characters input by a user.
BACKGROUND OF THE INVENTION Many handwritten characters resemble others, for example, a handwritten lowercase letter "o" resembles a "0" (zero) as well as an uppercase "0." Moreover, the shape of any handwritten character typically varies from user to user, and often varies from instance to instance when written by a single user. Consequently, in addition to shape matching, the accuracy of handwritten character recognition is improved when the character is also analyzed with respect to its context. Indeed, some Japanese symbols are virtually indistinguishable from one another absent surrounding context.
Context-based recognition is done in one of two ways, i.e., a backward context approach or a full context approach. Backward context recognition analyzes previous characters, if any, to determine the best translation of the current character. A benefit of backward context recognition is that each entered character is recognized and immediately displayed for user verification. In contrast, a full context approach looks at the characters both before and after each character in a string to help in the recognition thereof.
Backward context recognition is less accurate than full context recognition because by only looking backwards, only half of the available context is used. By way of example, a handwritten string such as "|O0" may be intended to be "one-zero-zero" or the first three letters of a word such as "loops." As can be appreciated, both are reasonably probable. If the next character was available for analysis, the chance of correctly recognizing the string may be greatly improved. For example, a "5" would strongly indicate that the first three characters were part of a number ("1005") while a "p" would strongly indicate the first three characters were part of a word ("loop") . However, the backwards context recognition has to make its initial decision based solely on the first character, which may be incorrect, particularly if the character shape is one which closely resembles the shape of one or more other characters. Thereafter, subsequent characters are analyzed relative to this initial guess, e.g., a letter more likely follows a letter than a number. As can be appreciated, if the initial guess is incorrect, with backwards context recognition, some or all subsequent characters may be also incorrect, further compounding the error .
Full (forwards and backwards) context recognition is thus more accurate than backwards recognition because of the ability to use all rather than half of the available context information. A first approach to full context recognition is to wait for the user to complete all characters in a given string (with the user indicating completion) and then analyze the full string. Full context thus provides a relatively high level of recognition accuracy, but also has a number of drawbacks associated therewith. First, the user receives no feedback while entering the characters until the user indicates that the string is complete. Second, there is a noticeably long delay between receipt of the string and the completion of the recognition operation. Lastly, and possibly most important, the memory requirements for keeping a potentially unlimited amount of electronic ink in the memory for a potentially unlimited amount of characters would overwhelm many systems, particularly hand-held (palmtop) computing devices in which handwriting recognition is commonplace.
A compromise solution has been attempted in which backwards context recognition is initially performed to provide instantaneous feedback using only small amounts of memory, while full context recognition is later performed on those backwards-recognized characters to increase recognition accuracy. However, this solution has the effect of changing characters that have already been displayed, and possibly verified as correct by the user, to character values that may be incorrect. Indeed, in most cases, unilaterally changing a character after display and verification of the correctness thereof has been found so annoying to users that this is an unacceptable solution, even though overall recognition accuracy may be increased.
OBJECTS AND SUMMARY OF THE INVENTION
Accordingly, it is a general object of the present invention to provide an improved method and mechanism for performing full-context recognition.
Another object is to provide a full context recognition method and mechanism that facilitates rapid visual feedback while having relatively low memory requirements.
In accomplishing those objects, it is a related object to provide a method and mechanism as characterized above that functions with existing recognition components such as shape matching components and language model context components.
Yet another object is to provide a method and mechanism of the above kind that is fast, reliable, cost- efficient, flexible and extensible.
Briefly, the present invention provides a method and mechanism for recognizing handwritten characters previously entered into the system from the context of subsequent handwritten characters presently being entered into the system. Using a shape matcher, each handwritten character entered is associated with a plurality of alternate code points therefor, with each alternate code point having probability information associated therewith. As each character is received, the alternate code points therefor are placed at the end of a queue, and a cost is determined from each alternate code point to any immediately preceding alternate in the queue. The cost is based on the probability information of the alternates and a transition cost therebetween. Then, the lowest cost path back from each of the alternates at the end of the queue to an alternate at the beginning of the queue is determined. If each lowest cost path back converges at (traverses) a common alternate in the queue, the common alternate and any alternates previous thereto on the path back are recognized as the code points for each of the handwritten characters associated therewith. The alternates corresponding to recognized code points are removed from the queue, and the code points appropriately displayed on a screen as recognized characters, to allow editing thereof or the like.
Other objects and advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which:
BRIEF DESCRIPTION OF THE DRAWINGS
FIGURE 1 is a block diagram representing a computer system into which the present invention may be incorporated;
FIG. 2 is a representation of an exemplary touch- sensitive screen for inputting user information to the system and for outputting system information to the user; FIG. 3 is a block diagram representing functional components for recognizing handwritten characters according to the present invention; FIGS. 4 - 6 represent an exemplary sequence of handwritten user characters with corresponding character alternatives returned by a shape matcher and buffered and analyzed for context recognition thereof; FIGS . 7 - 12 represent an alternate exemplary sequence of handwritten user characters with corresponding character alternatives returned by a shape matcher and buffered and analyzed for context recognition; FIG. 13 is a representation of a simplified state machine used to perform context recognition on the characters of FIGS. 7 - 12;
FIG. 14 is a representation of a lattice of alternative paths and showing cost information of some of the paths for the buffered alternative characters of FIG. 9;
FIGS. 15 - 16 comprise a flow diagram representing the general steps taken to perform full context recognition while displaying partial results in accordance with one aspect of the present invention;
FIG. 17 is a representation of a dictionary state machine used to perform context recognition;
FIG. 18 is a representation of a bigram state machine used to perform context recognition; and FIG. 19 is a representation of a lattice of alternative paths resulting from the state machines of FIGS. 17 and 18.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT Turning to the drawings and referring first to FIG. 1, there is shown a computer system generally designated 20 into which the present invention may be incorporated. The computer system 20 includes a processor 22 operatively connected to storage 24, the storage including read only memory (ROM) , random access memory
(RAM) and non-volatile storage such as a hard disk-drive, optical drive or the like. As can be appreciated, the non-volatile storage can be used in conjunction with the RAM to provide a relatively large amount of virtual memory via well-known swapping techniques. The processor 22 also connects through I/O circuitry 28 to one or more input devices 30ι - 303, such as a keyboard and/or a pointing device such as a mouse. The system 20 includes means for inputting pen-based information, including handwritten characters, such as a pen and/or pen-sensitive (e.g., touch-sensitive) input device 303. The system 20 also includes at least one local output device 34 connected to the I/O circuitry 28 for communicating information, such as via a graphical user interface, to the user of the system 20. The output device 34 may include a liquid crystal display screen or the like integrated with the pen-input device 303, which, taken together, can be considered a pen-sensitive screen 36 (FIG. 2) .
A preferred system 20 is a hand-held personal computing device running on the Windows CE operating system loaded in the storage 24. At least one application program such as a word processing program may also be loaded into the storage 24.
As shown in FIG. 2, the pen-sensitive screen 36 includes a visible manifestation of a text buffer 38 for displaying and allowing edit control of the text characters that have been input into and recognized by the system 20. As described in detail below, if the character is a handwritten character, the system 20 recognizes the character in part by its shape, from among a set of possible computer characters, (e.g., ASCII or Unicode code points) . The system 20 also recognizes the character in part from its full context, i.e., its surrounding characters both forward and backward. In accordance with one aspect of the invention and as described in detail below, the code point is transferred to the text buffer 38, i.e., recognized character buffer 38, and displayed therein in an appropriate font and font size, when it is established that the additional context cannot change the code point regardless of further character input.
The touch screen 36 preferably includes one or more button input areas 40ι - 403 (FIG. 2), which may provide a virtual keyboard through which the user may directly input characters, character-based editing commands and so on. For example, the buttons may represent characters, such that when contacted by a pen, the character is input into the system as if it was typed by the user on a conventional keyboard. In addition, the user may use such buttons to enter editing commands, such as Backspace or Enter commands, by touching the screen 36 at an appropriately displayed button in the one of the areas 40ι
-403.
As shown in FIG. 2, for inputting handwritten characters, the display/touch-sensitive screen 36 typically includes two or more pen input display areas 42 - 44. In general, the pen input display areas 42 - 44 are used to receive handwritten characters from the user in the form of strokes. The user typically alternates character input between the boxes 42 - 44. For example, a user will write in the first box 42, with the system echoing the pen movement on the screen by darkening or coloring the contacted areas, giving the user the impression that the pen has ink therein. When the user begins writing in the second box 43, it is assumed that the character in the first box 42 is complete. The stroke information from the first box 42 is thus sent to a recognition process 48, and the displayed strokes (ink) in the first box 42 are erased. Note that the recognizer runs on a separate thread whereby it can recognize a character while the user is writing the next character. Similarly, the same actions occur in the second box 43, i.e., the strokes are sent for character recognition thereof and the second box 43 erased upon a detection that the user is now writing in the third box 44 (or has returned to write in the first box 42) . A time-out when no pen activity is detected is also treated as a completed character, as will a direct command from the user, entered via one of the button input areas 40ι - 403 or the like, indicating that the user is finished with a character . In accordance with one aspect of the invention and as best shown in FIG. 3, when the user inputs a character, the recognition process 48 receives the character and sends the character data (e.g., the stroke or bitmap information) to a shape matcher 50. The shape matcher 50 compares the character data against shape data 52 maintained in the storage 24. The shape data 52 may be organized in one or more databases containing character shape information accumulated from samples taken from thousands or even millions of users. In keeping with the invention, the shape matcher 50 returns a list of possible alternates 54 (i.e., possible code points) for the character, each alternate having a probability (or some other weighted value) associated therewith indicative of the likelihood of the user- entered character's match with the sample data. For example, the leftmost part of FIG. 4 shows one such (hypothetical) ranked list 54ι of five alternates for the handwritten character (intended by the user to be a capital "P") input via one of the boxes 42 - 44. Based upon the actual shape versus the sample shape data 52, as determined by the shape matcher 50, the lowercase "p" may have fifty percent chance (likelihood) of being what the user intended, the uppercase "P" a forty-eight percent chance, the "D" a one percent chance, and so on totaling (or closely approaching) one hundred percent. Note that in the actual system, the shape matcher 50 may return in the list 54 any number of alternative characters, or a fixed number of alternates such as the most-probable twenty. Moreover, instead of a percentage, the preferred shape matcher 50 returns a cost proportional to how unlikely each alternative is. The cost equals the negative of the natural log of the percentage probability, to determine the lowest cost path as described below. Negative natural logs are used since some of the probabilities can be very small percentages, and since summing the costs (performed by a known algorithm) corresponds to multiplying the percentages. Once the shape matcher 50 has returned the ranked list of character alternatives 54, the recognition process 48 appends the list 54 to a context recognition buffer 56 or the like containing similar ranked lists for any preceding characters, if any, which have not yet been recognized. In accordance with another aspect of the invention, each of the alternates of the preceding character has a cost associated with transitioning to each of the alternates of the most recent character. As can be appreciated, this forms a lattice of possible paths (e.g., FIG. 14) between the most recent character's alternates and the previous character's alternates. A state machine 58, statistically derived from a language model or some other source and kept in the storage 24, maintains information needed to determine the costs of transitioning from each alternative of the last-written character to each of the alternatives of the preceding character . The recognition process 48 then uses a context computation process 60 to compute the total cost to reach each of the character alternatives of the last written character. A preferred context computation process uses an algorithm described in the paper entitled "The Vi terbi Algorithm7' G. David Forney Jr., Proceedings of the IEEE, VOL 61, No. 3, (1973) . The Viterbi algorithm is known in the character recognition art and is thus not described in detail herein, except to summarize that the algorithm finds the lowest cost path from among a lattice of possible paths between nodes, wherein each node has a cost associated therewith and each branch between any two nodes has a cost associated therewith.
Thus, as shown in FIGS. 4 - 6, as each character is written, the cost of each path, including the cost of the alternates and the transition cost, are computed for each path. The lowest cost path to get to each of the last- written character's alternates are saved along with the cost. More particularly, a backpointer from each of the last-written alternates pointing on the lowest cost backpath to a predecessor alternate is saved, along with the cost, in a pointer-cost data table 62 or the like for use by the recognition process 48 as described below.
By way of example, FIG. 4 shows five alternates in a list 542 within the context recognition buffer 56 for the second box, intended by the user to be a lowercase "a" character. Each of the alternates in the list 542 have a cost to get from the alternates in the list 54ι thereto. The lowest of these costs are saved as a backpointer, shown as an arrow in FIGS. 4 - 6. Note that the various characters and paths shown in this example are only for the purposes of providing a conceptual illustration, and are not necessarily derivable from any real data. Thus, in this example, the lowest cost path to get to the lowercase "o" character in the list 542 is from the lowercase "p" alternate in box 54ι, the "a" in 542 to the capital "P" in 54ι and so on. As shown in FIG. 4, backpointers are stored for each of these lowest cost paths. As can be appreciated, in a typical case, not all of the previous character's alternates (e.g., the "b" and "f" in the list 54ι of FIG. 4) have a backpointer thereto, since they are not on any of the last-written alternates' lowest cost paths. Conversely, those alternates that do have backpointers pointing thereto (shown as shaded in FIGS. 4 - 6) are considered to be active.
As characters are added, such as shown in FIGS. 5 - 6, the process is repeated. The backpointers are only saved for previous alternates that are active, since only active alternates can be on the lowest cost path back. Thus, as shown in FIG. 5, the "t"-shaped character's alternates in the list 543 all point to either the "o" or the "a" alternates in the list 542, and thus the backpointers from the "u," "c" and "n" characters are no longer active. Similarly, as shown in FIG. 6, all of the alternates in the list 54 corresponding to the fourth box have backpointers to either the "t" or "E" of the third box' s list 543. In accordance with one aspect of the invention, when the backpointers converge to one active alternate in given list of alternates, that alternate and any of its preceding alternates on the path back therefrom are considered to be recognized. Those alternates are then placed as code points in the recognized character (text) buffer 38 for displaying to the user, and the lists containing those alternates removed from the context recognition buffer 56. This is possible because further context, no matter what characters are entered, will not change this convergence. Thus, in the example shown in
FIG. 6, a "Pa" will be placed in the recognized character buffer 38 and the lists 54ι and 542 removed from the context recognition buffer 56.
Turning to an explanation of the invention with particular reference to FIGS. 7 - 12 and the flow diagram of FIGS. 15 - 16, in the following example, the user intends to input the string "loops_" into the system by writing into the input boxes 42 - 44 as described above. For purposes of readability, FIGS. 7 - 12 show the characters as they are added to the system, (e.g., the user has just begun writing in the next box) , and with the underscore character in FIG. 12 representing a space character. Also, for purposes of simplicity, in this example, the simplified state machine 60ι represented in FIG. 13 is used to determine transition costs, with only lowercase letters (including the space character) and numerical digits zero through nine allowed. According to the rules of the simplified state machine 60ι, the set of lowercase letters, represented by an "I.e." in FIG. 13, have a seventy-five percent chance of being followed by another lowercase letter and a twenty-five percent chance of being followed by a number. The set of numbers, represented by an "N" in FIG. 13, have a seventy-five percent chance of being followed by another number and a twenty-five percent chance of being followed by a letter. Of course, it is understood that more complex state machines based on various language models or the like may be used to determine transition costs in actual systems .
As described above, the system considers a character to be entered when the user begins writing in another box, stops writing for a sufficiently long period of time (times out) , or directly commands the system to enter the character. Thus, the recognition process 48 begins when an input character's data is received at step 800 (FIG. 8), as shown in FIG. 7 (wherein the user has just begun to write in another box) . At step 802, the recognition process 48 passes the character information (e.g., coordinates representing the strokes, a bitmap or some other appropriate format) to the shape matcher 50 for recognition thereof. As shown in FIG. 8, at step 804, the shape matcher 50 returns a ranked list of alternates 54ι, and the ranked list 54χ is appended to the (currently empty) context recognition buffer 56. In the present example, as shown in FIG. 8, for the vertical stroke entered by the user, the two top alternatives are a one ("1") and a lowercase letter-L ("1"), each having a forty-nine percent probability, or alternatively, cost = -In (0.49) = 0.7133. Other character alternates, such as an "i" (assumed undotted) and a "t" (uncrossed) may be also returned with much lower probabilities but are not separately shown. At step 806, since the user is writing in another box, (e.g., input box 43), the process determines that the user is not done writing, an act which would force the end of the recognition process as described below. For now, the process 48 continues to step 808. At step 808, a set of currently active alternates are established as all of the alternates returned in the list 54ι (by the shape matcher 50) for the last character written, the vertical stroke. (Step 810, described below, is a computation process for determining the lowest cost paths between alternates of adjacent entries, but since this is the first entry, the only cost is the cost of each alternate at this time.) The process thus continues to step 820 (FIG. 16) to test if the number of active alternates in the current set is one. Since the shape matcher 50 has returned more than one alternate for this first character, all of which are active, the process continues to step 822.
Step 822 tests to determine if the current set of active alternates belongs to the first character in the context recognition buffer 56. At this time, this is true, so step 822 returns to step 800 (FIG. 15) to await the next handwritten character.
As shown in FIG. 8, sometime later the user begins to write another character in a third box, having completed the circular-shaped character in the second box. Accordingly, the circular-shaped character information is received at step 800, and sent to the shape matcher at step 802. As shown in FIG. 9, a list of alternates 542, including a lowercase "o" and a zero ("0") are returned and placed in the context recognition buffer 56 at step 804, the zero shown with a slash therethrough in FIG. 9. Both choices again have forty-nine percent probabilities, or alternatively, cost = -In (0.49) = 0.7133.
Step 806 again tests to determine if the user is done writing, and since the user is still writing, the process continues to step 808. At step 808, the currently active alternates are the alternates of the last box written, i.e., the "o" and the "0" (plus others, not separately shown) . Then, at step 810, the context recognition process 48 uses the state machine 60ι, along with the costs returned by the shape matcher with each alternate, to compute and determine the lowest cost path from the previous to each of the alternates returned for the last written (circular shaped) character. FIG. 14 shows the lattice of paths and the cost information for the paths, shown as percentage probabilities for comparative readability purposes. For additional clarity, the following table sets forth the transition costs for this lattice (with lowercase "1" shown as a capital and the numbers written out to avoid confusion) based on the state machine 60ι of FIG. 13:
Figure imgf000016_0001
In keeping with the invention, as described above, the context computation process 60 (Viterbi algorithm) is used at step 810 to determine the lowest cost paths to the alternates of the last written character by summing the costs. As apparent from an analysis of FIG. 14 and as shown in FIG. 9, backpointers from the zero ("0") to the one ("1") and from the "o" to the "1" show the results of the computation. The process 48 then continues to step 820 of FIG. 16. Since there are at least these two alternates, step 820 of Fig. 16 branches to step 822, wherein it is determined that the list 542 does not represent the first character in the context recognition buffer 56. As a result, step 822 branches to step 824 where the currently active alternates are changed to alternates representing the previous character that are active (have backpointers thereto), i.e., the one ("1") and the "1" in the list 54χ. Note that in the present example, each of the alternates in the list 542 that are not separately shown also have to point to either the one ("1") alternate or the "1" alternate. This is because of the low cost of these top two alternates relative to the possible transition costs under the rules of the state machine 60ι.
The process then returns to step 820 with the new set of currently active alternates, wherein it is determined that more than one alternate is active in the list 5 A χ . Accordingly, step 820 branches to step 822 where this time it is determined that the set of currently active alternates belong to the list representing the first character in the context recognition buffer 56. Step 822 thus returns to step 800 of FIG. 15.
The next character, another circular-shaped entry, is handled in substantially the same manner as was the previous character. Thus, assuming at step 804 that the shape matcher returns approximately the same probabilities, the "o" alternate in the list 543 (FIG. 10) is assigned a backpointer to the "o" alternate in the list 542, and the zero ("0") in the list 543 a backpointer to the alternate in the list 542. As is apparent from
FIG. 10, for each loop through steps 820 - 824, step 820 again fails to detect any list with only one active alternate therein before the first character is reached at step 822.
The next entered character, however, the lowercase "p" character, begins to change the context in a way that will ultimately decide the characters n the string. More particularly, the "p" character in the list 54 receives a high probability, such as ninety-five percent (i.e., low cost, equal to 0.0513), while the other characters such as the "8" receive a much lower probability, such as four percent (cost equals 3.2189). However, this time through the recognition process 48, the cost for all of the alternates for the last written character are computed at step 810, with a backpointer assigned to each as shown in FIG. 11. Thus, although the "8" is unlikely, at this time it is still considered a viable possibility, and thus according the rules of the state machine 60ι, the backpointer from the "8" points to the previous character's zero ("0") alternate. Finally, the user completes the next, s-shaped character and begins another character (by causing a space character to be entered in some manner) as shown in FIG. 12. At step 802, the s-shaped character is sent to the shape matcher 50. As shown in FIG. 12, a "5" and an "s" is returned at step 804 with some cost assigned to each, along with other alternates in the list 545. Again, in the present example the user is not finished writing at step 806, and thus at step 808 the currently active alternates are the "5" and the "s" (plus others, not separately shown) of the list 54s-
At step 810, because of the low cost of the "p," the cost from the "5" to the "p" is still lower than from the "5" to the "8," notwithstanding the transition cost differential from the state machine 60ι. In other words, using the numbers stated above, 95 percent times 25 percent is greater than 4 percent times 75 percent, and thus the backpointer from the "5" is to the preceding "p" alternate instead of the preceding "8" alternate. As can be appreciated, this is true of any other alternates not separately shown. Once the backpointers have been computed at step 810, the process continues to step 820 of FIG. 16. At step 820, because there are at least these two alternates, the process branches to step 822. Step 822 determines that this is not the first character in the context recognition buffer 56, and thus continues to step 824 where the currently active alternate set is changed to point to the "p" alternate in the previous list of alternates. Note that only the "p" alternate is placed in the set because only the "p" alternate is active, i.e., has a backpointer thereto.
As a result, step 820 determines that the number of currently active alternate is now one, and thus branches to step 826. At step 826, the sole currently active alternate, the "p" alternate, is placed along with its preceding alternates on its lowest cost backpath (i.e., "loops") into the recognized character buffer 38. Then, at step 828, the lists 54χ - 544 corresponding to these now-recognized characters are removed from the context recognition buffer 56 whereby, the list 545 containing the "5" and the "s" alternates becomes the first entry in the context recognition buffer 56.
As can be appreciated, the above-described process thus places partial results into the recognized character buffer 38 as soon as available, yet uses a full context recognition process that is as accurate as waiting for an entire string to be completed before analysis thereof. Although the string "loop" was somewhat of an extreme example chosen to show the advantages over backwards-only recognition, in practice it has been found that one to three characters are ordinarily all that are needed by the process of the present invention before the backpointers converge to a common alternate. This provides acceptably rapid visual feedback along with the improved accuracy of full context recognition, while requiring only a small amount of memory for electronic ink. Significantly, once recognized with the above process, characters are not later changed regardless of further characters input into the string.
Lastly, when the user has finished writing, step 806 will detect that there are no more characters forthcoming from which forward context recognition can be accomplished. At such a time, the process 48 branches to step 812 to find the lowest-cost path back in order to determine the best characters of those remaining. Note that step 812 is just the conventional use of the Viterbi algorithm, used as if the remaining string was the full string entered all-at-once for recognition.
The above-described simplified state machine has the properties that every state is connected to every other state and any given code point always transitions to the same state regardless of the state in which it started. Any digit always goes to the digit state and any letter always goes to the letter state. In such an instance, the code point always uniquely determines the state, so there is no need to store the state separately. Moreover, all of the states are in a valid state. As can be appreciated, the simplified machine is a bigram state transition machine, which has one state per character in the alphabet, each state is connected to every other state (including itself). A slightly more complex bigram state machine is arranged as a table having transition costs corresponding to the probability of letter bigrams implied by the states, i.e., the cost to transition from "a" to "b" is the probability of seeing an "ab" bigram in normal text. The shape matcher returns a list of code point, score pairs. The set of lists maintained by the context recognition process (engine) is actually a list of triples, i.e., code point, score, state, along with a backpointer .
However, more complex state machines which implement language models are often desirable, such as a trie- structured dictionary state machine, or a nondeterministic state machine. With a dictionary state machine, the same letter has different states depending on what letters have come before. Moreover, most states only allow transition on a few letters. For example, only a "u" is a valid transition from the "q" state.
To accomplish context recognition in accordance with the present invention using more complex state machines, as before, for each box of ink the shape matcher returns a list of alternates which is made up character code points with associated probabilities. The context recognition process (i.e., context engine) takes this list of alternatives, and for each alternative, generates a list of all language model states that could be transitioned to with this character from the previous list of language model states. The language model states correspond to different paths through the character alternate arrays that are valid character transitions in the language model. The context recognition process computes the total path score for each of the new language model states by summing the cost of the character alternative assigned by the shape matcher with the cost of the transition from the previous language model state to the current language model state.
By way of example, the user is inputting the word "cattle" in lowercase as shown in FIG. 19. For purposes of simplicity, in the present example, as shown in FIG. 19, only two character alternatives are returned for each written box, the top choice having a seventy-five percent probability associated therewith, and the bottom choice a twenty-five percent probability. FIG. 17 shows a simplified dictionary state machine 70 having only three words therein which are relevant to the following example, i.e., "cattle", "oat" and "octant". Note that the "v" shown in states 13, 7 and 14 indicate a valid end of a word (termination) as described below. FIG. 18 shows a simplified eight-character bigram state machine 72 that will also be used in the present language model example, with the character for transitioning to each state shown as a character in parentheses within the numbered state. The bigram state machine 72 starts with a low probability such as one percent, corresponding to the fact that most words (ninety-nine percent) are expected to be in the dictionary 70. This means that the process selects a dictionary word if one is reasonable, but still can select a non-dictionary word. Note that more than one such state machine is included so that the user is not limited to writing only dictionary words, otherwise as will become apparent below, all paths would be pruned whenever a word was not in the dictionary. Indeed, a bigram state machine, such as 72 with each state being valid to terminate an input string, (and thus which cannot fail), is ordinarily provided for this purpose .
To integrate other state machines with the dictionary state machine, the context recognition uses the language model state alternative generating process mentioned above, e.g., on the first character box, every code point in the first box's ranked list of codepoint alternatives is used to generate the list of language model states that could be transitioned to. For this example each code point alternative generates two language model state alternatives, with the one at the top pointing to the dictionary 70 and the one at the bottom pointing to the bigram state machine 72. Note that since the same character can transition to two different language model states at the same time, this is the same as creating a single nondeterministic state machine from the combination of the various state machines .
Nondeterministic state machines are described by Knuth, The Art of Computer Programming, Addison-Wesley (1973) .
Note that both state machines 70, 72 share a common initial start state, indicated by the zero state in both FIGS. 17 and 18. Thus, in the present example, four initial language model state alternatives are created from the first box's proposed codepoint alternatives of
"c" and "o" that were returned, with a list of quadruples maintained for each language model state alternative generated that includes the code point, state, cost and a backpointer . At this time, the quadruples are [c, 1,
(.75*.99*.5) , NUL] ; [c, 101, ( .75* .01* .125 ) , NUL] ; [o, 2, ( .25*.99*.5) , NUL]; [o, 104, ( .25* .01* .125 ) , NUL]. The costs are shown as probabilities and are determined by the probability of the code point (.75 for "c" and .25 for "o"), times the cost of the state machine (.99 for the dictionary 70, .01 for the bigram 72), times the cost of the transition from the initial state (.5 in the dictionary since two possible branches exist and .125 in the bigram since eight possible branches exist) . Note that in actual state machines, many more branches would exist and the probabilities assigned to each would sum to one, but not necessarily be equal, e.g., the chance of the first character in a word within a dictionary beginning with an "s" is much greater than a word beginning with an "x." Dictionary states are generally named after prefixes, and thus the "c" is in the "c" state and the "o" is in the "o" state with respect to the dictionary. Note that in a full dictionary trie, these are 32-bit indexes into a table, and not actually strings. When the code point alternate list for the next ink box is returned, i.e., the "a" (.75 chance) and the "c" (.25 chance), the context recognition process generates the language model state alternatives that can be transitioned to from the previous box's four language model state alternates via the two code point alternates returned in the alternate list. The two code point alternatives thus generate as many as eight language model states in this example. However, based on the state machine 70, not all of these eight state machine alternatives are viable. More particularly, as shown in FIG. 17, with respect to the dictionary 70, the previous "c" can transition to the "a", and the previous "o" to either the "a" or a "c" alternates, but the dictionary state machine 70 does not allow for the previous "c" to transition to another "c". This eliminates the potential quadruple therefor from the list. Moreover, some of the states in the language model state alternate list for this character box are duplicates. As the Viterbi algorithm is being used to find the lowest cost path through the language model states, it is evident to one familiar with the Viterbi algorithm that only the lowest cost path to a state needs to be kept.
In each such duplicate instance, the highest cost (lowest probability) duplicates are pruned, whereby the character that made the transition with the lowest cost path to a previous state is kept. The search for duplicates and their removal is preferable done as the list is built, thus dynamically pruning the number of paths.
In the present example, the quadruples for the second box are [a, 3, ( .75* .75* .99* .5) , c,l], [a, 100, ( .75*.75*.01*.125*.125) , →c,101], [a, 4, ( .25*.75*.99*.5*.5) , →o,2], [a, 100, (.25*.75*.01*.125*.125) , →o,104], [c, 101, ( .25*.75*.01*.125*.125) , →c,101], [c, 5, ( .25*.25*.99*.5*.5) , →o,2], and [c, 101, ( .25*.25*.01*.125*.125) , →o,104], where the "→" represents a backpointer to a previous (code point alternate, state) pair. As is apparent, there are duplicate (a, 100) and (c, 101) states, (it is irrelevant that the character transitions are also identical) , and as described above, only the lowest cost (highest probabilities) of these are kept. Thus, the fourth and seventh of the above listed seven quadruples are removed. FIG. 19 shows the remaining code points, states and backpointers, at this time only looking back from Roman numerals II to I. For now, it should be noted that the backpointers from the second alternates are to two distinct characters, "c" and "o", and thus for now no partial results may be displayed. The display of partial results will be described below at the time in the example that such partial results are capable of being safely displayed.
The next character entered is either a "t" (75% probability) or an "x" (25% probability) . From the dictionary state machine 70, states 3, 4 and 5 can all transition via a "t" to states 6, 7 and 8, respectively, while from the bigram state machine 72, states 100 and 101 can transition to state 105 via a "t". However, in the dictionary 70, none of the states 3, 4 or 5 can validly transition to another state via the "x" alternate, and thus these dictionary "x" quadruples drop out. The "x" is a valid bigram transition, of course, from state 100 to state 106 and state 101 to state 106. The valid quadruples, this time shown without values for probabilities (which can be determined as described above) for purposes of simplicity, are thus [t, 6, (PROB), →a,3]; [t, 105, (PROB), →a,100]; [t, 7, (PROB), →a,4]; [t, 105, (PROB), →c,101]; [t, 8, (PROB), →a,5]; [x, 106, (PROB), →a, 100] and [x, 106, (PROB), →c,101]. By evaluating the costs, the latter of the two (t, 105) duplicates and the latter of the two (x, 106) duplicates are the lower probability duplicates of each, and thus removed, whereby at this time, the remaining code point alternates, states, and backpointers are as shown in FIG. 19 from Roman numeral III back to Roman numeral I.
Again, it should be noted for now that the backpointers from any state on back are to at least two distinct characters, and thus no partial results may be safely displayed at this time. Following the above explanation, it should be apparent that the next (fourth) character entered, which also has a "t" and "x" returned as alternates therefor, have only three quadruples after excluding invalid transitions and removing duplicates. These remaining quadruples, shown without values for their probabilities, are [t, 9, (PROB), ->t,6]; [t, 105, (PROB), →t,105]; and [x, 106, (PROB), →t,105]. Note that this is because there was no valid "t" or "x" transition from state 7 or 8 to another state in the dictionary 70, and no valid "x" transition from state 6. Moreover, the lowest costs duplicates in the bigram state machine 72 were for the t, 105 state that had a backpointer to the t, 105 state and the x, 106 state that had a backpointer to the t, 105 state, and thus the higher cost (i.e., lower probability) duplicates were removed. The code points, states and backpointers thus appear as in FIG. 19 from Roman numeral IV back to Roman numeral I .
In keeping with the present invention, the algorithm that determines which characters can be displayed safely is modified to search backwards through the queue of language model states for each character box, until all the still active language model states for an ink box have converged to the same proposed alternate character. Note that this does not mean the paths must converge, (since each state machine will have its own path, if any, pointing to its own state) , but that any remaining paths represent the same character. At present in the case that the language model is a non-deterministic state machine, the algorithm includes a rule, stating that the requirement that all the paths represent the same character has to hold for all ink boxes back to the first ink box written to safely display the character alternate. For example, if the only two active paths left were for "paternal" and "pottery," it would appear that at the third letter "t" the paths have merged for the same character. However, the characters are not the same all the way back to the first "p" character, and thus these first three characters in the string are not considered safely recognized.
In the present example, however, the backpointers from the fourth character alternates (IV) are to a common character, "t", either via a (t, 6) or a (t, 105) . In turn, each of these "t" backpointers are to an "a" character, which in turn are to a "c" character. As a result, even though on different paths, the first three characters, "cat" can be safely written to the recognized character (text) buffer 38 for displaying to the user.
When the inking process is completed, and completion of the recognition is forced by a time out or the user hitting the recognize button, a final symbol "end" is generated. Only those states that can transition to the end state on "end" are considered valid. For example, if the user hit "recognize" after entering only "catt," the states for the dictionary path would be discarded since the state numbered 9 is not a valid termination point. States which are valid termination points are marked as such such as by having a specific bit set therein. As can be appreciated, if all dictionary states are discarded, in the present example only bigram states would remain. In this manner, even with the low- probability (one percent) assigned to bigram states, the bigram states may become the ones used for context recognition. Note however, that the "cat" string, which was already written to the text buffer, is not changed by this termination regardless of which path is used.
As can be appreciated, implementing this search allows a sophisticated language model made up of state machines, of which some of the state machines may be dictionaries organized as state machines (trie state machine in the present implementation) . This enables a dictionary search with forward and backward context, while providing partial results that cannot be changed by further input.
In addition, using this technique, other state machines can be added. For example, a special state machine for numbers can be added, such as one which knows about commas, currency symbols, and decimal points. Another state machine for computer file names, or Internet URLs can be added. Significantly, a user may add a personal supplementary dictionary of private words. Moreover, other considerations are handled wherein the context recognition process creates a list of language model state entries from the ranked code point list provided by the shape matcher. A first consideration is when the same character can go to more than one state in the same state machine. For example, this happens with upper and lower case letters, i.e., most words are valid in all lower case (e.g., "word"), with an initial capital, (e.g., "Word") and in all capitals, (e.g., "WORD"). Other variations are prohibited, e.g., "wOrD"). Second, some words need to be capitalized, (e.g., "Florida" or "FLORIDA", but not "florida"), and a few are always all capitalized (e.g., "USA"). Of course, in the computer, a capital 'F' and a lower-case ' f are different letters, so if the first letter is 'F' the context engine needs to match that to both the "f" state and the "F" state. The dictionary state machine itself is slightly augmented to make this work, i.e., it "remembers" whether it has capitalized any letters or not. As can be appreciated by one skilled in the art, this is easily done by adding an appropriate number of bits to the state number, which is equivalent to creating a dictionary state machine with many more states to represent all the states a word can be m.
A third consideration for which the context recognition process creates multiple language model state entries from a single code point in the shape matchers alternate list is when there is more than one state machine, as described above. This usually happens only at the beginning of input, but it can also happen (for example) when a hyphen, dash, slash, or space is encountered. Because it is important that there be at least one valid state machine for any input, the bigram state machine is used, via which, at the beginning of input, each element m the first ranked list gets a bigram state and a dictionary state. It is also useful to also include state machines for currency amounts, file names, Internet URLs, and so on.
To this end, the dictionary itself may be nondeterministic, to handle compound forms, e.g., words or terms containing a space, (such as "habeus corpus"), a hyphen (such as "T-shirt") or a slash (such as "I/O") . Normally, any two dictionary words separated by space, hyphen, or slash are allowed, but it is even more desirable in certain instances to allow a non-dictionary word only when it is part of such a compound. This is accomplished by adding the compound form (including the space, dash, slash and so on) to the dictionary and then letting the context recognition process split each character in the next ranked list, i.e., one going back to the initial dictionary state and the other continuing from the space, dash, or slash. For example, with "I/O", the "I" is acceptable by itself, but the "0" is probably not, whereby the form "I/O" is added. Lastly, although the described embodiment uses characters written in boxes, it will be apparent to one skilled in the art that this algorithm also works for the unconstrained case. It can be readily appreciated that in the unconstrained case, some of the arrows
(backpointers) will point back two or more columns rather than always pointing to the immediately previous column of alternates.
As can be seen from the foregoing detailed description, there is provided an improved method and mechanism for performing full-context recognition. The full context recognition method and mechanism enables rapid visual feedback while having relatively low memory requirements. The method and mechanism function with existing recognition components such as shape matching components and language model context components, and is fast, reliable, cost-efficient, flexible and extensible.
While the invention is susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the invention to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention.

Claims

WHAT IS CLAIMED IS:
1. A method for recognizing a handwritten character previously input into a computer system when another character has been input into the system thereafter, comprising the steps of: receiving a plurality of current alternates corresponding to a latest handwritten character, each of the current alternates having information associated therewith corresponding to a probability; determining the cost from each of the current alternates to each of the previous alternates of a previously input handwritten character, the determination based on probability information of each previous alternate, probability information of each current alternate and a transition cost therebetween; determining the lowest cost from each of the current alternates to one of the previous alternates, and, if the lowest cost of each of the current alternates is to a common previous alternate, recognizing the common previous alternate as a code point for the previous character .
2. The method of claim 1 wherein the step of recognizing the common previous alternate as a code point for the previous character further comprises the steps of determining the lowest cost path back from the common previous alternate to a preceding alternate thereof and recognizing the preceding alternate as a code point.
3. The method of claim 1 wherein the step of determining the lowest cost from each of the current alternates to one of the previous alternates includes the step of saving a backpointer pointing from each of the current alternates to the previous alternate having the lowest cost path thereto.
4. The method of claim 3, wherein if the lowest cost of each of the current alternates is not to a common previous alternate, further comprising the steps of, determining if an earlier set of preceding alternates representing an earlier character precedes the previous alternate, and if so, determining the lowest cost from each of the previous alternates having a backpointer thereto to one of the preceding alternates, and, if the lowest cost of each previous alternate having a backpointer thereto converges to a common preceding alternate thereof, recognizing the common preceding alternate as a code point for the earlier character.
5. The method of claim 4 wherein the step of recognizing the common preceding alternate as a code point for the earlier character further comprises the steps of, determining the lowest cost path back from the common preceding alternate to a still-earlier preceding alternate thereof, and recognizing the still-earlier preceding alternate as a code point.
6. The method of claim 1 wherein the step of recognizing the common previous alternate as a code point includes the step of displaying the code point.
7. The method of claim 1 wherein the step of receiving a plurality of current alternates corresponding to a latest handwritten character includes the steps of receiving information of the current handwritten character, passing the information to a shape matcher, and receiving a response from the shape matcher, the response including the alternates and the probability information associated with each alternate.
8. The method of claim 7 wherein a list of language model states is generated from the list of character alternates returned by the shape matcher, the list of language model states corresponding to different paths through the character alternates that are valid character transitions in a language model.
9. The method of claim 1 wherein the probability information is a negative natural log of a probability value, and the transition cost is a negative natural log of a probability value, and wherein the step of determining the cost from each of the current alternates to each of the previous alternates of a previously input handwritten character includes the step of summing the natural log values .
10. A mechanism for recognizing a handwritten character previously input into a computer system when another character has been input into the system thereafter, comprising: means for receiving a plurality of current alternates corresponding to a latest handwritten character, each of the current alternates having information associated therewith corresponding to a probability; means for determining the cost from each of the current alternates to each of the previous alternates of a previously input handwritten character, the determination based on probability information of each previous alternate, probability information of each current alternate and a transition cost therebetween; means for determining the lowest cost from each of the current alternates to one of the previous alternates, means for detecting when the lowest cost of each of the current alternates is to a common previous alternate, and, means for recognizing the common previous alternate as a code point for the previous character when the lowest cost of each of the current alternates is to a common previous alternate.
11. The mechanism of claim 10 wherein the means for recognizing the common previous alternate as a code point for the previous character further comprises means for determining the lowest cost path back from the common previous alternate to a preceding alternate thereof, and means for recognizing the preceding alternate as a code point .
12. The mechanism of claim 10 wherein the means for determining the lowest cost from each of the current alternates to one of the previous alternates includes means for saving a backpointer pointing from each of the current alternates to the previous alternate having the lowest cost path thereto.
13. The mechanism of claim 12 further comprising means for detecting when each of the current alternates is not to a common previous alternate, means for determining if an earlier set of preceding alternates representing an earlier character precedes the previous alternate, means for determining the lowest cost from each of the previous alternates having a backpointer thereto to one of the preceding alternates, and means for recognizing the common preceding alternate as a code point for the earlier character when the lowest cost of each previous alternate having a backpointer thereto converges to a common preceding alternate thereof.
14. The mechanism of claim 13 wherein the means for recognizing the common preceding alternate as a code point for the earlier character further comprises, means for determining the lowest cost path back from the common preceding alternate to a still-earlier preceding alternate thereof, and means for recognizing the still- earlier preceding alternate as a code point.
15. The mechanism of claim 10 wherein the means for recognizing the common previous alternate as a code point includes means for displaying the code point.
16. The mechanism of claim 10 wherein the means for receiving a plurality of current alternates corresponding to a latest handwritten character includes, means for receiving information of the current handwritten character, means for passing the information to a shape matcher, and means for receiving a response from the shape matcher, the response including the alternates and the probability information associated with each alternate .
17. The mechanism of claim 16 wherein a list of language model states is generated from the list of character alternates returned by the shape matcher, the list of language model states corresponding to different paths through the character alternates that are valid character transitions in a language model.
18. The mechanism of claim 10 wherein the probability information is a negative natural log of a probability value, and the transition cost is a negative natural log of a probability value, and wherein the means for determining the cost from each of the current alternates to each of the previous alternates of a previously input handwritten character includes means for summing the natural log values.
19. In a computer system, a method of recognizing at least one handwritten character entered into the system as subsequent handwritten characters are entered into the system, wherein each handwritten character is associated with a plurality of alternate code points therefor, with each alternate code point having probability information associated therewith, the method comprising the steps of, placing the alternate code points for each handwritten character entered at the end of a queue, determining a cost from each alternate in the queue to any immediately preceding alternate in the queue, the cost based on the probability information of the alternates and a transition cost therebetween, determining the lowest cost path back from each of the alternates at the end of the queue to an alternate at the beginning of the queue, and determining if each lowest cost path back converges at a common alternate in the queue, and if so, recognizing the common alternate and any alternates previous thereto on the path back as the code points for each of the handwritten characters associated therewith.
20. In a computer system, a method of recognizing at least one handwritten character entered into the system based on the context of a subsequent handwritten character entered into the system, comprising the steps of:
(a) receiving an entered handwritten character;
(b) passing the handwritten character to a shape matcher and receiving a plurality of alternates therefor from the shape matcher, each alternate having a cost associated therewith corresponding to the likelihood of matching the character;
(c) identifying the received alternates as a set of currently active alternates;
(d) appending the alternates to a queue; (e) determining if there is a previous set of alternates prior to the currently active alternates in the queue, and if not, returning to step (a) ;
(f) determining the cost from each of the currently active alternates to each of the alternates in the previous set of alternates, the determination based on the cost of each previous alternate, the cost of each currently active alternate and a transition cost therebetween; (g) determining the lowest cost path from each of the currently active alternates to one of the previous alternates and associating a backpointer with each of the lowest cost paths;
(h) if the backpointers of each of the current alternates point to a common previous alternate, recognizing the common previous alternate as a code point for the previous character, following backpointers on a path from the common previous alternate through any earlier alternates thereof until the beginning of the queue is reached, recognizing each earlier alternate on the path as a code point for each earlier character, and returning to step (a) ; and (i) if the backpointers of each of the current alternates do not point to a common previous alternate, re-identifying the set of currently alternates as the previous alternates having backpointers thereto, and returning to step (e) .
PCT/US1998/011481 1997-06-06 1998-06-04 Partial results in full context handwriting recognition WO1998055957A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU77229/98A AU7722998A (en) 1997-06-06 1998-06-04 Partial results in full context handwriting recognition

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/870,558 US6111985A (en) 1997-06-06 1997-06-06 Method and mechanism for providing partial results in full context handwriting recognition
US08/870,558 1997-06-06

Publications (1)

Publication Number Publication Date
WO1998055957A1 true WO1998055957A1 (en) 1998-12-10

Family

ID=25355645

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/011481 WO1998055957A1 (en) 1997-06-06 1998-06-04 Partial results in full context handwriting recognition

Country Status (3)

Country Link
US (2) US6111985A (en)
AU (1) AU7722998A (en)
WO (1) WO1998055957A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1678661A1 (en) * 2003-08-21 2006-07-12 Microsoft Corporation Electronic ink processing
WO2008100951A2 (en) * 2007-02-12 2008-08-21 Google Inc. Contextual input method
EP2009579A1 (en) * 2004-11-12 2008-12-31 Microsoft Corporation Analysis alternates in context trees

Families Citing this family (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6111985A (en) * 1997-06-06 2000-08-29 Microsoft Corporation Method and mechanism for providing partial results in full context handwriting recognition
ES2202070T3 (en) 1999-03-18 2004-04-01 602531 British Columbia Ltd. DATA ENTRY FOR PERSONAL INFORMATIC DEVICES.
US7293231B1 (en) * 1999-03-18 2007-11-06 British Columbia Ltd. Data entry for personal computing devices
US6717580B1 (en) * 2000-02-02 2004-04-06 America Online, Inc. Method and apparatus using a list of <minimum, size, gap, offset > quadruple to encode unicode characters in an upper/lowercase mapping
US7031993B1 (en) 2000-02-18 2006-04-18 Ge Medical Systems Global Technology Company, Llc Method and apparatus for fast natural log(X) calculation
US20020057842A1 (en) * 2000-06-02 2002-05-16 Yuen Henry C. Smart handwriting recognition apparatus and methods
US6771817B1 (en) * 2000-08-22 2004-08-03 Microsoft Corporation Method and system for extending ink word data structures while maintaining version compatibility
US6785417B1 (en) 2000-08-22 2004-08-31 Microsoft Corp Method and system for searching for words in ink word documents
US6836759B1 (en) 2000-08-22 2004-12-28 Microsoft Corporation Method and system of handling the selection of alternates for recognized words
US6801660B1 (en) 2000-08-22 2004-10-05 Microsoft Corporation Method and system for maintaining alternates in association with recognized words
US6754386B1 (en) 2000-08-22 2004-06-22 Microsft Corporation Method and system of matching ink processor and recognizer word breaks
US6763137B1 (en) * 2000-09-14 2004-07-13 Canon Kabushiki Kaisha Recognition and clustering of connected components in bi-level images
CA2323856A1 (en) * 2000-10-18 2002-04-18 602531 British Columbia Ltd. Method, system and media for entering data in a personal computing device
JP3974359B2 (en) * 2000-10-31 2007-09-12 株式会社東芝 Online character recognition apparatus and method, computer-readable storage medium, and online character recognition program
US7158678B2 (en) * 2001-07-19 2007-01-02 Motorola, Inc. Text input method for personal digital assistants and the like
US6879718B2 (en) * 2001-11-06 2005-04-12 Microsoft Corp. Efficient method and system for determining parameters in computerized recognition
JP4006239B2 (en) * 2002-02-21 2007-11-14 株式会社日立製作所 Document search method and search system
US7010165B2 (en) * 2002-05-10 2006-03-07 Microsoft Corporation Preprocessing of multi-line rotated electronic ink
US6986106B2 (en) 2002-05-13 2006-01-10 Microsoft Corporation Correction widget
US20030215145A1 (en) * 2002-05-14 2003-11-20 Microsoft Corporation Classification analysis of freeform digital ink input
US7137076B2 (en) * 2002-07-30 2006-11-14 Microsoft Corporation Correcting recognition results associated with user input
US7886236B2 (en) * 2003-03-28 2011-02-08 Microsoft Corporation Dynamic feedback for gestures
US7567239B2 (en) * 2003-06-26 2009-07-28 Motorola, Inc. Method and system for message and note composition on small screen devices
US7418442B1 (en) * 2003-09-30 2008-08-26 Microsoft Corporation Ink alternates and plain text search
US8074184B2 (en) * 2003-11-07 2011-12-06 Mocrosoft Corporation Modifying electronic documents with recognized content or other associated data
US6989822B2 (en) * 2003-11-10 2006-01-24 Microsoft Corporation Ink correction pad
US7848573B2 (en) * 2003-12-03 2010-12-07 Microsoft Corporation Scaled text replacement of ink
US7506271B2 (en) * 2003-12-15 2009-03-17 Microsoft Corporation Multi-modal handwriting recognition correction
WO2005070114A2 (en) * 2004-01-09 2005-08-04 Nickel Brand Software, Inc. Brand recognition system
WO2006028438A1 (en) * 2004-09-01 2006-03-16 Hewlett-Packard Development Company, L.P. System, method, and apparatus for continuous character recognition
JP2008515044A (en) * 2004-10-05 2008-05-08 クルシアルテック シーオー.,エルティーディー Character input method for portable device using pointer and portable device
US8552984B2 (en) * 2005-01-13 2013-10-08 602531 British Columbia Ltd. Method, system, apparatus and computer-readable media for directing input associated with keyboard-type device
WO2006090404A1 (en) * 2005-02-23 2006-08-31 Hewlett-Packard Development Company, L.P. System, method, and apparatus for accomodating variability in chunking the sub-word units of online handwriting
US7817857B2 (en) * 2006-05-31 2010-10-19 Microsoft Corporation Combiner for improving handwriting recognition
US8050500B1 (en) 2006-07-06 2011-11-01 Senapps, LLC Recognition method and system
CA2581824A1 (en) * 2007-03-14 2008-09-14 602531 British Columbia Ltd. System, apparatus and method for data entry using multi-function keys
CN105204617B (en) * 2007-04-11 2018-12-14 谷歌有限责任公司 The method and system integrated for Input Method Editor
US7953279B2 (en) 2007-06-28 2011-05-31 Microsoft Corporation Combining online and offline recognizers in a handwriting recognition system
US20110060985A1 (en) * 2009-09-08 2011-03-10 ABJK Newco, Inc. System and Method for Collecting a Signature Using a Smart Device
US8463731B2 (en) * 2010-02-17 2013-06-11 Google Inc. Translating user interaction with a touch screen into text
JP2014127188A (en) * 2012-12-27 2014-07-07 Toshiba Corp Shaping device and method
US10228819B2 (en) 2013-02-04 2019-03-12 602531 British Cilumbia Ltd. Method, system, and apparatus for executing an action related to user selection
US9665246B2 (en) 2013-04-16 2017-05-30 Google Inc. Consistent text suggestion output
US8825474B1 (en) * 2013-04-16 2014-09-02 Google Inc. Text suggestion output using past interaction data
JP2015176241A (en) * 2014-03-13 2015-10-05 株式会社東芝 Database management device and address recognition device
KR20200103236A (en) * 2019-02-22 2020-09-02 삼성전자주식회사 Method and apparatus for displaying an input based on handwriting

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5377281A (en) * 1992-03-18 1994-12-27 At&T Corp. Knowledge-based character recognition
US5526444A (en) * 1991-12-10 1996-06-11 Xerox Corporation Document image decoding using modified branch-and-bound methods
US5615286A (en) * 1995-05-05 1997-03-25 Bell Communications Research, Inc. Method for determining a most likely sequence of states

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4589142A (en) * 1983-12-28 1986-05-13 International Business Machines Corp. (Ibm) Method and apparatus for character recognition based upon the frequency of occurrence of said characters
US5289270A (en) * 1986-10-28 1994-02-22 Canon Kabushiki Kaisha Image processing apparatus for performing density conversion
US4876731A (en) * 1988-02-19 1989-10-24 Nynex Corporation Neural network model in pattern recognition using probabilistic contextual information
US5067165A (en) * 1989-04-19 1991-11-19 Ricoh Company, Ltd. Character recognition method
US5151950A (en) * 1990-10-31 1992-09-29 Go Corporation Method for recognizing handwritten characters using shape and context analysis
US5321773A (en) * 1991-12-10 1994-06-14 Xerox Corporation Image recognition method using finite state networks
US5392363A (en) * 1992-11-13 1995-02-21 International Business Machines Corporation On-line connected handwritten word recognition by a probabilistic method
IL111039A (en) * 1994-09-22 1998-08-16 Advanced Recognition Tech Handwritten pattern recognizer
TW338815B (en) * 1995-06-05 1998-08-21 Motorola Inc Method and apparatus for character recognition of handwritten input
US6111985A (en) * 1997-06-06 2000-08-29 Microsoft Corporation Method and mechanism for providing partial results in full context handwriting recognition
US6061472A (en) * 1997-06-06 2000-05-09 Microsoft Corporation Method and mechanism to reduce handwriting recognizer errors using multiple decision trees

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5526444A (en) * 1991-12-10 1996-06-11 Xerox Corporation Document image decoding using modified branch-and-bound methods
US5377281A (en) * 1992-03-18 1994-12-27 At&T Corp. Knowledge-based character recognition
US5615286A (en) * 1995-05-05 1997-03-25 Bell Communications Research, Inc. Method for determining a most likely sequence of states

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1678661A1 (en) * 2003-08-21 2006-07-12 Microsoft Corporation Electronic ink processing
EP1678661A4 (en) * 2003-08-21 2007-06-27 Microsoft Corp Electronic ink processing
EP2009579A1 (en) * 2004-11-12 2008-12-31 Microsoft Corporation Analysis alternates in context trees
US7518749B2 (en) 2004-11-12 2009-04-14 Microsoft Corporation Analysis alternates in context trees
WO2008100951A2 (en) * 2007-02-12 2008-08-21 Google Inc. Contextual input method
WO2008100951A3 (en) * 2007-02-12 2009-03-19 Google Inc Contextual input method
JP2010518534A (en) * 2007-02-12 2010-05-27 グーグル・インコーポレーテッド Contextual input method
US8028230B2 (en) 2007-02-12 2011-09-27 Google Inc. Contextual input method
TWI475406B (en) * 2007-02-12 2015-03-01 Google Inc Contextual input method

Also Published As

Publication number Publication date
US6418239B1 (en) 2002-07-09
US6111985A (en) 2000-08-29
AU7722998A (en) 1998-12-21

Similar Documents

Publication Publication Date Title
US6111985A (en) Method and mechanism for providing partial results in full context handwriting recognition
JP2726568B2 (en) Character recognition method and device
KR900000602B1 (en) Acknowledge method and device of maturar language
JP3885974B2 (en) Handwriting recognition system and method
US5467407A (en) Method and apparatus for recognizing cursive writing from sequential input information
US5724449A (en) Stroke syntax input device
US5724457A (en) Character string input system
EP0555350A1 (en) Method for pattern recognition
JP3782467B2 (en) Character input method and apparatus
JPH10198766A (en) Device and method for recognizing character, and storage medium
JPH07191986A (en) Sentence input device
JPH07105217A (en) Automatic correcting device for input character
JPH0689368A (en) Method for forming prototype assembly for specified user for handwritten character recognition
CA2497586C (en) Method and apparatus for recognizing cursive writing from sequential input information
JPH0677252B2 (en) Japanese data input processor
JPH0756666B2 (en) Information recognition method
JP3022790B2 (en) Handwritten character input device
JP3157995B2 (en) Character processor
Toyokawa et al. An on-line character recognition system for effective Japanese input
JPH05189621A (en) On-line hand-written character recognizing device
JPH08335250A (en) Misused character correcting device
JPS61234461A (en) Character string correcting system
JPH1091727A (en) Method and device for recognizing handwritten character and its recording medium
JPS63145578A (en) Character recognition post-processing system
JPH0684019A (en) Period recognizing device in hand-written input character processor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1999502846

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase