US20060045344A1 - Handprint recognition test deck - Google Patents

Handprint recognition test deck Download PDF

Info

Publication number
US20060045344A1
US20060045344A1 US10/933,002 US93300204A US2006045344A1 US 20060045344 A1 US20060045344 A1 US 20060045344A1 US 93300204 A US93300204 A US 93300204A US 2006045344 A1 US2006045344 A1 US 2006045344A1
Authority
US
United States
Prior art keywords
handprint
snippets
character
snippet
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/933,002
Inventor
K. Paxton
William DiBacco
Steven Spiwak
Craig Towne
Manuel Trevisan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ADI LLC
Original Assignee
ADI LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ADI LLC filed Critical ADI LLC
Priority to US10/933,002 priority Critical patent/US20060045344A1/en
Assigned to ADI, LLC reassignment ADI, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TREVISAN, MANUEL, TOWNE, CRAIG A., DIBACCO, WILLIAM L., SPIWAK, STEVEN P., PAXTON, K. BRADLEY
Publication of US20060045344A1 publication Critical patent/US20060045344A1/en
Priority to US13/446,835 priority patent/US8498485B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/333Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1916Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables

Definitions

  • the present invention relates generally to forms processing technology and more particularly to a system and method to create test materials that can evaluate and help improve systems that recognize handprinted fields.
  • Forms processing technology can encompass many types of systems and include many steps including capturing hand-printed data from questionnaires and putting the data into a computer. Many organizations doing forms processing use traditional “heads-down” keying from paper (KFP); and “heads-up” keying from image (KFI).
  • KFP heads-down keying from paper
  • KFI heads-up keying from image
  • KFP Traditional “heads-down” keying from paper
  • KFI Heads-up keying from image
  • a major problem with capturing handprinted data from forms filled out by human respondents is measuring the accuracy and efficiency of the total system.
  • Former testing methods such as using a handprint font, may be satisfactory for production readiness tests, but clearly are not adequate to claim to measure handprint recognition accuracy and efficiency. This is partly because they are considered “too neat” and would thus give an artificially high estimate of accuracy relative to the “real world.”
  • a truth file is an accurate representation of the handprinted data on the forms. To be able to produce a deck of handprinted forms and the corresponding TRUTH file in a more timely and efficient manner would provide a valuable quality assurance tool.
  • Typical data collection forms request hand-printed response (rather than cursive) when clarity is required.
  • cursive request hand-printed response
  • Automation and cost savings can be realized by incorporating handprint Optical Character Recognition (OCR), so that computer systems attempt to recognize the handprinted fields, and send low confidence fields to KFI.
  • OCR Optical Character Recognition
  • the method of this patent creates test materials that make it possible to assess the accuracy of an entire form processing system more easily and consistently, as well as measure efficiency, regardless of whether KFP, KFI, OCR, or all of these are used in any combination.
  • This invention simplifies the expensive and laborious process associated with the (hand made) test forms wherein the forms are keyed twice from paper, and discrepancies verified and corrected by a third person.
  • Our invention is a method to measure handprint recognition accuracy and efficiency by creating a test deck to qualify and test handprint recognition systems.
  • This includes preparing a handprint snippet data base containing labeled handprint image snippets that collectively or in part, may approximate actual respondent's handprinted characters.
  • the method includes creating a page layout file, preparing a form description file as well as a data content file and then selecting handprint snippets from the handprint snippet database to populate a form creating a “completed” form image using the selected snippets according to the form description file. When printed, these forms may constitute a Digital Test DeckTM.
  • FIG. 1 schematically illustrates the test deck creation method.
  • FIG. 2 is an example of a blank form.
  • FIG. 3 is an example of a filled out form.
  • FIG. 4 is a form populated with actual handprint snippets.
  • FIG. 5 shows a form processing system
  • FIG. 6 shows a testing concept at the field level.
  • FIG. 7 shows a diagram of the mathematical system for measuring error.
  • FIG. 8 shows an example plot of error results for this system as a function of the number of samples for this system.
  • a digitally created printout 10 of this invention is created using a test deck system 12 as shown in FIG. 1 .
  • the print out 10 is sometimes referred to as the Digital Test DeckTM (DTDTM), or simply the test deck 10 , and can be used in a handprint recognition system to test the system for accuracy and to improve the efficiency of the forms processing technology.
  • DTDTM Digital Test DeckTM
  • TRUTH file 14 which accurately represents the data printed on the form and defines the expected output of the forms processing system
  • the test deck system 12 produces a very realistic and accurate set of testing forms.
  • the DTDTM printout 10 can be used for production readiness and for baseline testing of forms processing systems such as shipping labels, bank checks and surveys for accuracy and efficiency.
  • the test deck system 12 for creating the DTDTM printout 10 includes a file such as an Adobe PDF or PostScript file that includes instructions to describe an appropriate form 15 . It also includes a page description file (PD file) 16 formed by page description software 18 .
  • the PD file 16 may be an Adobe Quark Express file.
  • the PD file 16 contains the instructions necessary to determine what form to use and how that form is to be filled out.
  • the form may be a “blank” form 20 (See FIG. 2 ) that can be directed by the PD file 16 to be sent to a printer 22 which could be a traditional offset printer, a color digital printer, or other types of printers and then prints the page images of the forms in duplex or simplex, depending on the requirement.
  • This test deck system 12 also includes a form description file (FDF) 24 that describes the characteristics of the form's data fields.
  • the FDF 24 contains information such as field length and whether the field is a check box, or a write-in field for marks or characters respectively. Field length can be measured in the number of marks or characters.
  • the test deck system also includes a variable data database (VDDB) 26 that describes the desired content of the simulated respondent entries.
  • VDDB 26 is an ASCII database that tells what goes where on the printout. In the case where the field is a check box field, the VDDB 26 describes which check boxes are checked and which are blank and the type of mark that is expected.
  • the VDDB 26 contains a simulated response, such as a last name, “SMITH.”
  • the VDDB 26 allows customization depending on the form 20 used and the job requirements.
  • the test deck system 12 shown in FIG. 1 uses Handprint Character Snippets (HPC) 28 organized in an HPC snippet database 30 .
  • the HPC snippet database 30 can include characters, letters, symbols, or parts thereof, where a single character is defined by the set of legal symbols for the particular user or job requirement.
  • the HPC snippet database 30 has an appropriate identification/numbering scheme for subsequent incorporation into the test deck system 12 .
  • the individual HPC snippet 28 is preferably an image clip containing a hand printed symbol (character, mark or punctuation mark) obtained from a real person.
  • OCR optical character recognition
  • ICR Intelligent Character Recognition
  • the VDDB 26 contains acceptable responses for each data field on the form effectively simulating the data that would be obtained from a real respondent that had completed the form but with advantages that are important to improved accuracy and efficiency mentioned above.
  • the VDDB 26 can be created from a dictionary of acceptable data for each field.
  • the VDDB 26 is capable of specifying the placement of letters such as upper case or lower case to more closely simulate actual respondent data and/or changing the placement of the handprint character snippet in reference to a boundary by one or more of the handprint character snippet's position, angle, or size. These letters can extend below the line as if written by hand. These letters include those that have extensions such as g, j, p, q, and y, and can be composed from one or more HPC snippets 28 as will be discussed in detail below.
  • the test deck system 12 shown in FIG. 1 incorporates a HPC snippet database 30 of characters, symbols and/or digital images.
  • the HPC snippet database 30 classifies the characters and/or digital images by similarity and/or a feature set.
  • the HPC snippet database 30 contains all the valid or verified character images which have been collected to date in the project. These can come from one or more different sources and mixed together using a computer. The one or more of the sources can include computer-generated samples.
  • the HPC snippet database 30 works in conjunction with Field Dictionaries 31 and/or alternatively, with subfield Dictionaries, which can be ASCII files. This dictionary 31 contains a table of valid entries for any given field on the blank form 20 .
  • the dictionary can be subdivided into subfield directories.
  • This method creates the test deck used to qualify and test handprint recognition systems by preparing a handprint snippet database containing labeled handprint image snippets; preparing a form description file and page description file to describe a form; preparing a variable data database that describes the desired content of the simulated respondent entries using the handprint character snippets; populating the form using the variable data database in conjunction with the form description file, the page description file, and the handprint snippet database; and printing the completed form.
  • the test deck system uses a “Hand” 32 defined as an “adequate” supply of characters or symbols, preferably from the HPC snippets database 30 , to create the set of handprint field snippets 33 which satisfy the field data requirements for the VDDB file 26 .
  • the process is controlled by the variable data snippet maker (VDSM) software 34 .
  • the hand 32 can be a Homogeneous Hand which is a Hand in which all the characters are similar dimensional characteristics and features (i.e., slant, line width, height, etc.), so that the set of handprint field snippets 33 created from this homogeneous hand 32 look as though they were filled in by one individual.
  • VDSM software 34 describes the logic that controls handprint field snippet generation including the production of the set of handprint field snippets 33 that are to be put on the form image 36 of the appropriate form 15 .
  • the VDSM software 34 uses the data field information found in the VDDB file 26 and the field description found in the FDF 24 to appropriately select HPC snippets 28 and electronically paste the set of handprint field snippets 33 together.
  • the handprint field snippets 33 are subsequently electronically pasted onto the digital test deck file 38 , using variable data printing software 40 to vary or raster images as is known by those in the art. It is sometimes advantageous to incorporate a random or alternatively a defined noise into the data to simulate certain environmental or expected effects.
  • the well-defined TRUTH file 14 is being constructed to contain the “answer” expected from the system using the form supplied.
  • the HPC snippets 28 are digitally pasted together to construct the set of handprint field snippets 33 described above. This set of handprint field snippets 33 will be subsequently printed on the printout 10 .
  • the HPC snippet database 30 can be used in conjunction with the VDSM software 34 , and data field information as well as the field descriptions, to create snippets that are placed on the fringe of acceptable units; such as a snippet touching a boundary or another character, with at least one point.
  • the HP snippets 28 can also be cast into a vector representation if necessary, or placed in reference to a boundary by one or more of the HP snippets.
  • a form image 42 , the page description file 16 , and the handprinted field snippets 28 are processed through variable data printing software (VDP) 40 to create the Digital Test DeckTM (DTDTM) file 38 .
  • the DTDTM file 38 is sent to a printer 22 , which may be offset or digital, to produce the DTDTM printout 10 .
  • This printout and the DTDTM TRUTH file 14 comprise the system output that is available to the customer for test purposes.
  • the DTDTM 10 containing simulated human handprint looks like a form prepared by a real person even through they are printed by a digital printer and contain perfectly known data. Using these decks, a forms processing system may be tested for accuracy and efficiency, regardless of the technology used for the data capture. More specifically, the state of the processing system may be reliably assessed at any point in time.
  • a well-defined TRUTH FILE 14 is also developed by the handprint recognition system using extracted data from the forms to accurately represent the data placed within the DTDTM. The truth file 14 would contain the “answer” that is expected from a handprint recognition system, and can be used to determine the accuracy of said system.
  • FIG. 2 shows a blank form, which is a portion of the Year 2000 Decennial Census “short” form.
  • This is an example of a form that would benefit from the described system and method.
  • This (blank) form 20 There are places on this (blank) form 20 , called fields 44 , where the respondent is asked to print the answers to questions posed by the Census Bureau. When the respondent completes these fields, the form might look like FIG. 3 displaying fictitious data 46 in fields 44 . Most people would say this was an actual Census form image, albeit a rather neat one, but it was actually created using a handprint font on a computer. This is one example of a digital test form. A suitable number of different digital test forms would constitute the test deck.
  • test deck The basic properties of the test deck as defined above are:
  • the fourth bullet indicates that using handprint fonts results in a rather excessively “neat” simulated form, being created, and so the form is of limited use in actually measuring OCR data capture quality per se.
  • using actual handprint “snippets” 28 in the creation of the test deck 10 gives a very realistic appearance, as shown in FIG. 4 (again using fictitious data).
  • test deck 10 is used to:
  • This invention enables “outside-in” testing. If a perfectly known input is inserted into the system, and (mostly) the correct answers come out, then it is unlikely that there is anything seriously wrong in between.
  • “inside-out” testing analyses all possible internal variables such as a measure of the lamp intensity on the left-hand side of the scanner, or the speed of the scanner transport, etc.
  • the problem with an “inside-out” approach is that it may literally fill up file cabinets with data in this manner, and it will be the element or factor that is not tested that causes the system to fail or create erroneous data.
  • the “outside-in” approach used in this invention is advantageous because testing is simple, cost-effective, accurate, and consistent.
  • FIG. 5 shows a typical forms processing system 50 which uses automatic recognition (OCR) to do the bulk of the data capture workload, and KFI for data capture of those rejected fields for which the OCR system is not confident.
  • OCR automatic recognition
  • the terms Accept Rate and Accuracy Rate are used as a measure of the accuracy of the system under test.
  • the Accept Rate is the fraction of the fields in which the OCR has high confidence, usually expressed as a percent.
  • These “accepted” fields are the ones noted for OCR accuracy. Accepted fields are not sent to keyers except for quality assurance purposes, while in noting the OCR “accepted” fields, that fraction of the (non-blank) fields that are “correct” is the Accuracy Rate, usually expressed as a percent.
  • the Error Rate for either OCR or OMR is defined as 100% minus the Accuracy Rate in percent. So for example, if the Accuracy Rate is 99.2%, the Error Rate is 0.8%.
  • the Reject Rate for recognition is 100% minus the Accept Rate in percent. So, if the Accept Rate is 80%, the Reject Rate is 20%.
  • Rejected fields are the low confidence fields from the OCR, and are sent to keyers to be keyed because the automatic OCR isn't sure it has the correct answer.
  • the Accuracy Rate is that fraction of all the check-box fields that are correct, usually including blanks. Blanks are commonly included in scoring OMR because there is no way for the computer to know if a check-box contains a mark or not without looking at it, and so an empty check-box which is properly identified as such is considered a “correct” output of OMR.
  • Scoring includes the calculation of the accuracy of an OCR or keying (data collection) system.
  • a TRUTH File 14 also referred to as the groundtruth or answer file, contains the set of values expected as output from an OCR/OMR system upon extracting the respondent completed information from a form. When the present invention is used this TRUTH File 14 can be generated as described below.
  • test deck system 12 A portion of the test deck system 12 is shown schematically in FIG. 6 where all of the fields 44 in all the forms are being tested together.
  • the fields 44 are pulled out one at a time for testing. If the handprint was J-O-S-E and the resultant ASCII was JOSE, that would be a correct field, termed a “hard match”, meaning each and every character is correct in a field. On the other hand, if the handprint was C-H-AO and if the resultant ASCII was CHAD, there would be an error using the hard match criterion.
  • FIG. 7 shows a mathematical representation of the results of applying this test deck system 12 .
  • FIG. 8 is a plot of the measurement error as a function of the number of samples.
  • FIG. 8 shows that many samples may be needed to test the test deck system 12 properly. This invention makes creating a large number of samples cost-effective compared to previous manual methods.
  • test materials Six basic types of test materials are used to test forms processing systems are:
  • test decks described herein could be comprised of a wide variety of printed forms in addition to questionnaires; for example, bank checks, shipping labels, and other types of printed forms.

Abstract

A system and method for creating a test deck to qualify and test forms processing systems, including preparing a handprint snippet data base containing labeled handprint image snippets representing a unique hand, preparing a form description file and a data content file, selecting handprint snippets from the handprint snippet data base to formulate a form using the data content file, creating a form image using the selected snippets according to the form description file and printing the form image.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates generally to forms processing technology and more particularly to a system and method to create test materials that can evaluate and help improve systems that recognize handprinted fields.
  • 2. Background Art
  • Forms processing technology can encompass many types of systems and include many steps including capturing hand-printed data from questionnaires and putting the data into a computer. Many organizations doing forms processing use traditional “heads-down” keying from paper (KFP); and “heads-up” keying from image (KFI).
  • Traditional “heads-down” keying from paper (KFP) is an approach that involves human keyers sitting in front of a computer terminal, and looking down at a form placed on a rack. They read the data placed on the form by the respondent, and manually key this data into the computer using a KFP software package. People are not very consistent when performing routine tasks for long periods of time, and this process is prone to human error. A major source of error with KFP is placing the data in the wrong field.
  • “Heads-up” keying from image (KFI) is an approach that uses an electronic scanner to scan the forms before sending the electronic image of the form to a computer screen, along with fields where the human keyer is to key in data. The name comes from the fact that the keyer is looking straight ahead at the screen at all times. This method tends to be more accurate since it greatly reduces the incorrect field problem mentioned above for KFP. It is also often faster than KFP. Unfortunately, it still involves humans which are a constant source of errors.
  • A major problem with capturing handprinted data from forms filled out by human respondents is measuring the accuracy and efficiency of the total system. Former testing methods, such as using a handprint font, may be satisfactory for production readiness tests, but clearly are not adequate to claim to measure handprint recognition accuracy and efficiency. This is partly because they are considered “too neat” and would thus give an artificially high estimate of accuracy relative to the “real world.”
  • Creating a test deck manually gives realistic variability but is a time-consuming and demanding task, and you only have one unique deck. Creating a corresponding TRUTH file to evaluate the accuracy of the system's processing adds significantly to the complexity of this task. A truth file is an accurate representation of the handprinted data on the forms. To be able to produce a deck of handprinted forms and the corresponding TRUTH file in a more timely and efficient manner would provide a valuable quality assurance tool.
  • Typical data collection forms request hand-printed response (rather than cursive) when clarity is required. However, there is great variability in the handprints prepared by the population in general. In order to perform adequate system testing and shakedown, a sufficient number of examples are required so that realistic variability can be characterized for use during testing and system evaluation.
  • Automation and cost savings can be realized by incorporating handprint Optical Character Recognition (OCR), so that computer systems attempt to recognize the handprinted fields, and send low confidence fields to KFI. The method of this patent creates test materials that make it possible to assess the accuracy of an entire form processing system more easily and consistently, as well as measure efficiency, regardless of whether KFP, KFI, OCR, or all of these are used in any combination. This invention simplifies the expensive and laborious process associated with the (hand made) test forms wherein the forms are keyed twice from paper, and discrepancies verified and corrected by a third person.
  • BRIEF SUMMARY OF THE INVENTION
  • Our invention is a method to measure handprint recognition accuracy and efficiency by creating a test deck to qualify and test handprint recognition systems. This includes preparing a handprint snippet data base containing labeled handprint image snippets that collectively or in part, may approximate actual respondent's handprinted characters. The method includes creating a page layout file, preparing a form description file as well as a data content file and then selecting handprint snippets from the handprint snippet database to populate a form creating a “completed” form image using the selected snippets according to the form description file. When printed, these forms may constitute a Digital Test Deck™.
  • BRIEF DESCRIPTION OF THE DRAWING(S)
  • FIG. 1 schematically illustrates the test deck creation method.
  • FIG. 2 is an example of a blank form.
  • FIG. 3 is an example of a filled out form.
  • FIG. 4 is a form populated with actual handprint snippets.
  • FIG. 5 shows a form processing system.
  • FIG. 6 shows a testing concept at the field level.
  • FIG. 7 shows a diagram of the mathematical system for measuring error.
  • FIG. 8 shows an example plot of error results for this system as a function of the number of samples for this system.
  • DETAILED DESCRIPTION OF THE INVENTION
  • A digitally created printout 10 of this invention is created using a test deck system 12 as shown in FIG. 1. The print out 10 is sometimes referred to as the Digital Test Deck™ (DTD™), or simply the test deck 10, and can be used in a handprint recognition system to test the system for accuracy and to improve the efficiency of the forms processing technology. When accompanied by a TRUTH file 14, which accurately represents the data printed on the form and defines the expected output of the forms processing system, the test deck system 12 produces a very realistic and accurate set of testing forms. The DTD™ printout 10 can be used for production readiness and for baseline testing of forms processing systems such as shipping labels, bank checks and surveys for accuracy and efficiency.
  • The test deck system 12 for creating the DTD™ printout 10 includes a file such as an Adobe PDF or PostScript file that includes instructions to describe an appropriate form 15. It also includes a page description file (PD file) 16 formed by page description software 18. The PD file 16 may be an Adobe Quark Express file. The PD file 16 contains the instructions necessary to determine what form to use and how that form is to be filled out. The form may be a “blank” form 20 (See FIG. 2) that can be directed by the PD file 16 to be sent to a printer 22 which could be a traditional offset printer, a color digital printer, or other types of printers and then prints the page images of the forms in duplex or simplex, depending on the requirement.
  • This test deck system 12 also includes a form description file (FDF) 24 that describes the characteristics of the form's data fields. The FDF 24 contains information such as field length and whether the field is a check box, or a write-in field for marks or characters respectively. Field length can be measured in the number of marks or characters. The test deck system also includes a variable data database (VDDB) 26 that describes the desired content of the simulated respondent entries. The VDDB 26 is an ASCII database that tells what goes where on the printout. In the case where the field is a check box field, the VDDB 26 describes which check boxes are checked and which are blank and the type of mark that is expected. When the field is a write in field, the VDDB 26 contains a simulated response, such as a last name, “SMITH.” The VDDB 26 allows customization depending on the form 20 used and the job requirements.
  • The test deck system 12 shown in FIG. 1 uses Handprint Character Snippets (HPC) 28 organized in an HPC snippet database 30. The HPC snippet database 30 can include characters, letters, symbols, or parts thereof, where a single character is defined by the set of legal symbols for the particular user or job requirement. The HPC snippet database 30 has an appropriate identification/numbering scheme for subsequent incorporation into the test deck system 12. The individual HPC snippet 28 is preferably an image clip containing a hand printed symbol (character, mark or punctuation mark) obtained from a real person. By using the digitally created printout 10 with HPC snippets 28, as described below, it is possible to test the accuracy of form processing by an Optical Character Recognition (OCR) method and/or system with the assurance the results are realistic.
  • Optical character recognition is often referred to as OCR or ICR (Intelligent Character Recognition). Optical character recognition refers to the process of automatically recognizing write-in fields from the scanned image of the form. Originally, OCR referred to the recognition of machine print, and is used by government institutions, but has also been used as a generic name for recognition of handprint as well. Sometimes, the type of OCR used for handprint is called “Handprint OCR.” Optical Mark Recognition (OMR) is a related process and refers to the process of automatically recognizing the answers to check-box questions from the scanned image of the form. It is a more advanced version of its predecessor which was known as “mark sense” which determined if a particular circle or other shape was completely filled in by the respondent.
  • The VDDB 26 contains acceptable responses for each data field on the form effectively simulating the data that would be obtained from a real respondent that had completed the form but with advantages that are important to improved accuracy and efficiency mentioned above. The VDDB 26 can be created from a dictionary of acceptable data for each field. The VDDB 26 is capable of specifying the placement of letters such as upper case or lower case to more closely simulate actual respondent data and/or changing the placement of the handprint character snippet in reference to a boundary by one or more of the handprint character snippet's position, angle, or size. These letters can extend below the line as if written by hand. These letters include those that have extensions such as g, j, p, q, and y, and can be composed from one or more HPC snippets 28 as will be discussed in detail below.
  • The test deck system 12 shown in FIG. 1 incorporates a HPC snippet database 30 of characters, symbols and/or digital images. The HPC snippet database 30 classifies the characters and/or digital images by similarity and/or a feature set. The HPC snippet database 30 contains all the valid or verified character images which have been collected to date in the project. These can come from one or more different sources and mixed together using a computer. The one or more of the sources can include computer-generated samples. The HPC snippet database 30 works in conjunction with Field Dictionaries 31 and/or alternatively, with subfield Dictionaries, which can be ASCII files. This dictionary 31 contains a table of valid entries for any given field on the blank form 20. When there is a complex linkage between fields in a form, such as between the preparer's gender and first name, the dictionary can be subdivided into subfield directories. This method creates the test deck used to qualify and test handprint recognition systems by preparing a handprint snippet database containing labeled handprint image snippets; preparing a form description file and page description file to describe a form; preparing a variable data database that describes the desired content of the simulated respondent entries using the handprint character snippets; populating the form using the variable data database in conjunction with the form description file, the page description file, and the handprint snippet database; and printing the completed form.
  • The test deck system uses a “Hand” 32 defined as an “adequate” supply of characters or symbols, preferably from the HPC snippets database 30, to create the set of handprint field snippets 33 which satisfy the field data requirements for the VDDB file 26. The process is controlled by the variable data snippet maker (VDSM) software 34. The hand 32 can be a Homogeneous Hand which is a Hand in which all the characters are similar dimensional characteristics and features (i.e., slant, line width, height, etc.), so that the set of handprint field snippets 33 created from this homogeneous hand 32 look as though they were filled in by one individual. VDSM software 34 describes the logic that controls handprint field snippet generation including the production of the set of handprint field snippets 33 that are to be put on the form image 36 of the appropriate form 15. The VDSM software 34 uses the data field information found in the VDDB file 26 and the field description found in the FDF 24 to appropriately select HPC snippets 28 and electronically paste the set of handprint field snippets 33 together. The handprint field snippets 33 are subsequently electronically pasted onto the digital test deck file 38, using variable data printing software 40 to vary or raster images as is known by those in the art. It is sometimes advantageous to incorporate a random or alternatively a defined noise into the data to simulate certain environmental or expected effects.
  • At the same time the set of handprint filed snippets 33 are used to create the digital test deck 38, the well-defined TRUTH file 14 is being constructed to contain the “answer” expected from the system using the form supplied. The HPC snippets 28 are digitally pasted together to construct the set of handprint field snippets 33 described above. This set of handprint field snippets 33 will be subsequently printed on the printout 10. The HPC snippet database 30 can be used in conjunction with the VDSM software 34, and data field information as well as the field descriptions, to create snippets that are placed on the fringe of acceptable units; such as a snippet touching a boundary or another character, with at least one point. The HP snippets 28 can also be cast into a vector representation if necessary, or placed in reference to a boundary by one or more of the HP snippets.
  • A form image 42, the page description file 16, and the handprinted field snippets 28 are processed through variable data printing software (VDP) 40 to create the Digital Test Deck™ (DTD™) file 38. The DTD™ file 38 is sent to a printer 22, which may be offset or digital, to produce the DTD™ printout 10. This printout and the DTD™ TRUTH file 14 comprise the system output that is available to the customer for test purposes.
  • Digital Test Deck™
  • The DTD™ 10 containing simulated human handprint, looks like a form prepared by a real person even through they are printed by a digital printer and contain perfectly known data. Using these decks, a forms processing system may be tested for accuracy and efficiency, regardless of the technology used for the data capture. More specifically, the state of the processing system may be reliably assessed at any point in time. A well-defined TRUTH FILE 14 is also developed by the handprint recognition system using extracted data from the forms to accurately represent the data placed within the DTD™. The truth file 14 would contain the “answer” that is expected from a handprint recognition system, and can be used to determine the accuracy of said system.
  • FIG. 2 shows a blank form, which is a portion of the Year 2000 Decennial Census “short” form. This is an example of a form that would benefit from the described system and method. There are places on this (blank) form 20, called fields 44, where the respondent is asked to print the answers to questions posed by the Census Bureau. When the respondent completes these fields, the form might look like FIG. 3 displaying fictitious data 46 in fields 44. Most people would say this was an actual Census form image, albeit a rather neat one, but it was actually created using a handprint font on a computer. This is one example of a digital test form. A suitable number of different digital test forms would constitute the test deck.
  • The basic properties of the test deck as defined above are:
      • looks and feels like a real form with handprinted responses, but really printed on a high quality digital color (or black & white) printer;
      • form content designed to test critical system aspects;
      • reproducible as required;
      • compliments, but does not replace forms with “real data” content;
      • consistent test input;
      • test the data capture system “end-to-end;” and
      • know the “truth” perfectly.
  • The fourth bullet indicates that using handprint fonts results in a rather excessively “neat” simulated form, being created, and so the form is of limited use in actually measuring OCR data capture quality per se. However, using actual handprint “snippets” 28 in the creation of the test deck 10 gives a very realistic appearance, as shown in FIG. 4 (again using fictitious data).
  • The test deck 10 is used to:
      • verify correct operation of critical system components;
      • establish a measurable system performance baseline;
      • test system operation at each software/hardware change;
      • test daily production operational readiness before scanning; and
      • test consistency of system between scan operations, sites and over time
      • verify system “improvements.”
  • This invention enables “outside-in” testing. If a perfectly known input is inserted into the system, and (mostly) the correct answers come out, then it is unlikely that there is anything seriously wrong in between. Alternatively, “inside-out” testing, analyses all possible internal variables such as a measure of the lamp intensity on the left-hand side of the scanner, or the speed of the scanner transport, etc. The problem with an “inside-out” approach is that it may literally fill up file cabinets with data in this manner, and it will be the element or factor that is not tested that causes the system to fail or create erroneous data. The “outside-in” approach used in this invention is advantageous because testing is simple, cost-effective, accurate, and consistent.
  • How to Measure Accept Rate and Accuracy of a Forms Processing System
  • FIG. 5 shows a typical forms processing system 50 which uses automatic recognition (OCR) to do the bulk of the data capture workload, and KFI for data capture of those rejected fields for which the OCR system is not confident. The terms Accept Rate and Accuracy Rate are used as a measure of the accuracy of the system under test. In automated recognition of hand printed fields, the Accept Rate is the fraction of the fields in which the OCR has high confidence, usually expressed as a percent. These “accepted” fields are the ones noted for OCR accuracy. Accepted fields are not sent to keyers except for quality assurance purposes, while in noting the OCR “accepted” fields, that fraction of the (non-blank) fields that are “correct” is the Accuracy Rate, usually expressed as a percent. Also shown are the steps taken to measure the accept rate and accuracy of the system. Finally the Error Rate for either OCR or OMR is defined as 100% minus the Accuracy Rate in percent. So for example, if the Accuracy Rate is 99.2%, the Error Rate is 0.8%.
  • Related to the error rate is the Reject Rate. The Reject Rate for recognition is 100% minus the Accept Rate in percent. So, if the Accept Rate is 80%, the Reject Rate is 20%. Rejected fields are the low confidence fields from the OCR, and are sent to keyers to be keyed because the automatic OCR isn't sure it has the correct answer. For OMR, the Accuracy Rate is that fraction of all the check-box fields that are correct, usually including blanks. Blanks are commonly included in scoring OMR because there is no way for the computer to know if a check-box contains a mark or not without looking at it, and so an empty check-box which is properly identified as such is considered a “correct” output of OMR. Scoring (also called: an accuracy rate) includes the calculation of the accuracy of an OCR or keying (data collection) system. A TRUTH File 14 also referred to as the groundtruth or answer file, contains the set of values expected as output from an OCR/OMR system upon extracting the respondent completed information from a form. When the present invention is used this TRUTH File 14 can be generated as described below.
  • A portion of the test deck system 12 is shown schematically in FIG. 6 where all of the fields 44 in all the forms are being tested together. The fields 44 are pulled out one at a time for testing. If the handprint was J-O-S-E and the resultant ASCII was JOSE, that would be a correct field, termed a “hard match”, meaning each and every character is correct in a field. On the other hand, if the handprint was C-H-AO and if the resultant ASCII was CHAD, there would be an error using the hard match criterion.
  • Here, the total error estimate is:
  • εTOAOi(1−A O)+εt
      • εO=OCR Error
      • AO=OCR Accept Rate
      • εi=KFI Error
      • εt=“Truth” Error
      • εT=Total Data Error and the Estimator is shown with a ˆ over the εT.
  • FIG. 7 shows a mathematical representation of the results of applying this test deck system 12.
  • How to Associate Measurement Error with Number of Samples
  • If an accuracy in the neighborhood of p=99%, corresponding to an error rate of q=1% is needed, then the following relationship approximately describes the one-sigma sampling error in the estimate: σ = pq n
    where n is the number of samples. FIG. 8 is a plot of the measurement error as a function of the number of samples.
  • Using this method, it is possible to determine how many samples are needed to obtain the desired level of quality in estimating the desired system accuracy. FIG. 8 shows that many samples may be needed to test the test deck system 12 properly. This invention makes creating a large number of samples cost-effective compared to previous manual methods.
  • Test Materials for Forms Processing Systems
  • Six basic types of test materials are used to test forms processing systems are:
      • 1. Blank forms;
      • 2. Forms hand-marked by volunteers;
      • 3. Real forms filled out by respondents;
      • 4. Images of real forms on CD-ROM;
      • 5. Lithographically printed forms with simulated respondent marks;
      • 6. Digitally printed forms with simulated respondent marks.
        Each of these types of test materials has a purpose, and has advantages and disadvantages. By a suitable combination of these materials, tests may be devised to cover all testing needs.
  • While the invention has been described in connection with various embodiments, it is not intended to limit the scope of the invention to the particular form set forth, on the contrary, it is intended to cover such alternative, modification, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In particular, the test decks described herein could be comprised of a wide variety of printed forms in addition to questionnaires; for example, bank checks, shipping labels, and other types of printed forms.

Claims (18)

1. A method for creating a test deck to qualify and test handprint recognition systems, the method comprising:
(a) preparing a handprint snippet database containing labeled handprint image snippets;
(b) preparing a form description file and page description file to describe a form;
(c) preparing a variable data database that describes the desired content of the simulated respondent entries using the handprint character snippets;
(d) populating the form using the variable data database in conjunction with the form description file, the page description file, and the handprint snippet database; and
(e) printing the completed form.
2. The method of claim 1 further comprising where the handprint character snippets are grouped representing a unique Hand.
3. The method of claim 1 further comprising incorporated noise into the handprint snippet database.
4. The method of claim 1 further comprising changing the placement of the handprint character snippet in reference to a boundary by one or more of the handprint character snippet's position, angle, or size.
5. The method of claim 1 further comprising incorporating handprint character snippets from more than one source.
6. The method of claim 1 further comprising creating variable handprint character snippets on fringe of acceptable limits so that the snippet touches the boundary or another character in at least one point.
7. The method of claim 1 wherein the handprint snippets are cast into a vector representation.
8. The method of claim 1 further comprising software to help define the placement of handprint character snippets on the form.
9. The method of claim 1 further comprising creating the handprint character snippet database by:
(a) collecting multiple handprint character snippets sampled from each contribution; and
(b) mixing the samples.
10. The method of claim 9, wherein the samples are computer generated.
11. The method of claim 9, wherein multiple contributions are used.
12. A system of creating a test deck from handprint character snippets to qualify and test handprint recognition systems (OCR) comprising:
(a) a database of handprint character snippets for use in creating a variable handprint character snippet or field snippet;
(b) data relating to the test deck, including a form description file a page description, and a variable data database;
(c) variable data printing software receiving the data from a) and b); and
(d) printing software for printing one or more variable characters snippets using the variable data database to create a digital test deck.
13. The system of claim 12 further comprising computing software for controlling the image raster processor while rastering the variable characters or field snippets.
14. The system of claim 12, further comprising rastering the handprint character snippets to incorporate changes in angle, size, or position.
15. The system of claim 12, further comprising the variable data database coupled to multiple sources of handprint character snippets.
16. The system of claim 12, further comprising variable character snippets, including various variable character snippets positioned in relation to four boundaries of a rectangle.
17. The system of claim 16, further comprising including the variable character data database and handprint character snippets that have been rastered four times resulting in one version not touching a boundary and the other four touching each of four different boundaries.
18. The system of claim 12 further comprising printing the digital test deck comprising a plurality of forms for use in qualifying and testing an OCR system.
US10/933,002 2004-09-02 2004-09-02 Handprint recognition test deck Abandoned US20060045344A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US10/933,002 US20060045344A1 (en) 2004-09-02 2004-09-02 Handprint recognition test deck
US13/446,835 US8498485B2 (en) 2004-09-02 2012-04-13 Handprint recognition test deck

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/933,002 US20060045344A1 (en) 2004-09-02 2004-09-02 Handprint recognition test deck

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/446,835 Continuation US8498485B2 (en) 2004-09-02 2012-04-13 Handprint recognition test deck

Publications (1)

Publication Number Publication Date
US20060045344A1 true US20060045344A1 (en) 2006-03-02

Family

ID=35943135

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/933,002 Abandoned US20060045344A1 (en) 2004-09-02 2004-09-02 Handprint recognition test deck
US13/446,835 Active US8498485B2 (en) 2004-09-02 2012-04-13 Handprint recognition test deck

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/446,835 Active US8498485B2 (en) 2004-09-02 2012-04-13 Handprint recognition test deck

Country Status (1)

Country Link
US (2) US20060045344A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080212845A1 (en) * 2007-02-26 2008-09-04 Emc Corporation Automatic form generation
US20080221977A1 (en) * 2007-03-02 2008-09-11 Adi, Llc Method for Statistical Process Control for Data Entry Systems
US20080219557A1 (en) * 2007-03-02 2008-09-11 Adi, Llc Process Performance Evaluation for Enterprise Data Systems
US20080235263A1 (en) * 2007-03-02 2008-09-25 Adi, Llc Automating Creation of Digital Test Materials
US20120030151A1 (en) * 2010-07-30 2012-02-02 Adi, Llc. Method and system for assessing data classification quality
US20130223721A1 (en) * 2008-01-18 2013-08-29 Mitek Systems Systems and methods for developing and verifying image processing standards for moble deposit
US10042839B2 (en) * 2016-06-21 2018-08-07 International Business Machines Corporation Forms processing method
US10360447B2 (en) 2013-03-15 2019-07-23 Mitek Systems, Inc. Systems and methods for assessing standards for mobile image quality
US10558972B2 (en) 2008-01-18 2020-02-11 Mitek Systems, Inc. Systems and methods for mobile image capture and processing of documents
US10607073B2 (en) 2008-01-18 2020-03-31 Mitek Systems, Inc. Systems and methods for classifying payment documents during mobile image processing
US10685223B2 (en) 2008-01-18 2020-06-16 Mitek Systems, Inc. Systems and methods for mobile image capture and content processing of driver's licenses
US20210064867A1 (en) * 2019-08-28 2021-03-04 Fuji Xerox Co., Ltd. Information processing apparatus and non-transitory computer readable medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017055610A1 (en) 2015-10-01 2017-04-06 Basf Se Process for recovering a mixture comprising a (thio)phosphoric acid derivative
US10961264B2 (en) 2015-12-01 2021-03-30 Basf Se Process for isolating a (thio)phosphoric acid derivative

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4837842A (en) * 1986-09-19 1989-06-06 Holt Arthur W Character and pattern recognition machine and method
US4941189A (en) * 1987-02-25 1990-07-10 Lundy Electronics & Systems, Inc. Optical character reader with skew recognition
US5214718A (en) * 1986-10-06 1993-05-25 Ampex Systems Corporation Scan-in polygonal extraction of video images
US5416898A (en) * 1992-05-12 1995-05-16 Apple Computer, Inc. Apparatus and method for generating textual lines layouts
US5448375A (en) * 1992-03-20 1995-09-05 Xerox Corporation Method and system for labeling a document for storage, manipulation, and retrieval
US5455875A (en) * 1992-12-15 1995-10-03 International Business Machines Corporation System and method for correction of optical character recognition with display of image segments according to character data
US5694494A (en) * 1993-04-12 1997-12-02 Ricoh Company, Ltd. Electronic retrieval of information from form documents
US5805747A (en) * 1994-10-04 1998-09-08 Science Applications International Corporation Apparatus and method for OCR character and confidence determination using multiple OCR devices
US5854957A (en) * 1996-05-22 1998-12-29 Minolta Co., Ltd. Image formation apparatus that can have waiting time before image formation reduced
US5933531A (en) * 1996-08-23 1999-08-03 International Business Machines Corporation Verification and correction method and system for optical character recognition
US6426806B2 (en) * 1998-03-31 2002-07-30 Canon Kabushiki Kaisha Routing scanned documents with scanned control sheets
US6654495B1 (en) * 1999-04-27 2003-11-25 International Business Machines Corporation Method and apparatus for removing ruled lines
US6658166B1 (en) * 2000-03-08 2003-12-02 International Business Machines Corporation Correction of distortions in form processing
US6661919B2 (en) * 1994-08-31 2003-12-09 Adobe Systems Incorporated Method and apparatus for producing a hybrid data structure for displaying a raster image

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5550930A (en) * 1991-06-17 1996-08-27 Microsoft Corporation Method and system for training a handwriting recognizer at the time of misrecognition
US5293429A (en) * 1991-08-06 1994-03-08 Ricoh Company, Ltd. System and method for automatically classifying heterogeneous business forms
US5544257A (en) * 1992-01-08 1996-08-06 International Business Machines Corporation Continuous parameter hidden Markov model approach to automatic handwriting recognition
JP3362913B2 (en) * 1993-05-27 2003-01-07 松下電器産業株式会社 Handwritten character input device
EP0654755B1 (en) * 1993-11-23 2000-08-02 International Business Machines Corporation A system and method for automatic handwriting recognition with a writer-independent chirographic label alphabet
US6738526B1 (en) * 1999-07-30 2004-05-18 Microsoft Corporation Method and apparatus for filtering and caching data representing images
SE519014C2 (en) * 2001-03-07 2002-12-23 Decuma Ab Ideon Res Park Method and apparatus for recognizing a handwritten pattern
CN101452531B (en) * 2008-12-01 2010-11-03 宁波新然电子信息科技发展有限公司 Identification method for handwriting latin letter

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4837842A (en) * 1986-09-19 1989-06-06 Holt Arthur W Character and pattern recognition machine and method
US5214718A (en) * 1986-10-06 1993-05-25 Ampex Systems Corporation Scan-in polygonal extraction of video images
US4941189A (en) * 1987-02-25 1990-07-10 Lundy Electronics & Systems, Inc. Optical character reader with skew recognition
US5448375A (en) * 1992-03-20 1995-09-05 Xerox Corporation Method and system for labeling a document for storage, manipulation, and retrieval
US5416898A (en) * 1992-05-12 1995-05-16 Apple Computer, Inc. Apparatus and method for generating textual lines layouts
US5455875A (en) * 1992-12-15 1995-10-03 International Business Machines Corporation System and method for correction of optical character recognition with display of image segments according to character data
US5694494A (en) * 1993-04-12 1997-12-02 Ricoh Company, Ltd. Electronic retrieval of information from form documents
US6661919B2 (en) * 1994-08-31 2003-12-09 Adobe Systems Incorporated Method and apparatus for producing a hybrid data structure for displaying a raster image
US5805747A (en) * 1994-10-04 1998-09-08 Science Applications International Corporation Apparatus and method for OCR character and confidence determination using multiple OCR devices
US5854957A (en) * 1996-05-22 1998-12-29 Minolta Co., Ltd. Image formation apparatus that can have waiting time before image formation reduced
US5933531A (en) * 1996-08-23 1999-08-03 International Business Machines Corporation Verification and correction method and system for optical character recognition
US6426806B2 (en) * 1998-03-31 2002-07-30 Canon Kabushiki Kaisha Routing scanned documents with scanned control sheets
US6654495B1 (en) * 1999-04-27 2003-11-25 International Business Machines Corporation Method and apparatus for removing ruled lines
US6658166B1 (en) * 2000-03-08 2003-12-02 International Business Machines Corporation Correction of distortions in form processing

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080212845A1 (en) * 2007-02-26 2008-09-04 Emc Corporation Automatic form generation
US7886219B2 (en) * 2007-02-26 2011-02-08 Emc Corporation Automatic form generation
US20080221977A1 (en) * 2007-03-02 2008-09-11 Adi, Llc Method for Statistical Process Control for Data Entry Systems
US20080219557A1 (en) * 2007-03-02 2008-09-11 Adi, Llc Process Performance Evaluation for Enterprise Data Systems
US20080235263A1 (en) * 2007-03-02 2008-09-25 Adi, Llc Automating Creation of Digital Test Materials
US8055104B2 (en) 2007-03-02 2011-11-08 Adi, Llc Process performance evaluation for Enterprise data systems
US9070027B2 (en) * 2007-03-02 2015-06-30 Adi, Llc Process performance evaluation for rules-driven processing
US20120041948A1 (en) * 2007-03-02 2012-02-16 Exactdata, Llc Process performance evaluation for rules-driven processing
US8983170B2 (en) * 2008-01-18 2015-03-17 Mitek Systems, Inc. Systems and methods for developing and verifying image processing standards for mobile deposit
US10685223B2 (en) 2008-01-18 2020-06-16 Mitek Systems, Inc. Systems and methods for mobile image capture and content processing of driver's licenses
US20210103723A1 (en) * 2008-01-18 2021-04-08 Mitek Systems, Inc. Systems and methods for developing and verifying image processing standards for mobile deposit
US10909362B2 (en) * 2008-01-18 2021-02-02 Mitek Systems, Inc. Systems and methods for developing and verifying image processing standards for mobile deposit
US20130223721A1 (en) * 2008-01-18 2013-08-29 Mitek Systems Systems and methods for developing and verifying image processing standards for moble deposit
US10192108B2 (en) * 2008-01-18 2019-01-29 Mitek Systems, Inc. Systems and methods for developing and verifying image processing standards for mobile deposit
US10607073B2 (en) 2008-01-18 2020-03-31 Mitek Systems, Inc. Systems and methods for classifying payment documents during mobile image processing
US20190228222A1 (en) * 2008-01-18 2019-07-25 Mitek Systems, Inc. Systems and methods for developing and verifying image processing standards for mobile deposit
US10558972B2 (en) 2008-01-18 2020-02-11 Mitek Systems, Inc. Systems and methods for mobile image capture and processing of documents
US20120030151A1 (en) * 2010-07-30 2012-02-02 Adi, Llc. Method and system for assessing data classification quality
US8498948B2 (en) * 2010-07-30 2013-07-30 Adi, Llc Method and system for assessing data classification quality
US10360447B2 (en) 2013-03-15 2019-07-23 Mitek Systems, Inc. Systems and methods for assessing standards for mobile image quality
US11157731B2 (en) 2013-03-15 2021-10-26 Mitek Systems, Inc. Systems and methods for assessing standards for mobile image quality
US10042839B2 (en) * 2016-06-21 2018-08-07 International Business Machines Corporation Forms processing method
US20210064867A1 (en) * 2019-08-28 2021-03-04 Fuji Xerox Co., Ltd. Information processing apparatus and non-transitory computer readable medium

Also Published As

Publication number Publication date
US8498485B2 (en) 2013-07-30
US20120218575A1 (en) 2012-08-30

Similar Documents

Publication Publication Date Title
US8498485B2 (en) Handprint recognition test deck
CN109101469B (en) Extracting searchable information from digitized documents
CN110597806A (en) Wrong question set generation and answer statistics system and method based on reading and amending identification
CN110766014A (en) Bill information positioning method, system and computer readable storage medium
CN106846961B (en) Electronic test paper processing method and device
US20050207635A1 (en) Method and apparatus for printing documents that include MICR characters
US8768241B2 (en) System and method for representing digital assessments
CN109271951B (en) Method and system for improving accounting and auditing efficiency
US20080235263A1 (en) Automating Creation of Digital Test Materials
CN107808154B (en) Method and device for extracting cash register bill information
CN110689013A (en) Automatic marking method and system based on feature recognition
CN109886256A (en) Intelligence evaluation and test equipment and system
CN106778717A (en) A kind of test and appraisal table recognition methods based on image recognition and k nearest neighbor
CN109684957A (en) A kind of method and system showing system data according to paper form automatically
Mozaffari et al. IfN/Farsi-Database: a database of Farsi handwritten city names
JP5905690B2 (en) Answer processing device, answer processing method, program, and seal
CN210038810U (en) Intelligent evaluation equipment and system
Fornés et al. On the use of textural features for writer identification in old handwritten music scores
US20030108243A1 (en) Adaptive technology for automatic document analysis
Zhang et al. Computational method for calligraphic style representation and classification
JP2004029107A (en) System for processing marking of answer
Tomaschek Evaluation of off-the-shelf OCR technologies
JPH11202749A (en) Image processor and distribution system and method using the image processor
Balinsky et al. Aesthetic measure of alignment and regularity
CN114821618A (en) Analysis method for OFD reading software display effect

Legal Events

Date Code Title Description
AS Assignment

Owner name: ADI, LLC, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PAXTON, K. BRADLEY;DIBACCO, WILLIAM L.;SPIWAK, STEVEN P.;AND OTHERS;REEL/FRAME:016152/0813;SIGNING DATES FROM 20040816 TO 20040830

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION