WO2001006416A2 - Intelligent mapping of field names in an electronic form with standard field names - Google Patents

Intelligent mapping of field names in an electronic form with standard field names Download PDF

Info

Publication number
WO2001006416A2
WO2001006416A2 PCT/US2000/040415 US0040415W WO0106416A2 WO 2001006416 A2 WO2001006416 A2 WO 2001006416A2 US 0040415 W US0040415 W US 0040415W WO 0106416 A2 WO0106416 A2 WO 0106416A2
Authority
WO
WIPO (PCT)
Prior art keywords
legacy
field name
field
standard
name
Prior art date
Application number
PCT/US2000/040415
Other languages
French (fr)
Other versions
WO2001006416A3 (en
Inventor
Geoffrey W. Simons
Original Assignee
Infospace, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Infospace, Inc. filed Critical Infospace, Inc.
Priority to AU78801/00A priority Critical patent/AU7880100A/en
Publication of WO2001006416A2 publication Critical patent/WO2001006416A2/en
Publication of WO2001006416A3 publication Critical patent/WO2001006416A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation

Definitions

  • the present invention relates generally to computer software for examining form documents over a computer network. More particularly, the present invention provides a method and system for automatically deriving associations between fields in an electronic form with a list of predefined fields on a server computer.
  • e-commerce and performing various other types of transactions, typically between businesses and individuals.
  • One step generally necessary in performing transactions over the Internet is having the individual fill in some type of electronic or online form so the business or vendor can get some basic information about the customer.
  • many methods may be used to assist a user in filling out such an electronic form document.
  • One or more of these methods may involve a vendor registering its form with a third-party where the third-party facilitates, primarily by automating, the process of filling in the form by the customer.
  • One method of registering a vendor's electronic form with a third-party service provider typically involves the necessary step of matching fields in the vendor's form with the third- party's standard fields. Presently, this is done without involving the vendor. Employees of the third-party upload or otherwise obtain a copy of a vendor's form and examine each field in the form. They then match each of the fields with one of the third-party's own standard fields. This process is repeated for each vendor (some of whom may have more than one form) the third-party desires to register with its service, without involving or consulting with the vendor.
  • a method of mapping a standard field name with a legacy field name from an electronic legacy form is described.
  • a legacy field name is extracted from an online vendor form.
  • a list of standard field names that are predicted to most likely map to the legacy field name is created.
  • a standard field name from the list that actually maps to the legacy field name is selected.
  • a knowledge base used to create the list of predicted standard field names is micro-adjusted to reflect the selection of the predicted standard field name that actually maps to the legacy field name.
  • the legacy field name is extracted by parsing the online legacy form and creating a document object model.
  • the list of predicted standard field names are drawn from a standard field name list having multiple field names commonly found in vendor forms.
  • an initial training set upon which a knowledge base is built is created manually entering field mapping data between standard fields and legacy fields.
  • the list of predicted standard field names is created utilizing one or more evidence variables to determine standard field names most likely to map to the legacy field name and where one of the variables is relative distance of a term from the legacy field name being examined.
  • the knowledge base used to create the list of predicted standard field names is a vector set having multiple vectors, wherein a vector represents a standard field name.
  • the list of predicted standard field names is created using a linear algebra algorithm. Brief Description of the Drawings
  • FIG. 1 is an overview block diagram showing components of a field mapping system in accordance with one embodiment of the present invention.
  • FIG. 2 is a block diagram depicting components of a field mapping system in accordance with one embodiment of the present invention.
  • FIG. 3A is a flow diagram describing a processing for building an initial training set or knowledge base to be used in the field mapping system in accordance with one embodiment of the present invention.
  • FIG. 3B is a block diagram of an object model in accordance with one embodiment of the present invention.
  • FIG. 4 is a flow diagram of a process of registering a form with a third-party service after a training set has been created in accordance with one embodiment of the present invention.
  • FIG. 5 is a table depicting weight values and fields used to calculate probabilities of particular standard fields in accordance with one embodiment of the present invention.
  • FIG. 6 is a block diagram of a typical computer system suitable for implementing an embodiment of the present invention.
  • FIG. 6 is a block diagram of a typical computer system suitable for implementing an embodiment of the present invention.
  • Detailed Description of the Preferred Embodiment Reference will now be made in detail to a preferred embodiment of the invention. An example of the preferred embodiment is illustrated in the accompanying drawings. While the invention will be described in conjunction with a preferred embodiment, it will be understood that it is not intended to limit the invention to one preferred embodiment. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims.
  • a system and method for intelligently mapping fields in an electronic form with standard fields from a set of predefined field names as described in the various figures.
  • This intelligent field mapping method is one aspect of an automatic form filling procedure for online or electronic transactions, such as those over the Internet.
  • a system and method for automatic form filling is described in Application No. 09/231,644, titled “Server For Enabling The Automatic Insertion Of Data Into Electronic Forms On A User Computer”, (Attorney Dkt: MLLTP001) and in Application No. 09/231,254, (Attorney Dkt: MLLTP002) titled “Method and Apparatus for Client Side Automatic Electronic Form Completion", both filed January 15, 1999, and incorporated by reference herein.
  • third-party service or "third-party”
  • legacy fields in the vendor form need to be mapped or paired with a standard field from a set of field names created and maintained by the third-party (or at least those legacy fields that have a corresponding standard field).
  • the method of the present invention facilitates the matching process by providing the vendor, desiring to register its form, with standard field names most likely to match each of the legacy field names.
  • the field mapping process of the present invention "learns" more about which standardized fields are more likely to match a given legacy field name. Through this field mapping process, the registration process for a vendor evolves and becomes more efficient since the vendor merely has to select a standard field from a short list of "most likely" matching standard fields.
  • FIG. 1 is an overview block diagram showing components of a field mapping system in accordance with one embodiment of the present invention.
  • Form 102 is an electronic form containing legacy fields and labels for such fields to be filled in by a customer.
  • the legacy fields are given names when form 102 is created, such as fname and Iname or lastname and firstname, etc., which are typically not seen or known to the customer.
  • the labels are the text seen by the customer or user describing these fields such as "First Name” "Last Name” "Street Address,” etc.
  • form 102 is uploaded to a third-party server 104.
  • server 104 contains, among other information, a field map domain database 106 described in greater detail below. Domain database 106 is created using a metric described below and is used to determine the most likely matching standard field names.
  • This last step is not part of the present invention and is described in greater detail in pending patent applications incorporated by reference above.
  • Customer 108 and browser 1 10 are shown for completeness and are relevant in discussions below regarding the construction of document object models (“DOMs”) based on the uploaded vendor forms.
  • DOMs document object models
  • FIG. 2 is a block diagram depicting components of a field mapping system in accordance with one embodiment of the present invention. Shown are vendor form 102, a list of third-party standard field names 202, a field mapping component 204, and a field mapping list 206. Vendor form 102 is described in FIG. 1 above. It is typically an HTML document created and maintained by a vendor. In other embodiments, form 102 can be an XML document. Form 102 contains various fields, of which three are shown (in brackets) with accompanying labels or text: First Name 208: [fname] 210, Last Name 212: [lname] 214, and Middle Initial 216: [minitial] 218.
  • fname 210, lname 214, and minitial 218 are examples of legacy field names and can be categorized as field objects.
  • Labels First Name 208, Last Name 212, and Middle Initial 216 are each text strings and can be further decomposed into "First,” “Name,” “Last,” “Name,” “Middle,” and “Initial.”
  • the legacy field names and text strings are created and maintained by the vendor.
  • Standard field name list 202 is created and maintained by a third-party service for automatic form filling.
  • field name list 202 contains about 70 to 80 of the most common fields found in most customer-oriented forms for purchasing goods or services in North America and internationally.
  • a vendor registers an electronic form, generally, many of the fields in the form will have a corresponding matching field in field name list 202.
  • Shown as part of field name list 202 are three examples of standard field names: user_name_first 220, user_name_last 222, and user_name_middle 224, collectively making up a customer's name.
  • names of the standard fields and the number of such fields can vary depending on what the third-party service provider believes is most suitable.
  • Field mapping component 204 collects "intelligence" as new forms are registered and performs the field mapping function using various data sources, such as field map domain database 106, to create a field mapping list 206 (i.e., a list of associated field names). Field mapping component 204 is described in greater detail below. Initially (i.e., when no forms have been registered), mapping component 204 uses a training set which it then builds upon as new forms are registered. In the described embodiment, the training set, described below, is created “manually” in that vendors or the third-party service match fields without the benefit of a "most likely" matching field list.
  • Field mapping list 206 is an abstract representation of the final "output" of the intelligent field mapping process of the present invention.
  • legacy field names found in form 102, such as fname 210, lname 214, and minitial 218.
  • fname 210, lname 214, and minitial 218 Associated with each legacy field name is a selection list of most likely corresponding standard field names.
  • fname 210 has a selection list having, as a first and most likely choice, user_name_first 220, followed by the second most likely choice, user_name_last 222, and finally, by user_name_middle 224. If there are fields preceding fname 210 and text string 208, those fields may also be in the selection list. Whether they appear in the selection list depends, for example, in part, on how often they immediately precede fname 210 in other vendor forms. In other embodiments, evidence variables other than proximity and frequency can be used by the learning engine, alone or in combination with other evidence variables, to determine the most likely matching standard field names.
  • Standard field name user_name_last 222 will likely always be in the selection list since it is almost always very close to user_name_first 220. The same is true for user_name_middle 224, although this field is less predictable: it may appear after lname 214, between fname 210 and lname 214, or not appear at all.
  • the selections are presented to the vendor in the form of a pull-down window with only the first, most-likely selection showing initially, and the other choices in the pull-down window. In other embodiments, the selections can be presented to the user in other formats.
  • FIG. 3A is a flow diagram describing a process for building an initial training set or knowledge base to be used in the field mapping system in accordance with one embodiment of the present invention.
  • the training set is an initial knowledge base that is created "manually" in that legacy fields are mapped to standard fields preferably by the vendor. This is done without providing a short list or most likely standard field to the person performing the mapping.
  • a vendor's customer form is uploaded to a server under the control of the third party service provider.
  • the vendor uploads one form onto the server in order to register it with the third-party service.
  • multiple forms from one vendor can be uploaded.
  • a step 304 the form, typically an HTML document, is parsed using an HTML parser thereby creating a document object model, or DOM.
  • An example of an object model is shown in FIG. 3B.
  • Such a parsing procedure is well known in the field of Internet programming as is the creation of DOMs (essentially a tree of nodes).
  • Some typical parsing algorithms are widely used in Javascript, browser programs (e.g., Netscape Navigator or Microsoft Internet Explorer), and in WC3.
  • the HTML parsers used in the present invention should be as "forgiving" as the HTML parsers utilized in the widely used Web browsers.
  • DOM 330 created is a linear object model made up of a linear series of linked nodes.
  • Object 330 can be seen as the raw model that represents the legacy form. It should be noted here that the object module is not necessarily created solely for field mapping component 204 and can be used by other components of the broader automatic form filling system. Following the same example of text and fields in form 102 from above, a corresponding DOM 330 would include a node 332 of type TEXT and value "First” followed by another TEXT node 334 with value "Name” followed by a node 336 of type LFIELD (legacy field) having a value " fname,” and so on.
  • one unique feature of object model 330 in the described embodiment is the inclusion of an "unknown" node 338 for each legacy field node in the object model.
  • This node of type "unknown” is parallel to and at the same level as a legacy field, such as "fname.”
  • the HTML parser of the described embodiment inserts these nodes in the object model and creates a set of unknown nodes.
  • These object models containing the normal nodes and "unknown" nodes are assigned a unique identifier and stored.
  • HTML is linear
  • the parser can create a linear object model in which it is possible, in subsequent steps, to determine the distance other nodes in the model are from a particular node (i.e., it is possible to count nodes ahead of or behind a given node to determine a distance).
  • a vendor form is defined as a set of nodes of three types: text (labels), legacy fields, and unknowns.
  • a human performs a manual mapping of each legacy field with a standard field.
  • the object model representing the form is retrieved and all nodes representing a legacy field are extracted along with each associated unknown node or field.
  • the process of extracting nodes of one or more particular types from an object model is well known in the field of computer and Internet programming.
  • a corresponding standard field is chosen, if appropriate or available.
  • the object model is retrieved or reconstructed with the unknown nodes or fields eliminated and replaced with the standard field names entered by the vendor. Any unknown nodes associated with legacy fields which did not have a matching standard field are unaltered and are essentially ignored in subsequent processing in the described embodiment.
  • This manually entered mapping data is stored in a field map domain as a form object having individual elements and associated clauses.
  • a clause is a data element and a privacy practice, not relevant to the present invention.
  • the completed object model containing the legacy fields and the standard field names are stored in a field map domain database.
  • the object model is compressed before being stored in the domain database.
  • the entire object model uncompressed is stored should a different algorithm or learning engine be utilized for the field mapping.
  • the current algorithm or a different type of algorithm can be retrained to use different data in the object model.
  • term or label nodes such as " First" or " Street Address” in the object model can also be mapped or associated with a standard field by the vendor. "First” can be associated with the standard field user iamejirst, for example, and this added intelligence can be used by the learning engine or algorithm to better predict which standard field names map to a given legacy field.
  • step 310 the process of entering one vendor form into a training set is complete. The process is repeated for each form used to initially build the training set.
  • intelligence data from 150 to 200 forms is used to initially build the training set. This number can, of course, vary depending on the desired initial accuracy or intelligence of the field mapping system.
  • FIG. 4 is a flow diagram of a process of registering a form with a third-party service after training set has been created in accordance with one embodiment of the present invention.
  • an online vendor desiring to take advantage of the third-party's automatic form filling service is required to register the form with the third-party. It is during this registration process that the vendor must map legacy fields in its form with standard field names created and maintained by the third-party.
  • the present invention describes a method and system of facilitating the field mapping process by providing the vendor with the most likely standard field name to map to a particular legacy field followed by a list of other likely standard fields that might match.
  • Steps 402 and 404 of FIG. 4 are substantially similar to steps 302 and 304 of FIG. 3A.
  • the vendor uploads a form it desires to register to the third- party server.
  • the form is parsed using an HTML parser.
  • An object model having nodes for terms, legacy fields, and "unknown" nodes associated with each legacy field node is created.
  • the parser also extracts the legacy field nodes from the object model at step 404.
  • the field mapping system predicts which standard field names are most likely to match each legacy field (determined by examining the extracted legacy field nodes from the object model). That is, it makes its best predictions as to what the unknown nodes in the object model are in terms of standard field names. In the described embodiment this is done using a linear algebra algorithm and is described in greater detail below.
  • the predicted standard field names for the legacy fields are presented to the user.
  • the vendor initially sees one selection (the most likely standard field) followed by a pull-down menu of three or four other likely fields in descending order.
  • the vendor either selects the standard field dame presented (i.e., the one the field mapper determined should be the corresponding standard field for the particular legacy field) or one of the other likely standard field names. If the vendor selects the first choice, it is essentially confirming the field mapper's prediction. If the vendor selects one of the others or none at all, he is implicitly correcting the field mapper's intelligence.
  • this confirmation or correction feedback from the vendor is used to micro-adjust the inference base or rule set of the field map system.
  • the inference base is compressed into linear algebra tables as described below.
  • any corrections to or confirmations of the field map intelligence by the vendor are used to correct one evidence variable, namely, the relative distance of certain terms from the legacy field being examined. The distance is measured in terms of the number of nodes (in the object model) from a legacy field node. The role of this distance variable and how it is used is described in greater detail below.
  • other evidence variables can be used (e.g., a Boolean variable indicating whether a term is before or after a particular legacy field node) and, thus, input from the vendor can be used to micro-adjust the weights of those variables as well.
  • a linear algebra algorithm is used as a learning engine for building the inference base of the field map system.
  • the learning engine can be based on a Bayesian network or a neural network.
  • the linear algebra algorithm used in the described embodiment uses a vector set or matrix to represent a form where each row is keyed based on a standard field. Once a matrix, described below, is created for a form, its values are averaged or combined with corresponding values in a "master" matrix which represents a constantly adjusted inference base for the field map system. Predictions as to legacy field/standard field mappings are derived from this inference base or master matrix.
  • a selected form 1 includes a series of labels (terms) and fields (legacy fields).
  • T/ is created for a form 1.
  • This set of vectors only corresponds to nodes that were in the selected form /.
  • Each vector represents one term i contained in form 1, such as "First” "Last” or " Street.”
  • T ] ⁇ .• ⁇ ,- P , .
  • ⁇ (Fi st) ⁇ l P(user_name_first) + ⁇ p (user_name_f ⁇ rst) + -
  • the variable w represents a
  • weight calculated based on the distance a term node or field node is from a particular standard field. The calculation of this weight is described below.
  • Each vector is a sum of values where each value is the equivalent of a dot product of a weight (e.g., 1/3, 1/6. . .) and an unknown node in the object model. The operation performed is similar to a dot product taken in an inverted manner. Essentially, each vector represents how term i is associated with standard field nodes, symbolized by
  • the weights of nodes in the vicinity of term i are inversely proportional to the distance of the node from the term.
  • the distance is determined by the number of nodes from the term / ' node in object model 330. This distance is essentially the same as the relative position of terms in the original HTML document of form 1.
  • the nodes are in the following order: "First” "Name” [fname] "Last” "Name” and [lname]
  • the node for term "First” is two away from the [fname] node and five away from the [lname] node, for example.
  • a node is zero distance away from itself.
  • weights for term i in form 1
  • W JJ for term i in form 1
  • w i2 has a vector of weights, w /iQ , W JJ , w i2 , and so on, associated with each node. If the weight falls below a certain threshold value, the node is ignored and not included in the vector for that term. For example, while
  • the variable d is the distance a node is from term i. If a term has a distance of two, it is assigned a weight of 1/3; if it has a distance of one, it has a weight of V2. Thus, the farther the distance a node is from term i, less weight is accorded to that node. In the described embodiment, if a node has a weight less than 1/8, it is not included in the vector N j for term .
  • the set of vectors T / represents vectors for each term in form 1.
  • Each vector in turn, can be seen as a series of weight, standard field pairs as described above. The number of these pairs in each vector can vary, as can the number of vectors in the set of vectors representing a form since forms can have varying number of terms.
  • the inference base is the center of intelligence of the field map system and is continually modified when new forms are registered. In other words, when a new T/ is entered into the system, data from the vectors in T/ are used to update or, more accurately, micro-adjust data in the inference base.
  • One primary difference in calculating values for the inference base is that it is not specific to one form 1; thus, subscript 1 does not appear in any of the equations.
  • the inference base is also a series of vectors, similar to the matrix for a single form 1. However, the number of vectors is much larger since each unique term found in all forms registered has a vector in the inference base, which could range in the hundreds and possibly thousands.
  • the matrix T ; - is a sum of vectors T / , described below, where each vector represents a term found in at least one of the registered forms and that term's association with other nodes.
  • the weight given to a particular standard field to represent that field's association with a particular term, such as a legacy field or text string, is an average of weights given to the same standard field with respect to the same particular term over all previously registered forms.
  • the weight W j is calculated by summing all the weights from individual forms, represented by w , as described above, divided by the number of times nj the term has appeared in previously registered forms.
  • the matrix T,- is the average of individual form matrices T/ .
  • the field map system keeps a count of the number of times a term appears in a particular form.
  • the inference base matrix is used to calculate probabilities that a particular unknown or standard field is a matching standard field for a given legacy field.
  • Weights w are only those weights that exceed a threshold value, such as 1/8. Any fields having weights lower than the threshold are not worth considering in the equation.
  • T j is normalized by dividing Tj by the magnitude of T j . A normalized
  • the sums expressed in the formula are depicted in the form of a table shown in FIGURE 5.
  • FIG. 5 is a table depicting weight values and fields used to calculate probabilities of particular standard fields in accordance with one embodiment of the present invention.
  • the final step in the linear algebra learning algorithm used by the field map system is calculating a probability that a legacy field maps to a given standard field.
  • the calculation based on the formula U
  • ⁇ j t ⁇ i w / k WJJ JJ PJ can be shown more explicitly in table 500.
  • Columns 502 represent standard field names that can possibly match (i.e., have weights above a threshold value) any of the legacy fields in form /. Thus, the number of columns in table 500 can vary depending on the form / being parsed. Shown in table 500 are two sample columns: user jiame Jirst 504 and user iame ast 506. Rows 508 represent all terms and legacy fields in form / and are drawn from the form object model. As with columns 502, the number of rows can vary depending on the form being parsed.
  • table 500 Shown in table 500 are four sample rows for "First”, “Name”, [fname], and "Last.”
  • each term in rows 508 has a corresponding weight column 510 containing individual weight values w , previously calculated as described above.
  • This weight corresponds to a particular term's distance from an unknown field.
  • w / ' l,k is the weight between the unknown k and node i.
  • the term "First” has a weight of 1/3 with respect to legacy field [fname].
  • "Name” and "Last” have higher weights since they are closer to [fname].
  • Legacy field [fname] has a weight of one with respect to itself.
  • a cell 512 (intersection of a row and column) contains another weight value, specifically Wj j , needed to complete the sum operations to calculate the probability that particular unknown U matches a standard field P;.
  • Cell 512 holds the value .35. This is the weight given to the term "First" with respect to the standard field user jiame Jirst.
  • another cell 514 representing [fname]
  • N j and user Jirst _name, w; is very high since, drawing from past form registrations, the legacy field [fname] has frequently matched standard field user _name Jirst.
  • the weight would be higher, such as .99, if all previous forms called the field for the customer's first name "fname.” Vendors can call this field various other names, such as [firstname] or [name_first], and so on, which somewhat dilutes the possibility that [fname] matches user name Jrst.
  • the present invention employs various computer-implemented operations involving data stored in computer systems. These operations include, but are not limited to, those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated.
  • the operations described herein that form part of the invention are useful machine operations.
  • the manipulations performed are often referred to in terms, such as, producing, identifying, running, determining, comparing, executing, downloading, or detecting. It is sometimes convenient, principally for reasons of common usage, to refer to these electrical or magnetic signals as bits, values, elements, variables, characters, data, or the like. It should remembered, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities.
  • the present invention also relates to a server computer or similar system for performing the aforementioned operations.
  • the system may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or configured by a computer program stored in the computer.
  • the processes presented above are not inherently related to any particular computer or other computing apparatus.
  • various general-purpose computers may be used with programs written in accordance with the teachings herein, or, alternatively, it may be more convenient to construct a more specialized computer system, such as a privacy bank server, to perform the required operations.
  • FIG. 6 is a block diagram of a general purpose computer system 600 suitable for carrying out the processing in accordance with one embodiment of the present invention.
  • FIG. 6 illustrates one embodiment of a general purpose computer system.
  • Computer system 600 made up of various subsystems described below, includes at least one microprocessor subsystem (also referred to as a central processing unit, or CPU) 602. That is, CPU 602 can be implemented by a single-chip processor or by multiple processors. It should be noted that in reconfigurable computing systems, CPU 602 can be distributed amongst a group of programmable logic devices. In such a system, the programmable logic devices can be reconfigured as needed to control the operation of computer system 600. In this way, the manipulation of input data is distributed amongst the group of programmable logic devices.
  • CPU 602 is a general purpose digital processor which controls the operation of the computer system 600. Using instructions retrieved from memory, the CPU 602 controls the reception and manipulation of input data, and the output and display of data on output devices.
  • CPU 602 is coupled bi-directionally with a first primary storage 604, typically a random access memory (RAM), and uni-directionally with a second primary storage area 606, typically a read-only memory (ROM), via a memory bus 608.
  • primary storage 604 can be used as a general storage area and as scratch-pad memory, and can also be used to store input data and processed data. It can also store programming instructions and data, in the form of a field map domain database or a standard field name list, to name just two example, in addition to other data and instructions for processes operating on CPU 602, and is used typically used for fast transfer of data and instructions in a bi-directional manner over the memory bus 608.
  • primary storage 606 typically includes basic operating instructions, program code, data and objects used by the CPU 602 to perform its functions.
  • Primary storage devices 604 and 606 may include any suitable computer-readable storage media, described below, depending on whether, for example, data access needs to be bi-directional or uni-directional.
  • CPU 602 can also directly and very rapidly retrieve and store frequently needed data in a cache memory 610.
  • a removable mass storage device 612 provides additional data storage capacity for the computer system 600, and is coupled either bi-directionally or uni- directionally to CPU 602 via a peripheral bus 614.
  • a specific removable mass storage device commonly known as a CD-ROM typically passes data uni-directionally to the CPU 602, whereas a floppy disk can pass data bi- directionally to the CPU 602.
  • Storage 612 may also include computer-readable media such as magnetic tape, flash memory, signals embodied on a carrier wave, PC- CARDS, portable mass storage devices, holographic storage devices, and other storage devices.
  • a fixed mass storage 616 also provides additional data storage capacity and is coupled bi-directionally to CPU 602 via peripheral bus 614.
  • the most common example of mass storage 616 is a hard disk drive. Generally, access to these media is slower than access to primary storages 604 and 606.
  • Mass storage 612 and 616 generally store additional programming instructions, data, and the like that typically are not in active use by the CPU 602. It will be appreciated that the information retained within mass storage 612 and 616 may be incorporated, if needed, in standard fashion as part of primary storage 604 (e.g., RAM) as virtual memory.
  • primary storage 604 e.g., RAM
  • peripheral bus 614 is used to provide access other subsystems and devices as well.
  • these include a display monitor 618 and adapter 620, a printer device 622, a network interface 624, an auxiliary input/output device interface 626, a sound card 628 and speakers 630, and other subsystems as needed.
  • the network interface 624 allows CPU 602 to be coupled to another computer, computer network, or telecommunications network using a network connection as shown. Through the network interface 624, it is contemplated that the CPU 602 might receive information, e.g., data objects or program instructions, from another network, or might output information to another network in the course of performing the above-described method steps. Information, often represented as a sequence of instructions to be executed on a CPU, may be received from and outputted to another network, for example, in the form of a computer data signal embodied in a carrier wave.
  • An interface card or similar device and appropriate software implemented by CPU 602 can be used to connect the computer system 600 to an external network and transfer data according to standard protocols.
  • method embodiments of the present invention may execute solely upon CPU 602, or may be performed across a network such as the Internet, intranet networks, or local area networks, in conjunction with a remote CPU that shares a portion of the processing.
  • Additional mass storage devices may also be connected to CPU 602 through network interface 624.
  • Auxiliary I/O device interface 626 represents general and customized interfaces that allow the CPU 602 to send and, more typically, receive data from other devices such as microphones, touch-sensitive displays, transducer card readers, tape readers, voice or handwriting recognizers, biometrics readers, cameras, portable mass storage devices, and other computers.
  • a keyboard controller 632 is Also coupled to the CPU 602 is a keyboard controller 632 via a local bus 634 for receiving input from a keyboard 636 or a pointer device 638, and sending decoded symbols from the keyboard 636 or pointer device 638 to the CPU 602.
  • the pointer device may be a mouse, stylus, track ball, or tablet, and is useful for interacting with a graphical user interface.
  • embodiments of the present invention further relate to computer storage products with a computer readable medium that contain program code for performing various computer-implemented operations.
  • the computer-readable medium is any data storage device that can store data which can thereafter be read by a computer system.
  • the media and program code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known to those of ordinary skill in the computer software arts.
  • Examples of computer-readable media include, but are not limited to, all the media mentioned above: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as floptical disks; and specially configured hardware devices such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs), and ROM and RAM devices.
  • the computer-readable medium can also be distributed as a data signal embodied in a carrier wave over a network of coupled computer systems so that the computer- readable code is stored and executed in a distributed fashion.
  • Examples of program code include both machine code, as produced, for example, by a compiler, or files containing higher level code that may be executed using an interpreter.
  • FIG. 6 is but an example of a computer system suitable for use with the invention. Other computer architectures having different configurations of subsystems may also be utilized.

Abstract

Methods, apparatus, and computer program products are disclosed for facilitating the mapping of legacy field names in an online vendor form with standard field names as one step in an automatic online form filling process. When a vendor registers a form with a third-party automatic form filling service, fields in the vendor's form must be mapped with standardized field names created and maintained by the third-party service. A process is described for intelligent mapping of a legacy field name in an electronic legacy form with one or more standard field names, thereby saving the vendor time and reducing the chance of error in the form registration process. A method of mapping a standard field name with a legacy field name from an electronic legacy form is described. A legacy field name is extracted from an online vendor form. A list of standard field names that are predicted to most likely map to the legacy field name is created using a linear algebra algorithm. A standard field name from the list that actually maps to the legacy field name is selected. A knowledge base of vectors used to create the list of predicted standard field names is micro-adjusted to reflect the selection of the predicted standard field name that actually maps to the legacy field name.

Description

INTELLIGENT MAPPING OF FIELD NAMES IN AN ELECTRONIC FORM
WITH STANDARD FIELD NAMES
CROSS-REFERENCE TO RELATED APPLICATIONS A system and method for automatic form filling is described in Application No. 09/231,644, titled "Server For Enabling The Automatic Insertion Of Data Into Electronic Forms On A User Computer", (Attorney Dkt: MLLTP001) and in Application No. 09/231,254, (Attorney Dkt: MLLTP002) titled "Method and Apparatus for Client Side Automatic Electronic Form Completion", both filed January 15, 1999. Background of the Invention
Field of the Invention The present invention relates generally to computer software for examining form documents over a computer network. More particularly, the present invention provides a method and system for automatically deriving associations between fields in an electronic form with a list of predefined fields on a server computer.
Discussion of Related Art One result of the recent explosion in the use of computers is the amount of communication that now takes place between separate computers and computer systems. Many methods and systems exist for communications between computers or computer systems. This is reflected in many contexts, such as in the growth of the Internet or, more specifically, the World Wide Web. For purposes of the following discussion, several methods and systems will be described with reference to the Internet and to Web sites as a matter of convenience. It should be understood, however, that this is not intended to limit the scope of this discussion, and that many other applicable devices and protocols for computer communications exist, such as "Intranets," closed proxy networks, enterprise-wide networks, direct modem to modem connections, etc. In recent years, the Internet has been used for commerce (often referred to as
"e-commerce") and performing various other types of transactions, typically between businesses and individuals. One step generally necessary in performing transactions over the Internet is having the individual fill in some type of electronic or online form so the business or vendor can get some basic information about the customer. In general, many methods may be used to assist a user in filling out such an electronic form document. One or more of these methods may involve a vendor registering its form with a third-party where the third-party facilitates, primarily by automating, the process of filling in the form by the customer.
One method of registering a vendor's electronic form with a third-party service provider (where the "service" consists of automated form filling) typically involves the necessary step of matching fields in the vendor's form with the third- party's standard fields. Presently, this is done without involving the vendor. Employees of the third-party upload or otherwise obtain a copy of a vendor's form and examine each field in the form. They then match each of the fields with one of the third-party's own standard fields. This process is repeated for each vendor (some of whom may have more than one form) the third-party desires to register with its service, without involving or consulting with the vendor.
One of the drawbacks to this manual approach to matching form field names is that it is, by its nature, manually intensive if the third-party wants to register hundreds or thousands of forms from different vendors. This would require a large investment in time and labor to initially set up and to continually maintain and expand. Furthermore, because the matching is done manually by human beings, the chances of clerical error are high.
At another level, drawbacks from the manual approach to mapping field names are more significant. When done manually, the third-party could mismatch a vendor field name with one of the standard third-party field names thereby associating a meaning with a vendor, or legacy, field that the vendor did not intend. Another significant drawback stems from not having the vendor participate in the mapping process. By not having the vendor or form owner involved in the mapping process, the third-party is unable to take into account privacy considerations desired by the vendor. Although online transactions are on the rise, consumers are increasingly wary and sophisticated with respect to the privacy or protection of any personal information they enter online. Generally speaking, consumers, at a minimum, want to know how their information will be used once they have entered it. For example, will it be used strictly for the single transaction being conducted at the moment or will it be saved by the vendor, perhaps to be used in a mailing list or in a database of consumer preferences.
In any case, many online vendors want to have some type of privacy negotiation associated with any forms being registered with the third-party service. If no privacy features are associated with a particular form, consumers will be increasingly reluctant to provide information, especially certain types of information such as billing information or consumer preferences. However, if the vendor is involved in the matching process, the process should be efficient and easy for the vendor, otherwise the vendor is less likely to register its forms with the third-party. Therefore, it would be desirable to have a process in which the vendor or online merchant performs the matching between the legacy field names and the standardized third-party field names. Furthermore, it would be desirable to make the matching procedure for the vendor efficient, preferably by examining the legacy field name and then providing the most likely matching standard field names from which the vendor can select the correct one. This intelligent mapping process should have the ability to evolve or learn as corrections are made and more forms are registered with the third-party service.
Summary of the Invention According to the present invention, methods, apparatus, and computer program products are disclosed for intelligent mapping of a legacy field name in an electronic legacy form with one or more standard field names. In one aspect of the present invention a method of mapping a standard field name with a legacy field name from an electronic legacy form is described. A legacy field name is extracted from an online vendor form. A list of standard field names that are predicted to most likely map to the legacy field name is created. A standard field name from the list that actually maps to the legacy field name is selected. A knowledge base used to create the list of predicted standard field names is micro-adjusted to reflect the selection of the predicted standard field name that actually maps to the legacy field name. In one embodiment the legacy field name is extracted by parsing the online legacy form and creating a document object model. In another embodiment the list of predicted standard field names are drawn from a standard field name list having multiple field names commonly found in vendor forms. In yet another embodiment an initial training set upon which a knowledge base is built is created manually entering field mapping data between standard fields and legacy fields. In yet another embodiment the list of predicted standard field names is created utilizing one or more evidence variables to determine standard field names most likely to map to the legacy field name and where one of the variables is relative distance of a term from the legacy field name being examined. In yet another embodiment the knowledge base used to create the list of predicted standard field names is a vector set having multiple vectors, wherein a vector represents a standard field name. In yet another embodiment the list of predicted standard field names is created using a linear algebra algorithm. Brief Description of the Drawings
The invention will be better understood by reference to the following description taken in conjunction wαth the accompanying drawings in which:
FIG. 1 is an overview block diagram showing components of a field mapping system in accordance with one embodiment of the present invention. FIG. 2 is a block diagram depicting components of a field mapping system in accordance with one embodiment of the present invention.
FIG. 3A is a flow diagram describing a processing for building an initial training set or knowledge base to be used in the field mapping system in accordance with one embodiment of the present invention. FIG. 3B is a block diagram of an object model in accordance with one embodiment of the present invention.
FIG. 4 is a flow diagram of a process of registering a form with a third-party service after a training set has been created in accordance with one embodiment of the present invention. FIG. 5 is a table depicting weight values and fields used to calculate probabilities of particular standard fields in accordance with one embodiment of the present invention.
FIG. 6 is a block diagram of a typical computer system suitable for implementing an embodiment of the present invention. Detailed Description of the Preferred Embodiment Reference will now be made in detail to a preferred embodiment of the invention. An example of the preferred embodiment is illustrated in the accompanying drawings. While the invention will be described in conjunction with a preferred embodiment, it will be understood that it is not intended to limit the invention to one preferred embodiment. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims.
In accordance with one embodiment of the present invention, there is provided a system and method for intelligently mapping fields in an electronic form (referred to as legacy fields) with standard fields from a set of predefined field names as described in the various figures. This intelligent field mapping method is one aspect of an automatic form filling procedure for online or electronic transactions, such as those over the Internet. A system and method for automatic form filling is described in Application No. 09/231,644, titled "Server For Enabling The Automatic Insertion Of Data Into Electronic Forms On A User Computer", (Attorney Dkt: MLLTP001) and in Application No. 09/231,254, (Attorney Dkt: MLLTP002) titled "Method and Apparatus for Client Side Automatic Electronic Form Completion", both filed January 15, 1999, and incorporated by reference herein.
The automatic form filling procedure described in these applications require that a merchant or vendor form be registered with a third-party automatic form filling provider ("third-party service" or "third-party"). In order for a form to be registered with the third-party service for automatic form filling, legacy fields in the vendor form need to be mapped or paired with a standard field from a set of field names created and maintained by the third-party (or at least those legacy fields that have a corresponding standard field).
The method of the present invention facilitates the matching process by providing the vendor, desiring to register its form, with standard field names most likely to match each of the legacy field names. As the number of vendor forms registered with the third-party service increases, the field mapping process of the present invention "learns" more about which standardized fields are more likely to match a given legacy field name. Through this field mapping process, the registration process for a vendor evolves and becomes more efficient since the vendor merely has to select a standard field from a short list of "most likely" matching standard fields. To further illustrate the foregoing, FIG. 1 is an overview block diagram showing components of a field mapping system in accordance with one embodiment of the present invention. For example, a vendor or merchant wanting to do business over a network, such as the Internet, registers a form 102, such as a purchase order form to be filled out by customers, with a third-party service. Form 102 is an electronic form containing legacy fields and labels for such fields to be filled in by a customer. The legacy fields are given names when form 102 is created, such as fname and Iname or lastname and firstname, etc., which are typically not seen or known to the customer. The labels are the text seen by the customer or user describing these fields such as "First Name" "Last Name" "Street Address," etc. Typically, when an online customer visits the vendor's Web site and decides to make a purchase, the customer fills in each field "manually" by typing in the information, usually starting with a first name, last name, home address, phone number, billing address, and so on. In order to be registered, form 102 is uploaded to a third-party server 104. As mentioned above, one aspect of the registration process is mapping legacy field names with third-party standard field names. Server 104 contains, among other information, a field map domain database 106 described in greater detail below. Domain database 106 is created using a metric described below and is used to determine the most likely matching standard field names. A client "customer" 108 running a Web browser 110 already registered with the third-party (by providing the third-party with personal information) has form 102 automatically filled in with information associated with customer 108. This last step is not part of the present invention and is described in greater detail in pending patent applications incorporated by reference above. Customer 108 and browser 1 10 are shown for completeness and are relevant in discussions below regarding the construction of document object models ("DOMs") based on the uploaded vendor forms.
FIG. 2 is a block diagram depicting components of a field mapping system in accordance with one embodiment of the present invention. Shown are vendor form 102, a list of third-party standard field names 202, a field mapping component 204, and a field mapping list 206. Vendor form 102 is described in FIG. 1 above. It is typically an HTML document created and maintained by a vendor. In other embodiments, form 102 can be an XML document. Form 102 contains various fields, of which three are shown (in brackets) with accompanying labels or text: First Name 208: [fname] 210, Last Name 212: [lname] 214, and Middle Initial 216: [minitial] 218. Of these items or objects, fname 210, lname 214, and minitial 218 are examples of legacy field names and can be categorized as field objects. Labels First Name 208, Last Name 212, and Middle Initial 216 are each text strings and can be further decomposed into "First," "Name," "Last," "Name," "Middle," and "Initial." The legacy field names and text strings are created and maintained by the vendor.
Standard field name list 202 is created and maintained by a third-party service for automatic form filling. In the described embodiment, field name list 202 contains about 70 to 80 of the most common fields found in most customer-oriented forms for purchasing goods or services in North America and internationally. When a vendor registers an electronic form, generally, many of the fields in the form will have a corresponding matching field in field name list 202. Shown as part of field name list 202 are three examples of standard field names: user_name_first 220, user_name_last 222, and user_name_middle 224, collectively making up a customer's name. In other embodiments, names of the standard fields and the number of such fields can vary depending on what the third-party service provider believes is most suitable. Other examples of field names found in field name list 202 include billing street address, user_home_phone, credit_card_type, and so on. Field mapping component 204 collects "intelligence" as new forms are registered and performs the field mapping function using various data sources, such as field map domain database 106, to create a field mapping list 206 (i.e., a list of associated field names). Field mapping component 204 is described in greater detail below. Initially (i.e., when no forms have been registered), mapping component 204 uses a training set which it then builds upon as new forms are registered. In the described embodiment, the training set, described below, is created "manually" in that vendors or the third-party service match fields without the benefit of a "most likely" matching field list.
After the training set is created and the mapping component is running, an inference base built using a learning engine is used to provide the vendor with a selection of standard field names for a particular legacy field name. Field mapping list 206 is an abstract representation of the final "output" of the intelligent field mapping process of the present invention. On the left-side column of list 206 are legacy field names found in form 102, such as fname 210, lname 214, and minitial 218. Associated with each legacy field name is a selection list of most likely corresponding standard field names. For example, fname 210 has a selection list having, as a first and most likely choice, user_name_first 220, followed by the second most likely choice, user_name_last 222, and finally, by user_name_middle 224. If there are fields preceding fname 210 and text string 208, those fields may also be in the selection list. Whether they appear in the selection list depends, for example, in part, on how often they immediately precede fname 210 in other vendor forms. In other embodiments, evidence variables other than proximity and frequency can be used by the learning engine, alone or in combination with other evidence variables, to determine the most likely matching standard field names.
Standard field name user_name_last 222 will likely always be in the selection list since it is almost always very close to user_name_first 220. The same is true for user_name_middle 224, although this field is less predictable: it may appear after lname 214, between fname 210 and lname 214, or not appear at all. In the described embodiment, the selections are presented to the vendor in the form of a pull-down window with only the first, most-likely selection showing initially, and the other choices in the pull-down window. In other embodiments, the selections can be presented to the user in other formats.
FIG. 3A is a flow diagram describing a process for building an initial training set or knowledge base to be used in the field mapping system in accordance with one embodiment of the present invention. The training set is an initial knowledge base that is created "manually" in that legacy fields are mapped to standard fields preferably by the vendor. This is done without providing a short list or most likely standard field to the person performing the mapping. At step 302 a vendor's customer form is uploaded to a server under the control of the third party service provider. In the described embodiment, the vendor uploads one form onto the server in order to register it with the third-party service. In other embodiments, multiple forms from one vendor can be uploaded.
At a step 304 the form, typically an HTML document, is parsed using an HTML parser thereby creating a document object model, or DOM. An example of an object model is shown in FIG. 3B. Such a parsing procedure is well known in the field of Internet programming as is the creation of DOMs (essentially a tree of nodes). Some typical parsing algorithms are widely used in Javascript, browser programs (e.g., Netscape Navigator or Microsoft Internet Explorer), and in WC3. The HTML parsers used in the present invention should be as "forgiving" as the HTML parsers utilized in the widely used Web browsers. In the described embodiment, DOM 330 created is a linear object model made up of a linear series of linked nodes. Object 330 can be seen as the raw model that represents the legacy form. It should be noted here that the object module is not necessarily created solely for field mapping component 204 and can be used by other components of the broader automatic form filling system. Following the same example of text and fields in form 102 from above, a corresponding DOM 330 would include a node 332 of type TEXT and value "First" followed by another TEXT node 334 with value "Name" followed by a node 336 of type LFIELD (legacy field) having a value " fname," and so on.
However, one unique feature of object model 330 in the described embodiment is the inclusion of an "unknown" node 338 for each legacy field node in the object model. This node of type "unknown" is parallel to and at the same level as a legacy field, such as "fname." Thus, the HTML parser of the described embodiment inserts these nodes in the object model and creates a set of unknown nodes. These object models containing the normal nodes and "unknown" nodes are assigned a unique identifier and stored. It is also useful to note here that because HTML is linear, the parser can create a linear object model in which it is possible, in subsequent steps, to determine the distance other nodes in the model are from a particular node (i.e., it is possible to count nodes ahead of or behind a given node to determine a distance). Thus, a vendor form is defined as a set of nodes of three types: text (labels), legacy fields, and unknowns.
At step 306, a human performs a manual mapping of each legacy field with a standard field. In this step, the object model representing the form is retrieved and all nodes representing a legacy field are extracted along with each associated unknown node or field. The process of extracting nodes of one or more particular types from an object model is well known in the field of computer and Internet programming. For each legacy field, a corresponding standard field is chosen, if appropriate or available. Once the mapping between the legacy field names and standard field names is complete, at step 308 the object model is retrieved or reconstructed with the unknown nodes or fields eliminated and replaced with the standard field names entered by the vendor. Any unknown nodes associated with legacy fields which did not have a matching standard field are unaltered and are essentially ignored in subsequent processing in the described embodiment. This manually entered mapping data is stored in a field map domain as a form object having individual elements and associated clauses. A clause is a data element and a privacy practice, not relevant to the present invention. At step 310 the completed object model containing the legacy fields and the standard field names are stored in a field map domain database. In the described embodiment the object model is compressed before being stored in the domain database. Although storing a compressed object model representing a form saves storage space (the number of forms could potentially reach hundreds of thousands), data not necessary to the algorithm described below is lost during compression. If data object modules of mapped forms are stored, no data would be lost.
In another embodiment, the entire object model uncompressed is stored should a different algorithm or learning engine be utilized for the field mapping. By storing an uncompressed object model, the current algorithm or a different type of algorithm can be retrained to use different data in the object model. For example, in another embodiment term or label nodes such as " First" or " Street Address" in the object model can also be mapped or associated with a standard field by the vendor. "First" can be associated with the standard field user iamejirst, for example, and this added intelligence can be used by the learning engine or algorithm to better predict which standard field names map to a given legacy field.
After step 310 the process of entering one vendor form into a training set is complete. The process is repeated for each form used to initially build the training set. In the described embodiment, intelligence data from 150 to 200 forms is used to initially build the training set. This number can, of course, vary depending on the desired initial accuracy or intelligence of the field mapping system.
FIG. 4 is a flow diagram of a process of registering a form with a third-party service after training set has been created in accordance with one embodiment of the present invention. Once the training set has been created and the field map system is in full operation, an online vendor desiring to take advantage of the third-party's automatic form filling service is required to register the form with the third-party. It is during this registration process that the vendor must map legacy fields in its form with standard field names created and maintained by the third-party. As described above, the present invention describes a method and system of facilitating the field mapping process by providing the vendor with the most likely standard field name to map to a particular legacy field followed by a list of other likely standard fields that might match. For many vendor forms having dozens of fields, intelligently suggesting standard names to the vendor can save the vendor from having to go through a potentially tedious and time-consuming process (otherwise the vendor would have to search through a long list of standard field names for each field in its form, thereby deterring some vendors from registering with the third party service).
Steps 402 and 404 of FIG. 4 are substantially similar to steps 302 and 304 of FIG. 3A. At step 402 the vendor uploads a form it desires to register to the third- party server. At step 404 the form is parsed using an HTML parser. An object model having nodes for terms, legacy fields, and "unknown" nodes associated with each legacy field node is created. The parser also extracts the legacy field nodes from the object model at step 404. At step 406 the field mapping system predicts which standard field names are most likely to match each legacy field (determined by examining the extracted legacy field nodes from the object model). That is, it makes its best predictions as to what the unknown nodes in the object model are in terms of standard field names. In the described embodiment this is done using a linear algebra algorithm and is described in greater detail below.
At step 408 the predicted standard field names for the legacy fields are presented to the user. In the described embodiment, the vendor initially sees one selection (the most likely standard field) followed by a pull-down menu of three or four other likely fields in descending order. The vendor either selects the standard field dame presented (i.e., the one the field mapper determined should be the corresponding standard field for the particular legacy field) or one of the other likely standard field names. If the vendor selects the first choice, it is essentially confirming the field mapper's prediction. If the vendor selects one of the others or none at all, he is implicitly correcting the field mapper's intelligence.
At step 410 this confirmation or correction feedback from the vendor (provided by the vendor simply from using the field mapping component) is used to micro-adjust the inference base or rule set of the field map system. The inference base is compressed into linear algebra tables as described below. In the described embodiment any corrections to or confirmations of the field map intelligence by the vendor are used to correct one evidence variable, namely, the relative distance of certain terms from the legacy field being examined. The distance is measured in terms of the number of nodes (in the object model) from a legacy field node. The role of this distance variable and how it is used is described in greater detail below. In other embodiments, other evidence variables can be used (e.g., a Boolean variable indicating whether a term is before or after a particular legacy field node) and, thus, input from the vendor can be used to micro-adjust the weights of those variables as well. Once the vendor settings are fed to the field map inference base, the process, of registering a form is complete.
In the described embodiment a linear algebra algorithm is used as a learning engine for building the inference base of the field map system. In other embodiments the learning engine can be based on a Bayesian network or a neural network. Generally, the linear algebra algorithm used in the described embodiment uses a vector set or matrix to represent a form where each row is keyed based on a standard field. Once a matrix, described below, is created for a form, its values are averaged or combined with corresponding values in a "master" matrix which represents a constantly adjusted inference base for the field map system. Predictions as to legacy field/standard field mappings are derived from this inference base or master matrix.
A selected form 1 includes a series of labels (terms) and fields (legacy fields). In the first stage of the linear algebra algorithm a set of vectors, T/ is created for a form 1. This set of vectors only corresponds to nodes that were in the selected form /. Each vector represents one term i contained in form 1, such as "First" "Last" or " Street." Thus, if one form had the term "Email" the vector set for that form would have a vector corresponding to that term, whereas another vendor form that did not contain a field for "Email" would not have a vector for that term i. A vector for a term in form 1 is represented by the following formula: T] = Σ .• ≠,- P , . Thus, for a given term or node i in a form 1, a vector is created. An example of a single vector for the term "First" would be:
^(Fi st) = Λ l P(user_name_first) + ^ p(user_name_fϊrst) + - The variable w represents a
"weight" calculated based on the distance a term node or field node is from a particular standard field. The calculation of this weight is described below. Each vector is a sum of values where each value is the equivalent of a dot product of a weight (e.g., 1/3, 1/6. . .) and an unknown node in the object model. The operation performed is similar to a dot product taken in an inverted manner. Essentially, each vector represents how term i is associated with standard field nodes, symbolized by
The weights of nodes in the vicinity of term i are inversely proportional to the distance of the node from the term. The distance is determined by the number of nodes from the term /' node in object model 330. This distance is essentially the same as the relative position of terms in the original HTML document of form 1. Thus, if the nodes are in the following order: "First" "Name" [fname] "Last" "Name" and [lname], the node for term "First" is two away from the [fname] node and five away from the [lname] node, for example. A node is zero distance away from itself. Each vector T/. (for term i in form 1) has a vector of weights, w/iQ, W JJ , w i2 , and so on, associated with each node. If the weight falls below a certain threshold value, the node is ignored and not included in the vector for that term. For example, while
[lname] is sufficiently close to "First" to be included in the vector for "First", a term much further down the form, such as "Payment" or "Email", may be too far from
"First" to have any significant weight and therefore, not be included in the vector. In the described embodiment, the weight of a particular node with respect to a term is calculated based on the following formula: w=l/(l+d), hence, inversely proportional to the distance of the node from the term. The variable d is the distance a node is from term i. If a term has a distance of two, it is assigned a weight of 1/3; if it has a distance of one, it has a weight of V2. Thus, the farther the distance a node is from term i, less weight is accorded to that node. In the described embodiment, if a node has a weight less than 1/8, it is not included in the vector N j for term .
The set of vectors T/ represents vectors for each term in form 1. Each vector, in turn, can be seen as a series of weight, standard field pairs as described above. The number of these pairs in each vector can vary, as can the number of vectors in the set of vectors representing a form since forms can have varying number of terms.
T 7 first 1 / 3 ruser _ name _ first + 1 / 6 Puser _ name _ last + • 1 l ame 1 / 3 Puser _ name _ first + 1 / 6 r user _ name _ last + ■
T, 1 I fname 1 ruser _ name _ first + 1 / 4 r user _ name _ last + ■■■
The inference base is the center of intelligence of the field map system and is continually modified when new forms are registered. In other words, when a new T/ is entered into the system, data from the vectors in T/ are used to update or, more accurately, micro-adjust data in the inference base. One primary difference in calculating values for the inference base is that it is not specific to one form 1; thus, subscript 1 does not appear in any of the equations. The inference base is also a series of vectors, similar to the matrix for a single form 1. However, the number of vectors is much larger since each unique term found in all forms registered has a vector in the inference base, which could range in the hundreds and possibly thousands. The equation used to calculate a vector inference base is similar to that for a form: Tj = ∑ jWyP; , for a term /' and a standard field/.
The matrix T;- is a sum of vectors T/ , described below, where each vector represents a term found in at least one of the registered forms and that term's association with other nodes. The weight given to a particular standard field to represent that field's association with a particular term, such as a legacy field or text string, is an average of weights given to the same standard field with respect to the same particular term over all previously registered forms. Each vector in the inference base matrix is calculated based on the formula Tj = T jWjjPj . The weight Wj; is calculated by summing all the weights from individual forms, represented by w , as described above, divided by the number of times nj the term has appeared in previously registered forms. The matrix T,- is the average of individual form matrices T/ . The field map system keeps a count of the number of times a term appears in a particular form. A count vector N is maintained using the formula N = ; n jNi , where n; is the number of times node appeared in all the registered forms. Thus, when calculating the average, the sum of the number of times a term appears is determined first. The row corresponding to the term is divided by the sum to obtain the average weight for that term with respect to each standard field name in that row. Each element (node) in a row is divided by the number of times it appears in a particular form.
The inference base matrix is used to calculate probabilities that a particular unknown or standard field is a matching standard field for a given legacy field. In the described embodiment a general formula used to derive probabilities for an unknown field k in form / is U/ = ∑ j Tj . Weights w are only those weights that exceed a threshold value, such as 1/8. Any fields having weights lower than the threshold are not worth considering in the equation. In the described embodiment Tj is normalized by dividing Tj by the magnitude of Tj . A normalized
Tj is, for a particular legacy term i, a vector of sums of the dot product of weight and a standard field , based on the formula Tj = ∑ J WJ;PJ , as described above. A substitution can be made replacing Tj in the general formula U = ∑ jW Tj with
Σ iWijPi - By performing this substitution, U can now be expressed in terms of weights and standard field names, P; . This makes calculating the probabilities for
U practical. The probabilities can now be expressed as a sum within a sum:
U = X i w/ jWjjPj J. This can be rearranged to express the probabilities more explicitly in terms of P; or standard field: U = [∑ j(∑ i w/ k wij )JPj where the sums within the brackets represent the probability that the unknown field k in form / is the standard field Pj . The final result, that U be expressed as a linear combination of possible P;'s, is then achieved. The sums expressed in the formula are depicted in the form of a table shown in FIGURE 5. FIG. 5 is a table depicting weight values and fields used to calculate probabilities of particular standard fields in accordance with one embodiment of the present invention. The final step in the linear algebra learning algorithm used by the field map system is calculating a probability that a legacy field maps to a given standard field. The calculation based on the formula U = |∑ jt∑ i w/ k WJJ JJ PJ can be shown more explicitly in table 500. Columns 502 represent standard field names that can possibly match (i.e., have weights above a threshold value) any of the legacy fields in form /. Thus, the number of columns in table 500 can vary depending on the form / being parsed. Shown in table 500 are two sample columns: user jiame Jirst 504 and user iame ast 506. Rows 508 represent all terms and legacy fields in form / and are drawn from the form object model. As with columns 502, the number of rows can vary depending on the form being parsed.
Shown in table 500 are four sample rows for "First", "Name", [fname], and "Last."
In the described embodiment each term in rows 508 has a corresponding weight column 510 containing individual weight values w , previously calculated as described above. This weight corresponds to a particular term's distance from an unknown field. Thus, w/ ' l,k is the weight between the unknown k and node i. In table 500, the term "First" has a weight of 1/3 with respect to legacy field [fname]. "Name" and "Last" have higher weights since they are closer to [fname]. Legacy field [fname] has a weight of one with respect to itself. A cell 512 (intersection of a row and column) contains another weight value, specifically Wjj, needed to complete the sum operations to calculate the probability that particular unknown U matches a standard field P;. Cell 512, for example, holds the value .35. This is the weight given to the term "First" with respect to the standard field user jiame Jirst. The variable WJ; is calculated as WJ; = Σ WΛ -i . In another cell 514 representing [fname]
Nj and user Jirst _name, w;; is very high since, drawing from past form registrations, the legacy field [fname] has frequently matched standard field user _name Jirst. The weight would be higher, such as .99, if all previous forms called the field for the customer's first name "fname." Vendors can call this field various other names, such as [firstname] or [name_first], and so on, which somewhat dilutes the possibility that [fname] matches user name Jrst.
Once the weight values are in table 500, probabilities that a particular standard field matches a legacy field can be calculated. As noted above in the formula for unknown U = [∑ j(∑ i / k wij ]jPj > operations are performed for a particular column (standard field name). The summation of w/ and WJ; is first taken. These values are then summed to derive a particular probability. Using sample values from table 500, the calculations to be performed for user jiame Jirst would include: P user jiame Jirst = [(l/3)*.35] + [(l/2)*.5] + [1 *.9] + [(l 2)*.5] + . . . These calculations result in a value representing a probability that user name Jirst maps to [fname]. A similar calculation is done for : P user name last: [(1/3)*.35] + [(l/2)*.2] + [l *.4] + [(l/2)*.6] . . . . These probability values are then examined and the standard field having the highest probability is presented to the vendor during registration. In the described embodiment, the standard fields having the second, third, and fourth highest probability values are presented to the vendor in a pop-down window should the first, most-likely, candidate standard field not be the one that maps to the legacy field. These calculations lead to the formation of logical legacy field/standard fields list 206 of FIG. 2. Table 500 is logically created for each legacy field in vendor form 1. Through this process, potential standard fields for each legacy field in a form are derived and presented to the vendor.
The present invention employs various computer-implemented operations involving data stored in computer systems. These operations include, but are not limited to, those requiring physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. The operations described herein that form part of the invention are useful machine operations. The manipulations performed are often referred to in terms, such as, producing, identifying, running, determining, comparing, executing, downloading, or detecting. It is sometimes convenient, principally for reasons of common usage, to refer to these electrical or magnetic signals as bits, values, elements, variables, characters, data, or the like. It should remembered, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities.
The present invention also relates to a server computer or similar system for performing the aforementioned operations. The system may be specially constructed for the required purposes, or it may be a general purpose computer selectively activated or configured by a computer program stored in the computer. The processes presented above are not inherently related to any particular computer or other computing apparatus. In particular, various general-purpose computers may be used with programs written in accordance with the teachings herein, or, alternatively, it may be more convenient to construct a more specialized computer system, such as a privacy bank server, to perform the required operations. FIG. 6 is a block diagram of a general purpose computer system 600 suitable for carrying out the processing in accordance with one embodiment of the present invention. FIG. 6 illustrates one embodiment of a general purpose computer system. Other computer system architectures and configurations can be used for carrying out the processing of the present invention. Computer system 600, made up of various subsystems described below, includes at least one microprocessor subsystem (also referred to as a central processing unit, or CPU) 602. That is, CPU 602 can be implemented by a single-chip processor or by multiple processors. It should be noted that in reconfigurable computing systems, CPU 602 can be distributed amongst a group of programmable logic devices. In such a system, the programmable logic devices can be reconfigured as needed to control the operation of computer system 600. In this way, the manipulation of input data is distributed amongst the group of programmable logic devices. CPU 602 is a general purpose digital processor which controls the operation of the computer system 600. Using instructions retrieved from memory, the CPU 602 controls the reception and manipulation of input data, and the output and display of data on output devices.
CPU 602 is coupled bi-directionally with a first primary storage 604, typically a random access memory (RAM), and uni-directionally with a second primary storage area 606, typically a read-only memory (ROM), via a memory bus 608. As is well known in the art, primary storage 604 can be used as a general storage area and as scratch-pad memory, and can also be used to store input data and processed data. It can also store programming instructions and data, in the form of a field map domain database or a standard field name list, to name just two example, in addition to other data and instructions for processes operating on CPU 602, and is used typically used for fast transfer of data and instructions in a bi-directional manner over the memory bus 608. Also as well known in the art, primary storage 606 typically includes basic operating instructions, program code, data and objects used by the CPU 602 to perform its functions. Primary storage devices 604 and 606 may include any suitable computer-readable storage media, described below, depending on whether, for example, data access needs to be bi-directional or uni-directional. CPU 602 can also directly and very rapidly retrieve and store frequently needed data in a cache memory 610.
A removable mass storage device 612 provides additional data storage capacity for the computer system 600, and is coupled either bi-directionally or uni- directionally to CPU 602 via a peripheral bus 614. For example, a specific removable mass storage device commonly known as a CD-ROM typically passes data uni-directionally to the CPU 602, whereas a floppy disk can pass data bi- directionally to the CPU 602. Storage 612 may also include computer-readable media such as magnetic tape, flash memory, signals embodied on a carrier wave, PC- CARDS, portable mass storage devices, holographic storage devices, and other storage devices. A fixed mass storage 616 also provides additional data storage capacity and is coupled bi-directionally to CPU 602 via peripheral bus 614. The most common example of mass storage 616 is a hard disk drive. Generally, access to these media is slower than access to primary storages 604 and 606.
Mass storage 612 and 616 generally store additional programming instructions, data, and the like that typically are not in active use by the CPU 602. It will be appreciated that the information retained within mass storage 612 and 616 may be incorporated, if needed, in standard fashion as part of primary storage 604 (e.g., RAM) as virtual memory.
In addition to providing CPU 602 access to storage subsystems, the peripheral bus 614 is used to provide access other subsystems and devices as well. In the described embodiment, these include a display monitor 618 and adapter 620, a printer device 622, a network interface 624, an auxiliary input/output device interface 626, a sound card 628 and speakers 630, and other subsystems as needed.
The network interface 624 allows CPU 602 to be coupled to another computer, computer network, or telecommunications network using a network connection as shown. Through the network interface 624, it is contemplated that the CPU 602 might receive information, e.g., data objects or program instructions, from another network, or might output information to another network in the course of performing the above-described method steps. Information, often represented as a sequence of instructions to be executed on a CPU, may be received from and outputted to another network, for example, in the form of a computer data signal embodied in a carrier wave. An interface card or similar device and appropriate software implemented by CPU 602 can be used to connect the computer system 600 to an external network and transfer data according to standard protocols. That is, method embodiments of the present invention may execute solely upon CPU 602, or may be performed across a network such as the Internet, intranet networks, or local area networks, in conjunction with a remote CPU that shares a portion of the processing. Additional mass storage devices (not shown) may also be connected to CPU 602 through network interface 624. Auxiliary I/O device interface 626 represents general and customized interfaces that allow the CPU 602 to send and, more typically, receive data from other devices such as microphones, touch-sensitive displays, transducer card readers, tape readers, voice or handwriting recognizers, biometrics readers, cameras, portable mass storage devices, and other computers. Also coupled to the CPU 602 is a keyboard controller 632 via a local bus 634 for receiving input from a keyboard 636 or a pointer device 638, and sending decoded symbols from the keyboard 636 or pointer device 638 to the CPU 602. The pointer device may be a mouse, stylus, track ball, or tablet, and is useful for interacting with a graphical user interface. In addition, embodiments of the present invention further relate to computer storage products with a computer readable medium that contain program code for performing various computer-implemented operations. The computer-readable medium is any data storage device that can store data which can thereafter be read by a computer system. The media and program code may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known to those of ordinary skill in the computer software arts. Examples of computer-readable media include, but are not limited to, all the media mentioned above: magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as floptical disks; and specially configured hardware devices such as application-specific integrated circuits (ASICs), programmable logic devices (PLDs), and ROM and RAM devices. The computer-readable medium can also be distributed as a data signal embodied in a carrier wave over a network of coupled computer systems so that the computer- readable code is stored and executed in a distributed fashion. Examples of program code include both machine code, as produced, for example, by a compiler, or files containing higher level code that may be executed using an interpreter.
It will be appreciated by those skilled in the art that the above described hardware and software elements are of standard design and construction. Other computer systems suitable for use with the invention may include additional or fewer subsystems. In addition, memory bus 608, peripheral bus 614, and local bus 634 are illustrative of any interconnection scheme serving to link the subsystems. For example, a local bus could be used to connect the CPU to fixed mass storage 616 and display adapter 620. The computer system shown in FIG. 6 is but an example of a computer system suitable for use with the invention. Other computer architectures having different configurations of subsystems may also be utilized.
Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. Furthermore, it should be noted that there are alternative ways of implementing both the process and apparatus of the present invention. For example, a Bayesian network or neural network can be used instead of a linear algebra based learning system to create the list of predicted standard field names and maintain the master knowledge base. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

Claims

What is claimed is:
1. A method of mapping a standard field name with a legacy field name from an electronic legacy form, the method comprising: extracting a legacy field name from an electronic legacy form; creating a list of one or more predicted standard field names that are most likely to map to the legacy field name; selecting a predicted standard field name from the list that maps to the legacy field name; and adjusting a knowledge base used to create the list of one or more predicted standard field names to reflect the selection of the predicted standard field name that maps to the legacy field name.
2. A method as recited in claim 1 wherein extracting a legacy field name further comprises: parsing the electronic legacy form; and creating a document object model.
3. A method as recited in claim 2 wherein the document object model is a linear object model.
4. A method as recited in claim 1 wherein the list of one or more predicted standard field names are drawn from a standard field name list containing a plurality of field names commonly found in vendor forms.
5. A method as recited in claim 1 further comprising: creating an initial training set containing manually entered field mapping data.
6. A method as recited in claim 1 wherein the knowledge base is stored in a field map domain database.
7. A method as recited in claim 1 wherein creating a list of one or more predicted standard field names further comprises: utilizing one or more evidence variables to determine standard field names most likely to map to the legacy field name.
8. A method as recited in claim 7 wherein at least one evidence variable is relative distance of a term from the legacy field name being examined.
9. A method as recited in claim 1 wherein the knowledge base is a vector set having a plurality of vectors, wherein a vector represents a standard field name.
10. A method as recited in claim 1 wherein the list of one or more standard field names is created using a linear algebra algorithm.
11. A computer-readable medium containing programmed instructions arranged to map a standard field name with a legacy field name from an electronic legacy form, the computer-readable medium including programmed instructions for: extracting a legacy field name from an electronic legacy form; creating a list of one or more predicted standard field names that are most likely to map to the legacy field name; selecting a predicted standard field name from the list that maps to the legacy field name; and adjusting a knowledge base used to create the list of one or more predicted standard field names to reflect the selection of the predicted standard field name that maps to the legacy field name.
PCT/US2000/040415 1999-07-19 2000-07-18 Intelligent mapping of field names in an electronic form with standard field names WO2001006416A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU78801/00A AU7880100A (en) 1999-07-19 2000-07-18 Intelligent mapping of field names in an electronic form with standard field names

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US35753099A 1999-07-19 1999-07-19
US09/357,530 1999-07-19

Publications (2)

Publication Number Publication Date
WO2001006416A2 true WO2001006416A2 (en) 2001-01-25
WO2001006416A3 WO2001006416A3 (en) 2004-01-15

Family

ID=23406001

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/040415 WO2001006416A2 (en) 1999-07-19 2000-07-18 Intelligent mapping of field names in an electronic form with standard field names

Country Status (2)

Country Link
AU (1) AU7880100A (en)
WO (1) WO2001006416A2 (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1596310A3 (en) * 2004-05-12 2007-08-01 Microsoft Corporation Intelligent autofill
WO2008046218A1 (en) * 2006-10-20 2008-04-24 Her Majesty The Queen, In Right Of Canada As Represented By The Minister Of Health Through The Public Health Agency Of Canada Method and apparatus for creating a configurable browser-based forms application
EP1898601A3 (en) * 2006-09-08 2008-05-07 Ricoh Company, Ltd. System, method, and computer program product for identification of vendor and model name of a remote device among multiple network protocols
CN102184204A (en) * 2011-04-28 2011-09-14 常州大学 Auto fill method and system of intelligent Web form
US8694518B2 (en) 2007-06-14 2014-04-08 Colorquick, L.L.C. Method and apparatus for database mapping
WO2016011456A1 (en) * 2014-07-18 2016-01-21 FHOOSH, Inc. Systems and methods for locating, identifying and mapping electronic form fields
US9633378B1 (en) * 2010-12-06 2017-04-25 Wayfare Interactive, Inc. Deep-linking system, method and computer program product for online advertisement and E-commerce
US9858129B2 (en) 2016-02-16 2018-01-02 International Business Machines Corporation Dynamic copy content retrieval
CN107783950A (en) * 2017-04-11 2018-03-09 平安医疗健康管理股份有限公司 Package insert processing method and processing device
US10108928B2 (en) 2011-10-18 2018-10-23 Dotloop, Llc Systems, methods and apparatus for form building
US10152734B1 (en) 2010-12-06 2018-12-11 Metarail, Inc. Systems, methods and computer program products for mapping field identifiers from and to delivery service, mobile storefront, food truck, service vehicle, self-driving car, delivery drone, ride-sharing service or in-store pickup for integrated shopping, delivery, returns or refunds
US20190129931A1 (en) * 2017-10-28 2019-05-02 Intuit Inc. System and method for reliable extraction and mapping of data to and from customer forms
US10552525B1 (en) * 2014-02-12 2020-02-04 Dotloop, Llc Systems, methods and apparatuses for automated form templating
US10614099B2 (en) 2012-10-30 2020-04-07 Ubiq Security, Inc. Human interactions for populating user information on electronic forms
US10733364B1 (en) 2014-09-02 2020-08-04 Dotloop, Llc Simplified form interface system and method
US10762581B1 (en) 2018-04-24 2020-09-01 Intuit Inc. System and method for conversational report customization
US10817914B1 (en) 2010-12-06 2020-10-27 Metarail, Inc. Systems, methods and computer program products for triggering multiple deep-linked pages, apps, environments, and devices from single ad click
US10826951B2 (en) 2013-02-11 2020-11-03 Dotloop, Llc Electronic content sharing
US10839431B1 (en) 2010-12-06 2020-11-17 Metarail, Inc. Systems, methods and computer program products for cross-marketing related products and services based on machine learning algorithms involving field identifier level adjacencies
US10839430B1 (en) 2010-12-06 2020-11-17 Metarail, Inc. Systems, methods and computer program products for populating field identifiers from telephonic or electronic automated conversation, generating or modifying elements of telephonic or electronic automated conversation based on values from field identifiers
CN112347320A (en) * 2020-11-05 2021-02-09 杭州数梦工场科技有限公司 Associated field recommendation method and device for data table field
US10963926B1 (en) 2010-12-06 2021-03-30 Metarail, Inc. Systems, methods and computer program products for populating field identifiers from virtual reality or augmented reality environments, or modifying or selecting virtual or augmented reality environments or content based on values from field identifiers
US10976885B2 (en) 2013-04-02 2021-04-13 Zillow, Inc. Systems and methods for electronic signature
US11120512B1 (en) 2015-01-06 2021-09-14 Intuit Inc. System and method for detecting and mapping data fields for forms in a financial management system
US11349656B2 (en) 2018-03-08 2022-05-31 Ubiq Security, Inc. Systems and methods for secure storage and transmission of a data stream
US11393057B2 (en) 2008-10-17 2022-07-19 Zillow, Inc. Interactive real estate contract and negotiation tool
CN117009998A (en) * 2023-08-29 2023-11-07 上海倍通医药科技咨询有限公司 Data inspection method and system
CN117113947A (en) * 2023-10-25 2023-11-24 天衣(北京)科技有限公司 Form filling system, method, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998004976A1 (en) * 1996-07-25 1998-02-05 Lextron Systems, Inc. Apparatus and methods to enhance web browsing on the internet
WO1998032289A2 (en) * 1997-01-17 1998-07-23 The Board Of Regents Of The University Of Washington Method and apparatus for accessing on-line stores

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998004976A1 (en) * 1996-07-25 1998-02-05 Lextron Systems, Inc. Apparatus and methods to enhance web browsing on the internet
WO1998032289A2 (en) * 1997-01-17 1998-07-23 The Board Of Regents Of The University Of Washington Method and apparatus for accessing on-line stores

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"Gator offers one-click shopping at over 5000 e-commerce sites today" GATOR.COM COMPANY INFO, 14 June 1999 (1999-06-14), XP002145278 Retrieved from the Internet: <URL:www.gator.com/company/press/pr061499b .html> [retrieved on 2000-08-18] *
BLATTNER M M ET AL: "A user interface for computer-based message translation" SYSTEM SCIENCES, 1989. VOL.IV: EMERGING TECHNOLOGIES AND APPLICATIONS TRACK, PROCEEDINGS OF THE TWENTY-SECOND ANNUAL HAWAII INTERNATIONAL CONFERENCE ON KAILUA-KONA, HI, USA 3-6 JAN. 1989, WASHINGTON, DC, USA,IEEE COMPUT. SOC. PR, US, 3 January 1989 (1989-01-03), pages 43-51, XP010015135 ISBN: 0-8186-1914-7 *
BLATTNER M M ET AL: "DATA STRUCTURE AND FORMAT CONVERSION USING SYNTACTIVE INFERENCE1" RECENT ADVANCES WITH OXYGEN IN IRON AND STEEL MAKING, LONDON, BUTTERWORTHS, GB, 7 October 1987 (1987-10-07), pages 416-421, XP000042358 *
MARET P ET AL: "MULTIMEDIA INFORMATION INTERCHANGE: WEB FORMS MEET DATA SERVERS" PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, LOS ALAMITOS, CA, US, vol. 2, 7 June 1999 (1999-06-07), pages 499-505, XP000964627 *

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1596310A3 (en) * 2004-05-12 2007-08-01 Microsoft Corporation Intelligent autofill
EP1898601A3 (en) * 2006-09-08 2008-05-07 Ricoh Company, Ltd. System, method, and computer program product for identification of vendor and model name of a remote device among multiple network protocols
US7552111B2 (en) 2006-09-08 2009-06-23 Ricoh Co., Ltd. System, method, and computer program product for identification of vendor and model name of a remote device among multiple network protocols
WO2008046218A1 (en) * 2006-10-20 2008-04-24 Her Majesty The Queen, In Right Of Canada As Represented By The Minister Of Health Through The Public Health Agency Of Canada Method and apparatus for creating a configurable browser-based forms application
US8336022B2 (en) 2006-10-20 2012-12-18 Her Majesty the Queen in Right of Canada as Represented by the Minister of Health Through the Public Health Agency of Canada Method and apparatus for creating a configurable browser-based forms application
US8694518B2 (en) 2007-06-14 2014-04-08 Colorquick, L.L.C. Method and apparatus for database mapping
US11393057B2 (en) 2008-10-17 2022-07-19 Zillow, Inc. Interactive real estate contract and negotiation tool
US10817914B1 (en) 2010-12-06 2020-10-27 Metarail, Inc. Systems, methods and computer program products for triggering multiple deep-linked pages, apps, environments, and devices from single ad click
US10839431B1 (en) 2010-12-06 2020-11-17 Metarail, Inc. Systems, methods and computer program products for cross-marketing related products and services based on machine learning algorithms involving field identifier level adjacencies
US10963926B1 (en) 2010-12-06 2021-03-30 Metarail, Inc. Systems, methods and computer program products for populating field identifiers from virtual reality or augmented reality environments, or modifying or selecting virtual or augmented reality environments or content based on values from field identifiers
US10929896B1 (en) 2010-12-06 2021-02-23 Metarail, Inc. Systems, methods and computer program products for populating field identifiers from in-store product pictures or deep-linking to unified display of virtual and physical products when in store
US10839430B1 (en) 2010-12-06 2020-11-17 Metarail, Inc. Systems, methods and computer program products for populating field identifiers from telephonic or electronic automated conversation, generating or modifying elements of telephonic or electronic automated conversation based on values from field identifiers
US10152734B1 (en) 2010-12-06 2018-12-11 Metarail, Inc. Systems, methods and computer program products for mapping field identifiers from and to delivery service, mobile storefront, food truck, service vehicle, self-driving car, delivery drone, ride-sharing service or in-store pickup for integrated shopping, delivery, returns or refunds
US10262342B2 (en) 2010-12-06 2019-04-16 Metarail, Inc. Deep-linking system, method and computer program product for online advertisement and E-commerce
US9633378B1 (en) * 2010-12-06 2017-04-25 Wayfare Interactive, Inc. Deep-linking system, method and computer program product for online advertisement and E-commerce
US10789626B2 (en) 2010-12-06 2020-09-29 Metarail, Inc. Deep-linking system, method and computer program product for online advertisement and e-commerce
CN102184204A (en) * 2011-04-28 2011-09-14 常州大学 Auto fill method and system of intelligent Web form
US10108928B2 (en) 2011-10-18 2018-10-23 Dotloop, Llc Systems, methods and apparatus for form building
US11176518B2 (en) 2011-10-18 2021-11-16 Zillow, Inc. Systems, methods and apparatus for form building
US10635692B2 (en) 2012-10-30 2020-04-28 Ubiq Security, Inc. Systems and methods for tracking, reporting, submitting and completing information forms and reports
US10614099B2 (en) 2012-10-30 2020-04-07 Ubiq Security, Inc. Human interactions for populating user information on electronic forms
US11258837B1 (en) 2013-02-11 2022-02-22 Zillow, Inc. Electronic content sharing
US10826951B2 (en) 2013-02-11 2020-11-03 Dotloop, Llc Electronic content sharing
US11621983B1 (en) 2013-02-11 2023-04-04 MFTB Holdco, Inc. Electronic content sharing
US10976885B2 (en) 2013-04-02 2021-04-13 Zillow, Inc. Systems and methods for electronic signature
US11494047B1 (en) 2013-04-02 2022-11-08 Zillow, Inc. Systems and methods for electronic signature
US10552525B1 (en) * 2014-02-12 2020-02-04 Dotloop, Llc Systems, methods and apparatuses for automated form templating
WO2016011456A1 (en) * 2014-07-18 2016-01-21 FHOOSH, Inc. Systems and methods for locating, identifying and mapping electronic form fields
US10733364B1 (en) 2014-09-02 2020-08-04 Dotloop, Llc Simplified form interface system and method
US11120512B1 (en) 2015-01-06 2021-09-14 Intuit Inc. System and method for detecting and mapping data fields for forms in a financial management system
US11734771B2 (en) 2015-01-06 2023-08-22 Intuit Inc. System and method for detecting and mapping data fields for forms in a financial management system
US9858129B2 (en) 2016-02-16 2018-01-02 International Business Machines Corporation Dynamic copy content retrieval
CN107783950A (en) * 2017-04-11 2018-03-09 平安医疗健康管理股份有限公司 Package insert processing method and processing device
US10853567B2 (en) * 2017-10-28 2020-12-01 Intuit Inc. System and method for reliable extraction and mapping of data to and from customer forms
US11354495B2 (en) 2017-10-28 2022-06-07 Intuit Inc. System and method for reliable extraction and mapping of data to and from customer forms
US20190129931A1 (en) * 2017-10-28 2019-05-02 Intuit Inc. System and method for reliable extraction and mapping of data to and from customer forms
US11349656B2 (en) 2018-03-08 2022-05-31 Ubiq Security, Inc. Systems and methods for secure storage and transmission of a data stream
US10762581B1 (en) 2018-04-24 2020-09-01 Intuit Inc. System and method for conversational report customization
CN112347320A (en) * 2020-11-05 2021-02-09 杭州数梦工场科技有限公司 Associated field recommendation method and device for data table field
CN117009998A (en) * 2023-08-29 2023-11-07 上海倍通医药科技咨询有限公司 Data inspection method and system
CN117113947A (en) * 2023-10-25 2023-11-24 天衣(北京)科技有限公司 Form filling system, method, electronic equipment and storage medium
CN117113947B (en) * 2023-10-25 2024-01-23 天衣(北京)科技有限公司 Form filling system, method, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2001006416A3 (en) 2004-01-15
AU7880100A (en) 2001-02-05

Similar Documents

Publication Publication Date Title
WO2001006416A2 (en) Intelligent mapping of field names in an electronic form with standard field names
US7496543B1 (en) Pricing engine for electronic commerce
US8260682B2 (en) Systems and methods for online selection of service providers and management of service accounts
US8412718B1 (en) System and method for determining originality of data content
CN110738545A (en) Product recommendation method and device based on user intention identification, computer equipment and storage medium
EP1166216A2 (en) Server for enabling the automatic insertion of data into electronic forms on a user computer
CN106997549A (en) The method for pushing and system of a kind of advertising message
US11227217B1 (en) Entity transaction attribute determination method and apparatus
CA2548320A1 (en) Method and apparatus for optimizing product distribution strategies and product mixes to increase profitability in complex computer aided pricing of products and services
CN110659318A (en) Big data based strategy pushing method and system and computer equipment
CN114303164A (en) Supplier invoice reconciliation and payment using event driven platform
JP2020520609A (en) Electronic message filtering
CN103778358B (en) A kind of method and system for realizing shopping online
CN107908615A (en) A kind of method and apparatus for obtaining search term corresponding goods classification
CN111461827B (en) Push method and device for product evaluation information
CN114663155A (en) Advertisement putting and selecting method and device, equipment, medium and product thereof
CN110276660A (en) A kind of goods distribution method, device and its system
US20050165596A1 (en) Method and apparatus for determining expected values in the presence of uncertainty
US8856094B2 (en) Remote segmentation system and method
WO2000042540A2 (en) Method and apparatus for client side automatic electronic form completion
CN115423040A (en) User portrait identification method and AI system of interactive marketing platform
US20020194052A1 (en) Method and system for analyzing application needs of an entity
CN113689233A (en) Advertisement putting and selecting method and corresponding device, equipment and medium thereof
CN111813999A (en) Method for improving expandability of intelligent contract field of Etheng
US11532023B2 (en) System and method for streamlining a checkout process of e-commerce websites

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP