WO2002019164A1

WO2002019164A1 - Virtual aggregate fields

Info

Publication number: WO2002019164A1
Application number: PCT/US2001/011418
Authority: WO
Inventors: Walter Lindsay
Original assignee: Contivo, Inc.
Priority date: 2000-08-29
Filing date: 2001-04-06
Publication date: 2002-03-07
Also published as: JP2004507840A; EP1330735A1; AU2001251451A1; CA2420788A1; US6694338B1

Abstract

A method comprising defining one or more aggregate virtual fields (Figure 6) for a first document (610) using meta-data associated with the first document (615) is disclosed.

Description

VIRTUAL AGGREGATE FIELDS

FIELD OF INVENTION

The invention is related to the field of representation and translation of electronic documents.

BACKGROUND OF THE INVENTION

Some documents have fields that must be combined or massaged to produce data that matches a desired "meaning". For example, in a document, the following structure may be present in Documentl:

Store_Identifier (group)

DUNS_Number (9-digit field) Store_Number (4-digit field)

And in Document2: Location (11-digit field)

Where the Location is the DUNS_Number with the Store_Number concatenated to it. In this case, a "meaning" is part of multiple fields in Documentl.

Conventional solutions to mapping require users to write customized code to handle such cases. For example, suppose Vendorl has a mapping tool that Customerl uses. Customerl is mapping Documentl to Document2. Customerl writes code like:

Tar get. Location = concati Source.Store_Identifier.DUNS_Number, Source. Store_Identifier. Store Number)

Locations in a document can have more than one meaning. This means that mapping is hard to automate. Instead, the mappings must be manually done and require customized code, which does not allow reuse of mapping knowledge and rules.

Conventional solutions have several disadvantages. First, both mapping and the mapping rules are one-off. That is, each time a user wants to define how to perform a document translation, similar code must be written and tested. This increases the time needed to define how to translate from the source to the target document.

Furthermore, both mapping and the mapping rules depend on user- written code. This makes it hard to automatically validate the integrity of the mapping. It also sets a minimum bar for the skill level of anyone trying to define a mapping, as they must then know all the document locations that might hold a particular meaning, and must be skilful enough to write the code to handle the case. This imposes a maintenance burden, as fixing a problem in a mapping requires altering code.

The mapping and the mapping rules are translation-language dependent. The code that must be written and tested depends on the underlying translation engine that will translate the documents. Thus, mapping rules will be translation-engine dependent, and that a translation defined for one translation engine will likely need adjusting to make the mapping work on a different translation engine. Moving a transform from one translation engine to another is difficult.

The source and target mappings must be significantly different. The code for handling the case described above will differ whether the document is the source or the target document. If one has mapped from A to B, mapping from B to A requires major rework, as the code for the mapping would have to be rewritten using different logic.

Conventional mapping tools previously use superficial similarities in field names or document structure as the basis for automapping. They can not automap to virtual structures, forcing users to write code. SUMMARY OF THE INVENTION

A method comprising defining one or more aggregate virtual fields for a first document using meta-data associated with the first document is disclosed.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which:

Figure 1 illustrates one embodiment of a data structure for part of a document.

Figure 2 illustrates another embodiment of a data structure for the same document as Figure 1.

Figure 3 illustrates one embodiment of a data structure including the information needed to create the virtual aggregate field of Figure 2.

Figure 4 is an example of a network that uses aggregate virtual fields to translate documents.

Figure 5 is an example of a computer system that uses aggregate virtual fields to translate documents.

Figure 6 is an example of a translation system that uses aggregate virtual fields to translate documents.

Figure 7 is an embodiment of a method for automatically generating a transform using aggregate virtual fields.

DETAILED DESCRIPTION

A transform to perform mappings can be automatically generated using virtual aggregate fields. The virtual aggregate fields can be used to create a named field in the document, and associate it with a set of fields with a "map- to" and/or a "map-from" rule. All mapping and mapping rule operations can then be applied on the new field. The transform to perform the mappings can then be automatically generated, as discussed below. Figure 1 depicts a data structure for part of a document. Groupl has other groups under it, GroupA, GroupB and GroupC. GroupB has other groups under it, including Groupl, which has field Fieldl.3 in it. The full path through this document, from the top as shown down to Fieldl.3, is described as:

Groupl/GroupB/Groupl/Fieldl .3

Fieldl.3 and Field3.2 hold information that together represent a "meaning" of the document. It is important to realize that in many concrete documents a structure like GroupB can repeat many times. Figure 1 is an example of one embodiment of meta-data. In a concrete document, GroupB can appear multiple times. This is a particular instance of GroupB. Each instance of GroupB can have different data in its fields.

Enabling and Disabling a Virtual Aggregate Field

Figure 2 depicts a data structure for the same document as Figure 1, except that field F_Meaningl defined. That is, if a mapping accesses field F_Meaningl in a source document, the transform will automatically use the MapFrom code. If a mapping accesses field F_Meaningl in a target document, the transform will automatically use the MapTo code.

F_Meaningl is an enabled virtual aggregate field. That is, it appears in the document. There are events, such as the user in the GUI, that trigger the virtual aggregate field to be enabled - inserted into the document. Similarly, a virtual aggregate field can be disabled and disappear from the document.

Source Data for Defining Virtual Aggregate Fields

The descriptions of virtual aggregate fields can be stored in an external data source. Figure 3 illustrates a data structure including the information needed to create the virtual aggregate field of Figure 2. The information needed to generate a virtual group is:

• Name - the name of the new field.

• ParentGroup - the location of the new field. • Description - a textual description of the general meaning of the new field.

• ParticipatingFields - the fields that hold the source data of the new field.

• MapTo - code that, given the value of the virtual field, populates the ParticipatingFields.

• MapFrom - code that, given the ParticipatingFields, produces the value of the virtual aggregate field.

Using Virtual Groups in Source Documents to Automatically Generate a Transform

Users can apply mapping rules to meta-data to map from a field under a virtual aggregate field in the source document to the relevant location(s) in the target document. A virtual aggregate field in a source document can be treated just like any other field. Whatever operations - move, or any other mapping rule that might be applied to other fields - apply to such fields.

A transform is the code used by a translation engine to convert one concrete document to another. A transform is generated by applying mapping rules to the meta-data of the source and target documents. After the mapping rules and the meta-data, including virtual aggregate fields, are defined, a transform can be automatically generated, which performs the following processing on fields in a virtual aggregate field defined for a concrete source document:

Apply the MapTo function, and return its value.

Note that simple extension to support default values, etc. are within the scope of the invention. For example, a default value for a participating field in a virtual aggregate field could be added to the information in Figure 3. Looping functionality could also be supported to create a virtual aggregate field that combines the information of multiple structures into a single value. Using Virtual Groups in Target Documents to Automatically Generate a Transform

Users can map from location(s) in the source document to a virtual aggregate field in the target document. A virtual aggregate field in a target document can be treated just like any other field. Whatever operations - move, or any other manipulation rule that might be applied to other fields - apply to virtual aggregate fields.

After a transformation is defined, a transform can be automatically generated, which will do this processing on virtual aggregate fields defined for a concrete target document:

Apply the MapFrom function on the data mapped to the virtual aggregate field, in order to populate the participating fields.

Note that simple extension to support default values, etc. are within the scope of the invention. For example, a default value for a participating field of a virtual aggregate field could be added to the information of Figure 3, such that the participating field is conditionally populated. Users can write the code of the MapTo and MapFrom functions.

Hardware Overview

According to the present invention, a host computer system transmits and receives data over a computer network or standard telephone line. According to one embodiment, the steps of accessing, downloading, and manipulating the data, as well as other aspects of the present invention are implemented by a central processing unit (CPU) in the host computer executing sequences of instructions stored in a memory. The memory may be a random access memory (RAM), read-only memory (ROM), a persistent store, such as a mass storage device, or any combination of these devices. Execution of the sequences of instructions causes the CPU to perform steps according to the present invention. The instructions may be loaded into the memory of the host computer from a storage device, or from one or more other computer systems over a network connection. For example, a server computer may transmit a sequence of instructions to the host computer in response to a message transmitted to the server over a network by the host. As the host receives the instructions over the network connection, it stores the instructions in memory. The host may store the instructions for later execution or execute the instructions as they arrive over the network connection. In some cases, the downloaded instructions may be directly supported by the CPU. In other cases, the instructions may not be directly executable by the CPU, and may instead be executed by an interpreter that interprets the instructions. In other embodiments, hardwired circuitry may be used in place of, or in combination with, software instructions to implement the present invention. Thus, the present invention is not limited to any specific combination of hardware circuitry and software, nor to any particular source for the instructions executed by the host computer.

Figure 4 illustrates a system 400 in which a host computer 402 is connected to a remote computer 404 through a network 410. The network interface between host computer 402 and remote 404 may also include one or more routers, such as routers 406 and 408, which serve to buffer and route the data transmitted between the host and client computers. Network 410 may be the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), or any combination thereof. The remote computer 404 may be a World-Wide Web (WWW) server that stores data in the form of 'web pages' and transmits these pages as Hypertext Markup Language (HTML) files over the Internet network 410 to host computer 402. To access these files, host computer 402 runs a 'web browser', which is simply an application program for accessing and providing links to web pages available on various Internet sites. Host computer 402 is also configured to communicate to telephone system 412 through a telephone interface, typically a modem. Figure 5 is a block diagram of a representative networked computer, such as host computer 402 illustrated in Figure 4. The computer system 500 includes a processor 502 coupled through a bus 501 to a random access memory (RAM) 504, a read only memory (ROM) 506, and a mass storage device 507. Mass storage device 507 could be a disk or tape drive for storing data and instructions. A display device 520 for providing visual output is also coupled to processor 502 through bus 501. Keyboard 521 is coupled to bus 501 for communicating information and command selections to processor 502. Another type of user input device is cursor control unit 522, which may be a device such as a mouse or trackball, for communicating direction commands that control cursor movement on display 520. Also coupled to processor 502 through bus 501 is an audio output port 524 for connection to speakers that output audio signals produced by computer 500.

Further coupled to processor 502 through bus 501 is an input/ output (I/O) interface 525, and a network interface device 523 for providing a physical and logical connection between computer system 500 and a network. Network interface device 523 is used by various communication applications running on computer 500 for communicating over a network medium and may represent devices such as an ethernet card, ISDN card, or similar devices.

Modem 526 interfaces computer system 500 to a telephone line and translates digital data produced by the computer into analog signals that can be transmitted over standard telephone lines, such as by telephone system 412 in Figure 4. In an embodiment of the present invention, modem 526 provides a hardwired interface to a telephone wall jack, however modem 526 could also represent a wireless modem for communication over cellular telephone networks. It should be noted that the architecture of Figure 4 is provided only for purposes of illustration, and that a host computer used in conjunction with the present invention is not limited to the specific architecture shown.

Figure 6 shows an example of the groups and fields of two different documents, a source document format 610 and a target document format 620. In this embodiment, the document is a purchase order. However, the document may convey any information that one person or business wants to send to another person or business. The source group 615 includes the source fields of name, address, city, description, price, quantity, and total. The target group 625 includes the fields name, location, information, cost, number, and amount. Although the formats of the fields in the source and target groups are structurally different, they have similarities and common abstractions such as name, amount, and place to ship the goods. Thus, the names of the fields in groups 615 and 625 may be different, such as "price" and "cost," for example, but the data 617 and 627 contained in these fields is the same.

A virtual field that corresponds to a field in the source and target groups 615 and 625 can be used to capture these common abstractions using meta-data. For example, meta-data associated with the source document can be used by the mapping engine to define one or more aggregate virtual fields. The meta-data used to define the aggregate virtual fields can be obtained from a data structure such as the data structure of Figure 3. After the aggregate virtual fields are denned, the mapping engine can apply mapping rules to the meta-data associated with the source group, including the aggregate virtual fields, to automatically generate a transform. The transform is then provided to the translation engine, which uses the transform to convert the source document into the target document.

A mapping engine 650 creates a translation map, as shown in Figure 6. The translation map is used by a translation engine 630 to convert, or translate a message from a source format to a target format. The translation map is a metadata level description of the fields in the source document that will be used to populate a field in the target document.

Figure 7 shows an embodiment of a method for automatically generating a transform using aggregate virtual fields. One or more aggregate virtual fields for a first document are defined, step 710. The aggregate virtual fields are defined using meta-data contained in the data structure of Figure 3. One or more of these aggregate virtual fields are enabled, so that the enabled aggregate virtual fields appear in the first document, step 720. One or more of the aggregate virtual fields may be disabled, so that the disabled aggregate virtual fields do not appear in the first document, step 720. Mapping rules to map data from fields in the first document to fields in a second document are defined, step 740. Then, a transform to convert the first document into the second document is automatically generated by applying the mapping rules to the meta-data, including the enabled aggregate virtual fields, of the first and second documents.

Virtual aggregate fields thus provide several advantages. First, the virtual aggregate fields allow code to automap between source and target documents. The automapping code enables aggregate virtual fields as needed - if it discovers that a field under a aggregate virtual field that could potentially be enabled is required by the automapping mechanism, it enables the aggregate virtual field.

Second, merely doing the mapping tends to be sufficient if a aggregate virtual field is involved. Once the virtual aggregate field is defined, the user does not need to write code to identify when the data of the fields of a aggregate virtual field have a particular meaning, or write code to convert between the aggregate virtual field and its constituent fields.

Third, the ability to write code is not compromised. This new technique can co-exist with the older way of doing things. Fourth, transformation instructions can be successfully generated for the translation engines.

Fifth, mapping from document A to B is much closer to mapping from B to A than without this invention. Thus, the mapping from B to A has been made closer to the transposition of the mapping from A to B. Mapping one direction then provides most of the information needed to map the other direction. If users had to write code to map from A to B, such a transposition would be far more work. With this invention, transposing a mapping is far less work. Sixth, mapping to or from fields under a aggregate virtual field is translation-engine independent (assuming the code of the MapTo and MapFrom functions can support the target translation engines). We merely generate the code appropriate for that translation engine when writing out the transform in the way that translation engine requires.

Seventh, without needing to perform complicated analyses of user- written code, we can validate mappings to and from fields under aggregate virtual fields, as most cases do not require the user to write code. Because fewer mappings require the user to write code, mapping difference checking is easier. This way, once the MapTo and MapFrom functions have been validated, mapping difference checking is simplified.

Eighth, a non-programmer can do most of the work of mapping. Ninth, maps are more translation-engine independent, as less overall code is needed. Tenth, creating a map is faster, as automapping has a better hit rate. Eleventh, maps have fewer bugs, as users don't need to write code. Thus, debugging a mapping is faster. Twelfth, time to market is faster for users. Thirteenth, this mechanism works with virtual fields and groups.

Fourteenth, nesting of virtual groups works. That is, the meaning of a structure in a concrete document can depend on several qualifier fields as discussed above.

These and other embodiments of the present invention may be realized in accordance with these teachings and it should be evident that various modifications and changes may be made in these teachings without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than restrictive sense and the invention measured only in terms of the claims.

Claims

CLAIMSWhat is claimed is:

1. A method comprising: defining one or more aggregate virtual fields for a first document using metadata associated with the first document.

2. The method of claim 1 further comprising enabling one or more aggregate virtual fields so that the enabled aggregate virtual fields appear in the first document.

3. The method of claim 2 further comprising defining mapping rules to map data from fields in the first document to fields in a second document.

4. The method of claim 3 further comprising automatically generating a transform to convert the first document into the second document by applying the mapping rules to the meta-data and the enabled virtual aggregate fields.

5. The method of claim 1, wherein each aggregate virtual field is defined using meta-data contained in a data structure, said data structure having a name data element, a parent group data element, a participating fields data element, a map-to data element, and a map-from data element.

6. A computer readable medium having instructions which, when executed by a processing system, cause the system to: define one or more aggregate virtual fields for a first document using meta-data associated with the first document.

7. The medium of claim 6 wherein the executed instructions further cause the system to enable one or more aggregate virtual fields so that the enabled aggregate virtual fields appear in the first document.

8. The medium of claim 7 wherein the executed instructions further cause the system to define mapping rules to map data from fields in the first document to fields in a second document.

9. The medium of claim 8 wherein the executed instructions further cause the system to automatically generate a transform to convert the first document into the second document by applying the mapping rules to the meta-data and the enabled virtual aggregate fields.

10. The medium of claim 6, wherein each aggregate virtual field is defined using meta-data contained in a data structure, said data structure having a name data element, a parent group data element, a participating fields data element, a map-to data element, and a map-from data element.

11. An apparatus comprising: means for associating meta-data with a first document; and means for defining one or more aggregate virtual fields for the first document using meta-data associated with the first document.

12. The apparatus of claim 11 further comprising means for enabling one or more aggregate virtual fields so that the enabled aggregate virtual fields appear in the first document.

13. The apparatus of claim 12 further comprising means for defining mapping rules to map data from fields in the first document to fields in a second document.

14. The apparatus of claim 13 further comprising means for automatically generating a transform to convert the first document into the second document by applying the mapping rules to the meta-data and the enabled virtual aggregate fields.

15. The apparatus of claim 11, wherein each aggregate virtual field is defined means for using meta-data contained in a data structure, said data structure having a name data element, a parent group data element, a participating fields data element, a map-to data element, and a map-from data element.