US20080177783A1 - System and program product for providing high performance data lookup - Google Patents

System and program product for providing high performance data lookup Download PDF

Info

Publication number
US20080177783A1
US20080177783A1 US12/054,792 US5479208A US2008177783A1 US 20080177783 A1 US20080177783 A1 US 20080177783A1 US 5479208 A US5479208 A US 5479208A US 2008177783 A1 US2008177783 A1 US 2008177783A1
Authority
US
United States
Prior art keywords
documents
index
index keys
request
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/054,792
Inventor
Matthew J. Bangel
Scott D. Hicks
James A. Martin Jr.
Douglas G. Murray
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/054,792 priority Critical patent/US20080177783A1/en
Publication of US20080177783A1 publication Critical patent/US20080177783A1/en
Assigned to GOOGLE INC. reassignment GOOGLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INTERNATIONAL BUSINESS MACHINES CORPORATION
Priority to US13/617,726 priority patent/US20130073558A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MURRAY, DOUGLAS G., HICKS, SCOTT D., MARTIN, JAMES A., JR., BANGEL, MATTHEW J.
Assigned to GOOGLE LLC reassignment GOOGLE LLC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: GOOGLE INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface

Definitions

  • the present invention generally relates to data lookup. Specifically, the present invention relates to a system and program product for providing high performance data lookup (e.g., document retrieval).
  • high performance data lookup e.g., document retrieval
  • IT information technology
  • a growing number of organizations are turning to IT-based solutions for their data storage needs.
  • an organization can store a countless number of “documents” electronically while consuming very little physical space.
  • Such an IT-based approach can not only save overhead costs, but also allows for improved redundancy.
  • computerized access can be provided for authorized individuals from virtually any location.
  • the present invention provides a method, system and program product for providing high performance data lookup.
  • index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user.
  • a first aspect of the present invention provides a method for providing high performance data lookup, comprising: extracting data values from each of a set of documents; creating index keys for the set of documents using the extracted data values; populating the index keys into an index view; and automatically obtaining the set of documents using the index keys in the index view.
  • a second aspect of the present invention provides a method for providing high performance data lookup, comprising: generating index keys for a set of documents; populating an index view with the index keys; automatically obtaining the set of documents using the index keys; receiving a request for a desired document; and retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • a third aspect of the present invention provides a system for providing high performance data lookup, comprising: means for generating index keys for a set of documents; means for populating an index view with the index keys; means for automatically obtaining the set of documents using the index keys; means for receiving a request for a desired document; and means for retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • a fourth aspect of the present invention provides a program product stored on a computer readable medium for providing high performance data lookup, the computer readable medium comprising program code for performing the following steps: generating index keys for a set of documents; populating an index view with the index keys; automatically obtaining the set of documents using the index keys; receiving a request for a desired document; and retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • a fifth aspect of the present invention provides a method for deploying an application for providing high performance data lookup, comprising: providing a computer infrastructure being operable to: generate index keys for a set of documents; populate an index view with the index keys; automatically obtain the set of documents using the index keys; receive a request for a desired document; and retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • a sixth aspect of the present invention provides computer software for deploying an application for providing high performance data lookup, the computer software comprising instructions for causing a computer system to perform the following functions: generate index keys for a set of documents; populate an index view with the index keys; automatically obtain the set of documents using the index keys; receive a request for a desired document; and retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • a seventh aspect of the present invention provides a view for indexing documents, comprising: an index key portion for storing index keys for a set of documents, wherein each of the index keys includes a plurality of data values extracted from a corresponding one of the set of documents, and wherein the plurality of data values for the index keys are separated by a connector.
  • An eighth aspect of the invention provides a computer-readable medium that includes computer program code to enable a computer infrastructure to provide high performance data lookup.
  • a ninth aspect of the invention provides a business method for providing high performance data lookup.
  • FIG. 1 shows an illustrative system for providing high performance data lookup according to the present invention.
  • FIG. 2 shows an illustrative an index view according to the present invention.
  • FIG. 3 shows an illustrative method flow diagram according to the present invention.
  • index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user. It should be understood that as used herein, the term “document” is intended to refer to any collection of data that is store electronically.
  • system 10 includes a computer infrastructure 12 , which comprises a computer system 14 that can perform the various process steps described herein.
  • Computer system 14 is intended to represent any type of computer system capable of carrying out the teachings of the present invention.
  • computer system 14 could be a laptop computer, a desktop computer, a workstation, a handheld device, a server, etc.
  • computer system 14 can be deployed and/or operated by a service provider that is building providing high performance data lookup for users such as user 16 .
  • user 16 could directly access computer system 14 as shown, or could operate their own independent computer systems that communicate with computer system 14 over a network (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.).
  • a network e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.
  • communications between computer system 14 and the user-operated computer system can occur via any combination of various types of communications links.
  • the communication links can comprise addressable connections that may utilize any combination of wired and/or wireless transmission methods.
  • connectivity could be provided by conventional TCP/IP sockets-based protocol, and an Internet service provider could be used to establish connectivity to the Internet.
  • performance lookup system 40 is shown implemented on computer system 14 as computer program code.
  • computer system 14 is shown including a processing unit 20 , a memory 22 , a bus 24 , and input/output (I/O) interfaces 26 . Further, computer system 14 is shown in communication with external I/O devices/resources 28 and one or more storage systems 30 .
  • processing unit 20 executes computer program code, such as performance lookup system 40 , that is stored in memory 22 and/or storage system(s) 30 .
  • processing unit 20 can read and/or write data, to/from memory 22 , storage system(s) 30 , and/or I/O interfaces 26 .
  • Bus 24 provides a communication link between each of the components in computer system 14 .
  • External devices 28 can comprise any devices (e.g., keyboard, pointing device, display, etc.) that enables a user to interact with computer system 14 and/or any devices (e.g., network card, modem, etc.) that enables computer system 14 to communicate with one or more other computing devices, such as those in organization 18 and/or operated by user 16 .
  • Computer infrastructure 12 is only illustrative of various types of computer infrastructures for implementing the invention.
  • computer infrastructure 12 comprises two or more computing devices (e.g., a server cluster) that communicate over a network to perform the various process steps of the invention.
  • computer system 14 is only representative of various possible computer infrastructures that can include numerous combinations of hardware.
  • computer system 14 can comprise any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that comprises a combination of specific purpose and general purpose hardware/software, or the like.
  • the program code and hardware can be created using standard programming and engineering techniques, respectively.
  • processing unit 20 may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server.
  • memory 22 and/or storage system 30 can comprise any combination of various types of data storage and/or transmission media that reside at one or more physical locations.
  • I/O interfaces 26 can comprise any system for exchanging information with one or more external devices 28 .
  • one or more additional components e.g., system software, math co-processing unit, etc.
  • additional components e.g., system software, math co-processing unit, etc.
  • computer system 14 comprises a handheld device or the like, it is understood that one or more external devices 28 (e.g., a display) and/or storage system(s) 30 could be contained within computer system 14 , not externally as shown.
  • Storage system 30 can be any type of system (e.g., a database) capable of providing storage for information under the present invention.
  • storage system 30 could include one or more storage devices, such as a magnetic disk drive or an optical disk drive.
  • storage system 30 includes data distributed across, for example, a local area network (LAN), wide area network (WAN) or a storage area network (SAN) (not shown).
  • LAN local area network
  • WAN wide area network
  • SAN storage area network
  • additional components such as cache memory, communication systems, system software, etc., may be incorporated into computer system 14 .
  • computer systems operated by user 16 will likely contain computerized components similar to computer system 14 .
  • organization 18 and documents 50 could be contained within infrastructure 12 . They are shown as independent systems for illustrative purposes only.
  • performance lookup system 40 Shown in memory 22 of computer system 14 is performance lookup system 40 , which includes index key system 42 , index view system 44 , document retrieval system 46 and request processing system 48 . Operation of each of these systems is discussed further below. However, it is understood that some of the various systems shown in FIG. 1 can be implemented independently, combined, and/or stored in memory for one or more separate computers systems 14 that communicate over a network. Further, it is understood that some of the systems/functionality may not be implemented and/or additional systems/functionality may be included as part of the present invention. Still yet, it is understood that the depiction of these systems shown in FIG. 1 is illustrative only and that the same functionality could be achieved with a different configuration. That is, the functionality of these systems could be combined into fewer systems, or broken down into additional systems.
  • index key system 42 will create an index key for each document.
  • index key system 42 will analyze documents 50 and extract data values therefrom. These data values will be connected and separated by a separator to yield strings of data (with each string corresponding to a particular document).
  • the data values are positioned in the index keys in a descending hierarchical fashion.
  • index view system 44 will generate an index view into which the index keys are populated.
  • index view 60 includes a key window 62 where index keys 64 are listed.
  • the index keys 64 shown each include multiple data values as extracted from a corresponding document.
  • each index key 64 includes five or more data values.
  • each data value is separated from the next by a separator such as a tilde ( ⁇ ).
  • the data values are arranged within each index key in a descending hierarchical fashion (e.g., year 2004 is first, month 04 is second, etc.).
  • Document type window 66 allows a specific type of document 68 to be selected for display of its corresponding index keys in key window 62 .
  • document retrieval system 46 will automatically retrieve the documents using their index keys 64 ( FIG. 2 ).
  • document retrieval system 46 includes an automated agent or the like that analyzes the index keys 64 , and obtains the corresponding documents 50 .
  • the documents 50 can be considered “local” to computer system 14 (e.g., in memory 22 or storage system 30 ).
  • request processing system 48 will parse the request to determine what document is being requested, and then retrieve that document by cross-referencing the index key for that document.
  • request processing system can analyze the requests, and generate a user key for the requested document.
  • the user key can resemble or be similar to the index key for that document.
  • request processing system 48 can be configured similar to index key system 42 .
  • a user key that is determined to be identical or sufficiently similar to an index key upon comparison could correspond to the requested document.
  • Such document would then be retrieved (e.g., from local storage of computer system 14 ) and returned to or displayed to user 16 .
  • First step S 1 is to generate index keys for a set of documents. In general, this involves examining the set of documents, and connecting data values extracted from the set of documents to yield the index strings.
  • Second step S 2 is to populate an index view with the index keys.
  • step S 3 is to automatically obtain the set of documents using the index keys.
  • Fourth step S 4 is to receive a request for a desired document and fifth step S 5 is to retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • the invention provides a computer-readable medium that includes computer program code to enable a computer infrastructure to provide high performance data lookup within organizations.
  • the computer-readable medium includes program code that implements each of the various process steps of the invention.
  • the term “computer-readable medium” comprises one or more of any type of physical embodiment of the program code.
  • the computer-readable medium can comprise program code embodied on one or more portable storage articles of manufacture (e.g., a compact disc, a magnetic disk, a tape, etc.), on one or more data storage portions of a computing device, such as memory 22 ( FIG. 1 ) and/or storage system 30 ( FIG. 1 ) (e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.).
  • the invention provides a business method that performs the process steps of the invention on a subscription, advertising, and/or fee basis. That is, a service provider, such as an Internet Service Provider, could offer to provide high performance data lookup as described above.
  • the service provider can create, maintain, support, etc., a computer infrastructure, such as computer infrastructure 12 ( FIG. 1 ) that performs the process steps of the invention for one or more customers.
  • the service provider can receive payment from the customer(s) under a subscription and/or fee agreement and/or the service provider can receive payment from the sale of advertising content to one or more third parties.
  • the invention provides a method of providing high performance data lookup.
  • a computer infrastructure such as computer infrastructure 12 ( FIG. 1 )
  • one or more systems for performing the process steps of the invention can be obtained (e.g., created, purchased, used, modified, etc.) and deployed to the computer infrastructure.
  • the deployment of a system can comprise one or more of (1) installing program code on a computing device, such as computer system 14 ( FIG. 1 ), from a computer-readable medium; (2) adding one or more computing devices to the computer infrastructure; and (3) incorporating and/or modifying one or more existing systems of the computer infrastructure to enable the computer infrastructure to perform the process steps of the invention.
  • program code and “computer program code” are synonymous and mean any expression, in any language, code or notation, of a set of instructions intended to cause a computing device having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form.
  • program code can be embodied as one or more of: an application/software program, component software/a library of functions, an operating system, a basic I/O system/driver for a particular computing and/or I/O device, and the like.

Abstract

Under the present invention, index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user. It should be understood that as used herein, the term “document” is intended to refer to any type of electronically stored data.

Description

  • The current application is a continuation application of co-pending U.S. patent application Ser. No. 11/095,997, filed on Mar. 31, 2005, which is hereby incorporated by reference.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention generally relates to data lookup. Specifically, the present invention relates to a system and program product for providing high performance data lookup (e.g., document retrieval).
  • 2. Related Art
  • As the use of information technology (IT) continues to increase, a growing number of organizations are turning to IT-based solutions for their data storage needs. For example, today an organization can store a countless number of “documents” electronically while consuming very little physical space. Such an IT-based approach can not only save overhead costs, but also allows for improved redundancy. Moreover, when storing documents electronically, computerized access can be provided for authorized individuals from virtually any location.
  • Unfortunately, electronic document storage has various drawbacks. For example, in order to provide efficient access to electronic documents, they must be indexed in some manner. Moreover, requests for documents must be handled correctly. Due to the manner in which the documents can be stored, there is often a latency involved with their retrieval.
  • In view of the foregoing, there exists a need for a method, system and program product for providing high performance data lookup. Specifically, a need exists for a methodology and a “view” in which stored documents can be indexed for rapid retrieval.
  • SUMMARY OF THE INVENTION
  • In general, the present invention provides a method, system and program product for providing high performance data lookup. Under the present invention, index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user.
  • A first aspect of the present invention provides a method for providing high performance data lookup, comprising: extracting data values from each of a set of documents; creating index keys for the set of documents using the extracted data values; populating the index keys into an index view; and automatically obtaining the set of documents using the index keys in the index view.
  • A second aspect of the present invention provides a method for providing high performance data lookup, comprising: generating index keys for a set of documents; populating an index view with the index keys; automatically obtaining the set of documents using the index keys; receiving a request for a desired document; and retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • A third aspect of the present invention provides a system for providing high performance data lookup, comprising: means for generating index keys for a set of documents; means for populating an index view with the index keys; means for automatically obtaining the set of documents using the index keys; means for receiving a request for a desired document; and means for retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • A fourth aspect of the present invention provides a program product stored on a computer readable medium for providing high performance data lookup, the computer readable medium comprising program code for performing the following steps: generating index keys for a set of documents; populating an index view with the index keys; automatically obtaining the set of documents using the index keys; receiving a request for a desired document; and retrieving the desired document from the obtained set of documents based on the request and the index keys.
  • A fifth aspect of the present invention provides a method for deploying an application for providing high performance data lookup, comprising: providing a computer infrastructure being operable to: generate index keys for a set of documents; populate an index view with the index keys; automatically obtain the set of documents using the index keys; receive a request for a desired document; and retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • A sixth aspect of the present invention provides computer software for deploying an application for providing high performance data lookup, the computer software comprising instructions for causing a computer system to perform the following functions: generate index keys for a set of documents; populate an index view with the index keys; automatically obtain the set of documents using the index keys; receive a request for a desired document; and retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • A seventh aspect of the present invention provides a view for indexing documents, comprising: an index key portion for storing index keys for a set of documents, wherein each of the index keys includes a plurality of data values extracted from a corresponding one of the set of documents, and wherein the plurality of data values for the index keys are separated by a connector.
  • An eighth aspect of the invention provides a computer-readable medium that includes computer program code to enable a computer infrastructure to provide high performance data lookup.
  • A ninth aspect of the invention provides a business method for providing high performance data lookup.
  • The illustrative aspects of the present invention are designed to solve the problems herein described and other problems not discussed, which are discoverable by a skilled artisan.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 shows an illustrative system for providing high performance data lookup according to the present invention.
  • FIG. 2 shows an illustrative an index view according to the present invention.
  • FIG. 3 shows an illustrative method flow diagram according to the present invention.
  • It is noted that the drawings of the invention are not to scale. The drawings are intended to depict only typical aspects of the invention, and therefore should not be considered as limiting the scope of the invention. In the drawings, like numbering represents like elements between the drawings.
  • DETAILED DESCRIPTION OF THE DRAWINGS
  • As indicated above, the present invention provides a method, system and program product for providing high performance data lookup. Under the present invention, index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user. It should be understood that as used herein, the term “document” is intended to refer to any collection of data that is store electronically.
  • Referring now to FIG. 1, a system 10 for providing high performance data lookup according to the present invention is shown. As depicted, system 10 includes a computer infrastructure 12, which comprises a computer system 14 that can perform the various process steps described herein. Computer system 14 is intended to represent any type of computer system capable of carrying out the teachings of the present invention. For example, computer system 14 could be a laptop computer, a desktop computer, a workstation, a handheld device, a server, etc. In addition, as will be further described below, computer system 14 can be deployed and/or operated by a service provider that is building providing high performance data lookup for users such as user 16. It should be appreciated that user 16 could directly access computer system 14 as shown, or could operate their own independent computer systems that communicate with computer system 14 over a network (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.). In the case of the latter, communications between computer system 14 and the user-operated computer system can occur via any combination of various types of communications links. For example, the communication links can comprise addressable connections that may utilize any combination of wired and/or wireless transmission methods. Where communications occur via the Internet, connectivity could be provided by conventional TCP/IP sockets-based protocol, and an Internet service provider could be used to establish connectivity to the Internet.
  • In any event, assume that user 16 is authorized to access documents 50 as maintained by organization 18. Under the present invention, high performance data lookup of documents 50 is provided. To provide this functionality, performance lookup system 40 is shown implemented on computer system 14 as computer program code. To this extent, computer system 14 is shown including a processing unit 20, a memory 22, a bus 24, and input/output (I/O) interfaces 26. Further, computer system 14 is shown in communication with external I/O devices/resources 28 and one or more storage systems 30. In general, processing unit 20 executes computer program code, such as performance lookup system 40, that is stored in memory 22 and/or storage system(s) 30. While executing computer program code, processing unit 20 can read and/or write data, to/from memory 22, storage system(s) 30, and/or I/O interfaces 26. Bus 24 provides a communication link between each of the components in computer system 14. External devices 28 can comprise any devices (e.g., keyboard, pointing device, display, etc.) that enables a user to interact with computer system 14 and/or any devices (e.g., network card, modem, etc.) that enables computer system 14 to communicate with one or more other computing devices, such as those in organization 18 and/or operated by user 16.
  • Computer infrastructure 12 is only illustrative of various types of computer infrastructures for implementing the invention. For example, in one embodiment, computer infrastructure 12 comprises two or more computing devices (e.g., a server cluster) that communicate over a network to perform the various process steps of the invention. Moreover, computer system 14 is only representative of various possible computer infrastructures that can include numerous combinations of hardware. To this extent, in other embodiments, computer system 14 can comprise any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that comprises a combination of specific purpose and general purpose hardware/software, or the like. In each case, the program code and hardware can be created using standard programming and engineering techniques, respectively. In addition, processing unit 20 may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server. Similarly, memory 22 and/or storage system 30 can comprise any combination of various types of data storage and/or transmission media that reside at one or more physical locations. Further, I/O interfaces 26 can comprise any system for exchanging information with one or more external devices 28. Still further, it is understood that one or more additional components (e.g., system software, math co-processing unit, etc.) not shown in FIG. 1 can be included in computer system 14. However, if computer system 14 comprises a handheld device or the like, it is understood that one or more external devices 28 (e.g., a display) and/or storage system(s) 30 could be contained within computer system 14, not externally as shown.
  • Storage system 30 can be any type of system (e.g., a database) capable of providing storage for information under the present invention. To this extent, storage system 30 could include one or more storage devices, such as a magnetic disk drive or an optical disk drive. In another embodiment, storage system 30 includes data distributed across, for example, a local area network (LAN), wide area network (WAN) or a storage area network (SAN) (not shown). Although not shown, additional components, such as cache memory, communication systems, system software, etc., may be incorporated into computer system 14. Moreover, although not shown for brevity purposes, and computer systems operated by user 16 will likely contain computerized components similar to computer system 14. It should also be understood that organization 18 and documents 50 could be contained within infrastructure 12. They are shown as independent systems for illustrative purposes only.
  • Shown in memory 22 of computer system 14 is performance lookup system 40, which includes index key system 42, index view system 44, document retrieval system 46 and request processing system 48. Operation of each of these systems is discussed further below. However, it is understood that some of the various systems shown in FIG. 1 can be implemented independently, combined, and/or stored in memory for one or more separate computers systems 14 that communicate over a network. Further, it is understood that some of the systems/functionality may not be implemented and/or additional systems/functionality may be included as part of the present invention. Still yet, it is understood that the depiction of these systems shown in FIG. 1 is illustrative only and that the same functionality could be achieved with a different configuration. That is, the functionality of these systems could be combined into fewer systems, or broken down into additional systems.
  • Under the present invention high performance lookup of documents 50 is provided. First, index key system 42 will create an index key for each document. To create the index keys, index key system 42 will analyze documents 50 and extract data values therefrom. These data values will be connected and separated by a separator to yield strings of data (with each string corresponding to a particular document). In addition, as will be shown below in conjunction with FIG. 2, the data values are positioned in the index keys in a descending hierarchical fashion.
  • Once the index keys have been generated, index view system 44 will generate an index view into which the index keys are populated. Referring now to FIG. 2, index view 60 is shown in greater detail. As depicted, index view 60 includes a key window 62 where index keys 64 are listed. The index keys 64 shown each include multiple data values as extracted from a corresponding document. In a typical embodiment, each index key 64 includes five or more data values. As further shown, each data value is separated from the next by a separator such as a tilde (˜). In addition, as indicated above, the data values are arranged within each index key in a descending hierarchical fashion (e.g., year 2004 is first, month 04 is second, etc.). Document type window 66 allows a specific type of document 68 to be selected for display of its corresponding index keys in key window 62.
  • Referring back to FIG. 1, once index view 60 has been populated, document retrieval system 46 will automatically retrieve the documents using their index keys 64 (FIG. 2). Specifically, under the present invention, document retrieval system 46 includes an automated agent or the like that analyzes the index keys 64, and obtains the corresponding documents 50. At that point, the documents 50 can be considered “local” to computer system 14 (e.g., in memory 22 or storage system 30).
  • Then, if user 16 requests a certain document, user 16 will issue a request via a user view that is received by request processing system 48. Upon receipt, request processing system 48 will parse the request to determine what document is being requested, and then retrieve that document by cross-referencing the index key for that document. In one embodiment, request processing system can analyze the requests, and generate a user key for the requested document. The user key can resemble or be similar to the index key for that document. To this extent, request processing system 48 can be configured similar to index key system 42. A user key that is determined to be identical or sufficiently similar to an index key upon comparison could correspond to the requested document. Such document would then be retrieved (e.g., from local storage of computer system 14) and returned to or displayed to user 16.
  • Referring now to FIG. 3, a method flow diagram 100 according to the present invention is shown. First step S1 is to generate index keys for a set of documents. In general, this involves examining the set of documents, and connecting data values extracted from the set of documents to yield the index strings. Second step S2 is to populate an index view with the index keys. Thereafter, step S3 is to automatically obtain the set of documents using the index keys. Fourth step S4 is to receive a request for a desired document and fifth step S5 is to retrieve the desired document from the obtained set of documents based on the request and the index keys.
  • While shown and described herein as a method and system for providing high performance data lookup, it is understood that the invention further provides various alternative embodiments. For example, in one embodiment, the invention provides a computer-readable medium that includes computer program code to enable a computer infrastructure to provide high performance data lookup within organizations. To this extent, the computer-readable medium includes program code that implements each of the various process steps of the invention. It is understood that the term “computer-readable medium” comprises one or more of any type of physical embodiment of the program code. In particular, the computer-readable medium can comprise program code embodied on one or more portable storage articles of manufacture (e.g., a compact disc, a magnetic disk, a tape, etc.), on one or more data storage portions of a computing device, such as memory 22 (FIG. 1) and/or storage system 30 (FIG. 1) (e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.).
  • In another embodiment, the invention provides a business method that performs the process steps of the invention on a subscription, advertising, and/or fee basis. That is, a service provider, such as an Internet Service Provider, could offer to provide high performance data lookup as described above. In this case, the service provider can create, maintain, support, etc., a computer infrastructure, such as computer infrastructure 12 (FIG. 1) that performs the process steps of the invention for one or more customers. In return, the service provider can receive payment from the customer(s) under a subscription and/or fee agreement and/or the service provider can receive payment from the sale of advertising content to one or more third parties.
  • In still another embodiment, the invention provides a method of providing high performance data lookup. In this case, a computer infrastructure, such as computer infrastructure 12 (FIG. 1), can be provided and one or more systems for performing the process steps of the invention can be obtained (e.g., created, purchased, used, modified, etc.) and deployed to the computer infrastructure. To this extent, the deployment of a system can comprise one or more of (1) installing program code on a computing device, such as computer system 14 (FIG. 1), from a computer-readable medium; (2) adding one or more computing devices to the computer infrastructure; and (3) incorporating and/or modifying one or more existing systems of the computer infrastructure to enable the computer infrastructure to perform the process steps of the invention.
  • As used herein, it is understood that the terms “program code” and “computer program code” are synonymous and mean any expression, in any language, code or notation, of a set of instructions intended to cause a computing device having an information processing capability to perform a particular function either directly or after either or both of the following: (a) conversion to another language, code or notation; and/or (b) reproduction in a different material form. To this extent, program code can be embodied as one or more of: an application/software program, component software/a library of functions, an operating system, a basic I/O system/driver for a particular computing and/or I/O device, and the like.
  • The foregoing description of various aspects of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and obviously, many modifications and variations are possible. Such modifications and variations that may be apparent to a person skilled in the art are intended to be included within the scope of the invention as defined by the accompanying claims.

Claims (12)

1. A system for providing high performance data lookup, comprising:
means for generating index keys for a set of documents;
means for populating an index view with the index keys;
means for automatically obtaining the set of documents using the index keys prior to a request for a desired document from a user;
means for receiving a request for the desired document; and
means for retrieving the desired document from the obtained set of documents based on the request and the index keys.
2. The system of claim 1, wherein the means for generating the index keys, comprises:
means for extracting data values from the set of documents; and
means for connecting data values to yield the index strings, wherein each of the index keys comprises a plurality of data values obtained from a corresponding one of the set of documents.
3. The system of claim 2, wherein the plurality of data values comprises at least five data values.
4. The system of claim 1, wherein the obtaining step is performed by an automated agent.
5. The system of claim 1, wherein the request is made by the user via a user view.
6. The system of claim 1, further comprising:
means for generating a user key based on the request; and
means for comparing the user key to the index keys to identify the desired document among the obtained set of documents.
7. A computer program product having a computer readable medium for providing high performance data lookup, the computer program product comprising:
program code stored on a computer readable medium, which when executed would cause the computer to:
generate index keys for a set of documents;
populate an index view with the index keys;
automatically obtain the set of documents using the index keys prior to a request for a desired document by a user;
receive the request for the desired document; and
retrieve the desired document from the obtained set of documents based on the request and the index keys.
8. The program product of claim 7, wherein the computer program product further comprises program code stored on a computer readable medium, which when executed would cause the computer to:
extract data values from the set of documents; and
connect data values to yield the index strings, wherein each of the index keys comprises a plurality of data values obtained from a corresponding one of the set of documents.
9. The program product of claim 8, wherein the plurality of data values comprises at least five data values.
10. The program product of claim 7, wherein the obtaining step is performed by an automated agent.
11. The program product of claim 7, wherein the request is made by the user via a user view.
12. The program product of claim 7, wherein the computer program product further comprises program code stored on a computer readable medium, which when executed would cause the computer to:
generate a user key based on the request; and
compare the user key to the index keys to identify the desired document among the obtained set of documents.
US12/054,792 2005-03-31 2008-03-25 System and program product for providing high performance data lookup Abandoned US20080177783A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/054,792 US20080177783A1 (en) 2005-03-31 2008-03-25 System and program product for providing high performance data lookup
US13/617,726 US20130073558A1 (en) 2005-03-31 2012-09-14 System and program product for providing high performance data lookup

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/095,997 US7386570B2 (en) 2005-03-31 2005-03-31 Method, system and program product for providing high performance data lookup
US12/054,792 US20080177783A1 (en) 2005-03-31 2008-03-25 System and program product for providing high performance data lookup

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/095,997 Continuation US7386570B2 (en) 2005-03-31 2005-03-31 Method, system and program product for providing high performance data lookup

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/617,726 Continuation US20130073558A1 (en) 2005-03-31 2012-09-14 System and program product for providing high performance data lookup

Publications (1)

Publication Number Publication Date
US20080177783A1 true US20080177783A1 (en) 2008-07-24

Family

ID=37071837

Family Applications (3)

Application Number Title Priority Date Filing Date
US11/095,997 Active 2026-03-29 US7386570B2 (en) 2005-03-31 2005-03-31 Method, system and program product for providing high performance data lookup
US12/054,792 Abandoned US20080177783A1 (en) 2005-03-31 2008-03-25 System and program product for providing high performance data lookup
US13/617,726 Abandoned US20130073558A1 (en) 2005-03-31 2012-09-14 System and program product for providing high performance data lookup

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/095,997 Active 2026-03-29 US7386570B2 (en) 2005-03-31 2005-03-31 Method, system and program product for providing high performance data lookup

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/617,726 Abandoned US20130073558A1 (en) 2005-03-31 2012-09-14 System and program product for providing high performance data lookup

Country Status (1)

Country Link
US (3) US7386570B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10885593B2 (en) * 2015-06-09 2021-01-05 Microsoft Technology Licensing, Llc Hybrid classification system

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5649183A (en) * 1992-12-08 1997-07-15 Microsoft Corporation Method for compressing full text indexes with document identifiers and location offsets
US5745896A (en) * 1994-01-18 1998-04-28 Borland Int Inc Referential integrity in a relational database management system
US5852822A (en) * 1996-12-09 1998-12-22 Oracle Corporation Index-only tables with nested group keys
US5992737A (en) * 1996-03-25 1999-11-30 International Business Machines Corporation Information search method and apparatus, and medium for storing information searching program
US6094649A (en) * 1997-12-22 2000-07-25 Partnet, Inc. Keyword searches of structured databases
US20020046131A1 (en) * 2000-10-16 2002-04-18 Barry Boone Method and system for listing items globally and regionally, and customized listing according to currency or shipping area
US20020066007A1 (en) * 1992-06-30 2002-05-30 Wise Adrian P. Multistandard video decoder and decompression system for processing encoded bit streams including pipeline processing and methods relating thereto
US6421686B1 (en) * 1999-11-15 2002-07-16 International Business Machines Corporation Method of replicating data records
US6435737B1 (en) * 1992-06-30 2002-08-20 Discovision Associates Data pipeline system and data encoding method
US6457029B1 (en) * 1999-12-22 2002-09-24 International Business Machines Corporation Computer method and system for same document lookup with different keywords from a single view
US6493717B1 (en) * 1998-06-16 2002-12-10 Datafree, Inc. System and method for managing database information
US6678687B2 (en) * 1998-01-23 2004-01-13 Fuji Xerox Co., Ltd. Method for creating an index and method for searching an index
US6714927B1 (en) * 1999-08-17 2004-03-30 Ricoh Company, Ltd. Apparatus for retrieving documents
US6751628B2 (en) * 2001-01-11 2004-06-15 Dolphin Search Process and system for sparse vector and matrix representation of document indexing and retrieval
US6842878B1 (en) * 2000-09-29 2005-01-11 International Business Machines Corporation Method to document relations between objects using a graphical interface tree component
US7039636B2 (en) * 1999-02-09 2006-05-02 Hitachi, Ltd. Document retrieval method and document retrieval system
US20060242102A1 (en) * 2005-04-21 2006-10-26 Microsoft Corporation Relaxation-based approach to automatic physical database tuning
US7275061B1 (en) * 2000-04-13 2007-09-25 Indraweb.Com, Inc. Systems and methods for employing an orthogonal corpus for document indexing
US20090063400A1 (en) * 2007-09-05 2009-03-05 International Business Machines Corporation Apparatus, system, and method for improving update performance for indexing using delta key updates

Patent Citations (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030196078A1 (en) * 1992-06-30 2003-10-16 Wise Adrian P. Data pipeline system and data encoding method
US20020066007A1 (en) * 1992-06-30 2002-05-30 Wise Adrian P. Multistandard video decoder and decompression system for processing encoded bit streams including pipeline processing and methods relating thereto
US6697930B2 (en) * 1992-06-30 2004-02-24 Discovision Associates Multistandard video decoder and decompression method for processing encoded bit streams according to respective different standards
US6435737B1 (en) * 1992-06-30 2002-08-20 Discovision Associates Data pipeline system and data encoding method
US5832479A (en) * 1992-12-08 1998-11-03 Microsoft Corporation Method for compressing full text indexes with document identifiers and location offsets
US5649183A (en) * 1992-12-08 1997-07-15 Microsoft Corporation Method for compressing full text indexes with document identifiers and location offsets
US5745896A (en) * 1994-01-18 1998-04-28 Borland Int Inc Referential integrity in a relational database management system
US5992737A (en) * 1996-03-25 1999-11-30 International Business Machines Corporation Information search method and apparatus, and medium for storing information searching program
US5852822A (en) * 1996-12-09 1998-12-22 Oracle Corporation Index-only tables with nested group keys
US6094649A (en) * 1997-12-22 2000-07-25 Partnet, Inc. Keyword searches of structured databases
US6678687B2 (en) * 1998-01-23 2004-01-13 Fuji Xerox Co., Ltd. Method for creating an index and method for searching an index
US6493717B1 (en) * 1998-06-16 2002-12-10 Datafree, Inc. System and method for managing database information
US7039636B2 (en) * 1999-02-09 2006-05-02 Hitachi, Ltd. Document retrieval method and document retrieval system
US7720799B2 (en) * 1999-04-13 2010-05-18 Indraweb, Inc. Systems and methods for employing an orthogonal corpus for document indexing
US6714927B1 (en) * 1999-08-17 2004-03-30 Ricoh Company, Ltd. Apparatus for retrieving documents
US7188104B2 (en) * 1999-08-17 2007-03-06 Ricoh Company, Ltd. Apparatus for retrieving documents
US6421686B1 (en) * 1999-11-15 2002-07-16 International Business Machines Corporation Method of replicating data records
US6457029B1 (en) * 1999-12-22 2002-09-24 International Business Machines Corporation Computer method and system for same document lookup with different keywords from a single view
US7275061B1 (en) * 2000-04-13 2007-09-25 Indraweb.Com, Inc. Systems and methods for employing an orthogonal corpus for document indexing
US6842878B1 (en) * 2000-09-29 2005-01-11 International Business Machines Corporation Method to document relations between objects using a graphical interface tree component
US20020046131A1 (en) * 2000-10-16 2002-04-18 Barry Boone Method and system for listing items globally and regionally, and customized listing according to currency or shipping area
US6751628B2 (en) * 2001-01-11 2004-06-15 Dolphin Search Process and system for sparse vector and matrix representation of document indexing and retrieval
US7328204B2 (en) * 2001-01-11 2008-02-05 Aric Coady Process and system for sparse vector and matrix representation of document indexing and retrieval
US20060242102A1 (en) * 2005-04-21 2006-10-26 Microsoft Corporation Relaxation-based approach to automatic physical database tuning
US20090063400A1 (en) * 2007-09-05 2009-03-05 International Business Machines Corporation Apparatus, system, and method for improving update performance for indexing using delta key updates

Also Published As

Publication number Publication date
US20060224614A1 (en) 2006-10-05
US7386570B2 (en) 2008-06-10
US20130073558A1 (en) 2013-03-21

Similar Documents

Publication Publication Date Title
CN1713179B (en) Impact analysis in an object model
US8495102B2 (en) Method, system, and program product for organizing a database
US8655738B2 (en) Contextual computing system
US20050256825A1 (en) Viewing annotations across multiple applications
US9129010B2 (en) System and method of partitioned lexicographic search
US20060265641A1 (en) Custom report generation
US20070113185A1 (en) Intelligent network diagram layout
JP7157141B2 (en) A Context-Aware Differencing Algorithm for Genome Files
Wu et al. Design of data warehouses using metadata
US7870097B2 (en) Synchronization view
US20050091284A1 (en) Composite view
US7945532B2 (en) System, and program product for rebasing an application
JP3601675B2 (en) Information retrieval apparatus, information retrieval method, and recording medium
US9471703B2 (en) Webpage content search
US7386570B2 (en) Method, system and program product for providing high performance data lookup
KR20100083778A (en) Acquisition and expansion of storage area network interoperation relationships
US6904426B2 (en) Method, computer program product, and system for the generation of transaction screen displays from a hierarchical database management log utilizing a batch terminal simulator
JPH11306187A (en) Method and device for presenting retrieval result of document with category
EP1565816B1 (en) Linking data objects to a project development system
US20200364251A1 (en) Cluster computing system and method for automatically generating extraction patterns from operational logs
JP2007114972A (en) Data processing method of structured document, data processing program and data processor
US20070011353A1 (en) System, method, and program product for extracting information from an identifier
US20060212463A1 (en) Method and apparatus for extracting metadata information from an application
US7099883B2 (en) System and method of linking dissimilar databases using pointers
US20060236312A1 (en) Method, system and program product for identifying information for a software application

Legal Events

Date Code Title Description
AS Assignment

Owner name: GOOGLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:027463/0594

Effective date: 20111228

AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BANGEL, MATTHEW J.;HICKS, SCOTT D.;MARTIN, JAMES A., JR.;AND OTHERS;SIGNING DATES FROM 20050316 TO 20050330;REEL/FRAME:031511/0663

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION

AS Assignment

Owner name: GOOGLE LLC, CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:GOOGLE INC.;REEL/FRAME:044142/0357

Effective date: 20170929