WO1996030845A1 - Method and apparatus for improved information storage and retrieval system - Google Patents

Method and apparatus for improved information storage and retrieval system Download PDF

Info

Publication number
WO1996030845A1
WO1996030845A1 PCT/US1996/001260 US9601260W WO9630845A1 WO 1996030845 A1 WO1996030845 A1 WO 1996030845A1 US 9601260 W US9601260 W US 9601260W WO 9630845 A1 WO9630845 A1 WO 9630845A1
Authority
WO
WIPO (PCT)
Prior art keywords
column
row
rows
oid
record
Prior art date
Application number
PCT/US1996/001260
Other languages
French (fr)
Inventor
Scott Wlaschin
Robert M. Gordon
Louise J. Wannier
Clay Gordon
Original Assignee
Dex Information Systems
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=23514567&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=WO1996030845(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by Dex Information Systems filed Critical Dex Information Systems
Priority to EP96905298A priority Critical patent/EP0818010A1/en
Priority to AU49104/96A priority patent/AU4910496A/en
Publication of WO1996030845A1 publication Critical patent/WO1996030845A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/221Column-oriented storage; Management thereof
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/917Text
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/953Organization of data
    • Y10S707/954Relational
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99934Query formulation, input preparation, or translation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface

Definitions

  • the present invention relates generally to a method and apparatus for storing, retrieving, and distributing various kinds of data, and more particularly, to an improved database architecture and method for using the same.
  • databases comprised a simple "flat file" with an associated index.
  • Application programs as opposed to the database program itself, managed the relationships between these files and a user typically performed queries entirely at the application program level.
  • the introduction of relational database systems shifted many tasks from applications programs to database programs.
  • the currently existing database management systems comprise two main types, those that follow the relational model and those that follow the object oriented model.
  • the relational model sets out a number of rules and guidelines for organizing data items, such as data normalization.
  • a relational database management system (RDBMS) is a system that adheres to these rules.
  • RDBMS databases require that each data item be uniquely classified as a particular instance of a 'relation'.
  • Each set of relations is stored in a distinct 'table'.
  • Each row in the table represents a particular data item, and each column represents an attribute that is shared over all data items in that table.
  • the pure relational model places number of restrictions on data items. For example, each data item cannot have attributes other than those columns described for the table. Further, an item cannot point directly to another item. Instead, 'primary keys' (unique identifiers) must be used to reference other items. Typically, these restrictions cause RDBMS databases to include a large number of tables that require a relatively large amount of time to search.
  • the object oriented database model derived from the object-oriented programming model, is an alternative to the relational model. Like the relational model, each data item must be classified uniquely as belonging to a single class, which defines its attributes. Key features of the object-oriented model are: 1) each item has a unique system-generated object identification number that can be used for exact retrieval; 2) different types of data items can be stored together; and 3) predefined functions or behavior can be created and stored with a data item.
  • both the relational and object oriented models share important limitations with regard to data structures and searching. Both models require data to be input according to a defined field structure and thus do not completely support full text data entry. Although some databases allow records to include a text field, such text fields are not easily searched. The structural requirements of current databases require a
  • word and image processors that allow unstructured data entry do not provide efficient data retrieval mechanisms and a separate text retrieval or data management tool is required to retrieve data.
  • the current information management systems do not provide the capability of integrating full text or graphics data entry with the searching mechanisms of a database.
  • the separation of database from other programs such as word processors has created a large amount of text and other files that cannot be integrated with current databases.
  • Various database, spreadsheet, image, word processing, electronic mail and other types of files may not currently be accessed in a single database that contains all of this information.
  • Various programs provide integration between spreadsheet, word processing and database programs but, as previously described, current databases do not support effective searching in unstructured files.
  • the present invention overcomes the limitations of both the relational database model and object oriented database model by providing a database with increased flexibility, faster search times and smaller memory requirements and that supports text attributes. Further, the database of the present invention does not require a programmer to preconfigure a structure to which a user must adapt data entry. Many algorithms and techniques are required by applications that deal with these kinds of information.
  • the present invention provides for the integration, into a single database engine, of support for these techniques, and shifts the programming from the application to the database, as will be described below.
  • the present invention also provides for the integration, into a single database, of preexisting source files developed under various types of application programs such as other databases, spreadsheets and word processing programs.
  • the present invention allows users to control all of the data that are relevant to them without sacrificing the security needs of a centralized data repository.
  • the present invention improves upon prior art information search and retrieval systems by employing a flexible, self-referential table to store data.
  • the table of the present invention may store any type of data, both structured and unstructured, and provides an interface to other application programs such as word processors that allows for integration of all the data for such application programs into a single database.
  • the present invention also supports a variety of other features including hypertext.
  • the table of the present invention comprises a plurality of rows and columns. Each row has an object identification number (OID) and each column also has an OID.
  • a row corresponds to a record and a column corresponds to an attribute such that the intersection of a row and a column comprises a cell that may contain data for a particular record related to a particular attribute. A cell may also point to another record.
  • OID object identification number
  • columns are entered as rows in the table and the record corresponding to a column contains various information about the column. This renders the table self referential and provides numerous advantages, as will be discussed in this Specification.
  • the present invention includes an index structure to allow for rapid searches. Text from each cell is stored in a key word index which itself is stored in the table. The text cells include pointers to the entries in the key word index and the key word index contains pointers to the cells. This two way association provides for extended queries. The invention further includes weights and filters for such extended queries.
  • the present invention includes a thesaurus and knowledge base that enhances indexed searches.
  • the thesaurus is stored in the table and allows a user to search for synonyms and concepts and also provides a weighting mechanism to rank the relevance of retrieved records.
  • An application support layer includes a word processor, a password system, hypertext and other functions.
  • the novel word processor of the present invention is integrated with the table of the present invention to allow cells to be edited with the word processor.
  • the table may be interfaced with external documents which allows a user to retrieve data from external documents according to the enhanced retrieval system of the present invention.
  • FIG. 1 is a functional block diagram illustrating one possible computer system incorporating the teachings of the present invention.
  • FIG. 2 is a block diagram illustrating the main components of the present invention.
  • FIG. 3 illustrates the table structure of the database of the present invention.
  • FIG. 4 is a flow chart for a method of computing object identification numbers (OID's) that define rows and columns in the table of Fig. 1.
  • FIG. 5 is a part of the table of Fig. 2 illustrating the column
  • FIG. 6 is a flow chart for a method of searching the table of Fig. 2.
  • FIG. 7a is a flow chart for synchronizing columns of the table of Fig. 2.
  • FIG. 7b illustrates the results of column synchronization.
  • FIG. 8a illustrates a reference within one column to another column.
  • FIG. 8b illustrates an alternate embodiment for referring to another column within a column.
  • FIG. 9 illustrates a "Record Contents" column of the present invention that indicates which columns of a particular record have values.
  • FIG. 10 illustrates a folder structure that organizes records. The folder structure is stored within the table of Fig. 2.
  • FIG. 11 illustrates the correspondence between cells of the table of Fig. 2 and a sorted key word index.
  • FIG. 12 illustrate the "anchors" within a cell that relate a word in a cell to a key word index record.
  • FIG. 13 illustrates key word index records stored in the table of Fig. 2.
  • FIG. 14 illustrates the relationship between certain data records and key word index records.
  • FIG. 15 illustrates the relationship of Fig. 14 in graphical form.
  • FIG. 16a illustrates an extended search in graphical form.
  • FIG. 16h illustrates a further extended search in graphical form.
  • FIG. 17 illustrates the thesaurus structure of the present invention stored in the table of Fig. 2.
  • FIG. 18 illustrates prior art hypertext.
  • FIG. 19 illustrates the hypertext features of the present invention.
  • FIG. 20a illustrates a character and word box structure of the word processor of the present invention.
  • FIG. 20b illustrates the word and horizontal line box structure of the word processor of the present invention.
  • FIG. 20c illustrates the vertical box structure of the word processor of the present invention.
  • FIG. 21 illustrates the box tree structure of the word processor of the present invention.
  • FIG. 22a illustrates the results of a prior art sorting algorithm.
  • FIG. 22b illustrates the results of a sorting alogrithm according to the present invention.
  • FIG. 23 illustrates the correspondence between cells of the table of Fig. 2 and a sorted date index.
  • the manipulations performed are often referred to in terms, such as adding or comparing, which are commonly associated with mental operations performed by a human operator. No such capability of a human operator is necessary, or desirable in most cases, in any of the operations described herein which form part of the present invention; the operations are machine operations.
  • Useful machines for performing the operations of the present invention include general purpose digital computers or other similar digital devices. In all cases there should be borne in mind the distinction between the method operations in operating a computer and the method of computation itself.
  • the present invention relates to method steps for operating a computer in processing electrical or other (e.g., mechanical, chemical) physical signals to generate other desired physical signals.
  • the present invention also relates to apparatus for performing these operations.
  • This apparatus may be specially constructed for the required purposes or it may comprise a general purpose computer as selectively activated or reconfigured by a computer program stored in the computer.
  • the algorithms presented herein are not inherently related to a particular computer or other apparatus.
  • various general purpose machines may be used with programs written in accordance with the teachings herein, or it may prove more convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description given below.
  • the present invention discloses methods and apparatus for data storage, manipulation and retrieval. Although the present invention is described with reference to specific block diagrams, and table entries, etc., it will be appreciated by one of ordinary skill in the art that such details are disclosed simply to provide a more thorough understanding of the present invention. It will therefore be apparent to one skilled in the art that the present invention may be practiced without these specific details.
  • Figure 1 illustrates an information storage and retrieval system structured in accordance with the teachings of the present invention.
  • the information storage and retrieval system includes a computer 23 which comprises four major components. The first of these is an input/output (I/O) circuit 22, which is used to communicate information in appropriately structured form to and from other portions of the computer 23.
  • computer 20 includes a central processing unit (CPU) 24 coupled to the I/O circuit 22 and to a memory 26.
  • CPU central processing unit
  • keyboard 30 for inputting data and commands into computer 23 through the I/O circuit 22, as is well known.
  • a CD ROM 34 is coupled to the I/O circuit 22 for providing additional programming capacity to the system illustrated in Figure 1. It will be appreciated that additional devices may be coupled to the computer 20 for storing data, such as magnetic tape drives, buffer memory devices, and the like.
  • a device control 36 is coupled to both the memory 26 and the I/O circuit 22, to permit the computer 23 to communicate with multi-media system resources. The device control 36 controls operation of the multi-media resources to interface the multi-media resources to the computer 23.
  • a display monitor 43 is coupled to the computer 20 through the I/O circuit 22.
  • a cursor control device 45 includes switches 47 and 49 for signally the CPU 24 in accordance with the teachings of the present invention.
  • a cursor control device 45 (commonly referred to a "mouse") permits a user to select various command modes, modify graphic data, and input other data utilizing switches 47 and 49. More particularly, the cursor control device 45 permits a user to selectively position a cursor 39 at any desired location on a display screen 37 of the display 43.
  • the cursor control device 45 and the keyboard 30 are examples of a variety of input devices which may be utilized in accordance with the teachings of the present invention. Other input devices, including for example, trackballs, touch screens, data gloves or other virtual reality devices may also be used in conjunction with the invention as disclosed herein.
  • FIG. 2 is a block diagram of the information storage and retrieval system of the present invention.
  • the present invention includes an internal database 52 that further includes a record oriented database 74 and a free-text database 76.
  • the database 52 may receive data from a plurality of external sources 50, including word processing documents 58, spreadsheets 60 and database files 62.
  • the present invention includes an application support system that interfaces the external sources 50 with the database 52.
  • a plurality of indexes 54 including a keyword index 78 and other types of indexes such as phonetic, special sorting for other languages, and market specific such as chemical, legal and medical, store sorted information provided by the database 52.
  • a knowledge system 56 links information existing in the indexes 54.
  • the Specification will first describe the structure and features of the database 52. Next, the Specification will describe the index 54 and its implementation for searching the database 52. The Specification will then describe the knowledge system 56 that further enhances the index 54 by providing synonyms and other elements. Finally, the Specification will describe an interface between the external application programs 50 and the database 52, including a novel structured word processor and a novel password scheme.
  • FIG. 3 illustrates the storage and retrieval structure of the present invention.
  • the storage and retrieval structure of the present invention comprises a table 100.
  • the structure of the table 100 is a logical structure and not necessarily a physical structure.
  • the memories 26 and 32 configured according to the teachings of the present invention need not store the table 100 contiguously.
  • the table 100 further comprises a plurality of rows 110 and a plurality of columns 120.
  • a row corresponds to a record while a column corresponds to an attribute of a record and the defining characteristics of the column are stored in a row 108.
  • the intersection of a row and a column comprises a particular cell.
  • Each row is assigned a unique object identification number (OID) stored in column 120 and each column also is assigned a unique OID, indicated in brackets and stored in row 108.
  • OID object identification number
  • row 110 has an OID equal to
  • the column 122 has an OID equal to 101.
  • the OID's for both rows and columns may be used as pointers and a cell 134 may store an OID. The method for assigning the OID's will also be discussed below.
  • each row may include information in each column. However, a row need not, and generally will not, have data stored in every column.
  • row 110 corresponds to a company as shown in a cell 130. Since companies do not have titles, cell 132 is unused.
  • the type of information associated with a column is known as a
  • 'domain' Standard domains supported in most database systems include text, number, date, and Boolean.
  • the present invention includes other types of domains such as the OID domain that points to a row or column.
  • the present invention further supports 'user-defined' domains, whereby all the behavior of the domain can be determined by a user or programmer. For example, a user may configure a domain to include writing to and reading from a storage medium and handling operations such as equality testing and comparisons.
  • individual cells may be accessed according to their row and column OID's.
  • Using the cell as the unit of storage improves many standard data management operations that previously required the entire object or record. Such operations include versioning, security, hierarchical storage management, appending to remote partitions, printing, and other operations.
  • Each column has an associated column definition, which determines the properties of the column, such as the domain of the column, the name of the column, whether the column is required and other properties that may relate to a column.
  • the table 100 supports columns that include unstructured, free text data.
  • the column definition is stored as a record in the table 100 of Figure 3.
  • the "Employed By" column 126 has a corresponding row 136.
  • the addition or rows that correspond to columns renders the table 100 self-referential. New columns may be easily appended to the table 100 by creating a new column definition record. The new column is then immediately available for use in existing records. Dates
  • Dates can be specified numerically and textually.
  • An example of a numerical date is " 11/6/67" and an example of a textual date is "November 6, 1967.”
  • Textual entries are converted to dates using standard algorithms and lookup tables.
  • a date value can store both original text and the associated date to which the text is converted, which allows the date value to be displayed in the format in which it was originally entered.
  • Numeric values are classified as either a whole number (Integer) or fractional number.
  • Integers are stored as variable length structures, which can represent arbitrarily large numbers. All data structures and indexes use this format which ensures that there are no limits in the system.
  • Fractional numbers are represented by a ⁇ numerator/denominator > pair of variable length integers. As with dates, a numeric value can store both the original text ("4 1/2 inches") and the associated number(4.5). This allows the numeric value to be redisplayed in the format in which it was originally entered.
  • a record can be associated with a 'record type'.
  • the record type can be used simply as a category, but also can be used to determine the behavior of records.
  • the record type might specify certain columns that are required by all records of that type and, as with columns, the type definitions are stored as records in the table 100.
  • column 122 includes the type definition for each record.
  • the column 122 stores pointers to rows defining a particular column type.
  • the row 136 is a "Field” type column and contains a pointer in a cell 133 to a row 135 that defines "Field” type columns.
  • the "Type Column” 122 of the row 135 points to a type called "Type,” which is defined in a row 140.
  • "Type” has a type column that points to itself.
  • Record types may constrain the values that a record of that type may contain. For example, the record type 'Person' may require that records of type 'Person' have a valid value in the 'Name' column, the 'Phone' column, and any other columns.
  • the type of a record is an attribute of the record and thus may change at any time. Creating a unique OID
  • Figure 4 is a flow chart of the method for assigning OID's.
  • the CPU 24 running the database program stored in the memory 26 requests a timestamp from the operating system.
  • the system determines whether the received timestamp is identical to a previous timestamp. If the timestamps are identical, block 210 branches to block 220 and a tiebreaker is incremented to resolve the conflict between the identical timestamps.
  • the system determines whether the tiebreaker has reached its limit, and, if so, the system branches to block 200 to retrieve a new time stamp. Otherwise, the system branches to block 214 where the system requests a session identification which is unique to the user session.
  • the session identification is derived from the unique serial number of the application installed on the users machine.
  • the session identification may be used to determine the type of object. For example, dates are independent of any particular machine, and so an OID for a date may have a fixed session identification.
  • the system requests a session identification which is unique to the user session.
  • OID Domains The particular type of OID and its length is constant throughout a single database but may vary between databases. A flag indicating which type of OID to be used may be embedded in the header of each database. OID Domains
  • OID domains are used to store OID's, which are pointers to other records. An efficient query can use these OID's to go directly to another record, rather than searching through columns.
  • the present invention includes a novel technique for determining an OID from the textual description. Conversion from text to an OID may also be necessary when a user is entering information into a record. For exmaple, in Figure 3, the user may be entering information in the "Employed By" column 126, and wish to specify the text "DEXIS" and have it converted to OID #1100. For this purpose, special columns are required that provide a definition for how the search and conversion is performed.
  • Figure 6 is a flow chart for searching the table 100 configured according to the structure illustrated in Figure 5.
  • a user enters text through the keyboard 30 or mouse 45 for a particular column that the user wishes to search.
  • the system retrieves the search path for the column to be searched from the information stored in column 146 as illustrated in Figure 5.
  • a cell 146 in the row 136 contains the search path information for the "Employed By" column 126 of Figure 3.
  • the search path information for the "Employed By" field indicates that the folders called " ⁇ contacts" and " ⁇ departments" should be searched for a company with the dabel "DEXIS.”
  • the system searches the table 100 according to the retrieved search path information.
  • the routine searches for a record that has an entry in the label column 124 of Figure 2 that is the same as the text being searched for, and is of the same class, as indicated in column 122 of Figure 3. Folders will be further described below.
  • the system determines whether it has found any items matching the user's search text. If no items have been found, at block 158, the system prompts the user on the display screen 37 to create a new record. If the user wishes to create a new record, control passes to block 162 and the system creates a new record. At block 164, the OID of the new record is returned. If the user does not wish to create a new record, a "NIL" string is returned, as shown at block 160.
  • the system determines whether it has found more than one item, as illustrated in block 166. If only one item has been located, its OID is returned at block 168. If more than one item has been located, the system displays the list of items to the user at block 170 and the user selects a record from the list. At block 172, the OID of the selected record is returned, which, in the above example, is #1100, the OID of the record for the company "DEXIS.”
  • various features may be added to the search mechanism as described with reference to Figure 6.
  • further restrictions may be added to the search; the search may be related by allowing prefix matching or fuzzy matching instead of strict matching; and the search may be widened by using the 'associative search' techniques described below.
  • Records may have interrelationships and it is often desirable to maintain consistency between interrelated records.
  • a record including data for a company may include information regard employees of that company, as illustrated in row 110 of Figure 3.
  • the employees that work for that company may have a record that indicates, by a pointer, their employer, as illustrated by row 138 of Figure 3.
  • the employee column of a company should point to employees whose employer column points to that company.
  • the present invention includes a synchronization technique to ensure that whenever interrelated records are added or removed, the interrelationships between the columns are properly updated.
  • the system synchronizes interrelated records by adding a "Synchronize With" column 144 to the table 100 as illustrated in Figure 5. Since the value in the columns defines the relatedness between records, the rows 136 and 139 corresponding to columns contain information within the "Synchronize With" column 144 that indicates which other columns are to be synchronized with the columns corresponding to rows 136 and 139.
  • the "Employed By” column 126 is synchronized with the "Employees" column by an OID pointer in the "Synchronize With” column 144 to the "Employees” column, represented by row 139.
  • the "Employees" column is synchronized with the "Employed By” column 136 by a pointer in the "Synchronize With” column 144 to the "Employed by” column 134, represented by row 136.
  • the "Employee” column of the previous employer is updated to eliminate the pointer to the ex-employee and, correspondingly, the addition of the employee in the "Employed By" field of the new employer. Synchronization may need to occur whenever a column is changed, whether by addition or subtraction of a reference to another column, or when entire records are added or eliminated from the table 100.
  • Figure 7a is a flow chart for synchronizing records when a user adds or deletes a record.
  • the system makes a backup of the original list of references to other rows, which are simply the OID's of those other rows, so that it can later determine which OIDS have been added or removed. Only these changes need to be synchronized.
  • the system generates a new list of references by adding or deleting the specified OID.
  • the system determines whether the relevant column is synchronized with another column. If it is not, then the system branches to block 186 and the update is complete. If the column is synchronized with another column, the system determines whether it is already in a synchronization routine. If this were not done, the routine would get into an endless recursive loop. If the system is already in a synchronization routine, the system branches to 190 and the update is complete.
  • the system performs actual synchronization.
  • the system finds an OID that has been added or subtracted from the column (C1) of the record (R1) being altered.
  • the system retrieves the record (R2) corresponding to the added or subtracted OID at block 194.
  • the system determines the synchronization column (C2) of the column (C1) at block 196 and locates that field in the added or subtracted OID. For example, if an employer is fired from a job, and the employer's "Employed By" field changed accordingly, the system would look up the value of the "Synchronize With" column 144 for the "Employees" column which is contained in the cell 147 as illustrated in Figure 5.
  • the system locates the "Employed By” field of the record for the fired employee.
  • the located cell, (R2:C2) is updated by adding or subtracting the OID.
  • the "Employed By" field of the employee would be changed to no longer point to the previous employer by simply removing the employer's OID from that field.
  • the system branches back to block 192 to update any other OID additions or subtractions. If the system has processed all of the OID's, then the routine exits as illustrated at blocks 200 and 202.
  • Figure 7b illustrates the results of column synchronization of the "Employed By” field and the “Employees” field. As shown, the pointers in the records of these two columns are consistent with one another. Columns within columns
  • a column may contain within it a reference to another column in the same record.
  • a 'name' column may contain a reference to both a 'first name' and a 'last name' column. The value of the 'name' column can then be reconstructed from the values of the other two columns.
  • Figures 8a and 8b illustrate two possible implementations for reconstructing a value from one or more columns within the same record.
  • Figure 8a illustrates a table 210 that includes a "First Name” column 220, a "Last Name” column 222 and a "Name” column 224.
  • a record 226 for "John Smith” has the first name "John” in the "First Name” column 220 and the last name "Smith” in the column 222.
  • Figure 8b employs a variant of the referencing scheme illustrated in Figure 8a.
  • Figure 8a illustrates a table 230 that includes a "First Name” column 232, a "Last Name” column 234 and a "Name” column 236.
  • a record 238 for "John Smith” has the first name "John” in the "First Name” column 232 and the last name "Smith” in the column 234.
  • the name field 236 returns the text "The name is John Smith” by referencing the fields by defined variables 'fh' and 'In' as shown in column 236.
  • a given row may contain values for any column.
  • to determine all of the columns that might be used by a record would involve scanning every possible column.
  • the table 100 illustrated in Figure 3 includes a
  • RecordContents column that indicates those columns within which a particular record has stored values.
  • Figure 9 illustrates the table 100 with a "RecordContents" column 127 that includes pointers to the columns containing values for a particular record.
  • the "RecordContents" column 127 for row 110 has pointers to the column 124 and a column 125 but does not have a pointer to the column 126 because the row 110 does not have a value for the column 126.
  • the "RecordContents" column 127 has a defining row 129.
  • the cell containing the record contents can be versioned, providing the ability to do record versioning.
  • the table 100 includes a data type defined as a folder.
  • Figure 10 illustrates the structure of a folder.
  • the table 100 includes a "Parent Folder” column 240 and a "Folder Children” column 242.
  • a folder has a corresponding record.
  • a folder entitled “Contacts” has a corresponding row 244 as illustrated in Figure 10.
  • Contacts folder includes pointers to those records that belong to the folder. Similarly, those records that belong to a folder include a pointer to that folder in the "Parent Folder" column 240.
  • the folder structure illustrated in Figure 10 facilitates searching. As previously described, a column may be searched according to a folder specified in the column definition. If a folder is searched, the system accesses the record corresponding to the folder and then searches all of the records pointed to by that folder. Further, the synchronization feature described above may be used to generate the list of items in a folder. For example, in Figure 10, the 'Folder Parent' and 'Folder Children' columns may be synchronized. When the 'Folder Parent' field 240 for record 138 is set to reference the 'Contacts' folder represented by row 244, the list of items in the 'Contacts' folder
  • the present invention includes an indexing system that provides for rapid searching of text included in any cell in the table 100.
  • Each key phrase is extracted from a cell and stored in a list format according to a predefined hierarchy. For example, the list may be alphabetized, providing for very rapid searching of a particular name.
  • Figure 11 illustrates the extraction of text from the table 100 to a list
  • the list 250 is shown separately from the table 100 for purposes of illustration but, in the preferred embodiment, the list 250 comprises part of the table 100.
  • the list 250 stores cell identification numbers for each word in the list where a cell identification number may be of the format ⁇ record OID, column OID> .
  • a cell identification number may be of the format ⁇ record OID, column OID> .
  • the word "Ventura" occurs in cells 252, 254 and 256 that correspond to different rows and different columns.
  • the word "Ventura" occurs in cells 252, 254 and 256 that correspond to different rows and different columns.
  • “Ventura” in the list 250 contains a pointer, or cell identification number, to cells 252, 254 and 256.
  • each cell stores the references to the key phrases within it using 'anchors'.
  • an anchor contains a location (such as the start and stop offset within the text), and an identification number. Both the text and the anchor are stored in the cell 252.
  • Other kinds of domains also support anchors.
  • graphical images support the notion of 'hot spots' where the anchor position is a point on the image.
  • each key phrase is stored as a record in the database and the OID of the record equals the identification number described with reference to Figure 12.
  • One column stores the name of the key phrase and another stores the list of cell identification numbers that include that phrase.
  • Key phrases may have comments of their own, which may also be indexed.
  • the sorted list 250 as illustrated in Figure 11 is stored as a Folder, as illustrated in Figure 13.
  • a cell identification field 274 maintains the cells that include the term corresponding to that record.
  • the "Parent Folder" column 240 for each of the terms on the list 250 indicates that the parent folder is an index with a title "Natural.”
  • the "Natural" folder has a row 276 that has pointers in the "Folder Children” column 242 to all of the terms in the list 250.
  • the "Natural" folder corresponds to an index sorted by a specific type of algorithm.
  • Computer programs generally sort using a standard collating sequence such as ASCII.
  • the present invention provides an improvement over this type of sorting and the improved sorting technique corresponds to the "Natural" folder. Records in the "Natural" folder are sorted according to the following rules: 1) A key phrase may occur at more than one point in the list. In particular:
  • Key phrases may be permuted and stored under each permutation. For example: 'John Smith' can be stored under 'John' and also under 'Smith'. Noise words such as 'a' and 'the' are ignored in the permutation.
  • Key phrases which are numeric or date oriented may be stored under each possible location. For example: '1984' can be stored under the digit ' 1984' and also under 'One thousand, nine hundred...', and 'nineteen eighty four'.
  • the system To generate a sorted list, the system must first extract the key phrases or words from the applicable cells.
  • the combination of structured information and text allows various combinations of key phrase extraction to be used.
  • every word is indexed, which is typical for standard text retrieval systems.
  • column extraction the whole contents of the column are indexed which corresponds to a standard database system.
  • automatic analysis the contents of the text are analyzed and key phrases are extracted based on matching phrases, semantic context, and other factors.
  • the user or application explicitly marks the key phrase for indexing.
  • the date indexing scheme is very similar to the text indexing scheme as previously described. Important dates are extracted from the text and added to an 'Important Date' list. Each important date is represented by a 'Important Date' record. The 'Important Date' records are stored in a 'Important Dates' folder, which is sorted by date.
  • Figure 23 illustrates the correspondence between cells of the table of Figure 2 and a sorted date index.
  • Important Date records are assigned special predetermined OIDS since they always have the same identity in any system. Assigning predetermined OID's to dates allows Important Dates to be shared across systems.
  • the predetermined OID is generated by using a special session identification number that signifies that the OID is an Important Date. In this case, the timestamp represents the value of the Important Date itself, not the time that it was created.
  • Associative Queries are generated by using a special session identification number that signifies that the OID is an Important Date. In this case, the timestamp represents the value of the Important Date itself, not the time that it was created.
  • a sorted key word list is generated from the text in cells and list stored in a folder whose records point to the text cells.
  • the associations between the list of records with text and the list of key phrases is two-way since the cells that include text point to the key words.
  • Figure 14 illustrates this two way correspondence. Each record can point to multiple key phrases, and each key phrase can point to multiple records.
  • Figure 15 is a graphical representation of the two way association between records and the key word list.
  • Each record in a plurality of records 298 through 300 may point to one or more important word entries 310 through 312.
  • each important word entry may point to one or more records.
  • a single level search involves starting at one node (on either side of the graph) and following the links to the other side. For example, a user may wish to find the records including the word "Shasta.” First, the important word index would be accessed to find the word "Shasta" and the records pointed to by this word would then be retrieved. This search is indicated by the arrows 314 and 316 where word "Shasta" corresponds to cell 318.
  • Figure 16a illustrates this concept.
  • the term “Shasta” may correspond to a dog with extraordinary intelligence such that in one record, "Shasta” is described as a dog and another record, 'Shasta' is described as a genius. If the user wishes to find the words associated with 'Shasta', the system locates "Shasta" in the "Important Words” folder which points to the records including the word "Shasta.” In turn, the records pointed to contain pointers to the "Important Words" list for each indexed word in the record. Since "Shasta” appears with "dog” and "genius" in the records, these words are retrieved by the system.
  • Figure 16b illustrates an additional level of searching.
  • the word “genius” may occur in records referring to Dirac, and the word “dog” associated with “Checkers,” such that the multilevel search illustrated in Figure 16b results in a retrieval of "Dirac” and “Checkers” when provided with the word "Shasta.”
  • a relevance ranking can be created based on weights associated with each link and type of key word, and the records can be displayed in order of descending relevance.
  • the relevance is based on the distance from all nodes. In this way, only nodes which are near all the initial nodes will have a high relevance. Many other relevance rankings apart from distance may be used.
  • filters can be used to constrain the links that are followed.
  • the search may be filtered such that only the type "Person" is listed such that, in the above example, Shasta will be associated with Dirac but not Checkers.
  • the present invention includes a knowledge base and thesaurus to further improve searching capabilities.
  • Each important word record (term) included within the thesaurus contains a pointer to a 'concept' record.
  • Each concept record contains pointers to other concept records, and to the terms that are included within the bounds of that concept.
  • Figure 17 illustrates the structure of the thesaurus.
  • the table 100 includes a "Parent Concept” column 352, a "Concept Name” column 354, a "Synonyms” column 356, a "More Specific Terms” column 358, a "More General Terms” column 360 and a "See Also” column 362.
  • a concept record 350 defines the concept "IBM” and the Synonyms column 356 points to records that are synonymous with IBM, a record 364 with a label field with the value "IBM” and a record 366 with a label field with the value "International Business Machines.”
  • the records 364 and 366 have pointers in the "parent concept” field that point to the parent concept record 350.
  • the thesaurus structure illustrated in Figure 17 provides for greater flexibility than exact synonyms.
  • the "More Specific Terms” column 358 of the concept record 350 associated with "IBM” points to a concept record 368 associated with the IBM PC with an assigned weight of 100%, where the weight percentage reflects the similarity between the initial term “IBM” and the related term “IBM PC.”
  • the "More General Terms” column 360 of the concept record 350 associated with "IBM” points to a concept record 372 associated with Computer Companies with an assigned weight of 60%.
  • the “See also” column points to a record associated with the concept "Microsoft” with a weight of 70%, where the weight percentage reflects the similarity between the initial term "IBM” and the related term "IBM PC.”
  • the Thesaurus illustrated in Figure 17 enhances the searching mechanisms previously described with reference to Figures 14-16b.
  • the system first locates the record associated with a key word and locates the parent concept record pointed to by the key word record. The system may then follow some or all of the pointers in the columns 356, 358, 360 and 352 and return of the OID's stored in the 'Concept Name' column 354.
  • any other columns may be used to extend the knowledge and information stored therein.
  • the system can store any kind of relationship, including relationsihps other than thesaural relationships, between key phrases, concepts and other records .
  • the database of the present invention has been described without reference to its interface with applications that may use the invention as their primary storage and retrieval system.
  • the present database includes an interface to support applications programs.
  • Components in the application support system include external document support, hypertext, document management and workflow, calendaring and scheduling, security and other features.
  • the present invention includes various user interface components that allow have been developed to provide full access to the structure of the database of the present invention.
  • a new kind of structured word processor will be presented. The Specification will describe each component of the application support system separately.
  • the present invention supports indexing of external documents.
  • the table 100 stores the filenames of documents, such as word processor documents, where the contents of the files are not directly stored in the database.
  • the documents names may be stored in a column with a specialized "External
  • the external documents may reside in the mass memory 32 or on a multi-source that interfaces with the system through device control 36.
  • an external document is converted into a plain text format.
  • Key phrases are then extracted as previously described.
  • fields in the text can be determined and mapped to fields within the database.
  • a 'Memo' document may contain the text: 'To: John Smith. From: Mary Doe'.
  • This text can be mapped to the fields called 'to' and 'from', and the values of these fields set accordingly.
  • the analysis of the text in this way can be changed for different types of external documents such as memos, legal documents, spread sheets, computer source code and any other type of document.
  • a start and stop point within the text is determined.
  • a list of anchors of the format previously described, ⁇ start, stop, key phrase > is generated by the parser and stored within the table 100 under the external document domain. Viewing external documents
  • the stored anchors are overlaid on top of the document such that it appears that the external document has been marked with hypertext.
  • the corresponding anchor is determined from the various start and stop coordinates.
  • the OID of the key phrase corresponding to the anchor is stored within the anchor, and can be used for the purposes of retrieving the key phrase record or initiating a query as previously described.
  • the present invention supports Hypertext.
  • Hypertext systems typically associate a region of text with a pointer to another record, as illustrated in Figure 18. This creates a 'hard-coded' link between the source and the target. When s user clicks on the source region, the target record is loaded and displayed. If the target record is absent, the hypertext jump will fail, possibly with serious consequences.
  • each hypertext region is associated with a key phrase, not a normal record.
  • the application can then display on the display screen 37 either the highest ranked item, or present all the retrieved items and allow the user to pick the one to access.
  • the user may want to access a single 'default' item.
  • This item can be determined automatically, by picking the item at the top of the dynamically generated list, or manually, by letting the user pick the item explicitly and then preserving this choice in the anchor itself.
  • the database of the present invention includes a novel Structured Word Processor that may be used in conjunction with the table 100.
  • the structured word processor of the present invention uses the "boxes and glue" paradigm introduced by Donald Knuth in T E X. According to this paradigm, a page of text is created by starting with individual characters and concatenating the characters to form larger units, called “boxes," and then combining these boxes into yet larger boxes.
  • Figure 20a illustrates three character boxes 400, 402 and 404 concatenated to form a word box 406.
  • Figure 20b illustrates four word boxes 410, 412, 414 and the word box 406 combined to form a horizontal line box 408.
  • Horizontal boxes are used for words and other text tokens that are spaced horizontally inside another box, such as a line (or column width).
  • Figure 20c illustrates the combination of the horizontal line box 408 with another horizontal line box 4242 to form a vertical box 420.
  • Boxes may be attached to other boxes with "glue.”
  • the glue can stretch or shrink, as needed. For example, in a justified sentence, the white space between words is stretched to force the words to line up at the right edge of the column.
  • Glue can be used for between-character (horizontal) spacing, between-word (horizontal) spacing including “tab” glue, that "sticks” to tab markings.
  • Glue may also be used for between-line (vertical) spacing and between-paragraph (vertical) spacing.
  • each word and field definition is converted into boxes.
  • the system organizes these boxes into a tree structure of line boxes and paragraph boxes, as illustrated in Figure 21.
  • the record structure hierarchy 460 represents the record structure of the table 100 where a record 462 corresponds to a row in the table 100 and the record 462 includes a plurality of attributes, including attribute 464, that correspond to the columns of the table 100.
  • the attributes may include a variety of items.
  • the attribute 464 includes text, represented by block 466, field references represented by block 468 and other items as shown.
  • the layout hierarchy 470 comprises a document 472 which in turn comprises a plurality of pages, including page 474.
  • the page 474 comprises a plurality of paragraphs including paragraphs 430 and 431 and the paragraph 430 comprises a plurality of lines, including lines 432 and 434.
  • the paragraph 431 includes line 436.
  • the word processor of the present invention allows the document 472 to be inserted into the record 462 by providing a plurality of boxes, including boxes 438, 440 and 442, common to both the record structure hierarchy 460 and the layout hierarchy 470.
  • the box 438 corresponds to part of the line 432 and comprises part of the text of attribute 464 as illustrated by block 466.
  • the box 440 corresponds to part of the line 434 and may comprise a field reference as indicated by block 468.
  • the shared box structure as illustrated in Figure 21 allows any type of word processing document to interface with any record in the table 100.
  • each box is kept as a bitmap, and its height and width are known, so the system displays the tree structure 450 by displaying all of the bitmaps corresponding to the boxes in the tree. If the tree is changed, for example, by adding a new word, only the new word box and a relatively small number of adjacent boxes need be recalculated. Similarly, line breaks or restructuring of a paragraph does not alter most of the word boxes, which may be reused, and only the lineboxes need be recalculated.
  • a user may click a cursor on a part of the text.
  • the system locates the word box or glue that is being edited by a recursively descending through the tree structure 450.
  • the word processor supports multiple fonts and special effects such as subscripts, dropcaps and other features including graphic objects.
  • a word in a different font than a base font is in a different box and may have a different height from other boxes on a line.
  • the height of a linebox the height of the largest wordbox within it. Effects within a word can be handled by breaking a word into subboxes with no glue between them. Again, the height of a wordbox is the height of the largest box within it.
  • Graphic objects such as bitmaps, may be treated and formatted as a fixed width box.
  • the word processor of the present invention may be used to edit records in the table 100.
  • the text associated with each field in a record can be considered a "paragraph" for the purposes of inter-field spacing, text flow within a field, and other formatting parameters. Storing all the fields in the same way during text-editing allows the movement of text and "flow" to appear natural.
  • the text being edited is divided into fields, with each field corresponding to a column in the underlying database.
  • the positions and sizes of the attributes are not fixed but are dynamic and all the features of a word-processor such as fonts and embedded graphics are available to edit the record fields.
  • the word processor of the present invention allows existing fields to be added by typing the prefix of a field name and pressing a button. The system then completes the rest of the field name automatically.
  • the word processor of the present invention supports other database features. For example, new fields can be created by a user by using a popup dialog box. Similarly, references to other records or important words can be added by a dialog box. With particular regard to the table 100 of the present invention, OID references may support fields within other fields and a particular field within other fields supports the use of 'templates,' where a template is a list of field references embedded in text.
  • templates allow a user to build dynamic forms quickly and easily without having to use complicated form drawing tools.
  • the user interface for the word processor of the present invention allows a user to switch between two modes of data entry.
  • the word-processor of the present invention is used for flexible entry into one record at a time, while a columnar view is used for entering data in columns. The user can switch back and forth between these two views with no loss of data and switching from the word processor to the columnar view will cause the fields that were entered in the single item to become the columns to be displayed in the columnar view.
  • the password should not be made of common words, because an aggressor can use a brute force approach and a dictionary to guess the password;
  • the password should be longer rather than shorter; and c) the password should be changed often, so that even if is stolen it will not be valid for long.
  • a password should never be written down or embedded into a login script and should always be interactive.
  • a user's identity is determined through an extensive question and answer session.
  • the responses to certain personal questions very quickly identify the user with high accuracy. Even an accurate mimic will eventually fail to answer correctly if the question and answer session is prolonged.
  • sample questions might be: 'What is your favorite breakfast cereal?'; 'Where were you in April 1990?' 'What color is your toothbrush?'. These questions are wide ranging and hard to mimic.
  • the user creates the list of questions and corresponding answers, which are then stored. Because the user has complete control over the questions, the user may find the process of creating the questions and answers enjoyable, and as a result, change the questions and answer list more frequently, further enhancing system security.
  • a user creates a list of 50-100 questions and answers that are encrypted and stored.
  • the questions can be entirely new, or can be based on a large database of interesting questions.
  • the system randomly selects one of the questions related to that user and presents the question to the user.
  • the user then types in a response, which is matched against the correct answer.
  • the matching can be fuzzy and associative, as described above. If the response matches correctly, access is allowed.
  • more security may be provided by repeatedly asking questions until a certain risk threshold is reached. For example, if the answer to 'What color is your toothbrush?' is the single word 'Red', then brute force guessing may be effective in this one case. In this scenario, repeatedly asking questions will diminish the probability of brute force success.

Abstract

The information management and database system of the present invention comprises a flexible, self-referential table that stores data. The table of the present invention may store any type of data, both structured and unstructured, and provides an interface to other application programs. The table of the present invention comprises a plurality of rows and columns. Each row has an object identification number (OID) and each column also has an OID. A row corresponds to a record and a column corresponds to a field such that the intersection of a row and a column comprises a cell that may contain data for a particular record related to a particular field, a cell may also point to another record. To enhance searching and to provide for synchronization betweem columns, columns are entered as rows in the table and the record corresponding to a column contains various information about the column. The table includes an index structure for extended queries.

Description

METHOD AND APPARATUS FOR IMPROVED
INFORMATION STORAGE AND RETRIEVAL SYSTEM
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a method and apparatus for storing, retrieving, and distributing various kinds of data, and more particularly, to an improved database architecture and method for using the same.
2. Description of the Related Art
Over the past 30 years, computers have become increasingly important in storing and managing information. During this time, many database products have been developed to allow users to store and manipulate information and to search for desired information. The continuing growth of the information industry creates a demand for more powerful databases.
The database products have evolved over time. Initially, databases comprised a simple "flat file" with an associated index. Application programs, as opposed to the database program itself, managed the relationships between these files and a user typically performed queries entirely at the application program level. The introduction of relational database systems shifted many tasks from applications programs to database programs. The currently existing database management systems comprise two main types, those that follow the relational model and those that follow the object oriented model.
The relational model sets out a number of rules and guidelines for organizing data items, such as data normalization. A relational database management system (RDBMS) is a system that adheres to these rules. RDBMS databases require that each data item be uniquely classified as a particular instance of a 'relation'. Each set of relations is stored in a distinct 'table'. Each row in the table represents a particular data item, and each column represents an attribute that is shared over all data items in that table.
The pure relational model places number of restrictions on data items. For example, each data item cannot have attributes other than those columns described for the table. Further, an item cannot point directly to another item. Instead, 'primary keys' (unique identifiers) must be used to reference other items. Typically, these restrictions cause RDBMS databases to include a large number of tables that require a relatively large amount of time to search.
Further, the number of tables occupies a large amount of computer memory.
The object oriented database model, derived from the object-oriented programming model, is an alternative to the relational model. Like the relational model, each data item must be classified uniquely as belonging to a single class, which defines its attributes. Key features of the object-oriented model are: 1) each item has a unique system-generated object identification number that can be used for exact retrieval; 2) different types of data items can be stored together; and 3) predefined functions or behavior can be created and stored with a data item.
Apart from the limitations previously described, both the relational and object oriented models share important limitations with regard to data structures and searching. Both models require data to be input according to a defined field structure and thus do not completely support full text data entry. Although some databases allow records to include a text field, such text fields are not easily searched. The structural requirements of current databases require a
programmer to predefine a structure and subsequent date entry must conform to that structure. This is inefficient where it is difficult to determine the structure of the data that will be entered into a database.
Conversely, word and image processors that allow unstructured data entry do not provide efficient data retrieval mechanisms and a separate text retrieval or data management tool is required to retrieve data. Thus, the current information management systems do not provide the capability of integrating full text or graphics data entry with the searching mechanisms of a database.
The separation of database from other programs such as word processors has created a large amount of text and other files that cannot be integrated with current databases. Various database, spreadsheet, image, word processing, electronic mail and other types of files may not currently be accessed in a single database that contains all of this information. Various programs provide integration between spreadsheet, word processing and database programs but, as previously described, current databases do not support effective searching in unstructured files. The present invention overcomes the limitations of both the relational database model and object oriented database model by providing a database with increased flexibility, faster search times and smaller memory requirements and that supports text attributes. Further, the database of the present invention does not require a programmer to preconfigure a structure to which a user must adapt data entry. Many algorithms and techniques are required by applications that deal with these kinds of information. The present invention provides for the integration, into a single database engine, of support for these techniques, and shifts the programming from the application to the database, as will be described below. The present invention also provides for the integration, into a single database, of preexisting source files developed under various types of application programs such as other databases, spreadsheets and word processing programs. In addition, the present invention allows users to control all of the data that are relevant to them without sacrificing the security needs of a centralized data repository.
BRIEF SUMMARY OF THE INVENTION
The present invention improves upon prior art information search and retrieval systems by employing a flexible, self-referential table to store data. The table of the present invention may store any type of data, both structured and unstructured, and provides an interface to other application programs such as word processors that allows for integration of all the data for such application programs into a single database. The present invention also supports a variety of other features including hypertext.
The table of the present invention comprises a plurality of rows and columns. Each row has an object identification number (OID) and each column also has an OID. A row corresponds to a record and a column corresponds to an attribute such that the intersection of a row and a column comprises a cell that may contain data for a particular record related to a particular attribute. A cell may also point to another record. To enhance searching and to provide for synchronization between columns, columns are entered as rows in the table and the record corresponding to a column contains various information about the column. This renders the table self referential and provides numerous advantages, as will be discussed in this Specification. The present invention includes an index structure to allow for rapid searches. Text from each cell is stored in a key word index which itself is stored in the table. The text cells include pointers to the entries in the key word index and the key word index contains pointers to the cells. This two way association provides for extended queries. The invention further includes weights and filters for such extended queries.
The present invention includes a thesaurus and knowledge base that enhances indexed searches. The thesaurus is stored in the table and allows a user to search for synonyms and concepts and also provides a weighting mechanism to rank the relevance of retrieved records.
An application support layer includes a word processor, a password system, hypertext and other functions. The novel word processor of the present invention is integrated with the table of the present invention to allow cells to be edited with the word processor. In addition, the table may be interfaced with external documents which allows a user to retrieve data from external documents according to the enhanced retrieval system of the present invention.
These and numerous other advantages of the present invention will be apparent from the following description.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a functional block diagram illustrating one possible computer system incorporating the teachings of the present invention.
FIG. 2 is a block diagram illustrating the main components of the present invention.
FIG. 3 illustrates the table structure of the database of the present invention.
FIG. 4 is a flow chart for a method of computing object identification numbers (OID's) that define rows and columns in the table of Fig. 1. FIG. 5 is a part of the table of Fig. 2 illustrating the column
synchronization feature of the present invention.
FIG. 6 is a flow chart for a method of searching the table of Fig. 2.
FIG. 7a is a flow chart for synchronizing columns of the table of Fig. 2.
FIG. 7b illustrates the results of column synchronization. FIG. 8a illustrates a reference within one column to another column.
FIG. 8b illustrates an alternate embodiment for referring to another column within a column.
FIG. 9 illustrates a "Record Contents" column of the present invention that indicates which columns of a particular record have values. FIG. 10 illustrates a folder structure that organizes records. The folder structure is stored within the table of Fig. 2. FIG. 11 illustrates the correspondence between cells of the table of Fig. 2 and a sorted key word index.
FIG. 12 illustrate the "anchors" within a cell that relate a word in a cell to a key word index record. FIG. 13 illustrates key word index records stored in the table of Fig. 2.
FIG. 14 illustrates the relationship between certain data records and key word index records.
FIG. 15 illustrates the relationship of Fig. 14 in graphical form.
FIG. 16a illustrates an extended search in graphical form. FIG. 16h illustrates a further extended search in graphical form.
FIG. 17 illustrates the thesaurus structure of the present invention stored in the table of Fig. 2.
FIG. 18 illustrates prior art hypertext.
FIG. 19 illustrates the hypertext features of the present invention. FIG. 20a illustrates a character and word box structure of the word processor of the present invention.
FIG. 20b illustrates the word and horizontal line box structure of the word processor of the present invention.
FIG. 20c illustrates the vertical box structure of the word processor of the present invention. FIG. 21 illustrates the box tree structure of the word processor of the present invention.
FIG. 22a illustrates the results of a prior art sorting algorithm.
FIG. 22b illustrates the results of a sorting alogrithm according to the present invention.
FIG. 23 illustrates the correspondence between cells of the table of Fig. 2 and a sorted date index.
NOTATION AND NOMENCLATURE
The detailed descriptions which follow are presented largely in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art.
An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It proves convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities.
Further, the manipulations performed are often referred to in terms, such as adding or comparing, which are commonly associated with mental operations performed by a human operator. No such capability of a human operator is necessary, or desirable in most cases, in any of the operations described herein which form part of the present invention; the operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital computers or other similar digital devices. In all cases there should be borne in mind the distinction between the method operations in operating a computer and the method of computation itself. The present invention relates to method steps for operating a computer in processing electrical or other (e.g., mechanical, chemical) physical signals to generate other desired physical signals.
The present invention also relates to apparatus for performing these operations. This apparatus may be specially constructed for the required purposes or it may comprise a general purpose computer as selectively activated or reconfigured by a computer program stored in the computer. The algorithms presented herein are not inherently related to a particular computer or other apparatus. In particular, various general purpose machines may be used with programs written in accordance with the teachings herein, or it may prove more convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description given below. DETAILED DESCRIPTION OF THE INVENTION
The present invention discloses methods and apparatus for data storage, manipulation and retrieval. Although the present invention is described with reference to specific block diagrams, and table entries, etc., it will be appreciated by one of ordinary skill in the art that such details are disclosed simply to provide a more thorough understanding of the present invention. It will therefore be apparent to one skilled in the art that the present invention may be practiced without these specific details.
Reference to Appendices and Copyright Notice
Although the detailed description provides a complete disclosure of the invention, Appendices including source and object code disks of the invention and a sample database printout accompany this Specification. The appendix contains material protected under copyright law. The material in the appendices may be reproduced as it appears in the Patent and Trademark Office patent file or records but the owner reserves all other copyright rights in the appendices. System Hardware
Referring to Figure 1, the hardware configuration of the present invention is conceptually illustrated. Figure 1 illustrates an information storage and retrieval system structured in accordance with the teachings of the present invention. As illustrated, the information storage and retrieval system includes a computer 23 which comprises four major components. The first of these is an input/output (I/O) circuit 22, which is used to communicate information in appropriately structured form to and from other portions of the computer 23. In addition, computer 20 includes a central processing unit (CPU) 24 coupled to the I/O circuit 22 and to a memory 26. These elements are those typically found in most computers and, in fact, computer 23 is intended to be representative of a broad category of data processing devices. Also shown in Figure 1 is a keyboard 30 for inputting data and commands into computer 23 through the I/O circuit 22, as is well known.
Similarly, a CD ROM 34 is coupled to the I/O circuit 22 for providing additional programming capacity to the system illustrated in Figure 1. It will be appreciated that additional devices may be coupled to the computer 20 for storing data, such as magnetic tape drives, buffer memory devices, and the like. A device control 36 is coupled to both the memory 26 and the I/O circuit 22, to permit the computer 23 to communicate with multi-media system resources. The device control 36 controls operation of the multi-media resources to interface the multi-media resources to the computer 23.
A display monitor 43 is coupled to the computer 20 through the I/O circuit 22. A cursor control device 45 includes switches 47 and 49 for signally the CPU 24 in accordance with the teachings of the present invention. A cursor control device 45 (commonly referred to a "mouse") permits a user to select various command modes, modify graphic data, and input other data utilizing switches 47 and 49. More particularly, the cursor control device 45 permits a user to selectively position a cursor 39 at any desired location on a display screen 37 of the display 43. It will be appreciated that the cursor control device 45 and the keyboard 30 are examples of a variety of input devices which may be utilized in accordance with the teachings of the present invention. Other input devices, including for example, trackballs, touch screens, data gloves or other virtual reality devices may also be used in conjunction with the invention as disclosed herein. System Architecture
Figure 2 is a block diagram of the information storage and retrieval system of the present invention. As illustrated in the Figure, the present invention includes an internal database 52 that further includes a record oriented database 74 and a free-text database 76. The database 52 may receive data from a plurality of external sources 50, including word processing documents 58, spreadsheets 60 and database files 62. As will be described more fully below, the present invention includes an application support system that interfaces the external sources 50 with the database 52. To efficiently retrieve information stored in the database 52, a plurality of indexes 54 including a keyword index 78 and other types of indexes such as phonetic, special sorting for other languages, and market specific such as chemical, legal and medical, store sorted information provided by the database 52. To organize the information in the indexes 54, a knowledge system 56 links information existing in the indexes 54.
The organization illustrated in Figure 2 is for conceptual purposes and, in actuality, the database 52, the indexes 54 and the knowledge system 56 are stored in the same table, as will be described more fully below. This
Specification will first describe the structure and features of the database 52. Next, the Specification will describe the index 54 and its implementation for searching the database 52. The Specification will then describe the knowledge system 56 that further enhances the index 54 by providing synonyms and other elements. Finally, the Specification will describe an interface between the external application programs 50 and the database 52, including a novel structured word processor and a novel password scheme.
Figure 3 illustrates the storage and retrieval structure of the present invention. The storage and retrieval structure of the present invention comprises a table 100. The structure of the table 100 is a logical structure and not necessarily a physical structure. Thus, the memories 26 and 32 configured according to the teachings of the present invention need not store the table 100 contiguously.
The table 100 further comprises a plurality of rows 110 and a plurality of columns 120. A row corresponds to a record while a column corresponds to an attribute of a record and the defining characteristics of the column are stored in a row 108. The intersection of a row and a column comprises a particular cell.
Each row is assigned a unique object identification number (OID) stored in column 120 and each column also is assigned a unique OID, indicated in brackets and stored in row 108. For example, row 110 has an OID equal to
1100 while the column 122 has an OID equal to 101. As will be described more fully below, the OID's for both rows and columns may be used as pointers and a cell 134 may store an OID. The method for assigning the OID's will also be discussed below.
As illustrated in Figure 3, each row, corresponding to a record, may include information in each column. However, a row need not, and generally will not, have data stored in every column. For example, row 110 corresponds to a company as shown in a cell 130. Since companies do not have titles, cell 132 is unused.
The type of information associated with a column is known as a
'domain'. Standard domains supported in most database systems include text, number, date, and Boolean. The present invention includes other types of domains such as the OID domain that points to a row or column. The present invention further supports 'user-defined' domains, whereby all the behavior of the domain can be determined by a user or programmer. For example, a user may configure a domain to include writing to and reading from a storage medium and handling operations such as equality testing and comparisons.
According to the present invention, individual cells may be accessed according to their row and column OID's. Using the cell as the unit of storage improves many standard data management operations that previously required the entire object or record. Such operations include versioning, security, hierarchical storage management, appending to remote partitions, printing, and other operations.
Column definitions
Each column has an associated column definition, which determines the properties of the column, such as the domain of the column, the name of the column, whether the column is required and other properties that may relate to a column. The table 100 supports columns that include unstructured, free text data.
The column definition is stored as a record in the table 100 of Figure 3. For example, the "Employed By" column 126 has a corresponding row 136. The addition or rows that correspond to columns renders the table 100 self-referential. New columns may be easily appended to the table 100 by creating a new column definition record. The new column is then immediately available for use in existing records. Dates
Dates can be specified numerically and textually. An example of a numerical date is " 11/6/67" and an example of a textual date is "November 6, 1967." Textual entries are converted to dates using standard algorithms and lookup tables. A date value can store both original text and the associated date to which the text is converted, which allows the date value to be displayed in the format in which it was originally entered.
Numbers
Numeric values are classified as either a whole number (Integer) or fractional number. In the preferred embodiment, Integers are stored as variable length structures, which can represent arbitrarily large numbers. All data structures and indexes use this format which ensures that there are no limits in the system.
Fractional numbers are represented by a < numerator/denominator > pair of variable length integers. As with dates, a numeric value can store both the original text ("4 1/2 inches") and the associated number(4.5). This allows the numeric value to be redisplayed in the format in which it was originally entered.
Type definitions
A record can be associated with a 'record type'. The record type can be used simply as a category, but also can be used to determine the behavior of records. For example, the record type might specify certain columns that are required by all records of that type and, as with columns, the type definitions are stored as records in the table 100. In Figure 3, column 122 includes the type definition for each record. The column 122 stores pointers to rows defining a particular column type. For example, the row 136 is a "Field" type column and contains a pointer in a cell 133 to a row 135 that defines "Field" type columns. The "Type Column" 122 of the row 135 points to a type called "Type," which is defined in a row 140. "Type" has a type column that points to itself.
Record types, as defined by their corresponding rows, may constrain the values that a record of that type may contain. For example, the record type 'Person' may require that records of type 'Person' have a valid value in the 'Name' column, the 'Phone' column, and any other columns. The type of a record is an attribute of the record and thus may change at any time. Creating a unique OID
As previously described, the system must generate a unique OID when columns and rows are formed. Figure 4 is a flow chart of the method for assigning OID's.
At block 200 of Figure 4, the CPU 24 running the database program stored in the memory 26 requests a timestamp from the operating system. At block 210, the system determines whether the received timestamp is identical to a previous timestamp. If the timestamps are identical, block 210 branches to block 220 and a tiebreaker is incremented to resolve the conflict between the identical timestamps. At block 222, the system determines whether the tiebreaker has reached its limit, and, if so, the system branches to block 200 to retrieve a new time stamp. Otherwise, the system branches to block 214 where the system requests a session identification which is unique to the user session.
In the preferred embodiment, the session identification is derived from the unique serial number of the application installed on the users machine. For certain OID's which are independent of any particular machine, the session identification may be used to determine the type of object. For example, dates are independent of any particular machine, and so an OID for a date may have a fixed session identification.
Returning to block 210, if the timestamps are not identical, control passes to block 212 where the tiebreaker is set to zero and control then passes to block 214. As previously described, at block 214, the system requests a session identification which is unique to the user session. Control then passes to block 216 where the session identification, timestamp and tiebreaker are combined into a bit array, which becomes the OID. Since the OID is a variable length structure, any number of bits may be used, depending on the precision required, the resolution of the operating system clock, and the number of users. In the preferred embodiment, the OID is 64 bits long where the timestamp comprises the first 32 bits, the tiebreaker comprises the next 10 bits and the session identification comprises 22 bits.
The particular type of OID and its length is constant throughout a single database but may vary between databases. A flag indicating which type of OID to be used may be embedded in the header of each database. OID Domains
OID domains are used to store OID's, which are pointers to other records. An efficient query can use these OID's to go directly to another record, rather than searching through columns.
If a user wishes to search a column to find a record or records with a certain item in the column, and does not know the OID of the item, the present invention includes a novel technique for determining an OID from the textual description. Conversion from text to an OID may also be necessary when a user is entering information into a record. For exmaple, in Figure 3, the user may be entering information in the "Employed By" column 126, and wish to specify the text "DEXIS" and have it converted to OID #1100. For this purpose, special columns are required that provide a definition for how the search and conversion is performed.
Figure 6 is a flow chart for searching the table 100 configured according to the structure illustrated in Figure 5. At block 150, a user enters text through the keyboard 30 or mouse 45 for a particular column that the user wishes to search. At block 152, the system retrieves the search path for the column to be searched from the information stored in column 146 as illustrated in Figure 5. Continuing with the above example, a cell 146 in the row 136 contains the search path information for the "Employed By" column 126 of Figure 3. The search path information for the "Employed By" field indicates that the folders called "\contacts" and "\departments" should be searched for a company with the dabel "DEXIS."
Returning to Figure 5, the system searches the table 100 according to the retrieved search path information. For each folder specified in the search path, the routine searches for a record that has an entry in the label column 124 of Figure 2 that is the same as the text being searched for, and is of the same class, as indicated in column 122 of Figure 3. Folders will be further described below.
At block 156, the system determines whether it has found any items matching the user's search text. If no items have been found, at block 158, the system prompts the user on the display screen 37 to create a new record. If the user wishes to create a new record, control passes to block 162 and the system creates a new record. At block 164, the OID of the new record is returned. If the user does not wish to create a new record, a "NIL" string is returned, as shown at block 160.
If the system has located at least one item, the system determines whether it has found more than one item, as illustrated in block 166. If only one item has been located, its OID is returned at block 168. If more than one item has been located, the system displays the list of items to the user at block 170 and the user selects a record from the list. At block 172, the OID of the selected record is returned, which, in the above example, is #1100, the OID of the record for the company "DEXIS."
In alternate embodiments, various features may be added to the search mechanism as described with reference to Figure 6. For example, further restrictions may be added to the search; the search may be related by allowing prefix matching or fuzzy matching instead of strict matching; and the search may be widened by using the 'associative search' techniques described below. Two way synchronized links
Records may have interrelationships and it is often desirable to maintain consistency between interrelated records. For example, a record including data for a company may include information regard employees of that company, as illustrated in row 110 of Figure 3. Similarly, the employees that work for that company may have a record that indicates, by a pointer, their employer, as illustrated by row 138 of Figure 3. Thus, the employee column of a company should point to employees whose employer column points to that company. The present invention includes a synchronization technique to ensure that whenever interrelated records are added or removed, the interrelationships between the columns are properly updated.
The system synchronizes interrelated records by adding a "Synchronize With" column 144 to the table 100 as illustrated in Figure 5. Since the value in the columns defines the relatedness between records, the rows 136 and 139 corresponding to columns contain information within the "Synchronize With" column 144 that indicates which other columns are to be synchronized with the columns corresponding to rows 136 and 139. With reference to Figure 5, the "Employed By" column 126 is synchronized with the "Employees" column by an OID pointer in the "Synchronize With" column 144 to the "Employees" column, represented by row 139. Similarly, the "Employees" column is synchronized with the "Employed By" column 136 by a pointer in the "Synchronize With" column 144 to the "Employed by" column 134, represented by row 136. Thus, whenever an employee changes companies, such that the employee's "Employed By" column changes, the "Employee" column of the previous employer is updated to eliminate the pointer to the ex-employee and, correspondingly, the addition of the employee in the "Employed By" field of the new employer. Synchronization may need to occur whenever a column is changed, whether by addition or subtraction of a reference to another column, or when entire records are added or eliminated from the table 100.
Figure 7a is a flow chart for synchronizing records when a user adds or deletes a record. At block 180, the system makes a backup of the original list of references to other rows, which are simply the OID's of those other rows, so that it can later determine which OIDS have been added or removed. Only these changes need to be synchronized. At block 182, the system generates a new list of references by adding or deleting the specified OID. At block 184, the system determines whether the relevant column is synchronized with another column. If it is not, then the system branches to block 186 and the update is complete. If the column is synchronized with another column, the system determines whether it is already in a synchronization routine. If this were not done, the routine would get into an endless recursive loop. If the system is already in a synchronization routine, the system branches to 190 and the update is complete.
Otherwise, the system performs actual synchronization. At block 192, the system finds an OID that has been added or subtracted from the column (C1) of the record (R1) being altered. The system retrieves the record (R2) corresponding to the added or subtracted OID at block 194. The system determines the synchronization column (C2) of the column (C1) at block 196 and locates that field in the added or subtracted OID. For example, if an employer is fired from a job, and the employer's "Employed By" field changed accordingly, the system would look up the value of the "Synchronize With" column 144 for the "Employees" column which is contained in the cell 147 as illustrated in Figure 5. Since cell 147 points to the "Employed By" field, the system locates the "Employed By" field of the record for the fired employee. At block 198 of Figure 7a, the located cell, (R2:C2), is updated by adding or subtracting the OID. Continuing with the above example, the "Employed By" field of the employee would be changed to no longer point to the previous employer by simply removing the employer's OID from that field. The system branches back to block 192 to update any other OID additions or subtractions. If the system has processed all of the OID's, then the routine exits as illustrated at blocks 200 and 202.
Figure 7b illustrates the results of column synchronization of the "Employed By" field and the "Employees" field. As shown, the pointers in the records of these two columns are consistent with one another. Columns within columns
A column may contain within it a reference to another column in the same record. For example, a 'name' column may contain a reference to both a 'first name' and a 'last name' column. The value of the 'name' column can then be reconstructed from the values of the other two columns. Figures 8a and 8b illustrate two possible implementations for reconstructing a value from one or more columns within the same record.
Figure 8a illustrates a table 210 that includes a "First Name" column 220, a "Last Name" column 222 and a "Name" column 224. A record 226 for "John Smith" has the first name "John" in the "First Name" column 220 and the last name "Smith" in the column 222. The name field 224 returns the text "The name is John Smith" by referencing the fields in brackets, according to the format < fieldRef field = 'Column Name' > as shown in column 224.
Figure 8b employs a variant of the referencing scheme illustrated in Figure 8a. Figure 8a illustrates a table 230 that includes a "First Name" column 232, a "Last Name" column 234 and a "Name" column 236. A record 238 for "John Smith" has the first name "John" in the "First Name" column 232 and the last name "Smith" in the column 234. The name field 236 returns the text "The name is John Smith" by referencing the fields by defined variables 'fh' and 'In' as shown in column 236. The variables are defined according to the format variable := fieldAt (parameter, 'Column Name') and the variables may be referenced in a return statement as shown in column 236. Record Contents
As previously described, a given row may contain values for any column. However, to determine all of the columns that might be used by a record would involve scanning every possible column. To avoid this problem, in the preferred embodiment, the table 100 illustrated in Figure 3 includes a
"RecordContents" column that indicates those columns within which a particular record has stored values.
Figure 9 illustrates the table 100 with a "RecordContents" column 127 that includes pointers to the columns containing values for a particular record. For example, the "RecordContents" column 127 for row 110 has pointers to the column 124 and a column 125 but does not have a pointer to the column 126 because the row 110 does not have a value for the column 126. As previously described, since every column has a corresponding row that defines the column, the "RecordContents" column 127 has a defining row 129. Like any cell, the cell containing the record contents can be versioned, providing the ability to do record versioning.
Folders
To provide increased efficiency in managing information, the table 100 includes a data type defined as a folder. Figure 10 illustrates the structure of a folder. As illustrated in the Figure, the table 100 includes a "Parent Folder" column 240 and a "Folder Children" column 242. A folder has a corresponding record. For example, a folder entitled "Contacts" has a corresponding row 244 as illustrated in Figure 10. The "Folder Children" column 242 of the
"Contacts" folder includes pointers to those records that belong to the folder. Similarly, those records that belong to a folder include a pointer to that folder in the "Parent Folder" column 240.
The folder structure illustrated in Figure 10 facilitates searching. As previously described, a column may be searched according to a folder specified in the column definition. If a folder is searched, the system accesses the record corresponding to the folder and then searches all of the records pointed to by that folder. Further, the synchronization feature described above may be used to generate the list of items in a folder. For example, in Figure 10, the 'Folder Parent' and 'Folder Children' columns may be synchronized. When the 'Folder Parent' field 240 for record 138 is set to reference the 'Contacts' folder represented by row 244, the list of items in the 'Contacts' folder
('FolderChildren') is automatically updated to store a reciprocal reference to record represented by row 138 by including its OID, 1100, in the "Folder Children" column 242.
Text indexing system
The present invention includes an indexing system that provides for rapid searching of text included in any cell in the table 100. Each key phrase is extracted from a cell and stored in a list format according to a predefined hierarchy. For example, the list may be alphabetized, providing for very rapid searching of a particular name.
Figure 11 illustrates the extraction of text from the table 100 to a list
250. The list 250 is shown separately from the table 100 for purposes of illustration but, in the preferred embodiment, the list 250 comprises part of the table 100. The list 250 stores cell identification numbers for each word in the list where a cell identification number may be of the format < record OID, column OID> . For example, the word "Ventura" occurs in cells 252, 254 and 256 that correspond to different rows and different columns. The word
"Ventura" in the list 250 contains a pointer, or cell identification number, to cells 252, 254 and 256.
Similarly, each cell stores the references to the key phrases within it using 'anchors'. As illustrated in Figure 12, an anchor contains a location (such as the start and stop offset within the text), and an identification number. Both the text and the anchor are stored in the cell 252. Other kinds of domains also support anchors. For example, graphical images support the notion of 'hot spots' where the anchor position is a point on the image.
As previously described, each key phrase is stored as a record in the database and the OID of the record equals the identification number described with reference to Figure 12. One column stores the name of the key phrase and another stores the list of cell identification numbers that include that phrase. Key phrases may have comments of their own, which may also be indexed.
The sorted list 250 as illustrated in Figure 11 is stored as a Folder, as illustrated in Figure 13. A cell identification field 274 maintains the cells that include the term corresponding to that record. The "Parent Folder" column 240 for each of the terms on the list 250 indicates that the parent folder is an index with a title "Natural." The "Natural" folder has a row 276 that has pointers in the "Folder Children" column 242 to all of the terms in the list 250.
The "Natural" folder corresponds to an index sorted by a specific type of algorithm. Computer programs generally sort using a standard collating sequence such as ASCII. The present invention provides an improvement over this type of sorting and the improved sorting technique corresponds to the "Natural" folder. Records in the "Natural" folder are sorted according to the following rules: 1) A key phrase may occur at more than one point in the list. In particular:
1a) Key phrases may be permuted and stored under each permutation. For example: 'John Smith' can be stored under 'John' and also under 'Smith'. Noise words such as 'a' and 'the' are ignored in the permutation.
1b) Key phrases which are numeric or date oriented may be stored under each possible location. For example: '1984' can be stored under the digit ' 1984' and also under 'One thousand, nine hundred...', and 'nineteen eighty four'.
2) Numbers-are sorted naturally. For example, '20' comes after '3' and before '100'.
3) Prefixes in key phrases are ignored. For example, 'The Big Oak' is sorted under 'Big'. 4) Key phrases are stemmed, so that 'Computers' and 'Computing' map to the identical key phrase record. The preferred embodiment of the routine for generating positions for entering the key phrases into the 'Natural' folder is as follows:
1) Capitalize the key phrase to avoid case sensitivity problems. For example: 'John Smith the 1st' becomes 'JOHN SMITH THE 1ST'.
2) Each word in the key phrases is stemmed using standard
techniques. Eg "COMPUTERS" becomes "COMPUT".
3) Permute the key phrase. This results in a new set of multiple key phrases based on the original key phrase. For example 'JOHN SMITH THE 1ST' produces the set {'JOHN SMITH THE 1ST';
'SMITH THE 1ST JOHN'; 'THE 1ST JOHN SMITH'; '1ST JOHN SMITH THE'}.
4) Noise prefixes are eliminated. In the example above, the third entry, 'THE 1ST JOHN SMITH', is eliminated. If no phrases are left after elimination, the original phrase is used. For example, an entry for 'TO BE OR NOT TO BE" would be preserved even if all noise words were eliminated.
5) For each result, numbers and dates are expanded to all possible text representations, and text representations are converted to numeric. For example: '1ST JOHN SMITH THE' generates the set: {'1ST JOHN SMITH THE'; 'FIRST JOHN SMITH THE'}
6) Finally, each modified key phrase is used to determine the
position of a reference to the main key phrase record, and an entry is made in the folder accordingly. For example, '1ST JOHN SMITH THE' is stored between T and '2', while 'FIRST
JOHN SMIT THE' is stored after 'FIR' and before 'FIS.' Figure 22a illustrates the results of a prior art sorting algorithm while Figure 22b illustrates the results of a sorting alogrithm according to the present invention.
Extracting the key phrases
To generate a sorted list, the system must first extract the key phrases or words from the applicable cells. The combination of structured information and text allows various combinations of key phrase extraction to be used. In full text extraction, every word is indexed, which is typical for standard text retrieval systems. In column extraction, the whole contents of the column are indexed which corresponds to a standard database system. According to a third type of extraction, automatic analysis, the contents of the text are analyzed and key phrases are extracted based on matching phrases, semantic context, and other factors. Finally, in manual selection extraction, the user or application explicitly marks the key phrase for indexing. Date Indexing System
The date indexing scheme is very similar to the text indexing scheme as previously described. Important dates are extracted from the text and added to an 'Important Date' list. Each important date is represented by a 'Important Date' record. The 'Important Date' records are stored in a 'Important Dates' folder, which is sorted by date.
The important dates are extracted from the text. The system may search for numeric dates, such as '4/5/94' or date-oriented text, such as "Tomorrow", "next Tuesday" or "Christmas". Figure 23 illustrates the correspondence between cells of the table of Figure 2 and a sorted date index.
Important Date records are assigned special predetermined OIDS since they always have the same identity in any system. Assigning predetermined OID's to dates allows Important Dates to be shared across systems. The predetermined OID is generated by using a special session identification number that signifies that the OID is an Important Date. In this case, the timestamp represents the value of the Important Date itself, not the time that it was created. Associative Queries
As previously described, a sorted key word list is generated from the text in cells and list stored in a folder whose records point to the text cells. The associations between the list of records with text and the list of key phrases is two-way since the cells that include text point to the key words. Figure 14 illustrates this two way correspondence. Each record can point to multiple key phrases, and each key phrase can point to multiple records.
Figure 15 is a graphical representation of the two way association between records and the key word list. Each record in a plurality of records 298 through 300 may point to one or more important word entries 310 through 312. Similarly, each important word entry may point to one or more records. A single level search involves starting at one node (on either side of the graph) and following the links to the other side. For example, a user may wish to find the records including the word "Shasta." First, the important word index would be accessed to find the word "Shasta" and the records pointed to by this word would then be retrieved. This search is indicated by the arrows 314 and 316 where word "Shasta" corresponds to cell 318. Similarly, a user may wish to locate all of the important words included in a particular record, indicated by the arrows 320 and 322 in Figure 15. The search can be extended by repeatedly following the links back and forth to the desired level. Figure 16a illustrates this concept. As an example, the term "Shasta" may correspond to a dog with extraordinary intelligence such that in one record, "Shasta" is described as a dog and another record, 'Shasta' is described as a genius. If the user wishes to find the words associated with 'Shasta', the system locates "Shasta" in the "Important Words" folder which points to the records including the word "Shasta." In turn, the records pointed to contain pointers to the "Important Words" list for each indexed word in the record. Since "Shasta" appears with "dog" and "genius" in the records, these words are retrieved by the system.
This type of searching may be extended indefinitely. Figure 16b illustrates an additional level of searching. Continuing with the above example, the word "genius" may occur in records referring to Dirac, and the word "dog" associated with "Checkers," such that the multilevel search illustrated in Figure 16b results in a retrieval of "Dirac" and "Checkers" when provided with the word "Shasta."
A relevance ranking can be created based on weights associated with each link and type of key word, and the records can be displayed in order of descending relevance. In the preferred embodiment, if two or more nodes are used as the starting point, the relevance is based on the distance from all nodes. In this way, only nodes which are near all the initial nodes will have a high relevance. Many other relevance rankings apart from distance may be used.
To refine the search, filters can be used to constrain the links that are followed. For example, the search may be filtered such that only the type "Person" is listed such that, in the above example, Shasta will be associated with Dirac but not Checkers.
Knowledge base and thesaurus
The present invention includes a knowledge base and thesaurus to further improve searching capabilities.
Each important word record (term) included within the thesaurus contains a pointer to a 'concept' record. Each concept record contains pointers to other concept records, and to the terms that are included within the bounds of that concept. Figure 17 illustrates the structure of the thesaurus. The table 100 includes a "Parent Concept" column 352, a "Concept Name" column 354, a "Synonyms" column 356, a "More Specific Terms" column 358, a "More General Terms" column 360 and a "See Also" column 362. A concept record 350 defines the concept "IBM" and the Synonyms column 356 points to records that are synonymous with IBM, a record 364 with a label field with the value "IBM" and a record 366 with a label field with the value "International Business Machines." The records 364 and 366 have pointers in the "parent concept" field that point to the parent concept record 350.
The thesaurus structure illustrated in Figure 17 provides for greater flexibility than exact synonyms. The "More Specific Terms" column 358 of the concept record 350 associated with "IBM" points to a concept record 368 associated with the IBM PC with an assigned weight of 100%, where the weight percentage reflects the similarity between the initial term "IBM" and the related term "IBM PC." Similarly, the "More General Terms" column 360 of the concept record 350 associated with "IBM" points to a concept record 372 associated with Computer Companies with an assigned weight of 60%. The "See also" column points to a record associated with the concept "Microsoft" with a weight of 70%, where the weight percentage reflects the similarity between the initial term "IBM" and the related term "IBM PC."
The Thesaurus illustrated in Figure 17 enhances the searching mechanisms previously described with reference to Figures 14-16b. The system first locates the record associated with a key word and locates the parent concept record pointed to by the key word record. The system may then follow some or all of the pointers in the columns 356, 358, 360 and 352 and return of the OID's stored in the 'Concept Name' column 354.
Since key phrases and concepts are stored as records in this system, any other columns may be used to extend the knowledge and information stored therein. In particular, through the use of OID's, the system can store any kind of relationship, including relationsihps other than thesaural relationships, between key phrases, concepts and other records .
Application Support
The database of the present invention has been described without reference to its interface with applications that may use the invention as their primary storage and retrieval system. As previously described with reference to Figure 2, the present database includes an interface to support applications programs. Components in the application support system include external document support, hypertext, document management and workflow, calendaring and scheduling, security and other features.
Further, the present invention includes various user interface components that allow have been developed to provide full access to the structure of the database of the present invention. In particular, a new kind of structured word processor will be presented. The Specification will describe each component of the application support system separately. External documents
The present invention supports indexing of external documents. The table 100 stores the filenames of documents, such as word processor documents, where the contents of the files are not directly stored in the database. The documents names may be stored in a column with a specialized "External
Document" domain. The external documents may reside in the mass memory 32 or on a multi-source that interfaces with the system through device control 36.
To index documents external to the table 100, prior to processing, an external document is converted into a plain text format. Key phrases are then extracted as previously described. In particular, fields in the text can be determined and mapped to fields within the database. For example, a 'Memo' document may contain the text: 'To: John Smith. From: Mary Doe'. This text can be mapped to the fields called 'to' and 'from', and the values of these fields set accordingly. The analysis of the text in this way can be changed for different types of external documents such as memos, legal documents, spread sheets, computer source code and any other type of document. For each extracted key phrase, a start and stop point within the text is determined. A list of anchors of the format previously described, < start, stop, key phrase > is generated by the parser and stored within the table 100 under the external document domain. Viewing external documents
When a user views an external document on the display screen 37, the stored anchors are overlaid on top of the document such that it appears that the external document has been marked with hypertext. When the user clicks the switches 45 or 47 of the mouse 50 on a section of the external document display, the corresponding anchor is determined from the various start and stop coordinates. The OID of the key phrase corresponding to the anchor is stored within the anchor, and can be used for the purposes of retrieving the key phrase record or initiating a query as previously described.
Dynamic hypertext
The present invention supports Hypertext. Hypertext systems typically associate a region of text with a pointer to another record, as illustrated in Figure 18. This creates a 'hard-coded' link between the source and the target. When s user clicks on the source region, the target record is loaded and displayed. If the target record is absent, the hypertext jump will fail, possibly with serious consequences.
The present system uses a new approach based on a dynamic association between records. In the preferred embodiment, each hypertext region is associated with a key phrase, not a normal record. When the user clicks the switches 45 or 47 of the mouse 50 on the source region, all the records associated with the key phrase are retrieved and ranked using any of the associative search techniques previously described. As illustrated in Figure 19, the application can then display on the display screen 37 either the highest ranked item, or present all the retrieved items and allow the user to pick the one to access.
In certain applications, the user may want to access a single 'default' item. This item can be determined automatically, by picking the item at the top of the dynamically generated list, or manually, by letting the user pick the item explicitly and then preserving this choice in the anchor itself.
The generic word processor
The database of the present invention includes a novel Structured Word Processor that may be used in conjunction with the table 100.
The structured word processor of the present invention uses the "boxes and glue" paradigm introduced by Donald Knuth in TEX. According to this paradigm, a page of text is created by starting with individual characters and concatenating the characters to form larger units, called "boxes," and then combining these boxes into yet larger boxes. Figure 20a illustrates three character boxes 400, 402 and 404 concatenated to form a word box 406. Figure 20b illustrates four word boxes 410, 412, 414 and the word box 406 combined to form a horizontal line box 408. Horizontal boxes are used for words and other text tokens that are spaced horizontally inside another box, such as a line (or column width). Figure 20c illustrates the combination of the horizontal line box 408 with another horizontal line box 4242 to form a vertical box 420.
Vertical boxes are used for paragraphs and other objects that are spaced vertically inside other boxes, such as page height.
Boxes may be attached to other boxes with "glue." The glue can stretch or shrink, as needed. For example, in a justified sentence, the white space between words is stretched to force the words to line up at the right edge of the column. Glue can be used for between-character (horizontal) spacing, between-word (horizontal) spacing including "tab" glue, that "sticks" to tab markings. Glue may also be used for between-line (vertical) spacing and between-paragraph (vertical) spacing.
When a record of the table 100 is edited, each word and field definition is converted into boxes. The system organizes these boxes into a tree structure of line boxes and paragraph boxes, as illustrated in Figure 21. Shown there is a record hierarchy 460, corresponding to the hierarchy of a record, and a layout hierarchy 470, corresponding to the hierarchy of a layout such as a document generated according to the word processor described with reference to Figures 20a - 20c. The record structure hierarchy 460 represents the record structure of the table 100 where a record 462 corresponds to a row in the table 100 and the record 462 includes a plurality of attributes, including attribute 464, that correspond to the columns of the table 100. In turn, the attributes may include a variety of items. For example, the attribute 464 includes text, represented by block 466, field references represented by block 468 and other items as shown.
The layout hierarchy 470 comprises a document 472 which in turn comprises a plurality of pages, including page 474. The page 474 comprises a plurality of paragraphs including paragraphs 430 and 431 and the paragraph 430 comprises a plurality of lines, including lines 432 and 434. The paragraph 431 includes line 436.
The word processor of the present invention allows the document 472 to be inserted into the record 462 by providing a plurality of boxes, including boxes 438, 440 and 442, common to both the record structure hierarchy 460 and the layout hierarchy 470. For example, the box 438 corresponds to part of the line 432 and comprises part of the text of attribute 464 as illustrated by block 466. Similarly, the box 440 corresponds to part of the line 434 and may comprise a field reference as indicated by block 468. Thus, the shared box structure as illustrated in Figure 21 allows any type of word processing document to interface with any record in the table 100.
Conceptually, each box is kept as a bitmap, and its height and width are known, so the system displays the tree structure 450 by displaying all of the bitmaps corresponding to the boxes in the tree. If the tree is changed, for example, by adding a new word, only the new word box and a relatively small number of adjacent boxes need be recalculated. Similarly, line breaks or restructuring of a paragraph does not alter most of the word boxes, which may be reused, and only the lineboxes need be recalculated.
To edit the tree structure 450 as illustrated in Figure 21, a user may click a cursor on a part of the text. The system locates the word box or glue that is being edited by a recursively descending through the tree structure 450.
The word processor supports multiple fonts and special effects such as subscripts, dropcaps and other features including graphic objects. A word in a different font than a base font is in a different box and may have a different height from other boxes on a line. The height of a linebox the height of the largest wordbox within it. Effects within a word can be handled by breaking a word into subboxes with no glue between them. Again, the height of a wordbox is the height of the largest box within it. Graphic objects, such as bitmaps, may be treated and formatted as a fixed width box.
The word processor of the present invention may be used to edit records in the table 100. The text associated with each field in a record can be considered a "paragraph" for the purposes of inter-field spacing, text flow within a field, and other formatting parameters. Storing all the fields in the same way during text-editing allows the movement of text and "flow" to appear natural.
As previously described, the text being edited is divided into fields, with each field corresponding to a column in the underlying database. Unlike a traditional static data entry form, the positions and sizes of the attributes are not fixed but are dynamic and all the features of a word-processor such as fonts and embedded graphics are available to edit the record fields.
Similarly, all of the features of a database such as lookups and mailmerge are available to the word processor. All of the attributes that apply to data entry for a particular field are enforced by the word processor. Such attributes might include a mask (such as ###-####), existence requirements, range and value constraints, etc. The fields can be explicitly labelled, or hidden and implied.
The word processor of the present invention allows existing fields to be added by typing the prefix of a field name and pressing a button. The system then completes the rest of the field name automatically. The word processor of the present invention supports other database features. For example, new fields can be created by a user by using a popup dialog box. Similarly, references to other records or important words can be added by a dialog box. With particular regard to the table 100 of the present invention, OID references may support fields within other fields and a particular field within other fields supports the use of 'templates,' where a template is a list of field references embedded in text. For example, the template "Enter the first name here <fieldref id=firstName> and the last name here< fieldref id= lastName> " would appear to the user as "Enter the first name here: John and the last name here: Doe." Templates allow a user to build dynamic forms quickly and easily without having to use complicated form drawing tools.
The user interface for the word processor of the present invention allows a user to switch between two modes of data entry. The word-processor of the present invention is used for flexible entry into one record at a time, while a columnar view is used for entering data in columns. The user can switch back and forth between these two views with no loss of data and switching from the word processor to the columnar view will cause the fields that were entered in the single item to become the columns to be displayed in the columnar view.
Finally, the 'fields within fields' that are apparent in the word processor view become separated into columns in a columnar view. The user can then make changes in columnar mode, and then, when switching back to the word processor view, the columns become combined once again.
Passwords
It is often required that access to particular data items be restricted to certain users. In order to apply these restrictions, an information management system must determine the identity of the user requesting access. This is currently done in two ways, physically measuring a unique quality of the uses of requesting information from the user, most current information management systems rely on the second approach, by using 'passwords'. However, to avoid security problems with a password system, three guidelines are applied to passwords:
a) the password should not be made of common words, because an aggressor can use a brute force approach and a dictionary to guess the password;
b) the password should be longer rather than shorter; and c) the password should be changed often, so that even if is stolen it will not be valid for long.
Finally, a password should never be written down or embedded into a login script and should always be interactive.
According to the present password system, a user's identity is determined through an extensive question and answer session. The responses to certain personal questions very quickly identify the user with high accuracy. Even an accurate mimic will eventually fail to answer correctly if the question and answer session is prolonged.
For example, sample questions might be: 'What is your favorite breakfast cereal?'; 'Where were you in April 1990?' 'What color is your toothbrush?'. These questions are wide ranging and hard to mimic.
Furthermore, the correct responses are natural English sentences, with an extremely large solution .pace, so that a brute force approach is unlikely to be successful.
To improve the effectiveness of the response, an exact matching of user response and stored answer is not required and 'fuzzy' and 'associative' matching can be used according to the synonym, thesaurus and other features of the present invention.
According to the password system of the present invention, the user creates the list of questions and corresponding answers, which are then stored. Because the user has complete control over the questions, the user may find the process of creating the questions and answers enjoyable, and as a result, change the questions and answer list more frequently, further enhancing system security.
According to the preferred embodiment, a user creates a list of 50-100 questions and answers that are encrypted and stored. The questions can be entirely new, or can be based on a large database of interesting questions. When the user logs on the system, the system randomly selects one of the questions related to that user and presents the question to the user. The user then types in a response, which is matched against the correct answer. The matching can be fuzzy and associative, as described above. If the response matches correctly, access is allowed.
In an alternate embodiment, more security may be provided by repeatedly asking questions until a certain risk threshold is reached. For example, if the answer to 'What color is your toothbrush?' is the single word 'Red', then brute force guessing may be effective in this one case. In this scenario, repeatedly asking questions will diminish the probability of brute force success.
Summary
While the invention has been described in conjunction with the preferred embodiment, it is evident that numerous alternatives, modifications, variations and uses will be apparent to those skilled in the art in light of the foregoing description. Many other adaptations of the present invention are possible.
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Figure imgf000048_0001
Figure imgf000049_0001
Figure imgf000050_0001
Figure imgf000051_0001
>
Figure imgf000052_0001
Figure imgf000053_0001
Figure imgf000054_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
*_
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001

Claims

What is claimed is:
1. A storage and retrieval system for data in a computer system including a memory, a central processing unit and a display, said storage and retrieval system including:
memory configuring means for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said rows has an. OID equal to the OID to a corresponding one of said columns, said at least one row including information defining said corresponding column.
2. The system of claim 1 wherein said information defining said column includes information for searching said column.
3. The system of claim 2 wherein said information for searching said column includes a search path that references a folder, said folder including a group of rows of a similar type.
4. The system of claim 1 wherein:
said information defining said column includes information for synchronizing said column with a different
column; and said system further includes
synchronizing means for synchronizing said column with a different column.
5. The system of claim 4 wherein said information for synchronizing said column with a different column includes a pointer to said different column.
6. The system of claim 1 wherein:
at least one of said plurality of rows includes information defining the type of a different row; and
at least one of said plurality of rows includes a cell that contains a pointer to said row including row type information.
7. The system of claim 1 wherein at least one of said columns defines cells that include a plurality of pointers to other columns within the same record, said pointers indicating those columns within the same record that contain defined values.
8. The system of claim 1 wherein at least one of said rows is a folder type row, said folder type row including at least one cell that contains a plurality of pointers to a plurality of other rows included within said folder.
9. The system of claim 8 wherein said plurality of other rows included within said folder each includes a cell that contains a pointer to said folder type row.
10. The system of claim 1 wherein said OID's are variable length and include data related to a session identification number and a timestamp.
11. A storage and retrieval system for data in a computer system including a memory, a central processing unit and a display, said storage and retrieval system including:
memory configuring means for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said rows contains a cell that contains a pointer to a different row; and searching means for searching said table for said pointer.
12. The system of claim 11 wherein at least one of said plurality of rows includes information defining the type of a different row.
13. The system of claim 11 wherein at least one of said columns defines cells that include a plurality of pointers to other columns within the same record, said pointers indicating those columns within the same record that contain defined values.
14. The system of claim 1 1 wherein at least one of said rows is a folder type row, said folder type row including at least one cell that contains a plurality of pointers to a plurality of other rows included within said folder.
15. The system of claim 14 wherein said plurality of other rows included within said folder each includes a cell that contains a pointer to said folder type row.
16. The system of claim 11 wherein said OID's are variable length.
17. A storage and retrieval system for data in a computer system including a memory, a central processing unit and a display, said storage and retrieval system including:
memory configuring means for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and
indexing means for indexing data stored in said table, said indexing means further including:
searching means for searching a plurality of cells within said table for at least two key words, said searching means capable of searching a column containing unstructured text and a column containing structured data; and
inserting means for inserting into said table rows corresponding to said at least two key words.
18. The system of claim 17 wherein :
each of said inserted rows includes a cell that contains a pointer to a searched cell that contains the keyword
corresponding to said inserted row; and
said searched cells that contain a keyword corresponding to said inserted row contain a pointer to said inserted row.
19. The system of claim 18 wherein said pointer to said searched cell includes the OID's of the column and row defining said searched cell.
20. The system of claim 18 wherein said searched cells include anchors that mark said key words.
21. The system of claim 17 wherein one of said plurality of rows of said table includes a folder type row that includes a plurality of pointers to said key words.
22. The system of claim 17 wherein said searching means further includes: means for searching for every word in a text cell;
means for searching for every entry in a column;
means for searching for data based on automatic analysis; and means for searching for data marked by a user.
23. A storage and retrieval system for data in a computer system including a memory, a central processing unit and a display, said storage and retrieval system including:
memory configuring means for configuring said memory according to a logical table, said logical table including: a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said cells includes a pointer to an index record; and
indexing means for indexing data stored in said table, said indexing means further including:
searching means for searching said table for at least two key words; and
record creation means for creating index records for at least two key words, said index records including one or more pointers to cells in said table that contain said key words.
24. The system of claim 23 further including querying means, said querying means further including:
index look-up means for locating one of said index records according to the query of a user;
record retrieval means for retrieving at least one cell in said table pointed to by said located index record.
25. The system of claim 24 wherein said index look-up means further includes means for locating at least one of said index records pointed to by said at least one retrieved cell.
26. The system of claim 25 wherein said index look-up means and said record retrieval means each contain weighting means for weighting key words and retrieved cells according to pre-defined search criteria.
27. The system of claim 25 wherein said index look-up means and said record retrieval means each contain filtering means for filtering key words and retrieved cells according to pre-defined search criteria.
28. The system of claim 23 wherein said indexing means further includes means for indexing external documents.
29. A method for storing and retrieving data in a computer system including a memory, a central processing unit and a display, said method including the steps of:
providing an element for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said rows has an OID equal to the OID to a corresponding one of said columns, said at least one row including information defining said corresponding column.
30. The method of claim 29 wherein said information defining said column includes information for searching said column.
31. The method of claim 30 wherein said information for searching said column includes a search path that references a folder, said folder including a group of rows of a similar type.
32. The method of claim 29 wherein:
said method further includes the step of providing a component for synchronizing said column with a different
column; and
said information defining said column includes information for synchronizing said column with a different
column.
33. The method of claim 32 wherein said information for synchronizing said column with a different column includes a pointer to said different column.
34. The method of claim 29 wherein:
at least one of said plurality of rows includes information defining the type of a different row; and
at least one of said plurality of rows includes a cell that contains a pointer to said row including row type information.
35. The method of claim 29 wherein at least one of said columns defines cells that include a plurality of pointers to other columns within the same record, said pointers indicating those columns within the same record that contain defined values.
36. The method of claim 35 wherein at least one of said rows is a folder type row, said folder type row including at least one cell that contains a plurality of pointers to a plurality of other rows included within said folder.
37. The method of claim 36 wherein said plurality of other rows included within said folder each includes a cell that contains a pointer to said folder type row.
38. The method of claim 29 wherein said OID's are variable length and include data related to a session identification number and a timestamp.
39. A method for storing and retrieving data in a computer system including a memory, a central processing unit and a display, said method including the steps of:
providing an element for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said rows contains a cell that contains a pointer to a different row; and providing an element for searching said table for said pointer.
40. The method of claim 39 wherein at least one of said plurality of rows includes information defining the type of a different row.
41. The method of claim 39 wherein at least one of said columns defines cells that include a plurality of pointers to other columns within the same record, said pointers indicating those columns within the same record that contain defined values.
42. The method of claim 39 wherein at least one of said rows is a folder type row, said folder type row including at least one cell that contains a plurality of pointers to a plurality of other rows included within said folder.
43. The method of claim 42 wherein said plurality of other rows included within said folder each includes a cell that contains a pointer to said folder type row.
44. The method of claim 39 wherein said OID's are variable length.
45. A method for storing and retrieving data in a computer system including a memory, a central processing unit and a display, said method including the steps of:
providing an element for configuring said memory according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and
providing an element for indexing data stored in said
table, said step of providing an element for indexing data further including the steps of: providing an element for searching a plurality of cells within said table for at least two key words, said searching element capable of searching a column containing unstructured text and a column containing structured data; and providing an element for inserting into said table rows corresponding to said at least two key words.
46. The method of claim 45 wherein:
each of said inserted rows includes a cell that contains a pointer to a searched cell that contains the keyword
corresponding to said inserted row; and
said searched cells that contain a keyword corresponding to said inserted row contain a pointer to said inserted row.
47. The method of claim 46 wherein said pointer to said searched cell includes the OID's of the column and row defining said searched cell.
48. The method of claim 46 wherein said searched cells include anchors that mark said key words.
49. The method of claim 45 wherein one of said plurality of rows of said table includes a folder type row that includes a plurality of pointers to said key words.
50. The method of claim 45 wherein said step of providing said searching element further includes the steps of:
providing an element for searching for every word in a text cell;
providing an element for searching for every entry in a column;
providing an element for searching for data based on automatic analysis; and
providing an element for searching for data marked by a user.
51. A method for storing and retrieving data in a computer system including a memory, a central processing unit and a display, said method including the steps of: providing an element for configuring said memory
according to a logical table, said logical table including:
a plurality of rows, each said row including an object identification number (OID) to identify each said row, each said row corresponding to a record of information;
a plurality of columns intersecting said plurality of rows to define a plurality of cells, each said column including an OID to identify each said column; and wherein
at least one of said cells includes a pointer to an index record; and
providing an element for indexing data stored in said table, said step of providing an element for indexing data further including the steps of:
providing an element for searching said table for at least two key words; and
providing an element for creating index records for at least two key words, said index records including one or more pointers to cells in said table that contain said key words.
52. The method of claim 51 further including the step of providing an element for querying said table, said step of providing an element for querying said table further including the steps of:
providing an element for locating one of said index records according to the query of a user;
providing an element for retrieving at least one cell in said table pointed to by said located index record.
53. The method of claim 52 wherein said step of providing an element for locating one of said index records further includes the steps of providing an element for locating at least one of said index records pointed to by said at least one retrieved cell.
54. The method of claim 53 further including the step of providing an element for weighting key words and retrieved cells according to pre-defined search criteria.
55. The method of claim 53 further including the step of providing an element for filtering key words and retrieved cells according to pre-defined search criteria.
56. The method of claim 51 further including the step of providing an element for indexing external documents.
PCT/US1996/001260 1995-03-28 1996-02-01 Method and apparatus for improved information storage and retrieval system WO1996030845A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP96905298A EP0818010A1 (en) 1995-03-28 1996-02-01 Method and apparatus for improved information storage and retrieval system
AU49104/96A AU4910496A (en) 1995-03-28 1996-02-01 Method and apparatus for improved information storage and re trieval system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/383,752 US5729730A (en) 1995-03-28 1995-03-28 Method and apparatus for improved information storage and retrieval system
US08/383,752 1995-03-28

Publications (1)

Publication Number Publication Date
WO1996030845A1 true WO1996030845A1 (en) 1996-10-03

Family

ID=23514567

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1996/001260 WO1996030845A1 (en) 1995-03-28 1996-02-01 Method and apparatus for improved information storage and retrieval system

Country Status (5)

Country Link
US (4) US5729730A (en)
EP (1) EP0818010A1 (en)
AU (1) AU4910496A (en)
CA (1) CA2216719A1 (en)
WO (1) WO1996030845A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1258812A1 (en) * 2001-05-17 2002-11-20 Peter Pressmar Virtual database of heterogeneous data structures
US7146356B2 (en) 2003-03-21 2006-12-05 International Business Machines Corporation Real-time aggregation of unstructured data into structured data for SQL processing by a relational database engine
US7974681B2 (en) 2004-03-05 2011-07-05 Hansen Medical, Inc. Robotic catheter system
US9477749B2 (en) 2012-03-02 2016-10-25 Clarabridge, Inc. Apparatus for identifying root cause using unstructured data

Families Citing this family (228)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6182121B1 (en) * 1995-02-03 2001-01-30 Enfish, Inc. Method and apparatus for a physical storage architecture having an improved information storage and retrieval system for a shared file environment
US5729730A (en) * 1995-03-28 1998-03-17 Dex Information Systems, Inc. Method and apparatus for improved information storage and retrieval system
US6182074B1 (en) * 1995-07-25 2001-01-30 British Telecommunications Public Limited Company Automated generation of control interface to controlled element of telecommunications network
US6067552A (en) * 1995-08-21 2000-05-23 Cnet, Inc. User interface system and method for browsing a hypertext database
JPH09128466A (en) * 1995-10-26 1997-05-16 Casio Comput Co Ltd Method and device for processing table
US5799297A (en) * 1995-12-15 1998-08-25 Ncr Corporation Task workflow management system and method including an external program execution feature
JPH09233458A (en) * 1996-02-28 1997-09-05 Toshiba Corp Method and device for selecting image
JP3952518B2 (en) * 1996-03-29 2007-08-01 株式会社日立製作所 Multidimensional data processing method
AU3214697A (en) * 1996-06-03 1998-01-05 Electronic Data Systems Corporation Automated password reset
US6006227A (en) 1996-06-28 1999-12-21 Yale University Document stream operating system
US20030164856A1 (en) 1996-06-28 2003-09-04 Randy Prager Desktop, stream-based, information management system
US6457004B1 (en) * 1997-07-03 2002-09-24 Hitachi, Ltd. Document retrieval assisting method, system and service using closely displayed areas for titles and topics
DE19627472A1 (en) * 1996-07-08 1998-01-15 Ser Systeme Ag Database system
US5809502A (en) * 1996-08-09 1998-09-15 Digital Equipment Corporation Object-oriented interface for an index
US5832500A (en) * 1996-08-09 1998-11-03 Digital Equipment Corporation Method for searching an index
US5745890A (en) * 1996-08-09 1998-04-28 Digital Equipment Corporation Sequential searching of a database index using constraints on word-location pairs
JP3747525B2 (en) 1996-08-28 2006-02-22 株式会社日立製作所 Parallel database system search method
US6209005B1 (en) * 1996-12-23 2001-03-27 Apple Computer, Inc. Method and apparatus for generating and linking documents to contacts in an organizer
US6067540A (en) * 1997-02-28 2000-05-23 Oracle Corporation Bitmap segmentation
US5890147A (en) * 1997-03-07 1999-03-30 Microsoft Corporation Scope testing of documents in a search engine using document to folder mapping
US7111009B1 (en) 1997-03-14 2006-09-19 Microsoft Corporation Interactive playlist generation using annotations
JPH10333953A (en) * 1997-04-01 1998-12-18 Kokusai Zunou Sangyo Kk Integrated data base system and computer-readable recording medium recording program for managing its data base structure
US6272495B1 (en) * 1997-04-22 2001-08-07 Greg Hetherington Method and apparatus for processing free-format data
US5924100A (en) * 1997-05-06 1999-07-13 International Business Machines Corp. Flexible object representation of relational database cells having nontraditional datatypes
US5974413A (en) * 1997-07-03 1999-10-26 Activeword Systems, Inc. Semantic user interface
US6012057A (en) * 1997-07-30 2000-01-04 Quarterdeck Corporation High speed data searching for information in a computer system
US6006223A (en) * 1997-08-12 1999-12-21 International Business Machines Corporation Mapping words, phrases using sequential-pattern to find user specific trends in a text database
US6065003A (en) * 1997-08-19 2000-05-16 Microsoft Corporation System and method for finding the closest match of a data entry
US5983368A (en) * 1997-08-26 1999-11-09 International Business Machines Corporation Method and system for facilitating hierarchical storage management (HSM) testing
US5929857A (en) * 1997-09-10 1999-07-27 Oak Technology, Inc. Method and apparatus for dynamically constructing a graphic user interface from a DVD data stream
US6134563A (en) * 1997-09-19 2000-10-17 Modernsoft, Inc. Creating and editing documents
US6915265B1 (en) * 1997-10-29 2005-07-05 Janice Johnson Method and system for consolidating and distributing information
US6134558A (en) * 1997-10-31 2000-10-17 Oracle Corporation References that indicate where global database objects reside
US6108664A (en) * 1997-10-31 2000-08-22 Oracle Corporation Object views for relational data
US6058391A (en) * 1997-12-17 2000-05-02 Mci Communications Corporation Enhanced user view/update capability for managing data from relational tables
US6374256B1 (en) * 1997-12-22 2002-04-16 Sun Microsystems, Inc. Method and apparatus for creating indexes in a relational database corresponding to classes in an object-oriented application
US6345258B1 (en) * 1997-12-30 2002-02-05 William E. Pickens Information system for new home builders
US6219670B1 (en) * 1998-02-18 2001-04-17 International Business Machines Corporation Method and apparatus for filtering a table list before opening with a graphical user interface
US6247018B1 (en) 1998-04-16 2001-06-12 Platinum Technology Ip, Inc. Method for processing a file to generate a database
US6134582A (en) * 1998-05-26 2000-10-17 Microsoft Corporation System and method for managing electronic mail messages using a client-based database
US6330589B1 (en) * 1998-05-26 2001-12-11 Microsoft Corporation System and method for using a client database to manage conversation threads generated from email or news messages
JP3494920B2 (en) 1998-05-28 2004-02-09 インクリメント・ピー株式会社 Map information providing system and map information search method
US7162689B2 (en) * 1998-05-28 2007-01-09 Oracle International Corporation Schema evolution in replication
GB2341250A (en) * 1998-09-04 2000-03-08 Balaena Limited Database structure avoids duplication of stored data
US6956593B1 (en) 1998-09-15 2005-10-18 Microsoft Corporation User interface for creating, viewing and temporally positioning annotations for media content
US7051275B2 (en) 1998-09-15 2006-05-23 Microsoft Corporation Annotations for multiple versions of media content
US6226650B1 (en) * 1998-09-17 2001-05-01 Synchrologic, Inc. Database synchronization and organization system and method
US6363389B1 (en) * 1998-09-24 2002-03-26 International Business Machines Corporation Technique for creating a unique quasi-random row identifier
US6523028B1 (en) * 1998-12-03 2003-02-18 Lockhead Martin Corporation Method and system for universal querying of distributed databases
US6279004B1 (en) 1998-12-22 2001-08-21 International Business Machines Corporation Database index key versioning
US6279003B1 (en) 1998-12-22 2001-08-21 International Business Machines Corporation Self-describing multiple key format database index
US6334122B1 (en) * 1998-12-23 2001-12-25 Advanced Micro Devices, Inc. Method and apparatus for translating variable names to column names for accessing a database
US6711624B1 (en) * 1999-01-13 2004-03-23 Prodex Technologies Process of dynamically loading driver interface modules for exchanging data between disparate data hosts
US6457014B1 (en) * 1999-03-26 2002-09-24 Computer Associates Think, Inc. System and method for extracting index key data fields
US6370534B1 (en) * 1999-06-01 2002-04-09 Pliant Technologies, Inc. Blocking techniques for data storage
JP3914662B2 (en) 1999-06-30 2007-05-16 株式会社日立製作所 Database processing method and apparatus, and medium storing the processing program
US6463439B1 (en) * 1999-07-15 2002-10-08 American Management Systems, Incorporated System for accessing database tables mapped into memory for high performance data retrieval
US20020029217A1 (en) * 1999-08-09 2002-03-07 Raycam System Technology Telephone number inquiry method and database for all residents
US7099898B1 (en) 1999-08-12 2006-08-29 International Business Machines Corporation Data access system
US6532476B1 (en) 1999-11-13 2003-03-11 Precision Solutions, Inc. Software based methodology for the storage and retrieval of diverse information
US6633879B1 (en) 2000-01-04 2003-10-14 International Business Machines Corporation Method and system for optimizing direct tables and trees
US6742035B1 (en) * 2000-02-28 2004-05-25 Novell, Inc. Directory-based volume location service for a distributed file system
US20020029207A1 (en) * 2000-02-28 2002-03-07 Hyperroll, Inc. Data aggregation server for managing a multi-dimensional database and database management system having data aggregation server integrated therein
US7363290B1 (en) 2000-04-14 2008-04-22 Wachovia Corporation Item capture research system
US6684202B1 (en) * 2000-05-31 2004-01-27 Lexis Nexis Computer-based system and method for finding rules of law in text
US9038108B2 (en) * 2000-06-28 2015-05-19 Verizon Patent And Licensing Inc. Method and system for providing end user community functionality for publication and delivery of digital media content
GB0015896D0 (en) * 2000-06-28 2000-08-23 Twi Interactive Inc Multimedia publishing system
US8126313B2 (en) * 2000-06-28 2012-02-28 Verizon Business Network Services Inc. Method and system for providing a personal video recorder utilizing network-based digital media content
US7016900B2 (en) 2000-06-30 2006-03-21 Boris Gelfand Data cells and data cell generations
US7178098B2 (en) * 2000-07-13 2007-02-13 International Business Machines Corporation Method and system in an electronic spreadsheet for handling user-defined options in a copy/cut—paste operation
US7146561B2 (en) 2000-07-13 2006-12-05 International Business Machines Corporation Method and system in an electronic spreadsheet for comparing series of cells
US7272783B2 (en) 2000-07-13 2007-09-18 International Business Machines Corporation Method and system in an electronic spreadsheet for managing and handling user-defined options
US6973456B1 (en) * 2000-08-10 2005-12-06 Ross Elgart Database system and method for organizing and sharing information
AU2001287090A1 (en) * 2000-09-08 2002-03-22 Btg International Limited Customizing a legal document by extracting components from database
EP1189148A1 (en) * 2000-09-19 2002-03-20 UMA Information Technology AG Document search and analysing method and apparatus
US6718336B1 (en) * 2000-09-29 2004-04-06 Battelle Memorial Institute Data import system for data analysis system
JP2002117074A (en) * 2000-10-04 2002-04-19 Hitachi Ltd Information retrieving method
US7233940B2 (en) * 2000-11-06 2007-06-19 Answers Corporation System for processing at least partially structured data
US6898592B2 (en) * 2000-12-27 2005-05-24 Microsoft Corporation Scoping queries in a search engine
US20020152064A1 (en) * 2001-04-12 2002-10-17 International Business Machines Corporation Method, apparatus, and program for annotating documents to expand terms in a talking browser
US6904428B2 (en) * 2001-04-18 2005-06-07 Illinois Institute Of Technology Intranet mediator
US7219094B2 (en) 2001-05-10 2007-05-15 Siemens Medical Solutions Health Services Corporation Method and system for providing an adaptive interface for use in interrogating an application
US7970260B2 (en) * 2001-06-27 2011-06-28 Verizon Business Global Llc Digital media asset management system and method for supporting multiple users
US8990214B2 (en) * 2001-06-27 2015-03-24 Verizon Patent And Licensing Inc. Method and system for providing distributed editing and storage of digital media over a network
US20070089151A1 (en) * 2001-06-27 2007-04-19 Mci, Llc. Method and system for delivery of digital media experience via common instant communication clients
US8972862B2 (en) 2001-06-27 2015-03-03 Verizon Patent And Licensing Inc. Method and system for providing remote digital media ingest with centralized editorial control
US20060236221A1 (en) * 2001-06-27 2006-10-19 Mci, Llc. Method and system for providing digital media management using templates and profiles
US7461077B1 (en) 2001-07-31 2008-12-02 Nicholas Greenwood Representation of data records
US20030115082A1 (en) * 2001-08-24 2003-06-19 Jacobson Vince C. Mobile productivity tool for healthcare providers
US7747943B2 (en) * 2001-09-07 2010-06-29 Microsoft Corporation Robust anchoring of annotations to content
US7158994B1 (en) 2001-09-28 2007-01-02 Oracle International Corporation Object-oriented materialized views
US7480854B2 (en) * 2001-10-02 2009-01-20 International Business Machines Corporation Data conversion system and method
WO2007109890A1 (en) * 2006-03-29 2007-10-04 Mathieu Audet Multi-dimensional locating system and method
US7680817B2 (en) * 2001-10-15 2010-03-16 Maya-Systems Inc. Multi-dimensional locating system and method
US7606819B2 (en) * 2001-10-15 2009-10-20 Maya-Systems Inc. Multi-dimensional locating system and method
US6907422B1 (en) * 2001-12-18 2005-06-14 Siebel Systems, Inc. Method and system for access and display of data from large data sets
WO2003058284A1 (en) * 2001-12-31 2003-07-17 Lockheed Martin Corporation Methods and system for hazardous material early detection for use with mail and other objects
US7269651B2 (en) * 2002-09-26 2007-09-11 International Business Machines Corporation E-business operations measurements
US8086720B2 (en) * 2002-01-31 2011-12-27 International Business Machines Corporation Performance reporting in a network environment
US7412502B2 (en) * 2002-04-18 2008-08-12 International Business Machines Corporation Graphics for end to end component mapping and problem-solving in a network environment
US8527620B2 (en) 2003-03-06 2013-09-03 International Business Machines Corporation E-business competitive measurements
US20030149698A1 (en) * 2002-02-01 2003-08-07 Hoggatt Dana L. System and method for positioning records in a database
US20030154197A1 (en) * 2002-02-13 2003-08-14 Permutta Technologies Flexible relational data storage method and apparatus
US6996558B2 (en) 2002-02-26 2006-02-07 International Business Machines Corporation Application portability and extensibility through database schema and query abstraction
US7111020B1 (en) * 2002-03-26 2006-09-19 Oracle International Corporation Incremental refresh of materialized views containing rank function, and rewrite of queries containing rank or rownumber or min/max aggregate functions using such a materialized view
US7243301B2 (en) 2002-04-10 2007-07-10 Microsoft Corporation Common annotation framework
US20030196116A1 (en) * 2002-04-15 2003-10-16 Todd Troutman Electronic mail blocking system
US7568151B2 (en) * 2002-06-27 2009-07-28 Microsoft Corporation Notification of activity around documents
US7240046B2 (en) 2002-09-04 2007-07-03 International Business Machines Corporation Row-level security in a relational database management system
US20040049473A1 (en) * 2002-09-05 2004-03-11 David John Gower Information analytics systems and methods
US20080058106A1 (en) * 2002-10-07 2008-03-06 Maya-Systems Inc. Multi-dimensional locating game system and method
US7133885B2 (en) * 2002-11-26 2006-11-07 International Business Machines Corporation Database management system using offsets in entries with at least one varying-length column
AU2003901968A0 (en) * 2003-04-23 2003-05-15 Wolfgang Flatow A universal database schema
AU2004232862B2 (en) * 2003-04-23 2005-05-26 Certainedge Pty Ltd A universal database schema
US7765211B2 (en) * 2003-04-29 2010-07-27 International Business Machines Corporation System and method for space management of multidimensionally clustered tables
US7103588B2 (en) * 2003-05-05 2006-09-05 International Business Machines Corporation Range-clustered tables in a database management system
US7349918B2 (en) * 2003-06-30 2008-03-25 American Express Travel Related Services Company, Inc. Method and system for searching binary files
US7363581B2 (en) * 2003-08-12 2008-04-22 Accenture Global Services Gmbh Presentation generator
US7899843B2 (en) * 2003-09-19 2011-03-01 International Business Machines Corporation Expanding the scope of an annotation to an entity level
US7620624B2 (en) * 2003-10-17 2009-11-17 Yahoo! Inc. Systems and methods for indexing content for fast and scalable retrieval
US7174353B2 (en) * 2003-10-24 2007-02-06 International Business Machines Corporation Method and system for preserving an original table schema
US7900133B2 (en) 2003-12-09 2011-03-01 International Business Machines Corporation Annotation structure type determination
US8296304B2 (en) * 2004-01-26 2012-10-23 International Business Machines Corporation Method, system, and program for handling redirects in a search engine
US7293005B2 (en) * 2004-01-26 2007-11-06 International Business Machines Corporation Pipelined architecture for global analysis and index building
US7499913B2 (en) * 2004-01-26 2009-03-03 International Business Machines Corporation Method for handling anchor text
US7424467B2 (en) 2004-01-26 2008-09-09 International Business Machines Corporation Architecture for an indexer with fixed width sort and variable width sort
US7976539B2 (en) 2004-03-05 2011-07-12 Hansen Medical, Inc. System and method for denaturing and fixing collagenous tissue
US7890497B2 (en) * 2004-04-14 2011-02-15 Oracle International Corporation Using estimated cost to schedule an order for refreshing a set of materialized views (MVS)
US8478742B2 (en) * 2004-04-14 2013-07-02 Oracle Corporation Using estimated cost to refresh a set of materialized views (MVS)
US7734602B2 (en) 2004-04-14 2010-06-08 Oracle International Corporation Choosing whether to use a delayed index maintenance depending on the portion of the materialized view (MV) changed
US20060004725A1 (en) * 2004-06-08 2006-01-05 Abraido-Fandino Leonor M Automatic generation of a search engine for a structured document
US7814096B1 (en) * 2004-06-08 2010-10-12 Yahoo! Inc. Query based search engine
US7428535B1 (en) 2004-06-25 2008-09-23 Apple Inc. Automatic relevance filtering
US7886264B1 (en) 2004-06-25 2011-02-08 Apple Inc. Automatic conversion for disparate data types
US7971186B1 (en) 2004-06-25 2011-06-28 Apple Inc. Automatic execution flow ordering
US20060074836A1 (en) * 2004-09-03 2006-04-06 Biowisdom Limited System and method for graphically displaying ontology data
US20060053173A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for support of chemical data within multi-relational ontologies
US20060053382A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for facilitating user interaction with multi-relational ontologies
US20060053171A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for curating one or more multi-relational ontologies
US20060053175A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for creating, editing, and utilizing one or more rules for multi-relational ontology creation and maintenance
US20060053174A1 (en) * 2004-09-03 2006-03-09 Bio Wisdom Limited System and method for data extraction and management in multi-relational ontology creation
US7496593B2 (en) 2004-09-03 2009-02-24 Biowisdom Limited Creating a multi-relational ontology having a predetermined structure
US7505989B2 (en) 2004-09-03 2009-03-17 Biowisdom Limited System and method for creating customized ontologies
US7493333B2 (en) * 2004-09-03 2009-02-17 Biowisdom Limited System and method for parsing and/or exporting data from one or more multi-relational ontologies
US20060053172A1 (en) * 2004-09-03 2006-03-09 Biowisdom Limited System and method for creating, editing, and using multi-relational ontologies
US20060074833A1 (en) * 2004-09-03 2006-04-06 Biowisdom Limited System and method for notifying users of changes in multi-relational ontologies
US7461064B2 (en) 2004-09-24 2008-12-02 International Buiness Machines Corporation Method for searching documents for ranges of numeric values
US7606793B2 (en) * 2004-09-27 2009-10-20 Microsoft Corporation System and method for scoping searches using index keys
US20060085470A1 (en) * 2004-10-15 2006-04-20 Matthias Schmitt Database record templates
US7610272B2 (en) * 2004-11-29 2009-10-27 Sap Ag Materialized samples for a business warehouse query
US7363306B1 (en) * 2005-01-27 2008-04-22 Hewlett-Packard Development Company, L.P. Method and system for graphical representation
US7565217B2 (en) * 2005-04-01 2009-07-21 International Business Machines Corporation Traversal of empty regions in a searchable data structure
CA2609916A1 (en) * 2005-05-31 2006-12-07 Siemens Medical Solutions Usa, Inc. System and method for data sensitive filtering of patient demographic record queries
US8156079B1 (en) 2005-06-30 2012-04-10 Emc Corporation System and method for index processing
US7966292B1 (en) 2005-06-30 2011-06-21 Emc Corporation Index processing
US8161005B1 (en) 2005-06-30 2012-04-17 Emc Corporation Efficient index processing
US7849048B2 (en) 2005-07-05 2010-12-07 Clarabridge, Inc. System and method of making unstructured data available to structured data analysis tools
US7849049B2 (en) 2005-07-05 2010-12-07 Clarabridge, Inc. Schema and ETL tools for structured and unstructured data
US8417693B2 (en) * 2005-07-14 2013-04-09 International Business Machines Corporation Enforcing native access control to indexed documents
US7428524B2 (en) * 2005-08-05 2008-09-23 Google Inc. Large scale data storage in sparse tables
US7567973B1 (en) * 2005-08-05 2009-07-28 Google Inc. Storing a sparse table using locality groups
US7668846B1 (en) 2005-08-05 2010-02-23 Google Inc. Data reconstruction from shared update log
US20070107012A1 (en) * 2005-09-07 2007-05-10 Verizon Business Network Services Inc. Method and apparatus for providing on-demand resource allocation
US8631226B2 (en) * 2005-09-07 2014-01-14 Verizon Patent And Licensing Inc. Method and system for video monitoring
US9401080B2 (en) 2005-09-07 2016-07-26 Verizon Patent And Licensing Inc. Method and apparatus for synchronizing video frames
US9076311B2 (en) * 2005-09-07 2015-07-07 Verizon Patent And Licensing Inc. Method and apparatus for providing remote workflow management
US7627609B1 (en) 2005-09-30 2009-12-01 Emc Corporation Index processing using transformed values
US7698325B1 (en) 2005-09-30 2010-04-13 Emc Corporation Index processing for legacy systems
US7752211B1 (en) * 2005-09-30 2010-07-06 Emc Corporation Adaptive index processing
US7954049B2 (en) 2006-05-15 2011-05-31 Microsoft Corporation Annotating multimedia files along a timeline
US20070276852A1 (en) * 2006-05-25 2007-11-29 Microsoft Corporation Downloading portions of media files
US7801856B2 (en) * 2006-08-09 2010-09-21 Oracle International Corporation Using XML for flexible replication of complex types
US20080114733A1 (en) * 2006-11-14 2008-05-15 Microsoft Corporation User-structured data table indexing
US8826123B2 (en) * 2007-05-25 2014-09-02 9224-5489 Quebec Inc. Timescale for presenting information
US8069404B2 (en) 2007-08-22 2011-11-29 Maya-Systems Inc. Method of managing expected documents and system providing same
US8601392B2 (en) 2007-08-22 2013-12-03 9224-5489 Quebec Inc. Timeline for presenting information
US9348912B2 (en) * 2007-10-18 2016-05-24 Microsoft Technology Licensing, Llc Document length as a static relevance feature for ranking search results
US20090106221A1 (en) * 2007-10-18 2009-04-23 Microsoft Corporation Ranking and Providing Search Results Based In Part On A Number Of Click-Through Features
CA2657835C (en) 2008-03-07 2017-09-19 Mathieu Audet Documents discrimination system and method thereof
US8812493B2 (en) * 2008-04-11 2014-08-19 Microsoft Corporation Search results ranking using editing distance and document information
CA2666016C (en) 2008-05-15 2014-07-22 Mathieu Audet Method for building a search algorithm and method for linking documents with an object
US8893017B2 (en) 2008-05-29 2014-11-18 Adobe Systems Incorporated Tracking changes in a database tool
CA2677921C (en) 2008-09-12 2017-06-13 Mathieu Ma Audet Method of managing groups of arrays of documents
US8484351B1 (en) 2008-10-08 2013-07-09 Google Inc. Associating application-specific methods with tables used for data storage
US20100169348A1 (en) * 2008-12-31 2010-07-01 Evrichart, Inc. Systems and Methods for Handling Multiple Records
US8250026B2 (en) 2009-03-06 2012-08-21 Peoplechart Corporation Combining medical information captured in structured and unstructured data formats for use or display in a user application, interface, or view
US8140517B2 (en) * 2009-04-06 2012-03-20 International Business Machines Corporation Database query optimization using weight mapping to qualify an index
US8738635B2 (en) 2010-06-01 2014-05-27 Microsoft Corporation Detection of junk in search result ranking
US9058093B2 (en) 2011-02-01 2015-06-16 9224-5489 Quebec Inc. Active element
US20120317104A1 (en) * 2011-06-13 2012-12-13 Microsoft Corporation Using Aggregate Location Metadata to Provide a Personalized Service
US11341166B2 (en) 2011-09-01 2022-05-24 Full Circle Insights, Inc. Method and system for attributing metrics in a CRM system
US10621206B2 (en) * 2012-04-19 2020-04-14 Full Circle Insights, Inc. Method and system for recording responses in a CRM system
CA2790799C (en) 2011-09-25 2023-03-21 Mathieu Audet Method and apparatus of navigating information element axes
US8990675B2 (en) 2011-10-04 2015-03-24 Microsoft Technology Licensing, Llc Automatic relationship detection for spreadsheet data items
US9069748B2 (en) 2011-10-04 2015-06-30 Microsoft Technology Licensing, Llc Selective generation and display of data items associated with a spreadsheet
US9430114B1 (en) 2011-11-03 2016-08-30 Pervasive Software Data transformation system, graphical mapping tool, and method for creating a schema map
US9495462B2 (en) 2012-01-27 2016-11-15 Microsoft Technology Licensing, Llc Re-ranking search results
US8938428B1 (en) 2012-04-16 2015-01-20 Emc Corporation Systems and methods for efficiently locating object names in a large index of records containing object names
US9519693B2 (en) 2012-06-11 2016-12-13 9224-5489 Quebec Inc. Method and apparatus for displaying data element axes
US9646080B2 (en) 2012-06-12 2017-05-09 9224-5489 Quebec Inc. Multi-functions axis-based interface
US9836759B2 (en) 2012-08-06 2017-12-05 Randolph Ken Georgi Universal transaction associating identifier
US9811579B1 (en) * 2012-11-21 2017-11-07 Christopher A. Olson Document relational mapping
US9141669B2 (en) * 2013-01-22 2015-09-22 Go Daddy Operating Company, LLC Configuring an origin server content delivery using a pulled data list
US10372842B2 (en) * 2013-03-14 2019-08-06 Xerox Corporation Method and device for calibrating and updating a power model
EP2784699A1 (en) 2013-03-29 2014-10-01 Pilab S.A. Computer-implemented method for storing unlimited amount of data as a mind map in relational database systems
EP3159815A1 (en) 2013-06-30 2017-04-26 Pilab S.A. Database hierarchy-independent data drilling
EP2843567B1 (en) 2013-08-30 2017-05-10 Pilab S.A. Computer-implemented method for improving query execution in relational databases normalized at level 4 and above
EP2843568A1 (en) 2013-08-30 2015-03-04 Pilab S.A. Computer implemented method for creating database structures without knowledge on functioning of relational database system
US10210050B1 (en) * 2013-12-23 2019-02-19 EMC IP Holding Company LLC Consistency group driven backup
US10089380B2 (en) * 2014-01-07 2018-10-02 Samsung Electronics Co., Ltd. Method and apparatus for operating electronic device
US9460174B2 (en) 2014-05-20 2016-10-04 IfWizard Corporation Method for transporting relational data
US11636408B2 (en) 2015-01-22 2023-04-25 Visier Solutions, Inc. Techniques for manipulating and rearranging presentation of workforce data in accordance with different data-prediction scenarios available within a graphical user interface (GUI) of a computer system, and an apparatus and hardware memory implementing the techniques
US10402759B2 (en) * 2015-01-22 2019-09-03 Visier Solutions, Inc. Systems and methods of adding and reconciling dimension members
US10157400B1 (en) 2015-02-26 2018-12-18 Randolph Georgi Interoperable reward currency system, method, and apparatus
US10614478B1 (en) 2015-02-26 2020-04-07 Randolph Georgi Directed digital currency system, method, and apparatus
US20210319172A1 (en) * 2015-04-30 2021-10-14 Workiva Inc. Computing device for multiple cell linking
WO2017186774A1 (en) 2016-04-26 2017-11-02 Pilab S.A. Systems and methods for querying databases
US10068207B2 (en) 2016-06-17 2018-09-04 Snap-On Incorporated Systems and methods to generate repair orders using a taxonomy and an ontology
US20180032552A1 (en) * 2016-08-01 2018-02-01 Georgia Tech Research Corporation Configurable Hyper-Referenced Associative Object Schema
US9843974B1 (en) 2016-10-13 2017-12-12 Qualcomm Incorporated Communication beam soft handover
RU2650032C1 (en) * 2017-03-20 2018-04-06 Алексей Петрович Семенов Electronic database and method of its formation
CA3007166A1 (en) 2017-06-05 2018-12-05 9224-5489 Quebec Inc. Method and apparatus of aligning information element axes
US11443108B2 (en) 2020-08-17 2022-09-13 Workiva Inc. System and method for document management using branching
US11100281B1 (en) 2020-08-17 2021-08-24 Workiva Inc. System and method for maintaining links and revisions
US11681734B2 (en) 2020-12-09 2023-06-20 International Business Machines Corporation Organizing fragments of meaningful text
CN112699094B (en) * 2021-03-23 2021-07-13 中国信息通信研究院 File storage method, data retrieval method, corresponding device and system
US11354362B1 (en) 2021-05-06 2022-06-07 Workiva Inc. System and method for copying linked documents
US11762668B2 (en) 2021-07-06 2023-09-19 Servicenow, Inc. Centralized configuration data management and control

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0274392A2 (en) * 1987-01-08 1988-07-13 Wang Laboratories Inc. Improved relational data base system
EP0336586A2 (en) * 1988-04-08 1989-10-11 International Business Machines Corporation Data storage, retrieval and transmission in computer systems
US5201046A (en) * 1990-06-22 1993-04-06 Xidak, Inc. Relational database management system and method for storing, retrieving and modifying directed graph data structures

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5226161A (en) * 1987-08-21 1993-07-06 Wang Laboratories, Inc. Integration of data between typed data structures by mutual direct invocation between data managers corresponding to data types
JPH0370048A (en) * 1989-08-09 1991-03-26 Hitachi Ltd Dictionary generating method
US5295256A (en) * 1990-12-14 1994-03-15 Racal-Datacom, Inc. Automatic storage of persistent objects in a relational schema
JPH04271443A (en) * 1991-02-27 1992-09-28 Canon Inc Method and device for data base constitution
JP3177999B2 (en) * 1991-04-25 2001-06-18 カシオ計算機株式会社 System configuration diagram creation device
US5305389A (en) * 1991-08-30 1994-04-19 Digital Equipment Corporation Predictive cache system
JP2711204B2 (en) * 1992-03-09 1998-02-10 インターナショナル・ビジネス・マシーンズ・コーポレイション How to generate a relational database user interface
US5359724A (en) * 1992-03-30 1994-10-25 Arbor Software Corporation Method and apparatus for storing and retrieving multi-dimensional data in computer memory
US5459860A (en) * 1992-10-05 1995-10-17 International Business Machines Corporation Computerized system and process for managing a distributed database system
US5557787A (en) * 1993-02-18 1996-09-17 Fuji Xerox Co., Ltd. Table generating apparatus employing heading, layout, and table script data
JPH06251007A (en) * 1993-02-23 1994-09-09 Fuji Xerox Co Ltd Table data input device
US5303146A (en) * 1993-03-11 1994-04-12 Borland International, Inc. System and methods for improved scenario management in an electronic spreadsheet
US5560005A (en) * 1994-02-25 1996-09-24 Actamed Corp. Methods and systems for object-based relational distributed databases
US5729730A (en) * 1995-03-28 1998-03-17 Dex Information Systems, Inc. Method and apparatus for improved information storage and retrieval system
US5630005A (en) * 1996-03-22 1997-05-13 Cirrus Logic, Inc Method for seeking to a requested location within variable data rate recorded information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0274392A2 (en) * 1987-01-08 1988-07-13 Wang Laboratories Inc. Improved relational data base system
EP0336586A2 (en) * 1988-04-08 1989-10-11 International Business Machines Corporation Data storage, retrieval and transmission in computer systems
US5201046A (en) * 1990-06-22 1993-04-06 Xidak, Inc. Relational database management system and method for storing, retrieving and modifying directed graph data structures

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1258812A1 (en) * 2001-05-17 2002-11-20 Peter Pressmar Virtual database of heterogeneous data structures
US6920457B2 (en) 2001-05-17 2005-07-19 Peter Pressmar Virtual database of heterogeneous data structures
US7146356B2 (en) 2003-03-21 2006-12-05 International Business Machines Corporation Real-time aggregation of unstructured data into structured data for SQL processing by a relational database engine
US7974681B2 (en) 2004-03-05 2011-07-05 Hansen Medical, Inc. Robotic catheter system
US9477749B2 (en) 2012-03-02 2016-10-25 Clarabridge, Inc. Apparatus for identifying root cause using unstructured data
US10372741B2 (en) 2012-03-02 2019-08-06 Clarabridge, Inc. Apparatus for automatic theme detection from unstructured data

Also Published As

Publication number Publication date
AU4910496A (en) 1996-10-16
US6151604A (en) 2000-11-21
US5893087A (en) 1999-04-06
EP0818010A1 (en) 1998-01-14
US6163775A (en) 2000-12-19
US5729730A (en) 1998-03-17
CA2216719A1 (en) 1996-10-03

Similar Documents

Publication Publication Date Title
US5729730A (en) Method and apparatus for improved information storage and retrieval system
US6182121B1 (en) Method and apparatus for a physical storage architecture having an improved information storage and retrieval system for a shared file environment
US5499359A (en) Methods for improved referential integrity in a relational database management system
Chu Information representation and retrieval in the digital age
US6523030B1 (en) Sort system for merging database entries
Zloof Office-by-Example: A business language that unifies data and word processing and electronic mail
US7487154B2 (en) Method and apparatus for generating page-level security in a computer generated report
Raymond et al. Hypertext and the Oxford English dictionary
Kowalski Information retrieval systems: theory and implementation
US5787416A (en) Methods for hypertext reporting in a relational database management system
US5991776A (en) Database system with improved methods for storing free-form data objects of data records
US6772156B1 (en) Method and apparatus for creating and displaying a table of content for a computer-generated report having page-level security
Ellis et al. In search of the unknown user: indexing, hypertext and the World Wide Web
Porter Implementing a probabilistic information retrieval system
JPH0484271A (en) Intra-information retrieval device
WO2000026839A9 (en) Advanced model for automatic extraction of skill and knowledge information from an electronic document
Borko The conceptual foundations of information systems
JPH0744579A (en) Logical structure sentence retrieval system
KR970010030B1 (en) Picture search system
Harrison et al. On integrated bibliography processing
Ingwersen et al. Means to improved subject access and representation in modern information retrieval
EP1101176A1 (en) Method and apparatus for a physical storage architecture having an improved information storage and retrieval system for a shared file environment
Kowarski et al. The document concept in a data base
Maarek Introduction to Information Retrieval for Software Reuse.
Hupfeld Hierarchical Structures in Attribute-based Namespaces and their Application to Browsing

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG UZ VN

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref document number: 2216719

Country of ref document: CA

Ref country code: CA

Ref document number: 2216719

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1996905298

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1996905298

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 1996905298

Country of ref document: EP