WO1998041933A1 - Method for implementing an associative memory based on a digital trie structure - Google Patents

Method for implementing an associative memory based on a digital trie structure Download PDF

Info

Publication number
WO1998041933A1
WO1998041933A1 PCT/FI1998/000192 FI9800192W WO9841933A1 WO 1998041933 A1 WO1998041933 A1 WO 1998041933A1 FI 9800192 W FI9800192 W FI 9800192W WO 9841933 A1 WO9841933 A1 WO 9841933A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
nodes
bits
address
trie
Prior art date
Application number
PCT/FI1998/000192
Other languages
Finnish (fi)
French (fr)
Other versions
WO1998041933A8 (en
Inventor
Matti Tikkanen
Jukka-Pekka Iivonen
Original Assignee
Nokia Telecommunications Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Telecommunications Oy filed Critical Nokia Telecommunications Oy
Priority to EP98908123A priority Critical patent/EP0976066A1/en
Priority to AU66240/98A priority patent/AU6624098A/en
Publication of WO1998041933A1 publication Critical patent/WO1998041933A1/en
Publication of WO1998041933A8 publication Critical patent/WO1998041933A8/en
Priority to US09/389,574 priority patent/US6505206B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/953Organization of data
    • Y10S707/956Hierarchical
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation

Definitions

  • the present invention generally relates to implementation of an associative memory, particularly to implementation of an associative memory based on a digital trie structure.
  • the solution in accordance with the invention is intended for use primarily in connection with central memory databases, and it can be used in conjunction with all memories based on a digital trie structure.
  • digital trie The prior art unidimensional directory structure termed digital trie (the word “trie” is derived from the English word “retrieval”) is the underlying basis of the principle of the present invention.
  • Digital tries can be implemented in two types: bucket tries, and tries having no buckets.
  • a digital bucket trie structure is a tree-shaped structure composed of two types of nodes: buckets and trie nodes.
  • a bucket is a data structure containing a number of data units or a number of pointers to data units or a number of search key/pointer pairs (the number may include only one data unit, one pointer or one key/pointer pair).
  • a trie node is an array guiding the retrieval, having a size of two by the power of k (2 k ) elements. If an element in a trie node is in use, it refers either to a trie node at the next level in the directory tree or to a bucket. In other cases, the element is free (empty).
  • Search in the database proceeds by examining the search key (which in the case of a subscriber database in a mobile telephone network or a telephone exchange, for instance, is typically the binary numeral corresponding to the telephone number of the subscriber) k bits at a time.
  • the bits to be searched are selected in such a way that at the root level of the structure (in the first trie node), k leftmost bits are searched; at the second level of the structure, k bits next to the leftmost bits are searched, etc.
  • the bits to be searched are interpreted as an unsigned binary integer that is employed directly to index the element array contained in the trie node, the index indicating a given element in the array. If the element indicated by the index is free, the search will terminate as unsuccessful.
  • the routine branches off in the trie node either to a trie node at the next level or to a bucket. If the element refers to a bucket containing a key, the key stored therein is compared with the search key. The entire search key is thus compared only after the search has encountered a bucket. Where the keys are equal, the search is successful, and the desired data unit is obtained at the storage address indicated by the pointer of the bucket. Where the keys differ, the search terminates as unsuccessful.
  • a bucketless trie structure has no buckets, but reference to a data unit is effected from a trie node at the lowest level of a tree-shaped hierarchy, called a leaf node. Unlike buckets, the leaf nodes in a bucketless structure cannot contain data units but only pointers to data units. Also a bucket structure has leaf nodes, and hence trie nodes containing at least one pointer to a bucket (bucket structure) or to a data unit (bucketless structure) are leaf nodes. The other nodes in the trie are internal nodes. Trie nodes may thus be either internal nodes or leaf nodes.
  • Buckets are denoted with references A, B, C, D...H...M, N, O and P. Thus a bucket is a node that does not point to a lower level in the tree.
  • Trie nodes are denoted with references IN1...IN5 and elements in the trie node with reference NE in Figure 1.
  • a pointer is stored in each bucket to that storage location in the database SD at which the actual data, e.g. the telephone number of the pertinent subscriber and other information relating to that subscriber, is to be found.
  • the actual subscriber data may be stored in the database for instance as a sequential file of the type shown in the figure.
  • the search is performed on the basis of the search key of record H, for example, by first extracting from the search key the two leftmost bits (01) and interpreting them, which delivers the second element of node IN1 , containing a pointer to node IN3 at the next level. At this level, the two next bits (11) are extracted from the search key, thus yielding the fourth element of that node, pointing to record H.
  • a bucket may contain (besides a search key) an actual data file (also called by the more generic term data unit).
  • the data relating to subscriber A Figure 1 may be located in bucket A, the data relating to subscriber B in bucket B, etc.
  • a key-pointer pair is stored in the bucket, and in the second embodiment a key and actual data are stored, even though the key is not indispensable.
  • the search key may also be multidimensional. In other words, it may comprise a number of attributes (for example the family name and one or more forenames of a subscriber).
  • a multidimensional trie structure is disclosed in international application No. PCT/FI95/00319 (published under number WO 95/34155).
  • address computation is performed in such a way that a given predetermined number of bits at a time is selected from each dimension independently of the other dimensions.
  • a fixed limit independent of the other dimensions is set for each dimension in any individual node of the trie structure, by predetermining the number of search key bits to be searched in each dimension.
  • the memory circuit requirement can be curbed when the distribution of the values of the search keys is known in advance, in which case the structure can be implemented in a static form.
  • the size of the nodes must vary dynamically as the key distribution changes. When the key distribution is uniform, the node size may be increased to make the structure flatter.
  • the node size can be maintained small, which will enable locally a more uniform key distribution and thereby smaller storage space occupancy.
  • Dynamic changes in node size presuppose implementation of address computation in such a way that in each node of the tree-shaped hierarchy constituted by the digital trie structure, a node-specific number of bits is selected from the bit string constituted by the search keys employed.
  • the choice between a fixed node size and a dynamically changing node size is dependent for example on for what kind of application the memory is intended, for example what the number of retrievals, insertions and deletions to be made in the database is and what the proportions of these operations are.
  • memories based on the digital trie structure are nevertheless at- tended by the problem of how the empty space inevitably created in the structure can be modelled in such a way that storage space occupancy will be as low as possible and memory efficiency (speed of memory operations) as good as possible.
  • the first of these discloses a structure employing buckets and the second a structure not employing buckets.
  • the basic idea of the invention is to compress such nodes in a digital trie structure that provide only a single path downward in a tree-shaped hierarchy.
  • the data needed to proceed in the structure and for reorganization of nodes is stored in such a compressed node, without any storage space being required for (an) element array(s).
  • the empty space present in the trie structure can be modelled in such a way that storage space occupancy in the structure will remain small with uniform as well as non- uniform key distributions.
  • the solution enables the number of memory references requiring computation time to be minimized, thus making the efficiency (speed) of the memory as good as possible.
  • each chain made up by successive compressed nodes is replaced with a single collecting node. This enables elimination of chains made up by successive compressed nodes as a result of limited word length. Elimination of chains will further improve memory efficiency and curb the need for storage space.
  • the solution in accordance with the invention also ensures effective performance of set operations, as the structure is an order-preserving digital trie.
  • Figure 1 illustrates the use of a unidimensional digital trie structure in the maintenance of subscriber data in a telephone exchange
  • Figure 2 shows a multidimensional trie structure
  • Figure 3 shows a memory structure in accordance with the invention
  • Figure 4 illustrates implementation of address computation in the memory of the invention
  • Figure 5 illustrates the structure of a trie node of the memory when the memory employs dynamic node size
  • Figures 6a and 6b illustrate the principle of forming a compressed node
  • Figures 7a and 7b show an example of the maintenance of the memory struc- ture
  • Figure 8 illustrates the structure of a compressed node employed in the memory
  • Figure 9a illustrates the limitation posed by the word length employed on combining the nodes
  • Figure 9b shows the structure of a collecting node to be formed from the node chain of Figure 9a
  • Figure 10 shows the memory arrangement in accordance with the invention on block diagram level.
  • the trie structure has a multidimensional (generally n-dimensionai) implementation.
  • a multidimensional structure is otherwise fully similar to the unidimensional structure described at the beginning, but the element array contained in the trie node is multidimensional.
  • Figure 2 exemplifies a two-dimensional 2 2 *2 1 structure, in which one dimension in the element array comprises four elements and the other dimension two elements. Buckets pointed to from the elements in the trie node are indicated with circles in the figure.
  • the size of the trie node in the direction of each dimension is 2 ki ele- ments, and the total number of elements S in the trie node is also a power of two:
  • n integers (n>2), each of which may have a value in the range ⁇ 0,1...2 k
  • the predetermined fixed parameter is the total length of the search key in each dimension. If for example one dimension of the search key has 256 attributes (such as first names) at most, the total length of the search key is 8 bits.
  • Figure 3 shows an example of a node N10 used in the directory structure of the memory in accordance with the invention, employing a three- dimensional search key.
  • the linearization is an arithmetic operation that can be performed on arrays of all sizes. Hence, it is ir- relevant whether the trie nodes or their element arrays are considered to be unidimensional or multidimensional, as multidimensional arrays are linearized in any case to be unidimensional.
  • the elements in the array are numbered starting from zero (as shown in Figure 3), the number of the last element being one less than the product of the sizes of all dimensions.
  • the number of an element is the sum of the products of each coordinate (for example in the three- dimensional case, the x, y and z coordinates) and the sizes of the dimensions preceding it. The number thus computed is employed directly to index the unidimensional array.
  • the element number VA n is calculated in accordance with the above with the formula: where xe ⁇ 0,1,2,3 ⁇ , ye ⁇ 0,1 ⁇ and ze ⁇ 0,1, 2,3,4,5,6,7 ⁇ .
  • the (n-dimensioned) element array of a trie node of an n- dimensional trie structure is linearized, in accordance with the above the size of each dimension is 2 ⁇ where k, is the number of bits to be searched at a time in the dimension concerned. If a coordinate in accordance with the dimension is denoted by reference a, (je ⁇ 0,1,2...n ⁇ ), the linearization can be written out as
  • the linearization can be carried out by performing a multiplication in accordance with formula (3); yet it is expedient to perform the linearization by forming from the search key bits a bit string by known methods, the corresponding numeral indicating the element whose content provides the basis for proceeding in the directory tree.
  • bit interleaving is a more efficient (rapid) method than the multiplication in accordance with formula (3), since when bit interleaving is used multiplications will be converted to additions and bit shifts, which are faster to perform.
  • the most common way to implement bit interleaving is the 'z ordering'.
  • Another possible bit interleaving method is the line ordering. In the present invention, it is advantageous to use line ordering, as it affords the most efficient address computation in memory searches, but any known bit interleaving method may be employed, as long as the same method is employed in all nodes of the structure.
  • Figure 4 illustrates an example of address computation performed in the trie structure in accordance with the invention.
  • the memory employs dynamically changing node sizes and that the space is three-dimensional (dimensions x, y and z).
  • the search keys are listed one below another in the figure.
  • the indexing bits of a unidimensional element array are shown in frames denoted by continuous lines.
  • the leftmost bit in search key a y and the leftmost bit in search key a z are the leftmost bit in search key a y and the leftmost bit in search key a z .
  • z ordering the order of the bits is always as presently shown, in other words, the first bit of the first dimension is first extracted, thereafter the first bit of the second dimension, thereafter the first bit of the third dimension, etc.
  • the second bits are extracted from the different dimensions, starting from the first dimension. In this way, the following node-specific element array indices are obtained: 0 (node N1), 11 (node N2), 110 (node N3), 10 (node N4), 1010 (node N5), 10 (node N6) and 1100 (node N7).
  • bit interleaving method such as line ordering
  • the frames denoted by broken lines and the arrows pertaining to them illustrate the forming of an element array index in node N5, the memory employing bit interleaving with line ordering.
  • line ordering all bits of each dimension are extracted at a time.
  • the minimum number of bits to be extracted from the search keys of the different dimensions is first calculated in the node. This is obtained by dividing the number of bits searched in the node by the number of the dimensions and by truncating the obtained result to the closest integer.
  • the number of bits to be searched in node N5 is four and the number of dimensions three, which gives a minimum number of one (that is, at least one bit must be extracted from the search key of each dimension). Thereafter it is still to be calculated how many additional bits must be extracted from the search keys of the different dimensions.
  • the result 1 thus means that one additional bit is to be extracted. Extraction of additional bits is always started from the first searched dimension. In this exemplary case, one additional bit is thus extracted from the search key of dimension z. If the result had been two, one additional bit from the search key of dimension z and one additional bit from the search key of dimension x would have been extracted.
  • bit string 1001 is obtained as the element array index of node N5; this bit string is depicted in the lower portion of Figure 4.
  • FIG. 5 illustrates the structure of an ordinary trie node when dynamically changing node size is employed. In its minimum con- figuration, the node thus comprises only two parts: a field indicating the number of bits to be searched in the node (reference 51) and an element array (reference 52), the number of elements in the array corresponding to a power of two. For proceeding in the directory tree, in addition to the number of bits to be searched the type of each node must be known.
  • This data can be stored in the directory structure for example in each node or in the pointer of the parent of the node.
  • information can be encoded in the pointer on whether a zero pointer (an empty element) is concerned or whether the pointer points to an ordinary trie node, a bucket or a compressed trie node (which will be described hereinbelow).
  • the encoding may be for example of the type shown in the figure.
  • information on whether the pointer points to an uncompressed node, a compressed node or a data unit is stored.
  • the node does not necessarily contain but an element array.
  • compressed nodes are formed from the nodes of the trie structure in certain cases. If an ordinary trie node has only one child, this means that only one path downward in the tree passes through said trie node.
  • a trie node containing only a single pointer (path downward) is replaced with a compressed node in which the number of bits searched in said path and the computed array index value are disclosed.
  • compression also means that at least two child nodes are always maintained for ordinary (uncompressed) trie nodes in the memory structure, that is, an individual (ordinary) trie node has pointers to at least two different lower-level nodes (child nodes).
  • a compressed node replaces one or more successive internal nodes, each of which has one child, and hence the above- stated one child cannot be a bucket (or a leaf in a structure that has no buckets). Hence, a child node must be an ordinary trie node in order for compression to be possible.
  • the memory in accordance with the invention thus comprises two types of trie nodes: ordinary trie nodes containing an element array in accordance with Figure 5, and compressed nodes that will be described in the following.
  • Figures 6a and 6b illustrate the principle of forming a compressed node. For simplicity, all nodes are presumed to have a size of two elements.
  • FIGS 7a and 7b show a local maintenance example when data units and associated keys are deleted from a database.
  • Figure 7a shows an initial situation in which the memory structure comprises trie nodes N111...N113 and buckets L2...L4. Thereafter bucket L2 and the pointer/record contained therein is deleted from the memory, as a result of which nodes N111 and N112 can be replaced with a compressed node CN, in which the index of the pointer contained in the node and the number of bits searched in the path replaced by the compressed node are disclosed.
  • the compressed node is in principle similar to an ordinary trie node, but instead of the entire large-size element array with only one pointer being stored, the index of the pointer concerned and the number of bits searched in the path are stored.
  • This creates the compressed node CN in accordance with Figure 7b, in which the number of bits searched in said path (3) and the index corresponding to said pointer (101 5 when bit interleaving with line ordering is used) are disclosed.
  • a compressed node thus has a virtual array replacing the information contained in the one or more node arrays existing in the path. If the compressed node replaces several ordinary trie nodes, the number of searched bits indicated in the compressed node is equal to the sum of the numbers of bits searched in the replaced nodes.
  • Figure 8 illustrates the structure of a compressed node.
  • the mini- mum configuration of the node comprises 3 parts: field 120 indicating the number of searched bits, field 121 storing the value of the array index, and field 122 storing a pointer to a child node.
  • the compressed node is in need of this data in order for the search to proceed with the correct value at the compressed node as well, and in order for the restructuring of the node to be pos- sible in connection with changes in the memory structure. (Without information on the number of searched bits, the array index value cannot be calculated from the search key, and on the other hand without the array index value the calculated value could not be compared to the value stored in the node.)
  • a collision occurs in the compressed node in connection with an insertion, i.e. the compressed node will have a new pointer, it is studied which bit in order distinguishes the index of the initial pointer and the index of the new pointer. Accordingly, a structure replacing the initial compressed node is created, in which the new compressed node comprises the index bit number insofar as there are common bits. In addition, one or more trie nodes are created in the structure at points corresponding to those bits in which the indices differ from one another.
  • the compressed node is preceded by one or more compressed nodes or a chain of trie nodes providing only a single path, it is advantageous in view of storage space requirement and memory efficiency to further combine said nodes. Moreover, in view of memory efficiency it is advantageous to carry out the combination of nodes in such a way that only in the compressed node that is the last (lowest) in the chain the number of searched bits is smaller than the word length in the computer used. In other words, nodes are combined in such a way that the number of searched bits will be as large as possible in each compressed node. For example, three successive com- pressed nodes in which the numbers of searched bits are 5, 10 and 15 can be combined into one compressed node in which the number of searched bits is 30.
  • the search path or part thereof is replaced with a chain made up by several successive compressed nodes, in which the number of searched bits is the same as the word bit number, for example 32 in the Intel architecture, except for the last node where the number of bits is smaller than or equal to the word bit number.
  • FIG. 9a Such a situation is depicted in Figure 9a, showing three successive compressed nodes CN1...CN3.
  • the numbers of bits searched in the nodes are denoted by references b, b' and b" and the values of the array indices contained in the nodes with i, i' and i", respectively.
  • the number of searched bits has a maximum value (providing that a 32- bit computer architecture is used). It is advantageous to form from a chain of several successive compressed nodes resulting from limited word length a single node collecting such compressed nodes.
  • This collecting node is formed in such a way that the pointer of the collecting node is set to point to the child of the compressed node that is last in said chain, the sum of the numbers of bits searched in the compressed nodes in the chain is set as the number of bits B searched in the collecting node, and the array indices (i.e. search words) produced by bit interleaving are inserted in the list or table T of the node in the order in which they appear in the successive compressed nodes.
  • the collecting node will be a node CN4 as shown in Figure 9b, comprising three parts: field 130 containing a pointer to said lower-level node, field 131 containing the number of searched bits B (the above sum), and list or table T containing in succession the array indices produced by bit interleaving.
  • This third part thus has a varying size.
  • the number of indices is three, since the example of Figure 9a comprises three successive nodes.
  • the number of elements (i.e., indices) EN in table T is obtained from the number of searched bits B as follows:
  • nodes containing one child will also be created in conjunction with uniform key distributions when the n-dimension of the structure is sufficiently large.
  • a bucket cannot be preceded by a compressed node, but the parent node of a bucket is always either an ordinary trie node or an empty element.
  • a compressed node cannot point to a bucket, but it always points either to another compressed node or to an ordinary trie node.
  • An empty element means that if the total number of records is smaller that the number of pointers/records that the bucket can accommodate, a tree-shaped structure is not needed yet, but one bucket will suffice in the structure (in which case said node is conceptually preceded by an empty element). It is advantageous to proceed in this way at the initial phase of starting up the memory. It is thus worth-while starting building up the tree-shaped structure only when this is necessary.
  • the retrievals, insertions and deletions to be carried out in the memory are performed in a manner known per se.
  • the memory may also employ functional updating implemented by known methods by copying the path from root to buckets.
  • the above-described compres- sion principle also relates to a bucketless trie structure.
  • the equivalent of a bucket is a data unit (to which a leaf node in the bucketless structure points).
  • Figure 10 shows a memory in accordance with the invention on block diagram level.
  • Each dimension has a dedicated input register, and hence there is a total of n input registers.
  • the search key of each dimension is stored in these input registers, denoted by references Rv-.R,-, each key in a register of its own.
  • the input registers are connected to a register TR in which the above-described search word is formed in accordance with the bit interleaving method employed.
  • the register TR is connected via adder S to the address input of memory MEM.
  • the output of the memory in turn is connected to address register AR the output of which in turn is connected to adder S. Initially the bits selected from each register are read into the common register TR in the correct order.
  • the initial address of the first trie node is first stored in the address register AR, and the address obtained as an offset address from register TR is added to the initial address in adder S.
  • the resulting address is supplied to the address input of the memory MEM, and the data output of the memory provides the initial address of the next trie node, the address being written into the address register AR over the previous address stored therein.
  • the next selected bits are again loaded from the input registers into the common register TR in the correct order, and the array address thus obtained is added to the initial address of the relevant array (i.e., trie node), obtained from the address register AR.
  • This address is again supplied to the address input of the memory MEM, the data output of the memory thereafter providing the initial address of the next node.
  • the above-described procedure is repeated until the desired point has been accessed and recordal can be performed or the desired record read.
  • Control logic CL attends to the compression and to the correct number of bits being extracted from the registers in each node. If dynamically changing node sizes are employed in the memory, the control logic also at- tends to maintenance of node sizes.
  • the rapidity of the address computation can be influenced by the type of hardware configuration chosen. Since progress is by way of the above- stated bit manipulations, address computation can be accelerated by shifting from use of one processor to a multiprocessor environment in which parallel processing is carried out. An alternative implementation to the multiprocessor environment is an ASIC circuit.
  • Com- pression may, for example, be implemented in part of the memory only. The structure may also be implemented for keys of variable length. As was already stated at the beginning, the solution can be applied regardless of whether fixed or changing node size is employed in the memory.
  • a bucket is a data structure that may also contain another trie structure.
  • several directory structures in accordance with the present invention can be linked in succession in such a way that another directory structure (that is, another trie structure) is stored in a bucket, or a pointer contained in a bucket or a leaf points to another directory structure. Reference from a bucket or a leaf is made directly to the root node of the next directory structure.
  • a bucket contains at least one element so that the type of an individual element is selected from a group comprising a data unit, a pointer to a stored data unit, a pointer to another directory structure and another directory structure.
  • the detailed implementation of buckets is dependent on the application. In many cases, all elements in buckets may be of the same type, being e.g. either a data unit or a pointer to a data unit.
  • the bucket may contain element pairs in such a way that all pairs in the bucket are either pointer to data unit/pointer to directory structure pairs or data unit/pointer to a directory structure pairs or data unit/directory structure pairs. In such a case, for example, the prefix of the character string may be stored in the data unit and the search may be continued from the directory structure that is the pair of the data unit.

Abstract

The invention relates to a method for implementing a memory. The memory is implemented as a directory structure comprising a tree-shaped hierachy having nodes at several different levels, wherein an individual node can be (i) a trie node comprising an array wherein an individual element may contain the address of a lower node in the tree-shaped hierarchy and wherein an individual element may also be empty, the number of elements in the array corresponding to a power of two, or (ii) a bucket containing at least one element so that the type of an individual element in the bucket is selected from a group including a data unit, a pointer to a stored data unit, a pointer to a node in another directory structure and another directory structure. To optimize storage space occupancy and memory efficiency, in at least part of the directory structure sets of successive trie nodes are replaced with compressed nodes in such a way that an individual set made up by successive trie nodes, from each of which there is only one address to a trie node at a lower level, is replaced with a compressed node (CN) storing an address to the node that the lowest node in the set to be replaced points to, information on the value of the search word by means of which said address is found, and information on the total number of bits from which search words are formed in the set to be replaced. The invention also relates to a structure in which buckets are not employed.

Description

Method for implementing an associative emory based on a digital trie structure
Field of the Invention
The present invention generally relates to implementation of an associative memory, particularly to implementation of an associative memory based on a digital trie structure. The solution in accordance with the invention is intended for use primarily in connection with central memory databases, and it can be used in conjunction with all memories based on a digital trie structure.
Background of the invention
The prior art unidimensional directory structure termed digital trie (the word "trie" is derived from the English word "retrieval") is the underlying basis of the principle of the present invention. Digital tries can be implemented in two types: bucket tries, and tries having no buckets. A digital bucket trie structure is a tree-shaped structure composed of two types of nodes: buckets and trie nodes. A bucket is a data structure containing a number of data units or a number of pointers to data units or a number of search key/pointer pairs (the number may include only one data unit, one pointer or one key/pointer pair). A trie node, on the other hand, is an array guiding the retrieval, having a size of two by the power of k (2k) elements. If an element in a trie node is in use, it refers either to a trie node at the next level in the directory tree or to a bucket. In other cases, the element is free (empty).
Search in the database proceeds by examining the search key (which in the case of a subscriber database in a mobile telephone network or a telephone exchange, for instance, is typically the binary numeral corresponding to the telephone number of the subscriber) k bits at a time. The bits to be searched are selected in such a way that at the root level of the structure (in the first trie node), k leftmost bits are searched; at the second level of the structure, k bits next to the leftmost bits are searched, etc. The bits to be searched are interpreted as an unsigned binary integer that is employed directly to index the element array contained in the trie node, the index indicating a given element in the array. If the element indicated by the index is free, the search will terminate as unsuccessful. If the element refers to a trie node at the next level, k next bits extracted from the search key are searched at that level in the manner described above. As a result of comparison, the routine branches off in the trie node either to a trie node at the next level or to a bucket. If the element refers to a bucket containing a key, the key stored therein is compared with the search key. The entire search key is thus compared only after the search has encountered a bucket. Where the keys are equal, the search is successful, and the desired data unit is obtained at the storage address indicated by the pointer of the bucket. Where the keys differ, the search terminates as unsuccessful.
A bucketless trie structure has no buckets, but reference to a data unit is effected from a trie node at the lowest level of a tree-shaped hierarchy, called a leaf node. Unlike buckets, the leaf nodes in a bucketless structure cannot contain data units but only pointers to data units. Also a bucket structure has leaf nodes, and hence trie nodes containing at least one pointer to a bucket (bucket structure) or to a data unit (bucketless structure) are leaf nodes. The other nodes in the trie are internal nodes. Trie nodes may thus be either internal nodes or leaf nodes. By means of buckets, the need for reorganizing the directory structure can be postponed, as a large number of pointers/data units can be accommodated in the buckets until a time when the need for reorganization arises.
The solution in accordance with the invention can be applied to a bucket structure as well as a bucketless structure. In the following, bucket structures will nevertheless be used as examples.
Figure 1 illustrates an example of a digital trie structure in which the key has a length of 4 bits and k=2, and thus each trie node has 22=4 elements, and two bits extracted from the key are searched at each level. Buckets are denoted with references A, B, C, D...H...M, N, O and P. Thus a bucket is a node that does not point to a lower level in the tree. Trie nodes are denoted with references IN1...IN5 and elements in the trie node with reference NE in Figure 1.
In the exemplary case of Figure 1 , the search keys for the buckets shown are as follows: A=0000, B=0001, C=0010,..., H=0111,... and P=1111. In this case, a pointer is stored in each bucket to that storage location in the database SD at which the actual data, e.g. the telephone number of the pertinent subscriber and other information relating to that subscriber, is to be found. The actual subscriber data may be stored in the database for instance as a sequential file of the type shown in the figure. The search is performed on the basis of the search key of record H, for example, by first extracting from the search key the two leftmost bits (01) and interpreting them, which delivers the second element of node IN1 , containing a pointer to node IN3 at the next level. At this level, the two next bits (11) are extracted from the search key, thus yielding the fourth element of that node, pointing to record H. Instead of a pointer, a bucket may contain (besides a search key) an actual data file (also called by the more generic term data unit). Thus for example the data relating to subscriber A (Figure 1) may be located in bucket A, the data relating to subscriber B in bucket B, etc. Thus in the first embodiment of an associative memory, a key-pointer pair is stored in the bucket, and in the second embodiment a key and actual data are stored, even though the key is not indispensable.
The search key may also be multidimensional. In other words, it may comprise a number of attributes (for example the family name and one or more forenames of a subscriber). Such a multidimensional trie structure is disclosed in international application No. PCT/FI95/00319 (published under number WO 95/34155). In said structure, address computation is performed in such a way that a given predetermined number of bits at a time is selected from each dimension independently of the other dimensions. Hence, a fixed limit independent of the other dimensions is set for each dimension in any individual node of the trie structure, by predetermining the number of search key bits to be searched in each dimension. With such a structure, the memory circuit requirement can be curbed when the distribution of the values of the search keys is known in advance, in which case the structure can be implemented in a static form. If the possibility of reorganizing the structure in accordance with the current key distribution to be optimal in terms of efficiency and storage space occupancy is desired, the size of the nodes must vary dynamically as the key distribution changes. When the key distribution is uniform, the node size may be increased to make the structure flatter. On the other hand, with non-uniform key distributions in connection with which storage space occupancy will present a problem in memory structures employing dynamic node size, the node size can be maintained small, which will enable locally a more uniform key distribution and thereby smaller storage space occupancy. Dynamic changes in node size presuppose implementation of address computation in such a way that in each node of the tree-shaped hierarchy constituted by the digital trie structure, a node-specific number of bits is selected from the bit string constituted by the search keys employed.
The choice between a fixed node size and a dynamically changing node size is dependent for example on for what kind of application the memory is intended, for example what the number of retrievals, insertions and deletions to be made in the database is and what the proportions of these operations are.
Irrespective of whether a fixed or changing node size is used in the memory, memories based on the digital trie structure are nevertheless at- tended by the problem of how the empty space inevitably created in the structure can be modelled in such a way that storage space occupancy will be as low as possible and memory efficiency (speed of memory operations) as good as possible.
Summary of the Invention
It is an objective of the present invention to provide a solution to the above problem. This objective is achieved with the method defined in the independent claims. The first of these discloses a structure employing buckets and the second a structure not employing buckets. The basic idea of the invention is to compress such nodes in a digital trie structure that provide only a single path downward in a tree-shaped hierarchy. The data needed to proceed in the structure and for reorganization of nodes is stored in such a compressed node, without any storage space being required for (an) element array(s). On account of the solution of the invention, the empty space present in the trie structure can be modelled in such a way that storage space occupancy in the structure will remain small with uniform as well as non- uniform key distributions. Furthermore, the solution enables the number of memory references requiring computation time to be minimized, thus making the efficiency (speed) of the memory as good as possible.
In accordance with a preferred embodiment of the invention, each chain made up by successive compressed nodes is replaced with a single collecting node. This enables elimination of chains made up by successive compressed nodes as a result of limited word length. Elimination of chains will further improve memory efficiency and curb the need for storage space. The solution in accordance with the invention also ensures effective performance of set operations, as the structure is an order-preserving digital trie.
Brief Description of the Drawings
In the following the invention and its preferred embodiments will be described in closer detail with reference to examples in accordance with the accompanying drawings, in which
Figure 1 illustrates the use of a unidimensional digital trie structure in the maintenance of subscriber data in a telephone exchange, Figure 2 shows a multidimensional trie structure, Figure 3 shows a memory structure in accordance with the invention, Figure 4 illustrates implementation of address computation in the memory of the invention,
Figure 5 illustrates the structure of a trie node of the memory when the memory employs dynamic node size, Figures 6a and 6b illustrate the principle of forming a compressed node, Figures 7a and 7b show an example of the maintenance of the memory struc- ture,
Figure 8 illustrates the structure of a compressed node employed in the memory, Figure 9a illustrates the limitation posed by the word length employed on combining the nodes, Figure 9b shows the structure of a collecting node to be formed from the node chain of Figure 9a, and Figure 10 shows the memory arrangement in accordance with the invention on block diagram level.
Detailed Description of the Invention
As stated previously, in the present invention the trie structure has a multidimensional (generally n-dimensionai) implementation. Such a multidimensional structure is otherwise fully similar to the unidimensional structure described at the beginning, but the element array contained in the trie node is multidimensional. Figure 2 exemplifies a two-dimensional 22*21 structure, in which one dimension in the element array comprises four elements and the other dimension two elements. Buckets pointed to from the elements in the trie node are indicated with circles in the figure.
Address computation in the multidimensional case is performed on the same principle as in the unidimensional case. The fundamental difference, however, resides in that instead of a single element arrau index, an index is calculated for each dimension in the element array (n indices). Each dimension thus has a search key space of its own {0, 1,..., 2v'-1}(Vj is the length of the search key in bits in each dimension and i € {1,...n}).
The size of the trie node in the direction of each dimension is 2ki ele- ments, and the total number of elements S in the trie node is also a power of two:
S = Yl2k' = 2k'x2k!x2klx...= 2N (1)
All elements in a trie node having n dimensions can thus be pointed to by n integers (n>2), each of which may have a value in the range {0,1...2k|-1}. Thus the predetermined fixed parameter is the total length of the search key in each dimension. If for example one dimension of the search key has 256 attributes (such as first names) at most, the total length of the search key is 8 bits.
Figure 3 shows an example of a node N10 used in the directory structure of the memory in accordance with the invention, employing a three- dimensional search key. In the direction of the first dimension (x), the trie node has 22=4 elements, in the direction of the second dimension (y) 21=2 elements, and in the direction of the third dimension (z) 23=8 elements, which gives a total of 26=64 elements in the trie node, numbered 0...63. Since the memory space in practical hardware implementations (for example computer equipment) is unidimensional, the multidimensional array is linearized, i.e. converted to be unidirectional, in the address computation operation (that is, in proceeding in the directory tree). The linearization is an arithmetic operation that can be performed on arrays of all sizes. Hence, it is ir- relevant whether the trie nodes or their element arrays are considered to be unidimensional or multidimensional, as multidimensional arrays are linearized in any case to be unidimensional.
In linearization, the elements in the array are numbered starting from zero (as shown in Figure 3), the number of the last element being one less than the product of the sizes of all dimensions. The number of an element is the sum of the products of each coordinate (for example in the three- dimensional case, the x, y and z coordinates) and the sizes of the dimensions preceding it. The number thus computed is employed directly to index the unidimensional array.
In the case of the trie node shown in Figure 3, the element number VAn is calculated in accordance with the above with the formula:
Figure imgf000009_0001
where xe{0,1,2,3}, ye{0,1} and ze{0,1, 2,3,4,5,6,7}. Thus for example for element 54 we obtain from the coordinates thereof (2,1,6): 2+1x4+6x4x2= 2+4+48=54. When the (n-dimensioned) element array of a trie node of an n- dimensional trie structure is linearized, in accordance with the above the size of each dimension is 2\ where k, is the number of bits to be searched at a time in the dimension concerned. If a coordinate in accordance with the dimension is denoted by reference a, (je{0,1,2...n}), the linearization can be written out as
J-\ k< ∑^ ϋ 2 , V /: α, e{θ,l,2,...,2*' - l} Λ*0 = 0 (3) j=\ /=0
The linearization can be carried out by performing a multiplication in accordance with formula (3); yet it is expedient to perform the linearization by forming from the search key bits a bit string by known methods, the corresponding numeral indicating the element whose content provides the basis for proceeding in the directory tree. Such a linearization method is termed bit interleaving. Bit interleaving is a more efficient (rapid) method than the multiplication in accordance with formula (3), since when bit interleaving is used multiplications will be converted to additions and bit shifts, which are faster to perform. The most common way to implement bit interleaving is the 'z ordering'. Another possible bit interleaving method is the line ordering. In the present invention, it is advantageous to use line ordering, as it affords the most efficient address computation in memory searches, but any known bit interleaving method may be employed, as long as the same method is employed in all nodes of the structure.
Figure 4 illustrates an example of address computation performed in the trie structure in accordance with the invention. In the figure, it has been presumed that the memory employs dynamically changing node sizes and that the space is three-dimensional (dimensions x, y and z). It has further been presumed that search key ax in the direction of dimension x is ax = 011011, search key ay in the direction of dimension y is ay = 110100 and search key az in the direction of dimension z is az = 101010. The search keys are listed one below another in the figure. In the nodes of the trie structure, the indexing bits of a unidimensional element array are shown in frames denoted by continuous lines. These frames illustrate how a global search key is divided into local search keys (element array indices), each being used in one node of the trie structure. All frames denoted by continuous lines relate to the first bit interleaving method, i.e. the z ordering. The nodes in the structure are denoted by references N1...N7 in the order of progress. In the first node (N1) (at the uppermost level) only a single bit is employed, which is the leftmost bit in the search key of dimension x (which is a logical zero). Thereafter the routine proceeds in the direction of the arrow to the next node (N2), in which the number of bits forming the local search key is two. These are the leftmost bit in search key ay and the leftmost bit in search key az. In z ordering, the order of the bits is always as presently shown, in other words, the first bit of the first dimension is first extracted, thereafter the first bit of the second dimension, thereafter the first bit of the third dimension, etc. After the first bit of the last dimension, the second bits are extracted from the different dimensions, starting from the first dimension. In this way, the following node-specific element array indices are obtained: 0 (node N1), 11 (node N2), 110 (node N3), 10 (node N4), 1010 (node N5), 10 (node N6) and 1100 (node N7).
Alternatively, some other known bit interleaving method, such as line ordering, may be employed in the memory. In Figure 4, the frames denoted by broken lines and the arrows pertaining to them illustrate the forming of an element array index in node N5, the memory employing bit interleaving with line ordering. In the example of the figure, it has further been presumed that progress has been made in nodes N1...N4 so far that the first bit searched in node N5 is the third from the left in the search key in dimension z. In line ordering, all bits of each dimension are extracted at a time.
When line ordering is employed, the minimum number of bits to be extracted from the search keys of the different dimensions is first calculated in the node. This is obtained by dividing the number of bits searched in the node by the number of the dimensions and by truncating the obtained result to the closest integer. In this exemplary case, the number of bits to be searched in node N5 is four and the number of dimensions three, which gives a minimum number of one (that is, at least one bit must be extracted from the search key of each dimension). Thereafter it is still to be calculated how many additional bits must be extracted from the search keys of the different dimensions. The number of additional bits A is obtained from the formula A= k mod n, where k is the number of bits to be searched in the node and n is the number of dimensions. In this exemplary case, the result is A= 4 mod 3 = 1. The result 1 thus means that one additional bit is to be extracted. Extraction of additional bits is always started from the first searched dimension. In this exemplary case, one additional bit is thus extracted from the search key of dimension z. If the result had been two, one additional bit from the search key of dimension z and one additional bit from the search key of dimension x would have been extracted.
Hence, in this exemplary case one bit from the search key of each dimension and additionally one bit from the search key of dimension z is ex- tracted. Since in employing line ordering all bits of a dimension are extracted at a time, all bits (10) to be taken from dimension z are extracted first, thereafter all bits (0) to be used from the search key of dimension x, and lastly all bits (1) to be extracted from the search key of dimension y. Thus, when line ordering is employed, the bit string 1001 is obtained as the element array index of node N5; this bit string is depicted in the lower portion of Figure 4.
Since in the memory of the invention the address computation is performed by using bit interleaving known per se, the address computation will not be describer in further detail.
Since the order of bits in the local search key (element array index) to be formed in each node is constant, only the number of bits to be used must be known in the bit string formation performed in each node. This data is stored in each node. In addition, only an element array must be present in each ordinary trie node. Figure 5 illustrates the structure of an ordinary trie node when dynamically changing node size is employed. In its minimum con- figuration, the node thus comprises only two parts: a field indicating the number of bits to be searched in the node (reference 51) and an element array (reference 52), the number of elements in the array corresponding to a power of two. For proceeding in the directory tree, in addition to the number of bits to be searched the type of each node must be known. This data can be stored in the directory structure for example in each node or in the pointer of the parent of the node. By means of the two "extra" bits of the pointer (a and b, Figure 5), information can be encoded in the pointer on whether a zero pointer (an empty element) is concerned or whether the pointer points to an ordinary trie node, a bucket or a compressed trie node (which will be described hereinbelow). The encoding may be for example of the type shown in the figure. In the case of a bucketless structure, information on whether the pointer points to an uncompressed node, a compressed node or a data unit is stored.
If fixed node size is employed in the memory, the number of searched bits need not be stored in the node. In this case, therefore, the node does not necessarily contain but an element array.
To minimize storage space occupancy and to improve memory efficiency, compressed nodes are formed from the nodes of the trie structure in certain cases. If an ordinary trie node has only one child, this means that only one path downward in the tree passes through said trie node. In accordance with the invention, a trie node containing only a single pointer (path downward) is replaced with a compressed node in which the number of bits searched in said path and the computed array index value are disclosed. Since it is advantageous from the point of view of storage space occupancy to form compressed nodes from single-child trie nodes throughout the entire memory structure, compression also means that at least two child nodes are always maintained for ordinary (uncompressed) trie nodes in the memory structure, that is, an individual (ordinary) trie node has pointers to at least two different lower-level nodes (child nodes). A compressed node replaces one or more successive internal nodes, each of which has one child, and hence the above- stated one child cannot be a bucket (or a leaf in a structure that has no buckets). Hence, a child node must be an ordinary trie node in order for compression to be possible. From the point of view of optimizing storage space, it is thus advantageous to always maintain at least two child nodes for trie nodes preceding a bucket as well (i.e., if the bucket is preceded by a trie node having a size of two elements, said trie node always has two child nodes).
The memory in accordance with the invention thus comprises two types of trie nodes: ordinary trie nodes containing an element array in accordance with Figure 5, and compressed nodes that will be described in the following. Figures 6a and 6b illustrate the principle of forming a compressed node. For simplicity, all nodes are presumed to have a size of two elements. Figure 6a shows a trie structure comprising six nodes, having only one path for the five uppermost nodes. This trie structure of five nodes can be replaced with one element array shown in Figure 6b. Since the structure has a single path for these nodes, only one element of the array is in use, which in this exem- plary case is element 18 circled in the figure (18=01010 when the bits are taken in line order, i.e. the x bits first and thereafter the y bits). Thus, for the five uppermost nodes the trie structure can be replaced with a compressed node in which the number of bits to be searched (5) and the value of the array index (18) are stored. Figures 7a and 7b show a local maintenance example when data units and associated keys are deleted from a database. Figure 7a shows an initial situation in which the memory structure comprises trie nodes N111...N113 and buckets L2...L4. Thereafter bucket L2 and the pointer/record contained therein is deleted from the memory, as a result of which nodes N111 and N112 can be replaced with a compressed node CN, in which the index of the pointer contained in the node and the number of bits searched in the path replaced by the compressed node are disclosed. Hence, the compressed node is in principle similar to an ordinary trie node, but instead of the entire large-size element array with only one pointer being stored, the index of the pointer concerned and the number of bits searched in the path are stored. This creates the compressed node CN in accordance with Figure 7b, in which the number of bits searched in said path (3) and the index corresponding to said pointer (101=5 when bit interleaving with line ordering is used) are disclosed. A compressed node thus has a virtual array replacing the information contained in the one or more node arrays existing in the path. If the compressed node replaces several ordinary trie nodes, the number of searched bits indicated in the compressed node is equal to the sum of the numbers of bits searched in the replaced nodes.
Figure 8 illustrates the structure of a compressed node. The mini- mum configuration of the node comprises 3 parts: field 120 indicating the number of searched bits, field 121 storing the value of the array index, and field 122 storing a pointer to a child node. The compressed node is in need of this data in order for the search to proceed with the correct value at the compressed node as well, and in order for the restructuring of the node to be pos- sible in connection with changes in the memory structure. (Without information on the number of searched bits, the array index value cannot be calculated from the search key, and on the other hand without the array index value the calculated value could not be compared to the value stored in the node.)
If a collision occurs in the compressed node in connection with an insertion, i.e. the compressed node will have a new pointer, it is studied which bit in order distinguishes the index of the initial pointer and the index of the new pointer. Accordingly, a structure replacing the initial compressed node is created, in which the new compressed node comprises the index bit number insofar as there are common bits. In addition, one or more trie nodes are created in the structure at points corresponding to those bits in which the indices differ from one another.
If the compressed node is preceded by one or more compressed nodes or a chain of trie nodes providing only a single path, it is advantageous in view of storage space requirement and memory efficiency to further combine said nodes. Moreover, in view of memory efficiency it is advantageous to carry out the combination of nodes in such a way that only in the compressed node that is the last (lowest) in the chain the number of searched bits is smaller than the word length in the computer used. In other words, nodes are combined in such a way that the number of searched bits will be as large as possible in each compressed node. For example, three successive com- pressed nodes in which the numbers of searched bits are 5, 10 and 15 can be combined into one compressed node in which the number of searched bits is 30. Likewise, for example three successive compressed nodes (or three successive ordinary trie nodes providing only one path) in which the numbers of searched bits are 10, 10 and 15 can be combined into two compressed nodes in which the numbers of searched bits are 32 and 3, the word length employed being 32. Hence, it is attempted to obtain in as many compressed nodes as possible a number of searched bits corresponding to the word length of the computer, and the possible "superfluous" bits are left for the compressed node that is lowest in the hierarchy. However, compressed nodes cannot be combined so as to make the number of bits searched in one node higher than the word length in the computer employed. Particularly in multidimensional cases (n>3), it has been found to be common that there are so many successive nodes containing one child that the path cannot be represented by a single compressed node. Therefore, the search path or part thereof is replaced with a chain made up by several successive compressed nodes, in which the number of searched bits is the same as the word bit number, for example 32 in the Intel architecture, except for the last node where the number of bits is smaller than or equal to the word bit number.
Such a situation is depicted in Figure 9a, showing three successive compressed nodes CN1...CN3. The numbers of bits searched in the nodes are denoted by references b, b' and b" and the values of the array indices contained in the nodes with i, i' and i", respectively. In the two uppermost nodes, the number of searched bits has a maximum value (providing that a 32- bit computer architecture is used). It is advantageous to form from a chain of several successive compressed nodes resulting from limited word length a single node collecting such compressed nodes. This collecting node is formed in such a way that the pointer of the collecting node is set to point to the child of the compressed node that is last in said chain, the sum of the numbers of bits searched in the compressed nodes in the chain is set as the number of bits B searched in the collecting node, and the array indices (i.e. search words) produced by bit interleaving are inserted in the list or table T of the node in the order in which they appear in the successive compressed nodes. Thus, the collecting node will be a node CN4 as shown in Figure 9b, comprising three parts: field 130 containing a pointer to said lower-level node, field 131 containing the number of searched bits B (the above sum), and list or table T containing in succession the array indices produced by bit interleaving. This third part thus has a varying size. In the example of the figure, the number of indices is three, since the example of Figure 9a comprises three successive nodes. The number of elements (i.e., indices) EN in table T is obtained from the number of searched bits B as follows:
, B I W, if B MODW = 0 EN = \ IB / Wj+Uf B MODW ≠ O where L J is a floor function truncating decimals from the number, W is the word length used, e.g. 32, and MOD refers to modulo arithmetic. Thus, the number of indices need not be stored in the collecting node as separate data, but it can be found on the basis of the number of searched bits.
The number of bits B' needed to calculate the last index in the table (denoted by reference b" in the figure), which does not necessarily equal the word length, is obtained as follows: , W, if B MODW = 0 B = < B MODW, if B MODW ≠ O
By forming a collecting node from several successive compressed nodes, the number of memory references (pointers) can be reduced further. In present-day computer architecture, comprising caches of various levels, mem- ory references require considerable computation time, and hence the computation time will be diminished. At the same time, the need for storage space for pointers is eliminated.
By means of compressed nodes, the storage requirement can be effectively minimized particularly in conjunction with non-uniform key distribu- tions, since by means of compression the depth of the structure can be arbitrarily increased on a local basis without increased storage space requirement. Except for in conjunction with non-uniform key distributions, nodes containing one child will also be created in conjunction with uniform key distributions when the n-dimension of the structure is sufficiently large. As was already indirectly stated above, in the memory in accordance with the invention a bucket cannot be preceded by a compressed node, but the parent node of a bucket is always either an ordinary trie node or an empty element. Hence, a compressed node cannot point to a bucket, but it always points either to another compressed node or to an ordinary trie node. An empty element means that if the total number of records is smaller that the number of pointers/records that the bucket can accommodate, a tree-shaped structure is not needed yet, but one bucket will suffice in the structure (in which case said node is conceptually preceded by an empty element). It is advantageous to proceed in this way at the initial phase of starting up the memory. It is thus worth-while starting building up the tree-shaped structure only when this is necessary.
In other respects, the retrievals, insertions and deletions to be carried out in the memory are performed in a manner known per se. In this regard, reference is made e.g. to the international application mentioned at the beginning, providing a more detailed description of collision situations in association with insertions, for example. Instead of conventional deletion updating, the memory may also employ functional updating implemented by known methods by copying the path from root to buckets.
As already stated at the beginning, the above-described compres- sion principle also relates to a bucketless trie structure. In such a case, the equivalent of a bucket is a data unit (to which a leaf node in the bucketless structure points).
Figure 10 shows a memory in accordance with the invention on block diagram level. Each dimension has a dedicated input register, and hence there is a total of n input registers. The search key of each dimension is stored in these input registers, denoted by references Rv-.R,-, each key in a register of its own. The input registers are connected to a register TR in which the above-described search word is formed in accordance with the bit interleaving method employed. The register TR is connected via adder S to the address input of memory MEM. The output of the memory in turn is connected to address register AR the output of which in turn is connected to adder S. Initially the bits selected from each register are read into the common register TR in the correct order. The initial address of the first trie node is first stored in the address register AR, and the address obtained as an offset address from register TR is added to the initial address in adder S. The resulting address is supplied to the address input of the memory MEM, and the data output of the memory provides the initial address of the next trie node, the address being written into the address register AR over the previous address stored therein. Thereafter the next selected bits are again loaded from the input registers into the common register TR in the correct order, and the array address thus obtained is added to the initial address of the relevant array (i.e., trie node), obtained from the address register AR. This address is again supplied to the address input of the memory MEM, the data output of the memory thereafter providing the initial address of the next node. The above-described procedure is repeated until the desired point has been accessed and recordal can be performed or the desired record read.
Control logic CL attends to the compression and to the correct number of bits being extracted from the registers in each node. If dynamically changing node sizes are employed in the memory, the control logic also at- tends to maintenance of node sizes.
The rapidity of the address computation can be influenced by the type of hardware configuration chosen. Since progress is by way of the above- stated bit manipulations, address computation can be accelerated by shifting from use of one processor to a multiprocessor environment in which parallel processing is carried out. An alternative implementation to the multiprocessor environment is an ASIC circuit. Even though the invention has been described in the above with reference to examples in accordance with the accompanying drawings, it is obvious that the invention is not to be so restricted, but it can be modified within the scope of the inventive idea disclosed in the appended claims. Com- pression may, for example, be implemented in part of the memory only. The structure may also be implemented for keys of variable length. As was already stated at the beginning, the solution can be applied regardless of whether fixed or changing node size is employed in the memory. Hence, when the appended claims recite that in the node a given number of bits is selected from the bit string made up by the search keys employed, this shall be construed to cover both alternatives. Also, the address computation may continue in the bucket, providing that unsearched bits remain. The definition of a bucket given at the beginning is thus to be broadened to read that a bucket is a data structure that may also contain another trie structure. Hence, several directory structures in accordance with the present invention can be linked in succession in such a way that another directory structure (that is, another trie structure) is stored in a bucket, or a pointer contained in a bucket or a leaf points to another directory structure. Reference from a bucket or a leaf is made directly to the root node of the next directory structure. Generally, it may be stated that a bucket contains at least one element so that the type of an individual element is selected from a group comprising a data unit, a pointer to a stored data unit, a pointer to another directory structure and another directory structure. The detailed implementation of buckets is dependent on the application. In many cases, all elements in buckets may be of the same type, being e.g. either a data unit or a pointer to a data unit. On the other hand, for instance in an application in which character strings are stored in the memory the bucket may contain element pairs in such a way that all pairs in the bucket are either pointer to data unit/pointer to directory structure pairs or data unit/pointer to a directory structure pairs or data unit/directory structure pairs. In such a case, for example, the prefix of the character string may be stored in the data unit and the search may be continued from the directory structure that is the pair of the data unit.

Claims

Claims:
1. A method for implementing a memory, in which memory data is stored as data units for each of which a dedicated storage space is assigned in the memory, in accordance with which method - the memory is implemented as a directory structure comprising a tree-shaped hierarchy having nodes at several different hierarchical levels, wherein an individual node can be (i) a trie node comprising an array wherein an individual element may contain the address of a lower node in the tree- shaped hierarchy and wherein an individual element may also be empty, the number of elements in the array corresponding to a power of two, or (ii) a bucket containing at least one element so that the type of an individual element in the bucket is selected from a group including a data unit, a pointer to a stored data unit, a pointer to a node in another directory structure and another directory structure, - address computation performed in the directory structure comprises the steps of
- (a) selecting in the node at the uppermost level of the tree-shaped hierarchy a given number of bits from the bit string formed by the search keys employed, forming from the selected bits a search word with which the ad- dress of the next node is sought in the node, and proceeding to said node,
- (b) selecting from the unselected bits in the bit string formed by the search keys employed a given number of bits and forming from the selected bits a search word with which the address of a further new node at a lower level is sought from the array of the node that has been accessed, - repeating step (b) until an empty element is encountered or until the address of the new node at a lower level is the address of a bucket, c h a r a c t e r i z e d in that in at least part of the directory structure, sets of successive trie nodes are replaced with compressed nodes in such a way that an individual set made up by successive trie nodes, from each of which there is only one address to a trie node at a lower level, is replaced with a compressed node (CN) storing an address to the node that the lowest node in the set to be replaced points to, information on the value of the search word by means of which said address is found, and information on the total number of bits from which search words are formed in the set to be replaced.
2. A method as claimed in claim 1, characterized in that replacement is carried out throughout the entire directory structure in such a way that all said sets are replaced with compressed nodes.
3. A method as claimed in claim 1, characterized in that replacement is also carried out on a set comprising only one trie node, the total number of bits to be stored corresponding to the number of bits from which a search word is formed in said trie node.
4. A method as claimed in claim 1, characterized in that several successive compressed nodes are formed in the directory structure in such a way that at least in the compressed node at the uppermost level a number of search key bits to be searched corresponding to the word length employed is collected.
5. A method as claimed in claim 1, characterized in that several successive compressed nodes are combined into one new com- pressed node, the number of bits stored in the new node being the sum of the numbers obtained from the nodes to be combined.
6. A method as claimed in claim 4, characterized in that a chain made up by successive compressed nodes wherein the number of bits searched in at least two uppermost nodes corresponds to the word length employed, is replaced with one collecting node (CN4) comprising
- an address to the node to which the lowest node in the chain contains an address,
- the sum of the numbers of searched bits obtained from the nodes in the chain, and - the search word values contained in the chain nodes in sequence.
7. A method as claimed in claim 1, characterized in that in all uncompressed trie nodes of the memory, at least two addresses to a lower- level node are maintained.
8. A method for implementing a memory, in which memory data is stored as data units for each of which a dedicated storage space is assigned in the memory, in accordance with which method
- the memory is implemented as a directory structure comprising a tree-shaped hierarchy having nodes at several different hierarchical levels, wherein an individual node can be (i) an internal node comprising an array wherein an individual element may contain the address of a lower node in the tree-shaped hierarchy and wherein an individual element may also be empty, the number of elements in the array corresponding to a power of two, or (ii) a leaf containing at least one element the type of which is one from a group including a pointer to a stored data unit and a pointer to a node in another directory structure, - address computation performed in the directory structure comprises the steps of
- (a) selecting in the node at the uppermost level of the tree-shaped hierarchy a given number of bits from the bit string formed by the search keys employed, forming from the selected bits a search word with which the ad- dress of the next node is sought in the node, and proceeding to said node,
- (b) selecting from the unselected bits in the bit string formed by the search keys employed a given number of bits and forming from the selected bits a search word with which the address of a further new node at a lower level is sought from the array of the node that has been accessed, - repeating step (b) until an empty element is encountered or until the address of the new node at a lower level is the address of a leaf, characterized in that in at least part of the directory structure, sets of successive internal nodes are replaced with compressed nodes in such a way that an individual set made up by successive internal nodes, from each of which there is only one address to an internal node at a lower level, is replaced with a compressed node (CN) storing an address to the node that the lowest node in the set to be replaced points to, information on the value of the search word by means of which said address is found, and information on the total number of bits from which search words are formed in the set to be replaced.
9. A method as claimed in claim 8, characterized in that replacement is performed in the entire directory structure in such a way that all said sets are replaced with compressed nodes.
10. A method as claimed in claim 8, characterized in that replacement is also carried out on a set comprising only one internal node, the total number of bits to be stored corresponding to the number of bits from which a search word is formed in said internal node.
11. A method as claimed in claim 8, characterized in that several successive compressed nodes are formed in the directory structure in such a way that at least in the compressed node at the uppermost level a number of search key bits to be searched corresponding to the word length employed is collected.
12. A method as claimed in claim 8, characterized in that several successive compressed nodes are combined into one new com- pressed node, the number of bits stored in the new node being the sum of the numbers obtained from the nodes to be combined.
13. A method as claimed in claim 11, characterized in that a chain made up by successive compressed nodes wherein the number of bits searched in at least two uppermost nodes corresponds to the word length employed is replaced with one collecting node (CN4) comprising
- an address to the node to which the lowest node in the chain contains an address,
- the sum of the numbers of searched bits obtained from the nodes in the chain, and - the search word values contained in the chain nodes in sequence.
14. A method as claimed in claim 8, characterized in that in all uncompressed internal nodes of the memory, at least two addresses to a lower-level node are maintained.
PCT/FI1998/000192 1997-03-14 1998-03-04 Method for implementing an associative memory based on a digital trie structure WO1998041933A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP98908123A EP0976066A1 (en) 1997-03-14 1998-03-04 Method for implementing an associative memory based on a digital trie structure
AU66240/98A AU6624098A (en) 1997-03-14 1998-03-04 Method for implementing an associative memory based on a digital trie structure
US09/389,574 US6505206B1 (en) 1997-03-14 1999-09-03 Method for implementing an associative memory based on a digital trie structure

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI971067 1997-03-14
FI971067A FI102426B1 (en) 1997-03-14 1997-03-14 Method for implementing memory

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US09/389,574 Continuation US6505206B1 (en) 1997-03-14 1999-09-03 Method for implementing an associative memory based on a digital trie structure

Publications (2)

Publication Number Publication Date
WO1998041933A1 true WO1998041933A1 (en) 1998-09-24
WO1998041933A8 WO1998041933A8 (en) 1999-06-03

Family

ID=8548389

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI1998/000192 WO1998041933A1 (en) 1997-03-14 1998-03-04 Method for implementing an associative memory based on a digital trie structure

Country Status (5)

Country Link
US (1) US6505206B1 (en)
EP (1) EP0976066A1 (en)
AU (1) AU6624098A (en)
FI (1) FI102426B1 (en)
WO (1) WO1998041933A1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000075805A1 (en) * 1999-06-02 2000-12-14 Nokia Corporation Memory based on a digital trie structure
WO2000075804A1 (en) * 1999-06-02 2000-12-14 Nokia Coproration Functional memory based on a trie structure
US6879983B2 (en) 2000-10-12 2005-04-12 Qas Limited Method and apparatus for retrieving data representing a postal address from a plurality of postal addresses
US7908242B1 (en) 2005-04-11 2011-03-15 Experian Information Solutions, Inc. Systems and methods for optimizing database queries
US7912865B2 (en) 2006-09-26 2011-03-22 Experian Marketing Solutions, Inc. System and method for linking multiple entities in a business database
US9576030B1 (en) 2014-05-07 2017-02-21 Consumerinfo.Com, Inc. Keeping up with the joneses
US9619579B1 (en) 2007-01-31 2017-04-11 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US9684905B1 (en) 2010-11-22 2017-06-20 Experian Information Solutions, Inc. Systems and methods for data verification
US9690820B1 (en) 2007-09-27 2017-06-27 Experian Information Solutions, Inc. Database system for triggering event notifications based on updates to database records
US9697263B1 (en) 2013-03-04 2017-07-04 Experian Information Solutions, Inc. Consumer data request fulfillment system
US9853959B1 (en) 2012-05-07 2017-12-26 Consumerinfo.Com, Inc. Storage and maintenance of personal data
US10075446B2 (en) 2008-06-26 2018-09-11 Experian Marketing Solutions, Inc. Systems and methods for providing an integrated identifier
US10102536B1 (en) 2013-11-15 2018-10-16 Experian Information Solutions, Inc. Micro-geographic aggregation system
US10121194B1 (en) 2006-10-05 2018-11-06 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US10176233B1 (en) 2011-07-08 2019-01-08 Consumerinfo.Com, Inc. Lifescore
US10242019B1 (en) 2014-12-19 2019-03-26 Experian Information Solutions, Inc. User behavior segmentation using latent topic detection
US10262362B1 (en) 2014-02-14 2019-04-16 Experian Information Solutions, Inc. Automatic generation of code for attributes
US10437895B2 (en) 2007-03-30 2019-10-08 Consumerinfo.Com, Inc. Systems and methods for data verification
US10678894B2 (en) 2016-08-24 2020-06-09 Experian Information Solutions, Inc. Disambiguation and authentication of device users
US10963434B1 (en) 2018-09-07 2021-03-30 Experian Information Solutions, Inc. Data architecture for supporting multiple search models
US11227001B2 (en) 2017-01-31 2022-01-18 Experian Information Solutions, Inc. Massive scale heterogeneous data ingestion and user resolution
US11257126B2 (en) 2006-08-17 2022-02-22 Experian Information Solutions, Inc. System and method for providing a score for a used vehicle
US11880377B1 (en) 2021-03-26 2024-01-23 Experian Information Solutions, Inc. Systems and methods for entity resolution
US11941065B1 (en) 2019-09-13 2024-03-26 Experian Information Solutions, Inc. Single identifier platform for storing entity data

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6563439B1 (en) * 2000-10-31 2003-05-13 Intel Corporation Method of performing Huffman decoding
US6735595B2 (en) 2000-11-29 2004-05-11 Hewlett-Packard Development Company, L.P. Data structure and storage and retrieval method supporting ordinality based searching and data retrieval
US6785687B2 (en) 2001-06-04 2004-08-31 Hewlett-Packard Development Company, L.P. System for and method of efficient, expandable storage and retrieval of small datasets
US6671694B2 (en) * 2001-06-04 2003-12-30 Hewlett-Packard Development Company, L.P. System for and method of cache-efficient digital tree with rich pointers
US6816856B2 (en) * 2001-06-04 2004-11-09 Hewlett-Packard Development Company, L.P. System for and method of data compression in a valueless digital tree representing a bitset
US6654760B2 (en) * 2001-06-04 2003-11-25 Hewlett-Packard Development Company, L.P. System and method of providing a cache-efficient, hybrid, compressed digital tree with wide dynamic ranges and simple interface requiring no configuration or tuning
US6807618B1 (en) * 2001-08-08 2004-10-19 Emc Corporation Address translation
JP2004164318A (en) * 2002-11-13 2004-06-10 Hitachi Ltd Generation management method for backup data and storage controller to be used for this method
DE10329680A1 (en) * 2003-07-01 2005-02-10 Universität Stuttgart Processor architecture for exact pointer identification
US8180802B2 (en) * 2003-09-30 2012-05-15 International Business Machines Corporation Extensible decimal identification system for ordered nodes
WO2005052812A1 (en) * 2003-10-28 2005-06-09 France Telecom Trie-type memory device comprising a compression mechanism
US7634500B1 (en) 2003-11-03 2009-12-15 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
US7814129B2 (en) * 2005-03-11 2010-10-12 Ross Neil Williams Method and apparatus for storing data with reduced redundancy using data clusters
US8356021B2 (en) * 2005-03-11 2013-01-15 Ross Neil Williams Method and apparatus for indexing in a reduced-redundancy storage system
US8051252B2 (en) * 2005-03-11 2011-11-01 Ross Neil Williams Method and apparatus for detecting the presence of subblocks in a reduced-redundancy storage system
US7565380B1 (en) * 2005-03-24 2009-07-21 Netlogic Microsystems, Inc. Memory optimized pattern searching
US7353332B2 (en) * 2005-10-11 2008-04-01 Integrated Device Technology, Inc. Switching circuit implementing variable string matching
US7783654B1 (en) 2006-09-19 2010-08-24 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
US7676444B1 (en) 2007-01-18 2010-03-09 Netlogic Microsystems, Inc. Iterative compare operations using next success size bitmap
JP4402120B2 (en) * 2007-01-24 2010-01-20 株式会社エスグランツ Bit string search device, search method and program
US20080288527A1 (en) * 2007-05-16 2008-11-20 Yahoo! Inc. User interface for graphically representing groups of data
US8122056B2 (en) 2007-05-17 2012-02-21 Yahoo! Inc. Interactive aggregation of data on a scatter plot
US7739229B2 (en) 2007-05-22 2010-06-15 Yahoo! Inc. Exporting aggregated and un-aggregated data
US7756900B2 (en) * 2007-05-22 2010-07-13 Yahoo!, Inc. Visual interface to indicate custom binning of items
WO2008147918A2 (en) 2007-05-25 2008-12-04 Experian Information Solutions, Inc. System and method for automated detection of never-pay data sets
US8874663B2 (en) * 2009-08-28 2014-10-28 Facebook, Inc. Comparing similarity between documents for filtering unwanted documents
US9152727B1 (en) 2010-08-23 2015-10-06 Experian Marketing Solutions, Inc. Systems and methods for processing consumer information for targeted marketing applications
US9378304B2 (en) * 2013-01-16 2016-06-28 Google Inc. Searchable, mutable data structure
US9529851B1 (en) 2013-12-02 2016-12-27 Experian Information Solutions, Inc. Server architecture for electronic data quality processing
US11093450B2 (en) * 2017-09-27 2021-08-17 Vmware, Inc. Auto-tuned write-optimized key-value store
CN108319667B (en) * 2018-01-22 2021-03-05 上海星合网络科技有限公司 Multidimensional knowledge system display method and device
US11204905B2 (en) * 2018-06-27 2021-12-21 Datastax, Inc. Trie-based indices for databases

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276868A (en) * 1990-05-23 1994-01-04 Digital Equipment Corp. Method and apparatus for pointer compression in structured databases
EP0650131A1 (en) * 1993-10-20 1995-04-26 Microsoft Corporation Computer method and storage structure for storing and accessing multidimensional data
WO1995034155A2 (en) * 1994-06-06 1995-12-14 Nokia Telecommunications Oy A method for storing and retrieving data and a memory arrangement
WO1996000945A1 (en) * 1994-06-30 1996-01-11 International Business Machines Corp. Variable length data sequence matching method and apparatus
JPH08194719A (en) * 1994-11-16 1996-07-30 Fujitsu Ltd Retrieval device and dictionary and text retrieval method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2001390C (en) 1988-12-19 1997-12-30 Ming-Chien Shan View composition in a data-base management system
AU620994B2 (en) 1989-07-12 1992-02-27 Digital Equipment Corporation Compressed prefix matching database searching
US5319777A (en) 1990-10-16 1994-06-07 Sinper Corporation System and method for storing and retrieving information from a multidimensional array
US6115716A (en) * 1997-03-14 2000-09-05 Nokia Telecommunications Oy Method for implementing an associative memory based on a digital trie structure

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276868A (en) * 1990-05-23 1994-01-04 Digital Equipment Corp. Method and apparatus for pointer compression in structured databases
EP0650131A1 (en) * 1993-10-20 1995-04-26 Microsoft Corporation Computer method and storage structure for storing and accessing multidimensional data
WO1995034155A2 (en) * 1994-06-06 1995-12-14 Nokia Telecommunications Oy A method for storing and retrieving data and a memory arrangement
WO1996000945A1 (en) * 1994-06-30 1996-01-11 International Business Machines Corp. Variable length data sequence matching method and apparatus
JPH08194719A (en) * 1994-11-16 1996-07-30 Fujitsu Ltd Retrieval device and dictionary and text retrieval method

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000075804A1 (en) * 1999-06-02 2000-12-14 Nokia Coproration Functional memory based on a trie structure
US6675171B2 (en) 1999-06-02 2004-01-06 Nokia Corporation Memory based on a digital trie structure
US6691131B2 (en) 1999-06-02 2004-02-10 Nokia Corporation Functional memory based on a trie structure
WO2000075805A1 (en) * 1999-06-02 2000-12-14 Nokia Corporation Memory based on a digital trie structure
US6879983B2 (en) 2000-10-12 2005-04-12 Qas Limited Method and apparatus for retrieving data representing a postal address from a plurality of postal addresses
US7366726B2 (en) 2000-10-12 2008-04-29 Qas Limited Method and apparatus for retrieving data representing a postal address from a plurality of postal addresses
US7908242B1 (en) 2005-04-11 2011-03-15 Experian Information Solutions, Inc. Systems and methods for optimizing database queries
US8065264B1 (en) 2005-04-11 2011-11-22 Experian Information Solutions, Inc. Systems and methods for optimizing database queries
US11257126B2 (en) 2006-08-17 2022-02-22 Experian Information Solutions, Inc. System and method for providing a score for a used vehicle
US7912865B2 (en) 2006-09-26 2011-03-22 Experian Marketing Solutions, Inc. System and method for linking multiple entities in a business database
US10963961B1 (en) 2006-10-05 2021-03-30 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US11631129B1 (en) 2006-10-05 2023-04-18 Experian Information Solutions, Inc System and method for generating a finance attribute from tradeline data
US10121194B1 (en) 2006-10-05 2018-11-06 Experian Information Solutions, Inc. System and method for generating a finance attribute from tradeline data
US10891691B2 (en) 2007-01-31 2021-01-12 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US10650449B2 (en) 2007-01-31 2020-05-12 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US9619579B1 (en) 2007-01-31 2017-04-11 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US11908005B2 (en) 2007-01-31 2024-02-20 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US11443373B2 (en) 2007-01-31 2022-09-13 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US10402901B2 (en) 2007-01-31 2019-09-03 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US10078868B1 (en) 2007-01-31 2018-09-18 Experian Information Solutions, Inc. System and method for providing an aggregation tool
US10437895B2 (en) 2007-03-30 2019-10-08 Consumerinfo.Com, Inc. Systems and methods for data verification
US11308170B2 (en) 2007-03-30 2022-04-19 Consumerinfo.Com, Inc. Systems and methods for data verification
US11347715B2 (en) 2007-09-27 2022-05-31 Experian Information Solutions, Inc. Database system for triggering event notifications based on updates to database records
US10528545B1 (en) 2007-09-27 2020-01-07 Experian Information Solutions, Inc. Database system for triggering event notifications based on updates to database records
US9690820B1 (en) 2007-09-27 2017-06-27 Experian Information Solutions, Inc. Database system for triggering event notifications based on updates to database records
US10075446B2 (en) 2008-06-26 2018-09-11 Experian Marketing Solutions, Inc. Systems and methods for providing an integrated identifier
US11769112B2 (en) 2008-06-26 2023-09-26 Experian Marketing Solutions, Llc Systems and methods for providing an integrated identifier
US11157872B2 (en) 2008-06-26 2021-10-26 Experian Marketing Solutions, Llc Systems and methods for providing an integrated identifier
US9684905B1 (en) 2010-11-22 2017-06-20 Experian Information Solutions, Inc. Systems and methods for data verification
US10176233B1 (en) 2011-07-08 2019-01-08 Consumerinfo.Com, Inc. Lifescore
US11665253B1 (en) 2011-07-08 2023-05-30 Consumerinfo.Com, Inc. LifeScore
US10798197B2 (en) 2011-07-08 2020-10-06 Consumerinfo.Com, Inc. Lifescore
US9853959B1 (en) 2012-05-07 2017-12-26 Consumerinfo.Com, Inc. Storage and maintenance of personal data
US11356430B1 (en) 2012-05-07 2022-06-07 Consumerinfo.Com, Inc. Storage and maintenance of personal data
US9697263B1 (en) 2013-03-04 2017-07-04 Experian Information Solutions, Inc. Consumer data request fulfillment system
US10580025B2 (en) 2013-11-15 2020-03-03 Experian Information Solutions, Inc. Micro-geographic aggregation system
US10102536B1 (en) 2013-11-15 2018-10-16 Experian Information Solutions, Inc. Micro-geographic aggregation system
US11107158B1 (en) 2014-02-14 2021-08-31 Experian Information Solutions, Inc. Automatic generation of code for attributes
US10262362B1 (en) 2014-02-14 2019-04-16 Experian Information Solutions, Inc. Automatic generation of code for attributes
US10936629B2 (en) 2014-05-07 2021-03-02 Consumerinfo.Com, Inc. Keeping up with the joneses
US9576030B1 (en) 2014-05-07 2017-02-21 Consumerinfo.Com, Inc. Keeping up with the joneses
US11620314B1 (en) 2014-05-07 2023-04-04 Consumerinfo.Com, Inc. User rating based on comparing groups
US10019508B1 (en) 2014-05-07 2018-07-10 Consumerinfo.Com, Inc. Keeping up with the joneses
US11010345B1 (en) 2014-12-19 2021-05-18 Experian Information Solutions, Inc. User behavior segmentation using latent topic detection
US10445152B1 (en) 2014-12-19 2019-10-15 Experian Information Solutions, Inc. Systems and methods for dynamic report generation based on automatic modeling of complex data structures
US10242019B1 (en) 2014-12-19 2019-03-26 Experian Information Solutions, Inc. User behavior segmentation using latent topic detection
US10678894B2 (en) 2016-08-24 2020-06-09 Experian Information Solutions, Inc. Disambiguation and authentication of device users
US11550886B2 (en) 2016-08-24 2023-01-10 Experian Information Solutions, Inc. Disambiguation and authentication of device users
US11227001B2 (en) 2017-01-31 2022-01-18 Experian Information Solutions, Inc. Massive scale heterogeneous data ingestion and user resolution
US11681733B2 (en) 2017-01-31 2023-06-20 Experian Information Solutions, Inc. Massive scale heterogeneous data ingestion and user resolution
US11734234B1 (en) 2018-09-07 2023-08-22 Experian Information Solutions, Inc. Data architecture for supporting multiple search models
US10963434B1 (en) 2018-09-07 2021-03-30 Experian Information Solutions, Inc. Data architecture for supporting multiple search models
US11941065B1 (en) 2019-09-13 2024-03-26 Experian Information Solutions, Inc. Single identifier platform for storing entity data
US11880377B1 (en) 2021-03-26 2024-01-23 Experian Information Solutions, Inc. Systems and methods for entity resolution

Also Published As

Publication number Publication date
FI102426B (en) 1998-11-30
WO1998041933A8 (en) 1999-06-03
EP0976066A1 (en) 2000-02-02
AU6624098A (en) 1998-10-12
US6505206B1 (en) 2003-01-07
FI971067A (en) 1998-09-15
FI971067A0 (en) 1997-03-14
FI102426B1 (en) 1998-11-30

Similar Documents

Publication Publication Date Title
EP0976066A1 (en) Method for implementing an associative memory based on a digital trie structure
EP1008063B1 (en) Method for implementing an associative memory based on a digital trie structure
US6115716A (en) Method for implementing an associative memory based on a digital trie structure
WO1998041932A1 (en) Method for implementing an associative memory based on a digital trie structure
EP1125223B1 (en) Compression of nodes in a trie structure
EP0772836B1 (en) A method for storing and retrieving data and a memory arrangement
JP3992495B2 (en) Functional memory based on tree structure
JP3849279B2 (en) Index creation method and search method
Waldvogel et al. Scalable high-speed prefix matching
US6675171B2 (en) Memory based on a digital trie structure
AU7738596A (en) Storage and retrieval of ordered sets of keys in a compact 0-complete tree
AU2004225060B2 (en) A computer implemented compact 0-complete tree dynamic storage structure and method of processing stored data
US20030208495A1 (en) User selectable editing protocol for fast flexible search engine
Han Improved fast integer sorting in linear space
JPH09179743A (en) Method and device for sorting element
Franceschini et al. Optimal cache-oblivious implicit dictionaries
Franceschini et al. Optimal implicit and cache-oblivious dictionaries over unbounded universes
Köppl Load-Balancing Succinct B Trees
Iliopoulos et al. Massively parallel suffix array construction
Iliopoulos et al. Massively Parallel Su x Array Construction

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: C1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

WR Later publication of a revised version of an international search report
WWE Wipo information: entry into national phase

Ref document number: 1998908123

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09389574

Country of ref document: US

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1998908123

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1998540158

Format of ref document f/p: F

NENP Non-entry into the national phase

Ref country code: CA

WWW Wipo information: withdrawn in national office

Ref document number: 1998908123

Country of ref document: EP