WO2000007123A1 - Methods of deleting information in n-gram tree structures - Google Patents

Methods of deleting information in n-gram tree structures Download PDF

Info

Publication number
WO2000007123A1
WO2000007123A1 PCT/US1999/017133 US9917133W WO0007123A1 WO 2000007123 A1 WO2000007123 A1 WO 2000007123A1 US 9917133 W US9917133 W US 9917133W WO 0007123 A1 WO0007123 A1 WO 0007123A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
tokens
deletion
leaf
counts
Prior art date
Application number
PCT/US1999/017133
Other languages
French (fr)
Inventor
Tao Zhang
Original Assignee
Triada, Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Triada, Ltd. filed Critical Triada, Ltd.
Priority to AU52394/99A priority Critical patent/AU5239499A/en
Publication of WO2000007123A1 publication Critical patent/WO2000007123A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9027Trees

Definitions

  • This invention relates generally to deleting information values in an NGRAM system, similar to the deletion of records in a relational database.
  • the invention is directed toward the deletion of information values in both single and joined NGRAM tree structures.
  • deletion of records is carried out in two steps. First, a query is perfromed with a constraint imposed on one or more fields, afterwhich records are obtained which satisfies the constraint. Then the obtained records may be deleted by eliminating them from the database.
  • NGRAM NGRAM system, of the type described in commonly assigned U.S.
  • a record is represented by a set of tokens (or indices) stored at all non-leaf nodes, and dictionary data values stored at all leaf nodes. Deletion of a record means a set of deletions and modifications of the involved tokens and dictionaries. In an NGRAM tree structure, one may carry out deletion of records, starting from a leaf node and propagating throughout the NGRAM tree structure, if a constraint is imposed on the leaf node.
  • deletion operation may start at the leaf node L. If a constraint is imposed on a number of leaf nodes, deletion of records may start at the closest common ancestor node, followed by a propagation of the deletion throughout the NGRAM structure.
  • the former strategy may lead to a very sophisticated and complex procedure.
  • the resulting structure after deletion is clean and conducive to general data manipulation such as performing a query and carrying out data mining or OLAP analyses.
  • the latter deletion strategy may be less demanding, but may be problematic with respect to general data manipulation.
  • This invention is primarily directed toward the physical deletion of records in an NGRAM system.
  • NGRAM system Various alternative procedures are disclosed, based upon what information properties the system has before the deletion and on what information properties one may wish to preserve in the system after the deletion.
  • NGRAM systems there are two kinds of NGRAM systems that have different information properties.
  • One is an undisturbed system which corresponds to an original NGRAM system transformed from a raw database.
  • This system has, among other properties, a fundamental NGRAM information property: two consecutive indices in a look-down (or output lookup) table at any non-leaf node in an NGRAM system satisfies:
  • a look-down table is a stream of left or right child tokens whose addresses are the parent node tokens.
  • a fundamental structure function may be used to describe the relationship between any two pairing parent and child tokens. This function is given by: t p p ⁇ • c c + ' l *N ⁇ r r e ep_ e e a a t t Z)
  • N repeat is the number of repeated child tokens appearing before the child token t c .
  • NGRAM systems Another kind of NGRAM systems is a disturbed or reassigned system which does not preserve the above information property.
  • the relationships between parent and child tokens can be any sequence other than that described by the above structure function.
  • sequences each of which corresponds to a reassigned NGRAM system.
  • Such reassignment is typically used for a specific purpose.
  • reassignment is used to conserve memory by reassigning the left and right look-down tables in the hashing sequence; that is, for left hashing, sort the left look-down table; sort the right look-down table for right hashing.
  • an undisturbed NGRAM system there are two principle procedures for deletion, depending on the resulting structure one may wish to have after a deletion. If an undisturbed NGRAM system is expected after a deletion, a general procedure is employed, which requires reassignment of the memory structures in order to preserve the original information properties. The reassigned deletion procedure leads to no sign of disturbance due to deletion. All the information properties preserved before a deletion are unchanged after the deletion, including the information property described above. This deletion is the most difficult to implement, but is also the most general. If an undisturbed structure is not necessary to retain, a less general procedure, a partial procedure may be used which requires no reassignment of the memory structure.
  • a starting point for a given deletion procedure such as the root node in an NGRAM tree structure, since this is the place where a record number is identified.
  • This procedure is instructive, but less efficient than it could be because the record numbers for the records being deleted are usually identified by performing a query constrained at one or more leaf nodes. If the constraint is imposed on a single leaf node, traverse twice between the leaf node and the root node is necessary for deletion starting at the root node. However, one traverse between the two nodes is needed if the deletion starts at the leaf node where the constraint is imposed.
  • deletion may start at the closest common ancestor node of all constrained leaf nodes.
  • the procedure then propagates the deletion up and down in all directions. Such a procedure is disclosed herein, although at least a portion of the procedure may be applied directly to the deletion starting at the root node as well.
  • FIGURE 1 shows the main branch and side branches as subtree structures, wherein the main branch starts with a leaf node, and ends with the root node, and wherein all dotted circles indicate subtree structures;
  • FIGURE 2 shows the main branch and side branches as subtree structures, wherein the main branch starts with a non-leaf node and ends with the root node, and wherein all dotted circles indicate subtree structures.
  • An information value is a value that can be either a raw data value or a token (index) representing the memory address of a combination of data values.
  • a complete deletion of a token in a non-leaf node removes the token and the corresponding count from the node memory structure, because the count of deletions is equal to the count of occurrences.
  • a complete deletion of a dictionary value in a leaf node removes the value and the corresponding count from the leaf-node memory structure, since the count of deletions is equal to the count of occurrences.
  • a partial deletion of a token in a non-leaf node modifies the token and the associated count in the node memory structure, because the count of deletions is less than the count of occurrences.
  • a partial deletion of a dictionary value in a leaf node modifies the value and the associated count in the leaf-node memory structure, because the count of deletions is less than the count of occurrences.
  • the main branch of an NGRAM tree structure is the segment between a starting node for deletion and the root node. All leaf and non-leaf nodes on the main branch are main-branch nodes. All other nodes that are child nodes of the main-branch nodes are side-branch nodes.
  • a hybrid node is a node that stores at least two sets of information values. One set must be a set of tokens and the other set can be either a set of raw data values or a set of tokens of different type.
  • a foreign node is a node storing values that represent references to the referenced node values.
  • the referenced node values are tokens (indices) that represent the memory addresses of the referenced data values.
  • a look-down table is a mapper between tokens of a non-leaf node and tokens of its child node, which stores a set of the child tokens in such a sequence that the pairing tokens of the parent node are incremental. The purpose is to obtain the pairing child token for any given parent token. This is a one-to-one operation.
  • a more conventional term for a look-down table is an "output lookup table. "
  • a lookup table is defined as a mapping between tokens of a non-leaf node and tokens of its child node, which stores a set of the parent tokens in a hashing sequence in which all parent tokens pairing with the same child token are stored in the same hashing list. The purpose is to obtain all the pairing parent tokens for a given child token, by looking up the hashing list indexed by the child token. This is a one-to-many operation.
  • a more conventional term for a lookup table is an "input lookup table" or "lookup hash table. "
  • Reassignment of a node structure is to reassign the mapping between tokens of a non-leaf node and tokens of its child node, or to reassign the mapping between tokens of a leaf node and dictionaries at the leaf node.
  • Compacting a tree is to save memory storage by reassigning all non-leaf node structures in a specified hashing sequence.
  • a deletion operation starts from one end of the main branch where the end node is not the root node, and propagates through all the main-branch nodes as well as all the side-branch nodes and their substructures.
  • a constraint is imposed on a leaf node, say leaf node N,.
  • the constrained leaf node defines one end of the main branch.
  • the root node is the other end.
  • N 2 is the parent node of node N 1 ;
  • N 3 is the parent node of node N 2 , ...
  • N M is the root node and the parent node of node N M .
  • the constraint is such that all records that have dictionary values equal to d,, d 2 , ... d ml at node N, are deleted. We may start the deletion from node N,
  • Deletion at a leaf node N on the main branch: • Find dictionary values d, , d 2 , ... , d ml and their addresses tj, t 2 , ... , t ml in the leaf node memory structure file. Remove the dictionary values d,, d 2 , ..., d m] from the memory structure file. • Find the corresponding counts in the count file and remove the counts from the count file.
  • N the closest common ancestor node
  • the common ancestor node N ! defines one end of the main branch.
  • the root node is the other end.
  • N 2 is the parent node of node N l 5
  • N 3 is the parent node of node N 2
  • ... and N M is the root node and the parent node of node N M _, .
  • the constraint is such that all records consisting of combinations of data values described by tokens t ⁇ , t 2 , ..., t ml at node N, are deleted, where m, is the number of tokens being deleted. These tokens being deleted are obtained by performing a query with specified conditions setting on the constrained leaf nodes.
  • sibling node of node N is S lA .
  • the corresponding counts of deletions are c , c 2 , ... , c kl .
  • a side-branch node can be a leaf or non-leaf node.
  • a side-branch node may be considered as the root node of a subtree structure.
  • the subtree structure may be a single-leaf tree structure if the side-branch node is a leaf node. In this case, only the procedure for deletion at a leaf node may be used.
  • the subtree structure may be a non-cardinal tree structure if the side-branch node is a non-leaf node. In this case, deletion at a side-branch node is treaed as deletion starting at the root node in a tree structure and propagate the deletion down throughout the tree structure.
  • C [i] the count of deletions, is smaller than the count of occurrences, it is partial deletion.
  • i. Store the token and its count of deletions, C ⁇ [i], in two arrays R r [m r ] and C R [mj for partial deletion, where m is the total number of tokens for partial deletion, ii. If the first appearance is to be deleted, store the token and the largest token before the first undeleted appearance of the token being partially deleted in array X r .
  • N is the number of tokens for deletion of the first appearance. If N is not zero, do the following:
  • i is the number of tokens among b,, b 2 , ... , 15 b N , which appear before the first undeleted reappearance of t.
  • j is the number of tokens among e, , e 2 , ... , e N , which appear before the first undeleted reappearance of t.
  • the above result is shown by an example: assume a 20 stream of pairs are 1 ,1,1,2,3,4,2,3 and 1,2,3,4,5,2,6,3 for left- and right-child tokens, respectively. Suppose we want to delete the second and third pairs. Since the first appearance of the left-child token 1 is not to be deleted. No reassignment of left-child tokens is needed. The first 25 appearances of the right-child tokens 2 and 3 are deleted.
  • Reassignment of right-child tokens is in order.
  • the right-child tokens 4, 5, and 6 become 2, 3, and, 5 respectively.
  • the right-child tokens 2 and 3 become 4 and 6 respectively, according to equation (5).
  • the new right-child tokens are 1(1), 2(4), 3(5), 4(2), 5(6), 6(3), where tokens in brackets are old ones and the others are the corresponding reassigned tokens.
  • Another example shows the meaning of the first undeleted reappearance. Deleting the second, third, and sixth tokens among 1,2,3,4,5,2,6,3,2 leads to 1(1), 2(4), 3(5), 4(6), 5(3), 6(2). ii.
  • N r is the number of tokens for deletion of the first appearance. If N r is not zero, do the following:
  • i the number of tokens among b,, b 2 , ..., b Nr , which appear before the first undeleted reappearance of t.
  • j is the number of tokens among e,, e 2 , ... , e Nr , which appear before the first undeleted reappearance of t.
  • Reassignment of tokens after partial deletion should be taken into account.
  • i Get the left-hashing lists whose list indices appear in R,. Remove list elements that store parent-node tokens appearing in D and the corresponding right-child tokens, ii. Remove hashing lists whose list indices appear in D, (all the output tokens in these lists must appear in D).
  • t is a right-child token before the deletion and t' is the reassigned right-child token
  • t is a parent- node token before the deletion and t' is the reassigned parent-node token. 6. Deletion in a count file: (a) Reassign counts of occurrences for tokens appearing in R by C
  • N is the number of tokens for deletion of the first appearance. If N is not zero, do the following: i. If a leaf token is not among b,, b 2 , ...
  • FIG. 1 shows the main branch and side branches as subtree structures. The main branch starts with a leaf node and ends with the root node. All dotted circles indicate subtree structures.
  • Figure 2 shows the main branch and side branches as subtree structures.
  • the main branch starts with a non-leaf node and ends with the root node. All dotted circles indicate subtree structures.

Abstract

Methods of executing delete operations in an NGRAM tree structure are disclosed. One may choose a starting point for a given deletion procedure at the root, or a leaf node or an internal node in the NGRAM tree structure. During the deletion one or more leaf nodes of the NGRAM tree may be constrained and the deletion may be propagated up and down the nodes of the tree at different level.

Description

METHODS OF DELETING INFORMATION IN N GRAM TREE STRUCTURES
Field of the Invention This invention relates generally to deleting information values in an NGRAM system, similar to the deletion of records in a relational database. In particular, the invention is directed toward the deletion of information values in both single and joined NGRAM tree structures.
Background of the Invention
In a relational database, deletion of records is carried out in two steps. First, a query is perfromed with a constraint imposed on one or more fields, afterwhich records are obtained which satisfies the constraint. Then the obtained records may be deleted by eliminating them from the database. In an NGRAM system, of the type described in commonly assigned U.S.
Patent Nos. 5,245,337; 5,293,164; 5,592,667, a record is represented by a set of tokens (or indices) stored at all non-leaf nodes, and dictionary data values stored at all leaf nodes. Deletion of a record means a set of deletions and modifications of the involved tokens and dictionaries. In an NGRAM tree structure, one may carry out deletion of records, starting from a leaf node and propagating throughout the NGRAM tree structure, if a constraint is imposed on the leaf node.
For example, to delete records having a given set of values at leaf node L, the deletion operation may start at the leaf node L. If a constraint is imposed on a number of leaf nodes, deletion of records may start at the closest common ancestor node, followed by a propagation of the deletion throughout the NGRAM structure.
In general, there are two different strategies for deletion of records in an NGRAM system. One is to carry out the deletion physically throughout the NGRAM memory structure. The other is to set a number of flags at each node structure, indicating the tokens and dictionaries related to the records being deleted. However, no physical deletion is carried out in the memory structure, equivalent to virtual deletion.
The former strategy may lead to a very sophisticated and complex procedure. However, the resulting structure after deletion is clean and conducive to general data manipulation such as performing a query and carrying out data mining or OLAP analyses. On the other hand, the latter deletion strategy may be less demanding, but may be problematic with respect to general data manipulation. This invention is primarily directed toward the physical deletion of records in an NGRAM system.
Summary of the Invention This invention resides in methods of executing delete operations in an
NGRAM system. Various alternative procedures are disclosed, based upon what information properties the system has before the deletion and on what information properties one may wish to preserve in the system after the deletion.
In general, there are two kinds of NGRAM systems that have different information properties. One is an undisturbed system which corresponds to an original NGRAM system transformed from a raw database. This system has, among other properties, a fundamental NGRAM information property: two consecutive indices in a look-down (or output lookup) table at any non-leaf node in an NGRAM system satisfies:
ti+1 - ti < 2, i = 1,2, ... , C - l (1)
where C is the cardinality of the node, and tj and ti+ 1 are the ith and i + lth indices. Here, a look-down table is a stream of left or right child tokens whose addresses are the parent node tokens.
A fundamental structure function may be used to describe the relationship between any two pairing parent and child tokens. This function is given by: t pp ■• cc + ' l *N Λ rreep_eeaatt Z)
where tc is a child token and t, is the pairing parent token, and Nrepeat is the number of repeated child tokens appearing before the child token tc.
Another kind of NGRAM systems is a disturbed or reassigned system which does not preserve the above information property. The relationships between parent and child tokens can be any sequence other than that described by the above structure function. Generally speaking, there may exist an infinite number of sequences, each of which corresponds to a reassigned NGRAM system. Such reassignment is typically used for a specific purpose. Most notably, reassignment is used to conserve memory by reassigning the left and right look-down tables in the hashing sequence; that is, for left hashing, sort the left look-down table; sort the right look-down table for right hashing.
In an undisturbed NGRAM system, there are two principle procedures for deletion, depending on the resulting structure one may wish to have after a deletion. If an undisturbed NGRAM system is expected after a deletion, a general procedure is employed, which requires reassignment of the memory structures in order to preserve the original information properties. The reassigned deletion procedure leads to no sign of disturbance due to deletion. All the information properties preserved before a deletion are unchanged after the deletion, including the information property described above. This deletion is the most difficult to implement, but is also the most general. If an undisturbed structure is not necessary to retain, a less general procedure, a partial procedure may be used which requires no reassignment of the memory structure.
In a disturbed NGRAM system, only the less general deletion procedure without reassignment of undeleted tokens may be used. The resulting system after a deletion is also a disturbed system.
One may choose a starting point for a given deletion procedure, such as the root node in an NGRAM tree structure, since this is the place where a record number is identified. Once a record number is identified for a record being deleted, the procedure is as follows:
-delete the token at the root node that describes the record being deleted, -propagate the deletion down to every lower level node, and -delete all the tokens and dictionary values that describe different parts of the record being deleted
This procedure is instructive, but less efficient than it could be because the record numbers for the records being deleted are usually identified by performing a query constrained at one or more leaf nodes. If the constraint is imposed on a single leaf node, traverse twice between the leaf node and the root node is necessary for deletion starting at the root node. However, one traverse between the two nodes is needed if the deletion starts at the leaf node where the constraint is imposed.
If a constraint is imposed on a number of leaf nodes, deletion may start at the closest common ancestor node of all constrained leaf nodes. The procedure then propagates the deletion up and down in all directions. Such a procedure is disclosed herein, although at least a portion of the procedure may be applied directly to the deletion starting at the root node as well.
Brief Descripton of the Drawings FIGURE 1 shows the main branch and side branches as subtree structures, wherein the main branch starts with a leaf node, and ends with the root node, and wherein all dotted circles indicate subtree structures; and
FIGURE 2 shows the main branch and side branches as subtree structures, wherein the main branch starts with a non-leaf node and ends with the root node, and wherein all dotted circles indicate subtree structures.
Detailed Description of the Invention Before describing the invention in detail, several terms will be defined, as follows:
An information value is a value that can be either a raw data value or a token (index) representing the memory address of a combination of data values.
A complete deletion of a token in a non-leaf node removes the token and the corresponding count from the node memory structure, because the count of deletions is equal to the count of occurrences. A complete deletion of a dictionary value in a leaf node removes the value and the corresponding count from the leaf-node memory structure, since the count of deletions is equal to the count of occurrences.
A partial deletion of a token in a non-leaf node modifies the token and the associated count in the node memory structure, because the count of deletions is less than the count of occurrences.
A partial deletion of a dictionary value in a leaf node modifies the value and the associated count in the leaf-node memory structure, because the count of deletions is less than the count of occurrences.
The main branch of an NGRAM tree structure is the segment between a starting node for deletion and the root node. All leaf and non-leaf nodes on the main branch are main-branch nodes. All other nodes that are child nodes of the main-branch nodes are side-branch nodes.
A hybrid node is a node that stores at least two sets of information values. One set must be a set of tokens and the other set can be either a set of raw data values or a set of tokens of different type.
A foreign node is a node storing values that represent references to the referenced node values. The referenced node values are tokens (indices) that represent the memory addresses of the referenced data values.
A look-down table is a mapper between tokens of a non-leaf node and tokens of its child node, which stores a set of the child tokens in such a sequence that the pairing tokens of the parent node are incremental. The purpose is to obtain the pairing child token for any given parent token. This is a one-to-one operation. A more conventional term for a look-down table is an "output lookup table. "
A lookup table is defined as a mapping between tokens of a non-leaf node and tokens of its child node, which stores a set of the parent tokens in a hashing sequence in which all parent tokens pairing with the same child token are stored in the same hashing list. The purpose is to obtain all the pairing parent tokens for a given child token, by looking up the hashing list indexed by the child token. This is a one-to-many operation. A more conventional term for a lookup table is an "input lookup table" or "lookup hash table. "
Reassignment of a node structure is to reassign the mapping between tokens of a non-leaf node and tokens of its child node, or to reassign the mapping between tokens of a leaf node and dictionaries at the leaf node. Compacting a tree is to save memory storage by reassigning all non-leaf node structures in a specified hashing sequence.
DELETION IN AN N-GRAM TREE STRUCTURE A deletion operation starts from one end of the main branch where the end node is not the root node, and propagates through all the main-branch nodes as well as all the side-branch nodes and their substructures.
Deletion Starting at a Leaf Node
Suppose a constraint is imposed on a leaf node, say leaf node N,. The constrained leaf node defines one end of the main branch. The root node is the other end. Assume there are M main-branch nodes N,, N2,..., NM, where N2 is the parent node of node N1 ; N3 is the parent node of node N2, ... , and NM is the root node and the parent node of node NM.,. The constraint is such that all records that have dictionary values equal to d,, d2, ... dml at node N, are deleted. We may start the deletion from node N,
Deletion at a leaf node N, on the main branch: • Find dictionary values d, , d2, ... , dml and their addresses tj, t2, ... , tml in the leaf node memory structure file. Remove the dictionary values d,, d2, ..., dm] from the memory structure file. • Find the corresponding counts in the count file and remove the counts from the count file.
• Propagate tokens t| , t^, ... , tml up to the parent node N2 for deletion.
Set i = 2 and propagate the deletion up to node N; by following a procedure described below, for deletion at a non-leaf node Nj on the main branch. The propagation stops at the root node and all side-branch leaf nodes.
Deletion Starting at a Non-Leaf Node
Suppose a constraint is imposed on a number of leaf nodes. Find the closest common ancestor node, N,, of the constrained leaf nodes. The common ancestor node N! defines one end of the main branch. The root node is the other end. Assume there are M main-branch nodes N,, N2, ... , NM where N2 is the parent node of node Nl 5 N3 is the parent node of node N2, ... , and NM is the root node and the parent node of node NM_, . The constraint is such that all records consisting of combinations of data values described by tokens t{, t2, ..., tml at node N, are deleted, where m, is the number of tokens being deleted. These tokens being deleted are obtained by performing a query with specified conditions setting on the constrained leaf nodes.
1. Deletion at a non-leaf node N, on the main branch:
• Store tokens tj, tj, ..., tml in an array D. Store their counts of deletions in an array CD, equal to the counts of occurrences. Set arrays R, CR, and X, to be NULL. Carrying out deletion in a subtree structure in which node N, is the root node, by following a procedure for deletion at a side-branch node, to be described below.
• Propagate tokens tj, t2, ... , tml up to the parent node N2 for deletion. Set i = 2 and continue the following procedure.
2. Deletion at a non-leaf node N, on the main branch:
• Load a lookup table of mapping between the tokens of node N, and the tokens of node N,.,, from the memory structure of node N,. The lookup table is in a hashing form of a set of hashing lists. The tokens of node N,., describe the list indices of the hashing lists in which the tokens of node N, are the list elements. Find the ti" th, tj" th, ... , tj^ th lists which contain tokens of node Ni, to be deleted. Get all the tokens of node N, from the above lists or deletion. Assume these tokens tj, tj, ... , tm, are where m, ≥m,., and their counts of deletions cj, cj, ... , c , equal to the counts of occurrences respectively.
• Remove all the left and right child tokens that pair with tokens t1, , tj, ..., t from the memory structure of node N,. Remove all the corresponding counts from the count file.
• Assume the sibling node of node N,., is SlA. Get all the tokens of the sibling node S,.,, that pair with tokens t1,, tj, ... , tml . Eliminating repeated tokens by summing over counts of deletions for each distinct token. Assume these tokens are s1, , sj, ..., s , where k, ≥m The corresponding counts of deletions are c , c2, ... , ckl.
• Identify tokens among si, sj, ... , sj., for complete deletion. The identification of such a token may be made by checking if its count of deletions is equal to the count of occurrences. Stored the identified tokens and their counts of deletions in arrays D and CD.
• The rest tokens in s1,, sj, ... , s , are for partial deletion, since the count of deletions is less than the corresponding count of occurrences for each one of them. Store the tokens for partial deletion and their counts of deletions in arrays R and CR.
• If an undisturbed NGRAM system is not required after deletion, propagate the deleted tokens in D and R, and their counts of deletions in CD and CR, to the sibling node S, by following a procedure for deletion at a side-branch node, to be described below.
• If an undisturbed NGRAM system is required after deletion, reassignment of undeleted tokens is in order:
Reassign the undeleted tokens for nodes Nj and N, using the following formula: t* = t - Nd, (3) where t is a given undeleted token and t* is the corresponding token after reassignment. Nd is the number of tokens being deleted, that are less than t. - Identify each token in R for deletion of the first appearance. Store each recognized token for deletion of the first appearance and the largest token appearing before the first undeleted appearance of each recognized token in pairs in an array X. Reassign the undeleted tokens of node SU1 by using the reassignment schemes and formulas for deletion at a side-branch node, to be described below.
Propagate arrays D, R, X, CD, and CR, to the sibling node Sj.,, by following a procedure for deletion at a side-branch node, to be described below.
• If i = M, stop. Otherwise, set i = i + l and repeat the procedure for deletion at a non-leaf node N= on the main branch. Deletion at a Side-Branch Node
In general, a side-branch node can be a leaf or non-leaf node. A side-branch node may be considered as the root node of a subtree structure. The subtree structure may be a single-leaf tree structure if the side-branch node is a leaf node. In this case, only the procedure for deletion at a leaf node may be used. Or the subtree structure may be a non-cardinal tree structure if the side-branch node is a non-leaf node. In this case, deletion at a side-branch node is treaed as deletion starting at the root node in a tree structure and propagate the deletion down throughout the tree structure.
In the following, we describe a general procedure for deletion of a set of tokens starting from the root node in a tree or subtree structure. The procedure describes how to carry out deletion in a left-hashing memory structure in an explicit form. The procedure applies to all three different cases:
1) undisturbed NGRAM tree structure before and after deletion;
2) undisturbed NGRAM tree structure before deletion and reassigned NGRAM tree structure after deletion; and
3) reassigned NGRAM tree structure before and after deletion.
In the case of 1), the exact procedure described in this section may be followed. In the cases of 2) and 3), the procedure may be simplified by neglecting reassignment of undeleted tokens. It is also possible that a part of a NGRAM tree structure is reassigned and the rest is not. For example, output tokens in the root node memory structure may be dropped in an undisturbed NGRAM tree structure. In this case, mapping between the root-node output tokens and record numbers is lost, equivalent to reassignment of the root-node tokens. The other internal nodes, however, are not necessarily reassigned.
1. If it is a leaf node, go to deletion at a leaf node. If arrays D, R, X, CD, and CR are not all NULLs, go to the next step. Otherwise, identify the root-node tokens for deletion and get the count for each root-node token being deleted, from the count file for the root-node. Store all the root-node tokens for complete deletion and their counts of deletions in two arrays D[n] and CD[n] respectively, where n is the number of the root-node tokens being deleted. Set arrays R, X, and CR, to be NULL. In the following, D stores tokens for complete deletion and CD stores their counts of deletions. Array R stores tokens for partial deletions and CR stores the counts of deletions. Array X stores every token for partial deletion of the first appearance and the largest undeleted token appearing before the first undeleted appearance of the token being deleted partially.
2. From arrays R and D, get a set of n parent-node tokens being deleted (partially and completely) at a non-leaf node, and find pairs of left- and right-child tokens associated with the parent-node tokens in correspondence. Store the three sets of tokens in a composite buffer B sequentially, in the order of a stream of left-child tokens followed by a stream of right-child tokens followed by a stream of parent-node tokens. Sort the left-child tokens in the composite buffer B in an incremental sequence. Subsort the right-child tokens which pair with the same left-child token.
3. Get unique left-child tokens from B and sum over the counts of deletions for each unique left-child token.
(a) If a left-child token appears once and once only in B, get count of deletions for the left-child token, from the count of deletions for the pairing parent-node token. The count of deletions for the parent-node token is obtained from either CD (if the parent-node token is in D for complete deletion) or CR (if the parent-node token is in R for partial deletion).
(b) If a left-child token appears more than once in B, get count of deletions for the left-child token by summing over counts of deletions for all the pairing parent-node tokens, stored in CD and/or CR.
Store the unique left-child tokens and their counts of deletions in two arrays T,[n,] and Cτ[n,] respectively, where n, is the number of unique left-child tokens in B. For each token in T, (say T,[i], i=0,l,2,... , ,), compare its count of deletions in Cτ (Cτ[i]) and its count of occurrences. The count of occurrences can be obtained from either reading (T,[i] + l)th count in the count file for the left-child node, or summing over all the counts of occurrences for the pairing parent tokens.
(a) If the count of deletions (or Cτ[i]) is smaller than the count of occurrences, it is a partial deletion. i. Store the token and its count of deletions, Cτ[i] , in two arrays R,[m,] and CR [m,] for partial deletion, where m, is the total number of tokens for partial deletion. ii. If the first appearance is to be deleted, store the token and the largest token before the first undeleted appearance of the token being partially deleted in array X,.
(b) If Cτ[i], the count of deletions, is equal to the count of occurrences, the (T,[i] + l)th count in the left-child count file, store the token and its count of deletions, Cτ[i], in two arrays D,[k,] and C [k,] for complete deletion, where k, is the total number of tokens for complete deletion. Here, n, = m, + k,.
4. Get unique right-child tokens from B and sum over the counts of deletions for each unique right-child token.
(a) If a right-child token appears once and once only in B, get count of deletions for the right-child token from the count of deletions for the pairing parent-node token. The count for the parent-node token is obtained from either CD (if the parent-node token is in D for a complete deletion) or CR (if the parent-node token is in R for a partial deletion).
(b) If a right-child token appears more than once in B, get count of deletions for the right-child token by summing over counts of eletions for all the pairing parent-node tokens, stored in CD and/or CR. Store the unique right-child tokens and their counts of deletions in two arrays Tr[nr] and Cτ[nr] respectively, where nr is the number of unique right-child tokens in B. For each token in Tr (say Tr[i], i = 0, 1, 2, ..., nr-l), compare its count of deletions in C (Cτ [i]) and its count of occurrences. The count of occurrences can be obtained from either reading (Tr[i] + l)th count in the count file for the right-child node, or summing over all the counts of occurrences for the pairing parent tokens.
(a) If C [i], the count of deletions, is smaller than the count of occurrences, it is partial deletion. i. Store the token and its count of deletions, Cτ [i], in two arrays Rr[mr] and CR [mj for partial deletion, where m is the total number of tokens for partial deletion, ii. If the first appearance is to be deleted, store the token and the largest token before the first undeleted appearance of the token being partially deleted in array Xr.
(b) If Cτ [i] is equal to the (Tr[i] + l)th count in the count file for the right-child node, store the token and its count of deletions C^ [i] in two arrays D^ [kr] and C [kr] for complete deletion, where kr is the total number of tokens for complete deletion. Here, nr = mr+kr. 5. Delete tokens in a node memory structure: (a) Partial deletion: i. Recall that X stores parent-node tokens b b2, ... , bN and e,, e2, ... , eN, where bl is the smallest token for deletion of the first appearance, and ej is the largest token before the first undeleted appearance of token b, . b, is the z'th smallest token for deletion of the first appearance, and ej is the largest token before the first undeleted appearance of token bi; where i = 1, 2, ... , N. N is the number of tokens for deletion of the first appearance. If N is not zero, do the following:
A. If a parent-node token is not among b,, b2, ... , bN 5 in X, reassign it by t' = t - (i - j) (4) where i is the number of tokens among b,, b2, ... , bN, which are smaller than t. j is the number of tokens among e, , e2, ..., eN, which are smaller 10 than t.
B. If a parent-node token is among b,, b2, ... , bN in X, let us say bk, k = 1, 2, ... , N, reassign it by
V = t + (ek,) - (i - l - j) = ek - (i - l - j) (5) where i is the number of tokens among b,, b2, ... , 15 bN, which appear before the first undeleted reappearance of t. j is the number of tokens among e, , e2, ... , eN, which appear before the first undeleted reappearance of t. The above result is shown by an example: assume a 20 stream of pairs are 1 ,1,1,2,3,4,2,3 and 1,2,3,4,5,2,6,3 for left- and right-child tokens, respectively. Suppose we want to delete the second and third pairs. Since the first appearance of the left-child token 1 is not to be deleted. No reassignment of left-child tokens is needed. The first 25 appearances of the right-child tokens 2 and 3 are deleted.
Reassignment of right-child tokens is in order. According to equation (4), the right-child tokens 4, 5, and 6 become 2, 3, and, 5 respectively. The right-child tokens 2 and 3 become 4 and 6 respectively, according to equation (5). The new right-child tokens are 1(1), 2(4), 3(5), 4(2), 5(6), 6(3), where tokens in brackets are old ones and the others are the corresponding reassigned tokens. Another example shows the meaning of the first undeleted reappearance. Deleting the second, third, and sixth tokens among 1,2,3,4,5,2,6,3,2 leads to 1(1), 2(4), 3(5), 4(6), 5(3), 6(2). ii. Similarly, X, stores left-child tokens b,, b2, ... , bNl and e,, e2, ... , eNl, where b, is the smallest token for deletion of the first appearance, and e, is the largest token before the first undeleted appearance of token b, . b, is the /'th smallest token for deletion of the first appearance, and e, is the largest token before the first undeleted appearance of token b„ where i = 1, 2, ..., N,. N, is the number of tokens for deletion of the first appearance. If N, is not zero, do the following:
A. If a left-child token is not among b,, b2, ... , bNl in X,, reassign it by t' = t - (i - j), where i is the number of tokens among b,, b2, ... , bNl, which are smaller than t. j is the number of tokens among e,, e2, ..., eNl, which are smaller than t.
B. If a left-child token is among b,, b2, ... , bNl in X„ let us say b , k = 1, 2, ... , N„ reassign it by t' = ek - (i - 1 - j), where i is the number of tokens among b,, b2, . ., bNl, which appear before the first undeleted reappearance of t. j is the number of tokens among e,, e2, ... , eNl, which appear before the first undeleted reappearance of t. iii. Analogously, Xr stores right-child tokens b, , b2, ... , bNr and e, , e2, ..., eNr, where bj is the smallest token for deletion of the first appearance, and e, is the largest token before the first undeleted appearance of token b bj is the /th smallest token for deletion of the first appearance, and βj is the largest token before the first undeleted appearance of token bj, where i = 1, 2, ... , Nr. Nr is the number of tokens for deletion of the first appearance. If Nr is not zero, do the following:
A. If a right-child token is not among b,, b2, ... , bN1 in Xr, reassign it by t' = t - (i - j) where i is the number of tokens among b,, b2, ... , bNr, which are smaller than t. j is the number of tokens among e, , e2, ... , eNr, which are smaller than t.
B. If a right-child token is among b, , b2, ... , bNr in Xr, let us say bk, k = 1, 2, ... , Nr, reassign it by t'
= ek - (i - 1 - j), where i is the number of tokens among b,, b2, ..., bNr, which appear before the first undeleted reappearance of t. j is the number of tokens among e,, e2, ... , eNr, which appear before the first undeleted reappearance of t.
(b) Complete deletion:
Reassignment of tokens after partial deletion should be taken into account. i. Get the left-hashing lists whose list indices appear in R,. Remove list elements that store parent-node tokens appearing in D and the corresponding right-child tokens, ii. Remove hashing lists whose list indices appear in D, (all the output tokens in these lists must appear in D). iii. Sort the left-child tokens in D,. Reassign left-child tokens by t' = t - (i + 1) if D,[i] < t < D, [i + 1], where i = 0, 1, 2, ... , k, - 2. For those left-child tokens that are larger than D,[kl - 1], perform t' = t - k,. Here, t is a left-child token before the deletion and t' is the reassigned left-child token. iv. Sort the right-child tokens in Dr. Reassign right-child tokens by t' = t - (i + 1) if Dr [i] < t < Dr [i + 1], where i = 0, 1, 2, ... , kr - 2. For those right-child tokens that are larger than Dr [kr- 1], perform t' = t - kr. Here, t is a right-child token before the deletion and t' is the reassigned right-child token, v. Sort the parent-node tokens in D. Reassign parent-node tokens by t' = t - (i + 1) if D[i] < t < D[i + 1], where i = 0, 1, 2, ... , k - 2, where k is the total number of tokens in D. For those parent-node tokens that are larger than D[k - 1], perform t' = t - k. Here, t is a parent- node token before the deletion and t' is the reassigned parent-node token. 6. Deletion in a count file: (a) Reassign counts of occurrences for tokens appearing in R by C
= C - CR[i], i = 0, 1, 2, ..., m-1, where m is the total number of tokens in R. C is a count of occurrences, stored in the count file for a parent-node token appearing in R. CR[i] is the count of deletions for the same token. (b) Remove counts of occurrences from the count file for parent-node tokens appearing in D, since the count of deletions is equal to the count of occurrences for each token in D. Reassignment of positions for counts due to reassignment of parent-node tokens should be taken into account. 7. Deletion in a list count file:
(a) Reassign the list count for each of lists whose list indices appear in R, by C = C - C", where C" is the total number of parent- node tokens in the list, that appear in D. C is the list count stored in the list count file for a partially deleted list. C is a reassigned list count.
(b) Remove list counts from the list count file for lists whose list indices appear in D,.
Reassignment of positions for counts due to reassignment of left-child tokens should be taken into account.
8. Set R = R„ CR = CR D = D„ CD = C , X = X„ and n = n,. If the left-child node is an internal node, propagate the deletion down to the left-child node, and repeat the above procedure for the left-child node. If the left-child node is a leaf node, proceed to deletion at a leaf node. 9. Set R = Rr, CR = CR D = Dr, CD = C , X = Xr, and n = nr. If the right-child node is an internal node, propagate the deletion down to the right-child node, and repeat the above procedure for the right-child node. If the right-child node is a leaf node, proceed to deletion at a leaf node. 10. Deletion at a leaf node: (a) For each leaf token appearing in R, reassign its count in the count file by C = C - CR[i], i = 0, 1, 2, ... , m, where C represents the original count of occurrences for a leaf token, stored in the count file. CR[i] represents the count of deletions for the same leaf token being deleted. C' is the reassigned count. Now, X stores leaf-node tokens b,, b2, ... , bN and e,, e2, ... , eN, where bl is the smallest token for deletion of the first appearance, and ^ is the largest token before the first undeleted appearance of token b, . b, is the /th smallest token for deletion of the first appearance, and e, is the largest token before the first undeleted appearance of token bj, where i = 1, 2, ..., N. N is the number of tokens for deletion of the first appearance. If N is not zero, do the following: i. If a leaf token is not among b,, b2, ... , bN in X, reassign it by t' = t - (i - j) where / is the number of tokens among b,, b2, ... , bN which are smaller than t. j is the number of tokens among e,, e2, ..., eN which are smaller than t. ii. If a leaf token is among b,, b2, ... , bN in X, let us say bk, k = 1, 2, ..., N, reassign it by t' = ek - (i - 1 - j), where / is the number of tokens among b,, b2, ... , bN, which appear before the first undeleted reappearance of t. j is the number of tokens among e,, e2, ..., eN which appear before the first undeleted reappearance of t. (b) For each leaf token appearing in D, remove its count of occurrences from the count file, and delete the dictionary it represents. Relocations of the dictionaries indexed by the reassigned leaf tokens should be done accordingly. Reassigned positions of counts in the count file should be also consistent with the reassignment of the corresponding tokens after deletion. Figure 1 shows the main branch and side branches as subtree structures. The main branch starts with a leaf node and ends with the root node. All dotted circles indicate subtree structures.
Figure 2 shows the main branch and side branches as subtree structures. The main branch starts with a non-leaf node and ends with the root node. All dotted circles indicate subtree structures.
Deletion in Join NGRAM Tree Structures
Previous sections describe how to perform deletions in a single NGRAM tree structure. In join NGRAM tree structures, the descriptions without modification would deletions beyond trivial rejection.
(1) Change the foreign node, whose tokens reference tokens being deleted, to a hybrid node. The foreign-node tokens that reference the tokens being deleted are replaced by data values that are represented by the referenced tokens; (2) Propagate the deletion of the referenced tokens to the NGRAM tree structure where the foreign node locates. To do this, we get all the foreign-node tokens that reference the deleted tokens and propagate them up to the root node of the tree structure where the foreign node locates. Then, we perform deletions of the resulting root-node tokens, starting at the root node; or (3) Reassign the foreign-node tokens that reference the deleted tokens by some other values. This involves updating operations.
I claim:

Claims

1. A method of deleting a record in an NGRAM tree structure having plurality of nodes at different levels containing token and dictionary values, the method comprising the steps of: choosing a starting point containing information identifying the record to be deleted; deleting a token at the root node which describes the record; propagating the deletion down to every lower-level node in the tree structure; and at each lower-level node, deleting all the tokens and dictionary values associated with the record being deleted.
2. The method of claim 1, wherein the starting point is the root node of the tree structure.
3. The method of claim 1, wherein the NGRAM tree structure represents a disturbed NGRAM system, and wherein the tree structure after deletion also represents a disturbed system.
4. The method of claim 1, wherein the NGRAM tree structure represents a undisturbed NGRAM system, and wherein the method further includes the step of reassigning undeleted tokens such that the tree structure after deletion is also an undisturbed system.
5. The method of claim 1, wherein the step of choosing a starting point containing information identifying the record to be deleted inclutes the step of identifying a record to be deleted by performing a query constrained at one or more leaf nodes.
6. The method of claim 5, wherein the constraint is imposed on a single leaf node, and wherein the method further includes the step of traversing between the leaf node and the root node as necessary for deletion starting at the root node.
7. The method of claim 5, wherein the constraint is imposed on a plurality of leaf nodes, and wherein the method further includes the step of traversing between the nodes as needed if the deletion starts at the leaf node where the constraint is imposed.
8. The method of claim 5, wherein the constraint is imposed on a plurality of leaf nodes, and wherein the method further includes the steps of: initiating the deletion at the closest common ancestor node of all constrained leaf nodes; and propagating the deletion up and down the tree structure.
9. A method of performing a delete operation in an NGRAM tree structure having a main branch, one or more side branches, and a plurality of leaf nodes, comprising the steps of: imposing a constraint on one or more of the leaf nodes; and initiating the deletion operation with respect to the constrained node.
10. The method of claim 9, wherein only a single leaf node is constrianed, and wherein the deletion operation is initiated with respect to this node.
11. The method of claim 9, wherein a plurality of leaf nodes are constrained, and wherein the deletion operation is initiated at the closest common ancestor node of all constrained leaf nodes.
12. The method of claim 9, further including the step of propagating the deletions throughout the main branch and side branches of the tree structure.
13. The method of claim 9, further including the step of deleting tokens with respect to a non-leaf node on the main branch by: removing all the tokens being deleted from the node structure and from the parent node structure, except for the root node; and removing all the corresponding counts of occurrences from the node count file.
14. The method of claim 9, further including the step of deleting dictionary values for a leaf node on the main branch by: removing all the dictionaries being deleted from the leaf-node structure; and removing the corresponding leaf tokens from its parent node structure.
15. The method of claim 9, further including the step of deleting tokens for a non-leaf node on a side branch by: removing tokens whose counts of deletions are equal to the counts of occurrences from the node structure and from its parent node structure; removing the corresponding counts of occurrences from the count file of the node.
16. The method of claim 9, further including the step of deleting tokens for a non-leaf node on a side branch by: modifying tokens whose counts of deletions are less than the counts of occurrences by subtracting the counts of deletions from the corresponding counts of occurrences respectively.
17. The method of claim 9, further including the step of deleting a dictionary for a leaf node on a side branch by: removing dictionaries whose counts of deletions are equal to the counts of occurrences from the leaf node structure; removing the corresponding tokens from its parent node structure; and removing the corresponding counts of occurrences from the count file of the leaf node.
18. The method of claim 9, further including the step of deleting a dictionary for a leaf node on a side branch by: modifying dictionaries whose counts of deletions are less than the counts of occurrences by subtracting the counts of deletions from the corresponding counts of occurrences respectively.
19. The method of claim 9, further including the step of reassigning undeleted and modified tokens and dictionaries in order to preserve complete information properties if such properties were preserved prior to the deletion operation.
20. A method of performing delete operations in a join NGRAM tree structure, comprising the steps of: changing a foreign node to a hybrid node; and a) replacing the foreign node tokens that reference the tokens being deleted by data values described by the referenced tokens; or b) propagating the deletion of the referenced tokens through the
NGRAM tree strucmre where the foreign node locates.
PCT/US1999/017133 1998-07-28 1999-07-28 Methods of deleting information in n-gram tree structures WO2000007123A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU52394/99A AU5239499A (en) 1998-07-28 1999-07-28 Methods of deleting information in n-gram tree structures

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12361198A 1998-07-28 1998-07-28
US09/123,611 1998-07-28

Publications (1)

Publication Number Publication Date
WO2000007123A1 true WO2000007123A1 (en) 2000-02-10

Family

ID=22409709

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/017133 WO2000007123A1 (en) 1998-07-28 1999-07-28 Methods of deleting information in n-gram tree structures

Country Status (2)

Country Link
AU (1) AU5239499A (en)
WO (1) WO2000007123A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4899148A (en) * 1987-02-25 1990-02-06 Oki Electric Industry Co., Ltd. Data compression method
US5706365A (en) * 1995-04-10 1998-01-06 Rebus Technology, Inc. System and method for portable document indexing using n-gram word decomposition
US5758024A (en) * 1996-06-25 1998-05-26 Microsoft Corporation Method and system for encoding pronunciation prefix trees

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4899148A (en) * 1987-02-25 1990-02-06 Oki Electric Industry Co., Ltd. Data compression method
US5706365A (en) * 1995-04-10 1998-01-06 Rebus Technology, Inc. System and method for portable document indexing using n-gram word decomposition
US5758024A (en) * 1996-06-25 1998-05-26 Microsoft Corporation Method and system for encoding pronunciation prefix trees

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANDERSSON ET AL: "Faster Uniquely Represented Dictionaries", PROCEEDINGS OF THE ANNUAL SYMPOSIUM ON FOUNDATION OF COMPUTER SCIENCE, vol. SYMP.32, October 1991 (1991-10-01), USA, pages 642 - 649, XP000326131 *
CHOI KONG-RIM ET AL: "T*-Tree: A Main Memory Database Index Structure for Real Time Applications", PROCEEDINGS OF THE 1996 THIRD INTERNATIONAL WORKSHOP ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, October 1996 (1996-10-01), USA, pages 81 - 88, XP002919655 *

Also Published As

Publication number Publication date
AU5239499A (en) 2000-02-21

Similar Documents

Publication Publication Date Title
US8396844B1 (en) Hierarchical method for storing data with improved compression
US8224861B2 (en) Coupled node tree splitting/conjoining method and program
US8073874B2 (en) Bit string searching apparatus, searching method, and program
JP3771271B2 (en) Apparatus and method for storing and retrieving ordered collections of keys in a compact zero complete tree
US8332410B2 (en) Bit string merge sort device, method, and program
KR100798609B1 (en) Data sort method, data sort apparatus, and storage medium storing data sort program
CN105117415B (en) A kind of SSD data-updating methods of optimization
US8190591B2 (en) Bit string searching apparatus, searching method, and program
US8214405B2 (en) Longest-match/shortest-match search apparatus, search method, and program
JPH02230464A (en) Encoding of signature
CN108509505B (en) Character string retrieval method and device based on partition double-array Trie
JP2002501256A (en) Database device
US20050027513A1 (en) Symbol dictionary compiling method and symbol dictionary retrieving method
WO2008030694A1 (en) Dynamic fragment mapping
CN103365991A (en) Method for realizing dictionary memory management of Trie tree based on one-dimensional linear space
US20070094313A1 (en) Architecture and method for efficient bulk loading of a PATRICIA trie
Wang et al. Incremental discovery of sequential patterns
US7299235B2 (en) Method and apparatus for ternary PATRICIA trie blocks
JP2001527240A (en) Management in data structures
US8250089B2 (en) Bit string search apparatus, search method, and program
WO2000007123A1 (en) Methods of deleting information in n-gram tree structures
US7620640B2 (en) Cascading index method and apparatus
CN111190903A (en) Btree block indexing technology for disaster recovery client
Shneiderman Reduced combined indexes for efficient multiple attribute retrieval
KR20010109665A (en) Multi-Path Index Method for The Efficient Retrieval of XML Data

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase