US20010025320A1 - Multi-language domain name service - Google Patents

Multi-language domain name service Download PDF

Info

Publication number
US20010025320A1
US20010025320A1 US09/792,438 US79243801A US2001025320A1 US 20010025320 A1 US20010025320 A1 US 20010025320A1 US 79243801 A US79243801 A US 79243801A US 2001025320 A1 US2001025320 A1 US 2001025320A1
Authority
US
United States
Prior art keywords
domain name
linguistic
node
encoding
tree structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/792,438
Inventor
Ching Seng
Jun Yin
Mingliang Jiang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
I-DNS NET INTERNATIONAL Pte Ltd
Original Assignee
I-DNS NET INTERNATIONAL Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by I-DNS NET INTERNATIONAL Pte Ltd filed Critical I-DNS NET INTERNATIONAL Pte Ltd
Priority to US09/792,438 priority Critical patent/US20010025320A1/en
Assigned to I-DNS.NET INTERNATIONAL PTE LTD. reassignment I-DNS.NET INTERNATIONAL PTE LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JUN, YIN, MINGLIANG, JIANG, SENG, CHING HONG
Publication of US20010025320A1 publication Critical patent/US20010025320A1/en
Assigned to I-DNS.NET INTERNATIONAL PTE LTD. reassignment I-DNS.NET INTERNATIONAL PTE LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JIANG, MINGLIANG, SENG, CHING HONG, YIN, JUN
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/30Managing network names, e.g. use of aliases or nicknames
    • H04L61/301Name conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/30Types of network names
    • H04L2101/32Types of network names containing non-Latin characters, e.g. Chinese domain names

Definitions

  • the present invention relates to the Domain Name Service used to resolve network domain names into corresponding network addresses. More particularly, the invention relates to an alternative or modified Domain Name Service that accepts domain names provided in many different encoding formats, not just ASCII.
  • the Internet has evolved from a purely research and academic entity to a global network that reaches a diverse community with different languages and cultures. In all areas the Internet has progressed to address the localization needs of its audience.
  • Content on the World Wide Web is now published in many different languages as multilingual-enabled software applications proliferate. It is possible to send an e-mail message to another person in Chinese or to view a World Wide Web page in Japanese.
  • the Internet today relies entirely on the Domain Name System to resolve human readable names to numeric IP addresses and vice versa.
  • DNS Domain Name System
  • the Domain Name System (DNS) is still based on a subset of Latin-1 alphabet, thus still mainly English.
  • DNS Domain Name System
  • e-mail addresses, Web addresses, and other Internet addressing formats adopt ASCII as the global standard to guarantee interoperation. No provision is made to allow for e-mail or Web addresses to be in a non-ASCII native language. The implication is that any user of the Internet has to have some basic knowledge of ASCII characters.
  • Unicode is a character encoding system in which nearly every character of most important languages is uniquely mapped to a 16 bit value. Since Unicode has laid down the foundations for unique non-overlapping encoding system, some researchers have begun to explore how Unicode can be used as the basis for a future DNS namespace, which can embrace the rich diversity of languages present in the world today. See M. Durst, “Internationalization of Domain Names,” Internet Draft “draft-duerst-dns-i18n-02.txt,” which can be found at the IETF home page, http://www.ietf.cnri.reston.va.us/ID.html, July 1998. This document is incorporated herein by reference in its entirety and for all purposes. The new namespace should be able to offer multilingual and multiscript functionality that will make it easier for non-English speakers to use the Internet.
  • Adopting Unicode as the standard character set for a new Domain Name System avoids overlapping code space for different language scripts. In this way, it may allow the Internet community to use domain names in their native scripts such as
  • multilingual.com the popular “.com” (“dot com”) top level domain is represented in ASCII characters, but the second and lower level domains are represented in a non-ASCII format.
  • Such formats allow non-Roman characters.
  • the non-ASCII encoding type BIG-5 encodes Chinese characters.
  • a Chinese language second level domain name could be registered and used with a com top-level domain name.
  • the BIG-5 encoded second level domain name would first have to be converted to an ASCII representation.
  • the transformed multilingual.com second level domain could then used by conventional name servers to resolve the address.
  • the present invention pertains to methods and apparatus that detect the linguistic encoding type of a digital string encoding a domain name. This is accomplished using a tree or graph comprised of nodes holding linguistic digits representing the digital sequence of a character or a portion of a character. These nodes are compared against digital sequences of characters in the domain name under consideration. Each comparison results in a step down the graph. Then another comparison is performed, often with the next successive character in the domain name. Ultimately the process reaches a terminal node of the graph. This node specifies the encoding type of the domain name under consideration.
  • One specific aspect of the invention pertains to a method of detecting a linguistic encoding type of a domain name.
  • Such method may be characterized by the following sequence: (a) receiving a digital representation of the domain name; and (b) using the digital representation to traverse a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types.
  • the method detects the linguistic encoding type of the domain name.
  • the tree structure may take various forms.
  • the tree structure is ternary tree structure.
  • the nodes of the tree structure comprise digital sequences of linguistic digits from characters of multiple encoding types.
  • the method traverses the tree structure by considering individual characters of the domain name (or portions of those characters) to determine how to move between nodes on the tree structure.
  • the tree structure is traversed by comparing digital representations of linguistic digits in the nodes of the tree structure against digital representations of individual characters of the domain name or portions of those characters. The comparisons determine how to move between nodes on the tree structure. For example, if the digital value of a node's linguistic digit is greater than the digital value of the corresponding character of the domain name, one path is chosen. Other paths are chosen if the comparison shows different relationships between the digital values.
  • the method also involves reversing the sequence of the digital representation of the domain name prior to using the representation to traverse the tree structure. In this manner, a digital representation of a last character of the domain name is compared to a root node on the tree structure. Next, a digital representation of a next to last character of the domain name is compared to a second node of the tree structure. The method continues in this manner (i) using a next previous character of the domain name to identify a next lower level node of the tree structure; and (ii) repeating (i) until reaching a terminal node of the tree structure. Ultimately, a terminal node of the tree structure is reached. Typically, the terminal node itself specifies the linguistic encoding type of the domain name.
  • the apparatus may be characterized by the following features: (a) one or more processors; (b) memory in coupled to said one or more processor and configured to store a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and (c) a network interface configured to receive domain names from network nodes.
  • the one or more processors are configured or designed to traverse the tree structure using information from a domain name to thereby detect the linguistic encoding type of the domain name.
  • the tree structure may have a form as described above.
  • the apparatus also includes a logical module for converting the domain name from its linguistic encoding type to a DNS compatible encoding type (e.g., ASCII).
  • the apparatus may also include a logical module for resolving domain names in the DNS compatible encoding type.
  • Another specific aspect of the invention pertains to methods for creating an encoding detection tree of the types described above (e.g., ternary tree structures). Such methods may be characterized by the following sequence: (a) receiving a representation of a digitally represented first domain name which is encoded in a first linguistic encoding type; (b) adding the first domain name, and its first linguistic encoding type, to the encoding detection tree to create a first path through the encoding detection tree; (c) receiving a representation of a digitally represented second domain name which is encoded in a second linguistic encoding type; and (d) adding the second domain name, and its second linguistic encoding type, to the encoding detection tree to create a second path through the encoding detection tree. Part of the method may also involve determining whether the first domain name (or some part of it) already exists in the encoding detection tree.
  • the first and second paths each include separate terminal nodes, one or more intermediate nodes, and a common root node.
  • the system may add an identifier of the first and second linguistic encoding types to the terminal nodes of the first and second paths, respectively.
  • the system may also add a sequence of the first domain name to the terminal node of the first path.
  • the encoding detection tree presents the domain names in reverse order.
  • the first path in the tree presents the first domain name in reverse order of linguistic digits when moving from the root node to the terminal node.
  • the first domain name is included in the tree by adding a new node to the encoding detection tree for each linguistic digit of the first domain name having a digital sequence that does not appear at a corresponding location in the encoding detection tree.
  • the positions of the new nodes with respect to existing nodes is determined by comparing the digital sequence of an existing node with the digital sequence of a corresponding linguistic digit from the first domain name.
  • the process may also include adding to the encoding detection tree a linguistic equivalent node of the one of the linguistic digits in the first domain name.
  • Yet another aspect of the invention pertains to apparatus for creating an encoding detection tree.
  • apparatus may be characterized by the following features: (a) one or more processors; (b) memory in coupled to said one or more processor and configured to store a partially created tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and (c) an interface configured to receive domain names from a collection of domain names.
  • the one or more processors are configured or designed to receive representations of digitally represented domain names which are encoded in linguistic encoding types and add those domain names, together with their linguistic encoding types, to the encoding detection tree to create paths through the encoding detection tree.
  • Another aspect of the invention pertains to computer program products including a machine-readable media on which is stored program instructions for implementing a portion of or an entire method as described above. Any of the methods of this invention may be represented, in whole or in part, as program instructions that can be provided on such computer readable media.
  • FIG. 1 is a block diagram of an exemplary system for resolving a non-ASCII domain name to its numeric IP address.
  • FIG. 2 is a process flow diagram of operations between a client and two multilingual DNS servers according to an embodiment of the invention.
  • FIG. 3A is a flow chart of the conversion of the domain name from one linguistic encoding type to a second linguistic encoding type according to an embodiment of the invention.
  • FIG. 3B is a block diagram of a multilingual domain name server according to an embodiment of the invention.
  • FIG. 4A is a schematic diagram of the reversed linguistic digit sequence of a domain name and a corresponding encoding detection tree according to an embodiment of the invention.
  • FIG. 4B is a flow chart of an algorithm to determine an encoding type of a domain name using an encoding detection tree according to an embodiment of the invention.
  • FIG. 4C is a flow chart of an algorithm, used with the algorithm of FIG. 4B, to search a list of terminal nodes according to an embodiment of the invention.
  • FIG. 4D is a schematic diagram of another example of an encoding detection tree.
  • FIG. 5A is a flow chart of an algorithm to construct the data structure according to an embodiment of the invention.
  • FIG. 5B is a schematic diagram of the reversed linguistic digit sequence, the linguistic encoding type, the linguistic equivalent, and the encoding detection tree according to an embodiment of the invention.
  • FIG. 5C is a flow chart of an algorithm to check whether a linguistic digit has been inserted into the encoding detection tree according to an embodiment of the invention.
  • FIGS. 5 D- 5 I are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “A 1 A 2 B 1 B 2 .com” is added to the encoding detection tree.
  • FIG. 5J is a flow chart of an algorithm to adjust sub-EDT structure according to an embodiment of the invention.
  • FIGS. 5 K- 5 V are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “A 1 A 2 B 1 B 2 .com” is added to the encoding detection tree.
  • FIGS. 5 W- 5 AG are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “C 1 C 2 D 1 D 2 .com” is added to the encoding detection tree.
  • FIG. 5AH is a schematic diagram of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “D 1 E 2 F 1 F 2 .com” is added to the encoding detection tree.
  • FIG. 5AI is a flow chart of an algorithm to reinsert a sub-EDT structure according to an embodiment of the invention.
  • FIG. 5AJ is a schematic diagram of the reversed linguistic digit sequence of a domain name and a corresponding encoding detection tree according to an embodiment of the invention.
  • FIG. 6 is a schematic diagram depicting the construction of an encoding detection tree using the procedure depicted in FIG. 5AI, when a sub-EDT structure is destroyed and reinserted in the encoding detection tree.
  • FIG. 7 is a simplified block diagram of a typical computer system of the type that may be employed to implement the procedures of this invention.
  • the present invention provides a technology for efficiently and accurately identifying encoding types of domain names. It uses a tree or graph structure having nodes corresponding to “linguistic digits.” In a typical application, these linguistic digits are sequentially compared against digital representations of characters in the domain name. Each comparison results in a decision on which available path to take in the graph structure. This moves a pointer through a tree sequentially until reaching a terminal node associated with an encoding type. Thus, at the end of the process, the encoding type is detected. This information can then be employed to convert the characters of the multilingual domain name to a format compatible with the DNS standard (e.g., RFC 1035).
  • DNS standard e.g., RFC 1035
  • the present invention transforms multilingual multiscript names to a form that is compliant with DNS (e.g., DNS as explained in RFC1035). These transformed names may then be relayed as DNS queries to a conventional DNS server.
  • DNS e.g., DNS as explained in RFC1035.
  • FIG. 1 An exemplary process of how a localized domain name is resolved to its numeric IP address is illustrated by FIG. 1 below.
  • DNS is a hierarchical, domain-based naming scheme and a distributed database system for implementing this naming scheme. It is primarily used for mapping host names and e-mail destinations to IP addresses, but can be used for other purposes. As mentioned, DNS is defined in RFCs 1034 and 1035.
  • the DNS protocol is currently based upon a subset of ASCII, and is thus limited to the Latin alphabet. Numerous other encodings provide digital representations for other character sets of the world. Examples include BIG5 and GB-2312 for Chinese character scripts (traditional and simplified respectively), Shift-JIS and EUC-JP for Japanese character scripts, KSC-5601 for Korean character scripts, and the extended ASCII characters for French and German characters, for instance.
  • Unicode standard a “universal linguistic encoding type” that provides the capacity to encode all the characters used in the written languages of the world.
  • domain name strings in various different encoding types are all first converted to Unicode and then to ASCII—if necessary.
  • Unicode uses a 16-bit encoding that provides code points for more than 65,000 characters.
  • Unicode scripts include Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, kanni, Oriya, Tamil, Telugu, Kannada, Malayalam, Thia, Lao, Georgian, Vietnamese, Japanese Kana, the complete set of modem Korean Hangul, and a unified set of Chinese/Japanese/Korean (CJK) ideographs. Many more scripts and characters are to be added shortly, including Ethiopic, Canadian, Syllabics, Cherokee, additional rare ideographs, Sinhala, Iranc, Burmese, Khmer, and Braille.
  • a single 16-bit number is assigned to each code element defined by the Unicode Standard.
  • Each of these 16-bit numbers is called a code value and, when referred to in text, is listed in hexadecimal form following the prefix “U”.
  • U the code value U+0041 is the hexadecimal number 0041 (equal to the decimal number 65). It represents the character “A” in the Unicode Standard.
  • Each character is also assigned a unique name that specifies it and no other. For example, U+0041 is assigned the character name “LATIN CAPITAL LETTER A.” U+0A1B is assigned the character name “GURMUKHI LETTER CHA.” These Unicode names are identical to the names for the same characters in ISO/IEC 10646.
  • the Unicode Standard groups characters together by scripts in code blocks.
  • a script is any system of related characters.
  • the standard retains the order of characters in a source set where possible.
  • the characters of a script are traditionally arranged in a certain order—alphabetic order, for example—the Unicode Standard arranges them in its code space using the same order whenever possible.
  • Code blocks vary greatly in size. For example, the Cyrillic code block does not exceed 256 code values, while the CJK code block has a range of thousands of code values.
  • Code elements are grouped logically throughout the range of code values, called the “codespace.”
  • the coding starts at U+0000 with the standard ASCII characters, and continues with Greek, Cyrillic, Hebrew, Arabic, Indic and other scripts; then followed by symbols and punctuation.
  • the code space continues with Hiragana, Katakana, and Bopomofo.
  • the unified Han ideographs are followed by the complete set of modem Hangul.
  • a surrogate range of code values is reserved for future expansion with UTF-16.
  • Towards the end of the codespace is a range of code values reserved for private use, followed by a range of compatibility characters.
  • the compatibility characters are character variants that are encoded only to enable transcoding to earlier standards and old implementations, which made use of them.
  • Character encoding standards define not only the identity of each character and its numeric value, or code position, but also how this value is represented in bits.
  • the Unicode Standard endorses at least three forms that correspond to ISO/IEC 10646 transformation formats, UTF-7, UTF-8 and UTF-16.
  • the ISO/IEC 10646 transformation formats UTF-7, UTF-8 and UTF-16 are essentially ways of turning the encoding into the actual bits that are used in implementation.
  • UTF-16 assumes 16-bit characters and allows for a certain range of characters to be used as an extension mechanism in order to access an additional million characters using 16-bit character pairs.
  • the Unicode Standard, Version 2.0, Addison Wesley Longman (1996) (with updates and additions added via “The Unicode Standard, Version 2.1) has adopted this transformation format as defined in ISO/IEC 10646. This reference is incorporated herein by reference in its entirety and for all purposes.
  • the second transformation format is known as UTF-8.
  • UTF-8 This is a way of transforming all Unicode characters into a variable length encoding of bytes. It has the advantages that the Unicode characters corresponding to the familiar ASCII set end up having the same byte values as ASCII, and that Unicode characters transformed into UTF-8 can be used with much existing software without extensive software rewrites.
  • the Unicode Consortium also endorses the use of UTF-8 as a way of implementing the Unicode Standard. Any Unicode character expressed in the 16-bit UTF-16 form can be converted to the UTF-8 form and back without loss of information.
  • Linguistic encoding type any character or glyph encoding type (e.g., ASCII or BIG5) now known or used in the future.
  • Each encoding type has its own mapping between linguistic characters (e.g., “a” in the Latin alphabet and an “o” with an umlaut in the German alphabet) and corresponding digital representations (e.g., hexadecimal number 0041 for “A” in ASCII).
  • Digital sequence a particular sequence of ones and zeros, hexadecimal characters, or other constituents in a digital representation.
  • Encoded domain name the digital sequence of domain name characters represented in binary or hexadecimal for example. More specifically, the string of concatenated digital representations for the characters comprising the domain name under consideration is the “encoded domain name.”
  • the ASCII encoded domain name for abc.com is ⁇ 0x61 ⁇ 0x62 ⁇ 0x63 ⁇ 0x2E ⁇ 0x63 ⁇ 0x6F ⁇ 0x6D.
  • the GB-2312 encoded domain name of “Taiwan.com” is ⁇ 0xCC ⁇ 0xA8 ⁇ 0xCD ⁇ 0xE5 ⁇ 0x2E ⁇ 0x63 ⁇ 0x6F ⁇ 0x6D.
  • Encoding detection tree This is a tree or graph structure used to unambiguously determine the encoding types of arbitrary bit strings that are digital sequences of domain names. In typical examples, using information in the digital sequence of the domain name, an encoding detection algorithm can traverse the tree to reach a leaf node (terminal node). There the encoding type can be unambiguously determined. Specific examples of such trees and their use are presented below.
  • Linguistic digit this is the digital sequence employed at nodes of an encoding detection tree. In one embodiment, it is 8 bits long. It is typically derived from the digital representation of an encoded linguistic character. For example, one linguistic digit employed in an encoding detection tree may be the value hexadecimal number 0041, which is the ASCII representation of “A.” In another example, a linguistic digit is one byte of a two byte string used to represent a particular Chinese character in GB-2312.
  • the length of the linguistic digit may be chosen to provide an optimal balance between the size of the encoding detection tree and the speed at which it can be traversed. Smaller linguistic digits (1 bit in the extreme case) required more nodes and hence more storage space. Larger linguistic digits require longer comparison times.
  • Linguistic equivalent refers to two nominally different characters that have very similar linguistic meanings. Examples include uppercase and lowercase characters in the Latin and Greek alphabets.
  • DNS encoding type an encoding type supported by the DNS protocol of a network or the Internet; e.g., a limited set of ASCII characters specified in RFC 1035.
  • Non-DNS encoding type an encoding type not supported by the DNS protocol under consideration, e.g., BIG5 under the RFC 1035 standard.
  • Universal linguistic encoding type any linguistic encoding type, now known or developed in the future, that encompasses more than one character or glyph set within its encoding. Unicode is one example.
  • BIG5 iso-8859-11, and GB-2312 are others.
  • a network 10 used in a first embodiment of this invention include a client 12 , a corresponding node 14 with whom client 12 wishes to communicate, a multilingual DNS (“m 1 DNS”) server 16 and a conventional DNS server 18 .
  • the m 1 DNS server 16 may listen on a DNS port (currently addressed to the domain name port 53 ) for multilingual domain name queries in place of a normal DNS server.
  • Server 16 may include the Berkeley Internet Name Domain (‘BIND’ and its executable version ‘named’) which is a widely used DNS server written by Paul Vixie (http://www.isc.org/).
  • client 12 is used by a Chinese student who wishes to inquire about employment in a Hong Kong business that operates corresponding node 14 .
  • the student has previously communicated with the business and has obtained the domain name of that business.
  • the domain name is provided in native Chinese characters.
  • Client 12 is outfitted with a keyboard that can type Chinese language characters and is configured with software that can recognize encoded Chinese characters and accurately display them on a computer screen.
  • the student prepares a message to the Hong Kong business, encloses her resume, and types in the Chinese domain name as the destination.
  • the system shown in FIG. 1 takes the following actions.
  • the corresponding node domain name is submitted, in the native language, to m 1 DNS server 16 via a DNS request.
  • the m 1 DNS server 16 recognizes that the domain name is not in a format that can be handled by a conventional DNS server. Therefore it translates the Chinese domain name to a format that can be used with a conventional DNS server (normally a limited set of the ASCII characters).
  • the m 1 DNS server 16 then repackages the DNS request, with the translated corresponding node domain name, and transmits that request to conventional DNS server 18 .
  • DNS server 18 uses the normal DNS protocol to obtain a network address for the domain name it received in the DNS request. The resulting network address is the network address of corresponding node 14 .
  • DNS server 18 packages that network address according to conventional DNS protocol and forwards the address back to m 1 DNS server 16 .
  • the m 1 DNS server 16 transmits the needed network address back to client 12 , where it is placed in the student's message.
  • the message is packetized, with each packet having a destination network address corresponding to node 14 .
  • Client 12 then sends the message packets over the Internet to node 14 .
  • FIG. 1 shows the m 1 DNS server 16 and conventional DNS server 18 as separate blocks, often the two entities can be represented as a single logical block. Often both entities will reside on a single hardware device, such as a network workstation. Further, the functions of the two entities can be executed using a single block of program code or tightly coupled blocks of program code.
  • FIG. 2 shows a network having multiple m 1 DNS servers, each of which performs the logical operations of m 1 DNS server 16 and conventional DNS server 18 .
  • client 12 is depicted by a vertical line on the left-hand side of the figure
  • a default m 1 DNS server 17 is depicted by a vertical line in the center of the figure
  • a second m 1 DNS server 19 is depicted by a vertical line on the right-hand side of the figure.
  • an application running on client 12 generates a message intended for a network destination.
  • the domain name for that destination is input in non-DNS compatible text encoding format.
  • the text is encoded in a linguistic encoding type that digitally represents the characters of the text.
  • ASCII is but one linguistic encoding type.
  • the invention handles a wide range of encoding types. Examples of some in wide use include GB2312, BIG5, Shift-JIS, EUC-JP, KSC5601, extended ASCII, and others.
  • the client operating system After the client application creates the message at 203 , the client operating system creates a DNS request to resolve the domain name at 205 .
  • the DNS request may resemble a conventional DNS request in most regards. However, the domain name provided in the request will be provided in a non-DNS encoding format.
  • the client operating system transmits its DNS request to default m 1 DNS server 17 at 207 . Note that the client operating system may be configured to send DNS requests to m 1 DNS server 17 . In other words, the default DNS server of client 12 is m 1 DNS server 17 .
  • Default m 1 DNS server 17 extracts the encoded domain name from the DNS request and generates a transformed DNS request presenting the domain name in a DNS compatible encoding format (presently the reduced set ASCII specified in RFC 1035). See 209 .
  • Server 17 attempts to resolve the DNS compatible domain name. It may use the conventional DNS protocol for this purpose; i.e., to obtain the IP address of the domain name used in the client's communication. If server 17 cannot itself resolve the domain name presented to it, it will attempt to identify another m 1 DNS server that is authoritative for the domain name under consideration. Regardless of the outcome of operation 209 , default m 1 DNS server 17 then transmits a message back to client 12 . See 217 . This message may include the IP address of the domain name under consideration or it may include a reference to another m 1 DNS server.
  • client 12 then sends its DNS request (with the multilingual domain name) to second m 1 DNS server 19 . See 219 .
  • Server 19 attempts to resolve the request locally. See 221 . Regardless of its success, it sends a reply to client 12 . See 223 . That reply will include either the IP address of the multilingual domain name, the name of a referred server, or a failure message.
  • client 12 either sends a communication using the IP address of the resolved multilingual domain name or reports a failure to establish a connection (because servers 17 and 19 each failed to resolve the domain name). See 225 . It is of course possible that server 19 sent a referral for yet another multilingual domain name server. If that is the case, then client 12 may try to send its multilingual domain name request to the newly referred server.
  • the domain name must, at some point, be converted from a non-DNS encoding type to a DNS compatible encoding type. In the above examples, this is accomplished with a m 1 DNS server (or a proxy m 1 DNS server). This need not be the case, however, as the functionality necessary for conversion may be embodied elsewhere. In alternative embodiments, the functions performed by the m 1 DNS server are implemented in whole (or in part) on the client and/or on the DNS name server.
  • operations including detecting an encoding type, translating a non-DNS encoded domain name to a DNS encoded domain name and identifying a default name server are implemented on an Internet application (e.g., a multilingual-enabled Web browser).
  • code detection and code conversion are automatically done prior to dispatching a DNS resolution request to a DNS name server.
  • operations 305 - 311 can be implemented entirely on a proxy m 1 DNS server.
  • Other embodiments include collapsing all or some fraction of these operations into a conventional DNS name server.
  • code for some m 1 DNS functions can be collapsed into BIND code as a compilable module.
  • the conversion of the domain name from one linguistic encoding type to a second linguistic encoding type is performed at 209 or 221 (depending upon which of servers 17 and 19 is the authoritative server).
  • this conversion may take place via a process 301 .
  • the process begins at 303 with the system identifying the encoding type of the domain name in the DNS request. This is necessary when the system may be confronted with multiple different encoding types.
  • the detection will involve analyzing a bit string making up the domain name under consideration. A preferred approach to this process is described below in detail.
  • an application can present explicitly defined linguistic encoding which obviates the need for encoding type detection.
  • the system After the encoding type has been identified, the system next determines whether the domain name was encoded in a DNS compatible encoding type at 305 . Currently, that requires determining whether the domain name is encoded in the reduced set ASCII encoding type. If so, further conversion is unnecessary and process control is directed to 311 , which will be described below.
  • the domain name is encoded in a non-DNS format.
  • process control is directed to 307 where the system translates the domain name to a universal encoding type.
  • this universal encoding type is Unicode.
  • the characters identified in the native encoding type are then identified in the Unicode standard and converted to the Unicode digital sequences for those characters.
  • the newly translated domain name is then further transformed from the universal encoding type to a DNS compatible encoding type. See 309 .
  • this final encoding type may be the reduced set of ASCII specified in RFC 1035.
  • the system need only determine which conventional DNS name server it should forward the domain name to.
  • the DNS request might be forwarded to a top-level name server.
  • the Chinese government may maintain a root name server for Chinese language domain names
  • the Japanese government or a Japanese corporation may maintain a root name server for Japanese language domain names
  • the Indian government may maintain a root name server for Hindi language domain names, etc.
  • the system must identify the appropriate name server at 311 as indicated in FIG. 3A. After this has been accomplished, the conversion process is complete and the DNS request can be transmitted to the DNS system for handling according to convention.
  • the process depicted in FIG. 3A is performed solely on a m 1 DNS server.
  • some of the process may be performed on a client or a conventional DNS server.
  • 303 and 305 could be performed on a client and 309 could be performed on a conventional DNS server.
  • the name server is co-located with the m 1 DNS server. So operation 311 would involve nothing more than determining that the server performing the encoding type detection and conversion can also resolve a DNS request for the domain name in question.
  • FIG. 3B A preferred division of labor for the m 1 DNS function is depicted in FIG. 3B.
  • a m 1 DNS server 327 performs the necessary detection of encoding type and conversion to a DNS compatible format. Server 327 also performs normal DNS resolution.
  • An encoding detection tree (EDT) 321 and associated logic performs the operations of FIG. 3A.
  • a normal DNS resolution subsystem 323 performs the standard DNS resolving protocol.
  • the EDT and associated logic detects all necessary linguistic encoding types and can convert all encoding types to Unicode (or other suitable universal encoding type).
  • a client 325 submits a domain name for a corresponding node 331 in its native language.
  • the m 1 DNS server 327 receives the domain name and a conventional DNS resolution sub-system 323 performs the standard DNS resolving protocol. It returns the IP address for corresponding node 331 , allowing client 325 to communicate directly with node 331 .
  • EDT 321 and associated logic runs on a machine (identified by i2.i-dns.com for example) on a designated port (e.g., a port number 2000 ). It accepts a whole portion of a digitally represented domain name in any linguistic encoding type and returns a whole portion of a digitally represented domain name in Unicode transformed to a DNS encoding type (UTF-5). Normal DNS subsystem 323 returns an IP address for the domain name under consideration.
  • FIGS. 4 A- 4 C depict an embodiment of this invention for identifying and encoding type of a domain name string using an encoding detection tree of this invention.
  • an encoding detection tree 401 (which may also be viewed technically as a “graph”) includes various nodes such as node 403 and connections between those nodes such as “eq” connection 405 .
  • FIGS. 4B and 4C present a process flow for using a tree structure such as tree 401 to unambiguously detect an encoding type.
  • the process flow 450 depicted in FIG. 4B At the beginning of the process, the linguistic encoding type, T, is unknown.
  • the international domain name system e.g., an m 1 DNS server as described above
  • the reversed linguistic digit sequence S′ is depicted by sequence 407 (for the domain name A 1 A 2 B 1 B 2 .com).
  • sequence 407 for the domain name A 1 A 2 B 1 B 2 .com.
  • a 1 A 2 represent one 16 bit Chinese character encoded in BIG5
  • B 1 B 2 represent a second Chinese character also encoded in BIG5.
  • the international domain name system next sets a pointer P 1 to a first digit of the reversed sequence S′ and sets a pointer P 2 to the root of the encoding detection tree 401 . See operation 454 .
  • Pointers P 1 and P 2 are depicted in FIG. 4A. During the course of the encoding detection process, these pointers move from digit to digit (in the case of P 1 ) and from node to node (in the case of P 2 ).
  • the international domain name system compares the value at pointer P 1 against the value at pointer P 2 . See 456 .
  • This comparison involves the digital values of the character (or portion of a character) from the domain name at the current location of pointer P 1 and the linguistic digit represented in the node currently at pointer P 2 .
  • pointer P 2 is initially at node 403 , which corresponds to the digital value of m.
  • the linguistic digit at node 403 will be the ASCII value of the letter “m.”
  • the value at pointer P 1 is also the digital value for m. Therefore, the comparison presented at 456 indicates that the values at pointers P 1 and P 2 are equal.
  • process flow 450 proceeds to 458 where the position of pointer P 2 is moved to the equal child node of parent node 403 .
  • the equal child node of parent node 403 is the node 409 .
  • This node contains the linguistic digit of the letter “o” in ASCII.
  • the international domain name system determines whether the pointer P 2 is currently pointing to a terminal node of the tree structure. If so, it will determine the encoding type from that node. In the current example, however, there are many additional nodes between the pointer P 2 and the terminal node. (Examples of terminal nodes are indicated by nodes 421 and 429 in FIG. 4A.) Therefore, decision 460 is answered in the negative. Then process control is directed to a decision 462 , which determines whether the next digit in the domain name string represents the end of that string. In the example at hand, the pointer P 1 currently points to the m character. Therefore, several more digits exist between the end of string 407 and the current digit. Hence, decision 462 is answered in the negative.
  • Process control is next directed to block 464 where the international domain name system moves pointer P 1 to the next digit of S′, the reversed character sequence 407 . See FIG. 4A. In the example at hand, this results in P 1 moving from the letter m to the letter o in sequence 407 . Process control is then directed back to decision operation 456 .
  • decision block 460 determines whether the new location of P 2 is a terminal node. As it is not in this case, the process moves to decision block 462 where it determines that the next digit in the S′ character string is not the end of that string. Then, process block 464 moves P 1 to the next digit in S′, the letter “c.”
  • the pointer P 1 is located at the c and the “.” respectively.
  • the pointer P 2 points to the node 413 containing the linguistic digit that is the digital representation of character B 2 .
  • the pointer P 1 points to the character B 2 in reverse sequence 407 .
  • the international domain name system compares the values located at pointers P 1 and P 2 . See 456 . As before, these values are equal. Therefore, the location of pointer P 2 moves down the “eq” path to node 415 , which harbors the linguistic digit for B 1 .
  • each encoding type represented at the terminal node is also associated with its unique character string, S.
  • the international domain name system can then search through a list of character strings for the exact match of the sequence S. See 470 . When the exact match of the digital sequence S is found, the corresponding encoding type is selected.
  • FIG. 4C depicts one example of a process flow that may be employed to search through the list of terminal nodes as depicted at process block 470 .
  • a process 480 begins at 482 with selection of the first terminal node in the list of terminal nodes. See 482 .
  • the process normalizes the sequence S associated with terminal node L based on the linguistic encoding type of that sequence. See 484 .
  • the process determines whether the linguistic digital sequence S of the domain name under consideration matches the linguistic digital sequence associated with the terminal node under consideration. See 486 . Assuming that this is the case, process control is directed to block 488 where system returns the terminal node currently visited in the list.
  • This terminal node has an associated encoding type, which is the encoding type of the domain name under consideration.
  • process control is directed to decision block 490 , which checks to determine whether the end of the list of terminal nodes has been reached. If not, the next terminal node, T, in the list is visited at 492 . From there, process control is directed back to block 484 and the process continues as described above. Now, in the case where the end of the list of terminal nodes has been reached but no matching strings have been found, decision block 490 will be answered in the affirmative. As a result, the system sets a pointer to the list of terminal nodes to point to nothing. See 494 . From there, process controls directed to 488 which returns no match, in this case.
  • the need for the process of FIG. 4C can be understood as follows.
  • the traversing path may lead to a list of terminal nodes rather than a single match. Therefore, some mechanism to determine which terminal node is correct is required.
  • ??.com is a valid GB encoded binary sequence and also a valid iso8859-1 encoded binary sequence. Both of them have the same traversing path in the encoding detection table and will be chained up if both were previously inserted into the encoding detection tree.
  • iso8859-1 encoded characters upper case and lower case are considered linguistically equivalent, while for GB encoded characters, the case of the character is sometime significant. Chinese characters in GB will be double bytes. Therefore, these two bytes are case significant.
  • the detection tree could have a terminal node with detectable string as “0xEC ⁇ 0xA8 ⁇ 0xED ⁇ 0xE5 ⁇ 0x2E ⁇ 0x63 ⁇ 0x6F ⁇ 0x6D” (all valid iso8859-1 characters are lower-case) and an encoding type of “iso8859-1”.
  • the encoding tree could also have another terminal node with detectable string as “ ⁇ 0xCC ⁇ 0xA8 ⁇ 0xCD ⁇ 0xE5 ⁇ 0x2E ⁇ 0x63 ⁇ 0x6F ⁇ 0x6D” (valid GB characters will be preserved, while ASCII characters will be lower cased since they are not case significant) and an encoding type of “GB”. Both of them are chained up under the same traversing path.
  • the normalization process (see 484 ) will utilize the encoding information contained in the terminal node and lower case characters that are not case significant in the query string and then do exact match on the normalized query string with detectable string stored in the terminal node.
  • FIGS. 4A and 4B consider the possibility where the domain name under question is A 1 A 2 B 1 B 2 .coM.
  • the domain name is linguistically equivalent to the previous domain name under consideration. It so happens that it is presented with an upper case letter “M,” rather than the lower case letter “m.”
  • the pointer P 1 points to the M in the sequence, S′, and the pointer P 2 points to the root node 403 of tree 401 .
  • the system will discover that the value at P 1 is less than the value at P 2 . This is because the digital sequence representing M has a lower value than the digital sequence representing m.
  • process control proceeds to a process block 472 , which moves the pointer P 2 to the low child node branching from root node 403 . In this case, that is node 423 , populated with the digital sequence associated with the M.
  • the system next determines whether pointer P 2 is pointing to nothing. See decision block 473 . In this case, that is not true, so process control is directed back to decision block 456 , where the value associated with pointer P 1 , and the new position of pointer P 2 are compared. This time, the values will match, so process controls directed to 458 . There, the pointer P 2 is moved down the “eq” branch to node 409 . This causes the pointer P 2 to move to node 409 , where the linguistic digit for the letter “o” resides. The process then proceeds down the tree, loop by loop, until reaching terminal node 421 as described in the previous example.
  • process control loops back to 456 where the value of P 1 (F 2 ) is compared with the value at the new location of pointer P 2 (node 425 ). This time, the comparison will indicate a match. Then, process block 458 moves pointer P 2 down the eq path on tree 401 to a node 427 . The process will continue in this manner until reaching a terminal node 429 there, the encoding type of domain name D 1 E 2 F 1 F 2 .com will be identified.
  • the tree should embody representations of all domain names that are registered with a particular host system—e.g., a particular Internet Service Provider.
  • the encoding detection tree is periodically rebuilt when new multilingual domain names are registered.
  • the tree is recomputed every time a new domain name is registered. More typically, the tree is computed only after a defined number of new domains have been registered since the tree was last computed or a set length of time has expired (e.g., 12 hours) since the tree was last computed.
  • the registrar may enforce certain restrictions on registration. In a preferred embodiment, two restrictions are imposed. First, the registrar should not register two domain names having the exact same digital sequence. Second, it should not register two domain names that are linguistically equivalent.
  • the domain names grasshopper.com and GrassHopper.COM are linguistically equivalent.
  • domain names are case insensitive. Entities and individuals obtaining domain names expect to own rights to all linguistic equivalents of a given name.
  • the registrar should prevent registration of two linguistically equivalent domain names.
  • the encoding detection tree preferably contains paths for multiple linguistic equivalents of a single domain name.
  • the tree is designed considering one or more of four objectives:
  • the encoding detection tree is extended from a data structure called “ternary search tree.”
  • Each node of the tree is associated with a record holding a single linguistic digit and pointers to its children nodes.
  • the linguistic digit represents the digital encoding of a particular character (or a portion of that character) in a particular encoding type.
  • the size of the linguistic digit used in the nodes is chosen balance between rapid searching and low memory usage.
  • each node contains a one bit linguistic digit. This structure could be searched very fast, but would occupy too much memory. Trees having 16 bit linguistic digits at each node would occupy less memory, but would be searched more slowly.
  • each node of the tree includes 8 bits.
  • each node can only have at most three children nodes, which are named as “lokid”, “eqkid” and “hikid”.
  • the linguistic digit stored with a node is compared against the digital sequences of characters in the domain name under analysis. “Lokid” will be visited if the digital value of linguistic digit from the appropriate position of incoming digital sequence is less than the value of linguistic digit held by current node of tree. If both of the digits have the same value, “eqkid” is visited. When the digital value of lingustic digit from the appropriate position of incoming digital sequence is greater than the value of linguistic digit held by current node of tree, “hikid” will be visited.
  • Non-ternary search trees may be employed in alternative embodiments.
  • FIG. 4D depicts a slightly more complex version of a ternary encoding detection tree. This tree would be suitable for a multilingual domain name system that registers “.com” and “.tm” top level domain names. The tree itself would be traversed in the manner described above for tree 401 of FIG. 4A.
  • an algorithm 500 for building the encoding detection tree will be described.
  • the algorithm 500 is executed by a computer system (e.g., system 700 , which will be described later referring to FIG. 7) periodically.
  • a computer system e.g., system 700 , which will be described later referring to FIG. 7
  • the system adds a new linguistic digit sequence of a domain name to its encoding detection tree.
  • An exemplary process in which the system ultimately creates the EDT shown in FIG. 4A will be described in detail below.
  • the system receives three linguistic digit sequences, namely, “A 1 A 2 B 1 B 2 .com,” “C 1 C 2 D 1 D 2 .com,” and “D 1 E 2 F 1 F 2 .com” in this order.
  • the EDT may be stored in various types of memory devices in the system as long as its topological structure is kept precisely.
  • the routine 500 receives a linguistic digit sequence S (i.e., A 1 A 2 B 1 B 2 .com), and a linguistic encoding type of the linguistic digit sequence S (e.g., GB2312).
  • the system then reverses the order of the linguistic digit sequence S, and substitutes the reversed sequence of S for the reversed linguistic digit sequence S′.
  • the system substitutes a linguistic encoding type of the linguistic digit sequence S for a linguistic encoding type T.
  • the reversed linguistic sequence S′, and the linguistic encoding type T are “moc.B 2 B 1 A 2 A 1 ,” and “GB2312,” respectively.
  • FIG. 5B illustrates the reversed linguistic sequence S′, and the linguistic encoding type T.
  • the system may store the linguistic encoding type T, which is represented in ASCII characters.
  • the system may store a code corresponding to the linguistic encoding type T in order to reduce a memory space for storing the variable T.
  • the routine 500 initializes pointers P 1 , P 1 ′, P 2 , and P 2 ′ as indicated in FIG. 5B.
  • the pointer P 1 which points to a current position in the linguistic digit sequence, is set to the first digit of the sequence S′, namely, “m.”
  • the pointer P 1 ′ points to a linguistic equivalent of a linguistic digit pointed by the pointer P 1 , which is “M.”
  • a linguistic equivalent of a linguistic digit in ASCII code is an upper case letter of the linguistic digit.
  • the pointer P 2 which points to a currently-visited node in the EDT, is initially set to a root of the EDT.
  • the pointer P 2 ′ which points to a currently-visited node in the EDT for insertion of the linguistic digit at the location of pointer P 1 ′, is also set to the root of the EDT. Since the EDT has no node, the pointers P 2 and P 2 ′ point to a null node 539 shown by the broken line.
  • the system stores the linguistic equivalent of the linguistic digit (e.g., “M”) as a singe digit (e.g., one-byte) buffer.
  • the system may store the linguistic equivalent of the linguistic digit in a row of multiple-digit buffer having the same structure as the one where the sequence S′ is stored. In both cases, if the linguistic equivalent does not exist, those buffers do not store any digit.
  • Block 505 checks whether a linguistic digit pointed by the pointer P 1 exists in the EDT by calling a subroutine 540 labeled as “check_existence(D, P),” which will be described in detail below referring to FIG. 5C.
  • the subroutine 540 takes two local variables: D, which is a linguistic digit passed from the main routine 500 for checking the existence, and P, which is a pointer to the currently-visited node of the EDT (e.g., P 2 ).
  • the subroutine 540 returns the local variable P as a computed result to the main routine 500 .
  • FIG. 5C illustrates the subroutine 540 , check_existence(D, P) for checking whether a linguistic digit has already been inserted into the EDT.
  • the subroutine 540 takes a linguistic digit Value(P 1 ) and the pointer P 2 , and returns the result as a pointer P 3 .
  • the function “Value(P 1 )” returns a linguistic digit which the P 1 points to.
  • the function Value(P 1 ), and the pointer P 2 are substituted for the linguistic digit D, and the pointer P, respectively.
  • a decision 541 is made based on whether P 2 points to a null node.
  • Block 549 creates a new empty node 559 shown by the solid line in FIG. 5D, at a position which the pointer P 2 points to.
  • a “null node” (e.g., 539 ) means that the pointer P 2 points to a position where a node has not been created, while an “empty node” (e.g., 559 ) means that the pointer P 2 points to a node which stores no linguistic digit therein.
  • a null node is represented by a circle drawn by the broken line
  • an empty node is represented by a circle drawn by the solid line.
  • block 551 returns the position of the empty node 559 which the pointer P 2 points to, to block 505 as the pointer P 3 as shown in FIG. 5D.
  • the pointer P 3 points to the same node as the pointers P 2 , and P 2 ′ do, which is the empty node 559 .
  • a decision 507 is made based on whether the pointer P 3 points to an empty node. Since the subroutine 540 returns the pointer P 3 , which points to the empty node 559 , control moves to block 509 .
  • Block 509 substitutes the linguistic digit Value(P 1 ) for the linguistic digit Value(P 3 ).
  • the linguistic digit Value(P 1 ) i.e., “m,” is substituted for the linguistic digit which the pointer P 3 points to, i.e., the formerly empty node 559 , as shown in FIG. 5E.
  • control moves to a decision 511 .
  • the decision 511 is made based on whether the linguistic digit which the pointer P 1 ′ points to is an “empty value,” or has no value.
  • the pointer P 1 ′ points to a linguistic digit of “M” as shown in FIG. 5B, which is not a digit with an empty value, and thus, control moves to block 513 .
  • Block 513 checks whether a linguistic digit pointed by the pointer P 1 ′ exists in the EDT by calling a subroutine 540 .
  • the decision 541 is made based on whether the pointer P 2 ′ points to a null node.
  • control moves to a decision 543 .
  • the decision 543 is made based on whether the pointer P 2 ′ points to an empty node. Since the pointer P 2 ′ points to the node 559 storing the linguistic digit “m,” which is not an empty node, control moves to a decision 545 .
  • the decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P 1 ′, and P 2 ′ from the main routine 500 , respectively.
  • the pointer P 1 ′ points to “M” (FIG. 5B)
  • the pointer P 2 ′ points to “m” (FIG. 5E).
  • control proceeds to block 547 , which moves the pointer P to a “lokid” child node of the “m” node (i.e., the node 559 ) which the pointer P (i.e., P 2 ′) originally points to, as shown in FIG. 5F.
  • the lokid child node of the “m” node has no node, which is indicated by the broken line in FIG. 5F. Control then returns to the decision 541 .
  • Controls returns from the subroutine 540 and proceeds to a decision 515 in the main routine 500 .
  • the decision 515 is made based on whether the pointer P 3 ′ points to an empty node. Since the subroutine 540 returns the new empty node as the pointer P 3 ′, control moves to block 517 .
  • Block 517 substitutes the linguistic digit Value(P 1 ′) for the linguistic digit Value(P 3 ′).
  • the linguistic digit Value(P 1 ′) i.e., “M” is substituted for the linguistic digit which the pointer P 3 ′ points to as shown in FIG. 5I.
  • control moves to a decision 519 .
  • the decision 519 is made based on whether an eqkid of the node which the pointer P 3 points to is equal to an eqkid of the node which the pointer P 3 ′ points to.
  • an exceptional rule if both (i) an eqkid of the node which the pointer P 3 points to, and (ii) an eqkid of the node which the pointer P 3 ′ points to are null nodes, control moves on the “NO” branch, and proceeds to block 521 .
  • Block 521 calls a subroutine 560 , which is labeled as “adjust_subgraph(P, P′),” as described in detail referring to FIG. 5J.
  • the subroutine 560 takes two local variables P, which is a pointer to a linguistic digit in the EDT, and P′, which is a pointer to a linguistic equivalent of the linguistic digit in the EDT.
  • a decision 555 is made based on whether both (i) the “eqkid” child node of the node which the pointer P 3 points to, and (ii) the “eqkid” child node of the node which the pointer P 3 ′ points to, are null nodes.
  • P which is a pointer to a linguistic digit in the EDT
  • P′ which is a pointer to a linguistic equivalent of the linguistic digit in the EDT.
  • Block 557 creates a new empty node under (i) the eqkid child node of the node which the pointer P 3 points to, and (ii) the eqkid child node of the node which the pointer P 3 ′ points to, as shown in FIG. 5K.
  • Control returns from the subroutine 560 and proceeds to a decision 523 in the main routine 500 .
  • the decision 523 is made based on whether the pointer P 1 has reached the end of the linguistic digit sequence S′. As shown in FIG. 5B, the pointer P 1 still points to the first linguistic digit of the linguistic digit sequence S′. Thus, control moves to block 525 .
  • the system sets the pointers P 1 , P 1 ′, P 2 , and P 2 ′ to add the next digit in the sequence to the EDT as shown in FIG. 5L.
  • the system moves the pointer P 1 to a next linguistic digit of the linguistic digit sequence S′, namely, “o.”
  • the system sets the pointer P 1 ′ to a linguistic equivalent of the linguistic digit pointed by the pointer P 1 , which is “O.”
  • the system also sets the pointers P 2 and P 2 ′ to the empty node which was created under the eqkid child nodes of the “m” node and the “M” node. Then, control returns to block 505 .
  • the main routine 500 again calls the subroutine check_existence(Value(P 1 ), P 2 ).
  • the pointer P 2 does not point to a null node as shown in FIG. 5L, and thus, control moves to block 543 .
  • the pointer P 2 points to an empty node as shown in FIG. 5L. Therefore, control moves to block 551 , which returns the local variable P, i.e., the pointer P 2 which points to the empty node, to block 505 of the main routine 500 .
  • the system sets the pointer P 3 to the same node as the pointers P 2 and P 2 ′ points to, as shown in FIG. 5M. Control returns from the subroutine 540 to block 505 of the main routine 500 , and proceeds to block 507 .
  • decision 507 causes control to move to block 509 , which substitutes Value(P 1 ) for Value(P 3 ).
  • the system stores the linguistic digit “o” in the node which the pointer P 3 points to as shown in FIG. 5N.
  • Control moves on to the decision 511 .
  • block 511 since the linguistic digit which the pointer P 1 ′ points to is not empty, control moves to block 513 , which again calls the subroutine check_existence(Value(P 1 ′), P 2 ′). Control jumps from block 513 of the main routine 500 to block 541 of the subroutine 540 .
  • Block 541 the pointer P 2 ′ does not point to a null node as shown in FIG. %N, and thus control moves to block 543 .
  • the pointer P 2 ′ is not an empty node, and thus control proceeds to block 545 .
  • Block 545 compares Value(P 1 ′) (i.e., “ 0 ”) and Value(P 2 ′) (i.e., “o”).
  • control proceeds to block 547 , which moves the pointer P to a “lokid” child node of the “o” node which the pointer P (i.e., P 2 ′) originally points to, as shown in FIG. 50.
  • the lokid child node of the “o” node has no node, which is indicated by the broken line in FIG. 5O. Control then returns to the decision 541 .
  • FIG. 5O shows the pointers and variables of the system immediately after the execution of block 509 .
  • Blocks 523 , and 525 move the pointers to the next character without adding a portion of the tree structure representing the linguistic equivalents (e.g., “M,” “ 0 ,” and “C”).
  • the portion representing the linguistic equivalents namely, “M,” “ 0 ,” and “C,” are created by blocks 513 , 515 , and 517 .
  • Block 523 the pointer P 1 still has not reached the end of the linguistic digit sequence S′, and control moves to block 525 .
  • Block 525 sets the pointers P 1 , P 1 ′, P 2 , and P 2 ′ as indicated in FIG. 5Q, and control returns to block 505 .
  • Block 505 creates an empty node at a position which the pointer P 2 points to, and sets the pointer P 3 to the newly created empty node as shown in FIG. 5R.
  • the scheme in the subroutine 540 is similar to that described referring to FIGS. 5L and 5M. Then, control returns from the subroutine 540 to block 509 of the main routine 500 .
  • Block 507 the pointer P 3 points to an empty node as shown in FIG. 5R, and thus, control moves to block 509 .
  • Block 509 substitutes the linguistic digit which the pointer P 1 points to, i.e., B 2 , for a node which the pointer P 3 points to, as shown in FIG. 5S.
  • the pointer P 1 ′ points to an empty value.
  • Block 525 sets the pointers P 1 , P 1 ′, P 2 , and P 2 ′ as shown in FIG. 5T.
  • the system By iterating the process described referring to FIGS. 5 Q- 5 T, the system creates the three nodes “B 1 ,” “A 2 ,” and “A 1 ” under the eqkid node of the node of “B 2 ,” as shown in FIG. 5U. Control moves on to blocks 511 , and then 523 . In the decision 523 , this time, the pointer P 1 points to the end of the reversed linguistic digit sequence S′, and therefore, control proceeds to a decision 527 . The decision 527 is made based on whether the tree structure of “m-M-o-O-c-C-.-B 2 -B 1 -A 2 -A 1 ” has a terminal node. Here, the tree structure has no node, and thus, control proceed to block 529 .
  • Block 529 creates a new terminal node N 1 which is associated with the tree structure “m-M-o-O-c-C-.-B 2 -B 1 -A 2 -A 1 ”.
  • the terminal node N 1 contains the linguistic digit sequence S, and the linguistic encoding type T of the sequence S.
  • the terminal node N is sometimes referred to as a “leaf” node of the tree structure, which uniquely identifies distinct encoding types, thereby unambiguously specifying the linguistic encoding type T of the domain name corresponding to the tree pathway (e.g., “m-M-o-O-c-C-.-B 2 -B 1 -A 2 -A 1 ”).
  • FIG. 5V illustrates the tree structure “m-M-o-O-c-C-.-B 2 -B 1 -A 2 -A 1 ,” and the associated terminal node N 1 containing the linguistic digit sequence S, and the linguistic encoding type T, which are “A 1 A 2 B 1 B 2 .com,” and “GB2312,” respectively.
  • the routine 500 receives a linguistic digit sequence S (i.e., C 1 C 2 D 1 D 2 .com), and a linguistic encoding type of the linguistic digit sequence S (e.g., BIG5).
  • the computer system then reverses the order of the linguistic digit sequence S, and substitutes the reversed sequence of S for the reversed linguistic digit sequence S′.
  • the system substitutes a linguistic encoding type of the linguistic digit sequence S for a linguistic encoding type T.
  • the reversed linguistic sequence S′, and the linguistic encoding type T are “moc.D 2 D 1 C 2 C 1 ,” and “BIG5,” respectively.
  • FIG. 5W illustrates the reversed linguistic sequence S′, and the linguistic encoding type T.
  • the routine 500 initializes pointers P 1 , P 1 ′, P 2 , and P 2 ′ as indicated in FIG. 5W.
  • the pointer P 1 which points to a current position in the linguistic digit sequence, is set to the first digit of the sequence S′, namely, “m.”
  • the pointer P 1 ′ points to a linguistic equivalent of a linguistic digit pointed by the pointer P 1 , which is “M.”
  • the linguistic equivalent of a linguistic digit in ASCII code is a capital letter of the linguistic digit.
  • the pointer P 2 which points to a currently-visited node in the EDT, is set to a root of the EDT, namely a node storing “m.”
  • the pointer P 2 ′ which points to a currently-visited node in the EDT for insertion of the pointer P 1 ′, is also set to the root of the EDT, i.e., the “m” node.
  • Block 505 checks whether a linguistic digit pointed by the pointer P 1 exists in the EDT by calling the subroutine 540 check_existence(D, P).
  • the decision 541 is made based on whether the pointer P, i.e., the pointer P 2 , points to a null node.
  • the pointer P 2 points to a node with a linguistic digit of “m,” and thus, control moves to the decision 543 .
  • the decision 543 is made based on whether the pointer P points to an empty node.
  • the pointer P 2 points to the “m” node, not an empty node, and thus, controls proceeds to the decision 545 .
  • the decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P 1 , and P 2 from the main routine 500 , respectively.
  • the pointer P 1 points to “m,” and the pointer P 2 points to “m.” Therefore, control moves to block 551 , which returns the position of the pointer P 2 to the main routine 500 as the pointer P 3 .
  • the pointer P 3 points to the “m” node as shown in FIG. 5X, which is not an empty node, and thus, control moves to the decision 511 .
  • the decision 511 is made based on whether the pointer P 1 ′ is an empty value.
  • the pointer P 1 ′ points to a linguistic digit, “M,” which is not an empty value, and thus, control moves on to block 513 .
  • the subroutine 540 checks whether the EDT has already stored the linguistic digit which the pointer P 2 ′ points to. Specifically, referring to FIG. 5C, the decision 541 checks if the pointer P 2 ′ points to a null node.
  • the pointer P 2 ′ points to an “m” node as shown in FIG. 5X, and thus, control moves to the decision 543 .
  • the decision 543 checks if the pointer P 2 ′ points to an empty node. Since the pointer P 2 ′ does not point to an empty node, control moves to the decision 545 . Here, the relationship Value(D) ⁇ Value(P) (i.e., Value(“M”) ⁇ Value(“m”)) is satisfied, and thus control moves to block 547 , where the system moves the pointer P to the lokid node of the “m” node which the pointer P 2 ′ currently points to. Then, the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P 3 ′. As a result, the pointers P, and P 3 ′ are set to the “M” node of the EDT as shown in FIG. 5X.
  • Control returns from the subroutine 540 to block 513 of the main routine 500 .
  • the pointer P 3 ′ points to an “M” node, and thus, control moves to the decision 519 .
  • the decision 519 is made based on whether the eqkid node of the node which the pointer P 3 points to, and the eqkid node of the node which the pointer P 3 points to are the same or not.
  • the pointer P 3 points to the “m” node, and the pointer P 3 ′ points to the “M” node.
  • the system sets the pointers P 1 , P 1 ′, P 2 , and P 2 ′ to add the next digit in the sequence to the EDT as shown in FIG. 5Y.
  • Control loops back to block 505 to check if the EDT has the linguistic digit which the pointer P 2 points to. Since the pointer P 2 points to the “o” node now, control proceeds to the decisions 541 , 543 , and 545 .
  • control moves to block 551 , which returns the position of the pointer P 2 to the block 505 of the main routine 500 as the pointer P 3 , as shown in FIG. 5Z.
  • decision 507 since the pointer P 3 does not point to an empty node, control goes on to the decision 511 , which checks whether the pointer P 1 ′ points to an empty value.
  • the pointer P 1 ′ points to an “O” digit, and thus, control moves to block 513 .
  • Block 513 checks if the EDT has the linguistic digit which the pointer P 2 ′ points to. Since the pointer P 2 ′ points to the “o” node now, control proceeds to the decisions 541 , 543 , and 545 . Here, the relationship Value(D) ⁇ Value(P) (i.e., Value(“O”) ⁇ Value(“o”)) is satisfied, and thus control moves to block 547 , where the system moves the pointer P to the lokid node of the “o” node which the pointer P 2 ′ currently points to. Then, the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P 3 ′. As a result, the pointers P, and P 3 ′ are set to the “O” node of the EDT as shown in FIG. 5Z.
  • FIG. 5AA which is similar to FIG. 5Y, when the block 525 is executed.
  • Control then loops back to block 505 to check if the EDT has the linguistic digit, “.” (dot).
  • Control jumps from the main routine 500 to the subroutine 540 , where control proceeds to decisions 541 , 543 , 545 , and the block 551 .
  • the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P 3 , thereby setting the pointer P 3 as shown in FIG. 5AB.
  • Control returns to decision 507 of the main routine 500 , and further goes to blocks 509 , and then 511 since the pointer P 3 is not an empty node.
  • the pointer P 1 ′ points to an empty value since the linguistic digit of “.” has no linguistic equivalent as shown in FIG. 5AB.
  • Block 525 sets the pointers P 1 , P 1 ′, P 2 , and P 2 ′ as shown in FIG. 5AC.
  • Control loops back to block 505 , which calls the subroutine 540 .
  • Control moves to the decisions 541 , 543 , and 545 since the pointer P 2 points to neither a null node nor an empty node.
  • the decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P 1 , and P 2 from the main routine 500 , respectively.
  • the pointer P 1 points to “D 2 ”, and the pointer P 2 points to “B 2 .” Supposing that Value(“D 2 ”) ⁇ Value(“B 2 ”), control proceeds to block 547 , which moves the pointer P to a “lokid” child node of the “B 2 ” node which the pointer P (i.e., the pointer P 2 ′) originally points to, as shown in FIG. 5AC. The lokid child node of the “B 2 ” node has no node, which is indicated by the broken line in FIG. 5AC. Control then returns to the decision 541 .
  • Block 509 substitutes the linguistic digit Value(P 1 ) for the linguistic digit Value(P 3 ).
  • the linguistic digit Value(P 1 ) i.e., “D 2 ” is substituted for the linguistic digit which the pointer P 3 points to, as shown in FIG. 5AE.
  • control moves to the decision 511 .
  • the pointer P 1 ′ points to an empty value, and thus control goes to the decision 523 , and then block 525 , which set the pointers P 1 , P 1 ′, P 2 , and P 2 ′ as shown in FIG. 5AF.
  • the positions of the pointers P 1 , P 1 ′, P 2 , and P 2 ′ are similar to those of FIG. 5T. Accordingly, by iterating the process described referring to FIGS. 5 T- 5 V, the routine 500 creates the EDT as illustrated by FIG. 5AG.
  • the decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P 1 , and P 2 from the main routine 500 , respectively.
  • control proceeds to block 553 , which moves the pointer P to a “hikid” child node of the “B 2 ” node which the pointer P (i.e., the pointer P 2 ′) originally points to, as shown in FIG. 5AH.
  • the hikid child node of the “B 2 ” node has no node, which is indicated by the broken line in FIG. 5AH. Control then returns to the decision 541 .
  • the pointer P now points to a null node since the hikid of the “B 2 ” node has no node.
  • the process following the decision 541 is similar to that described referring to FIGS. 5 AC- 5 AG, and ultimately creates the EDT illustrated in FIG. 4A.
  • control follows nodes 403 , 409 , 411 , 413 , 415 , 417 , and 419 since each of the values of the linguistic digits for A 1 A 2 B 1 B 2 .com matches the corresponding one of the values of the linguistic digits for X 1 X 2 Y 1 Y 2 .com.
  • the pointer P 1 points to the end of the string S′, namely, X 1 , control goes to “YES” branch of the decision 523 in FIG. 5A.
  • the terminal node 421 contains the linguistic digit sequence S, which is “A 1 A 2 B 1 B 2 .com,” and the linguistic encoding type T, which is “GB2312.”
  • Block 531 creates a new terminal node 431 which is associated with the tree structure “m-M-o-O-c-C-.-Y 2 -Y 1 -X 2 -X 1 ”.
  • the terminal node 431 contains the linguistic digit sequence, and the linguistic encoding type, which are “X 1 X 2 Y 1 Y 2 .com,” and “BIG5,” respectively.
  • the new terminal node 531 is appended at the end of the tree structure of “A 1 A 2 B 1 B 2 .com.”
  • the tree has two leaf nodes, namely the terminal nodes 421 , and 431 .
  • FIG. 5AJ illustrates the completed EDT, in which the value of each corresponding linguistic digit matches with each other for two different domain names.
  • the tree building algorithm applies the third domain name as follows. Initially, pointer P 2 points to node E 2 and pointer P 1 points to the last letter “a” in the domain name aba. See block 503 of FIG. 5A. Next, the algorithm runs the check_existence subroutine 540 of FIG. 5C. At 545 in this subroutine, the algorithm determines that digital values of “a” and “E 2 ” are equal. Thus, it returns a pointer P 3 pointing to node E 2 . Now, returning to primary algorithm 500 , the process continues through decision blocks 507 and 511 .
  • decision 515 determines that P 3 ′ is not pointing to an empty node. Then decision 519 directs the process to 521 , where the adjust-EDT structure subroutine 560 is executed. There, the system determines that both P 3 and P 3 ′ point to a non-null node. Therefore, each of decisions 555 , 559 , and 561 are answered in the negative and process control is directed to process block 563 , where a “reinsert_subparagraph” subroutine is executed. This subroutine is depicted in FIG. 5AI by process 570 .
  • process 570 the system initially creates pointers P and P′ that point to nodes located immediately under pointers P 3 and P 3 ′.
  • P points to node El and P′ points to node D 3 .
  • the subroutine determines whether P′ is null. In this case, it is not, so the process moves to 573 where the check_existence subroutine is executed. This time the input parameters are P and P′. Executing this program as such compares the values of D 3 and El and moves P to hikid from El. Thereafter a new empty node is created at that position and a new pointer P′′ is returned at the position of this new node.
  • a decision block 575 determines whether P′′ points to an empty node. It does in this case, so a process operation 577 inserts the digit at the position of P′. In this case, that is the digit D 3 .
  • Embodiments of the present invention relate to an apparatus for performing the above-described m 1 DNS operations.
  • This apparatus may be specially constructed (designed) for the required purposes, or it may be a general-purpose computer selectively activated or configured by a computer program stored in the computer.
  • the processes presented herein are not inherently related to any particular computer or other apparatus.
  • various general-purpose machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required method operations. The required structure for a variety of these machines will appear from the description given above.
  • embodiments of the present invention further relate to computer readable media that include program instructions for performing various computer-implemented operations.
  • the media may also include, alone or in combination with the program instructions, data files, data structures, tables, and the like.
  • the media and program instructions may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts.
  • Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM).
  • the media may also be a transmission medium such as optical or metallic lines, wave guides, etc. including a carrier wave transmitting signals specifying the program instructions, data structures, etc.
  • Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
  • FIG. 7 illustrates a typical computer system in accordance with an embodiment of the present invention.
  • the computer system 700 includes any number of processors 702 (also referred to as central processing units, or CPUs) that are coupled to storage devices including primary storage 706 (typically a random access memory, or “RAM”), primary storage 704 (typically a read only memory, or “ROM”).
  • primary storage 704 acts to transfer data and instructions uni-directionally to the CPU and primary storage 706 is used typically to transfer data and instructions in a bi-directional manner. Both of these primary storage devices may include any suitable type of the computer-readable media described above.
  • a mass storage device 708 is also coupled bi-directionally to CPU 702 and provides additional data storage capacity and may include any of the computer-readable media described above.
  • the mass storage device 708 may be used to store programs, data and the like and is typically a secondary storage medium such as a hard disk that is slower than primary storage. It will be appreciated that the information retained within the mass storage device 708 , may, in appropriate cases, be incorporated in standard fashion as part of primary storage 706 as virtual memory.
  • a specific mass storage device such as a CD-ROM 714 may also pass data uni-directionally to the CPU.
  • CPU 702 is also coupled to an interface 710 that includes one or more input/output devices such as such as video monitors, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, or other well-known input devices such as, of course, other computers.
  • CPU 702 optionally may be coupled to a computer or telecommunications network using a network connection as shown generally at 712 . With such a network connection, it is contemplated that the CPU might receive information from the network (e.g., requests to resolve domains), or might output information to the network in the course of performing the above-described method operations.
  • the above-described devices and materials will be familiar to those of skill in the computer hardware and software arts.
  • the CPU 702 may take various forms. It may include one or more general purpose microprocessors that are selectively configured or reconfigured to implement the functions described herein. Or it may include one or more specially designed processors or microcontrollers that contain logic and/or circuitry for implementing the functions described herein. Any of the logical devices serving as CPU 702 may be designed as general purpose microprocessors, microcontrollers, application specific integrated circuits (ASICs), digital signal processors (DSPs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), and the like. They may execute instructions under the control of the hardware, firmware, software, reconfigurable hardware, combinations of these, etc.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • the hardware elements described above may be configured (usually temporarily) to act as one or more software modules for performing the operations of this invention.
  • separate modules may be created from program instructions for detecting an encoding type, transforming that encoding type, and identifying a default name server may be stored on mass storage device 708 or 714 and executed on CPU 708 in conjunction with primary memory 706 . See FIG. 3B for example.

Abstract

A multilingual apparatus detects the linguistic encoding type of a digital string encoding a domain name. It accomplished this using a tree or graph comprised of nodes holding linguistic digits representing the digital sequence of a character or a portion of a character. These nodes are compared against digital sequences of characters in the domain name under consideration. Each comparison results in a step down the graph. Then another comparison is performed, often with the next successive character in the domain name. Ultimately the process reaches a terminal node of the graph. This terminal node specifies the encoding type of the domain name under consideration.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation-in-part of U.S. patent application Ser. No. 09/258,690 filed Feb. 26, 1999 in the name of Ching Hong Seng et al. and titled “MULTI-LANGUAGE DOMAIN NAME SERVICE.” That application is incorporated herein by reference for all purposes.[0001]
  • BACKGROUND OF THE INVENTION
  • The present invention relates to the Domain Name Service used to resolve network domain names into corresponding network addresses. More particularly, the invention relates to an alternative or modified Domain Name Service that accepts domain names provided in many different encoding formats, not just ASCII. [0002]
  • The Internet has evolved from a purely research and academic entity to a global network that reaches a diverse community with different languages and cultures. In all areas the Internet has progressed to address the localization needs of its audience. Today, electronic mail is exchanged in most languages. Content on the World Wide Web is now published in many different languages as multilingual-enabled software applications proliferate. It is possible to send an e-mail message to another person in Chinese or to view a World Wide Web page in Japanese. [0003]
  • The Internet today relies entirely on the Domain Name System to resolve human readable names to numeric IP addresses and vice versa. The Domain Name System (DNS) is still based on a subset of Latin-1 alphabet, thus still mainly English. To provide universality, e-mail addresses, Web addresses, and other Internet addressing formats adopt ASCII as the global standard to guarantee interoperation. No provision is made to allow for e-mail or Web addresses to be in a non-ASCII native language. The implication is that any user of the Internet has to have some basic knowledge of ASCII characters. [0004]
  • While this does not pose a problem to technical or business users who, generally speaking, are able to understand English as an international language of science, technology, business and politics, it is a stumbling block to the rapid proliferation of the Internet to countries where English is not widely spoken. In those countries, the Internet neophyte must understand basic English as a prerequisite to send e-mail in her own native language because the e-mail address cannot support the native language even though the e-mail application can. Corporate intranets have to use ASCII to name their department domain names and Web documents simply because the protocols do not support anything other ASCII in the domain name field even though filenames and directory paths can be multilingual in the native locale. [0005]
  • Moreover, users of European languages have to approximate their domain names without accents and so on. A company like Citroën wishing to have a corporate identity has to approximate itself to the closest ASCII equivalent and use “www.citroen.fr” and Mr Francois from France has to constantly bear the irritation of deliberately mis-typing his e-mail address as “francois@email.fr” (as a fictitious example). [0006]
  • Currently, user-ids in an e-mail address field can be in multilingual scripts as operating systems can be localized to provide fonts in the relevant locale. Directories and filenames too can also be rendered in multilingual scripts. However, the domain name portion of these names are restricted to those permitted by the Internet standard in RFC1035, the standard setting forth the Domain Name System. [0007]
  • Based on RFC1035, valid domain names are currently restricted to a subset of the ISO-8859 Latin 1 alphabet, which comprises the alphabet letters A-Z (case insensitive), numbers 0-9 and the hyphenation symbol (-) only. This restriction effectively makes a domain name support English or languages with a romanized form, such as Malay or Romaji in Japanese, or a roman transliteration, such as transliterated Tamil. No other script is acceptable; even the extended ASCII characters cannot be used. [0008]
  • Unicode is a character encoding system in which nearly every character of most important languages is uniquely mapped to a 16 bit value. Since Unicode has laid down the foundations for unique non-overlapping encoding system, some researchers have begun to explore how Unicode can be used as the basis for a future DNS namespace, which can embrace the rich diversity of languages present in the world today. See M. Durst, “Internationalization of Domain Names,” Internet Draft “draft-duerst-dns-i18n-02.txt,” which can be found at the IETF home page, http://www.ietf.cnri.reston.va.us/ID.html, July 1998. This document is incorporated herein by reference in its entirety and for all purposes. The new namespace should be able to offer multilingual and multiscript functionality that will make it easier for non-English speakers to use the Internet. [0009]
  • Adopting Unicode as the standard character set for a new Domain Name System avoids overlapping code space for different language scripts. In this way, it may allow the Internet community to use domain names in their native scripts such as [0010]
  • www.citroën.ch
  • www.genève-city.ch
  • Unfortunately, several difficulties would preclude modifying the DNS server and client applications to implement a multilingual Domain Name System. For example, all future client applications and all future DNS servers have to be modified. As both client and server have to be modified for the system to work, the transition from the old system to the new system could be difficult. Further, very few available client applications use native Unicode. Instead, most multilingual client applications use non-Unicode encodings, and have strong followings. [0011]
  • One proposed compromise solution to this problem is the so-called “multilingual.com.” In this approach, the popular “.com” (“dot com”) top level domain is represented in ASCII characters, but the second and lower level domains are represented in a non-ASCII format. Such formats allow non-Roman characters. For example, the non-ASCII encoding type BIG-5 encodes Chinese characters. Thus, a Chinese language second level domain name could be registered and used with a com top-level domain name. However, to make use of the existing infrastructure for resolving domain names, the BIG-5 encoded second level domain name would first have to be converted to an ASCII representation. The transformed multilingual.com second level domain could then used by conventional name servers to resolve the address. [0012]
  • In view of these and other issues, it would be highly desirable to have a technique allowing the many linguistic encodings to be used with DNS. [0013]
  • SUMMARY OF THE INVENTION
  • The present invention pertains to methods and apparatus that detect the linguistic encoding type of a digital string encoding a domain name. This is accomplished using a tree or graph comprised of nodes holding linguistic digits representing the digital sequence of a character or a portion of a character. These nodes are compared against digital sequences of characters in the domain name under consideration. Each comparison results in a step down the graph. Then another comparison is performed, often with the next successive character in the domain name. Ultimately the process reaches a terminal node of the graph. This node specifies the encoding type of the domain name under consideration. [0014]
  • One specific aspect of the invention pertains to a method of detecting a linguistic encoding type of a domain name. Such method may be characterized by the following sequence: (a) receiving a digital representation of the domain name; and (b) using the digital representation to traverse a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types. By using the digital representation to traverse the tree in this manner and reach a terminal node, the method detects the linguistic encoding type of the domain name. [0015]
  • The tree structure may take various forms. In a preferred embodiment, the tree structure is ternary tree structure. Typically, the nodes of the tree structure comprise digital sequences of linguistic digits from characters of multiple encoding types. [0016]
  • Typically, the method traverses the tree structure by considering individual characters of the domain name (or portions of those characters) to determine how to move between nodes on the tree structure. In specific embodiments, the tree structure is traversed by comparing digital representations of linguistic digits in the nodes of the tree structure against digital representations of individual characters of the domain name or portions of those characters. The comparisons determine how to move between nodes on the tree structure. For example, if the digital value of a node's linguistic digit is greater than the digital value of the corresponding character of the domain name, one path is chosen. Other paths are chosen if the comparison shows different relationships between the digital values. [0017]
  • In certain specific embodiments, the method also involves reversing the sequence of the digital representation of the domain name prior to using the representation to traverse the tree structure. In this manner, a digital representation of a last character of the domain name is compared to a root node on the tree structure. Next, a digital representation of a next to last character of the domain name is compared to a second node of the tree structure. The method continues in this manner (i) using a next previous character of the domain name to identify a next lower level node of the tree structure; and (ii) repeating (i) until reaching a terminal node of the tree structure. Ultimately, a terminal node of the tree structure is reached. Typically, the terminal node itself specifies the linguistic encoding type of the domain name. [0018]
  • Another aspect of the invention pertains to apparatus for detecting the linguistic encoding type of a domain name. The apparatus may be characterized by the following features: (a) one or more processors; (b) memory in coupled to said one or more processor and configured to store a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and (c) a network interface configured to receive domain names from network nodes. The one or more processors are configured or designed to traverse the tree structure using information from a domain name to thereby detect the linguistic encoding type of the domain name. The tree structure may have a form as described above. [0019]
  • In a preferred embodiment, the apparatus also includes a logical module for converting the domain name from its linguistic encoding type to a DNS compatible encoding type (e.g., ASCII). The apparatus may also include a logical module for resolving domain names in the DNS compatible encoding type. [0020]
  • Another specific aspect of the invention pertains to methods for creating an encoding detection tree of the types described above (e.g., ternary tree structures). Such methods may be characterized by the following sequence: (a) receiving a representation of a digitally represented first domain name which is encoded in a first linguistic encoding type; (b) adding the first domain name, and its first linguistic encoding type, to the encoding detection tree to create a first path through the encoding detection tree; (c) receiving a representation of a digitally represented second domain name which is encoded in a second linguistic encoding type; and (d) adding the second domain name, and its second linguistic encoding type, to the encoding detection tree to create a second path through the encoding detection tree. Part of the method may also involve determining whether the first domain name (or some part of it) already exists in the encoding detection tree. [0021]
  • Generally, the first and second paths each include separate terminal nodes, one or more intermediate nodes, and a common root node. As part of the process, the system may add an identifier of the first and second linguistic encoding types to the terminal nodes of the first and second paths, respectively. The system may also add a sequence of the first domain name to the terminal node of the first path. In a preferred embodiment, the encoding detection tree presents the domain names in reverse order. Thus, for example, the first path in the tree presents the first domain name in reverse order of linguistic digits when moving from the root node to the terminal node. [0022]
  • In one approach to constructing the tree, the first domain name is included in the tree by adding a new node to the encoding detection tree for each linguistic digit of the first domain name having a digital sequence that does not appear at a corresponding location in the encoding detection tree. The positions of the new nodes with respect to existing nodes is determined by comparing the digital sequence of an existing node with the digital sequence of a corresponding linguistic digit from the first domain name. The process may also include adding to the encoding detection tree a linguistic equivalent node of the one of the linguistic digits in the first domain name. [0023]
  • Yet another aspect of the invention pertains to apparatus for creating an encoding detection tree. Such apparatus may be characterized by the following features: (a) one or more processors; (b) memory in coupled to said one or more processor and configured to store a partially created tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and (c) an interface configured to receive domain names from a collection of domain names. In this apparatus, the one or more processors are configured or designed to receive representations of digitally represented domain names which are encoded in linguistic encoding types and add those domain names, together with their linguistic encoding types, to the encoding detection tree to create paths through the encoding detection tree. [0024]
  • Another aspect of the invention pertains to computer program products including a machine-readable media on which is stored program instructions for implementing a portion of or an entire method as described above. Any of the methods of this invention may be represented, in whole or in part, as program instructions that can be provided on such computer readable media.[0025]
  • These and other features and advantages of the invention will be described in more detail below with reference to the associated drawings. [0026]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram of an exemplary system for resolving a non-ASCII domain name to its numeric IP address. [0027]
  • FIG. 2 is a process flow diagram of operations between a client and two multilingual DNS servers according to an embodiment of the invention. [0028]
  • FIG. 3A is a flow chart of the conversion of the domain name from one linguistic encoding type to a second linguistic encoding type according to an embodiment of the invention. [0029]
  • FIG. 3B is a block diagram of a multilingual domain name server according to an embodiment of the invention. [0030]
  • FIG. 4A is a schematic diagram of the reversed linguistic digit sequence of a domain name and a corresponding encoding detection tree according to an embodiment of the invention. [0031]
  • FIG. 4B is a flow chart of an algorithm to determine an encoding type of a domain name using an encoding detection tree according to an embodiment of the invention. [0032]
  • FIG. 4C is a flow chart of an algorithm, used with the algorithm of FIG. 4B, to search a list of terminal nodes according to an embodiment of the invention. [0033]
  • FIG. 4D is a schematic diagram of another example of an encoding detection tree. [0034]
  • FIG. 5A is a flow chart of an algorithm to construct the data structure according to an embodiment of the invention. [0035]
  • FIG. 5B is a schematic diagram of the reversed linguistic digit sequence, the linguistic encoding type, the linguistic equivalent, and the encoding detection tree according to an embodiment of the invention. [0036]
  • FIG. 5C is a flow chart of an algorithm to check whether a linguistic digit has been inserted into the encoding detection tree according to an embodiment of the invention. [0037]
  • FIGS. [0038] 5D-5I are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “A1A2B1B2.com” is added to the encoding detection tree.
  • FIG. 5J is a flow chart of an algorithm to adjust sub-EDT structure according to an embodiment of the invention. [0039]
  • FIGS. [0040] 5K-5V are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “A1A2B1B2.com” is added to the encoding detection tree.
  • FIGS. [0041] 5W-5AG are schematic diagrams of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “C1C2D1D2.com” is added to the encoding detection tree.
  • FIG. 5AH is a schematic diagram of the reversed linguistic digit sequence, the linguistic equivalent, the encoding detection tree, and the pointers according to an embodiment of the invention when a linguistic digit sequence “D[0042] 1E2F1F2.com” is added to the encoding detection tree.
  • FIG. 5AI is a flow chart of an algorithm to reinsert a sub-EDT structure according to an embodiment of the invention. [0043]
  • FIG. 5AJ is a schematic diagram of the reversed linguistic digit sequence of a domain name and a corresponding encoding detection tree according to an embodiment of the invention. [0044]
  • FIG. 6 is a schematic diagram depicting the construction of an encoding detection tree using the procedure depicted in FIG. 5AI, when a sub-EDT structure is destroyed and reinserted in the encoding detection tree. [0045]
  • FIG. 7 is a simplified block diagram of a typical computer system of the type that may be employed to implement the procedures of this invention.[0046]
  • DETAILED DESCRIPTION OF THE INVENTION
  • 1. INTRODUCTION [0047]
  • The present invention provides a technology for efficiently and accurately identifying encoding types of domain names. It uses a tree or graph structure having nodes corresponding to “linguistic digits.” In a typical application, these linguistic digits are sequentially compared against digital representations of characters in the domain name. Each comparison results in a decision on which available path to take in the graph structure. This moves a pointer through a tree sequentially until reaching a terminal node associated with an encoding type. Thus, at the end of the process, the encoding type is detected. This information can then be employed to convert the characters of the multilingual domain name to a format compatible with the DNS standard (e.g., RFC 1035). [0048]
  • While the invention is described below in terms of a “multilingual.com” embodiment, other domain name formats may be employed as well. In general, any domain name system that recognizes domain names from more than one encoding type can be used with this invention. Depending upon the range of acceptable domain names, the encoding detection graph/tree will have various forms. [0049]
  • In a typical embodiment, the present invention transforms multilingual multiscript names to a form that is compliant with DNS (e.g., DNS as explained in RFC1035). These transformed names may then be relayed as DNS queries to a conventional DNS server. An exemplary process of how a localized domain name is resolved to its numeric IP address is illustrated by FIG. 1 below. [0050]
  • As background for this invention, understand that programs rarely refer to hosts and other resources by their binary network addresses. Instead of binary numbers, they use ASCII strings, such as www.pobox.org.sg. Nevertheless, the network itself only understands binary addresses, so some mechanism is required to convert the ASCII strings to network addresses. This mechanism is provided by the Domain Name System. [0051]
  • The essence of DNS is a hierarchical, domain-based naming scheme and a distributed database system for implementing this naming scheme. It is primarily used for mapping host names and e-mail destinations to IP addresses, but can be used for other purposes. As mentioned, DNS is defined in RFCs 1034 and 1035. [0052]
  • As noted, the DNS protocol is currently based upon a subset of ASCII, and is thus limited to the Latin alphabet. Numerous other encodings provide digital representations for other character sets of the world. Examples include BIG5 and GB-2312 for Chinese character scripts (traditional and simplified respectively), Shift-JIS and EUC-JP for Japanese character scripts, KSC-5601 for Korean character scripts, and the extended ASCII characters for French and German characters, for instance. [0053]
  • Beyond these language-specific encoding types, there exists the Unicode standard (a “universal linguistic encoding type”) that provides the capacity to encode all the characters used in the written languages of the world. In a preferred embodiment, domain name strings in various different encoding types are all first converted to Unicode and then to ASCII—if necessary. [0054]
  • Unicode uses a 16-bit encoding that provides code points for more than 65,000 characters. Unicode scripts include Latin, Greek, Cyrillic, Armenian, Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Oriya, Tamil, Telugu, Kannada, Malayalam, Thia, Lao, Georgian, Tibetan, Japanese Kana, the complete set of modem Korean Hangul, and a unified set of Chinese/Japanese/Korean (CJK) ideographs. Many more scripts and characters are to be added shortly, including Ethiopic, Canadian, Syllabics, Cherokee, additional rare ideographs, Sinhala, Syriac, Burmese, Khmer, and Braille. [0055]
  • A single 16-bit number is assigned to each code element defined by the Unicode Standard. Each of these 16-bit numbers is called a code value and, when referred to in text, is listed in hexadecimal form following the prefix “U”. For example, the code value U+0041 is the hexadecimal number 0041 (equal to the decimal number 65). It represents the character “A” in the Unicode Standard. [0056]
  • Each character is also assigned a unique name that specifies it and no other. For example, U+0041 is assigned the character name “LATIN CAPITAL LETTER A.” U+0A1B is assigned the character name “GURMUKHI LETTER CHA.” These Unicode names are identical to the names for the same characters in ISO/IEC 10646. [0057]
  • The Unicode Standard groups characters together by scripts in code blocks. A script is any system of related characters. The standard retains the order of characters in a source set where possible. When the characters of a script are traditionally arranged in a certain order—alphabetic order, for example—the Unicode Standard arranges them in its code space using the same order whenever possible. Code blocks vary greatly in size. For example, the Cyrillic code block does not exceed 256 code values, while the CJK code block has a range of thousands of code values. [0058]
  • Code elements are grouped logically throughout the range of code values, called the “codespace.” The coding starts at U+0000 with the standard ASCII characters, and continues with Greek, Cyrillic, Hebrew, Arabic, Indic and other scripts; then followed by symbols and punctuation. The code space continues with Hiragana, Katakana, and Bopomofo. The unified Han ideographs are followed by the complete set of modem Hangul. A surrogate range of code values is reserved for future expansion with UTF-16. Towards the end of the codespace is a range of code values reserved for private use, followed by a range of compatibility characters. The compatibility characters are character variants that are encoded only to enable transcoding to earlier standards and old implementations, which made use of them. [0059]
  • Character encoding standards define not only the identity of each character and its numeric value, or code position, but also how this value is represented in bits. The Unicode Standard endorses at least three forms that correspond to ISO/IEC 10646 transformation formats, UTF-7, UTF-8 and UTF-16. [0060]
  • The ISO/IEC 10646 transformation formats UTF-7, UTF-8 and UTF-16 are essentially ways of turning the encoding into the actual bits that are used in implementation. UTF-16 assumes 16-bit characters and allows for a certain range of characters to be used as an extension mechanism in order to access an additional million characters using 16-bit character pairs. The Unicode Standard, Version 2.0, Addison Wesley Longman (1996) (with updates and additions added via “The Unicode Standard, Version 2.1) has adopted this transformation format as defined in ISO/IEC 10646. This reference is incorporated herein by reference in its entirety and for all purposes. [0061]
  • The second transformation format is known as UTF-8. This is a way of transforming all Unicode characters into a variable length encoding of bytes. It has the advantages that the Unicode characters corresponding to the familiar ASCII set end up having the same byte values as ASCII, and that Unicode characters transformed into UTF-8 can be used with much existing software without extensive software rewrites. The Unicode Consortium also endorses the use of UTF-8 as a way of implementing the Unicode Standard. Any Unicode character expressed in the 16-bit UTF-16 form can be converted to the UTF-8 form and back without loss of information. [0062]
  • 2. TERMINOLOGY [0063]
  • Some of the terms used herein are not commonly used in the art. Other terms may have multiple meanings in the art. Therefore, the following definitions are provided as an aide to understanding the description that follows. The invention as set forth in the claims should not necessarily be limited to these definitions. [0064]
  • Linguistic encoding type—any character or glyph encoding type (e.g., ASCII or BIG5) now known or used in the future. Each encoding type has its own mapping between linguistic characters (e.g., “a” in the Latin alphabet and an “o” with an umlaut in the German alphabet) and corresponding digital representations (e.g., hexadecimal number 0041 for “A” in ASCII). [0065]
  • Digitally represented—the way characters are presented as a result of encoding (e.g., in a bit stream, a hexadecimal format, etc.) [0066]
  • Digital sequence—a particular sequence of ones and zeros, hexadecimal characters, or other constituents in a digital representation. [0067]
  • Encoded domain name—the digital sequence of domain name characters represented in binary or hexadecimal for example. More specifically, the string of concatenated digital representations for the characters comprising the domain name under consideration is the “encoded domain name.” For example, the ASCII encoded domain name for abc.com is \0x61\0x62\0x63\0x2E\0x63\0x6F\0x6D. As another example, the GB-2312 encoded domain name of “Taiwan.com” is \0xCC\0xA8\0xCD\0xE5\0x2E\0x63\0x6F\0x6D. [0068]
  • Encoding detection tree—This is a tree or graph structure used to unambiguously determine the encoding types of arbitrary bit strings that are digital sequences of domain names. In typical examples, using information in the digital sequence of the domain name, an encoding detection algorithm can traverse the tree to reach a leaf node (terminal node). There the encoding type can be unambiguously determined. Specific examples of such trees and their use are presented below. [0069]
  • Linguistic digit—this is the digital sequence employed at nodes of an encoding detection tree. In one embodiment, it is 8 bits long. It is typically derived from the digital representation of an encoded linguistic character. For example, one linguistic digit employed in an encoding detection tree may be the value hexadecimal number 0041, which is the ASCII representation of “A.” In another example, a linguistic digit is one byte of a two byte string used to represent a particular Chinese character in GB-2312. [0070]
  • The length of the linguistic digit may be chosen to provide an optimal balance between the size of the encoding detection tree and the speed at which it can be traversed. Smaller linguistic digits (1 bit in the extreme case) required more nodes and hence more storage space. Larger linguistic digits require longer comparison times. [0071]
  • Linguistic equivalent—this refers to two nominally different characters that have very similar linguistic meanings. Examples include uppercase and lowercase characters in the Latin and Greek alphabets. [0072]
  • DNS encoding type—an encoding type supported by the DNS protocol of a network or the Internet; e.g., a limited set of ASCII characters specified in RFC 1035. [0073]
  • Non-DNS encoding type—an encoding type not supported by the DNS protocol under consideration, e.g., BIG5 under the RFC 1035 standard. [0074]
  • Universal linguistic encoding type—any linguistic encoding type, now known or developed in the future, that encompasses more than one character or glyph set within its encoding. Unicode is one example. BIG5, iso-8859-11, and GB-2312 are others. [0075]
  • 3. INTERNATIONAL DOMAIN NAME SYSTEMS [0076]
  • Turning now to FIG. 1, some important components of a [0077] network 10 used in a first embodiment of this invention include a client 12, a corresponding node 14 with whom client 12 wishes to communicate, a multilingual DNS (“m1DNS”) server 16 and a conventional DNS server 18. The m1DNS server 16 may listen on a DNS port (currently addressed to the domain name port 53) for multilingual domain name queries in place of a normal DNS server. Server 16 may include the Berkeley Internet Name Domain (‘BIND’ and its executable version ‘named’) which is a widely used DNS server written by Paul Vixie (http://www.isc.org/).
  • To understand the role of these components, assume that [0078] client 12 is used by a Chinese student who wishes to inquire about employment in a Hong Kong business that operates corresponding node 14. The student has previously communicated with the business and has obtained the domain name of that business. The domain name is provided in native Chinese characters. Client 12 is outfitted with a keyboard that can type Chinese language characters and is configured with software that can recognize encoded Chinese characters and accurately display them on a computer screen.
  • Now, the student prepares a message to the Hong Kong business, encloses her resume, and types in the Chinese domain name as the destination. When she instructs [0079] client 12 to send the message to corresponding node 14, the system shown in FIG. 1 takes the following actions. First, the corresponding node domain name is submitted, in the native language, to m1DNS server 16 via a DNS request. The m1DNS server 16 recognizes that the domain name is not in a format that can be handled by a conventional DNS server. Therefore it translates the Chinese domain name to a format that can be used with a conventional DNS server (normally a limited set of the ASCII characters). The m1DNS server 16 then repackages the DNS request, with the translated corresponding node domain name, and transmits that request to conventional DNS server 18. DNS server 18 then uses the normal DNS protocol to obtain a network address for the domain name it received in the DNS request. The resulting network address is the network address of corresponding node 14. DNS server 18 packages that network address according to conventional DNS protocol and forwards the address back to m1DNS server 16. The m1DNS server 16, in turn, transmits the needed network address back to client 12, where it is placed in the student's message. The message is packetized, with each packet having a destination network address corresponding to node 14. Client 12 then sends the message packets over the Internet to node 14.
  • While FIG. 1 shows the [0080] m1DNS server 16 and conventional DNS server 18 as separate blocks, often the two entities can be represented as a single logical block. Often both entities will reside on a single hardware device, such as a network workstation. Further, the functions of the two entities can be executed using a single block of program code or tightly coupled blocks of program code. FIG. 2 shows a network having multiple m1DNS servers, each of which performs the logical operations of m1DNS server 16 and conventional DNS server 18.
  • As shown in FIG. 2, [0081] client 12 is depicted by a vertical line on the left-hand side of the figure, a default m1DNS server 17 is depicted by a vertical line in the center of the figure, and a second m1DNS server 19 is depicted by a vertical line on the right-hand side of the figure.
  • Initially, at [0082] 203, an application running on client 12 generates a message intended for a network destination. The domain name for that destination is input in non-DNS compatible text encoding format. Thus, the text is encoded in a linguistic encoding type that digitally represents the characters of the text. As mentioned, ASCII is but one linguistic encoding type. In preferred embodiments, the invention handles a wide range of encoding types. Examples of some in wide use include GB2312, BIG5, Shift-JIS, EUC-JP, KSC5601, extended ASCII, and others.
  • After the client application creates the message at [0083] 203, the client operating system creates a DNS request to resolve the domain name at 205. The DNS request may resemble a conventional DNS request in most regards. However, the domain name provided in the request will be provided in a non-DNS encoding format. The client operating system transmits its DNS request to default m1DNS server 17 at 207. Note that the client operating system may be configured to send DNS requests to m1DNS server 17. In other words, the default DNS server of client 12 is m1DNS server 17.
  • [0084] Default m1DNS server 17 extracts the encoded domain name from the DNS request and generates a transformed DNS request presenting the domain name in a DNS compatible encoding format (presently the reduced set ASCII specified in RFC 1035). See 209. Server 17 then attempts to resolve the DNS compatible domain name. It may use the conventional DNS protocol for this purpose; i.e., to obtain the IP address of the domain name used in the client's communication. If server 17 cannot itself resolve the domain name presented to it, it will attempt to identify another m1DNS server that is authoritative for the domain name under consideration. Regardless of the outcome of operation 209, default m1DNS server 17 then transmits a message back to client 12. See 217. This message may include the IP address of the domain name under consideration or it may include a reference to another m1DNS server.
  • If it does not include the IP address, [0085] client 12 then sends its DNS request (with the multilingual domain name) to second m1DNS server 19. See 219. Server 19 then attempts to resolve the request locally. See 221. Regardless of its success, it sends a reply to client 12. See 223. That reply will include either the IP address of the multilingual domain name, the name of a referred server, or a failure message. Upon receipt of the reply, client 12 either sends a communication using the IP address of the resolved multilingual domain name or reports a failure to establish a connection (because servers 17 and 19 each failed to resolve the domain name). See 225. It is of course possible that server 19 sent a referral for yet another multilingual domain name server. If that is the case, then client 12 may try to send its multilingual domain name request to the newly referred server.
  • As indicated above, the domain name must, at some point, be converted from a non-DNS encoding type to a DNS compatible encoding type. In the above examples, this is accomplished with a m[0086] 1DNS server (or a proxy m1DNS server). This need not be the case, however, as the functionality necessary for conversion may be embodied elsewhere. In alternative embodiments, the functions performed by the m1DNS server are implemented in whole (or in part) on the client and/or on the DNS name server.
  • In a specific embodiment, operations including detecting an encoding type, translating a non-DNS encoded domain name to a DNS encoded domain name and identifying a default name server (operations [0087] 305-311 of the FIG. 3A flow chart discussed below) are implemented on an Internet application (e.g., a multilingual-enabled Web browser). In this embodiment, code detection and code conversion are automatically done prior to dispatching a DNS resolution request to a DNS name server.
  • In another specific embodiment, operations [0088] 305-311 can be implemented entirely on a proxy m1DNS server. Other embodiments include collapsing all or some fraction of these operations into a conventional DNS name server. For example, code for some m1DNS functions can be collapsed into BIND code as a compilable module.
  • In FIG. 2, the conversion of the domain name from one linguistic encoding type to a second linguistic encoding type (compatible with DNS) is performed at [0089] 209 or 221 (depending upon which of servers 17 and 19 is the authoritative server). As shown in FIG. 3A, in accordance with a preferred embodiment of this invention, this conversion may take place via a process 301. The process begins at 303 with the system identifying the encoding type of the domain name in the DNS request. This is necessary when the system may be confronted with multiple different encoding types. Typically, the detection will involve analyzing a bit string making up the domain name under consideration. A preferred approach to this process is described below in detail. However, in alternative embodiments, an application can present explicitly defined linguistic encoding which obviates the need for encoding type detection.
  • After the encoding type has been identified, the system next determines whether the domain name was encoded in a DNS compatible encoding type at [0090] 305. Currently, that requires determining whether the domain name is encoded in the reduced set ASCII encoding type. If so, further conversion is unnecessary and process control is directed to 311, which will be described below.
  • In the interesting case, the domain name is encoded in a non-DNS format. When this occurs, process control is directed to [0091] 307 where the system translates the domain name to a universal encoding type. In a preferred embodiment, this universal encoding type is Unicode. In this case, the characters identified in the native encoding type are then identified in the Unicode standard and converted to the Unicode digital sequences for those characters.
  • The newly translated domain name is then further transformed from the universal encoding type to a DNS compatible encoding type. See [0092] 309. Thus, this final encoding type may be the reduced set of ASCII specified in RFC 1035. Note that the translation from the DNS incompatible format to the DNS compatible format takes place in two operations through an intermediate universal encoding type. This two operation procedure will be detailed below. It should be understood, however, that it may be possible to directly convert, in one operation, the DNS incompatible domain name to the DNS compatible domain name. This may be accomplished in a system having multiple conversion algorithms, each designed to convert a specific encoding type to ASCII (or some other future DNS-compatible encoding type). In one example, these algorithms may be modeled after the “Dürst algorithm” identified above. Many other suitable algorithms are known or can be developed with routine effort.
  • With a DNS compatible domain name now in hand, the system need only determine which conventional DNS name server it should forward the domain name to. According to normal DNS protocol, the DNS request might be forwarded to a top-level name server. As will be described in more detail below, it may be convenient to have different root name servers handle different linguistic domains. For example, the Chinese government may maintain a root name server for Chinese language domain names, the Japanese government or a Japanese corporation may maintain a root name server for Japanese language domain names, the Indian government may maintain a root name server for Hindi language domain names, etc. In any event, the system must identify the appropriate name server at [0093] 311 as indicated in FIG. 3A. After this has been accomplished, the conversion process is complete and the DNS request can be transmitted to the DNS system for handling according to convention.
  • Preferably, the process depicted in FIG. 3A is performed solely on a m[0094] 1DNS server. However, some of the process may be performed on a client or a conventional DNS server. For example, 303 and 305 could be performed on a client and 309 could be performed on a conventional DNS server.
  • In some preferred embodiments, such as those depicted in FIG. 2, the name server is co-located with the m[0095] 1DNS server. So operation 311 would involve nothing more than determining that the server performing the encoding type detection and conversion can also resolve a DNS request for the domain name in question.
  • A preferred division of labor for the m[0096] 1DNS function is depicted in FIG. 3B. As shown there, a m1DNS server 327 performs the necessary detection of encoding type and conversion to a DNS compatible format. Server 327 also performs normal DNS resolution. An encoding detection tree (EDT) 321 and associated logic performs the operations of FIG. 3A. In addition, a normal DNS resolution subsystem 323 performs the standard DNS resolving protocol.
  • The EDT and associated logic detects all necessary linguistic encoding types and can convert all encoding types to Unicode (or other suitable universal encoding type). In the depicted embodiment, a [0097] client 325 submits a domain name for a corresponding node 331 in its native language. The m1DNS server 327 receives the domain name and a conventional DNS resolution sub-system 323 performs the standard DNS resolving protocol. It returns the IP address for corresponding node 331, allowing client 325 to communicate directly with node 331.
  • In one implementation, [0098] EDT 321 and associated logic runs on a machine (identified by i2.i-dns.com for example) on a designated port (e.g., a port number 2000). It accepts a whole portion of a digitally represented domain name in any linguistic encoding type and returns a whole portion of a digitally represented domain name in Unicode transformed to a DNS encoding type (UTF-5). Normal DNS subsystem 323 returns an IP address for the domain name under consideration.
  • As indicated in the discussion of FIG. 3A and elsewhere, when the system handles multiple encoding types, it must be capable of distinguishing one encoding type from the next. See [0099] block 303. An example of this process employing an encoding detection tree is detailed in FIGS. 4A-4C.
  • 4. MATCHING ALGORITHM [0100]
  • FIGS. [0101] 4A-4C depict an embodiment of this invention for identifying and encoding type of a domain name string using an encoding detection tree of this invention. As shown in FIG. 4A, an encoding detection tree 401 (which may also be viewed technically as a “graph”) includes various nodes such as node 403 and connections between those nodes such as “eq” connection 405. FIGS. 4B and 4C present a process flow for using a tree structure such as tree 401 to unambiguously detect an encoding type.
  • To understand how the matching algorithm works, consider a very simple registry having only 3 registered domain names: A[0102] 1A2B1B2.com, C1C2D1D2.com and E1E2F1F2.com. As described in more detail below, a registry with these domain names will be used to generate the tree structure 401 depicted in FIG. 4A. When an international domain name system is presented with a domain name, it detects the encoding type of that domain name using the tree 401. Imagine, for the moment, that the international domain name system receives the domain name A1A2B1B2.com.
  • Consider now the [0103] process flow 450 depicted in FIG. 4B. At the beginning of the process, the linguistic encoding type, T, is unknown. At 452, the international domain name system (e.g., an m1DNS server as described above) receives the linguistic digit sequence, S, for the domain name of interest and reverses that sequence to produce a reversed digit sequence S′. In FIG. 4A, the reversed linguistic digit sequence S′ is depicted by sequence 407 (for the domain name A1A2B1B2.com). Assume for example that “.com” is represented in ASCII and A1A2 represent one 16 bit Chinese character encoded in BIG5 and B1B2 represent a second Chinese character also encoded in BIG5.
  • Returning to process [0104] 450 in FIG. 4B, the international domain name system next sets a pointer P1 to a first digit of the reversed sequence S′ and sets a pointer P2 to the root of the encoding detection tree 401. See operation 454. Pointers P1 and P2 are depicted in FIG. 4A. During the course of the encoding detection process, these pointers move from digit to digit (in the case of P1) and from node to node (in the case of P2).
  • Next, the international domain name system compares the value at pointer P[0105] 1 against the value at pointer P2. See 456. This comparison involves the digital values of the character (or portion of a character) from the domain name at the current location of pointer P1 and the linguistic digit represented in the node currently at pointer P2. In the example of FIG. 4A, pointer P2 is initially at node 403, which corresponds to the digital value of m. In the case of a multilingual.com system, the linguistic digit at node 403 will be the ASCII value of the letter “m.” The value at pointer P1 is also the digital value for m. Therefore, the comparison presented at 456 indicates that the values at pointers P1 and P2 are equal. With this outcome, process flow 450 (FIG. 5B) proceeds to 458 where the position of pointer P2 is moved to the equal child node of parent node 403. In this case, the equal child node of parent node 403 is the node 409. This node contains the linguistic digit of the letter “o” in ASCII.
  • Next, at [0106] 460, the international domain name system determines whether the pointer P2 is currently pointing to a terminal node of the tree structure. If so, it will determine the encoding type from that node. In the current example, however, there are many additional nodes between the pointer P2 and the terminal node. (Examples of terminal nodes are indicated by nodes 421 and 429 in FIG. 4A.) Therefore, decision 460 is answered in the negative. Then process control is directed to a decision 462, which determines whether the next digit in the domain name string represents the end of that string. In the example at hand, the pointer P1 currently points to the m character. Therefore, several more digits exist between the end of string 407 and the current digit. Hence, decision 462 is answered in the negative.
  • Process control is next directed to block [0107] 464 where the international domain name system moves pointer P1 to the next digit of S′, the reversed character sequence 407. See FIG. 4A. In the example at hand, this results in P1 moving from the letter m to the letter o in sequence 407. Process control is then directed back to decision operation 456.
  • This time through the process, the locations of pointers P[0108] 1 and P2 has changed. P1 points to the letter o and P2 points to the node 409 containing the linguistic digit for the letter o. At 456, the system determines that the digital representations at pointers P1 and P2 are equal. Therefore, as before, the process flow is directed to block 458, where pointer P2 moves down the equal child path to a node 411.
  • Next, [0109] decision block 460 determines whether the new location of P2 is a terminal node. As it is not in this case, the process moves to decision block 462 where it determines that the next digit in the S′ character string is not the end of that string. Then, process block 464 moves P1 to the next digit in S′, the letter “c.”
  • During the next two passes through the [0110] process flow 450, the pointer P1 is located at the c and the “.” respectively. At the end of these cycles, the pointer P2 points to the node 413 containing the linguistic digit that is the digital representation of character B2. Also, at this time, the pointer P1 points to the character B2 in reverse sequence 407. Now, the international domain name system compares the values located at pointers P1 and P2. See 456. As before, these values are equal. Therefore, the location of pointer P2 moves down the “eq” path to node 415, which harbors the linguistic digit for B1. Because this is not a terminal node and because the current location of pointer P1 is upstream from the end of reversed sequence 407, the process loops back to decision block 456. It proceeds in this manner through nodes 417 and 419, corresponding to linguistic digits A2 and A1. When proceeding through the loop associated with linguistic digit A1, the pointer P2 moves, at 458, to a terminal node 421. At this point, decision block 460 is answered in the affirmative. As a consequence, process control is directed to a new decision block, decision 466, where it is determined whether there is only one terminal node. In other words, there is only one encoding type associated with a terminal node.
  • In the situation at hand, there is only one encoding type associated with the [0111] terminal node 421. Therefore, decision 466 is answered in the affirmative. At 468, the international domain name system identifies the linguistic encoding type, T, associated with the terminal node 421.
  • It is possible that coincidentally, two domain names, having different encoding types, have the same digital sequence. When this occurs, the terminal node will include two separate encoding types. In this situation, [0112] decision block 466 is answered in the negative. To account for this situation, each encoding type represented at the terminal node is also associated with its unique character string, S. The international domain name system can then search through a list of character strings for the exact match of the sequence S. See 470. When the exact match of the digital sequence S is found, the corresponding encoding type is selected.
  • FIG. 4C depicts one example of a process flow that may be employed to search through the list of terminal nodes as depicted at [0113] process block 470. In FIG. 4C, a process 480 begins at 482 with selection of the first terminal node in the list of terminal nodes. See 482. Next, the process normalizes the sequence S associated with terminal node L based on the linguistic encoding type of that sequence. See 484. Next, the process determines whether the linguistic digital sequence S of the domain name under consideration matches the linguistic digital sequence associated with the terminal node under consideration. See 486. Assuming that this is the case, process control is directed to block 488 where system returns the terminal node currently visited in the list. This terminal node has an associated encoding type, which is the encoding type of the domain name under consideration.
  • Assume for the moment that the comparison rendered at [0114] 486 indicates that the sequences are not identical. As a result, process control is directed to decision block 490, which checks to determine whether the end of the list of terminal nodes has been reached. If not, the next terminal node, T, in the list is visited at 492. From there, process control is directed back to block 484 and the process continues as described above. Now, in the case where the end of the list of terminal nodes has been reached but no matching strings have been found, decision block 490 will be answered in the affirmative. As a result, the system sets a pointer to the list of terminal nodes to point to nothing. See 494. From there, process controls directed to 488 which returns no match, in this case.
  • The need for the process of FIG. 4C can be understood as follows. The traversing path may lead to a list of terminal nodes rather than a single match. Therefore, some mechanism to determine which terminal node is correct is required. For example “??.com” is a valid GB encoded binary sequence and also a valid iso8859-1 encoded binary sequence. Both of them have the same traversing path in the encoding detection table and will be chained up if both were previously inserted into the encoding detection tree. For iso8859-1 encoded characters, upper case and lower case are considered linguistically equivalent, while for GB encoded characters, the case of the character is sometime significant. Chinese characters in GB will be double bytes. Therefore, these two bytes are case significant. So, for iso8859-1 encoding type, the detection tree could have a terminal node with detectable string as “0xEC\0xA8\0xED\0xE5\0x2E\0x63\0x6F\0x6D” (all valid iso8859-1 characters are lower-case) and an encoding type of “iso8859-1”. And, for the GB encoding type, the encoding tree could also have another terminal node with detectable string as “\0xCC\0xA8\0xCD\0xE5\0x2E\0x63\0x6F\0x6D” (valid GB characters will be preserved, while ASCII characters will be lower cased since they are not case significant) and an encoding type of “GB”. Both of them are chained up under the same traversing path. [0115]
  • If a match request comes in for “\0xCC\0xA8\0xCD\0xE5\0x2E\0x63\0x6F\0x6D”, after lower casing all the ASCII characters, it will match with the GB encoded string exactly and will take precedence over the iso8859-1 encoded string. And if a match request comes in as “\0xCC\0xA8\0xED\0xE5\0x2E\0x63\0x6F\0x6D”, it will not match with the GB encoded string after lower casing all the ASCII characters but will match with the iso8859-1 encoded string exactly after lower casing the iso8859-1 characters and ASCII characters. [0116]
  • The normalization process (see [0117] 484) will utilize the encoding information contained in the terminal node and lower case characters that are not case significant in the query string and then do exact match on the normalized query string with detectable string stored in the terminal node.
  • Returning now to FIGS. 4A and 4B consider the possibility where the domain name under question is A[0118] 1A2B1B2.coM. In this case, the domain name is linguistically equivalent to the previous domain name under consideration. It so happens that it is presented with an upper case letter “M,” rather than the lower case letter “m.” On the first pass through process 450, the pointer P1 points to the M in the sequence, S′, and the pointer P2 points to the root node 403 of tree 401. At 456, where the values at P1 and P2 are compared, the system will discover that the value at P1 is less than the value at P2. This is because the digital sequence representing M has a lower value than the digital sequence representing m. As a result, process control proceeds to a process block 472, which moves the pointer P2 to the low child node branching from root node 403. In this case, that is node 423, populated with the digital sequence associated with the M.
  • From [0119] block 472, the system next determines whether pointer P2 is pointing to nothing. See decision block 473. In this case, that is not true, so process control is directed back to decision block 456, where the value associated with pointer P1, and the new position of pointer P2 are compared. This time, the values will match, so process controls directed to 458. There, the pointer P2 is moved down the “eq” branch to node 409. This causes the pointer P2 to move to node 409, where the linguistic digit for the letter “o” resides. The process then proceeds down the tree, loop by loop, until reaching terminal node 421 as described in the previous example.
  • Considering the domain name D[0120] 1E2F1F2.com, the procedure will traverse tree 401 as described above until pointer P1 reaches linguistic digit F2 and pointer P2 reaches node 413, containing linguistic digit B2. At that point, the comparison of the values at P1 and P2 (decision block 456) indicates that the digital value of linguistic digit F2 is greater than the digital value of linguistic digit B2. At this point, process control is directed to a block 474. There, the pointer P2 is moved down the “hi” path to a node for 25. The system checks whether P2 points to nothing (473). As that is not the case, process control loops back to 456 where the value of P1 (F2) is compared with the value at the new location of pointer P2 (node 425). This time, the comparison will indicate a match. Then, process block 458 moves pointer P2 down the eq path on tree 401 to a node 427. The process will continue in this manner until reaching a terminal node 429 there, the encoding type of domain name D1E2F1F2.com will be identified.
  • One other case of interest should be discussed in the context of matching linguistic digit sequences. Specifically, if one ever encounters a situation where the current location of P[0121] 2 is not a terminal node but the next digit of the reversed sequence S′ is the end of that sequence, then the encoding type cannot be determined unambiguously. This situation is captured in process 450 when decision block 462 is answered in the affirmative. At that point, a process block 476 returns a “not found” message. The result occurs when decision block 473 is answered in the affirmative (i.e., P2 points to nothing).
  • 5. TREE STRUCTURE [0122]
  • To completely ensure that every encoding type can be detected, the tree should embody representations of all domain names that are registered with a particular host system—e.g., a particular Internet Service Provider. Thus, in one embodiment, the encoding detection tree is periodically rebuilt when new multilingual domain names are registered. In an extreme example, the tree is recomputed every time a new domain name is registered. More typically, the tree is computed only after a defined number of new domains have been registered since the tree was last computed or a set length of time has expired (e.g., 12 hours) since the tree was last computed. [0123]
  • To ensure that the tree can unambiguously distinguish encoding types for every registered domain name, the registrar may enforce certain restrictions on registration. In a preferred embodiment, two restrictions are imposed. First, the registrar should not register two domain names having the exact same digital sequence. Second, it should not register two domain names that are linguistically equivalent. [0124]
  • Considering the first restriction, unrelated domain names in different encoding types might coincidentally have the same digital representation. If this situation were allowed to occur, then the encoding detection system would be unable to unambiguously determine the encoding type when presented with one of these domain names. Hence the system could not guarantee that it would return the proper IP address. [0125]
  • Considering the second restriction, the domain names grasshopper.com and GrassHopper.COM are linguistically equivalent. In the traditional Latin alphabet based domain name system, domain names are case insensitive. Entities and individuals obtaining domain names expect to own rights to all linguistic equivalents of a given name. Hence, the registrar should prevent registration of two linguistically equivalent domain names. To allow this, the encoding detection tree preferably contains paths for multiple linguistic equivalents of a single domain name. [0126]
  • Preferably, the tree is designed considering one or more of four objectives: [0127]
  • 1. Enable the Domain Name System with multilingual capability; [0128]
  • 2. Require a reasonably short period of time to build a data structure for Domain Name System hosting large number of domain names; [0129]
  • 3. Require relatively little memory consumption for a Domain Name System hosting large number of domain names; and [0130]
  • 4. Enable efficient detection of the linguistic encoding type of a digital sequence. [0131]
  • In a preferred embodiment, the encoding detection tree is extended from a data structure called “ternary search tree.” Each node of the tree is associated with a record holding a single linguistic digit and pointers to its children nodes. The linguistic digit represents the digital encoding of a particular character (or a portion of that character) in a particular encoding type. The size of the linguistic digit used in the nodes is chosen balance between rapid searching and low memory usage. At one extreme, each node contains a one bit linguistic digit. This structure could be searched very fast, but would occupy too much memory. Trees having 16 bit linguistic digits at each node would occupy less memory, but would be searched more slowly. In a specific embodiment, each node of the tree includes 8 bits. [0132]
  • In the example shown in FIG. 4A, each node can only have at most three children nodes, which are named as “lokid”, “eqkid” and “hikid”. During encoding type detection, as explained, the linguistic digit stored with a node is compared against the digital sequences of characters in the domain name under analysis. “Lokid” will be visited if the digital value of linguistic digit from the appropriate position of incoming digital sequence is less than the value of linguistic digit held by current node of tree. If both of the digits have the same value, “eqkid” is visited. When the digital value of lingustic digit from the appropriate position of incoming digital sequence is greater than the value of linguistic digit held by current node of tree, “hikid” will be visited. Non-ternary search trees may be employed in alternative embodiments. [0133]
  • FIG. 4D depicts a slightly more complex version of a ternary encoding detection tree. This tree would be suitable for a multilingual domain name system that registers “.com” and “.tm” top level domain names. The tree itself would be traversed in the manner described above for [0134] tree 401 of FIG. 4A.
  • 6. BUILDING A TREE STRUCTURE [0135]
  • Referring to FIG. 5A, an [0136] algorithm 500 for building the encoding detection tree (EDT) will be described. The algorithm 500 is executed by a computer system (e.g., system 700, which will be described later referring to FIG. 7) periodically. For example, when the system adds a new linguistic digit sequence of a domain name to its encoding detection tree. An exemplary process in which the system ultimately creates the EDT shown in FIG. 4A will be described in detail below. Here, it is supposed that the system receives three linguistic digit sequences, namely, “A1A2B1B2.com,” “C1C2D1D2.com,” and “D1E2F1F2.com” in this order. The EDT may be stored in various types of memory devices in the system as long as its topological structure is kept precisely.
  • ADDING “A[0137] 1A2B1B2.com” TO EDT
  • First, the process for adding the linguistic digit sequence, “A[0138] 1A2B1B2.com” to the EDT will be described below referring to FIGS. 4A, and 5A-5V.
  • In block [0139] 501, the routine 500 receives a linguistic digit sequence S (i.e., A1A2B1B2.com), and a linguistic encoding type of the linguistic digit sequence S (e.g., GB2312). The system then reverses the order of the linguistic digit sequence S, and substitutes the reversed sequence of S for the reversed linguistic digit sequence S′. In block 501, the system substitutes a linguistic encoding type of the linguistic digit sequence S for a linguistic encoding type T. Here, the reversed linguistic sequence S′, and the linguistic encoding type T are “moc.B2B1A2A1,” and “GB2312,” respectively. FIG. 5B illustrates the reversed linguistic sequence S′, and the linguistic encoding type T.
  • The system may store the linguistic encoding type T, which is represented in ASCII characters. Alternatively, the system may store a code corresponding to the linguistic encoding type T in order to reduce a memory space for storing the variable T. [0140]
  • In [0141] block 503, the routine 500 initializes pointers P1, P1′, P2, and P2′ as indicated in FIG. 5B. The pointer P1, which points to a current position in the linguistic digit sequence, is set to the first digit of the sequence S′, namely, “m.” The pointer P1′ points to a linguistic equivalent of a linguistic digit pointed by the pointer P1, which is “M.” As mentioned, a linguistic equivalent of a linguistic digit in ASCII code is an upper case letter of the linguistic digit. The pointer P2, which points to a currently-visited node in the EDT, is initially set to a root of the EDT. The pointer P2′, which points to a currently-visited node in the EDT for insertion of the linguistic digit at the location of pointer P1′, is also set to the root of the EDT. Since the EDT has no node, the pointers P2 and P2′ point to a null node 539 shown by the broken line.
  • The system stores the linguistic equivalent of the linguistic digit (e.g., “M”) as a singe digit (e.g., one-byte) buffer. Alternatively, the system may store the linguistic equivalent of the linguistic digit in a row of multiple-digit buffer having the same structure as the one where the sequence S′ is stored. In both cases, if the linguistic equivalent does not exist, those buffers do not store any digit. [0142]
  • Block [0143] 505 checks whether a linguistic digit pointed by the pointer P1 exists in the EDT by calling a subroutine 540 labeled as “check_existence(D, P),” which will be described in detail below referring to FIG. 5C. The subroutine 540 takes two local variables: D, which is a linguistic digit passed from the main routine 500 for checking the existence, and P, which is a pointer to the currently-visited node of the EDT (e.g., P2). The subroutine 540 returns the local variable P as a computed result to the main routine 500.
  • FIG. 5C illustrates the [0144] subroutine 540, check_existence(D, P) for checking whether a linguistic digit has already been inserted into the EDT. Specifically, in block 505, the subroutine 540 takes a linguistic digit Value(P1) and the pointer P2, and returns the result as a pointer P3. In this specification, the function “Value(P1)” returns a linguistic digit which the P1 points to. In the case of FIG. 5B, the function Value(P1), and the pointer P2 are substituted for the linguistic digit D, and the pointer P, respectively. A decision 541 is made based on whether P2 points to a null node. Here, since the pointer P2 points to the null node 539 as shown in FIG. 5B, control moves to block 549. Block 549 creates a new empty node 559 shown by the solid line in FIG. 5D, at a position which the pointer P2 points to.
  • Here, a “null node” (e.g., [0145] 539) means that the pointer P2 points to a position where a node has not been created, while an “empty node” (e.g., 559) means that the pointer P2 points to a node which stores no linguistic digit therein. In the drawings of this document, a null node is represented by a circle drawn by the broken line, and an empty node is represented by a circle drawn by the solid line. Next, block 551 returns the position of the empty node 559 which the pointer P2 points to, to block 505 as the pointer P3 as shown in FIG. 5D. In other words, the pointer P3 points to the same node as the pointers P2, and P2′ do, which is the empty node 559.
  • Referring back to FIG. 5A, a [0146] decision 507 is made based on whether the pointer P3 points to an empty node. Since the subroutine 540 returns the pointer P3, which points to the empty node 559, control moves to block 509. Block 509 substitutes the linguistic digit Value(P1) for the linguistic digit Value(P3). Here, the linguistic digit Value(P1), i.e., “m,” is substituted for the linguistic digit which the pointer P3 points to, i.e., the formerly empty node 559, as shown in FIG. 5E. Then, control moves to a decision 511.
  • The decision [0147] 511 is made based on whether the linguistic digit which the pointer P1′ points to is an “empty value,” or has no value. Here, the pointer P1′ points to a linguistic digit of “M” as shown in FIG. 5B, which is not a digit with an empty value, and thus, control moves to block 513. Block 513 checks whether a linguistic digit pointed by the pointer P1′ exists in the EDT by calling a subroutine 540. Referring to FIG. 5C again, the decision 541 is made based on whether the pointer P2′ points to a null node. Here, since the pointer P2′ points to the node 559 which now stores a linguistic digit of “m” as shown in FIG. 5E, control moves to a decision 543. The decision 543 is made based on whether the pointer P2′ points to an empty node. Since the pointer P2′ points to the node 559 storing the linguistic digit “m,” which is not an empty node, control moves to a decision 545.
  • The [0148] decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P1′, and P2′ from the main routine 500, respectively. Here, the pointer P1′ points to “M” (FIG. 5B), and the pointer P2′ points to “m” (FIG. 5E). Supposing that Value(“M”)<Value(“m”), control proceeds to block 547, which moves the pointer P to a “lokid” child node of the “m” node (i.e., the node 559) which the pointer P (i.e., P2′) originally points to, as shown in FIG. 5F. The lokid child node of the “m” node has no node, which is indicated by the broken line in FIG. 5F. Control then returns to the decision 541.
  • At the [0149] decision 541, the pointer P now points to a null node since the lokid of the “m” node has no node. Thus, control goes to block 549 after the check_existence routine 540 has created a new empty node at a position where the pointer P points to. As a result, after block 549 has been executed, the pointer P points to the new empty node at the lokid of the “m” node as shown in FIG. 5G. Then, control moves to block 551, which returns the position of the pointer P as the pointer P3′ as shown in FIG. 5H.
  • Controls returns from the [0150] subroutine 540 and proceeds to a decision 515 in the main routine 500. The decision 515 is made based on whether the pointer P3′ points to an empty node. Since the subroutine 540 returns the new empty node as the pointer P3′, control moves to block 517. Block 517 substitutes the linguistic digit Value(P1′) for the linguistic digit Value(P3′). Here, the linguistic digit Value(P1′), i.e., “M” is substituted for the linguistic digit which the pointer P3′ points to as shown in FIG. 5I. Then, control moves to a decision 519.
  • The [0151] decision 519 is made based on whether an eqkid of the node which the pointer P3 points to is equal to an eqkid of the node which the pointer P3′ points to. As an exceptional rule, if both (i) an eqkid of the node which the pointer P3 points to, and (ii) an eqkid of the node which the pointer P3′ points to are null nodes, control moves on the “NO” branch, and proceeds to block 521. Block 521 calls a subroutine 560, which is labeled as “adjust_subgraph(P, P′),” as described in detail referring to FIG. 5J.
  • The [0152] subroutine 560 takes two local variables P, which is a pointer to a linguistic digit in the EDT, and P′, which is a pointer to a linguistic equivalent of the linguistic digit in the EDT. A decision 555 is made based on whether both (i) the “eqkid” child node of the node which the pointer P3 points to, and (ii) the “eqkid” child node of the node which the pointer P3′ points to, are null nodes. Here, as shown in FIG. 5I, (i) the eqkid child node of the node which the pointer P3 points to is a null node, and (ii) the eqkid child node of the node which the pointer P3′ points to is a null node. Thus, control proceeds to block 557. Block 557 creates a new empty node under (i) the eqkid child node of the node which the pointer P3 points to, and (ii) the eqkid child node of the node which the pointer P3′ points to, as shown in FIG. 5K.
  • Control returns from the [0153] subroutine 560 and proceeds to a decision 523 in the main routine 500. The decision 523 is made based on whether the pointer P1 has reached the end of the linguistic digit sequence S′. As shown in FIG. 5B, the pointer P1 still points to the first linguistic digit of the linguistic digit sequence S′. Thus, control moves to block 525. In block 525, the system sets the pointers P1, P1′, P2, and P2′ to add the next digit in the sequence to the EDT as shown in FIG. 5L. In block 525, the system moves the pointer P1 to a next linguistic digit of the linguistic digit sequence S′, namely, “o.” The system sets the pointer P1′ to a linguistic equivalent of the linguistic digit pointed by the pointer P1, which is “O.” The system also sets the pointers P2 and P2′ to the empty node which was created under the eqkid child nodes of the “m” node and the “M” node. Then, control returns to block 505.
  • In block [0154] 505, the main routine 500 again calls the subroutine check_existence(Value(P1), P2). At block 541 of the subroutine 540, the pointer P2 does not point to a null node as shown in FIG. 5L, and thus, control moves to block 543. At block 543, the pointer P2 points to an empty node as shown in FIG. 5L. Therefore, control moves to block 551, which returns the local variable P, i.e., the pointer P2 which points to the empty node, to block 505 of the main routine 500. In other words, the system sets the pointer P3 to the same node as the pointers P2 and P2′ points to, as shown in FIG. 5M. Control returns from the subroutine 540 to block 505 of the main routine 500, and proceeds to block 507.
  • At this point, as shown in FIG. 5M, the pointer P[0155] 3 points to an empty node. Thus, decision 507 causes control to move to block 509, which substitutes Value(P1) for Value(P3). Here, the system stores the linguistic digit “o” in the node which the pointer P3 points to as shown in FIG. 5N. Control moves on to the decision 511. In block 511, since the linguistic digit which the pointer P1′ points to is not empty, control moves to block 513, which again calls the subroutine check_existence(Value(P1′), P2′). Control jumps from block 513 of the main routine 500 to block 541 of the subroutine 540.
  • In [0156] block 541, the pointer P2′ does not point to a null node as shown in FIG. %N, and thus control moves to block 543. At block 543, the pointer P2′ is not an empty node, and thus control proceeds to block 545. Block 545 compares Value(P1′) (i.e., “0”) and Value(P2′) (i.e., “o”). Supposing that Value(“0”)<Value(“o”), control proceeds to block 547, which moves the pointer P to a “lokid” child node of the “o” node which the pointer P (i.e., P2′) originally points to, as shown in FIG. 50. The lokid child node of the “o” node has no node, which is indicated by the broken line in FIG. 5O. Control then returns to the decision 541.
  • The pointers P[0157] 1, P1′, P2, P2′, and P shown in FIG. 5O are located similarly to those in FIG. 5F. Therefore, the system creates the EDT illustrated in FIG. 5P by executing the above scheme for another iteration as described referring to FIGS. 5F-5O. FIG. 5P shows the pointers and variables of the system immediately after the execution of block 509.
  • The “.” (dot) which the pointer P[0158] 1 points to has no linguistic equivalent. Therefore, as indicated in FIG. 5P, the pointer P1′ which is a pointer to the linguistic equivalent of value (P1) points to an empty value. Referring back to FIG. 5A, in block 511, the pointer P1′ points to an empty value, and thus, control moves to block 523. Unlike other characters (e.g., “m,” “o,” and “c”) which have linguistic equivalents, the “.” leads to blocks 523, and 525, bypassing blocks 513, 515, and 517. Blocks 523, and 525 move the pointers to the next character without adding a portion of the tree structure representing the linguistic equivalents (e.g., “M,” “0,” and “C”). As described earlier, for the other characters having the linguistic equivalents like “m,” “o,” and “c,” the portion representing the linguistic equivalents, namely, “M,” “0,” and “C,” are created by blocks 513, 515, and 517.
  • In block [0159] 523, the pointer P1 still has not reached the end of the linguistic digit sequence S′, and control moves to block 525. Block 525 sets the pointers P1, P1′, P2, and P2′ as indicated in FIG. 5Q, and control returns to block 505. Block 505 creates an empty node at a position which the pointer P2 points to, and sets the pointer P3 to the newly created empty node as shown in FIG. 5R. The scheme in the subroutine 540 is similar to that described referring to FIGS. 5L and 5M. Then, control returns from the subroutine 540 to block 509 of the main routine 500.
  • In [0160] block 507, the pointer P3 points to an empty node as shown in FIG. 5R, and thus, control moves to block 509. Block 509 substitutes the linguistic digit which the pointer P1 points to, i.e., B2, for a node which the pointer P3 points to, as shown in FIG. 5S. Then, in the decision 511, the pointer P1′ points to an empty value. Thus, control moves to block 523. In the decision 523, the end of the linguistic digit sequence has not been reached yet, and thus, control continues to block 525. Block 525 sets the pointers P1, P1′, P2, and P2′ as shown in FIG. 5T.
  • By iterating the process described referring to FIGS. [0161] 5Q-5T, the system creates the three nodes “B1,” “A2,” and “A1” under the eqkid node of the node of “B2,” as shown in FIG. 5U. Control moves on to blocks 511, and then 523. In the decision 523, this time, the pointer P1 points to the end of the reversed linguistic digit sequence S′, and therefore, control proceeds to a decision 527. The decision 527 is made based on whether the tree structure of “m-M-o-O-c-C-.-B2-B1-A2-A1” has a terminal node. Here, the tree structure has no node, and thus, control proceed to block 529.
  • Block [0162] 529 creates a new terminal node N1 which is associated with the tree structure “m-M-o-O-c-C-.-B2-B1-A2-A1”. The terminal node N1 contains the linguistic digit sequence S, and the linguistic encoding type T of the sequence S. The terminal node N is sometimes referred to as a “leaf” node of the tree structure, which uniquely identifies distinct encoding types, thereby unambiguously specifying the linguistic encoding type T of the domain name corresponding to the tree pathway (e.g., “m-M-o-O-c-C-.-B2-B1-A2-A1”). The system calls subroutines detect_string, and detect_type, which return the linguistic digit sequence S, and the linguistic encoding type T, respectively, to the main routine 500. FIG. 5V illustrates the tree structure “m-M-o-O-c-C-.-B2-B1-A2-A1,” and the associated terminal node N1 containing the linguistic digit sequence S, and the linguistic encoding type T, which are “A1A2B1B2.com,” and “GB2312,” respectively.
  • ADDING “C[0163] 1C2D1D2.com” TO EDT
  • Now the process for adding the linguistic digit sequence, “C[0164] 1C2D1D2.com” to the EDT, which already contains the sequence, “A1A2B1B2.com,” will be described referring to FIGS. 4A, 5A, and 5W-5AG.
  • In block [0165] 501, the routine 500 receives a linguistic digit sequence S (i.e., C1C2D1D2.com), and a linguistic encoding type of the linguistic digit sequence S (e.g., BIG5). The computer system then reverses the order of the linguistic digit sequence S, and substitutes the reversed sequence of S for the reversed linguistic digit sequence S′. In block 501, the system substitutes a linguistic encoding type of the linguistic digit sequence S for a linguistic encoding type T. Here, the reversed linguistic sequence S′, and the linguistic encoding type T are “moc.D2D1C2C1,” and “BIG5,” respectively. FIG. 5W illustrates the reversed linguistic sequence S′, and the linguistic encoding type T.
  • In [0166] block 503, the routine 500 initializes pointers P1, P1′, P2, and P2′ as indicated in FIG. 5W. The pointer P1, which points to a current position in the linguistic digit sequence, is set to the first digit of the sequence S′, namely, “m.” The pointer P1′ points to a linguistic equivalent of a linguistic digit pointed by the pointer P1, which is “M.” The linguistic equivalent of a linguistic digit in ASCII code is a capital letter of the linguistic digit. The pointer P2, which points to a currently-visited node in the EDT, is set to a root of the EDT, namely a node storing “m.” The pointer P2′, which points to a currently-visited node in the EDT for insertion of the pointer P1 ′, is also set to the root of the EDT, i.e., the “m” node.
  • Block [0167] 505 checks whether a linguistic digit pointed by the pointer P1 exists in the EDT by calling the subroutine 540 check_existence(D, P). Referring again to FIG. 5C, the decision 541 is made based on whether the pointer P, i.e., the pointer P2, points to a null node. Here, the pointer P2 points to a node with a linguistic digit of “m,” and thus, control moves to the decision 543. The decision 543 is made based on whether the pointer P points to an empty node. The pointer P2 points to the “m” node, not an empty node, and thus, controls proceeds to the decision 545. The decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P1, and P2 from the main routine 500, respectively. Here, the pointer P1 points to “m,” and the pointer P2 points to “m.” Therefore, control moves to block 551, which returns the position of the pointer P2 to the main routine 500 as the pointer P3.
  • In the [0168] decision 507, the pointer P3 points to the “m” node as shown in FIG. 5X, which is not an empty node, and thus, control moves to the decision 511. The decision 511 is made based on whether the pointer P1′ is an empty value. Here, the pointer P1′ points to a linguistic digit, “M,” which is not an empty value, and thus, control moves on to block 513. In block 513, the subroutine 540 checks whether the EDT has already stored the linguistic digit which the pointer P2′ points to. Specifically, referring to FIG. 5C, the decision 541 checks if the pointer P2′ points to a null node. The pointer P2′ points to an “m” node as shown in FIG. 5X, and thus, control moves to the decision 543.
  • The [0169] decision 543 checks if the pointer P2′ points to an empty node. Since the pointer P2′ does not point to an empty node, control moves to the decision 545. Here, the relationship Value(D)<Value(P) (i.e., Value(“M”)<Value(“m”)) is satisfied, and thus control moves to block 547, where the system moves the pointer P to the lokid node of the “m” node which the pointer P2′ currently points to. Then, the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P3′. As a result, the pointers P, and P3′ are set to the “M” node of the EDT as shown in FIG. 5X.
  • Control returns from the [0170] subroutine 540 to block 513 of the main routine 500. Next, in the decision 515, the pointer P3′ points to an “M” node, and thus, control moves to the decision 519. The decision 519 is made based on whether the eqkid node of the node which the pointer P3 points to, and the eqkid node of the node which the pointer P3 points to are the same or not. Here, the pointer P3 points to the “m” node, and the pointer P3′ points to the “M” node. Thus, the eqkid nodes of the “m” node, and the “M” node are the same, namely, the “o” node, as shown in FIG. 5X. Consequently, control moves to the decision 523, where control proceeds further to block 525 since the end of the sequence S′ has not been reached yet.
  • In block [0171] 525, the system sets the pointers P1, P1′, P2, and P2′ to add the next digit in the sequence to the EDT as shown in FIG. 5Y. Control loops back to block 505 to check if the EDT has the linguistic digit which the pointer P2 points to. Since the pointer P2 points to the “o” node now, control proceeds to the decisions 541, 543, and 545. By comparing the value of the “o” pointed by the pointer P1 with the value of the “o” pointed by the pointer P2, control moves to block 551, which returns the position of the pointer P2 to the block 505 of the main routine 500 as the pointer P3, as shown in FIG. 5Z. In the decision 507, since the pointer P3 does not point to an empty node, control goes on to the decision 511, which checks whether the pointer P1′ points to an empty value. Here, the pointer P1′ points to an “O” digit, and thus, control moves to block 513.
  • Block [0172] 513 checks if the EDT has the linguistic digit which the pointer P2′ points to. Since the pointer P2′ points to the “o” node now, control proceeds to the decisions 541, 543, and 545. Here, the relationship Value(D)<Value(P) (i.e., Value(“O”)<Value(“o”)) is satisfied, and thus control moves to block 547, where the system moves the pointer P to the lokid node of the “o” node which the pointer P2′ currently points to. Then, the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P3′. As a result, the pointers P, and P3′ are set to the “O” node of the EDT as shown in FIG. 5Z.
  • By repeating the process described referring to FIGS. [0173] 5X-5Z, the pointer P1, P1′, P2, P2′, P3, and P3′ are set as shown in FIG. 5AA, which is similar to FIG. 5Y, when the block 525 is executed. Control then loops back to block 505 to check if the EDT has the linguistic digit, “.” (dot). Control jumps from the main routine 500 to the subroutine 540, where control proceeds to decisions 541, 543, 545, and the block 551. As a result, the subroutine 540 returns the position of the pointer P to the main routine 500 as the pointer P3, thereby setting the pointer P3 as shown in FIG. 5AB.
  • Control returns to [0174] decision 507 of the main routine 500, and further goes to blocks 509, and then 511 since the pointer P3 is not an empty node. In the decision 511, the pointer P1′ points to an empty value since the linguistic digit of “.” has no linguistic equivalent as shown in FIG. 5AB. Thus, control moves to the decision 523, and then block 525 since the end of the sequence S′ has not been reached. Block 525 sets the pointers P1, P1′, P2, and P2′ as shown in FIG. 5AC.
  • Control loops back to block [0175] 505, which calls the subroutine 540. Control moves to the decisions 541, 543, and 545 since the pointer P2 points to neither a null node nor an empty node. The decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P1, and P2 from the main routine 500, respectively.
  • Here, the pointer P[0176] 1 points to “D2”, and the pointer P2 points to “B2.” Supposing that Value(“D2”)<Value(“B2”), control proceeds to block 547, which moves the pointer P to a “lokid” child node of the “B2” node which the pointer P (i.e., the pointer P2′) originally points to, as shown in FIG. 5AC. The lokid child node of the “B2” node has no node, which is indicated by the broken line in FIG. 5AC. Control then returns to the decision 541. Significantly, since the linguistic digit “D2” does not have a value equal to that of the linguistic digit “B2,” the structure of the EDT diverges for the first time at the “B2” node. This causes the subsequent nodes for C1C2D1D2.com to follow a unique path different from that of A1A2B1B2.com below the “.” level, as shown in FIG. 5AG.
  • At the [0177] decision 541, the pointer P now points to a null node since the lokid of the “B2” node has no node. Thus, control goes to block 549, which creates a new empty node at a position which the pointer P points to. As a result, after block 549, the pointer P points to the new empty node at the lokid of the “B2” node as shown in FIG. 5AC. Then, control moves to block 551, which returns the position of the pointer P as the pointer P3 as shown in FIG. 5AD.
  • Referring back to FIG. 5A, the [0178] decision 507 is made based on whether the pointer P3 points to an empty node. Since the subroutine 540 returns the pointer P3 which points to the empty node, control moves to block 509. Block 509 substitutes the linguistic digit Value(P1) for the linguistic digit Value(P3). Here, the linguistic digit Value(P1), i.e., “D2” is substituted for the linguistic digit which the pointer P3 points to, as shown in FIG. 5AE. Then, control moves to the decision 511.
  • In the decision [0179] 511, the pointer P1′ points to an empty value, and thus control goes to the decision 523, and then block 525, which set the pointers P1, P1′, P2, and P2′ as shown in FIG. 5AF. The positions of the pointers P1, P1′, P2, and P2′ are similar to those of FIG. 5T. Accordingly, by iterating the process described referring to FIGS. 5T-5V, the routine 500 creates the EDT as illustrated by FIG. 5AG.
  • ADDING “D[0180] 1E2F1F2.com” TO EDT
  • Now the process for adding the linguistic digit sequence, “D[0181] 1E2F1F2.com” to the EDT, which already contains the linguistic digit sequences, “A1A2B1B2.com,” and “C1C2D1D2.com,” will be described below. The computer system adds the sequence, “D1E2F1F2.com” to the EDT by executing the main routine 500, which calls the subroutine 540, in a similar manner as described in detail above referring to FIGS. 5W-5AG with respect to adding “C1C2D1D2.com.” Adding the linguistic digit sequence “D1E2F1F2.com” to the EDT is similar to adding the sequence “C1C2D1D2.com” to the EDT with the exception that the node “F2” is created in a hikid node of the node “B2” shown in FIG. 4A.
  • Specifically, after block [0182] 525 sets the pointers P1, P1′, P2, and P2′ as shown in FIG. 5AH, control loops back to block 505, which calls the subroutine 540. Control moves to the decisions 541, 543, and 545 since the pointer P2 points to neither a null node nor an empty node. The decision 545 is made based on whether Value(D) is greater than, equal to, or less than Value(P), where the local variables D, and P are the pointers P1, and P2 from the main routine 500, respectively. Here, the pointer P1 points to “F2”, and the pointer P2 points to “B2.” Supposing that Value(“B2”)<Value(“F2”), control proceeds to block 553, which moves the pointer P to a “hikid” child node of the “B2” node which the pointer P (i.e., the pointer P2′) originally points to, as shown in FIG. 5AH. The hikid child node of the “B2” node has no node, which is indicated by the broken line in FIG. 5AH. Control then returns to the decision 541.
  • At the [0183] decision 541, the pointer P now points to a null node since the hikid of the “B2” node has no node. The process following the decision 541 is similar to that described referring to FIGS. 5AC-5AG, and ultimately creates the EDT illustrated in FIG. 4A.
  • Finally, a hypothetical case will be described below. Suppose that a new domain name “X[0184] 1X2Y1Y2.com” encoded by BIG5 is being added to the EDT 401 shown in FIG. 4A, and that the values of the linguistic digits X1, X2, Y1, and Y2 are exactly same as those of the linguistic digits A1, A2, B1, and B2 which are encoded by GB2312. This is a rare case, but can happen especially when characters with the same value are used in different linguistic encoding types.
  • In this case, control follows [0185] nodes 403, 409, 411, 413, 415, 417, and 419 since each of the values of the linguistic digits for A1A2B1B2.com matches the corresponding one of the values of the linguistic digits for X1X2Y1Y2.com. Finally, when the pointer P1 points to the end of the string S′, namely, X1, control goes to “YES” branch of the decision 523 in FIG. 5A. Unlike the cases for A1A2B1B2.com, C1C2D1D2.com, and D1E2F1F2.com, in block 527, control moves on to block 531 since the terminal node 421 is not null. Specifically, the terminal node 421 contains the linguistic digit sequence S, which is “A1A2B1B2.com,” and the linguistic encoding type T, which is “GB2312.”
  • [0186] Block 531 creates a new terminal node 431 which is associated with the tree structure “m-M-o-O-c-C-.-Y2-Y1-X2-X1”. The terminal node 431 contains the linguistic digit sequence, and the linguistic encoding type, which are “X1X2Y1Y2.com,” and “BIG5,” respectively. In block 533, the new terminal node 531 is appended at the end of the tree structure of “A1A2B1B2.com.” As a result, the tree has two leaf nodes, namely the terminal nodes 421, and 431. FIG. 5AJ illustrates the completed EDT, in which the value of each corresponding linguistic digit matches with each other for two different domain names.
  • THE REINSERT ROUTINE [0187]
  • The preceding discussion has shown how the algorithm can build up a tree structure without moving or adjusting previously completed substructures within the tree. However, it may at times become desirable to adjust a substructure within the tree. This might occur when, for example, neighboring nodes in the tree coincidentally represent linguistic equivalents of a character in some encoding type. Consider the situation in FIG. 6 where the registry contains three domain names: “A[0188] 1A2E1E2” in a first encoding type, “C1D3H3” in a second encoding type, and “aba” in ASCII. An encoding detection tree 601 is created when the first two domain names are considered. Now assume that by coincidence the digital sequence of E2 in its encoding type is identical to the ASCII value of “a” and the digital sequence of H3 in its encoding type is identical to the ASCII value of “A.”
  • The tree building algorithm applies the third domain name as follows. Initially, pointer P[0189] 2 points to node E2 and pointer P1 points to the last letter “a” in the domain name aba. See block 503 of FIG. 5A. Next, the algorithm runs the check_existence subroutine 540 of FIG. 5C. At 545 in this subroutine, the algorithm determines that digital values of “a” and “E2” are equal. Thus, it returns a pointer P3 pointing to node E2. Now, returning to primary algorithm 500, the process continues through decision blocks 507 and 511. At 511, it is determined that P1′ is not an empty value; it is “A.” Then, at 513, the algorithm runs subroutine check-existence 540 using P1′ =A and P2′ =E2 node. Two loops through subroutine 540 returns pointer P3′ pointing to the H3 node.
  • Now, returning to the main algorithm, decision [0190] 515 determines that P3′ is not pointing to an empty node. Then decision 519 directs the process to 521, where the adjust-EDT structure subroutine 560 is executed. There, the system determines that both P3 and P3′ point to a non-null node. Therefore, each of decisions 555, 559, and 561 are answered in the negative and process control is directed to process block 563, where a “reinsert_subparagraph” subroutine is executed. This subroutine is depicted in FIG. 5AI by process 570.
  • In [0191] process 570, the system initially creates pointers P and P′ that point to nodes located immediately under pointers P3 and P3′. In the example tree 601 of FIG. 6, P points to node El and P′ points to node D3. At 571, the subroutine determines whether P′ is null. In this case, it is not, so the process moves to 573 where the check_existence subroutine is executed. This time the input parameters are P and P′. Executing this program as such compares the values of D3 and El and moves P to hikid from El. Thereafter a new empty node is created at that position and a new pointer P″ is returned at the position of this new node. Now returning to routine 570, a decision block 575 determines whether P″ points to an empty node. It does in this case, so a process operation 577 inserts the digit at the position of P′. In this case, that is the digit D3.
  • The process then moves pointer P′ recursively down the branch on which it sits and destroys the previous node of P′. See [0192] 579 and 581. Process control then returns to 571 where the procedure for removing the next node on the P′ branch is executed. The process continues in this manner until the last node remaining in the P′ branch has been moved to the P branch. Then P′ points to a null node and the process completes as indicated at 571.
  • In general, reinsertion of a substructure is necessary when a digit and its linguistic equivalent digit have both been inserted into EDT and both of them have some substructures. Because now both of them should converge, the substructures of one needs to be redistributed to the substructures of the other to ensure no information loss. [0193]
  • 6. HARDWARE/SOFTWARE [0194]
  • Embodiments of the present invention relate to an apparatus for performing the above-described m[0195] 1DNS operations. This apparatus may be specially constructed (designed) for the required purposes, or it may be a general-purpose computer selectively activated or configured by a computer program stored in the computer. The processes presented herein are not inherently related to any particular computer or other apparatus. In particular, various general-purpose machines may be used with programs written in accordance with the teachings herein, or it may be more convenient to construct a more specialized apparatus to perform the required method operations. The required structure for a variety of these machines will appear from the description given above.
  • In addition, embodiments of the present invention further relate to computer readable media that include program instructions for performing various computer-implemented operations. The media may also include, alone or in combination with the program instructions, data files, data structures, tables, and the like. The media and program instructions may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well known and available to those having skill in the computer software arts. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROM disks; magneto-optical media such as floptical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM). The media may also be a transmission medium such as optical or metallic lines, wave guides, etc. including a carrier wave transmitting signals specifying the program instructions, data structures, etc. Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. [0196]
  • FIG. 7 illustrates a typical computer system in accordance with an embodiment of the present invention. The [0197] computer system 700 includes any number of processors 702 (also referred to as central processing units, or CPUs) that are coupled to storage devices including primary storage 706 (typically a random access memory, or “RAM”), primary storage 704 (typically a read only memory, or “ROM”). As is well known in the art, primary storage 704 acts to transfer data and instructions uni-directionally to the CPU and primary storage 706 is used typically to transfer data and instructions in a bi-directional manner. Both of these primary storage devices may include any suitable type of the computer-readable media described above. A mass storage device 708 is also coupled bi-directionally to CPU 702 and provides additional data storage capacity and may include any of the computer-readable media described above. The mass storage device 708 may be used to store programs, data and the like and is typically a secondary storage medium such as a hard disk that is slower than primary storage. It will be appreciated that the information retained within the mass storage device 708, may, in appropriate cases, be incorporated in standard fashion as part of primary storage 706 as virtual memory. A specific mass storage device such as a CD-ROM 714 may also pass data uni-directionally to the CPU.
  • [0198] CPU 702 is also coupled to an interface 710 that includes one or more input/output devices such as such as video monitors, track balls, mice, keyboards, microphones, touch-sensitive displays, transducer card readers, magnetic or paper tape readers, tablets, styluses, voice or handwriting recognizers, or other well-known input devices such as, of course, other computers. Finally, CPU 702 optionally may be coupled to a computer or telecommunications network using a network connection as shown generally at 712. With such a network connection, it is contemplated that the CPU might receive information from the network (e.g., requests to resolve domains), or might output information to the network in the course of performing the above-described method operations. The above-described devices and materials will be familiar to those of skill in the computer hardware and software arts.
  • The [0199] CPU 702 may take various forms. It may include one or more general purpose microprocessors that are selectively configured or reconfigured to implement the functions described herein. Or it may include one or more specially designed processors or microcontrollers that contain logic and/or circuitry for implementing the functions described herein. Any of the logical devices serving as CPU 702 may be designed as general purpose microprocessors, microcontrollers, application specific integrated circuits (ASICs), digital signal processors (DSPs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), and the like. They may execute instructions under the control of the hardware, firmware, software, reconfigurable hardware, combinations of these, etc.
  • The hardware elements described above may be configured (usually temporarily) to act as one or more software modules for performing the operations of this invention. For example, separate modules may be created from program instructions for detecting an encoding type, transforming that encoding type, and identifying a default name server may be stored on [0200] mass storage device 708 or 714 and executed on CPU 708 in conjunction with primary memory 706. See FIG. 3B for example.
  • Although the foregoing invention has been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications may be practiced within the scope of the appended claims. [0201]

Claims (57)

What is claimed is:
1. A method, implemented on an apparatus, of detecting a linguistic encoding type of a domain name, the method comprising:
receiving a digital representation of the domain name; and
using said representation to traverse a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types, thereby detecting the linguistic encoding type of the domain name.
2. The method of
claim 1
, wherein the tree structure is traversed by considering individual characters of the domain name or portions of said characters to determine how to move between nodes on the tree structure.
3. The method of
claim 1
, wherein the tree structure is traversed by comparing digital representations of linguistic digits in the nodes of the tree structure against digital representations of individual characters of the domain name or portions of said characters, and thereby determining how to move between nodes on the tree structure.
4. The method of
claim 3
, wherein a linguistic digit of a root of the tree structure is compared against a digital representation of a last character of the domain name and
wherein a linguistic digit of a terminal node of the tree structure is compared against a digital representation of a first character of the domain name.
5. The method of
claim 4
, wherein the terminal node specifies the linguistic encoding type of the domain name.
6. The method of
claim 1
, further comprising reversing the sequence of the digital representation of the domain name prior to using said representation to traverse the tree structure, wherein a digital representation of a last character of the domain name is compared to a root node on the tree structure.
7. The method of
claim 6
, wherein a digital representation of a next to last character of the domain name is compared to a second node of the tree structure.
8. The method of
claim 6
, further comprising:
(a) using a next previous character of the domain name to identify a next lower level node of the tree structure; and
(b) repeating (a) until reaching a terminal node of the tree structure.
9. The method of
claim 1
, wherein the tree structure is ternary tree structure.
10. The method of
claim 1
, wherein the nodes of the tree structure comprise digital sequences of linguistic digits from characters of multiple encoding types.
11. A computer program product comprising a machine readable medium on which is stored program code instructions for performing a method of detecting a linguistic encoding type of a domain name, the program instructions comprising:
program code for receiving a digital representation of the domain name; and
program code for using said representation to traverse a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types, thereby detecting the linguistic encoding type of the domain name.
12. The computer program product of
claim 11
, wherein the tree structure is traversed by executing program code for considering individual characters of the domain name or portions of said characters to determine how to move between nodes on the tree structure.
13. The computer program product of
claim 11
, wherein the tree structure is traversed by executing program code for comparing digital representations of linguistic digits in the nodes of the tree structure against digital representations of individual characters of the domain name or portions of said characters, and thereby determining how to move between nodes on the tree structure.
14. The computer program product of
claim 13
, wherein a linguistic digit of a root of the tree structure is compared against a digital representation of a last character of the domain name and
wherein a linguistic digit of a terminal node of the tree structure is compared against a digital representation of a first character of the domain name.
15. The computer program product of
claim 14
, wherein the terminal node specifies the linguistic encoding type of the domain name.
16. The computer program product of
claim 11
, further comprising program code for reversing the sequence of the digital representation of the domain name prior to using said representation to traverse the tree structure, wherein executing program code compares a digital representation of a last character of the domain name to a root node on the tree structure.
17. The computer program product of
claim 16
, wherein a digital representation of a next to last character of the domain name is compared to a second node of the tree structure.
18. The computer program product of
claim 16
, further comprising:
(a) program code for using a next previous character of the domain name to identify a next lower level node of the tree structure; and
(b) program code for repeating (a) until reaching a terminal node of the tree structure.
19. The computer program product of
claim 11
, wherein the tree structure is ternary tree structure.
20. The computer program product of
claim 11
, wherein the nodes of the tree structure comprise digital sequences of linguistic digits from characters of multiple encoding types.
21. An apparatus for detecting a linguistic encoding type of a domain name, the apparatus comprising:
one or more processors;
memory in coupled to said one or more processor and configured to store a tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and
a network interface configured to receive domain names from network nodes;
wherein the one or more processors are configured or designed to traverse the tree structure using information from a domain name to thereby detect the linguistic encoding type of the domain name.
22. The apparatus of
claim 21
, wherein the tree structure is ternary tree structure.
23. The apparatus of
claim 21
, wherein the nodes of the tree structure comprise digital sequences of linguistic digits from characters of multiple encoding types.
24. The apparatus of
claim 21
, wherein the one or more processors is further configured or designed to traverse the tree structure by considering individual characters of the domain name or portions of said characters to determine how to move between nodes on the tree structure.
25. The apparatus of
claim 21
, wherein the one or more processors is further configured or designed to traverse the tree structure by comparing digital representations of linguistic digits in the nodes of the tree structure against digital representations of individual characters of the domain name or portions of said characters, and thereby determining how to move between nodes on the tree structure.
26. The apparatus of
claim 21
, wherein the one or more processors are further configured or designed to (i) reverse the sequence of the digital representation of the domain name prior to using said representation to traverse the tree structure, and (ii) compare a digital representation of a last character of the domain name to a root node on the tree structure.
27. The apparatus of
claim 26
, wherein the one or more processors are further configured to compare a digital representation of a next to last character of the domain name to a second node of the tree structure.
28. The apparatus of
claim 21
, further comprising a logical module for converting the domain name from its linguistic encoding type to a DNS compatible encoding type.
29. The apparatus of
claim 28
, further comprising a logical module for resolving domain names in the DNS compatible encoding type.
30. The apparatus of
claim 28
, wherein the DNS compatible encoding type is ASCII.
31. A method, implemented on an apparatus, of creating an encoding detection tree comprising nodes connected by paths, with the paths representing domain names in various encoding types, the method comprising:
receiving a representation of a digitally represented first domain name which is encoded in a first linguistic encoding type;
adding the first domain name, and its first linguistic encoding type, to the encoding detection tree to create a first path through the encoding detection tree;
receiving a representation of a digitally represented second domain name which is encoded in a second linguistic encoding type; and
adding the second domain name, and its second linguistic encoding type, to the encoding detection tree to create a second path through the encoding detection tree.
32. The method of
claim 31
, further comprising determining whether the first domain name already exists in the encoding detection tree.
33. The method of
claim 31
, wherein the first and second paths each comprise separate terminal nodes, one or more intermediate nodes, and a common root node.
34. The method of
claim 33
, further comprising adding an identifier of the first and second linguistic encoding types to the terminal nodes of the first and second paths, respectively.
35. The method of
claim 34
, further comprising adding a sequence of the first domain name to the terminal node of the first path.
36. The method of
claim 33
, wherein the first path presents the first domain name in reverse order of linguistic digits when moving from the root node to the terminal node.
37. The method of
claim 31
, wherein each parent node of the encoding detection tree branches into three children nodes at most.
38. The method of
claim 31
, wherein adding the first domain name comprises adding a new node to the encoding detection tree for each linguistic digit of the first domain name having a digital sequence that does not appear at a corresponding location in the encoding detection tree.
39. The method of
claim 31
, wherein the position of the new nodes with respect to existing nodes is determined by comparing the digital sequence of an existing node with the digital sequence of a corresponding linguistic digit from the first domain name.
40. The method of
claim 31
, wherein adding the first domain name includes adding to the encoding detection tree a linguistic equivalent node of the one of the linguistic digits in the first domain name.
41. A computer program product comprising a machine readable medium on which is provided program instructions for creating an encoding detection tree comprising nodes connected by paths, with the paths representing domain names in various encoding types, the instructions comprising:
program code for receiving a representation of a digitally represented first domain name which is encoded in a first linguistic encoding type;
program code for adding the first domain name, and its first linguistic encoding type, to the encoding detection tree to create a first path through the encoding detection tree;
program code for receiving a representation of a digitally represented second domain name which is encoded in a second linguistic encoding type; and
program code for adding the second domain name, and its second linguistic encoding type, to the encoding detection tree to create a second path through the encoding detection tree.
42. The computer program product of
claim 41
, further comprising program code for determining whether the first domain name already exists in the encoding detection tree.
43. The computer program product of
claim 41
, wherein the first and second paths each comprise separate terminal nodes, one or more intermediate nodes, and a common root node.
44. The computer program product of
claim 43
, further comprising program code for adding an identifier of the first and second linguistic encoding types to the terminal nodes of the first and second paths, respectively.
45. The computer program product of
claim 44
, further comprising program code for adding a sequence of the first domain name to the terminal node of the first path.
46. The computer program product of
claim 43
, wherein the first path presents the first domain name in reverse order of linguistic digits when moving from the root node to the terminal node.
47. The computer program product of
claim 41
, wherein each parent node of the encoding detection tree branches into three children nodes at most.
48. The computer program product of
claim 41
, wherein the program code for adding the first domain name comprises program code for adding a new node to the encoding detection tree for each linguistic digit of the first domain name having a digital sequence that does not appear at a corresponding location in the encoding detection tree.
49. The computer program product of
claim 41
, wherein the position of the new nodes with respect to existing nodes is determined by comparing the digital sequence of an existing node with the digital sequence of a corresponding linguistic digit from the first domain name.
50. The computer program product of
claim 41
, wherein the program code for adding the first domain name includes program code for adding to the encoding detection tree a linguistic equivalent node of the one of the linguistic digits in the first domain name.
51. An apparatus for creating an encoding detection tree comprising nodes connected by paths, with the paths representing domain names in various encoding types, the apparatus comprising:
one or more processors;
memory in coupled to said one or more processor and configured to store a partially created tree structure having multiple nodes connected by paths and having terminal nodes that uniquely identify distinct encoding types; and
an interface configured to receive domain names from a collection of domain names;
wherein the one or more processors are configured or designed to receive representations of digitally represented domain names which are encoded in linguistic encoding types and add those domain names, together with their linguistic encoding types, to the encoding detection tree to create paths through the encoding detection tree.
52. The apparatus of
claim 51
, wherein the interface is configured to receive domain names from a registry of domain names.
53. The apparatus of
claim 51
, wherein the paths each comprise separate terminal nodes, one or more intermediate nodes, and a common root node.
54. The apparatus of
claim 53
, wherein the one or more processors are further designed or configured to add identifiers of the linguistic encoding types to the terminal nodes of the paths.
55. The apparatus of
claim 54
, wherein the one or more processors are further designed or configured to add a sequence of the domain names to the terminal nodes of the paths.
56. The apparatus of
claim 53
, wherein the paths present the domain names in reverse order of linguistic digits when moving from the root node to the terminal node.
57. The apparatus of
claim 51
, wherein each parent node of the encoding detection tree branches into three children nodes at most.
US09/792,438 1999-02-26 2001-02-23 Multi-language domain name service Abandoned US20010025320A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/792,438 US20010025320A1 (en) 1999-02-26 2001-02-23 Multi-language domain name service

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/258,690 US6314469B1 (en) 1999-02-26 1999-02-26 Multi-language domain name service
US09/792,438 US20010025320A1 (en) 1999-02-26 2001-02-23 Multi-language domain name service

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/258,690 Continuation-In-Part US6314469B1 (en) 1999-02-23 1999-02-26 Multi-language domain name service

Publications (1)

Publication Number Publication Date
US20010025320A1 true US20010025320A1 (en) 2001-09-27

Family

ID=22981715

Family Applications (3)

Application Number Title Priority Date Filing Date
US09/258,690 Expired - Lifetime US6314469B1 (en) 1999-02-23 1999-02-26 Multi-language domain name service
US09/792,438 Abandoned US20010025320A1 (en) 1999-02-26 2001-02-23 Multi-language domain name service
US09/823,523 Expired - Lifetime US6446133B1 (en) 1999-02-26 2001-03-30 Multi-language domain name service

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/258,690 Expired - Lifetime US6314469B1 (en) 1999-02-23 1999-02-26 Multi-language domain name service

Family Applications After (1)

Application Number Title Priority Date Filing Date
US09/823,523 Expired - Lifetime US6446133B1 (en) 1999-02-26 2001-03-30 Multi-language domain name service

Country Status (10)

Country Link
US (3) US6314469B1 (en)
EP (1) EP1059789A3 (en)
JP (1) JP3492580B2 (en)
KR (1) KR100444757B1 (en)
CN (2) CN1238804C (en)
EA (1) EA002513B1 (en)
HK (2) HK1029418A1 (en)
SG (1) SG91854A1 (en)
TW (1) TW461209B (en)
WO (1) WO2000050966A2 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116459A1 (en) * 2001-02-16 2002-08-22 Microsoft Corporation System and method for passing context-sensitive information from a first application to a second application on a mobile device
US20020174186A1 (en) * 2001-05-15 2002-11-21 Koichi Hashimoto Electronic mail typestyle processing device
US20030005157A1 (en) * 1999-11-26 2003-01-02 Edmon Chung Network address server
US20030084038A1 (en) * 2001-11-01 2003-05-01 Verisign, Inc. Transactional memory manager
US20040081434A1 (en) * 2002-10-15 2004-04-29 Samsung Electronics Co., Ltd. Information storage medium containing subtitle data for multiple languages using text data and downloadable fonts and apparatus therefor
US20050044394A1 (en) * 2001-11-09 2005-02-24 Wenhu Wang Method of the information secure
US20050210149A1 (en) * 2004-03-03 2005-09-22 Kimball Jordan L Method, system, and computer useable medium to facilitate name preservation across an unrestricted set of TLDS
US20060036767A1 (en) * 1999-06-22 2006-02-16 Ryan William K Method and apparatus for multiplexing internet domain names
US20070073839A1 (en) * 2000-06-27 2007-03-29 Edmon Chung Electronic Mail Server
US20080071909A1 (en) * 2006-09-14 2008-03-20 Michael Young System and method for facilitating distribution of limited resources
US20080109411A1 (en) * 2006-10-24 2008-05-08 Michael Young Supply Chain Discovery Services
US7584089B2 (en) 2002-03-08 2009-09-01 Toshiba Corporation Method of encoding and decoding for multi-language applications
US20100114559A1 (en) * 2008-10-30 2010-05-06 Yookyung Kim Short text language detection using geographic information
US20110022675A1 (en) * 2008-03-10 2011-01-27 Afilias Limited Platform independent idn e-mail storage translation
US20110054881A1 (en) * 2009-09-02 2011-03-03 Rahul Bhalerao Mechanism for Local Language Numeral Conversion in Dynamic Numeric Computing
US20110106924A1 (en) * 2009-10-30 2011-05-05 Verisign, Inc. Internet Domain Name Super Variants
US20110225246A1 (en) * 2010-03-10 2011-09-15 Afilias Limited Alternate e-mail delivery
US8077974B2 (en) 2006-07-28 2011-12-13 Hewlett-Packard Development Company, L.P. Compact stylus-based input technique for indic scripts
US20180336192A1 (en) * 2017-05-18 2018-11-22 Wipro Limited Method and system for generating named entities
US20200344209A1 (en) * 2011-12-29 2020-10-29 Verisign, Inc. Methods and systems for creating new domains
US11269836B2 (en) * 2019-12-17 2022-03-08 Cerner Innovation, Inc. System and method for generating multi-category searchable ternary tree data structure
US20220247711A1 (en) * 2019-12-11 2022-08-04 CallFire, Inc. Domain management and synchronization system
US11958889B2 (en) 2018-10-24 2024-04-16 Janssen Pharmaceuticals, Inc. Compositions of phosphorylated tau peptides and uses thereof

Families Citing this family (167)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6802068B1 (en) * 1996-10-16 2004-10-05 International Business Machines Corporation Addressless internetworking
US7136932B1 (en) * 1999-03-22 2006-11-14 Eric Schneider Fictitious domain name method, product, and apparatus
US6760746B1 (en) * 1999-09-01 2004-07-06 Eric Schneider Method, product, and apparatus for processing a data request
US6963871B1 (en) * 1998-03-25 2005-11-08 Language Analysis Systems, Inc. System and method for adaptive multi-cultural searching and matching of personal names
US8855998B2 (en) 1998-03-25 2014-10-07 International Business Machines Corporation Parsing culturally diverse names
US8812300B2 (en) 1998-03-25 2014-08-19 International Business Machines Corporation Identifying related names
US6738827B1 (en) * 1998-09-29 2004-05-18 Eli Abir Method and system for alternate internet resource identifiers and addresses
EP1003114A1 (en) * 1998-11-17 2000-05-24 International Business Machines Corporation Method of interconnecting computers and computer network
US6314469B1 (en) * 1999-02-26 2001-11-06 I-Dns.Net International Pte Ltd Multi-language domain name service
US6182148B1 (en) * 1999-03-18 2001-01-30 Walid, Inc. Method and system for internationalizing domain names
US7188138B1 (en) * 1999-03-22 2007-03-06 Eric Schneider Method, product, and apparatus for resource identifier registration and aftermarket services
US9141717B2 (en) * 1999-03-22 2015-09-22 Esdr Network Solutions Llc Methods, systems, products, and devices for processing DNS friendly identifiers
US8037168B2 (en) 1999-07-15 2011-10-11 Esdr Network Solutions Llc Method, product, and apparatus for enhancing resolution services, registration services, and search services
US6338082B1 (en) * 1999-03-22 2002-01-08 Eric Schneider Method, product, and apparatus for requesting a network resource
US8667051B2 (en) 1999-03-22 2014-03-04 Esdr Network Solutions Llc Real-time communication processing method, product, and apparatus
USRE43690E1 (en) 1999-03-22 2012-09-25 Esdr Network Solutions Llc Search engine request method, product, and apparatus
CN1176432C (en) * 1999-07-28 2004-11-17 国际商业机器公司 Method and system for providing national language inquiry service
AU6646200A (en) * 1999-08-30 2001-03-26 Ying Tuo Method and apparatus for using non-english characters in domain names and e-mailaddresses
USRE44207E1 (en) 1999-09-01 2013-05-07 Esdr Network Solutions Llc Network resource access method, product, and apparatus
JP2001125915A (en) * 1999-10-28 2001-05-11 Fujitsu Ltd Information retrieving device
US7107325B1 (en) * 1999-11-15 2006-09-12 Insweb Corporation System and method for optimizing and processing electronic pages in multiple languages
KR100383861B1 (en) * 2000-01-28 2003-05-12 주식회사 한닉 Korean dns system
KR100433982B1 (en) * 2000-02-03 2004-06-04 (주)넷피아닷컴 System for acc esing web page using real names and method thereof
US7149964B1 (en) * 2000-02-09 2006-12-12 Microsoft Corporation Creation and delivery of customized content
US6697806B1 (en) * 2000-04-24 2004-02-24 Sprint Communications Company, L.P. Access network authorization
US20020019800A1 (en) * 2000-05-16 2002-02-14 Ideaflood, Inc. Method and apparatus for transacting divisible property
US7076541B1 (en) * 2000-06-05 2006-07-11 Register.Com, Inc. Method and apparatus providing distributed domain management capabilities
WO2002001312A2 (en) * 2000-06-28 2002-01-03 Inter China Network Software Company Limited Method and system of intelligent information processing in a network
US7020602B1 (en) * 2000-08-21 2006-03-28 Kim Ki S Native language domain name registration and usage
FR2813678A1 (en) * 2000-09-07 2002-03-08 Joint Forture Technology Inter Linguistic conversion for domain names used on the Internet, uses name server to associate domain name in non-English language with IP address
US6823184B1 (en) * 2000-09-08 2004-11-23 Fuji Xerox Co., Ltd. Personal digital assistant for generating conversation utterances to a remote listener in response to a quiet selection
US7286649B1 (en) 2000-09-08 2007-10-23 Fuji Xerox Co., Ltd. Telecommunications infrastructure for generating conversation utterances to a remote listener in response to a quiet selection
US7106852B1 (en) 2000-09-08 2006-09-12 Fuji Xerox Co., Ltd. Telephone accessory for generating conversation utterances to a remote listener in response to a quiet selection
US6941342B1 (en) 2000-09-08 2005-09-06 Fuji Xerox Co., Ltd. Method for generating conversation utterances to a remote listener in response to a quiet selection
US7013279B1 (en) 2000-09-08 2006-03-14 Fuji Xerox Co., Ltd. Personal computer and scanner for generating conversation utterances to a remote listener in response to a quiet selection
AU2001296537A1 (en) 2000-10-02 2002-04-15 Enic Corporation Determining alternative textual identifiers, such as for registered domain names
WO2002031702A1 (en) 2000-10-09 2002-04-18 Enic Corporation Registering and using multilingual domain names
US20020083029A1 (en) * 2000-10-23 2002-06-27 Chun Won Ho Virtual domain name system using the user's preferred language for the internet
US6988269B1 (en) * 2000-10-24 2006-01-17 Litton Industries Inc. Employment of instruction in program supported by server application to cause execution of program unsupported by the server application
CA2427266C (en) * 2000-11-01 2005-03-29 Snapnames.Com, Inc. Registry-integrated internet domain name acquisition system
US8583745B2 (en) * 2000-11-16 2013-11-12 Opendesign, Inc. Application platform
US7900143B2 (en) * 2000-12-27 2011-03-01 Intel Corporation Large character set browser
US6393445B1 (en) * 2001-01-04 2002-05-21 Institute For Information Industry System for transforming Chinese character forms in real-time between a traditional character form and a simplified character form
US20020146181A1 (en) * 2001-02-06 2002-10-10 Azam Syed Aamer System, method and computer program product for a multi-lingual text engine
US7809854B2 (en) * 2002-02-12 2010-10-05 Open Design, Inc. Logical routing system
DE10110240B4 (en) * 2001-02-28 2005-07-07 Characterisation Gmbh Method for providing IP addresses to Internet characters containing special characters
EP1364500B1 (en) * 2001-03-02 2005-11-30 Alcatel Internetworking, Inc. Method and apparatus for classifying querying nodes
US7500017B2 (en) * 2001-04-19 2009-03-03 Microsoft Corporation Method and system for providing an XML binary format
US20040044791A1 (en) * 2001-05-22 2004-03-04 Pouzzner Daniel G. Internationalized domain name system with iterative conversion
US7603403B2 (en) * 2001-05-30 2009-10-13 International Business Machines Corporation Localization in distributed computer environments
US20030182447A1 (en) * 2001-05-31 2003-09-25 Schilling Frank T. Generic top-level domain re-routing system
US7194529B2 (en) * 2001-07-12 2007-03-20 Abb Inc. Method and apparatus for the delivery and integration of an asset management system into an existing enterprise network
US6961932B2 (en) * 2001-08-15 2005-11-01 Microsoft Corporation Semantics mapping between different object hierarchies
US7076785B2 (en) * 2001-08-15 2006-07-11 Microsoft Corporation Lazy loading with code conversion
US20030056004A1 (en) * 2001-09-19 2003-03-20 Abb Inc. Method and apparatus for the routing of messages in an asset management system
US7171457B1 (en) * 2001-09-25 2007-01-30 Juniper Networks, Inc. Processing numeric addresses in a network router
US7284056B2 (en) * 2001-10-04 2007-10-16 Microsoft Corporation Resolving host name data
US20030093465A1 (en) * 2001-10-31 2003-05-15 International Business Machines Corporation Management strategies for internationalization in a distributed computer environment
US20030093562A1 (en) * 2001-11-13 2003-05-15 Padala Chandrashekar R. Efficient peer to peer discovery
US7546143B2 (en) 2001-12-18 2009-06-09 Fuji Xerox Co., Ltd. Multi-channel quiet calls
US7565402B2 (en) * 2002-01-05 2009-07-21 Eric Schneider Sitemap access method, product, and apparatus
US7225222B1 (en) * 2002-01-18 2007-05-29 Novell, Inc. Methods, data structures, and systems to access data in cross-languages from cross-computing environments
US7178104B1 (en) * 2002-02-15 2007-02-13 Microsoft Corporation System and method for generating structured documents in a non-linear manner
US20030172119A1 (en) * 2002-03-06 2003-09-11 International Business Machines Corporation Method and system for dynamically sending email notifications with attachments in different communication languages
US20030191647A1 (en) * 2002-04-05 2003-10-09 Kam David M. Method & system for managing web pages, and telecommunications via multilingual keywords and domains
AU2003226952A1 (en) * 2002-04-22 2003-11-03 Thomas Arnfeldt Andersen Digital identity and method of producing same
US20030204553A1 (en) * 2002-04-24 2003-10-30 Eamoon Neylon Information handling system and program for use on a network, and a process of forming a relationship between global resources and local descriptions of those resources
KR100463208B1 (en) * 2002-08-05 2004-12-23 (주)하우앤와이 Internal Natural Domain Service System with Local Name Servers for Flexible Top-Level Domains
US8775675B2 (en) * 2002-08-30 2014-07-08 Go Daddy Operating Company, LLC Domain name hijack protection
US7130878B2 (en) * 2002-08-30 2006-10-31 The Go Daddy Group, Inc. Systems and methods for domain name registration by proxy
US7627633B2 (en) * 2002-08-30 2009-12-01 The Go Daddy Group, Inc. Proxy email method and system
US7426576B1 (en) * 2002-09-20 2008-09-16 Network Appliance, Inc. Highly available DNS resolver and method for use of the same
DE10244747A1 (en) 2002-09-25 2004-04-15 Siemens Ag Medical system architecture for the transfer of data, examination images and messages between imaging units, servers and computers, said system employing a proxy server for data transfer and being suitable for DICOM applications
US7127707B1 (en) 2002-10-10 2006-10-24 Microsoft Corporation Intellisense in project upgrade
US7254642B2 (en) * 2003-01-30 2007-08-07 International Business Machines Corporation Method and apparatus for local IP address translation
US20040205732A1 (en) * 2003-04-11 2004-10-14 Paul Parkinson Cross-platform porting tool for web applications
US7886075B2 (en) * 2003-05-16 2011-02-08 Cisco Technology, Inc. Arrangement for retrieving routing information for establishing a bidirectional tunnel between a mobile router and a correspondent router
US7257592B2 (en) * 2003-06-26 2007-08-14 International Business Machines Corporation Replicating the blob data from the source field to the target field based on the source coded character set identifier and the target coded character set identifier, wherein the replicating further comprises converting the blob data from the source coded character set identifier to the target coded character set identifier
US7877432B2 (en) * 2003-07-08 2011-01-25 The Go Daddy Group, Inc. Reseller program for registering domain names through resellers' web sites
US20050010392A1 (en) * 2003-07-10 2005-01-13 International Business Machines Corporation Traditional Chinese / simplified Chinese character translator
US20050010391A1 (en) * 2003-07-10 2005-01-13 International Business Machines Corporation Chinese character / Pin Yin / English translator
US8137105B2 (en) * 2003-07-31 2012-03-20 International Business Machines Corporation Chinese/English vocabulary learning tool
US20050027547A1 (en) * 2003-07-31 2005-02-03 International Business Machines Corporation Chinese / Pin Yin / english dictionary
US7441228B2 (en) * 2003-09-08 2008-10-21 Sap Ag Design-time representation for a first run-time environment with converting and executing applications for a second design-time environment
EP1692626A4 (en) * 2003-09-17 2008-11-19 Ibm Identifying related names
NO325313B1 (en) * 2003-12-10 2008-03-25 Kurt Arthur Seljeseth Intentional addressing and resource request in computer networks
US8200475B2 (en) * 2004-02-13 2012-06-12 Microsoft Corporation Phonetic-based text input method
US8037203B2 (en) * 2004-02-19 2011-10-11 International Business Machines Corporation User defined preferred DNS reference
US20070005586A1 (en) * 2004-03-30 2007-01-04 Shaefer Leonard A Jr Parsing culturally diverse names
US7769766B1 (en) * 2004-05-24 2010-08-03 Sonicwall, Inc. Method and an apparatus to store content rating information
US8015169B1 (en) 2004-05-24 2011-09-06 Sonicwall, Inc. Method and an apparatus to request web pages and content rating information thereof
US20050278347A1 (en) * 2004-05-26 2005-12-15 Wolf Werner G Method and system for extendable data conversion architecture
CN1965310A (en) * 2004-06-04 2007-05-16 拿丕.Com有限公司 Native language internet address system
US7594236B2 (en) * 2004-06-28 2009-09-22 Intel Corporation Thread to thread communication
US7409334B1 (en) * 2004-07-22 2008-08-05 The United States Of America As Represented By The Director, National Security Agency Method of text processing
US20060020667A1 (en) * 2004-07-22 2006-01-26 Taiwan Semiconductor Manufacturing Company, Ltd. Electronic mail system and method for multi-geographical domains
US9015263B2 (en) 2004-10-29 2015-04-21 Go Daddy Operating Company, LLC Domain name searching with reputation rating
US20060168020A1 (en) 2004-12-10 2006-07-27 Network Solutions, Llc Private domain name registration
US7516062B2 (en) * 2005-04-19 2009-04-07 International Business Machines Corporation Language converter with enhanced search capability
US7603482B2 (en) * 2005-04-22 2009-10-13 Microsoft Corporation DNS compatible PNRP peer name encoding
US7882116B2 (en) * 2005-05-18 2011-02-01 International Business Machines Corporation Method for localization of programming modeling resources
US20070004438A1 (en) * 2005-07-01 2007-01-04 Alec Brusilovsky Method and apparatus enabling PTT (push-to-talk) communications between legacy PSTN, cellular and wireless 3G terminals
US7148824B1 (en) * 2005-08-05 2006-12-12 Xerox Corporation Automatic detection of character encoding format using statistical analysis of the text strings
US7624020B2 (en) * 2005-09-09 2009-11-24 Language Weaver, Inc. Adapter for allowing both online and offline training of a text to text system
CN1859167B (en) * 2005-11-04 2012-08-08 华为技术有限公司 Exciting method for network telephone terminal configuration
KR100713824B1 (en) * 2005-12-30 2007-05-07 위니아만도 주식회사 Method for checking error of a kimchi refrigerator
US7642937B2 (en) * 2006-01-09 2010-01-05 Taiwan Semiconductor Manufacturing Co., Ltd. Character conversion methods and systems
EP2511833B1 (en) 2006-02-17 2020-02-05 Google LLC Encoding and adaptive, scalable accessing of distributed translation models
US8457007B2 (en) * 2006-02-24 2013-06-04 Marvell World Trade Ltd. Global switch resource manager
US9047234B1 (en) * 2006-06-05 2015-06-02 Thomson Reuters (Markets) Llc Data context passing between non-interfaced application programs in a common framework
US20080005671A1 (en) * 2006-07-01 2008-01-03 Ahangama Jayantha Chandrakumar Lossless Romanizing Schemes for Classic Sinhala and Tamil
TW200812819A (en) * 2006-09-15 2008-03-16 Inventec Appliances Corp Method of converting word codes
US7694016B2 (en) * 2007-02-07 2010-04-06 Nominum, Inc. Composite DNS zones
US20080235383A1 (en) * 2007-03-22 2008-09-25 Eric Schneider Methods, Systems, Products, And Devices For Generating And Processing DNS Friendly Identifiers
US8271263B2 (en) * 2007-03-30 2012-09-18 Symantec Corporation Multi-language text fragment transcoding and featurization
CN101690119B (en) * 2007-06-25 2013-11-27 西门子公司 Method for forwarding data in scattered data network
US7996523B2 (en) * 2008-01-17 2011-08-09 Fluke Corporation Free string match encoding and preview
US8463610B1 (en) * 2008-01-18 2013-06-11 Patrick J. Bourke Hardware-implemented scalable modular engine for low-power speech recognition
US7958261B2 (en) * 2008-02-14 2011-06-07 Microsoft Corporation Domain name cache control system generating series of varying nonce-bearing domain names based on a function of time
US7865618B2 (en) * 2008-02-22 2011-01-04 Micorsoft Corporation Defeating cache resistant domain name systems
WO2009111526A1 (en) * 2008-03-05 2009-09-11 Taylor Precision Products, Inc. Digital weather station
US7698688B2 (en) * 2008-03-28 2010-04-13 International Business Machines Corporation Method for automating an internationalization test in a multilingual web application
US7814229B1 (en) * 2008-04-04 2010-10-12 Amazon Technologies, Inc. Constraint-based domain name system
TWI383316B (en) * 2008-10-29 2013-01-21 Inventec Appliances Corp Automatic service-providing system and method
US20100169492A1 (en) * 2008-12-04 2010-07-01 The Go Daddy Group, Inc. Generating domain names relevant to social website trending topics
US20100146119A1 (en) * 2008-12-04 2010-06-10 The Go Daddy Group, Inc. Generating domain names relevant to current events
US20100146001A1 (en) * 2008-12-04 2010-06-10 The Go Daddy Group, Inc. Systems for generating domain names relevant to current events
US9026611B2 (en) 2009-03-26 2015-05-05 Nec Corporation DNS name resolution system, override agent, and DNS name resolution method
IL198426A0 (en) * 2009-04-27 2010-02-17 Netanel Raisch Multi-languages idn system
KR101625930B1 (en) * 2009-10-30 2016-06-01 삼성전자 주식회사 Mobile terminal and communicating methdo of the same
US8706728B2 (en) * 2010-02-19 2014-04-22 Go Daddy Operating Company, LLC Calculating reliability scores from word splitting
US9058393B1 (en) 2010-02-19 2015-06-16 Go Daddy Operating Company, LLC Tools for appraising a domain name using keyword monetary value data
US8515969B2 (en) * 2010-02-19 2013-08-20 Go Daddy Operating Company, LLC Splitting a character string into keyword strings
US8909558B1 (en) 2010-02-19 2014-12-09 Go Daddy Operating Company, LLC Appraising a domain name using keyword monetary value data
US8725815B2 (en) * 2011-03-30 2014-05-13 Afilias Limited Transmitting messages between internationalized email systems and non-internationalized email systems
US9002926B2 (en) 2011-04-22 2015-04-07 Go Daddy Operating Company, LLC Methods for suggesting domain names from a geographic location data
KR101258174B1 (en) * 2011-06-17 2013-04-25 한국항공대학교산학협력단 Automatic encoding detection system
US10565666B2 (en) 2011-09-26 2020-02-18 Verisign, Inc. Protect intellectual property (IP) rights across namespaces
US10237231B2 (en) 2011-09-26 2019-03-19 Verisign, Inc. Multiple provisioning object operation
US9319377B2 (en) * 2011-10-26 2016-04-19 Hewlett-Packard Development Company, L.P. Auto-split DNS
US9515988B2 (en) 2011-10-26 2016-12-06 Aruba Networks, Inc. Device and method for split DNS communications
TW201322247A (en) * 2011-11-23 2013-06-01 Inst Information Industry Device, method and computer readable storage medium for storing the method for displaying multiple language characters
CN104412255A (en) * 2012-06-29 2015-03-11 株式会社战略经营研究所 Document processing system, electronic document, document processing method, and program
US9218335B2 (en) 2012-10-10 2015-12-22 Verisign, Inc. Automated language detection for domain names
CN103037028B (en) * 2012-12-10 2015-09-16 中国科学院计算机网络信息中心 A kind ofly support the method and system that the dns resolution of variant domain name realizes
CN103037030B (en) * 2012-12-10 2016-01-27 中国科学院计算机网络信息中心 Support the method and system of domain name group dns resolution
CN104182402A (en) * 2013-05-22 2014-12-03 腾讯科技(深圳)有限公司 Browser interface address bar input control method and browser interface address bar input control system
US20140350933A1 (en) * 2013-05-24 2014-11-27 Samsung Electronics Co., Ltd. Voice recognition apparatus and control method thereof
US20150039599A1 (en) * 2013-08-01 2015-02-05 Go Daddy Operating Company, LLC Methods and systems for recommending top level and second level domains
US10437897B2 (en) 2013-08-01 2019-10-08 Go Daddy Operating Company, LLC Methods and systems for recommending packages of domain names for registration
US9684918B2 (en) 2013-10-10 2017-06-20 Go Daddy Operating Company, LLC System and method for candidate domain name generation
US9715694B2 (en) 2013-10-10 2017-07-25 Go Daddy Operating Company, LLC System and method for website personalization from survey data
EP2871819A1 (en) 2013-11-12 2015-05-13 Verisign, Inc. Multiple provisioning object operation
US10140282B2 (en) 2014-04-01 2018-11-27 Verisign, Inc. Input string matching for domain names
CN104980527A (en) * 2014-04-11 2015-10-14 政务和公益机构域名注册管理中心 Analytic method for variant domain name in domain name system (DNS)
US9953105B1 (en) 2014-10-01 2018-04-24 Go Daddy Operating Company, LLC System and method for creating subdomains or directories for a domain name
US9779125B2 (en) 2014-11-14 2017-10-03 Go Daddy Operating Company, LLC Ensuring accurate domain name contact information
US9785663B2 (en) 2014-11-14 2017-10-10 Go Daddy Operating Company, LLC Verifying a correspondence address for a registrant
US9720666B2 (en) * 2015-09-23 2017-08-01 Oracle International Corporation Densely stored strings
US9619588B1 (en) 2016-03-31 2017-04-11 Ca, Inc. Detecting and displaying cumulative lossy transformation
US10579347B2 (en) * 2017-11-03 2020-03-03 International Business Machines Corporation Self re-encoding interpreted application
CN108810187B (en) * 2018-03-01 2021-05-07 赵建文 Network system for butting voice service through block chain
CN108111547B (en) * 2018-03-06 2021-03-19 深圳互联先锋科技有限公司 Domain name health monitoring method and system
CN109542507A (en) * 2018-10-26 2019-03-29 深圳点猫科技有限公司 A kind of GBK code processing method and electronic equipment based on educational system
US11374901B2 (en) * 2020-09-24 2022-06-28 Apple Inc. Network address compression for electronic devices
CN113420570A (en) * 2021-07-01 2021-09-21 沈阳创思佳业科技有限公司 Method, system and device for improving translation accuracy
CN116306391B (en) * 2023-02-28 2024-01-02 师细会 Character string processing system and method for integrated circuit design

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721895A (en) * 1992-03-17 1998-02-24 International Business Machines Corporation Computer program product and program storage device for a data transmission dictionary for encoding, storing, and retrieving hierarchical data processing information for a computer system
US5784071A (en) * 1995-09-13 1998-07-21 Apple Computer, Inc. Context-based code convertor
US5784069A (en) * 1995-09-13 1998-07-21 Apple Computer, Inc. Bidirectional code converter
US5793381A (en) * 1995-09-13 1998-08-11 Apple Computer, Inc. Unicode converter
US6104711A (en) * 1997-03-06 2000-08-15 Bell Atlantic Network Services, Inc. Enhanced internet domain name server
US6108703A (en) * 1998-07-14 2000-08-22 Massachusetts Institute Of Technology Global hosting system
US6125395A (en) * 1999-10-04 2000-09-26 Piiq.Com, Inc. Method for identifying collections of internet web sites with domain names
US6131095A (en) * 1996-12-11 2000-10-10 Hewlett-Packard Company Method of accessing a target entity over a communications network
US6154777A (en) * 1996-07-01 2000-11-28 Sun Microsystems, Inc. System for context-dependent name resolution
US6161008A (en) * 1998-11-23 2000-12-12 Nortel Networks Limited Personal mobility and communication termination for users operating in a plurality of heterogeneous networks
US6182119B1 (en) * 1997-12-02 2001-01-30 Cisco Technology, Inc. Dynamically configurable filtered dispatch notification system
US6182148B1 (en) * 1999-03-18 2001-01-30 Walid, Inc. Method and system for internationalizing domain names
US6314469B1 (en) * 1999-02-26 2001-11-06 I-Dns.Net International Pte Ltd Multi-language domain name service

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US579381A (en) * 1897-03-23 Skirt-placket fastener
US5319776A (en) * 1990-04-19 1994-06-07 Hilgraeve Corporation In transit detection of computer virus with safeguard
DK170490B1 (en) * 1992-04-28 1995-09-18 Multi Inform As Data Processing Plant
WO1997010556A1 (en) * 1995-09-13 1997-03-20 Apple Computer, Inc. Unicode converter
US6115378A (en) * 1997-06-30 2000-09-05 Sun Microsystems, Inc. Multi-layer distributed network element
AUPO977997A0 (en) 1997-10-14 1997-11-06 Pouflis, Jason The utilisation of multi-lingual names on the internet

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721895A (en) * 1992-03-17 1998-02-24 International Business Machines Corporation Computer program product and program storage device for a data transmission dictionary for encoding, storing, and retrieving hierarchical data processing information for a computer system
US5784071A (en) * 1995-09-13 1998-07-21 Apple Computer, Inc. Context-based code convertor
US5784069A (en) * 1995-09-13 1998-07-21 Apple Computer, Inc. Bidirectional code converter
US5793381A (en) * 1995-09-13 1998-08-11 Apple Computer, Inc. Unicode converter
US6154777A (en) * 1996-07-01 2000-11-28 Sun Microsystems, Inc. System for context-dependent name resolution
US6131095A (en) * 1996-12-11 2000-10-10 Hewlett-Packard Company Method of accessing a target entity over a communications network
US6104711A (en) * 1997-03-06 2000-08-15 Bell Atlantic Network Services, Inc. Enhanced internet domain name server
US6182119B1 (en) * 1997-12-02 2001-01-30 Cisco Technology, Inc. Dynamically configurable filtered dispatch notification system
US6108703A (en) * 1998-07-14 2000-08-22 Massachusetts Institute Of Technology Global hosting system
US6161008A (en) * 1998-11-23 2000-12-12 Nortel Networks Limited Personal mobility and communication termination for users operating in a plurality of heterogeneous networks
US6314469B1 (en) * 1999-02-26 2001-11-06 I-Dns.Net International Pte Ltd Multi-language domain name service
US6446133B1 (en) * 1999-02-26 2002-09-03 I-Dns.Net International Pte Ltd. Multi-language domain name service
US6182148B1 (en) * 1999-03-18 2001-01-30 Walid, Inc. Method and system for internationalizing domain names
US6125395A (en) * 1999-10-04 2000-09-26 Piiq.Com, Inc. Method for identifying collections of internet web sites with domain names

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8543732B2 (en) * 1999-06-22 2013-09-24 William Kenneth Ryan Method and apparatus for multiplexing internet domain names
US20060036767A1 (en) * 1999-06-22 2006-02-16 Ryan William K Method and apparatus for multiplexing internet domain names
US7136901B2 (en) * 1999-11-26 2006-11-14 Neteka Inc. Electronic mail server
US20030005157A1 (en) * 1999-11-26 2003-01-02 Edmon Chung Network address server
US20030046353A1 (en) * 1999-11-26 2003-03-06 Edmon Chung Electronic mail server
US20070073839A1 (en) * 2000-06-27 2007-03-29 Edmon Chung Electronic Mail Server
US20020116459A1 (en) * 2001-02-16 2002-08-22 Microsoft Corporation System and method for passing context-sensitive information from a first application to a second application on a mobile device
US7325032B2 (en) * 2001-02-16 2008-01-29 Microsoft Corporation System and method for passing context-sensitive information from a first application to a second application on a mobile device
US20020174186A1 (en) * 2001-05-15 2002-11-21 Koichi Hashimoto Electronic mail typestyle processing device
US8630988B2 (en) 2001-11-01 2014-01-14 Verisign, Inc. System and method for processing DNS queries
US20030084038A1 (en) * 2001-11-01 2003-05-01 Verisign, Inc. Transactional memory manager
US20070100808A1 (en) * 2001-11-01 2007-05-03 Verisign, Inc. High speed non-concurrency controlled database
US20090106211A1 (en) * 2001-11-01 2009-04-23 Verisign, Inc. System and Method for Processing DNS Queries
US20050044394A1 (en) * 2001-11-09 2005-02-24 Wenhu Wang Method of the information secure
US7584089B2 (en) 2002-03-08 2009-09-01 Toshiba Corporation Method of encoding and decoding for multi-language applications
US20040081434A1 (en) * 2002-10-15 2004-04-29 Samsung Electronics Co., Ltd. Information storage medium containing subtitle data for multiple languages using text data and downloadable fonts and apparatus therefor
US20100266265A1 (en) * 2002-10-15 2010-10-21 Samsung Electronics Co., Ltd. Information storage medium containing subtitle data for multiple languages using text data and downloadable fonts and apparatus therefor
US20050210149A1 (en) * 2004-03-03 2005-09-22 Kimball Jordan L Method, system, and computer useable medium to facilitate name preservation across an unrestricted set of TLDS
US8077974B2 (en) 2006-07-28 2011-12-13 Hewlett-Packard Development Company, L.P. Compact stylus-based input technique for indic scripts
US9344379B2 (en) 2006-09-14 2016-05-17 Afilias Limited System and method for facilitating distribution of limited resources
US20080071909A1 (en) * 2006-09-14 2008-03-20 Michael Young System and method for facilitating distribution of limited resources
US20080109411A1 (en) * 2006-10-24 2008-05-08 Michael Young Supply Chain Discovery Services
US20110022675A1 (en) * 2008-03-10 2011-01-27 Afilias Limited Platform independent idn e-mail storage translation
US20100114559A1 (en) * 2008-10-30 2010-05-06 Yookyung Kim Short text language detection using geographic information
US8548797B2 (en) * 2008-10-30 2013-10-01 Yahoo! Inc. Short text language detection using geographic information
US20110054881A1 (en) * 2009-09-02 2011-03-03 Rahul Bhalerao Mechanism for Local Language Numeral Conversion in Dynamic Numeric Computing
US9454514B2 (en) * 2009-09-02 2016-09-27 Red Hat, Inc. Local language numeral conversion in numeric computing
US8341252B2 (en) 2009-10-30 2012-12-25 Verisign, Inc. Internet domain name super variants
US20110106924A1 (en) * 2009-10-30 2011-05-05 Verisign, Inc. Internet Domain Name Super Variants
US20110225246A1 (en) * 2010-03-10 2011-09-15 Afilias Limited Alternate e-mail delivery
US20200344209A1 (en) * 2011-12-29 2020-10-29 Verisign, Inc. Methods and systems for creating new domains
US20180336192A1 (en) * 2017-05-18 2018-11-22 Wipro Limited Method and system for generating named entities
US10467346B2 (en) * 2017-05-18 2019-11-05 Wipro Limited Method and system for generating named entities
US11958889B2 (en) 2018-10-24 2024-04-16 Janssen Pharmaceuticals, Inc. Compositions of phosphorylated tau peptides and uses thereof
US20220247711A1 (en) * 2019-12-11 2022-08-04 CallFire, Inc. Domain management and synchronization system
US11706189B2 (en) * 2019-12-11 2023-07-18 CallFire, Inc. Domain management and synchronization system
US11269836B2 (en) * 2019-12-17 2022-03-08 Cerner Innovation, Inc. System and method for generating multi-category searchable ternary tree data structure
US11748325B2 (en) 2019-12-17 2023-09-05 Cerner Innovation, Inc. System and method for generating multicategory searchable ternary tree data structure

Also Published As

Publication number Publication date
CN1238804C (en) 2006-01-25
KR20000076575A (en) 2000-12-26
EP1059789A3 (en) 2003-08-13
JP3492580B2 (en) 2004-02-03
TW461209B (en) 2001-10-21
KR100444757B1 (en) 2004-08-16
EP1059789A2 (en) 2000-12-13
EA002513B1 (en) 2002-06-27
CN1812407A (en) 2006-08-02
WO2000050966A2 (en) 2000-08-31
EA200000136A2 (en) 2000-08-28
JP2000253067A (en) 2000-09-14
WO2000050966A3 (en) 2000-12-14
US6446133B1 (en) 2002-09-03
HK1096499A1 (en) 2007-06-01
CN1812407B (en) 2012-03-21
HK1029418A1 (en) 2001-03-30
US20010047429A1 (en) 2001-11-29
CN1266237A (en) 2000-09-13
SG91854A1 (en) 2002-10-15
EA200000136A3 (en) 2001-04-23
US6314469B1 (en) 2001-11-06

Similar Documents

Publication Publication Date Title
US20010025320A1 (en) Multi-language domain name service
US6560596B1 (en) Multiscript database system and method
Faltstrom et al. Internationalizing domain names in applications (IDNA)
US8874624B2 (en) Method and system for seamlessly accessing remotely stored files
JP4518529B2 (en) Methods and systems for domain name internationalization
AU631276B2 (en) Name resolution in a directory database
JP2502021B2 (en) Multibyte data conversion method and system
US20040044791A1 (en) Internationalized domain name system with iterative conversion
US20030115040A1 (en) International (multiple language/non-english) domain name and email user account ID services system
KR20010103670A (en) Method and system for accessing information on a network using message aliasing functions having shadow callback functions
US6049869A (en) Method and system for detecting and identifying a text or data encoding system
US7272792B2 (en) Kana-to-kanji conversion method, apparatus and storage medium
RU2251729C2 (en) Method and device for registering and using domain names on native language
KR20010066754A (en) system for using domain names in the user&#39;s preferred language on the internet
Sun Internationalization of the handle system-a persistent global name service
AU4003700A (en) Multi-language domain name service
JP2003521844A (en) System and method for communicating across various communication applications using a single address string
Dry et al. The Internet: an introduction
WO2001090955A2 (en) Internationalized domain name system with iterative conversion
Lin et al. Variant chinese domain name resolution
DIACRITICS Mark D. Larsen Utah State University INTERNET WITH AN ACCENT: TOWARDS A STANDARDIZATION OF DIACRITICS
Pan et al. LDAP Cross-searching for Traditional and Simplified Chinese
Haddouti et al. Towards Arabic Rendering Issues—MHTML Approach
Seng et al. iDNS-The Next Big Step in the Internet Saga
Ksar ISO/IEC 10646

Legal Events

Date Code Title Description
AS Assignment

Owner name: I-DNS.NET INTERNATIONAL PTE LTD., SINGAPORE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SENG, CHING HONG;JUN, YIN;MINGLIANG, JIANG;REEL/FRAME:011820/0477

Effective date: 20010424

AS Assignment

Owner name: I-DNS.NET INTERNATIONAL PTE LTD., SINGAPORE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SENG, CHING HONG;YIN, JUN;JIANG, MINGLIANG;REEL/FRAME:012574/0856

Effective date: 20010424

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION