WO2001017197A2 - Providing state information in a stateless data communication protocol - Google Patents

Providing state information in a stateless data communication protocol Download PDF

Info

Publication number
WO2001017197A2
WO2001017197A2 PCT/EP2000/005693 EP0005693W WO0117197A2 WO 2001017197 A2 WO2001017197 A2 WO 2001017197A2 EP 0005693 W EP0005693 W EP 0005693W WO 0117197 A2 WO0117197 A2 WO 0117197A2
Authority
WO
WIPO (PCT)
Prior art keywords
site
state information
name
server
names
Prior art date
Application number
PCT/EP2000/005693
Other languages
French (fr)
Other versions
WO2001017197A3 (en
Inventor
Olaf Walkowiak
Paul Sponagl
Original Assignee
Sevenval Ag
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from EP99116993A external-priority patent/EP1081612B1/en
Application filed by Sevenval Ag filed Critical Sevenval Ag
Priority to AU62641/00A priority Critical patent/AU6264100A/en
Publication of WO2001017197A2 publication Critical patent/WO2001017197A2/en
Publication of WO2001017197A3 publication Critical patent/WO2001017197A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/35Network arrangements, protocols or services for addressing or naming involving non-standard use of addresses for implementing network functionalities, e.g. coding subscription information within the address or functional addressing, i.e. assigning an address to a function
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/30Types of network names
    • H04L2101/35Types of network names containing special prefixes

Definitions

  • the present invention concerns the field of data transmission via stateless data communication protocols.
  • the data may, for example, include hypertext documents, and the data communication protocol may be a protocol used in the World Wide Web (WWW).
  • WWW World Wide Web
  • Use of the invention is intended for all applications in which the provision of state information is desired in the context of stateless data communication protocols.
  • possible applications of the present invention are in the fields of electronic commerce or online databases or online dictionaries or online games.
  • EP 0 812 088 A2 discloses a method for preserving state in a stateless network protocol.
  • all documents sent to a client are modified by embedding the state information in the hyperlinks of the documents.
  • the state information is encoded as a CGI call in the file path specification of the hyperlink.
  • This method requires an elaborate converter program for modifying the documents requested by the client. The converter program must parse these I documents and modify the hyperlinks contained therein. This process requires a high amount of computing power. Furthermore, it is difficult to adapt the converter program to new document markup languages or complex document structures.
  • An object of the present invention therefore is to avoid the above-mentioned problems.
  • a further object is to create a way of providing state information in a stateless data communication protocol with very little effort.
  • Yet a further object is to keep the necessary programming work at a minimum while allowing maximum flexibility.
  • Still a further object is that the invention should be usable with a wide variety of data communication protocols, server programs, and document markup languages.
  • a further object is that use of the invention should keep the server load low when processing user requests.
  • the invention is based on the fundamental idea to encode the state information not in a file path specification of a hyperlink, but in the site name. This is a radical departure from prior art approaches and practice.
  • site names have been considered as a rare commodity, such that the seemingly very wasteful idea of encoding state information in site names has not been considered.
  • the present invention overcomes this prejudice by teaching several ways for efficient data communication where state information is encoded in site names, wherein each of a cluster of site names designates the same server site.
  • the present invention teaches to make a single server site accessible by a variety ("cluster") of site names, wherein each possible state information corresponds to at least one site name in the cluster.
  • the state information is provided to the server site each time the server site is accessed using the full site name, and the state information is preserved between subsequent access actions as long as the full site name is used by the client.
  • the client further extracts the state information from the received site name for immediate or possible later use.
  • the site name may be an internet host name comprising an internet domain and further information. Since the normally used internet browsers store the full site name, the state information is provided to the server each time an internet page at the server site is accessed.
  • a "site" is a location at which a plurality of documents or files or other data may potentially be accessed. This means that the path name of a single document is not considered to refer to a site.
  • An example of a site is an internet host name, i.e., a hierarchically structured domain. Even if no documents are actually present, the possibility exists to store and access a plurality of documents at the server site addressed by the host name, said documents being distinguished by their respective file or path names.
  • a "stateless protocol” is in particular a data transmission protocol that does not have the notion of "connections" and therefore does not provide direct information about the beginning, the continuation or the end of a user session.
  • an internet protocol like, for example, the HTTP protocol (hypertext transfer protocol) is used.
  • the HTTP protocol is defined in the proposed standard document RFC 2616 by R. Fielding et al. This document is available at the internet address www.w3c.org, and its contents are hereby incorporated in their entirety.
  • the present invention is used in the context of internet communication wherein a plurality of information pages is accessed by the user while maintaining the state information, individual users and/or user sessions may be distinguished and identified. This is helpful even if the accessed documents are not modified in response to the state information because it allows tracking of the path the user chooses when browsing through the available documents (so-called "session tracking").
  • the state information influences the response of the server in some way.
  • This influence may be, for example, that the server outputs different pages in response to the state information (e.g., an order confirmation page if address data is available for the user identified by the state, and an address entry form otherwise), or that information in a page is modified in accordance with the state information (e.g., the total price of all goods in a shopping basket is shown).
  • some programming may of course be necessary to implement the desired dependency of the information provided by the server from the state information.
  • the present invention may be used in connection with the transmission of all kinds of data, but applications are preferred in which documents and in particular hypertext documents are concerned.
  • hypertext documents may be written in any kind of markup language like SGML (standard generalized markup language) or HTML (hypertext markup language) or XML (extensible markup language).
  • the documents may be generated by any mechanism including script processing of PHP (see www.php.net) or ASP (active server page) code.
  • the state-preserving hyperlinks in the documents must of course be written in a way that the encoded state information is not destroyed. For example, hyperlinks pointing to the same server site will not include a host name, but just a (absolute or relative) file path specification (unless the state shall be deleted when the hyperlink is followed).
  • the present invention has substantial advantages over the method known from EP 0 812 088 A2 since the transmitted documents do not have to be analyzed and modified for enabling the state preservation feature. Therefore, no special provisions are normally required for using other markup or scripting languages or document authoring tools. This is especially important in the internet field because of its rapid development and the multitude of presently available and possible future extensions.
  • the present invention can readily be used with streaming data formats and extensions and plugins like that known under the trademarks Flash and Shockwave available at www.macromedia.com. 6
  • the server site is identified by an address, preferably an internet IP address, and a nameserver is used for translating the site name referring to the server site to the corresponding address.
  • the nameserver is preferably configured in a way that all site names of the site name cluster are mapped to the same address, such that the same server site is accessed irrespective of the encoded state information.
  • the site name containing the encoded state information is preferably transmitted to the server in a header field according to the HTTP protocol. This header field may the host header field or the referrer header field or another suitable header field.
  • the server may then extract and decode the state information from the full site name. This preferred embodiment, of course, requires that the server site is configured such that communication requests carrying any site name from the site name cluster are accepted.
  • a single server site may be designated by a few different second level domain names, and the choice of the actual name out of this predetermined set of name portions may confer the whole or parts of the state information.
  • the number of possible states is very high (e.g., more than 100 or more than 10000 or more than 1000000), such that it would not be possible in practice to register a corresponding number of second level domain names with a centralized registration authority.
  • the first communication request issued by the client may already contain the state information encoded in the site name.
  • the user may have bookmarked the full site name (including the state information) in an earlier session, or he/she may have received it by e-mail, or he/she might have copied it from a printed advertisement.
  • Means are provided in preferred embodiments of the invention for determining whether or not a valid state information is present in the site name received at the server site. If no valid state information is present (e.g., because the lifetime of the state information has expired or because the state information has an invalid format or because the user typed in a "standard" site name not containing any state information), a new identifying state information is created in some embodiments. A redirection instruction to the site name containing the new state information may then be sent to the client. Additionally or as an alternative, the user may be asked to register, e.g., by filling out some registration form.
  • the state information may be any kind of information in any encoding.
  • the state information is an identifier associated with the user or with the current user session. This identifier may be used for accessing a user record contained in a database at the side of the server. Full personal information about the user and/or the current session may be stored in the user record.
  • This kind of identifying status information which acts as a key to a further database access, is normally invariable during a plurality of request/reply events or even during the whole user session or the whole time the user is registered with a service provider, in some embodiments, however, the state information is modified during a user session in order to reflect, for example, the contents of a shopping basket or a changing score in an online game. Arbitrarily long user sessions are possible, and the state information may be changed as often as desired. Any change of the state information corresponds to a change of the current site name within the site name cluster.
  • the state information is not restricted to being merely an identifier. Generally, any kind of data in any encoding may be contained in the state information.
  • the encoding may be such that the state information is expressed in a very compact way within the limitations of the character set permissible in site names
  • the user might prefer a human-readable encoding of the state information
  • a human-readable encoding should be avoided in order to prevent unauthorized access
  • meaningful text e g , advertising slogans or phrases containing human-readable information
  • a numerical state code N could be represented by the N-th text phrase contained in a predetermined codebook
  • the state information may be encoded very compactly, and some text message may be put in front of it
  • Common internet browsers display the site name in a so-called "URL line" If the site name is longer than the length of the "URL line", only the initial portion of the site name is shown
  • URL line a so-called "URL line”
  • the site name may be an internet host name This is a hierarchical name structure comprising a top level domain, a second level domain and possibly further, lower level domains Often the top and the second level domains are just referred to as "domain"
  • An internet host name does not contain any file path part specifying a file name within the sever site identified by the host name
  • the internet URI http //www foo de/prd html comprises the host name www foo de (containing the domain foo de) and the file path specification (in this case a simple file name) prd html
  • the different parts of an 3 internet URI (uniform resource identifier) are described in more detail in the HTTP standard mentioned above.
  • the site name may be an internet IP address or IP number according to version 6 of the IP standard containing more than 32 bits.
  • the IP numbering system may also be considered as a hierarchical naming scheme. In other preferred embodiments, however, all site names in the site name cluster are mapped to a single (or at most a few) IP numbers. This is especially preferred if IP numbers according to the present versions of the IP standard are used since the number space for such 32 bit IP numbers is rather limited.
  • the present invention may also be used in some embodiments for maintaining state information when the user changes from one server site to another server site.
  • the part of the site name identifying the server site is changed, but the part containing the state information is maintained.
  • a shopping site may contain a link to an internet payment agency, and details of the payment may be encoded in the site name of this link.
  • the state information may, in some embodiments, comprise validity information like a verification number or some timestamp or some data defining the validity period of the state information.
  • the state information may be considered invalid if the predetermined validity period has expired. If there is no lifespan limitation, the state information may be regarded as a "timeless cookie" that can be used over and over again if the user, for example, bookmarks an URI containing the state information.
  • the computer program product and the apparatus of the present invention preferably also comprise the features mentioned above and/or in the dependent claims in connection with the inventive method.
  • Fig. 1 is a representation of two hypertext documents
  • Fig. 2 is a block diagram of some of the components involved when accessing a hypertext document
  • Fig. 3 is a flow diagram of a prior art method for data transmission using a stateless data communication protocol
  • Fig. 4a and Fig. 4b are message sequence diagrams of a first sample embodiment of the present invention in which state information is provided and preserved.
  • Fig. 1 depicts two hypertext documents 10, 12 having the filenames "index.html” and "prd.html", respectively. Both hypertext documents 10, 12 are written in the page description language HTML, but any other markup language that may or may not comprise hyperlinks may also be used.
  • the first hypertext document 10 (“index.html”) comprises contents 14 and a hyperlink 16 pointing to the second hypertext document 12 (“prd.html”), which in turn comprises contents 18.
  • the hyperlink 16 consists of an anchor tag ⁇ A ...> ... ⁇ /A> containing identifying text ("product”) and a HREF attribute.
  • the value of the HREF attribute is a file name or a file path specification, i.e. an address not containing a host name.
  • FIG 1 shows an arrangement generally used for accessing and viewing hypertext documents
  • a client 20 is provided by a general purpose computer of a user executing a well-known browser program
  • the browser program may be one of the internet browsers available under the trademarks "Netscape Navigator” or "Microsoft Internet Explorer", and the user's computer may be a standard personal computer
  • the client 20 accesses a server 22 via a computer network 24, for example the internet
  • a stateless communication protocol in the present sample embodiment the HTTP protocol supported by TCP, is used for data communication In this protocol, requests sent by the client 20 are answered by replies of the server 22
  • the protocol is called “stateless” because subsequent requests and replies do not unambiguously refer to each other (the HTTP referrer header may be ambiguous)
  • the server 22 is a powerful general purpose computer having a hardware and software configuration known per se
  • the server 22 may use the well-known Apache software available at www apache org
  • the server 22 comprises a control unit 26 communicating with a data storage unit 28 for storing documents like the hypertext documents 10, 12
  • the server 22 provides a plurality of server sites 30
  • Each of the server sites 30 is identified with one top and second level domain like, for example, the domains "foo de", "baz de” and "bar de”
  • Each server site 30 can be considered as a virtual server in itself, having its own unique IP number and providing access to all documents stored at any path location of the corresponding domain
  • the control unit 26 analyzes and processes the requests of the browser or client 20, accesses, if necessary, documents in the data storage unit 28, and generates and sends corresponding replies to the client 20
  • the client 20 may further access a nameserver 32 via a network 24'
  • the nameserver 32 is used for mapping host names into the corresponding IP numbers
  • the nameserver 32 is a complex distributed computer network, and one of the computers contained therein is shown in Fig 2 with reference numeral 34
  • the nameserver 32 implements the internet DNS (dynamic name service) system
  • the network 24' is also part of the 1 internet and transmits hostname lookup request using the UDP protocol
  • the operation of the nameserver 32 is well known per se
  • Fig 2 shows, as an example, the client 20 displaying the first hypertext document 10 in a browser window 36
  • the contents 14 are displayed as formatted text
  • the identifying text enclosed in the anchor tags of the hyperlink 16 is shown underlined for designating the presence of a hyperlink If the user performs a mouse click 38 (shown in fig 2 as a dotted arrow) on the hyperlink 16, the corresponding hypertext document 12 is called and will be displayed in the window 36
  • Fig 3 shows, as an example, a communication sequence according to the prior art between the client 20 and the server site 30
  • the sequence starts by the user typing the host name "www foo de” into an entry field of the browser (user action 40)
  • the client 20 sends a hostname lookup query to the nameserver 32 in sending action 42
  • the preliminary steps of identifying the relevant nameserver 32 have not been shown here
  • the nameserver 32 accesses a zonefile definition in which the host name "www foo de” is associated with a particular IP number, in the present example the IP number 192 168 4 10 (step 44) This IP number is sent back to the client 20 (sending action 46) The client 20 then opens a TCP connection to this IP number and sends a first request 50 to the server site 30 (action 48)
  • the first request 50 comprises the request line "get / http/1 1 " and a host header field containing the requested host name, i e "www foo de", as its value
  • the method "get” designates the kind of request, and "http/1 1 " designates the version of the HTTP communication protocol
  • the interposed character "/" designates an empty file name (file path specification) meaning that the predetermined main page of the server site 30 is requested
  • the server 22 accesses the first hypertext document 10 ("index html”) as the main page of the server site 30 (step 52)
  • This first hypertext document 10 ("index html") is sent to the client 20 in sending action 54, and its contents 14 are displayed by the browser running on the client 20 in the browser window 36 (step 56) This has already been illustrated in Fig 2
  • a second request 62 is sent to the server site 30 in sending step 60
  • This second request 62 is similar to the first request 50, but it comp ⁇ ses the file name "prd html" in the absolute file path specification "/prd html" This file name has been obtained by the browser from the argument of the HREF attribute of the hyperlink 16
  • the server 22 receives the second request 62 directed to the IP number of the server site 30 and in response accesses the second hypertext document 12 ("prd html") in step 64
  • This document 12 is sent to the client 20 in step 66, and the contents 18 of document 12 are displayed in the browser window 36 in step 68
  • the sample run shown in Fig 4a again starts with a user action 70 in which the user wishes to access the server site named "www foo de”
  • a hostname lookup is performed in steps 72 and 74
  • steps 72 and 74 These steps correspond to steps 42 and 44 of Fig 3 with the exception that the DNS zonefile of the nameserver 32 contains an entry specifying that all host names having a top level domain "de” and a second level domain "foo" are to be mapped to the IP number 192 168 4 10 irrespectively of the third and any lower level domains given in the host name
  • IP number 192.168.4.10 which is sent to the client 20 in sending step 76.
  • the browser running on the client 20 generates a first request 80, which is exactly the same as the request 50 shown in Fig. 2, and sends the request 80 to the server site 30 in sending step 72.
  • test 82 Upon receipt of the request 80 at the server site 30, it is first checked in test 82 whether or not the host name supplied as the value of the HTTP host header field contains a valid state information.
  • the state information would have a predetermined format and would be contained in the third level domain part of the hostname.
  • This third level domain part of the first request 80 is "www", which is not considered to be a valid encoded state information. Therefore the "no" branch of test 82 is chosen, and a new state information (called “id” in the present sample embodiment) is generated in step 84.
  • the state information "id” in the present sample embodiment is a unique alphanumerical identifier.
  • the generated state information "id" is embedded in a new site name. Since, in the present sample embodiment, the site name cluster identifying the server site 30 comprises all host names having the top level domain "de", the second level domain "foo" and arbitrary lower level domains, a wide variety of encodings is possible.
  • the sample encoding shown in the present embodiment is a simple textual concatenation using "w” as the fourth level domain part and the state information "id" as the third level domain part.
  • the new host name containing the state information "id” therefore reads "w.id. foo.de".
  • a redirection command 90 is sent to the client 20 in the status line of the HTTP reply (status code 3xx).
  • the redirection target is the new host name "w.id. foo.de” containing the state information.
  • Receiving the redirection command 90 causes the client 20 to disregard any further contents of the reply and to direct any further requests to the server site identified by the new host name.
  • the new host name is also displayed in the browser's URL entry field. All in all, the effect of receiving the redirect command 90 is essentially the same as if the user had typed in the new host name "w id foo de" into the browser's URL entry field
  • the client 20 will now perform a new host name lookup procedure using the new host name "w id foo de”
  • This host name is sent to the nameserver 32 in sending action 92
  • the nameserver 32 is configured in a way to disregard any further host name parts if only the top and second level domains "foo de” are present Therefore the DNS lookup step 94 will yield the IP number 192 168 4 10 also for the new host name "w id foo de” This IP number is transferred to the client 20 in step 96
  • the client 20 next generates a second request 100 containing the new host name "w id foo de” in the HTTP host header field
  • This request 100 is sent via the TCP protocol to the IP number 192 168 4 10
  • the server site 30 is configured in a way corresponding to the zonefile configuration of the nameserver 32 More in detail, the server site 30 will accept all requests directed to the server site's IP number if only the top and second level domain fields of the host name are "foo de” Lower level domain fields are disregarded
  • the original intention behind this feature was that several distinct server sites could possibly be accessed using a single IP number because of the present shortage of IP numbers It is an inventive merit of the present inventors to have found a novel way of using this feature for implementing the present invention
  • the first hypertext document 10 is accessed at the server site 30 (step 102) and sent to the client 20 (step 104)
  • the contents 14 of this document are displayed in the browser window 36 in step 106
  • test 112 Upon receipt of the HTTP request 112, a check is made by the server site 30 whether or not valid state information is encoded in the host name supplied in the HTTP host header field (test 114) This is assumed in the present situation, and the "yes" branch of test 114 is followed
  • the further steps of accessing the second hypertext document 12 (step 116), sending it to the client 20 (step 118) and displaying its contents 18 (step 120) are identical to steps 64, 66 and 68 shown in Fig 3 Again, it should be noted that no real time processing of the second hypertext document 12 is necessary for maintaining the state information, although such processing may be performed for other reasons
  • a second, third, fourth and fifth sample embodiment of the present invention will now be described showing the actual programming code used in possible implementations
  • a suitable DNS zonefile record is required that maps all site names contained in the site name cluster into a single IP number
  • Such a zonefile record for the sample top 1* and second level domain "foo.de” is shown in the following.
  • the parenthesized numbers at the left hand side are just line numbers used for easy reference and are not part of the actual files:
  • line 10 is the standard header and lines 11-15 contain standard timing information.
  • Line 16 is the so-called A record for the domain foo.de, line 17 contains nameserver information, and line 18 contains mail exchange information.
  • the definition in line 19 is the decisive one with respect to the present invention. This definition configures the nameserver 32 such that every host name lookup query for any hostname XXXX.foo.de (where XXXX is an arbitrary character sequence) will be answered by the IP number 192.168.4.10.
  • This embodiment uses version 1.3.6 or higher of an Apache server program containing the optional modules mod_php and mod_unique.
  • the Apache webserver is configured to provide a virtual host (corresponding to the server site 30) for the IP number 192.168.4.10 (lines 30-33 and 39).
  • Line 33 defines that this virtual host answers HTTP requests directed to any site name ending in "foo.de” on the HTTP host header field.
  • this configuration corresponds to that of the nameserver 32 in that the information contained in the third and lower level domain fields of the supplied host name is disregarded.
  • Line 35 specifies that any transmitted document will be processed by a PHP interpreter program. This is important since the functionality of PHP will be used for implementing the invention in the present embodiment.
  • Lines 34 and 36-38 contain administrative information less relevant for understanding the present invention:
  • the PHP interpreter program is configured such that a PHP script contained in the file "prepend.inc" will be added before any file that is delivered by the server. This functionality is achieved by the following line 40 in the PHP configuration file "php-ini”:
  • auto_prepend_fxle "/usr/local/shop/www/docs/prepend. inc"
  • line 51 extracts the encoded state information from the host name supplied in the HTTP host name field (which is available via the variable $HTTP_HOST).
  • the validity of the state information is checked in lines 52-55 (corresponding to tests 82 and 114 in Fig. 4a and Fig. 4b). In particular, a test as to the format of the state information is made in line 52. Only if the encoded state information comprises exactly 30 characters, a timestamp is extracted in line 54 and a check is made whether or not the lifetime of the state information (10800 seconds in the present example) has expired.
  • Line 54 can be omitted in alternative embodiments in order to obtain an unlimited validity period of the state information.
  • line 56 If no valid state information has been found (line 56), a new host name containing a newly generated unique identifier is sent back to the client 20 in a redirect command (line 60).
  • line 56 corresponds to test 82 ("no" branch)
  • line 60 corresponds to steps 84 and 86.
  • Lines 57-59 contain additional HTTP header values for preventing an undesired caching on the side of the internet browser. In the case of a redirection, processing of the client's request ends in line 61.
  • $SERVER_NAME strtr( $SERVER_NAME, "*", " " ) ;
  • the code in lines 71-72 tests for the presence of encoded state information (having a length of 19 characters) in either the HTTP host header (variable $HTTP_HOST) or the HTTP referrer header (variable $HTTP_REFERER). If such state information is found, it is extracted in lines 71-72 and stored in the variables $HTTP_COOKIE (only if the HTTP cookie header has been empty; see line 73) and $SESSION_ID (line 74). Processing then continues by accessing and sending the requested file to the client 20 (corresponding to steps 116, 118 in Fig. 4b).
  • the extracted state information can be used to modify the file in any desired way using, e.g., the PHP processing possibilities.
  • Lines 77-80 are executed if no encoded state information has been detected. In this case, a uniquely identifying string having 19 characters is obtained (line 77). This string is further encoded in line 77 by mapping "@" characters to ".” characters because the former are not permitted in host names according to the RFC 1035 and RFC 2616 standards. A similar encoding of the server name is made in line 78. Line 79 contains the redirection command to the newly created site name. Processing of the request ends in line 80. All in all, these steps correspond to steps 84-88 shown in Fig. 4a.
  • This embodiment is a very general possibility of implementing the present invention in the context of all kinds of page description languages like ASP, SHTML and so on. It is assumed in this embodiment that an internet shop using cookies shall be converted to the method of the present invention.
  • An Apache server program having the optional module "mod_rewrite" is used.
  • a zonefile record is used as defined in lines 10-20 above.
  • the virtual host definition of the Apache program is assumed to contain lines 30-34 and 36-39 given above.
  • the following lines 90-96 are inserted into the virtual host definition at an appropriate place:
  • the host name contained in the HTTP host header file is analyzed and any state information contained therein is stored in the variable SESSIONJD.
  • the second rewrite rule (lines 93 and 94) removes and stores any cookie information for further reference.
  • the third and last rewrite rule (lines 95 and 96) concerns the case that no valid state information is found in the host name. Then a redirection command to a newly generated site name is issued. The letter “R" at the end of line 96 indicates the redirection command, and the letter "L” indicates that there are no further rewrite rules This completes the description of the fourth sample embodiment.
  • the fifth sample embodiment of the invention is similar to the fourth one It also uses rewrite rules and allows a very general application of the present invention in connection with a wide variety of page description languages.
  • the difference to the fourth sample embodiment is that lines 91-96 shown above are replaced by the following lines 100-129.
  • lines 110-125 are identical to line 109 and have therefore been omitted for the sake of brevity
  • the mode of operation of the fifth sample embodiment corresponds to that of the third sample embodiment shown above, but rewrite rules have been used instead of PHP commands.
  • Encoded state information stored in either the HTTP referrer or the HTTP host header field is found and extracted into the $SESSION_ID variable in lines 100-102 If such information has been found, it is stored in the $HTTP_COOKIE variable (if $HTTP_COOKIE is empty; lines 103-105), and program execution ends (directive "L” in lines 105 and 107). If no valid state information has been found, a new unique identifier is generated (line 108) and encoded to bring it into conformity with RFC 1035 name space requirements (lines 109-128). Line 129 is identical to line 96 described above and causes redirection to the newly generated host name. This concludes the description of the fifth sample embodiment.
  • the procedure call or variable access "unique_id" was used to obtain a unique identifier.
  • this procedure call or variable access returns a tightly encoded identifier that is generally meaningless for human beings.
  • a different procedure call or variable access is used for obtaining a unique identifier that additionally comprises some meaningful text or consists of meaningful text. For example, an advertising slogan or a greeting message or some information for the user may be contained in the returned identifier.
  • the meaningful text may be unique in that it is selected from a large phrasebook, or it may be made unique by concatenating it with the results of the "unique_id" procedure call or variable access.

Abstract

In a method for providing state information in a stateless data communication protocol, said state information being provided between a client and a server site, said server site being accessible at each of a cluster of site names, one site name of said cluster of site names is used for accessing said server site, said site name containing the encoded state information. A computer program product and an apparatus comprise corresponding features. The invention creates a way of providing state information in a stateless data communication protocol with very little effort.

Description

PROVIDING STATE INFORMATION IN A STATELESS DATA COMMUNICATION PROTOCOL
COPYRIGHT NOTICE
A portion of the disclosure of this document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office files or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The present invention concerns the field of data transmission via stateless data communication protocols. The data may, for example, include hypertext documents, and the data communication protocol may be a protocol used in the World Wide Web (WWW). Use of the invention is intended for all applications in which the provision of state information is desired in the context of stateless data communication protocols. For example, possible applications of the present invention are in the fields of electronic commerce or online databases or online dictionaries or online games.
BACKGROUND OF THE INVENTION
EP 0 812 088 A2 discloses a method for preserving state in a stateless network protocol. In this method, all documents sent to a client are modified by embedding the state information in the hyperlinks of the documents. In each modified hyperlink, the state information is encoded as a CGI call in the file path specification of the hyperlink. This method requires an elaborate converter program for modifying the documents requested by the client. The converter program must parse these I documents and modify the hyperlinks contained therein. This process requires a high amount of computing power. Furthermore, it is difficult to adapt the converter program to new document markup languages or complex document structures.
Another method wherein the state information is embedded into the file path specification of all qualifying hyperlinks is described in European patent application no. 98 120 671.7 filed by the present inventors on 05 November 1998. The entire contents of this earlier application are hereby incorporated by reference. This method also requires that the server is customized in a rather complex way.
Further methods for embedding the state information in hyperlink file path fields are known. These methods use database functions on the side of the server to individually create each document delivered to a user according to predetermined rules. Defining these rules and programming the database are very complex tasks that require the expensive work of specialist programmers.
US 5,774,670 shows the use of so-called "cookies" for storing state information in the user's computer under control of the server. This method, however, is not very popular because of the alleged or real risk that data stored in the user's computer is accessed or modified without authorization. For this reason many users choose to configure their browser in a way that no cookies will be accepted.
OBJECTS AND SUMMARY OF THE INVENTION
An object of the present invention therefore is to avoid the above-mentioned problems. A further object is to create a way of providing state information in a stateless data communication protocol with very little effort. Yet a further object is to keep the necessary programming work at a minimum while allowing maximum flexibility. Still a further object is that the invention should be usable with a wide variety of data communication protocols, server programs, and document markup languages. A further object is that use of the invention should keep the server load low when processing user requests. According to the invention, these and other objects are solved by the method, the computer program product and the apparatus defined in the independent claims. The dependent claims concern preferred embodiments of the invention.
The invention is based on the fundamental idea to encode the state information not in a file path specification of a hyperlink, but in the site name. This is a radical departure from prior art approaches and practice. Up to now, site names have been considered as a rare commodity, such that the seemingly very wasteful idea of encoding state information in site names has not been considered. The present invention overcomes this prejudice by teaching several ways for efficient data communication where state information is encoded in site names, wherein each of a cluster of site names designates the same server site.
In other words, the present invention teaches to make a single server site accessible by a variety ("cluster") of site names, wherein each possible state information corresponds to at least one site name in the cluster. The state information is provided to the server site each time the server site is accessed using the full site name, and the state information is preserved between subsequent access actions as long as the full site name is used by the client. The client further extracts the state information from the received site name for immediate or possible later use. In the example of WWW servers and browsers, the site name may be an internet host name comprising an internet domain and further information. Since the normally used internet browsers store the full site name, the state information is provided to the server each time an internet page at the server site is accessed.
Thus the present invention offers a very convenient and flexible way of providing state information. The programming work needed to implement the present invention is extremely small. A few lines of code are sufficient to configure the communication network and to modify well-known server programs for performing the method of the invention. The client program and the accessed documents do not have to be modified at all. In the terminology used herein, a "site" is a location at which a plurality of documents or files or other data may potentially be accessed. This means that the path name of a single document is not considered to refer to a site. An example of a site is an internet host name, i.e., a hierarchically structured domain. Even if no documents are actually present, the possibility exists to store and access a plurality of documents at the server site addressed by the host name, said documents being distinguished by their respective file or path names.
A "stateless protocol" is in particular a data transmission protocol that does not have the notion of "connections" and therefore does not provide direct information about the beginning, the continuation or the end of a user session. In preferred embodiments of the invention, an internet protocol like, for example, the HTTP protocol (hypertext transfer protocol) is used. The HTTP protocol is defined in the proposed standard document RFC 2616 by R. Fielding et al. This document is available at the internet address www.w3c.org, and its contents are hereby incorporated in their entirety.
If the present invention is used in the context of internet communication wherein a plurality of information pages is accessed by the user while maintaining the state information, individual users and/or user sessions may be distinguished and identified. This is helpful even if the accessed documents are not modified in response to the state information because it allows tracking of the path the user chooses when browsing through the available documents (so-called "session tracking").
In preferred embodiments of the present invention, however, the state information influences the response of the server in some way. This influence may be, for example, that the server outputs different pages in response to the state information (e.g., an order confirmation page if address data is available for the user identified by the state, and an address entry form otherwise), or that information in a page is modified in accordance with the state information (e.g., the total price of all goods in a shopping basket is shown). In such embodiments, some programming may of course be necessary to implement the desired dependency of the information provided by the server from the state information.
Even if the transmitted documents are modified to some extent depending on the state information, the computing load put on the server for providing a document is usually very low. This is in complete contrast to the method known from EP 0 812 088 A2 where each delivered document must be parsed and analyzed in order to find and modify all hyperlinks.
The present invention may be used in connection with the transmission of all kinds of data, but applications are preferred in which documents and in particular hypertext documents are concerned. Such hypertext documents may be written in any kind of markup language like SGML (standard generalized markup language) or HTML (hypertext markup language) or XML (extensible markup language). Furthermore, the documents may be generated by any mechanism including script processing of PHP (see www.php.net) or ASP (active server page) code. The state-preserving hyperlinks in the documents must of course be written in a way that the encoded state information is not destroyed. For example, hyperlinks pointing to the same server site will not include a host name, but just a (absolute or relative) file path specification (unless the state shall be deleted when the hyperlink is followed).
Again, the present invention has substantial advantages over the method known from EP 0 812 088 A2 since the transmitted documents do not have to be analyzed and modified for enabling the state preservation feature. Therefore, no special provisions are normally required for using other markup or scripting languages or document authoring tools. This is especially important in the internet field because of its rapid development and the multitude of presently available and possible future extensions. The present invention can readily be used with streaming data formats and extensions and plugins like that known under the trademarks Flash and Shockwave available at www.macromedia.com. 6
In preferred embodiments of the invention, the server site is identified by an address, preferably an internet IP address, and a nameserver is used for translating the site name referring to the server site to the corresponding address. The nameserver is preferably configured in a way that all site names of the site name cluster are mapped to the same address, such that the same server site is accessed irrespective of the encoded state information. When the server site is contacted, the site name containing the encoded state information is preferably transmitted to the server in a header field according to the HTTP protocol. This header field may the host header field or the referrer header field or another suitable header field. The server may then extract and decode the state information from the full site name. This preferred embodiment, of course, requires that the server site is configured such that communication requests carrying any site name from the site name cluster are accepted.
It is one of the merits of the inventors to have shown that the configuration of both the nameserver and the server site in the way described in the previous paragraph are possible with very little programming effort. This is especially true in preferred embodiments in which a hierarchical site name system is used. In this case, it is preferred that all site names in the cluster of site names coincide with respect to the highest hierarchy level and possibly one or more lower hierarchy levels. For example, the top and second level part of an internet host name may be used to identify the server site, while lower level parts are used for holding the state information. It is also possible to use some levels of the hierarchical naming scheme for both purposes at the same time. For example, a single server site may be designated by a few different second level domain names, and the choice of the actual name out of this predetermined set of name portions may confer the whole or parts of the state information. In preferred embodiments, the number of possible states is very high (e.g., more than 100 or more than 10000 or more than 1000000), such that it would not be possible in practice to register a corresponding number of second level domain names with a centralized registration authority. Generally, the first communication request issued by the client may already contain the state information encoded in the site name. For example, the user may have bookmarked the full site name (including the state information) in an earlier session, or he/she may have received it by e-mail, or he/she might have copied it from a printed advertisement. Means are provided in preferred embodiments of the invention for determining whether or not a valid state information is present in the site name received at the server site. If no valid state information is present (e.g., because the lifetime of the state information has expired or because the state information has an invalid format or because the user typed in a "standard" site name not containing any state information), a new identifying state information is created in some embodiments. A redirection instruction to the site name containing the new state information may then be sent to the client. Additionally or as an alternative, the user may be asked to register, e.g., by filling out some registration form.
The state information may be any kind of information in any encoding. In preferred embodiments of the invention, the state information is an identifier associated with the user or with the current user session. This identifier may be used for accessing a user record contained in a database at the side of the server. Full personal information about the user and/or the current session may be stored in the user record. This kind of identifying status information, which acts as a key to a further database access, is normally invariable during a plurality of request/reply events or even during the whole user session or the whole time the user is registered with a service provider, in some embodiments, however, the state information is modified during a user session in order to reflect, for example, the contents of a shopping basket or a changing score in an online game. Arbitrarily long user sessions are possible, and the state information may be changed as often as desired. Any change of the state information corresponds to a change of the current site name within the site name cluster.
The state information is not restricted to being merely an identifier. Generally, any kind of data in any encoding may be contained in the state information. The encoding may be such that the state information is expressed in a very compact way within the limitations of the character set permissible in site names However, it is also possible to use an encoding that presents some meaningful information to the user when the site name (containing the encoded state information) is shown in the URL field of common browsers For example, the user might prefer a human-readable encoding of the state information In other applications, a human-readable encoding (that may be easy to guess or memorize) should be avoided in order to prevent unauthorized access
In some embodiments, meaningful text (e g , advertising slogans or phrases containing human-readable information) is used to encode the state information For example, a numerical state code N could be represented by the N-th text phrase contained in a predetermined codebook It is also possible to use a variety of different phrases for encoding one and the same state information For example, the state information may be encoded very compactly, and some text message may be put in front of it Common internet browsers display the site name in a so-called "URL line" If the site name is longer than the length of the "URL line", only the initial portion of the site name is shown Thus, by prependmg a comparatively long text message in front of the site name, only the text message is shown to the user, and the possibly confusing state information (as well as further parts of the site name) may effectively be hidden The idea of using a site name containing some meaningful text portion that plays no role in designating the actual server site is also considered as an invention in its own right, regardless of whether of not state information is also encoded into the site name
In preferred embodiments, the site name may be an internet host name This is a hierarchical name structure comprising a top level domain, a second level domain and possibly further, lower level domains Often the top and the second level domains are just referred to as "domain" An internet host name does not contain any file path part specifying a file name within the sever site identified by the host name For example, the internet URI http //www foo de/prd html comprises the host name www foo de (containing the domain foo de) and the file path specification (in this case a simple file name) prd html The different parts of an 3 internet URI (uniform resource identifier) are described in more detail in the HTTP standard mentioned above.
In other preferred embodiments, the site name may be an internet IP address or IP number according to version 6 of the IP standard containing more than 32 bits. The IP numbering system may also be considered as a hierarchical naming scheme. In other preferred embodiments, however, all site names in the site name cluster are mapped to a single (or at most a few) IP numbers. This is especially preferred if IP numbers according to the present versions of the IP standard are used since the number space for such 32 bit IP numbers is rather limited.
The present invention may also be used in some embodiments for maintaining state information when the user changes from one server site to another server site. In this case, the part of the site name identifying the server site is changed, but the part containing the state information is maintained. For example, a shopping site may contain a link to an internet payment agency, and details of the payment may be encoded in the site name of this link.
The state information may, in some embodiments, comprise validity information like a verification number or some timestamp or some data defining the validity period of the state information. The state information may be considered invalid if the predetermined validity period has expired. If there is no lifespan limitation, the state information may be regarded as a "timeless cookie" that can be used over and over again if the user, for example, bookmarks an URI containing the state information.
The computer program product and the apparatus of the present invention preferably also comprise the features mentioned above and/or in the dependent claims in connection with the inventive method.
Further objects and advantages will become apparent from the drawings and the following detailed description. DETAILED DESCRIPTION OF SAMPLE EMBODIMENTS
Five sample embodiments of the present invention and several further alternative embodiments will now be explained in more detail. Reference is made to the schematic drawings, in which:
Fig. 1 is a representation of two hypertext documents,
Fig. 2 is a block diagram of some of the components involved when accessing a hypertext document,
Fig. 3 is a flow diagram of a prior art method for data transmission using a stateless data communication protocol, and
Fig. 4a and Fig. 4b are message sequence diagrams of a first sample embodiment of the present invention in which state information is provided and preserved.
Fig. 1 depicts two hypertext documents 10, 12 having the filenames "index.html" and "prd.html", respectively. Both hypertext documents 10, 12 are written in the page description language HTML, but any other markup language that may or may not comprise hyperlinks may also be used. The first hypertext document 10 ("index.html") comprises contents 14 and a hyperlink 16 pointing to the second hypertext document 12 ("prd.html"), which in turn comprises contents 18. The hyperlink 16 consists of an anchor tag <A ...> ... </A> containing identifying text ("product") and a HREF attribute. The value of the HREF attribute is a file name or a file path specification, i.e. an address not containing a host name. The data structures depicted in Fig. 1 are known per se and are described in detail, e.g., in the book "The HTML Sourcebook" by Ian S. Graham, John Wiley & Sons, New York, 1995. Fig 2 shows an arrangement generally used for accessing and viewing hypertext documents A client 20 is provided by a general purpose computer of a user executing a well-known browser program For example, the browser program may be one of the internet browsers available under the trademarks "Netscape Navigator" or "Microsoft Internet Explorer", and the user's computer may be a standard personal computer The client 20 accesses a server 22 via a computer network 24, for example the internet A stateless communication protocol, in the present sample embodiment the HTTP protocol supported by TCP, is used for data communication In this protocol, requests sent by the client 20 are answered by replies of the server 22 The protocol is called "stateless" because subsequent requests and replies do not unambiguously refer to each other (the HTTP referrer header may be ambiguous)
The server 22 is a powerful general purpose computer having a hardware and software configuration known per se For example, the server 22 may use the well-known Apache software available at www apache org At the hardware level, the server 22 comprises a control unit 26 communicating with a data storage unit 28 for storing documents like the hypertext documents 10, 12 At the functional level, the server 22 provides a plurality of server sites 30 Each of the server sites 30 is identified with one top and second level domain like, for example, the domains "foo de", "baz de" and "bar de" Each server site 30 can be considered as a virtual server in itself, having its own unique IP number and providing access to all documents stored at any path location of the corresponding domain For each server site 30, the control unit 26 analyzes and processes the requests of the browser or client 20, accesses, if necessary, documents in the data storage unit 28, and generates and sends corresponding replies to the client 20
The client 20 may further access a nameserver 32 via a network 24' The nameserver 32 is used for mapping host names into the corresponding IP numbers The nameserver 32 is a complex distributed computer network, and one of the computers contained therein is shown in Fig 2 with reference numeral 34 In the present sample embodiment, the nameserver 32 implements the internet DNS (dynamic name service) system The network 24' is also part of the 1 internet and transmits hostname lookup request using the UDP protocol The operation of the nameserver 32 is well known per se
Fig 2 shows, as an example, the client 20 displaying the first hypertext document 10 in a browser window 36 The contents 14 are displayed as formatted text The identifying text enclosed in the anchor tags of the hyperlink 16 is shown underlined for designating the presence of a hyperlink If the user performs a mouse click 38 (shown in fig 2 as a dotted arrow) on the hyperlink 16, the corresponding hypertext document 12 is called and will be displayed in the window 36
At the level of abstraction of Fig 2, this drawing represents prior art systems as well as embodiments of the present invention
Fig 3 shows, as an example, a communication sequence according to the prior art between the client 20 and the server site 30 The sequence starts by the user typing the host name "www foo de" into an entry field of the browser (user action 40) In response to this user command, the client 20 sends a hostname lookup query to the nameserver 32 in sending action 42 For the sake of simplicity, the preliminary steps of identifying the relevant nameserver 32 have not been shown here These preliminary steps are well-known per se and are not the subject of the present invention
In response to receiving the lookup query, the nameserver 32 accesses a zonefile definition in which the host name "www foo de" is associated with a particular IP number, in the present example the IP number 192 168 4 10 (step 44) This IP number is sent back to the client 20 (sending action 46) The client 20 then opens a TCP connection to this IP number and sends a first request 50 to the server site 30 (action 48) The first request 50 comprises the request line "get / http/1 1 " and a host header field containing the requested host name, i e "www foo de", as its value In the request line the method "get" designates the kind of request, and "http/1 1 " designates the version of the HTTP communication protocol The interposed character "/" designates an empty file name (file path specification) meaning that the predetermined main page of the server site 30 is requested
In response to the first request 50, the server 22 accesses the first hypertext document 10 ("index html") as the main page of the server site 30 (step 52) This first hypertext document 10 ("index html") is sent to the client 20 in sending action 54, and its contents 14 are displayed by the browser running on the client 20 in the browser window 36 (step 56) This has already been illustrated in Fig 2
If the user now performs a mouse click 58 onto the underlined hyperlink 16, a second request 62 is sent to the server site 30 in sending step 60 This second request 62 is similar to the first request 50, but it compπses the file name "prd html" in the absolute file path specification "/prd html" This file name has been obtained by the browser from the argument of the HREF attribute of the hyperlink 16 The server 22 receives the second request 62 directed to the IP number of the server site 30 and in response accesses the second hypertext document 12 ("prd html") in step 64 This document 12 is sent to the client 20 in step 66, and the contents 18 of document 12 are displayed in the browser window 36 in step 68
The message sequence diagrams of Fig 4a and Fig 4b represent essentially the same sample run as shown in Fig 3 However, in this case state information is generated, transmitted and preserved between subsequent request/reply pairs This feature offers substantial advantages, e g in the field of electronic commerce
The sample run shown in Fig 4a again starts with a user action 70 in which the user wishes to access the server site named "www foo de" A hostname lookup is performed in steps 72 and 74 These steps correspond to steps 42 and 44 of Fig 3 with the exception that the DNS zonefile of the nameserver 32 contains an entry specifying that all host names having a top level domain "de" and a second level domain "foo" are to be mapped to the IP number 192 168 4 10 irrespectively of the third and any lower level domains given in the host name The nameserver %
32 consequently outputs the IP number 192.168.4.10, which is sent to the client 20 in sending step 76. The browser running on the client 20 generates a first request 80, which is exactly the same as the request 50 shown in Fig. 2, and sends the request 80 to the server site 30 in sending step 72.
Upon receipt of the request 80 at the server site 30, it is first checked in test 82 whether or not the host name supplied as the value of the HTTP host header field contains a valid state information. In the present sample embodiment, the state information would have a predetermined format and would be contained in the third level domain part of the hostname. This third level domain part of the first request 80 is "www", which is not considered to be a valid encoded state information. Therefore the "no" branch of test 82 is chosen, and a new state information (called "id" in the present sample embodiment) is generated in step 84. The state information "id" in the present sample embodiment is a unique alphanumerical identifier.
In the subsequent step 86, the generated state information "id" is embedded in a new site name. Since, in the present sample embodiment, the site name cluster identifying the server site 30 comprises all host names having the top level domain "de", the second level domain "foo" and arbitrary lower level domains, a wide variety of encodings is possible. The sample encoding shown in the present embodiment is a simple textual concatenation using "w" as the fourth level domain part and the state information "id" as the third level domain part. The new host name containing the state information "id" therefore reads "w.id. foo.de".
As the next step 88, a redirection command 90 is sent to the client 20 in the status line of the HTTP reply (status code 3xx). The redirection target is the new host name "w.id. foo.de" containing the state information. Receiving the redirection command 90 causes the client 20 to disregard any further contents of the reply and to direct any further requests to the server site identified by the new host name. The new host name is also displayed in the browser's URL entry field. All in all, the effect of receiving the redirect command 90 is essentially the same as if the user had typed in the new host name "w id foo de" into the browser's URL entry field
The client 20 will now perform a new host name lookup procedure using the new host name "w id foo de" This host name is sent to the nameserver 32 in sending action 92 As mentioned above, the nameserver 32 is configured in a way to disregard any further host name parts if only the top and second level domains "foo de" are present Therefore the DNS lookup step 94 will yield the IP number 192 168 4 10 also for the new host name "w id foo de" This IP number is transferred to the client 20 in step 96
The client 20 next generates a second request 100 containing the new host name "w id foo de" in the HTTP host header field This request 100 is sent via the TCP protocol to the IP number 192 168 4 10 The server site 30 is configured in a way corresponding to the zonefile configuration of the nameserver 32 More in detail, the server site 30 will accept all requests directed to the server site's IP number if only the top and second level domain fields of the host name are "foo de" Lower level domain fields are disregarded It is a standard feature in many internet server programs like the Apache program mentioned above that an evaluation and test of the host name supplied in the HTTP host header field is possible The original intention behind this feature was that several distinct server sites could possibly be accessed using a single IP number because of the present shortage of IP numbers It is an inventive merit of the present inventors to have found a novel way of using this feature for implementing the present invention
The further steps in the present sample run are shown in Fig 4b Similar to the corresponding steps in the prior art method, the first hypertext document 10 is accessed at the server site 30 (step 102) and sent to the client 20 (step 104) The contents 14 of this document are displayed in the browser window 36 in step 106
When the user follows the hyperlink 16 by means of a mouse click 108, a corresponding HTTP request 112 will be generated and sent to the server site 30 in step 110 This request 1 12 contains the new host name generated in steps 84, 86 since this name has been preserved by the browser running on the client 20 It is important to note that the HREF attribute of the hyperlink 16 only contains a file path specification and no host name part The present host name is therefore maintained The requirement that hyperlinks must not set new host names if the state information shall be preserved is the only restriction imposed by the present sample embodiment on the structure and contents of the hypertext documents It should further be noted that no processing of the first hypertext document 10 has been necessary for preserving the state information Thus the present embodiment needs much less computing power of the server 22 than prior art methods Of course, in alternative embodiments the state information could be used to appropriately modify the first hypertext document 10
Upon receipt of the HTTP request 112, a check is made by the server site 30 whether or not valid state information is encoded in the host name supplied in the HTTP host header field (test 114) This is assumed in the present situation, and the "yes" branch of test 114 is followed The further steps of accessing the second hypertext document 12 (step 116), sending it to the client 20 (step 118) and displaying its contents 18 (step 120) are identical to steps 64, 66 and 68 shown in Fig 3 Again, it should be noted that no real time processing of the second hypertext document 12 is necessary for maintaining the state information, although such processing may be performed for other reasons
The request/response mechanism shown in Fig 4b can be continued as often as desired while preserving the encoded state information It is remarked that the "yes" branch of test 82 in Fig 4a would roughly correspond to a jump to step 102, and the "no" branch of test 114 in Fig 4b would roughly correspond to a jump back to step 84 This completes the description of the first sample embodiment
A second, third, fourth and fifth sample embodiment of the present invention will now be described showing the actual programming code used in possible implementations For all five sample embodiments described herein, a suitable DNS zonefile record is required that maps all site names contained in the site name cluster into a single IP number Such a zonefile record for the sample top 1* and second level domain "foo.de" is shown in the following. The parenthesized numbers at the left hand side are just line numbers used for easy reference and are not part of the actual files:
(10) @ IN SOA ns lur.s . office . acmec na . αe
(11) ( 1999080501 ; Serial
(12) 28800 ; Refresh
(13) 7200 ; Retry
(14) 604800 ; Expire
(15) 86400 ) ; Minimum TTL
(16) Ih A 192.168.4 10
(17) Its NS ns
(18) Its MX 5 mail
(19) * IN A 192.168 4.10
(20) IN HINFO IBM- -PC UNIX
In this zonefile record, line 10 is the standard header and lines 11-15 contain standard timing information. Line 16 is the so-called A record for the domain foo.de, line 17 contains nameserver information, and line 18 contains mail exchange information. The definition in line 19 is the decisive one with respect to the present invention. This definition configures the nameserver 32 such that every host name lookup query for any hostname XXXX.foo.de (where XXXX is an arbitrary character sequence) will be answered by the IP number 192.168.4.10.
The following portions of code concern the second sample embodiment of the invention. This embodiment uses version 1.3.6 or higher of an Apache server program containing the optional modules mod_php and mod_unique. First the Apache webserver is configured to provide a virtual host (corresponding to the server site 30) for the IP number 192.168.4.10 (lines 30-33 and 39). Line 33 defines that this virtual host answers HTTP requests directed to any site name ending in "foo.de" on the HTTP host header field. In other words, this configuration corresponds to that of the nameserver 32 in that the information contained in the third and lower level domain fields of the supplied host name is disregarded. Line 35 specifies that any transmitted document will be processed by a PHP interpreter program. This is important since the functionality of PHP will be used for implementing the invention in the present embodiment. Lines 34 and 36-38 contain administrative information less relevant for understanding the present invention:
(30) NameVirtualHost 192 . 168 . 4 . 10
(31 ) Listen 192 . 168 . 4 . 10 : 80
(32) <VιrtualHost 192.168.4.10 : 80> (33) ServerName *. foo.de
(34) DocumentRoot /usr/local/shop/www/docs/ww . foo . de
(35) AddType applιcatιon/χ-httpd-php3 .php3 .html .htm
(36) ErrorLog /usr/local/shop/var/log/httpd-error_log
(37) CustomLog /usr/local/shop/var/log/httpd-access_log BUY ORLD (38) ServerAdmin webmaster@foo.de
(39) </V rtualHost>
The PHP interpreter program is configured such that a PHP script contained in the file "prepend.inc" will be added before any file that is delivered by the server. This functionality is achieved by the following line 40 in the PHP configuration file "php-ini":
(40) auto_prepend_fxle = "/usr/local/shop/www/docs/prepend. inc"
The file "prepend.inc" mentioned above contains the following PHP script:
(50) <?
(51) eregl"Λ ( [Λ\. ]*) \. (.+) " , $HTTP_HOST, $m) ;
(52) ιf( strlen(Sm[l] ) == 30 ) { (53) $SESSION_ID = $m[l];
(54) if (substr ($SESSION_ID,21, 9) < time () -10800 ) unset ($SESSION_ID) ;
(55) }
(56) if ( i isset ( $SESSION_ID ) ) { (57) heaαer ( "Expires : Date: Tue, 11 Nov 1987 11:11:11 GMT");
(58) heaαer ( "Pragm : no-cache") ;
(59) header ("Cache-Control: no-cache");
(60) header ( "Location: http: //w$UNIQUE_ID" . "0" . sprintf [ "% . lOd", time ()).". ".SHTTP_HOST.$REQUEST_URi; ;
(61) ex t;
(62) }
(63) ?>
In this PHP script, line 51 extracts the encoded state information from the host name supplied in the HTTP host name field (which is available via the variable $HTTP_HOST). The validity of the state information is checked in lines 52-55 (corresponding to tests 82 and 114 in Fig. 4a and Fig. 4b). In particular, a test as to the format of the state information is made in line 52. Only if the encoded state information comprises exactly 30 characters, a timestamp is extracted in line 54 and a check is made whether or not the lifetime of the state information (10800 seconds in the present example) has expired. Line 54 can be omitted in alternative embodiments in order to obtain an unlimited validity period of the state information.
If no valid state information has been found (line 56), a new host name containing a newly generated unique identifier is sent back to the client 20 in a redirect command (line 60). Referring to Fig. 4a, line 56 corresponds to test 82 ("no" branch), and line 60 corresponds to steps 84 and 86. Lines 57-59 contain additional HTTP header values for preventing an undesired caching on the side of the internet browser. In the case of a redirection, processing of the client's request ends in line 61.
If valid state information has been found, lines 57-61 are skipped and the requested document is sent to the client 20. This corresponds to test 114 ("yes" branch) and steps 116 and 118 in Fig. 4b. ZO
The third sample embodiment will now be described. This embodiment is similar to the second sample embodiment described above in that it also uses the scripts shown in lines 10-20, 30-39 and 40 and the PHP processing feature. However, the file "prepend.inc" contains the following lines 70-82:
(70) <?
(71) if( ereg( "Λhttp: //B ( . { 19 ) ) \ . " , $HTTP_REFERER, $r
(72) II ereg( "ΛB ( . { 19} ) >/\ . ", $KTTP_HOST, $m ) ) {
(73) if( ! $HTTP_COOKIE ) $HTTP_COOKIE = $m[0]; (74) $SESSION_ID = $m[0];
(75) }
(76) el
(77) $UNIQUE_ID = strtr( SUNIQUE_ID, "@'
(78) $SERVER_NAME = strtr( $SERVER_NAME, "*", " " ) ;
((7799)) header ("Location: http : / /B$UNIQUE_I D$ SERVER_NAME$REQUEST_URI " ) ;
(80) exit ;
(81 ) }
(82) ?>
The code in lines 71-72 tests for the presence of encoded state information (having a length of 19 characters) in either the HTTP host header (variable $HTTP_HOST) or the HTTP referrer header (variable $HTTP_REFERER). If such state information is found, it is extracted in lines 71-72 and stored in the variables $HTTP_COOKIE (only if the HTTP cookie header has been empty; see line 73) and $SESSION_ID (line 74). Processing then continues by accessing and sending the requested file to the client 20 (corresponding to steps 116, 118 in Fig. 4b). The extracted state information can be used to modify the file in any desired way using, e.g., the PHP processing possibilities.
Lines 77-80 are executed if no encoded state information has been detected. In this case, a uniquely identifying string having 19 characters is obtained (line 77). This string is further encoded in line 77 by mapping "@" characters to "." characters because the former are not permitted in host names according to the RFC 1035 and RFC 2616 standards. A similar encoding of the server name is made in line 78. Line 79 contains the redirection command to the newly created site name. Processing of the request ends in line 80. All in all, these steps correspond to steps 84-88 shown in Fig. 4a.
The fourth sample embodiment of the invention will now be described. This embodiment is a very general possibility of implementing the present invention in the context of all kinds of page description languages like ASP, SHTML and so on. It is assumed in this embodiment that an internet shop using cookies shall be converted to the method of the present invention. An Apache server program having the optional module "mod_rewrite" is used.
For implementing this fourth sample embodiment, a zonefile record is used as defined in lines 10-20 above. Furthermore the virtual host definition of the Apache program is assumed to contain lines 30-34 and 36-39 given above. The following lines 90-96 are inserted into the virtual host definition at an appropriate place:
(90) RewriteEngine On (91) RewriteCond {HTTP_HOST} ΛB ( [Λ\ . ] { 19 } ) W\ . ( . +) $
(92) RewriteRule Λ(.+) $1 [E=SESSION_ID: 11]
(93) RewriteCond % { HTTP_COOKIE} Λ.+$
(94) RewriteRule Λ(.+) $1 [E=HTTP_COOKIE : %1]
(95) RewriteCond %{HTTP_HOST} ! ΛB ( [Λ\ . ] { 19} ) W\ . ( . +) $ (96) RewriteRule Λ/(.+)$ http: //B% {ENV: UNIQUE_ID} . foo . de/$l [R,L]
In the first rewrite rule defined in lines 91 and 92, the host name contained in the HTTP host header file is analyzed and any state information contained therein is stored in the variable SESSIONJD. The second rewrite rule (lines 93 and 94) removes and stores any cookie information for further reference. The third and last rewrite rule (lines 95 and 96) concerns the case that no valid state information is found in the host name. Then a redirection command to a newly generated site name is issued. The letter "R" at the end of line 96 indicates the redirection command, and the letter "L" indicates that there are no further rewrite rules This completes the description of the fourth sample embodiment.
The fifth sample embodiment of the invention is similar to the fourth one It also uses rewrite rules and allows a very general application of the present invention in connection with a wide variety of page description languages. The difference to the fourth sample embodiment is that lines 91-96 shown above are replaced by the following lines 100-129. In the listing, lines 110-125 are identical to line 109 and have therefore been omitted for the sake of brevity
(100) RewriteCond %{HTTP_REFERER} "http : //B ( . { 19 } ) \ . [OR,NC]
(101) RewriteCond %{HTTP_HOST} ΛB(.{19})W\.
(102) RewriteRule Λ(.+) $1 [E=SESSION_ID: %1]
(103) RewriteCond % {ENV: SESSION_ID} Λ.+$ (104) RewriteCond % {HTTP_COOKIE} Λ$
(105) RewriteRule Λ(.+) $1 [E=HTTP_COOKIE : %1, L]
(106) RewriteCond % {ENV: SESSION_ID} Λ.+$
(107) RewriteRule Λ(.+) $1 [L]
(108) RewriteRule Λ(.*)$ % {ENV: UNIQUE_ID} [E=NOTE_IT_TEMP: $1] (109) RewriteRule (.*)\@(.*) $1\.$2 [C]
(126) RewriteRule (.*)\@(.*) $1\.$2 [C]
(127) RewriteRule (.*)\@(.*) $1\.$2
(128) RewriteRule Λ(.*)$ % {ENV: OTE_IT_TEMP} [E=UNIQUE_ID: $1] (129) RewriteRule ( .*)$ http: //B% {ENV: UNIQUE_ID} . foo. de/51 [R,L]
The mode of operation of the fifth sample embodiment corresponds to that of the third sample embodiment shown above, but rewrite rules have been used instead of PHP commands. Encoded state information stored in either the HTTP referrer or the HTTP host header field is found and extracted into the $SESSION_ID variable in lines 100-102 If such information has been found, it is stored in the $HTTP_COOKIE variable (if $HTTP_COOKIE is empty; lines 103-105), and program execution ends (directive "L" in lines 105 and 107). If no valid state information has been found, a new unique identifier is generated (line 108) and encoded to bring it into conformity with RFC 1035 name space requirements (lines 109-128). Line 129 is identical to line 96 described above and causes redirection to the newly generated host name. This concludes the description of the fifth sample embodiment.
In all sample embodiments described above, the procedure call or variable access "unique_id" was used to obtain a unique identifier. In the presently described embodiments, this procedure call or variable access returns a tightly encoded identifier that is generally meaningless for human beings. In alternative embodiments, however, a different procedure call or variable access is used for obtaining a unique identifier that additionally comprises some meaningful text or consists of meaningful text. For example, an advertising slogan or a greeting message or some information for the user may be contained in the returned identifier. The meaningful text may be unique in that it is selected from a large phrasebook, or it may be made unique by concatenating it with the results of the "unique_id" procedure call or variable access. Consider, as an example, the case that some instance of the procedure call or variable access "unique_id" returns the unique identifier "1234". In the alternative embodiments mentioned in this paragraph, the term "Always_buy_ACME_products_1234" may then be used for generating the new site name to which the user is redirected. This site name will read "Always_buy_ACME_products_1234.foo.de", and it will be displayed in the URL line of the browser window 36 after the redirection has taken place.
It can thus be seen that the invention can be used for providing state information in a stateless data communication protocol in a convenient and flexible way requiring very little effort. The particulars contained in the above description of sample embodiments should not be construed as limitations of the scope of the invention, but rather as exemplifications of preferred embodiments thereof. Many other variations are possible and will be readily apparent to persons skilled in the art. For example, while several distinct sample embodiments of the present invention have been explained above, it is apparent that a multitude of combinations of the features described in these sample embodiments and in the introductory part of this specification are possible. Accordingly, the scope of the invention should be determined not by the embodiments illustrated, but by the appended claims and their legal equivalents.

Claims

ZSClaims
1. A method for providing state information in a stateless data communication protocol, said state information being provided between a client (20) and a server site (30), said server site (30) being accessible at each of a cluster of site names, said method comprising the step of using one site name of said cluster of site names for accessing said server site (30), said site name containing the encoded state information.
2. The method of claim 1 , wherein said server site (30) is identified by an address, said method comprising the further steps of: providing a nameserver (32), with said site name containing said encoded state information, - receiving said address of said server site (30) from said nameserver (32), and contacting said server site (30) at said address and supplying said site name containing said encoded state information to said server site (30).
3. The method of claim 2, wherein said address is an internet IP address, and/or wherein said nameserver (32) is an internet DNS nameserver.
4. The method of claim 2 or claim 3, wherein said site name is supplied to said server site (30) in an HTTP header field.
5. The method of one of claims 1 to 4, further comprising the initial steps of: contacting said server site (30) using a site name of said cluster of site names, said site name not containing a valid encoded state information, receiving a redirection instruction to another site name of said cluster of site names, said other site name containing an encoded state information supplied by said server site (30), and using said other site name for accessing said server site (30). -?6
6. A method for providing state information in a stateless data communication protocol, said state information being provided between a client (20) and a server site (30), said server site (30) being accessible at each of a cluster of site names, said method comprising the steps of: - receiving one site name of said cluster of site names, said site name containing the encoded state information, and extracting said state information from said site name.
7. The method of claim 6, wherein said site name is received in an internet host header field.
8. The method of claim 6 or claim 7, further comprising the step of using said extracted state information for providing information depending on said state information to said client (20).
9. The method of one of claims 6 to 8, further comprising the steps of modifying said state information and embedding at least one site name containing the encoded modified state information into said information provided to said client (20).
10. The method of one of claims 1 to 9, wherein said state information is preserved throughout a plurality of communication events between said client (20) and said server site (30).
11. The method of one of claims 1 to 10, wherein said stateless data communication protocol is an internet protocol.
12. The method of claim 11 , wherein said stateless data communication protocol is the HTTP protocol.
13. The method of one of claims 1 to 12, wherein said data communication protocol uses a hierarchical site name system, and wherein the portions of all site names in said cluster of site names starting from the top hierarchy level down to a predetermined hierarchy level either are identical or are contained in a predetermined set of site name portions.
14. The method of one of claims 1 to 13, wherein said site name is an internet host name.
15. The method of one of claims 1 to 14, wherein said site name is an internet IP number containing more than 32 bits.
16. The method of one of claims 1 to 15, wherein said site names are distinct from file path specifications defining documents within said server site (30).
17. The method of one of claims 1 to 16, wherein at least one further server site (30) is provided, said further server site (30) being accessible at each of a further cluster of site names, said method comprising the step of using one site name of said further cluster of site names for accessing said further server site (30), said site name containing the encoded previous state information.
18. The method of one of claims 1 to 17, wherein said state information com- prises information for identifying the user and/or commercial information related to the user and/or information regarding the validity of said state information.
19. The method of one of claims 1 to 18, wherein said state information is an identifier associated with a user or with a current user session.
20. A computer program product for execution by a general purpose computer for providing state information in a stateless data communication protocol, said computer program product including instructions for making said general purpose computer perform the steps of the method of one of claims 1 to 19.
21. An apparatus comprising at least one general purpose computer programmed for performing the steps of the method of one of claims 1 to 20.
PCT/EP2000/005693 1999-08-28 2000-06-20 Providing state information in a stateless data communication protocol WO2001017197A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU62641/00A AU6264100A (en) 1999-08-28 2000-06-20 Providing state information in a stateless data communication protocol

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP99116993.9 1999-08-28
EP99116993A EP1081612B1 (en) 1999-08-28 1999-08-28 Providing state information in a stateless data communication protocol
US41709199A 1999-10-13 1999-10-13
US09/417,091 1999-10-13

Publications (2)

Publication Number Publication Date
WO2001017197A2 true WO2001017197A2 (en) 2001-03-08
WO2001017197A3 WO2001017197A3 (en) 2001-10-11

Family

ID=26153097

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2000/005693 WO2001017197A2 (en) 1999-08-28 2000-06-20 Providing state information in a stateless data communication protocol

Country Status (2)

Country Link
AU (1) AU6264100A (en)
WO (1) WO2001017197A2 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0812088A2 (en) * 1996-06-07 1997-12-10 International Business Machines Corporation Preserving state in stateless network protocols
US5774670A (en) * 1995-10-06 1998-06-30 Netscape Communications Corporation Persistent client state in a hypertext transfer protocol based client-server system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5774670A (en) * 1995-10-06 1998-06-30 Netscape Communications Corporation Persistent client state in a hypertext transfer protocol based client-server system
EP0812088A2 (en) * 1996-06-07 1997-12-10 International Business Machines Corporation Preserving state in stateless network protocols

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ARUN IYENGAR: "Dynamic Argument Embedding: Preserving State on the World Wide Web" IEEE INTERNET COMPUTING, 1 March 1997 (1997-03-01), XP002164484 *

Also Published As

Publication number Publication date
AU6264100A (en) 2001-03-26
WO2001017197A3 (en) 2001-10-11

Similar Documents

Publication Publication Date Title
US7970874B2 (en) Targeted web page redirection
CN100545833C (en) The page of trying to be the first and predicting is carried out buffer memory to improve the method and system of site map
US7636770B2 (en) System, method and computer program product for publishing interactive web content as a statically linked web hierarchy
US7634570B2 (en) Managing state information across communication sessions between a client and a server via a stateless protocol
CN100465926C (en) Method and system for network caching
US6973505B1 (en) Network resource access method, product, and apparatus
US5890171A (en) Computer system and computer-implemented method for interpreting hypertext links in a document when including the document within another document
US7168034B2 (en) Method for promoting contextual information to display pages containing hyperlinks
US6397253B1 (en) Method and system for providing high performance Web browser and server communications
USRE44207E1 (en) Network resource access method, product, and apparatus
US6338082B1 (en) Method, product, and apparatus for requesting a network resource
US6067558A (en) Method and apparatus for providing increased content from a resource constrained device
US6202087B1 (en) Replacement of error messages with non-error messages
US20030093400A1 (en) Method for updating a database from a browser
US20020156905A1 (en) System for logging on to servers through a portal computer
US20020083411A1 (en) Terminal-based method for optimizing data lookup
WO2002063414A2 (en) System and method for delivering plural advertisement information on a data network
WO1998003923A1 (en) World wide web bar code access system
US20080147875A1 (en) System, method and program for minimizing amount of data transfer across a network
US20050188008A1 (en) System for communicating with servers using message definitions
US20040162873A1 (en) Method and apparatus of wrapping an existing service
EP1081612B1 (en) Providing state information in a stateless data communication protocol
WO2001017197A2 (en) Providing state information in a stateless data communication protocol
EP1085715A2 (en) Providing state information in a data communication protocol
US8356073B1 (en) Multi-homed web server with animation player and programmable functionality

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP