WO2001063478A2 - Modifying contents of a document during delivery - Google Patents

Modifying contents of a document during delivery Download PDF

Info

Publication number
WO2001063478A2
WO2001063478A2 PCT/US2001/003610 US0103610W WO0163478A2 WO 2001063478 A2 WO2001063478 A2 WO 2001063478A2 US 0103610 W US0103610 W US 0103610W WO 0163478 A2 WO0163478 A2 WO 0163478A2
Authority
WO
WIPO (PCT)
Prior art keywords
document
computer system
web page
facility
information
Prior art date
Application number
PCT/US2001/003610
Other languages
French (fr)
Other versions
WO2001063478A3 (en
Inventor
Issac J. Roth
Jason D. Campbell
Original Assignee
Zack Network, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zack Network, Inc. filed Critical Zack Network, Inc.
Priority to AU2001236650A priority Critical patent/AU2001236650A1/en
Publication of WO2001063478A2 publication Critical patent/WO2001063478A2/en
Publication of WO2001063478A3 publication Critical patent/WO2001063478A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation

Definitions

  • the present invention is directed to the fields of electronic document delivery and processing.
  • Web pages The users of computer systems connected to the Internet commonly use a “web browser” computer program to request, receive, and display documents and other data made available via the World Wide Web by "web server” computer systems that are also connected to the Internet.
  • Such documents are generally called “web pages,” and related groups of web pages served from the same web server are generally called a “web site.”
  • Web pages are typically encoded in HyperText Markup Language (“HTML").
  • Each piece of data made available via the World Wide Web such as a web page or an image or other portion of a web page that is provided separately from the web page, is identified by an address called a Uniform Resource Locator ("URL").
  • URL Uniform Resource Locator
  • a web browser issues a HyperText Transfer Protocol request ("HTTP request") containing its URL.
  • HTTP request HyperText Transfer Protocol request
  • a user causes his or her web browser to issue an HTTP request for the URL of the web page. The user may do so by, for example, typing the URL into a URL field of the browser, or by selecting a link on a currently- displayed web page or a bookmark in a bookmark list.
  • the browser transmits the HTTP request to the web server on which the web page resides, which replies with an HTTP response containing the requested web page, also called an HTML document.
  • the browser receives the HTTP response, it displays the contained web page.
  • the user may retrieve and display a web page describing a particular product and offering it for sale. If the user has a tentative interest in purchasing the product, the user may wish to read a review of the product, determine whether the product can be purchased for a lower price from another vendor, or identify alternatives to the product. While this additional information is often available via the Internet, to obtain it the user generally must invest significant additional effort locating and retrieving the web pages containing this additional information. Further, even after the user has invested this additional effort, the web pages containing the additional information typically visually replace or obscure the original web page, making it difficult for the user to refer to both at the same time.
  • Figure 1 is a network diagram showing the environment in which the facility preferably operates.
  • Figure 2 is a high-level block diagram of the proxy server computer system.
  • Figure 3 is a flow diagram showing the steps preferably performed by the facility when an HTTP request is received from the client.
  • Figure 4 is a data flow diagram showing the makeup of a sample document processing pipeline.
  • Figure 5 is a flow diagram showing the steps preferably performed by the facility when it receives an HTTP response from a web server.
  • Figure 6 is a display diagram showing the original web page as it would be rendered from the HTTP response body shown in Code Block 4 if received by a browser without having been modified by the facility.
  • Figure 7 is a display diagram showing the sample web page as rendered by the client's browser after being modified by the facility.
  • Figure 8 is a display diagram showing the rendered modified web page with its supplemental information bar visually minimized.
  • Figure 9 is a data flow diagram showing conventional proxying of a secure web conversation.
  • Figure 10 is a data flow diagram showing proxying of a secure web conversation by the facility.
  • a software facility for modifying the contents of a document during its delivery (“the facility”) is provided.
  • the facility is particularly useful for adding information to, and otherwise modifying, display documents such as HTML documents and other types of web pages.
  • the facility may be used to identify information relating to a web page during its delivery and supplement web page with this additional information.
  • the facility can preferably extract certain information from a document, then use the extracted information to determine how to modify the document. Further, the facility may use information extracted from a series of documents requested by a particular user to construct a profile of the user, which the facility in turn uses to determine how to modify a subsequent document.
  • the facility receives documents during their delivery to a destination. For example, the facility preferably receives web pages during their delivery to the client computer system that requested each web page. This is preferably accomplished by configuring document requestors, such as web browsers executing on client computer systems, to route their requests for documents, and therefore the responses to those requests, through the facility.
  • the facility selects one or more document processing specifications for use in modifying the document.
  • Some document processing specifications specify particular modifications to a document. For example, various document processing specifications may specify the addition of editorial information to a document, the deletion of particular types of advertising from a document, or the addition of new advertising to a document.
  • some document processing specifications specify the extraction of particular information from a document. For example, one document processing specification can specify the extraction of information about a product that is described in a document.
  • the facility preferably utilizes both procedural document processing specifications, such as document processing scripts, and nonprocedural document processing specifications, such as document processing templates.
  • the facility may add to a document either content or references to content.
  • the facility may further add user-activatable controls to a document.
  • the facility constructs, at least for a portion of the documents it processes, a sequential document processing pipeline.
  • the pipeline is an ordered sequence of document processing specifications to which a document is subjected in order. For example, the facility may construct a pipeline of three document processing specifications.
  • the facility subjects them to the first document processing specification.
  • the contents of the document are processed in accordance with the first document processing specification, they are passed to the second document processing specification.
  • the facility preferably uses several performance optimizations to improve the timeliness with which modified documents are delivered to their destinations.
  • the facility preferably streams out the document to the destination as it is processing the document, rather than sending the entire document to the destination only once processing of the document has been completed. This expedites the display of the document at the destination, as many document rendering programs used by clients, such as web browsers, can begin to render a document before the document is received in its entirety.
  • the facility preferably includes this information by reference. This enables the document to be delivered before the time cost of such processing is incurred, allowing a user to view portions of the modified documents while the document rendering program is resolving the included references and waiting for the associated processing by the third party information provider.
  • the facility if the facility is adding different portions of information to a document by reference that are from different sources or are the products of different processing, the facility preferably adds to the document separate references to these portions, enabling document rendering programs that process multiple references in a document asynchronously to send separate reference resolution requests for each portion, and display each portion as soon as it is received.
  • FIG. 1 is a network diagram showing the environment in which the facility preferably operates.
  • a client computer system 101 connects to the Internet 120 via an Internet Service Provider ("ISP") 110.
  • ISP 110 Internet Service Provider
  • traffic passing between client computer system 101 and the Internet 120 is redirected by a layer 4 switch 113 through a proxy server 111 and a router 112.
  • the facility, which processes web pages passing from web servers such as web servers 131-133 to the client computer system is preferably implemented in proxy server 111.
  • an alternate configuration shows the proxy server to be utilized by client computer systems using other ISPs, such as client computer system 151 using ISP 140.
  • a proxy auto-configuration file is sent to the browser residing on the client computer system 151 which directs the browser to send requests through the proxy server 111.
  • the proxy server 111 may also be installed in a collocation facility not associated with an ISP.
  • FIG. 2 is a high-level block diagram of the proxy server computer system.
  • the proxy server 111 contains a memory 210.
  • the memory 210 preferably contains the facility 211, as well as scripts 212, profiles 213, and cookies 214 used by the facility. While items 211-214 are preferably stored in memory while being used, those skilled in the art will appreciate that these items, or portions of them, may be transferred between memory and a persistent storage device 202 for purposes of memory management and data integrity.
  • the proxy server further contains one or more central processing units (CPUs) 201 for executing programs, such as programs comprising the facility 211, and a computer-readable medium drive 203 for reading information or installing programs such as those comprising the facility from computer-readable media, such as a floppy disk, a CD ROM or a DVD.
  • CPUs central processing units
  • a computer-readable medium drive 203 for reading information or installing programs such as those comprising the facility from computer-readable media, such as a floppy disk, a CD ROM or a DVD.
  • FIG. 3 is a flow diagram showing the steps preferably performed by the facility when an HTTP request is received from the client.
  • the facility determines the host and path of the web server to which the HTTP request is directed. If the HTTP request is a proxy request, the facility extracts the host and the path from its URL. If the HTTP request is not a proxy request, the facility extracts the host from the host header field of the HTTP request.
  • Code Block 1 shows a sample HTTP request received by the facility.
  • the facility uses the host and path determined in step 301 to determine the sequence of processing specifications to apply in processing the HTTP response that will be produced by the web server in response to the HTTP request.
  • the facility preferably maintains a table mapping regular expressions matching a set of URI's (host/path combined) to sequences of document processing specification identifiers each identifying a document processing specification.
  • the facility determines in step 302 that, based upon the web server host and path extracted from the sample HTTP request, the following sequence of processing specifications will be applied to the corresponding HTTP response: First, a document processing specification to extract product data from the web page; second, a document processing specification to add supplemental information to the web page; and third, a document processing specification to rewrite secure references in the web page to refer to a facility.
  • these three document processing specifications are implemented as procedural scripts expressed in a scripting language similar to Perl. These scripts are shown below in Code Blocks 2, 3, and 4, respectively, and discussed further below.
  • step 303 the facility constructs, in accordance with the sequence determined in step 302, a processing pipeline that will later be used by the facility to process the HTTP response generated by the web server in response to this HTTP request.
  • step 303 involves instantiating a document processing, or "parser,” object for each processing specification, storing in each document processing object a pointer to the corresponding document processing specification, and linking the objects together in the same sequence as that specified for the document processing specifications.
  • the facility preferably registers the constructed pipeline with an HTTP callback so that, when the corresponding HTTP response is returned from the server, it is submitted to the pipeline for processing.
  • FIG. 4 is a data flow diagram showing the makeup of a sample document processing pipeline.
  • the document processing pipeline 420 is comprised of three document processing specifications 421-423.
  • the facility submits the portion of the HTML document 410 to a first document processing specification for extracting product data from the document 421.
  • the document processing specification 421 processes the received portion of the HTML document, as well as an earlier-received HTML that has been retained by this document processing specification for further processing, to extract product data from the document.
  • Any HTML content completely processed by document processing specification 421 is passed to document processing specification 422, which processes the HTML to add supplemental information to the web page.
  • Any HTML content completely processed by document processing specification 422 is passed to document processing specification 423 to rewrite secure references occurring in the HTML content.
  • Any HTML content completely processed by document processing specification 423 is transmitted to the client as HTML portion 430.
  • step 304 the facility generates an HTTP request to send to the web server based upon the HTTP request received from the client.
  • the HTTP request generated by the facility in step 304 is derived from the HTTP request received by the facility from the client shown in Code Block 1.
  • the actual request is generated by connecting to the web server identified by the host portion of the URI in the proxy request and sending the path information as the request.
  • This step is the same processing typically performed in a proxy server.
  • step 305 if the cookie maintenance service provided by the facility is active for this client, web server, and request, then the facility continues the step 306, or else the facility continues the step 307.
  • the cookie maintenance service is typically active for all pages requested from sites for which secure requests need to be processed.
  • step 306 the facility adds a cookie for this client, web server to the HTTP request generating step 304.
  • the facility preferably maintains a table of client cookie values and expiration dates indexed by client and sub-indexed by domain and path. This is similar to the cookie storage normally performed by the browser when the facility is not present.
  • step 306 the facility continues to step 307.
  • step 307 the facility sends the generated HTTP request to the web server, using the host and path determined in step 301. After step 307, these steps conclude.
  • FIG. 5 is a flow diagram showing the steps preferably performed by the facility when it receives an HTTP response from a web server.
  • the facility first receives the header of the HTTP response from the web server.
  • a sample HTTP response header is shown in Code Block 2.
  • Lines 4-5 of the sample received header shown in Code Block 2 contain a set-cookie command instructing the browser that receives it to set a cookie named "USER ID" to a value of "01203" for the domain "lava-lampsdirect.com.”
  • the sample received header contains a content-length field indicating that the body of the HTTP response that follows the header is 3686 bytes long.
  • the facility replaces a content-length header field in a received header with an indication that the proxy server's connection with the client computer system will be explicitly closed after the entire response has been sent to the client. By replacing the content-length header field in this manner, the facility makes it possible to begin sending the processed response to the client before it has received and processed the original response from the web server in its entirety, and therefore knows the final length of the processed response.
  • step 503 if the cookie maintenance service is active for this client and web server, then the facility continues in step 504, else the facility continues in step 505.
  • step 504 the facility removes from the received header any set-cookie fields, and processes the set-cookie command of each such set-cookie field against the table of client cookie values maintained by the facility to set the cookie values specified in the commands. After step 504, the facility continues in step 505.
  • step 505 the facility sends the header to the client.
  • Code Block 3 below shows the contents of the sample received header shown in Code Block 2 transmitted to the client after modification by the facility.
  • step 502 replaced the content-length field on line 6 of Code Block 2 with the connection close field on line 4 of Code Block 3. It can further be seen that in step 504 the facility will move the set-cookie field occurring on lines 4-5 in Code Block 2. In step 506, the facility receives a portion of the body of the
  • step 507 the facility submits the received portion of the body to the document processing pipeline constructed in response to the corresponding HTTP request.
  • the facility processes body data through the pipeline constructed for the HTTP response in step 303 by passing output generated by each parser object to the next one.
  • step 509 after processing the data through the pipeline, the facility sends to the client any content emerging from the pipeline.
  • step 510 if a portion of the body received in step 506 contains the end of the HTTP response, then the facility continues in step 511, else the facility continues in step 506 to receive the next portion of the body.
  • step 511 the facility explicitly closes the connection with the client and flushes the pipeline by sending an EOF (end of file) message to the pipeline. This message indicates to the parser objects that no more html data is forthcoming, so any data still being accumulated should be sent on without further processing.
  • step 512 the facility deletes the document processing pipeline. After step 512, these steps conclude.
  • Code Block 4 shows a sample HTTP response body received by the facility.
  • Lava has a calming effect, as it
  • FIG. 6 is a display diagram showing the original web page as it would be rendered from the HTTP response body shown in Code Block 4 if received by a browser without having been modified by the facility.
  • a browser window 600 contains a web page rendering window 610.
  • the web page rendering window 610 m turn contains the rendered contents of the unmodified web page.
  • the rendered web page includes information about a lava-lamp product, including a product name 621, a product item number 622, a textual description of the product 623, a price for the product 624, and a picture of the product 625.
  • the rendered web page further includes a purchasing control 631 that may be operated by the user in order to buy the product from the publisher of the web page, as well as a secure link 641 to a separate account management web page published by the publisher of the current web page.
  • Code Block 5 shows the HTTP response body preferably transmitted to the client after modification by the facility in accordance with document processing specifications 421-423 shown in Figure 4.
  • Lava has a calming effect, as it
  • Figure 7 is a display diagram showing the sample web page as rendered by the client's browser after being modified by the facility.
  • the application of document processing pipeline 420 shown in Figure 4 can be seen in comparing Figures 6 and 7.
  • the facility extracts from the web page and stores in its database such information as the item brand "Hot Lava” and description "motion lamp” 621, the item number "groovy 321" 622 and the item price "$99.99" 624.
  • the facility modifies the title bar of the browser window 600 to include a user and a pod name as shown in the title bar of browser window 700.
  • the facility further adds a supplemental information bar 750 to the web page rendering area 710.
  • the supplemental information bar includes links, such as a link 751 to add the product described in the web page to a wish list maintained for the user, a link 752 to a page containing information for the user, a link 753 to disable the facility and/or the display of the supplemental information bar, and a link 754 to a help page, and a link 755 for switching to a different user.
  • the supplemental information bar further contains information produced by three features: a Community feature 760, an AuctionWatch feature 770, and a PriceCompare feature 780.
  • the Community feature contain links 761 and 762 to two products that are popular with other users that have requested information about and/or purchased the lava-lamp product.
  • the AuctionWatch feature contains two links 771 and 772 to on-line auctions in which the lava-lamp product is offered for sale.
  • the PriceCompare feature contains two links 781 and 782 to other web-based merchants that are offering the lava-lamp product for sale at lower prices.
  • the facility modifies the URL for secure link 641 to refer to the facility rather than to the web server to which it originally referred.
  • Figure 8 is a display diagram showing the rendered modified web page with its supplemental information bar visually minimized.
  • the supplemental information bar 750 has been replaced with a much smaller supplemental information bar icon 890 by selecting a minimize control of the supplemental information bar 750. This enables the user to view any content of the original web page that may be obscured by the display of the supplemental information bar 750.
  • the facility uses procedural scripts as document processing specifications.
  • Such procedural scripts are preferably expressed in a command language similar to Perl called "ZCL.”
  • ZCL is described at a high level herein, and is described in greater detail in Appendix 1 which follows hereafter.
  • a parser in the document processing pipeline is to process the HTML content that it receives from the previous link in the chain by applying its ZCL script to that stream of data.
  • a simple ZCL script comprises a set of "SELECT" statements that indicate portions of the incoming page to which to apply commands. The commands to be applied are attached to each "SELECT" statement.
  • the parser accumulates all HTML data until it encounters data that matches the regular expression that indicates that the current "SELECT" statement has been satisfied. When the termination expression is satisfied, the parser then applies all the commands to the data grabbed from the page, returns the processed data, and goes on to process the next "SELECT" statement. Commands in a "SELECT" block can perform two general functions.
  • ZCL In addition to “SELECT” -based processing, ZCL also supports a more flexible, global search-and-replace function called "PipelinedMatch".
  • This command applies a regular expression-based search and replace to the whole HTML page while forwarding HTML data as soon as a match is processed or it is clear that no match will be found in that text.
  • the facility stores data collected during HTML processing in a database resident on the proxy server computer system.
  • This database stores data "grabbed" from the pages for each user by ZCL scripts.
  • the database is designed with a dynamic schema, which is updated whenever new ZCL scripts are installed for use by the facility.
  • Scripts for web pages containing consumer-oriented data generally grab product descriptions, as well as more detailed information such as item type, price, and URL of the product page.
  • the facility builds a user profile by grabbing data from pages the user visits. Each time the user visits a product page (i.e., a page describing a specific product), a data extraction script stores the products description and price. The facility processes this data to extract the last n items that the user viewed. In one preferred embodiment, the facility maintains information on the 50 items most recently viewed. The facility uses these items to generate a profile of the user's interests. The facility characterizes a user's profile according to keyword frequency and item type frequency. Keyword frequency is the number of times a given keyword occurs in the recorded items. Item type frequency is the number of times a given item type (product category) occurs. The facility records the user's top keywords and item types and uses them to target advertisements and other content that are relevant to the user's recent browsing.
  • Keyword frequency is the number of times a given keyword occurs in the recorded items.
  • Item type frequency is the number of times a given item type (product category) occurs.
  • Code Block 6 is a sample document processing script corresponding to the product data extraction document processing specification 421 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are first subjected. This script extracts product data from the body of the HTTP response for use in profiling the user retrieving this web page.
  • the script shown in Code Block 6 extracts product data from the web page and stores it in the database.
  • a ZCL statement in lines 14-17 of Code Block 6 processes data until the first occurrence of the string " ⁇ b> ⁇ font". This string occurs in the page just before the heading of the page (in this case "LAVA-LAMPS" in lines 24-25 of Code Block 4).
  • the empty command clause allows the parser to pass data on as soon as it is clear that it doesn't contain the termination string. Thus, as data comes in, it is passed on to the next ZCL script in the chain.
  • the ZCL statements in lines 30-38 and 39-44 advance the HTML stream to just before the brand name, and then extract the brand name "Hot Lava", from lines 32-33 of Code Block 4, into the item brand variable.
  • the next ZCL statements in Code Block 6 are structured similarly to the first four, with several pairs of statements to advance to the next data item, and then to extract the data item into the ZCL variables item description, itemjnumber, and item_price.
  • the final statement at lines 89-99 of Code Block 6 stores the acquired text into the database, indexed by domain name and user name.
  • ZCL variables and database fields have the extracted values shown below in Table 1.
  • item subtype ItemSubType viewed LAVA-LAMPS item brand ItemBrand viewed Hot Lava item_description ItemDescription viewed motion lamp item number ItemNumber viewed groovy 321 item_price ItemPrice viewed $99.99
  • the facility preferably adds to certain web pages a subwindow called a "supplemental information bar" that contains additional information relating to the web page.
  • the supplemental information bar is preferably implemented as a DHTML widget, which enables the information contained in it to be displayed in a relatively unobtrusive manner.
  • the supplemental information bar floats above the page and can be "minimized” so that it doesn't obscure the existing page content.
  • the supplemental information bar allows easy substitution of features, which enables a "slotting" strategy. That is, slots in the supplemental information bar can be sold.
  • the supplemental information bar code is capable of placing different features on different pages.
  • the supplemental information bar is preferably implemented using a Perl function which generates HTML text which is inserted into the original page between the ⁇ /BODY> and ⁇ /HTML> tags. This ensures that all the important HTML text in the page will have been processed by the time the supplemental information bar function is called. This allows ZCL parsers earlier in the chain to grab all relevant information out of the page so the variables can be passed to the supplemental information bar.
  • the supplemental information bar pulls in external content from server-side scripts, such as CGI scripts, that are called from the browser by elements able to reference external objects, such as ⁇ DIV> tags in Netscape, or ⁇ IFRAME> tags in Internet Explorer.
  • the ZCL script that places the supplemental information bar into the page generates the URLs for the CGI requests, passing the appropriate data grabbed from the page as parameters.
  • the product description is passed to a CGI script which generates a price comparison.
  • the supplemental information bar uses ⁇ LAYER> and ⁇ DIV> tags to place external content into the page.
  • the content is generated by CGI scripts which are handed parameters grabbed from the page.
  • a ⁇ LAYER> tag is used to include the external content.
  • ⁇ DIV> tags are used instead, with an ⁇ IFRAME> holding the actual content generated by the script.
  • the CGI content is received, it is copied from the ⁇ IFRAME> into the ⁇ DIV>. Note that to accomplish this, a special proxy command has been implemented.
  • In order to prevent malicious web sites from capturing data from users by copying data from one frame to another Internet Explorer enforces a security measure preventing access of frames from a web server other than the one running the script.
  • a proxy command is used to direct the proxy that requests from URI's whose path component begins with a certain keyword should be forwarded to the internal CGI server instead of being forwarded to the web server as they normally would be. This means that even though the browser appears to be sending the request to the original web served the response is actually generated by the facility's CGI server.
  • the CGI scripts are passed a cookie which identifies the user making the request.
  • the scripts can use this information to customize the data returned for the request.
  • An example of a CGI script application would be using collaborative filtering to deliver recommendations related to the product or web page being displayed. Collaborative filtering attempts to determine interests of one user by correlating the user's product browsing history with the histories of other users. Identifying similar users allows the script to determine what other products were interesting to those users.
  • Code Block 7 is a sample document processing script corresponding to the supplemental information addition document processing specification 422 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are subjected after they are subjected to the sample document processing script shown in Code Block 6 above.
  • This script adds a supplemental information bar to the web page containing information relating to the content of the web page and the user, including links to additional related information.
  • the script in Code Block 7 uses information extracted in the script shown in Code Block 6 to add the user's name to the title bar of the web page and to add supplemental information to the web page in a supplemental information bar.
  • the "PipelmedMatch" statement in lines 15-18 of Code Block 7 performs a global replacement in the page that adds the user's name and proxy server site to the title bar of the web page.
  • a Perl function (InsertUsername) is used to generate the actual HTML content. For example, the facility inserts the user and pod name string "[janedoe@podl]" before the page title "Acme. electronics" in line 3 of Code Block 5.
  • the user and pod name string are generated by an "InsertUsername" Perl function that takes ZCL variables containing the user's name and the pod's name as parameters.
  • the PipelmedMatch statement consumes data and passes it on as soon as it determines that the match expression doesn't match the contents of the accumulated data. Given the chaining of the two scripts in this example, the match will occur in the first batch of data received from ZCL script shown in Code Block 6 (the batch ending in " ⁇ b> ⁇ font").
  • the PipelmedMatch statement in lines 1-13 of Code Block 7 uses the variables set by ZCL script shown in Code Block 6 to call a Perl function that generates the supplemental information bar.
  • the supplemental information bar is added to the sample web page at lines 95-269 of Code Block 5.
  • the call to the Perl function specifies the content features to include in the supplemental information bar.
  • the parameters passed to the Perl script include the URL of a CGI script to produce the content for the feature as well as parameters passed to the CGI script for use in producing the content. For example, for a PriceCompare feature preferably provided by the facility, a description of the item whose price is to be compared is ultimately passed to the CGI script.
  • the parameters to the CGI script are expressed, at least in some cases, in terms of ZCL variables whose values are extracted from the web page by the script shown in Code Block 6.
  • the ZCL variables are replaced with their values.
  • the $%item_description variable name shown in line 7 of Code Block 7 is replaced with its value extracted from the sample web page, "Hot Lava motion lamp.”
  • the Perl function When invoked, for each specified feature, it adds a CGI script call to the HTML content generated by the Perl function and added to the web page by the ZCL script.
  • the facility preferably adds the CGI script call shown in lines 261-263 of Code Block 5.
  • the facility When the client receives the modified web page, for each CGI script call added by the Perl function, the facility generates an additional HTTP request invoking the script and passing the parameters. In the corresponding HTTP response, the facility receives the HTML content for the feature generated by the CGI script and displays this HTML content in the supplemental information bar. For example, the content for the PriceCompare feature is displayed in area 780 of the supplemental information bar.
  • An alternate implementation of the supplemental information bar processing uses a proxy command to invoke a Perl module which acts as a customized parser object.
  • This Perl module externally behaves like a parser object for ZCL scripts, but actually is not associated with any script. Instead, it contains Perl code which is optimized to search for the appropriate place to insert the DHTML content to implement the supplemental information bar.
  • the variables initialized by parser objects earlier in the pipeline are passed to this Perl code, allowing selection of features to include based on available data. This allows the facility to avoid placing features in the bar which generate no useful information in the context of a given page.
  • Code Block 8 below is a sample document processing script corresponding to the secure reference re-writing document processing specification 423 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are subjected after they are subjected to the sample document processing script shown in Code Block 7 above.
  • This script replaces any secure HTTP references to web servers with secure HTTP references to the facility.
  • the facility preferably replaces the secure HTTP reference to a web server in line 82 of Code Block 4, "https://www.lava-lampsdirect.com/ cgi-bin/myaccount.cgi," with a secure HTTP reference to the facility in lines 83-84 of Code Block 5, "https://secure.zacknetwork.com//www.lava-lampsdirect.com/cgi- bin/myaccount. cgi”.
  • Non-secure HTTP references to web servers such as the HTTP reference to a web server in line 25 of Code Block 4, "/lavalamps.html", on the other hand, are not replaced with an http reference to the facility, as can be seen at line 26 of Code Block 5.
  • the facility uses several additional mechanisms. These mechanisms redirect the secure connection to an instance of the proxy server running in a secure mode (the "secure proxy") so that it is a part of the communication between the user and the web site.
  • the secure proxy is responsible for script processing on secure web pages.
  • Secure web conversations are generally conducted using the SSL protocol.
  • This protocol entails a key exchange between the two endpoints of the conversation and subsequent establishment of a secure channel which cannot be understood by a third party.
  • a normal web proxy server simply forwards the keys and encrypted data stream without being able to decrypt the actual data that is transferred.
  • Figure 9 is a data flow diagram showing conventional proxying of a secure web conversation.
  • the diagram shows that, when the client 910 generates a secure request 911, the request is encrypted before leaving the client using a key set negotiated between the client and web server for the secure conversation, as shown by the wavy line extending from the request.
  • the request 911 remains encrypted while it passes through a conventional proxy server 920, and until it reaches the web server 930.
  • the web server 930 uses the key set negotiated between the client and web server for the secure conversation to decrypt the request 911, as shown by the transition from a wavy line to a straight line in web server 930.
  • the web server 930 generates a secure response 931, which it encrypts with the key set before transmitting it to the client.
  • the secure response 931 remains encrypted throughout its transmission to the client, including the time during which it is passing through the conventional proxy server 930.
  • the client decrypts the secure response using the key set for the secure conversation. Accordingly, the conventional proxy server 920 never has access to the secure request 911 or the secure response 931 in plaintext form.
  • the proxy server In order to be able to rewrite secure web pages, the proxy server must participate in this key exchange. For an HTTP conversation, this means that the browser must be directed to connect directly to the proxy instead of the ultimate web site. That is, the browser must be connecting to a URL which looks like it is resident on the proxy server, not on the actual site. A consequence of this redirection is that cookies (which are presented only to the site that sets them) will not be passed for the actual site, as the browser doesn't know that the request is actually intended for a different site.
  • FIG. 10 is a data flow diagram showing proxying of a secure web conversation by the facility.
  • the diagram shows that a secure conversation using the facility is actually implemented using two different secure conversations: a first secure conversation between the client and the proxy server, and a second secure conversation between the proxy server and the web server.
  • the client 1010 generates secure request 1011, which it encrypts using a key set negotiated with the proxy server before sending to the proxy server 1020.
  • the proxy server 1020 receives the encrypted secure request 1011, the proxy server decrypts the secure request using the key set negotiated with the client for the first secure session. Once it has decrypted the request, the proxy server may read, copy, modify, filter, or otherwise process the secure request in plaintext form.
  • the proxy server then re-encrypts the secure request using a key set negotiated with the web server for the second conversation, and forwards the re-encrypted secure request to the web server 1030.
  • the web server receives the re- encrypted request, the web server decrypts the re-encrypted request using the key set negotiated with the proxy server for the second secure conversation.
  • the web server generates a secure response 1031, which it encrypts using the key set negotiated with the proxy server for the second secure conversation, and transmits the encrypted secure response
  • the proxy server decrypts the encrypted secure response using the key set negotiated with the web server for the second secure conversation. After so decrypting the secure response, the proxy server can read, copy, modify, filter, or otherwise process the secure response in plaintext form.
  • the proxy server then re-encrypts the decrypted secure response using the key set negotiated with the client for the first secure conversation, and transmits the re-encrypted secure response to the client.
  • the client decrypts the re-encrypted secure response using the key set negotiated with the proxy server for the first secure conversation.
  • the secure "proxy” is implemented as a web server, not a web proxy server. It supports URL requests which contain the site from which to retrieve the data.
  • the URL that the browser requests actually specifies the secure proxy server as the location of the page. For example, a request from "https://www.asite.com/apath" would be requested as
  • the proxy auto- configuration file provided to the browser at login time by the facility is set up to pass all requests to "secure.zacknetwork.com” through without proxying.
  • the facility preferably rewrites all sources from which a secure request may originate. In general, these points are either links or redirection requests. Hyperlinks from one page to another often point to a secure page. These links must be rewritten. Redirection requests (e.g., HTTP 302 replies) are also often used to direct a browser to a new page, which could be a secure page. These redirection requests must also be rewritten to point to the secure proxy.
  • Links that are external to the click stream cannot be rewritten by this mechanism.
  • links are preferably rewritten using the normal ZCL mechanisms. This means that a script such as the script in Code Block 8 is written to process the insecure (standard HTTP) pages and rewrite all links so that the old URL now points to the facility (for example, the URL "https://www.asite.com/apath” is rewritten as "https://secureproxy. zacknetwork.com/www.asite.com/apath”).
  • Redirection requests generally cannot be rewritten by ZCL script as the location of the redirection is returned in a header field.
  • the facility therefore allows specification of a proxy command that indicates that redirection requests returned from a set of URLs should be rewritten.
  • the requests are rewritten by prepending the name of the secure proxy server to the Location header field specified in the original response.
  • the above- described facility could be adapted or extended in various ways.
  • the facility may be straightforwardly adapted to operating computing environments other than the Internet.
  • the facility may modify various types of documents besides web pages.
  • the facility may use kinds of document processing specifications other than those described above. Further, the facility may apply some document processing specifications in parallel rather than sequentially. While the foregoing description makes reference to preferred embodiments, the scope of the invention is defined solely by the claims that follow and the elements recited therein.

Abstract

A facility for modifying the contents of a document during its delivery is described. Based upon information associated with the document, the facility selects one of a number of information addition specifications each specifying a distinct manner of adding information to a document. The facility applies the selected information addition specification to the document to add information to the document as specified by the selected information addition specification. The facility then resumes the delivery of the document containing the added information.

Description

MODIFYING CONTENTS OF A DOCUMENT DURING DELIVERY
TECHNICAL FIELD
The present invention is directed to the fields of electronic document delivery and processing.
BACKGROUND
The users of computer systems connected to the Internet commonly use a "web browser" computer program to request, receive, and display documents and other data made available via the World Wide Web by "web server" computer systems that are also connected to the Internet. Such documents are generally called "web pages," and related groups of web pages served from the same web server are generally called a "web site." Web pages are typically encoded in HyperText Markup Language ("HTML").
Each piece of data made available via the World Wide Web, such as a web page or an image or other portion of a web page that is provided separately from the web page, is identified by an address called a Uniform Resource Locator ("URL"). In order to retrieve such a piece of data, a web browser issues a HyperText Transfer Protocol request ("HTTP request") containing its URL. Accordingly, in order to retrieve and display a particular web page, a user causes his or her web browser to issue an HTTP request for the URL of the web page. The user may do so by, for example, typing the URL into a URL field of the browser, or by selecting a link on a currently- displayed web page or a bookmark in a bookmark list. The browser transmits the HTTP request to the web server on which the web page resides, which replies with an HTTP response containing the requested web page, also called an HTML document. When the browser receives the HTTP response, it displays the contained web page.
In this manner, the user may retrieve and display a web page describing a particular product and offering it for sale. If the user has a tentative interest in purchasing the product, the user may wish to read a review of the product, determine whether the product can be purchased for a lower price from another vendor, or identify alternatives to the product. While this additional information is often available via the Internet, to obtain it the user generally must invest significant additional effort locating and retrieving the web pages containing this additional information. Further, even after the user has invested this additional effort, the web pages containing the additional information typically visually replace or obscure the original web page, making it difficult for the user to refer to both at the same time.
Accordingly, an automated facility for identifying additional information relating to a web page during its delivery and supplementing the web page with this additional information would have significant utility.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a network diagram showing the environment in which the facility preferably operates.
Figure 2 is a high-level block diagram of the proxy server computer system.
Figure 3 is a flow diagram showing the steps preferably performed by the facility when an HTTP request is received from the client. Figure 4 is a data flow diagram showing the makeup of a sample document processing pipeline. Figure 5 is a flow diagram showing the steps preferably performed by the facility when it receives an HTTP response from a web server.
Figure 6 is a display diagram showing the original web page as it would be rendered from the HTTP response body shown in Code Block 4 if received by a browser without having been modified by the facility.
Figure 7 is a display diagram showing the sample web page as rendered by the client's browser after being modified by the facility.
Figure 8 is a display diagram showing the rendered modified web page with its supplemental information bar visually minimized.
Figure 9 is a data flow diagram showing conventional proxying of a secure web conversation.
Figure 10 is a data flow diagram showing proxying of a secure web conversation by the facility.
DETAILED DESCRIPTION
A software facility for modifying the contents of a document during its delivery ("the facility") is provided. The facility is particularly useful for adding information to, and otherwise modifying, display documents such as HTML documents and other types of web pages. In particular, the facility may be used to identify information relating to a web page during its delivery and supplement web page with this additional information.
The facility can preferably extract certain information from a document, then use the extracted information to determine how to modify the document. Further, the facility may use information extracted from a series of documents requested by a particular user to construct a profile of the user, which the facility in turn uses to determine how to modify a subsequent document. The facility receives documents during their delivery to a destination. For example, the facility preferably receives web pages during their delivery to the client computer system that requested each web page. This is preferably accomplished by configuring document requestors, such as web browsers executing on client computer systems, to route their requests for documents, and therefore the responses to those requests, through the facility.
For a particular document, based upon such information as its source and/or its contents, the facility selects one or more document processing specifications for use in modifying the document. Some document processing specifications specify particular modifications to a document. For example, various document processing specifications may specify the addition of editorial information to a document, the deletion of particular types of advertising from a document, or the addition of new advertising to a document. Additionally, some document processing specifications specify the extraction of particular information from a document. For example, one document processing specification can specify the extraction of information about a product that is described in a document.
The facility preferably utilizes both procedural document processing specifications, such as document processing scripts, and nonprocedural document processing specifications, such as document processing templates. In accordance with various document processing specifications, the facility may add to a document either content or references to content. The facility may further add user-activatable controls to a document. In a further embodiment, the facility constructs, at least for a portion of the documents it processes, a sequential document processing pipeline. The pipeline is an ordered sequence of document processing specifications to which a document is subjected in order. For example, the facility may construct a pipeline of three document processing specifications. As the contents of the document are received, the facility subjects them to the first document processing specification. As the contents of the document are processed in accordance with the first document processing specification, they are passed to the second document processing specification. Similarly, as the contents of the document are processed in accordance with the second document processing specification, they are passed to the third document processing specification. Finally, as the contents of the document are processed in accordance with the third document processing specification, they are forwarded to the destination. The use of such a document processing pipeline provides powerful document processing functionality, expedited document processing, and modularity in the design of document processing specifications.
The facility preferably uses several performance optimizations to improve the timeliness with which modified documents are delivered to their destinations. In a first optimization, the facility preferably streams out the document to the destination as it is processing the document, rather than sending the entire document to the destination only once processing of the document has been completed. This expedites the display of the document at the destination, as many document rendering programs used by clients, such as web browsers, can begin to render a document before the document is received in its entirety.
In a second optimization, where the facility or a third party information provider must perform significant processing in order to produce information to be added to a document, the facility preferably includes this information by reference. This enables the document to be delivered before the time cost of such processing is incurred, allowing a user to view portions of the modified documents while the document rendering program is resolving the included references and waiting for the associated processing by the third party information provider. As part of this optimization, if the facility is adding different portions of information to a document by reference that are from different sources or are the products of different processing, the facility preferably adds to the document separate references to these portions, enabling document rendering programs that process multiple references in a document asynchronously to send separate reference resolution requests for each portion, and display each portion as soon as it is received.
By modifying and forwarding documents in this manner, the facility provides powerful document modification functionality whose impact on the delivery schedule of the document is minimized, and whose operation is independent of the participation of the publisher of the document. Figure 1 is a network diagram showing the environment in which the facility preferably operates. A client computer system 101 connects to the Internet 120 via an Internet Service Provider ("ISP") 110. At the ISP 110, traffic passing between client computer system 101 and the Internet 120 is redirected by a layer 4 switch 113 through a proxy server 111 and a router 112. The facility, which processes web pages passing from web servers such as web servers 131-133 to the client computer system is preferably implemented in proxy server 111. In addition to being used by client computer systems such as client computer system 101 that use ISP 110 which operates the proxy server 111, an alternate configuration shows the proxy server to be utilized by client computer systems using other ISPs, such as client computer system 151 using ISP 140. In this alternate configuration, a proxy auto-configuration file is sent to the browser residing on the client computer system 151 which directs the browser to send requests through the proxy server 111. Those skilled in the art will appreciate that the proxy server 111 may also be installed in a collocation facility not associated with an ISP.
Figure 2 is a high-level block diagram of the proxy server computer system. The proxy server 111 contains a memory 210. The memory 210 preferably contains the facility 211, as well as scripts 212, profiles 213, and cookies 214 used by the facility. While items 211-214 are preferably stored in memory while being used, those skilled in the art will appreciate that these items, or portions of them, may be transferred between memory and a persistent storage device 202 for purposes of memory management and data integrity. The proxy server further contains one or more central processing units (CPUs) 201 for executing programs, such as programs comprising the facility 211, and a computer-readable medium drive 203 for reading information or installing programs such as those comprising the facility from computer-readable media, such as a floppy disk, a CD ROM or a DVD. While preferred embodiments are described in terms of the environment described above, those skilled in the art will appreciate that the facility may be implemented in a variety of other environments, including various other combinations of computer systems or similar devices connected in various ways. The operation of the facility to process an HTTP request and the resulting HTTP response is described in detail below. To more fully illustrate the design and operation of the facility, this description is conducted in conjunction with an example.
Figure 3 is a flow diagram showing the steps preferably performed by the facility when an HTTP request is received from the client. In step 301, the facility determines the host and path of the web server to which the HTTP request is directed. If the HTTP request is a proxy request, the facility extracts the host and the path from its URL. If the HTTP request is not a proxy request, the facility extracts the host from the host header field of the HTTP request.
Code Block 1 below shows a sample HTTP request received by the facility.
1 GET http : //www. lava-lampsdirect . com/lava-lamp . html HTTP/ 1 . 0 CODE BLOCK 1 The sample HTTP request in Code Block 1 is a proxy request, so the facility in step 301 extracts from its URL the host, "www.lava-lampsdirect.com", and its path, "lava-lamp.html".
In step 302, the facility uses the host and path determined in step 301 to determine the sequence of processing specifications to apply in processing the HTTP response that will be produced by the web server in response to the HTTP request. In order to do so, the facility preferably maintains a table mapping regular expressions matching a set of URI's (host/path combined) to sequences of document processing specification identifiers each identifying a document processing specification.
In accordance with the example, the facility determines in step 302 that, based upon the web server host and path extracted from the sample HTTP request, the following sequence of processing specifications will be applied to the corresponding HTTP response: First, a document processing specification to extract product data from the web page; second, a document processing specification to add supplemental information to the web page; and third, a document processing specification to rewrite secure references in the web page to refer to a facility. In the example, these three document processing specifications are implemented as procedural scripts expressed in a scripting language similar to Perl. These scripts are shown below in Code Blocks 2, 3, and 4, respectively, and discussed further below.
In step 303, the facility constructs, in accordance with the sequence determined in step 302, a processing pipeline that will later be used by the facility to process the HTTP response generated by the web server in response to this HTTP request. In a preferred embodiment, step 303 involves instantiating a document processing, or "parser," object for each processing specification, storing in each document processing object a pointer to the corresponding document processing specification, and linking the objects together in the same sequence as that specified for the document processing specifications. The facility preferably registers the constructed pipeline with an HTTP callback so that, when the corresponding HTTP response is returned from the server, it is submitted to the pipeline for processing.
Figure 4 is a data flow diagram showing the makeup of a sample document processing pipeline. The document processing pipeline 420 is comprised of three document processing specifications 421-423. When a portion of an HTML document is received by the facility from the web server on which it originated, the facility submits the portion of the HTML document 410 to a first document processing specification for extracting product data from the document 421. The document processing specification 421 processes the received portion of the HTML document, as well as an earlier-received HTML that has been retained by this document processing specification for further processing, to extract product data from the document. Any HTML content completely processed by document processing specification 421 is passed to document processing specification 422, which processes the HTML to add supplemental information to the web page. Any HTML content completely processed by document processing specification 422 is passed to document processing specification 423 to rewrite secure references occurring in the HTML content. Any HTML content completely processed by document processing specification 423 is transmitted to the client as HTML portion 430.
In step 304, the facility generates an HTTP request to send to the web server based upon the HTTP request received from the client. In terms of the example, the HTTP request generated by the facility in step 304 is derived from the HTTP request received by the facility from the client shown in Code Block 1. The actual request is generated by connecting to the web server identified by the host portion of the URI in the proxy request and sending the path information as the request. This step is the same processing typically performed in a proxy server. In step 305, if the cookie maintenance service provided by the facility is active for this client, web server, and request, then the facility continues the step 306, or else the facility continues the step 307. The cookie maintenance service is typically active for all pages requested from sites for which secure requests need to be processed. In step 306, the facility adds a cookie for this client, web server to the HTTP request generating step 304. In order to support step 306, the facility preferably maintains a table of client cookie values and expiration dates indexed by client and sub-indexed by domain and path. This is similar to the cookie storage normally performed by the browser when the facility is not present.
After step 306, the facility continues to step 307. In step 307, the facility sends the generated HTTP request to the web server, using the host and path determined in step 301. After step 307, these steps conclude.
Figure 5 is a flow diagram showing the steps preferably performed by the facility when it receives an HTTP response from a web server. In step 501, the facility first receives the header of the HTTP response from the web server. A sample HTTP response header is shown in Code Block 2.
1 HTTP/1.1 200 OK
2 Date: Thu, 17 Feb 2000 03:54:56 GMT
3 Server: Apache/1.3.6 (Unix) mod_Perl/l .21 mod_ssl/2.3.9 OpenSSL/0.9.3a
4 Set-Cookie: USERID=01203; path=/; domain=lava-lampsdirect . com; expires= ed,
5 23-Feb-2000 00:00:00 GMT
6 Content-length: 3686
CODE BLOCK 2
Lines 4-5 of the sample received header shown in Code Block 2 contain a set-cookie command instructing the browser that receives it to set a cookie named "USER ID" to a value of "01203" for the domain "lava-lampsdirect.com." On line 6, the sample received header contains a content-length field indicating that the body of the HTTP response that follows the header is 3686 bytes long. In step 502, the facility replaces a content-length header field in a received header with an indication that the proxy server's connection with the client computer system will be explicitly closed after the entire response has been sent to the client. By replacing the content-length header field in this manner, the facility makes it possible to begin sending the processed response to the client before it has received and processed the original response from the web server in its entirety, and therefore knows the final length of the processed response.
In step 503, if the cookie maintenance service is active for this client and web server, then the facility continues in step 504, else the facility continues in step 505. In step 504, the facility removes from the received header any set-cookie fields, and processes the set-cookie command of each such set-cookie field against the table of client cookie values maintained by the facility to set the cookie values specified in the commands. After step 504, the facility continues in step 505.
In step 505, the facility sends the header to the client. Code Block 3 below shows the contents of the sample received header shown in Code Block 2 transmitted to the client after modification by the facility.
7 HTTP/1.1 200 OK
8 Date: Thu, 17 Feb 2000 03:54:56 GMT
9 Server: Apache/1.3.6 (Unix) mod_Perl/l .21 mod_ssl/2.3.9 OpenSSL/0.9.3a
10 Connection: Close
CODE BLOCK 3
It can be seen that the facility in step 502 replaced the content-length field on line 6 of Code Block 2 with the connection close field on line 4 of Code Block 3. It can further be seen that in step 504 the facility will move the set-cookie field occurring on lines 4-5 in Code Block 2. In step 506, the facility receives a portion of the body of the
HTTP response from the web server. In step 507, the facility submits the received portion of the body to the document processing pipeline constructed in response to the corresponding HTTP request. In step 508, the facility processes body data through the pipeline constructed for the HTTP response in step 303 by passing output generated by each parser object to the next one. In step 509, after processing the data through the pipeline, the facility sends to the client any content emerging from the pipeline. In step 510, if a portion of the body received in step 506 contains the end of the HTTP response, then the facility continues in step 511, else the facility continues in step 506 to receive the next portion of the body. In step 511, the facility explicitly closes the connection with the client and flushes the pipeline by sending an EOF (end of file) message to the pipeline. This message indicates to the parser objects that no more html data is forthcoming, so any data still being accumulated should be sent on without further processing. In step 512, the facility deletes the document processing pipeline. After step 512, these steps conclude.
Code Block 4 below shows a sample HTTP response body received by the facility.
I <HTML> 2 <HEAD>
3 <t tle>Acme . electronιcs</tιtle>
4 <meta http-equιv="Content-Type" content="text/html ; charset= so- 8859-l">
5 </head> 6
7 <body bgcolor="#FFFFFF">
8 <dιv ιd="Layerl" style="pos tιon : absolute; wιdth:462px; height : 450px; z-ιndex:l;
9 left: 2px; top: lpx"> 10 <table wιdth=" 450" border=" 0" alιgn="left" vspace=" 0" hspace=" 0" cellspacing=" 0" cellpaddmg=" 0">
II <tr valιgn="top" bgcolor=" #FFFFFF">
12 <td colspan=" 4 "Xιmg src="lava-lampsdιrect . gi f " wιdth=" 450" he ght=" 100" 13 border=" l"x/td>
14 </tr>
15 <tr alιgn="left" valιgn="top">
16 <td bgcolor="#CCFFFF" wιdth="130" alιgn="center" rowspan="4' heιght="225" 17 valιgn="top">
18 <pxfont f ace="Verdana , Anal , Helveti ca , sans-seri f" sι ze="2 "xbr>
-12- 1/29/01 19 HOME<br>
20 </fontx/p>
21 <p><font face="Verdana, Anal, Helvetica, sans-serif"
22 sιze="2">FURBYS<br> 23 </font> </p>
24 <p><font face="Verdana, Anal, Helvetica, sans-serif" sιze="2"Xb><font
25 color="#339900"Xa href="/lavalamps . html">LAVA-LAMPS</aXbr>
26 </fontx/bx/fontx/p> 27 <p><font face="Verdana, Anal, Helvetica, sans-serif" sιze="2">CLEARANCE
28 <br>
29 </fontx/p>
30 </td> 31 <td alιgn="center" colspan="2" bgcolor="#CCFFFF" wιdth="150">
32 <p><bxfont f ace="Verdana, Anal, Helvetica, sans-serif" sιze="4">Hot
33 Lava<font sιze="2">®<br> 34 </fontx/fontx/bxb><font f ace="Verdana, Anal,
Helvetica, sans
35 -serif" sιze="4"Xf ont sιze="2">motιon
36 lamp</fontx/fontx/bx/p>
37 </td> 38 <td wιdth="150" bgcolor="#CCFFFF" valιgn="bottom"Xf ont f a ce=" Verdana,
39 Anal, Helvetica, sans-serif" sιze="2"Xb>Style
40 # groovy32K/bx/fontx/td>
41 </tr> 42 <tr alιgn="left" valιgn="top">
43 <td ro span="3" colspan="2" bordercolor="#999999" wιdth="150"xιmg
44 src="lava-lamp.gιf" wιdth="106" heιght="250"x/td>
45 <td wιdth="150" bgcolor="#CCFFFF"Xf ont sιze="l" f ace="Verdana, Anal,
46 Helvetica, sans-seπf">
47 <br>
48 The Hot Lava Brand Motion Lamp (AKA lava-lamp ) : Dark as night, lava 49 action
50 is all that is seen. Watch as the lava creates ever- changing and
51 endlessly
52 fascinating shapes and formations. Lava has a calming effect, as it
53 gently
54 shifts and pulses before your eyes, stress and tension slip peacefully
55 away.<br> 56 </fontx/td>
57 </tr>
58 <tr alιgn="left" val gn="top">
59 <td w dth="150" bgcolor="#CCFFFF"Xf ont f ace="Verdana, Anal, Helvetica, 60 sans-serif" sιze="2" color="#FF0033"Xbxf ont color="#000000">Our
61 Pπce:</font> $99.99<br>
62 </bx/fontx/td>
63 </tr> 64 <tr alιgn="left" valιgn="top">
•13- 1/29/01 65 <td wιdth="150" alιgn="center" bgcolor="#CCFFFF"Xιmg src="buy . gif "
66 wιdth="36" heιght="36"x/td>
67 </tr> 68 <tr alιgn="center" valιgn="top" bgcolor="#FF33FF">
69 <td colspan="4">&nbsp;</td>
70 </tr>
71 <tr alιgn="center" valιgn="bottom" bgcolor="#FF33FF">
72 <td wιdth="99"Xbxfont face="Verdana, Anal, Helvetica, sans-serif "xfont
73 sιze="l">COMPANY
74 INFO.</font> </f ontx/bx/td>
75 <td wιdth="94"xb><font f ace="Verdana, Anal, Helvetica, sans-serif" 76 sιze="l">CONTACT
77 US</fontx/bx/td>
78 <td wιdth="94"Xb><font f ace="Verdana, Anal, Helvetica, sans-serif"
79 sιze="l">TECH. 80 SUPPORT</fontx/bx/td>
81 <td wιdth="163"xbxfont f ace="Verdana, Anal, Helvetica, sans-serif"
82 sιze="l"Xa href="https : //www. lava-lampsdirect . com/cgi- bin/myaccount . cgι">MY 83 ACCOUNT</a> </f ontx/bx/td>
84 </tr>
85 <tr alιgn="center" valιgn="bottom" bgcolor="#FF33FF">
86 <td wιdth="450" colspan="4"> <bx/bxb></bxbxf ont f ace="Verdana, Anal, 87 Helvetica, sans-serif" sιze="l">
88 </fontx/b><bXbr>
89 </bx/td>
90 </tr>
91 </table> 92 </dιv>
93 </body>
94 </html>
CODE BLOCK 4
■14- 1/2901 Figure 6 is a display diagram showing the original web page as it would be rendered from the HTTP response body shown in Code Block 4 if received by a browser without having been modified by the facility. A browser window 600 contains a web page rendering window 610. The web page rendering window 610 m turn contains the rendered contents of the unmodified web page. The rendered web page includes information about a lava-lamp product, including a product name 621, a product item number 622, a textual description of the product 623, a price for the product 624, and a picture of the product 625. The rendered web page further includes a purchasing control 631 that may be operated by the user in order to buy the product from the publisher of the web page, as well as a secure link 641 to a separate account management web page published by the publisher of the current web page.
Code Block 5 shows the HTTP response body preferably transmitted to the client after modification by the facility in accordance with document processing specifications 421-423 shown in Figure 4.
1 <HTML>
2 <HEAD> 3 <tιtle> [3anedoe@podl] Acme . electronιcs</tιtle>
4 <meta http-equιv="Content-Type" content="text/html ; charset=ιso 8859-l">
5 </head> 6 7 <body bgcolor="#FFFFFF">
8 <dιv ιd="Layerl" style="posιtιon : absolute; width: 462px; height : 450px; z-mdex:l;
9 left: 2px; top: lpx">
10 <table wιdth=" 450" border=" 0" alιgn="left" vspace="0" hspace=" 0 11 cellspacmg=" 0" cellpadding=v, 0">
12 <tr valιgn="top" bgcolor=" #FFFFFF">
13 <td colspan=" 4 "Xιmg src="lava-lampsdιrect . gi f " wιdth=" 450" heιght=" 100"
14 border=" l "x/td> 15 </tr>
16 <tr alιgn="left" valιgn="top">
17 <td bgcolor="#CCFFFF" wιdth="130" alιgn="center" rowspan="4 heιght="225"
18 valιgn="top"> 19 <pxfont f ace="Verdana , Anal , Helvetica , s ans -s eri f" sιze="2 "xbr>
20 HOME<br>
21 </ fontx/p>
• 1 5- 1/29/01 22 <p><font face="Verdana, Anal, Helvetica, sans-serif"
23 sιze="2">FURBYS<br>
24 </font> </p>
25 <p><font face="Verdana, Anal, Helvetica, sans-serif"
Figure imgf000018_0001
26 color="#339900"Xa href=Vlavalamps . html">LAVA-LAMPS</aXbr>
27 </fontx/bx/fontx/p>
28 <p><font face="Verdana, Anal, Helvetica, sans-serif" sιze="2">CLEARANCE 29 <br>
30 </fontx/p>
31 </td>
32 <td alιgn="center" colspan="2" bgcolor="#CCFFFF" wιdth="150">
33 <p><bXfont f ace="Verdana, Anal, Helvetica, sans-serif" sιze="4">Hot
34 Lava<font sιze="2">®<br>
35 </fontx/fontx/b><bXfont f ace="Verdana, Anal, Helvetica, sans
36 -serif" sιze="4"xf ont sιze="2">motιon 37 lamp</fontx/fontx/bx/p>
38 </td>
39 <td wιdth="150" bgcolor="#CCFFFF" valιgn="bottom"Xf ont f a ce=" Verdana,
40 Anal, Helvetica, sans-serif" sιze="2"xb>Style 41 # groovy32K/bx/fontx/td>
42 </tr>
43 <tr alιgn="left" valιgn="top">
44 <td ro span=w3" colspan="2" bordercolor="#999999" wιdth="150"xιmg 45 src="lava-lamp.gιf" wιdth="106" heιght="250"x/td>
46 <td wιdth="150" bgcolor="#CCFFFF"xf ont sιze="l" f ace="Verdana, Anal,
47 Helvetica, sans-serιf">
48 <br> 49 The Hot Lava Brand Motion Lamp (AKA lava-lamp ) : Dark as night, lava
50 action
51 is all that is seen. Watch as the lava creates ever- changing and 52 endlessly
53 fascinating shapes and formations. Lava has a calming effect, as it
54 gently
55 shifts and pulses before your eyes, stress and tension slip peacefully
56 away.<br>
57 </fontx/td>
58 </tr>
59 <tr alιgn="left" valιgn="top"> 60 <td wιdth="150" bgcolor="#CCFFFF"Xf ont f ace="Verdana, Anal,
Helvetica,
61 sans-serif" sιze="2" color=x#FF0033"Xb><f ont color="#000000">Our
62 Prιce:</font> $99.99<br>
63 </bx/fontx/td> 64 </tr>
65 <tr alιgn="left" valιgn="top">
66 <td wιdth="150" alιgn="center" bgcolor="#CCFFFF"Ximg src="buy . gif "
67 wιdth="36" heιght="36"x/td> 68 </tr> 69 <tr alιgn="center" valιgn="top" bgcolor="#FF33FF">
70 <td colspan="4">&nbsp;</td>
71 </tr>
72 <tr alιgn="center" valιgn="bottom" bgcolor="#FF33FF">
73 <td wιdth="99"Xbxfont f ace="Verdana, Anal, Helvetica, sans-serif ">< font
74 sιze="l">COMPANY
75 INF0.</font> </f ontx/bx/td>
76 <td wιdth="94"Xbxfont f ace="Verdana, Anal, Helvetica, sans-serif"
77 sιze="l">CONTACT
78 US</fontx/bx/td>
79 <td width="94"Xb><font f ace="Verdana, Anal, Helvetica, sans-serif" 80 sιze="l">TECH.
81 SUPPORT</fontx/bx/td>
82 <td wιdth="163"Xb><font f ace="Verdana, Anal, Helvetica, sans-serif"
83 sιze="l"Xa href="https : //secure . zacknetwork . com/ /www. lava- lampsdirect.com/cgi-
84 bm/myaccount. cgι">MY ACC0UNT</a> </f ontx/bx/td>
85 </tr>
86 <tr alιgn="center" valιgn="bottom" bgcolor="#FF33FF">
87 <td wιdth="450" colspan="4"> <bx/bxbx/bxbxf ont f ace="Verdana, Anal,
88 Helvetica, sans-serif" sιze="l">
89 </fontx/bXb><br>
90 </bx/td>
91 </tr> 92 </table>
93 </dιv>
94 </body> 95
96 <'--Thιs is the container for the "open" box.--> 97 <DIV CLASS="menbarl" ID="menbarl" STYLE="posιtιon : absolute; top:0; left: -2700;
98 visibility: hide;height : 95 ;background-color : transparent; layer- background-color :
99 transparent" zlndex="1001" ALIGN="πght"> 100
101
102<table cellspacmg="0" cellpaddmg=" 0" border="0"xtr>
103 <a href="mapl"xιmg
104 src="http : //podcgi . zacknetwork. com/ skins images /zackbox4. gif " usemap="#map2"
105border="0"X/a><map name="map2">
106<area coords="0, 0, 87 , 24" href="#" onMouseOver="status= λClose ZackBar' ; return
107 true;" onClιck="slιdeMenu ( 1 ) ; "> 108<area coords="141, 7 , 253, 22" href ="j avascπpt : return false;" 109onMouse0ver="status= ,Add to My Wishlist' ; return
110 true; "onClιck=" window. open ( λhttp : //podcgi . zacknetwork. com/cgi
111 -bm/addToWishlist . cgι' escπptιon=&prιce=&url= λ , 'winl' ,
112 ' direct ones =no, f ull sere en=no, heigh t=350, wιdth=400, locatιon=no, scro llbars=no,
113 status=no, toolbar=no, left=100, top=100, screenX=100, screenY=100' ) ;"> 114<area coords="284 , 7, 326, 22" href ="] avascnpt : return false;"
115 onMouseOver="status= λGo to myZack Page' ; return true;" 116 onClιck=" window. open ( Λhttp : //www.myzack . com' , 'wιn2' )"> 117<area coords=n362, 7 , 380, 22" href=" avascπpt : return false;" 118 onMouseOver="status= Exιt' ; return true;"
119 onClιck=" indow. open ( http : //podcgi . zacknetwork. com/cgi- bm/logout . cgi' , 120 Λwm3' ) "> 121<area shape="cιrcle" coords="438, 15, 10" href =" j avascπpt : return false;"
122 onMouseOver="status= ''Help' ; return true;" onClιck=" ιndow. open ( λ$help' ,
123 λwιn4' ) "x/mapx/tdx/tr> 124</table>
125 126
127</DIV> 128 129 < ' —This s the container for the "closed" box. —> 130<DIV CLASS="menbar2" ID="menbar2" STYLE="posιtιon: absolute; top : 0;left : 131 -50; visibility: hide; height : 95; width: 10;margιn: 0; border : Opx solid
132 #99FF99,-background-color : transparent; 1ayer-background-color : transparent"
133 zlndex="1000" ALIGN="πght">
134 <table cellspacmg="0" cellpaddmg="0" border="0" heιght="95"xtrxtd
135 valιgn="bottom"> 136 <A href="#" onClιck="slιdeMenu (2 ) ; "Ximg name="outbartab"
137 src="http : //podcgi . zacknetwork. com/ skins images /z tab . gif " border="0"x/A> ~
138 </tdx/trx/table> 139</DIV> 140
141 <scrιpt language="JavaScπptl .2"
142 src="http : //podcgi . zacknetwork . com/skins images/boxfuncs_ns . j s"> 143
144 </scπpt> 145
146<dιv ιd="cell0pre" style="posιtιon : absolute; margin: 0; height:
30; width: 70;
147 z-mdex: 1100; top: 500; left: -400; overflow: clιp;">
148<font face="sans serif, helvetica, anal, verdana" s ze="2"Xb>Hello
149 ]anedoe<brxa href="http: //www.myzack . com">Not
]anedoe?</a></bx/font>
150</dιv>
151<scrιpt language="Javascriptl .2"> 152
153 dynOb^ectArray [numChunks ] = "cellOpre";
15 xOffset [numChunks] = 375;
155 yOffset [numChunks] = 30;
156 numChunks++; 157
158 </scπpt>
159<dιv ιd="celllpre" style="posιtιon : absolute; margin: 0; height:
25; width: 300;
160 z-mdex: 1100; top: 500; left: -400; overflow: clιp;"> 161 <f ont face="sans serif, helvetica, anal, verdana"
162 sιze="2"xb>Communιty</bx/f ont>
163</dιv>
164 <scrιpt language="Javascπptl .2">
165 166dynOb]ectArray [numChunks] = "celllpre"; 167 xOf fset [numChunks] = 15;
168 yOf fset [numChunks] = 45;
169numChunks++;
170 171</scπpt>
172<dιv ιd="cell2pre" style="posιtιon : absolute; margin: 0; height:
25; width: 350;
173 z-mdex: 1100; top: 500; left: -400; overflow: clιp;">
174<font face="sans serif, helvetica, anal, verdana" 175 sιze="2"Xb>AuctιonWatch</bX/font>
176</dιv>
177 <scπpt language=" Javascπptl .2">
178
179 dynObjectArray [numChunks] = "cell2pre"; 180 xOffset [numChunks] = 15;
181 yOffset [numChunks] = 75;
182numChunks++;
183
184 </scπpt> 185<dιv ιd="cell3pre" style="posιtιon : absolute; margin: 0; height:
25; width: 350;
186z-mdex: 1100; top: 500; left: -400; overflow: clιp;">
187<font face="sans serif, helvetica, anal, verdana"
188 sιze="2"Xb>PπceCompare</bx/font> 189</d v>
190 <scrιpt language=" Javascriptl .2">
191
192 dynObjectArray [numChunks] = "cell3pre";
193 xOffset [numChunks] = 15; 194 yOffset [numChunks] = 105;
195 numChunks++;
196
197</scrιpt>
198<dιv ιd="cell4pre" style="posιtιon : absolute; margin: 0; height: 25; width: 300;
199z-mdex: 1100; top: 500; left: -400; overflow: clιp;">
200<font face="sans serif, helvetica, anal, verdana" sιze="2">Loadmg
. . .</font>
201</dιv> 202<scπpt language=" Javascriptl .2">
203
204 dynObjectArray [numChunks] = "cell4pre";
205 xOffset [numChunks] = 95;
206yOffset [numChunks] = 45; 207numChunks++;
208</scπpt>
209<dιv ιd="cell5pre" style="posιtιon : absolute; margin: 0; height:
25; width: 350;
210z-mdex:1100; top: 500; left: -400; overflow: clιp;"> 211<font face="sans serif, helvetica, anal, verdana" sιze="2">Loadmg
. . .</font>
212 </dιv>
213 <scnpt language="Javascriptl .2">
214 dynObjectArray [numChunks] = "cell5pre"; 215 Offset [numChunks] = 95; 216yOffset [numChunks] = 75; 217numChunks++; 218 219</scrιpt> 220<dιv ιd="cell6pre" style="posιtιon: absolute; margin: 0; height:
25; width: 350;
221z-mdex: 1100; top: 500; left: -400; overflow: clιp;">
222 <f ont face="sans serif, helvetica, anal, verdana" sιze="2">Loadmg 5 . . .</font>
223</dιv>
224 <scπpt language="Javascriptl .2">
225
226 dynObjectArray [numChunks] = "cellδpre"; 10 227 xOffset [numChunks] = 95;
228 yOffset [numChunks] = 105;
229numChunks++;
230
231</scrιpt> 15 232 <scπpt language=" JavaScπptl .2">
233
234 slιdeMenu(2) ;
235
236 </scnptXdιv ιd="cell4" style="posιtιon : absolute; margin: 0; 20 height: 25;
237wιdth: 300; z-mde : 1100 ; top: 500; left: -400; overflow: clip; background:
238 transparent; ">
239 <layer src="http : //podcgi . zacknetwork . com/cgi
25 240 bin/community. cgι?ItemDescπptιon=Hot+Lava+motιon+lamp&browsertype= ns&fontsιze=
2412"x/layer>
242 </dιvxscrιpt language="Javascriptl .2"> 243
30 244 document . layers [dynObjectArray [4] ] .visibility = "hide";
245dynObjectArray[4] = "cell4";
246
247 </scπptXdιv ιd="cell5" style="posιtιon : absolute; margin: 0; height: 25; 5 248 width: 350; z-mdex : 1100 ; top: 500; left: -400; overflow: clip; background:
249 transparent ; ">
250 <layer src="http : //podcgi . zacknetwork. com/cgi
251 bin/auction. cgι?descrιptιon=Hot+Lava+motιon+lamp&tιtle=&artιst=&bro 0 wsertype=ns&f
252 ontsιze=2"x/layer>
253 </dιvxscπpt language="Javascriptl .2"> 254
255 document . layers [dynOb ectArray [5] ]. visibility = "hide"; 45 256dynObjectArray[5] = "cell5";
257
258 </scrιptXdιv ιd="cell6" style="posιtιon : absolute; margin: 0; height: 25;
259wιdth: 350; z-mde : 1100 ; top: 500; left: -400; overflow: clip; 50 background:
260 transparent; ">
261<layer src="http: //podcgi . zacknetwork. co /cgi-
262 bm/pπcecompare . cgι?descπptιon=Hot+Lava+motιon+lamp&tιtle=&artιst
= &f ormat=&ιsb 55 263 n=&dιrector=&dvd=&vhs=&type=&username=&browsertype=ns&fontsιze=2"x
/layer>
264 </dιvxscπpt language=" Javascriptl .2">
265
266document . layers [dynObjectArray [6] ] . visibility = "hide"; 0 267dynObjectArray [6] = "cell6"; 268
269</script>
270</html>
CODE BLOCK 5
Figure 7 is a display diagram showing the sample web page as rendered by the client's browser after being modified by the facility. The application of document processing pipeline 420 shown in Figure 4 can be seen in comparing Figures 6 and 7. In applying the item data extraction document processing specification 421, the facility extracts from the web page and stores in its database such information as the item brand "Hot Lava" and description "motion lamp" 621, the item number "groovy 321" 622 and the item price "$99.99" 624.
In applying the supplemental information additional document specification 422, the facility modifies the title bar of the browser window 600 to include a user and a pod name as shown in the title bar of browser window 700. In applying the supplemental information additional document specification 422, the facility further adds a supplemental information bar 750 to the web page rendering area 710. The supplemental information bar includes links, such as a link 751 to add the product described in the web page to a wish list maintained for the user, a link 752 to a page containing information for the user, a link 753 to disable the facility and/or the display of the supplemental information bar, and a link 754 to a help page, and a link 755 for switching to a different user. The supplemental information bar further contains information produced by three features: a Community feature 760, an AuctionWatch feature 770, and a PriceCompare feature 780. The Community feature contain links 761 and 762 to two products that are popular with other users that have requested information about and/or purchased the lava-lamp product. The AuctionWatch feature contains two links 771 and 772 to on-line auctions in which the lava-lamp product is offered for sale. The PriceCompare feature contains two links 781 and 782 to other web-based merchants that are offering the lava-lamp product for sale at lower prices.
In applying the secure reference rewriting document processing specification 423, the facility modifies the URL for secure link 641 to refer to the facility rather than to the web server to which it originally referred.
Figure 8 is a display diagram showing the rendered modified web page with its supplemental information bar visually minimized. By comparing Figure 8 to Figure 7, it can be seen that the supplemental information bar 750 has been replaced with a much smaller supplemental information bar icon 890 by selecting a minimize control of the supplemental information bar 750. This enables the user to view any content of the original web page that may be obscured by the display of the supplemental information bar 750.
In one embodiment, the facility uses procedural scripts as document processing specifications. Such procedural scripts are preferably expressed in a command language similar to Perl called "ZCL." ZCL is described at a high level herein, and is described in greater detail in Appendix 1 which follows hereafter.
The function of a parser in the document processing pipeline is to process the HTML content that it receives from the previous link in the chain by applying its ZCL script to that stream of data. A simple ZCL script comprises a set of "SELECT" statements that indicate portions of the incoming page to which to apply commands. The commands to be applied are attached to each "SELECT" statement. The parser accumulates all HTML data until it encounters data that matches the regular expression that indicates that the current "SELECT" statement has been satisfied. When the termination expression is satisfied, the parser then applies all the commands to the data grabbed from the page, returns the processed data, and goes on to process the next "SELECT" statement. Commands in a "SELECT" block can perform two general functions. They modify the HTML data, and/or they grab data from the HTML page and store them in ZCL variables for later use. "MatchAndReplace" commands that perform substitution based on regular expressions perform the HTML modification. Data retrieval is performed by "GrabText" commands that assign data using the result of regular expression matching.
All the parsers in the chain share data assigned to variables. This means that variables set early in the chain may be used by ZCL scripts later in the chain.
In addition to "SELECT" -based processing, ZCL also supports a more flexible, global search-and-replace function called "PipelinedMatch".
This command applies a regular expression-based search and replace to the whole HTML page while forwarding HTML data as soon as a match is processed or it is clear that no match will be found in that text.
The facility stores data collected during HTML processing in a database resident on the proxy server computer system. This database stores data "grabbed" from the pages for each user by ZCL scripts. The database is designed with a dynamic schema, which is updated whenever new ZCL scripts are installed for use by the facility.
Scripts for web pages containing consumer-oriented data generally grab product descriptions, as well as more detailed information such as item type, price, and URL of the product page.
The facility builds a user profile by grabbing data from pages the user visits. Each time the user visits a product page (i.e., a page describing a specific product), a data extraction script stores the products description and price. The facility processes this data to extract the last n items that the user viewed. In one preferred embodiment, the facility maintains information on the 50 items most recently viewed. The facility uses these items to generate a profile of the user's interests. The facility characterizes a user's profile according to keyword frequency and item type frequency. Keyword frequency is the number of times a given keyword occurs in the recorded items. Item type frequency is the number of times a given item type (product category) occurs. The facility records the user's top keywords and item types and uses them to target advertisements and other content that are relevant to the user's recent browsing.
Code Block 6 below is a sample document processing script corresponding to the product data extraction document processing specification 421 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are first subjected. This script extracts product data from the body of the HTTP response for use in profiling the user retrieving this web page.
1 ; This is a sample ZCL (Zack Command Language) script that captures the
2 ; Item Description, Item Brand, Item Price, Manufacturer' s Number, Item Type,
3 ; and Item Subtype from the sample HTML page (zcl_patent_demo.html).
4 ; The regular expression for the site would be similar to
5 ; RegEx: /patentdemo/products/ . * 6
7 Example: 8 SetVar (
9 ) {
10 ; var name value
11 ιtem_type home
12 } 13
14 Select (
15 until = <b><font
16 )
17 {} 18
19 Select (
20 until = </a>
21 )
22 { 23 GrabText (
24 match = lamps. html"> ([Λ<]*)<br>
25 l st = ιtem_subtype
26 paren = 1
27 ) 28 }
29 Select ( until ) {} Select ( until = <br> ) { GrabText ( match = > ( [Λ<] * ) <font list = ιtem_brand paren = 1 ) 1 Select ( until )
Select ( until = </font> ) { GrabText ( match = > ( [ Λ<] * ) </font> list = ιtem_descrιptιon paren = 1 ) } Select ( until = </bx/fontx/td> ) { GrabText ( match = Style\s*#\s* ( [ ~<] * ) </bx/font> list = item number paren = 1 ) ) Select ( until = Price: ) {} Select ( until = <br> ) { GrabText ( match = </font>\s* ( [ Λ< ] * ) <br> list = ιtem_prιce paren = 1 ) } DBStore( 90 table = lavalampsdirect . com
91 )
92 (
93 ItemDescπptιon_vιewed ιtem_descrιptιon all 94 ItemBrand_vιewed ιtem_brand all
95 ItemPπce_vιewed item price all
96 ItemNumber_vιewed item number all
97 ItemType_vιewed ιtem_type all
98 ItemSubType_vιewed ιtem_subtype all 99 }
CODE BLOCK 6
The script shown in Code Block 6 extracts product data from the web page and stores it in the database. A ZCL statement in lines 14-17 of Code Block 6 processes data until the first occurrence of the string "<b><font". This string occurs in the page just before the heading of the page (in this case "LAVA-LAMPS" in lines 24-25 of Code Block 4). The empty command clause allows the parser to pass data on as soon as it is clear that it doesn't contain the termination string. Thus, as data comes in, it is passed on to the next ZCL script in the chain.
The ZCL statement in lines 19-28 of Code Block 6 extracts the heading of the page, "LAVA-LAMPS," from line 25 of Code Block 4 into the variable named "item subtype".
In a manner analogous to the ZCL statements in lines 14-17 and 19-28, respectively, of Code Block 6, the ZCL statements in lines 30-38 and 39-44 advance the HTML stream to just before the brand name, and then extract the brand name "Hot Lava", from lines 32-33 of Code Block 4, into the item brand variable.
The next ZCL statements in Code Block 6 are structured similarly to the first four, with several pairs of statements to advance to the next data item, and then to extract the data item into the ZCL variables item description, itemjnumber, and item_price.
The final statement at lines 89-99 of Code Block 6 stores the acquired text into the database, indexed by domain name and user name.
As a result, ZCL variables and database fields have the extracted values shown below in Table 1. ZCL Variable Database Field Value
item subtype ItemSubType viewed LAVA-LAMPS item brand ItemBrand viewed Hot Lava item_description ItemDescription viewed motion lamp item number ItemNumber viewed groovy 321 item_price ItemPrice viewed $99.99
TABLE 1
The facility preferably adds to certain web pages a subwindow called a "supplemental information bar" that contains additional information relating to the web page. The supplemental information bar is preferably implemented as a DHTML widget, which enables the information contained in it to be displayed in a relatively unobtrusive manner. The supplemental information bar floats above the page and can be "minimized" so that it doesn't obscure the existing page content.
By using external content, the supplemental information bar allows easy substitution of features, which enables a "slotting" strategy. That is, slots in the supplemental information bar can be sold. The supplemental information bar code is capable of placing different features on different pages.
The supplemental information bar is preferably implemented using a Perl function which generates HTML text which is inserted into the original page between the </BODY> and </HTML> tags. This ensures that all the important HTML text in the page will have been processed by the time the supplemental information bar function is called. This allows ZCL parsers earlier in the chain to grab all relevant information out of the page so the variables can be passed to the supplemental information bar. The supplemental information bar pulls in external content from server-side scripts, such as CGI scripts, that are called from the browser by elements able to reference external objects, such as <DIV> tags in Netscape, or <IFRAME> tags in Internet Explorer. The ZCL script that places the supplemental information bar into the page generates the URLs for the CGI requests, passing the appropriate data grabbed from the page as parameters. In the example above, the product description is passed to a CGI script which generates a price comparison.
The supplemental information bar uses <LAYER> and <DIV> tags to place external content into the page. The content is generated by CGI scripts which are handed parameters grabbed from the page. In Netscape browsers, a <LAYER> tag is used to include the external content. In Internet Explorer, <DIV> tags are used instead, with an <IFRAME> holding the actual content generated by the script. When the CGI content is received, it is copied from the <IFRAME> into the <DIV>. Note that to accomplish this, a special proxy command has been implemented. In order to prevent malicious web sites from capturing data from users by copying data from one frame to another Internet Explorer enforces a security measure preventing access of frames from a web server other than the one running the script. This means that the copying can only be achieved if the CGI request is served from the same site as the original page. This means that the CGI must appear to be resident on the original server. A proxy command is used to direct the proxy that requests from URI's whose path component begins with a certain keyword should be forwarded to the internal CGI server instead of being forwarded to the web server as they normally would be. This means that even though the browser appears to be sending the request to the original web served the response is actually generated by the facility's CGI server.
The CGI scripts are passed a cookie which identifies the user making the request. The scripts can use this information to customize the data returned for the request. An example of a CGI script application would be using collaborative filtering to deliver recommendations related to the product or web page being displayed. Collaborative filtering attempts to determine interests of one user by correlating the user's product browsing history with the histories of other users. Identifying similar users allows the script to determine what other products were interesting to those users.
Code Block 7 below is a sample document processing script corresponding to the supplemental information addition document processing specification 422 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are subjected after they are subjected to the sample document processing script shown in Code Block 6 above. This script adds a supplemental information bar to the web page containing information relating to the content of the web page and the user, including links to additional related information.
1 PipelmedMatch (
2 match=\<\/html\>(? ' . *")
3 replace=$&ZackBox($username, http : //$CGI_SERVER, cgi-
4 bin/community. cgι?descrιptιon=$%ιtem_descrιptιon&pπce=$%ι em_pr ιce&url=$% 5 url, cgi-bm/newpopular . cgι?ItemDescrιptιon=$%ιtem_descrιptιon, cgi-
6 bin/auctionwatch . cgι" escrιptιon=$%ιtem_descrιptιon&tιtle=$ -s tern _tιtle&artιst
Figure imgf000031_0001
12 buffer=20
13 ) ( } 14 15 PipelmedMatch (
16 match=<tιtle>
17 replace=<tιtle>$&InsertUsername ($username, $POD_NAME)
18 ) {} CODE BLOCK 7
The script in Code Block 7 uses information extracted in the script shown in Code Block 6 to add the user's name to the title bar of the web page and to add supplemental information to the web page in a supplemental information bar.
The "PipelmedMatch" statement in lines 15-18 of Code Block 7 performs a global replacement in the page that adds the user's name and proxy server site to the title bar of the web page. A Perl function (InsertUsername) is used to generate the actual HTML content. For example, the facility inserts the user and pod name string "[janedoe@podl]" before the page title "Acme. electronics" in line 3 of Code Block 5. The user and pod name string are generated by an "InsertUsername" Perl function that takes ZCL variables containing the user's name and the pod's name as parameters. The PipelmedMatch statement consumes data and passes it on as soon as it determines that the match expression doesn't match the contents of the accumulated data. Given the chaining of the two scripts in this example, the match will occur in the first batch of data received from ZCL script shown in Code Block 6 (the batch ending in "<b><font").
The PipelmedMatch statement in lines 1-13 of Code Block 7 uses the variables set by ZCL script shown in Code Block 6 to call a Perl function that generates the supplemental information bar. The supplemental information bar is added to the sample web page at lines 95-269 of Code Block 5. The call to the Perl function specifies the content features to include in the supplemental information bar. For each such feature, the parameters passed to the Perl script include the URL of a CGI script to produce the content for the feature as well as parameters passed to the CGI script for use in producing the content. For example, for a PriceCompare feature preferably provided by the facility, a description of the item whose price is to be compared is ultimately passed to the CGI script. The parameters to the CGI script are expressed, at least in some cases, in terms of ZCL variables whose values are extracted from the web page by the script shown in Code Block 6. Before calling the Perl function, the ZCL variables are replaced with their values. For example, the $%item_description variable name shown in line 7 of Code Block 7 is replaced with its value extracted from the sample web page, "Hot Lava motion lamp." When the Perl function is invoked, for each specified feature, it adds a CGI script call to the HTML content generated by the Perl function and added to the web page by the ZCL script. For example, for the PriceCompare feature, the facility preferably adds the CGI script call shown in lines 261-263 of Code Block 5. When the client receives the modified web page, for each CGI script call added by the Perl function, the facility generates an additional HTTP request invoking the script and passing the parameters. In the corresponding HTTP response, the facility receives the HTML content for the feature generated by the CGI script and displays this HTML content in the supplemental information bar. For example, the content for the PriceCompare feature is displayed in area 780 of the supplemental information bar.
An alternate implementation of the supplemental information bar processing uses a proxy command to invoke a Perl module which acts as a customized parser object. This Perl module externally behaves like a parser object for ZCL scripts, but actually is not associated with any script. Instead, it contains Perl code which is optimized to search for the appropriate place to insert the DHTML content to implement the supplemental information bar. The variables initialized by parser objects earlier in the pipeline are passed to this Perl code, allowing selection of features to include based on available data. This allows the facility to avoid placing features in the bar which generate no useful information in the context of a given page.
Code Block 8 below is a sample document processing script corresponding to the secure reference re-writing document processing specification 423 shown in Figure 4, to which the contents of the body of the HTTP response shown in Code Block 4 are subjected after they are subjected to the sample document processing script shown in Code Block 7 above. This script replaces any secure HTTP references to web servers with secure HTTP references to the facility.
1 PipelinedMatch (
2 match=href="https ://
3 replace=href="https : //secure . zacknetwork. com/https :
4 ) ( ]
CODE BLOCK 8
In accordance with this script, the facility preferably replaces the secure HTTP reference to a web server in line 82 of Code Block 4, "https://www.lava-lampsdirect.com/ cgi-bin/myaccount.cgi," with a secure HTTP reference to the facility in lines 83-84 of Code Block 5, "https://secure.zacknetwork.com//www.lava-lampsdirect.com/cgi- bin/myaccount. cgi". Non-secure HTTP references to web servers, such as the HTTP reference to a web server in line 25 of Code Block 4, "/lavalamps.html", on the other hand, are not replaced with an http reference to the facility, as can be seen at line 26 of Code Block 5. In order to process secure web sites, the facility uses several additional mechanisms. These mechanisms redirect the secure connection to an instance of the proxy server running in a secure mode (the "secure proxy") so that it is a part of the communication between the user and the web site. The secure proxy is responsible for script processing on secure web pages.
Secure web conversations are generally conducted using the SSL protocol. This protocol entails a key exchange between the two endpoints of the conversation and subsequent establishment of a secure channel which cannot be understood by a third party. A normal web proxy server simply forwards the keys and encrypted data stream without being able to decrypt the actual data that is transferred.
Figure 9 is a data flow diagram showing conventional proxying of a secure web conversation. The diagram shows that, when the client 910 generates a secure request 911, the request is encrypted before leaving the client using a key set negotiated between the client and web server for the secure conversation, as shown by the wavy line extending from the request. The request 911 remains encrypted while it passes through a conventional proxy server 920, and until it reaches the web server 930. The web server 930 uses the key set negotiated between the client and web server for the secure conversation to decrypt the request 911, as shown by the transition from a wavy line to a straight line in web server 930.
Similarly, the web server 930 generates a secure response 931, which it encrypts with the key set before transmitting it to the client. The secure response 931 remains encrypted throughout its transmission to the client, including the time during which it is passing through the conventional proxy server 930. When the secure response 931 reaches the client 910, the client decrypts the secure response using the key set for the secure conversation. Accordingly, the conventional proxy server 920 never has access to the secure request 911 or the secure response 931 in plaintext form.
In order to be able to rewrite secure web pages, the proxy server must participate in this key exchange. For an HTTP conversation, this means that the browser must be directed to connect directly to the proxy instead of the ultimate web site. That is, the browser must be connecting to a URL which looks like it is resident on the proxy server, not on the actual site. A consequence of this redirection is that cookies (which are presented only to the site that sets them) will not be passed for the actual site, as the browser doesn't know that the request is actually intended for a different site.
Figure 10 is a data flow diagram showing proxying of a secure web conversation by the facility. The diagram shows that a secure conversation using the facility is actually implemented using two different secure conversations: a first secure conversation between the client and the proxy server, and a second secure conversation between the proxy server and the web server. The client 1010 generates secure request 1011, which it encrypts using a key set negotiated with the proxy server before sending to the proxy server 1020. When the proxy server 1020 receives the encrypted secure request 1011, the proxy server decrypts the secure request using the key set negotiated with the client for the first secure session. Once it has decrypted the request, the proxy server may read, copy, modify, filter, or otherwise process the secure request in plaintext form. The proxy server then re-encrypts the secure request using a key set negotiated with the web server for the second conversation, and forwards the re-encrypted secure request to the web server 1030. When the web server receives the re- encrypted request, the web server decrypts the re-encrypted request using the key set negotiated with the proxy server for the second secure conversation.
Similarly, the web server generates a secure response 1031, which it encrypts using the key set negotiated with the proxy server for the second secure conversation, and transmits the encrypted secure response
1031 to the proxy server. The proxy server decrypts the encrypted secure response using the key set negotiated with the web server for the second secure conversation. After so decrypting the secure response, the proxy server can read, copy, modify, filter, or otherwise process the secure response in plaintext form. The proxy server then re-encrypts the decrypted secure response using the key set negotiated with the client for the first secure conversation, and transmits the re-encrypted secure response to the client. The client decrypts the re-encrypted secure response using the key set negotiated with the proxy server for the first secure conversation. Thus, it can be seen that a facility provides access in the proxy server to a secure request and response in plaintext form without sacrificing the security of the secure request and response. The secure "proxy" is implemented as a web server, not a web proxy server. It supports URL requests which contain the site from which to retrieve the data. The URL that the browser requests actually specifies the secure proxy server as the location of the page. For example, a request from "https://www.asite.com/apath" would be requested as
"https://secure.zacknetwork.corn/www.asite.com/apath". The proxy auto- configuration file provided to the browser at login time by the facility is set up to pass all requests to "secure.zacknetwork.com" through without proxying. In order to ensure that the facility processes all secure web requests, the facility preferably rewrites all sources from which a secure request may originate. In general, these points are either links or redirection requests. Hyperlinks from one page to another often point to a secure page. These links must be rewritten. Redirection requests (e.g., HTTP 302 replies) are also often used to direct a browser to a new page, which could be a secure page. These redirection requests must also be rewritten to point to the secure proxy. Note that links that are external to the click stream (e.g., bookmarks, links in email, etc.) cannot be rewritten by this mechanism. At this point, no server-side technique is known to address this. Links are preferably rewritten using the normal ZCL mechanisms. This means that a script such as the script in Code Block 8 is written to process the insecure (standard HTTP) pages and rewrite all links so that the old URL now points to the facility (for example, the URL "https://www.asite.com/apath" is rewritten as "https://secureproxy. zacknetwork.com/www.asite.com/apath").
Redirection requests generally cannot be rewritten by ZCL script as the location of the redirection is returned in a header field. The facility therefore allows specification of a proxy command that indicates that redirection requests returned from a set of URLs should be rewritten. The requests are rewritten by prepending the name of the secure proxy server to the Location header field specified in the original response.
It will be understood by those skilled in the art that the above- described facility could be adapted or extended in various ways. For example, the facility may be straightforwardly adapted to operating computing environments other than the Internet. Also, the facility may modify various types of documents besides web pages. Additionally, the facility may use kinds of document processing specifications other than those described above. Further, the facility may apply some document processing specifications in parallel rather than sequentially. While the foregoing description makes reference to preferred embodiments, the scope of the invention is defined solely by the claims that follow and the elements recited therein.

Claims

We claim:
L A computer-readable medium whose contents cause a data processing system to add information to a document for display during its delivery by: based upon information associated with the document, selecting one of a plurality of information addition specifications each specifying a distinct manner of adding information to a document; applying the selected information addition specification to the document to add information to the document in accordance with the manner of adding information to a document specified by the selected information addition specification; and resuming the delivery of the document.
2. The computer-readable medium of claim 1 wherein the document originated at a web server, and wherein the selecting selects an information addition specification based upon the identity of the web server.
3. The computer-readable medium of claim 1 wherein the document has a header, and wherein the selecting selects an information addition specification based upon contents of the header.
4. The computer-readable medium of claim 1 wherein the selected information addition specification designates document sections and, for each designated section, either indicates information to be added to the designated section or specifies that no information is to be added to the designated section, and wherein applying the selected information addition specification includes identifying the section of the document corresponding to each section designation of the selected information addition specification, and wherein resuming delivery of the document includes, for each identified section of the document to which the information addition specification indicates information to be added, transmitting the identified section directly in response to adding the indicated information to the identified section, and wherein resuming delivery of the document includes, for each identified section of the document to which the information addition specification indicates that that no information is to be added, transmitting the identified section directly in response to identifying the identified section.
5. The computer-readable medium of claim 4 wherein the selected information addition specification designates document sections using regular expression matching.
6. A computer memory containing an augmented web page data structure, comprising: contents generated by a server computer system; and contents added to the augmented web page data structure by a computer system other than the server computer system, such that displaying the augmented web page data structure causes both the contents generated by the server computer system and the contents added to the augmented web page data structure by a computer system other than the server computer system to be displayed.
7. The computer memory of claim 6 wherein the contents added to the augmented web page data structure by a computer system other than the server computer system represent an additional visual feature.
8. The computer memory of claim 6 wherein the contents added to the augmented web page data structure by a computer system other than the server computer system represent a DHTML element.
9. The computer memory of claim 6 wherein the contents added to the augmented web page data structure by a computer system other than the server computer system represent a sliding content bar.
10. The computer memory of claim 6 wherein the contents added to the augmented web page data structure by a computer system other than the server computer system represent a frame.
11. The computer memory of claim 6 wherein the contents added to the augmented web page data structure by a computer system other than the server computer system represent a layer.
12. A data processing system for modifying a document during its delivery, comprising: a document receiving subsystem that receives the document; a modification specification selection subsystem that selects one of a plurality of modification specifications each specifying a distinct manner of modifying a document based upon information associated with the document; a modification specification application subsystem that applies the selected modification specification to the document to modify the document in accordance with the manner of modifying a document specified by the selected information addition specification; and a document transmission subsystem for transmitting the modified document to resuming its delivery.
13. A data signal conveying a modified display document, comprising contents generated by a first computer system and subsequently modified by a computer system other than the first computer system, such that displaying the modified display document causes the contents generated by the first computer system to be displayed as modified by a computer system other than the first computer system.
14. A computer memory containing a web page modification specification, comprising: an indication of a selected location within a web page; and an indication of a selected modification to the web page to perform at the selected location, such that the indication of a location within a web page can be used to identify the selected location, and the indication of a selected modification to the web page to perform can be used to perform the selected modification at the identified location.
15. The computer memory of claim 14 wherein the modification indication is an insertion.
16. The computer memory of claim 14 wherein the modification indicated by the indication is a substitution.
17. The computer memory of claim 14 wherein the modification specification is procedural.
18. The computer memory of claim 14 wherein the modification specification is non-procedural.
1 19. The computer memory of claim 14 wherein the
2 indication of a selected location both identifies a segment of the web page
3 and identifies particular content within the segment.
1 20. The computer memory of claim 14 wherein the
2 computer memory contains a plurality of web page modification
3 specifications, and wherein the computer memory further contains, for each
4 web page modification specification, an identification of web pages that are
5 to be modified using the web page modification specification.
1 21. A method in a data processing system for processing a
2 document, comprising:
3 determining that a document of a particular document type is to
4 be processed;
5 from a multiplicity of document processing modules, selecting
6 a plurality of document processing modules that is adapted to processing
7 documents of the document type;
8 selecting an order for the selected document processing models
9 that is adapted to processing documents of the document type; and
10 constructing a document processor for processing the document l i by assembling the selected document processing modules in the selected
12 order.
1 22. The method of claim 21 wherein the selected sequence
2 specifies a first document processing module and a last document processing
3 module, and wherein the constructing involves assigning the document as the
4 input to the first document processing module, assigning as the input to the
5 document processing modules after the first document processing module in
6 the selected sequence the preceding document processing module, and assigns as the output of the last document processing module in the sequence the modified document.
23. The method of claim 21 wherein the document is a web page.
24. The method of claim 21 wherein the web page is transmitted from a web server to a client computer system via an intermediate computer system, and wherein the method is performed in the intermediate computer system.
25. The method of claim 21 wherein the web page is transmitted from a web server to a client computer system via an intermediate computer system in response to a request transmitted from the client computer system to the web server via the intermediate computer system, and wherein the determining, selecting, and constructing is performed in the intermediate computer system in response to receiving the request from the client computer system.
26. The method of claim 21 wherein modules assembled in the constructing are each an instance of a document parser configured to process an input document accordance with a document processing specification corresponding to the document processing module.
27. A computer memory containing a document processor pipeline for processing documents, comprising a plurality of document processing modules each comprising an instance of a parser configured to process an input document accordance with a document processing specification corresponding, the document processing modules being assembled in a sequence reflecting an order in which the document processing modules will process a document, such that a document may be processed by the document processor pipeline by being processed by each document processing module, in the order reflected by the sequence of document processing modules.
PCT/US2001/003610 2000-02-24 2001-02-05 Modifying contents of a document during delivery WO2001063478A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001236650A AU2001236650A1 (en) 2000-02-24 2001-02-05 Modifying contents of a document during delivery

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US51321700A 2000-02-24 2000-02-24
US09/513,217 2000-02-24

Publications (2)

Publication Number Publication Date
WO2001063478A2 true WO2001063478A2 (en) 2001-08-30
WO2001063478A3 WO2001063478A3 (en) 2004-02-12

Family

ID=24042323

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/003610 WO2001063478A2 (en) 2000-02-24 2001-02-05 Modifying contents of a document during delivery

Country Status (2)

Country Link
AU (1) AU2001236650A1 (en)
WO (1) WO2001063478A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5959623A (en) * 1995-12-08 1999-09-28 Sun Microsystems, Inc. System and method for displaying user selected set of advertisements
US6014638A (en) * 1996-05-29 2000-01-11 America Online, Inc. System for customizing computer displays in accordance with user preferences
WO2000008583A1 (en) * 1998-08-07 2000-02-17 E2 Software Corporation Network contact tracking system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5959623A (en) * 1995-12-08 1999-09-28 Sun Microsystems, Inc. System and method for displaying user selected set of advertisements
US6014638A (en) * 1996-05-29 2000-01-11 America Online, Inc. System for customizing computer displays in accordance with user preferences
WO2000008583A1 (en) * 1998-08-07 2000-02-17 E2 Software Corporation Network contact tracking system

Also Published As

Publication number Publication date
AU2001236650A1 (en) 2001-09-03
WO2001063478A3 (en) 2004-02-12

Similar Documents

Publication Publication Date Title
US7062475B1 (en) Personalized multi-service computer environment
US6725425B1 (en) Method and apparatus for retrieving information from semi-structured, web-based data sources
US7200804B1 (en) Method and apparatus for providing automation to an internet navigation application
KR100612711B1 (en) Gathering enriched web server activity data of cached web content
US8069407B1 (en) Method and apparatus for detecting changes in websites and reporting results to web developers for navigation template repair purposes
US7856453B2 (en) Method and apparatus for tracking functional states of a web-site and reporting results to web developers
KR100297632B1 (en) Method and apparatus for transparently accessing multiple data stream types from an HTML browser
US20010037359A1 (en) System and method for a server-side browser including markup language graphical user interface, dynamic markup language rewriter engine and profile engine
US7343559B1 (en) Computer-readable recorded medium on which image file is recorded, device for producing the recorded medium, medium on which image file creating program is recorded, device for transmitting image file, device for processing image file, and medium on which image file processing program is recorded
US5774670A (en) Persistent client state in a hypertext transfer protocol based client-server system
US8190629B2 (en) Network-based bookmark management and web-summary system
US6199077B1 (en) Server-side web summary generation and presentation
US20080091663A1 (en) Software Bundle for Providing Automated Functionality to a WEB-Browser
CN100422997C (en) Method of adding searchable deep labels in web pages in conjunction with browser plug-ins and scripts
US20020078165A1 (en) System and method for prefetching portions of a web page based on learned preferences
KR100373486B1 (en) Method for processing web documents
US20100082747A1 (en) Real-time collaborative browsing
JP2004005406A (en) Method and system for assisting creation of document
KR20140009483A (en) Message catalogs for remote modules
WO2001077909A2 (en) Web portholes: using web proxies to capture and enhance display real estate
US8683316B2 (en) Method and apparatus for providing auto-registration and service access to internet sites for internet portal subscribers
JPH11502346A (en) Computer system and computer execution process for creating and maintaining online services
US20020191015A1 (en) Method and apparatus for managing history logs in a data processing system
US20030079039A1 (en) Web server utilizing a state machine and user token
WO2001063444A2 (en) Commercial activity performed in conjunction with document retrieval

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP