WO2000019339A1 - Selective connection flow redirection for transparent network devices - Google Patents
Selective connection flow redirection for transparent network devices Download PDFInfo
- Publication number
- WO2000019339A1 WO2000019339A1 PCT/US1999/022326 US9922326W WO0019339A1 WO 2000019339 A1 WO2000019339 A1 WO 2000019339A1 US 9922326 W US9922326 W US 9922326W WO 0019339 A1 WO0019339 A1 WO 0019339A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- server
- selective
- message
- connections
- connection
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
- G06F16/9574—Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1006—Server selection for load balancing with static server selection, e.g. the same server being selected for a specific client
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1023—Server selection for load balancing based on a hash applied to IP addresses or costs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/565—Conversion or adaptation of application format or content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/565—Conversion or adaptation of application format or content
- H04L67/5651—Reducing the amount or size of exchanged application data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/16—Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
- H04L69/161—Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/10015—Access to distributed or replicated servers, e.g. using brokers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/22—Parsing or analysis of headers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/30—Definitions, standards or architectural aspects of layered protocol stacks
- H04L69/32—Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
- H04L69/322—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
- H04L69/329—Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
Definitions
- the most popular user environment provides access to content on an equal basis through the use of the client and server communication model.
- certain computers known as “servers” are used to store and provide information.
- Other computers in the network known as “clients” allow the users to view documents through the use of a computer program known as a browser that requests a copy of the document be sent from host servers down to the client.
- Documents are typically requested by the client browser program specifying an address which identifies the host server which stores the document .
- the browser specifies a document such as by its so-called Internet Protocol (IP) address
- IP Internet Protocol
- the server retrieves the document from its local disk and transmits the document over network to the client.
- the connection between the client and host server is then terminated.
- a common solution for the present bottlenecks within the Internet is to deploy higher speed hardware.
- Such solutions include the deployment of digital subscriber line (xDSL) cable modem technology to speed up the physical layer communication paths between the end users and points of presence.
- xDSL digital subscriber line
- Gigabit speed routers and optical fiber backbones are also being proposed to alleviate congestion within the network itself.
- server clusters and load balancers are being deployed to assist with the dispatching of Web pages more efficiently.
- Document caching provides a way to intercept client requests for the same document with the cache serving copies of the original document to multiple client locations.
- the process for providing document files to the client computers changes from the normal process.
- the intermediate cache server may instead be requested to obtain the document.
- the document is being transmitted down to the client computer, a copy is stored at the intermediate cache server. Therefore, when another client computer connected to the same network path requests the same content as the first user, rather than requiring the request to travel all the way back to the host server, the request may be served from the local cache server.
- cache techniques are typically sub-optimal in one way or another. For example, most Web browser programs have a built-in cache that keeps copies of recently viewed content within the client computer itself. If the same content is requested again, the browser simply retrieves it from local storage instead of going out to the network. However, when a browser cache services only one end user, content often expires before it can be reused.
- a browser-redirected cache server may also be deployed to service multiple end users. Such a cache server is a separate computer that sits inside a gateway or other point of presence . End users configure their Web browsers to redirect all HTTP traffic to the cache server instead of the locations implied by the Uniform Resource Locators (URLs) . The cache server then returns the requested Web page if it has a copy.
- Such a cache server therefore acts as a proxy, receiving all requests and examining them to determine if it can fulfill them locally.
- proxy servers it is typically necessary to configure the client browser, proxy server, routers, or other network infrastructure equipment in order to cause the request messages to be redirected to the proxy server. This provides some configuration management difficulties in that reconfiguration of browsers typically requires administrative overhead on the part of the humans who manage the networks .
- a primary cache cannot satisfy a request, it queries a secondary cache which in turn may query a tierciery cache, and so forth. If none of the caches in the hierarchy has the desired content, the primary cache ultimately ends up forwarding the document request to the originally requested host.
- These caching schemes also fall short in some way. Forced redirection of HTTP traffic turns such cache servers into single points of failure. If a cache server overloads or malfunctions, access to the network is blocked. Recovery is especially awkward with browser redirected caching since every end user's Web browser then has an explicit point to the broken server.
- Cache servers are, in particular, notoriously difficult to optimize. In certain configurations, they will quickly become overloaded in that the number of connections that they are expected to maintain with the user locations is more than the processing power can handle. Time spent determining whether to accept connections, cache documents, and/or refuse connections therefore overloads the cache server, which in turn reduces its performance on an exponential basis. In other situations, the cache servers are underloaded and not enough traffic is routed to them. They therefore represent a large investment of resources not providing optimum utilization and which are in effect underloaded.
- the present invention is a technique for implementing a content server such as a cache server together with a message redirector which off-loads connection processing functions.
- the message redirector performs a function of filtering traffic away from the content server so that the number of connections for which the content server can contribute utility is maximized. This is done by performing a load shedding functionality for accepting new connections and passing through new connection requests while the content server is overloaded.
- the message redirector is a three-port, transparent bridge with enhanced features such as filtering and traffic redirection.
- the bridge function permits a cache server to be transparently installed in-line between routers, switches, and other network backbone infrastructure.
- connection selectivity function performed by the message redirector provides increased hit rate for the cache server as measured in the number of objects delivered from the cache versus the number of objects which must be retrieved from elsewhere in the network.
- This connection selectivity functionality is implemented by maintaining a list of objects that are stored in the cache, as has been done in the past. However, in the present invention, the list itself is kept in a distinct location outside of the cache server, in the redirector.
- the message redirector first looks for a new connection request.
- a request may take the form of a SYN packet.
- IP Internet Protocol
- the Internet Protocol (IP) address associated with the new connection request is then compared to the local selective connection table in the message redirector.
- the selectivity period For a predetermined period of time, referred to as the selectivity period, the SYN request is only routed up to the cache server if it is in the selective connection table.
- every so often a new selectivity connection table is generated by the cache server.
- the cache server periodically scans this list of objects stored in the cache to identify a subset of the most requested objects.
- An object may be a domain such as a full IP address or may be sub-net mask address.
- This list of popular requested addresses is then sent down to the message redirector. Therefore, it can be understood that optimization of the load on the cache server can be achieved by breaking the pool of available connections into two groups (1) a group of selective connections derived from high probability of hit rate sites and (2) a smaller group of non-selective connections.
- Fig. 3 is a more detailed view of the transparent message redirector.
- Figs. 4A, 4B, and 4C illustrate the format of certain data structures used by the redirector.
- Fig. 5 is a flow chart of the steps performed by the message redirector to support layer two (L2) bridge functionality.
- Fig. 6 is a flow chart illlustrating how selectivity and connection tagging/tracking is implemented in the filter logic.
- Fig. 9 is a flow chart for closing a connection in the redirector.
- Fig. 10 is a table of connection tagging objects.
- Fig. 11 is a diagram illustrating how tags are appended or inserted into a network message.
- Fig. 12 is a plot of an exemplary optimum selectivity period versus number of offered connections .
- Fig. 13 is an exemplary selectivity function.
- ISP peering points interchange points in a large scale enterprise network, central offices in a local exchange carrier network, metropolitan area exchanges, or other points through which the network message traffic is concentrated.
- the cache server 150 is deployed at an intermediate point in the network 100.
- the cache server 150 is configured to cache Web pages on a local disk 155.
- HTTP hypertext transfer protocol
- a redirector 140 and cache server 150 then cooperate to determine whether the Web page can be served from the local disk 155. If so, the Web pages are returned through the routers 114-1,114-2 through the network connections 112 to the client 110.
- the original request is then forwarded on to travel through a second set of routers 114-3, 114-4, through other network connections 118, to eventually arrive at the originally requested HTTP server 170.
- the host port hO provides a connection for passing message traffic to and from the host cache server 150.
- This port hO may also typically use more tightly coupled to connect physical layer connections such as local area network connections or may also be sharing a computer bus structure as will be described in more detail below.
- the redirector 140 also contains a controller to set the position of switches in the cut through switch 130, to permit message traffic to be routed up to the cache server 150 or passed straight through the between the routers 114-2 and 114-3.
- the message redirector 140 is a type of message processor with certain functions that will be described in greater detail herein. In the case where the server 150 is a cache server for served cached Web content, the message redirector 140 and cache server 150 cooperate to provide a transparent HTTP object cache for the network 100.
- the network cache 150 monitors HTTP traffic flow between the routers 114 and stores copies of sufficiently popular Web pages. Subsequent requests for the stored Web pages, for example from the HTTP client 110, are then retrieved from the disk 155 local to the cache server 150 rather than from the network HTTP server 170. This results in a significant reduction in network line utilization and improves user response time by reducing the number of hops between the client 110 and originally requested server 170, and also by providing multiple sources for popular Web content. In addition, this advantage occurs in a transparent manner so that the cache server 150 may be introduced into the network without specific reconfiguration of the HTTP client 110 or HTTP server 170.
- the host cache server 150 may actually consists of a number of hosts 250-0, 250-1, ..., 250-2 and associated storage devices cooperating to increase the performance of the overall cache server 150.
- the redirector 140 may physically be implemented as a number of three port devices 240-0, 240-1, ..., 240-2 connected in series.
- the series connected redirectors 240 thus each provide physical access to the network 100 for each respective host 250-0, 250-1, ..., 250-2 at a respective host port hO , hi , ... , h.2.
- other multi-processing arrangements are possible, such as having the redirectors 240 arranged in parallel to share a computer bus type interconnection and the like.
- the redirector 140 performs certain critical filtering operations to optimize the routing of messages to the cache server 150. These functions present messages to the cache server 150 in a manner which optimizes its utilization.
- a back pressure function provide a control over how many new connections will be attempted to be offered to the cache server 150 depending upon its reported load.
- the selectivity function provides most of the rest of logic necessary for transparency. Its primary function is to bridge packets that the host server 150 is not processing straight from port eO out through to port el. Packets that are being processed by the cache server 150 are passed up the stack through the one or more host ports hO , hi, ..., h2.
- the selectivity feature is in effect used to attempt to "hijack" only those packets which have a high probability of being related to HTTP objects or documents connections which the cache server 150 has already stored or, in the case of a new connection request, is capable of servicing .
- a hardware block diagram of a message redirector 140 which implements these features is shown in Fig. 3.
- the L2 and data path control circuitry 350 is used to control how packets are routed to and from the various elements of the redirector 140. It comprises circuits that control the state of the internal bus 304, to allow packets to be moved from the eO port through to the el port, or from one of the eO or el ports up to the redirector logic 340 and/or packet payload memory 350.
- the redirector logic 340 may be replicated for each host port hO , hi, ..., h.2 , or the redirector logic 340 may control more than one host port.
- the payload memory 350 is used as a temporary buffer to store the payload portions of a packet while the rest of the redirector logic 340 is determining where to route the packets .
- the internal bus 304 may be an industry standard PCI bus and the NI (C) s 302 may be integrated circuit chips. This configuration may be the least expensive for volume manufacturing in the case of a single host port implementation. In other implementations, the NI (C) s 302 may be individual circuit cards, and the bus 304 an external bus. This "external box" configuration may be more desirable for servicing multiple hosts 150.
- Figs. 4A, 4B, and 4C are more detailed views of various data structures used in the redirector 140.
- Fig. 4B illustrates a connection tracking object table 420 which will be described in greater detail in connection with Figs. 7 through 11. It includes entries associated with active connections that the cache server 150 is presently servicing, including an Internet Protocol source (IPs) address field 421, an IP destination (IPd) field 422, a TCP source (Ts) field
- IPs Internet Protocol source
- IPd IP destination
- a port number field 420 indicates information relating to how to route packets internal to the cache server 150.
- Fig. 4C illustrates a selective address table 430.
- this table is used to determine whether or not a packet is actually routed up to the cache server 150 based upon a number of conditions.
- the entries in this table 430 include at least an IP address field 431 and a mask 432.
- An optional rating field 433 may be used to support a feature known as weighted selectivity; a port number field 434 is used in implementations supporting more than one host port hO , hi, ... , h2.
- Fig. 5 is a more detailed flow chart of certain operations performed by the redirector logic 340 to perform the selectivity and connection tagging functions.
- State 518 is next entered in which the L2 destination address is used to determine a tentative L2 forwarding decision.
- This decision referred to herein as the FD L2 decision, is indicated by reading at least the port number field 412 by finding the associated destination address in the MAC address field 411. Whichever bits are set in this field indicate to the redirector logic 140 to which port, e.g., eO, el, or hO (or hi, ..., h2 if present) the packet might be routed to, e . g. , tentative routing decision.
- state 520 if the static/dynamic bit is set to indicate a static address, then this indicates a type of packet which is intended not for network connected devices 110 or 170, but rather a "forus" management layer packet intended for the cache server host 150 itself. This bit may also be set in the case of a MAC layer broadcast address, as show in Fig. 4A.
- state 522 is entered in which the packet is forwarded using the FD L2 decision.
- the packet received is not an HTTP packet, such as if the TCP header port number is not set equal to "80" , then the packet is simply forwarded, or bridged, in state 612 using the FD L2 decision.
- IP fragmentation occurs because HTTP packets are sometimes fragmented into multiple IP packets. In such an instance, they will at this point need to be reassembled before they can be passed up the stack (assuming, for example, that the TCP header is present only in the first packet) .
- the packet is examined to determine if it indicates a new connection (or "flow") such as if it includes an SYN packet.
- SYN packets indicate the beginning of a request for a connection for an HTTP object. If the packet is not a SYN packet, then this packet relates to a connection which has already been set up. Processing proceeds to state 650 in which other attributes of the connection are examined to determine if and how the packet is to be handled. If the packet is an SYN packet, then a new TCP connection is being requested.
- a selective connectivity feature determines whether or not a new connection should actually be established with the cache server 150. State 616 specifically determines if a maximum number of connections are already being serviced by the cache server 150. If this is the case, then processing proceeds to state 618 where the packet is simply bridged out of the redirector using the FD L2 tentative decision.
- processing proceed to the selective connection state 620, where it is determined if the redirector is in a "selective" or “non-selective” mode.
- connection selectivity function is a feature which attempts to maintain a list of the IP addresses of the servers 170 that contain the most popular objects stored in the cache server 150.
- a selective connection table (SCT) generation process executing as part of the cache server, is responsible for generating information to permit the redirector 140 to maintain the list referred to the selectivity connection table (SCT) 430.
- This selective connection table 430 allows the message redirector 140 to hunt for connection requests (SYNs) that have a higher probability of a hit in the cache server 150, given that their destination IP address already has content loaded in the cache server 150.
- This selectivity feature also allows the cache server 150 to effectively shift the optimum cache locality point because it allows the cache server 150 to effectively participate in the need to compare fewer IP addresses.
- the selective connection state switches to a non-selective mode. In this non-selective mode, any occurring SYN will be permitted to be routed up to the cache .
- the selective mode only SYN requests which already have their associated IP addresses and/or sub-net masks stored in the selective connection table 430 are permitted to be routed out to the cache server.
- the non-selective mode the next SYN will be routed up.
- the system provides an N/K selective to non-selective behavior.
- connection selectivity function can be provided from state 620 as follows.
- state 620 the contents of the timer 314 is detected and used to determine if the selective mode should be entered. If the timer indicates that it is not time to enter the selective mode, then processing can exit from state 616 to prepare the new connection by tagging it in state 660.
- state 622 is entered to look up the IP address of the SYN request.
- state 624 if this address is located in the selective connection table 430, then the new connection will not be permitted to be maintained. In this instance, the packet is forwarded out of the redirector 140 using the tentative L2 decision FD L2 in state 626. The connection therefore will not be serviced locally. However, if the destination address is on the selective table 430, then processing will continue with state 650.
- redirector is not in selective mode in state 620 or if IP destination is on the selective address table, then a connection tracking object and associated tag is assigned in state 650.
- Assigning a connection tracking object for each active connection between the cache server 150 and a client on the network serves to off-load data processing at the redirector 140. For example, when a connection is to be maintained between the cache server 150 and client 110, multiple messages are typically exchanged between the client 110 and cache server 150. Since the redirector 140 and cache server 150 service multiple connections or communication sessions simultaneously, inbound messages from multiple clients 110, therefore, must be analyzed to determine to which connection a corresponding message should be directed.
- Fig. 11 is a diagram illustrating how a tag 1125 is either appended to an original message 1115, thus, forming a tagged message 1130. Alternatively, a tag is optionally overwritten in an Ether field of the original message 1115 to create an embedded tag within the tagged message 1120.
- This process of assigning an index tag to a connection and appending the corresponding index tag number simplifies bit manipulation at the cache server 150 because the cache server 150 receiving a tagged message from the redirector 140 needs only to read the tag to determine the associated connection to which the message pertains. Otherwise, many bits of information such as the IP source and destination and TCP source and destination address of the received message would have to be analyzed at the cache server 150 to determine the corresponding connection.
- Fig. 10 is an array of N connection tracking objects 1000 for maintaining information associated with a particular connection.
- a similar array of connection tracking objects 1000 is maintained by both the cache server 150 and redirector 140. Accordingly, this provides the redirector 140 and cache server 140 a shorthand way of communicating which connection a message pertains.
- the easy-to-read tag of a message passed between redirector 140 and cache server 150 indicates the connection to which the message pertains.
- Each tag number is an index number corresponding to the connection entry in the connection tracking object array 1000.
- connection tag #1 is an index pointer for the first object entry in connection tracking object table 1000
- connection tag #2 is an index pointer for the second object entry in connection tracking object table 1000
- so on for each of an array of N connection tag objects 1005.
- a connection and corresponding connection tag object 1005 must be established for the newly received data message and related subsequent messages. This process involves assigning a free connection tracking object 1005 in the connection tracking object table 1000 for the new connection.
- connection tracking object 1005 After a connection tracking object 1005 is assigned for a new connection, the information associated with the connection is stored in the new connection tracking object 1005 in state 655. For example, the IP and TCP source and destination address of the connection are stored in the connection tracking object 1005 so that the TCP and IP source and destination addresses of other received messages can be compared to those in the connection array 1000 to determine whether the message pertains to an active connection.
- a status 1010 of the connection tracking object is maintained, signaling whether a connection is active or inactive.
- the status 1010 of the newly created connection tracking object 1005 would be set active.
- a message type i.e., UDP, TCP ...
- a port number stored in the connection tracking object 1005 identifies which cache server 150 a connection pertains in the event that the system includes multiple cache servers 22.
- the corresponding index is appended or incorporated in the network message.
- the tag is stored in the Ethernet field or link layer.
- the newly tagged message is forwarded to cache server 150 and is processed based on network layer 3 information. It is common for failures to occur in any networking system. Therefore, active connections are monitored for activity or communication between the cache server 150 and clients 110. If the communication on a given connection is inactive for a predetermined time, the connection tracking object at both the redirector 140 and the cache server 150 are closed, i.e., set inactive, freeing resources for new connections. This grace time can depend on the availability of resources and present traffic through the redirector 140. For instance, when the redirector 140 is plagued with traffic and there are no resources to open new connections, the grace time for a presumed failed connection may be shorter since the resources are optimally used to service other requests.
- Fig. 7 is a flow chart illustrating how messages received from the redirector 140 are processed at the cache server 150. The process involves first receiving a message from the redirector 140 in state 705. If the received message in state 710 does not include a connection tag, the message is sent to the appropriate socket using standard Unix TCP/IP routing in state 715.
- connection tag associated with the received message in state 710
- the message is passed on to state 720 to determine whether the message includes a SYN message. If not, the tagged message is directed within the cache server 150 to the connection running on the TCP state machine corresponding with the tag in state 730. Again, the tag number is an index number to the proper TCP state machine or session connection corresponding with the tagged message.
- the message received from the redirector 140 includes an SYN message and a tag in state 720, this indicates that a new connection is to be opened for the requested object.
- the corresponding tag is the index number of the connection in the connection tracking object array 1000 to be established for future communication of a particular connection.
- a connection tracking object 1005 is created including the information as described in Fig. 10.
- Both the message redirector 140 and cache server 150 both track a particular connection based upon the content, at least in part, of each message. As a result, both the array in the redirector 140 and cache server 150 mirror each other, i.e., both arrays include substantially identical information, supporting the harmonious processing of messages. After it is determined to which connection the message pertains, the message is then processed in state 730 on the appropriate TCP state machine in the cache server 150.
- Fig. 8 is a flow chart illustrating the process associated with closing or maintaining a connection in the cache server 150.
- the cache server 150 determines in state 805 if an object request by a node is properly serviced and the associated connection should be freed. If communication for a particular connection is not completed in state 805, the connection is maintained for further communications between the requesting node such as client 110 and cache server 150 in state 810.
- connection is closed in state 815 where the status 1010 of the connection tracking object 1005 is set inactive to indicate that the connection tracking object 1005 and corresponding tag are free for a new connection.
- a message associated with closing the connection tracking object 1005 is sent to the redirector 140 in state 820 so that the corresponding object in the redirector' s 140 connection tracking object array 1000 is also closed.
- connection tracking object arrays 1000 by each decoding the contents of the message to determine whether to open a new connection.
- a connection and corresponding connection tracking object 1005 at the redirector 140 can be closed based on the detection of a FIN message, indicating that the message is last in line of a related stream of messages.
- Fig. 9 is a flow chart illustrating how messages received from the cache server 150 are processed by the redirector 140. Messages are first received from the cache server 150 in state 905. It is then determined whether the message includes a FIN in state 925. If not, the message is routed to the network in state 935. If the message includes a FIN message in state 925, a "time wait" function is performed in state 927. Following time wait, the connection tracking object 1005 associated with the message is deleted in state 930 because the FIN message indicates the last of the data messages sent between a requesting node such as a client 110 and the cache server 150 for a particular connection. Based on this method of closing a connection in the redirector 20 and cache server 150, the associated connection tracking object arrays 1000 appropriately mirror each other.
- a connection tracking object is closed based upon a direct order from the cache server 150. For example, if a connection tagging object is to be closed, the cache server optionally transmits a message to the redirector 140 to close a particular connection tracking object 1005.
- Messages transmitted over the network in state 535 are "unmarked" with the tag before they are sent out over the network.
- the tag is a code understood only by messages being passed between the cache server 150 and redirector 140.
- the appropriate information from the connection tracking object is incorporated back into the message for proper routing if it is not already there.
- the appropriate IP and TCP source and destination address are incorporated in the message. If a connection tag was appended to the network message, it is deleted so as not to interfere with subsequent routing of the message on the network.
- the implementation of a selective connectivity period provides a natural effect of controlling the new connection acceptance rate. For example, consider the case where the cache 150 is hunting for selective connections but the population of selective connections is low. In this case, the new connection SYNs allowed to be routed up to the cache server 150 are spaced at intervals of the selectivity period, t, plus the average SYN arrival interval .
- selectivity time period provides a natural load control mechanism.
- the number of offered connections, (O c ) is the actual number of connections passing through network 100.
- the number of serviceable connections (S_) is the number of connections that the cache server 150 can actually service at any point in time. In general, the number of offered connections (O c ) will exceed the number of serviceable connections (S c ) since the cache server 150 has a finite capacity.
- the goal is to obtain a higher hit rate for the cache server 150 as measured in the number of objects delivered from the cache server 150 as opposed to the number of objects which must be obtained from routes from the HTTP servers 19.
- setting the selectivity period to zero causes the cache server 150 will attempt to service all of the offered connections.
- the selective connection period is set to a relatively high value, such as 100 milliseconds, the cache server 150 will likely service a connection count which is under its maximum capacity and thus spend most of its time hunting for SYNs that are on its selectivity list.
- the selectivity period setting one can provide an optimum connection load for the cache server 150.
- the server 150 may preferably use a successive approximation approach by first setting the selectivity period to a predetermined value, such as fifty percent of a known maximum value, and then moving it up and down until the connection load runs just slightly below the maximum period. When this point is reached, the selectivity period is increased just enough to allow the server to run at an optimum rate .
- a predetermined value such as fifty percent of a known maximum value
- FIG. 12 there is shown an example of a plot of the number of serviced connections as a function of selectivity period.
- a horizontal dotted line 1200 indicates the maximum number of serviceable connections. By starting out a relatively high value, such as 100 milliseconds, for the selectivity period, the number of serviced connections is relatively low.
- the selectivity period As the selectivity period is reduced, the number of serviced connections gradually increases until a point is reached, such as at 1210, in which the maximum serviceable connections are reached. It is this setting or setting slightly below this which is the desired setting for the selectivity period. This will ensure that the cache server 150 is still attempting a sufficient number of new requests without becoming overloaded. This maximizes the hit rate in the cache.
- a natural time delay spacing for new connection requests is thus provided by setting the selectivity period to a value that slightly exceeds the system's maximum selectivity connection capacity.
- connection selectivity as a function of the load.
- selectivity period as the load increases or decreases.
- selectivity period setting as detected by the counter would therefore not be a constant but rather be a variable stored in a hardware memory that is loaded by either the server 150 or an intelligent subsystem, such as the NI (C) s 302.
- a NIC 302 uses the current connection count as an index into the selectivity array and reads out a period setting to use.
- a NIC 302 uses the current connection count as an index into the selectivity array and reads out a period setting to use.
- the server 150 runs efficiently at a maximum load of 100 to 1400 connections. as the load approaches maximum, the server 150 becomes increasingly selective by increasing the selectivity period.
- the selectivity period can actually reach infinity, meaning that the only requests to be processed are connections that have entries on the selectivity table 60.
- FI function, shown in Fig. 13 represents one of many possible functions that can be loaded into the selectivity function table.
- server has numerous functions that are loaded based on the system's "state.” Examples of system state include the number of objects that are stored on the cache system, a rating of the current selectivity list (i.e., calculate the quality of the current selectivity list and apply the corresponding selectivity function for the given case) .
- selectivity functions FI for various system "states" where the selectivity period is a function of the load, and the selectivity function shape is itself a function of systems state.
- Metrics which may be used for the system state include :
- the timer value is preferably set by a function running in the cache server 150.
- the cache server 150 also maintains a connection service processes which actually services active connections; that is, it accepts HTTP requests on active connections and provide the requested objects from the cache server 150 once active.
- N objects stored in the cache begin with the IP address aa.bb (or some other set described by the address K containing N stored objects) .
- R AK (M 01 +M 02 +M 03 +...+M 0N )/N + cN
- R AK [(M 01 +M 02 +M 03 +...+M 0N )/N + cN]/dB Dividing by the number of bits (times a constant, d) provides a lower rating for masks that are longer. This allows the most 'focused' sub-net combinations to yield better ratings.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU64018/99A AU6401899A (en) | 1998-09-28 | 1999-09-27 | Selective connection flow redirection for transparent network devices |
EP99951610A EP1125225A1 (en) | 1998-09-28 | 1999-09-27 | Selective connection flow redirection for transparent network devices |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10206698P | 1998-09-28 | 1998-09-28 | |
US60/102,066 | 1998-09-28 | ||
US40448399A | 1999-09-23 | 1999-09-23 | |
US09/404,483 | 1999-09-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000019339A1 true WO2000019339A1 (en) | 2000-04-06 |
Family
ID=26798966
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/022326 WO2000019339A1 (en) | 1998-09-28 | 1999-09-27 | Selective connection flow redirection for transparent network devices |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1125225A1 (en) |
AU (1) | AU6401899A (en) |
WO (1) | WO2000019339A1 (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998017039A1 (en) * | 1996-10-14 | 1998-04-23 | Mirror Image Internet Ab | Internet communication system |
-
1999
- 1999-09-27 AU AU64018/99A patent/AU6401899A/en not_active Abandoned
- 1999-09-27 WO PCT/US1999/022326 patent/WO2000019339A1/en not_active Application Discontinuation
- 1999-09-27 EP EP99951610A patent/EP1125225A1/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998017039A1 (en) * | 1996-10-14 | 1998-04-23 | Mirror Image Internet Ab | Internet communication system |
Non-Patent Citations (2)
Title |
---|
HEDDAYA A ET AL: "WebWave: globally load balanced fully distributed caching of hot published documents", INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS,XX,XX, 27 May 1997 (1997-05-27), pages 160 - 168, XP002075421 * |
R. TEWARI ET AL: "UTCS Technical Report: Beyond Hierarchies: Design Considerations for Distributed Caching on the Internet", February 1998, DEPT OF COMPUTER SCIENCES, THE UNIVERSITY OF TEXAS AT AUSTIN, AUSTIN, TX, USA, XP002130410 * |
Also Published As
Publication number | Publication date |
---|---|
EP1125225A1 (en) | 2001-08-22 |
AU6401899A (en) | 2000-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6535509B2 (en) | Tagging for demultiplexing in a network traffic server | |
US9647954B2 (en) | Method and system for optimizing a network by independently scaling control segments and data flow | |
CA2415888C (en) | Intelligent demand driven recognition of url objects in connection oriented transactions | |
EP1066709B1 (en) | Message redirector with cut-through switch | |
US8380854B2 (en) | Simplified method for processing multiple connections from the same client | |
JP4317522B2 (en) | Network traffic control in a peer-to-peer environment | |
US7570663B2 (en) | System and method for processing packets according to concurrently reconfigurable rules | |
US5774660A (en) | World-wide-web server with delayed resource-binding for resource-based load balancing on a distributed resource multi-node network | |
US7894372B2 (en) | Topology-centric resource management for large scale service clusters | |
US6748416B2 (en) | Client-side method and apparatus for improving the availability and performance of network mediated services | |
US7114008B2 (en) | Edge adapter architecture apparatus and method | |
US20040260745A1 (en) | Load balancer performance using affinity modification | |
WO2002093832A2 (en) | System and methods for providing differentiated services within a network communication system | |
US20030229713A1 (en) | Server network controller including server-directed packet forwarding and method therefor | |
US20020199017A1 (en) | Routing meta data for network file access | |
WO2000060825A9 (en) | Connection pass-through to optimize server performance | |
Pu | Pro^NDN: MCDM-Based Interest Forwarding and Cooperative Data Caching for Named Data Networking | |
Cisco | Designing APPN Internetworks | |
EP1125225A1 (en) | Selective connection flow redirection for transparent network devices | |
Cisco | Designing APPN Internetworks | |
Cisco | Designing APPN Internetworks | |
Cisco | Designing APPN Internetworks | |
Cisco | Designing APPN Internetworks | |
Cisco | Designing APPN Internetworks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref country code: AU Ref document number: 1999 64018 Kind code of ref document: A Format of ref document f/p: F |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999951610 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 1999951610 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999951610 Country of ref document: EP |