US20060224773A1 - Systems and methods for content-aware load balancing - Google Patents

Systems and methods for content-aware load balancing Download PDF

Info

Publication number
US20060224773A1
US20060224773A1 US11/094,905 US9490505A US2006224773A1 US 20060224773 A1 US20060224773 A1 US 20060224773A1 US 9490505 A US9490505 A US 9490505A US 2006224773 A1 US2006224773 A1 US 2006224773A1
Authority
US
United States
Prior art keywords
request
servers
satisfying
content
load balancer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/094,905
Inventor
Louis Degenaro
Lei Gao
Arun Iyengar
Jian Yin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/094,905 priority Critical patent/US20060224773A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YIN, JIAN, DEGENARO, LOUIS R., GAO, LEI, IYENGAR, ARUN KWANGIL
Priority to CN200680004598.5A priority patent/CN101116056B/en
Priority to PCT/EP2006/061130 priority patent/WO2006103250A1/en
Publication of US20060224773A1 publication Critical patent/US20060224773A1/en
Priority to US12/132,811 priority patent/US8185654B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1017Server selection for load balancing based on a round robin mechanism
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1023Server selection for load balancing based on a hash applied to IP addresses or costs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/10015Access to distributed or replicated servers, e.g. using brokers

Definitions

  • the present invention generally relates to information systems and, more particularly, to techniques for content-aware load balancing in such information systems.
  • an information system is a data processing system that provides some form of response to a user upon a user's request.
  • the Internet or World Wide Web (WWW or the “web”) is easily the most ubiquitous information system that exists today.
  • Scalable web sites associated with the Internet typically comprise one or more load balancers for routing requests to a plurality of servers.
  • the techniques used for load balancing the requests can have a significant effect on performance of the overall system. If requests are routed in a content-aware fashion, then the load balancer is aware of the contents of a request and can make more intelligent routing decisions.
  • the present invention provides improved load balancing techniques.
  • a method of satisfying requests in a system comprised of a plurality of servers comprises the following steps. At least one load balancer is provided for routing requests to the plurality of servers. At the at least one load balancer, a request sent from a client is obtained. At the at least one load balancer, the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. The estimation is based on at least one of a number and a cost of at least one remote access for satisfying the request. The request is routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • the step of routing may be accomplished by classifying a request into a partition and routing the request to a server hosting the partition.
  • the at least one load balancer may comprise at least one content-unaware load balancer routing requests to a plurality of content-aware load balancers.
  • at least one of the plurality of content-aware load balancers may reside on at least one of the plurality of servers.
  • the step of estimating costs may further comprise the steps of examining at least one parameter included in the request and using information about how data are partitioned among the plurality of servers to estimate at least one of numbers and costs of remote accesses for satisfying the request.
  • the step of obtaining a request may further comprise obtaining a request that is sent using the Transmission Control Protocol/Internet Protocol and the step of examining the request may further comprise accepting a TCP connection.
  • the step of estimating costs may comprise estimating at least one of a measure of resource utilization and a desired service level for satisfying the request. Further, the step of estimating costs may vary with respect to at least one of a number of servers and time. Still further, in the step of obtaining, at the at least one load balancer, a request from a client, the request may comprise a request for dynamic data.
  • a method of satisfying requests in a system comprised of a plurality of servers comprises the following steps. At least one load balancer is provided for routing requests to the plurality of servers. At the at least one load balancer, a request sent from a client is obtained. At the at least one load balancer, the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. Information obtained in the estimating step is sent from the load balancer to a server of the plurality of servers along with the request. At the server, the information is used to satisfy the request.
  • a system for satisfying requests from at least one client comprises a plurality of servers, at least one content-aware load balancer for routing requests to the plurality of servers, and at least one cost analyzer associated with the at least one content-aware load balancer for estimating costs of satisfying requests by different servers based on at least one of numbers and costs of remote accesses for satisfying requests.
  • apparatus for satisfying requests from at least one client in a system comprised of a plurality of servers comprises a memory and at least one processor coupled to the memory and operative to: (i) obtain a request from a client; (ii) examine content of the request; and (iii) estimate costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
  • the request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • a method of satisfying requests from at least one client in a system comprised of a plurality of servers comprises the following steps.
  • a request is obtained from a client.
  • Content of the request is examined.
  • Costs of satisfying the request by at least two of the plurality of servers are estimated.
  • the estimation is based at least one of a number and a cost of at least one remote access for satisfying the request.
  • the request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • an article of manufacture for use in satisfying requests from at least one client in a system comprised of a plurality of servers comprises a machine readable medium containing one or more programs which when executed implement the steps of obtaining a request from a client, examining content of the request, and estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
  • the request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • a method of providing a load balancing service comprises the step of a service provider providing a service to a customer which comprises obtaining a request from a client, examining content of the request, and estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
  • the request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • FIG. 1 is a diagram illustrating a server system architecture according to an embodiment of the present invention
  • FIG. 2 is a diagram illustrating a system for scaling a content-aware load balancer, according to an embodiment of the present invention
  • FIG. 3 is a diagram illustrating a system in which content-aware load balancers are integrated with servers, according to an embodiment of the present invention
  • FIG. 4 is a diagram illustrating a method for load balancing requests, according to an embodiment of the present invention.
  • FIG. 5 is a diagram illustrating a method for selecting a server based on cost, according to an embodiment of the present invention.
  • FIG. 6 is a diagram illustrating a computing system in accordance with which one or more components/steps of a load balancing system may be implemented, according to an embodiment of the present invention.
  • the present invention will be explained below in the context of an illustrative Internet or web-based client-server environment. However, it is to be understood that the present invention is not limited to such Internet or web implementations. Rather, the invention is more generally applicable to any request-based environment in which it would be desirable to provide enhanced load balancing performance.
  • an object may take on many forms and it is to be understood that the invention is not limited to any particular form.
  • an object may be an electronic document such as one or more web pages.
  • One skilled in the art could use the invention in a variety of different electronic document formats including, but not limited to, HTML (HyperText Markup Language) documents, XML (eXtensible Markup Language) documents, text documents in other formats, and binary documents.
  • the phrase “electronic document” may also be understood to comprise one or more of text data, binary data, one or more byte streams, etc.
  • the term “overhead” may include, but is not limited to, computer CPU (central processing unit) cycles, network bandwidth consumption, disk, I/O (input/output), etc.
  • a server system architecture according to an embodiment of the present invention is illustrated.
  • one or more clients e.g., 102 - 1 . . . 102 -M
  • send one or more requests to a server system 103 e.g., one or more clients (e.g., 102 - 1 . . . 102 -M)
  • server system 103 may be coupled to server system 103 via the Internet or some other wired and/or wireless communications network.
  • the invention is not limited to any particular communications network.
  • the term “request” is not limited only to requests for the return of data content from the server system. Depending on the particular application, a request may have different purposes and/or seek different results.
  • the invention is not limited to any particular type of request.
  • Server system 103 includes a load balancer 104 , a cost analyzer 106 , and a plurality of servers 108 - 1 . . . 108 -N.
  • Cost analyzer 106 aids load balancer 104 in determining which one of the plurality of servers a request should be routed to.
  • the load balancer may be content aware. In other words, load balancer 104 may have the ability to determine the contents of a request in order to make more intelligent routing decisions.
  • a content-aware load balancer can incur significantly more overhead for handling requests than a content-unaware load balancer. For example, if communication is taking place via Transmission Control Protocol/Internet Protocol (TCP/IP), then a content-aware load balancer would typically accept a TCP connection in order to examine the contents of a request. This step incurs considerable overhead and would normally not be required by a content-unaware load balancer.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • FIG. 2 depicts one such feature.
  • server system 202 includes a content-unaware load balancer 204 and a plurality of content-aware load balancers 206 - 1 . . . 206 -P.
  • Requests received from one or more clients are initially routed to content-unaware load balancer 204 which subsequently routes requests to one or more of the plurality of content-aware load balancers 206 - 1 . . . 206 -P.
  • the content-unaware load balancer has higher throughput than any of the individual content-aware load balancers.
  • schemes may be used by the content-unaware load balancer for sending requests to the content-aware load balancers including, but not limited to, round robin or methods considering the load on the content-aware load balancers.
  • server system 302 includes content-unaware load balancer 304 and a plurality of content-aware load balancers 306 - 1 . . . 306 -R respectively integrated with a plurality of servers 308 - 1 . . . 308 -R.
  • FIG. 4 a method 400 is illustrated for load balancing requests, according to an embodiment of the present invention.
  • the server system architecture of FIG. 1 will be referenced to illustrate the steps of method 400 .
  • server systems architecture e.g., those shown in FIGS. 2 and 3 , as well as others not expressly shown
  • FIGS. 2 and 3 may be employed.
  • a load balancer receives a request.
  • a cost analyzer e.g., 106 of FIG. 1
  • the request is routed to a server identified in step 404 .
  • step 404 may be implemented.
  • FIG. 5 depicts one illustrative method.
  • step 502 the request is examined. If the TCP/IP protocol is being used for communication, step 502 may involve accepting a TCP connection.
  • the request may contain one or more parameters.
  • data are partitioned among the servers 108 - 1 . . . 108 -N asymmetrically.
  • the proper server to handle a request depends on the request. For example, suppose that data are partitioned based on a name parameter. If there are three servers, then one server could handle names beginning with A through I, a second server could handle names beginning with J through R, and a third server could handle names beginning with S through Z.
  • a parameter corresponding to the request would contain the name. For example, if the name was “Smith,” then the request would be routed to the third sever. If the name was “Jones,” then the request would be routed to the second server.
  • the data are partitioned among the servers ( 108 - 1 . . . 108 -N) in a more complicated fashion.
  • a request routed to a server may result in a number of different accesses to non-local data depending on how the request is routed.
  • non-local it is meant that the data sought are not stored at the server that initially receives the request, but rather is stored on a server or other network element remote from that server. For example, if the request is routed to the first server, this might result in three remote accesses. If the request is routed to the second server, this might result in one remote access. If the request is routed to the third server, this might result in eight remote accesses. These determinations would be made in step 504 based on the contents of the request.
  • a request for static data is a request for data, such as a file, which exists at the time that a request is made.
  • a request for dynamic data is one in which program code is executed in order to satisfy the request. For example, ordering an item at an e-commerce web site would typically be implemented as at least one dynamic request. The order might result in several database accesses, persistent state changes at the server, and an acknowledgement which is generated on-the-fly and sent back to the client in response to the order.
  • Satisfying dynamic requests is typically much more complicated than satisfying static requests.
  • the overhead for dynamic requests is also generally much higher.
  • the inventive methods used for selecting servers are thus different from those which have been proposed for selecting servers for static content.
  • each partition may be associated with a part of a computational task, which may include the code and state to process a specific set of requests.
  • Partitioning can be done statically by analyzing the business logic of the application. These partitions can be further refined at run time based on on-line workload statistics. Thus, the cost analyzer not only is preferably aware of the initial partitions but also may implement the logic to repartition the application online.
  • the cost analyzer may take a global view of the application to determine how to partition the application to minimize the cost of processing a particular set of requests.
  • the cost of processing requests generally includes central processing unit (CPU) overhead and also the communication overhead of synchronizing the underlying state that may be shared by several backend servers.
  • CPU central processing unit
  • the cost analyzer may also determine how widely a piece of data should be replicated. While replicating a piece of data widely may increase the overall capacity of processing the requests associated with this piece of data, it may also increase the cost of synchronizing the data across all of the replicas. Thus, it is desirable for the cost analyzer to balance consistency requirements, request rates, and synchronization patterns of the data.
  • not-replicating data may also be considered by the cost analyzer. It may be more advantageous to route requests for particular data to one or a few servers hosting said data in order to avoid data replication costs. Further, the aggregate caching capabilities of a set of severs may be better utilized. For example, each of three servers comprising a server set, each equally capable of servicing any one request, may be able to cache only one of three frequently used large objects at any one time due to size constraints. The cost analyzer may choose to place large object A on server 1 , large object B on server 2 , and large object C on server 3 . Subsequently, requests for each object would be routed to a server according to its cached location. In this example, each large object is able to be cached on at least one server which results in overall improved performance. Thus, the cost analyzer can improve horizontal scalability of applications by making intelligent decisions about requests for service and where to service them based upon request content.
  • the underlying data held by servers may need to be migrated before request routing can be changed. Migrating underlying data typically comes with a cost.
  • the cost analyzer may also take the cost of state migration into account to optimize routing.
  • step 506 the system selects a server to route the request to based on the costs determined in step 504 .
  • the second server would normally be the one selected since it requires only one remote access. If the second server is highly loaded compared to the first server, however, the first server might be selected in certain cases.
  • the system determines costs of routing requests to different servers in step 504 , this determination does not have to be completely accurate. In several cases, an estimate will suffice.
  • the system might execute code that determines parameters of the request, which servers need to be contacted to satisfy the request, or other information that can help satisfy the request.
  • a straightforward approach would be to determine this information once by the cost analyzer and a second time by the server to which the request is routed. This approach incurs overhead due to the redundant calculations.
  • An optimization to alleviate these redundant calculations is for the cost analyzer to store the relevant information in the request and send the augmented request to a server. The server then accesses information stored in the augmented request by the cost analyzer to obtain relevant information for satisfying the request and avoiding redundant calculations.
  • a compiler can perform program transformation techniques on the program code for satisfying requests to fully or partially automate this optimization.
  • the cost analyzer may consider partition definition, partition-to-server assignment, and desired optimization.
  • Partition definition is the process of classifying requests into partitions.
  • Partition-to-server assignment is the process of deciding on which server a classified request is to be handled (in other words, on which server a partition is to be located). Desired optimization considers how to best distribute partitions amongst available servers. Individual partitions can be moved from one server to another on demand.
  • the cost analyzer can make dynamic determinations for each of the control variables. For example, cost analyzer off-line analysis of system utilization may recommend a different classification into partitions scheme; or recommend a different allocation of partitions to servers. Further, cost analysis results may be different for varying numbers of servers in the server set, and may vary over time.
  • one partition-to-server allocation scheme may be optimal for two servers, another for three servers, another for four servers, and so forth.
  • the partitions are named ⁇ 0 , 1 , 2 , 3 , . . . 9 ⁇ .
  • cost analysis may determine that optimal assignment is for partitions 0 - 6 to be assigned to server 1 and 7 - 9 to server 2 .
  • the cost analyzer may recommend partitions 0 - 3 be assigned to server 1 , partitions 4 - 7 be assigned to server 2 and partitions 8 - 9 be assigned to server 3 .
  • the desired cost optimization function may specify one assignment of partitions to servers for the period 8AM to 5PM, then another assignment during the period 5PM to 8AM.
  • the cost analyzer need not optimize on “best” utilization of resources from the system's perspective.
  • a simple service request load balancing technique to utilize a system is to round robin requests to each in a set of servers; or to choose the server with the least utilized CPU. Instead, the cost analyzer may classify requests and route them based upon said classification results, even though this may not result in the “best” utilization from the system's perspective.
  • Cost analysis may consider quality of service requirements. For example, a “gold” customer may be directed to a “fast” speed partition, a “silver” customer may be directed to a “medium” speed partition, and a “bronze” customer may be directed to a “slow” speed partition. That is, the cost analyzer may not always seek to optimize from the system's perspective, but rather relative to the request's importance.
  • the present invention also comprises techniques for providing load balancing services.
  • a content provider agrees (e.g., via a service level agreement or some informal agreement or arrangement) with a customer or client to provide content. Then, based on terms of the service contract between the content provider and the content customer, the content provider provides content to the content customer in accordance with one or more of the load balancing methodologies of the invention described herein.
  • a computing system is illustrated in accordance with which one or more components/steps of a load balancing system (e.g., components and methodologies described in the context of FIGS. 1 through 5 ) may be implemented, according to an embodiment of the present invention.
  • the individual components/steps may be implemented on one such computer system, or more preferably, on more than one such computer system.
  • the individual computer systems and/or devices may be connected via a suitable network, e.g., the Internet or World Wide Web.
  • the system may be realized via private or local networks. The invention is not limited to any particular network.
  • the computing system shown in FIG. 6 represents an illustrative computing system architecture for a load balancer, a server, a cost analyzer, and/or combinations thereof, within which one or more of the steps of the load balancing techniques of the invention may be executed.
  • the computer system 600 may be implemented in accordance with a processor 602 , a memory 604 , I/O devices 606 , and a network interface 608 , coupled via a computer bus 610 or alternate connection arrangement.
  • processor as used herein is intended to include any processing device, such as, for example, one that includes a CPU and/or other processing circuitry. It is also to be understood that the term “processor” may refer to more than one processing device and that various elements associated with a processing device may be shared by other processing devices.
  • memory as used herein is intended to include memory associated with a processor or CPU, such as, for example, RAM, ROM, a fixed memory device (e.g., hard drive), a removable memory device (e.g., diskette), flash memory, etc.
  • input/output devices or “I/O devices” as used herein is intended to include, for example, one or more input devices (e.g., keyboard, mouse, etc.) for entering data to the processing unit, and/or one or more output devices (e.g., speaker, display, etc.) for presenting results associated with the processing unit.
  • input devices e.g., keyboard, mouse, etc.
  • output devices e.g., speaker, display, etc.
  • network interface as used herein is intended to include, for example, one or more transceivers to permit the computer system to communicate with another computer system via an appropriate communications protocol.
  • software components including instructions or code for performing the methodologies described herein may be stored in one or more of the associated memory devices (e.g., ROM, fixed or removable memory) and, when ready to be utilized, loaded in part or in whole (e.g., into RAM) and executed by a CPU.
  • ROM read-only memory
  • RAM random access memory

Abstract

Improved load balancing techniques are disclosed. For example, in one illustrative aspect of the invention, a method of satisfying requests in a system comprised of a plurality of servers comprises the following steps. At least one load balancer is provided for routing requests to the plurality of servers. At the at least one load balancer, a request sent from a client is obtained. At the at least one load balancer, the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. The estimation is based on at least one of a number and a cost of at least one remote access for satisfying the request. The request is routed to a server of the plurality of servers with a low estimated cost of satisfying the request.

Description

    FIELD OF THE INVENTION
  • The present invention generally relates to information systems and, more particularly, to techniques for content-aware load balancing in such information systems.
  • BACKGROUND OF THE INVENTION
  • In general, an information system is a data processing system that provides some form of response to a user upon a user's request. The Internet or World Wide Web (WWW or the “web”) is easily the most ubiquitous information system that exists today.
  • Scalable web sites associated with the Internet typically comprise one or more load balancers for routing requests to a plurality of servers. The techniques used for load balancing the requests can have a significant effect on performance of the overall system. If requests are routed in a content-aware fashion, then the load balancer is aware of the contents of a request and can make more intelligent routing decisions.
  • One of the drawbacks to content-aware routing compared with content-unaware routing is that content-aware routing usually incurs significantly more overhead. Therefore, the benefits for performing content-aware routing must be significant enough to justify the higher overhead.
  • Content-aware routing techniques have been proposed, for example, as described in V. Pai et al., “Locality-Aware Request Distribution in Cluster-Based Network Servers,” Proceedings of ASPLOS-VIII, October 1998, the disclosure of which is incorporated by reference herein. However, the content-aware routing disclosed in the above-referenced work is primarily intended for static requests and thus uses techniques for selecting servers which are not always well suited for dynamic requests. Techniques described in C. S. Yang et al., “Efficient Support for Content-Based Routing in Web Server Clusters,” Proceedings of the 2nd USENIX/IEEE Symposium on Internet Technologies and Systems (USITS '99), the disclosure of which is incorporated by reference herein, provide content-aware routing only for static content. Thus, existing work in content-aware routing is not sufficient to handle the data partitioning problems which occur in major deployments.
  • Accordingly, a need exists for techniques which overcome the above-mentioned and other limitations associated with existing content-aware routing techniques.
  • SUMMARY OF THE INVENTION
  • The present invention provides improved load balancing techniques.
  • For example, in a first aspect of the invention, a method of satisfying requests in a system comprised of a plurality of servers comprises the following steps. At least one load balancer is provided for routing requests to the plurality of servers. At the at least one load balancer, a request sent from a client is obtained. At the at least one load balancer, the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. The estimation is based on at least one of a number and a cost of at least one remote access for satisfying the request. The request is routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • The step of routing may be accomplished by classifying a request into a partition and routing the request to a server hosting the partition. In the step of providing at least one load balancer, the at least one load balancer may comprise at least one content-unaware load balancer routing requests to a plurality of content-aware load balancers. Further, in the step of providing at least one load balancer, at least one of the plurality of content-aware load balancers may reside on at least one of the plurality of servers.
  • The step of estimating costs may further comprise the steps of examining at least one parameter included in the request and using information about how data are partitioned among the plurality of servers to estimate at least one of numbers and costs of remote accesses for satisfying the request. The step of obtaining a request may further comprise obtaining a request that is sent using the Transmission Control Protocol/Internet Protocol and the step of examining the request may further comprise accepting a TCP connection. The step of estimating costs may comprise estimating at least one of a measure of resource utilization and a desired service level for satisfying the request. Further, the step of estimating costs may vary with respect to at least one of a number of servers and time. Still further, in the step of obtaining, at the at least one load balancer, a request from a client, the request may comprise a request for dynamic data.
  • In a second aspect of the invention, a method of satisfying requests in a system comprised of a plurality of servers comprises the following steps. At least one load balancer is provided for routing requests to the plurality of servers. At the at least one load balancer, a request sent from a client is obtained. At the at least one load balancer, the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. Information obtained in the estimating step is sent from the load balancer to a server of the plurality of servers along with the request. At the server, the information is used to satisfy the request.
  • In a third aspect of the invention, a system for satisfying requests from at least one client comprises a plurality of servers, at least one content-aware load balancer for routing requests to the plurality of servers, and at least one cost analyzer associated with the at least one content-aware load balancer for estimating costs of satisfying requests by different servers based on at least one of numbers and costs of remote accesses for satisfying requests.
  • In a fourth aspect of the invention, apparatus for satisfying requests from at least one client in a system comprised of a plurality of servers comprises a memory and at least one processor coupled to the memory and operative to: (i) obtain a request from a client; (ii) examine content of the request; and (iii) estimate costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request. The request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • In a fifth aspect of the invention, a method of satisfying requests from at least one client in a system comprised of a plurality of servers comprises the following steps. A request is obtained from a client. Content of the request is examined. Costs of satisfying the request by at least two of the plurality of servers are estimated. The estimation is based at least one of a number and a cost of at least one remote access for satisfying the request. The request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • In a sixth aspect of the invention, an article of manufacture for use in satisfying requests from at least one client in a system comprised of a plurality of servers comprises a machine readable medium containing one or more programs which when executed implement the steps of obtaining a request from a client, examining content of the request, and estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request. The request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • In a seventh aspect of the invention, a method of providing a load balancing service comprises the step of a service provider providing a service to a customer which comprises obtaining a request from a client, examining content of the request, and estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request. The request may then be routed to a server of the plurality of servers with a low estimated cost of satisfying the request.
  • These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram illustrating a server system architecture according to an embodiment of the present invention;
  • FIG. 2 is a diagram illustrating a system for scaling a content-aware load balancer, according to an embodiment of the present invention;
  • FIG. 3 is a diagram illustrating a system in which content-aware load balancers are integrated with servers, according to an embodiment of the present invention;
  • FIG. 4 is a diagram illustrating a method for load balancing requests, according to an embodiment of the present invention;
  • FIG. 5 is a diagram illustrating a method for selecting a server based on cost, according to an embodiment of the present invention; and
  • FIG. 6 is a diagram illustrating a computing system in accordance with which one or more components/steps of a load balancing system may be implemented, according to an embodiment of the present invention.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • The present invention will be explained below in the context of an illustrative Internet or web-based client-server environment. However, it is to be understood that the present invention is not limited to such Internet or web implementations. Rather, the invention is more generally applicable to any request-based environment in which it would be desirable to provide enhanced load balancing performance.
  • Furthermore, content that is to be served in response to a request may be referred to generally herein as an “object.” An “object” may take on many forms and it is to be understood that the invention is not limited to any particular form. For example, an object may be an electronic document such as one or more web pages. One skilled in the art could use the invention in a variety of different electronic document formats including, but not limited to, HTML (HyperText Markup Language) documents, XML (eXtensible Markup Language) documents, text documents in other formats, and binary documents. Also, the phrase “electronic document” may also be understood to comprise one or more of text data, binary data, one or more byte streams, etc. Thus, the invention is not limited to any particular type of data object. Still further, it is to be understood that the term “overhead” may include, but is not limited to, computer CPU (central processing unit) cycles, network bandwidth consumption, disk, I/O (input/output), etc.
  • Referring initially to FIG. 1, a server system architecture, according to an embodiment of the present invention is illustrated. As shown, one or more clients (e.g., 102-1 . . . 102-M) send one or more requests to a server system 103. It is to be appreciated that the clients (by way of example only, personal computers, personal digital assistants, cellular phones, etc.) may be coupled to server system 103 via the Internet or some other wired and/or wireless communications network. The invention is not limited to any particular communications network. Also, it is to be understood that the term “request” is not limited only to requests for the return of data content from the server system. Depending on the particular application, a request may have different purposes and/or seek different results. The invention is not limited to any particular type of request.
  • Server system 103 includes a load balancer 104, a cost analyzer 106, and a plurality of servers 108-1 . . . 108-N. Cost analyzer 106 aids load balancer 104 in determining which one of the plurality of servers a request should be routed to. The load balancer may be content aware. In other words, load balancer 104 may have the ability to determine the contents of a request in order to make more intelligent routing decisions.
  • A content-aware load balancer can incur significantly more overhead for handling requests than a content-unaware load balancer. For example, if communication is taking place via Transmission Control Protocol/Internet Protocol (TCP/IP), then a content-aware load balancer would typically accept a TCP connection in order to examine the contents of a request. This step incurs considerable overhead and would normally not be required by a content-unaware load balancer.
  • Principles of the invention provide features for scaling up content-aware load balancers to handle high request rates. FIG. 2 depicts one such feature.
  • Referring now to FIG. 2, a system is illustrated for scaling a content-aware load balancer, according to an embodiment of the present invention. As shown, server system 202 includes a content-unaware load balancer 204 and a plurality of content-aware load balancers 206-1 . . . 206-P.
  • Requests received from one or more clients (not shown) are initially routed to content-unaware load balancer 204 which subsequently routes requests to one or more of the plurality of content-aware load balancers 206-1 . . . 206-P. The content-unaware load balancer has higher throughput than any of the individual content-aware load balancers. A variety of schemes may be used by the content-unaware load balancer for sending requests to the content-aware load balancers including, but not limited to, round robin or methods considering the load on the content-aware load balancers.
  • Referring now to FIG. 3, a generalization of FIG. 2 is depicted in which content-aware load balancers are integrated with (e.g., reside on) one or more servers. That is, as shown, server system 302 includes content-unaware load balancer 304 and a plurality of content-aware load balancers 306-1 . . . 306-R respectively integrated with a plurality of servers 308-1 . . . 308-R.
  • Referring now to FIG. 4, a method 400 is illustrated for load balancing requests, according to an embodiment of the present invention. The server system architecture of FIG. 1 will be referenced to illustrate the steps of method 400. However, it is to be appreciated that other server systems architecture (e.g., those shown in FIGS. 2 and 3, as well as others not expressly shown) may be employed.
  • In step 402, a load balancer (e.g., 104 of FIG. 1) receives a request. In step 404, a cost analyzer (e.g., 106 of FIG. 1) examines the request and identifies a server (e.g., 108-1 . . . 108-N) which will likely incur a low cost for satisfying the request. In step 406, the request is routed to a server identified in step 404.
  • There are a variety of ways in which step 404 may be implemented. FIG. 5 depicts one illustrative method.
  • Referring now to FIG. 5, a method 500 is illustrated for selecting a server based on cost, according to an embodiment of the present invention. In step 502, the request is examined. If the TCP/IP protocol is being used for communication, step 502 may involve accepting a TCP connection. The request may contain one or more parameters.
  • In one embodiment, data are partitioned among the servers 108-1 . . . 108-N asymmetrically. The proper server to handle a request depends on the request. For example, suppose that data are partitioned based on a name parameter. If there are three servers, then one server could handle names beginning with A through I, a second server could handle names beginning with J through R, and a third server could handle names beginning with S through Z.
  • When a request is received by the load balancer (e.g., step 402 of FIG. 4), in the scenario described above, a parameter corresponding to the request would contain the name. For example, if the name was “Smith,” then the request would be routed to the third sever. If the name was “Jones,” then the request would be routed to the second server.
  • In another scenario, the data are partitioned among the servers (108-1 . . . 108-N) in a more complicated fashion. A request routed to a server may result in a number of different accesses to non-local data depending on how the request is routed. By “non-local,” it is meant that the data sought are not stored at the server that initially receives the request, but rather is stored on a server or other network element remote from that server. For example, if the request is routed to the first server, this might result in three remote accesses. If the request is routed to the second server, this might result in one remote access. If the request is routed to the third server, this might result in eight remote accesses. These determinations would be made in step 504 based on the contents of the request.
  • It is to be appreciated that one of the features of the invention, but not the only feature, which distinguishes it from existing techniques such as, for example, the ASPLOS-VIII and USITS'99 papers mentioned above is that the inventive techniques can be used for requests for dynamic data as well as for static data. By contrast, the above-referenced papers describe content-aware routing techniques which are only well-suited for static data. A request for static data is a request for data, such as a file, which exists at the time that a request is made. A request for dynamic data is one in which program code is executed in order to satisfy the request. For example, ordering an item at an e-commerce web site would typically be implemented as at least one dynamic request. The order might result in several database accesses, persistent state changes at the server, and an acknowledgement which is generated on-the-fly and sent back to the client in response to the order.
  • Satisfying dynamic requests is typically much more complicated than satisfying static requests. The overhead for dynamic requests is also generally much higher. The inventive methods used for selecting servers are thus different from those which have been proposed for selecting servers for static content.
  • An application may be analyzed to determine how it can best be partitioned to run on a set of servers. Partitions are often defined to have little interactions with other partitions and thus the state needing to be shared is minimized. Logically, each partition may be associated with a part of a computational task, which may include the code and state to process a specific set of requests.
  • Partitioning can be done statically by analyzing the business logic of the application. These partitions can be further refined at run time based on on-line workload statistics. Thus, the cost analyzer not only is preferably aware of the initial partitions but also may implement the logic to repartition the application online.
  • The cost analyzer may take a global view of the application to determine how to partition the application to minimize the cost of processing a particular set of requests. The cost of processing requests generally includes central processing unit (CPU) overhead and also the communication overhead of synchronizing the underlying state that may be shared by several backend servers. Thus, the cost analyzer may also determine how widely a piece of data should be replicated. While replicating a piece of data widely may increase the overall capacity of processing the requests associated with this piece of data, it may also increase the cost of synchronizing the data across all of the replicas. Thus, it is desirable for the cost analyzer to balance consistency requirements, request rates, and synchronization patterns of the data.
  • Conversely, not-replicating data may also be considered by the cost analyzer. It may be more advantageous to route requests for particular data to one or a few servers hosting said data in order to avoid data replication costs. Further, the aggregate caching capabilities of a set of severs may be better utilized. For example, each of three servers comprising a server set, each equally capable of servicing any one request, may be able to cache only one of three frequently used large objects at any one time due to size constraints. The cost analyzer may choose to place large object A on server 1, large object B on server 2, and large object C on server 3. Subsequently, requests for each object would be routed to a server according to its cached location. In this example, each large object is able to be cached on at least one server which results in overall improved performance. Thus, the cost analyzer can improve horizontal scalability of applications by making intelligent decisions about requests for service and where to service them based upon request content.
  • The underlying data held by servers may need to be migrated before request routing can be changed. Migrating underlying data typically comes with a cost. Thus, the cost analyzer may also take the cost of state migration into account to optimize routing.
  • Since remote accesses are costly, it is desirable to minimize them. In step 506, the system selects a server to route the request to based on the costs determined in step 504. In the example from the previous paragraph, the second server would normally be the one selected since it requires only one remote access. If the second server is highly loaded compared to the first server, however, the first server might be selected in certain cases.
  • When the system determines costs of routing requests to different servers in step 504, this determination does not have to be completely accurate. In several cases, an estimate will suffice. In performing this determination, the system might execute code that determines parameters of the request, which servers need to be contacted to satisfy the request, or other information that can help satisfy the request. A straightforward approach would be to determine this information once by the cost analyzer and a second time by the server to which the request is routed. This approach incurs overhead due to the redundant calculations. An optimization to alleviate these redundant calculations is for the cost analyzer to store the relevant information in the request and send the augmented request to a server. The server then accesses information stored in the augmented request by the cost analyzer to obtain relevant information for satisfying the request and avoiding redundant calculations. A compiler can perform program transformation techniques on the program code for satisfying requests to fully or partially automate this optimization.
  • The cost analyzer may consider partition definition, partition-to-server assignment, and desired optimization. Partition definition is the process of classifying requests into partitions. Partition-to-server assignment is the process of deciding on which server a classified request is to be handled (in other words, on which server a partition is to be located). Desired optimization considers how to best distribute partitions amongst available servers. Individual partitions can be moved from one server to another on demand.
  • The cost analyzer can make dynamic determinations for each of the control variables. For example, cost analyzer off-line analysis of system utilization may recommend a different classification into partitions scheme; or recommend a different allocation of partitions to servers. Further, cost analysis results may be different for varying numbers of servers in the server set, and may vary over time.
  • For varying numbers of servers, one partition-to-server allocation scheme may be optimal for two servers, another for three servers, another for four servers, and so forth. For example, say the partitions are named {0, 1, 2, 3, . . . 9 }. For two servers, cost analysis may determine that optimal assignment is for partitions 0-6 to be assigned to server 1 and 7-9 to server 2. For three servers, the cost analyzer may recommend partitions 0-3 be assigned to server 1, partitions 4-7 be assigned to server 2 and partitions 8-9 be assigned to server 3.
  • For time variability, the desired cost optimization function may specify one assignment of partitions to servers for the period 8AM to 5PM, then another assignment during the period 5PM to 8AM.
  • The cost analyzer need not optimize on “best” utilization of resources from the system's perspective. A simple service request load balancing technique to utilize a system is to round robin requests to each in a set of servers; or to choose the server with the least utilized CPU. Instead, the cost analyzer may classify requests and route them based upon said classification results, even though this may not result in the “best” utilization from the system's perspective.
  • Cost analysis may consider quality of service requirements. For example, a “gold” customer may be directed to a “fast” speed partition, a “silver” customer may be directed to a “medium” speed partition, and a “bronze” customer may be directed to a “slow” speed partition. That is, the cost analyzer may not always seek to optimize from the system's perspective, but rather relative to the request's importance.
  • It is to be further appreciated that the present invention also comprises techniques for providing load balancing services. By way of example, a content provider agrees (e.g., via a service level agreement or some informal agreement or arrangement) with a customer or client to provide content. Then, based on terms of the service contract between the content provider and the content customer, the content provider provides content to the content customer in accordance with one or more of the load balancing methodologies of the invention described herein.
  • Referring finally to FIG. 6, a computing system is illustrated in accordance with which one or more components/steps of a load balancing system (e.g., components and methodologies described in the context of FIGS. 1 through 5) may be implemented, according to an embodiment of the present invention. It is to be understood that the individual components/steps may be implemented on one such computer system, or more preferably, on more than one such computer system. In the case of an implementation on a distributed computing system, the individual computer systems and/or devices may be connected via a suitable network, e.g., the Internet or World Wide Web. However, the system may be realized via private or local networks. The invention is not limited to any particular network.
  • Thus, the computing system shown in FIG. 6 represents an illustrative computing system architecture for a load balancer, a server, a cost analyzer, and/or combinations thereof, within which one or more of the steps of the load balancing techniques of the invention may be executed.
  • As shown, the computer system 600 may be implemented in accordance with a processor 602, a memory 604, I/O devices 606, and a network interface 608, coupled via a computer bus 610 or alternate connection arrangement.
  • It is to be appreciated that the term “processor” as used herein is intended to include any processing device, such as, for example, one that includes a CPU and/or other processing circuitry. It is also to be understood that the term “processor” may refer to more than one processing device and that various elements associated with a processing device may be shared by other processing devices.
  • The term “memory” as used herein is intended to include memory associated with a processor or CPU, such as, for example, RAM, ROM, a fixed memory device (e.g., hard drive), a removable memory device (e.g., diskette), flash memory, etc.
  • In addition, the phrase “input/output devices” or “I/O devices” as used herein is intended to include, for example, one or more input devices (e.g., keyboard, mouse, etc.) for entering data to the processing unit, and/or one or more output devices (e.g., speaker, display, etc.) for presenting results associated with the processing unit.
  • Still further, the phrase “network interface” as used herein is intended to include, for example, one or more transceivers to permit the computer system to communicate with another computer system via an appropriate communications protocol.
  • Accordingly, software components including instructions or code for performing the methodologies described herein may be stored in one or more of the associated memory devices (e.g., ROM, fixed or removable memory) and, when ready to be utilized, loaded in part or in whole (e.g., into RAM) and executed by a CPU.
  • Although illustrative embodiments of the present invention have been described herein with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments, and that various other changes and modifications may be made by one skilled in the art without departing from the scope or spirit of the invention.

Claims (22)

1. A method for satisfying requests in a system comprised of a plurality of servers, comprising the steps of:
providing at least one load balancer for routing requests to the plurality of servers;
obtaining, at the at least one load balancer, a request from a client;
examining, at the at least one load balancer, the request;
estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request; and
routing the request to a server of the plurality of servers with a low estimated cost of satisfying the request.
2. The method of claim 1, wherein the step of routing is accomplished by classifying a request into a partition and routing the request to a server hosting the partition.
3. The method of claim 1, wherein, in the step of providing at least one load balancer, the at least one load balancer comprises at least one content-unaware load balancer routing requests to a plurality of content-aware load balancers.
4. The method of claim 3, wherein, in the step of providing at least one load balancer, at least one of the plurality of content-aware load balancers resides on at least one of the plurality of servers.
5. The method of claim 1, wherein the step of estimating costs further comprises the steps of:
examining at least one parameter included in the request; and
using information about how data are partitioned among the plurality of servers to estimate at least one of numbers and costs of remote accesses for satisfying the request.
6. The method of claim 1, wherein the step of obtaining a request further comprises obtaining a request that is sent using the Transmission Control Protocol/Internet Protocol and wherein the step of examining further comprises accepting a TCP connection.
7. The method of claim 1, wherein the step of estimating costs comprises estimating at least one of a measure of resource utilization and a desired service level for satisfying the request.
8. The method of claim 1, wherein the step of estimating costs may vary with respect to at least one of a number of servers and time.
9. The method of claim 1, wherein, in the step of obtaining, at the at least one load balancer, a request from a client, the request comprises a request for dynamic data.
10. A method of satisfying requests in a system comprised of a plurality of servers, comprising the steps of:
providing at least one load balancer for routing requests to the plurality of servers;
obtaining, at the at least one load balancer, a request from a client;
examining, at the at least one load balancer, the request;
estimating costs of satisfying the request by at least two of the plurality of servers;
sending, from the load balancer, information obtained in the estimating step to a server of the plurality of servers along with the request; and
using, at the server, the information to satisfy the request.
11. A system for satisfying requests from at least one client, comprising:
a plurality of servers;
at least one content-aware load balancer for routing requests to the plurality of servers; and
at least one cost analyzer associated with the at least one content-aware load balancer for estimating costs of satisfying requests by different servers based on at least one of numbers and costs of remote accesses for satisfying requests.
12. The system of claim 11, wherein the at least one content-aware load balancer comprises a plurality of content-aware load balancers and wherein the system further comprises at least one content-unaware load balancer for routing requests to the plurality of content-aware load balancers.
13. The system of claim 12, wherein at least one of the plurality of content-aware load balancers resides on at least one of the plurality of servers.
14. The system of claim 11, wherein a request is routed by classifying the request into a partition and routing the request to a server hosting the partition.
15. The system of claim 11, wherein the at least one cost analyzer estimates costs by examining at least one parameter included in the request, and using information about how data are partitioned among the plurality of servers to estimate at least one of numbers and costs of remote accesses for satisfying the request.
16. The system of claim 11, wherein a request is sent using the Transmission Control Protocol/Internet Protocol and a TCP connection is accepted when the request is examined.
17. The system of claim 11, wherein the at least one cost analyzer estimates at least one of a measure of resource utilization and a desired service level for satisfying the request.
18. The system of claim 11, wherein estimating costs may vary with respect to at least one of a number of servers and time.
19. Apparatus for satisfying requests from at least one client in a system comprised of a plurality of servers, the apparatus comprising:
a memory; and
at least one processor coupled to the memory and operative to: (i) obtain a request from a client; (ii) examine content of the request; and (iii) estimate costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
20. A method for satisfying requests from at least one client in a system comprised of a plurality of servers, comprising the steps of:
obtaining a request from a client;
examining content of the request; and
estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
21. An article of manufacture for use in satisfying requests from at least one client in a system comprised of a plurality of servers, comprising a machine readable medium containing one or more programs which when executed implement the steps of:
obtaining a request from a client;
examining content of the request; and
estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
22. A method for providing a load balancing service, comprising the step of:
a service provider providing a service to a customer which comprises:
obtaining a request from a client;
examining content of the request; and
estimating costs of satisfying the request by at least two of the plurality of servers based on estimating at least one of a number and a cost of at least one remote access for satisfying the request.
US11/094,905 2005-03-31 2005-03-31 Systems and methods for content-aware load balancing Abandoned US20060224773A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US11/094,905 US20060224773A1 (en) 2005-03-31 2005-03-31 Systems and methods for content-aware load balancing
CN200680004598.5A CN101116056B (en) 2005-03-31 2006-03-29 Systems and methods for content-aware load balancing
PCT/EP2006/061130 WO2006103250A1 (en) 2005-03-31 2006-03-29 Systems and methods for content-aware load balancing
US12/132,811 US8185654B2 (en) 2005-03-31 2008-06-04 Systems and methods for content-aware load balancing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/094,905 US20060224773A1 (en) 2005-03-31 2005-03-31 Systems and methods for content-aware load balancing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/132,811 Continuation US8185654B2 (en) 2005-03-31 2008-06-04 Systems and methods for content-aware load balancing

Publications (1)

Publication Number Publication Date
US20060224773A1 true US20060224773A1 (en) 2006-10-05

Family

ID=36586162

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/094,905 Abandoned US20060224773A1 (en) 2005-03-31 2005-03-31 Systems and methods for content-aware load balancing
US12/132,811 Expired - Fee Related US8185654B2 (en) 2005-03-31 2008-06-04 Systems and methods for content-aware load balancing

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/132,811 Expired - Fee Related US8185654B2 (en) 2005-03-31 2008-06-04 Systems and methods for content-aware load balancing

Country Status (3)

Country Link
US (2) US20060224773A1 (en)
CN (1) CN101116056B (en)
WO (1) WO2006103250A1 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090150565A1 (en) * 2007-12-05 2009-06-11 Alcatel Lucent SOA infrastructure for application sensitive routing of web services
US20090222583A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Client-side load balancing
US20090222581A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Internet location coordinate enhanced domain name system
US20090222582A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Failover in an internet location coordinate enhanced domain name system
US20090260016A1 (en) * 2008-04-11 2009-10-15 Yahoo! Inc. System and/or method for bulk loading of records into an ordered distributed database
US20100179759A1 (en) * 2009-01-14 2010-07-15 Microsoft Corporation Detecting Spatial Outliers in a Location Entity Dataset
US8612134B2 (en) 2010-02-23 2013-12-17 Microsoft Corporation Mining correlation between locations using location history
US8645545B2 (en) 2010-11-24 2014-02-04 International Business Machines Corporation Balancing the loads of servers in a server farm based on an angle between two vectors
US8719198B2 (en) 2010-05-04 2014-05-06 Microsoft Corporation Collaborative location and activity recommendations
US20140243008A1 (en) * 2011-10-25 2014-08-28 Bo Wang Load balancing for charging system clusters
US8966121B2 (en) 2008-03-03 2015-02-24 Microsoft Corporation Client-side management of domain name information
US8972177B2 (en) 2008-02-26 2015-03-03 Microsoft Technology Licensing, Llc System for logging life experiences using geographic cues
US9009177B2 (en) 2009-09-25 2015-04-14 Microsoft Corporation Recommending points of interests in a region
US9246873B2 (en) 2011-12-22 2016-01-26 International; Business Machines Corporation Client-driven load balancing of dynamic IP address allocation
US9261376B2 (en) 2010-02-24 2016-02-16 Microsoft Technology Licensing, Llc Route computation based on route-oriented vehicle trajectories
US9536146B2 (en) 2011-12-21 2017-01-03 Microsoft Technology Licensing, Llc Determine spatiotemporal causal interactions in data
US9593957B2 (en) 2010-06-04 2017-03-14 Microsoft Technology Licensing, Llc Searching similar trajectories by locations
US9683858B2 (en) 2008-02-26 2017-06-20 Microsoft Technology Licensing, Llc Learning transportation modes from raw GPS data
US9754226B2 (en) 2011-12-13 2017-09-05 Microsoft Technology Licensing, Llc Urban computing of route-oriented vehicles
US9871711B2 (en) 2010-12-28 2018-01-16 Microsoft Technology Licensing, Llc Identifying problems in a network by detecting movement of devices between coordinates based on performances metrics
US10288433B2 (en) 2010-02-25 2019-05-14 Microsoft Technology Licensing, Llc Map-matching for low-sampling-rate GPS trajectories
US10542078B1 (en) * 2017-06-13 2020-01-21 Parallels International Gmbh System and method of load balancing traffic bursts in non-real time networks
US10817506B2 (en) * 2018-05-07 2020-10-27 Microsoft Technology Licensing, Llc Data service provisioning, metering, and load-balancing via service units
US11637796B2 (en) * 2014-07-15 2023-04-25 Zebrafish Labs, Inc. Image matching server network implementing a score between a server and an image store

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8631401B2 (en) * 2007-07-24 2014-01-14 Ca, Inc. Capacity planning by transaction type
US8261278B2 (en) * 2008-02-01 2012-09-04 Ca, Inc. Automatic baselining of resource consumption for transactions
US8402468B2 (en) * 2008-03-17 2013-03-19 Ca, Inc. Capacity planning based on resource utilization as a function of workload
CN102075409B (en) * 2009-11-24 2013-03-20 华为技术有限公司 Method and system for processing request message as well as load balancer equipment
US9177004B2 (en) * 2009-11-25 2015-11-03 Bmc Software, Inc. Balancing data across partitions of a table space during load processing
US8260958B2 (en) * 2010-02-24 2012-09-04 F5 Networks, Inc. Reducing energy consumption of servers
CN102469110A (en) * 2010-11-01 2012-05-23 英业达股份有限公司 Load balancing method applied to cluster system
CN103092527A (en) * 2011-10-31 2013-05-08 深圳市快播科技有限公司 Storage method and storage system for small files
CN103309843B (en) * 2012-03-06 2016-03-16 百度在线网络技术(北京)有限公司 The collocation method of server and system
KR102126507B1 (en) 2013-12-09 2020-06-24 삼성전자주식회사 Terminal, system and method of processing sensor data stream
CN105306605B (en) * 2015-12-09 2018-12-25 北京中电普华信息技术有限公司 A kind of double host server systems
CN107911438A (en) * 2017-11-06 2018-04-13 出门问问信息科技有限公司 The method, apparatus and system of data processing
US10963375B1 (en) * 2018-03-23 2021-03-30 Amazon Technologies, Inc. Managing maintenance operations for a distributed system
KR20200084707A (en) 2019-01-03 2020-07-13 삼성전자주식회사 Master device for managing distributed processing of task, task processing device for processing task and method for operating thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030028636A1 (en) * 2001-06-20 2003-02-06 Ludmila Cherkasova System and method for workload-aware request distribution in cluster-based network servers
US20030229710A1 (en) * 2002-06-11 2003-12-11 Netrake Corporation Method for matching complex patterns in IP data streams
US20050076104A1 (en) * 2002-11-08 2005-04-07 Barbara Liskov Methods and apparatus for performing content distribution in a content distribution network
US20060031374A1 (en) * 2001-06-18 2006-02-09 Transtech Networks Usa, Inc. Packet switch and method thereof dependent on application content
US20060168107A1 (en) * 2004-03-16 2006-07-27 Balan Rajesh K Generalized on-demand service architecture for interactive applications
US7222190B2 (en) * 2001-11-02 2007-05-22 Internap Network Services Corporation System and method to provide routing control of information over data networks

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6247056B1 (en) * 1997-02-03 2001-06-12 Oracle Corporation Method and apparatus for handling client request with a distributed web application server
US6189043B1 (en) * 1997-06-09 2001-02-13 At&T Corp Dynamic cache replication in a internet environment through routers and servers utilizing a reverse tree generation
EP1212680B1 (en) * 1999-08-13 2007-07-04 Sun Microsystems, Inc. Graceful distribution in application server load balancing
US20030046394A1 (en) * 2000-11-03 2003-03-06 Steve Goddard System and method for an application space server cluster
US20020194324A1 (en) * 2001-04-26 2002-12-19 Aloke Guha System for global and local data resource management for service guarantees
JP2003256310A (en) * 2002-03-05 2003-09-12 Nec Corp Server load decentralizing system, server load decentralizing apparatus, content management apparatus and server load decentralizing program
CN1235157C (en) * 2002-10-10 2006-01-04 华为技术有限公司 Content-oriented load equalizing method and apparatus
CN100382550C (en) * 2004-09-01 2008-04-16 恒生电子股份有限公司 Method for processing shared data in on-line processing system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060031374A1 (en) * 2001-06-18 2006-02-09 Transtech Networks Usa, Inc. Packet switch and method thereof dependent on application content
US20030028636A1 (en) * 2001-06-20 2003-02-06 Ludmila Cherkasova System and method for workload-aware request distribution in cluster-based network servers
US20060080388A1 (en) * 2001-06-20 2006-04-13 Ludmila Cherkasova System and method for workload-aware request distribution in cluster-based network servers
US7222190B2 (en) * 2001-11-02 2007-05-22 Internap Network Services Corporation System and method to provide routing control of information over data networks
US20030229710A1 (en) * 2002-06-11 2003-12-11 Netrake Corporation Method for matching complex patterns in IP data streams
US20050076104A1 (en) * 2002-11-08 2005-04-07 Barbara Liskov Methods and apparatus for performing content distribution in a content distribution network
US20060168107A1 (en) * 2004-03-16 2006-07-27 Balan Rajesh K Generalized on-demand service architecture for interactive applications

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090150565A1 (en) * 2007-12-05 2009-06-11 Alcatel Lucent SOA infrastructure for application sensitive routing of web services
US9683858B2 (en) 2008-02-26 2017-06-20 Microsoft Technology Licensing, Llc Learning transportation modes from raw GPS data
US8972177B2 (en) 2008-02-26 2015-03-03 Microsoft Technology Licensing, Llc System for logging life experiences using geographic cues
US7930427B2 (en) 2008-03-03 2011-04-19 Microsoft Corporation Client-side load balancing
US7991879B2 (en) 2008-03-03 2011-08-02 Microsoft Corporation Internet location coordinate enhanced domain name system
US8275873B2 (en) 2008-03-03 2012-09-25 Microsoft Corporation Internet location coordinate enhanced domain name system
US8458298B2 (en) 2008-03-03 2013-06-04 Microsoft Corporation Failover in an internet location coordinate enhanced domain name system
US20090222582A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Failover in an internet location coordinate enhanced domain name system
US20090222581A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Internet location coordinate enhanced domain name system
US20090222583A1 (en) * 2008-03-03 2009-09-03 Microsoft Corporation Client-side load balancing
US8966121B2 (en) 2008-03-03 2015-02-24 Microsoft Corporation Client-side management of domain name information
US20090260016A1 (en) * 2008-04-11 2009-10-15 Yahoo! Inc. System and/or method for bulk loading of records into an ordered distributed database
US8893131B2 (en) * 2008-04-11 2014-11-18 Yahoo! Inc. System and/or method for bulk loading of records into an ordered distributed database
US20100179759A1 (en) * 2009-01-14 2010-07-15 Microsoft Corporation Detecting Spatial Outliers in a Location Entity Dataset
US9063226B2 (en) 2009-01-14 2015-06-23 Microsoft Technology Licensing, Llc Detecting spatial outliers in a location entity dataset
US9009177B2 (en) 2009-09-25 2015-04-14 Microsoft Corporation Recommending points of interests in a region
US9501577B2 (en) 2009-09-25 2016-11-22 Microsoft Technology Licensing, Llc Recommending points of interests in a region
US8612134B2 (en) 2010-02-23 2013-12-17 Microsoft Corporation Mining correlation between locations using location history
US9261376B2 (en) 2010-02-24 2016-02-16 Microsoft Technology Licensing, Llc Route computation based on route-oriented vehicle trajectories
US11333502B2 (en) * 2010-02-25 2022-05-17 Microsoft Technology Licensing, Llc Map-matching for low-sampling-rate GPS trajectories
US10288433B2 (en) 2010-02-25 2019-05-14 Microsoft Technology Licensing, Llc Map-matching for low-sampling-rate GPS trajectories
US8719198B2 (en) 2010-05-04 2014-05-06 Microsoft Corporation Collaborative location and activity recommendations
US10571288B2 (en) 2010-06-04 2020-02-25 Microsoft Technology Licensing, Llc Searching similar trajectories by locations
US9593957B2 (en) 2010-06-04 2017-03-14 Microsoft Technology Licensing, Llc Searching similar trajectories by locations
US8645545B2 (en) 2010-11-24 2014-02-04 International Business Machines Corporation Balancing the loads of servers in a server farm based on an angle between two vectors
US8676983B2 (en) 2010-11-24 2014-03-18 International Business Machines Corporation Balancing the loads of servers in a server farm based on an angle between two vectors
US9871711B2 (en) 2010-12-28 2018-01-16 Microsoft Technology Licensing, Llc Identifying problems in a network by detecting movement of devices between coordinates based on performances metrics
US20140243008A1 (en) * 2011-10-25 2014-08-28 Bo Wang Load balancing for charging system clusters
US9754226B2 (en) 2011-12-13 2017-09-05 Microsoft Technology Licensing, Llc Urban computing of route-oriented vehicles
US9536146B2 (en) 2011-12-21 2017-01-03 Microsoft Technology Licensing, Llc Determine spatiotemporal causal interactions in data
US9253144B2 (en) 2011-12-22 2016-02-02 International Business Machines Corporation Client-driven load balancing of dynamic IP address allocation
US9948600B2 (en) 2011-12-22 2018-04-17 International Business Machines Corporation Client-driven load balancing of dynamic IP address allocation
US9246873B2 (en) 2011-12-22 2016-01-26 International; Business Machines Corporation Client-driven load balancing of dynamic IP address allocation
US11637796B2 (en) * 2014-07-15 2023-04-25 Zebrafish Labs, Inc. Image matching server network implementing a score between a server and an image store
US10979493B1 (en) * 2017-06-13 2021-04-13 Parallel International GmbH System and method for forwarding service requests to an idle server from among a plurality of servers
US10542078B1 (en) * 2017-06-13 2020-01-21 Parallels International Gmbh System and method of load balancing traffic bursts in non-real time networks
US10970269B2 (en) 2018-05-07 2021-04-06 Microsoft Technology Licensing, Llc Intermediate consistency levels for database configuration
US10970270B2 (en) 2018-05-07 2021-04-06 Microsoft Technology Licensing, Llc Unified data organization for multi-model distributed databases
US10885018B2 (en) 2018-05-07 2021-01-05 Microsoft Technology Licensing, Llc Containerization for elastic and scalable databases
US11030185B2 (en) 2018-05-07 2021-06-08 Microsoft Technology Licensing, Llc Schema-agnostic indexing of distributed databases
US11321303B2 (en) 2018-05-07 2022-05-03 Microsoft Technology Licensing, Llc Conflict resolution for multi-master distributed databases
US10817506B2 (en) * 2018-05-07 2020-10-27 Microsoft Technology Licensing, Llc Data service provisioning, metering, and load-balancing via service units
US11379461B2 (en) 2018-05-07 2022-07-05 Microsoft Technology Licensing, Llc Multi-master architectures for distributed databases
US11397721B2 (en) 2018-05-07 2022-07-26 Microsoft Technology Licensing, Llc Merging conflict resolution for multi-master distributed databases

Also Published As

Publication number Publication date
US20080235397A1 (en) 2008-09-25
WO2006103250A1 (en) 2006-10-05
CN101116056A (en) 2008-01-30
CN101116056B (en) 2010-05-19
US8185654B2 (en) 2012-05-22

Similar Documents

Publication Publication Date Title
US8185654B2 (en) Systems and methods for content-aware load balancing
US6122666A (en) Method for collaborative transformation and caching of web objects in a proxy network
US8219693B1 (en) Providing enhanced access to stored data
US9667739B2 (en) Proxy-based cache content distribution and affinity
CN1113503C (en) Dynamic routing in internet
US20090327460A1 (en) Application Request Routing and Load Balancing
US20030097429A1 (en) Method of forming a website server cluster and structure thereof
US7793297B2 (en) Intelligent resource provisioning based on on-demand weight calculation
US20050188091A1 (en) Method, a service system, and a computer software product of self-organizing distributing services in a computing network
JP4925231B2 (en) Sending request fragments from a response aggregation surrogate
JP2004501431A (en) Application caching system and method
US9774676B2 (en) Storing and moving data in a distributed storage system
JP2000187609A (en) Method for retrieving requested object and recording device
Seth et al. Dynamic heterogeneous shortest job first (DHSJF): a task scheduling approach for heterogeneous cloud computing systems
US7827141B2 (en) Dynamically sizing buffers to optimal size in network layers when supporting data transfers related to database applications
US10691700B1 (en) Table replica allocation in a replicated storage system
Broberg et al. Task assignment with work-conserving migration
Meira et al. E-representative: a scalability scheme for e-commerce
US20020092012A1 (en) Smart-caching system and method
CN111010453A (en) Service request processing method, system, electronic device and computer readable medium
Ramana et al. AWSQ: an approximated web server queuing algorithm for heterogeneous web server cluster
Patil et al. High quality design and methodology aspects to enhance large scale web services
Jander et al. Service discovery in megascale distributed systems
Lee et al. High performance web server architecture with Kernel-level caching
Jayalakshmi et al. Dynamic data replication across geo-distributed cloud data centres

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DEGENARO, LOUIS R.;GAO, LEI;IYENGAR, ARUN KWANGIL;AND OTHERS;REEL/FRAME:016186/0762;SIGNING DATES FROM 20050330 TO 20050405

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION