1
CONGESTION AVOIDANCE FOR THREADS IN SERVERS
BACKGROUND OF THE INVENTION
5
1. Field of the Invention
The present invention relates to a computer system, and deals more particularly with a method, system, and computer program product for enhancing performance, reliability, and recoverability of a computer running a multi- 10 threaded server application.
2. Description of the Related Art
A multi-threaded application is a software program that supports concurrent execution by multiple threads—that is, a re-entrant program. A thread is a single execution path 15 within such a program. The threads execute sequentially within one process, under control of the operating system scheduler, which allocates time slices to available threads. A process is an instance of a running program. The operating system maintains information about each concurrent thread 20 that enables the threads to share the CPU in time slices, but still be distinguishable from each other. For example, a different current instruction pointer is maintained for each thread, as are the values of registers. By maintaining some distinct state information, each execution path through the 25 re-entrant program can operate independently, as if separate programs were executing. Other state information such as virtual memory and file descriptors for open I/O (input/ output) streams are shared by all threads within the process for execution efficiency. On SMP (Symmetric Multiproces- 30 sor) machines, several of these threads may be executing simultaneously. The re-entrant program may contain mechanisms to synchronize these shared resources across the multiple execution paths.
Multi-threaded applications are increasingly common on 35 servers running in an Internet environment, as well as in other networking environments such as intranets and extranets. In order to enable many clients to access the same server, the computer that receives and/or processes the client's request typically executes a multi-threaded applica- 40 tion. The same instance of the application can then process multiple requests, where separate threads are used to isolate one client's request from the requests of other clients. When a server executes a multithreaded application program, the server may equivalently be referred to as a "threaded 45 server", or "multithreaded server".
The TCP/IP protocol (Transmission Control Protocol/ Internet Protocol) is the de facto standard method of transmitting data over networks, and is widely used in Internet transmissions and in other networking environments. TCP/ 50 IP uses the concept of a connection between two "sockets" for exchanging data between two computers, where a socket is comprised of an address identifying one of the computers, and a port number that identifies a particular process on that computer. The process identified by the port number is the 55 process that will receive the incoming data for that socket. A socket is typically implemented as a queue by each of the two computers using the connection, whereby the computer sending data on the connection queues the data it creates for transmission, and the computer receiving data on the con- 60 nection queues arriving data prior to processing that data.
When a multi-threaded server application communicates using a reliable protocol such as TCP/IP, congestion may occur. TCP/IP is considered a "reliable" protocol because messages that are sent to a receiver are buffered by the 65 sender until the receiver acknowledges receipt thereof. If the acknowledgement is not received (e.g. because the message
2
is lost in transmission), then the buffered message can be retransmitted. A limitation is placed on the amount of data that must be buffered at the sender and at the receiver. These limitations are referred to as "window sizes". When the amount of data a sender has sent to the receiver—and for which no acknowledgement has been received—reaches the sender's window size, then the sender is not permitted to send additional data to this receiver.
When this happens, any subsequent write operations attempted by the sender will "block". In the general case, a write operation is said to "block" when the operation does not return control to the executing program for some period of time. This may be due to any of a number of different factors, such as: congestion in the network; a sent message that is not received by the client; a client that fails to respond in a timely manner; filling up the transport layer buffer until the window size is reached, as described above; etc. If a write operation blocks, then the thread which is processing the write operation ceases to do productive work. A server application using a reliable protocol such as TCP/IP has no way of conclusively predicting whether the write operation used to send data to a particular receiver will block. If there are a relatively small number of threads processing the set of connections for a particular server application, then relatively few blocked write operations can cause the entire server application to be blocked from functioning. With the increasing popularity of multi-threaded applications such as those running on Web servers in the Internet, which may receive thousands or even millions of "hits" (i.e. client requests for processing) per day, the performance, reliability, and recoverability of server applications becomes a critical concern. Furthermore, because an incoming request to a server application often has a human waiting for the response at the client, processing inefficiencies (such as blocked threads) in a server application must be avoided to the greatest extent possible.
Accordingly, a need exists for a technique by which these inefficiencies in the current implementations of multithreaded server applications can be overcome.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a technique for enhancing the performance, reliability, and recoverability of multi-threaded server applications.
Another object of the present invention is to provide a technique whereby these enhancements are achieved by a priori avoidance of congestion for threads in multi-threaded server applications.
It is another object of the present invention to provide this congestion avoidance by enforcing a policy that limits the number of threads which may handle connections to a particular host.
It is yet another object of the present invention to provide a technique for dynamically adjusting the limit on the number of threads used for connections to a particular host.
It is a further object of the present invention to provide a technique for detecting failures or blocks in communication with a particular receiving host, enabling recovery operations to be attempted.
Still another object of the present invention to provide a technique for minimizing synchronization requirements, and the negative results that may occur when multiple threads need access to the same synchronized resources.
Other objects and advantages of the present invention will be set forth in part in the description and in the drawings
3
which follow and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the foregoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides a system, method, and computer 5 program product for enhancing performance, reliability, and recoverability of a multi-threaded application by avoiding congestion for threads therein. In a first aspect, this technique comprises: executing a plurality of worker threads; receiving a plurality of incoming client requests for connec- 10 tions onto an incoming queue; transferring each of the received client requests for connections from the incoming queue to a wide queue, the wide queue comprising a plurality of queues wherein each of the queues is separately synchronization-protected; and servicing, by the plurality of 15 worker threads, the client requests by retrieving selected ones of the client requests from the wide queue. The transferring may further comprise placing each of the received client requests on a selected one of the plurality of queues using a First-In, First-Out (FIFO) strategy, wherein 20 the selected one of the plurality of queues is selected using a round-robin approach. In this case, the technique further comprises returning the retrieved selected ones of the client requests to the wide queue using the FIFO strategy and the round-robin approach upon completion of the servicing. 25
In another aspect, this technique comprises: executing a plurality of worker threads; receiving a plurality of incoming client requests onto a queue, wherein each of the client requests is for a connection to a host; retrieving, by individual ones of the worker threads, a selected one of the client 30 requests from the queue; determining a number of connections to the host to which the connection is requested in the selected client request, wherein this number are those which are currently assigned to one or more of the worker threads; processing the selected client request if the number is less 35 than an upper limit, and not processing the selected client request otherwise; and returning the processed client request or the not processed client request to the queue. The upper limit may be a system-wide value. The upper limit may alternatively be a value specific to the host to which the 40 connection is requested.
When the upper limit is host-specific, the value may be dynamically computed, in which case the technique further comprises: executing a supervisor thread; monitoring, by the supervisor thread, whether connections to each of the hosts 45 succeed or fail; and decrementing the value when the connections to the host fail. Optionally, the value may be incremented when the connections to the host succeed. The monitoring preferably further comprises: setting, by each of the worker threads, a thread time stamp when the worker 50 thread performs active work; comparing, by the supervisor thread, the thread time stamp for each of the worker threads to a system time, thereby computing an elapsed time for the worker thread; and deactivating the worker thread if the elapsed time exceeds a maximum allowable time. 55
This aspect may further comprise: providing information for each of the hosts, the information comprising an address of the host and a plurality of in-use flags; setting a selected one of the in-use flags when a particular worker thread is processing work on the connection to a particular host, 60 wherein the selected one of the in-use flags is associated with the particular worker thread; and resetting the selected one of the in-use flags when the particular worker thread stops processing work on the connection to said particular host. Determining the number of currently-assigned connec- 65 tions preferably further comprises counting how many of the in-use flags are set.
4
In this aspect, the queue may be a wide queue comprised of a plurality of First-In, First-Out (FIFO) queues.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a computer workstation environment in which the present invention may be practiced;
FIG. 2 is a diagram of a networked computing environment in which the present invention may be practiced;
FIG. 3 illustrates the components involved in a preferred embodiment of the present invention; and
FIGS. 4-6 depict flowcharts which set forth the logic with which a preferred embodiment of the present invention may be implemented.
DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 illustrates a representative workstation hardware environment in which the present invention may be practiced. The environment of FIG. 1 comprises a representative single user computer workstation 10, such as a personal computer, including related peripheral devices. The workstation 10 includes a microprocessor 12 and a bus 14 employed to connect and enable communication between the microprocessor 12 and the components of the workstation 10 in accordance with known techniques. The workstation 10 typically includes a user interface adapter 16, which connects the microprocessor 12 via the bus 14 to one or more interface devices, such as a keyboard 18, mouse 20, and/or other interface devices 22, which can be any user interface device, such as a touch sensitive screen, digitized entry pad, etc. The bus 14 also connects a display device 24, such as an LCD screen or monitor, to the microprocessor 12 via a display adapter 26. The bus 14 also connects the microprocessor 12 to memory 28 and long-term storage 30 which can include a hard drive, diskette drive, tape drive, etc.
The workstation 10 may communicate with other computers or networks of computers, for example via a communications channel or modem 32. Alternatively, the workstation 10 may communicate using a wireless interface at 32, such as a CDPD (cellular digital packet data) card. The workstation 10 may be associated with such other computers in a local area network (LAN) or a wide area network (WAN), or the workstation 10 can be a client in a client/ server arrangement with another computer, etc. All of these configurations, as well as the appropriate communications hardware and software, are known in the art.
FIG. 2 illustrates a data processing network 40 in which the present invention may be practiced. The data processing network 40 may include a plurality of individual networks, such as wireless network 42 and network 44, each of which may include a plurality of individual workstations 10. Additionally, as those skilled in the art will appreciate, one or more LANs may be included (not shown), where a LAN may comprise a plurality of intelligent workstations coupled to a host processor.
Still referring to FIG. 2, the networks 42 and 44 may also include mainframe computers or servers, such as a gateway computer 46 or application server 47 (which may access a data repository 48). A gateway computer 46 serves as a point of entry into each network 44. The gateway 46 may be
« PrécédentContinuer » |