WO2004084508A1 - Method and apparatus for controlling congestion in communications network - Google Patents

Method and apparatus for controlling congestion in communications network Download PDF

Info

Publication number
WO2004084508A1
WO2004084508A1 PCT/SG2003/000054 SG0300054W WO2004084508A1 WO 2004084508 A1 WO2004084508 A1 WO 2004084508A1 SG 0300054 W SG0300054 W SG 0300054W WO 2004084508 A1 WO2004084508 A1 WO 2004084508A1
Authority
WO
WIPO (PCT)
Prior art keywords
packet
sample
match
buffer
discard probability
Prior art date
Application number
PCT/SG2003/000054
Other languages
French (fr)
Inventor
Mehul Motani
Saravanan Govindan
Peng Yong Kong
Original Assignee
Agency For Science, Technology And Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agency For Science, Technology And Research filed Critical Agency For Science, Technology And Research
Priority to AU2003217146A priority Critical patent/AU2003217146A1/en
Priority to PCT/SG2003/000054 priority patent/WO2004084508A1/en
Publication of WO2004084508A1 publication Critical patent/WO2004084508A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2425Traffic characterised by specific attributes, e.g. priority or QoS for supporting services specification, e.g. SLA
    • H04L47/2433Allocation of priorities to traffic types
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/12Avoiding congestion; Recovering from congestion
    • H04L47/129Avoiding congestion; Recovering from congestion at the destination endpoint, e.g. reservation of terminal resources or buffer space
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/29Flow control; Congestion control using a combination of thresholds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/32Flow control; Congestion control by discarding or delaying data units, e.g. packets or frames
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/41Flow control; Congestion control by acting on aggregated flows or links

Definitions

  • This invention relates to congestion control in communications networks.
  • it relates to a method and apparatus for managing buffers in a packet switching network.
  • Communications networks such as the Internet transport information in packet form from source nodes to destination nodes over transmission links, which connect the source nodes and the destination nodes.
  • source-destination node pairs are connected by a series of intermediate nodes, called routers or switches, whose main function is to route packets received from source and/or previous intermediate nodes to the appropriate subsequent nodes and/or destination nodes.
  • the intermediate nodes also have buffers at their input and/or output ports for temporary storage of packets during periods when the number of packets arriving at the intermediate nodes exceeds the bandwidth capacity of the output transmission link.
  • TCP Transmission Control Protocol
  • the TCP uses an adaptive window based congestion control mechanism.
  • a source node detects packet loss it decreases its transmission rate by reducing the size of its congestion window.
  • a round-trip time is the time between the start of transmission of a packet to the beginning of receipt of an acknowledgement for the transmitted packet.
  • Packet losses as a result of congestion, lead to abrupt reduction in transmission rates and slacken subsequent growth in the congestion window. This can restrict high throughput and lower link utilization.
  • the TCP has no way of warning the source nodes about the onset of congestion before the buffer limit is reached. This can invariably result in many existing active flows achieving meager throughputs and new active flows achieving zero throughputs because the first packet sent by these new active flows is discarded by intermediate nodes along the transmission links.
  • an intermediate node typically requires large buffers for accommodating heavy network traffic.
  • this is not a desirable solution because it is extremely difficult to predict the maximum traffic flow and therefore select the correct buffer size. Further, excessive buffering can result in high delays that may be unfavorable to many applications. Therefore there is a need to provide buffer management schemes to control congestion on the Internet and the like communications networks.
  • RED Random Early Detection
  • SRED Stabilized RED
  • CHOKe CHOose and Kill for unresponsive flows
  • the RED scheme is proposed by S. Floyd and V. Jacobson in "Random Early Detection Gateways for Congestion Avoidance," IEEE/ACM Transactions on Networking, vol. 1, no. 4, pp. 397-413, August 1993.
  • This buffer management scheme relies on adaptive transport protocols like TCP.
  • source nodes are notified about the onset of congestion at an intermediate node when the average buffer content at the intermediate node exceeds a predetermined threshold.
  • the RED scheme notifies the source node(s) by discarding packets based on a probability that is related to the average occupancy level of the buffer. If the source nodes use the TCP, the source nodes reduce their transmission rates in response to packet drops, thus reducing congestion at intermediate nodes along the transmission links to destination nodes.
  • Some of the key features of the RED scheme include the ability to avoid synchronization of flows by randomly dropping packets from different active flows, the maintenance of low buffer occupancy levels to ensure short delays and the prevention of bias against active flows with bursty traffic patterns.
  • the RED scheme achieves these features by first computing the average occupancy level of the buffer using an exponentially weighted moving average and comparing it to two thresholds: a lower bound average buffer occupancy threshold (Min th ) and an upper bound average buffer occupancy threshold (Max th ). Incoming packets are discarded with a packet drop probability that is a linear function of the average occupancy level of the buffer when the average occupancy level is between the two average buffer occupancy thresholds.
  • the packet drop probability is zero when the average buffer occupancy level is below Min th and one when the average buffer occupancy level is above Max t , as shown in a packet drop probability curve 100 in FIG. 1.
  • the packet drop probability P is calculated as follows:
  • Max p represents a maximum packet drop probability constant and avg represents the average occupancy level of the buffer.
  • Max p sets the maximum probability for discarding a packet.
  • Max p can have a value ranging from greater than zero up to one.
  • the average occupancy level of the buffer avg is calculated as follows:
  • avg prev represents the immediate previous average occupancy level of the buffer and q represents the current occupancy level of the buffer.
  • w represents a weight constant
  • avg prev represents the immediate previous average occupancy level of the buffer
  • q represents the current occupancy level of the buffer.
  • the RED scheme works well for random bursty traffic and in situations where congestion is not prolonged. During persistent congestion periods the RED scheme applies a uniform packet drop probability on all arriving packets irrespective of the level of contribution each of the active flows has on the buffer and therefore the state of congestion. That is, the RED scheme is not able to distinguish which of the active flows contribute the most to the congestion and therefore unable to penalize these active flows accordingly. This leads to unfair bandwidth utilization by certain active flows on the Internet and the like communications networks that deploy the RED scheme in the intermediate nodes. Active flows that have high levels of contribution to the congestion at intermediate nodes are referred to as misbehaving flows.
  • the SRED scheme is proposed by T. Ott, et. al. in "SRED: Stablized RED,” IEEE INFOCOM, vol. 3, pp. 1346-1355, 1999 and a related US patent no. 6,434,116.
  • the SRED scheme seeks to improve upon the RED scheme by stabilizing the occupancy of a buffer at a level independent of the number of active flows.
  • the SRED scheme achieves this by estimating the number of active flows and finding misbehaving flows.
  • the main idea is to compare the information of a newly arrived packet with information from a randomly chosen entry from a fixed size data structure, called the "Zombie list".
  • the Zombie list contains a list of entries, whereby each entry contains information of a recent packet that traversed the buffer.
  • the SRED scheme is able to identify misbehaving flows.
  • the SRED scheme does not propose a simple router mechanism for penalizing the misbehaving flows.
  • the SRED scheme requires additional resources to store, maintain and operate the Zombie list. This invariably results in the intermediate nodes requiring greater hardware and processing capability, which increases the overall cost of the intermediate nodes.
  • the CHOKe scheme is proposed by R Pan, et. al.
  • the CHOKe scheme compares a newly arrived packet with a randomly selected set of packets from the buffer. Once a packet from the randomly selected set of packets is found to originate from the same source node as the newly arrived packet a match is declared and the matched packet from the randomly selected set of packets and the newly arrived packet are discarded. If a match is not found, the newly arrived packet is admitted into the buffer using the RED packet drop probability P d as described in the foregoing.
  • the assumption behind the CHOKe scheme is that the buffer is likely to contain a greater number of packets belonging to misbehaving flows than normal active flows. Thus, the packets from the misbehaving flows are more likely to be selected for comparison with the newly arrived packet.
  • the CHOKe scheme provides fairer bandwidth utilization for all active flows. However, the CHOKe scheme can result in bursty losses as each packet drop decision is effective for both the newly arrived packet and the packets already admitted into the buffer.
  • the CHOKe scheme fundamentally relies on the RED packet drop probability Pd, it also suffers the same drawback of uniform dropping during persistent congestion periods as in the RED scheme since the packet drop probability for each active flow is calculated as a function of the packets from all active flows.
  • a random sampling buffer management (RS) scheme provides an efficient buffer management scheme that exhibits a high degree of fairness in allocating bandwidth to all active flows.
  • the RS scheme flows in proportion to the level of contribution of each of the active flows to the overall congestion at the intermediate node.
  • the level of contribution of each of the active flows is determined by randomly sampling packets (the sample) stored in the buffer of the intermediate node and comparing the sample with a newly arrived packet at the intermediate node.
  • a match is declared each time a packet in the sample and the newly arrived packet are determined to have the same flow identifier.
  • the number of matches declared is an indication of the level of contribution the active flow has on the overall state of congestion at the intermediate node. Accordingly, the newly arrived packet is discarded according to a probability based on the number of matches declared.
  • a method for controlling congestion in a communications network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node, the method comprising the steps of: providing a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; generating a discard probability based on the sample and the match group; and discarding the received packet based on the generated discard probability to control congestion in the communications network.
  • an apparatus for controlling congestion in a communications network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node
  • the apparatus comprising: a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; means for identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; means for generating a discard probability based on the sample and the match group; and means for discarding the received packet based on the discard probability to control congestion in the communications network.
  • FIG. 1 shows a prior art packet drop probability curve of the Random Early Detection (RED) scheme
  • FIG. 2 shows a flowchart of the steps performed by an intermediate node for managing its buffers in accordance with a first embodiment of the invention
  • FIG. 3 shows a flowchart of the steps performed by an intermediate node for managing its buffers in accordance with a second embodiment of the invention.
  • a random sampling buffer management (RS) scheme according to the embodiments of the invention is provided hereinafter.
  • the RS scheme provides an efficient buffer management scheme that exhibits a high degree of fairness in allocating bandwidth to all active flows in addition to the ability to avoid synchronization of active flows by randomly dropping packets, the maintenance of low buffer occupancy levels to ensure all active flows in addition to the ability to avoid synchronization of active flows by randomly dropping packets, the maintenance of low buffer occupancy levels to ensure short delays and the prevention of bias against active flows with bursty traffic patterns.
  • a packet switching network such as the Internet consists of multiple intermediate nodes linked together by multiple transmission links for transporting information in packet form from one or more source nodes to one or more destination nodes.
  • each intermediate node has a switch or a router, which receives packets from source and/or previous intermediate nodes and redirects these packets to their respective subsequent intermediate or destination nodes.
  • the intermediate nodes may not be able to cope with the large volume of arriving packets due to the limited bandwidth capacity of the outgoing transmission link.
  • High traffic periods occur when many active flows are transmitting information at the same time or when one or more active flows are transmitting large volumes of information over a short time period (i.e. bursty traffic).
  • the intermediate nodes typically discard incoming packets to implicitly signal the source nodes to reduce their transmission rates thereby reducing the number of packet being transmitted.
  • the sizes of the congestion windows of the source nodes are reduced, typically by one-half.
  • intermediate nodes also have buffers for temporarily storing arriving packets. However, because these buffers cannot be too large, an efficient buffer management scheme is needed to control the congestion at the intermediate nodes and at the same time provide fair bandwidth allocation to all active flows traversing the intermediate nodes.
  • the RS scheme according to the embodiments of the invention operates on the premise that the cause of the congestion at an intermediate node is most likely to be the active flow(s) with the greatest contribution to the state of the buffer. Therefore, to provide a high degree of fairness in the allocation of bandwidth to all active flows traversing the intermediate node, the active flows that contribute the most to the congestion (i.e. misbehaving flows) need to be identified and penalized accordingly. Furthermore, each misbehaving flow is to be penalized in accordance with its level of contribution to the state of the buffer and thereby the state of congestion. This ensures that not all active flows are penalized uniformly, rather each active flow is penalized depending on its contribution to the state of congestion. This results in a high degree of fairness in the utilization of the outgoing transmission link bandwidth.
  • the RS scheme according to a first embodiment of the invention is shown in flowchart 200 in FIG. 2.
  • An intermediate node implementing the RS scheme carries out the steps shown in the flowchart 200 as described hereinafter.
  • the router at an intermediate node receives a newly arrived packet in a step 202
  • the router randomly samples a set of packets (the sample) from the buffer for comparison with the newly arrived packet.
  • the packets in the sample are randomly selected from the buffer one packet at a time, in a step 206, for comparison in the matching process in the steps 208 to 212.
  • the size of the sample S s i ze can range from one packet up to the instantaneous buffer size.
  • the sample size S_, ze is preferably selected from a range of 20% to 50% of the instantaneous buffer size.
  • a temporary counter C and a matching counter MC are provided in a step 204 for use in the matching process.
  • the temporary counter C is used for counting the number of packets compared during the matching process, while the matching counter MC is used for tracking the number of matches found during the matching process.
  • the temporary counter C and the matching counter MC are assigned initial values "0" and "0", respectively.
  • the temporary counter C is incremented by one and a step 208 is activated to begin the matching process. In the step 208, if the temporary counter C is lesser than or equal to the sample size S size the requisite number of randomly sampled packets has not been compared.
  • the newly arrived packet is compared with a previously un- compared packet randomly sampled from the buffer in a step 210.
  • Each active flow has a flow identifier.
  • the flow identifier comprises one or a combination of destination and source addresses, destination and source port numbers, protocol identifier, and the like identifiers.
  • a match is found when the newly arrived packet and the randomly sampled packet from the buffer are determined to have the same flow identifier.
  • the matching counter MC is incremented by one to register the number of matches found for the newly arrived packet in a step 212.
  • the router then reverts to the step 206 to retrieve another un-checked randomly sampled packet from the buffer for comparison.
  • the router reverts to the step 206 to compare the next un-checked randomly sampled packet from the buffer.
  • the packet drop probability Pa of a newly arrived packet associating with an active flow i is given by:
  • Equation (1) indicates that the newly arrived packet from the active flow i is discarded in proportion to the number of matches found. If the number of matches found is high, then the newly arrived packet from the active flow i has a high probability of being discarded. Thus the bandwidth of the outgoing transmission link is not fully occupied by misbehaving flows.
  • the router preferably generates a random number P.
  • the generated random number P is then compared with the packet drop probability P i for the newly arrived packet. If the generated random number P is less than the packet drop probability P d i for the newly arrived packet from the active flow i, the newly arrived packet is discarded. Otherwise, the newly arrived packet is admitted into the buffer for transmission to either a subsequent intermediate node or the destination node(s). Alternatively, the newly arrived packet can be discarded once the probability P i is greater than a predetermined threshold.
  • the RS scheme according to a second embodiment of the invention is shown in a flowchart 300 in FIG. 3.
  • An intermediate node implementing the RS scheme of this second embodiment carries out the steps shown in the flowchart 300 as described hereinafter.
  • the router at an intermediate node receives a newly arrived packet in a step 302
  • the router proceeds to calculate a new average occupancy level of the buffer avg in a step 304.
  • the average occupancy level of the buffer avg is calculated using an exponentially weighted moving average.
  • the exponentially weighted moving average is preferred for its resilient mechanism that prevents drastic changes in the average occupancy level of the buffer due to random bursty traffic patterns.
  • the RS scheme does not bias against active flows exhibiting bursty traffic patterns.
  • the average occupancy level of the buffer avg is calculated as follows:
  • avg (1 - w)avg prev + wq (2)
  • w represents a weight constant
  • avg prev represents the immediate previous average occupancy level of the buffer
  • q represents the instantaneous occupancy level of the buffer.
  • the selection of w is dependent on the operational preferences of the service provider that deploys the intermediate nodes. If w is too large, the averaging procedure does not quickly detect the changes in the level of congestion at the intermediate node. On the other hand, if w is too small, the average occupancy level of the buffer avg responds too slowly to changes in the actual buffer occupancy level. Thus, the router is unable to detect the initial stages of congestion.
  • w is chosen between 0.001 and 0.01.
  • the average occupancy level of the buffer avg is essentially calculated using a lowpass filter, where w determines the time constant of the lowpass filter.
  • the router checks if the new average occupancy level of the buffer avg falls within two congestion thresholds Cl t h and C2 t ⁇ , in a step 306 with Cl t ⁇ t being lower than C_V This region between Cl th and C-V, indicates the onset of congestion at the intermediate node.
  • the values of these congestion thresholds Cl t h and C2 t ⁇ are dependent on the size of the buffer. Preferably, these values are 50% and 80% of the size of the buffer, respectively.
  • the router checks if the new average occupancy level of the buffer avg is lesser than Cl th in a step 308. If the new average occupancy level of the buffer avg is lesser than Cl th , the newly arrived packet is admitted into the buffer in a step 310 for transmission to either a subsequent intermediate node or the destination node. In the step 308, if the average occupancy level of the buffer avg is greater than Cl t , it implies that the average occupancy level avg is equal to or greater than C2 t ⁇ , indicating that the buffer is in a state of heavy congestion. Thus, the newly arrived packet is discarded directly in a step 312.
  • a sample of packets (the sample) is selected randomly from the buffer for comparison with the newly arrived packet.
  • the packets in the sample are randomly selected from the buffer one packet at a time, in a step 316, for comparison in the matching process in the steps 318 to 324.
  • the size of the sample &__ can range from one packet up to the instantaneous buffer size of the buffer.
  • the comparison does not provide an adequate representation of the actual state of the buffer.
  • the sample size S_, ze is preferably selected from a range of 20% to 50% of the instantaneous occupancy level of the buffer.
  • a temporary counter C and a matching counter MC are provided in a step 314 for using in the matching process.
  • the temporary counter C is used for counting the number of packets compared during the matching process, while the matching counter MC is used for tracking the number of matches found during the matching process.
  • the temporary counter C and the matching counter MC are assigned initial values "0" and "0", respectively.
  • the temporary counter C Upon acquiring a randomly sampled packet from the buffer in the step 316, the temporary counter C is incremented by one and a step 318 is activated to begin the matching process.
  • the temporary counter C if the temporary counter C is lesser than or equal to the sample size S_,-_ e the requisite number of randomly sampled packets has not been compared.
  • the newly arrived packet is compared with an un-checked packet randomly sampled from the buffer in a step 322.
  • Each active flow has a flow identifier.
  • the flow identifier comprises one or a combination of destination and source addresses, destination and source port numbers, protocol identifier, and the like identifiers. A match is found when the newly arrived packet and the randomly sampled packet from the buffer are determined to have the same flow identifier.
  • Each reverts to the step 316 to retrieve another un-checked randomly sampled packet from the buffer for comparison.
  • the router reverts to the step 316 to compare the next un-checked randomly sampled packet from the buffer.
  • the packet drop probability P_/,- of a newly arrived packet originating from an active flow i is given by:
  • Max p represents the maximum packet drop probability and MC, represents the number of matches found for the newly arrived packet that associates with the active flow /.
  • Max p sets the maximum probability for discarding a packet.
  • Max p can have a value ranging from greater than zero up to one.
  • the value of Max p is preferably chosen from a range of 0.02 to 0.08; although it is obvious to one skilled in the art the Max p can also be chosen from a larger range of zero to one.
  • Equation (3) indicates that the newly arrived packet from the active flow i is discarded in proportion to the number of matches found. If the number of matches found is high, then the newly arrived packet from the active flow i has a high probability of being discarded. Thus, the bandwidth of the outgoing transmission link is not fully occupied by misbehaving flows. Equation (3) also shows a random packet drop profile component that relates to the state of the buffer given by average occupancy level of the buffer avg. The larger the average occupancy level of the buffer avg in comparison C2 th , the greater the probability the newly arrived packet from the active flow / is discarded.
  • the random packet drop profile component in equation (3) provides the RS scheme the ability to avoid synchronization of active flows and maintains the buffer occupancies at an optimum level to ensure optimum throughput and minimal delay.
  • the router preferably generates a random number P.
  • the generated random number P is then compared with the packet drop probability P , for the newly arrived packet. If the generated random number P is less than the packet drop probability P , for the newly arrived packet from the active flow , the newly arrived packet is discarded. Otherwise, the newly arrived packet is admitted into the buffer for transmission to either a subsequent intermediate node or the destination node(s). Alternatively, the newly arrived packet can be discarded once the probability Pdi is greater than a predetermined threshold.
  • the steps in the flowcharts 200 and 300 according to the first and second embodiments of the invention, respectively, described in the foregoing are based on the assumption that the packets in the buffer are of equal size. In most cases, this is true because most routers at intermediate nodes split the packets into equal sizes during the buffering process to allow for efficient memory management. However, in cases where the packets in the buffer are not of equal size, the RS scheme can be modified to provide a fairer comparison.
  • the RS scheme according to the first and second embodiments as described in the foregoing is modified to cater for cases where the packets in the buffer are not of equal size. Accordingly, the steps in the flowcharts 200 and 300, as shown in FIG. 2 and FIG. 3 respectively, described in the foregoing are incorporated herein with the exception of the steps 206 and 212 in the flowchart 200 in FIG. 2 and the steps 316 and 324 in the flowchart 300 in FIG. 3.
  • each packet in the buffer is 1500 bytes in size. However, it is also possible that some packets in the buffer are larger or smaller in size.
  • the temporary counter C in the step 206 in the flowchart 200 in FIG.2 and the step 316 in the flowchart 300 in FIG. 3 along with the matching counter MC in the step 212 in the flowchart 200 in FIG.2 and the step 324 in the flowchart 300 in FIG. 3 are incremented by the size of the randomly sampled packet. That is, if two randomly sampled packets of sizes 1500 bytes and 500 byes are checked, the temporary counter C is incremented by 1500 and 500, respectively. And if the comparisons result in a match between the newly arrived packet from the active flow t and the randomly sampled packet of 500 bytes, the matching counter MC is incremented by 500.
  • the functions of the steps 206 and 212 in the flowchart 200 in FIG. 2 and the steps 316 and 324 in the flowchart 300 in FIG. 3 are changed as described in the foregoing to cater for the cases where the sizes of the packets in the buffer are not the same.
  • the sample size S_ ze no longer represents the number of packets as in the first and second embodiments of the invention.
  • the sample size S_ / _ e represents the total size of the sample in bytes.
  • an alternative to randomly sampling packets stored in the buffer is to identify packets from a fixed location in the buffer. Since the buffer operates on a first in first out (FIFO) basis, the movement of the packets in the buffer from the entry position to the exit position provides a random characteristic.
  • FIFO first in first out
  • the RS scheme can be implemented for use in other switching networks, such as a frame relay network or an Asynchronous Transfer Mode (ATM) network, that use other transport protocols.
  • ATM Asynchronous Transfer Mode
  • the RS scheme according to the embodiments of the invention can be implemented in one of software, firmware, special purpose digital logic, or any combination thereof.

Abstract

A random sampling buffer management (RS) scheme for controlling congestion in a packet switching network is disclosed. Upon receiving a newly arrived packet, a new average occupancy level of the buffer is calculated. The newly arrived packet is admitted into the buffer if the calculated average occupancy level of the buffer lies within a congestion region, the newly arrived packet is compared to a number of sample packets randomly selected from the packets in the buffer in a matching process. The newly arrived packet is discarded in accordance with a probability that is proportional to the number of matches found in the matching process. The matching process includes declaring a match upon finding the newly arrived packet being associated with the same active flow as that of a packet from among the randomly sampled packets based on their respective flow identifiers and tracking the number of matches declared.

Description

Method and Apparatus for Controlling Congestion in a Communications
Network
Field of the Invention
This invention relates to congestion control in communications networks. In particular, it relates to a method and apparatus for managing buffers in a packet switching network.
Background
Communications networks such as the Internet transport information in packet form from source nodes to destination nodes over transmission links, which connect the source nodes and the destination nodes. Generally, source-destination node pairs are connected by a series of intermediate nodes, called routers or switches, whose main function is to route packets received from source and/or previous intermediate nodes to the appropriate subsequent nodes and/or destination nodes. The intermediate nodes also have buffers at their input and/or output ports for temporary storage of packets during periods when the number of packets arriving at the intermediate nodes exceeds the bandwidth capacity of the output transmission link.
The Internet and many other communications networks use a variety of transport protocols that implement congestion control and error recovery mechanisms for transporting data in the form of packets. One such protocol that is widely implemented is the Transmission Control Protocol (TCP). The TCP uses an adaptive window based congestion control mechanism. In most commercial implementations of TCP, when a source node detects packet loss it decreases its transmission rate by reducing the size of its congestion window. Under most circumstances, as long as the source node does not detect any packet losses, it periodically increases the size of its congestion window by one packet over a time interval of the order of a round-trip time. A round-trip time is the time between the start of transmission of a packet to the beginning of receipt of an acknowledgement for the transmitted packet. Packet losses, as a result of congestion, lead to abrupt reduction in transmission rates and slacken subsequent growth in the congestion window. This can restrict high throughput and lower link utilization. Furthermore, the TCP has no way of warning the source nodes about the onset of congestion before the buffer limit is reached. This can invariably result in many existing active flows achieving meager throughputs and new active flows achieving zero throughputs because the first packet sent by these new active flows is discarded by intermediate nodes along the transmission links.
To help alleviate the foregoing problems, an intermediate node typically requires large buffers for accommodating heavy network traffic. However, this is not a desirable solution because it is extremely difficult to predict the maximum traffic flow and therefore select the correct buffer size. Further, excessive buffering can result in high delays that may be unfavorable to many applications. Therefore there is a need to provide buffer management schemes to control congestion on the Internet and the like communications networks.
There are a number of buffer management schemes proposed. These schemes include the Random Early Detection (RED) scheme, the Stabilized RED (SRED) scheme and the "CHOose and Keep for responsive flows, CHOose and Kill for unresponsive flows" (CHOKe) scheme.
The RED scheme is proposed by S. Floyd and V. Jacobson in "Random Early Detection Gateways for Congestion Avoidance," IEEE/ACM Transactions on Networking, vol. 1, no. 4, pp. 397-413, August 1993. This buffer management scheme relies on adaptive transport protocols like TCP. In the RED scheme, source nodes are notified about the onset of congestion at an intermediate node when the average buffer content at the intermediate node exceeds a predetermined threshold. The RED scheme notifies the source node(s) by discarding packets based on a probability that is related to the average occupancy level of the buffer. If the source nodes use the TCP, the source nodes reduce their transmission rates in response to packet drops, thus reducing congestion at intermediate nodes along the transmission links to destination nodes. Some of the key features of the RED scheme include the ability to avoid synchronization of flows by randomly dropping packets from different active flows, the maintenance of low buffer occupancy levels to ensure short delays and the prevention of bias against active flows with bursty traffic patterns.
The RED scheme achieves these features by first computing the average occupancy level of the buffer using an exponentially weighted moving average and comparing it to two thresholds: a lower bound average buffer occupancy threshold (Minth) and an upper bound average buffer occupancy threshold (Maxth). Incoming packets are discarded with a packet drop probability that is a linear function of the average occupancy level of the buffer when the average occupancy level is between the two average buffer occupancy thresholds. The packet drop probability is zero when the average buffer occupancy level is below Minth and one when the average buffer occupancy level is above Maxt , as shown in a packet drop probability curve 100 in FIG. 1. When the average occupancy level of the buffer is between the two thresholds, the packet drop probability P is calculated as follows:
Figure imgf000005_0001
where Maxp represents a maximum packet drop probability constant and avg represents the average occupancy level of the buffer. Maxp sets the maximum probability for discarding a packet. Depending on the operational requirements of the intermediate node, Maxp can have a value ranging from greater than zero up to one. The average occupancy level of the buffer avg is calculated as follows:
avg = (1 - w)avgprev + wq
where w represents a weight constant, avgprev represents the immediate previous average occupancy level of the buffer and q represents the current occupancy level of the buffer. This is essentially a lowpass filter, where w determines the time constant of the lowpass filter. Thus, the average occupancy level of the buffer avg is an exponentially weighted moving average, which is not substantially affected by random bursty traffic.
The RED scheme works well for random bursty traffic and in situations where congestion is not prolonged. During persistent congestion periods the RED scheme applies a uniform packet drop probability on all arriving packets irrespective of the level of contribution each of the active flows has on the buffer and therefore the state of congestion. That is, the RED scheme is not able to distinguish which of the active flows contribute the most to the congestion and therefore unable to penalize these active flows accordingly. This leads to unfair bandwidth utilization by certain active flows on the Internet and the like communications networks that deploy the RED scheme in the intermediate nodes. Active flows that have high levels of contribution to the congestion at intermediate nodes are referred to as misbehaving flows.
The SRED scheme is proposed by T. Ott, et. al. in "SRED: Stablized RED," IEEE INFOCOM, vol. 3, pp. 1346-1355, 1999 and a related US patent no. 6,434,116. The SRED scheme seeks to improve upon the RED scheme by stabilizing the occupancy of a buffer at a level independent of the number of active flows. The SRED scheme achieves this by estimating the number of active flows and finding misbehaving flows. The main idea is to compare the information of a newly arrived packet with information from a randomly chosen entry from a fixed size data structure, called the "Zombie list". The Zombie list contains a list of entries, whereby each entry contains information of a recent packet that traversed the buffer. If the information of the two packets are found to be associated with the same active flow a match is declared. Accordingly, the newly arrived packet is discarded with a packet drop probability that is related to whether a match is declared. By using the "Zombie list," the SRED scheme is able to identify misbehaving flows. However, the SRED scheme does not propose a simple router mechanism for penalizing the misbehaving flows. Further, the SRED scheme requires additional resources to store, maintain and operate the Zombie list. This invariably results in the intermediate nodes requiring greater hardware and processing capability, which increases the overall cost of the intermediate nodes. The CHOKe scheme is proposed by R Pan, et. al. in "CHOKe - A Stateless Active Queue Management Scheme for Approximating Fair Bandwidth Allocation," IEEE INFOCOM vol. 2, pp. 942-951, 2000. The CHOKe scheme compares a newly arrived packet with a randomly selected set of packets from the buffer. Once a packet from the randomly selected set of packets is found to originate from the same source node as the newly arrived packet a match is declared and the matched packet from the randomly selected set of packets and the newly arrived packet are discarded. If a match is not found, the newly arrived packet is admitted into the buffer using the RED packet drop probability Pd as described in the foregoing. The assumption behind the CHOKe scheme is that the buffer is likely to contain a greater number of packets belonging to misbehaving flows than normal active flows. Thus, the packets from the misbehaving flows are more likely to be selected for comparison with the newly arrived packet. By dropping packets belonging to the misbehaving flows, the CHOKe scheme provides fairer bandwidth utilization for all active flows. However, the CHOKe scheme can result in bursty losses as each packet drop decision is effective for both the newly arrived packet and the packets already admitted into the buffer. Furthers because the CHOKe scheme fundamentally relies on the RED packet drop probability Pd, it also suffers the same drawback of uniform dropping during persistent congestion periods as in the RED scheme since the packet drop probability for each active flow is calculated as a function of the packets from all active flows.
It is therefore desirable to provide a buffer management scheme that exhibits the key features of the RED scheme and yet able to achieve a high degree of fairness in bandwidth utilization without the limitations of the schemes described in the foregoing.
Summary
A random sampling buffer management (RS) scheme according to the embodiments of the invention provides an efficient buffer management scheme that exhibits a high degree of fairness in allocating bandwidth to all active flows. The RS scheme flows in proportion to the level of contribution of each of the active flows to the overall congestion at the intermediate node. The level of contribution of each of the active flows is determined by randomly sampling packets (the sample) stored in the buffer of the intermediate node and comparing the sample with a newly arrived packet at the intermediate node. A match is declared each time a packet in the sample and the newly arrived packet are determined to have the same flow identifier. The number of matches declared is an indication of the level of contribution the active flow has on the overall state of congestion at the intermediate node. Accordingly, the newly arrived packet is discarded according to a probability based on the number of matches declared.
Therefore, in accordance with a first aspect of the invention, there is disclosed a method for controlling congestion in a communications network, the network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node, the method comprising the steps of: providing a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; generating a discard probability based on the sample and the match group; and discarding the received packet based on the generated discard probability to control congestion in the communications network.
In accordance with a second aspect of the invention, there is disclosed an apparatus for controlling congestion in a communications network, the communications network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node, the apparatus comprising: a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; means for identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; means for generating a discard probability based on the sample and the match group; and means for discarding the received packet based on the discard probability to control congestion in the communications network.
Brief Descriptions of The Drawings
Embodiments of the invention are described hereinafter with reference to the following drawings, in which:
FIG. 1 shows a prior art packet drop probability curve of the Random Early Detection (RED) scheme;
FIG. 2 shows a flowchart of the steps performed by an intermediate node for managing its buffers in accordance with a first embodiment of the invention; and
FIG. 3 shows a flowchart of the steps performed by an intermediate node for managing its buffers in accordance with a second embodiment of the invention.
Detailed Description
A random sampling buffer management (RS) scheme according to the embodiments of the invention is provided hereinafter. The RS scheme provides an efficient buffer management scheme that exhibits a high degree of fairness in allocating bandwidth to all active flows in addition to the ability to avoid synchronization of active flows by randomly dropping packets, the maintenance of low buffer occupancy levels to ensure all active flows in addition to the ability to avoid synchronization of active flows by randomly dropping packets, the maintenance of low buffer occupancy levels to ensure short delays and the prevention of bias against active flows with bursty traffic patterns.
A packet switching network such as the Internet consists of multiple intermediate nodes linked together by multiple transmission links for transporting information in packet form from one or more source nodes to one or more destination nodes. Typically, each intermediate node has a switch or a router, which receives packets from source and/or previous intermediate nodes and redirects these packets to their respective subsequent intermediate or destination nodes. During high traffic periods the intermediate nodes may not be able to cope with the large volume of arriving packets due to the limited bandwidth capacity of the outgoing transmission link. High traffic periods occur when many active flows are transmitting information at the same time or when one or more active flows are transmitting large volumes of information over a short time period (i.e. bursty traffic). In such situations, the intermediate nodes typically discard incoming packets to implicitly signal the source nodes to reduce their transmission rates thereby reducing the number of packet being transmitted. In the case where the source nodes use the TCP, the sizes of the congestion windows of the source nodes are reduced, typically by one-half. To avoid discarding the incoming packets during high traffic periods or bursty traffic periods, intermediate nodes also have buffers for temporarily storing arriving packets. However, because these buffers cannot be too large, an efficient buffer management scheme is needed to control the congestion at the intermediate nodes and at the same time provide fair bandwidth allocation to all active flows traversing the intermediate nodes.
The RS scheme according to the embodiments of the invention operates on the premise that the cause of the congestion at an intermediate node is most likely to be the active flow(s) with the greatest contribution to the state of the buffer. Therefore, to provide a high degree of fairness in the allocation of bandwidth to all active flows traversing the intermediate node, the active flows that contribute the most to the congestion (i.e. misbehaving flows) need to be identified and penalized accordingly. Furthermore, each misbehaving flow is to be penalized in accordance with its level of contribution to the state of the buffer and thereby the state of congestion. This ensures that not all active flows are penalized uniformly, rather each active flow is penalized depending on its contribution to the state of congestion. This results in a high degree of fairness in the utilization of the outgoing transmission link bandwidth.
The RS scheme according to a first embodiment of the invention is shown in flowchart 200 in FIG. 2. An intermediate node implementing the RS scheme carries out the steps shown in the flowchart 200 as described hereinafter. When the router at an intermediate node receives a newly arrived packet in a step 202, the router randomly samples a set of packets (the sample) from the buffer for comparison with the newly arrived packet. To alleviate the need for additional buffering, the packets in the sample are randomly selected from the buffer one packet at a time, in a step 206, for comparison in the matching process in the steps 208 to 212. The size of the sample Ssize can range from one packet up to the instantaneous buffer size. If the sample size Ssize is too small, the comparisons do not provide an adequate representation of the actual state of the buffer. Therefore, misbehaving flows might not be identified and instead normal active flows can be mistaken as misbehaving flows and therefore erroneously penalized. On the other hand, if the sample size S_,ze is too large, even though large sample size S_,_e yields a substantially clearer picture of the state of the buffer for identifying more accurately the misbehaving flows, the router requires a great amount of processing time to carry out the matching process. In the first embodiment, to provide a substantially fair bandwidth allocation and at the same time minimizing the processing time of the router, the sample size S_/ze is preferably selected from a range of 20% to 50% of the instantaneous buffer size.
Before the matching process begins, a temporary counter C and a matching counter MC are provided in a step 204 for use in the matching process. The temporary counter C is used for counting the number of packets compared during the matching process, while the matching counter MC is used for tracking the number of matches found during the matching process. In the step 204, the temporary counter C and the matching counter MC are assigned initial values "0" and "0", respectively. Upon acquiring a randomly sampled packet from the buffer in the step 206, the temporary counter C is incremented by one and a step 208 is activated to begin the matching process. In the step 208, if the temporary counter C is lesser than or equal to the sample size Ssize the requisite number of randomly sampled packets has not been compared. In this case, the newly arrived packet is compared with a previously un- compared packet randomly sampled from the buffer in a step 210. Each active flow has a flow identifier. The flow identifier comprises one or a combination of destination and source addresses, destination and source port numbers, protocol identifier, and the like identifiers. A match is found when the newly arrived packet and the randomly sampled packet from the buffer are determined to have the same flow identifier. Each time a match is found, the matching counter MC is incremented by one to register the number of matches found for the newly arrived packet in a step 212. The router then reverts to the step 206 to retrieve another un-checked randomly sampled packet from the buffer for comparison.
However, if a match is not found in the step 210, the router reverts to the step 206 to compare the next un-checked randomly sampled packet from the buffer.
In the step 208, if the requisite number of comparisons (i.e. the sample size Ssjze) is compared, the newly arrived packet is discarded with a packet drop probability that is proportional to the matching counter MC in a step 214. In the first embodiment, the packet drop probability Pa of a newly arrived packet associating with an active flow i is given by:
MC, pdi = (1)
where MCi represents the number of matches found for the newly arrived packet that associates with the active flow i. Equation (1) indicates that the newly arrived packet from the active flow i is discarded in proportion to the number of matches found. If the number of matches found is high, then the newly arrived packet from the active flow i has a high probability of being discarded. Thus the bandwidth of the outgoing transmission link is not fully occupied by misbehaving flows.
Once the packet drop probability Pdi for a newly arrived packet from an active flow i is obtained, the router preferably generates a random number P. The generated random number P is then compared with the packet drop probability P i for the newly arrived packet. If the generated random number P is less than the packet drop probability Pdi for the newly arrived packet from the active flow i, the newly arrived packet is discarded. Otherwise, the newly arrived packet is admitted into the buffer for transmission to either a subsequent intermediate node or the destination node(s). Alternatively, the newly arrived packet can be discarded once the probability P i is greater than a predetermined threshold.
The RS scheme according to a second embodiment of the invention is shown in a flowchart 300 in FIG. 3. An intermediate node implementing the RS scheme of this second embodiment carries out the steps shown in the flowchart 300 as described hereinafter. When the router at an intermediate node receives a newly arrived packet in a step 302, the router proceeds to calculate a new average occupancy level of the buffer avg in a step 304. Like in the RED scheme, the average occupancy level of the buffer avg is calculated using an exponentially weighted moving average. Although other average occupancy level calculation methods can be used, the exponentially weighted moving average is preferred for its resilient mechanism that prevents drastic changes in the average occupancy level of the buffer due to random bursty traffic patterns. Thus, the RS scheme does not bias against active flows exhibiting bursty traffic patterns. The average occupancy level of the buffer avg is calculated as follows:
avg = (1 - w)avg prev + wq (2) where w represents a weight constant, avgprev represents the immediate previous average occupancy level of the buffer and q represents the instantaneous occupancy level of the buffer. The selection of w is dependent on the operational preferences of the service provider that deploys the intermediate nodes. If w is too large, the averaging procedure does not quickly detect the changes in the level of congestion at the intermediate node. On the other hand, if w is too small, the average occupancy level of the buffer avg responds too slowly to changes in the actual buffer occupancy level. Thus, the router is unable to detect the initial stages of congestion. Preferably, w is chosen between 0.001 and 0.01. The average occupancy level of the buffer avg is essentially calculated using a lowpass filter, where w determines the time constant of the lowpass filter.
Once the new average occupancy level of the buffer avg is obtained, the router checks if the new average occupancy level of the buffer avg falls within two congestion thresholds Clth and C2tι, in a step 306 with Cltιt being lower than C_V This region between Clth and C-V, indicates the onset of congestion at the intermediate node. The values of these congestion thresholds Clth and C2tι, are dependent on the size of the buffer. Preferably, these values are 50% and 80% of the size of the buffer, respectively. If the average occupancy level of the buffer avg is not within the region between Cltι, and C2tι„ the router checks if the new average occupancy level of the buffer avg is lesser than Clth in a step 308. If the new average occupancy level of the buffer avg is lesser than Clth, the newly arrived packet is admitted into the buffer in a step 310 for transmission to either a subsequent intermediate node or the destination node. In the step 308, if the average occupancy level of the buffer avg is greater than Clt , it implies that the average occupancy level avg is equal to or greater than C2tι, indicating that the buffer is in a state of heavy congestion. Thus, the newly arrived packet is discarded directly in a step 312.
However, if the average occupancy level of the buffer avg is found to fall within the region between Cltι, and C2tι, in the step 306, a sample of packets (the sample) is selected randomly from the buffer for comparison with the newly arrived packet. To alleviate the need for additional buffering, the packets in the sample are randomly selected from the buffer one packet at a time, in a step 316, for comparison in the matching process in the steps 318 to 324. The size of the sample &__ can range from one packet up to the instantaneous buffer size of the buffer. Like in the first embodiment, if the sample size S~ze is too small, the comparison does not provide an adequate representation of the actual state of the buffer. Therefore, misbehaving flows might not be identified and instead normal active flows can be mistaken as misbehaving flows and therefore erroneously penalized. On the other hand, if the sample size S_,ze is too large, even though large sample size S_,ze yields a substantially clearer picture of the state of the packets in the buffer for identifying more accurately the misbehaving flows, the router requires a great amount of processing time to carry out the matching process. In the second embodiment, to provide substantially fair bandwidth utilization and at the same time minimizing the processing time of the router, the sample size S_,ze is preferably selected from a range of 20% to 50% of the instantaneous occupancy level of the buffer.
Before the matching process begins, a temporary counter C and a matching counter MC are provided in a step 314 for using in the matching process. The temporary counter C is used for counting the number of packets compared during the matching process, while the matching counter MC is used for tracking the number of matches found during the matching process. In the step 314, the temporary counter C and the matching counter MC are assigned initial values "0" and "0", respectively.
Upon acquiring a randomly sampled packet from the buffer in the step 316, the temporary counter C is incremented by one and a step 318 is activated to begin the matching process. In the step 318, if the temporary counter C is lesser than or equal to the sample size S_,-_e the requisite number of randomly sampled packets has not been compared. In this case, the newly arrived packet is compared with an un-checked packet randomly sampled from the buffer in a step 322. Each active flow has a flow identifier. The flow identifier comprises one or a combination of destination and source addresses, destination and source port numbers, protocol identifier, and the like identifiers. A match is found when the newly arrived packet and the randomly sampled packet from the buffer are determined to have the same flow identifier. Each reverts to the step 316 to retrieve another un-checked randomly sampled packet from the buffer for comparison.
However, if a match is not found in the step 322, the router reverts to the step 316 to compare the next un-checked randomly sampled packet from the buffer.
In the step 318, if the requisite number of comparisons (i.e. the sample size Ssιze) is compared, the newly arrived packet is discarded with a packet drop probability that is proportional to the matching counter MC in a step 320. In the second embodiment, the packet drop probability P_/,- of a newly arrived packet originating from an active flow i is given by:
Figure imgf000016_0001
where Maxp represents the maximum packet drop probability and MC, represents the number of matches found for the newly arrived packet that associates with the active flow /. Maxp sets the maximum probability for discarding a packet. Depending on the operational requirements of the intermediate node, Maxp can have a value ranging from greater than zero up to one. The value of Maxp is preferably chosen from a range of 0.02 to 0.08; although it is obvious to one skilled in the art the Maxp can also be chosen from a larger range of zero to one.
Equation (3) indicates that the newly arrived packet from the active flow i is discarded in proportion to the number of matches found. If the number of matches found is high, then the newly arrived packet from the active flow i has a high probability of being discarded. Thus, the bandwidth of the outgoing transmission link is not fully occupied by misbehaving flows. Equation (3) also shows a random packet drop profile component that relates to the state of the buffer given by average occupancy level of the buffer avg. The larger the average occupancy level of the buffer avg in comparison C2th, the greater the probability the newly arrived packet from the active flow / is discarded. The random packet drop profile component in equation (3) provides the RS scheme the ability to avoid synchronization of active flows and maintains the buffer occupancies at an optimum level to ensure optimum throughput and minimal delay.
Once the packet drop probability Pd, for a newly arrived packet from an active flow i is obtained, the router preferably generates a random number P. The generated random number P is then compared with the packet drop probability P , for the newly arrived packet. If the generated random number P is less than the packet drop probability P , for the newly arrived packet from the active flow , the newly arrived packet is discarded. Otherwise, the newly arrived packet is admitted into the buffer for transmission to either a subsequent intermediate node or the destination node(s). Alternatively, the newly arrived packet can be discarded once the probability Pdi is greater than a predetermined threshold.
The steps in the flowcharts 200 and 300 according to the first and second embodiments of the invention, respectively, described in the foregoing are based on the assumption that the packets in the buffer are of equal size. In most cases, this is true because most routers at intermediate nodes split the packets into equal sizes during the buffering process to allow for efficient memory management. However, in cases where the packets in the buffer are not of equal size, the RS scheme can be modified to provide a fairer comparison.
In accordance with a third embodiment of the invention, the RS scheme according to the first and second embodiments as described in the foregoing is modified to cater for cases where the packets in the buffer are not of equal size. Accordingly, the steps in the flowcharts 200 and 300, as shown in FIG. 2 and FIG. 3 respectively, described in the foregoing are incorporated herein with the exception of the steps 206 and 212 in the flowchart 200 in FIG. 2 and the steps 316 and 324 in the flowchart 300 in FIG. 3.
Typically, each packet in the buffer is 1500 bytes in size. However, it is also possible that some packets in the buffer are larger or smaller in size. To account for the difference in packet sizes while ensuring that all the active flows are treated fairly, the temporary counter C in the step 206 in the flowchart 200 in FIG.2 and the step 316 in the flowchart 300 in FIG. 3 along with the matching counter MC in the step 212 in the flowchart 200 in FIG.2 and the step 324 in the flowchart 300 in FIG. 3 are incremented by the size of the randomly sampled packet. That is, if two randomly sampled packets of sizes 1500 bytes and 500 byes are checked, the temporary counter C is incremented by 1500 and 500, respectively. And if the comparisons result in a match between the newly arrived packet from the active flow t and the randomly sampled packet of 500 bytes, the matching counter MC is incremented by 500.
Therefore, the functions of the steps 206 and 212 in the flowchart 200 in FIG. 2 and the steps 316 and 324 in the flowchart 300 in FIG. 3 are changed as described in the foregoing to cater for the cases where the sizes of the packets in the buffer are not the same. Further, the sample size S_ ze no longer represents the number of packets as in the first and second embodiments of the invention. In the third embodiment of the invention, the sample size S_/_e represents the total size of the sample in bytes.
In the foregoing manner, a random sampling buffer management scheme is described according to the embodiments of the invention for addressing one or more of the foregoing disadvantages of conventional buffer management schemes. It will be apparent to one skilled in the art in view of this disclosure that numerous changes, modifications and combinations can be made without departing from the scope and spirit of the invention. For example, an equally effective average occupancy level calculation method can be used to replace the exponentially weighted moving average method. Similarly, the random packet drop probability component in equation (3) can be replaced by any one of numerous known methods of calculating a random probability. In the embodiments of the invention, an alternative to retrieving the randomly sampled packets from the buffer is to retrieve only the information relating to the randomly sampled packets for the matching process. Further, an alternative to randomly sampling packets stored in the buffer is to identify packets from a fixed location in the buffer. Since the buffer operates on a first in first out (FIFO) basis, the movement of the packets in the buffer from the entry position to the exit position provides a random characteristic. Even though the embodiments of the invention are described in the context of the Internet using the TCP for transporting information from source nodes to destination nodes, the RS scheme can be implemented for use in other switching networks, such as a frame relay network or an Asynchronous Transfer Mode (ATM) network, that use other transport protocols. Further, the RS scheme according to the embodiments of the invention can be implemented in one of software, firmware, special purpose digital logic, or any combination thereof.

Claims

Claims:
1. A method for controlling congestion in a communications network, the network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node, the method comprising the steps of: providing a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; generating a discard probability based on the sample and the match group; and discarding the received packet based on the generated discard probability to control congestion in the communications network.
2. The method as in claim 1, wherein the step of identifying the match group comprises the steps of: declaring a match when the flow identifier of the at least one packet in the sample and the received packet is the same; and incrementing a counter in response to the match being declared.
3. The method as in claim 2, wherein the step of incrementing the counter comprises the step of incrementing the counter by a factor relating to the size of the at least one packet in the sample being compared.
4. The method as in claim 1, wherein the step of generating the discard probability comprises the steps of: determining a sample quantity, the sample quantity being the size of the sample; determining a match quantity, the match quantity being the size of the match group; and generating the discard probability from the sample quantity and the match quantity.
5. The method as in claim 4, wherein the step of generating the discard probability comprises the step of calculating the discard probability in accordance with:
MQa pd =
SQcount
where Pd represents the discard probability, MQC0Unt represents the match quantity and SQcount represents the sample quantity.
6. The method as in claim 1, wherein the step of generating the discard probability comprises the steps of: determining a sample quantity, the sample quantity being a summation of the packet size of each of the at least one packet in the sample; determining a match quantity, the match quantity being a summation of the packet size of each packet in the match group; and generating the discard probability from the sample quantity and the match quantity.
7. The method as in claim 6, wherein the step of generating the discard probability comprises the step of calculating the discard probability in accordance with:
p _ MQm
SQ„ where Pd represents the discard probability, MQsize represents the match quantity and SQsize represents the sample quantity.
8. The method as in claim 1, further comprising the steps of: calculating an average occupancy level of the buffer upon receiving the packet at the recipient node; and admitting the received packet into the buffer if the average occupancy level of the buffer is lesser than a first threshold.
9. The method as in claim 8, further comprising the step of discarding the received packet if the average occupancy level of the buffer is greater than a second threshold, wherein the second threshold has a value ranging from a value greater than the first threshold up to the size of the buffer.
10. The method as in claim 8, wherein the step of calculating the average occupancy level of the buffer comprises the step of calculating the average occupancy level of the buffer avg in accordance with:
avg = (l-w)avgprev + wq
where w represents a weight constant, avgprev represents the immediate previous average occupancy level of the buffer and q represents the instantaneous occupancy level of the buffer.
11. The method as in claim 10, wherein the step of generating the discard probability comprises the steps of: determining a sample quantity, the sample quantity being the size of the sample; determining a match quantity, the match quantity being the size of the match group; and calculating the discard probability in accordance with:
Figure imgf000023_0001
where Pd represents the discard probability, Maxp represents a maximum packet drop probability, avg represents the average occupancy level of the buffer, Clth and C2th represent the first and second thresholds, respectively, MQcoullt represents the match quantity and SQC0Unt represents the sample quantity.
12. The method as in claim 10, wherein the step of generating the discard probability comprises the steps of: determining a sample quantity, the sample quantity being a summation of the packet size of each of the at least one packet in the sample; determining a match quantity, the match quantity being a summation of the packet size of each packet in the match group; and calculating the discard probability in accordance with:
Figure imgf000023_0002
where Pd represents the discard probability, Maxp represents a maximum packet drop probability, avg represents the average occupancy level of the buffer, Clth and C2th represent the first and second thresholds, respectively, MQslze represents the match quantity and SQs,ze represents the sample quantity.
13. An apparatus for controlling congestion in a communications network, the network comprising a recipient node, the recipient node comprising a buffer for storing packets received at the recipient node, the apparatus comprising: a sample, the sample comprising at least one packet randomly selected from packets stored in the buffer upon receiving a packet at the recipient node, the at least one packet and the received packet each having a flow identifier and a packet size, the flow identifier being indicative of the active flow to which the packet belongs; means for identifying a match group for grouping the at least one packet in the sample, wherein the flow identifier of each packet in the match group matches the flow identifier of the received packet; means for generating a discard probability based on the sample and the match group; and means for discarding the received packet based on the discard probability to control congestion in the communications network.
14. The apparatus as in claim 13, wherein the means for identifying the match group comprises: means for declaring a match when the flow identifier of the at least one packet in the sample and the received packet is the same; and a counter for keeping a count in response to the match being declared.
15. The apparatus as in claim 14, wherein the counter increases the count by a factor relating to the size of the at least one packet in the sample being compared.
16. The apparatus as in claim 13, wherein the means for generating the discard probability comprises: means for determining a sample quantity, the sample quantity being the size of the sample; means for determining a match quantity, the match quantity being the size of the match group; and means for generating the discard probability from the sample quantity and the match quantity.
17. The apparatus as in claim 16, wherein the means for generating the discard probability comprises means for calculating the discard probability in accordance with:
MQcomt Pd
SQcount where P. represents the discard probability, MQcomt represents the match quantity and SQcount represents the sample quantity.
18. The apparatus as in claim 13, wherein the means for generating the discard probability comprises: means for determining a sample quantity, the sample quantity being a summation of the packet size of each of the at least one packet in the sample; means for determining a match quantity, the match quantity being a summation of the packet size of each packet in the match group; and means for generating the discard probability from the sample quantity and the match quantity.
19. The apparatus as in claim 18, wherein the means for generating the discard probability comprises means for calculating the discard probability in accordance with:
Prf = MQSI
SQS,
where P represents the discard probability, MQS{ze represents the match quantity and S_2-/re represents the sample quantity.
20. The apparatus as in claim 13, further comprising: means for calculating an average occupancy level of the buffer upon receiving the packet at the recipient node; and means for admitting the received packet into the buffer if the average occupancy level of the buffer is lesser than a first threshold.
21. The apparatus as in claim 20, further comprising means for discarding the received packet if the average occupancy level of the buffer is greater than a second threshold, wherein the second threshold has a value ranging from a value greater than the first threshold up to the size of the buffer.
22. The apparatus as in claim 20, wherein the means for calculating the average occupancy level of the buffer comprises means for calculating the average occupancy level of the buffer avg in accordance with:
avg = (l-w)avgprev + wq
where w represents a weight constant, avgprev represents the immediate previous average occupancy level of the buffer and q represents the instantaneous occupancy level of the buffer.
23. The apparatus as in claim 22, wherein the means for generating the discard probability comprises: means for determining a sample quantity, the sample quantity being the size of the sample; means for determining a match quantity, the match quantity being the size of the match group; and means for calculating the discard probability in accordance with:
Figure imgf000026_0001
where P represents the discard probability, Maxp represents a maximum packet drop probability, avg represents the average occupancy level of the buffer, Clth and C2t/, represent the first and second thresholds, respectively, MQcomt represents the match quantity and SQCOunt represents the sample quantity.
24. The apparatus as in claim 22, wherein the means for generating the discard probability comprises: means for determining a sample quantity, the sample quantity being a summation of the packet size of each of the at least one packet in the sample; means for determining a match quantity, the match quantity being a summation of the packet size of each packet in the match group; and means for calculating the discard probability in accordance with:
Figure imgf000027_0001
where P represents the discard probability, Maxp represents a maximum packet drop probability, avg represents the average occupancy level of the buffer, Clth and C2tι, represent the first and second thresholds, respectively, MQslze represents the match quantity and SQs,ze represents the sample quantity.
PCT/SG2003/000054 2003-03-20 2003-03-20 Method and apparatus for controlling congestion in communications network WO2004084508A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2003217146A AU2003217146A1 (en) 2003-03-20 2003-03-20 Method and apparatus for controlling congestion in communications network
PCT/SG2003/000054 WO2004084508A1 (en) 2003-03-20 2003-03-20 Method and apparatus for controlling congestion in communications network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SG2003/000054 WO2004084508A1 (en) 2003-03-20 2003-03-20 Method and apparatus for controlling congestion in communications network

Publications (1)

Publication Number Publication Date
WO2004084508A1 true WO2004084508A1 (en) 2004-09-30

Family

ID=33029193

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2003/000054 WO2004084508A1 (en) 2003-03-20 2003-03-20 Method and apparatus for controlling congestion in communications network

Country Status (2)

Country Link
AU (1) AU2003217146A1 (en)
WO (1) WO2004084508A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008134049A1 (en) * 2007-04-30 2008-11-06 Lucent Technologies Inc. Lightweight bandwidth-management scheme for elastic traffic
US7724664B2 (en) * 2001-06-15 2010-05-25 British Telecommunications Public Limited Company Packet communications network congestion alleviation method and apparatus
EP2600575A1 (en) * 2011-11-29 2013-06-05 Hughes Network Systems, LLC Method and system for controlling tcp traffic with random early detection and window size adjustments
EP3086518A4 (en) * 2013-12-10 2016-11-16 Huawei Tech Co Ltd Method for network device congestion avoidance and network device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106101005B (en) * 2016-08-09 2018-12-25 中南大学 Jamming control method based on block length in a kind of data center network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6104715A (en) * 1997-04-28 2000-08-15 International Business Machines Corporation Merging of data cells in an ATM network
EP1122916A2 (en) * 2000-02-01 2001-08-08 Alcatel Canada Inc. Dynamic buffering system having integrated radom early detection
EP1128610A2 (en) * 1999-12-06 2001-08-29 Nortel Networks Limited Load adaptive buffer management in packet networks
US6333917B1 (en) * 1998-08-19 2001-12-25 Nortel Networks Limited Method and apparatus for red (random early detection) and enhancements.

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6104715A (en) * 1997-04-28 2000-08-15 International Business Machines Corporation Merging of data cells in an ATM network
US6333917B1 (en) * 1998-08-19 2001-12-25 Nortel Networks Limited Method and apparatus for red (random early detection) and enhancements.
EP1128610A2 (en) * 1999-12-06 2001-08-29 Nortel Networks Limited Load adaptive buffer management in packet networks
EP1122916A2 (en) * 2000-02-01 2001-08-08 Alcatel Canada Inc. Dynamic buffering system having integrated radom early detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FLOYD S., JACOBSON V.: "Random early detection gateways for congestion avoidance", IEEE/ACM TRANSACTION ON NETWORKING, vol. 1, no. 4, August 1993 (1993-08-01), pages 397 - 413, XP000415363 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7724664B2 (en) * 2001-06-15 2010-05-25 British Telecommunications Public Limited Company Packet communications network congestion alleviation method and apparatus
WO2008134049A1 (en) * 2007-04-30 2008-11-06 Lucent Technologies Inc. Lightweight bandwidth-management scheme for elastic traffic
US8289851B2 (en) 2007-04-30 2012-10-16 Alcatel Lucent Lightweight bandwidth-management scheme for elastic traffic
EP2600575A1 (en) * 2011-11-29 2013-06-05 Hughes Network Systems, LLC Method and system for controlling tcp traffic with random early detection and window size adjustments
US8705357B2 (en) 2011-11-29 2014-04-22 Hughes Network Systems, Llc Method and system for controlling TCP traffic with random early detection and window size adjustments
EP3086518A4 (en) * 2013-12-10 2016-11-16 Huawei Tech Co Ltd Method for network device congestion avoidance and network device

Also Published As

Publication number Publication date
AU2003217146A1 (en) 2004-10-11

Similar Documents

Publication Publication Date Title
EP0872988B1 (en) A method for supporting per-connection queuing for feedback-controlled traffic
US6912225B1 (en) Packet forwarding device and packet priority setting method
US6862621B2 (en) Flow controlling apparatus and node apparatus
US6560198B1 (en) Method and system for stabilized random early detection using packet sampling
US6847646B1 (en) Network coupling device with small cell memory
Ahammed et al. Anakyzing the performance of active queue management algorithms
US20040246895A1 (en) Bandwidth-limited supervisory packet transmission to control congestion and call establishment in packet-based networks
US20050201284A1 (en) TCP optimized single rate policer
CA2448221C (en) Communications network with congestion avoidance
WO1995017788A1 (en) Data link interface for packet-switched network
US7324442B1 (en) Active queue management toward fair bandwidth allocation
EP1417795B1 (en) Switching node with classification-dependent mac buffer control
US7383349B2 (en) Controlling the flow of packets within a network node utilizing random early detection
EP1704684A1 (en) Method and device for controlling a queue buffer
US7391785B2 (en) Method for active queue management with asymmetric congestion control
WO2004084508A1 (en) Method and apparatus for controlling congestion in communications network
Jiang et al. Self adjustable CHOKe: an active queue management algorithm for congestion control and fair bandwidth allocation
JP4135007B2 (en) ATM cell transfer device
RU2329607C2 (en) Method and device to control queue buffer
CN117395206B (en) Rapid and accurate congestion feedback method for lossless data center network
JP3595134B2 (en) Packet buffer device and packet discard control method
Hu et al. Evaluation of queue management algorithms
Hu et al. Evaluation of Queue Management Algorithms ¾ Course Project Report for 15-744 Computer Networks
Chatranon Providing Fairness Through Detection and Preferential Dropping of High Bandwidth Unresponsive Flows

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP