US20150163146A1 - Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing - Google Patents
Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing Download PDFInfo
- Publication number
- US20150163146A1 US20150163146A1 US14/488,307 US201414488307A US2015163146A1 US 20150163146 A1 US20150163146 A1 US 20150163146A1 US 201414488307 A US201414488307 A US 201414488307A US 2015163146 A1 US2015163146 A1 US 2015163146A1
- Authority
- US
- United States
- Prior art keywords
- flow
- path
- group
- egress
- packet
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 26
- 238000012544 monitoring process Methods 0.000 claims description 39
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims 2
- 230000006870 function Effects 0.000 description 33
- 238000010586 diagram Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 9
- 238000003860 storage Methods 0.000 description 9
- 238000009826 distribution Methods 0.000 description 5
- 241000406668 Loxodonta cyclotis Species 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 3
- 238000012545 processing Methods 0.000 description 2
- 230000032683 aging Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/12—Avoiding congestion; Recovering from congestion
- H04L47/125—Avoiding congestion; Recovering from congestion by balancing the load, e.g. traffic engineering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/06—Generation of reports
- H04L43/062—Generation of reports related to network traffic
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
- H04L45/125—Shortest path evaluation based on throughput or bandwidth
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/38—Flow based routing
Definitions
- the disclosed embodiments of the present invention relate to forwarding packets, and more particularly, to a packet processing apparatus and method using the flow subgroup based path selection for dynamic load balancing.
- Routing is the process of selecting the best path from a source node to a destination node in a network environment.
- the current Internet infrastructure consists of interconnected networks.
- a router also known as a gateway, is a device that connects different networks together.
- the router's main tasks may include discovering paths to various destinations and forwarding packets inside the network or between different networks.
- the header of the received packet is checked.
- the forwarding table lookup is performed to obtain the information to which outgoing (egress) port the packet should be sent.
- the router may employ one routing protocol for packet forwarding.
- ECMP Equal Cost Multi-Path
- ECMP Equal Cost Multi-Path
- the router When forwarding a packet, the router must decide which next-hop (path) to use.
- One typical method for determining which next-hop (path) to use when routing with ECMP may employ the hash-based path selection. For example, the router first determines a hash value by performing a hash function upon the packet header fields that identify a flow. Multiple next-hops have been assigned unique hash values. Hence, the router uses the hash value derived from the packet to be forwarded to decide which next-hop (path) to use.
- the hash-based path selection distributes flows to ECMP paths statistically. As a result, the hash-based path selection cannot guarantee a uniform bandwidth distribution over the ECMP paths. For example, one equal-cost egress path may be selected to deliver more flows, while another equal-cost egress path may be selected to deliver fewer flows. Further, the packet traffic in each flow may not be equal. Thus, there is a need for an innovative packet forwarding scheme which can applying dynamic load balancing to packet traffic over multiple egress paths to thereby achieve a more uniform bandwidth distribution.
- a packet processing apparatus and method using the flow subgroup based path selection for dynamic load balancing are proposed to solve the above-mentioned problem.
- an exemplary packet forwarding apparatus includes a path selection device configured to generate a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to an egress path group.
- the path selection device includes a flow group based path selection circuit and a flow subgroup based path selection circuit.
- the flow group based path selection circuit is configured to set the path selection signal based on a flow group into which a packet to be forwarded is categorized when a dynamic load balancing function is not applied to forwarding of the packet.
- the flow subgroup based path selection circuit is configured to set the path selection signal based on a flow subgroup into which the packet to be forwarded is categorized when the dynamic load balancing function is applied to forwarding of the packet.
- Flows associated with the egress path group are categorized into a plurality of flow subgroups, the flow subgroups are categorized into a plurality of flow groups, and each of the flows includes a group of packets with same tuple(s).
- an exemplary packet forwarding method include: generating a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to an egress path group, wherein the generating a path selection signal comprises: when a dynamic load balancing function is not applied to forwarding of a packet, performing a flow group based path selection to set the path selection signal based on a flow group into which the packet to be forwarded is categorized; and when the dynamic load balancing function is applied to forwarding of the packet, performing a flow subgroup based path selection to set the path selection signal based on a flow subgroup into which the packet to be forwarded is categorized.
- Flows associated with the egress path group are categorized into a plurality of flow subgroups, the flow subgroups are categorized into a plurality of flow groups, and each of the flows includes a group of packets with same tuple(s).
- FIG. 1 is a block diagram illustrating a generalized packet forwarding apparatus according to an embodiment of the present invention.
- FIG. 2 is a diagram illustrating a flow group hierarchy according to an embodiment of the present invention.
- FIG. 3 is a diagram illustrating a path rate monitor according to an embodiment of the present invention.
- FIG. 4 is a diagram illustrating a heavy flow monitor according to an embodiment of the present invention.
- FIG. 5 is a diagram illustrating a path selection device according to an embodiment of the present invention.
- FIG. 6 is a flowchart illustrating a packet forwarding method according to an embodiment of the present invention.
- One concept of the present invention is to use a flow subgroup based path selection for dynamic load balancing (DLB).
- the path selection i.e., next-hop selection
- the path selection can be adjusted for flow subgroups, thus increasing the granularity of adjusting packet traffic over multiple egress paths.
- the packet traffic adjustment is infrequent.
- the packet out-of-order can be handled by existing protocol stack, such as re-ordering or re-transmission in the TCP (Transmission Control Protocol) layer. Further details of the proposed dynamic load balancing design are described as below.
- FIG. 1 is a block diagram illustrating a generalized packet forwarding apparatus according to an embodiment of the present invention.
- the packet forwarding apparatus 100 may be part of a network device such as a router or a switch.
- the packet forwarding apparatus 100 includes a controller 102 , a path selection device 104 , a path rate monitor 106 , and a heavy flow monitor 108 .
- the packet forwarding apparatus 100 may include additional components to provide other functions.
- the controller 102 is configured to control at least the packet selection function of the packet forwarding apparatus 100 .
- the controller 102 may be implemented using a processor which executes software (e.g., firmware FW of the packet forwarding apparatus 100 ) to control the path selection function (which includes at least the proposed dynamic load balancing function).
- the controller 102 controls configurations of path selection device 104 , path rate monitor 106 and heavy flow monitor 108 .
- the path selection device 104 is configured to generate a path selection signal ecmp_idx referenced for selecting a destination path from a plurality of egress paths belonging to the same egress path group.
- the proposed dynamic load balancing may be employed by a router using an Equal Cost Multi-Path (ECMP) routing technique.
- ECMP Equal Cost Multi-Path
- the egress path group mentioned hereafter is an ECMP path group.
- the proposed dynamic load balancing may be employed by a router using a link aggregation technique. Multiple physical links between two nodes (e.g., routers) may be regarded as a single logical link between the two nodes (e.g., routers).
- the link aggregation splits traffic between multiple paths (i.e., links) belonging to the same egress group.
- the egress path group mentioned hereafter may be a link aggregation group (LAG) for unicast forwarding.
- LAG link aggregation group
- the path selection device 104 includes a flow group based path selection circuit 112 and a flow subgroup based path selection circuit 114 .
- the flow group based path selection circuit 112 is responsible for dealing with the hash-based path selection
- the flow subgroup based path selection circuit 114 is responsible for dealing with the dynamic load balancing.
- the flow group based path selection circuit 112 sets the path selection signal ecmp_idx based on a flow group into which the packet to be forwarded is categorized; and when the dynamic load balancing function is applied to forwarding of a packet, the flow subgroup based path selection circuit 114 sets the path selection signal ecmp_idx based on a flow subgroup into which the packet to be forwarded is categorized.
- FIG. 2 is a diagram illustrating a flow group hierarchy according to an embodiment of the present invention.
- Each packet has packet header fields used for identify a flow.
- the flow is a group of packets with the same tuple(s).
- packets may be categorized into different flows using 5-tuple hash distribution, where a 5-tuple extracted from each packet includes a source IP (internet protocol) address, a source port number, a destination IP address, a destination port number, and a protocol in use.
- packets are categorized into a plurality of flows FL 0 -FL 11 according to certain tuples tp0-tp11 extracted from packet headers of the packets received by the router.
- the flow FL 0 is composed of packets with the same tuples tp0
- the flow FL 1 is composed of packets with the same tuple (s) tp1, where tp0 ⁇ tp1.
- the tuples tp0-tp11 are further used to categorize flows FL 0 -FL 11 into flow subgroups of different flow groups.
- a hash engine with a predetermined hash algorithm is used to generate a hash result (e.g., a 16-bit hash value) pkt_hash according to a selected set of tuples of each packet.
- a hash result e.g., a 16-bit hash value
- a more significant bit (MSB) part of the hash result pkt_hash of a packet is used to serve as a flow group index FGI of a flow group into which the packet is categorized
- the rest of the hash result pkt_hash i.e., a less significant bit (LSB) part
- LSB less significant bit
- pkt_hash ⁇ FGI, FsGI ⁇
- flow group indices FGI is enough to distinguish between different flow groups.
- the flows FL 7 -FL 8 are categorized into the flow
- the flow subgroup index FsGI may be obtained from the rest of the hash result pkt_hash that is not used to serve as the flow group index FGI.
- this is not meant to be a limitation of the present invention.
- the flow subgroup index FsGI may be generated based on another predetermined hash algorithm.
- a flow group index FGI to act as an identifier of a flow group and using a combination of a flow group index FGI and a flow subgroup index FsGI to act as an identifier of a flow subgroup in a flow group can reduce the memory requirement, thus reducing the production cost of the proposed packet forwarding apparatus 100 .
- this is for illustrative purposes only. Any means capable of categorizing flows associated with the same egress path group into a plurality of flow subgroups and categorizing the flow subgroups into a plurality of flow groups may be employed. For example, full tuples of each packet maybe directly used to indicate a flow subgroup of a flow group into which the packet should be categorized.
- the path selection signal ecmp_idx is set by either the flow group based path selection circuit 112 or the flow subgroup based path selection circuit 114 .
- the path selection signal ecmp_idx controls the selection of a destination path used for forwarding the packet.
- the path selection signal ecmp_idx further informs the path rate monitor 106 of the selected egress path.
- the path rate monitor 106 updates a path rate monitor value of the selected egress path based on traffic of the packet.
- the path rate monitor 106 is configured to monitor data rates of egress paths of the same egress path group to generate path rate monitor values, respectively. In other words, when there are different egress path groups, the path rate monitor 106 generates path rate monitor values for egress paths of these egress path groups.
- FIG. 3 is a diagram illustrating a path rate monitor according to an embodiment of the present invention.
- the path rate monitor 106 shown in FIG. 1 may be implemented using the path rate monitor 300 shown in FIG. 3 .
- the path rate monitor 300 includes a lookup table 302 , a comparing circuit 304 corresponding to an egress path group, and a plurality of monitoring circuits 306 _ 1 , 306 _ 2 . . . 306 _N corresponding to different egress paths of the egress path group, respectively.
- the path rate monitor 300 has one monitoring circuit per path, and has one comparing circuit per egress path group.
- the path rate monitor 300 when there are N egress paths in one egress path group (e.g., an ECMP path group or an LAG), the path rate monitor 300 is configured to have N monitoring circuits and a single comparing circuit for the same egress path group.
- FIG. 3 only shows monitoring circuits and one comparing circuit for one egress path group.
- the lookup table 302 can be shared among different egress path groups. Hence, only one lookup table 302 is created in the path rate monitor 106 .
- the lookup table 302 has a plurality of table entries, each storing an adjacency index adj_idx and an associated rate counter pointer rate_cnt_ptr.
- the lookup table 302 maybe stored in a static random access memory (SRAM). Hence, any of the table entries can be accessed based on a corresponding memory address pointed to by the path selection signal ecmp_idx.
- an entry index of each table entry in the lookup table 302 is a memory address.
- the adjacency index adj_idx is read to select a destination path from egress paths of the same egress path group, and the associated rate counter pointer rate_cnt_ptr is read to select one monitoring circuit assigned to the selected destination path.
- the rate counter pointers rate_cnt_ptr may be configured by the controller (e.g., a processor running firmware FW) 102 , where more than one rate counter pointer rate_cnt_ptr may be configured to point to the same monitoring circuit.
- the path selection index ecmp_idx is generated to access one of table entries associated with the specific egress path group. For example, when the first entry shown in FIG. 3 is accessed by the path selection index ecmp_idx, a monitoring circuit 306 _ 1 associated with one egress path of the specific egress path group is selected; when the second entry shown in FIG.
- the same monitoring circuit 306 _ 1 associated with one egress path of the specific egress path group is selected; and when the third entry is accessed by the path selection index ecmp_idx, a different monitoring circuit 306 2 associated with another egress path of the specific egress path group is selected.
- the monitoring circuit assigned to the selected destination path is operative to update its path rate monitor value.
- the monitoring circuits 306 _ 1 - 306 _N generate average path rate values PR AVG — 1 -PR AVG — N as path rate monitor values.
- each of the monitoring circuits 306 _ 1 - 306 _N has a set of rate counters, including a first counter and a second counter.
- the monitoring circuit 306 _ 1 includes a first counter 308 _ 1 and a second counter 309 _ 1
- the monitoring circuit 306 _ 2 includes a first counter 308 _ 2 and a second counter 309 _ 2
- the monitoring circuit 306 _N includes a first counter 308 _N and a second counter 309 _N.
- the first counters 308 _ 1 - 308 _N are configured to generate instantaneous path rate values PR CUR — 1 -PR CUR — N, respectively.
- each of the first counters 308 _ 1 - 308 _N is configured to count the number of bytes transmitted through a corresponding egress path during one predetermined period T upd to generate one instantaneous path rate value when the corresponding egress path is the selected destination path.
- the second counters 309 _ 1 - 309 _N are configured to generate average path rate values PR AVG — 1 -PR AVG — N, respectively.
- each of the second counters 309 _ 1 - 309 _N is configured to generate a weighted average of an average path rate value and the instantaneous path rate value to update the average path rate value which acts as a path rate monitor value of the monitoring circuit.
- the operation of the second counter maybe expressed using following equation.
- PR AVG PR AVG *C+PR CUR *(1 ⁇ C ) (1)
- PR AVG represents an average path rate value
- PR CUR represents an instantaneous path rate value
- C represents a weighting factor.
- the weighting factor C and the predetermined period T upd may be configured by the controller (e.g., a processor running firmware FW) 102 .
- the path rate monitor values (e.g., average path rate values PR AVG — 1 -PR AVG — N) generated from the monitoring circuits 306 _ 1 - 306 _N indicate traffic statuses of paths belonging to the same egress path group.
- the comparing circuit 304 is configured to compare each of the path rate monitor values (e.g., average path rate values PR AVG — 1 -PR AVG — N) generated from the monitoring circuits 306 _ 1 - 306 _N with a predetermined threshold value TH_R, and generate an indication signal S IND when any of the path rate monitor values exceeds the predetermined threshold value TH_R.
- the predetermined threshold value TH_R may be configured by the controller (e.g., a processor running firmware FW) 102 .
- each of the second counters 309 _ 1 - 309 _N may be regarded as low-pass filtering.
- setting the path rate monitor value by the average path rate value can prevent the path rate monitor value from having a significant variation caused by a sudden packet traffic change.
- this is for illustrative purposes only, and is not meant to be a limitation of the present invention.
- Any means capable of generating path rate monitor values indicative of traffic statuses of paths belonging to the same egress path group may be employed by the path rate monitor 106 . These alternative designs fall within the scope of the present invention.
- the dynamical load balancing function offered by the flow subgroup based path selection circuit 114 should be properly adjusted to reduce packet traffic on the heavy-loaded path and/or increase packet traffic on light-loaded paths. In this way, the load unbalance can be removed or mitigated to achieve a more uniform bandwidth distribution. Further details of the dynamical load balancing function performed by the flow subgroup based path selection circuit 114 will be described later.
- the heavy flow monitor 108 provides additional information needed for adjusting the dynamical load balancing function offered by the flow subgroup based path selection circuit 114 .
- the heavy flow monitor 108 is configured to capture at least one heavy flow subgroup from flow subgroups of flow groups associated with the same egress path group, wherein traffic of the at least one heavy flow subgroup is higher than traffic of other flow subgroups of the flow groups. Specifically, concerning each egress path group, the heavy flow monitor 108 monitors flows forwarded through egress paths of the egress path group to capture at least one heavy flow subgroup. Hence, the heavy flow monitor 108 captures heavy flow subgroup(s) for one egress path group, and captures heavy flow subgroup(s) for another egress path group.
- FIG. 4 is a diagram illustrating a heavy flow monitor according to an embodiment of the present invention.
- the heavy flow monitor 108 shown in FIG. 1 may be implemented using the heavy flow monitor 400 shown in FIG. 4 .
- the heavy flow monitor 400 employs a hardware-based implementation, and includes a heavy flow monitoring controller 402 and a storage device 404 for one egress path group.
- FIG. 4 shows one heavy flow monitoring controller and one storage device only.
- the heavy flow monitor 400 has multiple sets of a heavy flow monitoring controller and a storage device, where each set of a heavy flow monitoring controller and a storage device is used to capture at least one heavy flow subgroup according to flows forwarded through egress paths of a corresponding egress path group.
- a key function of network performance monitoring is determining how bandwidth is used by flows; in particular, determining which flows use the most bandwidth.
- flows may be categorized into elephant flows each consuming a large amount of bandwidth due to high packet traffic (i.e., a large amount of data) and mice flows each consuming a small amount of bandwidth due to low packet traffic (i.e., a small amount of data).
- elephant flows have a large impact on load balance of paths. Based on such observation, heavy flow subgroups are identified for setting the proposed dynamic load balancing function performed by the flow subgroup based path selection circuit 114 .
- the heavy flow monitor 400 may be used to track top-M heavy flow subgroups among flow subgroups of flow groups associated with the same egress path group, where M is an integer determined based on the actual design consideration.
- M is an integer determined based on the actual design consideration.
- each egress path group is assigned with one heavy flow monitor.
- this is for illustrative purposes only, and is not meant to be a limitation of the present invention.
- a conventional flow-based heavy flow monitoring algorithm may be modified to be the heavy flow monitoring algorithm used by the heavy flow monitoring controller 402 to identify heavy flow subgroups.
- an identifier of one captured heavy flow subgroup e.g., a hash result pkt hash of one captured heavy flow subgroup
- CAM content-addressable memory
- SRAM SRAM
- the heavy flow monitor 400 is hardware based, and a rate table may be established for recording data rate information of each flow subgroup.
- the hardware-based heavy flow monitor 400 may employ flow sorting hardware that collaborates with the heavy flow monitoring algorithm. The same objective of identifying heavy flow subgroups can be achieved.
- the heavy flow monitor 108 may be realized using a software-based implementation.
- the heavy flow monitor 108 is a software module (e.g., an sFlow module defined by OpenFlow) executed by a processor. The same objective of capturing at least one heavy flow subgroup from flow subgroups of flow groups associated with an egress path group is achieved.
- the flow subgroup based path selection circuit 114 is designed for dynamic load balancing.
- the flow subgroup based path selection circuit 114 may use a flow rebalance table created and updated for re-directing heavy flow subgroups to light-loaded paths, thus leading to a more uniform bandwidth distribution over different egress paths in the same egress path group (e.g., an ECMP path group or an LAG).
- FIG. 5 is a diagram illustrating a path selection device according to an embodiment of the present invention.
- the path selection device 104 shown in FIG. 1 may be implemented using the path selection device 500 shown in FIG. 5 .
- the path selection device 500 includes a hash-based path selection circuit 502 , a dynamic load balancing circuit 504 , and a multiplexer (MUX) 506 .
- the hash-based path selection circuit 502 is used to realize the flow group based path selection circuit 112 shown in FIG. 1
- the dynamic load balancing circuit 504 is used to realize the flow subgroup based path selection circuit 114 shown in FIG. 1 .
- the hash result pkt_hash may be divided into a flow group index FGI and a flow subgroup index FsGI. It should be noted that the hash result pkt_hash, including the flow group index FGI and the flow subgroup index FsGI, is used to identify a flow subgroup into which the packet to be forwarded is categorized, and the flow group index FGI is to identify a flow group into which the packet to be forwarded is categorized.
- an entry index of each table entry in the lookup table 302 shown in FIG. 3 is a memory address.
- the path selection signal ecmp_idx is set by a memory address where one table entry of the lookup table 302 (which includes the next-hop information) is stored.
- the hash-based path selection circuit 502 may employ any well-known hash-based path selection algorithm to generate a first path selection signal ADDR 0 based on the flow group index FGI. For example, a hash-threshold algorithm may be employed by the hash-based path selection circuit 502 .
- the dynamic load balancing circuit 504 it includes a storage device 512 and an adder 514 .
- the storage device 512 is used to store a flow rebalance table 516 .
- the flow rebalance table 516 may be shared among different egress path groups.
- an entry index of each table entry of the flow rebalance table 516 is a combination of an egress path group index ecmp grp idx and a hash result pkt hash.
- an address offset idx_ofs is output from the flow rebalance table 516 to the adder 514 .
- the entry indices may be stored in a CAM or an SRAM, and the address offsets may be stored in an SRAM, thus the flow rebalance table 516 can be implemented with SRAM-based search table.
- the lookup table 302 shown in FIG. 3 may also be shared among different egress path groups.
- the table entries associated with the same egress path group may be stored in an allocated memory space with continuous memory addresses.
- one of the table entries associated with the same egress path group may be indexed by a memory address acting as a base address, and the rest of the table entries may be indexed by memory addresses, each being the base address plus one address offset.
- the flow rebalance table 516 may store address offsets only, as shown in FIG. 5 . However, this is not meant to be a limitation of the present invention.
- the adder 514 maybe omitted, and the flow rebalance table 516 may be modified to store entry indices of table entries in the lookup table 302 .
- a table entry of the load rebalance table 516 is hit, a stored memory address (i.e., an entry index of a table entry in the lookup table 302 ) is output from the flow rebalance table 516 to be the second path selection signal ADDR 1 .
- the multiplexer 506 is configured to select one of the first path selection signal ADDR 0 and the second path selection signal ADDR 1 as its output. Specifically, when one table entry of the flow rebalance table 516 is hit, implying that the dynamic load balancing function should be applied to forwarding of the packet, the multiplexer 506 outputs the second path selection signal ADDR 1 as the path selection signal ecmp_idx; and when no table entry of the flow rebalance table 516 is hit, implying that there is no need to apply the dynamic load balancing function to forwarding of the packet, the multiplexer 506 outputs the first path selection signal ADDR 0 as the path selection signal ecmp_idx.
- the path rate monitor 300 When one or more of paths in the egress path group are selected for packet forwarding, the path rate monitor 300 will update one or more of the path rate monitor values (i.e., average path rate values PR AVG — 1 -PR AVG — N), and the heavy flow monitor 500 will capture one or more new heavy flow subgroups and/or update one or more existing captured heavy flow subgroups.
- the path rate monitor values i.e., average path rate values PR AVG — 1 -PR AVG — N
- the comparing circuit 304 When the comparing circuit 304 detects that one of the path rate monitor values (i.e., average path rate values PR AVG — 1 -PR AVG — N) exceeds the predetermined threshold TH_R, the comparing circuit 304 generates the indication signal S IND to notify the controller 102 .
- the indication signal S IND may be an interrupt of the processor.
- the controller 102 reads status registers to find out which ⁇ egress path group, path ⁇ triggers the interrupt.
- the controller 102 reads the path rate monitor values (i.e., average path rate values PR AVG — 1 -PR AVG — N) of the egress path group from the path rate monitor 300 , and reads an identifier of any captured heavy flow subgroup from the heavy flow monitor 400 .
- the controller 102 refers to the path rate monitor values to find out at least one light-loaded path, and regards the at least one light-loaded path as at least one destination path to which one or more flow subgroups are re-directed.
- the controller 102 refers to the captured heavy flow subgroup(s) to find out which flow subgroup(s) should be re-directed.
- the controller 102 makes a decision on how to program/update the flow rebalance table 516 . After the flow rebalance table 516 is updated, packets categorized into a heavy flow subgroup can be re-directed to a light-loaded path.
- the flow rebalance table 516 may be updated by adding new table entries and/or replacing old table entries when the controller 102 is notified by the indication signal S IND . Further, since the storage device 512 has a limited storage space, each table entry of the flow rebalance table 516 can be aged-out to release the occupied storage space. Besides aging, the controller 102 may employ a table management policy to update the flow rebalance table 516 .
- the controller 102 is further configured to update the flow rebalance table 516 having an entry corresponding to a specific heavy flow subgroup of a specific flow group associated with the egress path group when the specific heavy flow subgroup of the specific flow group is evicted from the heavy flow subgroup(s).
- the specific flow subgroup is not one of the top-M heavy flow subgroups now.
- the load rebalance table 516 needs to be updated by replacing or removing a table entry associated with the specific flow subgroup.
- the controller 102 is further configured to update the flow rebalance table 516 having an entry corresponding to a specific egress path of the egress path group when the specific egress path is removed from the egress path group due to a path/link down event.
- the load rebalance table 516 needs to be managed during path removal of an egress path group.
- each egress path group is further assigned with an enable bit DLB_en in the egress path group table.
- the egress path group table is accessed to read an enable bit DLB_en and an egress path group index of an egress path group, where the enable bit DLB_en is used for determining whether to enable the dynamic load balancing function for the egress path group, and the egress path group index is used to perform table lookup for path selection/next-hop selection when the dynamic load balancing function for the egress path group is enabled.
- the flow group based path selection circuit 112 (e.g., hash-based path selection circuit 502 ) is used to select a destination path from the selected egress path group
- the flow subgroup based path selection circuit 114 (e.g., dynamic load balancing circuit 504 ) is not used to select a destination path from the selected egress path group
- the path rate monitor 106 e.g., path rate monitor 300
- the heavy flow monitor 108 e.g. , heavy flow monitor 400
- the heavy flow monitor 108 (e.g. , heavy flow monitor 400 ) does not need to track any heavy flow subgroups forwarded through egress paths of the selected egress path group.
- one of the flow group based path selection circuit 112 e.g., the hash-based path selection circuit 502
- the flow subgroup based path selection circuit 114 e.g., the dynamic load balancing circuit 504
- the path rate monitor 106 e.g.
- path rate monitor 300 does not need to update a path rate monitor value corresponding to the destination path selected by the flow group based path selection circuit 112 (e.g., hash-based path selection circuit 502 ) for forwarding the packet, and the heavy flow monitor 108 (e.g., heavy flow monitor 400 ) does not need to update a current tracking result of heavy flow subgroups corresponding to the selected egress path group.
- the flow group based path selection circuit 112 e.g., hash-based path selection circuit 502
- the heavy flow monitor 108 e.g., heavy flow monitor 400
- the path rate monitor 106 e.g., path rate monitor 300
- the heavy flow monitor 108 e.g., heavy flow monitor 400
- FIG. 6 is a flowchart illustrating a packet forwarding method according to an embodiment of the present invention.
- the packet forwarding method may be employed by the packet forwarding apparatus 100 with the path selection device 104 realized using the path selection device 500 , the path rate monitor 106 realized using the path rate monitor 300 and the heavy flow monitor 108 realized using the heavy flow monitor 400 . Provided that the result is substantially the same, the steps are not required to be executed in the exact order shown in FIG. 6 .
- the packet forwarding method may be briefly summarized as below.
- Step 600 Start.
- Step 602 Receive a packet to be forwarded.
- Step 604 Check if an enable bit DLB_en corresponding to an egress path group selected for the packet to be forwarded indicates that the dynamic load balancing function should be enabled. If yes, go to step 606 ; otherwise, go to step 612 .
- Step 606 Check if a table entry of a flow rebalance table is hit. If yes, go to step 608 ; otherwise, go to step 612 .
- Step 608 Refer to path selection information stored in the hit table entry to set a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to the selected egress path group.
- Step 610 Update a path rate monitor value corresponding to the selected destination path, and/or update a current tracking result of heavy flow subgroups corresponding to the selected egress path group. Go to step 602 to receive a next packet to be forwarded.
- Step 612 Set the path selection signal according to flow group based path selection (e.g., hash-based path selection).
- flow group based path selection e.g., hash-based path selection
- Step 614 Check if the enable bit DLB_en corresponding to the selected egress path group indicates that the dynamic load balancing function should be enabled. If yes, go to step 610 ; otherwise, go to step 602 to receive a next packet to be forwarded.
Abstract
Description
- This application claims the benefit of U.S. provisional application No. 61/912,020, filed on Dec. 5, 2013 and incorporated herein by reference.
- The disclosed embodiments of the present invention relate to forwarding packets, and more particularly, to a packet processing apparatus and method using the flow subgroup based path selection for dynamic load balancing.
- Routing is the process of selecting the best path from a source node to a destination node in a network environment. For example, the current Internet infrastructure consists of interconnected networks. A router, also known as a gateway, is a device that connects different networks together. The router's main tasks may include discovering paths to various destinations and forwarding packets inside the network or between different networks. When a router receives a packet at one of its incoming (ingress) ports, the header of the received packet is checked. When the destination address of the packet is known, the forwarding table lookup is performed to obtain the information to which outgoing (egress) port the packet should be sent.
- The router may employ one routing protocol for packet forwarding. For example, ECMP (Equal Cost Multi-Path) is a technique for routing packets along multiple paths of equal cost. When forwarding a packet, the router must decide which next-hop (path) to use. One typical method for determining which next-hop (path) to use when routing with ECMP may employ the hash-based path selection. For example, the router first determines a hash value by performing a hash function upon the packet header fields that identify a flow. Multiple next-hops have been assigned unique hash values. Hence, the router uses the hash value derived from the packet to be forwarded to decide which next-hop (path) to use.
- The hash-based path selection distributes flows to ECMP paths statistically. As a result, the hash-based path selection cannot guarantee a uniform bandwidth distribution over the ECMP paths. For example, one equal-cost egress path may be selected to deliver more flows, while another equal-cost egress path may be selected to deliver fewer flows. Further, the packet traffic in each flow may not be equal. Thus, there is a need for an innovative packet forwarding scheme which can applying dynamic load balancing to packet traffic over multiple egress paths to thereby achieve a more uniform bandwidth distribution.
- In accordance with exemplary embodiments of the present invention, a packet processing apparatus and method using the flow subgroup based path selection for dynamic load balancing are proposed to solve the above-mentioned problem.
- According to a first aspect of the present invention, an exemplary packet forwarding apparatus is disclosed. The exemplary packet forwarding apparatus includes a path selection device configured to generate a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to an egress path group. The path selection device includes a flow group based path selection circuit and a flow subgroup based path selection circuit. The flow group based path selection circuit is configured to set the path selection signal based on a flow group into which a packet to be forwarded is categorized when a dynamic load balancing function is not applied to forwarding of the packet. The flow subgroup based path selection circuit is configured to set the path selection signal based on a flow subgroup into which the packet to be forwarded is categorized when the dynamic load balancing function is applied to forwarding of the packet. Flows associated with the egress path group are categorized into a plurality of flow subgroups, the flow subgroups are categorized into a plurality of flow groups, and each of the flows includes a group of packets with same tuple(s).
- According to a second aspect of the present invention, an exemplary packet forwarding method is disclosed. The exemplary packet forwarding method include: generating a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to an egress path group, wherein the generating a path selection signal comprises: when a dynamic load balancing function is not applied to forwarding of a packet, performing a flow group based path selection to set the path selection signal based on a flow group into which the packet to be forwarded is categorized; and when the dynamic load balancing function is applied to forwarding of the packet, performing a flow subgroup based path selection to set the path selection signal based on a flow subgroup into which the packet to be forwarded is categorized. Flows associated with the egress path group are categorized into a plurality of flow subgroups, the flow subgroups are categorized into a plurality of flow groups, and each of the flows includes a group of packets with same tuple(s).
- These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
-
FIG. 1 is a block diagram illustrating a generalized packet forwarding apparatus according to an embodiment of the present invention. -
FIG. 2 is a diagram illustrating a flow group hierarchy according to an embodiment of the present invention. -
FIG. 3 is a diagram illustrating a path rate monitor according to an embodiment of the present invention. -
FIG. 4 is a diagram illustrating a heavy flow monitor according to an embodiment of the present invention. -
FIG. 5 is a diagram illustrating a path selection device according to an embodiment of the present invention. -
FIG. 6 is a flowchart illustrating a packet forwarding method according to an embodiment of the present invention. - Certain terms are used throughout the description and following claims to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”. Also, the term “couple” is intended to mean either an indirect or direct electrical connection. Accordingly, if one device is coupled to another device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
- One concept of the present invention is to use a flow subgroup based path selection for dynamic load balancing (DLB). The path selection (i.e., next-hop selection) can be adjusted for flow subgroups, thus increasing the granularity of adjusting packet traffic over multiple egress paths. Further, since the adjustment made to the path selection to re-balance the packet traffic affects the forwarding of packets categorized into a flow subgroup rather than the forwarding of packets categorized into a flow group, the packet traffic adjustment is infrequent. Hence, the packet out-of-order can be handled by existing protocol stack, such as re-ordering or re-transmission in the TCP (Transmission Control Protocol) layer. Further details of the proposed dynamic load balancing design are described as below.
-
FIG. 1 is a block diagram illustrating a generalized packet forwarding apparatus according to an embodiment of the present invention. By way of example, thepacket forwarding apparatus 100 may be part of a network device such as a router or a switch. As shown inFIG. 1 , thepacket forwarding apparatus 100 includes acontroller 102, apath selection device 104, apath rate monitor 106, and aheavy flow monitor 108. It should be noted that only the components pertinent to the path selection function are shown inFIG. 1 . In practice, thepacket forwarding apparatus 100 may include additional components to provide other functions. - The
controller 102 is configured to control at least the packet selection function of thepacket forwarding apparatus 100. In one exemplary design, thecontroller 102 may be implemented using a processor which executes software (e.g., firmware FW of the packet forwarding apparatus 100) to control the path selection function (which includes at least the proposed dynamic load balancing function). For example, thecontroller 102 controls configurations ofpath selection device 104,path rate monitor 106 andheavy flow monitor 108. - The
path selection device 104 is configured to generate a path selection signal ecmp_idx referenced for selecting a destination path from a plurality of egress paths belonging to the same egress path group. The proposed dynamic load balancing may be employed by a router using an Equal Cost Multi-Path (ECMP) routing technique. Hence, the egress path group mentioned hereafter is an ECMP path group. Alternatively, the proposed dynamic load balancing may be employed by a router using a link aggregation technique. Multiple physical links between two nodes (e.g., routers) may be regarded as a single logical link between the two nodes (e.g., routers). Like ECMP, the link aggregation splits traffic between multiple paths (i.e., links) belonging to the same egress group. Hence, the egress path group mentioned hereafter may be a link aggregation group (LAG) for unicast forwarding. - In this embodiment, the
path selection device 104 includes a flow group basedpath selection circuit 112 and a flow subgroup basedpath selection circuit 114. For example, the flow group basedpath selection circuit 112 is responsible for dealing with the hash-based path selection, and the flow subgroup basedpath selection circuit 114 is responsible for dealing with the dynamic load balancing. Hence, when a dynamic load balancing function is not applied to forwarding of a packet, the flow group basedpath selection circuit 112 sets the path selection signal ecmp_idx based on a flow group into which the packet to be forwarded is categorized; and when the dynamic load balancing function is applied to forwarding of a packet, the flow subgroup basedpath selection circuit 114 sets the path selection signal ecmp_idx based on a flow subgroup into which the packet to be forwarded is categorized. -
FIG. 2 is a diagram illustrating a flow group hierarchy according to an embodiment of the present invention. Each packet has packet header fields used for identify a flow. Specifically, the flow is a group of packets with the same tuple(s). For example, packets may be categorized into different flows using 5-tuple hash distribution, where a 5-tuple extracted from each packet includes a source IP (internet protocol) address, a source port number, a destination IP address, a destination port number, and a protocol in use. As shown inFIG. 2 , packets are categorized into a plurality of flows FL0-FL11 according to certain tuples tp0-tp11 extracted from packet headers of the packets received by the router. For example, the flow FL0 is composed of packets with the same tuples tp0, and the flow FL1 is composed of packets with the same tuple (s) tp1, where tp0≠tp1. The tuples tp0-tp11 are further used to categorize flows FL0-FL11 into flow subgroups of different flow groups. - In an exemplary design, a hash engine with a predetermined hash algorithm is used to generate a hash result (e.g., a 16-bit hash value) pkt_hash according to a selected set of tuples of each packet. Hence, packets belonging to the same flow should have the same hash result pkt_hash due to the same tuples. For example, a more significant bit (MSB) part of the hash result pkt_hash of a packet is used to serve as a flow group index FGI of a flow group into which the packet is categorized, and the rest of the hash result pkt_hash (i.e., a less significant bit (LSB) part) is used to serve as a flow subgroup index FsGI of a flow subgroup into which the packet is categorized. Since packets belonging to different flow groups may have the same flow subgroup index FsGI, the flow group indices FGI possessed by the packets are needed to distinguish between flow subgroups with the same flow subgroup index FsGI. In other words, hash results pkt_hash, each including a flow group index FGI and a flow subgroup index FsGI (e.g., pkt_hash={FGI, FsGI}), are used to distinguish between different flow subgroups belonging to different flow groups. However, using the flow group indices FGI is enough to distinguish between different flow groups.
- As shown in
FIG. 2 , the flows FL0-FL2 are categorized into the flow subgroup FsG0 of the flow group FG0 due to the fact that the flows FL0-FL2 have the same hash result pkt_hash={0, 0}; the flow FL3 is categorized into the flow subgroup FsG1 of the flow group FG0 due to the fact that the flow FL3 has the hash result pkt_hash={0, 1}; the flows FL4-FL5 are categorized into the flow subgroup FsG2 of the flow group FG0 due to the fact that the flows FL4-FL5 have the same hash result pkt_hash={0, 2}; the flows FL6 is categorized into the flow subgroup FsG0 of the flow group FG1 due to the fact that the flow FL6 has the hash result pkt_hash={1, 0}; the flows FL7-FL8 are categorized into the flow subgroup FsG1 of the flow group FG1 due to the fact that the flows FL7-FL8 have the same hash result pkt_hash={1, 1}; and the flows FL9-FL11 are categorized into the flow subgroup FsG2 of the flow group FG1 due to the fact that the flows FL9-FL11 have the same hash result pkt_hash={1, 2}. - As mentioned above, the flow subgroup index FsGI may be obtained from the rest of the hash result pkt_hash that is not used to serve as the flow group index FGI. However, this is not meant to be a limitation of the present invention. In an alternative design, the flow subgroup index FsGI may be generated based on another predetermined hash algorithm.
- Using a flow group index FGI to act as an identifier of a flow group and using a combination of a flow group index FGI and a flow subgroup index FsGI to act as an identifier of a flow subgroup in a flow group can reduce the memory requirement, thus reducing the production cost of the proposed
packet forwarding apparatus 100. However, this is for illustrative purposes only. Any means capable of categorizing flows associated with the same egress path group into a plurality of flow subgroups and categorizing the flow subgroups into a plurality of flow groups may be employed. For example, full tuples of each packet maybe directly used to indicate a flow subgroup of a flow group into which the packet should be categorized. - The path selection signal ecmp_idx is set by either the flow group based
path selection circuit 112 or the flow subgroup basedpath selection circuit 114. The path selection signal ecmp_idx controls the selection of a destination path used for forwarding the packet. When the proposed dynamic load balancing function is applied to forwarding of the packet, the path selection signal ecmp_idx further informs the path rate monitor 106 of the selected egress path. Hence, the path rate monitor 106 updates a path rate monitor value of the selected egress path based on traffic of the packet. Specifically, the path rate monitor 106 is configured to monitor data rates of egress paths of the same egress path group to generate path rate monitor values, respectively. In other words, when there are different egress path groups, the path rate monitor 106 generates path rate monitor values for egress paths of these egress path groups. -
FIG. 3 is a diagram illustrating a path rate monitor according to an embodiment of the present invention. The path rate monitor 106 shown inFIG. 1 may be implemented using the path rate monitor 300 shown inFIG. 3 . The path rate monitor 300 includes a lookup table 302, a comparingcircuit 304 corresponding to an egress path group, and a plurality of monitoring circuits 306_1, 306_2 . . . 306_N corresponding to different egress paths of the egress path group, respectively. The path rate monitor 300 has one monitoring circuit per path, and has one comparing circuit per egress path group. In other words, when there are N egress paths in one egress path group (e.g., an ECMP path group or an LAG), the path rate monitor 300 is configured to have N monitoring circuits and a single comparing circuit for the same egress path group. For clarity and simplicity,FIG. 3 only shows monitoring circuits and one comparing circuit for one egress path group. It should be noted that the lookup table 302 can be shared among different egress path groups. Hence, only one lookup table 302 is created in thepath rate monitor 106. - The lookup table 302 has a plurality of table entries, each storing an adjacency index adj_idx and an associated rate counter pointer rate_cnt_ptr. The lookup table 302 maybe stored in a static random access memory (SRAM). Hence, any of the table entries can be accessed based on a corresponding memory address pointed to by the path selection signal ecmp_idx. In other words, an entry index of each table entry in the lookup table 302 is a memory address. When a table entry in lookup table 302 is accessed based on the path selection signal ecmp_idx, the adjacency index adj_idx is read to select a destination path from egress paths of the same egress path group, and the associated rate counter pointer rate_cnt_ptr is read to select one monitoring circuit assigned to the selected destination path. In this embodiment, the rate counter pointers rate_cnt_ptr may be configured by the controller (e.g., a processor running firmware FW) 102, where more than one rate counter pointer rate_cnt_ptr may be configured to point to the same monitoring circuit. Consider a case where a received packet is required to be forwarded through one of egress paths of a specific egress path group, the path selection index ecmp_idx is generated to access one of table entries associated with the specific egress path group. For example, when the first entry shown in
FIG. 3 is accessed by the path selection index ecmp_idx, a monitoring circuit 306_1 associated with one egress path of the specific egress path group is selected; when the second entry shown inFIG. 3 is accessed by the path selection index ecmp_idx, the same monitoring circuit 306_1 associated with one egress path of the specific egress path group is selected; and when the third entry is accessed by the path selection index ecmp_idx, a different monitoring circuit 306 2 associated with another egress path of the specific egress path group is selected. - The monitoring circuit assigned to the selected destination path is operative to update its path rate monitor value. In this example, the monitoring circuits 306_1-306_N generate average path rate values PRAVG
— 1-PRAVG— N as path rate monitor values. As shown inFIG. 3 , each of the monitoring circuits 306_1-306_N has a set of rate counters, including a first counter and a second counter. For example, the monitoring circuit 306_1 includes a first counter 308_1 and a second counter 309_1, the monitoring circuit 306_2 includes a first counter 308_2 and a second counter 309_2, and the monitoring circuit 306_N includes a first counter 308_N and a second counter 309_N. - The first counters 308_1-308_N are configured to generate instantaneous path rate values PRCUR
— 1-PRCUR— N, respectively. For example, each of the first counters 308_1-308_N is configured to count the number of bytes transmitted through a corresponding egress path during one predetermined period Tupd to generate one instantaneous path rate value when the corresponding egress path is the selected destination path. The second counters 309_1-309_N are configured to generate average path rate values PRAVG— 1-PRAVG— N, respectively. For example, each of the second counters 309_1-309_N is configured to generate a weighted average of an average path rate value and the instantaneous path rate value to update the average path rate value which acts as a path rate monitor value of the monitoring circuit. The operation of the second counter maybe expressed using following equation. -
PR AVG =PR AVG *C+PR CUR*(1−C) (1) - In above equation (1), PRAVG represents an average path rate value, PRCUR represents an instantaneous path rate value, and C represents a weighting factor. The weighting factor C and the predetermined period Tupd may be configured by the controller (e.g., a processor running firmware FW) 102.
- The path rate monitor values (e.g., average path rate values PRAVG
— 1-PRAVG— N) generated from the monitoring circuits 306_1-306_N indicate traffic statuses of paths belonging to the same egress path group. In this embodiment, the comparingcircuit 304 is configured to compare each of the path rate monitor values (e.g., average path rate values PRAVG— 1-PRAVG— N) generated from the monitoring circuits 306_1-306_N with a predetermined threshold value TH_R, and generate an indication signal SIND when any of the path rate monitor values exceeds the predetermined threshold value TH_R. The predetermined threshold value TH_R may be configured by the controller (e.g., a processor running firmware FW) 102. - The operation of each of the second counters 309_1-309_N may be regarded as low-pass filtering. Hence, setting the path rate monitor value by the average path rate value can prevent the path rate monitor value from having a significant variation caused by a sudden packet traffic change. However, this is for illustrative purposes only, and is not meant to be a limitation of the present invention. Any means capable of generating path rate monitor values indicative of traffic statuses of paths belonging to the same egress path group may be employed by the
path rate monitor 106. These alternative designs fall within the scope of the present invention. - When one path rate monitor value exceeds the predetermined threshold value TH_R, it implies that the packet traffic on the monitored path is too high and thus results in load unbalance of different paths in the same egress path group. Hence, when the indication signal SIND is asserted to indicate load unbalance, the dynamical load balancing function offered by the flow subgroup based
path selection circuit 114 should be properly adjusted to reduce packet traffic on the heavy-loaded path and/or increase packet traffic on light-loaded paths. In this way, the load unbalance can be removed or mitigated to achieve a more uniform bandwidth distribution. Further details of the dynamical load balancing function performed by the flow subgroup basedpath selection circuit 114 will be described later. - The heavy flow monitor 108 provides additional information needed for adjusting the dynamical load balancing function offered by the flow subgroup based
path selection circuit 114. In this embodiment, the heavy flow monitor 108 is configured to capture at least one heavy flow subgroup from flow subgroups of flow groups associated with the same egress path group, wherein traffic of the at least one heavy flow subgroup is higher than traffic of other flow subgroups of the flow groups. Specifically, concerning each egress path group, the heavy flow monitor 108 monitors flows forwarded through egress paths of the egress path group to capture at least one heavy flow subgroup. Hence, the heavy flow monitor 108 captures heavy flow subgroup(s) for one egress path group, and captures heavy flow subgroup(s) for another egress path group. -
FIG. 4 is a diagram illustrating a heavy flow monitor according to an embodiment of the present invention. The heavy flow monitor 108 shown inFIG. 1 may be implemented using the heavy flow monitor 400 shown inFIG. 4 . In this embodiment, the heavy flow monitor 400 employs a hardware-based implementation, and includes a heavyflow monitoring controller 402 and astorage device 404 for one egress path group. For clarity and simplicity,FIG. 4 shows one heavy flow monitoring controller and one storage device only. When there are different egress path groups, the heavy flow monitor 400 has multiple sets of a heavy flow monitoring controller and a storage device, where each set of a heavy flow monitoring controller and a storage device is used to capture at least one heavy flow subgroup according to flows forwarded through egress paths of a corresponding egress path group. - A key function of network performance monitoring is determining how bandwidth is used by flows; in particular, determining which flows use the most bandwidth. In general, flows may be categorized into elephant flows each consuming a large amount of bandwidth due to high packet traffic (i.e., a large amount of data) and mice flows each consuming a small amount of bandwidth due to low packet traffic (i.e., a small amount of data). In accordance with the long-tailed nature of network traffic, there are few elephant flows compared to mice flows. That is, most of the flows carry very little traffic. Compared to mice flows, elephant flows have a large impact on load balance of paths. Based on such observation, heavy flow subgroups are identified for setting the proposed dynamic load balancing function performed by the flow subgroup based
path selection circuit 114. - In this embodiment, the heavy flow monitor 400 may be used to track top-M heavy flow subgroups among flow subgroups of flow groups associated with the same egress path group, where M is an integer determined based on the actual design consideration. As mentioned above, each egress path group is assigned with one heavy flow monitor. As shown in
FIG. 4 , the heavy flow monitor 400 is only responsible for monitoring heavy flow subgroups among flow subgroups of flow groups associated with an egress path group having a path group index ecmp_grp_idx=2. However, this is for illustrative purposes only, and is not meant to be a limitation of the present invention. - The heavy
flow monitoring controller 402 may employ a heavy flow monitoring algorithm to identify one or more heavy flow subgroups for the egress path group with the path group index ecmp_grp_idx=2. - By way of example, but not limitation, a conventional flow-based heavy flow monitoring algorithm may be modified to be the heavy flow monitoring algorithm used by the heavy
flow monitoring controller 402 to identify heavy flow subgroups. Further, an identifier of one captured heavy flow subgroup (e.g., a hash result pkt hash of one captured heavy flow subgroup) maybe stored in a content-addressable memory (CAM) or an SRAM, and the associated information, including a hit count hit cnt, a timestamp and a data rate, may be stored in an SRAM. In this embodiment, the heavy flow monitor 400 is hardware based, and a rate table may be established for recording data rate information of each flow subgroup. However, this is for illustrative purposes only, and is not meant to be a limitation of the present invention. In another embodiment, the hardware-based heavy flow monitor 400 may employ flow sorting hardware that collaborates with the heavy flow monitoring algorithm. The same objective of identifying heavy flow subgroups can be achieved. - Alternatively, the heavy flow monitor 108 may be realized using a software-based implementation. For example, the heavy flow monitor 108 is a software module (e.g., an sFlow module defined by OpenFlow) executed by a processor. The same objective of capturing at least one heavy flow subgroup from flow subgroups of flow groups associated with an egress path group is achieved.
- As mentioned above, the flow subgroup based
path selection circuit 114 is designed for dynamic load balancing. In one exemplary design, the flow subgroup basedpath selection circuit 114 may use a flow rebalance table created and updated for re-directing heavy flow subgroups to light-loaded paths, thus leading to a more uniform bandwidth distribution over different egress paths in the same egress path group (e.g., an ECMP path group or an LAG). -
FIG. 5 is a diagram illustrating a path selection device according to an embodiment of the present invention. Thepath selection device 104 shown inFIG. 1 may be implemented using thepath selection device 500 shown inFIG. 5 . Thepath selection device 500 includes a hash-basedpath selection circuit 502, a dynamicload balancing circuit 504, and a multiplexer (MUX) 506. Specifically, the hash-basedpath selection circuit 502 is used to realize the flow group basedpath selection circuit 112 shown inFIG. 1 , and the dynamicload balancing circuit 504 is used to realize the flow subgroup basedpath selection circuit 114 shown inFIG. 1 . - When there is a packet to be forwarded, an egress path group table (not shown) is checked to determine which egress path group should be used. For example, the egress path group with the egress path group index ecmp_grp_idx=2 is selected, where the selected egress path group includes N egress paths monitored by the monitoring circuits 306 1-306 N shown in
FIG. 3 , and flow subgroups forwarded through egress paths of the selected egress path group are monitored by the heavyflow monitoring controller 402 shown inFIG. 4 . In addition, the hash result pkt_hash is generated for the packet to be forwarded. For example, the hash result pkt_hash may be divided into a flow group index FGI and a flow subgroup index FsGI. It should be noted that the hash result pkt_hash, including the flow group index FGI and the flow subgroup index FsGI, is used to identify a flow subgroup into which the packet to be forwarded is categorized, and the flow group index FGI is to identify a flow group into which the packet to be forwarded is categorized. - As mentioned above, an entry index of each table entry in the lookup table 302 shown in
FIG. 3 is a memory address. Hence, the path selection signal ecmp_idx is set by a memory address where one table entry of the lookup table 302 (which includes the next-hop information) is stored. The hash-basedpath selection circuit 502 may employ any well-known hash-based path selection algorithm to generate a first path selection signal ADDR0 based on the flow group index FGI. For example, a hash-threshold algorithm may be employed by the hash-basedpath selection circuit 502. - Concerning the dynamic
load balancing circuit 504, it includes astorage device 512 and anadder 514. Thestorage device 512 is used to store a flow rebalance table 516. In one exemplary design, the flow rebalance table 516 may be shared among different egress path groups. Hence, as shown inFIG. 5 , an entry index of each table entry of the flow rebalance table 516 is a combination of an egress path group index ecmp grp idx and a hash result pkt hash. When a table entry is hit, an address offset idx_ofs is output from the flow rebalance table 516 to theadder 514. In this embodiment, the entry indices may be stored in a CAM or an SRAM, and the address offsets may be stored in an SRAM, thus the flow rebalance table 516 can be implemented with SRAM-based search table. Next, theadder 514 adds the address offset idx_ofs to a base address ADDRbase to generate a second path selection signal ADDR1. That is, ADDR1=ADDRbase+idx_ofs. - The lookup table 302 shown in
FIG. 3 may also be shared among different egress path groups. Preferably, the table entries associated with the same egress path group may be stored in an allocated memory space with continuous memory addresses. Hence, one of the table entries associated with the same egress path group may be indexed by a memory address acting as a base address, and the rest of the table entries may be indexed by memory addresses, each being the base address plus one address offset. To reduce the table size, the flow rebalance table 516 may store address offsets only, as shown inFIG. 5 . However, this is not meant to be a limitation of the present invention. Alternatively, theadder 514 maybe omitted, and the flow rebalance table 516 may be modified to store entry indices of table entries in the lookup table 302. Hence, when a table entry of the load rebalance table 516 is hit, a stored memory address (i.e., an entry index of a table entry in the lookup table 302) is output from the flow rebalance table 516 to be the second path selection signal ADDR1. - The
multiplexer 506 is configured to select one of the first path selection signal ADDR0 and the second path selection signal ADDR1 as its output. Specifically, when one table entry of the flow rebalance table 516 is hit, implying that the dynamic load balancing function should be applied to forwarding of the packet, themultiplexer 506 outputs the second path selection signal ADDR1 as the path selection signal ecmp_idx; and when no table entry of the flow rebalance table 516 is hit, implying that there is no need to apply the dynamic load balancing function to forwarding of the packet, themultiplexer 506 outputs the first path selection signal ADDR0 as the path selection signal ecmp_idx. - Consider a case where the dynamic load balancing function is enabled for the egress path group with the egress path group index ecmp_grp_idx=2. Initially, the flow rebalance table 516 includes no table entries for load balancing of the egress path group. Since no table entry of the flow rebalance table 516 is hit, the path selection signal ecmp_idx corresponding to a packet categorized into a flow group associated with the packet egress path group with the egress path group index ecmp_grp_idx=2 is set by the hash-based
path selection circuit 502. When the dynamic load balancing function is enabled for the egress path group, the path rate monitor 300 and the heavy flow monitor 500 are operative to perform intended functions as mentioned above. Initially, all of instantaneous path rate values PRCUR— 1-PRCUR— N and average path rate values PRAVG— 1-PRAVG— N for egress paths of the egress path group with the egress path group index ecmp_grp_idx=2 are set by initial values (e.g., 0). In addition, no heavy flow subgroup forwarded through egress paths of the egress path group with the egress path group index ecmp_grp_idx=2 is captured by theheavy flow monitor 500, initially. When one or more of paths in the egress path group are selected for packet forwarding, the path rate monitor 300 will update one or more of the path rate monitor values (i.e., average path rate values PRAVG— 1-PRAVG— N), and the heavy flow monitor 500 will capture one or more new heavy flow subgroups and/or update one or more existing captured heavy flow subgroups. - When the comparing
circuit 304 detects that one of the path rate monitor values (i.e., average path rate values PRAVG— 1-PRAVG— N) exceeds the predetermined threshold TH_R, the comparingcircuit 304 generates the indication signal SIND to notify thecontroller 102. For example, when thecontroller 102 is a processor running the firmware FW, the indication signal SIND may be an interrupt of the processor. Next, thecontroller 102 reads status registers to find out which {egress path group, path} triggers the interrupt. Further, thecontroller 102 reads the path rate monitor values (i.e., average path rate values PRAVG— 1-PRAVG— N) of the egress path group from thepath rate monitor 300, and reads an identifier of any captured heavy flow subgroup from theheavy flow monitor 400. Thecontroller 102 refers to the path rate monitor values to find out at least one light-loaded path, and regards the at least one light-loaded path as at least one destination path to which one or more flow subgroups are re-directed. In addition, thecontroller 102 refers to the captured heavy flow subgroup(s) to find out which flow subgroup(s) should be re-directed. In other words, based on the information given from the path rate monitor 300 and theheavy flow monitor 500, thecontroller 102 makes a decision on how to program/update the flow rebalance table 516. After the flow rebalance table 516 is updated, packets categorized into a heavy flow subgroup can be re-directed to a light-loaded path. - As mentioned above, the flow rebalance table 516 may be updated by adding new table entries and/or replacing old table entries when the
controller 102 is notified by the indication signal SIND. Further, since thestorage device 512 has a limited storage space, each table entry of the flow rebalance table 516 can be aged-out to release the occupied storage space. Besides aging, thecontroller 102 may employ a table management policy to update the flow rebalance table 516. - For example, the
controller 102 is further configured to update the flow rebalance table 516 having an entry corresponding to a specific heavy flow subgroup of a specific flow group associated with the egress path group when the specific heavy flow subgroup of the specific flow group is evicted from the heavy flow subgroup(s). In other words, when a specific flow subgroup previously captured by the heavyflow monitoring controller 402 is replaced by another flow subgroup with a higher traffic load, the specific flow subgroup is not one of the top-M heavy flow subgroups now. Hence, the load rebalance table 516 needs to be updated by replacing or removing a table entry associated with the specific flow subgroup. - For another example, the
controller 102 is further configured to update the flow rebalance table 516 having an entry corresponding to a specific egress path of the egress path group when the specific egress path is removed from the egress path group due to a path/link down event. In other words, the load rebalance table 516 needs to be managed during path removal of an egress path group. - In one exemplary design, each egress path group is further assigned with an enable bit DLB_en in the egress path group table. The enable bit DLB_en maybe set by the controller (e.g., a processor running firmware FW) 102 to indicate whether the proposed dynamic load balancing function should be enabled for the corresponding egress path group. For example, when DLB_en=1 for the egress path group with the egress path group index ecmp_grp_idx=2, the dynamic load balancing function is enabled for packets categorized into flow groups associated with the egress path group with the egress path group index ecmp_grp_idx=2. When DLB_en=0 for the egress path group with the egress path group index ecmp_grp_idx=2, the dynamic load balancing function is disabled for packets categorized into flow groups associated with the egress path group with the egress path group index ecmp_grp_idx=2.
- To put it simply, when a packet is received, the egress path group table is accessed to read an enable bit DLB_en and an egress path group index of an egress path group, where the enable bit DLB_en is used for determining whether to enable the dynamic load balancing function for the egress path group, and the egress path group index is used to perform table lookup for path selection/next-hop selection when the dynamic load balancing function for the egress path group is enabled.
- In a case where the enable bit DLB_en indicates that the proposed dynamic load balancing function is not enabled for a selected egress path group, the flow group based path selection circuit 112 (e.g., hash-based path selection circuit 502) is used to select a destination path from the selected egress path group, the flow subgroup based path selection circuit 114 (e.g., dynamic load balancing circuit 504) is not used to select a destination path from the selected egress path group, the path rate monitor 106 (e.g., path rate monitor 300) does not need to update any path rate monitor values for the selected egress path group, and the heavy flow monitor 108 (e.g. , heavy flow monitor 400) does not need to track any heavy flow subgroups forwarded through egress paths of the selected egress path group.
- In another case where the enable bit DLB_en indicates that the proposed dynamic load balancing function should be enabled for a selected egress path group, one of the flow group based path selection circuit 112 (e.g., the hash-based path selection circuit 502) and the flow subgroup based path selection circuit 114 (e.g., the dynamic load balancing circuit 504) is used to select a destination path from the selected egress path group. When the flow subgroup based path selection circuit 114 (e.g., dynamic load balancing circuit 504) is not used to select a destination path from the selected egress path group for a packet due to table entry miss, the path rate monitor 106 (e.g. , path rate monitor 300) does not need to update a path rate monitor value corresponding to the destination path selected by the flow group based path selection circuit 112 (e.g., hash-based path selection circuit 502) for forwarding the packet, and the heavy flow monitor 108 (e.g., heavy flow monitor 400) does not need to update a current tracking result of heavy flow subgroups corresponding to the selected egress path group. However, when the flow subgroup based path selection circuit 114 (e.g., dynamic load balancing circuit 504) is used to select a destination path from the selected egress path group for a packet due to table entry hit, the path rate monitor 106 (e.g., path rate monitor 300) needs to update a path rate monitor value corresponding to the destination path selected by the flow subgroup based path selection circuit 114 (e.g., dynamic load balancing circuit 504) for forwarding the packet, and the heavy flow monitor 108 (e.g., heavy flow monitor 400) needs to determine whether to update a current tracking result of heavy flow subgroups corresponding to the selected egress path group.
-
FIG. 6 is a flowchart illustrating a packet forwarding method according to an embodiment of the present invention. The packet forwarding method may be employed by thepacket forwarding apparatus 100 with thepath selection device 104 realized using thepath selection device 500, the path rate monitor 106 realized using the path rate monitor 300 and the heavy flow monitor 108 realized using theheavy flow monitor 400. Provided that the result is substantially the same, the steps are not required to be executed in the exact order shown inFIG. 6 . The packet forwarding method may be briefly summarized as below. - Step 600: Start.
- Step 602: Receive a packet to be forwarded.
- Step 604: Check if an enable bit DLB_en corresponding to an egress path group selected for the packet to be forwarded indicates that the dynamic load balancing function should be enabled. If yes, go to step 606; otherwise, go to step 612.
- Step 606: Check if a table entry of a flow rebalance table is hit. If yes, go to step 608; otherwise, go to step 612.
- Step 608: Refer to path selection information stored in the hit table entry to set a path selection signal referenced for selecting a destination path from a plurality of egress paths belonging to the selected egress path group.
- Step 610: Update a path rate monitor value corresponding to the selected destination path, and/or update a current tracking result of heavy flow subgroups corresponding to the selected egress path group. Go to step 602 to receive a next packet to be forwarded.
- Step 612: Set the path selection signal according to flow group based path selection (e.g., hash-based path selection).
- Step 614: Check if the enable bit DLB_en corresponding to the selected egress path group indicates that the dynamic load balancing function should be enabled. If yes, go to step 610; otherwise, go to step 602 to receive a next packet to be forwarded.
- As a person skilled in the art can readily understand details of each step after reading above paragraphs, further description is omitted here for brevity.
- Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Claims (20)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/488,307 US9467384B2 (en) | 2013-12-05 | 2014-09-17 | Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing |
CN201410738375.8A CN104702523B (en) | 2013-12-05 | 2014-12-05 | Packet forwarding apparatus and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361912020P | 2013-12-05 | 2013-12-05 | |
US14/488,307 US9467384B2 (en) | 2013-12-05 | 2014-09-17 | Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing |
Publications (2)
Publication Number | Publication Date |
---|---|
US20150163146A1 true US20150163146A1 (en) | 2015-06-11 |
US9467384B2 US9467384B2 (en) | 2016-10-11 |
Family
ID=53272296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/488,307 Active 2035-06-23 US9467384B2 (en) | 2013-12-05 | 2014-09-17 | Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing |
Country Status (2)
Country | Link |
---|---|
US (1) | US9467384B2 (en) |
CN (1) | CN104702523B (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150163144A1 (en) * | 2013-12-09 | 2015-06-11 | Nicira, Inc. | Detecting and handling elephant flows |
US20150194215A1 (en) * | 2014-01-09 | 2015-07-09 | Netronome Systems, Inc. | Dedicated egress fast path for non-matching packets in an openflow switch |
US20150326476A1 (en) * | 2014-05-12 | 2015-11-12 | Google Inc. | Prefix-aware weighted cost multi-path group reduction |
CN106209669A (en) * | 2016-06-30 | 2016-12-07 | 中国人民解放军国防科学技术大学 | Towards SDN data center network maximum of probability path stream scheduling method and device |
US9571400B1 (en) * | 2014-02-25 | 2017-02-14 | Google Inc. | Weighted load balancing in a multistage network using hierarchical ECMP |
US20170099230A1 (en) * | 2014-06-18 | 2017-04-06 | Huawei Technologies Co., Ltd. | Method and apparatus for controlling service data flow |
US9853900B1 (en) * | 2017-08-07 | 2017-12-26 | Mellanox Technologies Tlv Ltd. | Using consistent hashing for ECMP routing |
US9967199B2 (en) | 2013-12-09 | 2018-05-08 | Nicira, Inc. | Inspecting operations of a machine to detect elephant flows |
US20190079869A1 (en) * | 2017-09-11 | 2019-03-14 | Cisco Technology, Inc. | Distributed coordination of caching and processing by networking devices |
US10341235B2 (en) * | 2014-04-21 | 2019-07-02 | Huawei Technologies Co., Ltd. | Load balancing implementation method, device, and system |
US20190260670A1 (en) * | 2018-02-19 | 2019-08-22 | Arista Networks, Inc. | System and method of flow aware resilient ecmp |
US10432526B2 (en) | 2017-09-25 | 2019-10-01 | Mellanox Technologies Tlv Ltd. | Enhanced traffic distribution using VRF tables |
US10680964B1 (en) * | 2018-11-26 | 2020-06-09 | Mellanox Technologies Tlv Ltd. | Rate limiting in a multi-chassis environment by exchanging information between peer network elements |
US10880236B2 (en) | 2018-10-18 | 2020-12-29 | Mellanox Technologies Tlv Ltd. | Switch with controlled queuing for multi-host endpoints |
US20210075730A1 (en) * | 2019-09-11 | 2021-03-11 | Intel Corporation | Dynamic load balancing for multi-core computing environments |
US11088947B2 (en) * | 2017-05-04 | 2021-08-10 | Liveu Ltd | Device, system, and method of pre-processing and data delivery for multi-link communications and for media content |
US11140083B1 (en) * | 2019-12-06 | 2021-10-05 | Juniper Networks, Inc. | Load balancing over a plurality of packet forwarding components |
CN113660160A (en) * | 2021-08-20 | 2021-11-16 | 烽火通信科技股份有限公司 | UCMP load sharing method and device |
US20220012207A1 (en) * | 2020-03-11 | 2022-01-13 | Nvidia Corporation | Techniques to transfer data among hardware devices |
US11824764B1 (en) * | 2019-07-29 | 2023-11-21 | Innovium, Inc. | Auto load balancing |
US11962518B2 (en) | 2020-06-02 | 2024-04-16 | VMware LLC | Hardware acceleration techniques using flow selection |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101796372B1 (en) * | 2016-02-12 | 2017-11-10 | 경희대학교 산학협력단 | Apparatus and method of interest forwarding in parallel by using multipath in content-centric network |
US10129127B2 (en) * | 2017-02-08 | 2018-11-13 | Nanning Fugui Precision Industrial Co., Ltd. | Software defined network controller, service function chaining system and trace tracking method |
CN110099412A (en) | 2018-01-31 | 2019-08-06 | 慧与发展有限责任合伙企业 | Uplink is automatically selected based on user's anchor controller |
CN109361603B (en) * | 2018-11-26 | 2021-03-23 | 浪潮思科网络科技有限公司 | Method and system for dynamically adjusting equivalent path flow based on programmable switching chip |
US10901805B2 (en) | 2018-12-18 | 2021-01-26 | At&T Intellectual Property I, L.P. | Distributed load balancing for processing of high-volume data streams |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050080923A1 (en) * | 2003-09-10 | 2005-04-14 | Uri Elzur | System and method for load balancing and fail over |
US7613110B1 (en) * | 2000-05-17 | 2009-11-03 | Cisco Technology, Inc. | Combining multilink and IP per-destination load balancing over a multilink bundle |
US20150107712A1 (en) * | 2012-05-31 | 2015-04-23 | Hitachi Construction Machinery | Multiple valve device |
US20150188823A1 (en) * | 2013-12-03 | 2015-07-02 | Akamai Technologies, Inc. | Virtual private network (VPN)-as-a-service with load-balanced tunnel endpoints |
US20160094463A1 (en) * | 2012-08-15 | 2016-03-31 | Dell Products L.P. | Network switching system using software defined networking applications |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003221530A1 (en) * | 2002-03-05 | 2003-09-16 | International Business Machines Corporation | Method and system for ordered dynamic distribution of packet flows over network processors |
CN102215158B (en) * | 2010-04-08 | 2015-04-15 | 杭州华三通信技术有限公司 | Method for realizing VRRP (Virtual Router Redundancy Protocol) flow transmission and routing equipment |
-
2014
- 2014-09-17 US US14/488,307 patent/US9467384B2/en active Active
- 2014-12-05 CN CN201410738375.8A patent/CN104702523B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7613110B1 (en) * | 2000-05-17 | 2009-11-03 | Cisco Technology, Inc. | Combining multilink and IP per-destination load balancing over a multilink bundle |
US20050080923A1 (en) * | 2003-09-10 | 2005-04-14 | Uri Elzur | System and method for load balancing and fail over |
US20150107712A1 (en) * | 2012-05-31 | 2015-04-23 | Hitachi Construction Machinery | Multiple valve device |
US20160094463A1 (en) * | 2012-08-15 | 2016-03-31 | Dell Products L.P. | Network switching system using software defined networking applications |
US20150188823A1 (en) * | 2013-12-03 | 2015-07-02 | Akamai Technologies, Inc. | Virtual private network (VPN)-as-a-service with load-balanced tunnel endpoints |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9548924B2 (en) * | 2013-12-09 | 2017-01-17 | Nicira, Inc. | Detecting an elephant flow based on the size of a packet |
US10666530B2 (en) * | 2013-12-09 | 2020-05-26 | Nicira, Inc | Detecting and handling large flows |
US10158538B2 (en) * | 2013-12-09 | 2018-12-18 | Nicira, Inc. | Reporting elephant flows to a network controller |
US11811669B2 (en) | 2013-12-09 | 2023-11-07 | Nicira, Inc. | Inspecting operations of a machine to detect elephant flows |
US9967199B2 (en) | 2013-12-09 | 2018-05-08 | Nicira, Inc. | Inspecting operations of a machine to detect elephant flows |
US11539630B2 (en) | 2013-12-09 | 2022-12-27 | Nicira, Inc. | Inspecting operations of a machine to detect elephant flows |
US20150163142A1 (en) * | 2013-12-09 | 2015-06-11 | Nicira, Inc. | Detecting an elephant flow based on the size of a packet |
US11095536B2 (en) | 2013-12-09 | 2021-08-17 | Nicira, Inc. | Detecting and handling large flows |
US10193771B2 (en) * | 2013-12-09 | 2019-01-29 | Nicira, Inc. | Detecting and handling elephant flows |
US20150163144A1 (en) * | 2013-12-09 | 2015-06-11 | Nicira, Inc. | Detecting and handling elephant flows |
US9838276B2 (en) | 2013-12-09 | 2017-12-05 | Nicira, Inc. | Detecting an elephant flow based on the size of a packet |
US9299434B2 (en) * | 2014-01-09 | 2016-03-29 | Netronome Systems, Inc. | Dedicated egress fast path for non-matching packets in an OpenFlow switch |
US20150194215A1 (en) * | 2014-01-09 | 2015-07-09 | Netronome Systems, Inc. | Dedicated egress fast path for non-matching packets in an openflow switch |
US9571400B1 (en) * | 2014-02-25 | 2017-02-14 | Google Inc. | Weighted load balancing in a multistage network using hierarchical ECMP |
US9716658B1 (en) | 2014-02-25 | 2017-07-25 | Google Inc. | Weighted load balancing in a multistage network using heirachical ECMP |
US10341235B2 (en) * | 2014-04-21 | 2019-07-02 | Huawei Technologies Co., Ltd. | Load balancing implementation method, device, and system |
US9736067B2 (en) * | 2014-05-12 | 2017-08-15 | Google Inc. | Prefix-aware weighted cost multi-path group reduction |
US20150326476A1 (en) * | 2014-05-12 | 2015-11-12 | Google Inc. | Prefix-aware weighted cost multi-path group reduction |
US20170099230A1 (en) * | 2014-06-18 | 2017-04-06 | Huawei Technologies Co., Ltd. | Method and apparatus for controlling service data flow |
US10728162B2 (en) * | 2014-06-18 | 2020-07-28 | Huawei Technologies Co., Ltd. | Method and apparatus for controlling service data flow |
CN106209669A (en) * | 2016-06-30 | 2016-12-07 | 中国人民解放军国防科学技术大学 | Towards SDN data center network maximum of probability path stream scheduling method and device |
US11088947B2 (en) * | 2017-05-04 | 2021-08-10 | Liveu Ltd | Device, system, and method of pre-processing and data delivery for multi-link communications and for media content |
US10298500B2 (en) * | 2017-08-07 | 2019-05-21 | Mellanox Technologies Tlv Ltd. | Using consistent hashing for ECMP routing |
US9853900B1 (en) * | 2017-08-07 | 2017-12-26 | Mellanox Technologies Tlv Ltd. | Using consistent hashing for ECMP routing |
US10642739B2 (en) * | 2017-09-11 | 2020-05-05 | Cisco Technology, Inc. | Distributed coordination of caching and processing by networking devices |
US20190079869A1 (en) * | 2017-09-11 | 2019-03-14 | Cisco Technology, Inc. | Distributed coordination of caching and processing by networking devices |
US10432526B2 (en) | 2017-09-25 | 2019-10-01 | Mellanox Technologies Tlv Ltd. | Enhanced traffic distribution using VRF tables |
US20190260670A1 (en) * | 2018-02-19 | 2019-08-22 | Arista Networks, Inc. | System and method of flow aware resilient ecmp |
US10785145B2 (en) * | 2018-02-19 | 2020-09-22 | Arista Networks, Inc. | System and method of flow aware resilient ECMP |
US10880236B2 (en) | 2018-10-18 | 2020-12-29 | Mellanox Technologies Tlv Ltd. | Switch with controlled queuing for multi-host endpoints |
US10680964B1 (en) * | 2018-11-26 | 2020-06-09 | Mellanox Technologies Tlv Ltd. | Rate limiting in a multi-chassis environment by exchanging information between peer network elements |
US11824764B1 (en) * | 2019-07-29 | 2023-11-21 | Innovium, Inc. | Auto load balancing |
US20210075730A1 (en) * | 2019-09-11 | 2021-03-11 | Intel Corporation | Dynamic load balancing for multi-core computing environments |
US11575607B2 (en) * | 2019-09-11 | 2023-02-07 | Intel Corporation | Dynamic load balancing for multi-core computing environments |
US11140083B1 (en) * | 2019-12-06 | 2021-10-05 | Juniper Networks, Inc. | Load balancing over a plurality of packet forwarding components |
US11729101B1 (en) | 2019-12-06 | 2023-08-15 | Juniper Networks, Inc. | Load balancing over a plurality of packet forwarding components |
US20220012207A1 (en) * | 2020-03-11 | 2022-01-13 | Nvidia Corporation | Techniques to transfer data among hardware devices |
US11962518B2 (en) | 2020-06-02 | 2024-04-16 | VMware LLC | Hardware acceleration techniques using flow selection |
CN113660160A (en) * | 2021-08-20 | 2021-11-16 | 烽火通信科技股份有限公司 | UCMP load sharing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN104702523B (en) | 2018-02-27 |
US9467384B2 (en) | 2016-10-11 |
CN104702523A (en) | 2015-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9467384B2 (en) | Packet forwarding apparatus and method using flow subgroup based path selection for dynamic load balancing | |
US20200351181A1 (en) | Detecting and handling large flows | |
US10164886B2 (en) | Route optimization using measured congestion | |
Kim et al. | Revisiting route caching: The world should be flat | |
US8427958B2 (en) | Dynamic latency-based rerouting | |
US8738752B2 (en) | Local placement of large flows to assist load-balancing | |
US8995277B2 (en) | Method for dynamic load balancing of network flows on LAG interfaces | |
US10355981B1 (en) | Sliding windows | |
US9154394B2 (en) | Dynamic latency-based rerouting | |
US10574546B2 (en) | Network monitoring using selective mirroring | |
US10277481B2 (en) | Stateless forwarding in information centric networks with bloom filters | |
US10887240B2 (en) | Automatic flow learning in network devices | |
Xu et al. | Scalable software-defined networking through hybrid switching | |
AU2005326185A1 (en) | Hardware filtering support for denial-of-service attacks | |
US10771363B2 (en) | Devices for analyzing and mitigating dropped packets | |
US20210112009A1 (en) | Network management apparatus and method | |
US20220086080A1 (en) | Packet forwarding method and related apparatus | |
WO2018004639A1 (en) | Load balancing | |
US9992081B2 (en) | Scalable generation of inter-autonomous system traffic relations | |
CN111224888A (en) | Method for sending message and message forwarding equipment | |
US20210281514A1 (en) | Adaptive load balancing between routers in wan overlay networks using telemetry information | |
WO2007073620A1 (en) | A system and method for processing message | |
Yazici et al. | Policy broker-centric traffic classifier architecture for deep packet inspection systems with route asymmetry | |
US20240015563A1 (en) | Quasi-stateful load balancing | |
EP4031976B1 (en) | Method and system for cache management in a network device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MEDIATEK SINGAPORE PTE. LTD., SINGAPORE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, MING;CHANG, JONATHAN;REEL/FRAME:033753/0588 Effective date: 20140903 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |