US20080267182A1 - Load Balancing Algorithms in Non-Blocking Multistage Packet Switches - Google Patents

Load Balancing Algorithms in Non-Blocking Multistage Packet Switches Download PDF

Info

Publication number
US20080267182A1
US20080267182A1 US12/165,825 US16582508A US2008267182A1 US 20080267182 A1 US20080267182 A1 US 20080267182A1 US 16582508 A US16582508 A US 16582508A US 2008267182 A1 US2008267182 A1 US 2008267182A1
Authority
US
United States
Prior art keywords
input
output
cells
fabric
cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/165,825
Inventor
Aleksandra Smiljanic
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US12/165,825 priority Critical patent/US20080267182A1/en
Publication of US20080267182A1 publication Critical patent/US20080267182A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/15Interconnection of switching modules
    • H04L49/1515Non-blocking multistage, e.g. Clos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • H04L49/253Routing or path finding in a switch fabric using establishment or release of connections between ports
    • H04L49/254Centralised controller, i.e. arbitration or scheduling

Definitions

  • the invention relates generally to methods, and apparatuses, for balancing data flows through multistage networks.
  • FIG. 1 shows the connections between switching elements (SE) in a symmetric Clos three-stage switch. This interconnection rule is: the xth SE in some switching stage is connected to the xth input of each SE in the next stage (C. Clos, 32:406-424 (1953); J. Hui, Switching and Traffic Theory for Integrated Broadband Networks , Kluwer Academic Press 1990; F. K. Hwang, The mathematical theory of nonblocking switching networks , World Scientific, 1998). Here, all connections have the same bandwidths.
  • a circuit can be established through the Clos switching fabric without rearranging existing circuits as long as the number of SEs in the second stage is at least twice the number of inputs of an SE in the first stage, i.e. l ⁇ 2 n. It has also been shown that a circuit can be established through the Clos switching fabric as long as the number of SEs in the second stage is no less than the number of inputs of an SE in the first stage, i.e. l ⁇ n. In the latter case, the number of required SEs and their total capacity are smaller due to the fact that the existing circuits can be rearranged. While the complexity of the switching fabric hardware is reduced, the complexity of the algorithm for a circuit setup is increased.
  • the Clos switching fabric can be used for increasing capacity of packet switches as well.
  • the interconnection of SEs would be the same as in the circuit switch case.
  • these SEs should be reconfigured in each cell time slot based on the outputs of outstanding cells.
  • packets are split into cells of a fixed duration which is typically 50 ns (64 bytes at 10 Gb/s).
  • Algorithms for circuit setup in Clos circuit switches cannot be readily applied in Clos packet switches.
  • all SEs should be synchronized on a cell-by-cell basis.
  • an implementation of the algorithm that rearranges connections on a cell-by-cell basis in SEs of a rearrangeable non-blocking Clos switch would be prohibitively complex (J. Hui, Kluwer Academic Press 1990).
  • a scheduling algorithm that would provide non-blocking in a Clos packet switch would require higher processing complexity than its counterpart designed for a cross-bar switch (A. Smiljani ⁇ , “Flexible bandwidth allocation in terabit packet switches,” Proceedings of IEEE Conference on High Performance Switching and Routing , June 2000, pp. 233-241; A. Smiljani ⁇ , “Flexible Bandwidth Allocation in High-Capacity Packet Switches,” IEEE/ACM Transactions on Networking , April 2002, pp. 287-293).
  • FIG. 1 is a diagram of a Clos switching fabric.
  • FIG. 2 is a graph of a switch utilization: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 3 is a graph of a switch utilization when counters are reset each frame, i.e. synchronized: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 4 is a graph of a non-blocking switch speedup: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 5 is a graph of a non-blocking switch speedup when the counters are reset each frame, i.e. synchronized: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 6 is a diagram of a synchronization of the packet scheduling.
  • the present invention pertains to load balancing algorithms for non-blocking multistage packet switches. These algorithms allow for maximization of fabric utilization while providing a guaranteed delay.
  • the present invention provides a method for balancing unicast or multicast data flow in a multistage non-blocking fabric.
  • the fabric comprises at least one internal switching element (SE) stage, wherein the stage has l internal switching elements, and wherein each internal switching element is associated with a unique numerical identifier.
  • SE internal switching element
  • the input ports of the fabric are grouped into input sets whereby each input set consists of input ports that transmit through the same input SE.
  • the input sets are further divided into input subsets, designated by i.
  • the output ports of the fabric are also grouped into output sets whereby each output set consists of output ports that receive cells through the same output SE.
  • the output sets are further divided into output subsets, designated by j.
  • Data cells are received into the fabric. If a cell is a unicast cell, then the cell is associated with an input subset i and associated with an output subset j based on the input port and the output port of the cell. On the other hand, if a cell is a multicast cell, then the cell is associated with an input subset and associated with multiple output subsets based on the input port and the output ports of the cell. Each cell is then assigned a flow. If the cells are unicast cells, then the cells which are associated with the same input subset i and associated with the same output subset j are assigned to the same flow. On the other hand, if the cells are multicast cells, then the cells which are associated with the same input subset and associated with the output subsets of the same output sets are assigned to the same flow.
  • the flows are then transmitted through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements.
  • the quantity of the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive, preferably equal to one.
  • the number of subsets of at least one input set or at least one output set is less than n, wherein n is the number of ports of that input SE or of that output SE.
  • N is the total number of input ports and output ports.
  • N f is the maximum number of flows whose cells pass any given link.
  • the variables of n, N, N f , h, i, j and l are natural numbers. One or more flows are received by the fabric simultaneously.
  • the flows are distributed among the internal SE stage by using a counter.
  • a unique counter is associated with each flow, designated as c ij .
  • the counter for each flow is initialized with a number less than or equal to l.
  • a cell from a particular flow is transmitted through the internal switching element associated with a numerical identifier which is equal to the numerical value of the counter.
  • the numerical value of the counter is changed by decrementing or incrementing the counter modulus l.
  • the counter is again changed by decrementing or incrementing the counter modulus l. Such process continues until there are no longer any cells received for the particular flow.
  • the process is performed for cells of each flow.
  • the counters can be varied in any way which would allow for a sufficient variation of the internal switching elements used to transmit cells of the same flow.
  • the counter is varied by the following formula: (c ij +1) mod l, wherein l is the number of SEs in the internal SE stage.
  • the present invention provides a flow control device which embodies the methods of the invention.
  • the present invention provides a multistage non-blocking fabric which embodies the methods of the invention.
  • the present invention pertains to load balancing algorithms for balancing data flow in a multistage non-blocking fabric (e.g. packet switching networks).
  • a non-blocking fabric is defined as a fabric in which all the traffic for a given output gets through to its destination as long as the output port is not overloaded. These algorithms allow for maximization of fabric utilization while providing for a guaranteed delay. In these algorithms, either inputs or input SEs may balance traffic, and flows to either output SE or outputs may be balanced separately.
  • a fabric comprises packet switches.
  • a packet switch is a system that is connected to multiple transmission links and does the central processing for the activity of a packet switching network where the network consists of switches, transmission links and terminals.
  • the transmission links are connected to network equipment, such as multiplexers (MUX) and demultiplexers (DMUX).
  • a terminal can be connected to the MUX/DMUX or it can be connected to the packet switch system.
  • the packet switch consists of input and output transmission link controllers and the switching fabric.
  • the input and output link controllers perform the protocol termination traffic management and system administration related to transmission jobs and packet transmission. These controllers also process the packets to help assist in the control of the internal switching of the switching fabric.
  • the switching fabric of the packet switch performs space-division switching which switches each packet from its source link to its destination link.
  • a multistage fabric for the purposes of this specification comprises several switching element (SE) stages with a web of interconnections between adjacent stages.
  • SE switching element
  • An internal SE stage is a stage that is between the input SE stage and the output SE stage.
  • Each SE stage consists of several basic switching elements where the switching elements perform the switching operation on individual cells. So, each cell is to be processed by the distributed switching elements without a central control scheme, and thus high throughput switching can be done.
  • the methods of the present invention can be applied to packets of variable length or packets of fixed length. If the packets received from the input links are of variable length, they are fragmented into fixed-size cells. Variable-length packets are preferably transmitted according to Ethernet protocol. If the packets arriving to the switch all have a fixed length, no fragmentation is required. Such packets are transmitted in accordance with asynchronous transfer mode (ATM) protocol. For the purposes of this invention, a packet of fixed length or a packet of variable length is referred to as a cell.
  • ATM synchronous transfer mode
  • the input ports of the fabric are grouped into input sets whereby each input set consists of input ports that transmit through the same input SE.
  • the input sets are divided into input subsets.
  • the output ports of the fabric are also grouped into output sets whereby each output set consists of output ports that receive cells through the same output SE.
  • the output sets are divided into output subsets. Sets can be divided so that each input port and/or each output port belong to only one subset. Alternatively, sets can be divided so that each input port and/or each output port belong to more than one subset.
  • the grouping into sets and division into subsets is made in any efficient manner as would be known by a skilled artisan.
  • a fabric which comprises 100 input ports and 100 output ports can have the ports grouped into sets of five, i.e. input ports 1 - 5 belong to set one, and output ports 1 - 5 belong to set one; input ports 6 - 10 belong to set two, and output ports 6 - 10 belong to set two; etc. Then the input sets and output sets can be divided into subsets of, for example, even and odd numbered ports. So, in this example, input subsets would be (1,3,5), (2,4), (6,8,10), (7,9) etc.
  • each input port belongs to one subset. In another preferred embodiment, one or more of the input ports belong to at least two input subsets. Analogously, in one embodiment, each output port belongs to one subset. In another embodiment, one or more of the output ports belong to at least two input subsets.
  • the number of subsets, and so the number of flows is as small as possible.
  • the input subsets can be equal to the input ports themselves; and output subsets can be equal to the output sets themselves.
  • input subsets can be equal to either input ports or input sets, while output subsets can be equal to the output sets.
  • input subsets can be equal to either input ports or input sets, while output subsets can be equal to either output ports or output sets.
  • first load balancing algorithm of the invention cells from some input port bound for the particular output SE are spread equally among internal SEs.
  • cells from some input port bound for the particular output port are spread equally among internal SEs.
  • the load is balanced by input SEs, e.g., an arbiter associated with each input SE determines to which internal SE a cell will be transmitted.
  • input SEs e.g., an arbiter associated with each input SE determines to which internal SE a cell will be transmitted.
  • third algorithm cells transmitted from an input SE to some output SE are spread equally across the internal SEs.
  • cells transmitted from an input SE to some output port are spread equally across the internal SEs.
  • the methods of the invention are used for both unicast and multicast cells.
  • Cells are received into the fabric. Characteristics of cells being transmitted according to the Internet Protocol (IP) are identified from the packet headers.
  • IP Internet Protocol
  • the packet header contains the source IP address, and the destination IP address. From these addresses, the i, j designation of the cell is obtained, where i is the designation of input subset and j is the designation of the output subset. Based on the i, j designations, each cell is assigned a flow by the following algorithms of the invention.
  • a flow can contain an indefinite number of cells.
  • a cell is a unicast cell, then the cell is associated with an input subset and associated with an output subset j based on the input port and the output port of the cell. Then the cells which are associated with the same input subset and associated with the same output subset are assigned to the same flow.
  • a cell is a multicast cell
  • the cell is associated with an input subset i and associated with multiple output subsets ⁇ j ⁇ based on the input port and the multiple output ports of the cell, wherein ⁇ j ⁇ designates a set of output subsets. Then the cells which are associated with the same input subset and associated with the output subsets of the same output sets are assigned to the same flow.
  • unicast cells that have the following input ports (x), and output port (y) are assigned to the same flow: (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5).
  • cells that have the following i, j designations are assigned to the same flow: (2, 2), (2, 4), (4, 2), (4, 4).
  • the number of subsets of at least one input set or at least one output set is less than n, wherein n is the number of ports of that input SE or of that output SE. N is the total number of input ports and output ports. N f is the maximum number of flows whose cells pass any given link.
  • the variables of n, N, N f , h, i, j and l are natural numbers. These variables are defined by the particular fabric with which the invention is used as would be known by a skilled artisan. One or more flows are received by the fabric simultaneously.
  • the flows are transmitted through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements.
  • the quantity of the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive.
  • h is less than 50, less than 25, less than 20, less than 15, less than 10, or less than 5.
  • the flows are distributed among the internal SE stage by using a counter.
  • a unique counter is associated with each flow, designated as c ij , wherein i is the numerical identifier of an associated input subset and j is the numerical identifier of an associated output subset;
  • the counter for each flow is initialized with a number less than or equal to l.
  • a cell from a particular flow is transmitted through the internal switching element associated with a numerical identifier which is equal to the numerical value of the counter.
  • the numerical value of the counter is changed by decrementing or incrementing the counter modulus l.
  • the counter is again changed by decrementing or incrementing the counter modulus l. Such process continues until there are no longer any cells received for the particular flow.
  • the process is performed for cells of each flow.
  • the variable c ij is a natural number.
  • a counter can be varied in any way which would allow for a sufficient distribution of cells of the same flow among the internal switching elements.
  • the counter is varied by the following formula: (c ij +1) mod 1, wherein l is the number of SEs in the internal SE stage.
  • the counters can be varied in a random fashion
  • input port i In the first load balancing algorithm, input port i, 0 ⁇ i ⁇ N, has m different counters associated with different output SEs, c ij , 0 ⁇ j ⁇ m.
  • a cell arriving to input port i and bound for the jth output SE is marked to be transmitted through the c ij th output of its SE, i.e. to be transmitted through the c ij th center SE.
  • the counter in question is varied. For example, the counter is incremented modulo l, namely c ij ⁇ (c ij +1) mod l.
  • input i, 0 ⁇ i ⁇ N stores N counters associated with different switch outputs, c ij , 0 ⁇ j ⁇ N.
  • a cell arriving to input port i and bound for the jth switch output port is marked to be transmitted through the c ij th output of its SE, i.e. to be transmitted through the c ij th center SE.
  • the counter in question is varied, e.g., incremented modulo l.
  • input SE i 0 ⁇ i ⁇ m
  • a cell arriving to input SE i and bound for the jth output SE is marked to be transmitted through the c ij th output of its SE, i.e. to be transmitted through the c ij th center SE.
  • the counter in question is varied, e.g., incremented modulo 1.
  • input SE i 0 ⁇ i ⁇ m
  • a cell arriving to input SE i and bound for the jth switch output port is marked to be transmitted through the c ij th output of its SE, i.e. to be transmitted through the c ij th center SE.
  • the counter in question is incremented modulo l.
  • the method further comprises grouping cell time slots into frames of length F.
  • the counter of each flow is set at the beginning of each frame.
  • each frame input port (i) can transmit up to a ij cells to output port (j).
  • the fabric speedup is defined as:
  • the utilization of the fabric is maximized.
  • D is the maximum tolerable delay
  • T c is cell time slot duration.
  • cells passing through different center SEs may lose correct ordering, i.e. a cell that is transmitted earlier through some center SE may arrive to the output later than a cell that is transmitted later through another center SE. For this reason, cell reordering may be required at the switch outputs.
  • the number of flows should fulfill inequality
  • S switching fabric speedup
  • U targeted utilization of the switching fabric
  • D the maximum tolerable delay
  • T c cell time slot duration
  • cell time slots are grouped into frames of length F, and wherein each frame can transmit a ij cells from input port (i) to output port (j), preferably, the number of flows sourced by an input SE or bound for an output SE that are balanced starting from different internal SEs differ by at most one, wherein:
  • speedup is preferably defined as follows:
  • the number of flows sourced by an input SE or bound for an output SE that are balanced starting from different internal SEs differs by at most 1, wherein N f , fulfills:
  • flow synchronization is achieved by resetting counters each frame.
  • Non-blocking is provided without link speedup if l ⁇ n.
  • each input, or input SE will transmit the traffic at equal rates through the connections from input (first stage) to center (second stage) SEs, and, consequently the rate transmitted through any of these connections is:
  • R ′ ⁇ i ′ ⁇ SE 1 ⁇ i ⁇ S i ′ l ⁇ n ⁇ R l , ( 1 )
  • s i′ is the rate at which input i′ sends the traffic. If r i′k′ denotes the rate at which input i′ sends the traffic to output k′, then the rate transmitted through a connection from a center (second stage) SE to an output (third stage) SE, say SE 3k , is:
  • R ′′ ⁇ i ′ ⁇ ⁇ k ′ ⁇ SE 3 ⁇ k ⁇ r i ′ ⁇ k ′ l ⁇ nR l ( 2 )
  • Traffic of each individual flow is balanced independently across the SEs. If there are many flows that transmit cells across some SE at the same time, the cells will experience long delay. Many applications, e.g. voice and video, require rate and delay guarantees. The worst case utilizations for balancing algorithms that provide rate and delay guarantees has been assessed.
  • Time is divided into frames of F cells, and each input-output pair is guaranteed a specified number of time slots per frame, for example a ij time slots are guaranteed to input-output pair (i, j), 0 ⁇ i, j ⁇ N.
  • Each input, and each output can be assigned at most F u time slots per frame, i.e.
  • N f is the maximum number of flows passing through some connection that are separately balanced.
  • F′ c denote the maximum number of cells per frame sent from a given input SE through a given center SE. It holds that:
  • N f ′ denotes the number of flows sourced by SE 1i that pass through the links from this SE to center SEs.
  • ⁇ x ⁇ is the smallest integer no less than x, i.e. ⁇ x ⁇ x+1.
  • SE 2j The maximum number of cells sourced by SE 1i that may happen to be transmitted through the given center SE, say SE 2j , has been found. It was assumed that out of N f ′ flows sourced by SE 1i , N f ′ ⁇ n flows are assigned one time slot per frame, and the remaining n flows are assigned max(0, nF u ⁇ (N f ′ ⁇ n)) time slots per frame. If it happens that first cells in a frame of all flows are sent through SE 2j , the total number of cells per frame transmitted through SE 2j from SE 1i will be:
  • N f ′′ denotes the number of flows bound to SE 3k that pass through the links from center SEs to this output SE.
  • N f ′′ ⁇ n flows may transmit one cell per frame that pass through SE 2j , and n flows may transmit remaining max(0, nF u ⁇ N f ′+n) cells. If it happens that first cells in a frame of all flows are sent through SE 2j , the upper bound in (11) is almost reached, and claim of the lemma follows.
  • Theorem 2 Maximum utilization of any internal link in the fabric under which all cells pass it within designated frames is:
  • N f is the maximum number of flows sourced by any input SE or bound for any output SE, i.e. the maximum number of flows that are passing through some internal link of the fabric.
  • N f is the maximum number of flows sourced by any input SE or bound to any output SE, i.e. the maximum number of flows that are passing through some internal link of the fabric.
  • N f is the maximum number of flows that are passing through some internal link of the fabric.
  • Lemma 5 In load balancing algorithms with synchronized counters, if:
  • ⁇ x ⁇ is the smallest integer not greater than x i.e. ⁇ x ⁇ x. So, the number of cells from SE 1i through SE 2(n-1) is:
  • F c ′ ⁇ ⁇ 0 ⁇ g ⁇ N f ′ ⁇ ⁇ f ig ′ + ( i + g ) ⁇ mod ⁇ ⁇ n n ⁇ ⁇ ⁇ F u + N f ′ n ⁇ n - 1 2 ⁇ ⁇ 0 ⁇ g ⁇ N f ′ ⁇ f ig ′ + ( i + g ) ⁇ mod ⁇ ⁇ n n ⁇ ⁇ F u + N f ′ 2 , ( 18 )
  • F c ′ is maximal for:
  • Lemma 6 Maximum utilization of the links from input to center SEs, when the counters are synchronized is:
  • Lemma 7 In load balancing algorithms with synchronized counters, if:
  • Lemma 8 Maximum utilization of the links from center to output SEs when the counters are reset each frame is:
  • Theorem 3 In the algorithms where balancing of different flows is synchronized, maximum utilization of any internal link in the fabric under which all cells pass it within designated frames is:
  • Theorem 3 provides the maximum utilization when both balancing of flows sourced by an input SE, and balancing of flows bound for an output SE are synchronized. This assumption holds in all the algorithms.
  • R is a bit-rate at which data is transmitted through the fibers
  • R c is a bit-rate at which data is transmitted through the fabric connections.
  • Theorem 4 The speedup S required to pass all incoming packets with a tolerable delay when counters are not synchronized is:
  • N f N
  • N f nN
  • N f n because any input SE sources n flows, and each of n input SEs balances one flow for any output SE.
  • N f N because any input SE sources n flows, and each of n input SEs balances one flow for any output SE.
  • N f N because any input SE sources N flows, and each of n input SEs balances n flows for any output SE.
  • the second load balancing algorithm is least efficient, while the third algorithm is most efficient.
  • the frame length is increased.
  • the cell delay is proportional to the frame length. So the maximum frame length is determined by the delay that could be tolerated by the applications, such as interactive voice and video. Assume that the maximum delay that can be tolerated by interactive applications is D, and the cell time slot duration is T c , then
  • packet delay that can be tolerated by interactive applications is around 150 ms, but only 50-60 ms of this allowed delay can be budgeted for the queueing.
  • the switch delay as low as 3 ms may be required for various reasons. For example, packets might pass multiple packet switches from their sources to the destinations, and packet delays through these switches would add. Also, in order to provide flexible multicasting, the ports should forward packets multiple times through the packet switch, and the packet delay is prolonged accordingly (Chaney et al., Proceedings of INFOCOM 1997, 1:2-11 (1997); A. Smiljani ⁇ , “Scheduling of Multicast Traffic in High-Capacity Packet Switches,” IEICE/IEEE Workshop on High - Performance Switching and Routing , May 2002, pp.
  • FIG. 2 shows the fabric utilization decreases as the switch size increases for various tolerable delays.
  • T c 50 ns
  • T c 100 ns.
  • FIG. 3 shows the fabric utilization for the load balancing algorithms that reset counters to the specified values every frame.
  • T c 50 ns
  • T c 100 ns.
  • the utilization of the transmission capacity is maximized to 100% by implementing the switching fabric with a speedup.
  • the speedup required to provide non-blocking varies for different load balancing algorithms.
  • required speedups can be obtained from formula (46) to be:
  • FIG. 4 shows the fabric speedup that provides non-blocking through a switch for various delay requirements.
  • T c 50 ns
  • T c 100 ns.
  • the second load balancing algorithm requires the speedups larger than 4 and 11, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively.
  • the speedup required when the first and fourth load balancing algorithms are applied is close to 1 for all switch parameters.
  • FIG. 5 shows the fabric speedup that provides non-blocking through a switch for various delay requirements in the case when the counters used for balancing are synchronized.
  • T c 50 ns
  • T c 100 ns.
  • the second load balancing algorithm requires the speedups larger than 2 and 10, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively.
  • the required speedup is sometimes decreased when the counters are synchronized. No speedup is needed when the first and fourth load balancing algorithms are applied and the counters are synchronized.
  • cells bound for the output SE are spread equally across center SEs, or that input SEs spread cells across center SEs (N f ⁇ N). Since the performance improves as the number of balanced flows decreases, all algorithms for which N f ⁇ N perform well. However, the implementation of the algorithms where input SEs balance the traffic may be more complex, and, consequently, less scalable. First, inputs have to exchange the information with the SE arbiter. Secondly, counters of the arbiter should be updated n times per cell time slot, which may require advanced processing capability, and may limit the number of SE ports, i.e. the total switch capacity. Also, these algorithms assume the SEs with the shared buffers whose capacity was shown to be smaller than the capacity of crossbar SEs.
  • FIG. 6 ( a ) shows the switch frame boundaries, while the lower axes in FIG. 6 ( b ) and ( c ) show the port frame boundaries.
  • SB toggles
  • FB toggles
  • FIG. 6 ( b ) shows the switch frame boundaries
  • FIG. 6( c ) shows the port frame boundaries.
  • each SE high priority cells are first served and their number is limited according to the various admission control conditions that were described above. On the other side, there are no limits for low priority cells which are served if they can get through after the high-priority cells are served. By limiting the number of high-priority cells with the above equation, they are served with the guaranteed delay. If there is any resource left, namely time slots in which some input and output are idle, and there are lower priority cells between them, the lower priority cells are served without any delay guarantees.
  • a significant amount of traffic on the Internet is multicast in nature; i.e. it carries the information from one source to multiple destinations. Scheduling of multicast packets in switches is a complicated task. If a multicast packet is scheduled to be simultaneously transmitted to all destination outputs, it may be unacceptably delayed. On the other side, if the multicast packet is scheduled to be separately transmitted to all destination outputs, its transmission may consume an unacceptably large portion of the input transmission capacity.
  • a multicast input sends multicast packets to a limited number of destinations, and each multicast destination output that received the packets will forward them to a limited number of destination outputs who did not received them yet, and such forwarding continues until all destination outputs received all the packets.
  • P i.e. the number of destination outputs to which a packet is forwarded from one port
  • the switch utilization and the guaranteed delay could be selected (A. Smiljani ⁇ , IEICE/IEEE Workshop on High - Performance Switching and Routing , May 2002, pp. 29-33; A. Smiljani ⁇ , IEEE Communication Magazine , November 2002, pp. 72-77).
  • Packets can be forwarded in two ways.
  • a port separately transmits a multicast packet to its destination ports. Then, the packet flow is determined solely based on its input and output ports as in the case of unicast packets.
  • a port transmits only one copy of a multicast packet to the Clos network. The multicast packet is transmitted through the network until the last SE from which it can reach some destination port where it is replicated and its copies are routed separately through the remainder of the network. So, the multicast flow is balanced in stages before the packet replication starts. In this case, the packet flow is determined by its input port and its multiple destinations of ports. Obviously, the number of flows is increased in this way, and the performance of load balancing is degraded.
  • Improvement in the performance of load balancing of unicast and multicast flows in a fabric can be accomplished by increasing the frame length, balancing flows among different internal SEs, implementing the fabric with a speedup, or combinations thereof.
  • the methods of the present invention can be implemented by an article of manufacture which comprises a machine readable medium containing one or more programs which when executed implement the steps of the methods of the present invention.
  • the methods of the present invention can be implemented using a conventional microprocessor programmed according to the teachings of the present specification, as will be apparent to those skilled in the computer art.
  • Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art.
  • the invention may also be implemented by the preparation of application specific units, such as integrated circuits (ASIC), configurable logic blocks, field programmable gate arrays, or by interconnecting an appropriate network of conventional circuit components, as will be readily apparent to those skilled in the art.
  • ASIC integrated circuits
  • the article of manufacture can comprise a storage medium can include, but is not limited to Random-Access Memory (RAMs) for storing lookup tables.
  • RAMs Random-Access Memory
  • the assignment of cells to a flow comprise inputting the i, j designation of a cell into a lookup table which table assigns to the cell an input and output set, an input and output subset, and the flow of the cell.
  • the methods of the present invention can be implemented by an apparatus which comprises: a flow control device configured to perform the steps of the invention.
  • the apparatus can also comprise a counter module configured to assign counters to each flow pursuant to the methods of the invention.
  • the present invention also includes a multistage non-blocking fabric which comprises a network of switches that perform the method steps of the invention.
  • the fabric comprises at least one internal switching element (SE) stage, wherein the stage has l internal switching elements, an input SE stage, an output SE stage, input ports which are divided into input sets wherein each input set consists of input ports that transmit through the same input SE, and wherein the input sets are further divided into input subsets, and output ports which are divided into output sets wherein each output set consists of output ports that receive cells through the same output SE, and wherein the output sets are further divided into output subsets, and a flow assignment module wherein the module assigns cells which are received into the fabric to a flow.
  • the assignment module comprises a lookup table.

Abstract

The present invention provides a method for balancing unicast or multicast flows in a multistage non-blocking fabric, wherein the fabric comprises at least one internal switching element (SE) stage, an input SE stage and an output SE stage. The method comprises: (a) receiving cells into the fabric wherein each cell is associated with an input subset and associated with an output subset according to the source and destination address of the cell, (b) assigning each cell to a flow, wherein cells sourced from the same input subset, and bound for the same output subset, or multiple output subsets, are assigned to the same flow, and (c) transmitting flows through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements, wherein the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive, whereby the flow in the fabric is balanced.

Description

  • This application claims benefit from U.S. provisional Application Ser. No. 60/496,978, filed on Aug. 21, 2003, which application is incorporated herein by reference in its entirety.
  • TECHNICAL FIELD
  • The invention relates generally to methods, and apparatuses, for balancing data flows through multistage networks.
  • BACKGROUND OF THE INVENTION
  • Clos circuit switch has been proposed by Clos in 1953 at Bell Labs (C. Clos, “A study of non-blocking switching networks,” Bell Systems Technology Journal 32:406-424 (1953)). FIG. 1 shows the connections between switching elements (SE) in a symmetric Clos three-stage switch. This interconnection rule is: the xth SE in some switching stage is connected to the xth input of each SE in the next stage (C. Clos, 32:406-424 (1953); J. Hui, Switching and Traffic Theory for Integrated Broadband Networks, Kluwer Academic Press 1990; F. K. Hwang, The mathematical theory of nonblocking switching networks, World Scientific, 1998). Here, all connections have the same bandwidths. It has been shown that a circuit can be established through the Clos switching fabric without rearranging existing circuits as long as the number of SEs in the second stage is at least twice the number of inputs of an SE in the first stage, i.e. l≧2 n. It has also been shown that a circuit can be established through the Clos switching fabric as long as the number of SEs in the second stage is no less than the number of inputs of an SE in the first stage, i.e. l≧n. In the latter case, the number of required SEs and their total capacity are smaller due to the fact that the existing circuits can be rearranged. While the complexity of the switching fabric hardware is reduced, the complexity of the algorithm for a circuit setup is increased. In both cases, non-blocking property of the Clos architecture has been proven assuming the specific algorithms for circuit setup (F. K. Hwang, World Scientific, 1998). Various implications of Clos findings have been examined in W. Kabacinski et al. “50th anniversary of Clos networks,” IEEE Communication Magazine, 41(10): 26-64 (October 2003).
  • The Clos switching fabric can be used for increasing capacity of packet switches as well. The interconnection of SEs would be the same as in the circuit switch case. However, these SEs should be reconfigured in each cell time slot based on the outputs of outstanding cells. Here, packets are split into cells of a fixed duration which is typically 50 ns (64 bytes at 10 Gb/s). Algorithms for circuit setup in Clos circuit switches cannot be readily applied in Clos packet switches. First, all SEs should be synchronized on a cell-by-cell basis. Then, an implementation of the algorithm that rearranges connections on a cell-by-cell basis in SEs of a rearrangeable non-blocking Clos switch would be prohibitively complex (J. Hui, Kluwer Academic Press 1990). So, the Clos fabric with the larger hardware, l=2n, is needed for a non-blocking packet switch. A scheduling algorithm that would provide non-blocking in a Clos packet switch would require higher processing complexity than its counterpart designed for a cross-bar switch (A. Smiljanić, “Flexible bandwidth allocation in terabit packet switches,” Proceedings of IEEE Conference on High Performance Switching and Routing, June 2000, pp. 233-241; A. Smiljanić, “Flexible Bandwidth Allocation in High-Capacity Packet Switches,” IEEE/ACM Transactions on Networking, April 2002, pp. 287-293). Few heuristics have been proposed to configure SEs in Clos packet switches without assessment of their blocking nature (McDermott et al., “Large-scale IP router using a high-speed optical switch element,” OSA Journal on Optical Networking, www.osa-jon.org, July 2003, pp. 228-241; Oki et al., “Concurrent round-robin-based dispatching schemes for Clos-network switches,” IEEE/ACM Transactions on Networking, 10(6):830-844 (December 2002)).
  • On the other side, it has been recognized that a Clos packet switch in which the traffic load is balanced across the SEs provides non-blocking, i.e. with sufficiently large buffers it passes all the traffic if the outputs are not over-loaded. Such an architecture has been described in Chaney et al., “Design of a gigabit ATM switch,” Proceedings of INFOCOM 1997, 1:2-11 (1997) and J. S. Turner, “An optimal nonblocking multicast virtual circuit switch,” Proceeding of INFOCOM 1994, 1:298-305 (1994). Turner showed that the architecture is non-blocking if the traffic of each multicast session is balanced over the SEs in a Benes packet switch. Here the multicast session carries the information between end users in the network.
  • However, the delay that packets experience through the Clos switch has not been assessed. Delay guarantees are important for various applications, for example, interactive voice and video, web browsing, streaming etc. In previous work, flows of data belonging to individual multicast sessions were balanced over switching elements (SEs) in the middle stage. The delay for such load balancing mechanism is too long. In order to guarantee acceptable delays for sensitive applications, the utilization of the mechanisms that balances loads of individual sessions decreases unacceptably with switch size (A. Smiljanic, “Performance load balancing algorithm in Clos packet switches,” Proceedings of IEEE Workshop on High Performance Switching and Routing, 2004; A. Smiljanic, “Load balancing algorithm in Clos packet switches,” Proceedings of IEEE International Conference on Communications, 2004). Accordingly, a challenge in the field is providing a minimum required delay guarantee without unacceptably decreasing fabric utilization.
  • BRIEF DESCRIPTION OF FIGURES
  • FIG. 1 is a diagram of a Clos switching fabric.
  • FIG. 2 is a graph of a switch utilization: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 3 is a graph of a switch utilization when counters are reset each frame, i.e. synchronized: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 4 is a graph of a non-blocking switch speedup: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 5 is a graph of a non-blocking switch speedup when the counters are reset each frame, i.e. synchronized: solid curves represent the algorithm in which inputs balance flows bound for output SEs, and to the algorithm in which input SEs balance flows bound for outputs; dashed curves correspond to the algorithm in which inputs balance flows bound for outputs.
  • FIG. 6 is a diagram of a synchronization of the packet scheduling.
  • SUMMARY OF THE INVENTION
  • The present invention pertains to load balancing algorithms for non-blocking multistage packet switches. These algorithms allow for maximization of fabric utilization while providing a guaranteed delay.
  • In one embodiment, the present invention provides a method for balancing unicast or multicast data flow in a multistage non-blocking fabric. The fabric comprises at least one internal switching element (SE) stage, wherein the stage has l internal switching elements, and wherein each internal switching element is associated with a unique numerical identifier.
  • In the method, the input ports of the fabric are grouped into input sets whereby each input set consists of input ports that transmit through the same input SE. The input sets are further divided into input subsets, designated by i. The output ports of the fabric are also grouped into output sets whereby each output set consists of output ports that receive cells through the same output SE. The output sets are further divided into output subsets, designated by j.
  • Data cells are received into the fabric. If a cell is a unicast cell, then the cell is associated with an input subset i and associated with an output subset j based on the input port and the output port of the cell. On the other hand, if a cell is a multicast cell, then the cell is associated with an input subset and associated with multiple output subsets based on the input port and the output ports of the cell. Each cell is then assigned a flow. If the cells are unicast cells, then the cells which are associated with the same input subset i and associated with the same output subset j are assigned to the same flow. On the other hand, if the cells are multicast cells, then the cells which are associated with the same input subset and associated with the output subsets of the same output sets are assigned to the same flow.
  • The flows are then transmitted through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements. The quantity of the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive, preferably equal to one.
  • In this method, the number of subsets of at least one input set or at least one output set is less than n, wherein n is the number of ports of that input SE or of that output SE. N is the total number of input ports and output ports. Nf, is the maximum number of flows whose cells pass any given link. The variables of n, N, Nf, h, i, j and l are natural numbers. One or more flows are received by the fabric simultaneously.
  • Preferably, the flows are distributed among the internal SE stage by using a counter. For example, a unique counter is associated with each flow, designated as cij. The counter for each flow is initialized with a number less than or equal to l. A cell from a particular flow is transmitted through the internal switching element associated with a numerical identifier which is equal to the numerical value of the counter. After the cell has been transmitted through that internal switching element, the numerical value of the counter is changed by decrementing or incrementing the counter modulus l. Thus, if another cell of the particular flow is received, then the cell will be transmitted through the internal switching element associated with the updated numerical value of the counter, i.e. through a different internal SE. Then, after transmission, the counter is again changed by decrementing or incrementing the counter modulus l. Such process continues until there are no longer any cells received for the particular flow. The process is performed for cells of each flow.
  • The counters can be varied in any way which would allow for a sufficient variation of the internal switching elements used to transmit cells of the same flow. Preferably, the counter is varied by the following formula: (cij+1) mod l, wherein l is the number of SEs in the internal SE stage.
  • In another embodiment, the present invention provides a flow control device which embodies the methods of the invention.
  • In a further embodiment, the present invention provides a multistage non-blocking fabric which embodies the methods of the invention.
  • For a better understanding of the present invention, reference is made to the following description, taken in conjunction with the accompanying drawings, and the scope of the invention set forth in the claims.
  • DETAILED DESCRIPTION OF THE INVENTION
  • The present invention pertains to load balancing algorithms for balancing data flow in a multistage non-blocking fabric (e.g. packet switching networks). A non-blocking fabric is defined as a fabric in which all the traffic for a given output gets through to its destination as long as the output port is not overloaded. These algorithms allow for maximization of fabric utilization while providing for a guaranteed delay. In these algorithms, either inputs or input SEs may balance traffic, and flows to either output SE or outputs may be balanced separately. A fabric comprises packet switches. A packet switch is a system that is connected to multiple transmission links and does the central processing for the activity of a packet switching network where the network consists of switches, transmission links and terminals. The transmission links are connected to network equipment, such as multiplexers (MUX) and demultiplexers (DMUX). A terminal can be connected to the MUX/DMUX or it can be connected to the packet switch system. Generally, the packet switch consists of input and output transmission link controllers and the switching fabric. The input and output link controllers perform the protocol termination traffic management and system administration related to transmission jobs and packet transmission. These controllers also process the packets to help assist in the control of the internal switching of the switching fabric. The switching fabric of the packet switch performs space-division switching which switches each packet from its source link to its destination link.
  • A multistage fabric for the purposes of this specification comprises several switching element (SE) stages with a web of interconnections between adjacent stages. There is at least one internal switching element (SE) stage, wherein the stage has l internal switching elements, and wherein each internal switching element is associated with a unique numerical identifier. An internal SE stage is a stage that is between the input SE stage and the output SE stage.
  • Each SE stage consists of several basic switching elements where the switching elements perform the switching operation on individual cells. So, each cell is to be processed by the distributed switching elements without a central control scheme, and thus high throughput switching can be done.
  • The methods of the present invention can be applied to packets of variable length or packets of fixed length. If the packets received from the input links are of variable length, they are fragmented into fixed-size cells. Variable-length packets are preferably transmitted according to Ethernet protocol. If the packets arriving to the switch all have a fixed length, no fragmentation is required. Such packets are transmitted in accordance with asynchronous transfer mode (ATM) protocol. For the purposes of this invention, a packet of fixed length or a packet of variable length is referred to as a cell.
  • In the algorithms, the input ports of the fabric are grouped into input sets whereby each input set consists of input ports that transmit through the same input SE. The input sets are divided into input subsets. The output ports of the fabric are also grouped into output sets whereby each output set consists of output ports that receive cells through the same output SE. The output sets are divided into output subsets. Sets can be divided so that each input port and/or each output port belong to only one subset. Alternatively, sets can be divided so that each input port and/or each output port belong to more than one subset. The grouping into sets and division into subsets is made in any efficient manner as would be known by a skilled artisan.
  • For example, a fabric which comprises 100 input ports and 100 output ports can have the ports grouped into sets of five, i.e. input ports 1-5 belong to set one, and output ports 1-5 belong to set one; input ports 6-10 belong to set two, and output ports 6-10 belong to set two; etc. Then the input sets and output sets can be divided into subsets of, for example, even and odd numbered ports. So, in this example, input subsets would be (1,3,5), (2,4), (6,8,10), (7,9) etc.
  • In one preferred embodiment, each input port belongs to one subset. In another preferred embodiment, one or more of the input ports belong to at least two input subsets. Analogously, in one embodiment, each output port belongs to one subset. In another embodiment, one or more of the output ports belong to at least two input subsets.
  • Preferably, the number of subsets, and so the number of flows is as small as possible. For example if SEs are cross-bars, the input subsets can be equal to the input ports themselves; and output subsets can be equal to the output sets themselves. Or if SEs are shared buffers, input subsets can be equal to either input ports or input sets, while output subsets can be equal to the output sets.
  • In some algorithms, input subsets can be equal to either input ports or input sets, while output subsets can be equal to either output ports or output sets. In a first load balancing algorithm of the invention, cells from some input port bound for the particular output SE are spread equally among internal SEs. In a second case, cells from some input port bound for the particular output port are spread equally among internal SEs. Then, the load is balanced by input SEs, e.g., an arbiter associated with each input SE determines to which internal SE a cell will be transmitted. In a third algorithm, cells transmitted from an input SE to some output SE are spread equally across the internal SEs. In a fourth algorithm, cells transmitted from an input SE to some output port are spread equally across the internal SEs.
  • The methods of the invention are used for both unicast and multicast cells. Cells are received into the fabric. Characteristics of cells being transmitted according to the Internet Protocol (IP) are identified from the packet headers. The packet header contains the source IP address, and the destination IP address. From these addresses, the i, j designation of the cell is obtained, where i is the designation of input subset and j is the designation of the output subset. Based on the i, j designations, each cell is assigned a flow by the following algorithms of the invention. A flow can contain an indefinite number of cells.
  • If a cell is a unicast cell, then the cell is associated with an input subset and associated with an output subset j based on the input port and the output port of the cell. Then the cells which are associated with the same input subset and associated with the same output subset are assigned to the same flow.
  • Alternatively, if a cell is a multicast cell, then the cell is associated with an input subset i and associated with multiple output subsets {j} based on the input port and the multiple output ports of the cell, wherein {j} designates a set of output subsets. Then the cells which are associated with the same input subset and associated with the output subsets of the same output sets are assigned to the same flow.
  • As a way of illustration, using the example above, unicast cells that have the following input ports (x), and output port (y) are assigned to the same flow: (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5). As another example, cells that have the following i, j designations are assigned to the same flow: (2, 2), (2, 4), (4, 2), (4, 4).
  • The number of subsets of at least one input set or at least one output set is less than n, wherein n is the number of ports of that input SE or of that output SE. N is the total number of input ports and output ports. Nf is the maximum number of flows whose cells pass any given link. The variables of n, N, Nf, h, i, j and l are natural numbers. These variables are defined by the particular fabric with which the invention is used as would be known by a skilled artisan. One or more flows are received by the fabric simultaneously.
  • The flows are transmitted through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements. The quantity of the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive. Preferably, h is less than 50, less than 25, less than 20, less than 15, less than 10, or less than 5. Most preferably h is equal to one.
  • An alternate manner by which to generally define flow follows. Two cells of the same unicast flow must be sourced by the same input set or be bound to the same output sets. Two cells of the same multicast flow must be sourced by the same input sets or be bound to the same sets of output sets.
  • Preferably, the flows are distributed among the internal SE stage by using a counter. For example, a unique counter is associated with each flow, designated as cij, wherein i is the numerical identifier of an associated input subset and j is the numerical identifier of an associated output subset;
  • The counter for each flow is initialized with a number less than or equal to l. A cell from a particular flow is transmitted through the internal switching element associated with a numerical identifier which is equal to the numerical value of the counter. After the cell has been transmitted through that internal switching element, the numerical value of the counter is changed by decrementing or incrementing the counter modulus l. Thus, if another cell of the particular flow is received, then the cell will be transmitted through the internal switching element associated with the updated numerical value of the counter, i.e. through a different internal SE. Then, after transmission, the counter is again changed by decrementing or incrementing the counter modulus l. Such process continues until there are no longer any cells received for the particular flow. The process is performed for cells of each flow. The variable cij is a natural number.
  • A counter can be varied in any way which would allow for a sufficient distribution of cells of the same flow among the internal switching elements. The counter is varied by the following formula: (cij+p) mod l wherein gcd(p,l)=1, wherein gcd means greatest common divisor. Preferably, the counter is varied by the following formula: (cij+1) mod 1, wherein l is the number of SEs in the internal SE stage. Alternatively, the counters can be varied in a random fashion
  • In the first load balancing algorithm, input port i, 0≦i≦N, has m different counters associated with different output SEs, cij, 0≦j≦m. Here N=nm is the number of switch input and output ports. A cell arriving to input port i and bound for the jth output SE is marked to be transmitted through the cijth output of its SE, i.e. to be transmitted through the cijth center SE. Then, the counter in question is varied. For example, the counter is incremented modulo l, namely cij←(cij+1) mod l.
  • In the second load balancing algorithm, input i, 0≦i<N, stores N counters associated with different switch outputs, cij, 0≦j<N. A cell arriving to input port i and bound for the jth switch output port is marked to be transmitted through the cijth output of its SE, i.e. to be transmitted through the cijth center SE. Then, the counter in question is varied, e.g., incremented modulo l.
  • In the third load balancing algorithm, input SE i, 0≦i<m, stores m different counters associated with different output SEs, cij, 0≦j≦m. A cell arriving to input SE i and bound for the jth output SE is marked to be transmitted through the cijth output of its SE, i.e. to be transmitted through the cijth center SE. Then, the counter in question is varied, e.g., incremented modulo 1.
  • In the fourth load balancing algorithm, input SE i, 0≦i<m, stores N counters associated with different switch outputs, cij, 0≦j<N. A cell arriving to input SE i and bound for the jth switch output port is marked to be transmitted through the cijth output of its SE, i.e. to be transmitted through the cijth center SE. Then, the counter in question is incremented modulo l.
  • In certain preferred embodiments of the invention, the method further comprises grouping cell time slots into frames of length F. In some of such embodiments, the counter of each flow is set at the beginning of each frame. The counter is set to cij=(i+j) mod l, where i may be either an input or an input SE, and j may be either an output or an output SE.
  • In the embodiments wherein cell time slots are grouped into frames of length F, preferably, each frame input port (i) can transmit up to aij cells to output port (j). The following boundaries hold:
  • k a ik SF - N f , k a ki SF - N f
  • where S is the switching fabric speedup. Preferably, in this embodiment, the fabric speedup is defined as:
  • S = 1 + N f F ,
  • wherein:
  • k a ik F , k a ki F .
  • In this case, the utilization of the fabric is maximized. In this embodiment, with fabric speedup defined in any manner, preferably, at each stage only cells that have arrived in the same frame are transmitted to the next stage, wherein F=D/3Tc, or F=D/4Tc if cells are reordered at the outputs, wherein D is the maximum tolerable delay and Tc is cell time slot duration. Namely, cells passing through different center SEs may lose correct ordering, i.e. a cell that is transmitted earlier through some center SE may arrive to the output later than a cell that is transmitted later through another center SE. For this reason, cell reordering may be required at the switch outputs. In certain preferred embodiments of the invention, the number of flows should fulfill inequality

  • N f≦(S−UD/T c,
  • where S is switching fabric speedup, U is targeted utilization of the switching fabric, D is the maximum tolerable delay and Tc is cell time slot duration.
  • In a further embodiment wherein cell time slots are grouped into frames of length F, and wherein each frame can transmit aij cells from input port (i) to output port (j), preferably, the number of flows sourced by an input SE or bound for an output SE that are balanced starting from different internal SEs differ by at most one, wherein:
  • k a ik { SF - N f 2 F N f S ( SF ) 2 2 N f F < N f S , k a ki { SF - N f 2 F N f S ( SF ) 2 2 N f F < N f S
  • where S is the switching fabric speedup. In this embodiment, speedup is preferably defined as follows:
  • S = { 1 + N f 2 F F N f 2 2 N f F F < N f 2 ,
  • and wherein
  • k a ik F , k a ki F ,
  • whereby utilization of the fabric is maximized. Preferably, in this embodiment, wherein speedup is defined in any manner, at each stage only cells that have arrived in the same frame are transmitted to the next stage, wherein F=D/3Tc, or F=D/4Tc if cells are reordered at the outputs, wherein D is the maximum tolerable delay and Tc is cell time slot duration.
  • In one embodiment, in the methods of the present invention the number of flows sourced by an input SE or bound for an output SE that are balanced starting from different internal SEs differs by at most 1, wherein Nf, fulfills:
  • N f { 2 ( S - U ) · F U S 2 S 2 F 2 U U < S 2
  • where S is the switching fabric speedup, U is targeted utilization of the switching fabric, D is the maximum tolerable delay and Tc is cell time slot duration. Preferably, flow synchronization is achieved by resetting counters each frame. In some proposed algorithms, counters are set in each frame to cij=(i+j) mod l, where i maybe either input or input SE, and j may be either output or output SE.
  • The methods of the present invention are analyzed in the present specification by means of theorems and proofs thereof, and by means of examples.
  • Theorem 1: Non-blocking is provided without link speedup if l≧n.
  • Proof: Let SEij denote the jth SE in stage i throughout this specification. In all algorithms, each input, or input SE, will transmit the traffic at equal rates through the connections from input (first stage) to center (second stage) SEs, and, consequently the rate transmitted through any of these connections is:
  • R = i SE 1 i S i l n · R l , ( 1 )
  • where si′, is the rate at which input i′ sends the traffic. If ri′k′ denotes the rate at which input i′ sends the traffic to output k′, then the rate transmitted through a connection from a center (second stage) SE to an output (third stage) SE, say SE3k, is:
  • R = i k SE 3 k r i k l nR l ( 2 )
  • wherein the outputs are not overloaded. So, the maximum rate supported by a connection in the fabric should fulfill:
  • S = R c R n l , ( 3 )
  • because equality may be reached in (1,2). So, non-blocking is provided without link speedup, i.e. with S=1, if l≧n.
  • Traffic of each individual flow is balanced independently across the SEs. If there are many flows that transmit cells across some SE at the same time, the cells will experience long delay. Many applications, e.g. voice and video, require rate and delay guarantees. The worst case utilizations for balancing algorithms that provide rate and delay guarantees has been assessed.
  • Time is divided into frames of F cells, and each input-output pair is guaranteed a specified number of time slots per frame, for example aij time slots are guaranteed to input-output pair (i, j), 0≦i, j<N. Each input, and each output can be assigned at most Fu time slots per frame, i.e.
  • k a ik F u , k a ki F u . ( 4 )
  • Fu is evaluated in terms of F, N, Nf for various load balancing algorithms, under the assumption that that l=n. Here Nf is the maximum number of flows passing through some connection that are separately balanced.
  • It is assumed that there is a coarse synchronization in a switch, i.e. that at some point of time the input ports schedule cells belonging to the same frame. A possible implementation for such a coarse synchronization is described later. The coarse synchronization may introduce an additional delay smaller than the frame duration, but may also simplify the controller implementation. Otherwise, SEs should give priority to the earlier frames which complicates their schedulers; also cell resequencing becomes more complex because the maximum jitter is increased. The delay that a cell may experience through Clos switch is three times the frame duration D=3FTc, or D=4FTc if cells are reordered at the outputs.
  • The number of cells per frame sent from a given input SE through a given center SE (F′c≦F) in terms of Fu, and the maximal utilization of the connections from input ports to center SEs (Fu/F) is calculated. Because of the symmetry, utilization is the same for the connections from center to output SEs, as shown below. Note that all lemmas and theorems hold in large switches where n>10.
  • Lemma 1: Let F′c, denote the maximum number of cells per frame sent from a given input SE through a given center SE. It holds that:

  • F c ′≧F u +N f ′−n,  (5)
  • where Nf′ denotes the number of flows sourced by SE1i that pass through the links from this SE to center SEs.
  • Proof: Let fig′, 0≦g<Nf′, denote the number of time slots per frame that are guaranteed to the individual flows sourced by SE1i. It follows:
  • F c g f ig n F c < g f ig n + N f F c < F u + N f , ( 6 )
  • where ┌x┐ is the smallest integer no less than x, i.e. ┌x┐<x+1. The maximum number of cells sourced by SE1i that may happen to be transmitted through the given center SE, say SE2j, has been found. It was assumed that out of Nf′ flows sourced by SE1i, Nf′−n flows are assigned one time slot per frame, and the remaining n flows are assigned max(0, nFu−(Nf′−n)) time slots per frame. If it happens that first cells in a frame of all flows are sent through SE2j, the total number of cells per frame transmitted through SE2j from SE1i will be:
  • F c = max ( N f , N f - n + n F u n - N f N ) = max ( N f , F u + ( n - 1 ) N f - ( nF u - N f ) mod N n ) . ( 7 )
  • Note that in this case Fc′ almost reaches the upper bound in (6) for n>10, because n<<N≦Fu, and claim of the lemma follows.
  • Lemma 2: Maximum utilization of the links from input ports to center SEs is:
  • U a = { S - N f F F N f S 0 F < N f S ( 8 )
  • Proof. Since Fc′≦SF for any of the internal connections in the fabric, from Lemma 1 it follows that:

  • F u ≦SF−N f′.  (9)
  • If (9) holds, all cells pass from SE1i to center SEs within designated frames. So, the maximum utilization of the links from input to center SEs is:
  • U a = F u F = { S - N f F F N f S 0 F < N f S
  • where the last approximation holds for large switches for which n>10.
  • Lemma 3: Let fc″ denote the maximum number of cells per frame sent to a given output SE through a given center SE. It holds that:

  • F c ″≧F u +N f″.  (10)
  • where Nf″ denotes the number of flows bound to SE3k that pass through the links from center SEs to this output SE.
  • Proof. Let fkg″, 0≦g<Nf″, denote the number of time slots per frame that are guaranteed to the individual flows bound for SE3k. Similarly, as in the proof of Lemma 1, it holds that:

  • F c ″<F u +N f″.  (11)
  • Similarly, as in the proof of Lemma 1, out of Nf″ flows bound for SE3k, Nf″−n flows may transmit one cell per frame that pass through SE2j, and n flows may transmit remaining max(0, nFu−Nf′+n) cells. If it happens that first cells in a frame of all flows are sent through SE2j, the upper bound in (11) is almost reached, and claim of the lemma follows.
  • Lemma 4: Maximum utilization of the links from center to output SEs is:
  • U a = { S - N f F F N f S 0 F < N f S ( 12 )
  • Proof. Maximum utilization of the links from center to output SEs can be derived from Lemma 3 as:
  • F c = F u + N f SF U a = F u F = { S - N f F F N f S 0 F < N f S . ( 13 )
  • Theorem 2: Maximum utilization of any internal link in the fabric under which all cells pass it within designated frames is:
  • U a = { S - N f F F N f S 0 F < N f S , ( 14 )
  • where Nf is the maximum number of flows sourced by any input SE or bound for any output SE, i.e. the maximum number of flows that are passing through some internal link of the fabric.
  • Proof Maximum utilization of any internal link in the fabric under which all cells pass it within designated frames can be derived from Lemmas 2 and 4:
  • U a = min N f , N f ( U a , U a ) = { S - N f F F N f S 0 F < N f S , ( 15 )
  • where Nf is the maximum number of flows sourced by any input SE or bound to any output SE, i.e. the maximum number of flows that are passing through some internal link of the fabric.
  • Note that Theorem 2 holds for Benes network with an arbitrary number of stages as described in Chaney et al., Proceedings of INFOCOM1997 1:2-11 and J. S. Turner Proceedings of INFOCOM1994 1:298-305. In that case, the latter definition of Nf holds, i.e. Nf is the maximum number of flows that are passing through some internal link of the fabric.
  • The maximum utilization when different flows bound for the same SE are not properly synchronized was calculated, so they might send cells within a given frame starting from the same center SE. Alternatively, equal numbers of flows are balanced starting from different center SEs in each frame. For example, flow g of SE1i resets its counter at the beginning of a frame to cig=(i+g) mod n. Or, flow g bound to SE3k resets its counter at the beginning of a frame to ckg=(k+g) mod n. It is assumed that Nf>10n in order to simplify the analysis of load balancing algorithms with synchronized counters.
  • Lemma 5: In load balancing algorithms with synchronized counters, if:
  • F u N f 2 , it holds that : F c = F u + N f 2 , ( 16 ) otherwise if : 10 N f 8 N F u < N f 2 , it holds that : F c = 2 F u N f . ( 17 )
  • Proof. The maximum number of cells that are transmitted from SE1 through SE2(n-1) in the middle stage is calculated, and the same result holds for any other center SE. Let f′jg denote the number of cells in flow g which is balanced starting from SE2j at the beginning of each frame, where j=(i+g) mod n. Then, the number of cells in flow g transmitted from SE1i through SE2(n-1) is
  • f ig + ( i + g ) mod n n ,
  • where └x┘ is the smallest integer not greater than x i.e. └x┘≦x. So, the number of cells from SE1i through SE2(n-1) is:
  • F c = 0 g < N f f ig + ( i + g ) mod n n F u + N f n · n - 1 2 0 g < N f f ig + ( i + g ) mod n n F u + N f 2 , ( 18 )
  • for n>10 and Nf>10n. Note that inequality (18) holds for n>10 and Nf mod n=0 as well. Equality in (18) is reached if:

  • f ig ′=n−(i+g)mod n+n·y ig′.  (19)
  • where yfg′≧0 are integers. Values fig′ that satisfy condition (19) exist if it holds that:
  • nF u = 0 g < N f f jg 0 g < N f ( n - ( i + g ) mod n ) = N f n · n ( n + 1 ) 2 F u N f n · n + 1 2 N f 2 , ( 20 )
  • for n>10 and Nf>10n.
    Note that inequality (20) holds for n>10 and Nf mod n=0 as well. When inequality (20) holds, equality in (18) may be reached, and:
  • F c = F u + N f 2 , ( 21 )
  • If inequality (20) does not hold:
  • N f n · z ( z + 1 ) 2 nF u < N f n · ( z + 1 ) · ( z + 2 ) 2 z = - 1 + 1 + 8 NF u 2 2 , ( 22 )
  • where 0≦z<n is an integer. For
  • F u > 10 N f 8 N :
  • z 2 NF u N f ( 23 )
  • Fc′ is maximal for:
  • f ig = { n - q n - z q = ( i + g ) mod n = n 0 0 ( i + g ) mod n < n - z . ( 24 )
  • If 10Nf′/(8N)≦Fu<Nf′/2 from (18, 23, 24):
  • F c = N f z n 2 F u N f ( 25 )
  • Lemma 6: Maximum utilization of the links from input to center SEs, when the counters are synchronized is:
  • U r = { S - N f 2 F F N f S S 2 F 2 N f F < N f S . ( 26 )
  • Proof: Since Fc′≦SF, from Lemma 5 it follows that for Fu≧Nf/2,
  • F c = F u + N f 2 SF U r = F u F S - N f 2 F F N f S . ( 27 ) and for 10 N f 8 N F u < N f 2 : F c = 2 F u N f SF U r = F u F min ( N f 2 F , S 2 F 2 N f ) . ( 28 )
  • So, the maximum utilization when counters are reset each frame is:
  • U r = F u F { S - N f 2 F F u N f 2 min ( N f 2 F , S 2 F 2 N f ) 10 N f F u < N f 2 10 N f 8 NF F u < 10 N f 8 N ( 29 )
  • From equations (27, 29), it follows that:
  • U r = { S - N f 2 F F N f S S 2 F 2 N f F < N f S ( 30 )
  • Here
  • 10 N f 8 NF << 1
  • because Nf′≦F and N>>1, so range
  • F u < 10 N f 8 N
  • is not of a practical interest and was omitted in the final formula.
  • Lemma 7. In load balancing algorithms with synchronized counters, if:
  • F u N f 2 , it holds that : F c = F u + N f 2 , ( 31 ) otherwise if : 10 N f 8 N F u < N f 2 , it holds that : F c = 2 F u N f . ( 32 )
  • Proof. First the maximum number of cells that are transmitted to SE3k through SE2(n-1) in the middle stage is calculated, and the same result holds for any other center SE. Let fkg″ denote the number of cells in flow g transmitted to SE3k that are balanced starting from SE2j at the beginning of each frame, where j=(k+g) mod n. Then, the number of cells in flow g transmitted to SE3k through SE2(n-1) is └(fkg″+(k+g) mod n)/n┘. Similarly, as in the proof of Lemma 5, it holds that:
  • F c F u + N f 2 ( 33 ) If inequality F u N f 2 ( 34 )
  • holds, equality in (33) may be reached, so:
  • F c = F u + N f 2 ( 35 )
  • Similarly, as in the proof of Lemma 5, if it holds that:
  • 10 N f 8 N F u < N f 2 ( 36 ) then : F c = 2 F u N f . ( 37 )
  • Lemma 8: Maximum utilization of the links from center to output SEs when the counters are reset each frame is:
  • U r = { S - N f 2 F F N f S S 2 F 2 N f F < N f S ( 38 )
  • Proof: Since Fc″≦SF, from Lemma 7 it follows that for Fu≧Nf″/2:
  • F c = F u + N f 2 SF U r = F u F S - N f 2 F F N f S . ( 39 )
  • and for 10Nf″/(8N)≦Fc″Nf″/2:
  • F c = 2 F u N f SF U r = F u F min ( N f 2 F , S 2 F 2 N f ) . ( 40 )
  • So, maximum utilization of the links from center to output SEs is:
  • U r = F u F { S - N f 2 F F u N f 2 min ( N f 2 F , S 2 F 2 N f ) 10 N f 8 N F u < N f 2 10 N f 8 NF F u 10 N f 8 N ( 41 )
  • From equations (39, 41), it follows that:
  • U r = { S - N f 2 F F N f S S 2 F 2 N f F < N f S . ( 42 )
  • Theorem 3: In the algorithms where balancing of different flows is synchronized, maximum utilization of any internal link in the fabric under which all cells pass it within designated frames is:
  • U r = { S - N f 2 F F N f S S 2 F 2 N f F < N f S . ( 43 )
  • Proof: Maximum utilization of any internal link in the fabric under which all cells pass it within designated frames is derived from Lemmas 6 and 8 to be:
  • U r = min N f , N f ( U r , U r ) = { S - N f 2 F F N f S S 2 F 2 N f F < N f S . ( 44 )
  • Note that Theorem 3 provides the maximum utilization when both balancing of flows sourced by an input SE, and balancing of flows bound for an output SE are synchronized. This assumption holds in all the algorithms.
  • Often, signal transmission over the fibers connecting distant routers requires the most complex and costly hardware. Therefore, it is important to provide the highest utilization of the fiber transmission capacity. For this reason, switching fabrics with the speedup have been previously proposed. Namely, internal links of the fabric have higher capacity than the external links:
  • S = R c R 1 , ( 45 )
  • where R is a bit-rate at which data is transmitted through the fibers, and Rc is a bit-rate at which data is transmitted through the fabric connections.
  • Theorem 4: The speedup S required to pass all incoming packets with a tolerable delay when counters are not synchronized is:
  • S a 1 + N f F ( 46 )
  • and the speedup when counters are synchronized is:
  • S r { 1 + N f 2 F F N f 2 2 N f F F < N f 2 . ( 47 )
  • Proof: It should hold that Fu=F while Fc≦SF, where Fc is the number of cells passing through some internal link per frame. When the counters are not synchronized from Lemmas 1 and 3 it follows that:

  • S a F≧max(F c ′,F c″)=F+N f
  • and so:
  • S a 1 + N f F . ( 48 )
  • When the counters are synchronized, from Lemmas 5 and 7 it follows that:
  • S r F max ( F c , F c ) = { F + N f 2 F N f 2 2 FN f 10 N f 8 N F < N f 2 . and so S r { 1 + N f 2 F F N f 2 2 N f F F < N f 2 . ( 49 )
  • because F≧Nf>10 Nf/(8N), since N≧2. Note that the speedup smaller than 1 means that no speedup is really needed.
  • The performance of a load balancing algorithm depends on the number of flows that are separately balanced. Let Nf denote the maximum number of balanced flows passing through some internal link. As noted before, Nf is equal to the maximum number of flows sourced by some input SE or bound to some output SE. In the first algorithm Nf=N, because any input SE sources n2=N flows, and each of N inputs balances one flow for any output SE. In the second algorithm, Nf=nN, because any input SE sources nN flows, and each of N inputs balances n flows bound for any output SE. In the third algorithm, Nf=n because any input SE sources n flows, and each of n input SEs balances one flow for any output SE. In the fourth algorithm, Nf=N because any input SE sources N flows, and each of n input SEs balances n flows for any output SE.
  • Under the assumption of no speedup, i.e. S=1, the maximum utilizations for described load balancing algorithms by substituting Nf in formula (14) are obtained:
  • U a 1 = U a 4 = { 1 - N F F N 0 F < N , U a 2 = { 1 - nN F F nN 0 F < nN , U a 3 1. ( 50 )
  • Thus, the second load balancing algorithm is least efficient, while the third algorithm is most efficient.
  • In order to increase the efficiency of the load balancing algorithms, in one embodiment of the present invention, the frame length is increased. The cell delay is proportional to the frame length. So the maximum frame length is determined by the delay that could be tolerated by the applications, such as interactive voice and video. Assume that the maximum delay that can be tolerated by interactive applications is D, and the cell time slot duration is Tc, then
  • F D 3 T c and : ( 51 ) U a 1 = U a 4 = { 1 - 3 NT c D D 3 NT c 0 D < 3 NT c , U a 1 = { 1 - 3 nNT c D D 3 nNT c 0 D < 3 nNT c , ( 52 )
  • One way packet delay that can be tolerated by interactive applications is around 150 ms, but only 50-60 ms of this allowed delay can be budgeted for the queueing. The switch delay as low as 3 ms may be required for various reasons. For example, packets might pass multiple packet switches from their sources to the destinations, and packet delays through these switches would add. Also, in order to provide flexible multicasting, the ports should forward packets multiple times through the packet switch, and the packet delay is prolonged accordingly (Chaney et al., Proceedings of INFOCOM 1997, 1:2-11 (1997); A. Smiljanić, “Scheduling of Multicast Traffic in High-Capacity Packet Switches,” IEICE/IEEE Workshop on High-Performance Switching and Routing, May 2002, pp. 29-33; A. Smiljanić, “Scheduling of Multicast Traffic in High-Capacity Packet Switches,” IEEE Communication Magazine, November 2002, pp. 72-77; and J. S. Turner, Proceeding of INFOCOM 1994, 1:298-305 (1994)).
  • FIG. 2 shows the fabric utilization decreases as the switch size increases for various tolerable delays. In FIG. 2( a) Tc=50 ns, while in FIG. 2( b) Tc=100 ns. The solid curves represent the first and fourth algorithms (Nf=N), while the dashed curves correspond to the second algorithm (Nf=nN). The efficiency of the second balancing algorithm might decrease unacceptably as the switch size increases. For example, the utilization of a fabric with 1000 ports drops below 10% for a tolerable delay of 3 ms and Tc=50 ns. On the other side, for the same tolerable delay and cell duration, the utilization of a fabric with 4000 ports is 90% if the first or the fourth load balancing algorithm is applied. Note that utilizations are lower in FIG. 3 (b) when the cell duration is longer Tc=100 ns. Thus, the first and fourth load balancing algorithms (for which Nf=N) provide a superior performance. Flows balanced starting from different center SEs improve the efficiency of load balancing. Namely, at the beginning of each frame, counters are set to the appropriate values, e.g. cij=(i+j) mod n, where 0≦i<N, 0≦j<n for the first load balancing algorithm, 0≦i, j<N for the second algorithm, 0≦i<n, 0≦j<N for the fourth algorithm. (Efficiency of the third algorithm is already close to 100%.) Because in all these cases Nf≧N>10n and n>10, the guaranteed utilizations for the enhanced load balancing algorithms is derived by substituting Nf in formula (43) as follows:
  • U r 1 = U r 4 = { 1 - N 2 F F N F 2 N F < N , U r 2 = { 1 - nN 2 F F nN F 2 nN F < nN . It follows that : ( 53 ) U r 1 = U r 4 = { 1 - 3 NT c 2 D D 3 NT c D 6 NT c D < 3 NT c , U r 2 = { 1 - 3 nNT c 2 D D 3 nNT c D 6 nNT c D < 3 nNT c , ( 54 )
  • where D is the maximum delay that can be tolerated, and again it is assumed that there is no speedup, i.e. that S=1.
  • FIG. 3 shows the fabric utilization for the load balancing algorithms that reset counters to the specified values every frame. In FIG. 3( a) Tc=50 ns, while in FIG. 3( b) Tc=100 ns. The solid curves correspond to the first and fourth algorithms (Nf=N), while the dashed curves correspond to the second algorithm (Nf=nN). The efficiency of the second load balancing algorithm is improved, but, it is still low in large switches where cells bound for the particular output are spread equally across the center SEs. For example, the utilization of a fabric with 1000 ports drops below 30% for a tolerable delay of 3 ms and Tc=50 ns, and again drops below 10% in a switch with 4000 ports. The efficiency of the first and fourth load balancing algorithms is improved too, i.e. for the same tolerable delay and cell duration the utilization of a fabric with 4000 ports is 90%. Note that utilizations are lower in FIG. 3 (b) when the cell duration is longer, Tc=100 ns. Again, the first and fourth load balancing algorithms provide much better performance than the second load balancing algorithm.
  • In another embodiment of the present invention, the utilization of the transmission capacity is maximized to 100% by implementing the switching fabric with a speedup. The speedup required to provide non-blocking varies for different load balancing algorithms. In the simple case when different counters are not synchronized, required speedups can be obtained from formula (46) to be:
  • S a 1 = S a 3 = 1 + N F , S a 2 = 1 + nN F . ( 55 )
  • When the counters are synchronized, required speedups are decreased and are obtained from formula (47) as follows:
  • S r 1 = S r 3 = { 1 + N 2 F F N 2 2 N F F < N 2 , S r 2 = { 1 + nN 2 F F nN 2 2 nN F F < nN 2 . ( 56 )
  • Speedups required to pass the packets with a tolerable delay of D can be calculated from formula (55):
  • S a 1 = S a 3 = 1 + 3 NT c D , S a 2 = 1 + 3 nNT c D ( 57 )
  • When the counters are synchronized, required speedups are decreased and are obtained from formula (56) as follows:
  • S r 1 = S r 3 = { 1 + 3 NT c 2 D D 3 NT c 2 6 NT c D D < 3 NT c 2 , S r 2 = { 1 + 3 nNT c 2 D D 3 nNT c 2 6 nNT c D D < 3 nNT c 2 . ( 58 )
  • FIG. 4 shows the fabric speedup that provides non-blocking through a switch for various delay requirements. In FIG. 4( a) Tc=50 ns, while in FIG. 4( b) Tc=100 ns. The solid curves represent the first and fourth algorithms (Nf=N), while the dashed curves correspond to the second algorithm (Nf=nN). If the cell duration is 50 ns, the second load balancing algorithm requires the speedups larger than 2 and 10, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively. If the cell duration is 100 ns, the second load balancing algorithm requires the speedups larger than 4 and 11, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively. On the other side, the speedup required when the first and fourth load balancing algorithms are applied is close to 1 for all switch parameters.
  • FIG. 5 shows the fabric speedup that provides non-blocking through a switch for various delay requirements in the case when the counters used for balancing are synchronized. In FIG. 4( a) Tc=50 ns, while in FIG. 4( b) Tc=100 ns. The solid curves represent the first and fourth algorithms (Nf=N), while the dashed curves correspond to the second algorithm (Nf=nN). If the cell duration is 50 ns, the second load balancing algorithm requires the speedups larger than 2 and 7, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively. If the cell duration is 100 ns, the second load balancing algorithm requires the speedups larger than 2 and 10, in order to provide the delay less than 3 ms through a switch with 1000 and 4000 ports, respectively. Thus, the required speedup is sometimes decreased when the counters are synchronized. No speedup is needed when the first and fourth load balancing algorithms are applied and the counters are synchronized.
  • Therefore, it is preferred that cells bound for the output SE are spread equally across center SEs, or that input SEs spread cells across center SEs (Nf<N). Since the performance improves as the number of balanced flows decreases, all algorithms for which Nf≦N perform well. However, the implementation of the algorithms where input SEs balance the traffic may be more complex, and, consequently, less scalable. First, inputs have to exchange the information with the SE arbiter. Secondly, counters of the arbiter should be updated n times per cell time slot, which may require advanced processing capability, and may limit the number of SE ports, i.e. the total switch capacity. Also, these algorithms assume the SEs with the shared buffers whose capacity was shown to be smaller than the capacity of crossbar SEs. Note that in the Turner article (J. S. Turner, Proceeding of INFOCOM 1994, 1:298-305), it was proposed that the end-to-end sessions are separately balanced in a switch. In that case Nf≧nN; and consequently the performance is poorer than in the cases that were examined in this specification.
  • In some cases, there is a coarse synchronization in a switch during the flow of data, i.e. at some point of time the input ports schedule cells belonging to the same frame. In one embodiment of the present invention, if the frames at different ports are not synchronized, the correct switch operation can be accomplished in the following way. Frames are delineated by designated packets. One extra bit per packet, FB, is set at the port to denote its frame, and is toggled in each frame. In a given frame the switch arbiter will schedule only packets received before such frame with FB equal to the specified switch bit, SB. SB toggles in each frame as well. FIG. 6 illustrates this synchronization. The upper axis in FIG. 6 (a) shows the switch frame boundaries, while the lower axes in FIG. 6 (b) and (c) show the port frame boundaries. At the beginning of each switch frame, SB toggles, and at the beginning of each port frame, FB toggles, as shown. Thus, only packets with FB=SB=0 that have arrived before the switch frame k+2 in FIG. 6 (a) will be scheduled in the switch frame k+2; and these are packets of the upper port frame m+1 in FIG. 6 (b). Similarly, packets of the port frame m+2 will be scheduled in the switch frame k+2 etc. In FIG. 6 (b), the port is synchronized properly, while in FIG. 6( c), it is not. Namely, packets arriving at the end of the port frame m and packets arriving at the beginning of the port frame m+2 are eligible for scheduling in the switch frame k+3. So, the number of packets bound for some output that will be scheduled in frame k+3 might exceed negotiations, and would be blocked. Thus, SB and FBs have to be properly synchronized: an arbiter sets FB=1−SB if the switch frame boundary preceded the previous port frame boundary (delineation packet), or FB=SB otherwise, where FB is the frame bit of the first packet arriving as the synchronization process started. Although the coarse synchronization may introduce an additional delay smaller than the frame duration, the synchronization simplifies the controller implementation.
  • Multiple priorities can be served in the switch. In each SE, high priority cells are first served and their number is limited according to the various admission control conditions that were described above. On the other side, there are no limits for low priority cells which are served if they can get through after the high-priority cells are served. By limiting the number of high-priority cells with the above equation, they are served with the guaranteed delay. If there is any resource left, namely time slots in which some input and output are idle, and there are lower priority cells between them, the lower priority cells are served without any delay guarantees.
  • Multicasting
  • A significant amount of traffic on the Internet is multicast in nature; i.e. it carries the information from one source to multiple destinations. Scheduling of multicast packets in switches is a complicated task. If a multicast packet is scheduled to be simultaneously transmitted to all destination outputs, it may be unacceptably delayed. On the other side, if the multicast packet is scheduled to be separately transmitted to all destination outputs, its transmission may consume an unacceptably large portion of the input transmission capacity.
  • It has been proposed earlier that multicast packet should be forwarded through high-capacity switches (Chaney et al., Proceedings of INFOCOM 1997, 1:2-11 (1997); A. Smiljanić, IEICE/IEEE Workshop on High-Performance Switching and Routing, May 2002, pp. 29-33; A. Smiljanic, IEEE Communication Magazine, November 2002, pp. 72-77; J. S. Turner, “An optimal nonblocking multicast virtual circuit switch,” Proceeding of INFOCOM 1994, vol. 1, pp. 298-305). Namely, a multicast input sends multicast packets to a limited number of destinations, and each multicast destination output that received the packets will forward them to a limited number of destination outputs who did not received them yet, and such forwarding continues until all destination outputs received all the packets. By choosing appropriate forwarding fan-out P, i.e. the number of destination outputs to which a packet is forwarded from one port, the switch utilization and the guaranteed delay could be selected (A. Smiljanić, IEICE/IEEE Workshop on High-Performance Switching and Routing, May 2002, pp. 29-33; A. Smiljanić, IEEE Communication Magazine, November 2002, pp. 72-77).
  • Packets can be forwarded in two ways. In the first case, a port separately transmits a multicast packet to its destination ports. Then, the packet flow is determined solely based on its input and output ports as in the case of unicast packets. In the second case, a port transmits only one copy of a multicast packet to the Clos network. The multicast packet is transmitted through the network until the last SE from which it can reach some destination port where it is replicated and its copies are routed separately through the remainder of the network. So, the multicast flow is balanced in stages before the packet replication starts. In this case, the packet flow is determined by its input port and its multiple destinations of ports. Obviously, the number of flows is increased in this way, and the performance of load balancing is degraded. On the other side, the port transmission capacity required for forwarding is less. It was shown earlier that P=2 is the most practical choice; then, the port transmission capacity improvement is less than the utilization degradation due to imperfect load balancing, so the first multicasting scheme is recommended. In any case, the performance of the second multicasting scheme is improved when the number of flows is minimized.
  • Again, various load balancing algorithms can be performed depending on the definition of the flows that are separately balanced. Similarly, as for unicast transmission, four basic algorithms are provided.
  • In the first algorithm, all cells sourced by some input and bound to some set of P output SEs define one flow. So, for each multicast cell, its output SEs are determined, and the flow is determined by the found set of output SEs. There are Nf=nn(nn−1)/2≈nN/2 of such flows that are balanced through and link from input port to center SE. Remember that the corresponding utilization Ua=1−Nf/F=1−nN/(2F) has been shown to be unsatisfactory.
  • In the second algorithm, all cells sourced by some input and bound to some set of P outputs define one flow. There is an enormous number, Nf=nN(N−1)/2≈nN2/2, of such flows that are balanced through and link from input port to center SE, and this algorithm should be avoided by all means.
  • In the third algorithm, all cells sourced by some input SE and bound to some set of P output SEs define one flow. There are Nf=n(n−1)/2≈N/2 of such flows that are balanced through and link from input to center SE. Thus, the performance of the third algorithm will be fine as shown before.
  • In the fourth algorithm, all cells sourced by some input SE and bound to some set of P outputs define one flow. There is again an enormous number, Nf=N(N−1)/2≈N2/2, of such flows that are balanced through and link from input to center SE. The fourth algorithm should be by all means avoided. The only well performing algorithm is more complex for the implementation, and it assumes the SEs with shared buffers which have the smaller capacity than the cross-bar SEs.
  • Improvement in the performance of load balancing of unicast and multicast flows in a fabric can be accomplished by increasing the frame length, balancing flows among different internal SEs, implementing the fabric with a speedup, or combinations thereof.
  • Implementation
  • The methods of the present invention can be implemented by an article of manufacture which comprises a machine readable medium containing one or more programs which when executed implement the steps of the methods of the present invention.
  • For example, the methods of the present invention can be implemented using a conventional microprocessor programmed according to the teachings of the present specification, as will be apparent to those skilled in the computer art. Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art. The invention may also be implemented by the preparation of application specific units, such as integrated circuits (ASIC), configurable logic blocks, field programmable gate arrays, or by interconnecting an appropriate network of conventional circuit components, as will be readily apparent to those skilled in the art.
  • The article of manufacture can comprise a storage medium can include, but is not limited to Random-Access Memory (RAMs) for storing lookup tables. In one embodiment, the assignment of cells to a flow comprise inputting the i, j designation of a cell into a lookup table which table assigns to the cell an input and output set, an input and output subset, and the flow of the cell.
  • The methods of the present invention can be implemented by an apparatus which comprises: a flow control device configured to perform the steps of the invention. The apparatus can also comprise a counter module configured to assign counters to each flow pursuant to the methods of the invention.
  • The present invention also includes a multistage non-blocking fabric which comprises a network of switches that perform the method steps of the invention. The fabric comprises at least one internal switching element (SE) stage, wherein the stage has l internal switching elements, an input SE stage, an output SE stage, input ports which are divided into input sets wherein each input set consists of input ports that transmit through the same input SE, and wherein the input sets are further divided into input subsets, and output ports which are divided into output sets wherein each output set consists of output ports that receive cells through the same output SE, and wherein the output sets are further divided into output subsets, and a flow assignment module wherein the module assigns cells which are received into the fabric to a flow. The assignment module comprises a lookup table.
  • Thus, while there have been described what are presently believed to be the preferred embodiments of the invention, those skilled in the art will realize that changes and modifications may be made thereto without departing from the spirit of the invention, and it is intended to claim all such changes and modifications as fall within the true scope of the invention.

Claims (1)

1. A method for balancing unicast or multicast flows in a multistage non-blocking fabric, wherein the fabric comprises at least one internal switching element (SE) stage, an input SE stage and an output SE stage, wherein the method comprises:
(a) receiving cells into the fabric wherein each cell is associated with an input subset and associated with an output subset according to the source and destination address of the cell,
(b) assigning each cell to a flow, wherein cells sourced from the same input subset, and bound for the same output subset, or multiple output subsets, are assigned to the same flow, and
(c) transmitting flows through the internal SE stage wherein cells of a particular flow are distributed among the internal switching elements, wherein the cells of each particular flow transmitted at each internal SE differs by at most h, wherein h is positive,
whereby the flow in the fabric is balanced.
US12/165,825 2004-03-11 2008-07-01 Load Balancing Algorithms in Non-Blocking Multistage Packet Switches Abandoned US20080267182A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/165,825 US20080267182A1 (en) 2004-03-11 2008-07-01 Load Balancing Algorithms in Non-Blocking Multistage Packet Switches

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/798,077 US7397796B1 (en) 2003-08-21 2004-03-11 Load balancing algorithms in non-blocking multistage packet switches
US12/165,825 US20080267182A1 (en) 2004-03-11 2008-07-01 Load Balancing Algorithms in Non-Blocking Multistage Packet Switches

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US10/798,077 Continuation US7397796B1 (en) 2003-08-21 2004-03-11 Load balancing algorithms in non-blocking multistage packet switches

Publications (1)

Publication Number Publication Date
US20080267182A1 true US20080267182A1 (en) 2008-10-30

Family

ID=39967554

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/798,077 Expired - Fee Related US7397796B1 (en) 2003-08-21 2004-03-11 Load balancing algorithms in non-blocking multistage packet switches
US12/165,825 Abandoned US20080267182A1 (en) 2004-03-11 2008-07-01 Load Balancing Algorithms in Non-Blocking Multistage Packet Switches

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/798,077 Expired - Fee Related US7397796B1 (en) 2003-08-21 2004-03-11 Load balancing algorithms in non-blocking multistage packet switches

Country Status (1)

Country Link
US (2) US7397796B1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060153183A1 (en) * 2004-12-23 2006-07-13 Lim Swee H A Modified ethernet switch
US20080317025A1 (en) * 2007-06-22 2008-12-25 Sun Microsystems, Inc. Switch matrix
US20090175281A1 (en) * 2004-11-18 2009-07-09 Nec Corporation Switch device, switching method and switch control program
US20100050105A1 (en) * 2007-04-12 2010-02-25 Eric Denis Dufosse Centralized work flow monitoring
US20110032934A1 (en) * 2009-08-04 2011-02-10 Huawei Technologies Co., Ltd. Multi-stage switch fabric
US20120320917A1 (en) * 2011-06-20 2012-12-20 Electronics And Telecommunications Research Institute Apparatus and method for forwarding scalable multicast packet for use in large-capacity switch
US20130039657A1 (en) * 2011-06-23 2013-02-14 Telefonaktiebolaget L M Ericsson (Publ) Method and system for distributing a network application among a plurality of network sites on a shared network
EP3208981A3 (en) * 2016-02-18 2017-08-30 Media Global Links Co., Ltd. Multicast switching system

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4701152B2 (en) * 2006-10-20 2011-06-15 富士通株式会社 Data relay apparatus, data relay method, and data relay program
US8107468B2 (en) * 2006-11-07 2012-01-31 Media Global Links Co., Ltd. Non-blocking multicast switching system and a method for designing thereof
US20080107103A1 (en) * 2006-11-07 2008-05-08 Yuanyuan Yang Non-blocking multicast switching network
WO2008147928A1 (en) * 2007-05-25 2008-12-04 Venkat Konda Vlsi layouts of fully connected generalized networks
US7924052B1 (en) * 2008-01-30 2011-04-12 Actel Corporation Field programmable gate array architecture having Clos network-based input interconnect
US7924053B1 (en) 2008-01-30 2011-04-12 Actel Corporation Clustered field programmable gate array architecture
US8477151B2 (en) * 2008-11-18 2013-07-02 At&T Intellectual Property I, L.P. Boundary delineation system
US8687629B1 (en) * 2009-11-18 2014-04-01 Juniper Networks, Inc. Fabric virtualization for packet and circuit switching
US11405332B1 (en) * 2011-09-07 2022-08-02 Konda Technologies Inc. Fast scheduling and optimization of multi-stage hierarchical networks
US9509634B2 (en) * 2013-07-15 2016-11-29 Konda Technologies Inc. Fast scheduling and optmization of multi-stage hierarchical networks
US8891405B2 (en) * 2012-07-18 2014-11-18 International Business Machines Corporation Integrated device management over Ethernet network
US9553993B2 (en) * 2014-01-31 2017-01-24 Tracfone Wireless, Inc. Device and process for selecting one of a plurality of direct inward dialing numbers

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3979733A (en) * 1975-05-09 1976-09-07 Bell Telephone Laboratories, Incorporated Digital data communications system packet switch
US4057711A (en) * 1976-03-17 1977-11-08 Electronic Associates, Inc. Analog switching system with fan-out
US4516238A (en) * 1983-03-28 1985-05-07 At&T Bell Laboratories Self-routing switching network
US4785446A (en) * 1986-11-07 1988-11-15 International Business Machines Corporation Distributed bit switching of a multistage interconnection network
US5315587A (en) * 1991-10-04 1994-05-24 Telefonaktiebolaget L M Ericsson Data flow control
US5390173A (en) * 1992-10-22 1995-02-14 Digital Equipment Corporation Packet format in hub for packet data communications system
US5517495A (en) * 1994-12-06 1996-05-14 At&T Corp. Fair prioritized scheduling in an input-buffered switch
US5539747A (en) * 1993-08-24 1996-07-23 Matsushita Electric Industrial Co., Ltd. Flow control method
US5617409A (en) * 1994-01-28 1997-04-01 Digital Equipment Corporation Flow control with smooth limit setting for multiple virtual circuits
US5907485A (en) * 1995-03-31 1999-05-25 Sun Microsystems, Inc. Method and apparatus for flow control in packet-switched computer system
US5974028A (en) * 1997-02-24 1999-10-26 At&T Corp. System and method for improving transport protocol performance in communication networks having lossy links
US6046979A (en) * 1998-05-04 2000-04-04 Cabletron Systems, Inc. Method and apparatus for controlling the flow of variable-length packets through a multiport switch
US6072772A (en) * 1998-01-12 2000-06-06 Cabletron Systems, Inc. Method for providing bandwidth and delay guarantees in a crossbar switch with speedup
US6307835B1 (en) * 1998-07-10 2001-10-23 Stmicroelectronics, Inc. Method and apparatus for controlling data flow in data communication networks
US6335930B1 (en) * 1997-05-23 2002-01-01 Samsung Electronics Co., Ltd. Multi-stage interconnection network for high speed packet switching
US6341313B1 (en) * 1998-02-02 2002-01-22 Nec Corporation Flow controlling method and apparatus for network between processors
US6359885B1 (en) * 1997-08-27 2002-03-19 Electronics And Telecommunications Research Institute Multi-channel packet switching apparatus having traffic flow controlling and checking functions
US6405258B1 (en) * 1999-05-05 2002-06-11 Advanced Micro Devices Inc. Method and apparatus for controlling the flow of data frames through a network switch on a port-by-port basis
US6618379B1 (en) * 1998-12-08 2003-09-09 Nec Corporation RRGS-round-robin greedy scheduling for input/output terabit switches
US6628613B1 (en) * 1998-10-12 2003-09-30 Samsung Electronics Co. Ltd Flow control method in packet switched network
US6633543B1 (en) * 1998-08-27 2003-10-14 Intel Corporation Multicast flow control
US6636480B1 (en) * 1999-09-08 2003-10-21 Riverstone Networks, Inc. Method and system for controlling data flow through a multiport switch
US7031313B2 (en) * 2001-07-02 2006-04-18 Hitachi, Ltd. Packet transfer apparatus with the function of flow detection and flow management method
US7095744B2 (en) * 2000-11-22 2006-08-22 Dune Networks Method and system for switching variable sized packets

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4621359A (en) * 1984-10-18 1986-11-04 Hughes Aircraft Company Load balancing for packet switching nodes
US5274782A (en) * 1990-08-27 1993-12-28 International Business Machines Corporation Method and apparatus for dynamic detection and routing of non-uniform traffic in parallel buffered multistage interconnection networks
US5671222A (en) * 1994-06-06 1997-09-23 Lucent Technologies Inc. Multicast routing in self-routing multistage networks

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3979733A (en) * 1975-05-09 1976-09-07 Bell Telephone Laboratories, Incorporated Digital data communications system packet switch
US4057711A (en) * 1976-03-17 1977-11-08 Electronic Associates, Inc. Analog switching system with fan-out
US4516238A (en) * 1983-03-28 1985-05-07 At&T Bell Laboratories Self-routing switching network
US4785446A (en) * 1986-11-07 1988-11-15 International Business Machines Corporation Distributed bit switching of a multistage interconnection network
US5315587A (en) * 1991-10-04 1994-05-24 Telefonaktiebolaget L M Ericsson Data flow control
US5390173A (en) * 1992-10-22 1995-02-14 Digital Equipment Corporation Packet format in hub for packet data communications system
US5539747A (en) * 1993-08-24 1996-07-23 Matsushita Electric Industrial Co., Ltd. Flow control method
US5617409A (en) * 1994-01-28 1997-04-01 Digital Equipment Corporation Flow control with smooth limit setting for multiple virtual circuits
US5517495A (en) * 1994-12-06 1996-05-14 At&T Corp. Fair prioritized scheduling in an input-buffered switch
US5907485A (en) * 1995-03-31 1999-05-25 Sun Microsystems, Inc. Method and apparatus for flow control in packet-switched computer system
US5974028A (en) * 1997-02-24 1999-10-26 At&T Corp. System and method for improving transport protocol performance in communication networks having lossy links
US6335930B1 (en) * 1997-05-23 2002-01-01 Samsung Electronics Co., Ltd. Multi-stage interconnection network for high speed packet switching
US6359885B1 (en) * 1997-08-27 2002-03-19 Electronics And Telecommunications Research Institute Multi-channel packet switching apparatus having traffic flow controlling and checking functions
US6072772A (en) * 1998-01-12 2000-06-06 Cabletron Systems, Inc. Method for providing bandwidth and delay guarantees in a crossbar switch with speedup
US6341313B1 (en) * 1998-02-02 2002-01-22 Nec Corporation Flow controlling method and apparatus for network between processors
US6046979A (en) * 1998-05-04 2000-04-04 Cabletron Systems, Inc. Method and apparatus for controlling the flow of variable-length packets through a multiport switch
US6307835B1 (en) * 1998-07-10 2001-10-23 Stmicroelectronics, Inc. Method and apparatus for controlling data flow in data communication networks
US6633543B1 (en) * 1998-08-27 2003-10-14 Intel Corporation Multicast flow control
US6628613B1 (en) * 1998-10-12 2003-09-30 Samsung Electronics Co. Ltd Flow control method in packet switched network
US6618379B1 (en) * 1998-12-08 2003-09-09 Nec Corporation RRGS-round-robin greedy scheduling for input/output terabit switches
US6405258B1 (en) * 1999-05-05 2002-06-11 Advanced Micro Devices Inc. Method and apparatus for controlling the flow of data frames through a network switch on a port-by-port basis
US6636480B1 (en) * 1999-09-08 2003-10-21 Riverstone Networks, Inc. Method and system for controlling data flow through a multiport switch
US7095744B2 (en) * 2000-11-22 2006-08-22 Dune Networks Method and system for switching variable sized packets
US7031313B2 (en) * 2001-07-02 2006-04-18 Hitachi, Ltd. Packet transfer apparatus with the function of flow detection and flow management method

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090175281A1 (en) * 2004-11-18 2009-07-09 Nec Corporation Switch device, switching method and switch control program
US7873056B2 (en) * 2004-11-18 2011-01-18 Nec Corporation Switch device, switching method and switch control program
US7639684B2 (en) * 2004-12-23 2009-12-29 Infineon Technologies Ag Modified ethernet switch
US20060153183A1 (en) * 2004-12-23 2006-07-13 Lim Swee H A Modified ethernet switch
US9342804B2 (en) * 2007-04-12 2016-05-17 Gvbb Holdings S.A.R.L. Centralized work flow monitoring
US20100050105A1 (en) * 2007-04-12 2010-02-25 Eric Denis Dufosse Centralized work flow monitoring
US20080317025A1 (en) * 2007-06-22 2008-12-25 Sun Microsystems, Inc. Switch matrix
US7907624B2 (en) * 2007-06-22 2011-03-15 Oracle America, Inc. Switch matrix
US20110032934A1 (en) * 2009-08-04 2011-02-10 Huawei Technologies Co., Ltd. Multi-stage switch fabric
US8897133B2 (en) * 2009-08-04 2014-11-25 Huawei Technologies Co., Ltd. Multi-stage switch fabric
US20120320917A1 (en) * 2011-06-20 2012-12-20 Electronics And Telecommunications Research Institute Apparatus and method for forwarding scalable multicast packet for use in large-capacity switch
US20130039657A1 (en) * 2011-06-23 2013-02-14 Telefonaktiebolaget L M Ericsson (Publ) Method and system for distributing a network application among a plurality of network sites on a shared network
US9246994B2 (en) * 2011-06-23 2016-01-26 Telefonaktiebolaget L M Ericsson (Publ) Method and system for distributing a network application among a plurality of network sites on a shared network
EP3208981A3 (en) * 2016-02-18 2017-08-30 Media Global Links Co., Ltd. Multicast switching system
US10326606B2 (en) 2016-02-18 2019-06-18 Media Links Co., Ltd. Multicast switching system

Also Published As

Publication number Publication date
US7397796B1 (en) 2008-07-08

Similar Documents

Publication Publication Date Title
US20080267182A1 (en) Load Balancing Algorithms in Non-Blocking Multistage Packet Switches
Hui et al. A broadband packet switch for integrated transport
US5526352A (en) Integrated low complexity broadband multi-channel switch
Oie et al. Survey of switching techniques in high‐speed networks and their performance
US6667984B1 (en) Methods and apparatus for arbitrating output port contention in a switch having virtual output queuing
US6449283B1 (en) Methods and apparatus for providing a fast ring reservation arbitration
US5724351A (en) Scaleable multicast ATM switch
US5856977A (en) Distribution network switch for very large gigabit switching architecture
US20060285548A1 (en) Matching process
US7292576B2 (en) ATM switch having output buffers
Chrysos et al. Scheduling in Non-Blocking Buffered Three-Stage Switching Fabrics.
Hu et al. Feedback-based scheduling for load-balanced two-stage switches
US7450503B1 (en) System and method to multicast guaranteed and best-effort traffic in a communications network
Jacob A survey of fast packet switches
Szymanski A low-jitter guaranteed-rate scheduling algorithm for packet-switched ip routers
Smiljanic Rate and delay guarantees provided by clos packet switches with load balancing
Pattavina A multiservice high-performance packet switch for broad-band networks
Kiaei et al. Scalable architecture and low-latency scheduling schemes for next generation photonic datacenters
Smiljanic Load balancing mechanisms in Clos packet switches
JP4007939B2 (en) Packet switch cluster configuration method and packet switch cluster
Berger Multipath packet switch using packet bundling
Liotopoulos Issues on gigabit switching using 3-stage Clos networks
Szymanski et al. Delivery of guaranteed rate internet traffic with very low delay jitter
SmiljaniC Performance of load balancing algorithms in Clos packet switches
Hu et al. Train queue processing for highly scalable switch fabric design

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION