CA1312133C - Integrated packetized voice and data switching system - Google Patents

Integrated packetized voice and data switching system

Info

Publication number
CA1312133C
CA1312133C CA000595081A CA595081A CA1312133C CA 1312133 C CA1312133 C CA 1312133C CA 000595081 A CA000595081 A CA 000595081A CA 595081 A CA595081 A CA 595081A CA 1312133 C CA1312133 C CA 1312133C
Authority
CA
Canada
Prior art keywords
data
packets
packet
switching
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CA000595081A
Other languages
French (fr)
Inventor
Jayant Gurudatta Hemmady
William Paul Lidinsky
Scott Blair Steele
Werner Ulrich
Ronald Clare Weddige
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Corp
Original Assignee
American Telephone and Telegraph Co Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US07/175,547 external-priority patent/US4958341A/en
Application filed by American Telephone and Telegraph Co Inc filed Critical American Telephone and Telegraph Co Inc
Application granted granted Critical
Publication of CA1312133C publication Critical patent/CA1312133C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/35Switches specially adapted for specific applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/104Asynchronous transfer mode [ATM] switching fabrics
    • H04L49/105ATM switching elements
    • H04L49/106ATM switching elements using space switching, e.g. crossbar or matrix
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/104Asynchronous transfer mode [ATM] switching fabrics
    • H04L49/105ATM switching elements
    • H04L49/107ATM switching elements using shared medium
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/15Interconnection of switching modules
    • H04L49/1553Interconnection of ATM switching modules, e.g. ATM switching fabrics
    • H04L49/1576Crossbar or matrix
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/15Interconnection of switching modules
    • H04L49/1553Interconnection of ATM switching modules, e.g. ATM switching fabrics
    • H04L49/1584Full Mesh, e.g. knockout
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/20Support for services
    • H04L49/205Quality of Service based
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • H04L49/253Routing or path finding in a switch fabric using establishment or release of connections between ports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/25Routing or path finding in a switch fabric
    • H04L49/256Routing or path finding in ATM switching fabrics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3009Header conversion, routing tables or routing tags
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3081ATM peripheral units, e.g. policing, insertion or extraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • H04L49/3081ATM peripheral units, e.g. policing, insertion or extraction
    • H04L49/309Header conversion, routing tables or routing tags
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/50Overload detection or protection within a single switching element
    • H04L49/505Corrective measures
    • H04L49/506Backpressure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/55Prevention, detection or correction of errors
    • H04L49/557Error correction, e.g. fault recovery or fault tolerance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/60Software-defined switches
    • H04L49/606Hybrid ATM switches, e.g. ATM&STM, ATM&Frame Relay or ATM&IP
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5638Services, e.g. multimedia, GOS, QOS
    • H04L2012/5646Cell characteristics, e.g. loss, delay, jitter, sequence integrity
    • H04L2012/5651Priority, marking, classes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5678Traffic aspects, e.g. arbitration, load balancing, smoothing, buffer management
    • H04L2012/5679Arbitration or scheduling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • H04L12/5601Transfer mode dependent, e.g. ATM
    • H04L2012/5687Security aspects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/15Interconnection of switching modules
    • H04L49/1515Non-blocking multistage, e.g. Clos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/40Constructional details, e.g. power supply, mechanical construction or backplane
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/55Prevention, detection or correction of errors
    • H04L49/555Error detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls

Abstract

INTEGRATED PACKETIZED VOICE AND DATA SWITCHING SYSTEM
Abstract A high capacity metropolitan area network (MAN) is described. Data traffic from users is connected to data concentrators at the edge of the network, and is transmitted over fiber optic data links to a hub where the data is switched.
The hub includes a plurality of data switching modules, each having a control means, and each connected to a distributed control space division switch.
Advantageously, the data switching modules, whose inputs are connected to the concentrators, perform all checking and routing functions, while the 1024x1024 maximum size space division switch, whose outputs are connected to the concentrators, provides a large fan-out distribution network for reaching many concentrators from each data switching module. Distributed control of the space division switch permits several million connection and disconnection actions to be performed each second, while the pipelined and parallel operation within the control means permits each of the 256 switching modules to process at least 50,000 transactions per second. The data switching modules chain groups of incoming packets destined for a common outlet of the space division switch so that only one connection in that switch is required for transmitting each group of chained packets from a data switching module to a concentrator. MAN provides security features including a port identification supplied by the data concentrators, and a check that each packet is from an authorized source user, transmitting on a port associated with that user, to an authorized destination user that is in the same group (virtual network) as the source user.
This arrangement can also be used to switch voice packets, using a voice interface such as a digital switch and a digital voice signal to voice packet converter. In accordance with one embodiment of the invention, a packet switch is used for switching voice packet outputs of the data switching modules and a circuit switch, such as the space division switch, is used for switching data packet outputs. In accordance with another embodiment, voice packets are switched from the data switching modules through the space division switch to a small group ofdata switching modules, which further switch the voice packets through the circuit switch to a destination concentrator.

Description

~ 3 ~
~ ` 1 INTEGRATED PACKETIZED VOICE AND DATA SWITCHING SYSTEM
1`echnical lField This invention relates to arrangements for switching packetized voice and data.
Problem Integrated telephone voice and data switching systems are becoming available for offering customers integrated services digital network (ISDN) service In such systems, data is frequently switched by switching data packets using packetswitching techniques. The use of packet switching techniques for also switching voice signals converted into packets has been suggested, for example, in J.S. Turner: U.S.
Patent 4,491,945 (Turner). Such arrangements offer the opportunity to take advantage of the high speed of modern microelectronic circ-litry.
A problem with the technique described, for example, in Turner is that while the internal circuitry of the packet switch is very fast, the communication links connected to these switches cannot be operated at the same high speeds as the internal circuitry of these switches because such an arrangement would require excessive storage in each switch. Also, the data switches do not provide arrangements for performing analysis of the destination of each packet to ro~lte the packet appropriately but require that a coded route be generated as part of a header of each packet. Further, these data switches require that voice packets be repeatedly buffered in expensive memory in the many stages of packet switching required in such systems for switching packets to one of a large number of destination communication paths.
Furthermore, these switch architectures are prone to excessive internal congestion caused by focused overloads. Further, large networks for interconnecting many links are expensive using the small modules of the Turner switch.
Solution The above problems are solved and an advance is made over the prior art in accordance with the principles of this invention wherein voice signals are packetized and the packetized voice signals are switched using a data switching -, ... .
2 ~ 3 module that includes a group of banks of memory for storing consecutive words of a packet, a group of packet input and a group of packet output handlers and means for distributing data from each of the input handlers to the memory and from the memory to each of the output handlers.
In accordance with one aspect of the invention there is provided a system for switching voice signals comprising: means for converting said voice signals into voice packets; and means, connected to said means for converting, for packet switching said voice packets, comprising: a plurality of input packet handlers and a plurality of output packet handlers; memory access means for controlling storing and reading of said voice packets, comprising a plurality of memory access controllers for storing consecutive words of a voice packet in consecutive members of a plurality of memory modules; and means for distributing said voice packets from said plurality of input packet handlers to said plurality of memory access con~rollers, tor chainin~
packets to be transmitted to a common group of destinations, and for distributing said chained voice packets from said plurality of memory access controllers to said plurality of output packet handlers.
In accordance with another aspect of the invention there is provided a system for switching voice signals, comprising: a plurality of means for converting said voice signals into voice packets; and a plurality of means, connected to said means for converting, for packet switching said voice packets, comprising: ~ pl-lrality of input packet handlers and a plurality of output packet handlers; memory access means for controlling storing and reading of said voice packets, comprising a plurality of memory access controllers for storing consecutive words of a voice packet in consecutive members of a plurality of memory modules; means for distributing said voice packets ~rom said plurality of input packet handlers to said plurality of memory access controllers and for distributing said voice packets trom said plurality ot memory acce:is controllers to said plurality of output packet handlers; and circuit switch means for switching said voice packets between output packet handlers of said plurality of means for packet switching and ones of a plurality of communication paths; wherein said means for packet switching said voice packets comprise means for chaining voice packets in groups, each group for connection over one of said communication paths.
3 1~133 In accordance with another aspect of the invention there is provided a method of switching voice signals comprising the steps of: converting said voice signals to voice packets; transmitting said voice packets to an input packet handler of a data switching means; transmitting data from said input packet handler to a plurality of S memory access controllers of said data switching means for controlling storage of voice packets in a plurality of memory modules; chaining packets into groups having a common intermediate destination; and transmitting each of said groups from said plurality of memory access controllers to an output data handler of said data switching means for further transmission to one of said intermediate destinations In one embodiment, the outputs of the data switching modules are further switched using a circuit switch with distributed contro] for setting up large numbers of connections every second. Advantageously, such a circuit switch does require internal storage. Advantageously, a circuit switch can provide access from any input terminal to a large number of output terminals and if the circuit switch is fast 15 enough, can switch individual packets or groups of packets from a data switching module to a large number of output modules. Advantageously, in spite of the large access provided by such a circuit switch, no further bit synchronization problems are encountered within the circuit switch.
In an alternative embodiment, data packets representing voice signals 20 (voice packets) are switched from the data switching modules through a data switch in order to avoid the circuit set up time limitations of a circuit switch. High priority data packets, and, optionally, any single packet messages, can also be switched through the data switch. Advantageously, the relatively short voice packets can be separated from the data packets representing data, the latter having less rigorous switching delay 25 requirements and, on average, being much longer.
In another alternative embodiment, groups of voice packets are switched from a data and voice packet switch through the space division switch to ones of a group of specialist voice packet switching modules which collect and further switch the groups of voice packets through the circuit switch for 30 connection to the destination. Advantageously, in such an arrangement, voice packets from a source voice and data packet switch destined for a gro~lp oi~

. .
. .- . . , ~3~2~3~
destinations can first be assembled into groups of packets destined for a particular specialist voice packet switch, and voice packets from many voice and data packet switches can then be assembled in each voice packet switch into groups destined for a particular destination. Advantageously, the number of circuit switch 5 connections required per voice packet switching interval (i.e., the interval between successive voice packets to a particular receiving customer station) is sharply reduced from the number of connections required for switching such voice packetsdirectly from an initial data switching module to an outlet of the circuit switch for transmission to a destination.
In one embodiment, a lwal switch is part of the interface between customer voice signals, in analog or digital form, and a packet switching system.
The digital output signals from the voice switch are placed on ~unks which are connected to a packet assembler/disassembler for packetizing and unpacketizing these signals. Advantageously, such an arrangement peImits the complex voice 15 interfaces and control svftware of a local switch to be used while offering the advantage of a centralized data switching hub for distributing the voice trafficwidely. Advantageously, in such an arrangement, data signals from customers can be readily connectçd to the data switching hub.
For some sources, such as digital private branch exchanges (PBXs), a 20 direct connecdon is made to the PAD.
Brief Description of the Dra~vin~
FIG. 1 is a graphic representation of the characteristics of the type of communications traffic in a metropolitan area network.
FIG. 2 is a high level block diagram of an exemplary metropolitan 25 area network (referred to herein as MAN) including typical input user stations that communicate via such a network.
FIG. 3 is a more detailed block diagram of the hub of MAN and the units communicating with that hub.
FIGS. 4 and 5 are block diagrarns of MAN illustrating how data flows 30 from input user systems to the hub of MAN and back to output user systems.
FIG. 6 is a simplified illustrative exarnple of a type of networlc which can be used as a circuit switch in the hub of MAN.
FIG. 7 is a block diagram of an illustrative embodiment of a MAN
circuit switch and its associated control network.

~2~3~

FIGS. 8 and 9 are flowcharts representing the flow of requests from the data distribution stage of the hub to the controllers of the circuit switch of the hub.
FIG. 10 is a block diagram of one data distribution switch of a hub.
S FIGS. 11-14 are block diagrams and data layouts of portions of the data distribution switch of the hub.
FIG. 15 is a block diagram of an operation, administration, and maintenance (OA&M) system for controlling the data distribution stage of the hub.
FI~. 16 is a block diagram of an interface module for interfacing between end user systems and the hub.
FIG. 17 is a block diagram of an arrangement for interfacing between an end user system and a network interface.
FIG. 18 is a block diagram of a typical end user system.
FIG. 19 is a block diagram of a control arrangement for interfacing between an end user system and the hub of MAN.
FIG. 20 is a layout of a data packet arranged for transmission through MAN illustrating the MAN protocol.
FIG. 21 illustrates an alternate arrangement for controlling access 20 from the data distnbution switches to the circuit switch control.
FIG. 22 is a block diagram illustrating arrangements for using MAN
to switch voice as well as data.
FIG. 23 illustrates an arrangement for synchronizing data rçceived from the circuit switch by one of the data dis~ribution switches.
FIGS. 24 and 26 illus~rate alternate arsangements for the hub for switching packetized voice and data.
FIG. 25 is a block diagram of a MAN circuit switch controller.
Gerleral Dacrip~ion The Detailed Description of this specification is a description of an 30 exemplary metropolitan area network (MAN) that incorporates the present invention. Such a network as shown in FIGS. 2 and 3 includes an outer ring of network inter~ace modules (NIMs) 2 connected by fiber optic links 3 to a hub 1.
The hub interconnects data and voice paclcets fr~m any of the NIMs to any other NIM. The NIMs, in turn, are connected via interface modules to user devices 35 connected to the network.

- `
The details of processing voice packets in the MAN are discussed in section 11 with respect tO FIGS. 22, 24, and 26. Such an arrangement makes use of a voice user interface, a packet assembler/disassembler (PAD) 1217 (FIG. 22).FIG. 24 further shows an alternate arrangement wherein the packetized voice 5 signals are switched through a packet switch and the data packets are switchedthrough a circuit switch. FM. 26 ~urther shows another alternate arrangement wherein packetized voice signals are swit hed twice through packet switching modules and twice through a space division switch; the voice packets enter the data switching module along with data packets from the same NIM and are 10 switched through the space division switch to a small group of packet switches for switching predominantly voice packets; the output of these latter packet s-vitches is then switched once more through the circuit switch to a destination NIM.
Detailed Description Data networks often are classified by their size and scope of ownership. Local area networks (LANs) are usually owned by a single organization and have a reach of a few kilometers. Ihey interconnect tens to hundreds of terminals, computers, and other end user systems (EUSs). At the other extreme are wide area networks (WANs) spanning continents, owned by 20 common carriers, and interconnecting tens of thousands of EUSs. Between theseextremes other data networks have been identified whose scope ranges ~om a campus to a metropolitan area. The high performance metropolitan area network to be described herein will be referred to as MAN. A table of acronyms and abbreviadons is found in Appendix A.
Metropolitan area networks serve a variety of EUSs ranging from simple Ieporting devices and low intelligence terminals through personal computers to large mainframes and supercomputers. The demands that these EUSs place on a network vary widely. Some may issue messages infrequently while others may issue many messages each second. Some messages may be only 30 a few bytes while others may be files of millions of bytes. Some EUSs may require delivery any time within the next few hours while others may require delivery within microseconds.
This invention of a metropolitan area network is a compu~er and telephone communications network that has been designed for transrnitting 35 broadband low latency data which retains and indeed exceeds the perforrnance characteristics of the highest perfonnance local area networks. A metropolitan 3 ~
area network has size characteristics similar to those of a class S or end-office telephone central office; consequently, with respect to size, a metropolitan area network can be thought of as an end-office for data. The exemplary embodiment of the invention, hereinafter called MAN, was designed with this in mind.
S However, MAN also fits well either as an adjunct to or as part of a switch module for an end-offlce, thus supporting broadband Integrated Services Digital Network(ISDN) services. MAN can also be effective as either a local area or campus areanetworlc. It is able to grow gracefully from a small LAN through campus sized networks to a full MAN.
The rapid proliferation of workstations and their servers, and the grow~ of distributed computing are major factors that motivated the design of this invention. MAN was designed to provide networking for tens of thousands of diskless workstations and servers and other computers over tens of kilometers, where each user has tens to hundreds of simultaneous and different associations 15 with other computers on the network. Each networked computer can concurrentlygenerate tens to hundreds of messages per second, and require VO rates of tens to hundreds of millions of bits/second (Mbps). Message sizes may range from hundreds of bits to millions of bits. With this level of performance, MAN is capaUe of supporting remote procedure calls, interobject communications, remote 20 demand paging, remote swapping, file transfer, and computer graphics. The goal is to move most messages (or transacdons as they will be referred to henceforth)from an EUS memory to another EUS memo~y within less than a millisecond for small transactions and within a few milliseconds for large transactions. FIG. 1 classifies transaction ~pes and show desired EUS response times as a function of25 !ooth transacdon type and size, simple (i.e., low intelligence) terminals 70, remote procedure calls (RPCs) and interobject communications aOCs) 72, demand paging 74, memory swapping 76, animated computer ~aphics 78, computer graphics still pictures 80, file transfers 82, and packetized voice 84. Meeting the response time/transaction speeds of FIG. 1 represents part of the goals of the 30 MAN network. As a calibradon, lines of constant bit rate are shown where the bi rate is likely to dominate the response time. MAN has an aggregate bit rate of 15û gigabits per second and can handle 20 million network transactions per second with the exemplary choice of the processor elements shown in FIG. 14.
Furthennore, it has been designed to handle traffic overloads gracefully.

3 ~
MAN is a network which perforrns switching and routing as many systems do, but also addresses a myriad of other necessary functions such as error handling, user interfacing, and the like. Significant privacy and security features in MAN are provided by an authenticadon capability. This capability prevents 5 unauthorized network use, enables usage-sensitive billing, and provides non-forgeable source identification for all information. Capability also exists for defining virtual private networks.
MAN is a transaction-oriented (i.e., connectionless) network. It does not need to incur the o~erhead of establishing or maintaining connections although 10 a colmection veneer can be added in a straightforward fashion if desired.
MAN can also be used for switching packetized voice. Because of the short delay in ~aversing the network, the pAority which may be given to the transmission of single packet entities, and the low variation of delay when the network is not heavily loaded, voice or a mixture of voice and data can be readily 15 supported by MAN. For clarity, the term data as used hereinafter includes digital data representing voice signals, as weU as digital data representing commands, numerical data, graphics, programs, data files and other contents of memory.
MAN, though not yet completely built, has been extensively simulated. Many of the capacity estimates presented hereinafter are based on 20 these sLmulations.
2 ARCHlTECTURE AND OPERATION
2.1 Architecture The MAN network is a hierarchical star architecture with two or three levels depending upon how closely one looks at the topology. FIG. 2 shows the 25 network as consisting of a switching center called a hub 1 linked to network interface modules 2 (NIMs) at the edge of the network.
The hub is a very high performance transaction store-and-forward system that gracefully grows from a small four link system to something very large that is capable of handling over 20 million network ~nsactions per second 30 and that has an aggregate bit rate of 150 gigabits per second.
Radiating out from the hub for distances of up to tens of kilometers are optical fibers (or alternative data channels) called external links (XLs) (connect NIM to MINT), each capable of handling full duplex bit rates on the order of 150megabits per second. An XL terrninates in a NIM.

A NIM, the outer edge of which delineates the edge of the network, acts as a concentrator/demultiplexer and also identifies network ports. It concentrates when moving information into ~he network and demultiplexes when moving information out of the network. Its purpose in S concentrating/demultiplexing is to interface multiple end user systems 26 (EUSs) to the network in such a way as to use the link efficiently and cost effectively.
Up to 20 EUSs 26 can be supported by each NIM depending upon the EUSs networking needs. Examples of such EUSs are the increasingly common advanced function workstations 4 where the burst rates are already in ~he 10 Mbps 10 range (with the expectation that much faster systems will soon be available) with average rates orders of magnitude lower. If the EUS needs an average rate that is closer to its burst rate and the average rates are of the same order of magnitude as that of a NIM, then a NIM can either provide multiple interfaces to a single EUS 26 or can provide a single interface with the entire NIM and XL dedicated to15 that EUS. Examples of EUSs of this type include large mainframes 5 and file servers 6 for the above workstations, local area nesworks such as EIHERNET~ 8 and high perforrnance local area networks 7 such as Proteon~ 80, an 80 MBit token ring manufactured by Proteon Corp., or a system using a fiber distributed data interface (FDDI), an evolving American National Standards Institute (ANSI) 20 standard protocol ring interface. In the latter two cases, the LAN itself may do the concentration and the NIM then degenerates to a single port network interface module. Lower performance local area networks such as ETHERNET 8 and lBM
token rmgs may not need all of the capability that an entire NIM provides. In these cases, the LAN, even though it concentrates, may connect to a port 8 on a ~5 multiport NIM.
Within each EUS there is a user interface module (UIM) 13. This unit serves as a high bit rate direct memory access port for the EUS and as a buffer for transactions received from the network. It also off-loads the EUS from MAN interface protocol concerns. Closely associated with the UIM is the MAN
- 30 EUS resident driver. It works with the UIM to forrnat outgoing transac~ons, receive incoming transactions, implement protocols, and inteIface wi~h the EUSs operating system.
A closer inspection (see FIG. 3) of the hub reveals two different functional units - a MAN switch (MANS) 10 and one or more memory interface 3S modules 11 (MINTs). Each MINT is connected to up to four NIMs via XLs 3 and thus can accommodate up to 80 EUSs. The choice of four NIMs per MINT is -~ ~ ~ 2 ~

based upon a number of factors including transaction handling capacity, buffer memory size within the MINT, growability of the network, failure group siæ, and aggregate bit rate.
Each MINT is connected to the MAN5 by four internal links 12 (ILs) 5 (connect MINT and MAN switch), one of which is shvwn for each of the MINTs in FIG. 3. The reason for four links in this case is different than it is for the XLs.
Here multiple links are necessary because the MINT will norrnally be sending information through the MANS to multiple destinations concurrently; a single IL
would present a bottleneck. The choice of 4 Ls (as well as many other d~sign 10 choices of a similar nature) was made on the basis of extensive analytical and simulation modeling. The ILs run at the same bit rate as the external links but are very short since the entire hub is colocated.
The smallest hub consists of one MINT with the Ls looped back and no switch. A network based upon this hub includes up to four NIMs and 15 accommodate up to 80 EUSs. The largest hub that is currently envisioned consists of 256 MINTs and a 1024 x 1024 MANS. This hub accornmodates 1024 NIMs and up to 20,000 EUSs. By adding MINTs and growing the MANS, the hub and ultimately the entire network grows very gracefully.
2.1.1 LUWUs, Packets, SUWUs, and Transactions Before going further several terms need to be discussed. EUS
transactions are transfers of units of EUS infolmation that are meaningful to the EUS. Such transactions might be a remote procedure call consisting of a few bytes or the transfer of a 10 megabyte database. MAN recognizes two EUS
transaction unit sizes that are called long user work unit tLUWUs) and short user 25 work units tSUWUs) for the purposes of this description. While the delimitingsize is easily engineerable, usually transaction units of a couple of thousand bits or less are considered SUWUs while larger transaction units are LUWUs. Packets are given priority within the network to reduce response time based upon critenashown in FIG. 1 where it can be seen that the smaller EUS transaction units 30 usually need faster EUS transaction response times. Packets are kept intact as a single frame or packet as they move through the network. LUWUs are fragmented into frames or packets, called packets hereinafter, by the transrnitting UIM. Packets and SUWUs are sornetimes collectively referred to as network transac~ion units.

3 ~

Transfers through the MAN switch are referred to as switch transactions and the units transferred through the MANS are switch transaction units. They are composed of one or more network transaction units destined for the same NIM.
5 2.2 Functional Unit Overview .
Prior to discussing the operation of MAN, it is useful to provide a brief overview of each major functional unit within the network. The units described are the UIM 13, NIM 2, MINT 11, MANS 10, end user system link tconnects NIM and UIM) (EUSL) 14, XL 3, and IL 12 respectively. These units 10 are depicted in FIG. 4.
2.2.1 User Interface Module - UIM 13 This module is located within the EUS and often plugs onto an EUS
backplane such as a VME~ bus (an IEEE standard bus), an Intel MULTIBUS
II~, rnainframe IJO channel. It is designed to fit on one printed circuit board for 15 most applications. The UIM 13 connects to the NIM 2 over a duplex optical fiber link called the EUS link 14 (EUSL), driven by optical transrnitter 97 and 85. This link runs at the same speed as the external link (XL) 3. The UIM has a memory queue 15 used to store information on its way to the network. Packets and SUWUs are stored and forwarded to the NIM using out-of-band flow control.
By way of contrast, a receive buffer memory 90 must exist to receive information from the network. In this case entire EUS transacdons may sometimes be stored undl they can be transferred into End User System memory.
The receive buffer must be capable of dynamic buffer chaining. Pardal EUS
transactions may arrive concurrently in an interleaved fashion.
Opdcal Receiver 87 receives signals from optical link 14 for storage in receive buffer memory 90. Control 25 controls UIM 13, and controls exchange of data between transmit first-in-first-out (FIFO) queue 15 or receive buffer memory 90 and a bus interface for interfacing with bus 92 which connects to end user system 26. The details of the control of UIM 13 a~e shown in FIG. 19.
30 2.2.2 Network Interface Module - NIM 2 A NIM 2 is ~he part of MAN that is at the edge of the network. A
NIM performs six functions: (1) concentradon/demultiplexing including queuing of packets and SUWUs moving toward the MINl and external link arbitration. (~) participadon in network security using port identification, (3) participation in35 congesdon control, (4) EUS-to-network con~ol message identification, (5) participation in elTor handling, and (6) network interfacing. Small queues 94 in ~12~j3 memory similar to those 15 found in the UIM exist for each End User System.
They receive information from the UIM via link 14 and receiver 88 and store it until XL 3 is available for transmission to the MINT. The outputs of these queues drive a data concentrator 95 which in turn drives an optical transmitter 96. An S external link demand multiplexer exists which services demands for the use of the XL. The NIM prefixes a port identification number 600 (FIG. 20) to each network transaction unit flowing toward the MINT. This is used in various ways to provide value added services such as reliable and non-fraudulent sender identification and billing. This prefix is par~cularly des;rable for ensuring that 10 members of a virtual network are protected from unauthorized accçss by outsiders.
A check sequence is processed for error control. The NIM, working with the hub 1, determines congestion status within the network and controls flow from the IJIMs under high congestion conditions. The NIM also provides a standard physical and logical interface to the network including flow control mechanisms.Information flowing from the network to the EUS is passed through the NIM via receiver 89, distributed to the correct UIM by data distributor 86, and sent to destination IJIM 13 by transmitter 85 via link 14. No buffering is done at the NIM.
There are only two types of ND~s. One type (such as shown in 20 FIG. 4 and the upper right of FIG. 3) concentrates while the other type (shown at the lower right of FIG. 3) does not.
2.2.3 Memory and Interface Module - MINT 11 MINTs are located in the hub. Each MINl[ 11 consists of: (a) up to four external link handlers 16 (XLHs) that terminate XLs and also receive signals 25 from the half of the internal link that moves data from the switch 10 to the MINT;
(b) four internal link handlers 17 (ILHs) that generate data for the half of the IL
that moves data ~om a MINl` to the switch, (c) a memory 18 for storing data while awaiting a path from the MINT through the switch to the destinadon NIM;
(d) a Data Transport Ring 19 that moves data between the link handlers and the 30 memory and also carries MINT control information; and (e) a contr~l unit 20.
All functional units within the MINT are designed to accommodate the peak aggregate bit rate for data moving concurrently into and out of the MINT. Thus the ring, which is synchronous, has a set of reserved slots ~or moving information from each XLH to memory and another set of reserved slots 35 for moving information from memory to each ILH. It has a read plus write bit rate of over 1.5 Gbps. The memory is 512 bits wide so that an adequa~e memorv - 13 ~ 3~
-bit rate can be achieved with components having reasonable access times. The size of the memory (16 Mbytes) can be kept small because the occupancy time of inforrnation in the memory is also small (about 0.57 milliseconds under full networ3c load). However, this is an engineerable number that can be adjusted if 5 necessary.
The XLHs are bi-directional but not symmetric. Inforrnation moving from NIM to MINT is stored in MINT memory. Header information is copied by the XLH and sent to the MINT con~rol for processing. In contrast, inforrnation moving from the switch 10 toward a NIM is not stored in the MINT but simply 10 passes through the MINT, without being processed, on its way from MANS 10 output to a destination NIM 2. Due to variable path lengths in the switch, the information leaving the MANS 10 is out of phase with respec~ to the XL. A
phase alignment and scrambler circuit (described in section 6.1) must align the data before transmission to the NIM can occur. Section 4.6 describes the internal 15 link handler (ILH).
The MINT perforrns a variety of functions including (1) some of ~he overall routing within the network, (2) participation in user validation, (3) participation in network securi~, (4) queue management, (5) buffering of networktrcmsactions, (6) address trcmslation, (7) participation in congestion control, and (8) 20 the generation of operation, administration, and maintenance (OA&M) primitives.
The control for the MINT is a data flow processing system tailored to the MINT control algorithms. Each MINT is capable of processing up to 80,C00 network transactions per second. A fully provisioned hub with 250 MINTs can therefore process 20 million network trc~msactions per second. This is 25 discussed further in section 2.3.
2.2.4 MAN Switch - MANS 10 The MANS consists of two main parts (a) the fabric 21 through which infolmation passes and (b) the control 22 for that fabric. The con~rol allows ~he switch to be set up in about 50 microseconds. Special properties of the fabric 30 allow the control to be desompased into completely independent sub-controllers that can operate in parallel. Additionally, each sub-controller can be pipelined.
Thus, not only is the setup time very fast but many paths can be set up conc~ently and the "setup throughput" san be made high enough to accommodate high request rates from large numbers of MlNTs. MANs can be 35 made in various sizes ranging from 16x16 (handling four MINTs) to 1024 x 1024 (handling 256 MINTs).

- - \
3 ~

2.2.5 End User Sy tem Link - EUSL 14 The end user system link 14 connects the NIM 2 to the U~M 13 that resides within the end user's equipment. It is a full duplex optical fiber link that runs at the same rate and in synchronism with the eternal link on the other side of S the NIM. It is dedicated to the EUS to which it is connected. The length of the EUSL is intended to be on the order of meters to 10s of meters. However, there is no reason why it couldn't be longer if economics allow it.
The basic format and data rate for the EUSL for the present embodiment of the invention was chosen to be the sarne as that of the Metrobus 10 Lightwave System OS-l link. Whatever link layer data transmission standard is eventually adopted would be used in later embodiments of MAN.
2.2.6 External Links - XL 3 The external link (XL) 3 connects the NIM to the MINT. It is also a full duplex syncbronous optical fiber link. It is used in a demand multiplexed 15 fashion by the end user systems connected to its NIM. The length of the XL isintended to be on the order of 10s of kilometers. Demand multiplexing is used for economic reasons. It employs the Metrobus OS-l forrnat and data rate.
2.2.7 Internal Links - IL 24 The internal link 24 ~rovides connectivity between a MINT and the 20 MAN switch. It is a unidirectional semi-synchronous link that retains frequency but loses the synchronous phase relationship as it passes through the MANS 10.
The length of the IL 24 is on the order of meters but could be much longer if econornics allowed. The bit rate of the IL is the same as that of OS-l. The format, however, has only limited similarity to OS-l because of the need to 25 resynchronize the data.
2.3 Software Overview Using a workstation/server paradigm, each end user system connected to MAN is able ~o generate over 50 EUS transactions per second consisting of LUWUs and SUWUs. This translates into about 400 network transactions per 30 second (packets and SUWUs). With up to 20 EUS per NIM, each NIM must be capable of handling up to 8000 network transactions per second with each MINT
handling up to four times this amount or 3200~ network transactions per second.
These are aYerage or sustained rates. Burst conditions may substantially increase "instantaneous" ratçs for a single EUS 26. Averaging over a number of EUSs 35 will, however, smooth out individual EUS bursts. Thus while each NIM port must deal with bursts of considerably more than 50 network transactions per ~ 3~2~3~

second, NIMs (2) and XLs (3) are likely tO see only moderate bursts. This is even more true of MINTs 11, each of which serves 4 NIMs. The MAN switch 10 must pass an average of 8 million network transactions per second, but the switch controller does not need to process this many switch requests since the design of S the MINT control allows multiple packets and SlJWUs going to the same destination NIM to be switched with a single switch setup.
A second factor to be considered is network transaction interarrival tim~. With rates of 150Mbps and the smallest network transaction being an SUWU of 1000 bitsl two SUWUs could a~rive at a NIM or MINT 6.67 10 microseconds apart. NIMs and MIN7rs must be able to handle several back-to-back SUWUs on a transient basis.
The control software in the NIMs and especially the MINTs must deal with this severe real-time transaction processing. The asyrnmetry and burs~y nature of data traffic reguires a design capable of processing peak loads for short 15 periods of time. Thus the transaction control software structure must be capable of executing many hundreds of millions of CPU instructions per second (100's of MIPs). Moreover, in MAN, this control software performs a muldplicity of functions including routing of packets and SUWUs, network port identification, queuing of network transactions destined for the same NIM over up to 1000 N~Is 20 (this means real time maintenance of up to 1000 queues), handling of MANS
requests and acknowledgements, flow control of source ~USs based on complex criteria, network traffic data collection, congestion control, and a myriad of other tasks.
The MAN control software is capable of performing all of the above 25 tasks in real time. The control software is executed in three major components:
NIM control 23, MINT control 20, and MANS control 22. Associated with these three control components is a fourth con~ol structure 25 within the UIM 13 of the End User System 26. FIG. S shows this arrangement. Each NIM and MINT has its own con~ol unit. The control units function independently but cooperate 30 closely. This partitioning of control is one of the architectural mechanisms that makes possible MAN's real-time transaction processing capability. The other mechanism that allows MAN to handle high transaction Mtes is the technique of decomposing the control into a logical array of subfunctions and independently applying processing power to each subfunction. This approach has been greatly 35 facilltated by the use of Transputer(g~ very large scale integration (VLSI) process~r devices made by INMOS Corp. The technique basically is as iollows:

- Decompose the problem into a number of subfunctions.
- Arrange the subfunctions to form a dataflow structure.
- Implement each subfunction as one or more processes.
- Bind sets of processes to processors, arranging the bound processors in the S same topology as the datafiow structure so as to form a dataflow system that will execute the function.
- Iterate as necessary to achieve the real-time performance required.
Brief descriptions of the functions perforrned by the NIM, MINT, and MANS (most of which are done by the software control for those modules) are given in sections 2.2.2 through 2.2.4. Additional inforrnation is given in section 2.4. Detailed descriptions are included later in this description within specific sections covering these subsystems.
2.3.1 Control Processors The processors chosen for the system implementation are Transputers 15 from INMOS Corp. These 10 million instructions/second (MIP~ reduced instruction set control (RISC) machines are designed to be connected in an arbitrary topology over 20 Mbps serial links. Each machine has four links with an input and output path capable of simultaneous direct memory access (DMA).
2.3.2 MINT Control Per~ormance Because of the need to process a large number of transactions per second, the processing of each transaction is broken into serial sections which fon~ a pipeline. Transactions are fed into this pipeline where they are processed simultaneously with other transactions at more advanced stages within the pipe.
In addition, there are multiple parallel pipelines each handling unique processing 25 strearns simultaneously. Thus, the required high transaction processing rate, where each transaction requires routing and other complex servicing, is achieved by breaking the control structure into such a paralleVpipelined fabric of interconnected processors.
A constraint on MINT control is that any senal processing can take no 30 longer than 1 / (number of transactions per second processed in this pipeline).

A further constraint concerns the burst bandwidth for headers entering the con~rol within an XLH 16. If the time between successive network units arriving at the XLH is less than (header size) / (bandwidth into control) then the XLH must buffer headers. The maxirnum number of transactions per second assuming uniform arIival is given by:
(bandwidth into control) / ( size of transaction header).

S An example based upon the ef~ective bit rate of transputer links and the 40 byte MAN network transaction header is:

(8.0Mb/s for control link)/(320 bit header/~ransaction) = 25,000 ~ransactions/sec. per XLH, or one transaction per XLH every 40 rnicroseconds. 13ecause transaction 10 interarr~val times can be less than this, header buffering is performed in the XLH.
The MINT must be capable, within this time, of routing, executing billing primitives, making switch requests, performing network control, memory management, operation, adsninistration, and maintenance activities, name serving, and also providing other network services such as yellow page primitives. The 15 parallev pipelined nature of MINl control 20 achieves these goals.
As an example, the allocating and freeing of high-speed memory blocks can be processed completely independently of routing or billing primitives.
Transaction flow within a MINT is controlled in a single pipe by the management of the memory block address used for storing a network transaction unit (ie.
20 packet or SUWU). At the first stage of the pipe, memory management allocates free blocks of high-speed MINT memory. Then, at the next stage, these blocks are paired with the headers and routing trænslation is done. Then switch units are collected based on memory blocks sent to common NIMs, and to close the loop the memory blocks are freed after the blocks' data is transm~tted into the MANS.25 Billing primitives are simultaneously handled within a different pipe.
2.4 MAN_Oe~
The EUS ~6 is viewed by the network as a user with capabilities granted by a network administration. This is analogous to a tem~inal user loggedinto a time-sharing system. The user, such as a workstation or a front end 30 processor acting as a concentrator for stations or even net~vorks, will be required to make a physical comlec~ion at a NIM port and then identify itself via its MA~' ~ 3~ 33 name, virtual network identification, and password security. The network adjustsrout~ng tables to map data destined for this name to a uni~que NIM port. The capabilities of this user are associated with the physical port. The example just given accommodates the paradigm of a portable workstation. Ports may also be S configured to have fixed capabilities and possibly be "owned" by one MAN namedend user. This gives users dedicated network ports or provides pnvileged administrative maintenance ports. The source EUS refer to the destination by MAN narnes or services, so they are not required to know anything about the dynamic network topology.
The high bit rate and large transaction processing ~apability internal to the network yield very short response times and provide the EUS with a means to move data in a metropolitan area without undue network considerations. A MAN
end user will see ~US-memory-to-EUS memory response times as low as a ~ullisecond, low error rates, and the ability to send a hundred EUS transactions15 per second on a sustained basis. This number can expand to several thousand for hi~h performance EUSs. The EUS will send data irl whatever size is appropriate to his needs with no maximum upper bound. Most of the limitations on optitnizing MAN perforrnance are imposed by the limits of the EUS and applica~ions, not the overhead of the network. The user will supply the following 20 inforrnation on transmit~ing data to the UIM:
- A MAN name and virtual network narne for the destination address that is independent of the physical address.
- The size of the data.
- A MAN type field denoting network service required.
25 - The data.
Network transactions (packets and SUWUs) move along the following logical path (see FIG. 5):

sourceUIM ==> sourceNIM ==> MINT ==> MANS=-> destinationNIM(via MINT) =--> destinationUIM.

30 Each EUS transaction (i.e., LUWU or SUWU) is subrnitted to its UIl!~!I. Inside the UIM, a LUWU is further fragmented into variable size packets. An SUWU is not fragmented but is logically viewed in its entirety as a network transaction.
However, the determination that a network transaction is an SIJWU is not made until the SUWU reaches the MINT where the information is used in dynarnically categorizing data into SUWUs and packets for optimal network handling. The NIM checks incoming packets from the EUS to verify that they do not violate a maximum packet size. The UIM may pick packet sizes smaller than the maximum depending on EUS stated service. For optimum MINT memory 5 utilization, the packet size is the standard maximum. However under some circumstances, thç application may request that a smaller packet size be used because of end user consideration such as timing problems or data availability timing~ Additionally, there may be timing limits where the IJIM will send what it currently has from the EUS. Even where the maximum size packet is used, the 10 last packet of a LUWU usually is smaller than the maximum size packet.
At the transmitting UIM each network transaction (packet or SUWU) is prefixed with a fixed length MAN network header. It is the information withinthis header which the ~N network software uses to route, bill, offer network services, and provide network control. The destination UIM also uses the 15 in~ormation within this header in its job of delivering EUS transactions to the end user. The network transactions are stored in the UIM source transaction queue from which they are transmitted to the source NIM.
Upon receiving networlc transactions from UIMs, the NIM receives them in queues permanently dedicated to the EUSLs on which the transaction 20 arrived, for forwarding to the MINT 11 as soon as the link 3 becomes available.
The control software within the NIM processes the UIM to NIM protocol to identify control messages and prepends a source port number to the transaction that will be used by the MINT to authenticate the transaction. End-user data will never be touched by MAN network software unless the data is addressed to the 25 network as control information provided by the end user. As the transacdons are processed, the source NIM concentrates them onto the extemal link between the source NIM and its MINT. The source NIM to MINT links terrninate at a hardware interface in the MINT (the external link handler or XLH 16).
The extemal link protocol between the NIM and MI~T allows the 30 XLH 16 to detect the beginning and end of network ~ansactions. The transactions are immediately moved into a memory 18 designed to handle the l50Mb/s bursts of data arriving at the XLH. This memory access is via a high-speed time slottedring 19 which guarantees each l50Mb/s XLH input and each 150Mb/s output from the MINT (ie. MANS inputs) bandwidth with no contçntion. For example, a 35 MINT which concentrates 4 remote NIMs and has 4 input ports to the center switch must have a burst access bandwidth of at least 1.2Gb/s. The memory storage is used in fixed length blocks of a size equal to the maximum packet size plus the fixed length MAN header. The XLH moves an address of a fixed size memory block followed by the packet or SUWU data to the memory access ring.
The data and network header are stored until the MINT control 20 causes its S transmission into the MANS. The MINT control 20 will continually supply the XLHs with free memory block addresses for storing the incoming packets and SUWUs. The XLH also "knows" the length of the fixed size network header.
With this information the XLH passes a copy of ~he network header to MINT
control 2n. MINT control 20 pairs the header with the block address it had given10 the XLH for storing the packet or SUWU. Since the header is the only internalrepresentation of the data within MINT control it is vital that it be correct. To ensure sanity due to potential link errors the header has a cyclic redundancy check (CRC) of its own. The path this tuple takes within MINT control must ~e the same for all packets of any given LUWU (this allows ordering of I,UWU data to 15 be preserved). Packet and SUVVU headers paired with the MINT memory block address will move through a pipeline of processors. The pipeline allows multipleCPUs to process different network transactions at various stages of MINT
processing. In addition, there are multiple pipelines to prwide concurrent processing.
MINT control 20 selects an unused internal link 24 and requests a path setup from the IL to the destinadon NIM (through the MINT attached to that NIM). MAN switch control 21 queues the request and when, the path is available and (2) the XL 3 to the destination NIM is also available, it notifies the source MINT while concurren~ly setting up the path. This, on average and under full 25 load, takes 50 rnicroseconds. Upon notification, the source MINT transmits all network transactions destined for that ND!~, thus taking maximum advantage of the path setup. The internal link handler 17 requests network transactions from the MINT memory and transmits them over the path:

ILH ==> sourceIL ==> MANS ==> destinationIL ==> XLH, 30 this XLH being attached to the destination NIM. The XLH recovers bit synchronization on the way to the destination NIM. Note that information, as it l~a~es the switch, simply passes through a MINl' on its way to the destination NIM. The MINT doesn't process it in any way other than to recover bit synchroniza~on that has been lost in going through the MANS.

. ~ ;

As information (i.e., switch transactions made up of one or more network transactions) arrives at the destination NIM it is demultiplexed into network transactions (packets and SUWUs) and forwarded to the destination ULMs. This is done "on the fly"; there is no buffering in the NIM on the way out5 of the network.
The receiving UIM 13 will store the network transactions in its receive buffer memory 90 and rçcreate EUS transactions (LUWUs and SUWUs).
LUWU may arrive at the UIM in packet sized pieces. As soon as at least part of a L,UWU a~ives, the UIM will notify the EUS of ies existence and will, upon 10 instructions from thç EUS, transmit under the control of its DMA, partial EUS or whole EUS transactions into the FUS memory in DMA transfer sizes specified by the EUS. Alternate paradigms exist for transfer from UIM to EUS. For instance, an EUS can tell the UIM ahead of time that whenever anything arrives the UIM
should transfer it to a specified buffer in EUS memory. The UIM would then not 15 need to announce the arrival of information but would immediately transfer it to the EUS.
2.5 Addition_l Considerations 2.5.1 Error Handlin~
In order to achieve latencies in the order of hundreds of microseconds 20 from EUS memory to EUS memory, errors must be handled in a manner that differs from that used by conventional data networks today. In MAN, network transactions have a header check sequence 626 (FIG. 20) (HCS) appended to the header and a data check sequence 646 (FIG. 20) (DCS) appended to the entire network transaction.
Consider the header first. The source UIM generates a HCS before transmission to the source N~I. At the MINT the HCS is checked and, if in error, the transaction is discarded. The destination NIM performs a similar action for a third time before routing the transaction to the destination UIM. This scheme prevents misdelivery of inforrnation due to corrupted headers. Once a 30 header is found to be flawed, nothing in the header can be considered reliable and the only option that MA~ has is to discard the transaction.
The sou~ce UIM is also required to provide a DCS at ~e end of the user data. This field is checked within the MAN network but no action is taken it errors are found. The information is delivered to the destination UIM who can 35 check it and take appropriate action. Its use within the network is to identify bolh EUSL and internal network problems.

3 ~ ~ rJ ~

Note that there is never any attempt within the network to correct errors using the usual automatic repeat request ~ARQ~ techniques found in most of today's protocols. The need for low la~ency precludes this. Error correcting schemes would be too costly except for the headers, and even here the time 5 penalty may be too great as has sometimes been the case in computer systems.
However, header error correction may be employed later if experience proves thatit is needed and time-wise possible.
Consequently, MAN checks for errors and discards transactions when there is reason to suspect the validity of the headers. Beyond this, transactions are 10 delivered even if flawed. This is a reasonable approach for three reasons. First, intrinsic error rates over optical fibers are of the sarne order as error rates over cvpper when common ARQ protocols are employed. Both are in the range of 10-1l bits per bit. Secondly, graphics applications (which are increasing dramatically) often can tolerate small error rates where pixel images are 15 transmitted; a bit or two per image would usually be fine. Finally, where error rates need to be better than the intrinsic rates, EUS-to-EUS ARQ protocols can be used (as they are today) to achieve these improved error rates.
2.5.2 Authentication ~ . _ .
MAN provides an authentication feature. This feature assures a 20 destination EUS of the identity of the source EUS for each and eve~y transaction it receives. Malicious users cannot send transactions with forged "signatures".
Users are also prevented from using the network free of charge; all users are forced to idendfy themselves truthfully with each and every transacdon that theysend into the network, thus providing for accurate usage-sensidve billing. This 25 feature also provides the primitive capability for other features such as virtual private networks.
When an EUS first attaches to MAN, it "logs in" to a well known and privileged Login Server that is part of the network. The login server is in an administrative terminal 350 (FIG. 15) with an attached disk memory 351. The 30 adrninistrative terminal 350 is accessed via an OA&M MINT processor 315 (FIG. 14) and a MlNT OA&M monitor 317 in the MINT central control 20, and an OA&M central control (FIG. 15). This login is achieved by the EUS (via its UIM) sending a login transaction to the server through the network. This transaction contains the EUS identification number (its name), its requested virtual 35 network, and a password. In the NIM a port number is prefixed to the transaction before it is forwaIded to the MINT for routing to the server. The Login Server ~3~ 3~

notes the id/port pairing and informs the MINT attached to the source NIM of that pairing. It also acknowledges its receipt of the login to the EUS, telling the EUS
that it may now use the network.
When using the network, each and every network transaction that is S sent to the source NIM from the EUS has, within its header, itS source id plusother information in the header described below with respect to FIG. 20. The NIM preIixes the port number to the transaction and forwards it to the MIN'l' where the painng is checked. Incorrect pairing results in the MINT discarding the transaction. In the MINT, the prefixed source port number is replaced with a 10 destination port number before it is sent to the destination NIM. The destination NlM uses this destination port number to complete the routing to the destinationEUS.
If an EUS wishes to disconnect from the network, it "logs off' in a manner similar to its login. The Login Server informs the MINT of this and the 15 MINT removes the id/port inforrnation, thus rendering that port inactive.
2.5.3 Guaranteed Ordering From NIM to NIM the notion of a LUWU does not exist. Even though LUWUs lose their identity within the NIM to NIM envelope, the packets of a given LUWU must follow a path through predeterrnined XLs and MINTs.
20 This allows ordering of packets arriving at UIMs to be preserved for a LUWU.
However, packets may be discarded due to fiawed headers. The UIM checks for missing packets and notifies the 13US in the event that this occurs.
2.5.4 Virtual Circ~its and Infinite LUWUs The network does not set up a circuit through to the destination but 25 rather switches groups of packets and SUWUs as resources become available.
This does not prevent the EUS from setting up virtual circuits; for example the EUS could write an infinite size LUWU with the appropriate UIM timing parameters. Such a data stream would appear to the EUS as a virtuai circuit while to the network it would be a never ending LUWU that moYes packets at a time.
30 The implementation of ~his concept mus~ he handled between the UIM and the EUS protocols since there may be many different types of EUS and UIMs. The end-user can be transmitting multiple data streams to any number of destinationsat any one time. These streams are multiplexed on packet and SUWUs boundaries on the transmit link between the source UIhI and the source NIM.

1 ~ ~ r;
- 2~ -A parameter, to be adjusted for optimum performance as the system is loaded, limits the time (equivalent to limiting the length of the da~a stream) that one MINT can send data to a NIM in order to free that NIM to receive data from other MINTs. An initial value of 2 milliseconds appears reasonable based on 5 simulations. The value can be adjusted dynamically in response to traffic patterns in the system, with different values possible for different MINTs or NIMs, and at different times of the day or different days of the week.

The MAN switch (MANS) is the fast circuit switch at the center of 10 the MA~N hub. It interconnects the MINTs, and all end-user transactions must pass through it. The MANS consis~s of the switch fabric itself, (called the datanetwork or DNet), plus the switch control complex (SCC), a collection of controllels and links that operate the DNet fabric. The SCC must receive requests from the MINTs to connect or disconnect pairs of incoming and outgoing internal 15 links (ILs), execute the requests when possible, and info~n the MINTs of the outcome of their requests.
These apparently straightforward operations must be carried out at a high performance level. The demands of the MAN switching problem are discussed in the next section. Next, Section 3.2 presents the fundamentals of a 20 distributed-control circuit-switched network that is offered as a basis for a solution to such switching demands. Section 3.3 tailors this approach to the specific needs of MAN and covers some aspects of the control structure that are critical to high performance.
3.1 Characterizin~ the Problem First we estimate some numelical values for the demands on the MAN switch. Norninally, the MANS must establish or remove a transaction's connecdon in fractions of a millisecond in a network with hundreds of ports, each running at 150 Mb/s and each carrying thousands of separately switched transactions per second. Millions of transaction requests per second imply a 30 distributed con~ol structure where numerous pipelined controllers process transac~on requests in parallel.
The combination of so many ports each running a hi~h speed has several implications. First, the bandwidth of the network must be at least 150 Gb/s, thus requiring multiple data paths (nominally 150 Mb/s) through the 35 network. Second, a 150 Mb/s synchronous network would be dil~icult to build (although an asynchronous network needs to recover clock or ph~se~. Third, since :
-inband signaling creates a more complex (self-routing) network fabric and requires buffering within the network, an out-of-band signaling (separate control) approach is desirable.
In MAN, transaction lengths are expected to vary by several orders of 5 magnitude. These transactions can share a single switch, as discussed hereinafter with adequate delay performance for small transactions. The advantage of a single fabric is that data streams do not have to be separated before switching and recombined afterwards.
A problem to be dealt with is the condition where the requested 10 output port is busy. To set up a connection, the given input and output ports must be concurrently idle (the so-called concurrency problem). If an idle input (output~
port waits for the output (input) to become idle, the waiting port is inefficiently utilized and other transactions needing that port are delayed. If the idle port is instead given tO other transactions, the original busy destination port may have15 become idle and busy again in the meantime, thus adding further delay to the original transaction. The delay problem is worse when the port is busy with a large transaction.
Any concurrency resolution strategy requires that each port's busy/idle status be supplied to the controllers concerned with it. To maintain a high 20 transaction rate, this status update mechanism must operate with short delays.
If transaction times are short and most delays are caused by busy ports, an absolutely non-blocking network topology is not required, but the blocking probability should be small enough so as not to add much to delays or burden the SCC with excessive unachievable connection requests.
Broadcast (one to many) connections are a desirable network capability. However, even if the network supports broadcasting, the concurrency problem (here even worse with the many por~s involved) must be handled without disrup~ng other traffic. This seems to rule out the simple strategy of waiting for all destination ports to become idle and broadcasting to all of them at once.
Regardless of the special needs of the MAN network, the MANS
sadsfies the general requirements for any practical network. Startup COSItS are reasonable. The network is growable without disrupdng existing fabric. The topology is inherently efficient in itS use of fabric and circuit boards. Finally, the concerns of operational availability - reliability, fault tolerance, failure-group size 35 and ease of diagnosis and repair - are met.

3.2 General Approach - A Distributed-Control Circuit-Switchin~ Network In this section we describe the basic approach used in the MANS. It specifically addresses the means by which a large network can be run by a group of controllers operating in parallel and independendy of one another. The S distributed control mechanism is described in terms of two-stage networks, butwith a scheme to extend the approach to multistage networks. Section 3.3 presents details of the specific design for MAN .
A major advantage of our approach is that the plurality of network controllers operate independently of one another using only local information.
10 Throughput (measured in transactions) is increased because controllers do notburden each other with queries and responses. Also the delay in setting up or tearing down connections is reduced because the number of sequential con~ol steps is minimized. All this is possible because the network fabric is partitioned into disjoint subsets, each of which is controlled solely by its own controller that 15 uses global static information, such as the internal connection pattern of the data network 120, but only local dynamic (network state) data. Thus, each controller sees and handles only those connection requests that use the portion of the network for which it is responsible, and monitors the state of only that portion.
3.2.1 Partitioning Two-Stage Networks Consider the 9 x 9 two-stage networlc example in FIG. 6 comprising three input switches ISl (101), IS2 (102), and IS3 (103), and three output switches OSl (104), OS2 (105), and OS3 (106). We can partition its fabric into three disjoint subsets. Each subset includes the ~abric in a given second stage switch(OS~) plus the fabric (or crosspoints) in the first stage switches (TSy) that connect 25 to the links going to that second stage switch. For exarnple, in FIG. 6, the partition or subset associated with OSl (104) is shown by a dashed line around the crosspoints in OSl plus dashed lines around three crosspoints In each of thefirst stage switches (101,102,103) (those crosspoints being those that connect to the links to OSI).
3û Now, consider a controller for this subset of the network. It would be responsible for connections from any inlet to any oudet on OSI. The controller would maintain busy/idle status for the crosspoints it con~rolled. This in~ormation is clearly enough to tell whether a connection is possible. For example, supposean inlet on ISl is to be connected to an outlet on OSl. We assume that the 35 request is from the inlet, which must be idle. The outlet can be deterrnined to be idle from outle~ busyfidle status memory or else from the status of the outlet's -27- ~12 ~ ~
three crosspoints in OSI (all three must be idle). Next, the status of the link between ISI and OSI must be checked. This link will be idle if the two crosspoints on both ends of the link, which connect the link to the remaining two inlets and outlets, are all idle. If the inlet, outlet, and link are all idle, a5 crosspoint in each of ISI and OSI can be closed to set up the requested connection.
Note that this activity can proceed independently of activities in the other subsets (disjoint) of the network. The reason is that the network has onlytwo stages, so the inlet switches may be paItitioned according to their links to10 second stage switches. In theory this approach applies to any two-stage network, but the usefulness of the scheme depends on the network's blocking charactenstics. The network in FIG. 6 would block too frequently, because it canconnect at most one inlet on a given inlet switch to an outlet on a given secondstage switch.
A two-stage network, referred to hereinafter as a Richards network, of the type described in G. W. Richards et al.: "A Two-Stage Rearrangeable Broadcast Switching Network, IEEE Transactions on Communications, v. COM-33, no. 10, October 1985, avoids this problem by wiring each inlet po~t to multiple appearances spread over different inlet switches. The distributed control 20 scheme operates on a Richards network, even though MAN may not use such Richards network features as broadcast and rearrangement.
3.2.2 Control Network 3.2.2.1 Function In MAN, requests for connections come from inlets, actually, the 25 central control 20 of the MINTs. These requests must be distributed to the proper switch controller via a control network (CNet). In FIG. 7, both the DNet 120 forcircuit-switched transactions and the control CNet 130 are shown. The DNet is a two-stage rearrangeably non-blocking Richards networlc. Each switch 121,123 includes a rudimentary crosspoint controller (XPC) 122,124 which accepts 30 commands to connect a specified inlet on the switch to a specified outlet by closing the proper crosspoint. The first and second stages' XPCs (121,123) are abbreviated lSC (first stage ~on~oller) and 2SC (second stage controller) rçspectively.
On the right side of the CNet are 64 MANS controllers 140 35 (MANSCs3 corresponding to and con~rolling 64 disjoint subsets of the DNet, partitioned by second stage outlet switches as described earlier. Since the . 3~

controllers and their network are overlaid on the DNet and not integral to the data fabric, they could be replaced by a single controller in applications where transaction throughput is not critical.
3.2.2.2 Structure The CNet shown in FIG. 7 has special properties. It consists of three similar parts 130,134,135, corresponding to flows of messages from a MINT to a MANSC, orders from a MANSC to an XPC, and acknowledgments or negative acknowledgments ACKs/NAKs from a MANSC to a MINT; acknowledge (ACK), negative acknowledge (NAK). Each of the networks 130,134 and 135 is a 10 statistically multiplexed time-division switch, and comprises a bus 132, a group of interfaces 133 for buffering control data to a destination or from a source, and a bus arbiter controller (BAC) 131. The bus arbiter controller controls the gating of control data from an input to the bus. The address of the destination selects the output tO which the bus is to be gated. The output is connected ~o a controller 15 (network 130: a MANSC 140) or an interface (networks 131 and 132, interfaces similar to interface 133). The request inputs and ACK/NAK responses are concentrated by control data concentrators and distributors 136,138, each control data concentrator concentrating data to or from four M~Ts. The con~rol data concentrators and distributors simply buffer data from or to the MI~Ts. The 20 interfaces 133 in the CNet handle statistical demultiplexing and multiplexing(steermg and merging) of control messages. Note that the interconnections made by bus 132 for a given request message in the DNet are the same as those requested in the CNet.
3.2.3 Connection Request Scenario The connection request scenario begins with a connection request message arriving at ~he left of CNet 130 in a multiplexed stream on one of the message input links 13? from one of the data concentrators 136. This request includes the DNet 120 inlet and outlet to be connected. In the CNet 130, the message is routed to the appropriate link 139 on the Iight sidç of the CNet 30 according to the outlet to be connected, which is uniquely associated with a particular second stage switch and therefore also wi~ a particul~r MANS
controller 140.
This MANSC consults a static global directory (such as a RC)M) to find which first stage switches carry the requesting inlet. Independently of other 35 MANSCs, it now checks dynamic local da~a to see whether the outlet is idle and any links from the proper first stage switches are idle. If the required resources are idle, the MANSC sends a crosspoint connect order to its own sçcond stage outlet switch plus another order to the proper first stage switch via network 134.
The latter order includes a header to route it to the correct first stage.
This approach can achieve extremely high transaction throughput for 5 several reasons. All network controllers can operate in parallel, independently of one another, and need not wait for one another's data or go-aheads. Each controller sees only those requests for which it is responsible and does not waste time vith other messagss. Each controller's operations are inhe~ently sequentialand independent functions and thus may be pipelined with more than one request 10 in progress at a time.
The above scenario is not the only possibility. Variables to be conside~ed include broadcast -vs- point-to-point inlets, outlets -vs- inlet-oriented connecdon requests, rearrangement -vs- blocking-allowed operation, and disposition of blocked or busy connect requests. Although these choices are 15 already settled for MAN, all these options can be handled wi~h the control topology presented~ simply by changing the logic in the MANSCs.
3.2.4 Muldstage Networl~s This control structure is extendible to multistage Richards networks, where switches in a given stage are recursively implemented as tw~stage 20 networks. The resultant CNet is one in which coMection requests pass sequentially through S-l controllers in an S-stage network, where again controllers are responsiblc for disjoint subsets of the network and operate independently, thus retaining the high throughput potentiaL
3.3 Specific Dcsi~n for MAN
In this section we first exatnine those system attribu~es ~hat drive the design of the MANS. Next, the data and concrol networ~s are describecL Finally the functions of the MANS controller are discusscd in detail, including design tradeof~s that affect performancç.
3.3.1 System Attributes 30 3.3.1.1 External and Internal Lnterfaces FIG. 7 illustrates a proto~ypical fully-grown MANS composed of a DNet 121 with 1024 incoming and 1024 outgoing ILs and CNet 22 comprising three control message networks 130,135,134 each with Ç4 incoming and 64 outgoing message links. The ILs are par~tioned into groups of 4, one group for 35 each of 256 MINTs. The DNet is a tw~stage network of 64 first stage switches 121 and 64 second stage switches 123. Each switch includes an , XPC 122 that takes comrnands to open and close crosspoints. For each of the DNet's 64 second stages 123, there is an associated MANSC 140 with a dedicated control link to the XPC 124 in its second stage switch.
Each control link and status link interfaces 4 MINTs to the CNet's 5 left-to-right and right-to-left switch planes via 4:1 control data concentrators and distnbutors 136,138 which are also part of the CNet 22. These may be regarded either as remote concentrators in each 4-MINT group or as parts of their associated 1:64 CNet 130,135 stages; in the present embodiment, they are part ofthe CNet. A third 64x64 plane 134 of the CNet gives each MANSC 140 a 10 dedicated right-to-left interface 133 with one link to each of the 64 lSCs 122.
Each MINT 11 interfaces with the MANS 10 through its four ILs 12, its request signal to control data concentrator 136, and the acknowledge signal received back from control data distribut~r 138.
Alternately, each CNet could have 256 instead of 64 ports on its 15 MINT side, elin~inating the concentrators.
3.3.1.2 Size The MANS diagrarn in FIG. 7 represents a network needed to switch data traffic for up to 2û,00û EUSs. Each NIM is expected to handle and concentrate the traffic of 10 to 20 EUSs onto a 150 Mb/s XL, giving about 1000 20 XLs (rounded off in binary to 1024). Each MINT serves 4 XLs for a total of 256 MIN'rs. Each M~NT also handles 4 ILs, each with an input and an output termination on the DNet portion of the MANS. The data network thus has 1024 inputs and 10~4 outputs. ~ternal DNet link sizing will be addressed later.
Failure-group size and other considerations lead to a DNet with 32 25 input links on each first stage switch 121, each of which links is connected to two such switches. There are 16 outputs on each second stage switch 123 of the l:)Net. Thus, there are 64 of each type of switch and also 64 MANSCs 140 in the CNet, one per second stage switch.
3.3.1.3 Traffic and Consolidation The "natural" EUS transactions ~ data to be switched vaIy in size by several orders of magnitude, from SUWUs of a few hund~l bits to LUWlJs a megabit or more. As explained in Section 2.1.1, MAN breaks larger EUS
transactions into network transactions or packets of at most a few thousand bitseach. But the MANS deals with the switch ~ansaction, defined as the burst of 35 data that passes through one MANS connection per one connect (and disconnec request. Switch transactions can vary in size from a single SUWU to several LUWUs (many packets) for reasons about to be given. For the rest of Section 3, "transaction" means "switch transaction" except as noted.
For a given total data rate through the MANS, the transaction throughput rate (transactions/second) varies inversely with the transaction size.
5 Thus, the smaller the transaction size, the greater the transaction throughput must 'oe to maintain the data rate. This throughput is limited by the individual throughputs of the MANSCs (whose connect/disconnect processing delays reduce the e~fective IL bandwidth) and also by concuTrency resolution (waiting for busyoutlets). Each MANSC's overhead per transaction is of course independent of 10 transaction size.
Although larger transactions reduce the transaction throughput demands, they will add more delays to other transactions by holding outlets and fabric paths for longer times. A compromise is needed -- small transactions reduce blocking and concurrency delays, but large transactions ease the MANSC
15 and MINT workloads and improve the DNet duty cycle. The answer is to let MAN dynamically adjust its transaction sizes under varying loads for the best perforrnance.
The DNet is large enough to handle the offered load, so the switching control complex's (SCC) throughput is the limidng factor. Under light traffic, the 20 switch transactions will be short, mostly single SUWUs and packets. As traffic levels increase so does the transaction rate. As the SCC transaction rate capacity is approached, transaction sizes are dynami~ally increased to maintain the transacdon rate just below the point where the SCC would overload. This is achieved automatically by thel consolidation control strategy, whereby each MINT25 always ~ansmits in a single switch transaction all available SUWUs and packets targçted ~or a given destination, even though each burst may contain the whole or parts of several EUS transactions. F~uther increases in traffic will increase the size, but not so much the number, of transactions. Thus fabric and IL utilization improve with load, while the SCC's workload increases only slightly. Section 30 3.3.3.2.1 explains the feedback mechanism that controls transaction size.
3.3.1.4 Performance Goals Nevertheless, MAN's data throughput depends on extremely high performance of individual SCC control elements. For example, each XPC 122,124 in the data switch will be ordered tO set and clear at least 67,000 35 connections per second. Clearly, each request must be handled in a~ most a few microseconds.

- - -J~

Likewise, the MANSCs' functions must be done quickly. We assume that these steps will be pipelined; then the sum of the step processing times will contributç to connect and disconnect delays, and the maximum of these step timeswill limit transaction throughput. We aim tO hold the maximum and sum tO a few 5 microseconds and a few tens of microseconds, respectively.
The resolution of the concurrency problem must also be quick and efficient. Busy/idle status of destination terminals will have to be determined in about 6 microseconds, and the control strategy must avoid burdening MANSCs with unfulfillable connection r~uests.
One final perforrnance issue relates to the CNet itself. The network and its access links must run at high speeds (probably at least 10 Mb/s) to keepcontrol message transrnit tirnes small and so that links will run at low occupancies to minimize the contention delays from statistical multiplexing.
3.3.2 Data Netwo~k (DNet) The DNet is a Richards two-stage rearrangeably non-blocking broadcast network. This topology was chosen not so much for its broadcast capability, but because its two-stage structure allows the network to be partitioned into disjoint subsets for distributed control.
3.3.2.1 Desi~n Parameters The capabilities of the Richards networlc derive from the assignment of inlets to multiple appearances on different first stage switches according to a definite pattern. The particular assignment pattern chosen, the number m of multiple appearances per inlet, the total number of inlets, and the number of links between first and second stage switches detertn~ne the maximum number of outlets25 per second stage switch perrnitted for the network to be rearrangeably non-blocldng.
The DNet in FIG. 7 has 1024 inlets, each with two appearances on the first stage switches. There are two links between each first and second stage switch. These parameters along with the pattern of distributing the inlets ensure 30 that with 16 outlets per second stage switch the network will be rearrangeably non-blocking for broadcast.
Since MAN does not use broadcast or rearrangement, those pararneters not justified by failure-group or other considerations may be changed as more experience is obtained. For example, if a failure group size of 32 were deemed 35 tolerable, each second stage switch could have 32 outputs, thus reducing the number of second stage switches by a factor of 2. Making such a change would depend on the ability of the SCC control elernents each to handle twice as much traffic. In addition, blocking probabilities would increase and it would have to be determined that such an increase would not significantly detract from the performance of the network.
S l'he network has 64 first stage switches 121 and 64 second stage switches 123. Since each inlet has two appearances and there are two links between first and second stage switches, each first stage switch has 32 inlets and 128 outlets and each second stage has 128 inlets and 16 outlets.
3.3.2.2 Operation Since each inlet has two appearances and since there are two links between each first and second stage switch, any outlet switch can access any inle~
on any one of four links. The association vf inlets to links is algorithmic and thus may be computed or alternatively read from a table. The path hunt involves simply choosing an idle link ~if one exists) from among the four link possibilities.
If none of the four links is idle, a re-attempt to make a connection is made later and is requested by the same MINT. Alternatively, existing connections could be re-arranged to remove the blocking condition, a simple procedure in a Richards network. However, rerouting a connection in rnidstream could introduce a phase glitch beyond the outlet circuit's ability to recover phase 20 and clock. Thus with present circuitry, it is preferable not to run the MANS as a rearrangeable switch.
Each switch in the DNet has an XPC 122,124 on the CNet, which receives messages from the MANSCs telling which crosspoints to operate. No high-level logic is performed by these controllers.
25 3.3.3 Control Network and MANS Controller Functions 3.3.3.1 Control Network (CNet) The CNet 130,134,135 briefly described earlier, interconnects the MINTs, MANSCs, and lSCs. It must carry three types of messages --connect/disconnect orders from MINTs to MANSCs using block 130, crosspoint 30 orders from MANSCs to lSCs using block 134, and ACKs and NAKs from MANSCs back to the MINTs using block 135. The CNet shown in FIG. 7 has three cor~esponding planes or sections. The private MANS 140--2SC 124 links are shown but are not considered par~ of the CNet as no swi~ching is re~uired.
In this embodimsnt, the 256 MI~Ts access the CNe~ in groups of 4, 35 resulting in 64 input paths to and 64 output paths from the network. The bus elements in the control network perform merging and rou~ng of message streams~

~, ~3~2~3~

A request message from a MINT includes the ID of the outlet port to be connected or disconnected. Since the MANSCs are associated one-to-one with second stage switches, this outlet specification identifies the proper MANSC to which the message is routed.
The MANSCs transmit acknowledgrnent (ACK), negative acknowledgment (NAK), and lSC command messages via the right-t~left portion of the CNet (blocks 134,135). These messages will also be formatted with header information to route the messages to the specified MINTs and lSCs.
The CNet and its messages raise significant technical challenges.
10 Contention problems in the CNet may mirror those of the entire MANS, requiring their own concurrency solution. These are apparent in the Control Network shown ln FIG. 7. The control data concentrators 136 from four lines into one interfacemay have contention where more than one message tries to arrive at one time.
The data concentrators 136 have storage for one request from each of the four 15 connected MINTs, and the MINTs ensur~ that consecutive requests are sent sufficiently far apart that the previous request from a MINT has already been passed on by the concentrator before the next arrives. The MINTs time out if no acknowledgement of a request is received within a prespecified time.
Alternatively, the control data concentrators 136 sould simply "OR" any requests20 received on any input to the output; garbled requests would be ignored and not acknowledged, leading to a time out.
Functionally what is needed inside the blocks 130,134,135 is a micro-LAN specialized for tiny fixed-length packets and low contention and minimal delay. Ring nets are easy to interconnect, grow gracefully, and permit 25 simple tokenless add/drop protocols, but they are ill-suited for so many closely packed nodes and have intolerable end-to-end delays.
Since the longest message (a MINT's connect order) has under 32 bits, a parallel bus 132 serves as a CNet fabric that can send a complete message in one cycle. Its arbitration controller 131, in handling contention for the bus, 30 would automatically solve contention for the receivers. Bus components are duplicated for reliability (not shown).
3.3.3.2 MAN Switch Controller ~MANSC) Operations FIGS. 8 and 9 show a flowchart of the MANSC's high level functions. Messages to each MANSC 140 include a connec~/disconnect bit, 35 SUWlJ/packet bit, and the IDs of the MANS inpu~ and output ports involved.

;~i ~L ~ ~

3.3.3.2.1 Request Queues; Consolidation (Intake Section, FIG. 8) Since the rate of message arrivals at each MANSC 140 can exceed its message processing rate, a MANSC provides entrance queues for its messages.
Connect and disconnect requests are handled separately. Connects are not 5 enqueued unless their requested outlets are idle.
Priority and regular packet connect messages are provided separate queues 150,152 so that priority packets can ~e given higher priority. An entry from the regular packet queue 152 is processed only if the priority queue 150 isempty. This rninimizes the priority packets' processing delays at the expense of10 the regular packets', but it is esdmated that priority traffic will not usually be heavy enough to add much to packet delays. Even so, delays are likely to be more user-tolerable with the lower priori~y large data transacdons than with priority transactions. Also, if a packet is one of many pieces of a LUWU, any given packet delay may have no final effect since end-to-end LUWU delay 15 depends only on the last packet.
Both the prioriey and regular packet queues are short, intended only tO
cover short-term random fluctuations in message arrivals. If the short-term rate of arrivals exceeds the MANSC's processing rate, the regular packet queue and perhaps the priority queue will overflow. In such cases a control negadve 20 acknowledge (CNAK) is returned to the requesting MINT, indicating a MANSC
overload. This is no catastrophe, but rather the feedback mechanism in the consolidation strategy that ~ncreases switch transaction sizes as traffic gets heavier.
Each MINT combines into one transaction all available packets targeted for a given DNet outlet. ThJs, if a connection request by the MINT results in a 25 CNAK, the next request for the same destination may represent more data to beshipped during the connection, provided more packets of the LUWUs have arrived at the MINT in the meantime. Consolidation need not always add to LUWU
transmission delay, since a LUWU's last packe~ might not be af~ected. This scheme dynam~cally increases effective packet (transaction) sizes to accornmodate 30 the processing capability of the MANSCs.
The priority queue is longer than the regular packe~ queue to reduce the odds of sending a priority CNAK due to random bursts of requests. Priority packets are less likely to benefit from consolidation than packets recombining into their original LUWUs; this supports the separate, high-priority queue. To force 35 the MINTs to consolidate more packets, we may build the regular packet queue shorter than it "ought" to be. Simulations have indicated ~at a priority queue of requests capacity and a regular queue of 8 requests capaclty is appropriate. Thesizes of both queues affect system performance and can be fine-tuned with real experience with a system.
Priority is determined by a priority indicator in the type of service S indication 623 (FIG. 20). Voice packets are given priority because of their required low delay. In alternative a}rangements, all single packet transactions (SUWUs) may be given priority. Because charges are likely to be higher for high priority service, users will be discouraged from demanding high priority servicefor the many packets of a long LUWU.
10 3.3.3.2.2 Busy/Idle Check When a connect request first arrives at a MANSC, it is detected in test 153 which differentiates it from a disconnect request. The busy/idle status of the destination outlet is checked (test 154). If the destination is busy, a busynegative acknowledge (BNAK) is returned (action 156) to the requesting MINT, 15 which will try again later. Test 158 selects the proper queue (priority or regular packee). The queue is tested (160,162) to see if it is full. If the specified queue is full, a CNAK (rontrol negative acknowledge) is returned (action 164). Otherwise the request is enqueued in queue 150 or 152 and simultaneously the destination is seized (marked busy) (acdon lG6 or 167). Note that an overworked (full queues) 20 MANSC can still return BNAKs, and thas both BNA~s and CNAKs tend to increase transaction sizes through consolidation.
The busy/idle check and BNAK handle the concurrency problem. The penalty paid for this approach is that a MINT-to-MANS IL is unusabte during the interval between a MINT's issuing a connect request for that IL and its receipt of 25 an ACK or BNAK. Also the CNet jams up with BNAKs and failing requests under heavy MANS loads. Busy/idle checks must be done quickly so as not to degrade the connecdon request throughput and IL utilization; this e7~plains the performance of a busy test before enqueuing. It may be desirable further to use separate hardware to pre-test outlets for concurrency. Such a procedure would 30 relieve the MANSCs and CNets from repeated BNAK requests, increase the successful request throughput, and perrnit the MANS to saturate at a higher percentage of its theoretical aggregate bandwidth.
3.3.3.2.3 Path Hunt- MANSC Service Sec~ion (FIG. 9~
Priority block 168 gives highest priority to requests from disconnect 35 queue 170, lower prioAty to requests from the priority queue 150, and lowest pr~ority to requests from the packe~ queue 152. When a connect request is 3 ~ ~

unloaded from the priority or the regular packet queue, its requested outlet port has already been seized earlier (action 166 or 167), and the MANSC hunts for a path through the DNet. This merely involves looking up first the two inlets to which the incorning IL is connected (action 172) tO find the four links with access S tO that incoming IL and checking their busy status (test 174). lf all four are busy, a blocked-fabric NAK (fabric NAK or FNAK) fabric blocking negative acknowledge (FNAK) is returned to the requesting MINT, which will try the request again later (action 178). Also the seized destination outlet is released(ma~ked idle) (action 176). We expect FNAKs to be rare.
If the four links are not all busy, an idle one is chosen and seized, first a first stage inlet, then a link (action 180); both are marked busy (action 182).
The inlet and link choices are stored (action 184). Now the MANSC uses its dedicat~d control path to send a crosspoint connect order to the XPC in its associated second stage switch (action 188); this connects the chosen link to the 15 outlet. At the same time another crosspoint order is sent (via the right-to-left CNet plane 134~ to the lSC (action 186) required to connect the link to the inlet port. Once this order alTives at the lSC (test 190), an ACK is returned to the originating MINT (acdon 192~.
3.3.3.2.4 Disconnects To release network resources as quickly as possible, disconnect requests are handled separately from connect requests and at top priority. They have a separate queue 170, built 16 words long (same as the number of outlets) so it can never overflow. A disconnect is detected in test 153 which receives requests from the MINT and separates connect from disconnect requests. The outlet is released and the request placed in disconnect queue 170 (action 193).
Now a new connect request for this sarne outlet can be accepted even though the outlet is not yet physically disconnected. Due to its higher priority, the disconnect will tear down the switch connections before the new request tries to reconnect the outlet. Once enqueued, a disconnect can always be executed. Only the outlet ID
30 is needed to identify the spent connection; the MANSC recalls this connection's choice of link and crosspoints from local memo3y (action 195), marks these linksidle (action 196) and sends the two XPC orders to release them (actions 186 and 188). Thereafter, test 190 controls the wait for an aclcnowledgment from the firs~
stage controller and the ACK is sent to the MINT (action 192). If there is no 35 record of this connection, the MANSC returns a "Sanity NAK." The ~vlANSC
senses status from the outlet's phase alignment and scramble CiICuit (PASC) 290 to verify that some data transfer took place.
3.3.3.2.5 Par~llel Pipelini~
Except for seizure and release of resources, the above steps for one re~quest are independent of other requests' steps in the same MANSC and thus areS pipelined to increase MANSC throughput. Still more power is achieved through parallel operations; the path hunt begins at the same time as the busy/idle check.
Note that the transaction rate depends on the longest step in a pipelined process, but the response time for one given transaction (from request to ACK or NAK) is the sum of the step tirnes involved. The latter is improved by parallelism but not 10 by pipelining.
3.3.4 &or Detection and Diagnosis Costly hardware, message bits, and time-wasting protocols to the CNet and its nodes to verify every little message are avoided. For example, eachcrosspoint order from a MANSC to an XPC does not require an echo of the 15 command or even an ACK in return. Instead, MANSCs does assume that messages arrive ~mcorrupted and are acted on correctly, until evidence to the contrary arrives from outside. Audits and cross-checks are enabled only when there is cause for suspicion. The end users, NIMs and MINTs soon discover a defect in the MANS or its control complex and identify the subset of MANS ports 20 involved. Then the diagnostic task is to isolate the problem for repair and interim work-around.
Once a portion of the MANS is suspect, temporary auditing modes could be turned on to catch the guilty parties. For suspected lSCs and MANSC, these modes require use of the command ACKS and echoing. Special messages 25 such as crosspoint audits may also be passed through the CNet. This should be done while still carrying a light load of user traffic.
Before engaging these internal self-tests (or perhaps to elirninate them entirely), MAN can run experiments cn the MANS to pinpoint the failed circuit, using the MINTs, ILs, and NIMs. For example, if 75% of the test SUWUs sent 30 from a given lL make it to a given outlet, we would conclude that one of the two links from one of that IL's two first stages is defective. (Note this test must be run under load, lest the deterministic MANSC always select the same link.) Further experiments l an isolate that link. But if several MINTs are tested and none can send to a particular outlet, then that outle~ is marked "out of service" to 35 all MINTs and suspicion is now focussed on that second stage and its MANSC.
If other outlets on that stage work, the fault is in the second stage's fabric. These tests use the status lead from each of a MANSC's 16 PASC.
Coordinadng the independent MINTs and NIMs to run these tests requires a central intelligence with low-bandwidth message links to all MlNTs and NlMs. Given inter-MINT connectivity (see FIG. 15), any MINT with the needed S fi~mware can take on a diagnostic task. NIMs must ~e involved anyway to tell whether test SUWUs reach their destinations. Of cou~,e any NIM on a working MI~ can exchange messages with any other such NIM.
3.4 MAN Switch Controller FIG. 25 is a diagram of MANSC 140. This is the uni~ which sends 10 control instructions to data network 120 tO set up or tear down circuit connections.
It receives orders from control netwvrk 130 via link 139 and sends acknowledgments both positive and negative back to the requesting MINTs 11 via control ne~vork 135. It also sends instructions to first stage switch controllers via control network 134 to first stage switch controller 122 and directly to the second 15 stage controller 124 that is associated with the specific MANSC 140.
Inputs are received from inlet 139 at a request intake port 1402. They are processed by intake control 1404 to see if the requested outlet is busy. Theoutlet memory 1406 contains busy/idle indications of the oudets for which an MANSC 140 is responsible. If the outlet is idle a connect request is placed into20 one of two queues 150 and 152 previously described with respect to FIG. 8. Ifthe request is ~or a disconnect, the request is placed in disconnect queue 170. The outlet map 1406 is updated to mark a disconnected outlet idle. The acknowledge response unit 1408 sends negative acknowledgments if a request is received with an error or if a connect request is made to a busy outlet or if the appropriate 25 queue l50 or 152 is full. Acknowledgment responses are sent via control network 135 back to the requesting MINT 11 via distributor 138. All of these actions are performed under the control of intake control 1404.
Service control 14~0 controls the setup of pa~hs in data network 120 and the updating of outlet memory 1406 for those circumstances in which no path 30 is available in the data network between the requesting input link and an available output link. The intake control also updates outlet memoIy 1406 on connect requests so that a request which is already in the ~eue will block another request for the same output liDk.
Service control 1420 examines requests in tne three queues 150, 152, 35 and 170. Disconnect requests are always given the highest priority. For disconnect requests, the link memory 1424 and path memory 1426 are exarnined , '3 3 3 to see which links should be made idle. The instructions for idling these links are sent to first stage switches from first stage switch order port 1428 and the instrucdons to second stage switches are sent from second stage switch order port 1430. For connect requests, the static map 1422 is consulted to see which S links can be used to set up a path from the requesting input link to the requested output link. Link map 1424 is then consulted to see if appropriate links are available and if so these links are marked busy. Path memory 1426 is updated to show that this path has been set up so that on a subsequent disconnect order theappropriate links can be made idle. All of these actions are per~ormed under the10 control of servlce control 1420.
Controllers 1420 and 1404 may be a single controller or separate controllers and may be program controlled or controlled by sequential logic.
There is a great need for a very high-speed operations in these controllers because of the high throughput demanded which makes a hard wired controller preferable.
15 3.5 Control Network Control message network 130 (FIG. 7) takes outputs 137 from data concentrators 136 and transmits these outputs, representing connect or disconnect requests, to MAN switch controllers 140. Outputs of concentrators 136 are storedtemporarily in source registers 133. Bus access controller 131 polls these source 20 registers 133 to see if My have a request to be transmitted. Such requests are then placed on bus 132 whose output is stored temporarily in interrnediate register 141. Bus access controller 131 then sends outputs from register 141 to the appropriate one of the MAN switch controllers 140 via link 139 by placing the ou~put of register 141 on bus 142 connected to link 139. The action is 25 accomplished in three phases. During the first phase, the output of register 133 is placed on the bus 132, thence gated to register 141. During the second phase, the output of register 141 is placed on bus 142 and delivered to a MAN switch controller 140. Duling the ~hird phase, the MAN switch controller signals the source register 133 as to whether the controller has received the request; if so, 30 source register 133 can accept a new input from control data concentrator 136.
Otherwise, source register 133 retains the same request data and the bus access controller 131 will repeat the transmission later. The three phases may occur simultaneously ~or th~e separate requests. Control networks 134 and 135 operale in a fashion similar to control network 130.

. 3 ~

3.6 Sumrnary A structure to meet the large bandwidth and transaction throughput requirements for the MANS has been described. The data switch fal~ric is a two-stage Richards network, chosen because its low blocking probability pe~nits a 5 parallel, pipelined distributed switch control complex (SCC). The SCC includesXPCs in all first and second stage switches, an intelligent controller MANSC with each second stage, and the CNet that ties the control pieces together and links them to the MINTs.

10The memory and interface module (MINT) provides receive interfaces for the external fiber-optic links, buffer memory, control ~or routing and link protocols, and transmit~ers to send collected data over the links to the MAN
switch. In the present design, each MINT serves four network interface modules (NIMs) and has four links to the switch. The MINT is a data switching module.
15 4.1 Basic Functions The basic functions of the MINT are to provide the following:
1. A fiber-optic receiver and link protocol handler for each NIM.
2. A link handler and transmitter for each link to the switch.
3. A buffer memory to accumulate packets awaiting transrnission across the 20switch.
4. An interface to the controller for the switch to direct the setup and teardown of network paths.
5. Control for address translation, routing, making efflcient use of the switch,orderly ~ransmission of accumulated packets, and management of buffer 25memory.
6. An interface for operation, adrninistration, and maintenance of the overall system.
7. A control channel to each NIM for operation, adrninistration, and maintenance functions.
30 4.2 Data Flow In order to understand the descriptions s~f the individual functional units that make up a MINT, it is first necessary to have a basic understanding of the general flow of da~a and control. FIG. 10 shows an overall view of the Ml?~.Data enters the MINl' on a high-speed (10~150 Mbit/s) data channel 3 from 35 each NIM. This data is in the form of packets, on the order of 8 Kilobits long, each with itS own header containing routing infonnation. The hardware allows t;-r ~'2~

packet sizes in increments of 512 bits to a maximum of 128 Kilobits. Small packet sizes, however, reduce throughput due to the per-packet processing required. Large maximum packet sizes result in wasted memory for transactions of less than a maximum size packet. The link term~nates on an external link S handler 16 (XLH), which retains a copy of the pertinent header fields as it deposits the entire packet into the buffer memory. This header information, together with the buffer memory address and length, is then passed to the central control 20. The central control determines the destination NIM from the address and adds this block to the list of blocks (if any) awaiting transmission to this10 same destination. The central control also sends a connection request to the switch controller if there is not already a request outstanding. When the central control receives an acknowledgement from the switch controller that a connectionrequest has been satisfied, the central control transmits the list of memory blocks to the proper internal link handler 17 (ILH). The ILH reads the stored data from15 memory and transmits it at high speed (probably the same speed as the incoming links) to the MAN switch, which directs it to its destination. As the blocks aretransmitted, the ILH informs the central control so that the blocks can be added tO
the list of free blocks available for use by the XLEIs.
4.3 Memory Modules The buffer memory 18 (FIG. 4) of the MINT 11 satisfies three requirements:
1. The quantity of memory provides sufficient buffer space to hold the data accumulated (for all destinations) while awaiting switch setups.
2. The memory bandwidth is adequate to support simultaneous activity on all eight links (four receiving and four transmitting).
3. The memory access provides for efficient streaming of data to and from the link handlers.
4.3.1 Organization Because of the amount of memory required (Megabytes), it is 30 desirable to employ conventional high-density dynarnic random access memory (DRAM) parts. Thus, high bandwidth can be achieved only by making the memory wide. The memory is therefore organized into 16 modules 201,...,202 which make up a composite 512-bit word. As will be seen below, memoIy accesses are organized in a synchronous fashion so that no module ever receives 35 suçcessive requests without sufficient time to perform the required cycles. The range of memory for one MINT 11 in a typical MAN application is 16-o4 'f ~

Mbytes. The number is sensitive to the speed of application of flow control in overload situations.
4.3.2 Tirne Slot Assigners The time slot assigners 203,...,204 (TSAs) combine ~he functions of a 5 conventional DRAM controller and a specialized 8-channel DMA controller. Each receives read/write requests from logic associated with the Data Transport Ring 19 (see 4.4, below). Its setup commands come from dedicated control time slots on this sarne ring.
4.3.2.1 Control From a con~ol viewpoint, the TSA appears as a set of registers as shown in FIG. 11. For each XLH there is an associated address register 210 and count register 211. Each ILH also has address 213 and count 214 registers, but in addition has registers containing the next address 21S and count 216, thus allowing a series of blocks to 'oe read from memory in a continuous stream with 15 no inter-block gaps. A special set of registers 220-226 allows the MINT's central control section to access any of the internal registers in the TSA or to perforrn a directed read or write of any particular word in memory. These registers includea write data register 220 and read data register 221, a memory address register 222, channel status register 223, error register 224, memory refresh row 20 address register 225, and diagnostic control register 226.
4.3.2.2 Operation In normal operation, the TSA 203 receives only four order types from the ring interface logic: (1) "write" reyuests for data received by an XLH, (2) "read" requests for an ILH, (3) "new address" commands issued by either an XLH
25 or an ILH, and (4) "idle cycle" indications which eell the TSA to perform a refresh cycle or other special operation. Each order is accompanied by the identity of the link handler involved and, in the case of "write" and "new address" requests, by32 bits of data.
For a "write" operation, tne TSA 203 simply perfo~ns a memory 30 write cycle using the address from the register associated with the indicatedXLH 16 and the data provided by the ring interface logic. It then increments theaddress register and decrements tne count register. The count register is used in this case only as a safety check since the XLH should provide a new address before overflowing the culTent block.

? ;~

For a "read" operadon, the TSA 203 must first check whether the channel for this ILH is active. If it is, the TSA performs a memory read cycls using the address from the register for this ILH 17 and presents the data to thering interface logic. It also increments the address register and decrements the5 count register. In any case, the TSA provides the interface logic with two "tag"
bits which indicate (1) no data available, (2) data available, (3) first word ofpacket available, or (4) last word of packet available. For case (4), ~he TSA will load the L H's address 214 and count 213 registers from its "next address" 216 and "next count" 215 registers, provided that these registers have been loaded by 10 the ILH. lf they have not, the TSA marks the channel "inactive."
From the above descriptions, the function of a "new address"
operadon can be inferred. The TSA 203 receives the link identity, a 24-bit address, and an 8-bit count. For an XLH 16, it simply loads the associated registers. In the case of an LH 17, the TSA must check whether the channel is 15 active. If it is not, then the normal address 214 and count 213 registers are loaded and the channel is marked active. If the channel is currently active, then the "next address" 216 and "next count" 215 registers must be loaded instead of the normaladdress and count registers.
In an alternative embodirnent, the two tag bits are also stored in buffer 20 memory 201,...,202. Advantageously, this permits packet sizes that are not limited to being a multiple of the overall width of the memory (512 bits). In addition, the ILH 17 need not provide the actual length of the packet when reading it, thus relieving the central control 20 of the need to pass along this information to the ILH.
25 4.4 Data Transpo; Rin~
It is the job of the Data Transport Ring 19 to carry control commands and high-speed data between the link handlers 16,17 and tne me,nory modules 201,..,202. The ring provides sufficient bandwidth to allow all the links to run simultaneously, but carefully apportions this bandwidth so that circuits 30 connecting to the ring are never required to transfer data in high-speed bursts.
Instead, a fixed time slot cycle is employed that assigns slots to each circuit at well-spaced intervals. The use of this fixed cycle also means that source and destination addresses need not be carried on the ring itself since they can be readily determined at any point by a properly synchronized counter.

~ c)~

4.4.1 Electrical Description The ring is 32 data bits wide and is clocked at 24 MHz. This bandwidth is sufficient to support data rates of up to 150 Mbit/s. In addition to the data bits, the rings contains four parity bits, two tag bits, a sync bit to identify 5 the start of a superframe, and a clock signal. Within the ring, single-ended ECL
circuitry is used for all signals except the clock, which is differential ECL. The ring interface logic provides connecting circuits with l~L-compatible signal levels.
4.4.2 Time Slot Sequencin~ Requirements In order to meet the above objectives, the time slot cycle is subject to a numbçr of constraints:
1. During each complete cycle there must be a unique time slot for each combination of source and destination.
2. Each connecting circuit must see its data time slots appearing at reasonably regular intervals. Specifically, each circuit must have a certain minimum interval between its data time slots.
3. Each link handler must see its data time slots in numerical order by memory module number. (This is to avoid making the link handler shuffle a 512-bit word.) 4. Each TSA must have a known interval dunng which it can perforrn a refresh cycle or other miscellaneous memory operation.
5. Since the TSAs in the memory modules must examine every control time slot, there must also be a minimum interval between control time slots.
4.4.3 Time Slot CYcle Table I shows one data frame of a timing cycle which meets these requirernents. One data frame consists of a total of 80 time slots, of which 64 are used for data and the remaining 16 for control. The table shows, for each memory module TSA the slot during which it receives data from each XLH to be written into memory and during which it must supply data that was read from 30 memory for each LH. Every fifth slot is a control time slot during which the indicated link handler broadcasts con~ol orders to all the TSAs. For the purposes of this table, XLHs and ILHs are numbered 0-3, and TSAs are numbered 0-15.
TSA 0, for example, during time slot 0 receives data from XLH 0 and must supply data for ILH 0. During slot 17, TSA 0 per~olms similar operations for 35 XLH 2 and ILH 2. Slot 46 is used for XLH 1 and ILH 1, and slot 63 is used forXLH 3 and ILH 3. The re-use of the same time slot for reading and writing is J . j permissible since XLHs never read from memory and ILHs never write, thus effectively doubling the data bandwidth of the ring.
The control time slots are assigned, in sequence, to the four XLHs, the four ILHs, and the central control (CC). With these nine entities sharing the 5 control time slots, the control frame is 45 time slots long. The 80-slot data frame and the 45-slot control frame come into alignment every 720 time slots. This period is ~he superframe and is marked by the superframe sync signal.
There is a subtle synchronization condition that must also be met for the ILHs. The words of a block must ~ sent in sequence beginning with word 0, 10 regardless of where in the ring timing cycle the order was received. To assist in meeting this requirement, the ring interface circuitry provides a special "word 0"
sync signal for each ILH. For exarnple, in the timing cycle of Table I a new address might be sent by ILH 0 during time slot 24 (its control time slot). It is necessary to ensure that TSA number 0 is the first TSA to act on this new address 15 (requirement 3 in section 4.4.2) even though the data time slots for reads from TSAs numbered 5 through 15 for ILH 0 immediately follow time slot 24.
Since the number of time slots in the supeTframe, 720, exceeds the number of elements on the ring, 25, it is apparent that the logical time slots do not have a permanent existence; each time slot is, in effect, created at a particular 20 physical location on the ring and propagates around the ring until it returns to this location, where it vanishes. The effective creation point is different for data time slots than for control time slots.

TABLE I
RING TIME SLOT ASSIGNMENT
Write to From E~ead from To Control Time Slot TSA XLH TSA ILH Slot Source ~4 XLH0 0~ 1 0 1 0 13 ~ 3 6 3 26 12 1 1~ 1 27 ~ 2 2 2 2~ 9 3 ~ 3 29 ILHl 52 7 ~ 7 2 54 XLHl 58 15 3 ~5 3 1 ~?~ ~ 2 ~ ~ ~

1(~ 72 11 2 11 2 74 ILHl 7~ 3 3 3 3 2 ~ . ~
- so -4.4.3.1 Data Time Slots I)ata time slots can be considered to originate at the owning XLH. A
data time slot is used to carry incoming data to its assigned memory module, at which point it is re-used to carry outgoing data to the corresponding ILH. Since5 XLHs never receive information from a data time slot, thç ring can be considered to be logically broken (for data time slots only) between the ILHs and the XLHs.The two tag bits identify the contents of the data time slots as follows:

1 1 Empty 10 Data 01 First word of packet 00 Last word of packet The "first word of packet" is sent only by memory module 0 when it sends the f~rst word of a packet to an ILH. The "last word of packet" indication is sent only 15 by memory module 15 when it sends the end of a packet to an ILH.
4.4.3.2 Control Time Slots Control time slots originate and terminate at the station of central control 20 on the ring. The link handlers use their assigned control slots only to broadcast orders to the TSAs. The CC is assigned every ninth control time slot.
20 The TSAs receive orders from all control time slots and send responses back to the CC on the CC control time slot.
The two tag bits identify the contents of a control time slot as follows:

1 1 Empty 10 Data (to or from CC) 01 Order 00 Address & count (from a link handler) 4.5 External Link Handler The principal function of the XLH is to terminate the incoming high-30 speed data channel from a NIM, deposit the data in the MINT's buffer memory, and pass the necessary information to the MINT's central control 20 so that the data can be forwarded to its destination. In addition, the XLH telminates an E ~ ~ 2 ~ J J

incoming low-speed control channel that is multiplexed on the fiber link. Some of the functions assigned to the low-speed control channel are the transmission of the NIM status and control of flow in the network. It should be noted that the XLH
is only terminating the incoming fiber from the NIM. Transmission to the NIM is 5 handled by the internal link handler and the phase alignment and scrambler circuit that will be described later. The XLH uses an onboard processor 268 to interfaceto the hardware of the MINT central control 20. The four 20 Mbi~lsec links coming from this processor provide the connectivity to the central control section of the MINl. FIG. 12 shows an overall view of the XLH.
10 4.5.1 Link Interface The XLH contains the fiber op~ic receiver, clock recovery circuit and descrambler circuit needed to recover data from tlle fiber. After the data clock is recovered (block ~50) and the data descrambled (block 252) the data is then converted from serial to p2rallel and demultiplexed (block 254) into the high-15 speed data channel and the low-speed data channel. Low level protocol processing is then performed on the data on the high-speed data channel (block 256) as described in 5. This results in a data stream consisting of only packet data. The stream of packet data then goes through a first-in-fi~st-out (FIFO) queue 258 to a data steering circuit 260 which steers the header into the20 header FIFO 266 and sends the complete packet to the XLH's ring interface 262.
4.5.2 Rin~ Interface The ring interface 262 logic controls transfer of data from the packet FIFO 258 in the link interface to the MlNrr's buffer memory. It provides the following functions:
25 1. Establishing and maintaining synchronization with the ring's timing cycle.2. Transfer of data from the link interface FIFO to the proper ring time slots.
3. Sending a new address to the memory TSAs when the end of a packet is encountered.
It should be noted that resynchronization with the ring's 16-word (per XLH) 30 timing cycle will have to be performed during the processing of a packet whenever the link interface FIFO becomes temporanly empty. This will be a normal occurrence since the ring's bandwidth is higher than the link's transmission rate. The ring and TSA, however, are designed to accornmodate gaps in the data stream. Thus, resynchronization consists simply of waiting for 35 data to become available and for the ring cycle to return to ~he proper word number, marking the intervening time slots "empty." For example, if the ~ ~ ~3 ,~ l ~ rj FIFO 258 becomes empty when a word destined for the fifth memory module is needed, it is necessary to ensure that the next word actually sent goes to that memory module, in order to preserve the overall sequence.
4.5.3 Control The control portion of the XLH is responsible for replenishing the free block FIFO 270 and passing the header information about each packet received to the MINT's central control 20 (FIG. 4).
4.5.3.1 Header Processing At the same time a packet is being transrni~ted on the ring, the header 10 of the packet is deposited in the header FIFO 266 that is subsequentLy read by the XLH processor 268. In this header are the source and destination address fields,which the central control will require for routing. In addition, the header checksum is verified to ensure that these fields have not been corrupted. The header infolmation is then packaged with a memory block descriptor (address and 15 length) and sent in a message to the central control 20 (FIG. 4).
4.5.3.2 Interaction with Central Control There are only two basic interactions with the M~lT's central control.
The XLH control attempts to keep its free-block FIFO 270 full with block addresses obtained from the memory rnMager, and it passes header information 20 Md memory block descriptors to the central control so that the block can be routed to its destination. The block addresses are subsequently placed on the ring 19 by ring interface 262 upon receipt of the address from control sequencer 272. Both interactions with the central control are carried oot over links from XLH processor 268 to the appropriate sections of the central control.25 4.6 Internal Link Handler The internal link handler (ILH) (FIG. 13) is the first part of what can be considered a distributed link controller. At any instant in /ime this distributed link controller consists of a particular ILH, a path through the switch fabric and a particular Phase Alignrnent and Scrambler circuit 290 (PASC). The PASC is 30 described in section 6.1. It is the PASC that is actually responsible for thetrMsrnission of optical signals over the retum fiber of fiber pair 3 to the NIM
from the MINT. The informa~ion that is transmitted over the fiber comes from the MANS lû, which receives inputs at different times from the ILHs sending to that NIM. This kind of distributed link controller is necessary since path length~
35 through the MAN switch fabric are not all equal. If the PASC did not align all ~-1 the information coming from different ILHs to the same reference clock, ~ V ~

infolmation received by the NIM would be continually changing its phase and bit alignment.
The combination of the ILH with the PASC is in many ways a mirror image of the XLH. The ILH receives lists of block descriptors from the central 5 control, reads these blocks from memory, and transmits the data over the serial link to the switch. As data is received from memory, the associated block descriptor is sent to the cen~al conirol's memory manager so that the block can be retumed to the free list.
The ILH differs from the XLH in that the ILH performs no special 10 header processing, and the TSAs provide the ILH with additional pipelining sothat multiple blocks can be transmitted as a continuous stream if des-ired.
4.6.1 Link Inte face The link interface 289 provides the serial transrnitter for the data channel. I)ata is transmitted in a frarne-synchronous format compatibl~ with the15 link data format described in 5. Since the data is received from the ring interface 2B0 (see below) asynchronously and at a rate somewhat higher that the link's average data rate, the link interface contains a FIFO 282 to provide speed matching and frame synchronization. The da~a is received from MINT memory via data ring interface 280, stored in FIFO 282, is processed by level 1 and 2 20 protocol handler 286, and is transmitted to MAN switch 10 through the parallel to serial converter 288 within link interface 289.
4.6.2 Ring Interface The ring interface 280 logic controls ~he transfer of data from the MINT's buffer memory to the FIFO in the link interface. It provides the 25 following functions:
1. Establishing and malntaining synchronization with the ring's timing cycle.
2. Transfer of data from the ring to the link interface FIFO during the proper ring tim~ e slots.
3. Notifying the control section when the last word of a paeket (memory block) is received.
4. Sending a new address and count (if available) to the memory TSAs 203,...,204 (FIG. 10~ when the last wG~d of a packet is received and the condition of the FIFO 282 is such that the new packet will no~ cause an overllow.
35 Unlike the XLH, the ILH relies on the TSAs to ensure that data words are received in sequence and with no gaps within a block. Thus, maintaining word 3 ~

synchronization in this case consists simply of looking for unexpected empty data time slots.
4.6.3 Control The control portion of the ILH, controlled by sequencer 283 is 5 responsible for providing the ring interface with block descriptors received via the processor link interface 284 from the central control and stored therefrom in address FIFO 285, notifying the central control via the processor link inter~acewhen blocks have been retrieved from memory, and notifying the central control 20 when transmission of the final block is complete.
10 4.6.3.1 Interaction w h Central Control There are only three basic interactions with the MINT's central control:
1. Receiving lists of block descriptors.
2. Informing the memory manager of blocks that have been retrieved from memory.
3. Informing the switch request queue manager when all blocks have been transmitted.
In the present design, all of these interactions ale carried out over Transputer links to the appropriate sections of the central control.
20 4.6.3.2 Interaction with TSAs Like the XLH, the ILH uses its control time slots to send block descriptors (address and lengths) to the TSAs. When the TSAs receive a descriptor from an ILH, however, they will imrnediately begin reading the block from memory and placing the data on the ring. The length field frorn an ILH is 25 significant and dete~mines the number of words that will be read by each TSA
before moving on to the next block. The TSAs also provide each ILH with registers to hold the next address and length, so that successive blocks can be transmitted without gaps. Flow control is the responsibility of the ILH, however, and a new descliptor should not be sent to the TSAs un~l there is enough room in30 the packet FIFO 282 to compensate for reframing ~me and the difference in transmission rates.
4.7 MINT Central Control FI&. 14 is a block diagrarn of MINT central control 20. This central control is connected to the four XLH 16s of the MINT, the four ILH 17s of the 35 MINT, to data concentrator 136 and distributor 138 of the switch control (SeeFIG. 7), and to an OA&M central control 352 shown in FIG. 15. The relationshil-c~J

55 _ of the central control 20 with other units will first be discussed.
The MINT central control communicates with XLH 16 to providememory block addresses for use by the XLH in order to store incoming data in the MINT memory. XLH 16 comrnunicates with the MINrr central control to S provide the header of a packet to be stored in MINT memory, and the address where that packet is to be stored. Memory manager 302 of MINT central control 20 comrnunicates with ILH 17 to receive information that memory has been released by an ILH because the message stored in those memory blocks has been delivered? so that the released memory can be reused.
When queue manager 311 recognizes that the first network unit arriving for a particular NIM has been queued in switch unit queue 314, which contains FIFO queues 316 for each possible destination NIM, queue manager 311 sends a request to switch setup control 313 to request a connection in MAN
switch 10 to that NIM. The request is stored in one of the queues 318 (priority)15 and 312 (regular) of switch setup control 313. Switch setup control 313 administers these requests according to their priority and sends requests to MANswitch 10, specifically to switch control data concentrator 136. For norrnal loads, the queues 318 arld 312 should be almost empty since requests can norrnally be made almost irnmediately and will generally be processed by the appropriate 20 MAN switch controller. For overload conditions, the queues 318 and 312 becomea means for deferAng transmission of lower priority packets while retaining the relatively fast transmission of priority packets. Xf experience so dictates, it may be desirable to move a request from the regular queue to the priority queue if a prioAty packçt for tha~ desdnation NIM is received. Requests queued in 25 queues 318 and 312 do not tie up an IL, an ILH, and an output link of circuitswitch 10; this is in contrast to requests in the queues 150,152 (FIG. 8) of an MAN switch controller l40 (FIG. 7).
When switch setup control 313 recognizes that a connection has been established in switch 10, it nodfies NIM queue manager 311. The ILH 17 30 receives data from a FIFO queue 316 in switch ~mit queue 314 from NIM queue manager 311 to identify a queue of the memory locations of data packets which may be transmitted to the circuit switch, and for each packet, a list of one or more ports on the NIM to which that packet is to be transmitted. NIM queu~
manager 311 then causes ILH 17 to prefix the port number(s) to each packet and 35 to transmit data for each packet from memory 18 to switch 10. The ILH then proceeds to transmit the packets of the queue and when it has completed this task, JSJ~

notifies the switch setup control 313 that the connection in the circuit switch may be disconnected and notifies memory manager 302 of the identity of the blocks ofmemory that can now be released because the data has been transmitted.
The MINT central control uses a plurality of high speed processors S each s)f wbich have one or mo~e input/output ports. The specific processor used in this implementation is the Transputer manufactured by INMOS Corporation.
This processor has four input/outout ports. Such a processor can meet the processing demands of thç MINT central control.
Packets come into the four XLHs 16. There are Pour XLH managers 10 305, source checkers 307, routers 30~, and OA&M MINl processors 315, one corresponding to each XLH within the MINT; these processors, operating in parallel to process the data entering each XLH increase the total data processing capacity of the MIN~ central control.
The header for each packet entering an XLH is transmitted along with 15 the address where that packet is being stored directly to an associated XLH
manager 30S, if the header has passed the hardware check of the cyclic redundancy code (CRC) of the header performed by the XLH. If that CRC check fails, the packet is discarded by the XLH which recycles the allocated rnemory block. The XLH manager passes the header and the identity of allocated memory 20 for the packet to the source checker 307. The XLH manager recycles memory blocks if any of the source checker, router, or NIM queue manager find it impossible to transmit the packet to a destinadon. Recycled memory blocks get used before memory blocks allocated by the memory manager. Source checker 307 checks whether the source of the packet is properly logged in and whether 25 that source has access to the virtual network of the packet. Source checker 307 passes information about the packet, including the packet address in MINT
memory, to router 309 which translates the packet group identification, effectively a virtual network name, and the destination name of the packet in order to find OUt which output link this packet should be sent on. Router 309 passes the 30 identification of the output link to NIM queue manager 311 which iden~ifies and chains packets received by the four XLHs of this MINT which are headed for a common ou~put link. After the first packet to a NIM queue has ~een received, theNIM queue manager 311 sends a switch setup request to switch setup control 313 to request a connection to that NIM. NIM queue manager 311 chains these 35 packets in FIFO queues 316 oi switch unit queue 314 so that when a switch connection is made in the circuit switch 10, all of these packets may be sent over 3 ~j ~

that connection at one time. Output control signal distributor 138 of the switchcontrol 22 replies with an acknowledgment when it has set up a connection. This acknowledgment is received by switch setup control 313 which informs NIM
queue manager 311. NIM queue manager 311 then inforrns ILH 17 of the list of 5 chained packets in order that ILH 17 may transmit all of these packets. When ILH 17 has completed the transmission of this set of chained packets over the circuit switch, it informs switch setup control 313 to re~quest a disconnect of the connection in switch 10, and informs memory manager 301 tha~ the memory which was used for storing the data of the message is now available for use for a 10 new message. Memory manager 301 sends this release information to memory distributor 303 which distributes memory to the various XLH managers 305 for allocating memory to the XLHs.
Source checker 307 also passes billing information to operadon, administration and maintenance (OA&M) MINT processor 315 in order to perform lS billing for that packet and to accumulate appropriate statistics for cherking on the data flow within the MINT and, after combination with other statistics, in the MAN network. Router 309 also informs (OA&M) MINT processor 315 of the destination of the packet so that the OA&M MINT processor can keep track of data concerning packet destinations for subsequsnt traffic analysis. The output of 20 the four OA&M MINT processors 315 are sent to MINT OA&M monitor 317 which surnmarizes the data collected by the four OA&M MINT processors for subsequent transmission to OA&M central control 352 (FIG. 14).
MINT OA&M monitor 317 also receives information from OA&M
central control 352 for making changes via OA&M MINT processor 315 in the 25 router 309 data; these changes reflect additional terminals added to the network, the movement of logical terminals (i.e., terminals associated with a particular user) from one physical port to another, or the removal of physical terminals f~om thenetwork. Data is also provided from the OA&M central control 352 via the MINT operation, OA&M mor~itor and the OA&M MINT processor 315 to source 30 checker 307 for such data as a logical user's password and physical port as well as data concerning the privileges of each log~cal user.
4.8 MINT Operation, Administration, and Maintenance Control System FIG. lS is a block diagram of the maintenance and control system of thç MAN network. Operation, adrninistration, and maintenance tOA~M) 35 system 350 is connected to a plUlality of OA&M cen~al controls 352. These OA&M controls are each connected tO a plurality of MINTs, and within each MINT, to the MINT OA&M monitor 317 of MINT central control 20. Since many of the messages from OA&M system 350 must be distributed to all the MINTs, the various OA&M central controls are interconnected by a data ring.
This data ring transmits such data as the identification of the network interface 5 module, hence the identification of the output link, of each physical port that is added to the network so that this inforrnation may be stored in the router processors 309 of every MINT in the MAN hub.
S LINKS
5.1 Link Requirements The links in the MAN system are used to transmit packets between the EUS and the NIM (EUSL) (links 14) and between the NIM and the MAN
hub (XL) (links 3). Although the operation and the characteristics of the the data that is transferred on these links varies slightly with the particular application, the format used on the links is the same. Having the formats be the sarne makes it 15 possible use common hardware and software.
The link forrnat is designed to provide the following features.
1. It provides a high data rate packet channel.
2. It is compatible with the proposed Metrobus "OS-l" format.
3. Int~rfacing is easier because of th~ word oriented synchronous format.
2û 4. It defines how "packets" are delimited.
5. It includes a CRC for an entire "packet" (and another for the header.
6. The forrnat insures transparency of the data within a "packet".
7. The format provides a low bandwidth channel for flow control signaling.
8. Additional low bandwidth channels can be added easily.
25 9. Data scrambling insures good transition density for clock re~overy.
5.2 MAN Link Descripdon and Reasonin~
Fr~m a performance point of view, the faster the links are the better MAN will perform. This desire to operate the links as fast as possible is tempered by the fact that faster links cost more. A reasonable tradeoff between 30 speed and cost is to use LED transmitters (like the AT~T ODL-200) and multimode fiber. The use of ODL-200 transrnitters and receivers puts an upper lirnit on the link speed of about 200Mbit/sec. From the MAN architecture point of view, the exact data rate of the links is not important since MAN does not dosynchronous switching, The data rate for the MAN links was chosen to be the 35 same as ~he data rate of the Metrobus Lightwave System "OS-l" link. The Me~obus forrnat is described in M. S. Schaefer: "Synchronous Optical - s9 -Transmission Network ~or the Metrobus Lightwave Network", EEE Intemational Communications Conference, June 1987, Paper 30B.l.l. Another data rate (and format) that could be used in MAN will come from the specification of SONET, a link layer protocol specified by Bell Communications Research Corp. for 150 5 Mbit/sec unchannelized links.
5.2.1 Level 1 Link Format The MAN network uses the low level link format of Metrobus.
Information on the link is carried by a simple frame that is continuously repeated.
The frame consists of 88 - 16 bit words. The first word contains a fratning 10 sequence and 4 parity bits. In addition to this first word, three other wGrds are overhead words. These overhead words, which are used for internode communications in the Metrobus implementatlon, are not used by MAN for the sake of Metrobus compatibility. The word oriented nature of the protocol malces using it much simpler. A simple 16 bit shift register with parallel load can be 15 used to transmit and a similar shift register with parallel read out can be used to receive. At the 146.432Mbit/sec. link data rate, a 16 bit word is transrnitted or received every lO9ns. This approach makes it possible to implemen~ much of the link formatting hardware at conventional l'rL clock rates. The word oriented nature of the protocol does put some restrictions on the way the link is used, 20 however. To keep the complexity of the hardware reasonable it is necessary tO use the bandwidth of the link in units of 16 bit words.
5.2 2 Level 2 Link Format The link is used to move "packets", the basic unit of inforrnation trMsfer in MAN. To identify packets, the format includes the specification of 25 "SYNC" words and an "IDLE" word. When no packets are being transmitted the "IDLE" word will fill all of the words that make up the primary channel bandwidth (words not reserved for other purposes). Packets are delimited by a leading START S'YNC and a trailing END SYNC word. This scheme works well as long as the words with special meanings are never contained in the data within 30 a packet. Since restricting the data ~hat can be sent in a packet is an unreasonable restriction, a transparen~ data transfer technique must be used. MAN lin~s employ a very simple word stuffing transparency technique. Within the packet data, any occlLrrence of a spelial meaning word, like the START SYNC word, is preceded by another special word the "DLE" word. This word stuffing transparency was 35 chosen because of the simplicity of implementation. This protocol requires simples, lower speed logic than is required for bit stuffing protocols like EIDLC.

The technique itself is similar to the time pr~ven techniques used in IBM's BISYNC links. In addition to the word stuffing used to ensure transparency, "FILL" words are inserted if the data rate of the source is slightly less than the link data rate.
S The last word in any packet is a ~yclic redundancy check (CRC) word. This word is used to insure the that any corruption of the data in a packet can be detec~ed. The CRC word is computed on all of the data in the packet, excluding any special words like "DLE" that may need to be inserted in the data stream for transparency or otner reasons. The polynomial that is used to compute10 the CRC word is the CRC-16 standard.
To ensure good transition density ~or the optical receivers all of the data is scrambled (e.g., block 296, FIG. 13) prior to transmission. The scrambling makes it less likely that long sequences of ones or zeros will be transmitted on the link even though they may be quite common in the data actually being transmitted. The scrambler and descrambler (e.g., block 252, FIG. 12) are well known in the art. The descrambler design is self synchronizing, which makes it possible to recover from occasional bit errors without having to restart the descrambler.
5.2.3 Low Speed Channels and Flow Control Not all of the payload words in the level 1 forrnat are used for the level 2 ~orrnat that carries packets. Additional channels are included on the link by dedicating particular words within the frame. These low rate channels 255,295(FIGS. 12 and 13) are used for MAN network control purposes. A packet delirniting scheme similar to that used on the primary data channel is used on 25 these low rate channels. The dedicated words that make up low rate channels can be further divided down into individual bits for very low bandwidth channels like the flow control channel. The flow control channel is used on the MAN EUSL
(between the EUS and the NIM) to provide hardware level flow control. The flow control channel (bit) from the NIM to the EUS, indicates to the EUS link 30 transmitter whether or not it is allowed to transmilt more information. The design of the NIM is such that sufficient storage is available to absorb any data that is ~ransmitted prior to the,EUS transmitter actually stopping after flow control isasserted. Data transrnission can be stopped either between packets or in the rniddle of a packet transmission. If it is between packets, the next packet will not 35 be sent until flow control is turned deasserted. If flow control is asserted in the middle of a packet, it is necessary to suspend data transmission immediately and ~2~ ~

start sending the "Special FILL" code word. This code word, like all others, is escaped with the "DLE" code word when it appears in the body of a packet.
6 SYSTEM CLOCKIN~
The MAN switch, as described in section 3, is an asynchronous space 5 switch fabric with a very fast setup controller. The data fabric of the switch is design to reliably propagate digital signals with data rates from DC to in excess of 200Mbits/second. Since many paths can simultaneously exist through the fabric, the aggregate bandwidth requirements of the MAN hub can be easily meet by the fabric. This simple data fabric is not without drawbacks howev~r. Because of 10 mechanical and electrical constraints in implementing the fabric, it is not possible for all paths through the switch to incur the same amount of delay. Because the variations in path delay between different paths may be much greater than the bit time of the data going through the switch, it is noe possible to do synchronous switching. Any time that a path is setup from a particular ILH in a M~T to an 15 output port of the switch, there is no guarantee that data transmitted over that path will have the sarne relative phase as the data transrnitted over a previous path~hrough the switch. To use this high bandwidth switch it is therefore necessary to very quickly synchronize data coming out of a switch port to the clock being used for the synchronous link to the NIM.
20 6.1 The Phase Alignment and Scrambler Circuit (PASC?
The unit that must do the synchronization of data coming from the switch and drive the outgoing link to the NIM called the Phase Alignment and Scrambler Circuit (PASC) (block 290, FIG. 13). Since the ILHs and the PASC
circuits are all part of the MAN hub, it is possible to distribute the same master 25 clock to all of them. This has several advantages. By u.sing the sarne clock reference in the PASC as is used to transmit data from ~he ILH, one can be sure that data can not be coming into the PASC any fas~er than it is being moved out of it over the link. This eliminates the need for large FIFOs and elaborate elastic store controllers in the PASC. The fact that the bit ~ate of all data that comes into 30 a PASC is exactly the the same makes the syncnronization easier.
The ILH and the PASC can be though~ of as a distributed link handler for the format described in the previous section. The ILH creates the basic framing pattern into which the data is inserted and ~ansmits it through tne fabric to a PASC. The PASC aliyns this framing pattern with its own framing pattern, 35 merges in the low speed control channel and then scrambles the data for transmission.

~2~ 3~

The PASC synchronizes the incoming data to the reference clock by inserting an appropriate amount of delay into the data path. For this to work the ILH must be transmitting each frame with a reference clock that is slightly advanced from the reference clock used by the PASC. The number of bit times of 5 advance that the ILH requires is determined by the actual rninimum delay that may be incurred in getting from the ILH to the PASC. The arnount of delay that the PASC must be capable of inserting into the data path is dependent on the possible variation in path delays that may occur for different paths through theswitch.
FI~;. 23 is a block diagrarn of an illustrative embodiment of the invention. Unaligned data enters a tapped delay line 1001. The various taps of the delay line are clocked into edge sarnpling latches 1003,...,1005 by a signal that is 180 degrees out of phase with the reference clock (REFCLK) and is designated REFCLK . The outputs of the edge sampling latches feed selection logic 15 unit 1007 whose output is used to control a selector 1013 describçd below.
Selection logic 1007 includes a set of internal latches for repçating the state of latches 1003,...,1005. The selection logic includes a priority circuit connected to these internal latches, for selecting the highest rank order input which carries a logical "one". The output is a coded identification of this selected input. The 20 selection logic 1007 has two gating signals: a clear signal and a signal from all of a group of internal latches of the selection logic. Between da~a streams, the clear signal goes to a zero state causing the internal latches to accept new inputs. After the first "one" input has been received from the edge sarnpling latches 1003,...,1005 in response to the first pulse of a data strearn, the state of the 25 transparent latches is maintained until the clear signal goes back to the zero state.
The clear signal is set by out of band circuitry which recognizes the presence of a data stream.
The output of the tapped delay line also goes to a series of da~a latches lOO9,...,10ll. The input to the data latches is clocked by the referçnce30 clock. The outputs of the data latches 100~,...,1011 are th~ inputs to selector circuit 1013 which selects the output of one of these daea latches based on the inpu~ from selection logic 1007 and connects this ou~tput tO the output of the selector 1013, which is the bie aligned data stream as labeled on FIG. 23.
After the bits have been aligned, they are fçd into a shift register (not 35 shown) with tapped outputs to feed the driver XL3. This is to allow data stream~
to be transmitted synchronously starting at sixteen bit boundaries. The operation 11 3 ~ 2 1 ;j 3 of the shift register and auxiliary circuitry is substantially the same as that of the tapped delay line arrangement.
The selection logic is implemented in commercially available priority selection circuits. The selector is simply a one OUt of eight selector controlled by 5 the output of the selection logic. If it is necessary to have a finer alignment circuit using a one of sixteen selection, this can be readily implemented using the same principles. The arrangement described herein appears to be especially attractive in situations where there is a common source clock and where the length of each data stream is lini~ted. The cornrnon source clock is required since the10 clock is not derived from the incoming signal, but is, in fact, used to gate an incoming signal appropriately. The limitation on the length of the block is re~quired since a particular gating selection is maintained for the entire block so that if the block length were too long, any substantial amount of phase wandering would cause synchronism tO be lost and bits to be dropped.
While in the present embodiment, the signal is passed through a tapped delay line and is sampled by the clock and inverse clock, the alternativearrangement of passing the clock through a tapped delay line and using the delayed clocks to sample the signal could also be used in some applications.
6.2 Clock Distribudon The MAN hub operation is very dependent on the use of a single master reference clock for all of the ILH and PASC units in the system. The master clock must be distributed accurately and reliably to all of the units. lnaddidon to the basic clock frequency that must be distributed, the frame start pulse must be distributed to the PASC and an advanced frame start pulse must be 25 distributed to the ILH. All of these functions are handled by using a single clock distribution link (fiber or twisted pair) going to each unit.
The infolmatdon that is carried on these clock distribution links comes from a single ~lock SOurGe. This information can be split in the ~lectrical and/or optical domain and transmitted to as many destinations as necessary. There is no30 attempt to keep the informadon on all of the clock distribution links exactly in phase since the ILH and PASC are capable of correcdng for phase differences no matter what the reason for this difference. The informadon that is transrnitted is simply alternating ones and zeros with two exceptions. The occuIrence of two ones in a row indicates an advanced frarne pulse and the occurrence of two zeroe~
35 in a row indicates a normal frame pulse. Each board tha~ terminates one of these clock distribution links contains a clock recovery module. The clock recovery cc~ ~

module is the same as that used for the links themselves. The clock recovery module will provide a very stable bit clock while additional logic extracts the appropriate frame or advanced frame from the data itself. Since the clock recovery modules will continue to oscillate at the correct frequency even without S bit transieions for several bit times, even the unlikely occurrence of a bie error will noe a~fect the clock frequency. The logic that looks for the frame or advanced frame signal can also be made tolerant of errors since i~ is known that the frame pulses are periodic and extraneous pulses caused by bit elTors can be ignored.
7 NE!lVVORlK INTERFACE MODULE~
10 7.1 Overview _ The network interface module (NIM) connects one or more end user system links (EUSL) to one MAN external link ~XL). In so doing, the NIM
performs concentration and demultiplexing of network transaction units (i.e.
packets and SUWUs), as well as insuring source identification integrity by aff~xing 15 a physical "source port number" to each outgoing packet. The latter function, in combination with the network registration service described in 2.4, prevents a user from masquerading as another for the purpose of gaining access to unauthorized ne~work-provided services. The NIM thereby represents the boundary of the MAN network proper, NIMs are owned by the network provider, 20 while UIMs (described in 8) are owned by the users thernselves.
This section describes the basic functions of the NIM in more detail, and presents the ND~I ar~hitecture.
7.2 Basic Functions The NlM must perform the following basic functions:
25 EUS Link interfacin~. One or more interfaces must be provided to EUS link(s) (see 2.2.5). The downstream link (i.e. from NIM to UIM) consists of a data channol and an out-of-band channel used by the NlM to flow control the upstream link when NIM input buffers become full. Because the downstream link is not flow controlled, the flow control channel on the ups~eam link is umlsed. The 30 Data and Header Check Sequences (DCS, HCS) are generat~d by the UIM on the upstream link, and checked by the UIM on the downstream link.
External Link interfacing The XL ( 2.2.6~ is very similar to the EUSL, but lacksDCS checking and generation on both ends. This is to allow erroneous, but still potentially useful data to be delivered to the UIM. l~e destination port numbers35 in network transaction units arriving on the downstream XL are checked by the NIM, with illegal values resulting in dropped data.

13,;L2..

Concentration and demultiPlexin~. Network transaction units arriving on the EUSLs contend for and are statistically multiplexed to the outgoing XL. Those arri~/ing on the ~L are routed to the appropriate EUSL by mapping the destination port number to one or more EUS links.
S Source port identification. The port number of the source UIM is prepended to each network transaction unit going upstream by port number generator 403 (FiG. 16). This port number will be checked against the MAN address by the MINI to prevent unauthorized access to s~vices (including the most basic data ~ansport service) by "imposters".
10 7.3 NIM Architecture and Operation The architecture of the NIM is depicted in FIG. 16. The following subsections briefly describe the operation of the ND~I.
7.3.1 Upstream O~eration Incoming network transaction units are received from the UIMs at 15 their EUSL interface 400 receivers 402, are converted to words in serial to parallel converters 404 and are accumulated in FIFO buffers 94. Each EUSL interface is connected to the NIM transmit bus 95, which consists of a parallel data path, and various signals for bus arbitration and clocking. When a network transaction unit has been buffered, the EUSL interface 400 arbitrates for access to the transmit 20 bus 9~. Arbitration proceeds in parallel with data transmission on the bus. When the current data transmission is complete, the bus arbiter awards bus ownership to one of the competing EUSL interfaces, which begins transmission. For each transaction, the EUSI, port number, inserted at the beginning of each packet by port number generator 403,is transmitted first, followed by the network 25 transaction unit. Within an XL interface 440, the XL transmitter 96 provides the bus clock, and performs parallel to serial conversion 442 and data transmission on the upstrearn XL 3.
7.3.2 Downstream Operation Network transaction units arriving from the MINT on the downstream XL 3 are received within XL interface 440 by the XL receiver 446, which is connected via serial to parallel converter 448 to the NIM receive bus 430. The receive bus is similar to, but independent of the transmit bus. Also connected to the receive bus via a parallel to serial converter 408 are the EUSL interface transmitters 410. The XL receiver perforrns serial to parallel conversion, provides 35 the receive bus clock, and sources the incorning data onto the bus. Each EUSLinterface decodes the EUSL port number associated wi~h the data, and forwards 3 ~

the data to its EUSL if appropriate. More than one EUSL interface may forward the data if required, as in a broadcast or multicast operation. Each decoder 409checks the receive bus 430 while port number(s) are being transmitted to see if the following packet is destined for the end user of this EUSL interface 400; if so, S the packet is forwarded to transmitter 410 for delivery to an EUSL 14. IllegalEUSL port numbers ~e.g. violations of the error coding scheme) result in the data being dropped (i.e. not forwarded by any EUSL interface). Decode block 409 is used to gate info~mation destined for a particular EUS lh-k from transmit bus 95to the paralleVserial converter 408 and transrnitter 410.

8.1 Overview A user interface module (UIM) consists of the hardware and software necessary to connect one or more end user systems (EUS), local area networks (LAN), or dedicated point-to-point links to a single MAN end user system link 15 (EUSL) 14. Throughout this section, the term EUS will be used to generically refer to any of these network end user systems. Clearly, a portion of the UIM
used to connect a particular type of EUS to MAN is dependent on the architectureof that EUS, as well as the desired performance, flexibility, and cost of the implementation. Some of the functions provided by a UIM, however, must be 20 provided by every UIM in the system. It is therefore convenient to view the architecture of a UIM as having two distinct halves: the network interface, which provides the EUS-independent functionality, and the EUS interface, which implements the remainder of the UIM functions for the particular type of EUS
being connected.
Not all EUSs will require the performance inherent in a dedicated external link. The concentration provided by a NIM (described in 7) is an appropriate way to provide access to a number of EUSs which have stringent response ~me requirements along with the instantaneous VO bandwidth necessaTy to effec~ely utili~e the full MAN data rate, but which do not generate the 30 volume of traffic necessary to efficiently load the XL. Similarly, several lEUSs or LANs could be connected to the same UIM via some intermediate link (or the LANs themselves). Ln this scenaIio, the UIM acts as a multiplexer by providing several EUS (actually LAN or link) interfaces to go with one network interface.
This method is well suited to EUSs which do not allow direct connechons to their35 system busses, and which provide only a link connection that is itself limited in bandwidth. End users can provide their multiplexing or concentration at a UIM

~3~2~33 and MAN can provide further multiplexing or concentration at the NIM.
This section examines the architectures of both the network interface and EUS interface halves of the UIM. The functions provided by the nçtwork interface are described, and the architecture is presented. The heterogeneity of5 F,USs that may be connected to MAN does not allow such a generic treatrnent ofthe EUS interfaces. Instead, the EUS interface design options are explored, and a specific example of an EUS is used to illustrate one possible EUS interface design.
8.2 UIM - Network Interface The UIM network interface implements the EUS-independent functions of the UIM. Each network interface connects one or more ~US
interfaces to a single MAN EIJSL.
8.2.1 Basic Functions The UIM network interface must perform the following functions:
15 EUS Link interfacing. The interface to the EUS Link includes an optical transmitter and receiver, along with the hardware necessary to perform the link level functions required by the EUSL (e.g. CRC generation and checking, data foImatting, etc.).
Data buffering. Outgoing network transaction units (i.e. packets and SUWUs) 20 must be buffered so that they may be transmitted on the fast network link without gaps. Incorning network transaction units are buffered for purposes of speed matching and level three (and above) protocol processing.
Buffer memory mana~ement. The packets of one LUWU may arrive at the receive UIM interleaved with those of another LUWU. In order to support this concurrent 25 reception of several LlJWUs, the network interface must manage its receive buffer memory in a dynarnic fashion, allowing incoming packets to be chained together into LUWUs as they aIrive.
P~tocol processing Outgoing LUWUs mus~ be fragmented into packets for transrnission into the netvvork. Similarly, incoming packets must be recombined 30 into LUWUs for delivery to thç receiving process within the EUS.
8.2.2 Architectural_ ptions Clearly, all of the functions enumerated in the previous subsection must be perfolmed in order to interface any EUS to a MAN EUSL. However, some architectural decisions must be made regarding where these functions are 35 performed; i.ç., whether they are internal or external to the host itself.

2, 1 ~ 3 The first ~wo functions must be located external to the host, although for different reasons. The first and lowest level function, that of interfacing to the MAN EUS Link, must be implemented externally simply because it consists of special purpose hardware which is not par~ of a generic EUS. The EUS link S interface simply appears as a bidirectional I/O port to the remainder of the UIM
network interface. On the other hand, the second function, data buffering, cannot be implemented in existing host memory because the bandwidth requirements are too stringent. On reception, the network interface must be able to buffer incoming packets or SUWUs back-to-back at the full network data rate (150 Mb/s). This 10 data rate is such that it is generally impossible to deposit incon~ing packets directly into EUS memoTy. Similar bandwidth constraints apply to packet and SUWU transmission as well, since they must be completely buffered and then transmitted at the full 150 Mb/s rate. These constraints make it desirable to provide the necessary buffer memory external to the EUS. It should be noted that15 while FIFO memory will suffice to provide the necessary speed matching for trans~ussion, the lack of flow control on reception along with the interleaving of received packets necessitate that a larger amount of random access memory be provided as receive buffer memory. For MAN, the size of receive buffer memory may range from 256 Kbytes to 1 Mbyte. The particular size depends on the 20 interrupt latency of the host and on the maximum size LUWU allowed by the host software.
The final two functions involve processing, which could conceivably be performed by the host processor itself. The third function, buffer memory management, involves the timely allocation and deallocation of blocks of receive25 buffer memory. The latency requirement associated with the allocation operation is stringent, due once more to the high data rates and the possibility of packets alTiving back-to-back. However, this can be alleviated (for reasonable burst sizes) by pre-allocating sçveral blocks of memory. It is possible, therefore, for the host processor to manage the receive packet buffers. Similarly, the host processor may 30 or may not assume the burden of the fourth function, that of MAN protocol processing.
The location of these final two functions determines the level at whicn the EUS connects to the IJIM. If the host CPU assumes the burden for packet buffer memoIy management and MAN protocol processing (the "local"
35 configuration), then the unit of data transferred across the EUS interface is a packet, and the host is responsible for fragmenting and recombining LlJWUs. Ir.

on the other hand, those functions are off-loaded to another processor in the UIM, the front end processor (FEP) configuration, the unit of data transferred across the EUS interface is a LUWU. While in theory, subject to interleaving constraints atthe EUS interface, the unit of data transferred may be any amolmt less than or S equal to the entire LUWU, and the units deli~ered by the transmitter need not be the same size as those accepted by the receiver, for a general and uniform solution, useful for a variety of EUSs, the LUWU is to be preferred as the basicunit. The FEP configuration offloads the majority of the processing burden from the host CPU, as well as providing for a higher level EUS interface, thereby 10 hiding the details of network operation from the host. With the FEP, the hostknows only about LUWUs, and can control their transmission and reception at a higher, less CPU intensive level.
Although a lower cost interface is possible utilizing the local configuration, the network interface architecture described in the following section 15 is a FEP configuration more characteris~ic of that required by some of the high perfolmance EUS that are natural users of a MAN network. An additional reason for choosing the FEP configuration initially is that it is better suited for interfacing MAN to a LAN such as ETHERNET, in which case there is no "host CPU" to provide buffer memory management and protocol processing.
20 8.2.3 Network Interface Architecture The architecture of the UIM network interface is depicted in FIG. 17.
The following subsecdons briefly describe the operation of the UIM ne~work interface by presenting scenarios for the transmission and reception of data. AnFEP-type architecture is employed, i.e., receive buffer memory management and 25 MAN network layer protocol processing are performéd external to the host CPU
of the EUS.
8.2.3.1 Transmission of Data The main responsibilities of the network interface on transmission are to fragment the arbitra~y sized transmit user work units (UWUs) into packets ~if30 necessaly), encaps~llate the user data in the MAN header and trailer, and transmit the data to the network. To begin transmission, a message from the EUS
requesting transmission of a LUWU traverses the EUS interface and is handled by network interface processing 450, which also implements memory management and protocol processing functions. For each packet, the protocol processor portion 35 of the interface processing 450 formulates a header and writes it into the transmit FIFO 15. Data for that packet is then transferrecl across the EUS interface 451 ~ ~ 2~

into the transrnit FIFO 15 within link handler 460. When the packet is completely buffered, the link handler 460 transmits it onto the MAN EUS link using transmitter 454, followed by the trailer, which was computed by the link handler 460. The link is flow controlled by the NIM to ensure that the NIM
S packet buffers do not overflow. This transmission process is repeated for eachpacket. The transmit FIEO 15 contains space for two maximum length packets so that packet transmission may occur at the maximum rate. The user is notified viathe EUS interface 451 when the transmission is complete.
8.2.3.2 Reception of Data Incoming data is received by receiver 458 and loaded at the 150 MB/s link rate into slastic buffer 462. Dual-ported video E~AM is utilized for the receive buffer memory 90, and the data is unloaded from the elastic buffer and loaded into the shift register 464 of receive buifer memory 90 via its serial access port. Each packet is then transferred from ~he shift register into the main memory lS array 466 of the receive buffer memory under the control of the receiver DMA
sequencer 452. The block addresses used to perform these transfers are provided by the network interface processing arrangement 450 of UIM 13 via the buffer memory controller 456, which buffers a small number of addresses in hardware to relieve the strict latency requirements which would otherwise by imposed by 20 back-to-back SUWUs. Block 450 is composed of blocks 530, 540, 542, 550, 552, 554, 556, 558, 560, and 562 of FIG. 19. Because the network interface processinghas direct access to the buffer memory via its random access port, headers are not stripped off; rather they are placed into buffer memory along with the data. Thereceive queue manager 558 within 450 handles the headers and, with input from 25 the memory manager 550, keeps track of the various SUWUs and LUWUs as they arrive. The EUS is notified of the arrival of data by the network interface processing alTangement 450 via the EUS interface. The details of how data is delivered to the EUS are a function of the par~cular EUS interface being employed, and are described, for example, in section 8.3.3.2.
30 8.3 UIM - EUS Interfaces 8.3.1 Philosophv This section describes the "hal~' of the network interface that is EUS
dependent. The basic function of the EUS interface is the delivery of data between the EUS memory and the UIM network interface, in both directions.
35 Each particular EUS interface will define the protocol to effect delivery, the format of data and control messages, and the physical path for control and data.

~2~ ~

Each side of the interface has to implement a flow control mechanism to pro~ect itself from being overrun. The EUS must be able to control its own memory and the flow of data into it from the network, and the network has to be able to protect itself as well. Only at this basic functional level is it possible to talk 5 about commonality in EUS interfaces. EUS interfaces will be different because of EUS hardware and system software differences. The needs of the applications using the network, coupled with the capabilities of the EUS7 will also force interface design decisions dealing with performance and flexibility. There will be numerous interface choices even for a single type of EUS.
This set of choices means that the interface hardware can range from simple designs with few components to complex designs including sophisticated buf~ering and memory management schemes. Control functions in the interface can range from simple EUS interfaces to handling network level 3 protocols and even higher level protocols for distributed applications. Software in the EUS can 15 also range from straightforward data transmission schemes that fit underneathexisting networking software, to more extensive new EUS software that would allow very flexible uses of the network or allow the highest performance that the network has to offer. These interfaces must be tailored to the specific exisdng EUS hardware and softw~re systems, but there must also be an analysis of the 20 cost of interface features in comparison to the benefits they would deliver to the network applications running in these EUSs.
8.3.2 EUS Inter~ace Desi~n Options The tradeoff between a front end processor (FEP) and EUS processing is one example of different interface approaches to accomplish the same basic 25 function. Consider variations in receive buffering. A specialized EUS
architecture widl a high performance system bus could receive network packet messages direcdy from the network links. However, usually the interface will at least buffer packet messages as they come off the link, before they are delivered into EUS memory. Normally EUSs, either transmitting to or receiving from the 30 network, do not know (or want to know) anything about the intemal packet message. Ln that case, the receiving interface rnight have to buffer multiple packets that come from the LUWU of data that is the natural sized transmission unit between the transrnit and receive EUSs. E~ach one of these three receive buffering situations is possible and each would require a significantly different 35 EUS interface to transfer data into the EUS memo~y. If the EUS has a particular need to process network packet messages and has the processing power and ~ 3 ~

system bus performance to devote to that task then the EUS dependent portion of the network interface would be sirnple. However, often it will be desirable to off-load that processing into the EUS interface and improve the EUS performance.Different transmit buffering approaches also illustrate the tradeoff 5 between FEP and EUS processing. Por a specialized application, an EUS with high performance processor and bus could send network packet messages directly into thç network. But if the application used EUS transaction sizes that were much larger that the packet message size, it might take too much of the EUS
processing to produce packet messages on its OWD. An FEP could offload that 10 work of doing this level 3 network protocol formatting. This would also be the case where the EUS wishes to be independent of the internal network message size, or where it has a diverse set of network applications with a great variation in transm~ssion size.
Depending on the hardware architecture of the EUS, and the level of 15 performance desired, there is the choice between programmed l/O and DMA to move data between EUS memory and the network interface. In the prograrnrned 1/0 approach, probably both control and data will move over the same physical path. In the DMA approach there will be some kind of shared memory interface to move control informadon in an EUS inter~acing protocol, and a DMA
20 controller in the EUS interface to move data between buffer memory and EUS
memory over the EUS system bus without using EUS processor cycles.
There are several alternadves that exist for the location of EUS
buffering for network data. The data could be buffered on a front end processor network controller circuit board with its own private memory. This memory can 25 be connected to the EUS by busses using DMA transfer or dual ported memory accessed via a bus or dual ported memory located on the CPU side of a bus using private busses. The application now must access the data. Various techniques areava~lable; some involve mapping the end user work space directly to the address space used by the UIM to store the data. Other techniques require the operating 30 system to further buffer the data and recopy into the user's private address space.
Opdons exist in writing the driver level software in the EUS that is responsible for moving control and data infonna~ion over the interface. The driver could also implement the EUS interface protocol processing as well as just moving bits over the inter~ace. For the driver to still run efficiently the protocol 35 processing in the driver might not be very flexible. For more flexibility based on a particular application, the EUS interface protocol processing could be moved u~

3 ~

to a higher level. Closer to the application, more intelligence could be applied to the interface decisions, at the expense of more EUS processing time. The EUS
could implement various interface protocol approaches for delivery of data to and from the network: prioritization, preemption, etc. Network applications that did5 not require such flexibility could use a more direct interface to the driver and the network.
So, there are a variety of choices to be made at different levels in the system in both the hardware and the software.
8.3.3 Implementation Exam~le: SUN Workstadon Interface To illustrate the EUS dependerlt portion of the interface we describe one specific interface. The interface is to the Sun-3 VME bus based workstationsmanufactured by Sun Microsystems, Inc. This is an example of a single EUS
connected to a single network interface. The EUS also allows connection directlyto its system bus. The UIM hardware is envisioned as a single circuit board that15 plugs into the VME bus system bus.
First, there follows a descripdon of the Sun VO architecture, and then a descriptdon of the choices made in designing the interface hardware, the interface protocol, and the connection to new and existing network applications software.
20 8.3.3.1 SUN Workstadon VO Architectu_ The Sun-3's VO architecture, based on the VME bus structure and its memory management unit (MMU), provides a DMA approach called direct virtual memory access (DVMA). FIG. 17 shows the Sun DVMA. DVMA allows devices on the system bus to do DMA directly to Sun processor memory, and also 25 allow main bus masters to do DMA directly to main bus slaves without going through processor memory. It is called "virtual" because the addresses that a device on the system bus uses to communicate with the kernel are virtual addresses similar to those the CPU would use. The DVMA approach makes sure that all addresses used by devices on the bus are processed by the MMU, just as if 30 they were virtual addresses generated by the CPU. I`he slave decoder 512 (FIG. 18) responds to the lowest megabyte of VME bus address space (OxO000 0000 -> OxOOOf ffff, in the 32 bit VME address space) and maps this megabyte into the most significant megabyte of the system virtual address space (Oxffl) 00~) -~ Oxfff ffff in the 28 bit virtual address space). (OX means that the subsequent 35 characters are hex~decimal characters.) When the driver needs to send the buffer address to the device, it must strip off the high 8 bits from the 28 bit address, so l v ~ 3 ~

that the address that the device puts on the bus will be in the low megabyte (20bits) of the VME address space.
In FIG. 18, the CPU 500 drives a memory management unit 502, which is connected to a VME bus 504 and on board memory 506 that includes a S bu~fer 508. The VME bus communicates with DMA devices 510. Other on board bus masters, such as an ETHERNET access chip can also access memory 508 via MMU 502. Thus, devices can only make DVMA transfers in memory buffers that are reserved as DVMA space in these low (physical) memory areas. The kernel does however support redundant mapping of physical memory 10 pages into multiple virtual addresses. In this way, a page of user memory (orkernel memory) can be mapped into DVMA space in such a way that the data appears in (or comes frcm) the address space of the process requesting that operativn. The dIiver uses a routine called mbsetup to set up the kernel page maps to support this direct user space DVMA.
15 8.3.3.2 SUN UIM - EUS Interface Approach As mentioned above there are many opdons in designing a p~rticular interface. With the Sun-3 interface, a DMA ~ansfer approach was designed, an interface with FE~P capabilities, an interface with high perfonnance rnatching the system bus, and an EUS software flexibility to allow various new and existing 20 network applicadons to use the network. FIG. 19 shows an overview of the interface to the Sun-3.
The Sun-3's are systems with potendally many simultaneous processes running in support of the window system, and multiple users. The DMA and FEP
approachs were chosen to offload the Sun processor while the network transfers 25 are taking place. The UIM hardware is envisioned as a single circuit board tha~
plugs into the VME bus system bus. With the chance to connect directly to the system bus it is desirable to attempt the highest performance ~nterface possible.
Sun's DVMA provides a means to move data efficiently to and fiom processor memory. There is a DMA controller 92 in the UIM (E7IG. 4) to move data from 30 the UIM to EUS memory arld data ~rom FUS memo~y to the UIM over the bus, and there will be a shared memory interface to move control information in the host interfacing protocol. The front end processor (FEP) approach means that thedata from the network is presented to the EUS at a higher level. Level 3 protocol pr~cessing has been performed and packets have been linked together into 35 LUWUs, ~he user's natural sized unit of ~ansmission. With the potential variety of network applications that could be running on the Sun the FEP approach me~n~

2 ~ ~ ~

that EUS software does not have to be tightly coupled to the internal network packet format.
The Sun-3 DVMA architecture will limit the EUS transaction sizes to a maximum of one megabyte. If user buffers are not locked in, then Icernel 5 buffers would be used, as an inteIrnediate step between lhe device and the user, with the associated performance penalty for the copy operation. If transfers aregoing to be made directly to user space, using the "mbsetup" approach, the user's space will be locked into memory, not available for swapping, during the whole transfer process. This is a tradeoff; it ties up the resources in the machine, but it 10 may be more efficient if it avoids a copy operation from some other buffer in the kernel.
The Sun system has existing network applications running on ETHE~RNET, for example, their Network File System (NFS). To run these existing applications on MAN but still leave open the possibility for new 15 applications that could use the expanded capabilities of MAN, we needed flexible EUS software and a flexible interface protocol to be able to simultaneously handle a variety of network applications.
FIG. 19 is a functional overview of the operation and interfaces among the NIM, UIM, and EUS. The specific EUS shown in this illustrative 20 exarnple is a Sun-3 workstation, but the principles apply to other end user systems having greater or lesser sophistication. Consider first the direction from the MINT
via the NIM and UIM to the EUS. As shown in FIG. 4, data that is received from MINT 11 over link 3 is distributed to one of a plurality of UIMs 13 over links 14 and is stored in receive buffer memory gO of such a UIM, from which 25 data is transmitted in a pipelined fashion over an EUS bus 92 having a DMA
interface to the appropriate EUS. The control structure for accomplishing this transfer of data is shown in FIG. 19, which shows that the input from the MINT
is controlled by a MINT to NIM link handler 520, which transmits its output under the control of router 522 to one of a plurality of NIM to UIM link handlers 30 (N/U LH) 524. ~T/NIM link handler (M/~ LH) 520 supports a variant on ~he Metrobus physical layer protocol. The NIM to UIM link handler 524 also supports the Metrobus physical layer protocol in this implementation, but other protocols could be supported as well. It is possible that different protocols could coexist on the same NIM. The output of the N/U LH 524 is sent over a link 14 35 to a UIM 13, where it is buffered in receive buffer memory 90 by NIMJUIM linkhandler 552. The buffer address is supplied by memory manager 550, which 2~

manages free and allocated packet buffer lists. The status of the packet reception is obtained by N/U LH 552, which computes and verifies the checksum over header an data, and outputs the status information to receive packet handler 556, which pairs the status with the buffer address received from memory rnanager 5505 and queues the information on a received packet list. Information about received packets is then transfeIred to receive queue manager 558, which assembles packetinformation into queues per LUWU and SUWU, and which also keeps a queue of LUWUs and SUWUs about which the EUS has not yet been notified. Receive queue manager 558 is polled for information about LUWUs and SUWUs by the 10 EUS via the EUS/UIM link handler (E/U LH) 540, and responds with notificationmessages via UIMIEUS link handler (IJ/E LH) 562. Messages which notify the EUS of the reception of a SUWU also contain the data for the SUWU, thus completing the reception process. In the case of a LUWU, however, the EUS
allocates its memory for reception, and issues a receive request via EIU LH 540 to 15 receive request handler 560, which formulates a receive worklist and sends it to resource manager 554, which conerols the hardware and effects the data transfer over EUS bus 92 (FIG. 4) via a DMA alrangement. Note that the receive request from the EUS need not be for the entire amount of data in the LUWU; indeed, all of the data may not have even arrived at the UIM when the EUS makes its first 20 receive request. When subsequent data for this LUWU arrives, the EUS will again be notified and will have an opportunity to ~nake additional receive requests.
In this fashion, the reception of the data is pipelined as much as possible in order to reduce latency. Following data transfer, receive request handler 560 inforrnsthe EUS via U/13 LH 562, and directs memory manager 550 to de-allocate the 25 memory for that portion of the LUWU that was delivered, thus making that memory available for new incoming data.
In the reverse direction, i.e., from EUS 26 to MINl 11, the operation is controlled as follows: driver 570 of EUS 26 sends a transmit request to transmit request handler 542 via U/E~ LH 562. In the case of a SIJWU, the 30 transmit request itself contains the data to be transmitted, and transmit request handler 542 sends this data in a transmit worklist ~o resource manager 554, which computes the packet header and writes both header and data into buffer 15 (FIG. 4), from which it is transmitted to NIM 2 by UIM/NIM link handler 546 when authorized to do so via the flow control protocol in ~orce on link 14. The 35 packet is received at NIM 2 by UIMJNIM link handler 530 and stored in buffer 94. Arbiter 532 then selects among a plurality of buffers 94 in NIM 2 to select the next packet or SUWU to be transmitted under the control of NIM~lrNT
link handler 534 on MINrr link 3 to MINT 11. In the case of a LUWU, transmit request handler 542 decomposes the request into packets and sends a transmit worlclist to resource manager 554, which, for each packet, formulates the header, S writes the header into buffer 15, controls the hardware to effect the transfer of the packet data over EUS bus 92 via DMA, and directs U/N LH 546 to transmit the packet when authorized to do so. The transrnission process is then as described for the SUWU case. In either case, transrnit request handler 542 is notified by resource manage~ 554 when transmission of the SUWU or LUWU is complete, 10 whereupon driver 570 is notified via U/E LH 562 and may release its transmit buffers if desired.
FIG. 19 also shows details of the intemal software structure of EUS 26. Two types of arrangements are shown, in one of which blocks 572, 574, 576, 578, 580 the user system performs level 3 and higher functions. Shown in 15 FIG. 19 is an implementation based on Network of the Advanced Research Projects Administration of the U.S. Department of Defense (ARPAnet) protocols including an internet protocol 580 (level 3), transrnission control protocol (TCP) and user datagram protocol (UDP) block 578 (TCP being used for connection oriented service and UDP being arranged for connectionless service). At higher 20 levels are the remote procedure call (block 576), the network file server (block 574) and the user programs 572. Alternatively, the services of the MAN
network can be directly invoked by user (block 582) programs which directly interface with driver 570 as indicated by the null block 584 between the user and the driver.
25 8.3.3.3 EUS Interface Functions The main functional parts of the transmit EUS interface are a control interface with the EUS, and a DMA interface to transfer data between the EUS
and the UIM over the system bus. When transmitting into the network, control information is received that describes a LUWIJ or SUWUs to be transmitted and 30 information about the EUS buffers where the data resides. The control information from the EUS includes destination MAN address, destination group (virtual network), LUWU length, and type fields for type of service and higher , level protocol type. The DMA interface moves the user data over from the EUS
buffers into the UIM. The netwo~k interface portion is responsible for forrnatting 35 the LUWUs and SUWUs into packets and transrnitting the packets on the link ~othe network. The control interface could have several variations for flow control~

~ 3 iL rJ ~

multiple outstanding requests, priority, and preemption. The UIM is in control of the amount of data that it ~akes from the EUS memory and sends into the network.
On the receive side, the EUS polls for information about packets that S have 7Oeen received and the control interface responds with LUWU information from the packets header and current information about how much of the EUS
transac~on has arrived. Over the control interface, the EUS requests to receive data from these messages, and the DMA interface will send the data from memory on the UIM into the EUS memory buffers. The poll and response mechanism in 10 the interface protocol on the receive side allows a lot of EUS flexibility for receiving data from the network. The EUS can receive either partial or entire transactions that have come ~rom the source EUS. It also provides the flow control mechanism for the EUS on receive. The EUS is in control of wha~ it receives, when it receives it, and in what order.
15 8.3.3.4 SUN Software This section describes how a typical end user system, a SUN-3 workstation, is connectable to MAN. Other end user systems would use different software. The interface to MAN is relatively straightforward and efficient for a number of systems which have been studied.
20 8.3.3.4.1 Existin~ Network Software The Sun UNIX(~) operating system is derived from the 4.2BSD UNIX
system from the University of California at Berkeley. Like 4.2BSD it contains aspart of the kernel, an implementation of the ARPAnet protocols: internet protocol (IP), transmission control protocol (TCP) for connection-oriented service on top of 25 IP, and user datagram protocol (UDP) for connectionless service on top of IP.Current Sun systems use IP as an internet sublayer in the top half of the network l~yer. The bottom half of the network layer is a network specific sublayer. It currently consists of driver level software that interfaces to a specific network hardware connection, namely an ETHERNET controller, where the linl~ layer 30 MAC protocol is implemented. ETHERNET is the network currently used to connect Sun workstations. To connect Sun workstations with a MAN network, it is necessary to fit into the framework of this existing networking software. Thesoftware for the MAN network interface in the Sun will be dIiver level software.The MAN network is naturally a connectionless or datagram type of 35 network. LUWU data with control inforrnation forms the EUS transac~ion crossing the inter~ace into the network. Existing network services can be provided using the MAN network datagrarn LUWUs as a basis. Software in the Sun will build up both connectionless and connection-oriented transport and application services on top of a MAN datagram network layer. Since the Sun already has a variety of network application software, the MAN driver will provide a basic 5 service with the flexibility to multiplex multiple upper layers. This multiplexing capability will be necessary not just for existing applicadons but for additional new applications that will use MAM's power more directly.
There needs to be an address translation service function in the EUS
at the driver level in the host software. It would allow for IP addresses to be 10 ~ranslated into MAN addresses. The address translation service is similar in function to the current Sun address resolution protocol (ARP~, but different in implementa~on. If a particular EUS needs to update its address translation tables, it sends a networls message with an IP address to a well known address translation server. The corresponding MAN address will be returned. With a set of such 15 address translation services, MAN can then act as the underlying network for many different, new and existing, network software services in the Sun environment.
8.3.3.4.2 Device D~ver On the top side, the driver multiplexes several different queues of 20 LUWUs from the higher protocols and applications for transmission and queues up received LUWUs in several different queues for the higher layers. On the hardware side, the driver sets up DMA transfers to and from user memory buffers.The driver must communicate with the system to map user buffers into mernory that can be accessed by the DMA controller over the main system bus.
On transmit, the dIiver must do address translation on the outgoing LUWUs for those protocol layers that are not using MA~ addresses, i.e., the ~PAnet protocols. The MAN destination address and destmation group is included in MAN datagram control information that is sent when a LUWU is to be transmitted. Other transmit control information will be LUWU length, fiPlds 30 indicating type of service and higher level protocol, along with the data location for DMA. The UIM uses this control in~ormadon to ~orm packet headers and to move the LIJWU data out of E~US memory.
On receive, the dsiver will implement a poll/response protocol with the UIM notifying the EUS of incoming data. The poll response will contain 35 control inforrnation that gives source address, total LUWIJ length, amount of dal;
that has arrived up to this point, the type fields indicating higher protocol layers.

~ 3 ~

and some agreed on amount of the data from the message. (For small messages, the whole user message could arrive in tbis poll response.) The driver itself has the flexibility based on the type field to decide how to receive this message and which higher level entity to pass it on up to. It may be, that based on a certain S type field, it may just deliver the announcement, and pass the reception decision on up to a higher layer. Which ever approach is used, eventually a control request for the delivery of tbe data from the UIM to the EU5 memory is made, which results in a DMA operation by the UIM. EUS buffers to receive the data may preallocated for the protocol types where the driver handles the reception in a 10 fixed fashion, or the driver may have to get buffer inforrnation from a higher layer in the case where it has just passed the announcement on up. This is the type offlexibility we need in the driver to handle both existing and new applications in the Sun environment.
~.3.3.4.3 R w MAN Intotface Software La~er, as applications are written tbat wish to directly use the capabilities of tbe MAN network, the address translation function will not be necessary. The MAN datagram control information will be specified directly by special MAN network layer software.
9 MAN Protocols __ 2Q 9.1 Overview The MAN protocol provides for the delivery of user data from source UIM across the network to destination UIM. The protocol is connectionless, asymmetric for receive and send, implements error detection without correction, and discards layer I~urity for high performance.
25 9.~ Messa~e Sc_nario The EUS sends datagram transactions called LUWUs into the network. The data that comes from the EUS resides in EUS memory. A control message frorn the EUS specifies to the UIM the data length, the destination address for this LUWU, the destination group and a type field which could contain 30 informadon like the user protocol and the network class of service required.
Together, the data and the control information ~orm the LIJWU. Depending on the type of EUS inter~ace, this data and control can be passed to the UIM in different ways9 but it is likely that the data is passed in a DMA transfer.
The UIM will transmit this LUWU into the ne~work. To reduce 35 potential delay, larger LUWUs are not sent into the network as one contiguousstream. The UIM breaks up the LUWU into fragments call~d packets that can be ~ 3~2i ~33 up to a certain maximum size. An UWU smaller than the maximum size is called a SUWU and will be contained in a single packet. Several EUSs are concentrated at the NIM and packets are transrnitted over the link from the UIM to the NIM
(the EUSL). Packets from one UIM can be demand multiplexed on the link from S the NIM to the MINT (the XL) with packets from other EUSs. Delays are reduced because no EUS has to wait for the completion of a long LUWU from another EUS shanng the link to the MINT. The UIM generates a header for every packet that contains information from the original LUWU transaction, so ~hat each packet can pass through the network from source UIM to destination UIM and be 10 recombined into the same LUWU that was passed into the network by the source EUS. The packet header contains the information ~or the network layer protocol in the MAN network.
Before the NIM sends the packet to the MINT on the XL, it adds a NIMIMINT header to the packet message. The header contains the source port 1~ number identifying the physical port on the NIM where a particular EUS/iJIM is connected. This header is used by the MINT to verify that the source EUS is located at the por~ where he is authorized to be. This type of additional check is especially important for a data network that serves one or more virtual networks, to ensure privacy for such virtual networks. The MINT uses the packet header to 20 deterrnine the route for the packet, as well as other potential services. The MINT
does not change the contents of the packet header. When the ILH in the MINT
passes the packet t)Ut through the switch to be sent out on the XL to the destination NIM, it places a different port number in the NI~IINT header. This port number is the physica! port on the NIM where the destination EUS/UlM is 25 connected. The destination NIM uses this port number to route the packet on the fly to the proper EUSL.
The various sections of a packet are identified by delimiters according to the link forrnat. Such delimiters occur between the NIMIMINl header S00 and the MAN header 610, and between the MAN header and the rest of the packet.
30 The delirniter at the MAN header/rest of packet border is required to signal the header check sequence circuit to insert or check the header check. The NIM
broadcasts a received packet to all por~s in the NIM~T header field.
When the packet arrives at the destination UIM, the packet header contains the original information ~rom the source UIM necessary to reassemble the 35 source EUS transaction. There is also enough information to allow a variety of EUS receive interface approaches including pipelining or other variations of EUS

n ~

- ~2-transaction size, prioridzation, and preemption.
9.3 MAN Protocol Description 9.3.1 Link La~er Functions The link functions are described in Section 5. The functions of S message beginning and end demarcation, data transparency, and message check sequences on the ~USL and XL links are discussed there.
A check sequence for the whole packet message is performed at the link level, but instead of corrective action being taken there, an indication of the error is passed on up to the network layer for handling there. A message check 10 sequence error results only in incrementing an error count for administrativepu~poses, but the message transmission continues. A separate header check sequence is calculated in hardware in the UIM. A header check sequence error deteeted by the MINT control results in the message being thrown away and an error count being ~ncremented for administrative purposes. At the destination 15 UIM a header check sequence error also results in the message being thrown away. The data check sequence result can be conveyed to the EUS as part of the LUWU arrival notificadon, and the EUS can detern~ine whether of nvt to receive the message. These violations of layer purity have been made to simplify the processing at the link layer to increase speed and overall network performance.
2Q Other "standard" link layer functions liko error correction and flow control are not performed in the conventional manner. There are no acknowledgement messages retumed at the link level for error correction (retransmission requests) or for flow control. Flow control is signaled using special bits in the framing pattern. The complexity of X.25-like protocols at the 25 link level can be tolerated for low speed links where the processing o~erhead will not reduce perfolmance and does increase the reliability of links that ha~e highe~or rates. However, it is felt that an acceptable level of error-free throughput will be achieved by the low bit error rates in the fiber optic links in this network (Bit Error Rate less than 10 errors per trillion bits.) Also, because of the large 30 amounts of bu~fer memory in the MINT and the UIM necessary to handle data ~rom the high-spee~ links, it was felt that flow control messages would not be necessary or effective.
9.3.2 Network Layer ~1 3~ 3~

9.3.2.1 Functions The message unit that leaves the source UIM and travels all the way to the destination UIM is the packet. The packet is not altered once it leaves the source UIM.
S The inforrnation in the UIM to UIM message header will allow the following functions to be performed:
- fragmentadon of LUWUs at the source UIM, - recombination of LUWUs at the destination UIM, - routing to the proper NIM at the MINT, - routing to the proper UIMIEUS port at the destination NIM, - MINT transmission of variable length rnessages (e.g., SUWU, packet, _ packets), - destination UIM congestion control and arrival announcement, - detection and handling of message header errors, - addressing of network entities for internal network messages, - EUS authentication for delivery of network services only to authorized users.
9.3.2.2 Format FIG. 20 shows the UIM to MINT Message format. The MAN
header 610 consists of the Destination Address 612, the Source Address ~14, the group (virtual network) identifier 616, group name 618, the ty~e of service 620,the Packet Length (the header plus data in bytes) 622, a type of service indicator 623, a protocol identifier 624 for use by end user systems for identifying the contents of EUS to EUS header 630, and the Header Check Sequence 626.
The header is of fLlced length, seven 32-bit words or 224 bits long. The MAN
25 header is followed by an EUS to EUS header 630 to process message fragmentation. This header includes a LUWlJ identifier 632, a LUWU length indicator 634, the packet sequence number 636, the protocol identifier 638 for identifying the contents of the internal EUS protocol which is the header of user data 640, and the number 639 of the initial byte of data of this packet within the 30 total LUWU of information. Finally, user data 640 may be preceded for appropriate user proto ols by the identity of the destinadon port 642 and sourceport 644. The fields are 32 bits because that is the most efficient length (integers) for present network control processors. Error checking is performed on the header in con~ol software; this is the Header Check Sequence. At the link level, error 35 checking done over the whole message; this is the Message Check Sequence 63 The NIM/MINT header 600 (explained below3 is also shown in the figure for completeness.
The destination address, group identification, type of service, and the source address are placed as the first five fields in the message for efficiency in MINT processing. The destination and group idendfication are used for routing, S the si~e for memory management, the type fields for special processing, and the source is used for service authentication.
9.3.2.2.1 Destination Address The Destination Address 612 is a MAN address that specifies to which EUS the packet is being sent. A MAN address is 32 bits long and is a flat 10 address that specifies an EUS connected to the network. (In internal network messages, if the high order bit in the MAN address is set, the address specifies an internal network entity like a MINT or NIM, instead of an EUS.) A MAN
address will be permanently assigned to an FUS and will identify an EUS even if it moves to different physical location on the network. If an EUS moves, it must15 sign in with a well-known routing authentication server to update the correspondence between its MAN address and the physical port on which it is located. Of course, the port number is supplied by the NIM so the EUS cannot cheat about where it is located.
In the MINT the destination address will be used to determine a 20 destination NIM for routing the message. In the destinadon NIM the destinadonaddress will be usecl to determine a destination UIM for routing the message.
9.3.2.2.2 Packet Length The Packet Length 622 is 16 bits long and represents the length in bytes of this message fragment including the fi~ced length header and the data.
25 This length is used by the MINT for~transmitting the message. It is also used by the destination UIM to determine the amount of data available for delivery to the EUS.
9.3.2.2.3 Type Fields The type of service field 623 is 16 bits long and contains the type of 30 service specified in the original EUS request. The MINT may look at the type of service and handle the message differently. The destination UIM may also look a the type of service to determine how to deliver the message to the destination EUS, i.e., deliver even if in error. The user protocol 624 assists the EUS driver in multiplexing various streams of data from the network.

~ ^3 ~ ~J l ~3 ~
~5 9.3.2.2.4 Packet Sequence Number This is a Packet Sequence Number 636 for this particular LUWU
transmission. It helps the receiving UIM recombine the incoming LUWU, so that it can determine if any fragments of the transmission have been lost because of 5 error. The sequence number is incremented for each fragment of the LUWU. The last sequence number is nega~ive to indicate the last packet of a LUW1J. (An SUWU would have -1 as the sequence number.) If an infinite length LUWU is being sent, the Packet Sequence Number should wrap around. (See UWU Length, Section 9.3.2.2.7, for an explanation of an infinite length LUWU.) 10 9.3.2.2.5 Sou~ce Address The Source Address 614 is 32 bits long and is a MAN address that specifies the EUS that sent the message. (See Destination Address for an explanation of MAN address.) The Source Address will be needed in the MINT
for network accounting. Coupled with the Port Number 600 from the NlMlMlNr 15 header, it is used by the MINT to authenticate the source EUS for network services. The Source Address will be delivered to thç destination EUS so that itknows the network address of the EUS that sent the message.
9.3.2.2.6 UWU ID
The UWU ID 632 is a 32 bit number that is used by the destination 20 UIM to recombine a UWU. Note that the recombination job is made easier because fragments cannot get out of order in the network. The UWU ~, along with the Source and I:1estination Addresses, identifies packets of the same LUWU, or in other words, fragrnents of the original datagram transaction. The ID must be unique for the source and destination pair for the time that any fra,,ment is in the 25 network.
9.3.2.2.7 UWU Length The UWU Length 634 is 32 bits long and represents the total length of UWU data in bytes. In the first packet of a LUWU this will allow the destination UIM to do congestion control, and if the LUWU is pipelined into the 30 EUS, it will allow the UIM to begin a LIJWU announcemenE and delivery before the complete LUWU arrives at the UIM.
A Length that is negative indicates an infinite length LUWU, which is like an open channel be~ween two EUSs. Closing down an infinite length LUWU
is done by sending a negative Packet Sequence Number. An infinite length 35 LUWU only makes sense where the UIM controls the DMA into EUS memory.

~ 3 ~

9.3.2.2 8 Header Check Sequence There is a header check sequence 626, calculated by the transmitting IJIM for header information so that the MINT and the destination UIM can determine if the header information was received correctly. The MINT or the 5 destination UIM will not attempt delivery of a packet with a header check sequence error.
9.3.2.2.9 User Data .
The user data 640 is the portion of the user UWU data that is transmitted in this fragment of the transmission. Following the data is the overall 10 message check sequence 646 calculated at the link level.
9.3.3 NIM/~NT Layer 9.3.3.1 Functions This protocol layer consists of a header containing a NIM port number 600. The port number has a one to one correspondence to an EUS
15 connection on the NIM and is prepended by the NIM in block 403 (FIG. 16) so that the user cannot enter false data therein. This header is positioned at the front of a packet message and is not covered by the overall packet message check sequence. It is checked by a group of parity bits in the same word to enhance its error reliability. The incoming message to the MINT contains the source NIM
20 port number to assist in user authentication for network services that might be requested in the type fields. The outgoing message from the MINT contains the destination NIM port number in place of the source port 600 in order to speed the demultiplexing/routing by the NIM to the proper destination EUS. If the packet has a plurality of destination ports in one NIM, a list of these ports is placed at 25 the beginning of the packet so that section 600 of the header becomes several words long.

10 1 General A system such as MAN is naturally mo$t cost effective when it can 30 serve a large number of customers. Such a large mlmber of customers is likely to include a number of sets of users who require protection from outsiders. Such users can conveniently be grouped into virtual networks. In order tO provide still further flexibility and protection, individual users may be given access to a number of virtual networks. For example, all the users of one company may be 3S on one virtual network and ~e payroll department of that company may be on a separate virtual network. The payroll department users should belong to both of these virtual networks since they may need access to general data about the corporation but the users outside the payroll departrnent should not be members of the virtual network of the payroll departrnent virtual network since they should not have access to payroll records.
S The login procedure method of source checking and the method of routing are the arrangements which permit the MAN system to support a large number of virtual networks while providing an optimum level of protection against unauthorized data access. Further, the arrangement whereby ~he NIM
prepends the user port to every packet, gives additional protection against access 10 of a virtllal network by an unauthorized user by preventing aliasing.
10.2 Bu ng Up_he Authvrization Data Base FIG. 15 illustrates the adr~unistrative control of the MAN network. A
data base is stored in disk 351 acc~ssed via operation, administration, and maintenance (OA&M) system 350 ~or authorizing users in response to a login 15 request. For a large MAN network, OA&M system 350 may be a distributed multiprocessor arrangement for handling a large volume of login requests. This data base is arranged so that users cannot access restricted virtual networks ofwhich they are not members. The data base is under the control of three types ofsuper users. A first super user who would in general be an employee of the 20 common carrier that is supplying MAN service. This super user, refeIred to for convenience herein as a level 1 super user, assigns a block of MAN names which would in general consist of a block of numbers to each user group and assigns type 2 and type 3 super users to particular ones of these names. The level 1 super user also assigns virtual networks to particular MAN groups. Finally, a level I
25 super user has the authority to create or destroy a MAN supplied service such as electronic "yellow page" service. A type 2 super user assigns valid MAN narnes from the block assigned to the par~icular user community, and assigns physical port access restrictions where appropriate. In addidon, a type 2 super user has the authority to restrict access to certain virtual ne~works by sets of members of his 30 customer cornmunity.
Type 3 super users who are broadly equal in authority to type 2 super users, have ehe authority to grant MAN narnes access to ~heir virtual networks.
No~e that such access can only be granted by a type 3 super user if the MAN
oarne~s eype 2 super user has allowed this MAN narne user the capability of 35 joining this group by an appropriate entry in table 370.

. ~--The data base includes table 360 which provides for each user identification 362, the password 361, the group 363 accessible using that password, a list of ports and, for special cases, directory numbers 364 from which that user may transmit and/or receive, and the type of service 365, i.e., receive 5 only, transmit only, or receive and transmit.
The data base also includes user-capability tables 370,375 for relating users (table 370) tO groups (table 375) potentially authorizable for each user.
When a user is to be authorized by a super user to access a group, this table ischecked to seç if that group is in the list of table 370; if not the request to 10 authorize that user for that group will be rejected. Super users have authority to enter data for their group and their groups in tables 370,375. Super users also have the authority for their user to move a group from table 375 into the list of groups 363 of the user/group authorization table 360. Thus, for a user to accessan outside group, super users from both groups would have to authorize this 15 access.
10.3 Lo~n Procedure At login time~ a user who has previously been appropriately authorized according to the arrangements described above, sends an initial loginrequest message to the MAN network. This message is destined not for any other 20 user, but for the MAN network itself. Effectively, this message is a header only message which is analyzed by the MINT central control. The password, type of login service being requested, MAN group, MAN name and port number are all in the MAN header of a login request, replacing other fields. This is done because only the header is passed by the XLH to the MINT central control, for further 25 processing by the OA&M central control. The login data which includes the MAN name, the requested MAN group name (virtual network name), and the password are compared against the login authorization data base 351 to check whether the particular user is authorized to access that virtual network from the physical port to which that user is connected (the physical port was prepended by 30 the NIM prior to reception of the login packet by the MINI~. If the user is in fact properly authorized, then the tables in source checker 307 and in router 30(FIG. 14) are updated. Only the source checker table of the checker that processes the login user's port is updated from a login ~or terminal operations. If a login request is for receive functions, then the routing tables of all MINl s rnu~t 35 be updated to allow that source to receive data from any authorized connectable user of the sarne group who may be connected to other MINTs to respond to ;~ L

rçquests. The source checker table 308 includçs a list of authorized name/group pairs for each port connected to the NIM that sends the data stream to the XLH
for that source checker. The router tables 310, all include entries for all users authorized to receive UWUs. Each entry includes a name/group pair, and the 5 corresponding NIM and port number. The entries in the source checker list are grouped by group identification numbers. The group identification number 616 is part of the header of subsequent packets from the logged in user, and is derivedby the OA&M system 350 at login time and sent back by the OA&M system via the MAN switch 10 to the login user. The OA~M system 350 uses the MI~r 10 central control's 20 access 19 to the MINT memory 18 to enter the login acknowledgç to the login user. On subsequent packets, as they are received in the MINT, the source checker checks the port number, MAN name and MAN group against the authorization table in the source checker with the result that the packet is allowed to proceed or not. The router then checks to see if the destination is an 15 allowable destination for that input by checking the virtual network group name and the destination narne. As a result, once a user is logged in, the user can reach any destination that is in the routing tables, i.e., that has previously logged in for access in the read only mode or the read/write mode, and that has the same virtual network group name as requested in the login; in contrast unauthori~ed users are20 blocked in every packet.
While in the present embodiment, the checking is done for each packet, it could also be done for each user work unit (LUWU or SUWU), with a recorded indication that all subsequent packets of a LUWU whose orig~nal packet was rejected are also to be rejected, or by rejecting all LUWUs whose initial 25 packet is missing at the user system.
Those super user logins which are associated with making changes in the login data base are checked in the same way as conventional logins except that it is recognized in OA&M system 350 as a login reguest for a user who has authority for changing the data base stored on disk 351.
Super users types 2 and 3 get access to the OA&M system 350 from a computer connected to a user port of MAN. OA&M system 350 derives statistics on billing, usage, authorizations and performance which the super users can access rom their compute~s.
The MAN network can also serve special types of users such as 35 transrnit only users and receive only users. An e~ample of a transrnit only user is a broadcast stock quotation system or a video transnutter. Outputs of transmit - 9o -only users are only checked in source checker tables. Receive only units such asprinters or monitoring devices are authonzed by entries in the routing tables.
11 APPLICATION OF MAN TO VOICE SWITCHIN&
FIG. 22 shows an arrangement for using the MAN architecture to 5 switch voice as well as data. In order to simplify the application of this architecture to such services, an existing switch in this case, the 5ESS~ switchmanufactured by AT&T Network Systems, is used. The advantage of using an existing switch is thae it avoids the necessity for developing a program to control a local switch, a very large development effort. By using an existing switch as 10 the interface between the MAN and voice users, this effoTt can he almost completely eliminated. Shown on FIG. 22 is a conventional customer telephone connected to a switching module 1207 of SESS switch 1200. This custorner telephone could also be a combined integrated services digital nçtwork (ISDN) voice and data customer station which can also be connected to a SESS switch.
15 Other customer stations 1202 are connected through a subscriber loop carrier system 1203 which is connected to a switching rnodule 1207. The switching modules 1207 are connected to a time multiplex switch 1209 which sets up connections between switching modules. Two of these switching modules are shown connected to an interface 1210 complising Common Channel Signaling 7 20 (CCS 7) signaling channels 1211, pulse code modulation (PCM) channels 1213, an special signaling channels 1215. These are connected to a packet assembler and disassembler 1217 for interfacing with an MAN NIM 2. The function of the PAD is to interface between the PCM signals which are generated in the switch and the packet signals which are switched in the MAN network. The funcdon of 25 the special signaling channel 1215 is to infolm PAD 1217 of the source and destinatdon associated with each PCM channel. The CCS 7 channels transmit packets which require further processing by PAD 1217 to get them into the form necessary for switching by the MAN network. To make the system less vulnerable against the failure of equipment or transmission facilities, the switch is 30 shown as being connected to two different NIMs of the MAN network. A digital PBX 1219 also interfaces with packet assembler disassembler 1217 directly. Ln a subsequent upgrade of the PAD, it would be possible to interface diTectly with SLC 1203 or with telephones such as integrated services digital network ~ISDN) telephones that generate a digi~al voice bit stream directly.

The NIMs are connected to a MAN hub 1230. The NIMs are connected to MINTs 11 of that hub. The MINTs 11 are interconnected by MAN
switch 22.
For this type of configuradon, it is desirable to switch substantial S quantities of data as well as voice in order to utilize the capabilities of the MAN
hub most effectively. Voice packets, in particular, have very short delay requirements in order to minimize the total delay encountered in ~ansmitting speech from a source to a destination and in order to ensure that there is no substantial interpacket gap which would result in the loss of a portion of the 10 speech signal.
The basic design parameters for MAN have 'oeen selected to optimize data switching, and have been adapted in a most straightforward manner as shown in FIG. 22. If a large amount of voice packet switching is required, one or moreof the following additional steps can be taken:
1. A forFII of coding such as adaptive differential PCM (ADPCM) which offers exceUent performance at 32 Kbit/second could be used instead of 64 Kbit PCM. Excellent coding schemes are also available which require fewer than 32 Kbit/sec. for good performance.
2. Packets need only be sent when a customer is actually speaking. This reduces the number of packets that must be sent by at least 2:1.
3. The size of the buffer for buffering voice samples could be increased above the storage for 256 voice sarnples (a two packet buffer) per channel.
However, longçr voice packets introduce more delay which may or may not be tolerable depending on the characteristics of the rest of the voice network.
4. Voice traffic might be concentrated in specialist MIN~s to reduce the number of switch setup operations for voice packets. Such an aTrangement may enlarge the number of customers affected by a failure of a NIM or MINT and might requi~e arrangements for providing alternate paths to another NIM and/or MINT.
5. Alternate hub configurations can be used.
The alternate hub configuration of FIG. 24 is an example of a step 5 solution. A basic problem of switching voice packets is that in order to minimize delay in transmitting voice, tne voice paclcets must represent only a short segment 35 of speech, as low as 20 milliseconds according to some estimates. I'his corresponds to as many as 5() packets per second for each direction of speech. If 3 ~

a substantial fraction of the input to a MINT represented such voice packets, the circuit switch sehlp time might be too great to handle such traffic. If only voice traffic were being switched, a packet switch which would not require circuit setup operations might be needed for high traffic situations.
One embodiment of such a packet switch 1300 comprises a group of MINTs 1313 interconnected like a conventional array of space division switches wherein each MINT 1313 is connected to four others, and enough stages are add~l to reach all output MINTs 1312 that calTy heavy voice trafflc. For added protection against equipment failure, the MINTs 1313 of the packet switch 1300 10 could be interconnected through MANS 10 in order to route traffic around a defecdve MINT 1313 and to use a spare MINT 1313 instead.
The OUtpllt bit stream of NIM 2 is connected to one of the inputs (XL) of an input MINT 1311. The packet data traffic leaving input MINT 1311 can continue to be switched through MANS 10. In this embodiment, the data 15 packet output of MANS 10 is merged with the voice packet output of data switch 1300 in an output MINT 1312 which receives the outputs of MANS 10 and data switch 1300 on the XL 16 (input) side and whose IL 17 output is the input bit stream of NIM 2, produced by a PASC circuit 290 (FIG. 13). Input MINT 1311 does not contain the PASC circuit 290 (FIG. 13) for generating the output bit stream to NIM 2. For output MINT 1312 the inputs to the XLs from MANS 10 pass through a phase alignment circuit 292 (FIG. 13) such as that shown in FIG. 23, since such inputs come from many different sources through circuit paths that insert different delay.
This alrangement can also be used for switching high priority data packets through the packet switch 1300 while retaining the circuit switch 10 forswitching low priority data packets. With this arrangement, it is not necessary to connect the packet switch 1300 to output MINTs 1312 carrying no voice traffic; in that case, high priority packets to MINTs cahying no voice ~affic would have to be routed through circuit switch MANS 10.
FIG. 26 shows another alternate configuration; in Lhis configuration, while data packets are swi~ched once through the circuit switch as previously described, voice packets are switched twice through the space division switch. In FIC;. 26, the MINTs 11 are broken down into two groups. The firs~ group consisting of M~NT 11-û through MINT 11-239 are used in the conventional wa~
35 and have both voice and data packet inputs from the NIMs to which they are connected by a link 3. When one of the MINTs 11-0,...,11-239 recognizes a vol~c packet, it prepares to send that voice packet through the circuit switch MANS 10to one of 16 specialist voice packet switch modules, MlNTs 11-240,...,11-255.
Each of the MINTs 11-0,...,11-239 can then assemble voice packets in only 16 different groups, one group for each of the voice packet switching modules, S MINTs 11-240,...,11-255, so that any circuit connection from one of the MINTs 11-0,...,11-239 can carry voice packets destined for 1/16th of the 960 NIMs connected to the 240 voice and data packet switch m¢dules.
A voice packet or a chained series of voice packets destined for one of the voice packet switch modules, MINls 11-240,...,11-255~ is connected from 10 the output of MANS 10 to an input of such a MINT. The voice packet switch MINT then separates each incoming packet stream into 15 possible destinations and assembles voice packets received from any of the voice and data packet switch modules, MINTs 11-0,...,11-239, for each of the 15 destinations (NIMs) served by each of the voice packet switch modules, MINTs 11-240,...,11-255.
15 E~ach of the latter MINTs then transmits a chain of packets for each of the 15 NIMs served by that MlNT through MANS 10 to the one of the outlets of MANS 10 that is connected to the correct destination NIM.
This arrangement sharply reduces the number of connections that must be set up through MANS 10 for transmitting voice packets since each voice and 20 data packet MINT has only 16 voice packet destinations (MINTs 11-240,...,11-255) and each voice packet switch MINT, 11-240,...,11-255, has only 15 destinations, i.e., the 15 NIMs that it serves. This is in contrast to a comparable single stage arrangement whereby each voice and data packet switch module must set up connections to up to 960 different NIMs.
25 12 l!~IINT ACCESS CONTROL TO ~LAN SWITCH CONTROL
FIG. 21 illustrates one arrangemellt for controlling ac~ess by MINTs 11 to the MAN switch control 22. Each MINT has an associated access controller 1120. A data ring 1102,104,1106 dis~ibutes data indicating the availability of output links to each logic and count circuit 1110 of each access30 controller. Each access controller 1120 maintzins a list 1110 of ou~tput links such as 1112 to which it wants tO send data, each link having an associated pliority indicator 1114. A MINT can seize an output link of that list by marking the linkunavailable in ring 1102 and transmitting an order to the MAN switch control 22 to set up a path from an LH of fhat MINT to Ihe requested output link. Whe 35 the full data block to be transmitted to that output link has been so transmitted~
the MI~ marks the output link available in the data transmitted by data ring 1102 which thereby makes that output link available for access by other MINTs.
A problem with using only availability data is that during periods of congestion the time before a particular MINT may get access to an output link can 5 'oe excessive. In order to even the accessibility of any output link to any MINT, the following arrangement is used. Associated with each link availability indication, called a ready bit transmitted in ring 1102, is a window bit transmitted in ring 1104. The ready bit is controlled by any MINT that seizes or releases anoutput link. The window bit is controlled by the access controller 1120 of only a 10 single MINT called, for the purposes of this description, the controlling MINT. In this par~cular embodirnent, the controlling MINT for a given output link is the MINl' to which the corresponding output link is routed~
The effect of an open window (window bit = 1) is to let the first access controller on the ring that wants to seize an output link and recognizes its 15 availability as the ready bit passes the controller, seize such a link, and to let any controller which tries to seize an unavailable link see the priority in~icator 1114 -for that unavailable link. The effect of a closed window (window bit = 0) is topermit only controllers which have a priority indicator set for a coIresponding available link to seize that available linK. The window is closed by tne access 20 controller 1120 of the controlling MINT whenever the logic and count circuit 1100 of that controller detects that the output link is not available (ready bit = 0~ and is opened whenever that contA~ller detects that that output link isavailable (ready bit = 1).
The operation of an access controller seizing a link is as follows. If 25 the link is unavailable (ready bit = 0) and the window bit is one, the accesscontroller sets the priority indicator 1114 for that output link. If the link isunavailable and the window bit is zero, the controller does nothing. If the link is available and the window bit is one, the controller seizes the link and marks the ready bit zero to ensure that no other con~oller seizes the same linl;. If the link is 30 available and the window bit is zero, then only a con~oller whose priori~y indicator 1114 is set for that link can seize that link and will do so by marking the ready bit zero. The action of the access controller of the controlling MINl on the window bit is simpler: that controller simply copies the vallle of ~he ready bitinto the window bit.

- 9s -In addidon tO the ready and window bits, a frame bit is circulated in ring 1106 to define the beginning of a frame of resource availability data, hence, to define the count for identifying the link associated with each clear and window bit. Data on the three rings 1102, 1104 and 1106 circulates serially and in S synchronism through the logic and count circuit 1100 of each MINT.
The result of this type of operation is that those access controllers which are trying to seize an output link and which are located between the unit that first successfully seized that output link and the access controller that controls the window bit have priolity and will be served in turn before any other 10 controllers that subsequently may make a request tO seize the specific output link.
As a result, an approximately fair distribution of access by all MINTs to all output links is achieved.
If this alternative approach to controlling MINT 11 access control to the MANSC 22 is used, priority is controlled ~m the MINT. Fach MINT
15 maintains a priority and a regular queue for queuing requests, and makes requests for MANSC serv~ces first from the MINT priority queue.

It is to be understood that the above description is only of one preferred embodiment of the invention. Numerous other arrangements may be 20 devised by one skilled in the art without departing from the spirit and scope of the invention. The invention is thus limited only as defined in the accompanying claims.

~3~2~33 APPENl:3IX A
ACRONYMS AND AB~ VL~TIONS

lSC First Stage Controller 2SC Second Stage Controller ACK Acknowledge ARP Address Resolutîon Protocol ARQ Automatic Repeat Reqllest BNAK Busy Negative Acknowledge CC Cenhal Control lû CNAK Control Negative Acknowledge CNet Control Network CR(: Cyclic Redundancy Check or Code DNet Data Network DRAM Dynamic !Random Access Memory D~M[A Direct Virtual Memory Access F,US End User System EUSL End User Link tCormects NIM and UIM) FEP Front End Processor FIFO First In First Out FN~K Fabric Blocking Negative Acknowl~dge IL Internal Link (Connects MINTr and MANS) ILH Internal Link Handler IP Internet Protocol LAN Local Area Network LUWU Long User Work IJnit MAN Exemplary Metropolitan Area Network MANS MAN Switch MANSC MAN/Switch Controller MINT Memory and Interface Module MMU Memory Management Unit NAK Negative Acknowledge NIM Network Interface Module OA&M Operation, Administration and Maintenance PASC Phase Alignment and Scramble Circuit SCC Switch Control Complex SUWU Short User Work Unit TCP Transmission Control Protocol TSA Time Slot Assigner UDP User Datagram Protocol UIM User Interface Module UWU User Work Unit VLSI Very Large Scale Integration VME~ bus An EEE Standard Bus WAN Wide Area Network XL External Link (Connects NIM tO MINlr) XLH External Linlc Handler XPC Crosspoint Controller ~ - ~

Claims (16)

Claims
1. A system for switching voice signals comprising:
means for converting said voice signals into voice packets; and means, connected to said means for converting, for packet switching said voice packets, comprising:
a plurality of input packet handlers and a plurality of output packet handlers;
memory access means for controlling storing and reading of said voice packets, comprising a plurality of memory access controllers for storing consecutive words of a voice packet in consecutive members of a plurality of memory modules;and means for distributing said voice packets from said plurality of input packet handlers to said plurality of memory access controllers, for chaining packets to be transmitted to a common group of destinations, and for distributing said chained voice packets from said plurality of memory access controllers to said plurality of output packet handlers.
2. A system for switching voice signals, comprising:
a plurality of means for converting said voice signals into voice packets;
and a plurality of means, connected to said means for converting, for packet switching said voice packets, comprising:
a plurality of input packet handlers and a plurality of output packet handlers;
memory access means for controlling storing and reading of said voice packets, comprising a plurality of memory access controllers for storing consecutive words of a voice packet in consecutive members of a plurality of memory modules;means for distributing said voice packets from said plurality of input packet handlers to said plurality of memory access controllers and for distributing said voice packets from said plurality of memory access controllers to said plurality of output packet handlers; and circuit switch means for switching said voice packets between output packet handlers of said plurality of means for packet switching and ones of a plurality of communication paths;
wherein said means for packet switching said voice packets comprise means for chaining voice packets in groups, each group for connection over one of said communication paths.
3. The system of claim 2 wherein ones of said plurality of communication paths are connectable to a packet to digital voice signal converter.
4. The system of claim 3 wherein said means for converting said voice signals into voice packets is comprised in a digital switching system connectable to customer stations;
said digital switching systems further comprising means for generating signaling information to said means for converting for signaling terminal identification data for switching packets of a voice connection to a customer station, and for generating signaling information to said means for converting for signaling the identity of a requested customer station to a switch serving that requested customer station.
5. A network for switching first packets comprising data and second packets comprising voice signals, comprising:
first data switching means for switching said first and said second packets, received in said first data switching means, to first and second outputs respectively;
circuit switching means connected to said first outputs for further switching said first packets; and second data switching means connected to said second outputs for further switching said second packets.
6. A system for switching data and voice signals comprising:
digital switching means connectable to customer lines for generating digital speech signals;
means for generating speech channel identification information;

means connected to said digital switching means for converting speech signals into voice packets and responsive to said speech channel identification information for generating headers to said voice packets;
means for concentrating data traffic from and distributing traffic to said means for generating voice packets;
means, connected via data links to said means for concentrating, for packet switching said voice packets comprising:
a plurality of input packet handlers and a plurality of output packet handlers;
memory means for storing said voice packets comprising a plurality of memory modules for storing consecutive words of a voice packet;
means for chaining packets into groups destined for a common means for distributing and for communicating said chaining data to said output packet handlers;
means, controlled by said input packet handlers for distributing said voice packets from said plurality of input packet handlers to said plurality of memory modules and, controlled by said output packet handlers, for assembling said chained groups of voice packets from said plurality of memory modules to said plurality of output packet handlers.
7. The system of claim 6 further comprising:
circuit switching means connected to said means for packet switching for groups of packets from said means for packet switching to ones of data linksconnected to said means for concentrating data.
8. A method of switching voice and data packets comprising the steps of:
packet switching said voice packets received on inputs of a first packet switch means to first outputs of said first packet switch means and said data packets to second outputs of said first packet switch means;
connecting said first outputs to a circuit switch means and said second outputs to a second packet switch means.
9. A method of switching voice signals comprising the steps of:
converting said voice signals to voice packets;

transmitting said voice packets to an input packet handler of a data switching means;
transmitting data from said input packet handler to a plurality of memory access controllers of said data switching means for controlling storage of voice packets in a plurality of memory modules;
chaining packets into groups having a common intermediate destination;
and transmitting each of said groups from said plurality of memory access controllers to an output data handler of said data switching means for further transmission to one of said intermediate destinations.
10. A network for switching first packets, comprising data, and second packets, comprising information representing voice signals, from a plurality of inlets to a plurality of outlets, comprising:
first and second data switching means; and circuit switching means;
said first data switching means for switching said first and said second packets received from said inlets to said circuit switching means;
said circuit switching means responsive to said first and said second packets received from said first data switching means for switching said first and said second packets to said outlets and said second data switching means, respectively;
said second data switching means responsive to said second packets received from said circuit switching means for switching said second packets to said circuit switching means;
said circuit switching means further responsive to said second packets received from said second data switching means for switching said second packets to said outlets.
11. The network of claim 10 wherein each of said first and second data switching means comprise means for generating control signals for selecting outlets and second data switching means and wherein said circuit switching means is responsive to said control signals for switching a packet received from one of said data switching means to an outlet or a second data switching means selected by acontrol signal from said one of said data switching means.
12. The network of claim 11 wherein each of said data switching means comprise a plurality of data switching modules, and wherein each of said data switching modules of said first data switching means comprises means for chaining received first data packets destined for a common outlet and for chaining received second data packets destined for a common one of said plurality of data switching modules of said second data switching means, and means for generating control signals for controlling the switching by said circuit switching means of said chained received packets to said common outlet or said one of said plurality of switching modules of said second data switching means.
13. The network of claim 12 wherein each of said data switching modules of said second data switching means comprises means for chaining received second data packets destined for another common outlet and means for generating control signals for switching said chained received packets to said other commonoutlet.
14. In a data switching system comprising circuit switching means and first and second data switching means, a method for switching first packets comprising data and second packets comprising information representing voice signals from a plurality of inlets to said first data switching means to a plurality of outlets comprising the steps of:
in said first data switching means, data switching said first and said second packets, received from said inlets to said first data switching means, to said circuit switching means;
in said circuit switching means, circuit switching said second packets received from said first data switching means to said second data switching means;
in said second data switching means, data switching said second packets received from said circuit switching means to said circuit switching means; and in said circuit switching means, circuit switching said first packets received from said first data switching means and said second packets received from said second data switching means to said outlets.
15. The method of claim 14 further comprising the steps of generating control signals in said first data switching means for causing said circuit switching means to switch said first packets to said outlets, and said second packets, received from said first data switching means, to said second data switching means.
16. The method of claim 15 wherein said second data switching means comprises at least one module, further comprising the steps of chaining first packets destined for a common outlet and chaining second packets destined for a common module of said second data switching means.
CA000595081A 1988-03-31 1989-03-29 Integrated packetized voice and data switching system Expired - Lifetime CA1312133C (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US07/175,547 US4958341A (en) 1988-03-31 1988-03-31 Integrated packetized voice and data switching system
US175,547 1988-03-31
US07/238,309 US4872160A (en) 1988-03-31 1988-08-30 Integrated packetized voice and data switching system
US238,309 1988-08-30

Publications (1)

Publication Number Publication Date
CA1312133C true CA1312133C (en) 1992-12-29

Family

ID=26871311

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000595081A Expired - Lifetime CA1312133C (en) 1988-03-31 1989-03-29 Integrated packetized voice and data switching system

Country Status (2)

Country Link
US (1) US4872160A (en)
CA (1) CA1312133C (en)

Families Citing this family (80)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0813057B2 (en) * 1989-02-03 1996-02-07 日本電気株式会社 Mixed transfer method of HDLC variable length packet and non-HDLC fixed length packet
US6389010B1 (en) 1995-10-05 2002-05-14 Intermec Ip Corp. Hierarchical data collection network supporting packetized voice communications among wireless terminals and telephones
US5463762A (en) * 1993-12-30 1995-10-31 Unisys Corporation I/O subsystem with header and error detection code generation and checking
US5748633A (en) * 1995-07-12 1998-05-05 3Com Corporation Method and apparatus for the concurrent reception and transmission of packets in a communications internetworking device
US5825774A (en) * 1995-07-12 1998-10-20 3Com Corporation Packet characterization using code vectors
US5651002A (en) * 1995-07-12 1997-07-22 3Com Corporation Internetworking device with enhanced packet header translation and memory
US5796944A (en) * 1995-07-12 1998-08-18 3Com Corporation Apparatus and method for processing data frames in an internetworking device
US5812775A (en) * 1995-07-12 1998-09-22 3Com Corporation Method and apparatus for internetworking buffer management
AU6501496A (en) * 1995-07-19 1997-02-18 Ascom Nexion Inc. Point-to-multipoint transmission using subqueues
JPH11512583A (en) * 1995-09-14 1999-10-26 フジツウ ネットワーク コミュニケーションズ,インコーポレイテッド Transmitter-controlled flow control for buffer allocation in a wide area ATM network
GB9603582D0 (en) 1996-02-20 1996-04-17 Hewlett Packard Co Method of accessing service resource items that are for use in a telecommunications system
US7336649B1 (en) * 1995-12-20 2008-02-26 Verizon Business Global Llc Hybrid packet-switched and circuit-switched telephony system
US6041109A (en) * 1995-12-29 2000-03-21 Mci Communications Corporation Telecommunications system having separate switch intelligence and switch fabric
JP2000517488A (en) * 1996-01-16 2000-12-26 フジツウ ネットワーク コミュニケーションズ,インコーポレイテッド Reliable and flexible multicast mechanism for ATM networks
US6125113A (en) * 1996-04-18 2000-09-26 Bell Atlantic Network Services, Inc. Internet telephone service
US6154445A (en) 1996-04-18 2000-11-28 Bell Atlantic Network Services, Inc. Telephony communication via varied redundant networks
US6069890A (en) * 1996-06-26 2000-05-30 Bell Atlantic Network Services, Inc. Internet telephone service
US6122255A (en) * 1996-04-18 2000-09-19 Bell Atlantic Network Services, Inc. Internet telephone service with mediation
US6438218B1 (en) 1996-04-18 2002-08-20 Robert D. Farris Internet telephone service
US6021126A (en) * 1996-06-26 2000-02-01 Bell Atlantic Network Services, Inc. Telecommunication number portability
US6014379A (en) * 1996-06-26 2000-01-11 Bell Atlantic Network Services, Inc. Telecommunications custom calling services
US5839088A (en) 1996-08-22 1998-11-17 Go2 Software, Inc. Geographic location referencing system and method
US6266328B1 (en) 1996-08-26 2001-07-24 Caritas Technologies, Inc. Dial up telephone conferencing system controlled by an online computer network
US5748905A (en) * 1996-08-30 1998-05-05 Fujitsu Network Communications, Inc. Frame classification using classification keys
GB9620082D0 (en) * 1996-09-26 1996-11-13 Eyretel Ltd Signal monitoring apparatus
US6570871B1 (en) 1996-10-08 2003-05-27 Verizon Services Corp. Internet telephone service using cellular digital vocoder
US6546003B1 (en) 1996-11-21 2003-04-08 Verizon Services Corp. Telecommunications system
US6078582A (en) * 1996-12-18 2000-06-20 Bell Atlantic Network Services, Inc. Internet long distance telephone service
US6064653A (en) * 1997-01-07 2000-05-16 Bell Atlantic Network Services, Inc. Internetwork gateway to gateway alternative communication
US6137869A (en) 1997-09-16 2000-10-24 Bell Atlantic Network Services, Inc. Network session management
US6075783A (en) 1997-03-06 2000-06-13 Bell Atlantic Network Services, Inc. Internet phone to PSTN cellular/PCS system
US6215790B1 (en) 1997-03-06 2001-04-10 Bell Atlantic Network Services, Inc. Automatic called party locator over internet with provisioning
US6574216B1 (en) 1997-03-11 2003-06-03 Verizon Services Corp. Packet data network voice call quality monitoring
US5933490A (en) * 1997-03-12 1999-08-03 Bell Atlantic Network Services, Inc. Overload protection for on-demand access to the internet that redirects calls from overloaded internet service provider (ISP) to alternate internet access provider
US6870827B1 (en) 1997-03-19 2005-03-22 Verizon Services Corp. Voice call alternative routing through PSTN and internet networks
US6292479B1 (en) 1997-03-19 2001-09-18 Bell Atlantic Network Services, Inc. Transport of caller identification information through diverse communication networks
US6272126B1 (en) 1997-07-24 2001-08-07 Bell Atlantic Network Services, Inc. Internetwork telephony with enhanced features
US6418461B1 (en) * 1997-10-06 2002-07-09 Mci Communications Corporation Intelligent call switching node in an intelligent distributed network architecture
JP3805096B2 (en) * 1998-03-13 2006-08-02 富士通株式会社 Voice / data integrated communication device
JP3244054B2 (en) * 1998-07-01 2002-01-07 日本電気株式会社 Method and system for delivering data to nodes in PBX network
US6108331A (en) * 1998-07-10 2000-08-22 Upstate Systems Tec, Inc. Single medium wiring scheme for multiple signal distribution in building and access port therefor
US6442169B1 (en) 1998-11-20 2002-08-27 Level 3 Communications, Inc. System and method for bypassing data from egress facilities
US6614781B1 (en) 1998-11-20 2003-09-02 Level 3 Communications, Inc. Voice over data telecommunications network architecture
EP1135903A2 (en) * 1998-11-30 2001-09-26 Broadcom Corporation Network telephony system
US6885657B1 (en) * 1998-11-30 2005-04-26 Broadcom Corporation Network telephony system
US6584122B1 (en) 1998-12-18 2003-06-24 Integral Access, Inc. Method and system for providing voice and data service
CA2364468A1 (en) * 1999-03-06 2000-09-14 Coppercom, Inc. System and method for administrating call and call feature set-up in a telecommunications network
US20020061309A1 (en) * 2000-03-08 2002-05-23 Garger Stephen J. Production of peptides in plants as N-terminal viral coat protein fusions
US7046778B2 (en) * 2000-03-31 2006-05-16 Coppercom, Inc. Telecommunications portal capable of interpreting messages from an external device
US7324635B2 (en) 2000-05-04 2008-01-29 Telemaze Llc Branch calling and caller ID based call routing telephone features
US6937562B2 (en) 2001-02-05 2005-08-30 Ipr Licensing, Inc. Application specific traffic optimization in a wireless link
US20020159468A1 (en) * 2001-04-27 2002-10-31 Foster Michael S. Method and system for administrative ports in a routing device
US7266609B2 (en) * 2001-04-30 2007-09-04 Aol Llc Generating multiple data streams from a single data source
US8572278B2 (en) 2001-04-30 2013-10-29 Facebook, Inc. Generating multiple data streams from a single data source
US7237033B2 (en) 2001-04-30 2007-06-26 Aol Llc Duplicating switch for streaming data units to a terminal
WO2003105006A1 (en) * 2001-04-30 2003-12-18 America Online, Inc. Load balancing with direct terminal response
US7124166B2 (en) 2001-04-30 2006-10-17 Aol Llc Duplicating digital streams for digital conferencing using switching technologies
US8068832B2 (en) * 2001-11-19 2011-11-29 Nokia Corporation Multicast session handover
US20040128693A1 (en) 2002-12-27 2004-07-01 Weigand Gilbert G. System and method for enabling access to content through a personal channel
US6918001B2 (en) * 2002-01-02 2005-07-12 Intel Corporation Point-to-point busing and arrangement
US8028092B2 (en) 2002-06-28 2011-09-27 Aol Inc. Inserting advertising content
US7376183B2 (en) * 2002-09-09 2008-05-20 Warner Bros. Entertainment, Inc. Post-production processing
US7197071B1 (en) 2002-09-09 2007-03-27 Warner Bros. Entertainment Inc. Film resource manager
US7555017B2 (en) * 2002-12-17 2009-06-30 Tls Corporation Low latency digital audio over packet switched networks
US7278920B1 (en) 2002-12-31 2007-10-09 Warner Bros. Entertainment, Inc. Theater-based gaming system enabling a multi-player game throughout a system of the theaters
US8223745B2 (en) * 2005-04-22 2012-07-17 Oracle America, Inc. Adding packet routing information without ECRC recalculation
US7716359B2 (en) * 2005-05-09 2010-05-11 Microsoft Corporation Method and system for providing an interface through which an application can access a media stack
US7930740B2 (en) * 2005-07-07 2011-04-19 International Business Machines Corporation System and method for detection and mitigation of distributed denial of service attacks
KR100909542B1 (en) 2005-08-01 2009-07-27 삼성전자주식회사 Method and apparatus for interworking voice and multimedia service between a CSI terminal and an IMS terminal
KR100728220B1 (en) 2005-09-29 2007-06-13 한국전자통신연구원 Apparatus and Method of Fault Diagnosis and Data Management for Satellite Ground Station
US8665892B2 (en) * 2006-05-30 2014-03-04 Broadcom Corporation Method and system for adaptive queue and buffer control based on monitoring in a packet network switch
US20090247006A1 (en) * 2008-01-22 2009-10-01 Wi3, Inc., New York Network access point having interchangeable cartridges
US8238538B2 (en) 2009-05-28 2012-08-07 Comcast Cable Communications, Llc Stateful home phone service
CA2848307A1 (en) 2011-08-08 2013-02-14 Novano Corporation Service over ethernet interconnectable wall plate (soeicwp) module
JP6235140B2 (en) * 2013-08-09 2017-11-22 ヒューレット パッカード エンタープライズ デベロップメント エル ピーHewlett Packard Enterprise Development LP Switch assembly
KR102578553B1 (en) 2015-12-10 2023-09-13 마이크로소프트 테크놀로지 라이센싱, 엘엘씨 Data-Driven Automated Provisioning for Telecommunications Applications
US10129769B2 (en) 2015-12-31 2018-11-13 Affirmed Networks, Inc. Adaptive peer overload control in mobile networks
EP3403432B1 (en) 2016-01-15 2020-11-18 Microsoft Technology Licensing, LLC Database based redundancy in a telecommunications network
US11741196B2 (en) 2018-11-15 2023-08-29 The Research Foundation For The State University Of New York Detecting and preventing exploits of software vulnerability using instruction tags
WO2020150315A1 (en) 2019-01-15 2020-07-23 Affirmed Networks, Inc. Dynamic auto-configuration of multi-tenant paas components

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4698802A (en) * 1986-03-07 1987-10-06 American Telephone And Telegraph Company And At&T Information Systems Inc. Combined circuit and packet switching system
US4731785A (en) * 1986-06-20 1988-03-15 American Telephone And Telegraph Company Combined circuit switch and packet switching system
US4764919A (en) * 1986-09-05 1988-08-16 American Telephone And Telegraph Company, At&T Bell Laboratories Virtual PBX call processing method

Also Published As

Publication number Publication date
US4872160A (en) 1989-10-03

Similar Documents

Publication Publication Date Title
CA1312133C (en) Integrated packetized voice and data switching system
CA1314955C (en) Identification and authentication of end user systems for packet communications network services
CA1321818C (en) Architecture of the control of a high performance packet switching distribution network
CA1311037C (en) Architecture and organization of a high performance metropolitan area telecommunications packet network
US4958341A (en) Integrated packetized voice and data switching system
CA1310733C (en) User to network interface protocol for packet communications networks
US4897874A (en) Metropolitan area network arrangement for serving virtual data networks
US4977582A (en) Synchronization of non-continuous digital bit streams
CA1315376C (en) Arrangement for switching concentrated telecommunications packet traffic
US4872159A (en) Packet network architecture for providing rapid response time
US4942574A (en) Concurrent resource request resolution mechanism
CA1315373C (en) Distributed control rapid connection circuit switch
US4894824A (en) Control network for a rapid connection circuit switch
US4875206A (en) High bandwidth interleaved buffer memory and control
EP0335562B1 (en) Architecture and organization of a high performance metropolitan area telecommunications packet network
Partridge et al. A 50-Gb/s IP router
McAuley Protocol design for high speed networks
US6343081B1 (en) Method and apparatus for managing contention in a self-routing switching architecture in a port expansion mode
EP0335555B1 (en) User to network interface protocol for packet communications networks
EP0336598B1 (en) Arrangement for switching concentrated telecommunications packet traffic
EP0335563B1 (en) Distributed control rapid connection circuit switch and controlling method
JP2594641C (en)
Traw Applying architectural parallelism in high-performance network subsystems
Ismert ATM network striping
Gandhi A parallel processing architecture for DQDB protocol implementation

Legal Events

Date Code Title Description
MKLA Lapsed