CA2022209A1

CA2022209A1 - Method of handling errors in software

Info

Publication number: CA2022209A1
Application number: CA002022209A
Authority: CA
Inventors: William F. Bruckert; Thomas D. Bissett; James Melvin
Original assignee: William F. Bruckert; Thomas D. Bissett; James Melvin; Digital Equipment Corporation
Current assignee: Digital Equipment Corp
Priority date: 1989-08-01
Filing date: 1990-07-30
Publication date: 1991-02-02
Also published as: US5153881A; EP0414379A3; EP0414379A2; JPH03182939A

Abstract

ABSTRACT
Hardware error processing is undertaken to analyze the source of the error and to preserve sufficient information to allow later software error processing.
The hardware error processing also allows, for certain errors, complete recovery without interruption of the sequence of instruction execution.

Description

~ r' ~
20222~9 5 C ~

I. BACRGROIIND OF THE INVENTION
This invention relates generally to error proce~s-ing in fault tolerant computing systems, and more specifically to methods and apparatus for processing errors in such system~ which have dual proce~ors , ~, operating in synchronism.
Processing errors in a data processing system involves three steps. The first iR the detection of the error. The second i~ the recovery from the error. The third i9 recording information about the error. Another concern in a redundant proces~ing environment is return-ing the sy~tem to full redundancy after repair.
In a fault tolerant computing sy6tem, errors are more co~tly than in a standard non-fault tolerant ¢omputing system. This is becau~e fault tolerant computer C~ystems are always employed in environments where the cost of any downtime i8 high either in terms of money or safety. Therefore, error processing is an extremely important operation for such systems.
As important as error processing is to a fault tolerant system, it is desirable that such processing not delay the execution of normal data processing -operations unnece~sarily. Thus a balance must be struck -between the desire for efficient processing and the need 25~^Wo~e~- for effective error processing.
FlNNtC~I. HENDEP.SON
F~RAIIO~, C~ E I r ~i DUNNEP~
17~ TIIeeT, 11 W, ~ olol~.o C ~ooo~
1~0~ o ~ ~ .

.' '~
:: ~

~`

20222~

1 Certain conventional fault tolerant computer systems suspend all operations upon the detection of an error in order to execute software error recovery procedures. Software error recovery procedures, however, can be complex and involved. Usually such procedures take considerable ti~e and force the systQm to interrupt a potentially crucial task.
Accordingly, i~ is desirable to construct a system which spends a minimal amount of time executing softw~re operations necessary for error handling. In this man-ner, the ef~'ects of software error handling on a computer system can be reduced by minimizing the time spent on orror processing.
It is also desirable for the present invention to handle as many errors as possible in hardware so that error recovery is transparent to the software proce~es for which the computer system is executing data procee~s-ing instructions.
II. SUMMARY OF THE INVENTION
Thls invention attains the desired ends by attempt-ing to locate the 30urce of an error in hardware and disable a processor if it is faulty prior to entering software error handling routines. Also, if an error can be handled entirely in hardware,then no software error handling is necessary.
~w Or,,~
rlNNE~.AN, IIENI-ER~ON
F~A~RAD~ CARRel r a Du~eR
t e ~ . N ~
NOTON~ D. C ~OOO--3 U~O~ O
~ 2 ~ ;.-~ . .,' ` ~ :~. ,' .~* t~

20222~

1 More specifically, in accordance with the pre~ent invention, as enbodied and as broadly described hereil, a method of recovering from an error is provided comprisinq several steps which ar~ performed by data processing system without executing the data processing operations. The data proce~sing sy~tem has a plurality ; ~ -of processor portion~ executin~ the same set of data processing operations, and the steps are: detecting an error in the data proce~.sing sy~tem during the execution of a faulting one of the operation3; locating the processing portion in which the error was detected;
determining whether the detected error is a critical error indicating the processor portion in which the er~
ror wa~ detected i~ incapable of executing the data processing operations normally; reconfiguring dat~
paths, if a critical error ~ detected, to bypass the processor portion in which the error was detected; and retrying the data processing operation3 being executed when the error was detected if the error is not a critical error. ,!'. ' ' ' ' ' '~"'. ' ' . .'~' ' ' ' . "' "
. ' ~ ' ~"'""'"
~

: . .
~
~w 0".~............................................................ ,:
FINNEC~N, HENDERSON
F-~A,~ CAARE1 6 DL'::NER
IT7~ lteT. ~ w TO~ O C ~000--350., - 3 - ~ ~
', - :. ' 2022~0~ :-'~"`
1 III. BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and which constitute a part of this specification, Lllustrate one embodiment of the invention and, together with the descrip-tion of the invention, explain the principles of the inven-tion.
Fig. 1 is a block diagram of a preferred embodiment of fault tolerant computer ~ystem which practices the present invention;
Fig. 2 is an illu~tration of the physical hardware containing the fault tolerant computer system in Fig. l;
Fig. 3 is a block diagram of the CPU module ~hown in the fault tolerant computer system shown in Fig. 1;
Fig. 4 is a block dlagram of an interconnected CPU
module and I/O module for the computer sy~tem shown in Fig.
l; . ', Fig. 5 is a block diagram of a memory module for the fault tolerant computer sy~tem shown in Fig. 1;
Fig. 6 i8 a detailed diagram of the elements of the `~
control logic in the memory module shown in Fig. 5;
Fig. 7 is a block diagram of portions of th~ primary -~
memory controller of the CPV module shown in Fig. 3;
Pig. 8 is a block diagram of the DNA engine in the primary memOry controller of the CPU module of Fig. 3;
Fig. 9 is a diagram of error proces~ing circuitry in the ~WO~ICt~ primary memory controller of ths CPU module of Fig. 3;
FINNECAN. HENDERSON
F.~RAbO~1r CARRe I r 6 D~NNER
t~ 1~ W
10-01-. 0 C ~000--~0~ 0 `

2 0 2 2 ,~

l Fig. lO is a drawing of some of the regis~ers of the cross-link in the CPU module shown in Fig. 3;
Fiq. 11 is a block diagrc~m of the elements which route control signals in the cro~s-links of the CPU module shown in Fig. 3;
Fig. 12 is a block diagram o~ the elements which route data and address signals in tho primary cro~-link of the CPU
module shown in Fig. 3;
Fig. 13 i~ a state diagram showing the states for the cross-link of the CPU module ~hown in Fig. 3;
Fig. 14 is a block diagram of the timing system for the fault tolerant computer system of Fig~ l;
Fig. 15 i8 a timing diagram for the clock signals gener-ated by the timing system in Fig. 14;
Fig. 16 is a detailed diagram of a phase detector for the timing 3ystem ~hown in Fig. 14; ~;
Fig. 17 is a block diagram of an I/O module for the com-puter system of Fig. 1; -~
Fig. 18 is a block diagram of the firewall element in the I/O module shown in Fig. 17; ; ,~
Fig. 19 is a detailed diagram of the elements of the cross-link pathway for the computer sy3tem of Fig. l;
Figs~ 20A-20E are data flow diagrams for the computer system in Fig. l;
Fig. 21 is a block diagram of zone 20 showing the rout-FI~EG~. HE~DERSON ing of reset signals;
F.~R~' G~RRE1 r li Du~lER
." ~ t~. W .:
1 O~O~r. D C ~OOOt O ~ D
~ 5 ~ ~ ~

2022~9 ~

1 Fig. 22 is a block diagram of the components involved in resets in the CPU module shown in Fig. 3; and Fig. 23 is a diagram of clock reset circuitry.
Fig. 24 is a flow diagram illustrating an oYerall hard-ware error handling procedure for ~he computer Qystem in F$g.
l; :.
Figs. 25a and 25b, taken together, are a flow diagram of a procedure for handling CPU I/0 errors within the process of Fig. 24;
Fig. 26 is a block diagram showing the error lines and varlous elements used in error handling procedure~ for the computer ~y~tem in Fig. l;
Fig. 27 is a block diagram showing the location of trace RAM8 within the computer system in Fig. l;
Fig. 28 i8 a block diagram of a traGe RAM for the computer system in Fig. 1; ;~
Fig. 29 is a flow diagram lllustrating the procedure for ;
recovering from a ~XA error within the o~erall hardware error proce~sing procedure of Fig. 24;
Fig. 30 is a flow diagram illustrating a procedure for hdndling CPU/MEM faults within the process of Fig. 24;
. ,:....................................................................... .... .... .... ~
Fig. 31 is a flow d~agram illustrating an overall soft~
ware error handling procedure for the computer sy~tem in Fig.

Fig. 32 is a flow diagram illustrating the CPU I/0 error ~
~wo~e~ handler of Fig. 31; ~ -FINN~ I, HeNDER~ON
F.~RA~O~. C~RRElr TI~eCT N W
W~ O~OII. O. C ~000-1~0~ -0 - 6 - ~ ~

~ ;

2022~9 :- ~
... .

:
1 Fig. 33 is a flow diagram i ~ ustrating th~ fail~d device handler of Fig. 32;
Fig. 34 is an illustration of a system address con~er-sion table u~ed in the computer system in Fig. 1 Fig. 35 is an illustration of an example of a device driver used in the computer 8y8tem in Fig. 1;
Fig. 36 is a flow diagram of the CPU/MEM fault handler of Fig. 31 Fig. 37 is a flow diagram of the clock error handler of Fig. 31;
Pig. 38 is a flow diagram of the NXM error handler of Fig. 31; and ~ ;
Figs. 39 and 40 are flow diagram~ illustratlng a procedure for the conversion of rail unique data to ~ystem data for the computer system in Fig. 1.
IV. DESCRIPTION OF THE PREFE M ED EMBODIMENT
Reference will now be made in detail to a presently pre- :~
ferred embodiment of the invention, an example of which is .,, ::
illustrated in the accompanying drawings.
A. SYSTEM DESCRIPTION
:: -:. . ~ :: .
Fig. 1 i~ a block diagram of a fault tolerant computer ~ ~
.:. .: ~ .
sy~tem 10 in accordance with the pre~ent in~ention. Fault tolerant computer system 10 includes duplicate sy~tems, called zones. In the normal mode, the two zonei 11 and 11' operate simultaneously. The duplication ensures that there ~or~ce~ is no single point of failure and that a single error or FWNECA:SI, HENDERSON
F.~RAeO~ G'.RRElT
6 DUNNeR
~TO 1~ ~TatliT, 11 W
OTO~ . O C. ~OOO--1~ 0~ 0 ~ 7 ~0222~

1 fault in one of the zones 11 or 11~ will not disable computer system 10. Furthermore, all such faults can be corrected by disabling or ignoring the device or element which caused the fault. Zones 11 and 11~ are shown in Fig. 1 as re~pectively including duplicate processing systems 20 and 20'. The dual-ity, however, goes beyond ~he processing Eystem.
Fig. 2 contains an illustration of the physical hardware of fault tolerant computer ~y~tem 10 and gr~phically il-lustrates the duplication of the Ey~tems. Each zone 11 and 11' is housed in a different cabinet 12 and 12', respectively. Cabinet 12 includes battery 13, power regula-tor 14, cooling fan6 16, and AC input 17. Cabinet 12' includes separate elements corresponding to elements 13, 14, 16 and 17 of cabinet 12.
lS As explained in greater detall ~elow, proce~sing systems 20 ~nd 20' include e~everal modules interconnected by back-planes. If a module contains a fault or error, that module may be removed and replaced without disabling computing system 10. This is because proce~sing systems 20 and 20' are ~;
physically ~eparate, have separate backplanes into which the ~ ~ ~
modules are plugged, and can operate independently of each ~;
other. ~hu-~ modules can be removed from and plugged into the backplane of one processing system while the other processing Yystem continues to operate.
In the preferred embodiment, the duplicate proces6ing FINNE~.HE~DER50N systems 20 and 20' are identical and contain identical F.~R~ RRETE

e~ w ~nl~O~OI~. O C ~0001 0.,.0~ ~0 ,~
. ~ :

2~2~ ~9 1 modules. Thus, only processing system 20 will be described completely with the understanding that processing ~ystem 20' operates equivalently.
Processing system 20 includes CPU module 30 which is shown in greater detail in Figs. 3 and 4. CPU module 30 is interconnected with CPU module 30~ in proce~sing system 20 by a cross-link pathway 2S which is de~cribed in greater detail below. Cross-link pathway 25 provide~ data transmis-sion paths between processing systems 20 and 20' and carries timing signals to ensure that proces~ing systems 20 and 20' operate synchronously.
Processing system 20 al80 includes I/O modules 100, 110, and 120. I/O modules 100, 110, 120, 100~, 110~ and 120~ are independent devices. I/O module 100 is shown in greater ; ::: ~
detail in Figs. 1, 4, and 17. Although multiple I/O modules are shown, duplication of ~uch modules is not a requirement ~ ;~
of the system. Without such duplication, however, some degree of fault tolerance will be lost.
Each of the I/O modul~s 100, 110 and 120 is connected to ~ ;
CPU module 30 by dual rail module interconnect~ 130 and 132. ~- ;
Module interconnects 130 and 132 serve as the I/O intercon~
nect and are routed across the backplane for processing system 20. For purposes of this application, the data path-way including CPU 40, memory controller 70, cross-link 90 and module interconnect 130 18 considered as one rail, and the .AwOr~
FlN~eC~N. HENDERSON
FM~EOWi C~RRETr /; DUNNER
~C 11 ~T~Ct~. N W
10~01~ 0 C ~000 1~0~ -0 _ 9 - '.'' '~' ' ~: `
20222~9 data pathway including CPU 50, memory controller 75, cross-link 95, and module interconnect 132 is considered as another rail. During proper operatlon, the data on both rails is3 the same.
P3. FAULT TOLERANT SYSTEN PHILOSOPHY' Fault tolerant computer system lO does not have a single point of failure becau~e each element i8 duplicated.
Proce~ing systems 20 and 20~ are e~ch n f~il stop proce~3ing ~ -~
system which means that those systems can detect faults or errorF3 in the subsystems and prevent uncontrolled propagation of such faults and errors to other s3ubsystems, but they have ~ ;
a single point of failure because the elements in each processing system are not duplicated.
The two fail stop proces3sing syFtems 20 and 20~ are interconnected by certain elements operating in a defined manner to form a fail safe s3ystem. In the fail safe system embodied as fault tolerant computer system lO, the entire computer system can continue processing even if one of the fail stop processinq syE3tems 20 and 20~ is faulting.
The two fail stop processing systems 20 and 20' are considered to operate in lockstep synchronism because CPUs 40, S0, 40' and 50' operate in such synchronism. There are three significant exceptions. The first i8 at initialization when a bootstrapping technique brings both processors3 into synchronism. The second exception is when the processing .~0;,,c~. systems 20 and 20~ operate independently tasynchronously) on c,~ HE~DE~oN
F.',RAE~ RREl r R ~RC~, R W
O~OI~ C ~000 120~ 3~ 0 _ lO -; ~ r.;; r,r~T~T~,~rn,~

2~222~ ~

1 two different wor~loads. The third exception occurs when certain errors arise in processing systems 20 and 20~. In this last exception, the CPU and memory elements in one of the processing systems is disabled, thereby ending s synchronous operation.
When the 3y8tem i3 running in lockstep I/O~ only one I/O
device is being acce~sed at any one time. All four CPVs 40, ~ ~ i 50, 40~ and 50', however, would receive the same data from ~;
that I/O device at substantially the same time. In the fol~
lowing discuscion, it will be understood that lockstep synchronization of processing sy~tems means that only one I/O ~ ;
module is being accessed.
The synchronism of duplicate processing sy~tems 20 and 20' is implemented by treating each sy~tem as a deterministic machine which, starting ln the same known state and upon receipt of the same inputs, will always enter the same machine states and produce the same result~ in the absence of error. Proces~ing systems 20 and 20~ are confiqured identi~
cally, receive the same inputs, and therefore pass through the same states. Thus, as long as both processors operate synchronously, they should produce the same results and enter the same state. If the processing systems are not in the same state or produce different results, it is assumed that one of the processing systems 20 and 20' has faulted. The source of the fault must then be isolated in order to take LAW or~ee- corrective action, such as disabling the faulting module. FINNE~ . HENDER50N
FARAbOW CARRETr 17~ T~tt~, ~I W
NO~O~. O C ~0000 1~0 ~ 0 ~0 - 11 - ' .

20222~

1 Error detection generally involves overhead in the form of additional proce~6ing time or logic. To minimize such overhead, a system should check for errors as infrequently as possible consistent with fault tolerant operation. At the very least, error checking must occur before data i3 output-ted from CPU modules 30 and 30~. Otherwise, internal processing errors may cause improper operation in external systems, like a nuclear reactor, which is the condition that fault tolerant system~ are de~igned to prevent.
There are reasons i'or additional error checking. For example, to isolate faults or error~ it is desirable to check the data received by CPU modules 30 and 30~ prior to ~torage or use. Otherwise, when erroneou~s stored data i8 later ac~
cessed and additional error~ result, it become~ difficult or lS imposslble to find the original source of errors, especially when the erroneous data has been ~tored for some time. The ~`
passage of time as well as subsequent proce~sing of the er-roneou~ data may destroy any trail back to the source of the error.
"Error latency,~ which refers to the amount of time an ~ -error is stored prior to detection, may cau~e later problems a6 well. For example, a ~eldom-used routine may uncover a latent error when the computer system i~ already operating with diminished capacity due to a previous error. When the computer system has diminished capacity, the latent error may ~^~o~ cause the system to crash FINNEC~N. HENDERSON
F~R.~ I', C,~RRE1r T~C~T, . o c ,.o.,,., ...o ~ ".~

20222~ :

1 Furthermore, it is desirable in the dual rail systems of proces~ing systems 20 and 20~ to check for error~ prior to transferring data to single rail system~, such a~ a shared ~ ;
resource like memory. This i~ because there are no longer two independent sources of data after such transfers, and if any error in the single rail sy~tem i8 lster detected, then ;~
error tracing becomes difficult if not impossible. ~ ~-C. MOD~lLE DESCRIPTION `~
1. CPU Module The elements of CP~ module 30 which appear in Fig. 1 are shown in greater detail in F~gs. 3 and 4. Fig. 3 is a block diagram of the CPU module, and Fig. 4 shows block diagrams of CPU module 30 and I/O module 100 a4 well as their intercon~
nections. Only CPU module 30 will be de~cribed since the operation of and the elements included in CPU modules 30 and 30~ are generally the same.
CPU module 30 contains dual CPUs 40 and 50. CPUs 40 and 50 can be standard central processing unitQ known to persons of ordinary skill. In the preferred embodiment, CPU~ 40 and 50 are VAX microproce~sors manufactured by Digital Equipment Corporation, the assignee of this application. ;~ ;
A~sociated with CPUs 40 and 50 are cache memories 42 and 52, respectively, which are standard cache RAMs of sufficient memory size for the CPUs. In the preferred embodiment, the cache RAN is 4K x 64 bits. It is not nece~sary for the ~wO.~,c.- present invention to have a cache RAM, however.
FINNECA~, HENDERSON
F.~A~ jO~ R R E7 r ~T~ T. h ~h~hOtOh O C 000 1~021 ~ 0 :.. !^- ~:~ r`

.? ~ ^

20222~9 1 2. MemorY Module Preferably, CPU~s 40 and 50 can share up to four memory modules 60. Fig. 5 i8 a block diagram of one memory module 60 ~hown connected to CPU module 30.
During memory transfer cycles, status register transfer cycles, and EEPROM transfer cycles, each memory module 60 transfers data to and from primary memory controller 70 via a bidirectional data bus 85. Each memory module 60 al80 receives address, control, timing, and ECC signal3 from memory controllers ~0 and 75 ~ia buses 80 a~d 82, respectively. The address signals on bUBe8 80 and 82 include board, bank, and row and column address signals that identify the memory board, bank, and row and column address involved in the d~ta transfer.
is As shown in Fig. 5, each memory module 60 includes a memory array 600. Each memory array 600 is a stsndard RAM in which the DRAM~ are organized into eight banks of memory. In the preferred embodiment, fast page mode type DRAMS are used.
Memory module 60 a1BO includes control logic 610, data transceivers/regi~ters 620, memory drivers 630, snd an EEPROM
640. Data transceivers/receivers 620 provide a data buffer and data interface for transferring data between memory array 600 and the bidirectional data lines of data bus 85. Memory drivers 630 distribute row and column address signal~ snd control signals from control logic 610 to each bank in memory `:
~ WO"I~..... array 600 to enable transfer of a longword of data and itq FlNNeG~ HI~NDEI~SON
FMABO~. G~RRE~r 6 DU~IeR
,"",, ., W
W~ O~, o C ~ooo-o~ o ~ 30 - 14 -: : `
20222Q~
.~
. .

1 corresponding ECC signals to or from the memory bank selected by the memo~y board and bank address signals.
EEPROM 640, which can be any type of NVRAM (nonvolatile RAM), stores memory error data for off-line repair and con-figuration data, such as module ize. h~hen the memory module is removed after a fault, ~tored data i8 extracted from : . . : : .:: :, . :
EEPROM 640 to determine the cauxe of the fault. EEPROM 640 is addressed via row addre~ lines from drivers 630 and by EEPROM control signals from control logic 610. EEPROM 640 ' transfers eight bits of data to and from a thirty-two bit internal memory data bus 645.
Control logic 610 routes address signals to the elements of memory module 60 and generates internal timing and control signals. As ~hown in greater de~ail in Fig. 6, control logic 610 includes a primary/mirror designator circuit 612.
Primary/mirror designator c~rcuit 612 receives two sets of memory board address, bank address, row and column ad~
dress, cycle type, and cycle timing signal~ from memory controllers 70 and 75 on buses 80 and 82, and also transfers two sets of ECC signals to or from the memory controllers on buses 80 and 82. Transceivers/registers in designator 612 provide a buffer and interface for transferring these signals to and from memory buses 80 and 82. A primary/mirror multiplexer bit stored in status registers 618 indicates which one of memory controllers 70 and 75 is designated as LAwO~r~Ce- the primary memory controller and which i~ designated as the FINNEC~N. HENDER~ON
F.~RAUO~ RRE7r ~?~ ~ ~TICee~. N W.
~-U~ OTOI-. O C. ~OOO--O . : , ' ~ ~`. u mirror memory controller, and a primary/mirror multiplexer signal is provided from status registers 618 to designator 612.
Primary/mirror designator 612 provides two sets of signals for distribution in control logic 610. One set of signals includes designated primary memory board address, bank address, row and column address, cycle type, cycle tim-ing, and ECC signals. The other set of signals includes designated mirror memory board address, bank address, row and column address, cycle type, cycle timing, and ECC signals.
The primary/mirror multiplexer signal is used by designator 612 to select whether the signals on buses 80 and 82 will be respectively routed to the lines for carrying designated primary signals and to the lines for carrying designated mir-ror signals, or vice-versa.
A number of time division multiplexed bidirectional lines are included in buses 80 and 82. At certain times after the beginning of memory transfer cycles, status register transfer cycles, and EEPROM transfer cycles, ECC
signals corresponding to data on data bus 85 are placed on these time division multiplexed bidirectional lines. If the transfer cycle is a write cycle, memory module 60 receives data and ECC signals from the memory controllers. If the transfer cycle is a read cycle, memory module 60 transmits data and ECC signals to the memory controllers. At other times during transfer cycles, address, control, and timing ~ 2 0 2 2 ~

1 signals are received by memory module 60 on the time divisionmultiplexed bidirectional lines. Preferably, at the begin~
ning of memory transfer cycles, status register transfcr cycles, and EEPROM transfer cycles, memory controllers 70 and 75 transmit memory board addre s, bank address, and cycle type signals on these time~hared lilleE~ to sach memory module 60.
Preferably, row address signals and column address signals are multiplexed on the same row and column address ~-;
lines during transfer cycle~. First, a row addres~
provided to memory module 60 by the memory controller3, fol~
lowed by a column address about sixty nanoseconds later. ;~
A sequencer 616 receives as inputs a system clock signal and a reset signal from CPU module 30, and receives the designated primary cycle timing, designated primary cycle type, designated mirror cycle timing, and designated mirror cycle type signals from the transceivers/registers in de~ignator 612.
Sequencer 616 is a ring counter with associated steering logic th~t generates and distribute~ a number of control and sequence timing signals for the memory module that are needed in order to execute the various types of cycles. The control and ~equence timing signals are generated from the system clock signals, the designated primary cycle timing signals, ., .::
and the designated primary cycle type signals.
~A~ O--~ICC~
Fl#NEC.W. HENDERsoN
F.'~R.~BO~ Cr~RREl r ~n ~ -. N ~1 ~ N~ToN D. C ~OOO-- , 1~0~ --0 ~ 17 ~ :- ~

2022~Q~

1 Sequencer 616 alscJ generates a duplicate set of sequence mirror cycle timing sigr. and the designated mirror cycl~
type signals. These dui i sequence timing si~nals are used for error check ~ e~ transfers of multi-long wordæ of data to an~ rom m~ edule 60 in a fast page mode, each set of co~umn addresses starting with the first set i8 followed by next. 'umn address 120 n~no~econds :~
later, and each lol~ ~rd ol: ca is moved across bus 85 120 nanoseconds a~ter ~ previ , long word of data.
Sequencer 616 also ger.. ,t~s tx/rx register control signals. The tx~rx regist~ nt.rol signals are provided t~
control the operation of data transce$vers/registers 620 and the transceivers/registers in designator 612. The direction o~ da~ flow i~ deterQined ~-y the steering logLc in sequencer 616, which responds the ;ignated primary cycle type signals by gel~ n~ trol and sequence timing signals to in .Le wheth~ .I when data and ECC ~ignals should be written into or 1 fxom the transceivers/
registers in memory module 60. Thus, durinq memory write cycles, status register write cycles, and EEPROM write ~
cycles, data and ECC si~nals will be latched int~ the - . .
transc~ivers~re~isters f buses 80, 82, and 85, while dur ~:
ing memory read cycles ~-u8 regi~ter read cycles, and EEPROM read cycles, da .lnd ECC signals will be latched int.J
" ,c.~
~INN~N. HENDERsoN
,C,.~R~o~. C~RR~l r 6 DUNNeR
~n~ -.. N W.
011.0 C ~000-1~0~ 0 _ 18 -~o . '~: ~ ~ ,: `? i~ ~

~' ~
20222Q~

l the transceiverstregisters from memory array 600, status registers 618, or EEPROM 640 for output to CPU ~odule 30.
Sequencer 616 also generates EEPROM control signals to control the operation of EEPROM 640.
The timing relationships that exi~t in memory module 60 are specified with reference to the rise time of the syctem clock signal, which has a period of thirty nanoseconds. All ~tatus register read and write cycle~, and all memory read and write cycles of a single longword, are performed in ten system clock periods, i.e., 300 nanoseconds. Memory read and write transfer cycles may consist o' multi-longword transfers. For each additional longword that is transferred, ;~
the memory transfer cycle is extended for four additional system clock periods. Memory ref~esh cycles and EEPROM write cycles require at least twelve system clock periods to execute, and EEPROM read cycles require at least twenty system clock periods.
The designated primary cycle timing signal causes sequencer 616 to start generating the sequence timing and control ignals that enable the memory module selected by the memory board address signals to implement a requested cycle.
The transition of the designated primary cycle timing signal ~-to an activa state marks the start of the cycle. The return of the deslgnated primary cycle timing signal to an inactive state marks the end of the cycle.
~W Ot-lCt-FINNEC~N. HENDERSON
F.~R.~O~. G/~RREIr a DUNNER
m~ t-T, W.
w~ir~O~o~, O C ~000- , ~
~ ~ o ~ - 0 : -- 19 - ,;~ ~

20222~1~

1 The sequence timing signals generated by sequencer 616 are associated with the different stata~ entered by the ~ -. .
sequencer as a cycle reque~ted by CPU module 30 is executed.
In order to qpecify the timing relation~hip among these dif~
ferent states (and the timing relatiOnship among sequence timing signals corresponding to each of these states), the discrete states that may be entered by sequencer 616 are `
identified as state~ SEQ IDLE and SEQ 1 to SEQ 19. Each state lasts for a single system clock period (thirty nanosecond~). Entry by sequencer 616 into each different state is triggered by the leading edge of the system clock signal. The leading edges of the system clock signal that cau~e sequencer 616 to enter ~tate~ SEQ ID~E and SEQ 1 to SEQ
19 are referred to as tran~itions T ID~E and Tl to ~19 to relate them to the ~equencer states, i.e., TN is the ~ystem clock signal leading edge that cau~es sequencer 616 to enter state SEQ N.
At times when CPU module 30 i5 not directing memory module 60 to execute a cycle, the designated primary cycle timing signal is not asserted, and the sequencer remains in -state SEQ IDLE. The sequencer is started (enters state SEQ
1) in response to a~sertion by memory controller 70 of the cycle timlng signal on bus 80, provided control logic 610 and sequencer 616 are located in the memory module ~elected by memory board address signals also transmitted from memory FlNNEC~j HENDERSON co~troller 70 on bus 80. The ri8ing edg~ of the first systemF.j~R~O~ RREl r ; D~ ER
I~C ~ ~latcl. N w NOlON D C ~000 1~0~ -0 , ~, . . .

2~22~ ~ 9 : .

1 clock signal following aRsertion of the de~ignated primary cycle active signal corre~ponds to transition Tl.
As indicated previously, in the case of transfers of a single longword to or from memory array 600, the cycle is performed in ten system clock periods. The sequencer proceeds from SEQ IDLE, to ~tates SEQ 1 through SEQ 9, and returns to SEQ IDLE. :
Memory read and write cycles may be extended, however, to transfer additional longwords. MemGry array 600 prefer- ~ :
ably uses "r'ast page mode~ DRAMs. During multi-longword reads and writes, transfers of data to and from the memory array after tran3fer of the first longword are accomplished by repeatedly updating the column addre~3 and regenerating a CAS (column address strobe) signal.
lS During multi-longword transfer cycles, these updates of the column address can be implemented because sequencer 616 repeatedly loops from states SEQ 4 through SEQ 7 until all of the longwords are transferred. For ex.~mple, if three longwords are being read from or written into memory array 600, the sequsncer enters states SEQ IDLE, SEQ 1, S~Q 2, SEQ ~ -3, SEQ 4, SEQ 5; SEQ 6, SEQ 7, SEQ 4, SEQ 5, SEQ 6, SEQ 7, SEQ 4, SEQ 5, SEQ 6, SEQ 7, SEQ 8, SEQ 9, and SEQ IDLE.
During a memory trans~er cycle, the designated primary cycIe timing signal is monitored by 3equencer 616 during transition T6 to determine whether to extend the memory read .~wO~c... or write cycle in order to transfer at least one additional FINNEC~N, HENDER50N
C.'~RAEOW. C~RREtr T ~ c e ~ . N W
~ul~oTo~o O C 000--1~ 0~ 0 _ 21 ~
,:,~ ., 20222Q9 ~ ~ ~

1 longword. At times when the designated primary cycle timing ~ignal is a~ert d during tran~ition T6, t~e ~eq~ r -~n state SEQ 7 will resplf ~o the next s~, ~m clock signal by entering state SEQ 4 i~ ;.ead of anterinl. -.tate SEQ 8.
In ~he ca~ of a ~.ilti-longword transfer, the de~ignated primary cycle timing signal is asserted at least fifteen nanoseconds before the first T6 txansition and remains as-serted until the final longword is transferred. In order to end a memory tran~fer cycle after the final longword has been transferred, the designated primary cycle timing signal is . ~ .
deasserted at lea~t fifteen nanoseconds before the last r6 transition and remains deaqserted for at lea~t ten nanoseconds after tha last T6 transition.
During memory transfer cycles, the de~ignated primary row address signals and the de~ignated primary column address ~:-~ignals are presented at different times by de~ignator 612 in control logic 610 to memory drivers 630 on a set of time division multiplexed lines. The ou~puts of drivers 630 are applied to the address inputs of the DRAMS in memory array 600, and also are returned to control logic 610 for comparison with the designated mirror row and column address signals to check for errors. During status register transfer :
cycles and EEPROM transfer cycles, column address ~ignals are not needed to select a particular storage location.
2S During a memory transfer cycle, row address signals are~^~O'~C~ the first signals presented on the time~hared row and column FINNEC~N. HeNDERSON
F~R~U.~ ARE1 r T ~ W
0~0~. 0 C. ~000~
O ., .10 - 22 -~ f~ ~
2~222~

., ~ ::.
.

1 address lines of buses 80 and 82. During state SEQ IDLE, row addre~,~ 5ignals are tran~ittQd by the me~ory ~ontr~ll~rs the row and column addr~ss lines, and the row address iq stable from at least fifteen nanoseconds before the Tl transition until ten nanoseconds after the T1 tranFition.
Next, column address signals are transmitted by the memory controllers on the row and column address lines, and ~he ~;
column address i8 stable from at least ten nanoseconds before the T3 transition until fifteen nanoseconds after the T4 transition. In the caqe of multi-longword transfers during memory transfer cycle~, sub~equent column address signals are then transmitted on the row and column address line~, and these subsequent column addres~es are stable from ten nanoFeconds before the T6 transition until fifteen ~ ~;
~ .
nano~econdF, after the T7 transition.
Generator/checker 617 receives the two se~s of sequence timing signals generated by sequencer 616. In addition, the designated primary cycle type and bank address signals and the designzted mirror cycle type and bank address signals are transmltted to generator~checker 617 by designator 612. In the generator/checker, a number of primary control signals, i.e., RAS (row addresc strobe), CAS (column address strobe), and WE ~write enable), are generated for distribution to drivers 630, u~ing the primary sequ~nce timing signals and the designated primary cycle type and bank address signals. ~ -~
^WO"'"- A duplicate set of these control signals is generated by FI~NEG~ . HE~DERSON
FAR~EO~' G, RRE1r E7 DuN!`lER
17-~ Il C~t~T, h w W~Ch~hOTOh.O C ~OOO--~0~ 0 20222~

1 generator/checker 617 from the duplicate (mirror) ~equence timing signals and the designated mirror cycle type and bank address signals. These mirror RAS, CAS, and write enable signals are used for error checking.
When the primary cycle type signal~ indicate a memory transfer cycle is being performed, the primary bank address signals identify one selected bank of DRAM~ in memory array 600. Memory drivers 630 include ~epar~te RAS drivers for each bank of DRAM~ in memory array 600. In generator/checker 617, the primary RAS signal is generated during the memory transfer cycle and demultiplexed onto one of the lines con~
necting the generator~checker to the RAS drivers. As a result, only the RAS driver corresponding to the selected DRAM bank receives an asserted RAS signal during the memory tran~fer cycle. During refresh cycle~, the primary RAS
~ignal is not demultiplexed and an asserted RAS signal i~
received by each RAS driver. During statu~ register transfer cycles ~nd EEPROM trnn~fer cysles, the bank address ~ignals are unnece~sary.
Memory drivers 630 also include CAS drivers. In ~ ~
generator/checker 617, the primary CAS signal is generated -i~i during memory transfer cycles and refresh cycles. The primary CAS signal is not demultiplexed and an asserted CAS
signal is received by each CAS driver.
2S During memory write cycles, the primary WE signal is ~wo~e~ generated by generator/checker 617. The asserted WE signal FINNEC~N. HENDEASON
F~R~I~OW CARAE~E

t~- - W
W~ OTO~ o. C ~ooo- , ~o~ o ,,, ~.

20222~ ~

1 is provided by drivers 630 to each DRAM bank in memory array 600. However, a write can only be executed by the selected DRAM bank, which also receives a~serted RAS and CAS signals.
In the preferred embodiment of the inYention, during S memory transfer cycles the primary RAS signal i~ asserted during the T2 transition, is stakle from at least ~en nano-seconds before the T3 transition, and ~s deasserted durlng the last T7 transition. The primary CAS signal is asserted fifteen nanoseconds after each T4 transition, and is . ~ ~
dea~serted during each T7 tran~ition. During memory write cycles the primary WE signal is a~serted during the T3 transition, is stable from at least ten nanoseconds before khe f$rst T4 tran~ition, and is dea~serted during the last T7 transition.
When the primary cycle type signals indicate a memory refresh cycle is being performed, generator/checker 61~
causes memory array 600 to perform memory refresh operations in response to the primary sequencs timing signals provided by sequencer 616. During these refresh operations, the RAS
and CAS signals are generated and distributed by the ;~
generator/checker in reverse order. This mode of rsfresh . ~
requires no external addressing for bank, row, or column.
During transfer cycles, ECC signals are transferred on the time division multiplexed bidirectional lines of buse~ 80 and 82 at times when data i~ being transferred on bu~ 85.
s~o-l-- However, these same line~ are used to transfer control (e.g., F.~R~ W. C~R R E1 r o~O~ o c ~000-~o~ 0 ~ 202220~

1 cycle type) and address (e.g.~ memoxy board address and bank address) signals at other time~ during the transfer cycle.
The transceivers/regis~ers in primary/mirror designator 612 include receivers and transmitters that are respon~ive to sequence timing signals and tx~rx register control signals provided by sequencer 616. The sequence timing signals and tx/rx regi~ter control -ignals enable multiplexing of ECC
siqnals and address and control ~ignals on the time division multiplexed bidirectional line~ of buses 80 and 82.
Preferably, control and address signals, such a~ cycle type, memory board address, and bank address signals, are transmitted by memory controllers 70 and 75 and presented on the time6hared lines of buses 80 and 82 at the beginn$ng of either single or multi-longword tran~fer cycles. The~e signals start their transition ~while the ~equencer i~ in the SEQ IDLE state) concurrent with activation of the cycle tim-ing signal, and remain stable throuqh T2. Therefore, in the transcelvers/regiisters of designator 612, the receivers are enabled and the transmitters are set into their tristate mode at least until the end of state SEQ 2.
The cycle type signals identify which of the following listed functions will be performed by memory array 60 during the cycles memory read, memory wrLte, ~tatus regi~ter read, status register write, EEPROM read, EEPROM write, and refre~h. The designated primary cycle type signals received LAWO~ C~ by designatox 612 are provided to sequ2ncer 616 and used in FINNEC.~N, HE! DERSON
F~R~EO~ ~RRE1r 6~ DUN~JER
IT~C 1~ TII~CtT, ~ ~
~C~ 1~010~ . D. C TOOO--~0~ 0 ~ ' ' `~ A~ ~ V~

20222~.~

1 generating tx/rx control signals and sequence timing signals.
For example, in data transceivers/registers 620 and in the transceivers/registers of designator 612, the receivers are enabled and the transmitters are set into their tristate mode 5 , by ~.equencer 616 throughout a write cycle. Bowever~ in data ;
transceivers~registers 620 and in the transcQivers/registers of designator 612 dur.ng a read cycle, ~he receiver~ are ~.e~
into their trista~e mode and the tran~mitter~. are enabled by sequencer 616 after the cycle type, memory board addre~, and bank address signals have been received at the beginning of the cycle.
In the preferred embodiment, data transferred to or from ~ ~-memory array 600 i~ checked in each memory module 60 using an ~-~
Error Dete~ting Code (EDC), which is preferably the ~ame code -lS requ$red by memory controllers 70 and 75. The preferred code ;~
18 a _ingle bit correcting, double bit detecting, error cor~
recting code (ECC).
During a memory write cycle, memory controller 70 tranRmits at lea~t one longword of data on data bus 85 and ~ ~
slmultaneous.ly transmits a corresponding set of ECC signals ~;
on bus 80. Meanwhile, memory controller 75 transmLts a ~econd set of ECC signals, which also correspond to the longword on data bus 85, on bus 82.
As embodied herein, during a memory ~ri~e cycle the data and the ECC signals for each longword are pre~ented to the .~wOr.,~" receivers of data transceiver3/registers 620 and to the Fl~;EC~!~t;. HE~DERSON
F.~ ~RRE I r ii Dl,'~i~'iER
t~ttT. h W
~h~hl~10~ D C ~000--~0~ 0 - 20222~9 1 receivers of the transceiver~/registers of designator 612.
The data and the ECC signals, which are stable a~ least ten nano econds before the T4 transition and remain stable until fifteen nanoseconds after the T6 tran~ition, are latched into S these transceivers/registers During this time period, :
memory controllers 70 and 75 do not provide addres~ and ~ ~:
control signals on the timeshared lines of bu~es 80 and 82.
The designated primary ECC signals received by designa- :
tor 612 and the longword of data received by transceivers/
re~isters 620 during the memory write cycle are provided to the data input~ of the DRA~s in each of the eight b~nks of memory array 600 and to ECC generator 623. The generated ECC
is compared to the designated primary ECC by comparator 625.
The designated primary ECC signals also are provided to ECC
comp~rators 625, together with the designated mirror ECC
signals. :~ :
As embodied herein, during a memory read cycle, at least :
one longword of data and a corre~ponding set of ECC signals are read from memory array 600 and respectively steered to :: ::
data transceivers~registers 620 and to the transceivers/ ~ :~
registers of designator 612. During transition T7 of the memory read cycle, the data and the ECC signals for each :~
lonqword are available from memory array 600 and are latched into these transceiverR/registers, The data is also presented to the ECC generator 623 and itY output is compared .~WOr.,e... to the ECC read from memory FINNEC~N . HENDERSON
F.~RAW~ RRE1 r ~" ~ T. ~ w O ~ O -t D C ~ O O O C
1~0~ 0 _ 2B -2022~

1 After latching, the data and the ECC signals are presented to data bus 85 and to buses 80 and 82 by the transmitters of data tran~ceivers/regi~ters 620 and by ~he tran~mitters of the transceivers/registers of designator 612.
The same ECC signals are kran~mitted from the transceivers/
registers in designator 612 to memory controller 70 and to memory controller 75. The data and the ECC signals transmit-ted on data bus 85 and on buses 80 and 82 are stable from j-:~
fifteen nanoseconds after the T7 tran0ition until five nanoseconds before the following T6 transition (in the case of a multi-longword transfer) or until five nanoseconds before the following T IDLE transition ~in the case of a single longword trans~er or the last longword of a multi-longword transfer). During thi~ time period, memory control~
lers 70 an~ 75 do not provide address and control signal3 on the timeshared lines of bu~es 80 and 82. The transmitters of :~
data transceivers~registers 620 and the transmitters of the transceivers/registers of designator 612 are set into their tristate mode during the following T IDLE transition.
Compar~tor 614 i9 provided to compare the address, control, and timing signals originating from controller 70 with the corresponding address, control, and timing signals originating from controller 75. The designated primary cycle timing signals, cycle type ~ignals, memory board address signals, and bank address signals, together with the LA~Or~,ct. designated mirror cycle timing signals, cycle type signals, F.~R~ I' G~RRE1 r 6 D ~ E R
17~ A~t~, N W
~AINO~ON O C ~OOO--1~ 0~ 0 r~
20222~

1 memory board address signals, bank addres~ signals, row ad~
dress signals, and column address signal~, are provided from designator 612 to comparator 614. The de~ignsted primary row addre3s signal~ and column addres~ signals are provided from the outputs of drivers 630 to comparator 614. Both set~ of signals are then compared.
If there i9 a miscompare between any of the address, control, and timing signals originating from the memory controllers, comparator 614 generates an appropriate error signal. A~ shown in Figure 6, board address error, bank ad~
dre~ error, row addres~ error, column address error, cycle ..
type address error and cycle timing error signals may be output by the comparator.
Generator/checker 617 compares the primary control and ~. . , tim~ng signals generated by ~equencer 616 an~i generator/
checker 617 using the designated primary bank addres~, cycle type, and cycle timing signals with the mirrox control and timlng signals generated using the designated mirror bank addre~s, cycle type, and cycle timing signals. The two sets of ~equence timing signals are provided by sequencer 616 to generator/checker 617. The primarY RAS, CAS, and W~ signals are provided from the output~ of drivers 630 to generator/
checker 617. As indicated previously, the mirror RAS, CAS, and WE signals are generated internally by the generator/
checker. Generator/checker 617 compares the primary RAS, ~ o~rlee~
FlNtleCAN~ HE!IDERSON
,'~RAE;~;ARRETT
6 Dl.'NNER
3,~,~,.ot,.~w W~ 0104 0 C ~000-IIOJ~ O

,~.

20222~

1 CAS, wE, and sequence timing signal~ to the mirror RAS, CAS, ;~
WE, and sequence timing signals.
If there is a miscompare between any of the control and timing signals originating from sequencer 616 or generator/
checker 617, the generator/checker generates an appropriate error signal. As shown in Figure 6, sequencer error, RAS
error, CAS error, and WE error signals may be output by generator/checker 617.
Error signal6 are provided from comparator 614 and from generator/checker 617 to addrass/control error logic 621. In respon~e to receipt of an error signal from comparator 614 or from generator/checker 617, addres~/control error logic 621 transmits an addre~s/control error signal to CPU module 30 to indicate the detection of a fault due to a miscompare between any address, control, or timing ~ignals. The address/control error ~ignal is ~ent to error logic in memory controllers 70 and 75 for error handling. The transmission of the address/
control error signal to CPU module 30 causas a CPU/MEM fault, which is discussed in greater detail in other sections.
The error signal~ from comparator 614 and from generator/checker 617 also are provided to status registers 618. In the status registers, the error signals and all of the address, control, timing, data, and ECC sign~ls relevant to the fault are temporarily stored to enable error diagnosis and recovery.
~-~ or~le~
FINNECAN. HENDER50N
F.'~R~BOW. G'~RREl r a DUNNER
te~. N v~.
NOTON D C iOOOt ~0~ 0 - 20222~

1 In accordance with one aspect of the inventlon, only a single thirty-two bit data bus 85 is provided between CPU
module 30 and memory module 60. Therefore, memory module 60 cannot compare two sets of data from memory controllers 70 and 75. ~owever, data integrlty is verified by memory module 60 without using a duplicate set of thirty-two data lines by ;
checking the two C?eparate ~ets of ECC signals that are transmitted by memory controllerR 70 and 75 to memory module 60.
As shown in Fig. 6, control logic 610 includes ECC
generator 623 and ECC comparators 625. The designated primary and mirror ECC signal~ are provided by designator 612 to the ECC comparators. During a memory write cycle, the designated primary ECC signals are compared to the designated mirror ECC signals. As a result, memory module 60 verifie~
whether memory controller~ 70 and 75 are in agreement and whether the designated pr~mary ECC signals being stored in the DRAMs of memory array 600 during the memory write cycle are correct. Furthermore, the data presented to the data ~`
inputs of the DRAMs during the memory write cycle is provided to ECC generator 623. ECC generator 623 produces a set of generated ECC signals that correspond to the data and provides the generated ECC signals to ECC comparators 625.
The designated primary ECC signal~ are compared to the gener-ated ECC signals to verify whether the data transmitted on ;~
~w O~C~
FINNEC~N~ HENDeRSON
F.~R~E~ Ct\RRETr 0 ~ t t T . 1~ W
10-01~,O C ~000-- 0 ~ - - 0 ` 20222~9 1 data bu~ 85 by memory controller 70 is the same as the data being stored in the DRAMS of memory array 600. :
During a memory read cycle, the data read from the selected bank of D~AMs is pre~ented to the ECC generator.
The generated ECC signals then are provided to the ECC
comparator~, which also receive ~tored ECC ~ignals read from ~ ;
the selected bank of DRAMs. The generated and stored ECC
signals are compared by ECC comparator~ 625.
If there is a miscompare between any of pairs of ECC .~
signals monitored by ECC comparators 625, the ECC comparators ~;
generate an appropriate error ~,ignal. As shown in Figure 6, ~.
prim~ry/mirror ECC.error, primary/generated ECC error, and ~ .
memory/generated ECC error signals may ba output by the ECC
compara~ors.
These ECC error signals from ECC comparators 625 are provided to status registers 618. In the status registers, each of the ECC error signals and all of the address, control, timing, data, and ECC signals relevant to an ECC
fault are temporsrily stored to enable error diagnosis and recovery.
An ECC error F,ignal is ass,erted by ECC comparatorF' 625 on an ECC error line and transmitted to CPU module 30 to indicate the detection of an ECC fault due to a miscompare.
The miscompare can occur during either of the two ECC check performed during a memory write cycle, or during the ~ingle .~wOr,,~... ECC check performsd during a memory read cycle.
FINNEC~N. HENDERSON
F~R.~E~ RREl r ac~T, N W
C -O I O e 0 0 0 -~0~
_ 33 -20222~

1 As shown in Flgure 6, board select logic 627 receives slot signals from a memory backplane. ~he slot signalfi -specify a unique slot location for each memory module 60. ~ -Board select logic 627 then compares the slot signals with S the designated primary board address signals transmitted from one of the memory controllers via designator circuit 612. A
board selected signal i~ generated by board select loglc 627 if the ~lot signals are the ~ame ~8 the design~ted primary board address signal~, th~reby enabling the other circuitry in control logic 610.
3. _emory Çontroller Memory controllers 70 and 75 control the acce~s of CPUs 40 and 50, respectively, to memory module 60, auxiliary memory elements and, in the preferred embodiment, perform certain error handling operations. The auxiliary memory elements coupled to memory controller 70 include system ROM
43, EEPROM 44, and scratch pad RAM 45. RON 43 holds certain standard code, such as diagno~tics, console drivers, and part of the boot~trap code. EEP~OM 44 is u~ed to hold informa-tion such as error information detected during the operation of CPU 40, which may need to be modified, but which should not be lost when power is removed. Scratch pad RAN 45 is u~ed for certain operations performed by CPU 40 and to convert rail-unique information (e.g., information specific to conditions on one rail which is available to only one CPU

~ 34 ~ ~ ~ .

2~222~

1 40 or 50) to zone information (e.g., information which can be acce~sed by both CPVs 40 and 50).
Equivalent elements 53, 54 and 55 are coupled to memory controller 75. System ROM 53, EEPROM 54, and scratch pad RAM
55 are the same as system ROM 43, EEPROM 44, and scratch pad RAM 45, re~pectively, and perform the ~ame function~.
The details of the preferred embodiment of primary memory controller 70 can be seen in Figs. 7-9. Nlrror memory controller 75 has the same elements as shown in Figs. 7-9, but differs slightly in operation. Therefore, only psimary memory controller 70'8 operation will be described, except where the operation of memory controller 75 differs. Memory controllers 70' and 75~ in processing system 20~ have the same elements and act the same as memory controllers 70 and 75, respectively.
The elements shown in Fig. 7 control the flow of data, addressQs and signals through primary memory controller 70.
Control logic 700 control the state of the various elements in Fig. 7 according to the ~ignsl3 received by memory controller 70 and the state engine of that memory controller which is stored in control logic 700. ~ultiplexer 702 select~ addresses from one of three source~. The addresses can either come from CPU 30 via receiver 705, from the DMA
enqine 800 described below in reference to Fig. 8, or from a refresh recync address line which is u~ed to generate an - ;~

'~
i,~

, , ,r--20222~

l artificial refresh during certain bulk memory transfers from one zone to another during raoiynchronization operations.
The output of multiplexer 702 is an input to multiplexer 710, as is data from CPV 30 received via receiver 705 and S data from DNA engine 800. The output of multiplexer 710 provides data to memory module 60 via memory interconnect 85 and driver 715. Driver 715 is disabled for mirror memory control modules 75 and 75~ because only one 8et of memory data iB sent to memory modules 60 and 60', respectively.
The data sent to memory interconnect 85 includes either data to be stored in memory module 60 from CPU 30 or DMA
engine 800. Data from CPU 30 and addresses from multiplexer 702 are also sent to DMA engine 800 via this path and also via receiver 745 and ECC corrector 750.
lS The addresses from multiplexer 702 also provide an input to demultiplexer 720 whlch divides the addresses into a row/column address portion, a board/bank address portion, and a single board bit. The twenty-two bits of the row/column address are multiplexed onto eleven lines. In the preferred embodiment, the twenty-two row/column address bits are ~ent to memory module 60 via drivers 721. The single board bit is preferably sent to memory module 60 via driver 722, and the other board/bank address bits are multiplexed with ECC
signals.
Multiplexer 725 combines a normal refresh command for memory controller 70 along with cycle type information from -:

~ 202220~

l CPU 30 (i.e., read, write, etc.) and DMA cycle type informa~
tion. The normal refresh command and the refresh resync ad-dress both cause memory module 60 to initiate a memory refresh operation.
The output of multiplexer 725 i~ an input to multiplexer 730 along with the board/bank address from demultiplexer 720.
Another input into mult~plexer 730 is the output of ECC
generator/checker 735. ~ultiplexer 730 ~elects one of ~he inputs and places it on the time-division multiplexed ECC/
address lines to memory module 60. Multiplexer 730 allows tho~e ~ime-division multiplexed line~ to carry board/b~nk ad-dre~s and additional control information aq well as ECC
information, although at different times.
ECC information is received from memory modules 60 via receiver 734 and is provided as an input to ECC generator/
checker 735 to compare the ECC generated by memory module 60 with that generated by memory controller 70.
Another input into ECC generator/checker 735 i8 the output of multip;exer 740. Depending upon whether the memory transaction i8 a write tran~action or a read transaction, multiplexer 740 receives as inputs the memo~y data sent to memory module 60 from multiplexer 710 or the memory data received from memory module 60 via receiver 745. Multiplexer 740 selects one of these sets of memory data to be the input : - .
to ECC generator/checker 735. Generator/checker 735 then , . , ~ r -20222~

1 generates the appropriate ECC code which, in addition to be- ;
ing sent to multiplexer 730, is also sent to ECC corrector -750. In the preferred embodiment, ECC corrector 750 correct any single bit errors in the memory data received from memory module 60.
The corrected memory data from ECC checker 750 i~ then sent to the DMA enqine shown in Fig. 8 as well as to multiplexer 752. The other input into multiplexer 752 is error informatlon from the error handling logic described below in connection with Fig. 9. The output of multiplexer 752 is sent to CPU 30 via driver 753.
Comparator 755 compares the d~ta sent from multiplexer 710 to memory module 60 with a copy of that data after it passes through driver 715 and receiver 745. This checking ! determines whether driver 715 and receiver 745 are operatlng correctly. The output of comparator 755 i8 a CMP error signsl which indicates the presence or absence of such a comparison error. The CMP error feeds the error logic in Fig. 9.
: . , ,:
Two other elements in Fig. 7 provide a different kind of error detection. Element 760 is a parity generator. ECC
data, generated either by the memory controller 70 on data to -~
be stored ln memory module 60 or generated by memory module 60 on data read from memory module 60 i9 sent to a parity generator 760. The parity signal from generator 760 i8 sent, ~-, ~ia driver 762, to comparator 765. Comparator 765 compares ' ' ~; '.

20222Q~

1 the ECC parity signal from generator 760 with an equivalent ECC parity signal generated by controller 75'.
Parity generator 770 performs the same type of a check on the row/column and single bit board address signals received from demultiplexer 720. The address parity signal from parity generator 770 i5 tranimitted by a driver 772 to a comparator 775 which al~o receives an address parity signal from controller 75. The outputs of comparator 7~5 and 775 are parity error signals which feed the error logic in Fig.
9.
Fig. 8 show~ the fundamentals of a DMA engine 800. In the preferred embodiment, DMA engine 800 resides in memory controller 70, but there i8 no requirement for ~uch place-ment. ~8 shown in Fig. 8, DMA engine 800 includes a data router 810, a DMA control 820, and DNA registers 830. Driver 815 and receiver 816 provide an interface between memory controller 70 and cross-link 90.
DMA control 820 receives internal control ~ignals from control logic 700 and, in re~ponse, sends control ~i~nals to place data router 810 into the appropriate confi~uration.
Control 820 al80 cause~ data router 810 to set ~ts configura-tion to route data and control signals from cros3-link 90 to the memory control 70 circuitry shown in Fig. 7. Data router 810 sends its status signals to DMA control 820 which relays such 3ignals, along with other DMA information, to error logic in Fig. 9.

- 39 - ~
: , ~ :

~gg ~ ~ g, ; g ~ ~ ~tg'~=~;''-"'~ ~'''' 2022~

1 Registers 830 includes a DMA byte counter register 832 and a DMA address register 836. These registers are ~et to initial values by CPU 40 via router 810. Then, during DMA -cycles, control 820 causes, via router 810, the counter ~
register 832 to increment and address register 836 to decre- ~;
ment. Control 820 also causes the contents of addre~s regi~ters 836 to be sent to memory module 60 through router 810 and the circuitry in Fig. 7 during DMA operations.
As explained above, in the preferred embodiment of this invention, the memory controllers 70, 75, 70~ and 75' also perform certain fundamental error opera~ions. An ex~ple of the preferred embodiment of the hardware to perform such er-ror operations are shown in Fig. 9.
. AB shown in Fiq. 9, certain memory controller internal -~
signals, such as timeout, ECC error and bus mLscompare, are inputs into diagnostic error logic 870, as are certain external signals such as rail error, firewall miscompare, and addres~/control error. In the preferred embodiment, di~gnostic error logic 870 receives error signal~ from the : ~ -other components of system 10 via cross-links 90 and 95.
Diagnost~c error logic 870 forms error pulses from the error signals and from a control pulse signal generated from the basic timing of memory controller 70. The error pulses generated by diagnostic error logic 870 contain certain error information which is stored into appropriate locztions in a 20222~

1 diagno~tic error register 880 in accordance with certain tim-ing signals. System fault error addre~s register 865 stores the addres~ in memory module 60 which CPUs 40 and 50 were communicating with when an error occurred.
The error pulses from diagnos~ic error logic 870 are also sent to error categorization logic 850 which also receives information from CPU 30 indicating the cycle type (e.g., read, write, etc.). From that information and the error pulses, error categorization logic 850 determines the presence of CPU/IO errors, DMA errors, or CPU~iE~ faults.
A CPU/IO error i8 an error on an operation that iB
directly attributable to a CPU/IO cycle on bus 46 and may be hnrdware recoverable, a~ explalned below in regard to re~ets.
DMA errors are errors that occur durinq a DMA cycle and, in the preferred embodiment, are handled principally by software. CPU/MEM faults are errors that for which the cor-rect operation of CPU or the contents of memory cannot be guaranteed. -~
~he outputs from error categorization logic 850 are sent ~ ;
to encoder 855 which forms a specific error code. This error code is then sent to cross-links 90 and g5 via AND gate 356 when the error disable signal is not present. -~
After receiving the error codes, cro~s-links 90, 95, 90 and 95~ qend a retry request signal back to the memory controllers. As shown in Fig. 9, an encoder 895 in memory controller 70 receives the retry request sighal along with . .
.

1 cycle typP information and the error signals (collectively ~ ~ -~hown as cycle qualifiers). Encoder 895 then ganerate~ an appropriate error code for storage in a system fault error register 898.
System fault error register 898 doe~ not store the same information as diaqno~tic error register 880. Unli~e the ~ystem fault error register 898, the diagno~tic error regi~ter 880 only contains rail unique information, such as an error on one input from a cro s-link rail, and zone unique . .
data, such as an uncorrectable ECC error in memory module 60.
System fault error register ~98 al90 contains ~everal bit~ which are used for error handling. These include a NXM
bit indicating that a de3ired memory location i~ missing, a NXI0 bit indicating that a desired I/0 location i~ missing, a solid fault bit and a transient bit. The transi~nt and solid bits together indicate the fault level. The transient bit also causes system fault error address register 865 to ; freeze.
, ::~. .
Memory controller ~tatus register 875, although techni~
cally not part of the error logic, is shown in Fig. 9 also.
Register 875 stores certain statu~ information such as a DMA
ratio code in DMA ratio portion 877, an error disable code in error disable portion 878, and a mirror bus driver enable ~
code in mirror bus driver enable portion 876. The DMA ratio code specifies ~he fraction of me~ory bandwidth which can be allotted to DMA. The error disable code provides a signal ~ .~'. ' ''' 20222~

1 for disabling AND gate 856 and thus the error code. The mir-ror bus driYer enable code provides a signal for enabling the mirror bus drivers for cextain data transactions.

4. Cross-link Data for memory resync, DMA and I/O operations pa~s ;~
through cross-links 90 and 95. Generally, cross-links 90 and 9S provide communications between C~U module 30, CPU module 30', I/O modules 100, 110, 120, and I~O module~ 100~, 110~, . . : .
120' (see Fig. 1).
Cros5-links 90 and 95 contain both parallel registers 910 and serial registers 920 as ~hown in Fig. 10. Both types of registers are u~ed for interpxoces~or communication in the ~ , .
preferred embodiment of this invention. During normal operation, processing systems 20 and 20~ are synchronized and data i~ exchanged in parallel between processing systems 20 ~nd 20' using parallel reqisters 910 in cro~s-links 90/9S and 90'/95', respectively. Nhen proce~5ing sy~tems 20 and 20' are not synchronized, most notably during boot~trapping, data iQ exchanged between cross-link~ by way of serial registers 920.
The addresses of the parallel registers are in I/O space a~ oppo~ed to memory space. Memory space refers to locations in memory module 60. I/O space refers to locations such as I/O and internal system registers, which are not in memory module 60.

', . : , ~ ?~ ;?~ '.',j'.`. i"j . . ~

:
2022~3 1 Within I/O space, addresses can either be in ~ystem ad-dre~.~ space or zone addres~ space. The term - 8y8tem addres3 space~- refers to addresses that are a~ce~sible throughout the entire sy~tem 10, and thus by both proce~ing system~ 20 and 20~. The term ~zone address space~' refers to addresse~ which are acces6ible only by the zone containin~ the particular ~-cross-l$nk.
The parallel regi~ters shown in Fig. 10 include ~ com~
munications register 906 and an I/O reset register 908. Com~
munications register 906 contains unique data to be exchanged between zones. Such d~ta is usually zone-unique, such as a -~
memory soft error (it is almost beyond the realm of prob- :~
ability that memory modules 60 and 60' would independently ~ `
experience the same error at the same time)~
Because the data to be ~tored into register 906 i~: ~:::.
unique, the address of communications register 906 for purposes of writing must be in zone address space.
Otherwise, processing ~ystems 20 and 20~, because they are in lockstep synchronization and executing the same serie~ of ~: -instruction at substantially the same time, could not store zone unique data into only the communications registers 906 : -in zone 11; they would have to store that same data into the communications registers 906~ (not shown) in zone 11~
The addres~ of communications register 906 for reading, however, is in system address space. Thus, during synchronous operation, both zones can simultaneou31y read the ,, . ,~

2 ~2 ~

l communications register from one zone and then simultaneously read the communications register from the other zone.
I/O reset register 908 reside~ in system address space.
The I/O reset register includes one bit per I/O module to indicate whether the corre~ponding module i~ in a reset state. When an I/O module is in a re?et sitate, it is ef~
fectively disabled.
Parallel regi~ters 910 al~o include other register~, but an understanding of those other regi~ters i8 not nece~sary to an understanding of the present invention.
All of the serial cross-link registers 920 are in the zone specific space since they are used either for ;~
asynchronou~ communication or contain only zone speeific ;;~
information. The purpose of the serial cross-link registers and the serial cros~-link is to allow processors 20 and 20' to communicate even though they are not running in lockstep synchronization ~i.e., phase-locked clocks and same memory states). In the preferred embodiment, there are several ~erial regi3ters, but they need not be described to ;
understand this invention.
Control and status register 912 is a serial register which contains status and control flags. One of the flags is an OSR bit 913 which is used for bootstrapping and indicates whether the proces~ing sy&?tem in the corresponding zone has 2S already begun its bootstrapping process or whether the i operating system for that zone is currently running, either .j .

202220~

1 because its bootstrapping procegg has completed, or because it underwent a resynchronization.
Control and status regi~ter 912 also contain the mode ;
bits 914 for identifying the current mode of cros~-link 90 and thus of processing system 20. Preferably mode bits include resync mode bits 915 and cross-link mode bits 916.
Resync mode bits 9i5 identify cros3-link 90 as being either in resync slave or resync master mode. The cross-link mode bits 916 identify cross-link 90 a3 being either in cross-link off, duplex, cross-link master, or cross-link slave mode.
One of the use~ for the ~erial registers is a statu3 read operation which allows ths cross-link in one zone to read the s~atus of the other zone's cross-link. Settlng a status read request flag 918 in serial control and status lS register 912 sends a request for status information to cros~
link 90~. Upon receipt of this message, cross-link 90' sends ~ ;
the contents of it~ serial control and status register 912 back to cross-link 90.
Flg. 11 shows some of the elements for routing contxol and status slgnals (referred to as control codesl) ~n primsry cross-link 90 and mirror cross-link 95. Correspond~
ing cross-link elements exist in the preferred embod~ment within cross-links 90~ and 95~. The~e code~ are sent between the memory controller~ 70 and 75 and the I/O modules coupled to module interconnects 130, 132, 130~ and 132~.

P 20222~9 1 Fig. 12 shows the elements in the preferred embodiment of primary cross-link 90 which are used for routinq data and address signals. Corresponding cross-link elements exi~t in cross-links 95, 90' and 95~.
In Fig. 11, the elements for both the primsry ero~-link 90 and mirror cros6-link 95 in processing system 20 are shown, although the hardware is identical, becau~e of an important interconnection between the alement3. The circuit elements in mirror cro6s-link 95 which are equivalent to elements in primary cross-link 90 are shown by the same number, except ln the mirror controller the letter ~m~ is placed a~ter the number.
With reference to Figs. 11 and 12, the elements include latches, multiplexers, drivers and receivers. Some of the latches, such a~ latches 933 and 933m, act as delay element~
to ensure the proper timing through the cross-links and thereby maintain synchronization. A3 shown in Fig. 11, control codes from memory controller 70 are sent via bu~ 88 to latch 931 and then to latch 932. The reason for such ~;~
latching is to provide appropriate delays to ensure that data from memory controller 70 passes through cross-link 90 simultaneously with data from memory controller 70~.
If codes from memory controller 70 are to be sent to proces~ing system 20~ via cross-llnk 90~, then driver 937 is 2S enabled. ~he control codes from memory controller 70 also pass through latch 933 and into multiplexer CSMUXA 935. If l ~,.. ' t j:.. ;~ ' ' V

~c~ ~ ` ` `

:`
20222~

l control codes are received into primary cross-link 90 from cross-link 90~, then ~heir path is through receiver 936 into ;~
latch 938 and also into multiplexer 935.
Control codes to multiplexer g35 determine the 60urce of data, that is either from memory controller 70 or from memory controller 70~, and place those codes on the output of -~
multiplexer 935. That output iB ~tored in latch 939, again for proper delay purpose6, and driver 940 is enabled ~f the codes nre to be sent to module interconnect 130.
: . ~:
The path for data and address signals, as shown in Fig. ---12 is somewhat similar to the path of control signal~ shown in Fig. 11. The differences reflect the fact that during any one transaction, data and addresses are flowing in only one ;~
direction through cross-links 90 and 95, but control signals csn be flowing in both directions during that transaction.
.. ~, For that same reason the data lines in busses 88 and 89 are bidirectional, but the control codes are not.
D~ts and addreqses from the memory controller 70, via bu~ 88, enter latch 961, then latch 962, and then latch 964.
As in Fig. 11, the latches in Fig. 1~ provide proper tlming to maintain synchronization. Data from memory controller 70' is buffered by receiver 986, qtored in latch 988, and then routed to the input of multiplexer MUXA 966. The output of multiplexer 966 i8 stored in latch 968 and, if driver 969 is enabled, is sent to module interconnect 130.

- ~ , .

1 The path for control code~ to be sent to memory control-ler 70 is shown in Fig. 11. Code~ from module interconnect 130 are fir~t stored in latch 941 and then pre~ented to multiplexer CSMUXC 942. Multiplexer 942 also receives eontrol codes from parallel cross-link register6 910 and selects either the parallal register codes or the codes from latch 941 for transmission to latch 943. If those control ~ ;~
codes are to be transmitted to cro~s-link 90~, then driver 946 i6 enabled. Control codes from cross-link 90~ (and thus from memory controller 70') are buffered by receiver 947, stored in latch 948, and presented as an input to multiplexer CSMUXD 94S. CSMUXD 945 also receive~ as an input the output of latch 944 which stores the contents of latch 943.
Multiplexer 945 select~ either the codes from module ~;:
interconnect 130 or from cross-link 90~ and presents those ~ :
signals as an input to multiplexer CSNUXE 949. ~ultiplexer 949 also receives as inputs a code from the decode logic 970 ~for bulk memory transfers that occur during resynchroniza- ~
tion), codes from the serial cross-link registers 920, or a . . .
predetermined error code ERR. Nultiplexer 949 then select~
ones of those inputs, under the appropriate control, for storage in latch 950. If tho6e codes are to be sent to memory controller 70, then dri~er 951 i3 activated.
The purpo6e of the error code ERR, which is an input into multiplexer 949, is to ensure that an error in one of the rails will not cause t~e CPUs in the same zone a~ the ~ 49 ~
. .

`

1 rails to proces6 different information. If this occurred, CPU module 30 would detect a faul~ which would cau~e drastic, and perhaps unnecessary action. ~o avoid this, cross-link 90 contain~ an EXCLUSIVE OR gate 960 which compares the outputs of multiplexers 945 and 945m. If they differ, then gate 960 causes multiplexer 949 to select the ERR code. EXCLUSIVE OR
gate 960m similarly cause~ multiplexer 949m also to select an ERR code. This code indicates to mamory controllers 70 and 75 that there has been an error, but avoids causing a CPU
module error. ~he single rail interface to memorv module 60 accomplishe~ the same result for data and addresses.
The data and address flow shown in Fig. 12 is similar to the flow of control ~ignals in Fig. 11. Data and addresses from module interconnect 130 are stored in latch 972 and then provided as an input to multiplexer MUXB 974. Data from the ~-parallel registers 910 provide another input to multiplexer 974. ~he output of multiplexer 974 is an input to mul-tiplexer MUXC 976 which also receives data and addresses stoxed in latch 961 that were originally sent from memory controller 70. Multiplexer 976 thsn selects one of the inputs for storage in latch 978. If the data and addresses, either from the mo~ule interconnect 130 or from the memory con~roller 70, are to be sent to cross-link 90~, then driver 984 is enabled.

. .

, ,~ , "~, "" :~ 1 , S ~,, . ~ . ~, -. . ~ ~ ", ." , ,. ., , , " ., . .;~ ., , ~; 20222~9 1 Data from cross-l$nk 90~ is buffered by receiver 986 and stored in latch 988, which al~o provides an input to multi-plexer iMUXD 982~ The other input of multiplexer MUXD 982 is the output of latch 980 which contains data and addresses S from latch 978. Multiplexer 982 then selects one of Lts inputs which is then stored into latch 990. If the data or ~;
addre~ses are to be sent to memory controller 70, then driver 992 i~ activated. D~ta from ~erial register~ 920 are sent to memory controller 70 via driver 994. -The data routing in cross-link 90, and more particularly the xonreol elements Ln both Fig~. 11 and 12, i5 controlled ~;
by several signals generated by decode logic 970, decode logic 971, decode logic 996, and decode logic 998. This logic provides the signals which control multiplexers 935, 942, 945, 949, 966, 974, 976, and 982 to select the appropri-ate input source. In addition, the decode logic also con~rols drivers 940, 946, 951, 969, 984, 992, and 994.
Mo~t of the control signals are genersted by decode logic 998, but some are generated by decode logic 970, 971, 970m, 971m, and 996. Decode logic 998, 970 and 970m are con-nected at positions that will ensure that the logic will receive the data and codes necessary for control whether the data and codes are received from its own zone or from other zone.
The purpose of decode logic 971, 971m and 996 is to ensure that the drivers 937, 937m and 984 are set into the - Sl -- .' .

20222~ `

1 proper state. This l~early decode~ makes sure that data ad-dresses and codes will be forwarded to the proper cross-links in all cases. Without such early decode logic, the cros~
links could all be in a state with their drivers disabled.
If one at the memory controllers were also disabled, then its cro~s-links would never receive addresses, data and control codes, effectively di~abling all the I/O modules connected to that cross-link.
Prior to describing the driver control signals genarated by decode logic 970, 971, 970m, 971m, and 998, it is neces~
sary to understand the different modes that these zones, and therefore the cross-link~ 90 and 95, can be in. Fig. 13 con-tains a diagram of the different states A-F, and a table ex- -~
plalning the states which correspond to each mode.
At start-up and in other instances, both zones are in state A which is known as the OFF mode for both zones. In that mode, the computer ~ystem~ in both zones are operating independently. After one of the zones~ operating system re-quests the ability to communicate with the I/O of the other zone, and that request is honored, then the zone6 enter the master/slave mode, shown as states B and C. In such modes, the zone which i9 the master, has an operating CPU and has ;~
control of the I/O modules of its zone and of the other zone.
Upon initiation of resynchronization, the computer 9y8- ~ ~;
tem leaves the master~slave modes, either states B or C, and enters a resync slave/resync master mode, which is shown as "''~ ~.' 20222~9 :::

1 states E and F. In those mode~, the zone that was the master z~ne i~ in charge of bringing ~he CP~ o~ ~he other zQn~ Q~
line. If the resynchronization fails, the zone~ revert to the ~ame master/slave mode that they were in prior to the resynchronization attempt.
If the resynchronization is ~uccessful, however, thcn the zones enter state D, which is the full duplex mode. In this mode, both zones are operating together in lockstep syn-chronization. Operation continues in this mode until there is a CPU/MEN fault, in which case the system enter~ one of the two master/slave modes. The slave is the zone whose pro-ce~sor experienced the CPU/M~ fault. ~ -~
Nhen operating in state D, the full duplex mode, certain errors, most notably clock pha~e errors, necessitate split-ting the system into two independent processing systems.
This causes system lO to go back into state A.
Decode logic 970, 970m, 971, 971m, and 998 ~collectively referred to as the cross-link control logic), which are shown in FLgs. 11 and 12, have access to the resync mode bits 915 and the cross-link mode bits 916, which are shown in Fig. 10, ln order to determine how to set the cross-link driver~ and multiplexers into the proper ~tates. In addition, the cross-link decode logic also receives and analyzes a portion of an address sent from memory controllers 70 and 75 during data ~5 transactions to extract addresqing information that further ,:

~ 202~2~ ~

1 indicates to the cross-link decode logic how to set the state of the cross-link multiplexers and drivers.
The information needed to set the states of the multi-plexers is fairly straightforward once the different modes -~
S and transactions are understood. The only determination to ba made is the source of the data. Thu3 when cro~s-link~ 90 and 95 are in the slave mode, multiplexers 935, 935m, and 966 will select data addresYes and codes from zone ll~. Those multiplexers will al~o select data, addresses and codes from the other zone if cross-links 90 and 95 are in full duplex mode, the address of an I/0 instruction is for a device con-nected to an I/0 module in zone 11, and the cros~-link with the affected multiplexer is in a cros6-over mode. In a cross-over mode, the data to be sent on the module intercon-nect is to be received from the other zone for checking. In the preferred embodiment, module interconnect 130 would receive data, addresses and codes from the primary rail in zone 11 and module interconnect would receive data, addresses and codes from the mirror rail in zone 11~. Alt2rnatively, module interconnect 132 could receive data, addresses and code6 from the primary rail in zone 11~ which would allow the primary rail of one zone to be compared with the mirror rail of the other zone.
Multiplexers 945, 945m, and 982 will be set to accept data, address and codes from whichever zone is the source of ~ -the data. This is true both when all the cross-links are in ',~,':','',~, _ 54 ~

." :, ~ C;'~
" ' ~ ' :

20222~

1 full duplex mode and the data, address and codes are received ~
from I/O modules and when the cros~-link is in a resync slave ~ -mode and the data, sddress and codes are received from the memory controllers of the other zone.
If the addressing information from memory controllers 70 and 75 indic~tes that the ~ource of re~ponse data and codes is the cross-link~s own parallel regi~ters 910, then multi- ;
plexers 942, 942m, and 974 are set to select data and codes from tho~e registers. Similarly, i~ the addres6ing information from memory controllers 70 and 75 indicates that the source of response data is the cross-link's own serial register 920, then multiplexers 949 and 949m are set to se-lect data and codes from those registers.
Multiplexers 949 and 949m are also set to select data from decode logic 970 and 970m, respectively, if the informa-tion is a control code during memory resync operations, and ~ ~-to select the ERR code if the EXCLUSIVE OR gates 960 and 960m identtfy a miscompare between the data transm~tted via cro3s-links 90 and 95. In this latter case, the control of the multiplexers 949 and 949m is generated from the EXCLUSIVE OR
gates 960 and 960m rather than from the cross-link control logic. Nultiplexers 949 and 949m also ~elect codes from se-rial cross-link reqi~ter3 910 when those registers are requested or the output of multiplexers 945 and 945m when those codes are requested. Multiplexers 945 and 945m select ~ V ~ . ~A ;; . ' , ; ~ ;

202220~

1 either the outputs from multiplexers 942 and 942m, respec-tively, or ItO codes from cross-links 30' and 95', respec-tively.
Multiplexer 976 selects either data and addre~se~ from module interconnect 130 Ln the case of a tran~action with an ;~
I/O module, or data and addresse~ from memory controller 90 when the data and addresses are to be sent to cro~s-link 90 either for I/O or during memo~y resynchronization. ;
Drivers 937 and 937m are activated when cross-links 90 and 95 are in duplex, master or resync ma~ter mode~. Drivers 940 and 940m are activated for I/O transactions in zone 11.
Drivers 946 and 946m are activated when cros~ nks sa and 95 are in the duplex or ~lave modes. Drivers 9Sl and 951m are always activated.
Driver 969 is activated during I~O writes to zone 11.
Driver 984 is activated when cross-link 90 is sending data and addresses to I/O in zone 11~, or when cro~-link 90 i8 in the resync master mode. Receiver 986 recelves data from ;~
cross-link 90~. Drivers 992 and 994 are activated when data is being sent to memory controller 70; driver 994 is activated when the content~ of the serial cross-link register 910 are read and driver 992 is activated during all other reads.

5. Oscillator When both processing systemA 20 and 20~ are each per-forming the same function~ in the full duplex mode, it is . ..
, -20222~9 1 imperative that CPU module~ 30 and 30~ perform operations at the same rate. Otherwise, massive amounts of processing time will be consuimed in resynchronizing proces~ing system3 20 and 20~ for I/O and interprocessor error checking. In the S preferred emhodiment of proces6ing system6 20 and 20~, their basic clock signals are synchronized and phase-locked to each other. The fault tolerant computing system 10 includes a timing ~ystem to control the frequency of the clock signal6 to processing systems 20 and 20' and to minimize the phase difference between the clock signals for each processing sys-tem.
Fig. 14 show~ a block diagram of the timing system of thls invention embedded in proces3ing ~ystems 20 and 20~
The timing system comprise0 06cillator system 200 in CPU mod-ule 30 of processing system 20, and oscillator system 200' in CPU module 30' of processing ~ystem 20~. The elements of oscillator 200~ are equivalent to those for oscillator 200 and both o~cillator systems~ operation i9 the same. Thu6, only the elements and operation of o~cillato_ ~ystem 200 will be described, except if the operations of oscillator systems 200 and 200' differ. ;~
A~ Fig. 14 shows, much of oscillator system 200, ~-specifically the digital logic, lies inside of cross-link 95, ~ -but that placement is not re~uired for the pre~ent invention.
Oscillator system 200 includes a voltage-controlled crystal oscillator (vcxo) 205 which generate~ a basic oscillator _ 57 _ ~ ~ ~
..... . .

l signal preferably at 66.66 Mhz. The frequency of VCXO 205 can be ad~usted by the voltage level at the input.
Clock di~tribution chip 210 divide6 down the basic oscillator ~Lgnal and preferably produces four primary clocks S all having the 3ame frequency. For primary CPU 40 the clock~
are PCLX L and PCLK H, which are logical inverses of each other. For mirror CPU 50, clock distribution chip 210 produces clock signal~ MC~R L and ~CLR H, which are al~o log~
ical inverses of each other. The timing and phase relationship of these clock signals are shown in Fig. 15.
Preferably, frequency of clock signal~ PCLR L, PCLK H, NCLK ;
L, and MCLK H is about 33.33 Mhz. Clock chip 210 also produces a phase-locked loop signal CLKC H at 16.66 Mhz, al50 shown in Fig. 15. ~his pha~e locked loop signal 15 sent to clock logic 220 which buffers that signal.
Clock logic buffer 220 ~ends the CLRC H signal to o~cil-lator 200' for use in synchronization. Clock logic buffer 220' in oscillator 200~ sends its own buffered phase-locked ~;
loop signal CLRC~ H to phase detector 230 in oscillator 200.
Phase detector 230 also receives the buffered phase locked loop signal CLRC H from clock logic 220 through delay element 225. Delay element 225 approximates the delay due to the cable run from clock logic buffer 220~.
Phase detector 230 compares its input pho~e locked loop ~ignals and generates two outputs. One is a phase differ- -ences signal 235 which is sent through loop amplifier 240 to 6 ; ~

` 20222~

1 the voltage input of VCXO 205. Phase differences signal 235 will cause amplifier 240 to generate a signal to alter the frequency of vCXO 205 to compensate for phase differences.
~he other output of phase detector 230 is a phaRe error signal 236 which indicates possible synchronism faults.
FLg. 16 is a detailed diagram of phase detector 230.
Phase detector 230 includes a ph~e comparator 232 and a voltage comparator 234. Phase comparator 232 receive~ the clock signal from delay element 225 (CL~C H) and the pha~e lock loop clock signal from oscillator 200' (CLRC' H) and generates phase differences signal 23S as a voltage level representing the phase difference of those signals.
If processing system 20 were the ~lave~ for purpo~es of clock synchronization, switch 245 would be in the ~SLAVE~
po~ition (i.e., closed) and the voltage level 235, after be~
ing amplified by loop amplifier 240, would control the fre-quency of VCXO 205. If both switches 245 and 245' are in the "ma~ter" position, processing ~ystems 20 and 20' would not be phase-locked and would be running asynchronou~ly (indepen-dently). ~ ~ ;
~he voltage level of phase differences signal 235 is also an input to voltage comparator 234 as are two reference voltages, vrefl and V~ef2, representing acceptable ranges of ~-phase lead and lag. If the phase difference is within toler-ance, the PHASE ERROR s~gnal will not be activated. If the phase difference i~ out of tolerance, then the PHASE ERROR
: ~ :

' '''' . ' .
2022~

1 signal 236 will be activated and oent to cross-link 95 via clock decoder 220.
6. I/O Moqule Fig. 17 shows a preferred embodiment of an I/O module 100. The principles of operation I/O module 100 are ap~
plicable to the other I/O modules as well.
Fig. 19 shows the elements in the preferred embodiment of firewall 1000. Firewall 1000 include~ a 16 bit bus inter-face 1810 to module interconnect 130 and a 32 bit bus inter-face 1820 for connection to bus 1020 shown in Fig. 17. In-terfaces 1810 and 1820 are connected by an internal firewall bus 1815 which also interconnects with the other elements of firewall 1000. Preferably bus 1815 i~ a parallel bus either 16 or 32 bit~ wide.
I/O module 100 iB connected to CPU module 30 by means of ;~
dual rail module interconnects 130 and 132. Each of the module interconnects is received by firewalls 1000 and 1010, ~ -respectively. One of the firewall6, which is usually, but not always firewall 1000, writes the data from module , ~ .:
interconnect 130 onto bus 1020. The other firewall, in this case firewall I010, checks that data against its own copy received from module interconnect 132 using firewall comparison circuit 1840 shown in Fig. 18. That checkLng is effective due to the lockstep synchronization of CPU modules 30 and 30~ which causes data written to I/O module 100 from ' .: ~

1 CPU modules 30 and 30~ to be available at firewalls 1000 and 1010 substantially simultaneously.
Firewall comparison circuit 1840 only checks data received from CP~ modules 30 and 30~. Da~a sent to CPU
modules 30 and 30~ from an I/O device have a common oriqin and thus do no~ require checking. In~tead, data raceived from an I/O device to be sent to CPU modules 30 and 30~ is checked by an error detection code (~DC), such ~ a cyclic~
redundancy check (C~C), which is performed by EDC/CRC
generator 1850. EDC/CRC generator 1850 Ls also coupled to internal firewall bus 1815.
EDC/CRC generator 1850 generates and checks the same EDC/C~C code that is used by the I/O device. Preferably, I/O
module 100 generates two EDC. One, which can al90 be a EDC/CRC, is used for an interface to a network, such as the Ethernet packet network to which module 100 is coupled (~ee element 1082 in Fig. 17). The other is used for a disk interface such as disk interface 1072 Ln Fig. 17.
EDC/CRC coverage is not required between CPU module 30 and I/O module 100 because the module interconnects are duplicated. For example in CPU module 30, cross-link 90 com-municates with firewall 1000 through module interconnect 130, and cross-link 95 communicates with firewall 1010 through module interconnect 132.
A mesiage received from Ethernet network loa2 is checked for a valid EDC/CRC by network control 1080 shown in Fig. 17.

~ , 20222~9 1 The data, complete with EDC/CRC, is written to a local RAN
1060 also shown in Fig. 17. All data in local RAM 1060 is transferred to memory module 60 using DMA. A DMA control 1890 coordinates the tran~fer and directs EDC/CRC generator 1850 to check the validity of the EDC/CRC encoded data being transferred. :'~'':
Most data transfers with an I/0 device are done with DM~. Data is moved between main memory and I/0 buffer memory. When data i8 moved from the ma~n memory to an I/O
buffer memory, an EDC/CRC may be appended. When the data is moved from I/O buffer memory to main memory, an EDCtCRC may be checked and moved to main memory or may be stripped. When data i8 moved from the I/O buffar memory through an external ~-device, such a~ a disk or Ethernet adaptor the EDC/CRC may be checked locally or at a distant receiving node, or both. The i memory dAta packets may have their EDC/CRC generated at the distant node or by the local interface on the I/0 module.
This operation ensures that data residing in or being transferred through a single rail ~ystem like I/O module 100 ~ ~;
is covered by an error detection code, which i8 preferably at least as reliable as the communications medla the data will eventually pas~ through. Different I/0 modules, for example those which handle synchronous protocols, preferably have an EDC/CRC generator which generates and checks the EDC/CRC
codeQ of the appropriate protocol3.

202220~

1 In general, DMA control 1890 handles the port~on of a DMA operation specific to tha shared memory controller 1050 and local RAM 1060 being addressed. The 32 bit bus 1020 is driven in two different modes. During DMA setup, DMA control 1890 u~es bus 1020 as a standard asynchronous microproce~sor bus. The addres~ in local RAM 1060 where the DMA operation will occur is supplied by shared memory controller 1050 and DMA control 1890 During the actual DNA transfer, DNA
control 1890 directs DMA control lines 1895 to drive bus 1020 in a synchronous fashion. Shared memory controller 1050 will -~
transfer a 32 bit data word with bus 1020 every bu~ cycle, and DMA control 1890 keeps track of how many words are left to be transferred. Shared memory control 1050 also controls local RAM 1060 and creates the next DMA address.
The I/O modules (100, 110, 120) are responsible for con-trolling the xead/write operations to their own local RAM
- ~ :
1060. The CPU module 30 i8 responsible for controlling the tran~fer operations with memory array S0. The DMA engine 800 ; of memory controllers 70 and 75 (shown in Fig. 8) directs the DMA operations on the CPU module 30. This division of labor prevents a fault in the DMA logic on any module from degrad-ing the data in~egrity on any other module in zone~ 11 or 11' . ' . ' . ' The functions of trace RAM 1872 and trace RAM controller 1870 are described in greater detail below. ~riefly, when a fault is detected and the CPU~ 40, 40', 50 and 50' and CPU

F~

20222~

1 modules 30 and 30 are notified, various trace RAMs throughout computer system 10 are caused to perform certain functions described below. The communication with the trace ~ -~
RAMs takes place over trace bus 1095. Trace RAM control 1870, in re~pon~e to signals from trace bus 1095, cau~e~
trace RAM 1872 either to qtop storing, or to dump it3 contents over trace bus 1095.
I/O module bus 1020, which i8 preferably a 32 bit paral-lel bu~, couples to firewalls 1000 and 1010 as well as to other elements of the I~O module 100. A shared memory controller 1050 is also coupled to I/O bus 1020 in I/O module 100. Shared memory controller 1050 is coupled to a local memory 1060 by a shared memory bus 1065, which preferably carrie~ 32 bit data. Prefer~bly, local memory 1060 i~ a RAM -with 256 Kbytss of memory, but the size of RAM 1060 i8 discretionary. The shared memory controller 1050 and local RAM 1060 provide memory capability for I/O module 100.
Disk controller 1070 provides a standard interface to a disk, such as disks 1075 and 1075~ in Fig. 1. Disk control-ler 1070 is also coupled to shared memory controller 1050 either for use of local RA~ 1060 or for communication with I/O module bus 1020.
A network controller 1080 provides an interface to a standard network, such as the ETHERNET network, by way of networX interface 1082. Network controller 1080 is also coupled to shared memory controller 1050 which acts as an .

.i .

l :??`~ `` .`. . ~ . . ~ . '~= ) ~ ~ . ~

`",': "~ ' .~ :' 20~22~

1 interface both to local RAM 1060 and I/O module bus 1020.
There is no requirement, however, for any one specific organization or ~tructure of I/O module bus 1020.
PCIM (power and cooling interface module) support element 1030 is connected to I/O module bu~ 1020 and to an ASCII interface 1032. PCIN ~upport element 1030 allows processing sy~tem 20 to monitor the status of the power system (i.e., batteries, regulators, etc.) and the cooling system (i.e., fans) to ensure t~eir proper operation. Pref-erably, PCIM support element 1030 only receives messages when there i8 some fault or potential fault indicationl such as an unacceptably low battery voltage. It i8 also possible to use PCIM support element 1030 to monitor all the power and cooling ~ub~ystems periodicPlly. Altarnatively PCIM support ~
,,, . . ~ .
element 1030 may be connected directly to firewall S 1000 and 1010.
Dlagnostics microprocessor 1100 is also connected to the I/O module bus 1020. In general, diagnostics microproce~or 1100 is used to gather error checking information from trace RAMS, such as trace RAM 1872, when faults are detected. ~hat data is gathered into trace buses 1095 and 1096, through fir-ewalls 1000 and lO10, respecti~ely, through module bus 1020, and into microprocessor 1100.

:::
!

~ r~ ~
20222~
, .. . .

1 D. INTERPROCESSOR AND IN~ERNODULE COMMUNICATION
1. Data Paths The elements of computer sys~em 10 do not by them~elve~
constitute a fault tolerant system. There need~ to be a com~
munications pathway and protocol which allows communication during normal operations and operation during fault detection and correction. Rey to such communication is cross-link pat-hway 25. Cross-link pathway 25 comprises the parsllel links, serial links, and clock ~ignals already described. ~hese are shown in Fig. 19. The parallel link includes two identical sets of data and address lines, control lines, interrupt llne~, coded error lines, and a soft reset request line. The data and address lines and the control line~ contain information to be exchanged between the CPU modules, such as from the module interconnects 130 and 132 (or 130~ and 132') or from memory moduls 60 (60').
The interrupt lines preferably contain one line for each of the interrupt levels available to I/O ~ubsy~tem (modules 100, 110, 120, 100~, 110~ and 120~). These lines are ~hared by cros~-links 90, 95, 90' and 95'.
The coded error lines preferably include codes for synchronizing a console ~HALT~ request for both zones, one for synchronizing a CPU error for both zones, one for indicating the occurrence of a CPU/memory failure to the other zone, one for ~ynchronizing DMA error for both zone~, and one for indicating clock phase error. The error lines :,' - 66 ~

- 20222~

1 from each zone 11 or 11~ are input~ to an OR gate, ~uch as OR
gate 1990 for zone 11 or OR gate 1990' for zone 11'. The output at ~ach OR gate provide~ an input to the cro~s-linXs - of the other zone. ~ ~ -The fault tolerant proce~sing system 10 i8 de~igned to continue operating as a dual rail sy0tem despite tra~ient faults. The I/O sub~ystem tmodules 100, 110, 120, 100', ~ -110~, 120'~ can also experience transient errors or faults and centinue to operzte. In the preferred embodiment, an error detected by firewal} comparison circuit 1840 will cause a synchronized error report to be mads through pathway 25 for CPU directed operations. Hardware in CPU 30 and 30~ will causa a ~ynchronized ~oft reset through pathway 25 and will retry ths faulted operation. For DMA directed operations, thQ ~ame error detection re~ult~ in synchronous interrupts through pathway 25, and software in CPUs 40, 50, 40' and 50' will re~tart the DNA operation.
Certain transient errorq are not immediately recoverable to allow continued operation in a full-duplex, synchronized fashion. For example, a control error in memory module 60 can result in unknown data in memory module 60. In this situation, the CPUs and memory elements can no longer func-tion reliably as part of a fail safe sy~tem so they are removed. ~emory array 60 must then undergo a memory resync before the CPU~ and memory elements can re~oin the ~y~tem.
The CPU/memory fault code of the coded error lines in pathway . -~ . ', ~ :' ,.
- 67 - ~ ~-2022~Q~

1 25 indicates to CPU 30~ that the CPUs and memory elements of CPU 30 have been faulted.
The control lines, which represent a combination of cycle type, error type ~ and ready conditions, provide the handshaking between CPU module~ (30 and 30~) and the I/O
modules. Cycle type, a~ explained above, defines the type of -~
bus operation being performed: CPU I/O read, DMA transfer, DMA setup, or interrupt vector re~ue~t. Error type define~
either a firewall miscompare or a CRC error. "Ready" mes-sages are ~ent between the CPU and I/O modules to indicate the completion of requested operations.
The serial cross-link include3 two sets of two line6 to provide a serial data ~ransfer for a status read, loopback, and data transfer.
The clock signals exchanged are the pha~e locked clock signals CLKC H and CLKC' H (delayed).
Pigs. 20A-D show block diagrams of the elements of CPU
modules 30 and 30~ and I/O modules 100 and 100~ through which data passes during the different operations. Each of those elements has each been described previously.
Fig. 20A shows the data pathways for a typical CPU I/O
read operation of data from an I~O module 100, such as a CPU
I/O regi~ter read operation of register data from shaved memory controller 1050 (1050'). Such an operation will be referred to as a read of local data, to distingulsh it from a DMA read of data from local memory 1060, which u~ually l contains data from an internal device controller. The local data are presumed to be stored in local RAM 1060 (1060~) for ~,:
transfer through shared memory controller 1050 (1050~). For ;~
one path, the data pa~ through firewall 1000, module S interconnect 130, to cros~-link 90. As seen in Fig. 12, croYs-link 90 delay~ the data from firewall 1000 to memory ~
controller 70 ~o that the data to cross-link 90' may be ~-presQnted to memory controller 70 at the same time the data are presented to memory controller 70, thu~ allowinq process-ing ~y~tems 20 and 20~ to remain synchronized. The data then proceed out of memory controllers 70 and 70~ into CPUs 40 and 40' by way of internal busse~ 46 and 46'.
A similar path is taken for reading data into CPUs 50 and 50'. Data from the shared memory controller 1050 proceeds through firewall 1010 and into cross-link 95. At that time, the data are routed both to cross-link 95' and through a delay unit inside cross-link 95.
CPV ItO read operations may also be performed for data received from the I/O devices of processing system 20' via a -~
~hared memory controller 1050~ and local RAM in I/O device 100'.
Although I/O modules 100, 110, and 120 are similar and correspond to I/O module~ 100~, 110~, and 120~, respectively, the corresponding I/O module~ are not in lockstep synchroni~
zation. Using memory controller 1050~ and local RAM 1060' for . ... ~ .
CPU I/O read, the data would first go to cross-links 90' and - " .

d - 69 - ~

2022~9 1 95~. The remaining data path i8 equivalent to the path from -~
memory controller 1050. ~he data travel from the cros~-links ;
90~ and 95~ up through memory controllers 70~ and 75~ and finally to CPUs 40~ and 50~, respectively. SLmultaneously, the data travel across to cross-links 90 and 95, respectively, and then, without passing through a delay element, the data continue up to CPUs 40 and 50, respectively.
Fi~. 20B shows a CPU I/O write operation of local data.
Su~h local data are transferred from the CPU~ 40, 50, 40' and 50' to an I/O module, such as I/O modula 100. An example of such an operation is a write to a register in shared memory controllers 1050. The data transferred by CPU 40 proceed along the same path but in a direction opposite to that of the data during the CPU I/O read. Specifically, ~uch data pas~ through bus 46, memory controller 70, various latches (to permit ~ynchronization), firewall 1000, and memory controller 1050. Data from CPU 50~ also follow the pAth of the CPU I/O reads ln a rever~e directlon. Specifically, 3uch data pa88 through bus 56~, memory controller 75~, cross-link 95', cross-link 95, and into firewall 1010. As indicated above, firewalls 1000 and 1010 check the data during I/O
write operations to check for 2rrors prior to storage.
When writes are performed to an I/O module in the other zone, a similar operation is performed. However, the data from CPUs 50 and 40~ are used in~tead of CP~s 50~ ~nd 40.
. ~:''''.
~0 _ 70 - ~

: ` ~
20222~

1 The data from CPUs 50 and 40~ are tran~mitted through symmetrical path~ to shared memory controller 1050~. The data from CPUs 50 and 40~ are compared by firewalls 1000' and ~-1010~. The reason differen~ CPU pairs are used to service I/ ;
o write da~a is to allow checking of all data paths during normal u~e in a full duplex system. Interrail checks for each zone were previously performed at memory controllers 70, 75, 70' and 75~.
Fig. 20C shows the data path~ for DMA read operations.
The data from memory array 600 pass simultaneously into ;
memory controllers 70 and 75 and then to cros6-links 90 and ~ ~
95. Cross-link 90 delays the data tran~mit~ed to flrewall ~i 1000 80 that the data from crosi~-links 90 and 95 reach firewalls 1000 and 1010 at substantially the same time.
Similar to the CPU I/O write operation, there are four copies of data of data to the variou~ cro6s-links. At the firewall, only two copies are received. A different pair of data are used when performing reads to zone 11. The data paths for the DMA write operation are shown in F~g. 20D and are similar to those for a CPU I/O reAd. Specifically, data from shared memory controller 1050~ proceed through firewall 1000', cross-link 90~ (with a delay), memory controller 70~, and into memory array 600~. Simultaneously, the data pass ~;
through firewall 1010', cross-link 95' (with a delay), and .
memory controller 75~, at which time it is compared with the data from memory controller 70' during an interrail error 21~220`~

1 check. As with the CP~ I/O read, the data in a DMA write operation may alternatively be brought up through shared memory controller 1050 in an equivalent operation.
The data out of cross-link 90~ al~o pa88 through cross-link gO and memory controller 70 and into memory array 600.
The data from cross-link 9S~ pas~ through cross-link 95 and memory controller 75, at which time they are compaxed with the data from memory controller 70~ during a ~imultaneous interrail check.
The data path for a memory resynchronization (resync) operation i8 shown in Fig. 20E. In this operation th~ ;
contents of both memory arrays 60 and 60~ must be sst equal to each other. In memory resync, data from memory array 600' pass through memory controllers 70~ and 75~ under DMA
lS control, then through cross-links 90~ and 95', respect1vely.
The data then enters cross-links 90 and 95 and memory ;
controllers 70 and 75, respectively, before being ~tored in memory array 600.
2. ~esets The preceding discussions of system 10 have made refer-ence to many different needs for resets. In certain in~tances not discussed, resets are u~ed for standard func~
tions, such as when power is initially applied to system 10.
Most systems have a single reset which always sets the processor back to some predetermined or initial state, and thus disrupts the proce~sors~ instruction flow. Unlike most 20222~

1 other systems, however, resets in sy~tem 10 do not affect the flow of instruction execution by CPU~ 40, 40', 50 and 50' unless absolutely necessary. In addition, resets in system 10 affect only tho6e portions that need to be reset to restore normal operation.
Another a~pect of the reset6 in ~y6tem 10 is their containment. One of the prime considerations ~n a f~ult tolerant system i8 that no function should be allowed to stop the system from operating should that function fail. For this reason, no single reset in system 10 controls elements of both zones 11 and 11~ wLthout direct cooperation between zones 11 and 11~. Thus, in full duplex mode of operation, all reset~ in zone 11 will be independent of resets in zone -~
11~. When ~ystem 10 is in master/slave mode, however, the slave zone uses the resets of the ma~t~r zone. In addition, ..
no reset in system 10 affects the contents of memory chips. -~
Thus neither cache memory 42 and 52, scratch pad memory 45 ;~-and 55 nor memory module 60 lose any data dus to a re~et.
There are preferably three clas~es of resets in system 10; "clock reset," "hard re~et," and "soft reset." A clock reset realigns all the clock phase generators in a zone. A
,- :, clock reset in zone 11 will also initialize CPUs 40 and 50 and memory module 60. A clock reset does not affect the module interconnects 130 and 132 except to realign the clock ph~e generators on those modules. Ev0n when sy3tem 10 is Ln master/~lave mode, a clock reset in the slave zone wLll not -~ "
-' ' ~ 73 ~
.: :

20~22~ ~

1 disturb data tranqfers from the master zone to the slave zone module interconnect. A clock reset in zone 11', however, will initialize the corresponding elements in zone 11'.
In general, a hard reset returns all state device~ and registers to some predetermined or initial state. A soft reset only return~ state engines and temporary storage registers to their predetermined or initial state. The state engine in a module i~ the circuitry Shat defines the ~tate of that module. Registers containing error information and configuration data will not be affected by a soft re3et. Ad-ditionally, system 10 will selectively apply both hard reset~
. and soft resets at the same time to reset only those elements that need to be reinitialized in order to continue process-ing.
The hard resets clear system 10 and, as in conventional sy~tems, return system 10 to a known configuration. Hard resets are used after power Ls applied, when zones are to be synchronized, or to initiallze or disable an I~O module. In system 10 there are preferably four hsrd resets: ~power up reset," "CPU hard reset,~ module reset," and ~'device reset.~' Hard resets can be further broken down inSo local and system hard resets. A local hard reset only affects logic that responds when the CPU is in the slave mode. A system hard reset i~ limitad to the logic that is connected to cross-link cab~es 25 and module interconnects 130 and 132.

20222~

1 The power up reset is used to initialize zones 11 and ~-11' immediately after power is supplied. The power up reset ;
forces an automatic reset to all parts of the zone. A power up reset i~ never connected between the zones of system 11 ~-because each zone has its own power supply and will thus experience different length "powe~-on" events. The power up re~et is implemented by applying all hard resets and a clock reset to zone 11 or 11'.
The CPU hard reset is u8ed for diagnostic purpoF7es in order to return a CPV module to a known state. The CPU hard reset clears all information in the CPUs, memory controllers, and memory module status registers in the affected zone.
Although the cache memories and memory modulas are disabled, thè content~ of the scratch pad 7~AMs 45 and 55 and of the ~;
memory module 60 are not chanqed. In addition, unlike the `
power up reset, the CPV hard reset does not modi~'y the zone `~
identification of the cross-links nor the clock mastership.
~he CPU hard reset is the sum of all local hard reset~ that ~ ~ ;
can be appl~ed to a CPV module and a clock re~7et. -The module hard reset is used to set the I/0 module6 to ;
a known state, such as during bootstrapping, and is also used to remove a faulting I/0 module from the system. The I/0 module hard reset clears everything on the I/0 module, leaves the firewalls in a diagnostic mode, and disables the drivers.

~w O~lCe- .:: , :
FlNNeC~N . HENDERSON
F.~R~E-D~r G~RRETr 30~7 D~'NNER
"7,~"~t"~w 0~0~. D
O ~ ::
- 75 - ~

1 A device reset i~ used to reset I/O devic.3s connected to the I/O modules. The re~ets are device dependent and are provided by the I/O module to which the device is connected.
The other class of resets i8 soft reset3. As explained above, soft resets clear the state engines and temporaryr registers in sy3tem 10 but they do not change configuration information, such a~ the mode bits in the cross-links, In addition, soft resets al~o clear the error handling mechanisms in the modules, but they do not change error registers such as system error register 898 and system fault ~ `~
address register 865.
Soft resets are targeted 80 that only the necessary por-tions of the ~ystem are reset. For example, if module inter-connect 130 needs to be reset, CPU 40 is not reset nor are the device~ connected to I/O module 100.
There are three unique aspects of soft re~et~. One i8 that each zone is responsible for generating its own re~et.
Faulty error or reset logic in one zone is thus prevented from causing resets in the non-faulted zone.
The second aspect is thst the soft re~et does not -~
disrupt the sequence of instruction execution. CPUs 40, 40', .~ .
50, 50~ are reset on a combined clock and hard re~et only.
Additionally memory controllers 70, 75, 70~ and 75~ have those state engines and regi3ters necesCiary to seirvice CPU
instructions attached to hard reset. Thus the soft reset is ,~wO,,,.~. tran~parent to software execution.
FINNEC~N. HENDERSON
FS~RABO~j' ~RRE1 r ~J ~ JTRtiT, N W
IINGTON O C ~OOO~I
0~ 0 :~

~ 2~222~9 1 The third aspect is that the range of a soft reset, thatis the number of elementis in 2system 10 that is affected by a soft reset, isi dependent upon the mode of system 10 and the ~-original reset request. In full duplex mode, the soft re3et request originating in CPU module 30 will issue a soft reset to all elements of CPU module 30 as well as all flrewalls 1000 and 1010 attached to module interconnect 130 and 132.
Thus all modules serviced by module interconnect 130 and 132 -;~
will have their state engines and temporary regi~iters re~et. -This will clear the system pipeline of any problem caused by a tran3ient error. Since system 10 i8 in duplex mode, zone : ,:
11' will be doing everything that zone 11 i~2. Thus CPU - ~-module 30' will, at the ~2ame time as CPU module 30, issue a ~ ~
, . ~
soft reset request. The soft reset in zone 11~ will have the ~ame effect as the soft reset in zone 11.
When system 10 is in a master/silave mode, however, with CPU module 30' in the slave mode, a soft reset request originating in CPU module 30 will, a~ expected, issue a soft reset to all elements of CPU module 30 as well as all firewalls 1000 and 1010 attached to module interconnects 130 and 132. Additionally, the soft reset request will be forwarded to CPU module 30~ via cross-linksi 90 and 90', cros2~-link cables 25, and cross-links 90~ and 95~. Parts of module interconnects 130~ and 132~ will receive the soft reset. In this same configuration, a soft reset reque~it ~ -~
~wo~r~ce- originating from CPU module 30~ will only reset memory FINNECA:~I. HENDERSON
F~R~EO~ CARREIr ~ DUNNER
r~c~T ~ w.
w~ O~O~.O C ~000 ~O~ J ~ O
~ 77 `
:

1 controllers 70~ and 75~ and portion~ of cross-links 90' and 95~.
Soft resets include ~'CPU soft resets~' and ~sy~tem soft resets.~ A CPU soft reset is a soft reset that affects the state engines on the CPU module th.t originated the request.
A system soft reset is a soft reset over the module intercon-nect and those elemen~s directly attached to it. A CPU
module can alway~ request a CPU soft reset. A 3y8tem 3eft .- .
reset can only be requested if the cro6s-link of the reque~t-ing CPU is in duplex mode, m~.ster/ellave mode, or off mode. A
cross-link in the slave mode will take a system soft reset from the other zone and generate a system soft reset to its own module interconnects.
CPU soft reset3 clear the CPU pipeline following an er~
ror condition. The CPV pipeline includes memory intercon-nects 80 and 82, latches (not shown) in memory controllers 70 and 75, DMA engine 800, and croe~s-linke~ 90 and 95. The CPU
soft reset can also occur following a DMA er I/O time-out. A
, DMA or I/O time-out occurs when the I/O device does not respond within a specified time perlod to a DMA or an I/O
request.
Fig. 21 shows the reset linec from the CPU modules 30 and 30' to the I/O modules 100, 110, 100', and 110' and to the memory modules 60 and 60~. The CPU module 30 receives a DC OK signal indicating when the power supply hAs ~ettled.
~w o~ee-FINNEC~N, HENDERSON
F.~RA30~ GARRETr ~ DuNNER
310~ 11 Ctl-ttT. il ~.
~A~ O~O~.O C 000-~0~ 0 ~ 78 -~; r i~ ~i :r~ "~ ;, ."~",r,r E ~ ~ ' ! 'A"~ , ;

~ 20222~9 It is this signal which initializes the power-up reset. CPU
module 30' receLves a similar signal i'rom it~ power supply.
One system hard reset line is sent to each I/O module, and one system soft reset is sent to every three I/O modules.
The reason that single hard reset is needed for each module i8 becau3e the ~ystem hard reset line are used to remove ;~
individual I/O modules from system 10. The limitation of three I/O module~ for each 6ystem soft reset 18 merely a loading consideration. In addition, one clock reset line is sent for every I/O module and memory module. The rea~on for using a single line per module i~ to control the skew by controlling the load.
Fig. 22 shows the elements of CPU module 30 which relate to resets. CPUs 40 and 50 contain clock generators 2210 and -~
2211, respectively. Memory controllers 70 and 75 contain clock generators 2220 and 2221, respectively, and cross-links 90 and 95 contain clock generators 2260 and 2261, respectively. The clock generator divide down the system clock signals for use by the individual modules.
Memory controller 70 contains reset control circuitry ;~
2230 and a 30ft reset request register 2235. ~emory control- ~ -ler 75 contains reset control circuitry 2231 and 8 soft reset ~ ~-request register 2236.
Cross-link 90 contains both a local reset generator 2240 and a system reset generator 2250. Cross-link 95 contains a .~wOt.lct. local re~et generator 2241 and a system reset generator 2251.
FjNNtC~N. HENDER50N
F.~R~BO~ GARRe1r ~ DUNNER
t~ N W
GTON D C ~000--~0~ 0~0 ~ ,."~''.,.,~

` 20222~9 1 The ~local~ portion of a cross-link is that portion of the cross-link which remains with the CPU module when that cross-link is in the slave mode and therefore includes the serial registers and some of the parallel registers. The ~system~
portion of a cross-link is that portion of the cross-link that is needed for acces~ to module interconnects 130 and 132 (or 130~ and 132~) and cross-link cables 25.
The local reset ~enerators 2240 and 2241 generate reset~
for CPU mod~le 30 by sending hard and so~t re~et signal~ to the local re~et control circuits 2245 and 2246 of cro~s-links 90 and 95, respectively, and to the reset control circuits 2230 and 2231 of memory controller 70 and 75, respectively.
Local cross-link re~et control circuits 2245 and 2246 respond to the soft reset signals by resetting their state engines, ;~
the latches storing data to be transferred, and their error registers. Those circuits respond to the hard reset signal~
by taking the same actions as are taken for the soft resets, and by also resettlng the error registers and the configura-tion regi~ters. Reset control circuits 2230 and 2231 respond to hard and soft reset signals in a similar manner.
In addition, the local reset generator 2240 sends clock reset signals to the I/O modules 100, 110 and 120 via module ~ ;
interconnects 130 and 132. The I/O modules 100, 110, and 120 use the clock reset signals to reset their clocks in the man-ner de~cribed below. Soft reset request regi3ters 2235 and ~w O~C--FINNEC~N. HENDERSON
F~tP~AIIO~, G~RRETt l~ DUNNeP.
~ tt~, N W
W.~ O~ON. O.
~0~ 0 l ~ i i i ?: ~ j "~

-20222g~

1 2236 send soft request signals to local reset generators 2240 ~-and 2241, respectively.
System reset generators 2250 and 2251 of cro~s-linki~ 90 -~
and 95, respectively, send sy~tem hard reset signal~ and system soft reset signals to I/O module~ 100, 110, and 120 via module interconnects 130 and 132, re~pectively. I/O
modules 100, 110, and 120 ra~pond to the soft re et signals by resetting all regi~ters that are dependent on CPU data or commands. Those modules respond to the hard reset signals by resetting the same register as soft resets do, and by also resetting any configuration registers.
In addition, the system reset generators 2250 and 2251 also send the ~y~tem soft and system hard reset ~ignals to the system reset control circuit 2255 and 2256 of each cros~
link. System reset control circuit 2255 and 2256 re~pond to ~; ;
the system ~oft reset signals and to the systQm hard reset signals in a manner similar to the response of the local resat control circuits to the local ~oft and local hard re~et signals. ~ `
Memory controllers 70 and 75 cause crosis-links 90 and ;~
95, respectively, to generate the soft resets when CPUs 40 and 50, respectively, write the appropriate codes into soft reset request registers 2235 and 2236, respectively. Soft reset request registers 2235 and 2236 send soft reset request signal~ to local re~et generatoris 2240 and 2241, ~w O~ e- . .
hNNEC~N~ HENDERSON
F~R~aow, C~RRETr o VUNNeR
R ~ I W .
40~01-. D C ~000--1~0 ~ 0 ~ , ~

~ - ~
2022~

1 respectively. The coded error signal i8 sent from memory controller 70 to local reset generators 2240 and 22~41.
System soft resets are sent between zones along the same data paths data and control ~ignals are sent. Thus, the same philosophy of equalizing delays is u~ed for reset~ a3 for data and addresses, and resets reach all of the elements in both zones at approximately the ~ame time.
Hard resets are gener~ted by CPUa 40 and 50 writing the appropriate code into the local hard resE-Tt registers 2243 or by the request for a power up re~et caused by the DC OR -;;;~
signal.
Synchronization circuit 2270 in cros~-link 90 includes approprlate delay elements to en~ure that the DC OR signal ;~
goes to all of the local and re~et generators 2240, 2250, 2241 and 2251 at the same time. -~
In fact, synchronization of re~ets is very lmportant in 6ystem 10. That is why the reset signals originate in the cross-links. In that way, the resats can be sent to arrive at dir'ferent modules and elements in the modules ap- -proximately synchronously.
With the under~tanding of the structure in Figs. 21 and 22, the execution of the different hard re~ets can be better under3tood. The power up reFTet generate~ both a ~ystem hard reset, a local hard reset and a clock reset. Generally, cross-links 90, 95, 90~ and 95~ are initially in both the ~w orrlce~
FINNE~N.HENDERsoN
F~R~ao~ ~RReTr li DUNNER
~ 11 ~TI~CtT, N W.
W~ NGTOI~. O C ~000--~o~ o ~;";,: , " ,; ,;, ~ " -:~`r",,, ~ "~ ~ ~ "~ "; ,, 20222Q~

1 cros~-link off and re~ync off mode~, and with both zones a~
serting clock mastership.
The CPU/MEM fault reset i8 automatically activated when-ev~r memory controllers 70, 75, 70~ and 75' detect a CPU/MEM
fault. ~he coded error logic i~ sent from error logic 2237 and 2238 to both cross-links 90 and 95. The CPU module which generated the fault i~ then removed from system 10 by ~etting it9 croEo8-link to the ~l~ve state and by s~tting the cro~s~
link in the other CPU module to the master state. The non~
faultlng CPU module will not experience a reset, however.
Inste~d, Lt will be notified of the fault in the other module throu~h a code in a serial cross-link error register (not shown). The CPU/MEM fault reset consists of a cl w k reset to the zone with the failing CPU module and a local soft reset to that module.
A resync reset Ls essentially a system ~oft reset with a ~ r - local hard reset and a clock reset. The resync re~et is used to bring two zones into lockstep synchronization. If, after a period in which zones 11 and 11' were not synchronized, the contents of the memory modules 60 and 60', includLng the stored st~tes of the CPV regiRters, are set equal to each other, the resync reset is used to bring the zones into a compatible configuration ~o they can restart in a duplex mode.

4NNEC~N. HENDeRSON
C~R~DO~ C~RREtr 3U~Du#NeR
,--. ,~ ..,.~t., ,. w ~A~ O~O~- O C ZOOO~ , ' '~' ' ~0~ 0 - 83 - ~

20222~

1 The resync re~et i~ essentially a CPU hard reset and a clock reset. The resync reset i~ activ~ted by ~oftware writ-ing the resync reset address into one of the parallel cross-link regi~ters. At that time, one zone should be in the cross-link m~ster/resync ma~ter mode and the other in the cross-link slave/resync slave mode. A simultaneous re~et will then be performed on both the zones which, among other things, will set all four cross-links into the duplex mode.
Since the re6ync reset i~ not a system soft reset, the I/O
modules do not receive re3et.
The preferred embodiment of ~ystem 10 also ensures that clock reset ~ignals do not reset conforming clocks, only non~
conforming clocks. The reason for this i8 that whenever a clock $8 reset, it alters the timing of ~he clocks which in turn sffects the operation of the module~ with such clocks.
If the module was performing correctly and its clock was in the proper phase, then altering its operation would be both unnecessary and wasteful.
Fig. 23 shows a preferred embodiment of circuitry which will ensure that only nonconforming clocks are reset. The circuitry shown in Pig. 23 preferably resides in the clock generatorq 2210, 2211, 2220, 2221, 2260, and 2261 of the cor-responding modules shown in Fig. 22.
In the preferred embodiment, the different clock generators 2210, 2211, 2220, 2221, 2260, and 2261 include a ~w or~ce~ rising edge detector 2300 and a phase generator 2310. The FINNEC~W. HENDERSON
FAR~O~P. C~RREl r ~I DliNNER
CeT. N. ~
NNO~O~ . O C ~0000 1~ 0~ 0 :',' ~ 20~22~9 l rising edge detector 2300 receives the clock re~et signals from the cros~-links 90 and 95 and generates a pulse of known duration concurrent with the rising edge of the clock re~et signal. That pulse is in an input to the phase generator 2310 as are the internal clock si~nals for the particular module. The internal clock signals for that module are clock signals which are derived from the system clock signals that ~;
have been distributed from oscillator ~ystems 200 and 200~
Pha~e generator 2310 is preferably a divide-do~n circuit which forms different phases for the clock signals. Other designs for phase generator 2310, such as recirculating shift registers, can also be used.
Preferably, the rising edge pulse from rising ed~e detector 2300 causes phase ~enerator 2310 to output a preselected phase. Thus, for example, if phase generator 2310 were a divide-down circuit with several stages, the clock reset rising edge pulse could be a Ret input to the ;~
qtage which generates the preselected phase and a re~et input to all other ~tages. If phase generator 2310 were already gener.~ting that phase, then the pre~ence of the ~ynchronized clock reset signal would be e~sentially transparent. ~ ~ -The resets thus organized are de~igned to provide the minimal disruption to the normal execution of system 10, and only cause the drastic action of interrup~ing the normal sequences of instruction execution when ~uch drastic action ~W O~CL~ iS required. This is particularly important in a dual or FINNEC~N. HENDERSON
F~RAEO~' C~RRElT
UN:`IER
eT. N W~
W~ O~O~. O C ~000-1~o~ 0 - 85 - `
.::

~` 2022~

1 multiple zone environment because of the problems of resynchronization which conventional resets cause. Thus, it is preferable to minimiza the number of hard re~ets, as is done in sy~tem 10. -~
E. ERROR HANDLING ~ ;
Error handling involves error detection, error recovery, and error xeporting. Error detectio~ has been discus6ed above with re~pect to the comparison elements in memory controllers 70, 75, 70' and 75', memory module 60 and 60', cross links 90, 95, 90' and 95', and firewalls 1000, 1010, and 1000' and 1010'.
Error recovery in the present invention is de~igned to minimize the time spent on such recovery and to minimize the overhead which error recovery imposeq on lS normally executing software. There are two aspects to this error recoverys hardware and software. Hardware error recovery is attempted for most faults before software error recovery within the general software er-ror proces~ing process is attempted. If the fault~ for which hardware error recovery is attempted are transient, error recovery back to fault tolerant lockstep operation may be performed mo3t of the time entlrely by the hardware. If hardware error recovary is not successful or is not used, then software error recovery is attempted. Such ~oftware recovery is .~wO,,,c.... designed to allow CPUs 40, 50 and 40~, 50' to perform an FINNEC~:~I. HENDERSON
F.~RAEO~' GSRRE~r DUNNER
a~o~l.o e ~ooo-1~0~ o ~` 20222~9 : ~:

1 orderly transition from normal operation to the error handling process.
Error recovery is complete when the data processing system has determined which module is the source of the S error and has disabled the faulty device or otherwis~
reconfigured the system to bypass the faulty device.
1. Hardware Error Handlina and Recoverv In the preferred embodiment of the invention, error recovery is implemented as much as pos~ible at the hardware level. This is done to minimize time spent in the error recovery phase of error handling and to minimize the complexity of the software. Software intervention qenerally takes more time and causes B ~;
greater impact on the rest of the system. This is e~pecially true in a multiproce~sor ~ystem, s~ch as system 10, where different zones 11 and 11' are in lockstep synchronization with each other. The greater . . .
the percentage of the error handling that can take place ln hardware, the less will be the impact on the whole ;~
system.
There are three basic categorles of faults or er~
rors in system 10 which can be resolved using a hardware error recovery algorithm. Thef2e errorfl are a CPU I/0 error, a CPU/MA~M fault, and a D~A error. The srror handling routines for each type of error differ ~w o~r~c~ slightly.
FINNECAN . HE#DeR50N
F~R~DOW G~RRE1r ~ DWNER - ~ ~:
I~r~ rR~ w WA~ ROT01 . D C 2000 ~O~ o~o " -- ~7 ~ ~ ~ :

rf~
.
~` 20222Q~ :

1 Figure 24 illustrates a flow diagram 2400 showing the overall hardware error handling procedure. As with prior explanations, the proce~ure in proce~s 2400 will be described where possible with reference to zone 11 with the understanding tha~ the proce~s could be executed equivalently with the elements of zone 11'.
Prior to discussing diagram 2400, it is important ~;
to understand certain principles of error handling.
After a dat~ proces3ing operation is performed, there is a window of time during which information is present which allows an error to be a~sociated with the bus operation which generated the error. The term "bus operation" refers to a complete operation initiated by CPU~ 40, 50, 40~ or S0~ which requires resource~, such as memory modules 60 and 60~, not directly connected to CPUs 40, S0, 40', or S0'.
As Figure 24 illustrates, after a bus operatLon i8 performed (step 2410), a determination i8 made whether an error occurred. If no error i9 detected (step 2420), there is no need for hardware error handling and the procedure is complete ~step 2440).
If an error i~ detected, however, hardware error handling must be initiated in the time window following the bus operation that caused the fault. First, the type of error must be identified (step 2430). The error ~w o-rlC~-FINNEC~N. HE!~DERSON
F.~R~IIOW G~RRE~r h DUNNER
3~ w .0~0~. 0 c.~ooo~
~o~ -50 - 88 ~
:~ ' ~ ~ . j r ,~ r i ;, ~

^ 20222Q9 1 types include CPU I/0 error, DMA error, or CPU/M~M
fault. ~;
Depending on the data proce~sing instruction or operation being performed by data processing ~ystem 10, different hardwaxe error handling procedures will be followed. When a CPU I/0 error i8 detected, a CPU I/0 error handler is entered (~tep 2450). The CPU I/0 error ~enerally indicates some type of error occurred in an area peripheral to CPUs 40 and 50, me~ory module 60, and the portion~ of memory controllers 70 and 75 interfacing with memory module 60. A CPU I/0 error occurs, for example, when there is a time-out of CPU busse~ 88 and 89 or an I/0 miscompare detected at either firewalls 1000 and 1010, memory controllers 70 and 75 or cross~
links 90 and 95. For such a situation, CPUs 40 and 50 ~ .:
can be considered capable of continued reliable operation.
The CPU I/0 error handlin~ is described below. In general, however, after the CPU I/0 hardware error processing is complete, registers will be ~et to indicate whether the error was transient, or solid, and will be loaded with other information for error analysis. A transient fault or error mean~ that a retry of a faulty operation was ~uccesEiful during hardware error recovery. Also, an interrupt (Sys Err) of a ~.~w O~IC~ . . ,.:.
F~NNE~NHENDERsoN
F~R~ow ~RRETr ~DUNNER ;~
~nln~on.o e ~ooo~
~o~ o - 89 - ~

~;

l predetermined level is set so that CPUs 40 and 50 will execute software error recovery or logging.
If an error is detected during a DNA operation, the DMA error handler is entered (step 2452). This error would be detected during a DMA operation, for example, when there is a time-out of CPU busses 88 and 89 or an I/O miscompare detected at either flrewall~ 1000 and 1010, memory controllers 70 and 75 or cross-links 90 and 95. Becau~e DMA i~. operating asynchronously with respect to the operation of CPU~. 40 and 50, the principal action of the DMA handler (step 2452) will be to shut down DNA eng,ine 800, and to use various other re~ponses discussed below, such a~. setting a Sys Err interrupt and a DMA interrupt.
If an error i8 detected such that the operation of the CPU3 40 or 50 or the contents of memory module 60 are in question, then the error is deemed a CPU/MEM
fault and the CPU/M~M fault handler is entered (3t~p 2454). Examples of CPU/MEM faults ar~ a double bit ECC
error, a miscompar,e on data from CP~s 40 and 50, or miscompare on addresses sent to memory module 60.
Detection of a CPU/NEM fault brings Lnto doubt the state of the CPU module 30 and its associated memory module ' ~ , ''~' 60. This type of error is considered critical and requires that the CPU memory pair which experienced the ,',~', ~O~C~ cpu~MæM fault shut itself down automatically and the FI~NECAN, HENDER50N
F.~RA570~ cARRETr ~7~ ~ ~TRt~T, N W.
~JI~NOTON O C ~OOO~
0~ 0 ~ 90 - -.

20222~ ~

.
l system reconfigure. The questionable state of the faulting CP'J or ac~ociated memory makes further error processing in hardware or ~oftware by the corre~ponding CPU memory pair unreliable. ~ :
The flow diagram of Figure 25 illustrates a preferred process 2500 for handling CPU I~O errors which includes the CPU I/O handler (s~ep 2450) of Figure 24.
In the preferred embodiment of ~he invention, the signals descri~ed in this error handling process as well : .
a~ the other error handling processes are Lllustrated in Figure 26. .
One important aspect of hardware CPU I~O error handling i9 that some operation~ which are.external to .
memory controllers 70 and 75 do not have a delay after .
the operation unles~ an error signal i8 received.
Therefore, if an error signal is received for data cor-responding to such an operation, then the system will delay 80 that all error reports will propagate to the memory controller~ 70 and 75.
~he series of operations performed by memory ~ :-controllers 70 and 75 after a CPU I/O error signal is received (step 2510) are initiated by memory controller~
70 and 75 if one of three conditions exist: (1) a specific signal is transmitted up from cross-links 90 and 95, (2) an error report i8 generated by memory ~:
~ o~cee FINN~C~N, HENDERSON
FARAI~O~ C~RRE~r ~ DUNNER
3~ , N W
nlllOlor O C 00011 ~0~ 0 ;;'' - ~ 2022~09 ` ~

1 module 60, or (3) an internal error signal i8 generated at memory controllers 70 and 75.
The specific signal tran~mitted from cros6-links 90 and 95 i8 a code that is ~iimultaneously transmitted along the control status lines of busses B8 and 89. In the preferred embodiment of the invention, ~uch a code i8 generated eithe.r when a miscompare is detec~ed .~t the firewalls 1000 and 1010 or when cro~s-links 90 and 95 detect a rail miscompare, such a~ in EXCLUSIVE OR gates 360 and 960m in Fig. 11. If firewalls 1000 and 1010 detect a miscompare, they transmit a predetermined bit pattern to cross-links 90 and 95 via module intercon-nects 130 and 132, respactively, and that pattern is then retransmitted to memory controllers 70 and 75, ~;
respectively. ~:~
Memory controllers 70 and 75 send the~ error signal~ to diagnostic error register logic 870, shown in : Fig. 9, which generates an error pulse. That error ;~
pulse sets bits in diagnostic error regis~er 880 ~stap .
2510). ~he error pulse from diagnostic error logic 870 i9 an input to error categoriza~ion logic 850. ~: :
The output of error categorization logic 850 i~
transmitted to encoder 855 which generates an error code ~step 2510). The error code is trsnsmitted from AND ~
gate B56 when the hardware error handling is enabled, -: ~.
~w O.~le~.
FINNEG~N, HENDrRSON
F.~A~a~W~ G~ARE7 r t~ w.
w~ o~o~. O. ~oo ~0 ~ 0 20222~9 1 and error disable bit 878 is set accordingly. The error code is then sent to cross-links 90 and 95.
In response to the error code cross-links 90 and 95 perform a series of hardware operations (step 2520).
One of those operations is the as8ertion of a predetermined error signal on the zone error line~ for distribution to system 10 ( Btep 2520). There i3 one ~et of four zone error lines per zone a~ shown in Figs. 19 and 26. The zone error signal for zone 11 i8 formed when error line6 from cross-links ~0 and 95 (cro~s-link6 90' and 95' for the error signal of zone 11') are ORed together (OR gates l99Q and 1992 in Fig. 19). This is done 8e that a consistent error report i~ generated by cro~-links 90 snd 9S (and cro3s-links 90~ and 95') to -be sent out to the other zone's cross-link~
After distributing the predetermined error signal to the other cross-links (~tep 2520), error logic circuit~ in cross-links 90,95, 90~, and 95~, simulta-neously post a Sys Err interrupt, freeze the trace RAMS, and send a Retry Request (step. 2520). Cross-links 90 and 95 post a Sys Err Interrupt by ~etting the Sys Err l~ne (See Fig. 26) which tran~mits the interrupt to CPUs 40 and SO. Also cross-linkY 90 and 95 freeze trace RANs (step 2520), which are connected to various busses to capture bus informatlon, by setting a global error line ,~wO.,,c.... (See Fig. 26).
FINNEC~. HENDERSON
F.~jR~EO~'. ~RRE~r DuNNER
t ~ - . N w .
w~n~o~os~ o c ~ooo-~o~ o . ~
; 20222~9 The trace RAMs are frozen to capture the most recent data transferred ~ust prior to the detection of the error. The function of trace RA~ will be briefly described in this section, and their use in error analysis will be discussed in the discussion of software error handling.
In ~ystem 10, trace RANs are preferably located on all ma~or rail data paths. Figure 27 is a block diagram of CPU module 30 and I/0 module 100 showing preferred locations of trace RAM8 in computer Yyqtem 10. Of course, other locations may also be selected. The func~
tion of the trace RAMs i8 to permit the identification of the source of errors by tracing the miscomparisons of data between trace R~M contents.
In Figure 27, trace RAMs 2700 and 2705 are located in firewalls 1000 and 1010, respectively, and are ~ -coupled to module interconnects 130 and 132, ~ ~;
respectively. Trace R~Ms 2710, 2715, and 2718, respectively, are located on the interface~ with cor-responding busses of cross-link 90 and trace RAMS 2720, 2725, and 2728, respectively, are located on the interfaces with corresponding busses of cross-link 95.
A complementary set of trace RAMs are located in processing 3ystem 20 ' . ~ -~
In zone 11, trace RANs 2700 and 2718 monitor module .~wO,,,c.......... interconnect 130, trace RAMs 2705 and 2728 monitor FINNEC~N. HENDER50N
F.~R~I~OW ~RP~E1 r ~ DUNNER
w.
ti~ O~Oi . D C 2000--~0~ 0 ., ~ ~ ` ` ~. . ~.r. ~ ~ :-, . " ;: ~

20222~

.

1 module interconnect 132, trace RAMs 2715 and 2725 monitor cross-link cable 25, trace RAM 2710 monitors bus 88, and trace RAM 2720 monitors bus 89. The correspond-ing trace RAMs in zone 11~ monitor the respective bus-ses.
An example of a trace RA~ 2800 i~ shown in Figure 28 Trace RAM 2800 is preferably organized as a circular i buffer which s~ores the data tran~ferred on the N most recent cycles of the associated bu~ pathway.
Trace RAM 2800 compri~es a buffer register 2805 having inputs coupled to receive data from an associated ~ ~;
data path. A load input $nto buffer 2805 i~ the ~utput of AND gate 2815. The inputs to AND gate 2815 are a clock signal from the data path, a global error signal . ~
generated when a fault is detected, and a trace RAM en~
able signal from trace RAM decoder 2820.
The trace RAN ~nable signal enable~ storage of data from the corresponding bus when the bus is not in an idle state. During bus idle cycles the bus is not being used to transmit data, therefore, the trace RAM does not continue to store the signals present on the bus.
Preferably, the global error signal causes the trace RA~ to freeze its data and stop storing additional signals. The inverse of the global error signal is pre~ented to AND gate 281S 80 that when the global error ~w or~lc~-FINNEC~N. HENDERSON
F.~RARO~, CARRETr R ~ t l . .. W
.~,.. o-o,. o c ooo- :
~o~ O

?
~ s,~. , ' ,, ,,j: ~ ~ ' ~,,-', ~ ?~; jr~

- ~ 2022209 1 signal is asserted, buffer 2805 will cease storing thesignals present on the associated data path. -The address inputs of buffer 2805 are supplied by a recycling counter 2810 which receives a count signal S from AND gate 2815.
Each of t~le trace RAM8 keeps in itB memory a copy of the N mos~ recent non-idle transactions on the data pathway associated with it. For ex~mple, in Figure 27 trace RAM 2700 keeps a copy of the N most recent transactions on module interconnect 130.
The depth N of the traca RAN 2800 is determined by the total number of bus cycles which are required for the most distant message transferred plus the total number of cycles which would be required to send the global error signal to the trace RAN when an error or fault occurs. In the preferred embodiment, sixteen non~
idle bus cycles are stored.
The remaining action taken in direct response to the generatlon of the error code is the transmission of a retry request. Error logic circuits 2237 and 2238 in cross-links 90 and 95 send a Retry Request (see Fig. 26) in re~ponse to the error code (step 2520). The Retry Request causes a ~eries of operations to occur at ap-proximately the same time in memory controllers 70 and ~ ;
75 (step 2530): lncrementlng the fault level, freezing ~w o-rlet~
FINNEC~`J. HENDE~SON
FARA~O~X' c~sRRETr 6 DUNNEP~
t-,.. w, .0,.. ~. e 1~ 0~ 0 ` 20222~9 1 the system fault error address register and sending a soft reset request.
The current hardware error recovery fault level or status re3ide~ in two bit~ of system fault error register 898. These two bits are the transient bit and the solid bit (see Fig. 9). The combination of the~e `~
two bits is designated as the bus error code. There are :~ .
three valid values of the bus error code when interpret~
ing CPU I/O fonts. One valid value corresponds to a system status in which there are no currently pending errors and no error recovery algorithm is currently be-;~
ing execut~d. A second valid value of the bu~ error code corresponds to the system ~tatus in which there has been an error on the initial execution of an operation or an error has occurred for which no retry was at-tempted. The third ~alid value correspond~ to the case of an error occurring after an operation has been retried. The Retry Request i8 an input to encoder 895 whlch increments the fault level.
The fault level may be incremented multiple times by multiple errors if the errors occur so frequently that the original fault level was not cleared by the software error processing. ~hus, two faults ocsurring in rapid succession may be seen by the software error processing as being a solid fault.
.~w or~lcc-FINNEC~N, HENDERSON
r~h~ Re I r h DUNNER
3~ ~ ~r~tll, N w tl~iNNUlON. O C ~000--,,0,,,.,.. O _ 97 -.
;` 2022209 1 The incrementing of the fault level causes the system fault error address regi~ter to freeze. The transient bit is set at both the first and second fault levels, but not at the lowest level corre6ponding to no currently pending errors. The transient bit disables and thereby freezes syC~tem fault error address register 898. System fault error address register 865 iY
contained in memory controllers 70 and 75 and is frozen to allow the current bus operation to be retried and to assist in performing diagnostics. ;~ ~ -A Soft Reset Request iR ~ent (step 2530) to cross-links 90 and 95 by momory controllers 70 and 75, ;
respectively, by setting the Soft Reset Request lines shown in Figure 26. Additionally, when memory control- ~;
lers 70 and 75 receive the Retry Request, they ~top DNA
engine 800, write an error code into a status register in DMA controller 820 to indicate the type of error, and free~e buses 88 and 89 batween memory controllers 70 and 75 and cross-links 90 and 95, respectively.
After tha different operations are performed in response to the Retry Request, local soft reset generator 2240 in primary cross-link 90 generates a lo-cal soft reset (step 2532) in response to the Soft Reset Request. In response to the local soft reset, retry gener~tors 2610 and 2620 in memory controllers 70 and .AwO~ ........... 75, respectively, reinitiate the pending bus transaction FINNEC~N. HENDERSON
FARA~OW. G~RRETr ~ DUNNER
T ~ C ~
w~Sl~ OT05 . O C 1000--1~0~ 0 , . .
- 98 - ~

20222~9 1 tstep 2534). If the retried bus oparation is succe~ful and no error signal is received (step 2536), then the ~:
hardware error processing for a CPU I/0 error i8 - ~-fini~hed (step 2525).
S If an error signal i~ received by the memory controllers ~0 and 75, similar hardware operations will be implemented a~ were implemented when the first error signal wa~ received. Diagnostic error register 880 is ~ ;~
set and the error code is generated (Jtep 2538); the error signal i~ distributed, the Sys Err interrupt is ~:~
poeted, the trace RAMs are frozen and the retry reque~t i8 sent (step 2539); and the fault level is incremented, the system fault error addre~s register is frozen and a soft re~et request is sent (step 2540). Since moe~t of these operations have already been carried out, there will be no change in the trace RAM~, the error address and diagnostic errox registers. The fault level, however, will have been incremented to its highest level indicating that there is a ~solid fault~ in the computer system. This is becau~e the error will have been detected after a retry of a bus operation, and a solid fault occurs when an error is detected during a retry.
Than, before exiting the hardware error h~ndling routine : :
for a CPU I/0 error 2500, a ~oft reset is performed (step 2542).
~w O~lCe~ :
FI~NECAN. HeNDeRSON
F~R~30~. C,~RRE~r 6 Dt.~NeR
q~ te~ w W~;lrNO~ON.O C ~000--,Ø.Ø .-.0 _ 99 _ ~ 20222~

1 In order to complete the operation being performed by CPUs 40 and S0 80 that an interrupt m~y be taken for software error handling discussed below, a test is made ~ :
to see whether a read operation wa8 being executed when the error was detec~ed (step 2544). If so, a default operation will be performed (step 2546). The default ;~
operation consi~ts of supplying CPU module 30 with consi~tent data, such as all zeroe~, 80 that the cur~
rently executing operation can be completed without a risk of a failure because of a rail data divergence.
: ., :
Figure 29 illu~trates the procedure 2900 for recovering from a DMA error. The sequence of operations which take place at the hardware level (~tep 2910) is similar to those discus~ed with respect to CPU I/O error recovery seoAuQnce. rrhe hardware respons2 to a DNA error : ~ .
(step 2910) include~ posting a Sys Err interrupt, post- ..
ing a DMA interrupt, freezing the trace RAM~ and ~top-ping the DA~A~
First, the DMA i8 stopped to preYent corruption of data in system 10. Posting of the Sys Err interrupt indicates to the system that a interrupt processing routine should be conducted 80 that complete recovery :
can be made from the error. Posting of the DMA inter-rupt invoke3 a DA~A handler lnto so~tware to initiate a check of its own operation~. The trace RAM3 are also ~ ;~
~w Or,,c..
FINNECAN, HENDER~ON
F~R~BOW ~RRETr l; DUNNER
A L~l-Lt-. h W. : : .
~A~ O~O~l,O C ~000~ ' ' :
uo~l~rJ~ -a~o "
- 100 - ~. s. ,-' ,':

20222~9 1 frozen 80 that software error handling will be able to localize the source of the fault.
Even though the DMA is stopped, the remainder of the ~ystem is allowed ~o continue normal operation.
S However, continued operation of the sy3tem when the DMAhas been stopped may result in additional arrors, such as CPU I/0 error~ due to a bu~ time out, since the zone with an inoperative DMA engine will not be abl~ to execute I/0 operations.
After the hardware re~ponse to a DMA error takes place, the DMA error racovery sequence is completed (step 2920~. Further processing of the DMA fault and resumption of the operation of the DNA must occur in ~oftware. The software error handling scheme executed by CPUs 40, 50 and 40~, 50~ i5 dl~cus~ed below.
The third type of error which i9 primarily handled by hardware is the CPU/NEM fault. Figure 30 illustrate~
CPU/MEM fault error handl~ng proc~dure 3000.
When the error ~ignals for a CPU/MEM fault are received, they propagate through diagnostic error register logic 879 and through error categorization logic 850 in the memory controller which detected the error. Error categorization logic 850 then po~t~ a CPU/
MEM fault signal which i~ encoded by encoder 855 into a two bit error code indicating a CPU/MEM fault. The two ,~w or~er~
FINNEC~N. HENDERSON
F~R~EOW C~RRETr ~ DUNNER
" ""~,., .. w. -:
lO-.O ~ ooo- . , , ,:
o -- 101 -- :: -:
.. ..

` 202220g 1 bit error code is transmitted through AND gate 856 to cross-links 90 and 95.
T~e posting of the CPV/M~N fault (step 3010) cau3es posting of a Sys Err interrupt, increment~ng of the fault level, freezing of the system fault error addres~
register, and freezing trace RAM~ (step 3020) which e~re described above in the discus~ion of CPU l/O error .;~
handling process 2500.
Upon detection of a CPU/MRM fault, no effort is made to retry the operation since the ability of the current zone to operate correctly and therefore imple-ment any type of error recovery scheme is uncertain at be~t. Once cro~s-link~ 90 and 95 receive the error code indlcating a CPU/MEM fault, they immediately reconfigure ; themselves into the slave mode (~tep 3025). System 10 i8 now conRidered to be operating in a degraded duplex or master/slave mode.
A local soft ree~et (step 3030) and zone clock reset (~tep 3040) are performed and the hardware error recovery for a CPU/~EM fault is complete (step 3050).
Two error conditions occur which cause two cor-re~ponding bits in system error register 898 to be set.
The first is a NX~ (nonexi6tent memory) error which cor~
responds to a lack of response during a memory -operation. The ~econd error cond~tion i8 an NXIO
:, ,,:
~AwOr.,ce.
FINNEC~W. HENDER50N
F.'~AA~O~ RAETr a DUNNEA
~,.
SilOTOi~O C TOOOII
ITO~ 0 d '. ~,:
,~
,-,-` 20222~9 ; ~

1 (nonexistent I~O device) error which correspond6 to a l~ck of respon~e during an I/O operation.
NXM errors are recovered from in software as discussed below. NXIO errors fall within the CPU I/O
error type snd are handled in hardware according to CPU
I/O handler process 2500.
A NXIO bit and a NXN bi~ (see Figure 9~ are detected for corresponding NXM and NXIO error~. When the NXM bit iB set DMA 800 i8 di~abled which prevents access to I/O by system 10.
In each of the three types of hardware error recovery, a software error handling process is used after the hardware error recovery procedures to detect the cause and location of the error if possible. Ad-ditionally, the software error handling may determine that there Ls no fault and that the systam can be restarted in a normal fully duplex mode. On the other hand during software error handl~ng it may be determined that a module Ls bad and the module will be marked ac-cordingly.
~he overall hardware error recovery scheme minimizes time spent in error recovery by allowing the system to continue operation after a tran6ient fault in the CPU I/O error handler process 2500. Additionally, system overhead devoted to error processing i3 minimized ~ -L~wO,r,c... by not attempting to provide for recovery from CPU/MEM
FINNEC~N. HENDERSON
C~R~O~ C~RRE1 r a D~NNER
17-0 1~ 7~C~T, N W
~7N~NO~ON O C ~OOO--1~0~ 0 ~` ~ `' ~?~ ~ -:
i 2Q222~9 ~

1 faults. The ability of system 10 to recover from a CPU/
MEM fault would impose a time penalty to allow for error recovery which in the preferred embodiment ~everely degrades system performance.
2. Software Error Handlin and Recovery In order to initiate software error handling, computer system 10 must take a Sy8 Err interrupt or a DMA interrupt (not shown), whichever is appropriate.
The interrupts are used instead of more dra~tic mean~, such as a machine check, to allow ~y~tem 10 to complete the current bus operation. A machine check causes im-mediate action to be taken and can stop a sy~tem in the middle of a bus operation. As discus~ed briefly with respect to hardware error handling, default information may need to be generated in order to complete a bus operation.
If system 10 is accepting interrupts, then it will initlate a software error handling procedure such as procedure 3100 in Fig. 31. Computer system 10 operates at a given interrupt priority level ~IPL) which can be changed. The IPL designates the priority level which an interrupt must be posted at in order to interrupt cur-rent computer system operations. If an interrupt i8 generated with a IPL the sc~me or lower than the IPL cur-rently at which computer system 10 is runninq, then an . . ~. .
.~WO~r,e~ interrupt will not be taken. In the preferred ~ ~ ~
FINNEC~N. HENDERSON
F.~RAbO~ ~ARREIr ~ DUNNER
"" ,~ "~ w .0~0~ . 0 c 1~ 0~ 0 202229~ :

l embodiment, Sys Err interrupt is the highest priority interrupt.
As has been done with other examples, the software error handling will generally be described with respect to the operation of components in zone 11 with the understanding that, when system 10 is functioning in lockstep mode, similar operations will be performed by zone 11'.
If system 10 takes the Sys Err interrupt (step 3110), system lO initiates a soft re~et (step 3112).
Sy~tem lO then attempts to read the various error regis-ters located in memory controller~ 70, 75, 70' and 75', and in cross~link4 90, 95, 90' and 95' (step 3114). The memory controller and cross-link error registers, ~ome of which are not shown, store information which is used in software error processing. Two such error registers are system fault error register 898 and system fault error address register 865. These are located in zone address space and should contain identical information for each zone. In the case of CPU/MEM fault, however, the system fault error register of the two zones will be dlfferent. This difference betwsen the contents of the registers in the two zones can only be tolerated if data processing systems 20 and 20~ are no longer in lockstep ;
and system 10 is running in degraded duplex or master/
.~wO",c.... slave mode.
FI~NEC~ E:~IDERSON
F.~R~O~ RRETr 6 DU~:`IER
, N w.
IO~Ot D ~ 000--.20~ -0 - 105 - : ~

20222~
, . . .

1 Therefore, if the data from registers used in error analysis is not consistent (step 3116), meaning there is a error detected or a miscomparison, the zone which detects inconsistent error data will set a CPUiMEM fault S causing the hardware error recovery procedure 3000 il- -lustrated in Fig. 30 to be entered (step 3118~. This condition arises when the occurred in the error lo~ic, and this approach re~ults in the failing element being removed from system 10.
If the error information i5 consistent (~tep 3116), ~oftware error handling continues, system 10 identifying the nature of the fault (step 3122) to determine which error handler to employ. In order to identify the error type, ~rror regi~ters, 0uch as system fault error : ~,: : .~ .
regi~ter 898 and system fault error addres~ register 865 ~ ~
.: .: . : ,. , in memory controllers 70 and 75 and error registers in cross-links 90 and 95 (not shown), are analyzed. In addition, NXM error bit in system fault error register; ;
898 must be checked prior to accessing the error regi~ters in cross-links 90 and 9S because a~cess to the cross-links is inhibited while the NXM bit is set.
.
If the fault detected wa~ a CPU I/0 type error, the CPU I/0 error handler is entered (step 3124); if the fault is a CPU/MEM fault, the CPU/MEMi fault handler is entered (~tep 3126); if a clock error is detected the ~wO~ce. clock error handler i entered (step 3128); and if a FINNECA!J, HENDER50N
F.~RAEOW c~RRElr 6 DuNNER
,~" " ~TUt~T, .~ W.
W~IUI 0~01~. D. C ~000-~
~o~ O ::

, ' ', ' ~' ~

_.~ 1 20222~

1 nonexistent memory (NXM) is detected, the NXM handler Ls entered (step 3130). CPU I/0 error and CPU/MEM faults have been described above with regard to hardware error handling. Por software error handling only, CPU I~0 errors include DMA error~. A NXM error is an indication that the memory sought to be acceseed is not present. A
clock error indicatee2 that the two zone~ cannot be considered as running in lockst2p.
CPU I/0 error handl.^.r 3200, as illustrated in F2g.
32, begins with a trace RAM read (step 3210). The trace RAM read may not be necessary, but it Ls started at this point becau~e it i8 a relatively lengthy process. As explained in the previous section, the trace RAMs we.re ; frozen by the global error signal. The data from the trace RAMs is then read onto trace busses 1095 and 1096 by diagnostlc microprocessor 1100 and into local RAM
1060 in I/0 module 100. Complete sets of trace RAM data from the trace RAN8 of both zones is collected by the I/
0 modules of both zones 11 and 11'.
Analysis of trace RAM data entsils looking at a trace RAM signature. As the trace RAM data i8 read down into I/0 modules 100 and 100~, a trace RAM ~ignature is formed as a string of M bits for each zone. M equals the number of trace RAMs on each rail. Each bit of the trace RAM siqnature corresponds to one pair of trace ~;~
~wo~ce~ RAMs. A trace RAN pair is two trace RAMs on different FINNEC~N. HENDERSON
~A~ R~1 r - -DUNN~R
1~ R~e~ W
W~R~-~OTO-.. D C ~000-- - .
1~0~ 0 - 107 - ~

-20222~
:

. .
1 rails which are located at the sc~me relative position.
In Fig. 27, for example, trace RAM pairis in zone 11 would include trace RAMx 2700/2705, 2718/272e, 2715t ~-2725, and 2710/2720. If a bit in a trace RAM signature is set, there hag been a miscomparison between the trace RAMs ~n a pair.
Next, the NXIO (nonexistent I/O) bit in error status register 898 $8 examined (step 3212). If that bit i8 set, a NXIO error, which indicates a time out during a read to I/O or a write from I/O, has occurred.
If the NXIO bit i8 not ~et (~tep 3212), the trace ~AMs . .
.
are analyzed to assist in determining the device in : .:::
which the error occurred (step 3214). For example, if ~
. : ~ :..
.::
the trace R~N signature bit corresponding to trace RAM
pair 2700/2705 is set then system 10 can determine that the I~O module corresponding to firewall~ 1000 and 1010 - ~
;
is the source of the error.
After the device which is the source of the error :. . :: : . ::: :.:::
~ has been determined, the system may now form an indict~
ment o~ the faulty device (step 3220). This indictment .. . ~:
involves u~ing the error information stored in the ;
various error registers, e.g., system fault error ad- ~ ~ -dress register 865, and the trace RAM analyqis to ;~
identify the specific faulting device. Once the system has identified the faulting device, a determination i ~w O~C~
FlNNeG\N. HENDER~ON
F.~R~GOI' G~RRElr - .-~ DUNNER
q~ T~
O-OI~.O C 000- ' 1~0~ 0 ' ' - 108 - ~
,: .:

~ ' ' 20222~

1 made whether the error is a solid fault or an intermit-tent fault. To determine whether a fault is solid, the first two bits of system fault error register 898 are analyzed.
If the fault is intermittent, rate based thresholding is conducted to determine whether or not the indicated device should be considered as ha~ing failed (step 3224). Rate based thre~holdlng involves comparing the number of intermittent errors that have occurred over a given period of time in a faulting device to a predetermined threshold for that device. If ~1 the number of errors per unit time for a device is greater than predetermined thre~hold (step 3224) then the unit i8 considered to have failed. If the nu~ber of errors per unit time i~7 not greater than the threshold : : .: ': .: ! , ` .
(step 3224), then the principal functions of software error handling are complete and steps are taken to exit software error handling procedure 3200.
I~ the number of intermittent faults iQ too great or the fault i8 a solid fault, a failed device handler (step 3226) is called.
Figure 33 illustrates failed device handler procedure 3300. First, c~ppropriate fault information is stored in the EEPROM of the faulted module (step 3310).
Such information can include bits indicating that the ~AwO~lce. corre~ponding module is broken or may be broken. The FINNEC~`J, HENDERSON
F.~R~OW C;,~ARETr 6 DUN:`IER
1~7~ ~ 9111~T, N W
~91~1tGlON, Cl C ~OOO-- ` .
1 ~ 0 7 ~ 0 ~,",,~;,"", ;" ~ .-,.j,t; ~j~ ;' ,tj'j~ " ~ `r~ '?
.'.`'',,.-,;,.,~' ;,.,: b;

:
20222~

1 stored information can also include certain status information as well. ~ :
In order to minimize the effec~s of a device :
failure, virtual addre3s of the failed device i8 mapped to a physical address called the ~black hole" (~tep 3314). The "black hole~' is a physical addre~s space which corresponds in ef~'ect to a device to which data may be sent to without experiencing errors in the system and which will return a predetermined set of data, which ~:~
0 i8 preferably all zeros, on a read operation. The map~
ping is performed in the preferred embodiment using a system address conversion table which contains a listing of the virtual addresses and corresponding system ad- : :
dresses for the devices in system 10.
lS Fig. 34 illustrates an example of a system address conversion table 3400 which is preferably stored in memory arrays 600 and 600'.
System conversion table 3400 include~ a virtual address field 3410 and a physical address field 3420.
The software uses system address conversion table 3400 to translate or map a device virtual addre~s to its physical address. In addition, the I/O driver routines use the virtual address to identify a corre~ponding I~O: ~
device. Modifying system address conversion table 3400:~ :
for a device, therefore, effectively change~ the final ~w Or.,e~.
FINNEG~N. HENDERSON , F.~RAE~O~ C~RRE~r li DUNNeR
3~ w 0~ 0 c ~ 000 ~
,Ø,,~.. 0 ~ ~ :

~l `:

`~ 20222~

1 destination for data addressed to the virtual address which formally corresponded to the I/O device.
After the mapping is complete, the next step in the failed device handler is to clear a device present flag in a software table contained in memory array 600 (step 3316). The purpose of clearing the flag is to tell the device driver corresponding to a failed device that the device iB considered to have failed.
After the device present flag ha~ been cleared the system performs a notification of the required repair ;~
(step 3318). This notification, in the preferred embodiment, ~ends a messa~e to appropriate repair personnel. In one embodiment, this me~sage can be sent via a modem to service personnel at a remote location. ~ ~
The effect of the failed device handler procedure ~ i 3300 can be appreciated by examining the pe~rformance of a device driver. Fig. 35 illustrates an example of a device driver 3S00 which is an executable block of instructions including a ~eries of I/O instructions to be performed by the corresponding device. Even if the device has failed, the device driver continues to oper-ate normally and execute the I/O instructions. Slnce ;~
the I/O device address space has been mapped to the "black hole~ for the fa~led device, the continued execu~
tion of instructions will not generate any additional .~wO~lct. faults. All device drivers will include a ~check device FlNNtC~N, HENDERSON
F.~R~E~'X' C~RRETr a DUNNER
"" " c~te~T, ~, w.
~,. 0 c ~Ooo-~o~c~ o , ,,,, " ,,."~ :i"r,i~, j ;,- .,.,, , ; ~ ," ";

-20222~

1 present~ instruction 3510. This instruction checks the device present bit for the correqponding ItO device. If the device present bit is cleared then the devic~ is considered to have failed and the driver di~ables itself in an orderly fashion.
Ju~t prior to the ~check device present~ instruc~
tion 3510 there is a clear pipeline instruction 3520.
The clear pipeline instruction en6ures that all pending I~O instructions ars complete so that an error in an immediately preceding instruction will not be mi~sed due to p$pel~ne delays. An example of a "clear pipaline"
instruction is a read from a memory controller register.
The ability to execute a series of instruction~ before checking whether the device is considered to have failed saves on the software overhead becau~e it avoids making checks after every operation.
The CPU/IO error handler illustrated in Fig. 32 institute~ a number of houseke2ping operations after exitlng failed devlce handler 3300 (step 3226), after determining that the device with the error is not consider2d to have failed after thresholding tstep 3224) or After performing a crash dump (~tep 3232). The~e housekeeping operations include re~etting the tr~ce RAM
and error register~ (~tep 3228) and logging the error (step 3230).
~w O~lCt~
FISJNEC~N, HENDERSON
FARAI~-~W, CARRE~r h DUNNER
~e~
0~0~ 0 c,~ooo~
,Ø,,~ ..0 : `~
20222~9 1 Referring back to the software error handling flow of Fig. 31, if the error type is determined to be a CPU/ ~;~
MEM fault (s~ep 3122), then a CPU/MEM fault handler iffi entered (step 3126). Fig. 36 illustrates an example of a CPU/MEM fault handler.
CPU/MEM fault handler 3600 i8 a 8imple ~oftware procedure entered in all ca~es where a CPU/MEM fault i8 5`
determined to have occurred and for which reliable : . .
operation of the CPUs or memory module i~ unknown. Ac~
cordingly, for a system that has a CPU/MEM fault, there iB little reliable error processing can be accomplished.
After the CPU/MEM fault i8 entered, the faulting CPU `
module attempts to move Lts internal error registers (not ~hown) to the appropriate EEPROM (step 3612), such as EEPROM 1055. ~he error registers moved to EEPROM
1055 may very well ~ontain rail unique data because indications of a CPU/MEM fault error reporting i~ not always given a chance to propagate to both rails and the system iQ shut down as quickly as po3sible during hard-ware error processing.
After the faulting CPU module attempt~ to move the error registers into its EEPROMs (step 3612), the fault-ing CPU module immediately enters the console mode (step 3614), and the CPU/MEM fault handler 3600 i8 complete (step 3616).
~w ort~ce~
FINNEC~N, HtNDE~50N
i.'~R~Eo~, C~RREtr ~ DUNNER
3~ w 0~01~. 0 C ~000-,Ø,.. 0 -- 20222~

1 In the software error handling routine, if the er~
ror type is determined to be a clock error (Step 3122) then a clock error handler is entered (step 3128). An example of a clock error handler i8 illustrated in Fig.
37 as procedure 3700.
If a clock error has occurred, it i8 assumed that `
no accurate diagnostics or error analysis can be ac- ~-complished bacause the clocks were not synchronized when -~
the error occurred. Therefore, the error registers are ~ - - ,:
cleared (step 3710), and the trace RAM~ are unfrozen by dea~serting the global error signal (step 3716). Any zone which finds a the clock error, sets itself to clock master (step 3718).
.. .. .
; The zone finding a clock error then execute~ a ~ `~
check to see whether the cable is installed and the power is on in the other zone. If the cross-link cable : . -25 is installed (step 3720), and the other zone does not have power (step ~725), then a clock error is logged in the normal fashion ~step 3730) and the zone continues. ;~
If the cros~-link cable 25 is not installed (step 3720) or is installed but the other zone ha~ power (step . . , 3725), then the zone asks whether it i8 the zone pre3elected to continue operating under these conditions (step 3735). If 30, then the clock error is logged (step 3730), and the zone continues. If the zone is not ~w o~lee~ - ' , ':
fiNNEC~N. HENDERSON
~e.~RA50~!. G~RRElr ~; DUNNER
T--e~-. N. W
W~ lD-O/i. D C iOOO--1~0~ 0 : ~ .:

20222~9 1 the preselected zone (step 3735), then it enters the console mode (step 3740). -. :
If the error type of the software error handling routine is determined to be a nonexistent memory error (NXM) (~tep 3122), then the NXM handler is entered (step 3130). The NXM error can be detected if the NXM bit is set in system fault error register 898 illustrated in Fig. 9. The NXM bit in system fault error address register 898 is set on two conditions. One is if there is an illegal instruction which the sy0tem attempted to execute. Another is if a NXM error was detected due to a lack of response from memory module 60.
An example of a procedure 3800 for handling NX~
errors is illustrated in Fig. 38. After the NXM handler is entered, (step 3130) the fir~t determination is whether an illegal instruction was attempted (step ~:
3810). If the NXN bit was set becau~e there was an il~
legal instruction, then the console mode is entered (~tep 3812) and the NXM bit i3 deasRerted (~tep 3831) and the NXM handler is complete (step 3832).
If there was an actual NXM error, system fault er-ror address register 865 is read (step 3820). System fault error address register 865 contains the address of the memory location in memory array. The next step is to compare the memory address with the valid memory .~wOr~c~ locations listed in memory map (step 3826). The purpose FINNEC~N. HENDeRSON
FAR~I~O~ cARRETr .,.~ .~ ,.. , .. w.
~nl~o~oloo C OOo-1~0~1~9~ 0 . - 115 - . ~ -" ~; ;, ", ~ ; -7~ r~

20222~

1 of this compariison is to diff~rentiate hardware errors from software errors. ;~
There are three different situations in which a NXM
error would be detected. The first ~iituation is where S the system is booting up and the memory is being sized -in order to form the memory map. During booting, the software is probing valid and invalid memory locations in memory array 600. To avoid having this ~ituation cause an error, reporting i8 either disabled during probing of the memory at boot time by elevating the system IPL during memory prob~ng. Thus the NXM error handler would not be entered.
The second situation in which a NXM error is detected is when memory module 60 has experienced a har~
dware fault which disabled a particular por~ion of memory array 600, even though ~hat portion was valid :
when the memory map was formed. This can happen, for example, when ons of the memory array cards i8 simply removed from the system during operation. This i8 a hardware fault and will make reliable operation of the corresponding CPU module impossible.
The third situation when a NXM error occurs is when softw~ire creates an invalid memory address. In this ~ ;
3ituation, the software i8 in error.

~wo~-~ct~
FINNE~N. HENDERSON
F~R~IIOW G~RRE?~r . - -3 ~ Dl:YNER
tt-. ~ w.
o~o~. O liii ~000-~0~9~ 0 202228~

1 These three cases are distingui hable in the present situation. As described above, the fir~t situa~
tion is distinguished by not entering the NXM error handler. The nex~ two ~ituations are distinguished by checking the memory address when the NXM error was detected with the valid memory locations and the memory map (step 3826). A~ can be seen, if the memory module of a zone had h~rdware fault and the current memory location was a valid location in the map but for some reaYon is no longer valid, then a CPU/MEN fault is forced (step 3828). In this way the currently executing task can cantinue to be executed since the CPU/MEM fault will cause a hardware error processing routine to ~:~
reconfigure the system for continued operation in the degraded duplex or master/slave mode.
However, if it i8 determined that the current memory location was an invalid location and was not present in the valid memory map, then the ~y~tem determines that the software is in error and a crash dump and error log will have to be performed (step 3830). After these two cases ara accounted for (steps 3828 and 3830) the NXM bit is deas~erted (step 2931) and the NXM error handler is exited (3tep 3832). After the NXM bit is deas~erted access to I/0 device will be permitted a3 di3cussed above.
~w O~C~
~INNEC~N He~DERSON
;~\RAI~OW~ C;ARRE~r DUNN~R
~n ~ -. N W.
W~rNO~ON. O. C ~000--~0~1~9:~ 0 l ~

20222~9 : -1 In each of the three types of hardware error ~ ~
recovery, a software error handling pro~ess is used ~ -after the hardware error recovery procedures to detect the cause or location of the error if pos~ible. Ad-ditionally, the software error handling may determine that there is no fault and that the system can be restarted in a normal fully duplex mode. On the other hsnd during software error handling it may be determined that a module is bad and the module will be marked ac-cordingly.
In su~mmary, by allowing the system 10 to perform software error recovery only when an interrupt cycle i8 .
reached by system 10 the impact on operations executing when ~n error i~ detected is minimized. Hardware srror ~ ;-recovery facilitates this transparency of error proce6s-ing to normal execution data processing instructions.
Mapping I/O devices to a ~black hole~ and thereby allow- -ing the device drivers to complete a number of I/O
instructions before checking for an error minimizes overhead needed to insure I/O operation~ are performed correctly and not inappropriately interrupted if ad-ditional errors are detected after a first detected er-ror.

~ O~IC~
FINNEG~N. HENDERSON :.
~bO~ G~RREl r JU ~7 DL'NNER
177~ llt~l. N W
W~ 10101~,0 C ~000-1~0~ 0 - 118 - ~

2~222~9 :

1 3. Conversion of Rail Uniaue Data ~o SYstem Data.
Under certain conditions of error processing in fault tolerant computer system 10, data is generated which is unique to a single rail of zones 11 or 11'. In the preferred embodiment of the invention, rail unique data may be stored in diagno~t$c error register 880 ~-after a CPU/MEM fault. Rail unique data i8 not limited ;~
to diagnostic register 880, however. During diagnostic error analysis, rail unique date will be generated in a variety of locations depending on the registers being tested.
If data proces~ing systems 20 or 20' attempt to move rail unique data from one location to another or to ` `~
use it in any way, the normal error detection circuitry, ~uch as the data comparison 1000~ and 1010', will 8~ gnal an error because the data on each rail will not be identical. Thus a mechanism is needed to avoid causing An error during such transfer.
Furthermore, once rail unique data has been converted into data common to a zone, it i~ still not usable by fault tolerant sy~tem 10 since there would be disagreement between the data in zones 11 and 11'. In order to analyze this data it must be further converted into system data 80 that there is one consistent copy of data present in each of data processing systems 20 and ~w or~ct- 20~. This conversion of data must also occur with the FINN~C.~N, HENDeRSON
F.~RA~OW G~RRe I r 6 D~,'NNER
R ~ ~ T, t~ w w ~ O~O-~ D C /0000 ~0~ 0 20222~9 1 four CPU~s 40, 50, 40~, 50~ running in lockstep :~
synchronization.
The conversion of rail unique data to zone unique data will be described with reference to zone 11 and data processing syEtem 20 for purpose6 of illustration, with the understanding tha~ analogous procedure~ may be executed by data processing ~ystem 20~ in zone 11~. ~-In ~he preferred embodiment of the invention in the procedure for converting rail unique data to system data, as illustrated in Figure 39, the interrupt prior~
ity level (IPL) of the computer system 10 i8 elevated :;
above the level at which mi~comparison errors will cause a software error processing routine to be axecuted (step 3910). At this IPL, computer system 10 will only accept ~:~
interrupts having a higher priority level than the Sy8 .; ~:
Err interrupt level. ~ ~
The error reporting system is also disabled in ~;
memory controllers 70 and 75 ~step 3912). Error report~
ing in memory controllers 70 and 75 i8 disabled by set- :
ting error disable bit 878 in memory controller statu~
register 876 which i8 an input to AND gate 856. ...
The rail unique data from a particular register, which for th~s example will be the data from diagnost~c error register 880, is moved into scratch pad memories 45 and 55 from the diagnostic error register3 Ln cor- ~:
~wo~e~i responding memory controller~ 70 and 75, re~pectively FINNEC~N. HENDEP.SON
FARAEOW GARRe1r ~ DUNNER
T~ ~ W
.. ,.,.. 0Ø. ~ c ooo~ ::: . :
,,0,, ,~ .0 . : :,:
- 120 - ~ ~
: ~: :
', .,.~` r~

~ 20222~9 ~

1 (step 3914). Scratch pad mamories ~5 and 55 are located ~above-- memory controllers 70 and 75 so that data from register~ in memory controllers 70 and 75 does not pass through any error checkers.
This data ~n Fcratch pad memorLe~. 45 and 55 is then moved down into memory module 60. First, a wTrite operation is executed in which the data in scratch pad memories 45 and 55 is written into memory module 60 at a - ~
first location (step 3916). The system default ~ ~ -configuration causes data to be written into the ad~
dressed memory location of memory module 60 from the primary rail. This write operation to a first memory location re~ults in data rom scratch p~d 45 bein~ re~d ~ .
into memory module 60.
The data from mirror scratch pad memoxy 55 is writ-ten Lnto memory module which requires two operationF.
First, the memory bus diversion in memory controller 75 ~ 9 mu~t be enabled snd those in memory controller 70 must .
be disabled (step 3918). This is accomplished by Fet- ;~
tlng mirror bus driver enable bit 879 in memory control- :~
ler status regi~ter 876. Next, memory module 60 is com- ~ :~
manded to select the ECC for the data from mirror memory : :~
controller 75 (F.tep 3920).
Another write operation is then executed in which the data in scratch pad memories 45 and 55 is written :
~wO~.~e... into a second memory location (step 3922) different from FINNEC.~N. HtNDERSON
F~R~I~OW CARREl r li DUNNtR
~7-C 11 ~IIIttT, 1~ W
IIOTO~ . O C OOO-- ,, ~o~ o ,.
- 121 - ~.
,". ~

20222~9 ~ :-1 the location first written to from scratch pad memorie~
45 and 55 (step 3916). ~his write operation to a second ~ -memory location cause6 the data from scratch pad memory 55 to be written into the second memory location in ~:
memory module 60 since the mirror rail was chosen as the 30urce of data for the write operation (steps 3918 and ;~
Step 3920). - ~ .
This series of operations has converted rail unique~ : -data to zone unique data. The da~a from registers located on respective rails of zone 11 i8 now located in :
memory module 60 80 that it may be u~ed by data process-ing system 20 without causing miscomparisons. The zones can now be set back to their normal condition by clear~
inq the specific locations in scratch pad memories 45 and 55 that where previously used (step 3g24), selecting the primary rail on memory module 60 (step 3926), deselecting mirror rail bus drivers in memory controller 75 by resetting mirror bus driver enable bit 879 (step , . . ~
. 3928) , clearing the appropriate error and diagnostic registers (step 3930), and forcing a soft reset in memory controllers 70 and 75 (step 3932). ~ ~.
Ar'ter the IPL is returned to a level at which the sy~tem may accept interrupts (~tep 3934), system 10 i~
ready to convert the zone unique data stored at two ad-dresses in each memory module 60 and 60~ into data U8 .~wO.. ,c...... able by the entire system.
FINNEC~N. HENDER50N
F.~RA~OW, C~RRETE

17~t 11 ~TI-~t-. ~1. ~1.
O-Ot . O. C. 000 0~ 0 - 122 - - .

, 2 0 2 2 2 ~ 9 1 In order to transform zone unique data into system data communication register 906 is utilized. Communica~
tions register 906 is used to hold unique data to be exchanged between zones. A~ described previously, the addrQss of communication register for writing is in zone address space. Thus, dur~ng lockstep operation, ~oth zone~ can simultaneously write the communications register in their respective zone~. ~he address of com- -munications register 9 ~ for reading, however, is in the system address space. In this manner, two zones in lockstep operation, can simultaneously read zone unique data using the communication regit;ters.
The method of converting zone unique data into system data is illustrated as procedure 4000 in ~igure 40. First, both data processing systems 20 and 20' -~
s$multaneously write the desired location from their respective memory modules into their respective com-munications register (step 4010). Next, both data ; procest~ing systems write the data from communications regi~ter 906 into memory modules 60 and 60~ (~tep 4020).
Then both data processing systems 20 write the data from communications register 906~ into memory modules 60 and 61' (~tep 4030). Now the entire zone has the same data.
, ..
If, as in the casQ of rail unique data, there are multiple memory module locations with different data, -~
^WO~Ct~ the procedure 4000 would be repeated for each location. -FiNNtC~N. HENDERSON
t.~RAEO\I~' C~RRE~r t.... w.
w ~n~-o-o- o c ~ooo~
~o~ o .
- 123 ~

Claims

1. In a data processing system having a plurality of processor portions executing the same set of data processing operations, a method of recovering from an error comprising the steps, performed by the data pro-cessing system without executing said data processing operations of:
detecting an error in said data processing system during the execution of a faulting one of said operations;
locating the processing portion in which the error was detected;
determining whether the detected error is a critical error indicating the processor portion is which the error was detected is incapable of executing said data processing operations normally;
reconfiguring data paths, if a critical error is detected, to bypass the processor portion in which the error was detected; and retrying the data processing operations being executed when the error was detected if the error is not a critical error.

2. The data processing system of claim 1 wherein processing portions are each coupled to a different peripheral portion via a separate data path, and wherein the data paths for each of said processing portions includes two parallel paths for carrying substantially the same information at substantially the same time dur-ing desired operation, and wherein the step of detecting an error includes the substep of determining whether the two parallel paths of each of the data paths have different data.

3. The method of claim 2 wherein the step of determining whether the detected error is a critical error includes the substep of determining what type of data processing instruc-tion was being executed when said error was detected.

4. The method of claim 1 further including the step of:
resetting the state of preselected ones of the processor portions if the error is detected during the processing of the faulting data processing instruction.

5. The method of claim 4 wherein the resetting step includes the substep of resetting the preselected ones of the processing portions to an initial state that existed prior to execution of the faulting data processing operation.

6. The method of claim 1, further including the step of performing a default operation after unsuccess-fully completing a data processing instruction.

7. The method of claim 6, wherein the step of causing the faulting data processing operation to perform a default operation includes the substep of returning a predetermined pattern of data when the faulting data processing operation is a read operation.

8. The method of claim 6 wherein the step of caus-ing the faulting data processing operation to perform a default operation includes the substep of avoiding execution of the operation when the fault-ing data processing operation is a write operation.