Recherche Images Maps Play YouTube Actualités Gmail Drive Plus »
Recherche avancée dans les brevets | Images de page | Historique Web | Connexion

Brevets

  
[blocks in formation]
[graphic][merged small][merged small][merged small][merged small][merged small][merged small][merged small][graphic][merged small][merged small][table][merged small][merged small][merged small][graphic]

1

SYSTEM AND METHOD FOR RECOVERY
FROM ADDRESS ERRORS

TECHNICAL FIELD

The invention relates to computers and processor systems. 5 More particularly, the invention relates to recovery from address channel errors in a multiprocessor computing system having cache memory.

BACKGROUND ART 10

In a computer system, the interface between a processor and memory is critically important to the performance of the system. Because fast memory is very expensive, memory in the amount needed to support a processor is generally much slower than the processor. In order to bridge the gap between fast processor cycle times and slow memory access times, cache memory was developed. A cache is a small amount of very fast, zero wait state memory that is used to store a copy of frequently accessed data and instructions from main 2Q memory. The microprocessor can operate out of this very fast memory and thereby reduce the number of wait states that must be interposed during memory accesses. When the processor requests data from memory and the data resides in the cache, then a cache read "hit" takes place, and the data ^ from the memory access can be returned to the processor from the cache without incurring wait states. If the data is not in the cache, then a cache read "miss" takes place, and the memory request is forwarded to the system and the data is retrieved from main memory, as would normally be done 3Q if the cache did not exist. On a cache miss, the data that is retrieved from the main memory is provided to the processor and is also written into the cache due to the statistical likelihood that this data will be requested again by the processor. 35

The individual data elements stored in a cache memory are referred to as "lines." Each line of a cache is meant to correspond to one addressable unit of data in the main memory. A cache line thus comprises data and is associated with a main memory address in some way. Schemes for 40 associating a main memory address with a line of cache data include direct mapping, full association and set association, all of which are well known in the art.

The presence of caches should be transparent to the overall system, and various protocols are implemented to 45 achieve such transparency, including write-through and write-back protocols. In a write-through action, data to be stored is written to a cache line and to the main memory at the same time. In a write-back action, data to be stored is written to the cache and only written to the main memory 50 later when the line in the cache needs to be displaced for a more recent line of data or when another processor requires the cached line. Because lines may be written to a cache exclusively in a write-back protocol, precautions must be taken to manage the status of data in a write-back cache, as 55 described in greater detail below.

Cache management is generally performed by a device referred to as a cache controller. A principal cache management objective is the preservation of cache coherency. In computer systems where independent bus masters can 60 access memory, there is a possibility that a bus master, such as another processor, network interface, disk interface, or video graphics card might alter the contents of a main memory location that is duplicated in the cache. When this occurs, the cache is said to hold stale or invalid data. In order 65 to maintain cache coherency, it is necessary for the cache controller to monitor the system bus when the processor

2

does not own the system bus to see if another bus master accesses main memory. This method of monitoring the bus is referred to as "snooping."

The cache controller must monitor the system bus during memory reads by a bus master in a write-back cache design because of the possibility that a previous processor write may have altered a copy of data in the cache that has not been updated in main memory. This is referred to as read snooping. On a "read snoop hit," where the cache contains data not yet updated in main memory, the cache controller generally provides the respective data to main memory, and the requesting bus master generally reads this data en route from the cache controller to main memory, this operation being referred to as "snarfing." The cache controller must also monitor the system bus during memory writes because another bus master may write to or alter a memory location that resides in the cache. This is referred to as write snooping. On a "write snoop hit," the cache entry is either marked invalid in the cache directory by the cache controller, signifying that this entry is no longer correct, or the cache is updated along with main memory. Therefore, when another bus master reads or writes to main memory in a write-back cache design, or writes to main memory in a write-through cache design, the cache controller must latch the system address and perform a cache look-up to see if the main memory location being accessed also resides in the cache. If a copy of the data from this location does reside in the cache, then the cache controller takes the appropriate action depending on whether a read or write snoop hit has occurred. This prevents incoherent data from being stored in main memory and the cache, thereby preserving cache coherency.

Another consideration in the preservation of cache coherency is the handling of processor writes to memory. When the processor writes to main memory, the memory location must be checked to determine if a copy of the data from this location also resides in the cache. If a processor write hit occurs in a write-back cache design, then the cache location is updated with the new data and main memory may be updated with the new data at a later time or should the need arise. In a write-through cache, the main memory location is generally updated in conjunction with the cache location on a processor write hit. If a processor write miss occurs, the cache controller may ignore the write miss in a writethrough cache design because the cache is unaffected in this design. Alternatively, the cache controller may perform a "write-allocate" whereby the cache controller allocates a new line in a cache in addition to passing the data to the main memory. In a write-back cache design, the cache controller generally allocates a new line in the cache when a processor write miss occurs. This generally involves reading the remaining entries to fill the line from main memory before or jointly with providing the write data to the cache. Main memory is updated at a later time should the need arise.

Caches may be designed independently of the microprocessor, in which case the cache is placed on the local bus of the microprocessor and interfaced between the processor and the system bus during the design of the computer system. However, as the density of transistors on a process chip has increased, processors may be designed with one or more internal caches in order to decrease further memory access times. The internal cache used in these processors is generally small, an exemplary size being 8 k (8192 bytes) in size. In computer systems that utilize processors with one or more internal caches, an external cache is often added to the system to further improve memory access time. The external cache is generally much larger 3

than the internal cache(s), and, when used in conjunction with the internal cache(s), provides a greater overall hit rate than the internal cache(s) would provide alone.

In systems that incorporate multiple levels of caches, when the processor requests data from memory, the internal 5 or first level cache is first checked to see if a copy of the data resides there. If so, then a first level cache hit occurs, and the first level cache provides the appropriate data to the processor. If a first level cache miss occurs, then the second level cache is then checked. If a second level cache hit occurs, 1° then the data is provided from the second level cache to the processor. If a second level cache miss occurs, then the data is retrieved from main memory. This process continues through higher levels of caches, if present. Write operations are similar, with mixing and matching of the operations :5 discussed above being possible.

In many instances where multilevel cache hierarchies exist with multiple processors, a property referred to as multilevel inclusion is desired in the hierarchy. Multilevel inclusion provides that a second level (e.g., external) cache 20 is guaranteed to have a copy of what is inside a first level (e.g., internal) cache. In this case, the second level cache holds a superset of the first level cache. Multilevel inclusion obviates the need for all levels of caches to snoop the system bus and thus enables the caches to perform more efficiently. 25 Multilevel inclusion is most popular in multiprocessor systems, where the higher level caches can shield the lower level caches from cache coherency problems and thereby prevent unnecessary snoops that would otherwise occur in the lower level caches if multilevel inclusion were not 30 implemented.

In a multiprocessor system where each processor utilizes a multilevel cache system with inclusion, there may be, for example, a Level 1 (LI) write-through cache associated with 3J each processor and a larger, slower Level 2 (L2) write-back cache, which is still much faster than the main memory. The L2 and LI caches utilize the MESI (pronounced "messy") protocol for managing the state of each cache line as follows: For each cache line, there is an M, E, S, or I state 4Q that indicates the current state of the cache line in the system. According to this well-known protocol, the Exclusive (E) bit indicates that the line only exists in this cache, the Shared (S) bit indicates that the line can be shared by multiple users at one time, the Invalid (I) bit indicates that the line is not 4J available in the cache, and the Modified (M) bit indicates that the line has been changed or modified since it was first written to the cache. This management system improves system performance because unmodified lines need not be written back to the system's main memory. 5Q

The LI cache does not require the Exclusive (E) bit in systems where it is the L2 cache's responsibility to manage line MESI state changes. Thus, the LI cache may be said to implement the MSI protocol. In these systems, a line marked Exclusive (E) in L2, would be marked Shared (S) in LI. If 55 another processor wants to share a copy of this line, the L2 cache would indicate via its snoop response that the line is Shared (S) and change the state of the L2 copy of the line to Shared (S). Because the LI line state did not need to be changed, the LI cache did not need to be involved in the line 60 state change, thus improving performance.

In a multiprocessor environment, snoop latency may be fixed, which means that when a processor makes a storage request on the system bus, all other processors, or bus devices, must respond within a fixed period of time. In the 65 event the storage request is a line read, other processors or devices which have a copy of the line are allowed to respond

4

only with Shared (S) or Modified (M). A processor is not allowed to keep exclusive ownership of the line in this case. If the snoop response is Modified (M), the processor owning the current state of the line must provide the current copy to the requester, and change the state of its copy of the line to Shared (S) or Invalid (I), depending on the snoopy bus protocol. In systems where the LI cache cannot be snooped, or the LI cache snoop response cannot meet the fixed response time requirement of the snoopy bus, the L2 cache must mark a line as Modified (M) prior to any processor store to that line.

An alternative to snoopy protocols are directory based cache coherency protocols. In a directory based coherency scheme, the system typically contains a single directory having one entry for every address in main memory. Each directory entry identifies the ownership and data state information of each line of main memory. That is, the directory contents are tags. The data states tracked in a directory coherency system may be similar to the data states tracked in a snoopy system (e.g., MESI-based states or something similar).

OBJECTS OF THE INVENTION

An objective of the present invention is to exploit cache coherency information in a multiprocessor computing system in reaction to an address error.

One of the most serious types of errors in a computer system is an address error on a computer bus. On a typical computer bus, which contains a separate address bus (as well as a data bus and control lines), as address buses are designed in greater widths, the specter of address errors becomes more threatening. An address error is serious because a bus agent that reports the error must be assumed to have no idea what the true memory address target was. The result might be a memory controller providing the wrong data, or worse yet, writing the right data to the wrong memory location. When an address error occurs in a multiprocessor system, one or more processors may have inconsistent views of memory. In order to avoid the continued processing of corrupted data, most computers respond to an address error by generating a fatal error, which in turn causes an immediate failure of the operating system. This is disadvantageous because it does not permit graceful shutdown of applications.

Non-graceful shutdowns cause significant increases in recovery time for applications such as databases. In particular, non-graceful shutdowns may occur before the system has an opportunity to flush open buffers and close open files. As a result, storage files that reside on I/O (input/output) devices may be corrupt and inconsistent. Returning a computer file system to a known good state may require a great deal of time and effort. Typically, archival backups and/or update logs of the storage system are needed. Further, some data that has been entered since the last backup needs to be recreated, if that is even possible.

An advantage of the present invention is a higher likelihood of either avoiding the need for bringing the system down as a result of an address error or providing a window of opportunity in which to conduct a more orderly shutdown of critical operations, in response to an address error.

SUMMARY OF INVENTION

This invention is based upon the recognition that the property of inclusion, offered by an inclusive cache, coherency directory or other coherency filter provides unique opportunities for error recovery. In traditional bus-based

5 6

MESI coherency systems, only the owner of a cache line and 145. Attached to the local bus 135 are processors 150

knows that he owns it. Thus, if the owner dies or his and 155. Although two processors are illustrated on each

connection to the system fails, the correct state of memory local bus, the number of processors is arbitrary within

is totally unknown. The data structures used by inclusive system limits, including only a single processor,

systems to track inclusion (address tags for caches, and 5 FIGS. 2 and 3 are flowcharts of an address error recovery

directory entries for directories) contain redundant informa- method for the multiprocessor computing system 100. The

tion about the ownership of lines and, in some cases, flowcharts of FIGS. 2 and 3 illustrate the method in refer

up-to-date copies of modified data. The present invention is ence to the following example: Assume that the processor

sometimes capable of providing enough data about the state 140 executes a read from a given memory address. The

of memory to allow applications to recover from address 10 processor 145 sees this read on the local bus 130 and detects

errors, such as parity errors. The information provided may that the address 18 erroneous. The address is truly erroneous

be sufficient to allow a complete recovery, but more often, and cannot be remedied by retransmission or other protocol,

the information provided will allow the system to avoid Thus> the cache 120 may be inconsistent,

corrupt data and run long enough to permit graceful shut- According to the method 200 illustrated in FIG. 2, a

down of mission critical applications. 15 detecting step 205 is first taken to determine that the address

According to a method of the present invention, an fea on thecal bus 130 is erroneous. Typically this address error is detected on a local channel, such as a local ^teebng step 205 comprises a parity check over the address bus. The coherency states of one or more lines of cache field; an error ls/elected, the local bus 130 is placed memory associated with the local channel are then read, and ln a falled s,,tate' accordlngto tbx Placln# fteP 210' 50 *attbx actions are taken in response. Reading of coherency states 20 local bus 130Us cut off from the rest of the system 100. The ranges from a complete and active interrogation of all cache ?TMng step 210 has the effect of quiescing the processors lines, to a selective and passive interrogation, such as in 140 and,145 and aTM that ^ corrupted data that may responding to snoop requests. If the data state consistency is ^xistln the Pressor 140, the processor 145 or the local bus unknown, such as when the MESI state is Modified (M) or 130,cannot be, transferred to the main memory 110 or Exclusive (E), then the corresponding data in main memory 25 another part of the system 100. Next, a notifying step 215 is is poisoned. Poisoning may be accomplished by writing a Performed to notify another processor of the error. The detectable but unrecoverable error pattern in the main notlfied Pressor may be a another processor, such as the memory. Alternatively, the same effect may be accomplished Pressors 150 or 155, or a separate processor, such as a by signaling a hard error on the system bus. If the data state master Processor or a maintenance processor (not shown). It consistency of an interrogated cache line is Shared (S) or 30 * necessary that the notified processor not be isolated from Invalid (I), the line may be ignored or the line marked the cache 120 or the maln memory 110' 50 that the invalid. If the state of the cached line is valid and consistent, Processor can execute a recovery routine. Any method of such as the "Modified uncached" (Mu) state in a MuMESI notification can be utilized to reach this non-isolated proprotocol, then the line may be written to main memory or cessor- An. exemplary method of notification is signaling an provided to a snoop requester. 35 interrupt line, preferably a high priority interrupt line.

In response to the notifying step 215, the non-isolated

DESCRIPTION OF DRAWINGS processor performs an interrogating step 220 on each line in

the cache 120. Based upon a coherency state of a line, an

FIG. 1 is a block diagram of a multiprocessor computing appropriate action is taken so as to minimize the impact of

system according to the present invention. 40 the address error. The interrogated coherency states may be

FIG. 2 is a flowchart of a first method according to the stored in the cache 120 in the case of a snoopy system or in

present invention. a directory (not illustrated) in a directory based coherency

FIG. 3 is a flowchart of a second method according to the system. In a preferred embodiment, the possible coherency

present invention. states are based upon the MSI, MESI, MOESI, MuMESI or

FIG. 4 is a flowchart of a third method according to the 45 similar protocols, which are collectively referred to as

present invention MESI-based schemes. If the MESI state of a given line is

Modified uncached (Mu)—if this state is implemented—

FIG. 5 is a block diagram of modules according to the then a main memQry wdting step 225 ig by wMch ±e

present invention. given cache Hne is written to the main memory no. The

50 Mu-MESI protocol augments the basic MESI protocol by utilizing an additional "Modified uncached" (Mu) state. This state basically signifies that a cache line is valid and con

FIG. 1 is a block diagram of a multiprocessor computing sistent and is described in greater detail in pending U.S.

system 100. A system bus 105 interconnects major system patent application Ser. No. 09/290,430, entitled "Optimiza

elements. Although the system bus 105 is illustrated as a bus, 55 tion of MESI/MSI Protocol to Improve L3 Cache Perfor

a bus is exemplary of any channel (e.g., ring) interconnect- mance" (attorney docket no. HP 10981260-1), which is

ing major system elements. Attached to the system bus 105 hereby incorporated by reference. The MOESI protocol is

is a main memory 110. One or more input/output (I/O) another variation of the MESI protocol and is well known to

devices 115 may also be attached to the system bus 105. those skilled in the art.

Exemplary I/O devices include disk drives, graphics devices 60 If the MESI state of the given line is Modified (M),

and network interfaces. Also attached to the system bus 105 representing that the consistency of this data is unknown,

are caches 120 and 125. Although two caches are illustrated, then a poisoning step 230 is performed. The objective of the

any supportable number of caches is possible. Attached to poisoning step 230 is to ensure that corrupt data will not be

the caches 120 and 125 are local buses 130 and 135, used by the system 100 in subsequent computations,

respectively. As with the system bus 105, the bus structure 65 According to the poisoning step 230, a detectable but

shown for local buses 130 and 135 is illustrative of any type uncorrectable error pattern is written onto the data field

of channel. Attached to the local bus 130 are processors 140 corresponding to this line in the main memory 110. Thus,

DETAILED DESCRIPTION OF PREFERRED
EMBODIMENTS

« PrécédentContinuer »