EP0359384B1 - Queued posted-write disk write method and apparatus with improved error handling - Google Patents

Queued posted-write disk write method and apparatus with improved error handling Download PDF

Info

Publication number
EP0359384B1
EP0359384B1 EP89307948A EP89307948A EP0359384B1 EP 0359384 B1 EP0359384 B1 EP 0359384B1 EP 89307948 A EP89307948 A EP 89307948A EP 89307948 A EP89307948 A EP 89307948A EP 0359384 B1 EP0359384 B1 EP 0359384B1
Authority
EP
European Patent Office
Prior art keywords
queue
disk
write
sector
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP89307948A
Other languages
German (de)
French (fr)
Other versions
EP0359384A2 (en
EP0359384A3 (en
Inventor
Curtis R. Jones
Robert S. Gready
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Compaq Computer Corp
Original Assignee
Compaq Computer Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Compaq Computer Corp filed Critical Compaq Computer Corp
Publication of EP0359384A2 publication Critical patent/EP0359384A2/en
Publication of EP0359384A3 publication Critical patent/EP0359384A3/en
Application granted granted Critical
Publication of EP0359384B1 publication Critical patent/EP0359384B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/073Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a memory management context, e.g. virtual memory or cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs

Definitions

  • This invention relates to a method for queuing posted-write disk write operations, with improved error handling.
  • Disk caching is a method of keeping a copy of the information last read from a relatively slow storage device (e.g., a fixed or "hard” disk) in much faster read-write random-access memory (RAM). This permits quicker processing of subsequent requests for that data.
  • RAM read-write random-access memory
  • Disk caching typically operates in conjunction with read-operation requests by application programs or other programs (referred to here as "tasks").
  • a task when a task initiates a read operation, it reserves a certain portion of RAM, referred to as a "buffer," and requests that the information stored in one or more disk sectors be copied to the buffer.
  • buffer a certain portion of RAM
  • the request can be filled from the RAM cache buffer. Since the relatively slow disk drives need not be activated, and the request is thus filled entirely electronically, this subsequent read request is completed much faster than the first one.
  • the cache buffer is checked first to see if the desired disk sectors have already been read into the cache buffer. Only if the information is not in the cache buffer is an actual disk read operation initiated, whereupon the new information is itself copied to the cache buffer. "Old" information in the cache buffer is removed from the buffer; generally speaking, the information that is removed is the least recently used information.
  • Interrupts cause a central processing unit (CPU) of a computer to suspend execution of the current program instruction(s); to execute a specific "interrupt handler" routine or set of instructions; and then to resume execution of the suspended current program instruction(s) at the point where it left off.
  • CPU central processing unit
  • Interrupts can be generated by software, i.e., by special instructions built into a program known as "software interrupts.”
  • software interrupts When the CPU encounters a software interrupt in a program, among other things it executes the associated interrupt handler routine, then returns to execution of the program into which the software interrupt was built.
  • a return from an interrupt-generated call to an interrupt handler routine is known as an interrupt return or "IRET.”
  • Interrupts can also be generated by appropriately designed hardware: many CPUs (e.g., the Intel 8086 family, including the 8086, 8088, 80286, and 80386) are designed so that other hardware components in the computer system can cause an interrupt by transmitting special signals to the CPU.
  • CPUs e.g., the Intel 8086 family, including the 8086, 8088, 80286, and 80386 are designed so that other hardware components in the computer system can cause an interrupt by transmitting special signals to the CPU.
  • Interrupts are commonly used to initiate disk write operations.
  • a disk write operation typically entails copying of information to the disk from RAM that is in use by a task.
  • a disk-write interrupt might be generated by a task.
  • a spreadsheet program could initiate such a request in order to save the user's work.
  • a common disk-write interrupt in the "industry standard architecture” causes execution of a specific interrupt handler (known as INT 13H) that is part of the BIOS (basic input/output services) program.
  • INT 13H specific interrupt handler
  • BIOS basic input/output services
  • the industry standard architecture is exemplified by, e.g., the IBM PC and the Compaq Deskpro 286.
  • the BIOS program typically is stored in a read-only memory (ROM) installed in the ISA computer, and so the BIOS program itself is sometimes referred to as simply "the ROM.”
  • ROM read-only memory
  • the INT 13H interrupt handler In processing the disk write request, the INT 13H interrupt handler does two things (among others). The explanation below uses a write operation to a fixed disk as an example.
  • INT 13H copies the specified data to be written to disk (referred to here as a "write buffer") to a buffer under the control of a disk controller associated with the specified disk drive (e.g., in RAM that is installed with the disk controller and not as part of "main” memory). It then directs the disk controller to copy the data to a specified sector(s) on the disk.
  • a disk controller associated with the specified disk drive (e.g., in RAM that is installed with the disk controller and not as part of "main” memory). It then directs the disk controller to copy the data to a specified sector(s) on the disk.
  • INT 13H itself calls the INT 15H WAIT interrupt handler.
  • INT 15 WAIT simply returns to the calling function, i.e., to the INT 13H interrupt handler.
  • INT 13H then enters a wait loop: in each iteration of the loop, it checks to see if a flag has been set to indicate that a fixed disk hardware interrupt has occurred; if the flag has not been set, the loop continues. In other words, the CPU is now busy waiting for the fixed disk hardware-interrupt-occurred flag to be set -- and has not resumed executing the task's instructions.
  • the disk controller When the disk controller has completed the write operation, it generates a hardware interrupt that causes the CPU to set the fixed disk hardware-interrupt-occurred flag and execute the INT 15H POST interrupt handler. This routine typically performs an IRET back to the INT 13H wait loop.
  • the INT 13H wait loop is ended, whereupon INT 13H clears the flag, finishes its processing, and performs an IRET to return control to the task.
  • the design of the INT 13H interrupt handler routine forces the CPU to sit idle until the disk controller (which, once activated, does not need the CPU to perform its data-writing functions) has completed its work.
  • Write queuing generally speaking, involves directing some or all disk write requests to a queue buffer instead of to the disk in question, and control is given back to the task. The actual physical writing to disk is performed later on whenever convenient, thus reducing the delay in resuming execution of the calling task.
  • a method of maintaining a posted-write queue upon occurrence of a timeout or upon occurrence of a specified type of error to which queue may be added information sectors, containing data to be stored on a disk, addressed to respective corresponding disk sectors, and from which queue the information sectors may be written out to the respective corresponding disk sectors,
  • the invention further provides an apparatus for maintaining a posted-write queue as defined in claim 5.
  • a posted-write queuing program for writing information sectors to disk sectors includes error-handling routines to minimize the risk of data loss upon specified types of errors. Upon timeouts, queuing is suspended and all information sectors pending in the queue are written out to the corresponding disk sectors. Upon specified types of write errors, queuing is discontinued and repeated attempts are made to write out all information sectors to the corresponding disk sectors. For each unsuccessful attempt, the corresponding information sector is saved in the queue; the user is alerted, and subsequent read or write requests directed to the corresponding disk sector is serviced from the saved information sector in the queue.
  • the present invention is illustrated by the following description of a set of routines for posting and queueing write requests generated by tasks.
  • the routines operate as part of a disk-cache installable device driver under the MS-DOS operating system (versions 3.10 through 3.39).
  • the posted-write queueing routines operate in conjunction with a disk cache utility that maintains a cache buffer and a cache directory.
  • a disk cache utility that maintains a cache buffer and a cache directory.
  • the posted-write queueing routines are written as a part of such a disk cache utility, which forms no part of the present invention (except to the extent claimed) and is not otherwise described.
  • interrupt handlers in the BIOS program have their behaviour altered (sometimes referred to as "hooked,” “trapped,” or “grabbed” in the conventional manner. Generally speaking, this involves (a) saving the vectors that are stored in low memory and are associated with the respective interrupt handlers; these vectors ordinarily point to the respective addresses of the normal BIOS interrupt handler routines; and (b) overwriting these vectors with new vectors pointing to the addresses of substitute interrupt handler routines.
  • a substitute routine may call the original interrupt handler; this is done by simply calling the original handler, whose address was saved as part of the overwriting process described above. For example, the Flow Chart shows that this is done in the substitute routine for INT 13H.
  • the posted-write queuing routines use an 8K FIFO (first-in, first-out) buffer in main memory for queuing write requests. More specifically, the queue utilizes the base memory that is reserved by the cache driver during its installation.
  • 8K FIFO first-in, first-out
  • the queue is kept relatively small to reduce the chance of data loss due to the user turning off the computer when it appears that the write is complete (when in fact it is not).
  • the write buffer of a disk write request will occupy no more than one sector on the disk (i.e., if the request is for a single-sector write), then broadly speaking, the write request is queued and posted as complete, and control is returned to the task.
  • the substitute INT 13H routine queues the write request and jumps to a DEQUEUE routine.
  • the DEQUEUE routine in turn generates its own conventional INT 13H write request that specifies the queue as the write buffer.
  • the substitute INT 15H WAIT routine saves the wait environment and returns back to the calling task (instead of to the conventional INT 13H routine that called it), even though the disk controller may not yet have reported completion of the write operation. In effect, this leaves the conventional INT 13H routine in something like a state of suspension.
  • the disk controller When the disk controller does complete its write operation, it generates a hardware interrupt, thus causing the fixed disk hardware-interrupt-occurred flag to be set and causing the INT 15H POST routine to be called.
  • the substitute INT 15H POST routine restores the previously-saved wait environment and returns control to the previously "suspended" conventional INT 13H write routine, in effect reactivating that routine.
  • the now-reactivated conventional INT 13H write routine does not stay in its wait loop waiting for that interrupt (as described above). Instead, the conventional INT 13H write routine finishes its processing and performs an IRET to return control to the DEQUEUE routine that originally called it.
  • the DEQUEUE routine in turn returns control to the calling task, but at the point where the last hardware interrupt occurred, not at the point where the write request was generated.
  • Timeout error logic is employed while waiting for the queue to become empty or not full to ensure that a cache controller failure is the only possible cause of an unrecoverable error.
  • the timeout error logic uses flags to determine where the write operation timed out. This aids in making a determination whether or not the last sector sent to the conventional INT 13H write routine was actually written out to the disk. It also aids in determining whether the timeout was caused by hardware (e.g., controller failure) or software (e.g., conflicting programs) and in giving the user an appropriate error message.
  • each disk read request and each disk write request is checked to determine whether any requested sector is in the queue (meaning that the actual disk sector is out of date or perhaps even unaccessible). Such requests are serviced from the queue buffer; if the sector in question is written to, it is updated in the queue buffer. A suitable alarm prompts the user to back up the disk (the backup request will be serviced in part from the queue).
  • the queue is forced empty (i.e., i.e., written out to the disk, or "flushed") before the requested operation is performed. If disk reads are cached, the queue is flushed only if the desired sectors to be read are not present in the cache. This queue flushing is a convenient, low-overhead way of ensuring that the data in the queue is always current.
  • the queue is not sorted, because the queue is small and since most single-sector writes will either be sequential or will vary greatly across the disk.
  • the overhead associated with moving data is very high, so in this embodiment the queue is copied once and not moved.
  • the queue may contain write requests for a number of sectors (i.e., if write requests are generated in quick succession by the task). If successive write requests in the queue are directed to successive disk sectors, a more efficient disk write can be accomplished when emptying the queue by performing a multi-sector write from the queue.
  • the queue in the embodiment described is circular and has a fixed buffer location; furthermore, the head and tail pointers are both placed at the top of the queue when the queue is empty.

Description

This invention relates to a method for queuing posted-write disk write operations, with improved error handling.
Disk Caching
Posted writes to a disk are roughly analogous in some ways to cached reads from a disk. Disk caching, as is well known to those of ordinary skill, is a method of keeping a copy of the information last read from a relatively slow storage device (e.g., a fixed or "hard" disk) in much faster read-write random-access memory (RAM). This permits quicker processing of subsequent requests for that data.
Disk caching typically operates in conjunction with read-operation requests by application programs or other programs (referred to here as "tasks").
Generally speaking, when a task initiates a read operation, it reserves a certain portion of RAM, referred to as a "buffer," and requests that the information stored in one or more disk sectors be copied to the buffer.
Once this information is copied to RAM, the task can manipulate the information much more rapidly than on the disk. In part, this is because manipulation of the information on the disk requires activation of mechanical components of a disk drive, whereas manipulation in RAM is done entirely electronically.
When disk caching is used, each time a task requests that a certain disk sector be read (i.e., copied into RAM), not only is the requested sector read, but in addition certain adjacent sectors are also copied into a special RAM "cache buffer." This is done on the assumption that these disk sectors are likely to be read soon themselves.
Consequently, if one or more of those adjacent sectors is indeed the subject of a subsequent read request, the request can be filled from the RAM cache buffer. Since the relatively slow disk drives need not be activated, and the request is thus filled entirely electronically, this subsequent read request is completed much faster than the first one.
When disk caching is enabled, whenever a read request for specified disk sectors is initiated, the cache buffer is checked first to see if the desired disk sectors have already been read into the cache buffer. Only if the information is not in the cache buffer is an actual disk read operation initiated, whereupon the new information is itself copied to the cache buffer. "Old" information in the cache buffer is removed from the buffer; generally speaking, the information that is removed is the least recently used information.
Interrupts
The operation of the present invention makes use of a special capability designed into many computer architectures, known as "interrupts" and "interrupt handlers."
Interrupts cause a central processing unit (CPU) of a computer to suspend execution of the current program instruction(s); to execute a specific "interrupt handler" routine or set of instructions; and then to resume execution of the suspended current program instruction(s) at the point where it left off.
Interrupts can be generated by software, i.e., by special instructions built into a program known as "software interrupts." When the CPU encounters a software interrupt in a program, among other things it executes the associated interrupt handler routine, then returns to execution of the program into which the software interrupt was built. A return from an interrupt-generated call to an interrupt handler routine is known as an interrupt return or "IRET."
Interrupts can also be generated by appropriately designed hardware: many CPUs (e.g., the Intel 8086 family, including the 8086, 8088, 80286, and 80386) are designed so that other hardware components in the computer system can cause an interrupt by transmitting special signals to the CPU.
Disk Write Interrupts
Interrupts are commonly used to initiate disk write operations. A disk write operation typically entails copying of information to the disk from RAM that is in use by a task.
A disk-write interrupt might be generated by a task. For example, a spreadsheet program could initiate such a request in order to save the user's work.
(In this discussion, a computer program itself is sometimes referred to as performing one or another operation. In reality, it is a hardware component such as the CPU that actually performs the operation under control of the program. This is a common shorthand in the art.)
Disk Write Interrupt Handler in ISA BIOS
A common disk-write interrupt in the "industry standard architecture" (ISA) causes execution of a specific interrupt handler (known as INT 13H) that is part of the BIOS (basic input/output services) program. The industry standard architecture is exemplified by, e.g., the IBM PC and the Compaq Deskpro 286.
The BIOS program typically is stored in a read-only memory (ROM) installed in the ISA computer, and so the BIOS program itself is sometimes referred to as simply "the ROM."
In processing the disk write request, the INT 13H interrupt handler does two things (among others). The explanation below uses a write operation to a fixed disk as an example.
First, INT 13H copies the specified data to be written to disk (referred to here as a "write buffer") to a buffer under the control of a disk controller associated with the specified disk drive (e.g., in RAM that is installed with the disk controller and not as part of "main" memory). It then directs the disk controller to copy the data to a specified sector(s) on the disk.
That having been done, INT 13H itself calls the INT 15H WAIT interrupt handler. By default, INT 15 WAIT simply returns to the calling function, i.e., to the INT 13H interrupt handler.
INT 13H then enters a wait loop: in each iteration of the loop, it checks to see if a flag has been set to indicate that a fixed disk hardware interrupt has occurred; if the flag has not been set, the loop continues. In other words, the CPU is now busy waiting for the fixed disk hardware-interrupt-occurred flag to be set -- and has not resumed executing the task's instructions.
When the disk controller has completed the write operation, it generates a hardware interrupt that causes the CPU to set the fixed disk hardware-interrupt-occurred flag and execute the INT 15H POST interrupt handler. This routine typically performs an IRET back to the INT 13H wait loop.
Now that the fixed disk hardware-interrupt-occurred flag has been set, the INT 13H wait loop is ended, whereupon INT 13H clears the flag, finishes its processing, and performs an IRET to return control to the task.
In effect, the design of the INT 13H interrupt handler routine forces the CPU to sit idle until the disk controller (which, once activated, does not need the CPU to perform its data-writing functions) has completed its work.
It will be recognized by those of ordinary skill that the above description relates to a relatively simple case of writing to disk. As is well known to those of ordinary skill, some write operations require multiple hardware interrupts thereby causing multiple WAITs and POSTs.
Queued Write Operations
Write queuing, generally speaking, involves directing some or all disk write requests to a queue buffer instead of to the disk in question, and control is given back to the task. The actual physical writing to disk is performed later on whenever convenient, thus reducing the delay in resuming execution of the calling task.
According to the present invention, there is provided a method of maintaining a posted-write queue upon occurrence of a timeout or upon occurrence of a specified type of error, to which queue may be added information sectors, containing data to be stored on a disk, addressed to respective corresponding disk sectors, and from which queue the information sectors may be written out to the respective corresponding disk sectors,
  • the method comprising the steps of:
  • blocking the addition of information sectors to the queue; and
  • attempting to write out to the corresponding disk sector each information sector already added to the queue.
  • The invention further provides an apparatus for maintaining a posted-write queue as defined in claim 5.
    A posted-write queuing program for writing information sectors to disk sectors includes error-handling routines to minimize the risk of data loss upon specified types of errors. Upon timeouts, queuing is suspended and all information sectors pending in the queue are written out to the corresponding disk sectors. Upon specified types of write errors, queuing is discontinued and repeated attempts are made to write out all information sectors to the corresponding disk sectors. For each unsuccessful attempt, the corresponding information sector is saved in the queue; the user is alerted, and subsequent read or write requests directed to the corresponding disk sector is serviced from the saved information sector in the queue.
    Posted-Write Queueing in Disk Cache Utility
    The present invention is illustrated by the following description of a set of routines for posting and queueing write requests generated by tasks. The routines operate as part of a disk-cache installable device driver under the MS-DOS operating system (versions 3.10 through 3.39).
    The posted-write queueing routines operate in conjunction with a disk cache utility that maintains a cache buffer and a cache directory. In the illustration described here, the posted-write queueing routines are written as a part of such a disk cache utility, which forms no part of the present invention (except to the extent claimed) and is not otherwise described.
    It will be understood that the description is presented by way of illustration and not as a limitation on the subject matter claimed.
    Substitution of Interrupt Handlers
    Several interrupt handlers in the BIOS program have their behaviour altered (sometimes referred to as "hooked," "trapped," or "grabbed" in the conventional manner. Generally speaking, this involves (a) saving the vectors that are stored in low memory and are associated with the respective interrupt handlers; these vectors ordinarily point to the respective addresses of the normal BIOS interrupt handler routines; and (b) overwriting these vectors with new vectors pointing to the addresses of substitute interrupt handler routines.
    In particular, the respective handlers for INT 13H, for INT 15H WAIT and INT 15H POST, and for timeouts and write errors, are replaced. The Flow Chart sets out a pseudocode description of the substitute routines.
    As is conventional, a substitute routine may call the original interrupt handler; this is done by simply calling the original handler, whose address was saved as part of the overwriting process described above. For example, the Flow Chart shows that this is done in the substitute routine for INT 13H.
    Memory Allocation for Queue
    The posted-write queuing routines use an 8K FIFO (first-in, first-out) buffer in main memory for queuing write requests. More specifically, the queue utilizes the base memory that is reserved by the cache driver during its installation.
    The queue is kept relatively small to reduce the chance of data loss due to the user turning off the computer when it appears that the write is complete (when in fact it is not).
    Single-Sector Writes
    If the write buffer of a disk write request will occupy no more than one sector on the disk (i.e., if the request is for a single-sector write), then broadly speaking, the write request is queued and posted as complete, and control is returned to the task.
    More specifically, the substitute INT 13H routine queues the write request and jumps to a DEQUEUE routine. The DEQUEUE routine in turn generates its own conventional INT 13H write request that specifies the queue as the write buffer.
    When the conventional INT 13H interrupt handler calls the INT 15H WAIT routine, the substitute INT 15H WAIT routine saves the wait environment and returns back to the calling task (instead of to the conventional INT 13H routine that called it), even though the disk controller may not yet have reported completion of the write operation. In effect, this leaves the conventional INT 13H routine in something like a state of suspension.
    When the disk controller does complete its write operation, it generates a hardware interrupt, thus causing the fixed disk hardware-interrupt-occurred flag to be set and causing the INT 15H POST routine to be called. The substitute INT 15H POST routine restores the previously-saved wait environment and returns control to the previously "suspended" conventional INT 13H write routine, in effect reactivating that routine.
    Because the fixed disk hardware-interrupt-occurred flag is now set, the now-reactivated conventional INT 13H write routine does not stay in its wait loop waiting for that interrupt (as described above). Instead, the conventional INT 13H write routine finishes its processing and performs an IRET to return control to the DEQUEUE routine that originally called it.
    The DEQUEUE routine in turn returns control to the calling task, but at the point where the last hardware interrupt occurred, not at the point where the write request was generated.
    Error Handling
    On any time out, queuing is discontinued (i.e., no more write requests are added to the queue) and all write requests already queued are written out individually. Timeout error logic is employed while waiting for the queue to become empty or not full to ensure that a cache controller failure is the only possible cause of an unrecoverable error.
    The timeout error logic uses flags to determine where the write operation timed out. This aids in making a determination whether or not the last sector sent to the conventional INT 13H write routine was actually written out to the disk. It also aids in determining whether the timeout was caused by hardware (e.g., controller failure) or software (e.g., conflicting programs) and in giving the user an appropriate error message.
    On any sector-not-found write error or address-not-found write error, repeated attempts (e.g., 5 disk resets and retries) are made to write out each sector in the queue to disk, one at a time. If any given sector(s) cannot be written out, queuing is permanently discontinued (until a reset, of course) and the sector(s) in question are saved in the queue.
    Subsequent to such action (until power reset), each disk read request and each disk write request is checked to determine whether any requested sector is in the queue (meaning that the actual disk sector is out of date or perhaps even unaccessible). Such requests are serviced from the queue buffer; if the sector in question is written to, it is updated in the queue buffer. A suitable alarm prompts the user to back up the disk (the backup request will be serviced in part from the queue).
    Efficiency Considerations
    Queueing of disk writes takes place only for single-sector writes. Only single-sector writes are queued because the queue is small and most writes will be either single-sector or very large multi-sector writes. To accommodate large writes would require too much complexity and overhead; it is regarded as more efficient to handle only the other most common write size, i.e., single-sector writes.
    If either a multi-sector write request or a disk read request is generated, the queue is forced empty (i.e., i.e., written out to the disk, or "flushed") before the requested operation is performed. If disk reads are cached, the queue is flushed only if the desired sectors to be read are not present in the cache. This queue flushing is a convenient, low-overhead way of ensuring that the data in the queue is always current.
    No check is performed for duplication of write requests in the queue. Such a check would theoretically eliminate the extra disk write, but the associated overhead is regarded as not worth the trade-off, because only comparatively rarely will the queue contain two writes to the same disk sector.
    Likewise, the queue is not sorted, because the queue is small and since most single-sector writes will either be sequential or will vary greatly across the disk. The overhead associated with moving data (in sorting or in copying to the disk controller buffer) is very high, so in this embodiment the queue is copied once and not moved.
    At any given time, the queue may contain write requests for a number of sectors (i.e., if write requests are generated in quick succession by the task). If successive write requests in the queue are directed to successive disk sectors, a more efficient disk write can be accomplished when emptying the queue by performing a multi-sector write from the queue.
    Toward this end, the queue in the embodiment described is circular and has a fixed buffer location; furthermore, the head and tail pointers are both placed at the top of the queue when the queue is empty.

    Claims (7)

    1. A method of maintaining a posted-write queue upon occurrence of a timeout or upon occurrence of a specified type of error, to which queue may be added information sectors, containing data to be stored on a disk, addressed to respective corresponding disk sectors, and from which queue the information sectors may be written out to the respective corresponding disk sectors,
      the method comprising the steps of:
      blocking the addition of information sectors to the queue; and
      attempting to write out each information sector already added to the queue to the corresponding disk sector.
    2. A method according to claim 1, wherein, upon a specified type of error, for each attempt to write to the corresponding disk sector that is unsuccessful, the information sector is saved in the queue.
    3. A method according to claim 2, further including the step of attempting to perform, from the queue, subsequent read requests or write requests directed to a disk sector corresponding to any saved information sector.
    4. A method according to claim 1, wherein
      upon a timeout, the addition of information sectors to the queue is blocked and each information sector already added to the queue is written out to the corresponding disk sector;
      upon a specified type of write error, the addition of information sectors to the queue is blocked, an attempt is made to write out to the corresponding disk sector each information sector already added to the queue, for each attempt that is unsuccessful, the information sector is saved in the queue, and any read request or write request subsequent to the unsuccessful attempt that is directed to a disk sector corresponding to the saved information sector is serviced from the queue.
    5. Apparatus for maintaining a posted-write queue in a computer memory, to which queue may be added information sectors, containing data to be stored on a disk, addressed to respective corresponding disk sectors upon occurrence of a timeout or upon occurrence of a specified type of error, the apparatus including:
      means arranged to prevent the further addition of information sectors to the queue; and
      means arranged to attempt to write out each information sector already added to the queue to the corresponding disk sector.
    6. Apparatus according to claim 5, further including means arranged to save in the queue each information sector that is unsuccessfully written to disk.
    7. Apparatus according to claim 6, further including interrupt handlers and means arranged to service from the queue any subsequent read request or write request received and performed by the interrupt handlers and directed to a disk sector corresponding to any saved information sector.
    EP89307948A 1988-09-16 1989-08-04 Queued posted-write disk write method and apparatus with improved error handling Expired - Lifetime EP0359384B1 (en)

    Applications Claiming Priority (2)

    Application Number Priority Date Filing Date Title
    US245865 1988-09-16
    US07/245,865 US5065354A (en) 1988-09-16 1988-09-16 Queued posted-write disk write method with improved error handling

    Publications (3)

    Publication Number Publication Date
    EP0359384A2 EP0359384A2 (en) 1990-03-21
    EP0359384A3 EP0359384A3 (en) 1991-07-10
    EP0359384B1 true EP0359384B1 (en) 1998-01-14

    Family

    ID=22928407

    Family Applications (1)

    Application Number Title Priority Date Filing Date
    EP89307948A Expired - Lifetime EP0359384B1 (en) 1988-09-16 1989-08-04 Queued posted-write disk write method and apparatus with improved error handling

    Country Status (6)

    Country Link
    US (1) US5065354A (en)
    EP (1) EP0359384B1 (en)
    JP (1) JPH0293949A (en)
    KR (1) KR970011213B1 (en)
    CA (1) CA1319440C (en)
    DE (1) DE68928542T2 (en)

    Cited By (5)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    US6457130B2 (en) 1998-03-03 2002-09-24 Network Appliance, Inc. File access control in a multi-protocol file server
    US6604118B2 (en) 1998-07-31 2003-08-05 Network Appliance, Inc. File system image transfer
    US7103794B2 (en) 1998-06-08 2006-09-05 Cacheflow, Inc. Network object cache engine
    US7685169B2 (en) 2002-06-07 2010-03-23 Netapp, Inc. Multiple concurrent active file systems
    US7818498B2 (en) 1993-06-03 2010-10-19 Network Appliance, Inc. Allocating files in a file system integrated with a RAID disk sub-system

    Families Citing this family (30)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    US5347648A (en) * 1990-06-29 1994-09-13 Digital Equipment Corporation Ensuring write ordering under writeback cache error conditions
    GB9026917D0 (en) * 1990-12-11 1991-01-30 Int Computers Ltd Rotating memory system
    US5295259A (en) * 1991-02-05 1994-03-15 Advanced Micro Devices, Inc. Data cache and method for handling memory errors during copy-back
    JPH06309200A (en) * 1991-04-10 1994-11-04 Internatl Business Mach Corp <Ibm> Method for reading object from volume hierarchical type memory system and information processing system
    EP0510245A1 (en) * 1991-04-22 1992-10-28 Acer Incorporated System and method for a fast data write from a computer system to a storage system
    US5388254A (en) * 1992-03-27 1995-02-07 International Business Machines Corporation Method and means for limiting duration of input/output (I/O) requests
    US5448719A (en) * 1992-06-05 1995-09-05 Compaq Computer Corp. Method and apparatus for maintaining and retrieving live data in a posted write cache in case of power failure
    US5408644A (en) * 1992-06-05 1995-04-18 Compaq Computer Corporation Method and apparatus for improving the performance of partial stripe operations in a disk array subsystem
    US5715424A (en) * 1992-12-10 1998-02-03 International Business Machines Corporation Apparatus and method for writing data onto rewritable optical media
    EP0680634B1 (en) * 1993-01-21 1997-05-14 Apple Computer, Inc. Apparatus and method for backing up data from networked computer storage devices
    ATE154851T1 (en) * 1993-01-21 1997-07-15 Apple Computer METHOD AND DEVICE FOR DATA TRANSMISSION AND STORAGE IN A HIGHLY PARALLEL COMPUTER NETWORK ENVIRONMENT
    ATE409907T1 (en) * 1993-06-03 2008-10-15 Network Appliance Inc METHOD AND DEVICE FOR DESCRIBING ANY AREAS OF A FILE SYSTEM
    US5963962A (en) * 1995-05-31 1999-10-05 Network Appliance, Inc. Write anywhere file-system layout
    DE69431186T2 (en) * 1993-06-03 2003-05-08 Network Appliance Inc Method and file system for assigning file blocks to storage space in a RAID disk system
    US5675725A (en) * 1993-07-19 1997-10-07 Cheyenne Advanced Technology Limited Computer backup system operable with open files
    EP0710375B1 (en) * 1993-07-19 1999-02-17 Cheyenne Advanced Technology Limited File backup system
    US5574950A (en) * 1994-03-01 1996-11-12 International Business Machines Corporation Remote data shadowing using a multimode interface to dynamically reconfigure control link-level and communication link-level
    US5764903A (en) * 1994-09-26 1998-06-09 Acer America Corporation High availability network disk mirroring system
    US5680580A (en) * 1995-02-28 1997-10-21 International Business Machines Corporation Remote copy system for setting request interconnect bit in each adapter within storage controller and initiating request connect frame in response to the setting bit
    US6449686B1 (en) * 1997-03-06 2002-09-10 Micron Technology, Inc. Method and apparatus for determining removable magnetic media types in a computer after detection of a read error condition
    AU8061798A (en) * 1997-06-09 1998-12-30 Cacheflow, Inc. Network object cache engine
    US6516351B2 (en) 1997-12-05 2003-02-04 Network Appliance, Inc. Enforcing uniform file-locking for diverse file-locking protocols
    US6366968B1 (en) 1998-06-26 2002-04-02 Intel Corporation Physical write packets processing when posted write error queue is full, with posted write error queue storing physical write requests when posted write packet fails
    US6587962B1 (en) * 1999-10-20 2003-07-01 Hewlett-Packard Development Company, L.P. Write request protection upon failure in a multi-computer system
    US7302690B2 (en) * 2003-07-31 2007-11-27 International Business Machines Corporation Method and apparatus for transparently sharing an exception vector between firmware and an operating system
    US7313720B1 (en) 2004-02-12 2007-12-25 Network Appliance, Inc. Technique for increasing the number of persistent consistency point images in a file system
    US20110167197A1 (en) * 2010-01-05 2011-07-07 Mark Leinwander Nonvolatile Storage with Disparate Memory Types
    US20140089646A1 (en) * 2012-09-27 2014-03-27 Texas Instruments Incorporated Processor with interruptable instruction execution
    US10649829B2 (en) 2017-07-10 2020-05-12 Hewlett Packard Enterprise Development Lp Tracking errors associated with memory access operations
    CN109918024B (en) * 2019-02-28 2022-07-05 深圳和而泰数据资源与云技术有限公司 Storage management method and device, single-chip microcomputer equipment and readable storage medium

    Family Cites Families (9)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    JPS5255446A (en) * 1975-10-31 1977-05-06 Toshiba Corp Information transfer control system
    US4394733A (en) * 1980-11-14 1983-07-19 Sperry Corporation Cache/disk subsystem
    US4598357A (en) * 1980-11-14 1986-07-01 Sperry Corporation Cache/disk subsystem with file number for recovery of cached data
    US4523275A (en) * 1980-11-14 1985-06-11 Sperry Corporation Cache/disk subsystem with floating entry
    US4476526A (en) * 1981-11-27 1984-10-09 Storage Technology Corporation Cache buffered memory subsystem
    US4454595A (en) * 1981-12-23 1984-06-12 Pitney Bowes Inc. Buffer for use with a fixed disk controller
    US4523206A (en) * 1982-03-03 1985-06-11 Sperry Corporation Cache/disk system with writeback regulation relative to use of cache memory
    US4527233A (en) * 1982-07-26 1985-07-02 Ambrosius Iii William H Addressable buffer circuit with address incrementer independently clocked by host computer and external storage device controller
    US4546430A (en) * 1983-07-13 1985-10-08 Sperry Corporation Control unit busy queuing

    Cited By (7)

    * Cited by examiner, † Cited by third party
    Publication number Priority date Publication date Assignee Title
    US7818498B2 (en) 1993-06-03 2010-10-19 Network Appliance, Inc. Allocating files in a file system integrated with a RAID disk sub-system
    US8359334B2 (en) 1993-06-03 2013-01-22 Network Appliance, Inc. Allocating files in a file system integrated with a RAID disk sub-system
    US6457130B2 (en) 1998-03-03 2002-09-24 Network Appliance, Inc. File access control in a multi-protocol file server
    US7103794B2 (en) 1998-06-08 2006-09-05 Cacheflow, Inc. Network object cache engine
    US6604118B2 (en) 1998-07-31 2003-08-05 Network Appliance, Inc. File system image transfer
    US7685169B2 (en) 2002-06-07 2010-03-23 Netapp, Inc. Multiple concurrent active file systems
    US7962531B2 (en) 2002-06-07 2011-06-14 Netapp, Inc. Multiple concurrent active file systems

    Also Published As

    Publication number Publication date
    CA1319440C (en) 1993-06-22
    DE68928542D1 (en) 1998-02-19
    EP0359384A2 (en) 1990-03-21
    JPH0293949A (en) 1990-04-04
    KR900005326A (en) 1990-04-14
    EP0359384A3 (en) 1991-07-10
    US5065354A (en) 1991-11-12
    KR970011213B1 (en) 1997-07-08
    DE68928542T2 (en) 1998-07-23

    Similar Documents

    Publication Publication Date Title
    EP0359384B1 (en) Queued posted-write disk write method and apparatus with improved error handling
    US5606681A (en) Method and device implementing software virtual disk in computer RAM that uses a cache of IRPs to increase system performance
    USRE37601E1 (en) Method and system for incremental time zero backup copying of data
    JP2557172B2 (en) Method and system for secondary file status polling in a time zero backup copy process
    US5530897A (en) System for dynamic association of a variable number of device addresses with input/output devices to allow increased concurrent requests for access to the input/output devices
    US4493034A (en) Apparatus and method for an operating system supervisor in a data processing system
    EP0566967A2 (en) Method and system for time zero backup session security
    US7840768B2 (en) Memory-controller-embedded apparatus and procedure for achieving system-directed checkpointing without operating-system kernel support
    JPH08314848A (en) Method and apparatus for transfer of data between two devices by shortening of overhead of microprocessor
    US20060150010A1 (en) Memory-controller-embedded apparatus and procedure for achieving system-directed checkpointing without operating-system kernel support
    US5752268A (en) Minimum-delay recoverable disk control system using checkpoints and nonvolatile memory
    JP2557199B2 (en) Interface system and method
    JPH0683687A (en) Data processing system and method thereof
    JPH07504527A (en) High performance non-volatile RAM protected write cache accelerator system
    JPH09160862A (en) Status processing system for transfer of data block between local side and host side
    US4675807A (en) Multiple file transfer to streaming device
    US20060069888A1 (en) Method, system and program for managing asynchronous cache scans
    US6725348B1 (en) Data storage device and method for reducing write misses by completing transfer to a dual-port cache before initiating a disk write of the data from the cache
    JPH05173961A (en) Method for controlling transfer of data block
    JPH065515B2 (en) Method and computer system for reducing cache reload overhead
    WO2020147544A1 (en) Method and device for resuming execution of application, and computer
    US7324220B1 (en) Print performance under the windows® operating system
    US20020120789A1 (en) Finite state machine with a single process context for a RAID system
    USRE45632E1 (en) Memory-controller-embedded apparatus and procedure for achieving system-directed checkpointing without operating-system kernel support
    US10871908B2 (en) Storage bypass driver operation in a highly available computer system

    Legal Events

    Date Code Title Description
    PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

    Free format text: ORIGINAL CODE: 0009012

    AK Designated contracting states

    Kind code of ref document: A2

    Designated state(s): BE CH DE ES FR GB GR IT LI NL SE

    PUAL Search report despatched

    Free format text: ORIGINAL CODE: 0009013

    AK Designated contracting states

    Kind code of ref document: A3

    Designated state(s): BE CH DE ES FR GB GR IT LI NL SE

    17P Request for examination filed

    Effective date: 19920109

    17Q First examination report despatched

    Effective date: 19940330

    GRAG Despatch of communication of intention to grant

    Free format text: ORIGINAL CODE: EPIDOS AGRA

    GRAG Despatch of communication of intention to grant

    Free format text: ORIGINAL CODE: EPIDOS AGRA

    GRAH Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOS IGRA

    GRAH Despatch of communication of intention to grant a patent

    Free format text: ORIGINAL CODE: EPIDOS IGRA

    GRAA (expected) grant

    Free format text: ORIGINAL CODE: 0009210

    AK Designated contracting states

    Kind code of ref document: B1

    Designated state(s): BE CH DE ES FR GB GR IT LI NL SE

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: LI

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980114

    Ref country code: CH

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980114

    Ref country code: GR

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980114

    Ref country code: ES

    Free format text: THE PATENT HAS BEEN ANNULLED BY A DECISION OF A NATIONAL AUTHORITY

    Effective date: 19980114

    Ref country code: NL

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980114

    Ref country code: BE

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980114

    REG Reference to a national code

    Ref country code: CH

    Ref legal event code: EP

    ITF It: translation for a ep patent filed

    Owner name: BARZANO' E ZANARDO ROMA S.P.A.

    REF Corresponds to:

    Ref document number: 68928542

    Country of ref document: DE

    Date of ref document: 19980219

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: SE

    Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

    Effective date: 19980414

    ET Fr: translation filed
    NLV1 Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act
    REG Reference to a national code

    Ref country code: CH

    Ref legal event code: PL

    PLBE No opposition filed within time limit

    Free format text: ORIGINAL CODE: 0009261

    STAA Information on the status of an ep patent application or granted ep patent

    Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

    26N No opposition filed
    REG Reference to a national code

    Ref country code: GB

    Ref legal event code: IF02

    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: GB

    Payment date: 20060825

    Year of fee payment: 18

    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: FR

    Payment date: 20060831

    Year of fee payment: 18

    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: DE

    Payment date: 20061002

    Year of fee payment: 18

    PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

    Ref country code: IT

    Payment date: 20070830

    Year of fee payment: 19

    GBPC Gb: european patent ceased through non-payment of renewal fee

    Effective date: 20070804

    REG Reference to a national code

    Ref country code: FR

    Ref legal event code: ST

    Effective date: 20080430

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: DE

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20080301

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: FR

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20070831

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: GB

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20070804

    PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

    Ref country code: IT

    Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

    Effective date: 20080804