WO2009033971A1 - System and method for splitting data and data control information - Google Patents

System and method for splitting data and data control information Download PDF

Info

Publication number
WO2009033971A1
WO2009033971A1 PCT/EP2008/061464 EP2008061464W WO2009033971A1 WO 2009033971 A1 WO2009033971 A1 WO 2009033971A1 EP 2008061464 W EP2008061464 W EP 2008061464W WO 2009033971 A1 WO2009033971 A1 WO 2009033971A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
array
control information
bus driver
array controller
Prior art date
Application number
PCT/EP2008/061464
Other languages
French (fr)
Inventor
Wolfgang Klausberger
Stefan Abeling
Axel Kochale
Johann Maas
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2009033971A1 publication Critical patent/WO2009033971A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0613Improving I/O performance in relation to throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Definitions

  • the present invention relates to the field of mass storage solutions with multiple storage units.
  • exemplary embodiments of the present invention relate to a Method and apparatus for splitting data and data control information on storage devices.
  • High-speed data recording in the workflow for digital cinematography may involve a process data rate of 2.5 Gbps and more. To achieve this bandwidth using conventional disc drives like e.g. hard disk drives or solid state disks, it may be necessary to stream to multiple storage units in parallel. Additionally, the nature of audiovisual or AV streams involves the processing of data in real-time, i.e. with limited delays in a data path.
  • Known systems fulfill these requirements by separating data and control path functions at a higher level. In other words, known systems separate data processing, maintenance and/or user interface functions in order to free the data path from processing as much control information as possible. However, known approaches still require the file system/addressing tasks to be implemented either completely in the data path or to be shared with the control path.
  • One known approach presupposes the data to be mapped into the address space of a control processor, with the control processor setting up all I/O operations to/from storage devices and to/from communication network.
  • U.S. Patent Application Publication No. 20060112219 to Chawla, et al. purports to disclose a modular data storage system with a control path and a data path.
  • the storage system includes three modular components linked and adapted for independent removal and insertion within the modular data storage system.
  • a service processor is positioned in the control path
  • a data services platform is positioned in the data path and the control path
  • a storage array controller is positioned in the data path and the control path.
  • the data services platform has a host interface interfacing with storage application hosts and includes a control path block linked to the service processor.
  • the platform includes a data path block including data path functions that may be functions partitioned for performance only by the data services platform.
  • the storage array controller includes a control path block linked to the service processor and including control interfaces.
  • the controller includes a data path block including data path functions.
  • U.S. Patent No. 5,802,366 to Row et al. purports to disclose a file server architecture that comprises as separate processors a network controller unit, a file controller unit and a storage processor unit. These units incorporate their own processors, and operate in parallel with a local Unix host processor. All networks are connected to the network controller unit, which performs all protocol processing up through the NFS layer.
  • the virtual file system is implemented in the file control unit, and the storage processor provides high-speed multiplexed access to an array of mass storage devices.
  • the file controller unit control file information caching through its own local cache buffer, and controls disk data caching through a large system memory which is accessible on a bus by any of the processors.
  • U.S. Patent No. 5,555,390 to Judd, et al. purports to disclose a data storage subsystem and method for transferring data from a storage subsystem to a connected host data processing system.
  • the subsystem comprises a device controller connected to one or more direct access storage devices e.g. disk drives.
  • the host data processing system issues data transfer commands to the subsystem to initiate transfer of data between the host processing system and the device or devices associated with the data storage subsystem.
  • Read/write data is transferred directly from device to host via a buffer controller.
  • the read command from the host data processing system specifies the data to be transferred and the start address in host memory to which the data should be sent.
  • the device controller of the data storage subsystem is capable of respecifying or amending the start address specified by the host in the read command. This provides a performance benefit for split data transfers.
  • the device controller of the data storage subsystem can specify the host address to which the replacement data should be sent.
  • control path desirably include a high degree of redundancy or performance reserve, since their construction does not allow a good degree of predictability due to estimated worst-case delays that may occur while processing file system related tasks. This leads to inefficient utilization of hardware resources.
  • An improved system and method that facilitates separation of a data path from a control path in a high-speed data transfer system is desired.
  • a data storage apparatus in accordance with the present invention is recited in claim 1.
  • the data storage apparatus comprises a cache that buffers data received from a data path and an array controller that multiplexes an input stream of data received from the cache.
  • the data storage apparatus additionally comprises a bus driver module that is adapted to associate control information with a portion of an output stream of data received from the array controller.
  • a data storage apparatus in accordance with the present invention may additionally comprise a control path that is adapted to provide the control information to the bus driver module.
  • the control information may identify a sector on an array of storage devices or may more generally relate to an address on the array of storage devices.
  • the stream of data may be transferred in a sequence of burst transfers, for which the bus driver module is adapted to receive a start address only for an initial one of the sequence of burst transfers.
  • the bus driver may be adapted to write the portion of the output stream of data received from the array controller to an array of storage devices at locations as identified by the control information.
  • a method of transferring data in accordance with the present invention is set forth in claim 6. The method comprises buffering data in a cache, delivering the data buffered in the cache to an array controller and delivering data from the array controller to a bus driver module.
  • the method additionally comprises associating control information with a portion of the data received from the array controller.
  • a method in accordance with the present invention may comprise writing the portion of data received from the array controller to an array of storage devices at a location that corresponds to the control information.
  • the control information may identify a sector on an array of storage devices or may relate more generally to an address on an array of storage devices.
  • Data received from the array controller may be transferred to an array of storage devices as a sequence of burst transfers.
  • Fig. 1 is a block diagram of a data recording system in accordance with an exemplary embodiment of the present invention.
  • Fig. 2 is a state diagram that is useful in explaining the operation of an exemplary embodiment of the present invention.
  • Fig. 3 is a process flow diagram that shows a method in accordance with an exemplary embodiment of the present invention.
  • Fig. 1 is a block diagram of a recording system having a system controller module to perform storage functionality on an array of storage units in accordance with an exemplary embodiment of the present invention.
  • the recording system shown in Fig. 1 is generally referred to by the reference number 100.
  • the recording system 100 includes a user interface 102, which allows a user to control the overall operation of the recording system 100 and to view information about the system status and the like.
  • the user interface includes an LCD touchpad display.
  • the recording system 100 includes a system controller module 104.
  • the system controller module 104 is adapted to transfer data to and receive data from an HDD array 114, which comprises a plurality of individual HDDs. Transfers of data clusters to or from the disks of the HDD array 114 are initiated by setting or inscribing appropriate values into registers, indicating the cluster size, the cluster start address and the related command, like "read” or "write”.
  • the system controller module 104 includes an embedded software processor system, which is shown in Fig. 1 as PPC 106.
  • PPC is an acronym for Power PC.
  • the PPC 106 communicates via an external control path 116 with external modules.
  • Controlling tasks performed by the PPC 106 may include executing file system maintenance algorithms, communicating with external clients, controlling the user access via an LCD touchpad or other user interface or Ul.
  • File system tasks may relate to, for example, an interface between logical addressing and physical addressing of the storage, access rights management, memory allocation, garbage collection, defragmentation, version management, load balancing or the like.
  • data processing in accordance with an exemplary embodiment of the invention is preferably performed without overhead of file system/addressing tasks.
  • the PPC 106 configures the hardware of the system controller module 104 via an internal control path 118.
  • the PPC 106 configures a cache 108, an array control such as a RAID control 110 and a bus driver module 112.
  • the PPC106 may be adapted to associate control information with data by providing control information to the bus driver 112 via the internal control path 118.
  • the cache 108 is adapted to transfer data received via a data path 120.
  • real-time data received via the data path 120 are buffered in the cache 108 before being transferred to the RAID control 110.
  • the RAID control 110 then delivers the data to the bus driver module 112 to provide data streaming to the attached devices in the HDD array 114.
  • the recording system 100 may additionally be adapted to stream data from the HDD array 114.
  • the RAID controller 110 multiplexes incoming data streams. Multiplexing an incoming data stream is the process of separating a unitary incoming data stream into a plurality of parallel streams, each of which is intended to be written to a different storage device in an array of storage devices.
  • the skilled person will appreciate that the RAID controller 110 may also be adapted to de-multiplex outgoing data streams, and optionally add redundancy for recovering data from up to two failing HDDs without data loss.
  • the bus driver module 112 splits the data stream word by word into n chunks, each assigned to a corresponding DMA engine serving the attached standard ATA-HDD.
  • the skilled person will appreciate that storage devices operating according to any protocol may be employed in an exemplary embodiment of the present invention. Parallel or serial ATA-HDDs are just two examples of protocols that may be employed.
  • the addressing task that is the assigning of data chunks to sector addresses on storage devices, is also done in the bus driver module 112.
  • An exemplary embodiment of the present invention provides an efficient method and apparatus for distributing a high-speed data stream onto multiple devices performed by a "late binding" of data to data control information via the internal control path 118.
  • the expression "late binding” refers to associating data with control information such as an intended sector on an array of storage devices and/or other address information at a time just before the data is to be written to the HDD array.
  • the internal control path 118 may be adapted to provide addressing information directly to the bus driver module 112 and not to the cache 108 or the RAID control 110. The idea is to associate the incoming/outgoing data with assigned clusters on storage devices at a very low level, that is, in the last controller instance that directly communicates with the single storage units or storage devices.
  • the late association of data with control information simplifies the design of the recording system 100 because the control information does not need to be processed until just before the associated data is written to the HDD array 114.
  • An exemplary embodiment of the present invention exploits a known principle of DMA one step further.
  • DMA generally allows a processor to be liberated from the resource-consuming task of orchestrating every single memory cell access in a block data transfer. This is achieved by providing dedicated hardware that performs specific functionality.
  • An exemplary embodiment of the present invention extends this concept to situations in which sequences of plural DMA accesses to contiguous memory address ranges occur.
  • an exemplary embodiment of the present invention provides dedicated hardware to perform even the task of setting up, supervising and influencing such sequences of DMA calls. This frees the PPC 106 even further from brute-force routine tasks.
  • An exemplary embodiment of the present invention increases the predictability of a storage system by almost completely discharging the data path 120 from performing control tasks. This results in a better utilization of hardware resources, which in turn allows renouncing or getting rid of redundant components, and enables the usage of a low-power control processor.
  • Fig. 2 is a state diagram that is useful in explaining the operation of an exemplary embodiment of the present invention.
  • the state diagram is generally referred to by the reference number 200.
  • the state diagram 200 shows, in state-diagram format, an exemplary method of operation for a state machine that improves the performance of a data storage system.
  • a state machine that operates according to Fig. 2 provides late binding of data and control information in accordance with an exemplary embodiment of the present invention.
  • a system controller Prior to beginning operation, a system controller, which is denoted as Power PC or PPC in Fig. 2, is in an idle state 202.
  • the PPC initiates a transfer by delivering, via registers, values for the cluster size of the next transfer, the sector start address to be used, and whether the related command is for a read operation or a write operation.
  • cluster sizes may range up to 16 MB per device. These values may be provided using, for example, a serial communication bus protocol like I2C.
  • the registers are denoted as "PPC registers" in Fig. 2.
  • the register values are read, as shown at state 204.
  • An interrupt is delivered to the system controller to signal the initialisation of the storage devices involved in the transfer.
  • the interrupt signals that the PPC can read the status of the data transmission finished before or previously, and may initiate the next transfer.
  • "Small" or "large” data clusters may be addressed, but read/write transfers are generally performed as 64-kB bursts per device unless the cluster size per device is smaller than 64 kB. In other words, burst size is derived and equals the cluster size with an upper limit of 64 kB.
  • the bus driver module 112 may include logic to calculate the specific sector start address and/or other address information for the ATA devices. In an exemplary embodiment of the present invention, 64-kB bursts are initiated together or simultaneously for all DMA engines.
  • the lower half of the state diagram 200 exhibits a loop structure, where the transfer of the cluster is performed in the form of a sequence of burst transfers.
  • the cluster size and address is determined. This determination involves, for example, incrementing the previously-used address or start address by the burst length and decrementing the cluster size by the burst length.
  • the system controller returns to the idle state 202. If, at state 206, the cluster size is determined to be greater than zero, as shown at state 210, this is taken as an indication that there is data still to be transferred to complete the transaction. Moreover, a new DMA burst transfer is initiated at state 212. During the DMA transaction, the state machine is in a DMA_control state 214. After the burst transfer, a new iteration occurs whenever the attached group of storage devices is ready as a whole. This is symbolically indicated as an HDD group ready state 216 in Fig. 2.
  • the state machine iterates the loop by again calculating the cluster size remaining to be transferred and the address, as shown at state 206.
  • streaming data is not mapped into an address space of a control processor.
  • the PPC 106 shown in Fig. 1 , only delivers subsequent, consecutive or successive cluster addresses for the accessed file. For larger cluster sizes, a longer time slot is needed by hardware modules to fill the cluster with streaming data. Having large enough clusters and an appropriate data rate, periodic requests for new cluster addresses from the bus driver module 112 to the PPC 106 will come in at relatively long intervals, thus allowing the processor to operate at a very low clock frequency.
  • a start address is provided by the PPC 106 to the bus driver module 112 prior to the first of a sequence of burst transfers.
  • the start address for subsequent transfers is determined by the bus driver module 112, which increments the value of the initial start address.
  • no additional addressing information is provided by the PPC 106 to the bus driver module 112 via the internal control path 118 for the remaining sequence of burst transfers.
  • the bus driver module 112 may be adapted to notify the processor of a completion status of a previously completed data cluster transfer.
  • the bus driver module 112 may be adapted to request from the PPC 106 a next start address to be used in a next cluster transfer to be performed after the current cluster transfer.
  • Fig. 3 is a process flow diagram that shows a method in accordance with an exemplary embodiment of the present invention.
  • the method is generally referred to by the reference number 300.
  • the skilled person will appreciate that the method 300 may be desirably performed by the data storage system 100.
  • An exemplary method in accordance with the present invention may, in addition, be implemented according to the state diagram 200.
  • data is buffered in a cache such as the cache 108.
  • data buffered in the cache is delivered to an array controller such as the RAID controller 110.
  • Data is delivered from the array controller to a bus driver module such as the bus driver module 112 at step 306.
  • the bus driver module associates control information with a portion of data received from the array controller. Finally, the portion of data received from the array controller is written by the bus driver module to an array of storage devices at a location that corresponds to the control information.

Abstract

The present invention relates to a device and method for transferring data. An exemplary data storage device comprises a cache that buffers data received from a data path and an array controller that multiplexes an input stream of data received from the cache. The exemplary data storage device additionally comprises a bus driver module that is adapted to associate control information with a portion of an output stream of data received from the array controller. An exemplary method (300) of transferring data comprises buffering (302) data in a cache, delivering (304) the data buffered in the cache to an array controller and delivering (306) data from the array controller to a bus driver module. The exemplary method (300) additionally comprises associating (308) control information with a portion of the data received from the array controller.

Description

System and Method for Splitting Data and Data Control Information
The present invention relates to the field of mass storage solutions with multiple storage units. In particular, exemplary embodiments of the present invention relate to a Method and apparatus for splitting data and data control information on storage devices.
High-speed data recording in the workflow for digital cinematography may involve a process data rate of 2.5 Gbps and more. To achieve this bandwidth using conventional disc drives like e.g. hard disk drives or solid state disks, it may be necessary to stream to multiple storage units in parallel. Additionally, the nature of audiovisual or AV streams involves the processing of data in real-time, i.e. with limited delays in a data path. Known systems fulfill these requirements by separating data and control path functions at a higher level. In other words, known systems separate data processing, maintenance and/or user interface functions in order to free the data path from processing as much control information as possible. However, known approaches still require the file system/addressing tasks to be implemented either completely in the data path or to be shared with the control path. One known approach presupposes the data to be mapped into the address space of a control processor, with the control processor setting up all I/O operations to/from storage devices and to/from communication network.
U.S. Patent Application Publication No. 20060112219 to Chawla, et al. purports to disclose a modular data storage system with a control path and a data path. The storage system includes three modular components linked and adapted for independent removal and insertion within the modular data storage system. A service processor is positioned in the control path, a data services platform is positioned in the data path and the control path, and a storage array controller is positioned in the data path and the control path. The data services platform has a host interface interfacing with storage application hosts and includes a control path block linked to the service processor. The platform includes a data path block including data path functions that may be functions partitioned for performance only by the data services platform. The storage array controller includes a control path block linked to the service processor and including control interfaces. The controller includes a data path block including data path functions.
U.S. Patent No. 5,802,366 to Row et al. purports to disclose a file server architecture that comprises as separate processors a network controller unit, a file controller unit and a storage processor unit. These units incorporate their own processors, and operate in parallel with a local Unix host processor. All networks are connected to the network controller unit, which performs all protocol processing up through the NFS layer. The virtual file system is implemented in the file control unit, and the storage processor provides high-speed multiplexed access to an array of mass storage devices. The file controller unit control file information caching through its own local cache buffer, and controls disk data caching through a large system memory which is accessible on a bus by any of the processors.
U.S. Patent No. 5,555,390 to Judd, et al. purports to disclose a data storage subsystem and method for transferring data from a storage subsystem to a connected host data processing system. The subsystem comprises a device controller connected to one or more direct access storage devices e.g. disk drives. The host data processing system issues data transfer commands to the subsystem to initiate transfer of data between the host processing system and the device or devices associated with the data storage subsystem. Read/write data is transferred directly from device to host via a buffer controller. For a read operation, the read command from the host data processing system specifies the data to be transferred and the start address in host memory to which the data should be sent. The device controller of the data storage subsystem is capable of respecifying or amending the start address specified by the host in the read command. This provides a performance benefit for split data transfers. In addition, if an error occurs during a read operation, the device controller of the data storage subsystem can specify the host address to which the replacement data should be sent.
Moreover, known methods of separating a control path from a data path require either a powerful architecture, which may necessitate a complex data path, or a very fast control processor. Under real-time requirements, the data path and control path desirably include a high degree of redundancy or performance reserve, since their construction does not allow a good degree of predictability due to estimated worst-case delays that may occur while processing file system related tasks. This leads to inefficient utilization of hardware resources.
An improved system and method that facilitates separation of a data path from a control path in a high-speed data transfer system is desired.
A data storage apparatus in accordance with the present invention is recited in claim 1. The data storage apparatus comprises a cache that buffers data received from a data path and an array controller that multiplexes an input stream of data received from the cache. The data storage apparatus additionally comprises a bus driver module that is adapted to associate control information with a portion of an output stream of data received from the array controller.
A data storage apparatus in accordance with the present invention may additionally comprise a control path that is adapted to provide the control information to the bus driver module. The control information may identify a sector on an array of storage devices or may more generally relate to an address on the array of storage devices. In accordance with the present invention, the stream of data may be transferred in a sequence of burst transfers, for which the bus driver module is adapted to receive a start address only for an initial one of the sequence of burst transfers. The bus driver may be adapted to write the portion of the output stream of data received from the array controller to an array of storage devices at locations as identified by the control information. A method of transferring data in accordance with the present invention is set forth in claim 6. The method comprises buffering data in a cache, delivering the data buffered in the cache to an array controller and delivering data from the array controller to a bus driver module. The method additionally comprises associating control information with a portion of the data received from the array controller.
In addition, a method in accordance with the present invention may comprise writing the portion of data received from the array controller to an array of storage devices at a location that corresponds to the control information. The control information may identify a sector on an array of storage devices or may relate more generally to an address on an array of storage devices. Data received from the array controller may be transferred to an array of storage devices as a sequence of burst transfers.
A preferred embodiment of the present invention is described with reference to the accompanying drawings. The preferred embodiment merely exemplifies the invention. Plural possible modifications are apparent to the skilled person. The gist and scope of the present invention is defined in the appended claims of the present application.
Fig. 1 is a block diagram of a data recording system in accordance with an exemplary embodiment of the present invention.
Fig. 2 is a state diagram that is useful in explaining the operation of an exemplary embodiment of the present invention.
Fig. 3 is a process flow diagram that shows a method in accordance with an exemplary embodiment of the present invention.
Fig. 1 is a block diagram of a recording system having a system controller module to perform storage functionality on an array of storage units in accordance with an exemplary embodiment of the present invention. The recording system shown in Fig. 1 is generally referred to by the reference number 100. The recording system 100 includes a user interface 102, which allows a user to control the overall operation of the recording system 100 and to view information about the system status and the like. In one exemplary embodiment of the present invention, the user interface includes an LCD touchpad display.
The recording system 100 includes a system controller module 104. The system controller module 104 is adapted to transfer data to and receive data from an HDD array 114, which comprises a plurality of individual HDDs. Transfers of data clusters to or from the disks of the HDD array 114 are initiated by setting or inscribing appropriate values into registers, indicating the cluster size, the cluster start address and the related command, like "read" or "write".
The system controller module 104 includes an embedded software processor system, which is shown in Fig. 1 as PPC 106. As used herein, PPC is an acronym for Power PC. The PPC 106 communicates via an external control path 116 with external modules. Controlling tasks performed by the PPC 106 may include executing file system maintenance algorithms, communicating with external clients, controlling the user access via an LCD touchpad or other user interface or Ul. File system tasks may relate to, for example, an interface between logical addressing and physical addressing of the storage, access rights management, memory allocation, garbage collection, defragmentation, version management, load balancing or the like. As fully set forth below, data processing in accordance with an exemplary embodiment of the invention is preferably performed without overhead of file system/addressing tasks.
Additionally, the PPC 106 configures the hardware of the system controller module 104 via an internal control path 118. In the exemplary embodiment shown in Fig. 1 , the PPC 106 configures a cache 108, an array control such as a RAID control 110 and a bus driver module 112. As set forth below, the PPC106 may be adapted to associate control information with data by providing control information to the bus driver 112 via the internal control path 118. In the exemplary embodiment shown in Fig. 1 , the cache 108 is adapted to transfer data received via a data path 120. Moreover, real-time data received via the data path 120 are buffered in the cache 108 before being transferred to the RAID control 110. The RAID control 110 then delivers the data to the bus driver module 112 to provide data streaming to the attached devices in the HDD array 114. The skilled person will appreciate that the recording system 100 may additionally be adapted to stream data from the HDD array 114.
The RAID controller 110 multiplexes incoming data streams. Multiplexing an incoming data stream is the process of separating a unitary incoming data stream into a plurality of parallel streams, each of which is intended to be written to a different storage device in an array of storage devices. The skilled person will appreciate that the RAID controller 110 may also be adapted to de-multiplex outgoing data streams, and optionally add redundancy for recovering data from up to two failing HDDs without data loss.
The bus driver module 112 splits the data stream word by word into n chunks, each assigned to a corresponding DMA engine serving the attached standard ATA-HDD. The skilled person will appreciate that storage devices operating according to any protocol may be employed in an exemplary embodiment of the present invention. Parallel or serial ATA-HDDs are just two examples of protocols that may be employed. The addressing task, that is the assigning of data chunks to sector addresses on storage devices, is also done in the bus driver module 112.
An exemplary embodiment of the present invention provides an efficient method and apparatus for distributing a high-speed data stream onto multiple devices performed by a "late binding" of data to data control information via the internal control path 118. The expression "late binding" refers to associating data with control information such as an intended sector on an array of storage devices and/or other address information at a time just before the data is to be written to the HDD array. Moreover, the internal control path 118 may be adapted to provide addressing information directly to the bus driver module 112 and not to the cache 108 or the RAID control 110. The idea is to associate the incoming/outgoing data with assigned clusters on storage devices at a very low level, that is, in the last controller instance that directly communicates with the single storage units or storage devices. In this manner, it is possible to achieve high-speed data recording performance while using a very low controlling overhead. In particular, the late association of data with control information simplifies the design of the recording system 100 because the control information does not need to be processed until just before the associated data is written to the HDD array 114.
An exemplary embodiment of the present invention exploits a known principle of DMA one step further. In particular, DMA generally allows a processor to be liberated from the resource-consuming task of orchestrating every single memory cell access in a block data transfer. This is achieved by providing dedicated hardware that performs specific functionality. An exemplary embodiment of the present invention extends this concept to situations in which sequences of plural DMA accesses to contiguous memory address ranges occur. Moreover, an exemplary embodiment of the present invention provides dedicated hardware to perform even the task of setting up, supervising and influencing such sequences of DMA calls. This frees the PPC 106 even further from brute-force routine tasks.
An exemplary embodiment of the present invention increases the predictability of a storage system by almost completely discharging the data path 120 from performing control tasks. This results in a better utilization of hardware resources, which in turn allows renouncing or getting rid of redundant components, and enables the usage of a low-power control processor.
Fig. 2 is a state diagram that is useful in explaining the operation of an exemplary embodiment of the present invention. The state diagram is generally referred to by the reference number 200. The state diagram 200 shows, in state-diagram format, an exemplary method of operation for a state machine that improves the performance of a data storage system. In particular, a state machine that operates according to Fig. 2 provides late binding of data and control information in accordance with an exemplary embodiment of the present invention.
Prior to beginning operation, a system controller, which is denoted as Power PC or PPC in Fig. 2, is in an idle state 202. The PPC initiates a transfer by delivering, via registers, values for the cluster size of the next transfer, the sector start address to be used, and whether the related command is for a read operation or a write operation. In one exemplary embodiment of the present invention, cluster sizes may range up to 16 MB per device. These values may be provided using, for example, a serial communication bus protocol like I2C. The registers are denoted as "PPC registers" in Fig. 2. The register values are read, as shown at state 204. An interrupt is delivered to the system controller to signal the initialisation of the storage devices involved in the transfer.
The interrupt signals that the PPC can read the status of the data transmission finished before or previously, and may initiate the next transfer. "Small" or "large" data clusters may be addressed, but read/write transfers are generally performed as 64-kB bursts per device unless the cluster size per device is smaller than 64 kB. In other words, burst size is derived and equals the cluster size with an upper limit of 64 kB. The bus driver module 112 may include logic to calculate the specific sector start address and/or other address information for the ATA devices. In an exemplary embodiment of the present invention, 64-kB bursts are initiated together or simultaneously for all DMA engines.
The lower half of the state diagram 200 exhibits a loop structure, where the transfer of the cluster is performed in the form of a sequence of burst transfers. At state 206, the cluster size and address is determined. This determination involves, for example, incrementing the previously-used address or start address by the burst length and decrementing the cluster size by the burst length.
If the cluster size is determined zero, as shown at state 208, this indicates that the DMA transaction has completed. Accordingly, the system controller returns to the idle state 202. If, at state 206, the cluster size is determined to be greater than zero, as shown at state 210, this is taken as an indication that there is data still to be transferred to complete the transaction. Moreover, a new DMA burst transfer is initiated at state 212. During the DMA transaction, the state machine is in a DMA_control state 214. After the burst transfer, a new iteration occurs whenever the attached group of storage devices is ready as a whole. This is symbolically indicated as an HDD group ready state 216 in Fig. 2. The skilled person will appreciate that the idea of waiting until a group of storage devices is ready is applicable not only to HDD storage devices, but any suitable storage device that may be employed in a digital storage system such as the recording system 100. When the associated storage devices are ready, the state machine iterates the loop by again calculating the cluster size remaining to be transferred and the address, as shown at state 206.
In an exemplary embodiment of the present invention, streaming data is not mapped into an address space of a control processor. Instead, the PPC 106, shown in Fig. 1 , only delivers subsequent, consecutive or successive cluster addresses for the accessed file. For larger cluster sizes, a longer time slot is needed by hardware modules to fill the cluster with streaming data. Having large enough clusters and an appropriate data rate, periodic requests for new cluster addresses from the bus driver module 112 to the PPC 106 will come in at relatively long intervals, thus allowing the processor to operate at a very low clock frequency.
In one exemplary embodiment of the present invention, a start address is provided by the PPC 106 to the bus driver module 112 prior to the first of a sequence of burst transfers. The start address for subsequent transfers is determined by the bus driver module 112, which increments the value of the initial start address. Moreover, no additional addressing information is provided by the PPC 106 to the bus driver module 112 via the internal control path 118 for the remaining sequence of burst transfers. After receiving a request for a data cluster transfer, the bus driver module 112 may be adapted to notify the processor of a completion status of a previously completed data cluster transfer. In an exemplary embodiment of the present invention, after setting up the first DMA access, the bus driver module 112 may be adapted to request from the PPC 106 a next start address to be used in a next cluster transfer to be performed after the current cluster transfer.
Fig. 3 is a process flow diagram that shows a method in accordance with an exemplary embodiment of the present invention. The method is generally referred to by the reference number 300. The skilled person will appreciate that the method 300 may be desirably performed by the data storage system 100. An exemplary method in accordance with the present invention may, in addition, be implemented according to the state diagram 200.
At step 302, data is buffered in a cache such as the cache 108. At step 304, data buffered in the cache is delivered to an array controller such as the RAID controller 110. Data is delivered from the array controller to a bus driver module such as the bus driver module 112 at step 306.
At step 308, the bus driver module associates control information with a portion of data received from the array controller. Finally, the portion of data received from the array controller is written by the bus driver module to an array of storage devices at a location that corresponds to the control information.
The skilled person will appreciate that combining any of the above-recited features of the present invention together may be desirable.

Claims

Claims
1. Data storage device (100), comprising:
- a cache (108) that buffers data received from a data path (120);
- an array controller (110) that multiplexes an input stream of data received from the cache (108); and
- a bus driver module (112) that is adapted to associate control information with a portion of an output stream of data received from the array controller (110).
2. Data storage device (100) according to claim 1 , comprising a control path (118) that is adapted to provide the control information to the bus driver module (112).
3. Data storage device (100) according to claims 1 or 2, wherein the control information identifies a sector on an array of storage devices (114).
4. Data storage device (100) according to claims 1 or 2, wherein the control information relates to an address on an array of storage devices (114).
5. Data storage device (100) according to any preceding claim, wherein the bus driver module (112) is adapted to transfer the output stream of data in a sequence of burst transfers, and wherein the bus driver module (112) is adapted to receive a start address only for an initial one of the sequence of burst transfers.
6. Data storage device (100) according to any preceding claim, wherein the bus driver (112) is adapted to write the portion of the output stream of data received from the array controller (110) to an array of storage devices at locations as identified by the control information.
7. Method (300) of transferring data, comprising:
- buffering (302) data in a cache (108); - delivering (304) the data buffered in the cache (108) to an array controller (110);
- delivering (306) data from the array controller (110) to a bus driver module (112); and - associating (308) control information with a portion of the data received from the array controller (110).
8. Method (300) of transferring data according to claim 7, comprising writing (310) the portion of data received from the array controller (110) to an array of storage devices at locations as identified by the control information.
9. Method (300) of transferring data according to claims 7 or 8, wherein the control information identifies a sector on an array of storage devices (114).
10. Method (300) of transferring data according to claims 7 or 8, wherein the control information relates to an address on an array of storage devices (114).
11. Method (300) of transferring data according to one of claims 7 to 10, comprising transferring the data received from the array controller (110) to an array of storage devices (114) as a sequence of burst transfers.
12. Method (300) of transferring data according to claim 11 , comprising providing a start address only for an initial one of the sequence of burst transfers.
PCT/EP2008/061464 2007-09-13 2008-09-01 System and method for splitting data and data control information WO2009033971A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP07116321 2007-09-13
EP07116321.6 2007-09-13

Publications (1)

Publication Number Publication Date
WO2009033971A1 true WO2009033971A1 (en) 2009-03-19

Family

ID=40001386

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/061464 WO2009033971A1 (en) 2007-09-13 2008-09-01 System and method for splitting data and data control information

Country Status (2)

Country Link
TW (1) TW200912728A (en)
WO (1) WO2009033971A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9948615B1 (en) * 2015-03-16 2018-04-17 Pure Storage, Inc. Increased storage unit encryption based on loss of trust
CN115496114A (en) * 2022-11-18 2022-12-20 成都戎星科技有限公司 TDMA burst length estimation method based on K-means clustering

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9235798B2 (en) 2012-07-18 2016-01-12 Micron Technology, Inc. Methods and systems for handling data received by a state machine engine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0858025A2 (en) * 1997-02-03 1998-08-12 Matsushita Electric Industrial Co., Ltd. Data recorder and method of access to data recorder
WO1999026150A1 (en) * 1997-11-14 1999-05-27 3Ware, Inc. High-performance architecture for disk array controller
US6349357B1 (en) * 1999-03-04 2002-02-19 Sun Microsystems, Inc. Storage architecture providing scalable performance through independent control and data transfer paths
US20040128444A1 (en) * 2002-12-24 2004-07-01 Sung-Hoon Baek Method for storing data in disk array based on block division and method for controlling input/output of disk array by using the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0858025A2 (en) * 1997-02-03 1998-08-12 Matsushita Electric Industrial Co., Ltd. Data recorder and method of access to data recorder
WO1999026150A1 (en) * 1997-11-14 1999-05-27 3Ware, Inc. High-performance architecture for disk array controller
US6349357B1 (en) * 1999-03-04 2002-02-19 Sun Microsystems, Inc. Storage architecture providing scalable performance through independent control and data transfer paths
US20040128444A1 (en) * 2002-12-24 2004-07-01 Sung-Hoon Baek Method for storing data in disk array based on block division and method for controlling input/output of disk array by using the same

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9948615B1 (en) * 2015-03-16 2018-04-17 Pure Storage, Inc. Increased storage unit encryption based on loss of trust
CN115496114A (en) * 2022-11-18 2022-12-20 成都戎星科技有限公司 TDMA burst length estimation method based on K-means clustering
CN115496114B (en) * 2022-11-18 2023-04-07 成都戎星科技有限公司 TDMA burst length estimation method based on K-means clustering

Also Published As

Publication number Publication date
TW200912728A (en) 2009-03-16

Similar Documents

Publication Publication Date Title
US10387202B2 (en) Quality of service implementation in a networked storage system with hierarchical schedulers
US7162550B2 (en) Method, system, and program for managing requests to an Input/Output device
US7894288B2 (en) Parallel data storage system
EP1896965B1 (en) Dma descriptor queue read and cache write pointer arrangement
US8868809B2 (en) Interrupt queuing in a media controller architecture
US8639898B2 (en) Storage apparatus and data copy method
US20060136654A1 (en) Method and computer program product to increase I/O write performance in a redundant array
US20050235072A1 (en) Data storage controller
WO2007005702A2 (en) Multi-threaded transmit transport engine for storage devices
CN111722786A (en) Storage system based on NVMe equipment
JP2005512227A (en) Receive data from multiple interleaved simultaneous transactions in FIFO memory
US7809068B2 (en) Integrated circuit capable of independently operating a plurality of communication channels
US8078798B2 (en) Managing first level storage in a multi-host environment
US7130932B1 (en) Method and apparatus for increasing the performance of communications between a host processor and a SATA or ATA device
US20110082950A1 (en) Computer system and computer system input/output method
US11029847B2 (en) Method and system for shared direct access storage
US20040111532A1 (en) Method, system, and program for adding operations to structures
WO2009033971A1 (en) System and method for splitting data and data control information
US6092140A (en) Low latency bridging between high speed bus networks
US11080192B2 (en) Storage system and storage control method
CN114415959B (en) SATA disk dynamic accelerated access method and device
WO2004010279A2 (en) Method, system, and program for returning data to read requests received over a bus
US8943237B1 (en) Performance improvement for attached multi-storage devices
US6401151B1 (en) Method for configuring bus architecture through software control
CN114662162B (en) Multi-algorithm-core high-performance SR-IOV encryption and decryption system and method for realizing dynamic VF distribution

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08803447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08803447

Country of ref document: EP

Kind code of ref document: A1