« PrécédentContinuer »
I COMPUTE ROUNDTRIP LATENCY 1 I-/3'5 sso ,- ----- -- Q IE5 ADD’L BLOCKS i N0
L _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _....J
COMPREHENSIVE END-TO-END STORAGE AREA NETWORK (SAN) APPLICATION TRANSPORT SERVICE
CROSS REFERENCE TO RELATED APPLICATIONS
This application is related to commonly assigned patent application Ser. No. 10/228,776 filed Aug. 9, 2005, entitled “Asymmetric Data Mirroring”, now the U.S. Pat. No. 6,976, 186 issued Dec. 13, 2005, which is incorporated herein by reference. This application is also related to commonly assigned patent application Ser. No. 11/203,420 filedAug. 12, 2005, entitled “Asymmetric Data Mirroring”, now the U.S. Pat. No. 7,549,080 issued on Jun. 16, 2009, which is incorporated herein by reference. This application is also related to commonly assigned patent application Ser. No. 11/207,312 filed Aug. 19, 2005, entitled “Method and System for Long Haul Optical Transport for Applications Sensitive to Data Flow Interruption”, now the U.S. Pat. No. 7,839,766 issued on Nov. 23, 2010, which is incorporated herein by reference.
BACKGROUND OF THE INVENTION
The present invention relates generally to storage area networks, and more particularly to a comprehensive, end-to-end storage area network (SAN) application transport service.
With the advent of and growth of the Internet, the availability of data has become increasingly important. Many corporations need access to their data during most, if not all, hours of the day. For example, people may be searching the web for a particular piece of information at any time of day. If the information is associated with a corporation’s web site, the corporation may lose customers if their web site is not functioning properly or if the data cannot be retrieved at the time of search. As a result, data storage and availability have become extremely important to businesses in today’s competitive landscape.
Data storage devices may fail as a result of system malfunctions, weather disasters, or other types of unforeseen conditions. Corporations typically have a remote backup storage device to ensure data availability when a local storage device fails. Data redundancy is also referred to as data mirroring and typically involves the submission of simultaneous write requests to multiple storage devices (i.e., the local and remote data storage devices).
Typically, in a data mirroring arrangement, a server is attached or connected to a local data storage device as well as to a remote data storage device with the data from each storage device mirroring that of another (or each other).
The distance at which data can reliably be transmitted to a remote storage device also becomes relevant to performance of a data storage network and data security. Specifically, the shorter the distance between a server and data sites, the more quickly the data can be synchronized at the data sites. Maintaining synchronization between data at mirrored sites is often highly desirable. Synchronization is the ability for data in different data sites to be kept up-to-date so that each data store contains the same infonnation. One way to accomplish synchronization is by all mirrored storage devices acknowledging receipt of an input/ output (I/O) request from a requesting application before the application may generate the next I/O request. As the distance between mirrored sites increases, synchronization becomes harder to achieve using existing mirroring techniques as the application generating the I/O request is slowed while awaiting acknowledgment from the remote storage device.
It is possible to obtain synchronization using existing techniques if the physical distance between the mirrored sites is less than approximately twenty-five (25) miles (i.e., 40 km). For greater distances, existing techniques may not provide the synchronization that is needed for maintaining data security in case of a wide-spread calamity.
Also, the greater the distance between the mirrored sites, the less likely a situation (e.g., a weather disaster or a system failure) will affect both the local storage device and the remote storage device. Further, when data is transported over increasing distances, the throughput associated with the data transfer traditionally experiences “throughput droop”. Throughput is defined as the amount of data that can be transmitted across a data channel at any given time. Throughput is often represented graphically relative to distance. Throughput droop is when a throughput curve goes down or “droops” as the transport distance increases. Thus, in order to transmit the maximum amount of data over a data channel, the distance between the server and the remote storage device must often be kept within a reasonable distance (e.g., 25 miles).
A transport interruption event is another problem that may be experienced during data transmission to a remote storage device. This occurs when there is a failure in the data channel. After the failure is recognized, the transmitting party (e.g., server) may then switch the channel used to transmit the data. This is referred to as a switch-to-protect event. When performing this data channel switch, the server has to synchronize the communications with the remote storage device over the new data channel. This resynchronization (after the initial synchronization over the initial data channel) and switching to the new data channel traditionally introduces a disruption (e.g., 40 milliseconds) in the data transmissions until the synchronization is complete.
Thus, there remains a need to provide a comprehensive storage area network (SAN) application transport service that solves the above-mentioned problems.
BRIEF SUMMARY OF THE INVENTION
A system and method for solving the above mentioned problems transmits data on a data channel from a source to a destination. The data channel has a plurality of wavelength channels and an associated throughput. The system and method include a storage application for multicasting data on each of the plurality of wavelength channels, a storage protocol extension device for adjusting the throughput during the multicasting by using buffer credits to determine a capacity of data that can be communicated between the source and the destination, and an application optimization device for managing data channel latency by submitting requests to the source and the destination during a predetennined time period associated with the latency. As described in more detail below, data channel latency is the time required for a signal to traverse the round trip distance between a server (or source) and a remote storage device plus the maximum write time of the remote storage device.
The data channel may be part of an optical network, such as a Fibre Charmel network, or a packet-based network, such as a MultiProtocol Label Switching (MPLS) network. The managing of data channel latency can include determining a predetermined time period associated with the latency between the source and the destination, submitting a request to the source and to the destination, and submitting additional requests to the source and the destination during the predetennined time period. The request and the additional requests may be resubmitted to the destination if an acknowledgement
is not received. The submission of additional requests to the source and destination may also continue if the acknowledgement is received.
In one embodiment, the system and method detennine whether an acknowledgement associated with the request has been received from the destination during the predetermined time period. In another embodiment, the system and method store a copy of each request submitted by the source to the destination in a memory disposed between the source and the destination while the source waits for whether an acknowledgement associated with the request has been received from the destination during the predetennined time period. In one embodiment, the source halts submission of new requests to the destination if the acknowledgement is not received.
These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying draw
ings. BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a data mirroring system according to an embodiment of the invention;
FIG. 2 shows a functional block diagram of the layered SAN transport model in accordance with an embodiment of the invention;
FIG. 3 is a flow chart of a data mirroring method in accordance with an embodiment of the invention;
FIG. 4 is a flow chart of an altemative data mirroring method in accordance with an embodiment of the invention; and
FIG. 5 is a block diagram of a layered model for SAN transport in accordance with an embodiment of the invention.
Data availability, and therefore data storage, has become vitally important to corporations. A failure of a data storage device may result in millions of dollars lost if a corporation’ s data is not available. As a result, corporations often mirror a local data storage device with a remote data storage device.
An exemplary data mirroring system 100 is shown in FIG. 1. The data mirroring system 100 includes a server 110, a local storage device 120 and a remote storage device 130. The server 110 and the remote storage device 130 may be connected via a communication link 140. The communication link 140 may be a cable connection or a wireless cormection. The cable may be terrestrial, underwater, etc. It may utilize a fiber optic medium. The communication link 140 can also be any combination of these connections (e.g., one portion is a wired connection and one portion is a wireless connection). The network fonned by the arrangement of FIG. 1 may be a public or a private network. Further, the functions performed by the server 110 described above and below may instead be performed by a storage application (as shown in FIG. 2).
There exists asymmetry in the distance between the server 110 and the storage devices 120 and 130. The distance between server 110 and local storage device 120 is negligible relative to the distance between server 11 0 and remote storage device 130. In an asymmetric data mirroring (ADM) method according to exemplary embodiments of the present invention, the server 110 first submits an I/O request (such as a write request of a block of data) to both the local and remote storage devices 120 and 130 and then continues to make additional I/ O requests to the devices 120 and 130 over a predetermined time period while waiting for an acknowledgement from the remote storage device 130 for the submit
ted I/ O request. According to this exemplary embodiment, an acknowledgment (within the predetennined time period or time interval) is required for each submitted request. The predetermined time period represents the time needed for a signal to traverse the round trip distance between the server 110 and the remote storage device 130 plus the maximum write time of the remote storage device 130. This time period may also be referred to as the round trip latency or network latency and may be measured or determined by a processor of the server 110. The write time of the remote storage device 130 may be negligible since the request may first be written to cache associated with the remote storage device 130. Therefore an acknowledgement may be submitted by the remote storage device 130 upon receipt of the write request from the server 110.
If the server 110 does not receive an acknowledgement from the remote storage device 130 within the predetermined time period, all further requests to the devices 120 and 130 are halted. At this point, the request for which an acknowledgement is not received as well as all additional requests that have been submitted are resubmitted, block by block, by the server 110 to the remote storage device 130.
It may be appreciated by those skilled in the art that latency between network nodes, such as server 110 and remote storage device 130, may vary as a result of network traffic, etc. Accordingly, multiple readings may be made in order to determine the round trip latency. That is, a number of pings may be submitted by the server 110 to detennine the average latency while also noting the minimum and maximum round trip latency values. Furthennore, the latency will change if the distance between the server 110 and the remote storage device 130 is changed for any reason. In this case (i.e., the distance between the server and the remote storage device changes), the latency measurements have to be updated.
FIG. 2 shows a more detailed block diagram of a storage area network (SAN) having a server 214 in communication with a local storage device (i.e., source) 208 (shown with a dashed box) and a remote storage device (i.e., destination) 212 (also shown with a dashed box). SAN 200 includes a transport network 204 to transmit data from the source to the destination 212. In one embodiment, server 214 issues an I/O command to write data to the source 208. The source 208 stores the data in local storage 210 and also transmits the data over the transport network to the destination 212 for storage in remote storage 213.
The transport network 204 may include any number of channels (e.g., fibers) and may have any configuration. The transport network 204 may be an optical network, such as a Fiber Optics network, or may be a packet-based network, such as a Multiprotocol Label Switching (MPLS) network.
In the transport network 204, the data is multicasted over at least two separate wavelength channels (or, in the context of an MPLS network, over two separate data channels). The transport network 204 thereby eliminates the switch-to-protect time traditionally needed when a failure occurs. In particular, because the data is being transmitted over multiple channels, no time is needed to switch to a new channel, synchronize a new data transmission between the source 208 and the destination 212, and then transmit the data over the new channel. Instead, the same data is already being transmitted over multiple wavelength channels before a failure occurs. Thus, no synchronization time is needed if a failure occurs.
The source 208 and destination 212 each have a respective switch 216a, 216b (generally 216). Switch 216 may take the form of a router, Ethemet switch, SAN switch, or any other network element capable of providing input data (e.g., optical