CA2508804A1 - Apparatus and method for a scalable network attach storage system - Google Patents

Apparatus and method for a scalable network attach storage system Download PDF

Info

Publication number
CA2508804A1
CA2508804A1 CA002508804A CA2508804A CA2508804A1 CA 2508804 A1 CA2508804 A1 CA 2508804A1 CA 002508804 A CA002508804 A CA 002508804A CA 2508804 A CA2508804 A CA 2508804A CA 2508804 A1 CA2508804 A1 CA 2508804A1
Authority
CA
Canada
Prior art keywords
nodes
file
termination
node
file server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002508804A
Other languages
French (fr)
Inventor
Thomas James Edsall
Mario Mazzola
Prem Jain
Silvano Gai
Luca Cafiero
Maurilio De Nicolo
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cisco Technology Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2508804A1 publication Critical patent/CA2508804A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • H04L67/1008Server selection for load balancing based on parameters of servers, e.g. available memory or workload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/40Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass for recovering from a failure of a protocol instance or entity, e.g. service redundancy protocols, protocol state redundancy or protocol service redirection

Abstract

An apparatus and method for a scalable network attached storage system. The apparatus includes a scalable network attached storage system, the network attached storage system including one or more termination nodes, one or more file server nodes for maintaining file systems, one or more disk controller nodes for accessing storage disks respectively, and a switching fabric coupling the one or more termination node, file server nodes, and disk controller nodes. The one or more termination nodes, file server nodes and disk controller nodes can be scaled as needed to meet user demands. The method includes receiving a connection request from a client, selecting a termination node among the plurality of termination nodes to establish a connection with the client in response to the connection request based on a predetermined metric, terminating at the selected termination node a command request received from the client during the connection by extracting a file handle defined by the command request, forwarding the command request to a selected file server node among a plurality of file server nodes interpreting the command request at the selected file server node and accessing an appropriate disk controller node among a plurality of disk controller nodes, and accessing disk storage through the appropriate disk controller node and serving the accessed data to the client. The number of termination nodes, file server nodes, and disk controller nodes are scalable as needed to meet user demands.

Description

APPARATUS AND METHOD FOR A SCALABLE NETWORK ATTACH
STORAGE SYSTEM
Related Applications The present invention is related to U.S. Application Serial Nlunber 10/313,745 (attorney docket number ANDIP023) filed on December 6, 2002 entitled "Apparatus and Method for A High Availability Data Network Using Replicated Delivery" by Thomas Edsall et. al. and U.S. Application Serial Number 10/313,305 (attorney docket number ANDIP018) filed on December 6, 2002 entitled "Apparatus and Method for a Lightweight, Reliable Paclcet-Based Protocol" by Gai Silvano et.
al., both filed on the same day and assigned to the same assignee as the present invention, and incorporated herein by reference for all purposes.
BACKGROUND OF THE INVENTION
1. Field of the Invention The present invention relates to data storage, and more particularly, to an apparatus and method for a scalable Network Attached Storage (NAS) system.
2. Background of the Invention With the increasing popularity of Internet commerce and network centric computing, businesses and other organizations axe becoming more and more reliant on information. To handle all of this data, various types of storage systems have been developed such as Storage Array Networks (SANS) and Network Attached Storage (NAS). SANS have been developed based on the concept of storing and retrieving data blocks. In contrast, NAS systems are based on the concept of storing and retrieving files.
A typical NAS system is a single monolithic node that performs protocol termination, maintains a file system, manages disk space allocation and includes a number of disks, all managed by one processor at one location. Protocol termination is the conversion of NFS or CIFS requests over TCP/IP received from a client over a networlc into whatever internal inter-processor communication (IPC) mechanism defined by the operating system relied on by the system. Some NAS system providers, such as Network Appliance of Sunnyvale, CA, market NAS systems that can process both NFS and CIFS requests so that files can be accessed by both Unix and Windows users respectively. With these types of NAS systems, the protocol termination node includes the capability to translate both NFS or CIFS
requests into whatever corrununication protocol is used within the NAS system. The file system maintains a log of all the files stored in the system. In response to a request from the termination node, the file system retrieves or stores files as needed to satisfy the request. The file system is also responsible for managing files stored on the various storage disks of the system and for locking files that are being accessed. The locking of files is typically done whenever a file is open, regardless if it is being written to or read. For example, to prevent a second user from writing to a file that is currently being written to by a first user, the file is loclced. A file may also be locked during a read to prevent another termination node from attempting to write or modify that file while it is being read. The disk controller handles a nlunber of responsibilities, SLlch as accessing the disks, managing data mirroring on the disks for baclc-up purposes, and monitoring the disks for failure and/or replacement. The storage disk are typically arranged in one of a number of different well known configurations, such as a lazown level of Redundant Array of Independent Dislcs (i.e., RAIDl or RAIDS).
The protocol termination node and file system are usually implemented in microcode or software on a computer server operating either the Windows, Unix or Linux operating systems. Together, the computer, disk controller, and array of storage dislcs are then assembled into a rack. A typical NAS system is thus assembled and marketed as a stand alone rack system.
A number of problems are associated with current NAS systems. Foremost, most NAS systems are not scaleable. Each NAS system rack maintains its own file system. The file system of one rack does not inter-operate with the file systems of other racks within the information technology infrastructure of an enterprise.
It is therefore not possible for the file system of one rack to access the dish space of another rack or vice versa. Consequently, the performance of NAS systems is typically limited to that of single raclc system. Certain NAS systems are redundant.
However, even these systems do not scale very well and are typically limited to only two or four nodes at most.
Due to the aforementioned problems, the benclunarlcs (for example the access rate and the overall response time) used to measure the performance of NAS
systems are relatively poor or even contrived. Often several of these independent systems will be used in parallel to get an aggregate performance. This is not true scaling, however, as these aggregate systems are typically not coordinated.
There are also many drawbaclcs associated with individual NAS systems.
Individual NAS systems all have restrictions on the number of users that can access the system at any one time, the number of files that can be served at one time, and the data throughput (i.e., the rate or wait time before requested files are served). When there are many files stored on an NAS system, and there are many users, a significant amount of system resources are dedicated to managing overhead functions such as the loclcing of particular files that are being access by users. This overhead significantly impedes the overall performance of the system.
Another problem with existing NAS solutions is that the performance of the system cannot be tuned to the particular workload of an enterprise. In a monolithic system, there is a fixed amount of processing power that can be applied to the entire solution independent of the work load. However, some work loads require more bandwidth than others, some require more I/Os per second, some require very large numbers of files with moderate bandwidth and users, and still others require very large total capacity with limited bandwidth and a limited total number of files.
Existing systems typically are not very flexible in how the system can be optimized for these various work loads. They typically require the scaling of all components equally to meet the demands of perhaps only one dimension of the work load such as number of I/Os per second.
Another problem is high availability. This is similar to the scalability problem noted earlier where two or more nodes can access the same data at the same time, but here it is in the context of take over dining a failure. Systems today that do support redundancy typically do in a one-to-one (1:1) mode whereby one system can baclc up just one other system. Existing NAS systems typically do not support the redwdancy for more than one other system.
An NAS architecture that enables multiple termination nodes, file systems, and disk controller nodes to be readily added to the system as required to provide scalability, improve performance and to provide high availability redundancy is therefore needed.
SUMMARY OF THE INVENTION
To achieve the foregoing, and in accordance with the purpose of the present invention, an apparatus and method for a scalable networlc attached storage system is disclosed. The apparatus includes a scalable network attached storage system, the network attached storage system including one or more termination nodes, one or more file server nodes for maintaining file systems, one or more dislc controller nodes for accessing storage disks respectively, and a switching fabric coupling the one or more termination node, file server nodes, and disk controller nodes. The one or more termination nodes, file server nodes and disk controller nodes can be scaled as needed to meet user demands. The method includes receiving a connection request from a client, selecting a termination node among the plurality of termination nodes to establish a comlection with the client in response to the connection request based on a predetermined metric, terminating at the selected termination node a command request received from the client during the connection by extracting a file handle defined by the con nnand request, forwarding the command request to a selected file server node among a plurality of file server nodes interpreting the command request at the selected file server node and accessing an appropriate disk controller node among a plurality of dislc controller nodes, and accessing disk storage through the appropriate disk controller node and serving the accessed data to the client. The nlunber of termination nodes, file server nodes, and disk controller nodes are scalable as needed to meet user demands.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram of a NAS system having a scalable architecture according to the present invention.
Figures 2A and 2B are flow diagrams illustrating the operation of a load balancer of the NAS system of the present invention.
Figure 3 is a flow chart illustrating the operation of termination nodes in the NAS system of the present invention.
Figures 4A through 4C are flow diagrams illustrating how the NAS system processes a request from a client according to the present invention.
Figure 5 is a block diagram illustrating an actual implementation of the NAS
system according to one embodiment of the of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Referring to Figure 1, a block diagram of NAS system having a scalable architecture according to the present invention is shown. The NAS system 10 includes a load balancer 12, one or more termination nodes 14a through 14x, one or more file server nodes 16a through 16y, one or more dish controller nodes 18a through 18z, and a plurality of disks 20. A switching fabric 22 is provided to intercomiect the termination nodes 14a through 14x, the file server nodes 16a through 16y, and the disk controller nodes 18a though 18z. In an alternative embodiment, a Storage Array Network (not shown) could be used between the dislc controller nodes 18a through 18z and the disks 20. The NAS system is connected to a network 24 through a standard network interconnect. The network 24 can be any type of computing network including a variety of servers and users rurming various operating systems such as Windows, Unix, Linux, or a combination thereof.
The load balances 12 receives requests to access files stored on the NAS
system 10 from users on the network 24. The main function performed by the load balances 12 is to balance the number of active comlections among the one or more termination nodes 14a through 14x. In other words, the load balances 12 dynamically assigns user connections so that no one termination node 14 becomes a "bottleneck"
due to handling too many comzections. In a system 10 having three termination nodes 14 for example, if the first, second and third termination nodes 14 are handling seven (7), eleven (11), and three (3) connections respectively, then the load balances 12 will forward the next comlections to the third termination node 14 since it is handling the fewest number of comzections. The load balances 12 also redistributes connections among remaining termination nodes 14 in the event one fails or in the event a new termination node 14 is added to the NAS system 10. The load balances 12 can also use other metrics to distribute the load among the various termination nodes 14. For example, the load balances 12 can distribute the load based on CPU
utilization, memory utilization and the nlunber of connections, or any combination thereof.
Referring to Figure 2A and 2B, flow diagrams illustrating the operation of the load balances 12 of the present invention are shown. Flow diagram 2A
illustrates the sequence of the load balances 12 in maintaining a cmTent list of the available termination nodes 14 in the NAS system 10. Figure 2B illustrates the sequence of the load balances 12 in balancing the load of connections among the current list of available termination nodes.
In Figure 2A, the load balances 12 sequences through the following routine.
Initially the load balances 12 determines if a new termination node 14 has been identified as functional (decision diamond 30). If yes, then the list of available termiliation nodes 14 is updated to include the new termination node 14 (box 32).
Regardless if a new termination node 14 has been added or not, the load balancer 12 next determines if any of the available termination nodes 14 is non-functional (decision diamond 34). If yes, the non-functional termination node is removed from the available list (box 36). Regardless if a non-functional termination node 14 has been identified or not, the aforementioned sequence is repeated (control is returned to diamond 30). In this mariner, the load balancer 12 is constantly updating the list of available termination nodes 14 in the NAS system 10.
In Figure 2B, the sequence for balancing connection loads among the available termination nodes 14 of the NAS system 10 is shown. Initially the load balancer 12 determines if it has received a new correction (decision diamond 40). If yes, the load balancer 12 ascertains the current load of each of the available termination nodes 14 in the system 10 (box 42). The termination node 14 with the smallest cmTent load is then identified (box 44). The lleW COlllleCt1011 1S then assigned to the termination node 14 with the smallest load (box 46). The aforementioned sequence is repeated for subsequent requests. In this mamer, the load balancer 12 is able to prevent bottlenecks by evenly distributing coimections loads among the termination nodes 14 of the NAS system 10. As previously noted, the number of comiections is but one metric that can be used by the load balancer 12. Other metrics such as CPU
utilization and memory utilization could be used. With these embodiments, these other metrics alone or in combination would be considered by the load balancer 12 in assigning a new connection to a termination node 14. It should be noted that once a connection is made to a termination node 14, all subsequent received requests or packets associated with that connection are usually sent to the same termination node 14.
The termination nodes 14 each perform a number of functions. The termination nodes 14 terminate coimection requests received tluough the load balances 12 from clients over the network 24. The received correction requests are typically TCP/IP or UDP/IP protocol messages. Termination involves the conversion or translation of the upper layer protocols, usually either NFS or CIFS, into the communication protocol used by the switching fabric 22. The termination nodes also determine which file server node 16 will receive the translated request based on the content of the received NFS or CIFS request. The termination nodes 14 also terminates XDR and RPC messages when NFS requests are received, maintains additional state information with CIFS messages, and is capable of detecting the failiue of any of the server nodes 16. XDR is an External Data Representation and RPC is Remote Procedure Call. These are protocol layers between TCP and NFS.
XDR creates a standard data format so that different operating systems can communicate in a common way and RPC allows one machine to run procedures on a remote machine. In CIFS, the file handle is not global, i.e. it is specific to the connection. This means that each connection for CIFS can have a different file handle for the same file. Since it is desirable for all of the TCP/IP
terminations nodes 14 to make the same decision as to which 16 node is responsible for a given file independent of the connection, the CIFS handle has to be translated into the handle used internally for the file. Failures may be detected in a number of laiown ways, for example by sending out periodic messages and aclaiowledgements between the nodes 16 and the nodes 14.
The selection of the file server node 16a through 16y may depend on a nmnber of factors. One such factor is the range of the file handles served by each file server node 16. When a request is received, the termination node routes the request based on the file handle defined by the request. For example, file server node 16a may be assigned file handle range 100 to 499, file server node 16b may be assigned file handle range 500 to 699, and file server node 16c may be assigned file handle range 700 to 999, etc. Whenever a request is received, the responsible termination node 14 will forward the request to the appropriate file server node 16 based on the file handle defined by the request. It should be noted that the file ranges mention herein are only exemplary and they should in no way be construed as some how limiting the invention.
In other embodiments, certain file server nodes 16 can be pre-assigned to handle certain types of files. For example, if one of the file server nodes 16 is designated to access MPEG files, then any MPEG request is automatically routed by the termination node 14 handling that request to the designated MPEG file server node 16. Examples of other types of files that may have a dedicated file server node 16 include ''.doc", web pages identified by htm or html, or images identified by .jpg, .gif, .bmp, etc.
Referring to Figure 3, a flow chart illustrating the operation of a termination node 14 is shown. When a request is received from the load balancer 12 (box 50), the responsible termination node 14 terminates either the TCP or LJDP protocol rumling on top of IP (box 52). Thereafter, the terminate node 14 determines if the request is either NFS or CIFS (decision diamond 54). If NFS, then the termination node 14 terminates XDR and RPC (box 56). After the XDR and RPC termination, or if the request was CIFS, the termination node 14 next extracts the file handle defined by the request (box 58). The termination node 14 then determines or maps the appropriate file server node 16 to send the request to based on the extracted file handle.
For CIFS
requests, this mapping is per connection. For NFS requests, the mapping is per system (box ~0). In other words, a given file handle may imply one file for a given CIFS
comlection and the same file handle may imply a different file for a different CIFS
comzection. Each CIFS connection must therefore beep its own mapping of either a File handle to a node 16 or a file handle to an internal version of the file handle which is consistently mapped to a file for the entire NAS system. The NFS file handles, on the other hand, are already consistent for the entire NAS system, i.e., the file handle to file mapping for one NFS coimection is exactly the same on all NFS
comlections.
The termination node 14 converts the request into a common format for both NFS
and CIFS (box 62) and then sends the converted request to the appropriate file server node 16 (box 64). The aforementioned sequence is repeated for subsequent requests that are received.
The file server nodes 16 also perform a munber of functions within the NAS
system 10. Foremost, each file server node 16 implements its own file system.
Accordingly, each file server node 16 is responsible for retrieving files through the dish controllers 18a - 18z as necessary to service received requests. Each file server node 16 is also responsible for terminating the requests received from the termination nodes 14 and the disk controller nodes 18.
According to one embodiment, the file server nodes 16 implement a "federated" or "loosely coupled" file system. Each file server node 16 does not have to communicate with the other file server nodes 16 within the NAS system 10.
This makes the file server nodes 16 scalable because each file server node 16 does not have to monitor or lceep track of the files the other file server nodes 16 are accessing. Each file server 16 need not check or "aslc permission" from the other file server nodes 16 before attempting to access a file. This arrangement significantly reduces management overhead within the NAS system 10.

The individual file sever nodes 16 also take responsibility for their name space ranges at the file level. In other words, the granularity of the division of responsibility for the name space between various file server nodes is at the file level. The division of labor among the various file server nodes 16 for regions of the name space, however, may vary dynamically. Any changes in the name space are propagated bacl~
to the termination nodes 14 so that they laiow which file server node 16 is responsible for a particular request (associated with a pauticular file) from the users.
According to one embodiment, the file server nodes 16 communicate with one another upon creation or transfer of name space among the file server nodes 16. For example, if one file server node has too large a name space and becomes too busy handling all the requests within its name space, then some or all of that name space can be transferred to another file sever node 16. Each file server node 16 maintains a table that indicates the name space managed by each of the file server nodes 16a through 16y. When name space is transferred, the table of each file server nodes 16 is updated. Similarly, when name space is added °to the NAS system 10, the table of each file server node 16 is again updated. It should be noted that it is not necessary or even desirable for each node 16 to keep a complete map of the name space.
Therefore in alternative embodiments, each node 16 keeps track of its own name space, i.e. all the files it is currently responsible for, plus the location of all the files that were created on that node 16 that may have been moved to a different node.
It should be noted that the termination nodes 14 should be made aware of the current name space mapping so that they can direct the terminated requests accordingly. If a termination node 14 has a name space mapping that is out of date, it may send the request to the wrong server node 16. That server node 16 may then have to inform the requesting termination node 14 of the change to the name space and the termination node 14 will have to re-issue the request to the correct server node 16.
Each server node 16 therefore beeps traclc of which server node 16 created a file and where the files have migrated. Consider an example where server node 16a creates file handles in the range 0-999, server node 16b creates file handles in the range 1000-1999, and server node 16c creates file handles in the range 2000-2999.
All of the termination nodes 14 are aware of this static configL~ration and direct file requests accordingly. Assume that server node 16a creates a file "A" with file handle 321. The termination nodes 14 all know that when they see a reference to file handle 321, it falls in the range 0-999 and therefore is sent to server node 16a.
Now assume that file "A" migrates from 16a to 16b due to load balancing . If a request comes into termination node 14a for file handle 321, termination node 14a will send the request to server node 16a. However, server node 16a knows that file handle 321 has migrated to server node 14b. Consequently, server node 16a send a message back to termination 14a informing it that file handle 321 is now being handled by server node 16b. Termination node 14a will then send the request to server node 16b and updates this exception to its mapping table for all subsequent requests for file handle 321. All subsequent requests for file A will then be forwarded directly to server node 16b by termination node 14a.
Assume again that the same file "A" is migrated from server node 16b to 16c.
When a another request for file A is received, termination node 14a notes the exception to its mapping table for file handle 321 and sends the request to server node 16b. The server node 16b lalows that file handle 321 has migrated to some other node and therefore responds to termination 14a to remove the exception.
Termination node 14a then sends the request to server node 16a according to the default mapping.
Server 16a responds back to termination 14a that it should send this and all subsequent requests for file handle 321 to server node 16c. All subsequent requests are handled by server node 16c until file A migrates to another server node and the above update sequence is repeated.
It is useful to note that with this scheme, the state of all the files does not have to be updated atomically. Only one server node 16 needs to laiow where a particular file is at any point in time. In the example above, the server node 16a keeps track of the location of file handle 321. Since this information does not need to be distributed atomically, the present invention provides a highly scalable NAS solution.
Another noteworthy aspect with this scheme is that the server node 16 that creates a file handle is responsible for permanently storing information related to that file handle.. This is required so that the system 10 lalows where all the files are after a catastrophic event, such as a power failure. Since the server node where the file was created (node 16a in the example for file "A") is the single authority of where the file is, it is the only server node responsible for writing this information into stable storage.

In alternative embodiments, updates to the mapping scheme may be implemented in a variety of ways different than the exception handling scheme described above. For example, the 16 nodes can propagate mapping exceptions to the termination 14 nodes as they occur in the background without substantially interfering with normal communications between the two sets of nodes 14 and 16. If that propagation has completed, there is no redirection. If it has not completed, there may be some redirection. Overall, since this redirection typically does not happen because the file has not moved or the exception entries are already in node 14, or has one level of indirection because a double move is rare, the total performance impact is negligible. "redirection" occurs when node 16a informs node 14a that file 321 is located on node 16b in the first part of the above example. "propagation" is when the 14 nodes are informed that file 321 has moved to node 16b before the nodes 14 even try to access file 321. This propagation will effectively eliminate the redirection previously described. Since redirection will likely have some performance impact due to the time and processing requirements for the additional messages back and forth between the 14 nodes and the 16 nodes, it is desirable to avoid redirection.
There is, however, a window of time between when a file has moved fiom 16a to 16b until when each of the 14 nodes have updated their mapping table to reflect that move. If a file request comes in from the network during this window of time, there are two possible ways to handle this: (i) block all node 14 access to a file that is moving Lllltll the move has completed and the mapping table in all the nodes 14 have been updated; or (ii) allow the node 14 to access the file at any time including during the window that the node 14 has inaccurate information about where the cLUrent location of the file is and to handle this case with redirection. The second option is a practical way to handle the problem and it is a reasonable solution from a performance perspective because the overhead for redirection is not particularly large.
In addition, with propagation of the mapping exceptions from nodes 16 to nodes 14, the probability that an access occurs for a file while the nodes 14 have the wrong location information for that file is fairly small. This fiuther reduces the performance impact of moving files between different nodes 16.
The exception information could also be kept in a central location so that each server node 16 only needs to know about the files it is currently responsible for. If it gets a request for a file handle of a file it does not currently have, it will direct the termination node 14 to consult the central data base of exceptions for the current location of the file. This has the benefit that the server nodes 16 only need to keep information for the files that they have which they are required to maintain anyway.
According to yet another embodiment, the file server nodes 16 can be conflgllred to cache recently and/or frequently accessed files. The advantage of maintaining cache copies is that these files can be immediately served by the file server nodes 16 without the delay of accessing the disks 20. Files can be cached based on the principles of either temporal or spatial locality, or a combination thereof.
The cached files can be replaced using any appropriate replacement algoritlun for the kind of file being accessed, such as Last Recently Used or first-in first-out for example.
It should be noted that the file server nodes 16 do communicate with one another to detect failures for redundancy purposes. This communication, however, is relatively insignificant and does not vary depending on the load volume on the system 10.
According to various embodiments, the file server nodes 16 may 1111p1e111el1t either a dynamic distributed file system such as CODA or a clustered file system. For snore information on CODA, see for example "The Coda Distribution File System", by Peter J. Braaln, School of Computer Science, Carnegie Melton University, incorporated by reference herein. Other file systems that may be used include for example UFS (Unix File System) or AFS (Andrew File System).
According to another embodiment, the file server nodes 16 are each capable of locking a file that it is accessing in accordance with a number of possible loclcing semantics. With exclusive loclcs for example, access of a file accessed by one file server node 16 would lock out both reads and write attempts by other file server nodes 16. Alternatively, if one file server node 16 is writing to a file, it will place a lock on that file to prevent a second client from writing to that file. However, a read access may be permitted.
Finally, as previously noted, the individual file sever nodes 16 can be configured or optimized for handling specific types of requests. With the MPEG
example, the responsible file server node 16 can be optimized to pre-fetch the blocks of data from the dislcs 20 based on the assumption that all the frames in the MPEG
file will need to be served. In another example, if a file is used for a database index, an optimization may be to provide more cache memory. This would reduce the occurrence of pre-fetching since the data access pattern will likely be random with bursts of activity on the same location of a file. In another example involving a log file, a single read cache and a relatively large amount of write cache may be provided since the data is primarily write-only and is read only during error recovery.
In yet another example, generally small web type files you may be optimized by using a block layout on the disk that is optimized for reads versus writes and for small files versus large files. It should be noted that munerous other specific optimizations could be implemented and that those provided above are merely illustrative and should not be construed as limiting in anyway.
The disk controller nodes 18 are responsible for managing the disks 20 respectively. As such, the disk controller nodes 18 are responsible for file mirroring, relocation, and other disk related activities such as those associated with whatever level of RAID is used in the system 10. In addition, the disk controller nodes terminate any requests received from the file server nodes 16, virtualize physical dislc space, access the appropriate storage blocks to retrieve requested files, and act as a data block server. The controller nodes 18 also monitor their disks 20 for failure and replacement, and perform mirroring of the data stored on the dislcs for back-up purposes.
As previously noted, the disks 20 can be arranged in any type of configuration, such as RAID 1 for example. If the disk controller nodes 18 implement RAID 1 for example, they will mirror all the data across two or more physical disks, i.e.
each disk controller node 18 will create two copies when a write occurs and will read only one of the copies when a read OCCLIrS. With this implementation, server node 16, on the other hand, thincs that it is writing to a single, standard disk. But in reality, it is writing to a virtual disk that node 18 then implements in physical disk space.
In other words, the victual view of the storage is different than the physical implementation.
In another example, consider a large file system of 360 Gbytes. Currently a single disk of this size is not feasible. Since file systems typically cannot span multiple disks, the file system running on the server node 16 must see a dislc that is at least 360 Gbytes. Consequently, the disk controller nodes 18 have to logically concatenate a number of physical disks together to present the desired disk space to the server node 16. In alternative embodiments, other types of storage mediums may be used, such as electro-magnetic tape, CD-ROM, or silicon based memory chips.
The switching fabric 22 includes a number switches. In various embodiments, the switching fabric can include Fibre Channel switches, Ethernet switches, or a combination thereof. Similarly, a number of different communication protocols can be used over the switching fabric. For example, TCP/IP or FCP running over Ethernet or Fibre Channel, could be used as the c0111111LU11Cat1011 protocol across the switching fabric 22. In one embodiment, a protocol specifically designed for the NAS
system 10, hereafter referred to as the "ABC" protocol, may be used. For a more detailed explanation of the ABC protocol, see U.S. Patent Application Serial No.
10/313,305 (attorney docket number ANDIP018) filed on December 6, 2002, entitled Apparatus and Method for a Lightweight, Reliable, Packet-Based Transport Protocol, and assigned to the same assignee, incorporated by reference herein for all purposes.
Referring to Figure 4A through 4C, flow dlagTa111S 111L1StTat111g hOW the NAS
system 10 processes a request from a client according to the present invention is ShOWll.
As illustrated in Figure 4A, when a client in the networlc 24 wishes to access the NAS system 10, the client initiates a connection through the network 24 (box 102). The load balancer 12, in response, selects a termination node 14 as described above (box 104). The selected termination node 14 establishes a comzection with the client (box 106). The client then sends the NFS/CIFS colrllnand to the selected termination node 14 (box 108) which terminates the TCP/IP request and extracts the NFS/CIFS command (box 110).
As illustrated in Figure 4B, the selected termination node 14 performs any necessary vil-tual to real file address translations (box 112) and then determines which file server node 16 should receive the request. As previously noted, the file server node 16 is generally selected based on the contents of the request (box 114).
The selected file server node 16 interprets the NFS/CIFS command and accesses the appropriate dislc controller node 18 (box 116). Thereafter, the desk controller node 18 accesses the appropriate disk 20 and provides the requested file to the selected file server node 16 (box 118).
Finally, as illustrated in Figure SC, the file server node 16 provides the file to the selected termination node 14 (box 120), which in turn, provides the file to the client over the network 24 (box 122).

Referring to Figure 5, a block diagram illustrating an implementation of the NAS system according to one embodiment of the of the present 111Ve11t1011 1S
5hoW11.
The NAS system 200 includes a pair of load balancers 12a and 12b, a pair of general nodes 202a and 202b, a plurality of termination nodes 14a through 14c, a phuality of file server nodes 16a through 16c, a plurality of disk controller nodes 18a tluough 18c, and a plurality of disks 20 associated with the dislc controller nodes 18a through 18c respectively. The switching fabric 22 of this embodiment includes two Gigabit Ethernet switches 204. Redundant comzections are provided between each of the above listed elements for high performance and as baclc-up in the event one of the connections goes down. The "general nodes 202" are responsible for management of the system. For example, when the administrator logs into the file server to set quotas for users or to setup user access control, the administrator must do this through a node in the system 200. It could be handled by any node in the system, but if there is a dedicated node (or two for redundancy) it makes the implementation easier.
Basically the general nodes 202 are responsible for system configuration and management.
They do not participate in the data path of file access. They may be used for determining when various nodes fail and for implementing policies for data migration from one node 16 to another, all of which do not impact performance.
In this embodiment, TCP/IP is used for communications between users on the network 24 and the termination nodes 14. The ABC protocol is used for conununication between the termination nodes 14 and the file server nodes 16.
SCSI
over ABC is used for communications between the file server nodes 16 and the dislc controller nodes 18. Finally, SCSI over Fibre Chamiel is used for communications between the disk controller nodes 18 and the disks 20.
In one embodiment of the invention, the load balancers 12a and 12b can be implemented in software or microcode executed on one or more computers. In alternative embodiments, the load balancers 12a and 12b can be implemented in hardware system including one or more application specific logic chips, programmable logic devices such as a Field Programmable Logic Device, or a combination thereof. Similarly, both the termination nodes 14 and the file server nodes 16 can be implemented on computers, such a server, dedicated hardware, programmable logic, or a combination thereof. Furthermore, one or more of the termination nodes 14 and the file server nodes 16 may be in a single CPU or multiple CPUs and the switching fabric may be replaced by inter or intra CPU
connnunication mechanism(s).
The termination nodes 14, file server nodes 16, and the disk controller nodes 18 are each independently scalable within the NAS system of the present invention. If one type of node becomes over-loaded, then additional nodes of that type can be added to the system Lentil the problem is corrected.
The embodiments of the present invention described above are to be considered as illustrative and not restrictive. The invention is not to be limited to the details given herein, but may be modified within the scope and equivalents of the appended claims.

Claims (35)

We claim:
1. An apparatus comprising:

a scalable network attached storage system, the network attached storage system comprising:

one or more termination nodes;

one or more file server nodes for maintaining file systems respectively;
one or more disk controller nodes for accessing storage disks respectively; and a switching fabric coupling the one or more termination node, file server nodes, and disk controller nodes, wherein the one or more termination nodes, file server nodes and disk controller nodes can be added or deleted to the scalable network attached storage system as needed.
2. The apparatus of claim 1, further comprising a load balances configured to be coupled to the termination nades, the load balances configured to balance the load of connections among the one or more termination nodes.
3. The apparatus of claim 2, wherein the load balances balances the load of connections among the one or more termination nodes based on one or more of the following metrics: the number of connections per termination node; utilization of the termination nodes; memory utilization; or a combination thereof.
4. The apparatus of claim 2, wherein the load balances is further configured to maintain a current list of the termination nodes as they may be added or deleted from the scalable network attached storage system.
5. The apparatus of claim 2, wherein the load balances is further configured to forward all requests associated with a connection to the same termination node as the requests are received.
6. The apparatus of claim 1, wherein each of the one or more termination nodes is configured to terminate requests as they are received.
7. The apparatus of claim 6, wherein the requests are either TCP or UDP
sunning on IP.
8. The apparatus of claim 6, wherein the termination nodes are further configured to determine if any received requests are NFS or CIFS.
9. The apparatus of claim 8, wherein the termination nodes are further configured to terminate XDR and RPC for NFS requests.
10. The apparatus of claim 6, wherein the one or more termination nodes are configured to extract the file handle from any request it receives respectively.
11. The apparatus of claim 10, wherein the one or more termination nodes are configured to send the request to a selected one of the file server nodes based on the extracted file handle.
12. The apparatus of claim 11, wherein the one or more termination nodes are configured to send the request to the selected one of the file server nodes in a common format regardless if the request was NFS or CIFS.
13. The apparatus of claim 6, wherein the one or more termination nodes are configured to send the request to a selected file server node based on the type of file defined by the request.
14. The apparatus of claim 1, wherein the one or more termination nodes are configured to detect failures of the one or more file server nodes.
15. The apparatus of claim 1, wherein the one or more file server nodes are each configured to retrieve files through the one or more disk controller nodes as necessary to service any received requests.
16. The apparatus of claim 1, wherein the one or more file server nodes are each configured to terminate any requests received from the termination nodes and the disk controller nodes.
17. The apparatus of claim 1, wherein each of the one or more file server nodes maintains a federated file system that does not beep track of the files accessed by the other file server nodes.
18. The apparatus of claim 1, wherein the file systems maintained by each of the one or more server nodes services a different name space range respectively.
19. The apparatus of claim 18, wherein the different name space ranges serviced by the one or more server nodes is allocated dynamically.
20. The apparatus of claim 19, wherein the name space allocated to each of the one or more server nodes is dynamically propagated to the one or more termination nodes.
21. The apparatus of claim 1, wherein each of the file server nodes is capable of locking a file when accessing that file.
22. The apparatus of claim 21, wherein the file is locked when being read, when being written, or both.
23. The apparatus of claim 1, wherein the one or more file server nodes are each further configured to maintain a cache of recently accessed files that can be served without accessing the storage disks respectively.
24. The apparatus of claim 23, wherein the files in the caches are replaced using a replacement algorithm, the replacement algorithm being one of the following last recently used, or first in first out.
25. The apparatus of claim 1, wherein the one or more file server nodes are optimized for handling certain types of specific requests.
26. The apparatus of claim 1, wherein the storage disk are arranged in one or more redundant arrays of independent disks.
27. The apparatus of claim 1, wherein each of the disk controller nodes performs one or more of the following functions: file mirroring for backup purposes, file relocation, terminate requests received from the one or more file server nodes, virtualization of dish space, monitor the storage disks for failure and replacement, and act as a data block server.
28. The apparatus of claim 1, wherein the switching fabric comprises the following types of switches: Ethernet switches, Fibre Channel switches, or a combination thereof.
29. The apparatus of claim 1, further comprising a storage array network coupled between the one or more dish controller nodes and the storage disks.
30. The apparatus of claim 1, one or more of the termination nodes and the file server nodes are implemented in one or more CPUs the switching fabric is at least partially implemented using an inter and/or an intra CPU communication mechanism.
31. A method comprising:

receiving a connection request from a client;

selecting a termination node among the plurality of termination nodes to establish a connection with the client in response to the connection request based on a predetermined metric;

terminating at the selected termination node a command request received from the client during the connection by extracting a file handle defined by the command request;

forwarding the command request to a selected file server node among a plurality of file server nodes;

interpreting the command request at the selected file server node and accessing an appropriate disk controller node among a plurality of disk controller nodes; and accessing disk storage through the appropriate disk controller node and serving the accessed data to the client.
32. The method of claim 31, wherein the predetermined metric comprises one of the following: the load among the plurality of termination nodes, CPU
utilization, memory utilization, or a combination thereof.
33. The method of claim 32, wherein the forwarding of the command request to a selected file server node based on the file handle extracted from the command request.
34. The method of claim 32, wherein the forwarding of the command request to a selected file server node based on the type of file defined by the command request.
35. The method of claim 31, further comprising scaling the number of termination nodes, file server nodes, and disk controller nodes as needed to meet user demands.
CA002508804A 2002-12-06 2003-11-19 Apparatus and method for a scalable network attach storage system Abandoned CA2508804A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US10/313,306 US20040139167A1 (en) 2002-12-06 2002-12-06 Apparatus and method for a scalable network attach storage system
US10/313,306 2002-12-06
PCT/US2003/037234 WO2004053677A2 (en) 2002-12-06 2003-11-19 Apparatus and method for a scalable network attach storage system

Publications (1)

Publication Number Publication Date
CA2508804A1 true CA2508804A1 (en) 2004-06-24

Family

ID=32505836

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002508804A Abandoned CA2508804A1 (en) 2002-12-06 2003-11-19 Apparatus and method for a scalable network attach storage system

Country Status (6)

Country Link
US (1) US20040139167A1 (en)
EP (1) EP1570337A2 (en)
CN (1) CN1723434A (en)
AU (1) AU2003291122A1 (en)
CA (1) CA2508804A1 (en)
WO (1) WO2004053677A2 (en)

Families Citing this family (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6671773B2 (en) * 2000-12-07 2003-12-30 Spinnaker Networks, Llc Method and system for responding to file system requests
US6868417B2 (en) * 2000-12-18 2005-03-15 Spinnaker Networks, Inc. Mechanism for handling file level and block level remote file accesses using the same server
US7127565B2 (en) * 2001-08-20 2006-10-24 Spinnaker Networks, Inc. Method and system for safely arbitrating disk drive ownership using a timestamp voting algorithm
US7873700B2 (en) * 2002-08-09 2011-01-18 Netapp, Inc. Multi-protocol storage appliance that provides integrated support for file and block access protocols
US6938184B2 (en) * 2002-10-17 2005-08-30 Spinnaker Networks, Inc. Method and system for providing persistent storage of user data
US7475142B2 (en) 2002-12-06 2009-01-06 Cisco Technology, Inc. CIFS for scalable NAS architecture
US7443845B2 (en) * 2002-12-06 2008-10-28 Cisco Technology, Inc. Apparatus and method for a lightweight, reliable, packet-based transport protocol
JP2004227097A (en) * 2003-01-20 2004-08-12 Hitachi Ltd Control method of storage device controller, and storage device controller
JP2004280283A (en) 2003-03-13 2004-10-07 Hitachi Ltd Distributed file system, distributed file system server, and access method to distributed file system
JP4322031B2 (en) * 2003-03-27 2009-08-26 株式会社日立製作所 Storage device
US7346664B2 (en) * 2003-04-24 2008-03-18 Neopath Networks, Inc. Transparent file migration using namespace replication
US7587422B2 (en) * 2003-04-24 2009-09-08 Neopath Networks, Inc. Transparent file replication using namespace replication
US7831641B2 (en) * 2003-04-24 2010-11-09 Neopath Networks, Inc. Large file support for a network file server
JP4329412B2 (en) * 2003-06-02 2009-09-09 株式会社日立製作所 File server system
US20050089054A1 (en) * 2003-08-11 2005-04-28 Gene Ciancaglini Methods and apparatus for provisioning connection oriented, quality of service capabilities and services
US7539143B2 (en) * 2003-08-11 2009-05-26 Netapp, Inc. Network switching device ingress memory system
WO2005029251A2 (en) * 2003-09-15 2005-03-31 Neopath Networks, Inc. Enabling proxy services using referral mechanisms
JP4311636B2 (en) * 2003-10-23 2009-08-12 株式会社日立製作所 A computer system that shares a storage device among multiple computers
JP2005148868A (en) * 2003-11-12 2005-06-09 Hitachi Ltd Data prefetch in storage device
US7366837B2 (en) * 2003-11-24 2008-04-29 Network Appliance, Inc. Data placement technique for striping data containers across volumes of a storage system cluster
US7647451B1 (en) 2003-11-24 2010-01-12 Netapp, Inc. Data placement technique for striping data containers across volumes of a storage system cluster
US7698289B2 (en) 2003-12-02 2010-04-13 Netapp, Inc. Storage system architecture for striping data container content across volumes of a cluster
US7302520B2 (en) * 2003-12-02 2007-11-27 Spinnaker Networks, Llc Method and apparatus for data storage using striping
US7409497B1 (en) 2003-12-02 2008-08-05 Network Appliance, Inc. System and method for efficiently guaranteeing data consistency to clients of a storage system cluster
US20050125456A1 (en) * 2003-12-09 2005-06-09 Junichi Hara File migration method based on access history
US8195627B2 (en) * 2004-04-23 2012-06-05 Neopath Networks, Inc. Storage policy monitoring for a storage network
US8190741B2 (en) * 2004-04-23 2012-05-29 Neopath Networks, Inc. Customizing a namespace in a decentralized storage environment
US7720796B2 (en) * 2004-04-23 2010-05-18 Neopath Networks, Inc. Directory and file mirroring for migration, snapshot, and replication
US7523286B2 (en) * 2004-11-19 2009-04-21 Network Appliance, Inc. System and method for real-time balancing of user workload across multiple storage systems with shared back end storage
US7962689B1 (en) 2005-04-29 2011-06-14 Netapp, Inc. System and method for performing transactional processing in a striped volume set
US7904649B2 (en) * 2005-04-29 2011-03-08 Netapp, Inc. System and method for restriping data across a plurality of volumes
US7698501B1 (en) 2005-04-29 2010-04-13 Netapp, Inc. System and method for utilizing sparse data containers in a striped volume set
US7617370B2 (en) * 2005-04-29 2009-11-10 Netapp, Inc. Data allocation within a storage system architecture
US7698334B2 (en) 2005-04-29 2010-04-13 Netapp, Inc. System and method for multi-tiered meta-data caching and distribution in a clustered computer environment
US7443872B1 (en) 2005-04-29 2008-10-28 Network Appliance, Inc. System and method for multiplexing channels over multiple connections in a storage system cluster
US7657537B1 (en) 2005-04-29 2010-02-02 Netapp, Inc. System and method for specifying batch execution ordering of requests in a storage system cluster
US8627071B1 (en) 2005-04-29 2014-01-07 Netapp, Inc. Insuring integrity of remote procedure calls used in a client and server storage system
US7743210B1 (en) 2005-04-29 2010-06-22 Netapp, Inc. System and method for implementing atomic cross-stripe write operations in a striped volume set
US7484039B2 (en) * 2005-05-23 2009-01-27 Xiaogang Qiu Method and apparatus for implementing a grid storage system
WO2007002855A2 (en) * 2005-06-29 2007-01-04 Neopath Networks, Inc. Parallel filesystem traversal for transparent mirroring of directories and files
US8001580B1 (en) 2005-07-25 2011-08-16 Netapp, Inc. System and method for revoking soft locks in a distributed storage system environment
EP1934838A4 (en) * 2005-09-30 2010-07-07 Neopath Networks Inc Accumulating access frequency and file attributes for supporting policy based storage management
US8131689B2 (en) * 2005-09-30 2012-03-06 Panagiotis Tsirigotis Accumulating access frequency and file attributes for supporting policy based storage management
US8484365B1 (en) 2005-10-20 2013-07-09 Netapp, Inc. System and method for providing a unified iSCSI target with a plurality of loosely coupled iSCSI front ends
EP1949214B1 (en) 2005-10-28 2012-12-19 Network Appliance, Inc. System and method for optimizing multi-pathing support in a distributed storage system environment
US8032896B1 (en) 2005-11-01 2011-10-04 Netapp, Inc. System and method for histogram based chatter suppression
US7730258B1 (en) 2005-11-01 2010-06-01 Netapp, Inc. System and method for managing hard and soft lock state information in a distributed storage system environment
US7587558B1 (en) 2005-11-01 2009-09-08 Netapp, Inc. System and method for managing hard lock state information in a distributed storage system environment
US8255425B1 (en) 2005-11-01 2012-08-28 Netapp, Inc. System and method for event notification using an event routing table
US7526558B1 (en) 2005-11-14 2009-04-28 Network Appliance, Inc. System and method for supporting a plurality of levels of acceleration in a single protocol session
US7797570B2 (en) * 2005-11-29 2010-09-14 Netapp, Inc. System and method for failover of iSCSI target portal groups in a cluster environment
JP2007286897A (en) * 2006-04-17 2007-11-01 Hitachi Ltd Storage system, data management device, and management method for it
US8788685B1 (en) * 2006-04-27 2014-07-22 Netapp, Inc. System and method for testing multi-protocol storage systems
US8082362B1 (en) 2006-04-27 2011-12-20 Netapp, Inc. System and method for selection of data paths in a clustered storage system
US7840969B2 (en) * 2006-04-28 2010-11-23 Netapp, Inc. System and method for management of jobs in a cluster environment
US8489811B1 (en) 2006-12-29 2013-07-16 Netapp, Inc. System and method for addressing data containers using data set identifiers
US8301673B2 (en) * 2006-12-29 2012-10-30 Netapp, Inc. System and method for performing distributed consistency verification of a clustered file system
US8312046B1 (en) 2007-02-28 2012-11-13 Netapp, Inc. System and method for enabling a data container to appear in a plurality of locations in a super-namespace
US8312214B1 (en) 2007-03-28 2012-11-13 Netapp, Inc. System and method for pausing disk drives in an aggregate
US7827350B1 (en) 2007-04-27 2010-11-02 Netapp, Inc. Method and system for promoting a snapshot in a distributed file system
US7797489B1 (en) 2007-06-01 2010-09-14 Netapp, Inc. System and method for providing space availability notification in a distributed striped volume set
US7984259B1 (en) 2007-12-17 2011-07-19 Netapp, Inc. Reducing load imbalance in a storage system
US7996607B1 (en) 2008-01-28 2011-08-09 Netapp, Inc. Distributing lookup operations in a striped storage system
US8578018B2 (en) * 2008-06-29 2013-11-05 Microsoft Corporation User-based wide area network optimization
SE533007C2 (en) 2008-10-24 2010-06-08 Ilt Productions Ab Distributed data storage
US7992055B1 (en) 2008-11-07 2011-08-02 Netapp, Inc. System and method for providing autosupport for a security system
US9325790B1 (en) 2009-02-17 2016-04-26 Netapp, Inc. Servicing of network software components of nodes of a cluster storage system
US8117388B2 (en) * 2009-04-30 2012-02-14 Netapp, Inc. Data distribution through capacity leveling in a striped file system
US9372728B2 (en) 2009-12-03 2016-06-21 Ol Security Limited Liability Company System and method for agent networks
EP2712149B1 (en) 2010-04-23 2019-10-30 Compuverde AB Distributed data storage
US9424351B2 (en) 2010-11-22 2016-08-23 Microsoft Technology Licensing, Llc Hybrid-distribution model for search engine indexes
US9195745B2 (en) * 2010-11-22 2015-11-24 Microsoft Technology Licensing, Llc Dynamic query master agent for query execution
US9529908B2 (en) 2010-11-22 2016-12-27 Microsoft Technology Licensing, Llc Tiering of posting lists in search engine index
US9342582B2 (en) 2010-11-22 2016-05-17 Microsoft Technology Licensing, Llc Selection of atoms for search engine retrieval
US8713024B2 (en) 2010-11-22 2014-04-29 Microsoft Corporation Efficient forward ranking in a search engine
CN102693274B (en) * 2011-03-25 2017-08-15 微软技术许可有限责任公司 Dynamic queries master agent for query execution
US9495477B1 (en) * 2011-04-20 2016-11-15 Google Inc. Data storage in a graph processing system
US8645978B2 (en) 2011-09-02 2014-02-04 Compuverde Ab Method for data maintenance
US8997124B2 (en) 2011-09-02 2015-03-31 Compuverde Ab Method for updating data in a distributed data storage system
US8769138B2 (en) 2011-09-02 2014-07-01 Compuverde Ab Method for data retrieval from a distributed data storage system
US9626378B2 (en) 2011-09-02 2017-04-18 Compuverde Ab Method for handling requests in a storage system and a storage node for a storage system
US9021053B2 (en) 2011-09-02 2015-04-28 Compuverde Ab Method and device for writing data to a data storage system comprising a plurality of data storage nodes
US8650365B2 (en) 2011-09-02 2014-02-11 Compuverde Ab Method and device for maintaining data in a data storage system comprising a plurality of data storage nodes
CN102331957B (en) * 2011-09-28 2013-08-28 华为技术有限公司 File backup method and device
US9813491B2 (en) * 2011-10-20 2017-11-07 Oracle International Corporation Highly available network filer with automatic load balancing and performance adjustment
US20130262811A1 (en) * 2012-03-27 2013-10-03 Hitachi, Ltd. Method and apparatus of memory management by storage system
US9172744B2 (en) 2012-06-14 2015-10-27 Microsoft Technology Licensing, Llc Scalable storage with programmable networks
CN104052677B (en) * 2013-03-14 2018-04-10 阿里巴巴集团控股有限公司 The soft load-balancing method and device of data mapping
US20150160864A1 (en) * 2013-12-09 2015-06-11 Netapp, Inc. Systems and methods for high availability in multi-node storage networks
US20150215389A1 (en) * 2014-01-30 2015-07-30 Salesforce.Com, Inc. Distributed server architecture
CN111031033B (en) 2014-06-13 2022-08-16 柏思科技有限公司 Method and system for managing nodes
US10452482B2 (en) * 2016-12-14 2019-10-22 Oracle International Corporation Systems and methods for continuously available network file system (NFS) state data
CN108769151B (en) * 2018-05-15 2019-11-12 新华三技术有限公司 A kind of method and device for business processing
US11436524B2 (en) * 2018-09-28 2022-09-06 Amazon Technologies, Inc. Hosting machine learning models
US11562288B2 (en) 2018-09-28 2023-01-24 Amazon Technologies, Inc. Pre-warming scheme to load machine learning models
US11706303B2 (en) * 2021-04-22 2023-07-18 Cisco Technology, Inc. Survivability method for LISP based connectivity

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6105029A (en) * 1997-09-17 2000-08-15 International Business Machines Corporation Retrieving network files through parallel channels
US6515967B1 (en) * 1998-06-30 2003-02-04 Cisco Technology, Inc. Method and apparatus for detecting a fault in a multicast routing infrastructure
US6249801B1 (en) * 1998-07-15 2001-06-19 Radware Ltd. Load balancing
US7506034B2 (en) * 2000-03-03 2009-03-17 Intel Corporation Methods and apparatus for off loading content servers through direct file transfer from a storage center to an end-user
US8281022B1 (en) * 2000-06-30 2012-10-02 Emc Corporation Method and apparatus for implementing high-performance, scaleable data processing and storage systems
US6970939B2 (en) * 2000-10-26 2005-11-29 Intel Corporation Method and apparatus for large payload distribution in a network
US6606690B2 (en) * 2001-02-20 2003-08-12 Hewlett-Packard Development Company, L.P. System and method for accessing a storage area network as network attached storage
US7475142B2 (en) * 2002-12-06 2009-01-06 Cisco Technology, Inc. CIFS for scalable NAS architecture

Also Published As

Publication number Publication date
CN1723434A (en) 2006-01-18
US20040139167A1 (en) 2004-07-15
AU2003291122A1 (en) 2004-06-30
WO2004053677A2 (en) 2004-06-24
EP1570337A2 (en) 2005-09-07
WO2004053677A3 (en) 2005-02-10

Similar Documents

Publication Publication Date Title
US20040139167A1 (en) Apparatus and method for a scalable network attach storage system
US9900397B1 (en) System and method for scale-out node-local data caching using network-attached non-volatile memories
US10963289B2 (en) Storage virtual machine relocation
US9923958B1 (en) Highly available network filer with automatic load balancing and performance adjustment
US9355036B2 (en) System and method for operating a system to cache a networked file system utilizing tiered storage and customizable eviction policies based on priority and tiers
JP5047165B2 (en) Virtualization network storage system, network storage apparatus and virtualization method thereof
US9537710B2 (en) Non-disruptive failover of RDMA connection
EP1859603B1 (en) Integrated storage virtualization and switch system
US7562110B2 (en) File switch and switched file system
US9143566B2 (en) Non-disruptive storage caching using spliced cache appliances with packet inspection intelligence
US9049204B2 (en) Collaborative management of shared resources
US9130968B2 (en) Clustered cache appliance system and methodology
US9906596B2 (en) Resource node interface protocol
JP2005267327A (en) Storage system
WO2002008899A2 (en) Method and apparatus for implementing high-performance, scaleable data processing and storage systems
JP5137409B2 (en) File storage method and computer system
US8756338B1 (en) Storage server with embedded communication agent
US20230315695A1 (en) Byte-addressable journal hosted using block storage device
US20050193021A1 (en) Method and apparatus for unified storage of data for storage area network systems and network attached storage systems
CN111225003B (en) NFS node configuration method and device
US7685223B1 (en) Network-wide service discovery
Eisler et al. Data ONTAP GX: A Scalable Storage Cluster.
KR101023622B1 (en) Adaptive high-performance proxy cache server and Caching method
JP2023541069A (en) Active-active storage systems and their data processing methods
KR20140045738A (en) Cloud storage system

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued