WO2017130022A1 - Method for adding storage devices to a data storage system with diagonally replicated data storage blocks - Google Patents
Method for adding storage devices to a data storage system with diagonally replicated data storage blocks Download PDFInfo
- Publication number
- WO2017130022A1 WO2017130022A1 PCT/IB2016/050386 IB2016050386W WO2017130022A1 WO 2017130022 A1 WO2017130022 A1 WO 2017130022A1 IB 2016050386 W IB2016050386 W IB 2016050386W WO 2017130022 A1 WO2017130022 A1 WO 2017130022A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- storage devices
- new
- data storage
- blocks
- existing
- Prior art date
Links
- 238000003860 storage Methods 0.000 title claims abstract description 164
- 238000013500 data storage Methods 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims description 26
- 230000010076 replication Effects 0.000 claims abstract description 37
- 230000015654 memory Effects 0.000 claims description 11
- 238000005192 partition Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 4
- 230000001934 delay Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0604—Improving or facilitating administration, e.g. storage management
- G06F3/0607—Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0632—Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/065—Replication mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2053—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
- G06F11/2094—Redundant storage or storage space
Definitions
- Embodiments of the invention relate to the field of data storage systems and, more specifically, to configuring logical storage space when new storage devices are added to an existing data storage system.
- the manner of organizing storage blocks in a data storage system aims to ensure efficient access and data replication so as to provide adequate fault tolerance in the event of hardware failures.
- the storage blocks are replicated such that each replica resides in a different storage device.
- the number of storage devices in a conventional data storage system multip!icative!y depends on a replication factor, and this has been an undesirable constraint.
- the logical storage space of the data storage system is configured such that groups of blocks storing replicas of the same data are allocated diagonally and circularly across the added storage devices. This manner of
- a computer-implemented method for adding new devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
- the method includes receiving an indication that a number of new storage devices are to be added to the data storage system. If the number of new storage devices is greater than the replication factor, then a logical storage space of the data storage system after adding the new storage devices is configured such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
- each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
- the data storage system includes preexisting storage devices, new storage devices and a processor.
- the processor is configured to receive an indication that a number of new storage devices are added to the data storage system. Further, if the number of new storage devices is greater than the replication factor, the processor configures a logical storage space of the data storage system after adding the new storage devices so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
- a non-transitory computer- readable recording medium storing executable codes that, when executed by a computer, make the computer perform a method for adding new storage devices to a data storage system in which each storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
- the method includes receiving an indication that a number of new storage devices are added to the data storage system. If the number of new storage devices is greater than the replication factor, then a iogicai storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
- Figure 1 is a block diagram illustrating a data storage system according to one embodiment
- Figure 2 illustrates a memory configuration according to an embodiment
- Figure 3 illustrates diagonal and circular allocation of partitions in two sets of disks
- Figure 4 is a block diagram of a method according to an embodiment
- Figure 5 is a flowchart of a method according to an embodiment. DETAILED DESCRIPTION
- Figure 1 is a block diagram illustrating data storage according to one embodiment.
- One or more clients such as 101 and 102, are communicatively coupled to storage system 104 over network 103.
- the term “coupled” is used to indicate that the elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other.
- the term “connected” is used in this document to indicate that elements coupled with each other communicate.
- Each of clients 101 and 102 may be a server or a personal computer (e.g., workstation, laptop, netbook, tablet, palm top, mobile phone, smartphone, phablet, multimedia phone, Voice Over Internet Protocol (VOIP) phone, terminal, portable media player, GPS unit, wearable device, gaming system, set-top box, Internet-enabled household appliance, etc.).
- Network 103 may be any type of network, such as a local area network (LAN), a wide area network (WAN) such as the Internet, a corporate intranet, a metropolitan area network (MAN), a storage area network (SAN), a bus, or a combination thereof, wired and/or wireless.
- LAN local area network
- WAN wide area network
- MAN metropolitan area network
- SAN storage area network
- bus or a combination thereof, wired and/or wireless.
- Data storage system 104 may include any type of server or cluster of servers, and may have a distributed architecture, or all of its components may be integrated into a single unit.
- the data storage system may be, for example, a file server (e.g., an appliance used to provide network attached storage (NAS) capability), a block- based storage server (e.g., used to provide SAN capability), a unified storage device (e.g., one which combines NAS and SAN capabilities), a neariine storage device, a direct attached storage (DAS) device, a tape backup device, or essentially any other type of data storage device.
- NAS network attached storage
- DAS direct attached storage
- Storage system 104 includes pre-existing storage devices 1 10 and newly added storage devices 120.
- Each of the storage devices may be, for example, conventional magnetic disks, magnetic tape storages, magneto-optical (MO) storage media, solid-state disks, flash memory-based devices, or any other type of non-volatile storage devices suitable for data.
- the storage devices may be disk storage media organized into one or more volumes of redundant array of inexpensive disks (RAID).
- RAID redundant array of inexpensive disks
- a storage managing processor 105 communicatively coupled to the storage system 104 may be configured to perform allocation and maintenance related to storage system 104.
- the processor may execute data storage-related management methods by executing software stored on a computer-readable medium, which may also be part of the storage system (e.g., computer-readable medium 130 in Figure 1 ).
- a distributed file system may be built on top of storage blocks (e.g., partitions in disks) grouped to store the same data.
- the location of the blocks in a group may be determined by the administrator of cloud storage, which thus performs a storage manager function.
- Figure 2 illustrates memory (i.e., volume) configuration according to an embodiment.
- the memory includes multiple disk sets. Each disk set is represented as a matrix, between square parentheses.
- M the number of disks in a set
- R the replication factor.
- M the number of disks in each set is greater than the replication factor (M>R).
- M the number of disks in all the disk sets illustrated in Figure 2 is M.
- Disks A1 , A2, to AM pertain to a first set, and labels Ah, A to AIR indicate partitions in the disk Ai (where ⁇ " takes values between 1 and M).
- the number of partitions in a disk may be equal to the replication factor, but there also may be more partitions.
- the different sets of disks may have been added at different times. In other words, when a number of disks greater than the replication factor are added together, a new set of disks may be formed.
- the data storage system may be managed such that blocks (i.e., partitions) used for storing sets of replicas are allocated diagonally and circularly. This manner of allocating blocks avoids the constraint that multiplicatively constraints the number of disks relative to the replication factor. However, this manner of organizing disks is an option and not a limitation, in a more general view, blocks from different disks are allocated to store copies (replicas) of each block, respectively.
- Figure 3 exempiarily illustrates diagonal and circular allocation of blocks (that are also called "partitions " ) in disk set 310 (which may be pre-existing disks) and disk set 320 (which may be newly added disks).
- the replication factor is 4, that is, there are 3 copies to any block, for a total of 4 replicas in a set of replicas.
- the disk blocks form groups for storing sets of replicas as follows: the first group includes A1 , B2, C3 and D4, the second group includes B1 , C2, D3 and E4, the third group includes C1 , D2, E3 and A4, the fourth group includes D1 , E2, A3 and B4, and the fifth group includes E1 , A2, B3 and C4.
- the disk partitions then form the new groups for storing new sets of replicas as follows: the first new group includes F1 , G2, H3 and I4, the second new group includes G1 , H2, 13 and J4, the third new group includes H1 , I2, J3 and K4, the fourth new group includes 11 , J2, K3 and F4, the fifth new group includes J1 , K2, F3 and G4, and finally, the sixth new group includes K1 , F2, G3 and H4.
- FIG. 4 is a block diagram of a method for adding new disks to a data storage system according to an embodiment.
- K information regarding the number of new disks, K
- the number of new disks is compared to the replication factor, R. If K>R (i.e., the "YES" branch emerging from the diamond block labeled 420 in Figure 4), then, at 430, new groups of blocks/partitions are created for storing sets of replicas across the new K disks. If, however, K ⁇ R (i.e., the "NO" branch emerging from the block 420 in Figure 4), then, at 440, the new disks are merged (i.e., organized and allocated) with pre-existing disks.
- Block 450 in Figure 4 indicates that the upgraded storage system (including the new disks) is ready for use.
- the storage entities i.e., blocks or partitions
- the grouped blocks or partitions for storing sets of replicas
- may be logically contiguous in the memory space i.e., the starting address of a block/partition in a storage device is immediately after the ending address of another block/partition in another storage device.
- a data storage system in which each data storing block is replicated a replication factor, R, times in different storage devices includes pre-existing storage devices (such as 1 10 in Figure 1 and 310 in Figure 3), new storage devices (such as 120 in Figure 1 and 320 in Figure 3), and a processor (such as 105 in Figure 1 ).
- the processor may be configured to receive an indication that a number, K, of new storage devices are added to the data storage system.
- the processor may configure a logical storage space of the data storage system after adding new storage devices so that groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
- the pre-existing logical storage space of the data storage system before adding the new data storage devices may be included without being modified in the logical storage space of the data storage system after adding the new data storage devices.
- the pre-existing logical storage space of the data storage system before adding the new data storage devices may have also been configured such that preexisting groups of blocks storing sets of replicas, respectively, were diagonally and circularly allocated across the pre-existing storage devices.
- the processor may configure the logical storage space of the data storage such that the new groups of blocks storing sets of replicas are allocated diagonally and circularly across both the pre-existing storage devices and the new storage devices.
- the processor may also link blocks of the pre-existing storage devices and blocks of the new storage devices to avoid copying data already stored in the blocks of the pre-existing storage devices.
- a non-transitory computer readable recording medium (such as memory 130 in Figure 1 ) stores executable codes which, when executed by a computer, make the computer perform methods for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage device, as previously described.
- the disclosed exemplary embodiments provide methods and devices for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
Abstract
New storage devices are efficiently added to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then the logical storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
Description
[0001] Embodiments of the invention relate to the field of data storage systems and, more specifically, to configuring logical storage space when new storage devices are added to an existing data storage system.
DISCUSSION OF THE BACKGROUND
[0002] The manner of organizing storage blocks in a data storage system aims to ensure efficient access and data replication so as to provide adequate fault tolerance in the event of hardware failures. In order to avoid loss of data when a storage device fails, the storage blocks are replicated such that each replica resides in a different storage device. The number of storage devices in a conventional data storage system multip!icative!y depends on a replication factor, and this has been an undesirable constraint.
[0003] This constraint has been recently overcome through development of methods of organizing/allocating data storage blocks in a data storage system allocating storage blocks in a circular and diagonal manner. These methods allocate the blocks one-by- one on different storage devices in a circular list. Replicas of a set (i.e., storing the same data) are allocated to successive devices in the list. When a block is allocated in the "last" device in the list, the next block is allocated to the "first" device in the list.
This circular and diagonal manner of organizing/allocating storage blocks renders moot the constraint of adding a number of new memories dependent on the replication factor.
[0004] Various aspects related to data storage systems are continuously improved as described in U.S. Patent Nos. 8832363, 5937425, 7249150, 7680837, 7996636, 8082390, 8099396, 8205065, 8341457, 8417987, 8495417, 8839008, 8560879, and 8595595, and U.S. Patent Application Publication Nos. 2003/0120869, 2003/0191916, 2005/0144514, 2007/0143359, 2010/0042790, 2010/0088296, 201 1/0035548,
201 1/0213928, and 2012/0290788.
[0005] Automated management of the data storage system storing replicas of data in physically distinct devices is substantially disrupted when new storage devices (i.e., memories or disks) are added. For example, such an expansion of a data storage system that is circularly and diagonally organized/allocated triggers delays due to data copying between pre-existing and new storage devices.
[0006] It is desirable to develop methods for adding new storage devices to a data storage system, which methods to minimize the disruption and avoid the need to copy data from one storage device to another.
SUMMARY
[0007] In various embodiments, if the number of added storage devices is greater than the replication factor, then the logical storage space of the data storage system is configured such that groups of blocks storing replicas of the same data are allocated diagonally and circularly across the added storage devices. This manner of
organizing/allocating storage space for replica sets reduces the time required to add
storage devices (i.e., the added devices are instantly available without delays caused by data copying from one storage device to another). Moreover, the number of added devices does not have to be a multiple of the replication factor.
[0008] According to an embodiment, there is a computer-implemented method for adding new devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The method includes receiving an indication that a number of new storage devices are to be added to the data storage system. If the number of new storage devices is greater than the replication factor, then a logical storage space of the data storage system after adding the new storage devices is configured such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
[0009] According to another embodiment, there is a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The data storage system includes preexisting storage devices, new storage devices and a processor. The processor is configured to receive an indication that a number of new storage devices are added to the data storage system. Further, if the number of new storage devices is greater than the replication factor, the processor configures a logical storage space of the data storage system after adding the new storage devices so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
[0010] According to yet another embodiment, there is a non-transitory computer- readable recording medium storing executable codes that, when executed by a computer, make the computer perform a method for adding new storage devices to a data storage system in which each storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The method includes receiving an indication that a number of new storage devices are added to the data storage system. If the number of new storage devices is greater than the replication factor, then a iogicai storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate one or more embodiments and, together with the description, explain these embodiments. In the drawings:
[0012] Figure 1 is a block diagram illustrating a data storage system according to one embodiment;
[0013] Figure 2 illustrates a memory configuration according to an embodiment;
[0014] Figure 3 illustrates diagonal and circular allocation of partitions in two sets of disks;
[0015] Figure 4 is a block diagram of a method according to an embodiment; and [0016] Figure 5 is a flowchart of a method according to an embodiment.
DETAILED DESCRIPTION
[0017] The following description of the exemplary embodiments refers to the accompanying drawings. The same reference numbers in different drawings identify the same or similar elements. The following detailed description does not limit the invention, instead, the scope of the invention is defined by the appended claims. The following embodiments are discussed using terminology of block storage systems (i.e., non-object and non-metadata based systems).
[0018] Reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with an embodiment is included in at least one embodiment of the subject matter disclosed. Thus, the appearance of the phrases "in one embodiment" or "in an embodiment" in various places throughout the specification is not necessarily referring to the same embodiment. Further, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
[0019] Figure 1 is a block diagram illustrating data storage according to one embodiment. One or more clients, such as 101 and 102, are communicatively coupled to storage system 104 over network 103. Here, the term "coupled" is used to indicate that the elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other. Further, the term "connected" is used in this document to indicate that elements coupled with each other communicate.
[0020] Each of clients 101 and 102 may be a server or a personal computer (e.g., workstation, laptop, netbook, tablet, palm top, mobile phone, smartphone, phablet, multimedia phone, Voice Over Internet Protocol (VOIP) phone, terminal, portable media
player, GPS unit, wearable device, gaming system, set-top box, Internet-enabled household appliance, etc.). Network 103 may be any type of network, such as a local area network (LAN), a wide area network (WAN) such as the Internet, a corporate intranet, a metropolitan area network (MAN), a storage area network (SAN), a bus, or a combination thereof, wired and/or wireless.
[0021] Data storage system 104 may include any type of server or cluster of servers, and may have a distributed architecture, or all of its components may be integrated into a single unit. The data storage system may be, for example, a file server (e.g., an appliance used to provide network attached storage (NAS) capability), a block- based storage server (e.g., used to provide SAN capability), a unified storage device (e.g., one which combines NAS and SAN capabilities), a neariine storage device, a direct attached storage (DAS) device, a tape backup device, or essentially any other type of data storage device.
[0022] Storage system 104 includes pre-existing storage devices 1 10 and newly added storage devices 120. Each of the storage devices may be, for example, conventional magnetic disks, magnetic tape storages, magneto-optical (MO) storage media, solid-state disks, flash memory-based devices, or any other type of non-volatile storage devices suitable for data. The storage devices may be disk storage media organized into one or more volumes of redundant array of inexpensive disks (RAID). In the following description, the term "disk" is used as short form for "storage device."
[0023] A storage managing processor 105 communicatively coupled to the storage system 104 may be configured to perform allocation and maintenance related to storage system 104. The processor may execute data storage-related management
methods by executing software stored on a computer-readable medium, which may also be part of the storage system (e.g., computer-readable medium 130 in Figure 1 ).
[0024] According to one embodiment, a distributed file system (DSF) may be built on top of storage blocks (e.g., partitions in disks) grouped to store the same data. The location of the blocks in a group may be determined by the administrator of cloud storage, which thus performs a storage manager function.
[0025] The following description refers to a single volume, but multiple volumes and expanding a volume may also be implemented using the same concepts. Figure 2 illustrates memory (i.e., volume) configuration according to an embodiment. The memory includes multiple disk sets. Each disk set is represented as a matrix, between square parentheses. Consider M to be the number of disks in a set, and R the replication factor. The number of disks in each set is greater than the replication factor (M>R). There may be different numbers of disks in the different disk sets, but for the sake of simplicity, the numbers of disks in all the disk sets illustrated in Figure 2 is M.
[0026] Disks A1 , A2, to AM pertain to a first set, and labels Ah, A to AIR indicate partitions in the disk Ai (where Ί" takes values between 1 and M). The number of partitions in a disk may be equal to the replication factor, but there also may be more partitions.
[0027] The different sets of disks may have been added at different times. In other words, when a number of disks greater than the replication factor are added together, a new set of disks may be formed.
[0028] The data storage system may be managed such that blocks (i.e., partitions) used for storing sets of replicas are allocated diagonally and circularly. This
manner of allocating blocks avoids the constraint that multiplicatively constraints the number of disks relative to the replication factor. However, this manner of organizing disks is an option and not a limitation, in a more general view, blocks from different disks are allocated to store copies (replicas) of each block, respectively.
[0029] Figure 3 exempiarily illustrates diagonal and circular allocation of blocks (that are also called "partitions") in disk set 310 (which may be pre-existing disks) and disk set 320 (which may be newly added disks). Disk set 310 includes disks A, B, C, D and E, (that is, M~5 disks), and disk set 320 includes disks F, G, H, I, J and K (i.e. , M=6 disks). The replication factor is 4, that is, there are 3 copies to any block, for a total of 4 replicas in a set of replicas.
[0030] in disk set 310, the disk blocks form groups for storing sets of replicas as follows: the first group includes A1 , B2, C3 and D4, the second group includes B1 , C2, D3 and E4, the third group includes C1 , D2, E3 and A4, the fourth group includes D1 , E2, A3 and B4, and the fifth group includes E1 , A2, B3 and C4.
[0031] in disk set 320, the disk partitions then form the new groups for storing new sets of replicas as follows: the first new group includes F1 , G2, H3 and I4, the second new group includes G1 , H2, 13 and J4, the third new group includes H1 , I2, J3 and K4, the fourth new group includes 11 , J2, K3 and F4, the fifth new group includes J1 , K2, F3 and G4, and finally, the sixth new group includes K1 , F2, G3 and H4.
Different shades and hashes are employed to illustrate the different groups of partitions.
[0032] Figure 4 is a block diagram of a method for adding new disks to a data storage system according to an embodiment. At 410, information regarding the number of new disks, K, becomes available. Then, at 420, the number of new disks is
compared to the replication factor, R. If K>R (i.e., the "YES" branch emerging from the diamond block labeled 420 in Figure 4), then, at 430, new groups of blocks/partitions are created for storing sets of replicas across the new K disks. If, however, K<R (i.e., the "NO" branch emerging from the block 420 in Figure 4), then, at 440, the new disks are merged (i.e., organized and allocated) with pre-existing disks. Block 450 in Figure 4 indicates that the upgraded storage system (including the new disks) is ready for use.
[0033] Although the storage entities (i.e., blocks or partitions) are allocated in different storage devices, the grouped blocks or partitions (for storing sets of replicas) may be logically contiguous in the memory space (i.e., the starting address of a block/partition in a storage device is immediately after the ending address of another block/partition in another storage device).
[0034] According to an exemplary embodiment, a data storage system in which each data storing block is replicated a replication factor, R, times in different storage devices includes pre-existing storage devices (such as 1 10 in Figure 1 and 310 in Figure 3), new storage devices (such as 120 in Figure 1 and 320 in Figure 3), and a processor (such as 105 in Figure 1 ). The processor may be configured to receive an indication that a number, K, of new storage devices are added to the data storage system. The processor may configure a logical storage space of the data storage system after adding new storage devices so that groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
[0035] The pre-existing logical storage space of the data storage system before adding the new data storage devices may be included without being modified in the
logical storage space of the data storage system after adding the new data storage devices. The pre-existing logical storage space of the data storage system before adding the new data storage devices may have also been configured such that preexisting groups of blocks storing sets of replicas, respectively, were diagonally and circularly allocated across the pre-existing storage devices.
[0036] if the number of new storage devices is not greater than the replication factor, the processor may configure the logical storage space of the data storage such that the new groups of blocks storing sets of replicas are allocated diagonally and circularly across both the pre-existing storage devices and the new storage devices. In this case, the processor may also link blocks of the pre-existing storage devices and blocks of the new storage devices to avoid copying data already stored in the blocks of the pre-existing storage devices.
[0037] According to one embodiment, a non-transitory computer readable recording medium (such as memory 130 in Figure 1 ) stores executable codes which, when executed by a computer, make the computer perform methods for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage device, as previously described.
The disclosed exemplary embodiments provide methods and devices for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices. It should be understood that this description is not intended to limit the
invention. On the contrary, the exemplary embodiments are intended to cover alternatives, modifications and equivalents, which are included in the spirit and scope of the invention as defined by the appended claims. Further, in the detailed description of the exemplary embodiments, numerous specific details are set forth in order to provide a comprehensive understanding of the claimed invention. However, one skilled in the art would understand that various embodiments may be practiced without such specific details.
[0038] Although the features and elements of the present exemplary
embodiments are described in the embodiments in particular combinations, each feature or element can be used alone without the other features and elements of the embodiments or in various combinations with or without other features and elements disclosed herein.
[0039] This written description uses examples of the subject matter disclosed to enable any person skilled in the art to practice the same, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the subject matter is defined by the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims.
Claims
1. A computer-implemented method (400, 500) for adding new storage devices (120, 320) to a data storage system (100, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the method comprising:
receiving (410, S510) an indication that a number of new storage devices are to be added to the data storage system; and
if the number of new storage devices is greater than the replication factor (420), then configuring (430, S520) a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
2. The computer-implemented method of claim 1 , wherein a pre-existing logical storage space of the data storage system before adding the new data storage devices is included without being modified, in the logical storage space.
3. The computer-implemented method of claim 2, wherein, according to the pre-existing logical storage space, pre-existing groups of blocks storing sets of replicas, respectively, have been allocated diagonally and circularly across the pre-existing storage devices.
4. The computer-implemented method of claim 1 , further comprising:
if the number of new storage devices is not greater than the replication factor, configuring (440) the logical storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
5. The computer-implemented method of claim 4, further comprising:
when the number of new storage devices is not greater than the replication factor, creating links between blocks of the pre-existing storage devices and blocks of the new storage devices according to the iogicai storage space of the data storage system after adding the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
6. The computer-implemented method of claim 1 , wherein blocks of a same group are allocated to be logically contiguous in a memory space.
7. A data storage system (104, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the data storage system comprising:
pre-existing storage devices (1 10, 310) and new storage devices (120, 320); a processor (105) configured
to receive an indication that a number of the new storage devices are added to the data storage system; and
if the number of new storage devices is greater than the replication factor, to configure a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
8. The data storage system of claim 7, wherein the at least one processor is configured to include without modifying a pre-existing logical storage space of the data storage system before adding the new data storage devices, in the logical storage space of the data storage system after adding the new data storage devices.
9. The data storage system of claim 8, wherein the pre-existing logical storage space has been configured to allocate diagonally and circularly groups of blocks to store sets of replicas, respectively, across the pre-existing storage devices.
10. The data storage system of claim 7, wherein if the number of new storage devices is not greater than the replication factor, the at least one processor configures the logical storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
1 . The data storage system of claim 0, wherein when the number of new storage devices is not greater than the replication factor, the at least one processor links
blocks of the pre-existing storage devices and blocks of the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
12. The data storage system of claim 7, wherein the at least one processor allocates blocks of a same new group to be logically contiguous in a memory space.
13. A non-transitory computer readable recording medium (130) storing executable codes which, when executed by a computer, make the computer perform a method (500) for adding new storage devices (120, 320) to a data storage system (104, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the method comprising: receiving (410, S510) an indication that a number of new storage devices are added to the data storage system; and
if the number of new storage devices is greater than the replication factor (420), then configuring (430, S520) a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, to be allocated diagonally and circularly across the new storage devices.
14. The non-transitory computer readable recording medium of claim 13, wherein a pre-existing logical storage space of the data storage system before adding the new data storage devices is included without being modified, in the logical storage space.
15. The non-transitory computer readable recording medium of claim 14, wherein, according to the pre-existing logicai storage space, pre-existing groups of blocks storing sets of replicas, respectively, have been allocated diagonally and circularly across the pre-existing storage devices.
16. The non-transitory computer readable recording medium of claim 13, wherein the method further comprises:
if the number of new storage devices is not greater than the replication factor, configuring (440) the logicai storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
17. The non-transitory computer readable recording medium of claim 16, wherein the method further comprises:
when the number of new storage devices is not greater than the replication factor, creating links between blocks of the pre-existing storage devices and blocks of the new storage devices according to the logical storage space of the data storage system after adding the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
18. The non-transitory computer readable recording medium of claim 13, wherein blocks of a same group are allocated to be logically contiguous in a memory space.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2016/050386 WO2017130022A1 (en) | 2016-01-26 | 2016-01-26 | Method for adding storage devices to a data storage system with diagonally replicated data storage blocks |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2016/050386 WO2017130022A1 (en) | 2016-01-26 | 2016-01-26 | Method for adding storage devices to a data storage system with diagonally replicated data storage blocks |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017130022A1 true WO2017130022A1 (en) | 2017-08-03 |
Family
ID=55451513
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2016/050386 WO2017130022A1 (en) | 2016-01-26 | 2016-01-26 | Method for adding storage devices to a data storage system with diagonally replicated data storage blocks |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017130022A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170220284A1 (en) * | 2016-01-29 | 2017-08-03 | Netapp, Inc. | Block-level internal fragmentation reduction using a heuristic-based approach to allocate fine-grained blocks |
Citations (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5937425A (en) | 1997-10-16 | 1999-08-10 | M-Systems Flash Disk Pioneers Ltd. | Flash file system optimized for page-mode flash technologies |
US20030120809A1 (en) | 2001-12-20 | 2003-06-26 | Bellur Barghav R. | Interference mitigation and adaptive routing in wireless ad-hoc packet-switched networks |
US20030191916A1 (en) | 2002-04-04 | 2003-10-09 | International Business Machines Corporation | Apparatus and method of cascading backup logical volume mirrors |
US20050144514A1 (en) | 2001-01-29 | 2005-06-30 | Ulrich Thomas R. | Dynamic redistribution of parity groups |
US20070143359A1 (en) | 2005-12-19 | 2007-06-21 | Yahoo! Inc. | System and method for recovery from failure of a storage server in a distributed column chunk data store |
US7249150B1 (en) | 2001-07-03 | 2007-07-24 | Network Appliance, Inc. | System and method for parallelized replay of an NVRAM log in a storage appliance |
US20100042790A1 (en) | 2008-08-12 | 2010-02-18 | Netapp, Inc. | Scalable deduplication of stored data |
US7680837B2 (en) | 2005-11-08 | 2010-03-16 | Nec Corporation | File management method for log-structured file system for sequentially adding and storing log of file access |
US20100088296A1 (en) | 2008-10-03 | 2010-04-08 | Netapp, Inc. | System and method for organizing data to facilitate data deduplication |
US20110035548A1 (en) | 2008-02-12 | 2011-02-10 | Kimmel Jeffrey S | Hybrid media storage system architecture |
US7996636B1 (en) | 2007-11-06 | 2011-08-09 | Netapp, Inc. | Uniquely identifying block context signatures in a storage volume hierarchy |
US20110213928A1 (en) | 2010-02-27 | 2011-09-01 | Cleversafe, Inc. | Distributedly storing raid data in a raid memory and a dispersed storage network memory |
US8082390B1 (en) | 2007-06-20 | 2011-12-20 | Emc Corporation | Techniques for representing and storing RAID group consistency information |
US8099396B1 (en) | 2006-12-15 | 2012-01-17 | Netapp, Inc. | System and method for enhancing log performance |
US8205065B2 (en) | 2009-03-30 | 2012-06-19 | Exar Corporation | System and method for data deduplication |
US20120290788A1 (en) | 2006-05-24 | 2012-11-15 | Compellent Technologies | System and method for raid management, reallocation, and restripping |
US8341457B2 (en) | 2010-03-11 | 2012-12-25 | Lsi Corporation | System and method for optimizing redundancy restoration in distributed data layout environments |
US8417987B1 (en) | 2009-12-01 | 2013-04-09 | Netapp, Inc. | Mechanism for correcting errors beyond the fault tolerant level of a raid array in a storage system |
US8495417B2 (en) | 2009-01-09 | 2013-07-23 | Netapp, Inc. | System and method for redundancy-protected aggregates |
US8560879B1 (en) | 2009-04-22 | 2013-10-15 | Netapp Inc. | Data recovery for failed memory device of memory device array |
US8595595B1 (en) | 2010-12-27 | 2013-11-26 | Netapp, Inc. | Identifying lost write errors in a raid array |
US8832363B1 (en) | 2014-01-17 | 2014-09-09 | Netapp, Inc. | Clustered RAID data organization |
US8839008B2 (en) | 2011-09-23 | 2014-09-16 | Broadcom Corporation | System and method for detecting configuration of a power sourcing equipment device connected to a powered device by simultaneously measuring voltage at two terminals of a resistor disposed within the powered device |
-
2016
- 2016-01-26 WO PCT/IB2016/050386 patent/WO2017130022A1/en active Application Filing
Patent Citations (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5937425A (en) | 1997-10-16 | 1999-08-10 | M-Systems Flash Disk Pioneers Ltd. | Flash file system optimized for page-mode flash technologies |
US20050144514A1 (en) | 2001-01-29 | 2005-06-30 | Ulrich Thomas R. | Dynamic redistribution of parity groups |
US7249150B1 (en) | 2001-07-03 | 2007-07-24 | Network Appliance, Inc. | System and method for parallelized replay of an NVRAM log in a storage appliance |
US20030120809A1 (en) | 2001-12-20 | 2003-06-26 | Bellur Barghav R. | Interference mitigation and adaptive routing in wireless ad-hoc packet-switched networks |
US20030191916A1 (en) | 2002-04-04 | 2003-10-09 | International Business Machines Corporation | Apparatus and method of cascading backup logical volume mirrors |
US7680837B2 (en) | 2005-11-08 | 2010-03-16 | Nec Corporation | File management method for log-structured file system for sequentially adding and storing log of file access |
US20070143359A1 (en) | 2005-12-19 | 2007-06-21 | Yahoo! Inc. | System and method for recovery from failure of a storage server in a distributed column chunk data store |
US20120290788A1 (en) | 2006-05-24 | 2012-11-15 | Compellent Technologies | System and method for raid management, reallocation, and restripping |
US8099396B1 (en) | 2006-12-15 | 2012-01-17 | Netapp, Inc. | System and method for enhancing log performance |
US8082390B1 (en) | 2007-06-20 | 2011-12-20 | Emc Corporation | Techniques for representing and storing RAID group consistency information |
US7996636B1 (en) | 2007-11-06 | 2011-08-09 | Netapp, Inc. | Uniquely identifying block context signatures in a storage volume hierarchy |
US20110035548A1 (en) | 2008-02-12 | 2011-02-10 | Kimmel Jeffrey S | Hybrid media storage system architecture |
US20100042790A1 (en) | 2008-08-12 | 2010-02-18 | Netapp, Inc. | Scalable deduplication of stored data |
US20100088296A1 (en) | 2008-10-03 | 2010-04-08 | Netapp, Inc. | System and method for organizing data to facilitate data deduplication |
US8495417B2 (en) | 2009-01-09 | 2013-07-23 | Netapp, Inc. | System and method for redundancy-protected aggregates |
US8205065B2 (en) | 2009-03-30 | 2012-06-19 | Exar Corporation | System and method for data deduplication |
US8560879B1 (en) | 2009-04-22 | 2013-10-15 | Netapp Inc. | Data recovery for failed memory device of memory device array |
US8417987B1 (en) | 2009-12-01 | 2013-04-09 | Netapp, Inc. | Mechanism for correcting errors beyond the fault tolerant level of a raid array in a storage system |
US20110213928A1 (en) | 2010-02-27 | 2011-09-01 | Cleversafe, Inc. | Distributedly storing raid data in a raid memory and a dispersed storage network memory |
US8341457B2 (en) | 2010-03-11 | 2012-12-25 | Lsi Corporation | System and method for optimizing redundancy restoration in distributed data layout environments |
US8595595B1 (en) | 2010-12-27 | 2013-11-26 | Netapp, Inc. | Identifying lost write errors in a raid array |
US8839008B2 (en) | 2011-09-23 | 2014-09-16 | Broadcom Corporation | System and method for detecting configuration of a power sourcing equipment device connected to a powered device by simultaneously measuring voltage at two terminals of a resistor disposed within the powered device |
US8832363B1 (en) | 2014-01-17 | 2014-09-09 | Netapp, Inc. | Clustered RAID data organization |
Non-Patent Citations (2)
Title |
---|
ANONYMOUS: "md(4): Multiple Device driver aka Software RAID - Linux man page", 27 December 2015 (2015-12-27), XP055301357, Retrieved from the Internet <URL:http://web.archive.org/web/20151227021146/http://linux.die.net/man/4/md> [retrieved on 20160909] * |
ANONYMOUS: "mdadm(8): manage MD devices aka Software RAID - Linux man page", 19 December 2015 (2015-12-19), XP055301360, Retrieved from the Internet <URL:http://web.archive.org/web/20151219130132/http://linux.die.net/man/8/mdadm> [retrieved on 20160909] * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170220284A1 (en) * | 2016-01-29 | 2017-08-03 | Netapp, Inc. | Block-level internal fragmentation reduction using a heuristic-based approach to allocate fine-grained blocks |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10496627B2 (en) | Consistent ring namespaces facilitating data storage and organization in network infrastructures | |
US10019192B2 (en) | Policy-based hierarchical data protection in distributed storage | |
US7685459B1 (en) | Parallel backup | |
US7293154B1 (en) | System and method for optimizing storage operations by operating only on mapped blocks | |
US7689764B1 (en) | Network routing of data based on content thereof | |
US10852966B1 (en) | System and method for creating mapped RAID group during expansion of extent pool | |
US9031906B2 (en) | Method of managing data in asymmetric cluster file system | |
CN103929500A (en) | Method for data fragmentation of distributed storage system | |
US11836115B2 (en) | Gransets for managing consistency groups of dispersed storage items | |
EP3513296B1 (en) | Hierarchical fault tolerance in system storage | |
US10628298B1 (en) | Resumable garbage collection | |
US10459806B1 (en) | Cloud storage replica of a storage array device | |
JP2016513306A (en) | Data storage method, data storage device, and storage device | |
US8924656B1 (en) | Storage environment with symmetric frontend and asymmetric backend | |
CN105373340A (en) | System and method for secure multi-tenancy in operating system of a storage system | |
US11003554B2 (en) | RAID schema for providing metadata protection in a data storage system | |
WO2014107901A1 (en) | Data storage method, database storage node failure processing method and apparatus | |
CN104391802A (en) | Streamline pool metadata node refreshing consistency protecting method | |
US11194501B2 (en) | Standby copies withstand cascading fails | |
US11514181B2 (en) | Bin syncing technique for multiple data protection schemes | |
US11216204B2 (en) | Degraded redundant metadata, DRuM, technique | |
CN111752892B (en) | Distributed file system and implementation method, management system, equipment and medium thereof | |
US10929255B2 (en) | Reducing the size of fault domains | |
WO2017130022A1 (en) | Method for adding storage devices to a data storage system with diagonally replicated data storage blocks | |
CN116389233A (en) | Container cloud management platform active-standby switching system, method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16707561 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16707561 Country of ref document: EP Kind code of ref document: A1 |