WO2017130022A1 - Method for adding storage devices to a data storage system with diagonally replicated data storage blocks - Google Patents

Method for adding storage devices to a data storage system with diagonally replicated data storage blocks Download PDF

Info

Publication number
WO2017130022A1
WO2017130022A1 PCT/IB2016/050386 IB2016050386W WO2017130022A1 WO 2017130022 A1 WO2017130022 A1 WO 2017130022A1 IB 2016050386 W IB2016050386 W IB 2016050386W WO 2017130022 A1 WO2017130022 A1 WO 2017130022A1
Authority
WO
WIPO (PCT)
Prior art keywords
storage devices
new
data storage
blocks
existing
Prior art date
Application number
PCT/IB2016/050386
Other languages
French (fr)
Inventor
Nobin Mathew
Subrata Ghosh
George Madathilparambil George
Prakash PADMANABHAN
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/IB2016/050386 priority Critical patent/WO2017130022A1/en
Publication of WO2017130022A1 publication Critical patent/WO2017130022A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0632Configuration or reconfiguration of storage systems by initialisation or re-initialisation of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space

Definitions

  • Embodiments of the invention relate to the field of data storage systems and, more specifically, to configuring logical storage space when new storage devices are added to an existing data storage system.
  • the manner of organizing storage blocks in a data storage system aims to ensure efficient access and data replication so as to provide adequate fault tolerance in the event of hardware failures.
  • the storage blocks are replicated such that each replica resides in a different storage device.
  • the number of storage devices in a conventional data storage system multip!icative!y depends on a replication factor, and this has been an undesirable constraint.
  • the logical storage space of the data storage system is configured such that groups of blocks storing replicas of the same data are allocated diagonally and circularly across the added storage devices. This manner of
  • a computer-implemented method for adding new devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
  • the method includes receiving an indication that a number of new storage devices are to be added to the data storage system. If the number of new storage devices is greater than the replication factor, then a logical storage space of the data storage system after adding the new storage devices is configured such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
  • each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
  • the data storage system includes preexisting storage devices, new storage devices and a processor.
  • the processor is configured to receive an indication that a number of new storage devices are added to the data storage system. Further, if the number of new storage devices is greater than the replication factor, the processor configures a logical storage space of the data storage system after adding the new storage devices so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
  • a non-transitory computer- readable recording medium storing executable codes that, when executed by a computer, make the computer perform a method for adding new storage devices to a data storage system in which each storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas.
  • the method includes receiving an indication that a number of new storage devices are added to the data storage system. If the number of new storage devices is greater than the replication factor, then a iogicai storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
  • Figure 1 is a block diagram illustrating a data storage system according to one embodiment
  • Figure 2 illustrates a memory configuration according to an embodiment
  • Figure 3 illustrates diagonal and circular allocation of partitions in two sets of disks
  • Figure 4 is a block diagram of a method according to an embodiment
  • Figure 5 is a flowchart of a method according to an embodiment. DETAILED DESCRIPTION
  • Figure 1 is a block diagram illustrating data storage according to one embodiment.
  • One or more clients such as 101 and 102, are communicatively coupled to storage system 104 over network 103.
  • the term “coupled” is used to indicate that the elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other.
  • the term “connected” is used in this document to indicate that elements coupled with each other communicate.
  • Each of clients 101 and 102 may be a server or a personal computer (e.g., workstation, laptop, netbook, tablet, palm top, mobile phone, smartphone, phablet, multimedia phone, Voice Over Internet Protocol (VOIP) phone, terminal, portable media player, GPS unit, wearable device, gaming system, set-top box, Internet-enabled household appliance, etc.).
  • Network 103 may be any type of network, such as a local area network (LAN), a wide area network (WAN) such as the Internet, a corporate intranet, a metropolitan area network (MAN), a storage area network (SAN), a bus, or a combination thereof, wired and/or wireless.
  • LAN local area network
  • WAN wide area network
  • MAN metropolitan area network
  • SAN storage area network
  • bus or a combination thereof, wired and/or wireless.
  • Data storage system 104 may include any type of server or cluster of servers, and may have a distributed architecture, or all of its components may be integrated into a single unit.
  • the data storage system may be, for example, a file server (e.g., an appliance used to provide network attached storage (NAS) capability), a block- based storage server (e.g., used to provide SAN capability), a unified storage device (e.g., one which combines NAS and SAN capabilities), a neariine storage device, a direct attached storage (DAS) device, a tape backup device, or essentially any other type of data storage device.
  • NAS network attached storage
  • DAS direct attached storage
  • Storage system 104 includes pre-existing storage devices 1 10 and newly added storage devices 120.
  • Each of the storage devices may be, for example, conventional magnetic disks, magnetic tape storages, magneto-optical (MO) storage media, solid-state disks, flash memory-based devices, or any other type of non-volatile storage devices suitable for data.
  • the storage devices may be disk storage media organized into one or more volumes of redundant array of inexpensive disks (RAID).
  • RAID redundant array of inexpensive disks
  • a storage managing processor 105 communicatively coupled to the storage system 104 may be configured to perform allocation and maintenance related to storage system 104.
  • the processor may execute data storage-related management methods by executing software stored on a computer-readable medium, which may also be part of the storage system (e.g., computer-readable medium 130 in Figure 1 ).
  • a distributed file system may be built on top of storage blocks (e.g., partitions in disks) grouped to store the same data.
  • the location of the blocks in a group may be determined by the administrator of cloud storage, which thus performs a storage manager function.
  • Figure 2 illustrates memory (i.e., volume) configuration according to an embodiment.
  • the memory includes multiple disk sets. Each disk set is represented as a matrix, between square parentheses.
  • M the number of disks in a set
  • R the replication factor.
  • M the number of disks in each set is greater than the replication factor (M>R).
  • M the number of disks in all the disk sets illustrated in Figure 2 is M.
  • Disks A1 , A2, to AM pertain to a first set, and labels Ah, A to AIR indicate partitions in the disk Ai (where ⁇ " takes values between 1 and M).
  • the number of partitions in a disk may be equal to the replication factor, but there also may be more partitions.
  • the different sets of disks may have been added at different times. In other words, when a number of disks greater than the replication factor are added together, a new set of disks may be formed.
  • the data storage system may be managed such that blocks (i.e., partitions) used for storing sets of replicas are allocated diagonally and circularly. This manner of allocating blocks avoids the constraint that multiplicatively constraints the number of disks relative to the replication factor. However, this manner of organizing disks is an option and not a limitation, in a more general view, blocks from different disks are allocated to store copies (replicas) of each block, respectively.
  • Figure 3 exempiarily illustrates diagonal and circular allocation of blocks (that are also called "partitions " ) in disk set 310 (which may be pre-existing disks) and disk set 320 (which may be newly added disks).
  • the replication factor is 4, that is, there are 3 copies to any block, for a total of 4 replicas in a set of replicas.
  • the disk blocks form groups for storing sets of replicas as follows: the first group includes A1 , B2, C3 and D4, the second group includes B1 , C2, D3 and E4, the third group includes C1 , D2, E3 and A4, the fourth group includes D1 , E2, A3 and B4, and the fifth group includes E1 , A2, B3 and C4.
  • the disk partitions then form the new groups for storing new sets of replicas as follows: the first new group includes F1 , G2, H3 and I4, the second new group includes G1 , H2, 13 and J4, the third new group includes H1 , I2, J3 and K4, the fourth new group includes 11 , J2, K3 and F4, the fifth new group includes J1 , K2, F3 and G4, and finally, the sixth new group includes K1 , F2, G3 and H4.
  • FIG. 4 is a block diagram of a method for adding new disks to a data storage system according to an embodiment.
  • K information regarding the number of new disks, K
  • the number of new disks is compared to the replication factor, R. If K>R (i.e., the "YES" branch emerging from the diamond block labeled 420 in Figure 4), then, at 430, new groups of blocks/partitions are created for storing sets of replicas across the new K disks. If, however, K ⁇ R (i.e., the "NO" branch emerging from the block 420 in Figure 4), then, at 440, the new disks are merged (i.e., organized and allocated) with pre-existing disks.
  • Block 450 in Figure 4 indicates that the upgraded storage system (including the new disks) is ready for use.
  • the storage entities i.e., blocks or partitions
  • the grouped blocks or partitions for storing sets of replicas
  • may be logically contiguous in the memory space i.e., the starting address of a block/partition in a storage device is immediately after the ending address of another block/partition in another storage device.
  • a data storage system in which each data storing block is replicated a replication factor, R, times in different storage devices includes pre-existing storage devices (such as 1 10 in Figure 1 and 310 in Figure 3), new storage devices (such as 120 in Figure 1 and 320 in Figure 3), and a processor (such as 105 in Figure 1 ).
  • the processor may be configured to receive an indication that a number, K, of new storage devices are added to the data storage system.
  • the processor may configure a logical storage space of the data storage system after adding new storage devices so that groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
  • the pre-existing logical storage space of the data storage system before adding the new data storage devices may be included without being modified in the logical storage space of the data storage system after adding the new data storage devices.
  • the pre-existing logical storage space of the data storage system before adding the new data storage devices may have also been configured such that preexisting groups of blocks storing sets of replicas, respectively, were diagonally and circularly allocated across the pre-existing storage devices.
  • the processor may configure the logical storage space of the data storage such that the new groups of blocks storing sets of replicas are allocated diagonally and circularly across both the pre-existing storage devices and the new storage devices.
  • the processor may also link blocks of the pre-existing storage devices and blocks of the new storage devices to avoid copying data already stored in the blocks of the pre-existing storage devices.
  • a non-transitory computer readable recording medium (such as memory 130 in Figure 1 ) stores executable codes which, when executed by a computer, make the computer perform methods for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage device, as previously described.
  • the disclosed exemplary embodiments provide methods and devices for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.

Abstract

New storage devices are efficiently added to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then the logical storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.

Description

[0001] Embodiments of the invention relate to the field of data storage systems and, more specifically, to configuring logical storage space when new storage devices are added to an existing data storage system.
DISCUSSION OF THE BACKGROUND
[0002] The manner of organizing storage blocks in a data storage system aims to ensure efficient access and data replication so as to provide adequate fault tolerance in the event of hardware failures. In order to avoid loss of data when a storage device fails, the storage blocks are replicated such that each replica resides in a different storage device. The number of storage devices in a conventional data storage system multip!icative!y depends on a replication factor, and this has been an undesirable constraint.
[0003] This constraint has been recently overcome through development of methods of organizing/allocating data storage blocks in a data storage system allocating storage blocks in a circular and diagonal manner. These methods allocate the blocks one-by- one on different storage devices in a circular list. Replicas of a set (i.e., storing the same data) are allocated to successive devices in the list. When a block is allocated in the "last" device in the list, the next block is allocated to the "first" device in the list. This circular and diagonal manner of organizing/allocating storage blocks renders moot the constraint of adding a number of new memories dependent on the replication factor.
[0004] Various aspects related to data storage systems are continuously improved as described in U.S. Patent Nos. 8832363, 5937425, 7249150, 7680837, 7996636, 8082390, 8099396, 8205065, 8341457, 8417987, 8495417, 8839008, 8560879, and 8595595, and U.S. Patent Application Publication Nos. 2003/0120869, 2003/0191916, 2005/0144514, 2007/0143359, 2010/0042790, 2010/0088296, 201 1/0035548,
201 1/0213928, and 2012/0290788.
[0005] Automated management of the data storage system storing replicas of data in physically distinct devices is substantially disrupted when new storage devices (i.e., memories or disks) are added. For example, such an expansion of a data storage system that is circularly and diagonally organized/allocated triggers delays due to data copying between pre-existing and new storage devices.
[0006] It is desirable to develop methods for adding new storage devices to a data storage system, which methods to minimize the disruption and avoid the need to copy data from one storage device to another.
SUMMARY
[0007] In various embodiments, if the number of added storage devices is greater than the replication factor, then the logical storage space of the data storage system is configured such that groups of blocks storing replicas of the same data are allocated diagonally and circularly across the added storage devices. This manner of
organizing/allocating storage space for replica sets reduces the time required to add storage devices (i.e., the added devices are instantly available without delays caused by data copying from one storage device to another). Moreover, the number of added devices does not have to be a multiple of the replication factor.
[0008] According to an embodiment, there is a computer-implemented method for adding new devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The method includes receiving an indication that a number of new storage devices are to be added to the data storage system. If the number of new storage devices is greater than the replication factor, then a logical storage space of the data storage system after adding the new storage devices is configured such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
[0009] According to another embodiment, there is a data storage system in which each data storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The data storage system includes preexisting storage devices, new storage devices and a processor. The processor is configured to receive an indication that a number of new storage devices are added to the data storage system. Further, if the number of new storage devices is greater than the replication factor, the processor configures a logical storage space of the data storage system after adding the new storage devices so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices. [0010] According to yet another embodiment, there is a non-transitory computer- readable recording medium storing executable codes that, when executed by a computer, make the computer perform a method for adding new storage devices to a data storage system in which each storing block is replicated a replication factor times in different storage devices, thereby forming sets of replicas. The method includes receiving an indication that a number of new storage devices are added to the data storage system. If the number of new storage devices is greater than the replication factor, then a iogicai storage space of the data storage system after adding the new storage devices is configured so that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate one or more embodiments and, together with the description, explain these embodiments. In the drawings:
[0012] Figure 1 is a block diagram illustrating a data storage system according to one embodiment;
[0013] Figure 2 illustrates a memory configuration according to an embodiment;
[0014] Figure 3 illustrates diagonal and circular allocation of partitions in two sets of disks;
[0015] Figure 4 is a block diagram of a method according to an embodiment; and [0016] Figure 5 is a flowchart of a method according to an embodiment. DETAILED DESCRIPTION
[0017] The following description of the exemplary embodiments refers to the accompanying drawings. The same reference numbers in different drawings identify the same or similar elements. The following detailed description does not limit the invention, instead, the scope of the invention is defined by the appended claims. The following embodiments are discussed using terminology of block storage systems (i.e., non-object and non-metadata based systems).
[0018] Reference throughout the specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with an embodiment is included in at least one embodiment of the subject matter disclosed. Thus, the appearance of the phrases "in one embodiment" or "in an embodiment" in various places throughout the specification is not necessarily referring to the same embodiment. Further, the particular features, structures or characteristics may be combined in any suitable manner in one or more embodiments.
[0019] Figure 1 is a block diagram illustrating data storage according to one embodiment. One or more clients, such as 101 and 102, are communicatively coupled to storage system 104 over network 103. Here, the term "coupled" is used to indicate that the elements, which may or may not be in direct physical or electrical contact with each other, cooperate or interact with each other. Further, the term "connected" is used in this document to indicate that elements coupled with each other communicate.
[0020] Each of clients 101 and 102 may be a server or a personal computer (e.g., workstation, laptop, netbook, tablet, palm top, mobile phone, smartphone, phablet, multimedia phone, Voice Over Internet Protocol (VOIP) phone, terminal, portable media player, GPS unit, wearable device, gaming system, set-top box, Internet-enabled household appliance, etc.). Network 103 may be any type of network, such as a local area network (LAN), a wide area network (WAN) such as the Internet, a corporate intranet, a metropolitan area network (MAN), a storage area network (SAN), a bus, or a combination thereof, wired and/or wireless.
[0021] Data storage system 104 may include any type of server or cluster of servers, and may have a distributed architecture, or all of its components may be integrated into a single unit. The data storage system may be, for example, a file server (e.g., an appliance used to provide network attached storage (NAS) capability), a block- based storage server (e.g., used to provide SAN capability), a unified storage device (e.g., one which combines NAS and SAN capabilities), a neariine storage device, a direct attached storage (DAS) device, a tape backup device, or essentially any other type of data storage device.
[0022] Storage system 104 includes pre-existing storage devices 1 10 and newly added storage devices 120. Each of the storage devices may be, for example, conventional magnetic disks, magnetic tape storages, magneto-optical (MO) storage media, solid-state disks, flash memory-based devices, or any other type of non-volatile storage devices suitable for data. The storage devices may be disk storage media organized into one or more volumes of redundant array of inexpensive disks (RAID). In the following description, the term "disk" is used as short form for "storage device."
[0023] A storage managing processor 105 communicatively coupled to the storage system 104 may be configured to perform allocation and maintenance related to storage system 104. The processor may execute data storage-related management methods by executing software stored on a computer-readable medium, which may also be part of the storage system (e.g., computer-readable medium 130 in Figure 1 ).
[0024] According to one embodiment, a distributed file system (DSF) may be built on top of storage blocks (e.g., partitions in disks) grouped to store the same data. The location of the blocks in a group may be determined by the administrator of cloud storage, which thus performs a storage manager function.
[0025] The following description refers to a single volume, but multiple volumes and expanding a volume may also be implemented using the same concepts. Figure 2 illustrates memory (i.e., volume) configuration according to an embodiment. The memory includes multiple disk sets. Each disk set is represented as a matrix, between square parentheses. Consider M to be the number of disks in a set, and R the replication factor. The number of disks in each set is greater than the replication factor (M>R). There may be different numbers of disks in the different disk sets, but for the sake of simplicity, the numbers of disks in all the disk sets illustrated in Figure 2 is M.
[0026] Disks A1 , A2, to AM pertain to a first set, and labels Ah, A to AIR indicate partitions in the disk Ai (where Ί" takes values between 1 and M). The number of partitions in a disk may be equal to the replication factor, but there also may be more partitions.
[0027] The different sets of disks may have been added at different times. In other words, when a number of disks greater than the replication factor are added together, a new set of disks may be formed.
[0028] The data storage system may be managed such that blocks (i.e., partitions) used for storing sets of replicas are allocated diagonally and circularly. This manner of allocating blocks avoids the constraint that multiplicatively constraints the number of disks relative to the replication factor. However, this manner of organizing disks is an option and not a limitation, in a more general view, blocks from different disks are allocated to store copies (replicas) of each block, respectively.
[0029] Figure 3 exempiarily illustrates diagonal and circular allocation of blocks (that are also called "partitions") in disk set 310 (which may be pre-existing disks) and disk set 320 (which may be newly added disks). Disk set 310 includes disks A, B, C, D and E, (that is, M~5 disks), and disk set 320 includes disks F, G, H, I, J and K (i.e. , M=6 disks). The replication factor is 4, that is, there are 3 copies to any block, for a total of 4 replicas in a set of replicas.
[0030] in disk set 310, the disk blocks form groups for storing sets of replicas as follows: the first group includes A1 , B2, C3 and D4, the second group includes B1 , C2, D3 and E4, the third group includes C1 , D2, E3 and A4, the fourth group includes D1 , E2, A3 and B4, and the fifth group includes E1 , A2, B3 and C4.
[0031] in disk set 320, the disk partitions then form the new groups for storing new sets of replicas as follows: the first new group includes F1 , G2, H3 and I4, the second new group includes G1 , H2, 13 and J4, the third new group includes H1 , I2, J3 and K4, the fourth new group includes 11 , J2, K3 and F4, the fifth new group includes J1 , K2, F3 and G4, and finally, the sixth new group includes K1 , F2, G3 and H4.
Different shades and hashes are employed to illustrate the different groups of partitions.
[0032] Figure 4 is a block diagram of a method for adding new disks to a data storage system according to an embodiment. At 410, information regarding the number of new disks, K, becomes available. Then, at 420, the number of new disks is compared to the replication factor, R. If K>R (i.e., the "YES" branch emerging from the diamond block labeled 420 in Figure 4), then, at 430, new groups of blocks/partitions are created for storing sets of replicas across the new K disks. If, however, K<R (i.e., the "NO" branch emerging from the block 420 in Figure 4), then, at 440, the new disks are merged (i.e., organized and allocated) with pre-existing disks. Block 450 in Figure 4 indicates that the upgraded storage system (including the new disks) is ready for use.
[0033] Although the storage entities (i.e., blocks or partitions) are allocated in different storage devices, the grouped blocks or partitions (for storing sets of replicas) may be logically contiguous in the memory space (i.e., the starting address of a block/partition in a storage device is immediately after the ending address of another block/partition in another storage device).
[0034] According to an exemplary embodiment, a data storage system in which each data storing block is replicated a replication factor, R, times in different storage devices includes pre-existing storage devices (such as 1 10 in Figure 1 and 310 in Figure 3), new storage devices (such as 120 in Figure 1 and 320 in Figure 3), and a processor (such as 105 in Figure 1 ). The processor may be configured to receive an indication that a number, K, of new storage devices are added to the data storage system. The processor may configure a logical storage space of the data storage system after adding new storage devices so that groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
[0035] The pre-existing logical storage space of the data storage system before adding the new data storage devices may be included without being modified in the logical storage space of the data storage system after adding the new data storage devices. The pre-existing logical storage space of the data storage system before adding the new data storage devices may have also been configured such that preexisting groups of blocks storing sets of replicas, respectively, were diagonally and circularly allocated across the pre-existing storage devices.
[0036] if the number of new storage devices is not greater than the replication factor, the processor may configure the logical storage space of the data storage such that the new groups of blocks storing sets of replicas are allocated diagonally and circularly across both the pre-existing storage devices and the new storage devices. In this case, the processor may also link blocks of the pre-existing storage devices and blocks of the new storage devices to avoid copying data already stored in the blocks of the pre-existing storage devices.
[0037] According to one embodiment, a non-transitory computer readable recording medium (such as memory 130 in Figure 1 ) stores executable codes which, when executed by a computer, make the computer perform methods for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage device, as previously described.
The disclosed exemplary embodiments provide methods and devices for adding new storage devices to a data storage system in which each data storing block is replicated a replication factor times in different storage devices. If the number of new storage devices is greater than the replication factor, then new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices. It should be understood that this description is not intended to limit the invention. On the contrary, the exemplary embodiments are intended to cover alternatives, modifications and equivalents, which are included in the spirit and scope of the invention as defined by the appended claims. Further, in the detailed description of the exemplary embodiments, numerous specific details are set forth in order to provide a comprehensive understanding of the claimed invention. However, one skilled in the art would understand that various embodiments may be practiced without such specific details.
[0038] Although the features and elements of the present exemplary
embodiments are described in the embodiments in particular combinations, each feature or element can be used alone without the other features and elements of the embodiments or in various combinations with or without other features and elements disclosed herein.
[0039] This written description uses examples of the subject matter disclosed to enable any person skilled in the art to practice the same, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the subject matter is defined by the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims.

Claims

WHAT IS CLAIMED IS:
1. A computer-implemented method (400, 500) for adding new storage devices (120, 320) to a data storage system (100, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the method comprising:
receiving (410, S510) an indication that a number of new storage devices are to be added to the data storage system; and
if the number of new storage devices is greater than the replication factor (420), then configuring (430, S520) a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
2. The computer-implemented method of claim 1 , wherein a pre-existing logical storage space of the data storage system before adding the new data storage devices is included without being modified, in the logical storage space.
3. The computer-implemented method of claim 2, wherein, according to the pre-existing logical storage space, pre-existing groups of blocks storing sets of replicas, respectively, have been allocated diagonally and circularly across the pre-existing storage devices.
4. The computer-implemented method of claim 1 , further comprising: if the number of new storage devices is not greater than the replication factor, configuring (440) the logical storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
5. The computer-implemented method of claim 4, further comprising:
when the number of new storage devices is not greater than the replication factor, creating links between blocks of the pre-existing storage devices and blocks of the new storage devices according to the iogicai storage space of the data storage system after adding the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
6. The computer-implemented method of claim 1 , wherein blocks of a same group are allocated to be logically contiguous in a memory space.
7. A data storage system (104, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the data storage system comprising:
pre-existing storage devices (1 10, 310) and new storage devices (120, 320); a processor (105) configured
to receive an indication that a number of the new storage devices are added to the data storage system; and if the number of new storage devices is greater than the replication factor, to configure a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, are allocated diagonally and circularly across the new storage devices.
8. The data storage system of claim 7, wherein the at least one processor is configured to include without modifying a pre-existing logical storage space of the data storage system before adding the new data storage devices, in the logical storage space of the data storage system after adding the new data storage devices.
9. The data storage system of claim 8, wherein the pre-existing logical storage space has been configured to allocate diagonally and circularly groups of blocks to store sets of replicas, respectively, across the pre-existing storage devices.
10. The data storage system of claim 7, wherein if the number of new storage devices is not greater than the replication factor, the at least one processor configures the logical storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
1 . The data storage system of claim 0, wherein when the number of new storage devices is not greater than the replication factor, the at least one processor links blocks of the pre-existing storage devices and blocks of the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
12. The data storage system of claim 7, wherein the at least one processor allocates blocks of a same new group to be logically contiguous in a memory space.
13. A non-transitory computer readable recording medium (130) storing executable codes which, when executed by a computer, make the computer perform a method (500) for adding new storage devices (120, 320) to a data storage system (104, 300) in which each data storing block is replicated a replication factor (R) times in different storage devices, thereby forming sets of replicas, the method comprising: receiving (410, S510) an indication that a number of new storage devices are added to the data storage system; and
if the number of new storage devices is greater than the replication factor (420), then configuring (430, S520) a logical storage space of the data storage system after adding the new storage devices such that new groups of blocks storing sets of replicas, respectively, to be allocated diagonally and circularly across the new storage devices.
14. The non-transitory computer readable recording medium of claim 13, wherein a pre-existing logical storage space of the data storage system before adding the new data storage devices is included without being modified, in the logical storage space.
15. The non-transitory computer readable recording medium of claim 14, wherein, according to the pre-existing logicai storage space, pre-existing groups of blocks storing sets of replicas, respectively, have been allocated diagonally and circularly across the pre-existing storage devices.
16. The non-transitory computer readable recording medium of claim 13, wherein the method further comprises:
if the number of new storage devices is not greater than the replication factor, configuring (440) the logicai storage space of the data storage system such that the new groups of blocks storing sets of replicas to be allocated diagonally and circularly across both pre-existing storage devices and the new storage devices.
17. The non-transitory computer readable recording medium of claim 16, wherein the method further comprises:
when the number of new storage devices is not greater than the replication factor, creating links between blocks of the pre-existing storage devices and blocks of the new storage devices according to the logical storage space of the data storage system after adding the new storage devices, to avoid copying data already stored in the blocks of the pre-existing storage devices.
18. The non-transitory computer readable recording medium of claim 13, wherein blocks of a same group are allocated to be logically contiguous in a memory space.
PCT/IB2016/050386 2016-01-26 2016-01-26 Method for adding storage devices to a data storage system with diagonally replicated data storage blocks WO2017130022A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2016/050386 WO2017130022A1 (en) 2016-01-26 2016-01-26 Method for adding storage devices to a data storage system with diagonally replicated data storage blocks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2016/050386 WO2017130022A1 (en) 2016-01-26 2016-01-26 Method for adding storage devices to a data storage system with diagonally replicated data storage blocks

Publications (1)

Publication Number Publication Date
WO2017130022A1 true WO2017130022A1 (en) 2017-08-03

Family

ID=55451513

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/050386 WO2017130022A1 (en) 2016-01-26 2016-01-26 Method for adding storage devices to a data storage system with diagonally replicated data storage blocks

Country Status (1)

Country Link
WO (1) WO2017130022A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220284A1 (en) * 2016-01-29 2017-08-03 Netapp, Inc. Block-level internal fragmentation reduction using a heuristic-based approach to allocate fine-grained blocks

Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5937425A (en) 1997-10-16 1999-08-10 M-Systems Flash Disk Pioneers Ltd. Flash file system optimized for page-mode flash technologies
US20030120809A1 (en) 2001-12-20 2003-06-26 Bellur Barghav R. Interference mitigation and adaptive routing in wireless ad-hoc packet-switched networks
US20030191916A1 (en) 2002-04-04 2003-10-09 International Business Machines Corporation Apparatus and method of cascading backup logical volume mirrors
US20050144514A1 (en) 2001-01-29 2005-06-30 Ulrich Thomas R. Dynamic redistribution of parity groups
US20070143359A1 (en) 2005-12-19 2007-06-21 Yahoo! Inc. System and method for recovery from failure of a storage server in a distributed column chunk data store
US7249150B1 (en) 2001-07-03 2007-07-24 Network Appliance, Inc. System and method for parallelized replay of an NVRAM log in a storage appliance
US20100042790A1 (en) 2008-08-12 2010-02-18 Netapp, Inc. Scalable deduplication of stored data
US7680837B2 (en) 2005-11-08 2010-03-16 Nec Corporation File management method for log-structured file system for sequentially adding and storing log of file access
US20100088296A1 (en) 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US20110035548A1 (en) 2008-02-12 2011-02-10 Kimmel Jeffrey S Hybrid media storage system architecture
US7996636B1 (en) 2007-11-06 2011-08-09 Netapp, Inc. Uniquely identifying block context signatures in a storage volume hierarchy
US20110213928A1 (en) 2010-02-27 2011-09-01 Cleversafe, Inc. Distributedly storing raid data in a raid memory and a dispersed storage network memory
US8082390B1 (en) 2007-06-20 2011-12-20 Emc Corporation Techniques for representing and storing RAID group consistency information
US8099396B1 (en) 2006-12-15 2012-01-17 Netapp, Inc. System and method for enhancing log performance
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US20120290788A1 (en) 2006-05-24 2012-11-15 Compellent Technologies System and method for raid management, reallocation, and restripping
US8341457B2 (en) 2010-03-11 2012-12-25 Lsi Corporation System and method for optimizing redundancy restoration in distributed data layout environments
US8417987B1 (en) 2009-12-01 2013-04-09 Netapp, Inc. Mechanism for correcting errors beyond the fault tolerant level of a raid array in a storage system
US8495417B2 (en) 2009-01-09 2013-07-23 Netapp, Inc. System and method for redundancy-protected aggregates
US8560879B1 (en) 2009-04-22 2013-10-15 Netapp Inc. Data recovery for failed memory device of memory device array
US8595595B1 (en) 2010-12-27 2013-11-26 Netapp, Inc. Identifying lost write errors in a raid array
US8832363B1 (en) 2014-01-17 2014-09-09 Netapp, Inc. Clustered RAID data organization
US8839008B2 (en) 2011-09-23 2014-09-16 Broadcom Corporation System and method for detecting configuration of a power sourcing equipment device connected to a powered device by simultaneously measuring voltage at two terminals of a resistor disposed within the powered device

Patent Citations (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5937425A (en) 1997-10-16 1999-08-10 M-Systems Flash Disk Pioneers Ltd. Flash file system optimized for page-mode flash technologies
US20050144514A1 (en) 2001-01-29 2005-06-30 Ulrich Thomas R. Dynamic redistribution of parity groups
US7249150B1 (en) 2001-07-03 2007-07-24 Network Appliance, Inc. System and method for parallelized replay of an NVRAM log in a storage appliance
US20030120809A1 (en) 2001-12-20 2003-06-26 Bellur Barghav R. Interference mitigation and adaptive routing in wireless ad-hoc packet-switched networks
US20030191916A1 (en) 2002-04-04 2003-10-09 International Business Machines Corporation Apparatus and method of cascading backup logical volume mirrors
US7680837B2 (en) 2005-11-08 2010-03-16 Nec Corporation File management method for log-structured file system for sequentially adding and storing log of file access
US20070143359A1 (en) 2005-12-19 2007-06-21 Yahoo! Inc. System and method for recovery from failure of a storage server in a distributed column chunk data store
US20120290788A1 (en) 2006-05-24 2012-11-15 Compellent Technologies System and method for raid management, reallocation, and restripping
US8099396B1 (en) 2006-12-15 2012-01-17 Netapp, Inc. System and method for enhancing log performance
US8082390B1 (en) 2007-06-20 2011-12-20 Emc Corporation Techniques for representing and storing RAID group consistency information
US7996636B1 (en) 2007-11-06 2011-08-09 Netapp, Inc. Uniquely identifying block context signatures in a storage volume hierarchy
US20110035548A1 (en) 2008-02-12 2011-02-10 Kimmel Jeffrey S Hybrid media storage system architecture
US20100042790A1 (en) 2008-08-12 2010-02-18 Netapp, Inc. Scalable deduplication of stored data
US20100088296A1 (en) 2008-10-03 2010-04-08 Netapp, Inc. System and method for organizing data to facilitate data deduplication
US8495417B2 (en) 2009-01-09 2013-07-23 Netapp, Inc. System and method for redundancy-protected aggregates
US8205065B2 (en) 2009-03-30 2012-06-19 Exar Corporation System and method for data deduplication
US8560879B1 (en) 2009-04-22 2013-10-15 Netapp Inc. Data recovery for failed memory device of memory device array
US8417987B1 (en) 2009-12-01 2013-04-09 Netapp, Inc. Mechanism for correcting errors beyond the fault tolerant level of a raid array in a storage system
US20110213928A1 (en) 2010-02-27 2011-09-01 Cleversafe, Inc. Distributedly storing raid data in a raid memory and a dispersed storage network memory
US8341457B2 (en) 2010-03-11 2012-12-25 Lsi Corporation System and method for optimizing redundancy restoration in distributed data layout environments
US8595595B1 (en) 2010-12-27 2013-11-26 Netapp, Inc. Identifying lost write errors in a raid array
US8839008B2 (en) 2011-09-23 2014-09-16 Broadcom Corporation System and method for detecting configuration of a power sourcing equipment device connected to a powered device by simultaneously measuring voltage at two terminals of a resistor disposed within the powered device
US8832363B1 (en) 2014-01-17 2014-09-09 Netapp, Inc. Clustered RAID data organization

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "md(4): Multiple Device driver aka Software RAID - Linux man page", 27 December 2015 (2015-12-27), XP055301357, Retrieved from the Internet <URL:http://web.archive.org/web/20151227021146/http://linux.die.net/man/4/md> [retrieved on 20160909] *
ANONYMOUS: "mdadm(8): manage MD devices aka Software RAID - Linux man page", 19 December 2015 (2015-12-19), XP055301360, Retrieved from the Internet <URL:http://web.archive.org/web/20151219130132/http://linux.die.net/man/8/mdadm> [retrieved on 20160909] *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220284A1 (en) * 2016-01-29 2017-08-03 Netapp, Inc. Block-level internal fragmentation reduction using a heuristic-based approach to allocate fine-grained blocks

Similar Documents

Publication Publication Date Title
US10496627B2 (en) Consistent ring namespaces facilitating data storage and organization in network infrastructures
US10019192B2 (en) Policy-based hierarchical data protection in distributed storage
US7685459B1 (en) Parallel backup
US7293154B1 (en) System and method for optimizing storage operations by operating only on mapped blocks
US7689764B1 (en) Network routing of data based on content thereof
US10852966B1 (en) System and method for creating mapped RAID group during expansion of extent pool
US9031906B2 (en) Method of managing data in asymmetric cluster file system
CN103929500A (en) Method for data fragmentation of distributed storage system
US11836115B2 (en) Gransets for managing consistency groups of dispersed storage items
EP3513296B1 (en) Hierarchical fault tolerance in system storage
US10628298B1 (en) Resumable garbage collection
US10459806B1 (en) Cloud storage replica of a storage array device
JP2016513306A (en) Data storage method, data storage device, and storage device
US8924656B1 (en) Storage environment with symmetric frontend and asymmetric backend
CN105373340A (en) System and method for secure multi-tenancy in operating system of a storage system
US11003554B2 (en) RAID schema for providing metadata protection in a data storage system
WO2014107901A1 (en) Data storage method, database storage node failure processing method and apparatus
CN104391802A (en) Streamline pool metadata node refreshing consistency protecting method
US11194501B2 (en) Standby copies withstand cascading fails
US11514181B2 (en) Bin syncing technique for multiple data protection schemes
US11216204B2 (en) Degraded redundant metadata, DRuM, technique
CN111752892B (en) Distributed file system and implementation method, management system, equipment and medium thereof
US10929255B2 (en) Reducing the size of fault domains
WO2017130022A1 (en) Method for adding storage devices to a data storage system with diagonally replicated data storage blocks
CN116389233A (en) Container cloud management platform active-standby switching system, method and device and computer equipment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16707561

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16707561

Country of ref document: EP

Kind code of ref document: A1