WO2017087015A1 - Count of metadata operations - Google Patents

Count of metadata operations Download PDF

Info

Publication number
WO2017087015A1
WO2017087015A1 PCT/US2016/014785 US2016014785W WO2017087015A1 WO 2017087015 A1 WO2017087015 A1 WO 2017087015A1 US 2016014785 W US2016014785 W US 2016014785W WO 2017087015 A1 WO2017087015 A1 WO 2017087015A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
metadata
journal log
user
journal
Prior art date
Application number
PCT/US2016/014785
Other languages
French (fr)
Inventor
Hiro LALWANI
Abubaker SIDDIQUE
Jayasankar Nallasamy
Original Assignee
Hewlett Packard Enterprise Development Lp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hewlett Packard Enterprise Development Lp filed Critical Hewlett Packard Enterprise Development Lp
Publication of WO2017087015A1 publication Critical patent/WO2017087015A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3034Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a storage system, e.g. DASD based or network based
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/40Data acquisition and logging

Definitions

  • Storage systems typically store data in user files.
  • the user files may be written on a non-transitory computer readable storage medium, such as hard disks, flash memory, tape etc.
  • a user file may include data and metadata.
  • the data includes the information which the user wishes to store in the file, while the metadata may include information relating to the size, location and other attributes of the file.
  • Some file systems keep a record of certain changes to the files in a journal log, which is also known as a transaction log. This can be helpful in re-building the files in the case of a system error.
  • Some file systems employ both a primary storage system and a secondary storage system for redundancy. The contents of the primary storage system may be copied to the secondary storage system so that the secondary storage system may be used as a backup if the primary storage system fails.
  • Figure 1 shows a schematic example of a storage system
  • Figure 2 shows a schematic example of a storage system in more detail
  • Figure 3 shows an example method of storing metadata
  • Figure 4 shows an example of a file metadata
  • Figure 5 shows an example journal log file
  • Figure 6 shows an example of a journal attributes record for some example journal log files
  • Figure 7 shows an example method which uses a journal log file
  • Figure 8 shows an example method of determining whether there is an error in a journal log file
  • Figure 9 shows an example method of correcting an error in a journal log file.
  • Figure 1 is a schematic diagram showing an example of a storage system 100 according to the present disclosure.
  • the storage system 100 includes a processor 110 and a non-transitory computer readable storage medium 120.
  • the storage medium 120 stores a number of user files 130, 140, a journal log 170 and machine readable instructions 180 that are executable by the processor 110.
  • the storage medium may 120 include non- volatile storage such a hard disk, or an array of hard disks, flash memory, tape storage.
  • the storage medium may further include volatile storage such as volatile random access memory (RAM) or dynamic random access memory (DRAM).
  • Volatile storage may be used as a buffer to temporarily store data that is about to be written to non-volatile memory, or which has just been read from non-volatile memory.
  • RAM volatile random access memory
  • DRAM dynamic random access memory
  • Volatile storage may be used as a buffer to temporarily store data that is about to be written to non-volatile memory, or which has just been read from non-volatile memory.
  • the storage medium 120 is shown as single part in Figure 1 , but in practice may include one, several or all of the different types of storage mentioned above and may be distributed across several physical devices.
  • the storage medium 120 stores a number of user files which form a file system.
  • Figure 1 shows a plurality of files, including a first user file 130 and an Nth user file 140.
  • Each user file includes data (not shown explicitly in Figure 1) and metadata 130A, 140A relating to the user file.
  • the metadata 130A, 140A of a user file 130, 140 may for example include a name, size, location and/or other attributes of the user file.
  • the file system also includes a journal log 170 which is stored on the storage medium 120.
  • the journal log 170 is not a user file. Rather the journal log stores a list of changes that have been made to the file system.
  • the journal log 170 may include metadata operations which describe changes made to the metadata of the user files. There may be a plurality of different possible metadata operations and each metadata operation entry may indicate the type of operation and other information.
  • the journal log may be implemented as a circular buffer. As it includes a list of changes made to the file system, the journal log may be used to replay recent changes to a file system if recent changes are lost due a system failure.
  • the journal log may be stored in volatile memory and periodically flushed to a non-volatile storage.
  • the journal log 170 is accessible by a journal sub-system of the file system, but it is not a user file and therefore may not be easily accessible to other applications which use the file system. Therefore in some cases the journal log may be copied into a user file to create a journal log file.
  • a journal log file is like a user file and may be accessible in the same manner as a user file, but stores journal log data instead of user data.
  • the journal log file is a file created by the file system, rather than a user, for the specific purpose of storing journal log data.
  • the journal log file may be conveniently accessed by other applications, such as disaster recovery, express query etc. However, because the journal log file is not a reserved system file or data sector, the integrity of the journal log file may not be ensured in the case of system failure or write errors.
  • the present disclosure proposes maintaining, in the metadata of each user file, a count of each type of a plurality of metadata operation types that have been performed on a user file. In this way the integrity of the journal log file may be checked by comparing its contents with the counts in the metadata of the user files.
  • the metadata 130A, 140A of each user file includes a counter 130B, 140B for counting each type of metadata operation.
  • the machine readable instructions 180 are instructions that are executable by the processor 1 10 to manage certain aspects of the storage system.
  • the instructions may include instructions to handle writing data to the file system and reading data from the file system.
  • the instructions include instructions to write a metadata operation to the journal log and to maintain, in metadata of a user file, a count of each type of metadata operation performed on the user file.
  • Figure 3 shows an example method that may be implemented by the processor 110 of the storage system 100 shown in Figure 1.
  • the storage system receives an instruction to write data to a user file.
  • the storage system writes the data and corresponding metadata to the user file.
  • the metadata may include information about the user file, such as its size and location etc.
  • the metadata may be updated as the file is changed and more data is written to or deleted form the file, to reflect changes in size, name, location, disk sectors which are occupied etc.
  • the storage system writes a metadata operation entry to a journal log.
  • the storage system maintains, in the metadata of the user file, a count of each type of a plurality of metadata operation types that have been performed on the user file.
  • the metadata operation entry which is written to the journal log in block 330 may correspond to the metadata written to the user file in block 320.
  • Each metadata operation may have a type.
  • the journal log may include a plurality of metadata operations and some of the metadata operations may have different types.
  • Figure 2 shows an example which includes a primary storage system 100 and a secondary storage system 200.
  • a client 10 may send write requests to write data to the primary storage system and read requests to read data from the primary storage system.
  • the client may be a user device, such as a computer, a server, a mobile device etc.
  • the client 10 may connect to the primary storage system over a network.
  • the client may likewise connect to the secondary storage system over a network. If the primary storage system 100 foils, or the connection of the client to the primary storage system fails, then the secondary storage system may take over and the client may read data from and write data to the secondary storage system instead.
  • the secondary storage system acts as a backup, which may be used in case the primary storage fails, or becomes inaccessible due to a network failure etc.
  • the secondary storage may be located in the same building or a different building to the primary storage.
  • Data is copied from the primary storage system 100 to the secondary storage system 200.
  • Data may be copied periodically, on demand, or based on the volume of data, or volume of changes to the data or otherwise.
  • the primary and secondary storage systems may synchronized so that they contain the same information or the secondary storage system may be updated so that it just a little behind the primary storage system.
  • data is copied frequently, for instance every 5 seconds, so that the secondary storage is not too far behind the primary storage at any point in time.
  • This copying of data from the primary storage system to the secondary storage system may be referred to as disaster recovery, as it enables the user to access data in the case that a disaster renders the primary storage system inoperable or inaccessible.
  • the primary storage system 100 of Figure 2 is similar to the storage 100 of Figure 1, and so the same reference numerals are used. However, merely by way of example, some more specific details of the structure of the file system in Figure 2 are shown.
  • the storage system 100 of Figure 2 stores a file system including a plurality of user files 130, 140. While just two user files 130, 140 are shown in Figure 2, it is to be understood that there may be any number of user files.
  • Each user file 130, 140 includes metadata 130A, 140A and user data 130C, 140C.
  • the metadata 130A, 140A includes a counter 130B, 140B for counting a number of each type of metadata operation which is performed on the user file.
  • Figure 4 shows an example of user file metadata 400.
  • the metadata may include various records including for instance any one, any combination or all of the following: a file ID record 410, a file attributes 140 record 420, a file extent record 430 and a journal attributes record 440. There may also be other records such as quota information (not shown).
  • the file ID record 410 may for example include a tag number or inode number identifying the file.
  • the file attributes record 420 may include information about the file size, mode and/or number of links.
  • the file extent record 430 may include information about the file user data.
  • the journal attributes record may include counters of metadata operations performed on the file and is described in more detail later.
  • the file system includes a journal log 170, which may also be referred to as a transaction log.
  • the journal log 170 stores information about each metadata operation which is performed on the user files of the file system. For example, before or after a metadata operation is performed to write metadata to a user file, details of the metadata operation may recorded in the journal log 170.
  • the journal log may thus be useful for rebuilding the file system in the event of a system failure.
  • the journal log 170 may be accessible by a journal log subsystem of the file system. However, as the journal log 170 is not a user file, it may not be easily accessible to other applications. Therefore contents of the journal log 170 may be copied to a journal log file. There may be a plurality of journal log files each covering a respective period of time.
  • Figure 2 shows two journal log files: a first journal log file 150 and a second journal log file 160. However, this is just by way of example and at any one time there may be more or fewer journal log files.
  • journal log files 150, 160 are like user files and accessible to applications in the same way as user files, but are automatically created by the file system, rather than in response to an instruction to from a client to store user data.
  • the journal log files 150, 160 may be created by copying data from the journal log 170 periodically, or after a certain number of transactions or at certain intervals.
  • Each journal log file 150, 160 may have the same structure as a user file, for instance including metadata 150A, 160A and data 150C, 160C.
  • the data 150C, 160C of each journal log file is data which has been copied from the journal log 170 and covers metadata operations carried out on the file system during a particular period of time.
  • the metadata 150A, 160A may indicate the period of time which the journal log file relates to.
  • Figure 5 shows an example of a journal log file 150 in more detail. It includes metadata 150A, which may for example be as shown in Figure 4.
  • the metadata 150A may also indicate the time period which the journal log file relates to. For example, this information about the time period may be recorded in a file attributes record of the metadata or elsewhere.
  • the journal log file 150 further includes details of metadata operations 150C-1, 150C-2, 150C-3...150C-N that have been carried out on the file system in the period of time which the journal log file relates to. This may include metadata operations carried out on a plurality of different user files in this period of time.
  • the primary storage system includes machine readable instructions 180, which may include a file system manager 182 for implementing various functions of the storage system 100.
  • the file system manager 182 may handle reading and writing to the primary storage system in response to read and write requests from a client 10.
  • the file system manager 182 may also handle writing to the journal log 170 and creation of the journal log files ISO, 160.
  • the file system manger 182 may operate in accordance with the method of Figure 3 to maintain metadata of the user files, maintain the journal log and maintain the counters 130B, 140B etc in the various files' metadata.
  • the machine readable instructions 180 may also include various other applications, for instance a disaster recovery (DR) application 184 and an Express Query (EQ) application 186.
  • the DR application copies data from the primary storage system 100 to the secondary storage system 200, so that the secondary storage system is available in the case of a disaster which renders the primary storage system inoperable or inaccessible.
  • the data may copied from the primary storage system to the secondary storage system based on the contents of the journal log file. For example, metadata operations included in a journal log file 150, 160 of the primary storage system may be applied to the secondary storage system to so that changes to existing user files, or creation of new user files, on the primary storage system are applied to the secondary storage system.
  • journal log file may be deleted from the primary storage system. For example, a journal log file may be deleted by de-allocating its file space so that it is eventually overwritten.
  • the secondary storage system 200 has a number of user files 230, 240 which are copies of the user files 130, 140 of the primary storage system and created based on the journal log files 150, 160 of the primary storage system.
  • a processor 210 of the secondary storage system may execute machine readable instructions 280 including a secondary file system manager 282 to construct the user files 230, 240 based on the journal log files 150, 160.
  • Each user file 230, 240 may include metadata 230A, 240A and user data 230C, 240C as is the case for the corresponding user files of the primary storage system.
  • the Express Query (EQ) application 186 is an application that carries out certain storage administration and analytics based on file system metadata. In particular it may facilitate analysis of the use of the storage system over a period of time. This may be used, for instance, to help make administrative decisions and/or to understand which objects are frequently accessed. In one example it may be generate reports of files which were created or deleted over a particular period of time and the results may be filtered to narrow down to particular types of user, types of file etc. This analysis may be based on the journal log files.
  • Both DR., EQ and other applications may thus rely on one journal log file or a plurality of journal log files.
  • the journal log file may become corrupted or inaccurate.
  • conventional systems often lack safeguards for the integrity of the journal log file, even though the journal log file plays an important role in critical applications such as disaster recovery.
  • a count of the metadata operations carried out on each user file is stored in the metadata of the user file.
  • each counter there may be a plurality of counters in the user file metadata, each counter
  • the counters may, for example, be kept in a journal attributes record 440 of the metadata as shown in Figure 4. As will be described below, this may facilitate error checking of the journal log file.
  • Figure 6 shows one example of a journal attributes record in more detail.
  • the journal attributes record may include a respective set of counters for each respective journal log file.
  • Figure 6 shows an example journal attributes record with two sets of counters: a first set 600A for a first journal log file and a second set 600B for a second journal log file.
  • Each set of counters may include a plurality of counters, one counter for each type of the plurality of possible metadata operations.
  • FIG. 6 there is shown a create counter to count the number of create file metadata operations, a make directory counter to count the number of make directory metadata operations, a remove directory counter to count the number of remove directory metadata operations, a symlink counter to count the number of soft links, a write counter to count the number of metadata write operations and a rename counter to count the number of rename file metadata operations.
  • the journal attributes record may store information indicating the time period which each respective set of counters relates to, i.e. the time period of the corresponding journal log file. For example the time period may include a start time and an end time.
  • counters in Figure 6 are just an example.
  • other counters may include counters to count any of the following metadata operations: create, make directory, remove directory, rename, link, get/set attribute, write, unlink, get/set x attribute, remove x attribute, symlink, make node and migrate.
  • the various counters mentioned above may be maintained by incrementing the counter each time the type of metadata operation which it counts is applied to the metadata of the user file.
  • Figure 7 illustrates an example method of disaster recovery that may be implemented by the primary storage system 100.
  • journal log file 150 The journal log file may for example be created by copying data from the journal log 170 to a user file.
  • a user file storing journal log data may be referred to as a journal log file 1 SO.
  • the journal log file may include all metadata operations carried out in a particular period of time.
  • the primary storage system may create a second journal log file 160 by copying metadata operations carried out during a subsequent period of time. This copying from the journal log 170 to journal log files 150, 160 is shown schematically by the dashed arrowed lines in Figure 2.
  • the primary storage system performs a disaster recovery operation by applying metadata operations of the journal log file to the secondary storage system.
  • a disaster recovery application 184 of the primary storage system may for instance be initiated by a disaster recovery application 184 of the primary storage system and implemented on the secondary storage system by a file system manager 282 of the secondary storage system which communicates with the disaster recovery application.
  • the secondary storage system may create and/or update user files based on the journal log file or journal log files so mat the contents of the secondary storage system reflect those of the primary storage system.
  • the journal log file may be deleted from the primary storage system.
  • the journal log file may be effectively deleted by de-allocating its storage space, so that in time it is overwritten by new data.
  • the corresponding counters in the metadata of the user files may be deleted.
  • the part of the journal attribute records including counters 600A relating to the first journal log file may be deleted, from the metadata of each user file, after the first journal log file has been successfully applied to the secondary storage system.
  • Figure 8 shows a method of determining whether there is an error in a journal log file. This may be carried out by the DR application 184 or other instructions in the machine readable instructions 180 which are executed by processor 110.
  • the number of each type of metadata operation is calculated based on the metadata operation entries in the journal log file that is being checked. For instance, the number of create file metadata operations in the journal log file may be counted, and the number of write metadata operations in the journal log file may be counted etc.
  • the number of each of type of metadata operation is calculated based on the counters in the metadata of the user files.
  • the number of create file metadata operations can be calculated by summing the create counters of the journal attributes records of each user file. The same for the other type of metadata operations.
  • journal log file 810 is compared with the numbers calculated in block 820. If the results are consistent then the journal log file is consistent with the user file metadata and can be assumed to be without error. The method then proceeds to block 840 and ends. However, if the number calculated for any of the metadata operation types is different between block 810 and block 820, then the journal log file is inconsistent with the metadata of the user files. Based on this it can be determined at block 8S0 that an error has occurred in the journal log file.
  • Figure 9 shows an example method of correcting an error in a journal log file.
  • journal log file is inconsistent with the metadata of the user files. For example, this may be based on a result returned by the method of Figure 8 described above.
  • journal log file is corrected based on the contents of the user files' metadata.
  • the appropriate correction may be determined by examining the metadata of the user files. For instance, the method may determine which write metadata operation in the journal log file is not present in the actual metadata of the user files and then delete the incorrect write metadata operation from the journal log file. Similarly, if the number of write metadata operations in the journal log file is less than the total write metadata operations according to the counters of the metadata of the user files, then it can be determined from the metadata of the user files and/or the counters, which write metadata operations are missing from the journal log file. The missing write metadata operations may then be added to the journal log file. If the period of time which any one journal log file covers is relatively short, then the number of metadata operations in each journal log file will be relatively small, making it possible to complete this error detection and correction reasonably quickly.

Abstract

The present invention is a non-transitory computer readable medium, storing machine readable instructions that are executable by a processor to receive an instruction to write data to a user file of a storage system, write the data and corresponding metadata to the user file, write a metadata operation entry to a journal log, and maintain, in the metadata of the user file, a count of each type of a plurality of metadata operation types that have been performed on the user file.

Description

COUNT OF METADATA OPERATIONS
BACKGROUND
[0001] Storage systems typically store data in user files. The user files may be written on a non-transitory computer readable storage medium, such as hard disks, flash memory, tape etc. A user file may include data and metadata. The data includes the information which the user wishes to store in the file, while the metadata may include information relating to the size, location and other attributes of the file.
[0002] Some file systems keep a record of certain changes to the files in a journal log, which is also known as a transaction log. This can be helpful in re-building the files in the case of a system error. Some file systems employ both a primary storage system and a secondary storage system for redundancy. The contents of the primary storage system may be copied to the secondary storage system so that the secondary storage system may be used as a backup if the primary storage system fails.
BRIEF DESCRIPTION OF THE DRAWINGS
10003] Examples will now be described, by way of non-limiting example only, with, reference to the accompanying drawings, in which:
Figure 1 shows a schematic example of a storage system;
Figure 2 shows a schematic example of a storage system in more detail;
Figure 3 shows an example method of storing metadata;
Figure 4 shows an example of a file metadata;
Figure 5 shows an example journal log file;
Figure 6 shows an example of a journal attributes record for some example journal log files;
Figure 7 shows an example method which uses a journal log file;
Figure 8 shows an example method of determining whether there is an error in a journal log file; and
Figure 9 shows an example method of correcting an error in a journal log file.
DETAILED DESCRIPTION
[0004] For simplicity and illustrative purposes, the present disclosure is described by referring mainly to an example thereof. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be readily apparent however, that the present disclosure may be practiced without limitation to these specific details. In other instances, some methods and structures have not been described in detail so as not to unnecessarily obscure the present disclosure. Throughout the present disclosure, the terms "a", "an" and "a number of are intended to denote at least one of a particular element. As used herein, the term "includes" means includes but not limited to, the term "including" means including but not limited to. The term "based on" means based at least in part on.
10005] Figure 1 is a schematic diagram showing an example of a storage system 100 according to the present disclosure. The storage system 100 includes a processor 110 and a non-transitory computer readable storage medium 120. The storage medium 120 stores a number of user files 130, 140, a journal log 170 and machine readable instructions 180 that are executable by the processor 110.
[0006] The storage medium may 120 include non- volatile storage such a hard disk, or an array of hard disks, flash memory, tape storage. The storage medium may further include volatile storage such as volatile random access memory (RAM) or dynamic random access memory (DRAM). Volatile storage may be used as a buffer to temporarily store data that is about to be written to non-volatile memory, or which has just been read from non-volatile memory. For convenience the storage medium 120 is shown as single part in Figure 1 , but in practice may include one, several or all of the different types of storage mentioned above and may be distributed across several physical devices.
[0007] The storage medium 120 stores a number of user files which form a file system. Figure 1 shows a plurality of files, including a first user file 130 and an Nth user file 140. Each user file includes data (not shown explicitly in Figure 1) and metadata 130A, 140A relating to the user file. The metadata 130A, 140A of a user file 130, 140 may for example include a name, size, location and/or other attributes of the user file.
[0008] The file system also includes a journal log 170 which is stored on the storage medium 120. The journal log 170 is not a user file. Rather the journal log stores a list of changes that have been made to the file system. For instance the journal log 170 may include metadata operations which describe changes made to the metadata of the user files. There may be a plurality of different possible metadata operations and each metadata operation entry may indicate the type of operation and other information. In one example, the journal log may be implemented as a circular buffer. As it includes a list of changes made to the file system, the journal log may be used to replay recent changes to a file system if recent changes are lost due a system failure. The journal log may be stored in volatile memory and periodically flushed to a non-volatile storage.
[0009] The journal log 170 is accessible by a journal sub-system of the file system, but it is not a user file and therefore may not be easily accessible to other applications which use the file system. Therefore in some cases the journal log may be copied into a user file to create a journal log file. A journal log file is like a user file and may be accessible in the same manner as a user file, but stores journal log data instead of user data. The journal log file is a file created by the file system, rather than a user, for the specific purpose of storing journal log data.
[0010] The journal log file may be conveniently accessed by other applications, such as disaster recovery, express query etc. However, because the journal log file is not a reserved system file or data sector, the integrity of the journal log file may not be ensured in the case of system failure or write errors.
[0011] Accordingly, the present disclosure proposes maintaining, in the metadata of each user file, a count of each type of a plurality of metadata operation types that have been performed on a user file. In this way the integrity of the journal log file may be checked by comparing its contents with the counts in the metadata of the user files.
[0012] In the example of Figure 1, the metadata 130A, 140A of each user file includes a counter 130B, 140B for counting each type of metadata operation.
[0013] The machine readable instructions 180 are instructions that are executable by the processor 1 10 to manage certain aspects of the storage system. For example the instructions may include instructions to handle writing data to the file system and reading data from the file system. In one example the instructions include instructions to write a metadata operation to the journal log and to maintain, in metadata of a user file, a count of each type of metadata operation performed on the user file.
[0014] Figure 3 shows an example method that may be implemented by the processor 110 of the storage system 100 shown in Figure 1.
[0015] At block 310 the storage system receives an instruction to write data to a user file.
[0016] At block 320 in response to the instruction received at block 310, the storage system writes the data and corresponding metadata to the user file.
[0017] For example the metadata may include information about the user file, such as its size and location etc. The metadata may be updated as the file is changed and more data is written to or deleted form the file, to reflect changes in size, name, location, disk sectors which are occupied etc.
[0018] At block 330 the storage system writes a metadata operation entry to a journal log.
[0019] At block 340 the storage system maintains, in the metadata of the user file, a count of each type of a plurality of metadata operation types that have been performed on the user file.
[0020] The metadata operation entry which is written to the journal log in block 330 may correspond to the metadata written to the user file in block 320. Each metadata operation may have a type. Over time as more data and metadata is written to the file system, the journal log may include a plurality of metadata operations and some of the metadata operations may have different types.
[0021] While the method of Figure 3 is initiated by the instruction to write data to a user file, the other blocks may be carried out in any order and are not limited to the particular sequence shown in Figure 3, which is merely by way of example.
[0022] Figure 2 shows an example which includes a primary storage system 100 and a secondary storage system 200. A client 10 may send write requests to write data to the primary storage system and read requests to read data from the primary storage system. The client may be a user device, such as a computer, a server, a mobile device etc. The client 10 may connect to the primary storage system over a network. The client may likewise connect to the secondary storage system over a network. If the primary storage system 100 foils, or the connection of the client to the primary storage system fails, then the secondary storage system may take over and the client may read data from and write data to the secondary storage system instead.
[0023] Thus the secondary storage system acts as a backup, which may be used in case the primary storage fails, or becomes inaccessible due to a network failure etc. The secondary storage may be located in the same building or a different building to the primary storage. Data is copied from the primary storage system 100 to the secondary storage system 200. Data may be copied periodically, on demand, or based on the volume of data, or volume of changes to the data or otherwise. In this way the primary and secondary storage systems may synchronized so that they contain the same information or the secondary storage system may be updated so that it just a little behind the primary storage system. Generally data is copied frequently, for instance every 5 seconds, so that the secondary storage is not too far behind the primary storage at any point in time. This copying of data from the primary storage system to the secondary storage system may be referred to as disaster recovery, as it enables the user to access data in the case that a disaster renders the primary storage system inoperable or inaccessible.
[Θ024] The primary storage system 100 of Figure 2 is similar to the storage 100 of Figure 1, and so the same reference numerals are used. However, merely by way of example, some more specific details of the structure of the file system in Figure 2 are shown.
[0025] The storage system 100 of Figure 2 stores a file system including a plurality of user files 130, 140. While just two user files 130, 140 are shown in Figure 2, it is to be understood that there may be any number of user files. Each user file 130, 140 includes metadata 130A, 140A and user data 130C, 140C. The metadata 130A, 140A includes a counter 130B, 140B for counting a number of each type of metadata operation which is performed on the user file.
[0026] Figure 4 shows an example of user file metadata 400. The metadata may include various records including for instance any one, any combination or all of the following: a file ID record 410, a file attributes 140 record 420, a file extent record 430 and a journal attributes record 440. There may also be other records such as quota information (not shown). The file ID record 410 may for example include a tag number or inode number identifying the file. The file attributes record 420 may include information about the file size, mode and/or number of links. The file extent record 430 may include information about the file user data. The journal attributes record may include counters of metadata operations performed on the file and is described in more detail later.
[0027] The file system includes a journal log 170, which may also be referred to as a transaction log. The journal log 170 stores information about each metadata operation which is performed on the user files of the file system. For example, before or after a metadata operation is performed to write metadata to a user file, details of the metadata operation may recorded in the journal log 170. The journal log may thus be useful for rebuilding the file system in the event of a system failure.
[0028] The journal log 170 may be accessible by a journal log subsystem of the file system. However, as the journal log 170 is not a user file, it may not be easily accessible to other applications. Therefore contents of the journal log 170 may be copied to a journal log file. There may be a plurality of journal log files each covering a respective period of time. Figure 2 shows two journal log files: a first journal log file 150 and a second journal log file 160. However, this is just by way of example and at any one time there may be more or fewer journal log files.
[0029] The creation and maintenance of the journal log files may be carried out by the file system. For instance by processor 110 executing machine readable instructions 180 for managing the file system. The journal log files 150, 160 are like user files and accessible to applications in the same way as user files, but are automatically created by the file system, rather than in response to an instruction to from a client to store user data. The journal log files 150, 160 may be created by copying data from the journal log 170 periodically, or after a certain number of transactions or at certain intervals. Each journal log file 150, 160 may have the same structure as a user file, for instance including metadata 150A, 160A and data 150C, 160C. The data 150C, 160C of each journal log file is data which has been copied from the journal log 170 and covers metadata operations carried out on the file system during a particular period of time. The metadata 150A, 160A may indicate the period of time which the journal log file relates to.
[0030] Figure 5 shows an example of a journal log file 150 in more detail. It includes metadata 150A, which may for example be as shown in Figure 4. The metadata 150A may also indicate the time period which the journal log file relates to. For example, this information about the time period may be recorded in a file attributes record of the metadata or elsewhere. The journal log file 150 further includes details of metadata operations 150C-1, 150C-2, 150C-3...150C-N that have been carried out on the file system in the period of time which the journal log file relates to. This may include metadata operations carried out on a plurality of different user files in this period of time.
[0031] The primary storage system includes machine readable instructions 180, which may include a file system manager 182 for implementing various functions of the storage system 100. For instance the file system manager 182 may handle reading and writing to the primary storage system in response to read and write requests from a client 10. The file system manager 182 may also handle writing to the journal log 170 and creation of the journal log files ISO, 160. The file system manger 182 may operate in accordance with the method of Figure 3 to maintain metadata of the user files, maintain the journal log and maintain the counters 130B, 140B etc in the various files' metadata.
[0032] The machine readable instructions 180 may also include various other applications, for instance a disaster recovery (DR) application 184 and an Express Query (EQ) application 186. The DR application copies data from the primary storage system 100 to the secondary storage system 200, so that the secondary storage system is available in the case of a disaster which renders the primary storage system inoperable or inaccessible. The data may copied from the primary storage system to the secondary storage system based on the contents of the journal log file. For example, metadata operations included in a journal log file 150, 160 of the primary storage system may be applied to the secondary storage system to so that changes to existing user files, or creation of new user files, on the primary storage system are applied to the secondary storage system. This may be done periodically, on demand or after the journal log files have reached a certain size etc. Once the metadata operations in a journal log file have been applied to the secondary storage system, the journal log file may be deleted from the primary storage system. For example, a journal log file may be deleted by de-allocating its file space so that it is eventually overwritten.
[0033] As shown in Figure 2, the secondary storage system 200 has a number of user files 230, 240 which are copies of the user files 130, 140 of the primary storage system and created based on the journal log files 150, 160 of the primary storage system. For example, a processor 210 of the secondary storage system may execute machine readable instructions 280 including a secondary file system manager 282 to construct the user files 230, 240 based on the journal log files 150, 160. Each user file 230, 240 may include metadata 230A, 240A and user data 230C, 240C as is the case for the corresponding user files of the primary storage system.
[0034] The Express Query (EQ) application 186 is an application that carries out certain storage administration and analytics based on file system metadata. In particular it may facilitate analysis of the use of the storage system over a period of time. This may be used, for instance, to help make administrative decisions and/or to understand which objects are frequently accessed. In one example it may be generate reports of files which were created or deleted over a particular period of time and the results may be filtered to narrow down to particular types of user, types of file etc. This analysis may be based on the journal log files.
10035] Both DR., EQ and other applications may thus rely on one journal log file or a plurality of journal log files. However, if there is an error in writing the journal log file, or a system failure, then the journal log file may become corrupted or inaccurate. As the journal log file is not a reserved file, conventional systems often lack safeguards for the integrity of the journal log file, even though the journal log file plays an important role in critical applications such as disaster recovery.
[0036] However, in examples described herein, a count of the metadata operations carried out on each user file is stored in the metadata of the user file. For example, there may be a plurality of counters in the user file metadata, each counter
corresponding to a particular type of metadata operation. The counters may, for example, be kept in a journal attributes record 440 of the metadata as shown in Figure 4. As will be described below, this may facilitate error checking of the journal log file.
[0037] Figure 6 shows one example of a journal attributes record in more detail. The journal attributes record may include a respective set of counters for each respective journal log file. Figure 6 shows an example journal attributes record with two sets of counters: a first set 600A for a first journal log file and a second set 600B for a second journal log file. Each set of counters may include a plurality of counters, one counter for each type of the plurality of possible metadata operations. Thus in Figure 6, there is shown a create counter to count the number of create file metadata operations, a make directory counter to count the number of make directory metadata operations, a remove directory counter to count the number of remove directory metadata operations, a symlink counter to count the number of soft links, a write counter to count the number of metadata write operations and a rename counter to count the number of rename file metadata operations. Likewise there are create counter, make director counter, remove directory counter, symlink counter, write counter and rename counter for the second journal log file. The journal attributes record may store information indicating the time period which each respective set of counters relates to, i.e. the time period of the corresponding journal log file. For example the time period may include a start time and an end time.
[0038] The counters in Figure 6 are just an example. In other examples, other counters may include counters to count any of the following metadata operations: create, make directory, remove directory, rename, link, get/set attribute, write, unlink, get/set x attribute, remove x attribute, symlink, make node and migrate. The various counters mentioned above may be maintained by incrementing the counter each time the type of metadata operation which it counts is applied to the metadata of the user file.
[Θ039] Figure 7 illustrates an example method of disaster recovery that may be implemented by the primary storage system 100.
[0040] At block 710 the primary storage system creates a journal log file 150. The journal log file may for example be created by copying data from the journal log 170 to a user file. A user file storing journal log data may be referred to as a journal log file 1 SO. The journal log file may include all metadata operations carried out in a particular period of time. At a later time, the primary storage system may create a second journal log file 160 by copying metadata operations carried out during a subsequent period of time. This copying from the journal log 170 to journal log files 150, 160 is shown schematically by the dashed arrowed lines in Figure 2.
[0041] At block 720, the primary storage system performs a disaster recovery operation by applying metadata operations of the journal log file to the secondary storage system. This is schematically shown by the solid arrowed lines in Figure 2 and may for instance be initiated by a disaster recovery application 184 of the primary storage system and implemented on the secondary storage system by a file system manager 282 of the secondary storage system which communicates with the disaster recovery application.
[0042] Thus, the secondary storage system may create and/or update user files based on the journal log file or journal log files so mat the contents of the secondary storage system reflect those of the primary storage system.
[0043] At block 730 once the contents of a journal log file have been successfully applied to the secondary storage system, the journal log file may be deleted from the primary storage system. For example, the journal log file may be effectively deleted by de-allocating its storage space, so that in time it is overwritten by new data. In addition to deallocating the journal log file, the corresponding counters in the metadata of the user files may be deleted. For example, with reference to Figure 6, the part of the journal attribute records including counters 600A relating to the first journal log file may be deleted, from the metadata of each user file, after the first journal log file has been successfully applied to the secondary storage system.
[0044] Figure 8 shows a method of determining whether there is an error in a journal log file. This may be carried out by the DR application 184 or other instructions in the machine readable instructions 180 which are executed by processor 110.
[0045] At block 810 the number of each type of metadata operation is calculated based on the metadata operation entries in the journal log file that is being checked. For instance, the number of create file metadata operations in the journal log file may be counted, and the number of write metadata operations in the journal log file may be counted etc.
[0046] At block 820 the number of each of type of metadata operation is calculated based on the counters in the metadata of the user files.
[0047] For instance, the number of create file metadata operations can be calculated by summing the create counters of the journal attributes records of each user file. The same for the other type of metadata operations.
[0048] At block 830 the number of each type of metadata operation from block
810 is compared with the numbers calculated in block 820. If the results are consistent then the journal log file is consistent with the user file metadata and can be assumed to be without error. The method then proceeds to block 840 and ends. However, if the number calculated for any of the metadata operation types is different between block 810 and block 820, then the journal log file is inconsistent with the metadata of the user files. Based on this it can be determined at block 8S0 that an error has occurred in the journal log file.
[0049] Figure 9 shows an example method of correcting an error in a journal log file.
[0050] At block 910 it is determined that the journal log file is inconsistent with the metadata of the user files. For example, this may be based on a result returned by the method of Figure 8 described above.
[0051] At block 920 the journal log file is corrected based on the contents of the user files' metadata.
[0052] For example, if the number of write metadata operations in the journal log file is found to be more than the total number of write metadata operations according to the counters in the metadata for the user files, then the appropriate correction may be determined by examining the metadata of the user files. For instance, the method may determine which write metadata operation in the journal log file is not present in the actual metadata of the user files and then delete the incorrect write metadata operation from the journal log file. Similarly, if the number of write metadata operations in the journal log file is less than the total write metadata operations according to the counters of the metadata of the user files, then it can be determined from the metadata of the user files and/or the counters, which write metadata operations are missing from the journal log file. The missing write metadata operations may then be added to the journal log file. If the period of time which any one journal log file covers is relatively short, then the number of metadata operations in each journal log file will be relatively small, making it possible to complete this error detection and correction reasonably quickly.
[0053] All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
[0054] Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.

Claims

WHAT IS CLAIMED IS:
1. A non-transitory computer readable storage medium, storing machine readable instructions that are executable by a processor to:
receive an instruction to write data to a user file of a storage system; write the data and corresponding metadata to the user file; write a metadata operation entry to a journal log; and
maintain in the metadata of the user file a count of each type of a plurality of metadata operation types that have been performed on the user file.
2. The non-transitory computer readable storage medium of claim 1 wherein the metadata for each user file includes a journal attributes record including a plurality of counters, each counter relating to a respective type of metadata operation.
3. The non-transitory computer readable storage medium of claim 2 wherein the metadata for each user file includes a plurality of other records in addition to the journal attributes record, the other records including a file attributes record and a file extent record.
4. The non-transitory computer readable storage medium of claim 1 wherein the type of metadata operations are selected from the list comprising create, make directory, remove directory, rename, write, truncate, symlink and unlink.
5. The non-transitory computer readable storage medium of claim 1 wherein machine readable instructions include instructions to create a journal log file by copying data from the journal log to a user file that is accessible by higher level functions.
6. The non-transitory computer readable storage medium of claim 5 wherein the journal log file is stored in a primary storage system and the machine readable instructions include instructions to perform a disaster recovery operation by applying metadata operations stored in the journal log file to a secondary storage system so as to recreate user files of the primary storage system on the secondary storage system.
7. The non-transitory computer readable storage medium of claim 5 wherein the instructions includes instructions to create a plurality of journal log files, wherein each journal log file corresponds to a period in time and each journal log file includes a start time and an end time, and wherein the journal attribute record of each user file includes separate counters for each journal log file.
8. A storage system comprising a processor and a non-transitory computer
readable storage medium storing a file system, the storage system comprising the file system includes: a plurality of user files storing user data and metadata relating to the user data;
a journal log storing changes made to the metadata of the user files during a period of time,
wherein the metadata of each user file includes information relating to the user data and a count of each type of metadata operation performed on the metadata of the user file in a period of time, and wherein, in response to receiving an instruction to write data to a user file, the processor is to write a metadata entry to the journal log indicating a type of metadata operation and to increment a counter in the metadata of the user file, the counter indicating a number times said type of metadata operation has been performed on the user file.
9. The storage system of claim 8 wherein the processor is to copy contents of the journal log into a journal log file, wherein the journal log file is a user file that is accessible by a disaster recovery process.
10. The storage system of claim 9 wherein the journal log file is stored on a
primary storage system and wherein the machine readable instructions include instructions to perform a disaster recovery operation including causing the metadata operations of the journal log file of the primary storage system to be applied to a secondary storage system.
11. The storage system of claim 9 wherein the machine readable instructions
include instructions to delete the journal log file and delete counters in the metadata that relate to metadata operations included in the journal log file after a disaster recovery operation has been performed on the journal log file.
12. The storage system of claim 9 wherein the processor includes instructions to periodically copy the journal log to a journal log file in order to create a plurality of journal log files, each corresponding to a respective time period and wherein the metadata of each user file includes separate counters for each respective journal log file.
13. The storage system of claim 8 wherein the type of metadata operations are selected from the list comprising create, make directory, remove directory, rename, get attribute, set attribute, write, truncate, migrate, symlink and unlink.
14. A method of determining an error in a journal log file of a file system, the method comprising:
examining a journal log file including a plurality of metadata operations;
counting the number of each type of metadata operation in the journal log file; examining a journal attribute record in metadata of each user file in the file system and counting the total number of each type of metadata operation in the journal attribute records of the metadata of the user files; and
determining that there is an error in the journal log file in response to determining that the number of each type of metadata operation in the journal log file is not the same as the total number of each type of metadata operation in the journal attribute records of the metadata of the user files.
15. The method of claim 14 further comprising referring to the user file metadata to determine which metadata operations are missing from the journal log file and updating the journal log file accordingly.
PCT/US2016/014785 2015-11-19 2016-01-25 Count of metadata operations WO2017087015A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN6240CH2015 2015-11-19
IN6240/CHE/2015 2015-11-19

Publications (1)

Publication Number Publication Date
WO2017087015A1 true WO2017087015A1 (en) 2017-05-26

Family

ID=58717582

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/014785 WO2017087015A1 (en) 2015-11-19 2016-01-25 Count of metadata operations

Country Status (1)

Country Link
WO (1) WO2017087015A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158603A (en) * 2019-12-30 2020-05-15 北京浪潮数据技术有限公司 Data migration method, system, electronic equipment and storage medium
CN111177085A (en) * 2019-12-27 2020-05-19 浪潮(北京)电子信息产业有限公司 Method, device and medium for verifying data consistency function of file system
CN113179665A (en) * 2019-06-26 2021-07-27 西部数据技术公司 Identifying underperforming data storage devices using error correction based metrics
CN113806107A (en) * 2021-08-25 2021-12-17 济南浪潮数据技术有限公司 Object copying method, device, equipment and storage medium
CN115657956A (en) * 2022-11-02 2023-01-31 中国科学院空间应用工程与技术中心 Metadata consistency writing method and system for handling cache data loss

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6732124B1 (en) * 1999-03-30 2004-05-04 Fujitsu Limited Data processing system with mechanism for restoring file systems based on transaction logs
US20050081099A1 (en) * 2003-10-09 2005-04-14 International Business Machines Corporation Method and apparatus for ensuring valid journaled file system metadata during a backup operation
EP1950658A1 (en) * 2003-02-20 2008-07-30 Hitachi Ltd. Data restoring apparatus using journal data and identification information
US20090031083A1 (en) * 2007-07-25 2009-01-29 Kenneth Lewis Willis Storage control unit with memory cash protection via recorded log
US20150058291A1 (en) * 2013-08-26 2015-02-26 Vmware, Inc. Log-structured storage device format

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6732124B1 (en) * 1999-03-30 2004-05-04 Fujitsu Limited Data processing system with mechanism for restoring file systems based on transaction logs
EP1950658A1 (en) * 2003-02-20 2008-07-30 Hitachi Ltd. Data restoring apparatus using journal data and identification information
US20050081099A1 (en) * 2003-10-09 2005-04-14 International Business Machines Corporation Method and apparatus for ensuring valid journaled file system metadata during a backup operation
US20090031083A1 (en) * 2007-07-25 2009-01-29 Kenneth Lewis Willis Storage control unit with memory cash protection via recorded log
US20150058291A1 (en) * 2013-08-26 2015-02-26 Vmware, Inc. Log-structured storage device format

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113179665A (en) * 2019-06-26 2021-07-27 西部数据技术公司 Identifying underperforming data storage devices using error correction based metrics
CN111177085A (en) * 2019-12-27 2020-05-19 浪潮(北京)电子信息产业有限公司 Method, device and medium for verifying data consistency function of file system
CN111158603A (en) * 2019-12-30 2020-05-15 北京浪潮数据技术有限公司 Data migration method, system, electronic equipment and storage medium
CN113806107A (en) * 2021-08-25 2021-12-17 济南浪潮数据技术有限公司 Object copying method, device, equipment and storage medium
CN113806107B (en) * 2021-08-25 2024-02-13 济南浪潮数据技术有限公司 Object copying method, device, equipment and storage medium
CN115657956A (en) * 2022-11-02 2023-01-31 中国科学院空间应用工程与技术中心 Metadata consistency writing method and system for handling cache data loss
CN115657956B (en) * 2022-11-02 2023-08-22 中国科学院空间应用工程与技术中心 Metadata consistency writing method and system for coping with cache data loss

Similar Documents

Publication Publication Date Title
US10157109B2 (en) Method for restoring files from a continuous recovery system
US10860547B2 (en) Data mobility, accessibility, and consistency in a data storage system
US9268648B1 (en) System and method for consistency verification of replicated data in a recovery system
EP3098715B1 (en) System and method for object-based continuous data protection
US10248336B1 (en) Efficient deletion of shared snapshots
US9304998B2 (en) Main-memory database checkpointing
US7752180B1 (en) File system group consistency point
US8315991B2 (en) Detecting inadvertent or malicious data corruption in storage subsystems and recovering data
EP2756405B1 (en) Nonvolatile media dirty region tracking
US8103840B2 (en) Snapshot mechanism and method thereof
CN111506251B (en) Data processing method and device, SMR storage system and storage medium
US10481988B2 (en) System and method for consistency verification of replicated data in a recovery system
WO2017087015A1 (en) Count of metadata operations
US8301602B1 (en) Detection of inconsistencies in a file system
US7913044B1 (en) Efficient incremental backups using a change database
US20150066857A1 (en) Replication of snapshots and clones
US9256498B1 (en) System and method for generating backups of a protected system from a recovery system
US7383465B1 (en) Undoable volume using write logging
US10013312B2 (en) Method and system for a safe archiving of data
US8479046B1 (en) Systems, methods, and computer readable media for tracking pool storage space reservations
US10261865B1 (en) Fast and optimized restore using delta information
US9383936B1 (en) Percent quotas for deduplication storage appliance
GB2531295A (en) A data block based backup method
CN111638995A (en) Metadata backup method, device and equipment and storage medium
CN109426586B (en) Data file repairing method, device and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16866772

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16866772

Country of ref document: EP

Kind code of ref document: A1