US20100332533A1 - Disaster Recovery for Databases - Google Patents
Disaster Recovery for Databases Download PDFInfo
- Publication number
- US20100332533A1 US20100332533A1 US12/493,794 US49379409A US2010332533A1 US 20100332533 A1 US20100332533 A1 US 20100332533A1 US 49379409 A US49379409 A US 49379409A US 2010332533 A1 US2010332533 A1 US 2010332533A1
- Authority
- US
- United States
- Prior art keywords
- logs
- computer system
- sessions
- processor unit
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2038—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with a single idle spare processing component
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/81—Threshold
Definitions
- the present disclosure relates generally to an improved data processing system and, more specifically, to a method and apparatus for processing data. Still more specifically, the present disclosure relates to a method and apparatus for backing up data for use in disaster recovery.
- a backup device may be, for example, a tape drive, an optical disk drive, a hard drive, or other suitable devices.
- a backup device is attached to, integrated as part of, or otherwise associated with the computer on which the data is located.
- This type of local backup provides a capability to restore data in the event that a hardware failure occurs in the computer.
- This type of backup does not provide a suitable mechanism for restoring data in the event of a disaster.
- a disaster may be an event in which an extensive failure occurs. This failure may involve the loss or destruction of the computer and the backup device. Further, a disaster also may involve damage or loss of other equipment in the same location as the computer. For example, a disaster may be caused by environmental hazards, such as fire, flood, earthquake, power outages, malicious acts, operator errors, or other similar actions.
- One solution is to use a remote backup device. For example, data on a computer may be backed up to a remote site. A backup process may copy data from a computer and send it to a remote storage device. Thus, if a disaster occurs, the data may be restored. Some data, however, may not be restored. Data that has been added to the computer since the last backup was performed is lost. Depending on how often a backup is made, the amount of data lost may not be unreasonable or crucial. Backups may be made daily, hourly, or even continuously, depending on the particular type of backup device used and the schedule that is set.
- a backup copy of data such as a secondary database
- the distance between the location of a primary database and a secondary database may reduce the likelihood that the secondary database is also affected by a disaster affecting the primary database.
- location, processing of data, and access to the data may be directed to the secondary database.
- Another manner in which databases may be synchronized involves generating a record of changes made to the primary database. These changes may then be sent asynchronously to the secondary database. These changes are oftentimes saved in a log generated at the primary database. These logs are then sent to the backup database to allow the backup database to make the same changes that occurred at the primary database.
- a method and apparatus for managing data in a database.
- a determination is made as to whether a number of logs created for a primary database located on a first computer system is ready for transfer to a second computer system.
- a number of sessions is identified based on resources available to transfer the number of logs across a network to the second computer system to form an identified number of sessions in response to a determination that the number of logs is ready to be transferred.
- the first computer system and the second computer system are in communication with the network.
- the number of logs is transferred from the first computer system to the second computer system using the identified number of sessions.
- FIG. 1 is a pictorial representation of a network of data processing systems in which illustrative embodiments may be implemented;
- FIG. 2 is a diagram of a data processing system in accordance with an illustrative embodiment
- FIG. 3 is a diagram of a database environment in accordance with an illustrative embodiment
- FIG. 4 is a flowchart of a process for generating a schedule to transfer logs in accordance with an illustrative embodiment
- FIG. 5 is a flowchart of a process for transferring logs from a primary database to a secondary database in accordance with an illustrative embodiment.
- aspects of the present invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module,” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
- the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
- a computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
- a computer readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electromagnetic, optical, or any suitable combination thereof.
- a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
- Program code embodied on a computer readable medium may be transmitted using any appropriate medium including, but not limited to, wireless, wireline, optical fiber cable, RF, or any suitable combination of the foregoing.
- Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++, or the like, and conventional procedural programming languages, such as the “C” programming language, or similar programming languages.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
- the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
- LAN local area network
- WAN wide area network
- Internet Service Provider for example, AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
- These computer program instructions also may be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
- the computer program instructions also may be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other devices to produce a computer implemented process such that the instructions, which execute on the computer or other programmable apparatus, provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- FIGS. 1-2 exemplary diagrams of data processing environments are provided in which advantageous embodiments may be implemented. It should be appreciated that FIGS. 1-2 are only exemplary and are not intended to assert or imply any limitation with regard to the environments in which different embodiments may be implemented. Many modifications to the depicted environments may be made.
- FIG. 1 depicts a pictorial representation of a network of data processing systems in which advantageous embodiments may be implemented.
- Network data processing system 100 is a network of computers in which the advantageous embodiments may be implemented.
- Network data processing system 100 contains network 102 , which is the medium used to provide communications links between various devices and computers connected together within network data processing system 100 .
- Network 102 may include connections, such as wire, wireless communication links, or fiber optic cables.
- server 104 and server 106 connect to network 102 along with storage unit 108 .
- clients 110 , 112 , and 114 connect to network 102 .
- Clients 110 , 112 , and 114 may be, for example, personal computers or network computers.
- server 104 provides data, such as boot files, operating system images, and applications to clients 110 , 112 , and 114 .
- Clients 110 , 112 , and 114 are clients to server 104 in this example.
- server 104 may contain a primary database, while server 106 may contain a secondary database.
- Clients 110 , 112 , and 114 may access server 104 .
- Network data processing system 100 may include additional servers, clients, and other devices not shown.
- Program code located in network data processing system 100 may be stored on a computer recordable storage medium and downloaded to a data processing system or other device for use.
- program code may be stored on a computer recordable storage medium on server 104 and downloaded to client 110 over network 102 for use on client 110 .
- network data processing system 100 is the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another.
- TCP/IP Transmission Control Protocol/Internet Protocol
- At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers consisting of thousands of commercial, governmental, educational, and other computer systems that route data and messages.
- network data processing system 100 also may be implemented as a number of different types of networks such as, for example, an intranet, a local area network (LAN), or a wide area network (WAN).
- FIG. 1 is intended as an example and not as an architectural limitation for the different illustrative embodiments.
- the computer program instructions also may be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- Program code located in network data processing system 100 may be stored on a computer recordable storage medium and downloaded to a data processing system or other device for use.
- program code may be stored on a computer recordable storage medium on server 104 and downloaded to client 110 over network 102 for use on client 110 .
- Data processing system 200 is an example of a computer, such as server 104 or client 110 in FIG. 1 , in which computer usable program code or instructions implementing the processes may be located for the illustrative embodiments.
- data processing system 200 includes communications fabric 202 , which provides communications between processor unit 204 , memory 206 , persistent storage 208 , communications unit 210 , input/output (I/O) unit 212 , and display 214 .
- Processor unit 204 serves to execute instructions for software that may be loaded into memory 206 .
- Processor unit 204 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further, processor unit 204 may be implemented using one or more heterogeneous processor systems, in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 204 may be a symmetric multi-processor system containing multiple processors of the same type.
- Memory 206 and persistent storage 208 are examples of storage devices 216 .
- a storage device is any piece of hardware that is capable of storing information such as, for example, without limitation, data, program code in functional form, and/or other suitable information either on a temporary basis and/or a permanent basis.
- Memory 206 in these examples, may be, for example, a random access memory, or any other suitable volatile or non-volatile storage device.
- Persistent storage 208 may take various forms, depending on the particular implementation.
- persistent storage 208 may contain one or more components or devices.
- persistent storage 208 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above.
- the media used by persistent storage 208 may be removable.
- a removable hard drive may be used for persistent storage 208 .
- Communications unit 210 in these examples, provides for communication with other data processing systems or devices.
- communications unit 210 is a network interface card.
- Communications unit 210 may provide communications through the use of either or both physical and wireless communications links.
- Input/output unit 212 allows for the input and output of data with other devices that may be connected to data processing system 200 .
- input/output unit 212 may provide a connection for user input through a keyboard, a mouse, and/or some other suitable input device. Further, input/output unit 212 may send output to a printer.
- Display 214 provides a mechanism to display information to a user.
- Instructions for the operating system, applications, and/or programs may be located in storage devices 216 , which are in communication with processor unit 204 through communications fabric 202 .
- the instructions are in a functional form on persistent storage 208 . These instructions may be loaded into memory 206 for execution by processor unit 204 .
- the processes of the different embodiments may be performed by processor unit 204 using computer-implemented instructions, which may be located in a memory, such as memory 206 .
- program code computer usable program code
- computer readable program code that may be read and executed by a processor in processor unit 204 .
- the program code in the different embodiments may be embodied on different physical or computer readable storage media, such as memory 206 or persistent storage 208 .
- Program code 218 is located in a functional form on computer readable storage media 220 that is selectively removable and may be loaded onto or transferred to data processing system 200 for execution by processor unit 204 .
- Program code 218 and computer readable storage media 220 form computer program product 222 .
- Computer readable storage media 220 may include, for example, an optical or magnetic disk that is inserted or placed into a drive or other device that is part of persistent storage 208 for transfer onto a storage device, such as a hard drive, that is part of persistent storage 208 .
- Computer readable storage media 220 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory, that is connected to data processing system 200 . In some instances, computer readable storage media 220 may not be removable from data processing system 200 .
- program code 218 may be transferred to data processing system 200 using computer readable signal media.
- Computer readable signal media may be, for example, a propagated data signal containing program code 218 .
- Computer readable signal media may be an electromagnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communications links, such as wireless communications links, an optical fiber cable, a coaxial cable, a wire, and/or any other suitable type of communications link.
- the communications link and/or the connection may be physical or wireless in the illustrative examples.
- the computer readable signal media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code.
- program code 218 may be downloaded over a network to persistent storage 208 from another device or data processing system through a computer readable signal media for use within data processing system 200 .
- program code stored in computer readable storage media in a server data processing system may be downloaded over a network from the server to data processing system 200 .
- the data processing system providing program code 218 may be a server computer, a client computer, or some other device capable of storing and transmitting program code 218 .
- data processing system 200 may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding a human being.
- a storage device may be comprised of an organic semiconductor.
- a storage device in data processing system 200 is any hardware apparatus that may store data.
- Memory 206 , persistent storage 208 , and computer program product 222 are examples of storage devices in a tangible form.
- a bus system may be used to implement communications fabric 202 and may be comprised of one or more buses, such as a system bus or an input/output bus.
- the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system.
- a communications unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter.
- a memory may be, for example, memory 206 or a cache such as found in an interface and memory controller hub that may be present in communications fabric 202 .
- the different illustrative embodiments recognize and take into account a number of different considerations. For example, the different illustrative embodiments recognize that although the use of logs may reduce issues that may occur if the backup database cannot be reached, the different illustrative embodiments recognize that using logs also may result in undesirable delays and limits to the amount of available bandwidth in a network.
- a log is a file or suitable other data structure that contains changes to be made to a database.
- the different illustrative embodiments recognize and take into account that the currently used processes for recording changes in logs at one database and sending the logs to another database for updating data can be time consuming and use up network resources. For example, a primary database sends information to a secondary database one log at a time. These logs may be very large in size. For example, a log may have 800 or more megabytes.
- the number of logs present at the primary database may reduce availability or responsiveness of the database.
- the number of logs also may reduce the availability or responsiveness of the network on which the database is located. More specifically, the size and number of logs for transfer may cause other requests and transactions to be delayed or stalled. For example, the number of primary database logs and the size of the logs may cause the database to delay or stall other transactions in a manner that is noticeable by other users.
- one or more of the illustrative embodiments provide a method and apparatus for synchronizing databases.
- a determination is made as to whether a number of logs created for a primary database located on the first computer system is ready for transfer to a second computer system.
- a number of sessions are identified based on resources available to transfer the number of logs across a network to the second computer system to form an identified number of sessions.
- the first computer system and the second computer system are in communication with the network.
- the number of logs is transferred from the first computer system to the second computer system using the number of identified sessions.
- Database environment 300 is an example of an environment that may be implemented in network data processing system 100 in FIG. 1 .
- database environment 300 includes primary computer system 302 and secondary computer system 304 . These two systems are connected to network 306 .
- Primary computer system 302 is located in geographic location 308
- secondary computer system 304 is located in geographic location 310 , which is remote to geographic location 308 in these examples.
- Primary computer system 302 includes number of computers 312 .
- a number when referring to items, means one or more items.
- a number of computers is one or more computers.
- Number of computers 312 may be implemented using one or more of data processing system 200 in FIG. 2 .
- primary database 314 and database management process 316 are located on primary computer system 302 .
- Primary database 314 may be located on one computer within number of computers 312
- database management process 316 may be located on another computer.
- both primary database 314 and database management process 316 may be located on the same computer in number of computers 312 in primary computer system 302 .
- Primary database 314 is a structured collection of data stored on one or more of number of computers 312 . This data is stored in the form of records in these examples.
- Database management process 316 is a software component that manages the storage of data within primary database 314 .
- Database management process 316 handles changes 318 made to primary database 314 .
- Changes 318 may include, for example, without limitation, adding new data, deleting data, overwriting data, and other changes to primary database 314 .
- Changes 318 may be made in response to requests from clients 320 .
- primary database 314 is replicated on secondary database 324 to provide a backup of data in primary database 314 .
- secondary database 324 is used in place of primary database 314 .
- damage may occur to number of computers 312 , data may become corrupted in primary database 314 , access to number of computers 312 may be cut off or prevented on network 306 , and/or some other event may occur causing primary database 314 to be inaccessible by clients 320 .
- secondary database 324 and database management process 326 are located on number of computers 322 in secondary computer system 304 .
- Secondary computer system 304 comprises number of computers 322 .
- the computers in number of computers 322 also may be implemented using a data processing system such as, for example, data processing system 200 in FIG. 2 .
- Secondary database 324 is synchronized with primary database 314 in these illustrative examples.
- log generation process 328 in database management process 316 generates logs 330 containing changes 318 . These logs are used to propagate changes 318 from primary database 314 to secondary database 324 .
- logs 330 include number of archived logs 332 and number of active logs 334 . Number of archived logs 332 are logs that have been completed and are ready for transfer to secondary computer system 304 . Number of active logs 334 are logs that are still being written by log generation process 328 .
- a log may be considered an archived log when the log has not been changed for some period of time in an archive directory.
- log generation process 328 may generate a message indicating that a log is present as part of number of archived logs 332 .
- Log generation process 328 may send this message to transfer process 336 .
- the message may include identification of the log, such as a file name or a path to the file.
- Transfer process 336 sends number of archived logs 332 to receiving process 338 in database management process 326 .
- receiving process 338 updates secondary database 324 with the changes in number of received logs 340 .
- number of archived logs 332 are transferred by transfer process 336 to receiving process 338 using number of sessions 342 .
- Transfer process 336 identifies number of sessions 342 for use in sending number of archived logs 332 to receiving process 338 in data management process 326 .
- each session in number of sessions 342 may be used concurrently or simultaneously with other sessions within number of sessions 342 .
- the different sessions within number of sessions 342 may be parallel sessions.
- Number of session 342 may be identified using at least one of the number of logs ready to be transferred, an amount of bandwidth available, performance criteria, and other suitable factors.
- the identification of number of sessions 342 is performed based on resources 344 available to transfer number of archived logs 332 from primary computer system 302 to secondary computer system 304 over network 306 .
- Resources 344 include at least one of bandwidth 346 , processor usage 348 , storage 350 , and other suitable types of resources used to transfer data.
- Bandwidth 346 is a rate of data transfer over network 306 .
- Processor usage 348 is processor resources used on one or both of primary computer system 302 and secondary computer system 304 to transfer number of archived logs 332 .
- Storage 350 may be available storage, such as random access memory, hard disk drives, and other storage devices on these computer systems.
- one log in number of archived logs 332 is transferred over a session within number of sessions 342 .
- more than one log may be transferred using a session.
- these sessions may be over the same connection or a different connection between primary computer system 302 and secondary computer system 304 .
- Transfer process 336 also generates schedule 352 .
- schedule 352 identifies when data in logs 330 should be transferred from primary computer system 302 to secondary computer system 304 over number of sessions 342 .
- the selection of number of sessions 342 may be made ahead of time or when a transfer of logs 330 is to occur.
- Schedule 352 may be selected in a manner that avoids reducing resources 344 available to transfer data below a threshold or desired level. For example, in some illustrative embodiments, it may be undesirable to use more than around 50 percent of the available amounts of bandwidth 346 . Number of sessions 342 and schedule 352 may be selected to avoid this situation.
- transfer process 336 takes into account amount of data 354 in number of archived logs 332 .
- transfer process 336 takes into account resources 344 available for use needed to transfer number of archived logs 332 from primary computer system 302 to secondary computer system 304 as well as amount of data 354 .
- schedule 352 may change as resources 344 change within network 306 .
- Schedule 352 may provide for transfers of number of archived logs 332 on a periodic basis and/or based on particular events.
- transfer process 336 may cause log generation process 328 to close a log within number of active logs 334 to generate an archived log within number of archived logs 332 prior to the log normally being closed.
- transfer process 336 may cause log generation process 328 to close logs after the logs reach a desired size as compared to the size of the logs normally generated. For example, if logs in number of active logs 334 are normally closed when the logs reach a size of around 800 megabytes, transfer process 336 may cause a log having a smaller size, such as around 600 megabytes, to be closed. Additionally, by closing a log prior to the normal size at which a log is closed, faster synchronization or backing up of changes 318 may occur.
- a log is closed when changes 318 are no longer written to the log.
- the log becomes an archived log and is ready for transfer in these examples.
- the size of logs may vary, depending on the particular implementation. If a log is closed prior to the log reaching the size when a log is normally closed, the smaller size may be selected. This smaller size may be selected as one that reduces use of resources 344 . This reduction in the use of resources 344 may be one that allows for transfer of number of archived logs 332 without using an amount of resources 344 that is greater than some desired level of usage.
- the closing of logs at an earlier time may be performed in response to schedule 352 .
- schedule 352 indicates that a transfer of logs is to be made from primary database 314 to secondary database 324
- transfer process 336 determines whether number of archived logs 332 is present for transfer. If no logs are present for transfer, transfer process 336 may cause log generation process 328 to close one or more logs in number of active logs 334 . These closed logs may then be transferred to secondary database 324 .
- database environment 300 is not meant to imply physical or architectural limitations to the manner in which different advantageous embodiments may be implemented. Other components in addition to and/or in place of the ones illustrated may be used. Some components may be unnecessary in some advantageous embodiments. Also, the blocks are presented to illustrate some functional components. One or more of these blocks may be combined and/or divided into different blocks when implemented in different advantageous embodiments.
- log generation process 328 may be a process separate from database management process 316 .
- Log generation process 328 may be located on a different computer from database management process 316 .
- receiving process 338 may be integrated as part of database management process 326 .
- network 306 may be comprised of multiple networks.
- network 306 may be a local area network and a wide area network connected to each other.
- network 306 also may include the Internet.
- number of archived logs 332 may be compressed and/or encrypted prior to transfer using number of sessions 342 over network 306 to secondary computer system 304 .
- the compression of number of archived logs 332 may reduce the amount of data that is transferred over network 306 through number of sessions 342 in a manner that may reduce the use of resources 344 that are available.
- encryption of number of archived logs 332 may protect sensitive and/or confidential information in number of archived logs 332 from being viewed by unauthorized users.
- transfer process 336 may monitor updates and may determine that secondary database 324 is behind in updates from primary database 314 beyond a desired threshold or amount. Transfer process may measure the amount of updates by, for example, without limitation, the size of the log or logs, or the time since a log has been transferred. This situation may cause transfer process 336 to generate an indication that secondary database 324 is out of synchronization from primary database 314 .
- FIG. 4 a flowchart of a process for generating a schedule to transfer logs is depicted in accordance with an advantageous embodiment.
- the different steps in FIG. 4 may be implemented as computer usable program code in a functional form for execution by computers in database environment 300 in FIG. 3 .
- the process illustrated in FIG. 4 may be implemented with number of computers 312 executing program code for transfer process 336 in FIG. 3 .
- the program code for transfer process 336 is executed to identify a history of database usage (step 400 ).
- transfer process 336 analyzes a history of use of primary database 314 by clients 320 . The analysis may be used to identify times during which the primary database is known to be less busy.
- the program code for transfer process 336 is executed to generate a schedule based on the analyzed history (step 402 ), with the process terminating thereafter.
- a schedule such as schedule 352
- transfer process 336 may generate a schedule that performs more log transfers during periods of time in which the database usage is low as compared to other periods of time.
- the schedule may set more frequent transfers during periods of time when the database is undergoing a volume of changes greater than a threshold level or for any other reason suitable for which a more frequent backup is desirable over a longer period of time.
- schedule may be generated using other processes other than the one illustrated in FIG. 4 .
- transfer process 336 may generate a schedule in which logs are transferred periodically.
- the frequency during which the transfer of logs may occur also may be based on a priority level.
- a higher priority level may cause the process for transferring logs to occur at different periods of time. For example, logs may be transferred every hour, every five minutes, continuously, or for some other period of time.
- the schedule may set transfers during certain periods of time when the database system is least used. This process may be performed by transfer process 336 periodically. The process may be performed in response to a periodic or non-periodic event.
- FIG. 5 a flowchart of a process for transferring logs from a primary database to a secondary database is depicted in accordance with an illustrative embodiment.
- the process illustrated in FIG. 5 may be implemented in the form of program code in a functional form. This program code may be executed to run a process to transfer logs from a primary database to a secondary database.
- the process in FIG. 5 may be implemented with number of computers 312 executing program code for transfer process 336 in FIG. 3 . This process may be initiated in response to a schedule, such as schedule 352 in FIG. 3 .
- the program code for transfer process 336 is executed to determine whether a number of logs created on the primary database on a first computer system are ready for transfer to a second computer system (step 500 ).
- step 500 may involve determining whether archived logs are present in the database in this illustrative example. If logs are ready for transfer, the program code for transfer process 336 is executed to identify a number of sessions for use in transferring the logs (step 502 ).
- the number of sessions may be selected based on a number of resources available for use to transfer the logs. This number of resources may be, for example, the bandwidth available. In other examples, the number of resources considered in selecting the number of sessions may be bandwidth and processor usage. For example, if it is desirable not to use more than around 50 percent of the available bandwidth, then the number of sessions may be selected based on that parameter.
- the program code for transfer process 336 is executed to compress the number of logs to be transferred (step 504 ). This operation also may include encrypting the logs, depending on the particular implementation. By compressing the number of logs, the amount of data to be transferred over the sessions may be reduced.
- the program code for transfer process 336 is then executed to open the number of sessions (step 506 ).
- the number of sessions selected does not exceed the number of logs to be transferred. For example, if four logs are present for transfer, only a maximum of four sessions are opened. In some cases, depending on the resources available and the constraints on those resources, the number of sessions may be less than the number of logs to be transferred.
- the program code for transfer process 336 is executed to determine whether a session in the number of sessions opened has failed (step 508 ). If a session has not failed, the program code for transfer process 336 is executed to transfer the logs over the number of sessions (step 510 ). The program code for transfer process 336 is then executed to determine whether an error has occurred in transferring a log (step 512 ). In these examples, an error may be identified by determining whether the received log is of the correct size and by comparing check sums. If an error has not occurred, the process terminates. If an error has occurred in transferring one or more logs, the process returns to step 510 with transfer process 336 transferring the logs that have failed. In some embodiments, only the portions of a log that were not received or received with errors may be resent.
- the program code for transfer process 336 is executed to determine whether the failure has been caused by a lack of resources needed to transfer the logs (step 514 ). If the failure is caused by a lack of resources being available, the program code for transfer process 336 is executed to determine whether opened sessions should be closed to reach a desired level of available resources (step 516 ). If additional sessions are to be closed, the program code for transfer process 336 is executed to close the additional sessions (step 518 ). The process then proceeds to step 510 as described above.
- step 500 if archived logs are not available, the program code for transfer process 336 is executed to send a request to the log generation process to close a number of currently open logs (step 520 ). When a currently open log file is closed, the process then proceeds to step 502 . With reference again to step 516 , if additional opened sessions are not to be closed, the process returns to step 510 as described above. Turning back to step 514 , if the failure has not been caused by a lack of resources needed to transfer the logs, the process also returns to step 510 .
- the synchronization of data between the databases may be performed more quickly. Further, the cost of compression and transfer also may be reduced by reducing the amount of time needed to transfer a log with a smaller size.
- one or more of the different illustrative embodiments provide a method and apparatus for processing data.
- the different illustrative embodiments provide a method and apparatus for synchronizing changes made at one database with another database.
- the data may be transferred in a manner that reduces usage of resources available on a network.
- to determine whether a number of logs created on a primary database located on a first computer system is ready for transfer to a second computer system to determine whether a number of logs created on a primary database located on a first computer system is ready for transfer to a second computer system.
- a number of sessions are identified based on resources available to transfer the number of logs across a network to a second computer system. The number of identified sessions is used to transfer the number of logs in a first computer system to the second computer system.
Abstract
Description
- The present disclosure relates generally to an improved data processing system and, more specifically, to a method and apparatus for processing data. Still more specifically, the present disclosure relates to a method and apparatus for backing up data for use in disaster recovery.
- Most users backup data in case of hardware failure, data corruption, or other situations that may involve the loss of data. Backups to a computer can be made to a backup device. A backup device may be, for example, a tape drive, an optical disk drive, a hard drive, or other suitable devices. A backup device is attached to, integrated as part of, or otherwise associated with the computer on which the data is located.
- This type of local backup provides a capability to restore data in the event that a hardware failure occurs in the computer. This type of backup, however, does not provide a suitable mechanism for restoring data in the event of a disaster. A disaster may be an event in which an extensive failure occurs. This failure may involve the loss or destruction of the computer and the backup device. Further, a disaster also may involve damage or loss of other equipment in the same location as the computer. For example, a disaster may be caused by environmental hazards, such as fire, flood, earthquake, power outages, malicious acts, operator errors, or other similar actions.
- One solution is to use a remote backup device. For example, data on a computer may be backed up to a remote site. A backup process may copy data from a computer and send it to a remote storage device. Thus, if a disaster occurs, the data may be restored. Some data, however, may not be restored. Data that has been added to the computer since the last backup was performed is lost. Depending on how often a backup is made, the amount of data lost may not be unreasonable or crucial. Backups may be made daily, hourly, or even continuously, depending on the particular type of backup device used and the schedule that is set.
- For many organizations and users, it is desirable and sometimes essential to provide continuous access to data and processing of data, even in the event of these types of failures. A backup copy of data, such as a secondary database, is often maintained at a remote geographic location. The distance between the location of a primary database and a secondary database may reduce the likelihood that the secondary database is also affected by a disaster affecting the primary database. In the event that a failure occurs in the primary database, location, processing of data, and access to the data may be directed to the secondary database.
- Currently, many processes and techniques are available for synchronizing databases. One manner in which databases may be synchronized involves applying updates to both databases in a synchronous fashion. In other words, any writes or changes to the primary database are also made to the backup database before actually making the writes or changes to the primary database. Keeping the databases synchronized in this fashion has the benefit of ensuring a minimal loss of data in the event of a disaster.
- Another manner in which databases may be synchronized involves generating a record of changes made to the primary database. These changes may then be sent asynchronously to the secondary database. These changes are oftentimes saved in a log generated at the primary database. These logs are then sent to the backup database to allow the backup database to make the same changes that occurred at the primary database.
- In an illustrative embodiment, a method and apparatus is provided for managing data in a database. A determination is made as to whether a number of logs created for a primary database located on a first computer system is ready for transfer to a second computer system. A number of sessions is identified based on resources available to transfer the number of logs across a network to the second computer system to form an identified number of sessions in response to a determination that the number of logs is ready to be transferred. The first computer system and the second computer system are in communication with the network. The number of logs is transferred from the first computer system to the second computer system using the identified number of sessions.
-
FIG. 1 is a pictorial representation of a network of data processing systems in which illustrative embodiments may be implemented; -
FIG. 2 is a diagram of a data processing system in accordance with an illustrative embodiment; -
FIG. 3 is a diagram of a database environment in accordance with an illustrative embodiment; -
FIG. 4 is a flowchart of a process for generating a schedule to transfer logs in accordance with an illustrative embodiment; and -
FIG. 5 is a flowchart of a process for transferring logs from a primary database to a secondary database in accordance with an illustrative embodiment. - As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method, or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module,” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
- Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electromagnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium including, but not limited to, wireless, wireline, optical fiber cable, RF, or any suitable combination of the foregoing.
- Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++, or the like, and conventional procedural programming languages, such as the “C” programming language, or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
- Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- These computer program instructions also may be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
- The computer program instructions also may be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus, or other devices to produce a computer implemented process such that the instructions, which execute on the computer or other programmable apparatus, provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- With reference now to the figures and, in particular, with reference to
FIGS. 1-2 , exemplary diagrams of data processing environments are provided in which advantageous embodiments may be implemented. It should be appreciated thatFIGS. 1-2 are only exemplary and are not intended to assert or imply any limitation with regard to the environments in which different embodiments may be implemented. Many modifications to the depicted environments may be made. -
FIG. 1 depicts a pictorial representation of a network of data processing systems in which advantageous embodiments may be implemented. Networkdata processing system 100 is a network of computers in which the advantageous embodiments may be implemented. Networkdata processing system 100 containsnetwork 102, which is the medium used to provide communications links between various devices and computers connected together within networkdata processing system 100.Network 102 may include connections, such as wire, wireless communication links, or fiber optic cables. - In the depicted example,
server 104 andserver 106 connect to network 102 along withstorage unit 108. In addition,clients Clients server 104 provides data, such as boot files, operating system images, and applications toclients Clients server 104 in this example. For example,server 104 may contain a primary database, whileserver 106 may contain a secondary database.Clients server 104. In the event thatserver 104 is no longer accessible or the database atserver 104 cannot be used,clients server 106. Networkdata processing system 100 may include additional servers, clients, and other devices not shown. - Program code located in network
data processing system 100 may be stored on a computer recordable storage medium and downloaded to a data processing system or other device for use. For example, program code may be stored on a computer recordable storage medium onserver 104 and downloaded toclient 110 overnetwork 102 for use onclient 110. - In the depicted example, network
data processing system 100 is the Internet withnetwork 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers consisting of thousands of commercial, governmental, educational, and other computer systems that route data and messages. Of course, networkdata processing system 100 also may be implemented as a number of different types of networks such as, for example, an intranet, a local area network (LAN), or a wide area network (WAN).FIG. 1 is intended as an example and not as an architectural limitation for the different illustrative embodiments. - The computer program instructions also may be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
- Program code located in network
data processing system 100 may be stored on a computer recordable storage medium and downloaded to a data processing system or other device for use. For example, program code may be stored on a computer recordable storage medium onserver 104 and downloaded toclient 110 overnetwork 102 for use onclient 110. - With reference now to
FIG. 2 , a diagram of a data processing system is depicted in accordance with an illustrative embodiment.Data processing system 200 is an example of a computer, such asserver 104 orclient 110 inFIG. 1 , in which computer usable program code or instructions implementing the processes may be located for the illustrative embodiments. In this illustrative example,data processing system 200 includescommunications fabric 202, which provides communications betweenprocessor unit 204,memory 206,persistent storage 208,communications unit 210, input/output (I/O)unit 212, anddisplay 214. -
Processor unit 204 serves to execute instructions for software that may be loaded intomemory 206.Processor unit 204 may be a set of one or more processors or may be a multi-processor core, depending on the particular implementation. Further,processor unit 204 may be implemented using one or more heterogeneous processor systems, in which a main processor is present with secondary processors on a single chip. As another illustrative example,processor unit 204 may be a symmetric multi-processor system containing multiple processors of the same type. -
Memory 206 andpersistent storage 208 are examples ofstorage devices 216. A storage device is any piece of hardware that is capable of storing information such as, for example, without limitation, data, program code in functional form, and/or other suitable information either on a temporary basis and/or a permanent basis.Memory 206, in these examples, may be, for example, a random access memory, or any other suitable volatile or non-volatile storage device.Persistent storage 208 may take various forms, depending on the particular implementation. For example,persistent storage 208 may contain one or more components or devices. For example,persistent storage 208 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above. The media used bypersistent storage 208 may be removable. For example, a removable hard drive may be used forpersistent storage 208. -
Communications unit 210, in these examples, provides for communication with other data processing systems or devices. In these examples,communications unit 210 is a network interface card.Communications unit 210 may provide communications through the use of either or both physical and wireless communications links. - Input/
output unit 212 allows for the input and output of data with other devices that may be connected todata processing system 200. For example, input/output unit 212 may provide a connection for user input through a keyboard, a mouse, and/or some other suitable input device. Further, input/output unit 212 may send output to a printer.Display 214 provides a mechanism to display information to a user. - Instructions for the operating system, applications, and/or programs may be located in
storage devices 216, which are in communication withprocessor unit 204 throughcommunications fabric 202. In these illustrative examples, the instructions are in a functional form onpersistent storage 208. These instructions may be loaded intomemory 206 for execution byprocessor unit 204. The processes of the different embodiments may be performed byprocessor unit 204 using computer-implemented instructions, which may be located in a memory, such asmemory 206. - These instructions are referred to as program code, computer usable program code, or computer readable program code that may be read and executed by a processor in
processor unit 204. The program code in the different embodiments may be embodied on different physical or computer readable storage media, such asmemory 206 orpersistent storage 208. -
Program code 218 is located in a functional form on computerreadable storage media 220 that is selectively removable and may be loaded onto or transferred todata processing system 200 for execution byprocessor unit 204.Program code 218 and computerreadable storage media 220 formcomputer program product 222. Computerreadable storage media 220 may include, for example, an optical or magnetic disk that is inserted or placed into a drive or other device that is part ofpersistent storage 208 for transfer onto a storage device, such as a hard drive, that is part ofpersistent storage 208. Computerreadable storage media 220 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory, that is connected todata processing system 200. In some instances, computerreadable storage media 220 may not be removable fromdata processing system 200. - Alternatively,
program code 218 may be transferred todata processing system 200 using computer readable signal media. Computer readable signal media may be, for example, a propagated data signal containingprogram code 218. For example, computer readable signal media may be an electromagnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communications links, such as wireless communications links, an optical fiber cable, a coaxial cable, a wire, and/or any other suitable type of communications link. In other words, the communications link and/or the connection may be physical or wireless in the illustrative examples. The computer readable signal media also may take the form of non-tangible media, such as communications links or wireless transmissions containing the program code. - In some illustrative embodiments,
program code 218 may be downloaded over a network topersistent storage 208 from another device or data processing system through a computer readable signal media for use withindata processing system 200. For instance, program code stored in computer readable storage media in a server data processing system may be downloaded over a network from the server todata processing system 200. The data processing system providingprogram code 218 may be a server computer, a client computer, or some other device capable of storing and transmittingprogram code 218. - The different components illustrated for
data processing system 200 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated fordata processing system 200. Other components shown inFIG. 2 can be varied from the illustrative examples shown. The different embodiments may be implemented using any hardware device or system capable of executing program code. As one example,data processing system 200 may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding a human being. For example, a storage device may be comprised of an organic semiconductor. As another example, a storage device indata processing system 200 is any hardware apparatus that may store data.Memory 206,persistent storage 208, andcomputer program product 222 are examples of storage devices in a tangible form. - In another example, a bus system may be used to implement
communications fabric 202 and may be comprised of one or more buses, such as a system bus or an input/output bus. Of course, the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system. Additionally, a communications unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter. Further, a memory may be, for example,memory 206 or a cache such as found in an interface and memory controller hub that may be present incommunications fabric 202. - The different illustrative embodiments recognize and take into account a number of different considerations. For example, the different illustrative embodiments recognize that although the use of logs may reduce issues that may occur if the backup database cannot be reached, the different illustrative embodiments recognize that using logs also may result in undesirable delays and limits to the amount of available bandwidth in a network. In these illustrative examples, a log is a file or suitable other data structure that contains changes to be made to a database.
- The different illustrative embodiments recognize and take into account that the currently used processes for recording changes in logs at one database and sending the logs to another database for updating data can be time consuming and use up network resources. For example, a primary database sends information to a secondary database one log at a time. These logs may be very large in size. For example, a log may have 800 or more megabytes.
- Also, depending on the bandwidth available, the number of logs present at the primary database may reduce availability or responsiveness of the database. The number of logs also may reduce the availability or responsiveness of the network on which the database is located. More specifically, the size and number of logs for transfer may cause other requests and transactions to be delayed or stalled. For example, the number of primary database logs and the size of the logs may cause the database to delay or stall other transactions in a manner that is noticeable by other users.
- Thus, one or more of the illustrative embodiments provide a method and apparatus for synchronizing databases. In one illustrative embodiment, a determination is made as to whether a number of logs created for a primary database located on the first computer system is ready for transfer to a second computer system. In response to a determination that the number of logs are ready to be transferred, a number of sessions are identified based on resources available to transfer the number of logs across a network to the second computer system to form an identified number of sessions. The first computer system and the second computer system are in communication with the network. The number of logs is transferred from the first computer system to the second computer system using the number of identified sessions.
- Turning now to
FIG. 3 , a diagram of a database environment is depicted in accordance with an illustrative embodiment.Database environment 300 is an example of an environment that may be implemented in networkdata processing system 100 inFIG. 1 . - In this illustrative example,
database environment 300 includesprimary computer system 302 andsecondary computer system 304. These two systems are connected to network 306.Primary computer system 302 is located ingeographic location 308, whilesecondary computer system 304 is located ingeographic location 310, which is remote togeographic location 308 in these examples. -
Primary computer system 302 includes number ofcomputers 312. As used herein, a number, when referring to items, means one or more items. For example, a number of computers is one or more computers. Number ofcomputers 312 may be implemented using one or more ofdata processing system 200 inFIG. 2 . - In this illustrative example,
primary database 314 anddatabase management process 316 are located onprimary computer system 302.Primary database 314 may be located on one computer within number ofcomputers 312, whiledatabase management process 316 may be located on another computer. In yet other illustrative examples, bothprimary database 314 anddatabase management process 316 may be located on the same computer in number ofcomputers 312 inprimary computer system 302.Primary database 314 is a structured collection of data stored on one or more of number ofcomputers 312. This data is stored in the form of records in these examples. -
Database management process 316 is a software component that manages the storage of data withinprimary database 314.Database management process 316 handleschanges 318 made toprimary database 314.Changes 318 may include, for example, without limitation, adding new data, deleting data, overwriting data, and other changes toprimary database 314.Changes 318 may be made in response to requests fromclients 320. - In these illustrative examples,
primary database 314 is replicated onsecondary database 324 to provide a backup of data inprimary database 314. As a result, if an event occurs that rendersprimary database 314 unusable and/or inaccessible,secondary database 324 is used in place ofprimary database 314. For example, damage may occur to number ofcomputers 312, data may become corrupted inprimary database 314, access to number ofcomputers 312 may be cut off or prevented onnetwork 306, and/or some other event may occur causingprimary database 314 to be inaccessible byclients 320. - In this illustrative example,
secondary database 324 anddatabase management process 326 are located on number ofcomputers 322 insecondary computer system 304.Secondary computer system 304 comprises number ofcomputers 322. The computers in number ofcomputers 322 also may be implemented using a data processing system such as, for example,data processing system 200 inFIG. 2 . -
Secondary database 324 is synchronized withprimary database 314 in these illustrative examples. In other words, aschanges 318 are received and made toprimary database 314,changes 318 are propagated tosecondary database 324. In these examples, loggeneration process 328 indatabase management process 316 generateslogs 330 containingchanges 318. These logs are used to propagatechanges 318 fromprimary database 314 tosecondary database 324. In these illustrative examples,logs 330 include number ofarchived logs 332 and number ofactive logs 334. Number ofarchived logs 332 are logs that have been completed and are ready for transfer tosecondary computer system 304. Number ofactive logs 334 are logs that are still being written bylog generation process 328. - In these illustrative examples, a log may be considered an archived log when the log has not been changed for some period of time in an archive directory. In other illustrative examples, log
generation process 328 may generate a message indicating that a log is present as part of number ofarchived logs 332. Loggeneration process 328 may send this message to transferprocess 336. The message may include identification of the log, such as a file name or a path to the file. -
Transfer process 336 sends number ofarchived logs 332 to receivingprocess 338 indatabase management process 326. In response to receiving number ofarchived logs 332, receivingprocess 338 updatessecondary database 324 with the changes in number of received logs 340. - In these illustrative examples, number of
archived logs 332 are transferred bytransfer process 336 to receivingprocess 338 using number ofsessions 342.Transfer process 336 identifies number ofsessions 342 for use in sending number ofarchived logs 332 to receivingprocess 338 indata management process 326. Additionally, each session in number ofsessions 342 may be used concurrently or simultaneously with other sessions within number ofsessions 342. In other words, the different sessions within number ofsessions 342 may be parallel sessions. - Number of
session 342 may be identified using at least one of the number of logs ready to be transferred, an amount of bandwidth available, performance criteria, and other suitable factors. - In some examples, the identification of number of
sessions 342 is performed based onresources 344 available to transfer number ofarchived logs 332 fromprimary computer system 302 tosecondary computer system 304 overnetwork 306.Resources 344 include at least one ofbandwidth 346,processor usage 348,storage 350, and other suitable types of resources used to transfer data.Bandwidth 346 is a rate of data transfer overnetwork 306.Processor usage 348 is processor resources used on one or both ofprimary computer system 302 andsecondary computer system 304 to transfer number ofarchived logs 332.Storage 350 may be available storage, such as random access memory, hard disk drives, and other storage devices on these computer systems. - In these illustrative examples, one log in number of
archived logs 332 is transferred over a session within number ofsessions 342. In some advantageous embodiments, more than one log may be transferred using a session. Further, these sessions may be over the same connection or a different connection betweenprimary computer system 302 andsecondary computer system 304. -
Transfer process 336 also generatesschedule 352. In these examples,schedule 352 identifies when data inlogs 330 should be transferred fromprimary computer system 302 tosecondary computer system 304 over number ofsessions 342. In these illustrative examples, the selection of number ofsessions 342 may be made ahead of time or when a transfer oflogs 330 is to occur.Schedule 352 may be selected in a manner that avoids reducingresources 344 available to transfer data below a threshold or desired level. For example, in some illustrative embodiments, it may be undesirable to use more than around 50 percent of the available amounts ofbandwidth 346. Number ofsessions 342 andschedule 352 may be selected to avoid this situation. - In identifying
schedule 352 and number ofsessions 342,transfer process 336 takes into account amount ofdata 354 in number ofarchived logs 332. As a result,transfer process 336 takes intoaccount resources 344 available for use needed to transfer number ofarchived logs 332 fromprimary computer system 302 tosecondary computer system 304 as well as amount ofdata 354. In these illustrative examples,schedule 352 may change asresources 344 change withinnetwork 306.Schedule 352 may provide for transfers of number ofarchived logs 332 on a periodic basis and/or based on particular events. - Further,
transfer process 336 may causelog generation process 328 to close a log within number ofactive logs 334 to generate an archived log within number ofarchived logs 332 prior to the log normally being closed. In one example,transfer process 336 may causelog generation process 328 to close logs after the logs reach a desired size as compared to the size of the logs normally generated. For example, if logs in number ofactive logs 334 are normally closed when the logs reach a size of around 800 megabytes,transfer process 336 may cause a log having a smaller size, such as around 600 megabytes, to be closed. Additionally, by closing a log prior to the normal size at which a log is closed, faster synchronization or backing up ofchanges 318 may occur. - In these examples, a log is closed when
changes 318 are no longer written to the log. When a log is closed, the log becomes an archived log and is ready for transfer in these examples. - Although the sizes of around 800 megabytes and around 600 megabytes are described in the illustrative examples, the size of logs may vary, depending on the particular implementation. If a log is closed prior to the log reaching the size when a log is normally closed, the smaller size may be selected. This smaller size may be selected as one that reduces use of
resources 344. This reduction in the use ofresources 344 may be one that allows for transfer of number ofarchived logs 332 without using an amount ofresources 344 that is greater than some desired level of usage. - The closing of logs at an earlier time may be performed in response to
schedule 352. For example, ifschedule 352 indicates that a transfer of logs is to be made fromprimary database 314 tosecondary database 324,transfer process 336 determines whether number ofarchived logs 332 is present for transfer. If no logs are present for transfer,transfer process 336 may causelog generation process 328 to close one or more logs in number ofactive logs 334. These closed logs may then be transferred tosecondary database 324. - The illustration of
database environment 300 is not meant to imply physical or architectural limitations to the manner in which different advantageous embodiments may be implemented. Other components in addition to and/or in place of the ones illustrated may be used. Some components may be unnecessary in some advantageous embodiments. Also, the blocks are presented to illustrate some functional components. One or more of these blocks may be combined and/or divided into different blocks when implemented in different advantageous embodiments. - For example, in some advantageous embodiments, log
generation process 328 may be a process separate fromdatabase management process 316. Loggeneration process 328 may be located on a different computer fromdatabase management process 316. As yet another example, in some illustrative embodiments, receivingprocess 338 may be integrated as part ofdatabase management process 326. Further,network 306 may be comprised of multiple networks. For example,network 306 may be a local area network and a wide area network connected to each other. In yet other advantageous embodiments,network 306 also may include the Internet. - As another example, in the different illustrative embodiments, number of
archived logs 332 may be compressed and/or encrypted prior to transfer using number ofsessions 342 overnetwork 306 tosecondary computer system 304. The compression of number ofarchived logs 332 may reduce the amount of data that is transferred overnetwork 306 through number ofsessions 342 in a manner that may reduce the use ofresources 344 that are available. Further, encryption of number ofarchived logs 332 may protect sensitive and/or confidential information in number ofarchived logs 332 from being viewed by unauthorized users. - As yet another example,
transfer process 336 may monitor updates and may determine thatsecondary database 324 is behind in updates fromprimary database 314 beyond a desired threshold or amount. Transfer process may measure the amount of updates by, for example, without limitation, the size of the log or logs, or the time since a log has been transferred. This situation may causetransfer process 336 to generate an indication thatsecondary database 324 is out of synchronization fromprimary database 314. - With reference now to
FIG. 4 , a flowchart of a process for generating a schedule to transfer logs is depicted in accordance with an advantageous embodiment. In this illustrative example, the different steps inFIG. 4 may be implemented as computer usable program code in a functional form for execution by computers indatabase environment 300 inFIG. 3 . For example, the process illustrated inFIG. 4 may be implemented with number ofcomputers 312 executing program code fortransfer process 336 inFIG. 3 . - The program code for
transfer process 336 is executed to identify a history of database usage (step 400). Instep 400,transfer process 336 analyzes a history of use ofprimary database 314 byclients 320. The analysis may be used to identify times during which the primary database is known to be less busy. The program code fortransfer process 336 is executed to generate a schedule based on the analyzed history (step 402), with the process terminating thereafter. - In this manner, a schedule, such as
schedule 352, may be set to transfer archived logs in a manner that reduces the impact to the primary database computer system. For example,transfer process 336 may generate a schedule that performs more log transfers during periods of time in which the database usage is low as compared to other periods of time. For example, the schedule may set more frequent transfers during periods of time when the database is undergoing a volume of changes greater than a threshold level or for any other reason suitable for which a more frequent backup is desirable over a longer period of time. - Of course, the schedule may be generated using other processes other than the one illustrated in
FIG. 4 . For example,transfer process 336 may generate a schedule in which logs are transferred periodically. - The frequency during which the transfer of logs may occur also may be based on a priority level. A higher priority level may cause the process for transferring logs to occur at different periods of time. For example, logs may be transferred every hour, every five minutes, continuously, or for some other period of time. For example, the schedule may set transfers during certain periods of time when the database system is least used. This process may be performed by
transfer process 336 periodically. The process may be performed in response to a periodic or non-periodic event. - Turning now to
FIG. 5 , a flowchart of a process for transferring logs from a primary database to a secondary database is depicted in accordance with an illustrative embodiment. The process illustrated inFIG. 5 may be implemented in the form of program code in a functional form. This program code may be executed to run a process to transfer logs from a primary database to a secondary database. In these illustrative examples, the process inFIG. 5 may be implemented with number ofcomputers 312 executing program code fortransfer process 336 inFIG. 3 . This process may be initiated in response to a schedule, such asschedule 352 inFIG. 3 . - The program code for
transfer process 336 is executed to determine whether a number of logs created on the primary database on a first computer system are ready for transfer to a second computer system (step 500). In this depicted example, step 500 may involve determining whether archived logs are present in the database in this illustrative example. If logs are ready for transfer, the program code fortransfer process 336 is executed to identify a number of sessions for use in transferring the logs (step 502). In this illustrative example, the number of sessions may be selected based on a number of resources available for use to transfer the logs. This number of resources may be, for example, the bandwidth available. In other examples, the number of resources considered in selecting the number of sessions may be bandwidth and processor usage. For example, if it is desirable not to use more than around 50 percent of the available bandwidth, then the number of sessions may be selected based on that parameter. - The program code for
transfer process 336 is executed to compress the number of logs to be transferred (step 504). This operation also may include encrypting the logs, depending on the particular implementation. By compressing the number of logs, the amount of data to be transferred over the sessions may be reduced. - The program code for
transfer process 336 is then executed to open the number of sessions (step 506). In these examples, the number of sessions selected does not exceed the number of logs to be transferred. For example, if four logs are present for transfer, only a maximum of four sessions are opened. In some cases, depending on the resources available and the constraints on those resources, the number of sessions may be less than the number of logs to be transferred. - After the sessions are started, the program code for
transfer process 336 is executed to determine whether a session in the number of sessions opened has failed (step 508). If a session has not failed, the program code fortransfer process 336 is executed to transfer the logs over the number of sessions (step 510). The program code fortransfer process 336 is then executed to determine whether an error has occurred in transferring a log (step 512). In these examples, an error may be identified by determining whether the received log is of the correct size and by comparing check sums. If an error has not occurred, the process terminates. If an error has occurred in transferring one or more logs, the process returns to step 510 withtransfer process 336 transferring the logs that have failed. In some embodiments, only the portions of a log that were not received or received with errors may be resent. - With reference again to step 508, if a session has failed, the program code for
transfer process 336 is executed to determine whether the failure has been caused by a lack of resources needed to transfer the logs (step 514). If the failure is caused by a lack of resources being available, the program code fortransfer process 336 is executed to determine whether opened sessions should be closed to reach a desired level of available resources (step 516). If additional sessions are to be closed, the program code fortransfer process 336 is executed to close the additional sessions (step 518). The process then proceeds to step 510 as described above. - With reference again to step 500, if archived logs are not available, the program code for
transfer process 336 is executed to send a request to the log generation process to close a number of currently open logs (step 520). When a currently open log file is closed, the process then proceeds to step 502. With reference again to step 516, if additional opened sessions are not to be closed, the process returns to step 510 as described above. Turning back to step 514, if the failure has not been caused by a lack of resources needed to transfer the logs, the process also returns to step 510. - In this manner, the synchronization of data between the databases may be performed more quickly. Further, the cost of compression and transfer also may be reduced by reducing the amount of time needed to transfer a log with a smaller size.
- In this manner, one or more of the different illustrative embodiments provide a method and apparatus for processing data. In particular, the different illustrative embodiments provide a method and apparatus for synchronizing changes made at one database with another database. In the illustrative embodiments, the data may be transferred in a manner that reduces usage of resources available on a network. In one or more of the illustrative embodiments, to determine whether a number of logs created on a primary database located on a first computer system is ready for transfer to a second computer system. In response to a determination that the number of logs is ready to be transferred, a number of sessions are identified based on resources available to transfer the number of logs across a network to a second computer system. The number of identified sessions is used to transfer the number of logs in a first computer system to the second computer system.
- The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
- The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Claims (22)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/493,794 US9223666B2 (en) | 2009-06-29 | 2009-06-29 | Disaster recovery for databases |
PCT/EP2010/058487 WO2011000701A1 (en) | 2009-06-29 | 2010-06-16 | Disaster recovery for databases |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/493,794 US9223666B2 (en) | 2009-06-29 | 2009-06-29 | Disaster recovery for databases |
Publications (2)
Publication Number | Publication Date |
---|---|
US20100332533A1 true US20100332533A1 (en) | 2010-12-30 |
US9223666B2 US9223666B2 (en) | 2015-12-29 |
Family
ID=42732645
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/493,794 Active 2034-03-15 US9223666B2 (en) | 2009-06-29 | 2009-06-29 | Disaster recovery for databases |
Country Status (2)
Country | Link |
---|---|
US (1) | US9223666B2 (en) |
WO (1) | WO2011000701A1 (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5758150A (en) * | 1995-10-06 | 1998-05-26 | Tele-Communications, Inc. | System and method for database synchronization |
US6038665A (en) * | 1996-12-03 | 2000-03-14 | Fairbanks Systems Group | System and method for backing up computer files over a wide area computer network |
US6058462A (en) * | 1998-01-23 | 2000-05-02 | International Business Machines Corporation | Method and apparatus for enabling transfer of compressed data record tracks with CRC checking |
US6192365B1 (en) * | 1995-07-20 | 2001-02-20 | Novell, Inc. | Transaction log management in a disconnectable computer and network |
US6327671B1 (en) * | 1998-11-18 | 2001-12-04 | International Business Machines Corporation | Delta compressed asynchronous remote copy |
US6408310B1 (en) * | 1999-10-08 | 2002-06-18 | Unisys Corporation | System and method for expediting transfer of sectioned audit files from a primary host to a secondary host |
US20030061537A1 (en) * | 2001-07-16 | 2003-03-27 | Cha Sang K. | Parallelized redo-only logging and recovery for highly available main memory database systems |
US6978396B2 (en) * | 2002-05-30 | 2005-12-20 | Solid Information Technology Oy | Method and system for processing replicated transactions parallel in secondary server |
US20060085672A1 (en) * | 2004-09-30 | 2006-04-20 | Satoru Watanabe | Method and program for creating determinate backup data in a database backup system |
US20060242370A1 (en) * | 2005-04-20 | 2006-10-26 | Yoshio Suzuki | Disaster recovery method, disaster recovery system, remote copy method and storage system |
US20080228832A1 (en) * | 2007-03-12 | 2008-09-18 | Microsoft Corporation | Interfaces for high availability systems and log shipping |
-
2009
- 2009-06-29 US US12/493,794 patent/US9223666B2/en active Active
-
2010
- 2010-06-16 WO PCT/EP2010/058487 patent/WO2011000701A1/en active Application Filing
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6192365B1 (en) * | 1995-07-20 | 2001-02-20 | Novell, Inc. | Transaction log management in a disconnectable computer and network |
US5758150A (en) * | 1995-10-06 | 1998-05-26 | Tele-Communications, Inc. | System and method for database synchronization |
US6038665A (en) * | 1996-12-03 | 2000-03-14 | Fairbanks Systems Group | System and method for backing up computer files over a wide area computer network |
US6058462A (en) * | 1998-01-23 | 2000-05-02 | International Business Machines Corporation | Method and apparatus for enabling transfer of compressed data record tracks with CRC checking |
US6327671B1 (en) * | 1998-11-18 | 2001-12-04 | International Business Machines Corporation | Delta compressed asynchronous remote copy |
US6408310B1 (en) * | 1999-10-08 | 2002-06-18 | Unisys Corporation | System and method for expediting transfer of sectioned audit files from a primary host to a secondary host |
US20030061537A1 (en) * | 2001-07-16 | 2003-03-27 | Cha Sang K. | Parallelized redo-only logging and recovery for highly available main memory database systems |
US6978396B2 (en) * | 2002-05-30 | 2005-12-20 | Solid Information Technology Oy | Method and system for processing replicated transactions parallel in secondary server |
US20060085672A1 (en) * | 2004-09-30 | 2006-04-20 | Satoru Watanabe | Method and program for creating determinate backup data in a database backup system |
US20060242370A1 (en) * | 2005-04-20 | 2006-10-26 | Yoshio Suzuki | Disaster recovery method, disaster recovery system, remote copy method and storage system |
US20080228832A1 (en) * | 2007-03-12 | 2008-09-18 | Microsoft Corporation | Interfaces for high availability systems and log shipping |
Also Published As
Publication number | Publication date |
---|---|
WO2011000701A1 (en) | 2011-01-06 |
US9223666B2 (en) | 2015-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11815993B2 (en) | Remedial action based on maintaining process awareness in data storage management | |
US10747630B2 (en) | Heartbeat monitoring of virtual machines for initiating failover operations in a data storage management system, including operations by a master monitor node | |
US10353790B1 (en) | Disaster recovery rehearsals | |
US9940206B2 (en) | Handling failed cluster members when replicating a database between clusters | |
US7882393B2 (en) | In-band problem log data collection between a host system and a storage system | |
US9075771B1 (en) | Techniques for managing disaster recovery sites | |
US20060015545A1 (en) | Backup and sychronization of local data in a network | |
US20220374519A1 (en) | Application migration for cloud data management and ransomware recovery | |
EP3238063B1 (en) | Techniques for data backup and restoration | |
US8838921B2 (en) | Determining whether to extend a drain time to copy data blocks from a first storage to a second storage | |
US9817834B1 (en) | Techniques for performing an incremental backup | |
US7519610B2 (en) | Method and apparatus for efficiently storing audit event data having diverse characteristics using tiered tables | |
US11550677B2 (en) | Client-less database system recovery | |
US8655845B2 (en) | Reducing duplicate information when reporting system incidents | |
US9223666B2 (en) | Disaster recovery for databases | |
JP5939053B2 (en) | Remote backup device, remote backup system, method and program | |
US11593309B2 (en) | Reliable delivery of event notifications from a distributed file system | |
US11269958B1 (en) | Input and output files that offload search pattern matching from clients to servers | |
US10152415B1 (en) | Techniques for backing up application-consistent data using asynchronous replication | |
Venegas Ríos | Off Site Database Management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KOCHUKUNJAN, SUNIL VERGHESE;YENNARAM, KONDAL REDDY;REEL/FRAME:022899/0863 Effective date: 20090626 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
AS | Assignment |
Owner name: KYNDRYL, INC., NEW YORK Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:057885/0644 Effective date: 20210930 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |