« 上一頁繼續 »
METHOD FOR RELIABLY UPDATING A
DATA GROUP IN A READ-BEFORE-WRITE
DATA REPLICATION ENVIRONMENT USING
A COMPARISON FILE
CROSS-REFERENCES TO RELATED
 This application is a continuation of U.S. patent application Ser. No. 11/093,392 entitled "RELIABLY UPDATING A DATA GROUP IN A READ-BEFOREWRITE DATA REPLICATION ENVIRONMENT USING A COMPARISON FILE" filed on Mar. 30, 2005 for Henry E. Butterworth et al., and claims priority to U.S. patent applicationNo. 10/867,058 entitled "Apparatus, system, andmethod for providing efficient disaster recovery storage of data using differencing" and filed on Jun. 14, 2004 for Kenneth Boyd.
FIELD OF THE INVENTION
 This invention relates to data replication and more particularly relates to reliably updating a data group in a data replication environment.
DESCRIPTION OF THE RELATED ART
 Two important objectives of a data replication environment are first to maintain an accurate replica of a data group, and second to maintain a consistent replica of a data group. Maintaining accuracy requires that as data is copied from one storage medium to another, no errors are introduced. Maintaining consistency requires that as data is copied from one storage medium to another, no data is lost or omitted. Accuracy may be ensured in the case of data corruption or media failure by copying a consistent replica of the data from a backup source. Consequently, in a data replication environment, it is important to maintain an up-to-date copy of a data group.
 There are two main types of data replication environments, a synchronous system and an asynchronous system. A synchronous system updates a second storage medium each time a first storage medium is updated. The update to the second storage medium is a part of the update transaction made on the first storage medium. Inconsistencies are less common in a synchronous system because less data is copied with each update to the second storage medium, and less time is required for the copy transaction to be completed. However, synchronous systems are often cumbersome because of the intense usage of network resources required to constantly update the second storage medium.
 Asynchronous systems are often used as an alternative to synchronous systems. An asynchronous system copies an updated data group from a first storage medium to a second storage medium. The first storage medium does not send updates to the second storage medium until updates to the first storage medium are complete. Asynchronous systems are beneficial, because updates to the second storage medium are made less frequently. If one of the storage mediums fails during a data update, the data may be inconsistent. Additionally, if a consistent backup copy of the data is not available, data may be lost or corrupted as a consequence of system corruption or media failure.
 From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that reliably update a data group in a data replication environment.
Beneficially, such an apparatus, system, and method would provide a highly consistent backup replica of data stored in a data storage environment.
SUMMARY OF THE INVENTION
 The present invention has been developed in response to the present state of the art, and in particular, in response to the problems and needs in the art that have not yet been fully solved by currently available data replication products. Accordingly, the present invention has been developed to provide a method for reliably updating a data group in a data replication environment that overcomes many or all of the above-discussed shortcomings in the art.  A method of the present invention is presented for reliably updating a data group in a data replication environment. The method in the disclosed embodiments substantially includes the steps necessary to carry out the functions presented above with respect to the operation of the described apparatus and system. In one embodiment, these steps include receiving an updated data group sent from a first storage medium to a second storage medium, comparing the updated data group with a previous data group previously existing on the second storage medium, and writing the updated data group to the second storage medium.  Reference throughout this specification to features, advantages, or similar language does not imply that all of the features and advantages that may be realized with the present invention should be or are in any single embodiment of the invention. Rather, language referring to the features and advantages is understood to mean that a specific feature, advantage, or characteristic described in connection with an embodiment is included in at least one embodiment of the present invention. Thus, discussion of the features and advantages, and similar language, throughout this specification may, but do not necessarily, refer to the same embodiment.  Furthermore, the described features, advantages, and characteristics of the invention may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize that the invention may be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments of the invention.
 These features and advantages of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.
BRIEF DESCRIPTION OF THE DRAWINGS
 In order that the advantages of the invention will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments that are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments of the invention and are not therefore to be considered to be limiting of its scope, the invention will be described and explained with additional specificity and detail through the use of the accompanying drawings, in which:  FIG. 1 is a schematic block diagram illustrating one embodiment of a system for reliably updating a data group in a data replication environment;
 FIG. 2 is a schematic block diagram illustrating one embodiment of an apparatus for reliably updating a data group in a data replication environment;  FIG. 3 is a detailed schematic block diagram illustrating another embodiment of an apparatus for reliably updating a data group in a data replication environment;  FIG. 4 is a schematic flow diagram illustrating one embodiment of a method for reliably updating a data group in a data replication environment;
 FIG. 5 is a schematic flow diagram illustration one embodiment of a method for reliably writing data in a data replication environment;
 FIG. 6 is a detailed schematic flow diagram illustrating another embodiment of a method for reliably updating a data group in a data replication environment.
DETAILED DESCRIPTION OF THE INVENTION
 Many of the functional units described in this specification have been labeled as modules, in order to more particularly emphasize their implementation independence. For example, a module may be implemented as a hardware circuit comprising custom VLSI circuits or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
 Modules may also be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be organized as an obj ect, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different locations which, when joined logically together, comprise the module and achieve the stated purpose for the module.  Indeed, a module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Similarly, operational data may be identified and illustrated herein within modules, and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.
 Reference throughout this specification to 'one embodiment,' 'an embodiment,' or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases 'in one embodiment,' 'in an embodiment,' and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.  Reference to a signal bearing medium may take any form capable of generating a signal, causing a signal to be generated, or causing execution of a program of machinereadable instructions on a digital processing apparatus. A signal bearing medium may be embodied by a transmission line, a compact disk, digital-video disk, a magnetic tape, a Bernoulli drive, a magnetic disk, a punch card, flash memory, integrated circuits, or other digital processing apparatus memory device.
 The schematic flow chart diagrams included are generally set forth as logical flow chart diagrams. As such, the depicted order and labeled steps are indicative of one embodiment of the presented method. Other steps and methods may be conceived that are equivalent in function, logic, or effect to one or more steps, or portions thereof, of the illustrated method. Additionally, the format and symbols employed are provided to explain the logical steps of the method and are understood not to limit the scope of the method. Although various arrow types and line types may be employed in the flow chart diagrams, they are understood not to limit the scope of the corresponding method. Indeed, some arrows or other connectors may be used to indicate only the logical flow of the method. For instance, an arrow may indicate a waiting or monitoring period of unspecified duration between enumerated steps of the depicted method. Additionally, the order in which a particular method occurs may or may not strictly adhere to the order of the corresponding steps shown.  Furthermore, the described features, structures, or characteristics of the invention may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided, such as examples of programming, software modules, user selections, network transactions, database queries, database structures, hardware modules, hardware circuits, hardware chips, etc., to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention may be practiced without one or more of the specific details, or with other methods, components, materials, and so forth. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the invention.  FIG. 1 depicts a system 100 for reliably updating a data group in a data replication environment. In one embodiment, the system 100 includes a first storage medium 102, a controller 104, and a second storage medium 106. In such an embodiment, the first storage medium 102 sends updates to a second storage medium 106. The updates may be managed by a controller 104. In one embodiment, the system 100 may also include a peer controller 108 in communication with the controller 104 and the second storage medium 106.  In one embodiment, the first storage medium 102 is an IBM Enterprise Storage SystemTM (ESS). The first storage medium 102 may be part of a Storage Area Network (SAN), and receive updates from other computing devices on the SAN. Of course, the invention may be implemented using any suitable data storage medium. Application data stored on the first storage medium 102 may be extremely error sensitive. For example, a banking application may store data about banking transactions. If the data is corrupted, due to system failure or otherwise, significant financial consequences may follow. Therefore, it is desirable to maintain an accurate and consistent backup copy of the data on a second storage medium 106.
 In one embodiment, the data stored on the first storage medium 102 is copied to a second storage medium 106. The second storage medium 106 maybe the same model as the first storage medium 102. Alternatively, the physical or software platform of the second storage medium 106 may be different from that of the first storage medium 102. Additionally, the first storage medium 102 and the second storage medium 106 may be separated by a large geographical dis