« 上一頁繼續 »
LU £T LU X
DATA LOCK MANAGEMENT IN A
DISTRIBUTED FILE SERVER SYSTEM
DETERMINES VARIABLE LOCK LIFETIME
IN RESPONSE TO REQUEST TO ACCESS
B ACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to distributed file server systems and, more particularly, to cache memory data lock management for such systems.
2. Description of the Related Art
Computing systems consisting of workstations or terminals connected to a central computer or mainframe by a fast local area network have become widespread. Such systems are used for executing general office tasks, engineering design, software development, and many other applications. Future systems promise to provide an information processing environment that will include thousands of workstations, as well as many mid-size machines or mainframes, all interconnected by one or more local area networks (LANs). The workstations in such systems need to share resources, both for economic reasons and due to the nature of applications. Systems that share resources such as system hardware and software, system data, and user software and data, are known as distributed file systems.
The sharing of data objects such as data files and databases generally is administered through the use of file servers. The file servers comprise computing units that are interposed between individual workstations and the local area network. In addition to managing the sharing of resources, file servers also provide such services as automatic backup and recovery, user mobility, and management of cache workstation memories. The distributed file system allows for replication of files and/or storing data files in cache memory on various levels of storage hierarchy. A cache memory consists of a memory associated with a particular workstation in which may reside copies of data objects from a primary location of data storage, such as the file server or a mid-size machine or mainframe. The workstation carries out operations on its copy of a data object rather than on the data object maintained in the primary location of data storage. Caching of data at a workstation level, if done properly, can improve system performance because it permits data to be accessed without the intervention of the file server. Such caching, however, introduces the problem of data coherency between the cache data and the data at the primary location of data storage. Data coherency refers to insuring that only a single image of a data object exists or that the system at least performs as if only a single image exists. With large data caches, the data traffic between file servers and workstations required to maintain data coherency can be the dominant factor in effective data access time and cache performance. Most conventional approaches to data coherency fall into one of two categories, those that assume reliable broadcast, and therefore cannot tolerate communication failures, and those that require a check for consistency in every data read and therefore suffer from excessive overhead. A system preferably provides failure tolerance and therefore attempts have been made to reduce overhead while providing coherency.
Conventional cache file systems that have provided data coherency do so by utilizing data locks. When a client, such as a local workstation terminal or an application, requires access to a data object, it must request access to the data
object from a file server. Conventional caching systems provide coherency by granting data locks with either a zero lifetime or an infinite lifetime. A zero lifetime lock, for example, is granted when the client workstation will immediately execute a write command upon receiving the lock and then release its data lock, thereafter providing other users with a chance to write data. An infinite lifetime lock is held by the client to whom it was granted for as long as the client desires it.
System efficiency is optimized in the case of data files that are highly shared by granting zero lifetime locks. This minimizes delay resulting from processor failures and from communication failures, such as when a data object is being accessed by one workstation at the same time access is requested by another workstation. System efficiency is improved in the case of system files, which are repeatedly accessed but rarely written, with an infinite lifetime lock. This permits client workstations to complete extended read operations without interruption.
Attempts have been made to combine the advantages of the two lock lifetime extremes. For example, Sturgis, Mitchell and Israel in the July 1980 issue of Operating Systems Review, pp. 55-69, describe a distributed file system that uses breakable locks with time limits. The locks, however, have a minimum lifetime before they can be broken. Because clients in the Sturgis et al. system are not reliably notified when a lock is broken, the system actually resembles a system with locks of zero lifetime. That is, locks are released as soon as read or write operations are completed. U.S. Pat. No. 4,716,528 to Cms describes a method for utilizing a coordinated pair of locking limits to manage data concurrency and adjust lock granularity. Again, however, all the data locks have a predetermined zero or infinite lifetime. U.S. Pat. No. 4,965,719 to Shoens describes a method for increasing throughput and maintaining page coherency in a multiple processing environment with predetermined lock lifetimes.
From the discussion above, it should be apparent that there is a need for a distributed file system in which data coherency is assured and data locks are optimal lock lifetimes that are provided for highly shared files and rarely written files. The present invention satisfies this need.
SUMMARY OF THE INVENTION
The present invention provides a method of data lock management in a distributed file system in which client workstations serviced by a file server are granted variable lifetime locks when they request access to a data object. As used herein, variable lifetime locks are locks that can assume a lifetime value in a range from zero to infinity seconds and are not limited to the endpoints of the range. The present invention permits the lock lifetimes to be determined in either a static scheme or dynamic scheme. In the static scheme, certain system parameters, such as the file read/ write access ratio, the file access rate, the number of clients sharing the data object, and the number of clients served by the file server are assumed to have a predetermined value or characteristic and lock lifetimes are assigned accordingly. In a dynamic scheme of determining lock lifetime, the actual real-time system parameters are utilized to determine the lock lifetime. For example, parameters such as the file read/write access ratio, file access rate, and number of clients sharing a data object are typically calculated in real-time by a computer system in the normal course of operation. The present invention uses such readily calculated parameters to dynamically assign a variable lifetime to a lock at the time the lock is granted. Thus, the lock lifetime can be tailored to the immediate operating situation in the system.
In another aspect of the invention, client workstations 5 having a current data lock can request renewal of their data lock prior to the expiration of the lock. For example, a workstation might be ready to execute a file backup operation that requires more time to complete than is remaining in the lifetime of the data lock. Rather than complete as much 10 of the backup operation as possible and then compete with other users for access to the data file to complete the operation, the workstation might request a data lock renewal before the lock expires. In this way, the file backup operation is not interrupted by an access request from another work- 15 station during the backup operation. This can reduce overhead for lock management and improves efficiency of the system.
Other features and advantages of the present invention should be apparent from the following description of the preferred embodiment, which illustrates, by way of example, the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
DESCRIPTION OF THE PREFERRED EMBODIMENT
FIG. 1 is a block diagram of a computer system in accordance with the present invention, showing the plurality of file servers with client workstations all interconnected through a local area network (LAN).
FIG. 2 is a block diagram of a file server and workstation 30 pair of the type illustrated in FIG. 1.
FIG. 3 is a time line illustrating operation of the system shown in FIGS. 1 and 2.
FIG. 4 is a flow diagram illustrating the operation of the 35 system shown in FIGS. 1 and 2.
FIG. 5 is a graph of relative server coherency-induced load as a function of lock lifetime for a simulated system and static performance model.
FIG. 6 is a graph of average coherency delay as a function 40 of lock lifetime for the simulated system and static performance model associated with FIG. 3.
With reference to FIGS. 1 and 2, a computer system 10 is shown having a central processor 12 such as an International Business Machines Corporation (IBM Corporation) System 390 product connected to a plurality of file servers such as 50 IBM Corporation AS400 file servers, three of which 14,16, 18 are illustrated and all of which are interconnected by a local area network (LAN) 20. Each of the file servers is associated with a plurality of client workstations such as IBM Corporation RS6000 workstations, three of which 22, 55 24, 26 are illustrated. The client workstations periodically request access to data objects from their respective file server. Each file server, in turn, provides copies of data objects to the client workstations if the requested data object is associated with the particular server. If the data object is go associated with a different server or with the central processor 12 system memory, then the server gets the data object via the LAN 20 and provides it to the workstation.
FIG. 2 shows a particular file server 14 and its associated workstations 22, 24, 26. FIG. 2 shows that each file server 65 includes a central processing unit (CPU) 30 and a storage memory device 32. The file server CPU controls access to
the storage device, which contains data objects, and communicates with the LAN 20 and the workstations. Each workstation includes a terminal 34, such as a keyboard and monitor combination, a workstation CPU 36, and a cache memory 38. The terminal, CPU, and cache memory of the workstation are interconnected. Data objects are received and data is provided through the terminal 34 and cache memory 38.
As noted above, optimal system performance is obtained when data locks are not limited to a fixed lifetime of either zero or infinity. For example, locks with relatively short lifetimes minimize delay resulting from client and server failures and partitioning communication failures while ensuring decreasing overhead from repeated requests for locks. When a server cannot communicate with a client workstation, the file server must delay write operations to a file for which the failed client holds a data lock until that lock lifetime has expired. Short lifetimes minimize delay and reduce the storage requirements of a file server. Locks with short lifetimes also minimize false contention, which refers to a locking conflict when no actual conflict in file access exists. In particular, false contention can occur when a client writes to a file that is covered by a lock held by another client when the other client is not currently accessing the file. Zero lifetime locks, however, create excessive overhead for files that are repeatedly written. On the other hand, locks with relatively long lifetimes are significantly more efficient for the system on files that are accessed repeatedly and have relatively little sharing of write operations.
The present invention provides improved system performance by providing variable lifetime data locks. The data locks do not have predetermined lock lifetimes of either zero or infinity. Each file server 14, 16, 18 controls the lifetimes of the data locks it grants. Unlike conventional distributed file systems, the lifetime of a lock is not predetermined for all of the locks granted. Rather, the lifetime of a data lock is determined at the time the lock is granted. In accordance with the invention, the lock lifetime is determined in either a static scheme or a dynamic scheme based on system operating statistics and other parameters. The system operating statistics can include, for example, the file access rate, file read access rate, file write access rate, number of client workstations served by a file server, and number of client workstations sharing a data object. Other lock parameters can include the type of data object for which access is being requested, type of workstation requesting access, and the assigned priority of the data object or workstation.
In a static scheme, the system operating statistics are assumed to have a predetermined value and the lock lifetimes are determined based on achieving maximum efficiency for a system having the predetermined characteristics. The static scheme is static only in the sense that system operating statistics are assumed static. Other parameters, such as type of workstation and assigned priority, are dynamic and variably determine the lock lifetime. In a dynamic scheme, one or more of the system operating statistics used to determine lock lifetimes are calculated in real time. Most distributed file systems calculate the statistics that might be used to determine the lock lifetimes as part of normal system operations. Thus, overhead for the dynamic scheme is not unreasonable. The system parameters and lock lifetimes that result in maximum efficiency are best determined empirically, or through simulation methods well-known to those skilled in the art.
The present inventor has empirically determined a performance model that has been found to provide optimal