To illustrate our approach we have modified a lightweight transaction-based system called RVM  and the EXODUS storage manager  to use remote memory (instead of disks) for synchronous write operations. After studying the performance of the systems, we concluded that they spend a significant amount of their time, synchronously writing transaction data to their log file, which is used to implement a two-phase commit protocol. When a transaction commits, all the data the transaction modified are synchronously written to the log (stored as a UNIX file on a magnetic disk). After the mentioned data are successfully written to the log, the system is allowed to proceed.
We have modified both EXODUS and RVM so as to to keep a copy of their log file in remote main memory (as well as the disk). The unmodified systems force all their sensitive data to the disk at transaction commit time using synchronous disk write operations. In our modified systems, we substitute each synchronous write operation with the following two operations:
Essentially, we substitute a synchronous disk write operation with a synchronous network write operation plus an asynchronous disk write operation (which has no effect on completion time since it proceeds in the background, as long as adequate data buffering is provided). At the same time, our systems do not compromise data reliability. Let's examine what are the steps in writing data in our systems:
The transaction is committed after step 2 completes. It seems that there is a ``window of vulnerability'' between steps 2. and 3., that is after the data have been safely written to remote memory (and scheduled to be written on the disk), but before the data have been safely written to magnetic disk. If the local system crashes during this interval, then the data that are still in the local main memory buffer cache will be lost during the crash. Fortunately, our system can still recover the seemingly lost data, since the same data reside in the remote memory as a result of step 1. Data loss may happen only if both local and remote systems crash during this interval. However, we have argued that the probability of both systems (which are equipped with UPSs) crashing during the interval of few minutes is comparable (or even lower) than the probability of a magnetic disk malfunction. Thereby our system provides levels of reliability comparable to a magnetic disk.