RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Writing Data Safely with the CKPTFile Class : Page 5

Making changes to memory-mapped files is a delicate operation, but far from impossible. Learn how to make safe, atomic changes to memory-mapped files using a checkpoint system that will leave your applications fast and robust and your data impervious to corruption.

The Physical Log
To restore the unmodified document in this scenario you have to copy the data that is already there to another file before you overwrite any data in the main file. Then, if there is a failure before any given checkpoint can be completed, the old data can be retrieved from this other file.

The other file is called the physical log, and its use is shown in Figure 7.

Figure 7. Repairing the Datafile Using the Physical Log : Data in the datafile that is changed is first copied to the physical log so that it can be restored later if necessary.
The physical log acts as a copy of your original data. This is conceptually similar to keeping the original file around while writing the new data to a temporary file—at a certain point, you have both the old and new data on disk. But the physical log is different—and better—because instead of having a copy of the entire file, you have a copy of only the data that has changed, saving a lot of space.

Each of the put() methods in CKPTFile calls a method called markDirty(). This method tells the CKPTFile that a particular region of the file is about to be changed (i.e. made dirty). The CKPTFile first makes a copy of this region of data to the physical log and then carries out the write.

It's important to write to the physical log before writing to the datafile. If you write to the datafile first and the system crashes before writing to the physical log, you'll have no copy of the original data.

The physical log is just another file, wrapped in a class called PhysicalLog. It is opened by CKPTFile the first time a put() method is called. It contains a series of entries, implemented by a class called PhysicalLogEntry (see Listing 2). Each entry contains a chunk of data, along with the start and end positions within the data file.

To checkpoint a data file, you have to flush all written data to the file. Remember that most operating systems cache filesystem data in memory, making reads and writes faster. When you write data to a file, the data isn't necessarily written to disk immediately. Rather, it stays in the buffer for a short time, while other writes happen. When either enough writes have accumulated or enough time has passed, the operating system writes the modified data to disk.

This greatly increases filesystem speed, but it also makes the system less reliable—you can write data to a file and think it is safely there when it isn't. If the system crashes before the real write actually happens, the data won't be on disk.

This is where checkpointing comes in. To checkpoint a data file, simply tell the system to really force the data out to disk. This can be done in NIO using the MappedByteBuffer.force() method. This method forces any changes out to disk. When this method returns, you can safely assume that the data really is on disk.

You can checkpoint a CKPTFile at any time by calling its checkpoint() method. However, note that this method is synchronized, as are all the get() and put() methods. This assures that checkpointing won't happen at the same time that a write is in progress. This is good; we only want checkpoints to happen in the safe zone between writes.

Once checkpointing completes, you can delete your physical log. You won't need it now.

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date