The CKPTFile Class
Before we can design our algorithm, we have to decide what part of the Java I/O subsystem we are going to use. Ideally, you would implement the checkpointing facilities as part of the New I/O (NIO) packages that come with JDK 1.4. You could, for example, try to create a class called CKPTMappedByteBuffer, which would be just like a MappedByteBuffer, except that it could process its changes atomically.
|What You Need
| JDK 1.4 from Sun Microsystems, or a compatible Java development kit
A Java IDE or a suitable command shell
NIO, however, doesn't make it easy to create subclasses of its buffer classes, and the implementation of such a class would be a distraction from the topic at hand. Instead, we'll consider the basic functionalitysafe, atomic data writingin a form that could be adapted later to be a buffer.
The main class of our implementation (see Listing 1) is called CKPTFile.
CKPTFile isn't a buffer, but it's similar. It allows you to read and write bytes, either one-at-a-time or in bulk. When doing bulk transfers, you can do them to or from byte arrays, or you can do them to or from actual ByteBuffers. Unlike a buffer, a CKPTFile doesn't keep track of a read/write cursor you have to specify where you want to read or write each time you do it.
(If you want to jump ahead, take a peek at the source for Pound.java this is a test program that uses CKPTFile. In its main method, you can see the CKPTFile in action.)
Memory-Mapped File I/O
Even though CKPTFile isn't a Buffer, it uses a BufferMappedByteBuffer, to be precise. To read and write the data, use memory-mapped file I/O, which maps the contents of the file directly into an in-memory array (see Figure 3).
The array contains the bytes of the file: to read the file, all you have to do is read the array. This might sound like a terrible waste of RAM, but in fact this is invariably implemented by a low-level operating system service called demand paging
. Demand paging allows the file to be loaded into memory only as needed. Thus, the system only loads data that you actually access.
Likewise, changes made to the array go directly into the file on disk. As with loading, the disk access doesn't necessarily happen right away. In this case, the operating system writes the data out after a delay, according to a scheme that maximizes efficient use of the disk and minimizes delay.
Despite this operating-system trickery, the array really is the file. There is no semantic difference between reading and writing the array and reading and writing the file. And this is dangerous because any change you make can take effect immediately, and if you leave the file in an inconsistent stateby only completing some of a larger set of changesyou can corrupt the data.
In fact, this kind of I/O is often even more dangerous than reading and writing files in the regular way, because file I/O often involves writing a file from scratch, whereas memory-mapped I/O is usually used for modifying a file directly.