DevX HomePage

Learn to Read and Write Microsoft Excel Documents with Jakarta's POI

Jakarta's POI project makes it easy to read and write Excel files and provides support for manipulating other documents based on Microsoft's OLE 2 Compound Document Format. This article contains a step-by-step walkthrough that shows you how to get started with the Jakarta POI project.
he purpose of the Jakarta POI open source project is to provide Java with the ability to manipulate file formats based on Microsoft's OLE 2 Compound Document Format. Currently, the most widely accepted part of the project is the Horrible Spread Sheet Format (HSSF) (click the link to discover a bit of the history behind that name.

HSSF provides:

The HSSF framework is divided into the following packages (see Table 1).

Table 1: Packages in the HSSF framework. The table lists the packages in the HSSF framework and the purpose of each.

Package

Purpose

org.apache.poi.hssf.eventmodel

Handles different events generated in the process of reading and writing Excel documents.

org.apache.poi.hssf.eventusermodel

Provides classes to read Excel documents.

org.apache.poi.hssf.record.formula

Contains classes to handle FORMULA used in Excel document.

org.apache.poi.hssf.usermodel

Contains classes to generate Excel documents.

org.apache.poi.hssf.util

Contains utility classes to handle different attributes of the Excel document.


Using these packages you can interact with existing Excel documents and create new ones. Before starting to develop the example, here's a brief explanation of how the POI models an Excel document.




The Excel Document Model
From the HSSF point of view, the Excel document is divided into a Workbook, one or more Worksheets, Worksheet consists of rows. Rows have multiple cells and cells can have different display attributes and formula attached to them. This overall structure is as depicted in Figure 1.

 
Figure 1: The Excel Document Model. The diagram shows the relationships of objects in Excel's document model.
Following this model, HSSF provides objects and classes at each level. Thus, you can read an existing Excel document as a Workbook object, while you create an Excel document by creating a Workbook object composed of different Worksheet objects.

Installing HSSF
Go to the Jakarta.apache.org/poi site and download the latest binary for the POI project. Install it anywhere in your system. POI uses Jakarta's commons-logging package. The download includes with the Jakarta-commons.jar file and you must have that path in your classpath to run POI based applications. In the next section, you'll see how to create the Excel document shown in Figure 2 from Java, using the Jakarta POI project.

Figure 2: A Sample Excel Document. The sample Excel document shown here was generated with the Jakarta POI.




Build a Practical Example
To generate this Excel document using HSSF, you must:

You can see that creating the spreadsheet requires separate tasks where you create a worksheet, create rows and cells, create a merged region for the worksheet caption, specify different styles for the cells and create a special type of cell to contain the mathematical formula that calculates the total. Finally, you have to save the workbook in the file system.

Creating a new Excel worksheet is simple. First you create a new workbook:

   // create a new workbook
   HSSFWorkbook workBook = new HSSFWorkbook();
Then you can create a new worksheet within that workbook:

   //create a new worksheet
   HSSFSheet sheet = workBook.createSheet();
   etc...
You create rows by calling the Worksheet object's createRow method. Remember, the row index starts from 0.

   // ***************header data******************//
   //create a header row
   HSSFRow headerRow = sheet.createRow((short) 0);
To create the merged area for the caption you want to define a merged region from the first to the third row, and from first to the tenth column. Here's the code:

   //define the area for the header data(row1->row3, 
      col1-->col10)
   sheet.addMergedRegion(new Region(0, (short) 0, 2, 
      (short) 10));
After creating the caption region, create a cell to contain the caption value.

   //create the header data cell
   HSSFCell headerCell = 
      headerRow.createCell((short) 0);
   //add the date to the header cell
   headerCell.setCellValue("The Bowling Score");
Because the header row spans multiple rows and columns, the caption "The Bowling Score" will span the defined merged region.

Worksheet cells have many display properties that you can control through HSSF. To do that, you create an HSSFCellStyle object and apply styles such as cell alignment (CENTER/LEFT/RIGHT). The following code shows how you can set the background colors and the font style for individual cells.

   //create a style for the header cell
   HSSFCellStyle headerStyle = 
      workBook.createCellStyle();
        
   headerStyle.setAlignment(
      HSSFCellStyle.ALIGN_CENTER);
   headerCell.setCellStyle(headerStyle);
   
   
   //create a style for this header columns
   HSSFCellStyle columnHeaderStyle =  
      workBook.createCellStyle();
   
   columnHeaderStyle.setFillBackgroundColor(
      HSSFColor.BLUE_GREY.index);
   
   columnHeaderStyle.setFillForegroundColor(
      HSSFColor.BLUE_GREY.index);
   
   HSSFFont font = workBook.createFont();
   font.setColor(HSSFFont.COLOR_RED);
   columnHeaderStyle.setFont(font);
After defining cell styles, you can apply those styles as follows:

   HSSFCell colHeading1 = 
      columnHeaderRow.createCell((short) 0);
   HSSFCell colHeading2 = 
      columnHeaderRow.createCell((short) 4);
   
   colHeading1.setCellStyle(columnHeaderStyle);
   colHeading2.setCellStyle(columnHeaderStyle);
The Worksheet essentially consists of multiple rows and cells. You can create as many rows and cells as you need and then apply styles and assign data to the cells. The following example creates five rows with two cells in each row. The code sets the cell values last, using the setCellValue method.

   //**************report data rows and cols*********//
   // create 5 rows of data
   HSSFRow row1 = sheet.createRow((short) 5);
   HSSFRow row2 = sheet.createRow((short) 6);
   HSSFRow row3 = sheet.createRow((short) 7);
   HSSFRow row4 = sheet.createRow((short) 8);
   HSSFRow row5 = sheet.createRow((short) 9);
   
   // create the 2 cells for each row
   HSSFCell c11 = row1.createCell((short) 0);
   HSSFCell c12 = row1.createCell((short) 4);
   HSSFCell c21 = row2.createCell((short) 0);
   HSSFCell c22 = row2.createCell((short) 4);
   HSSFCell c31 = row3.createCell((short) 0);
   HSSFCell c32 = row3.createCell((short) 4);
   HSSFCell c41 = row4.createCell((short) 0);
   HSSFCell c42 = row4.createCell((short) 4);
   
   // writing data to the cells
   c11.setCellValue("Sam");
   c12.setCellValue(100);
   
   c21.setCellValue("John");
   c22.setCellValue(50);
   
   c31.setCellValue("Paul");
   c32.setCellValue(25);
   
   c41.setCellValue("Richard");
   c42.setCellValue(20);
One of the main advantages of Excel is that you can attach mathematical formulas to cells. The following code shows how to attach a formula to a cell. The example assigns a formula to calculate the total value of other cells in the worksheet. Notice that, in order to create a cell with a formula, you must first set the cell type to the FORMULA type.

   //create a formula for the total
           totalValue.setCellType(HSSFCell.CELL_TYPE_FORMULA);
   totalValue.setCellFormula("SUM(E6:E9)");
When the workbook is complete, you can save it to a file system by opening a FileOutputStream object for the destination file and writing the Workbook contents to the stream using the write method.

   FileOutputStream stream = new 
      FileOutputStream("c:/dev/src/customs/Book1.xls");
   workBook.write(stream);
It is really as simple as that. Here's the entire procedure in condensed form.




Reading an Excel Document
Reading an existing Excel document is just as simple as creating one. HSSF provides a custom POIFSFileSystem object that specializes in reading Microsoft's OLE 2 documents. Using the POIFSFileSystem object, you construct an HSSFWorkbook object from a specified Excel document. Here's an example.

   //create a POIFSFileSystem object to read the data
   POIFSFileSystem fs = new POIFSFileSystem( 
      new FileInputStream( 
      "c:/dev/src/customs/Book1.xls"));
Next, create a new workbook from the input stream. You can obtain a reference to any of the worksheets within the workbook. Notice that the worksheet index starts at 0.

   // create a workbook out of the input stream
   HSSFWorkbook wb = new HSSFWorkbook(fs);
   // get a reference to the worksheet
       HSSFSheet sheet = wb.getSheetAt(0);
A Little Bit More
Some of you may need to use Java to access Excel from a Web application. In such cases, you may need to construct an Excel document and show it in a browser. That's not difficult but does require a little trick. You send the Excel document as the response, but to make it appear in Excel, you need to set the MIME type of the response to the appropriate type. This method is browser independent. Here's an example:

   try {
      OutputStream out = res.getOutputStream();
   
      // SET THE MIME TYPE
      res.setContentType("application/vnd.ms-excel");
   
      // set content dispostion to attachment in 
      // case the open/save dialog needs to appear
      res.setHeader("Content-disposition", 
         "inline; filename=sample");
         workBook.write(out);
         out.flush();
         out.close();
   } catch (FileNotFoundException fne) {
        System.out.println("File not found...");
   } catch (IOException ioe) {
        System.out.println("IOException..." + 
           ioe.toString());
   }
The following points are important:

This tutorial should help you to get started in the world of Java-to-Microsoft bridges. The POI project offers considerably more than I've discussed here, such as the capability to interact with Microsoft Word and with Adobe's PDF files. The project is very active, so you should monitor the POI project and keep up what's going on.

Samudra Gupta holds a post graduate degree in Information Technology and Management from All India Management Association, India. He has six years of experience in designing and developing web-based applications. He presently works as an independent Java Consultant in the United Kingdom. He is the author of the book "Logging in Java with the JDK1.4 Logging API and Apache log4j, Apress Ltd, April 2003, is an author for JavaWorld, and Java Developer's Journal, and is a monthly contributor to javaboutique.com. When not programming or writing, he plays Contract Bridge. Contact him .


DevX is a division of Jupitermedia Corporation
© Copyright 2007 Jupitermedia Corporation. All Rights Reserved. Legal Notices