devxlogo

Learn to Read and Write Microsoft Excel Documents with Jakarta’s POI

Learn to Read and Write Microsoft Excel Documents with Jakarta’s POI

he purpose of the Jakarta POI open source project is to provide Java with the ability to manipulate file formats based on Microsoft’s OLE 2 Compound Document Format. Currently, the most widely accepted part of the project is the Horrible Spread Sheet Format (HSSF) (click the link to discover a bit of the history behind that name.

HSSF provides:

  • An environmental API for read-only access to Excel (97-2000) format.
  • A full user model API for creating, reading and modifying the Excel files.

The HSSF framework is divided into the following packages (see Table 1).

Table 1: Packages in the HSSF framework. The table lists the packages in the HSSF framework and the purpose of each.

Package

Purpose

org.apache.poi.hssf.eventmodel

Handles different events generated in the process of readingand writing Excel documents.

org.apache.poi.hssf.eventusermodel

Provides classes to read Excel documents.

org.apache.poi.hssf.record.formula

Contains classes to handle FORMULA used in Excel document.

org.apache.poi.hssf.usermodel

Contains classes to generate Excel documents.

org.apache.poi.hssf.util

Contains utility classes to handle different attributes of theExcel document.

Using these packages you can interact with existing Excel documents and create new ones. Before starting to develop the example, here’s a brief explanation of how the POI models an Excel document.

The Excel Document Model
From the HSSF point of view, the Excel document is divided into a Workbook, one or more Worksheets, Worksheet consists of rows. Rows have multiple cells and cells can have different display attributes and formula attached to them. This overall structure is as depicted in Figure 1.

Build a Practical Example
To generate this Excel document using HSSF, you must:

  • Create the header by spanning it across two cells.
  • Leave a gap of one row.
  • Populate the first and second cell of the following row with the displayed data.
  • The last row displays a total, based on a formula that calculates the total value.

You can see that creating the spreadsheet requires separate tasks where you create a worksheet, create rows and cells, create a merged region for the worksheet caption, specify different styles for the cells and create a special type of cell to contain the mathematical formula that calculates the total. Finally, you have to save the workbook in the file system.

Creating a new Excel worksheet is simple. First you create a new workbook:

   // create a new workbook   HSSFWorkbook workBook = new HSSFWorkbook();

Then you can create a new worksheet within that workbook:

   //create a new worksheet   HSSFSheet sheet = workBook.createSheet();   etc...

You create rows by calling the Worksheet object’s createRow method. Remember, the row index starts from 0.

   // ***************header data******************//   //create a header row   HSSFRow headerRow = sheet.createRow((short) 0);

To create the merged area for the caption you want to define a merged region from the first to the third row, and from first to the tenth column. Here’s the code:

   //define the area for the header data(row1->row3,       col1-->col10)   sheet.addMergedRegion(new Region(0, (short) 0, 2,       (short) 10));

After creating the caption region, create a cell to contain the caption value.

   //create the header data cell   HSSFCell headerCell =       headerRow.createCell((short) 0);   //add the date to the header cell   headerCell.setCellValue("The Bowling Score");

Because the header row spans multiple rows and columns, the caption “The Bowling Score” will span the defined merged region.

Worksheet cells have many display properties that you can control through HSSF. To do that, you create an HSSFCellStyle object and apply styles such as cell alignment (CENTER/LEFT/RIGHT). The following code shows how you can set the background colors and the font style for individual cells.

   //create a style for the header cell   HSSFCellStyle headerStyle =       workBook.createCellStyle();           headerStyle.setAlignment(      HSSFCellStyle.ALIGN_CENTER);   headerCell.setCellStyle(headerStyle);         //create a style for this header columns   HSSFCellStyle columnHeaderStyle =        workBook.createCellStyle();      columnHeaderStyle.setFillBackgroundColor(      HSSFColor.BLUE_GREY.index);      columnHeaderStyle.setFillForegroundColor(      HSSFColor.BLUE_GREY.index);      HSSFFont font = workBook.createFont();   font.setColor(HSSFFont.COLOR_RED);   columnHeaderStyle.setFont(font);

After defining cell styles, you can apply those styles as follows:

   HSSFCell colHeading1 =       columnHeaderRow.createCell((short) 0);   HSSFCell colHeading2 =       columnHeaderRow.createCell((short) 4);      colHeading1.setCellStyle(columnHeaderStyle);   colHeading2.setCellStyle(columnHeaderStyle);

The Worksheet essentially consists of multiple rows and cells. You can create as many rows and cells as you need and then apply styles and assign data to the cells. The following example creates five rows with two cells in each row. The code sets the cell values last, using the setCellValue method.

   //**************report data rows and cols*********//   // create 5 rows of data   HSSFRow row1 = sheet.createRow((short) 5);   HSSFRow row2 = sheet.createRow((short) 6);   HSSFRow row3 = sheet.createRow((short) 7);   HSSFRow row4 = sheet.createRow((short) 8);   HSSFRow row5 = sheet.createRow((short) 9);      // create the 2 cells for each row   HSSFCell c11 = row1.createCell((short) 0);   HSSFCell c12 = row1.createCell((short) 4);   HSSFCell c21 = row2.createCell((short) 0);   HSSFCell c22 = row2.createCell((short) 4);   HSSFCell c31 = row3.createCell((short) 0);   HSSFCell c32 = row3.createCell((short) 4);   HSSFCell c41 = row4.createCell((short) 0);   HSSFCell c42 = row4.createCell((short) 4);      // writing data to the cells   c11.setCellValue("Sam");   c12.setCellValue(100);      c21.setCellValue("John");   c22.setCellValue(50);      c31.setCellValue("Paul");   c32.setCellValue(25);      c41.setCellValue("Richard");   c42.setCellValue(20);

One of the main advantages of Excel is that you can attach mathematical formulas to cells. The following code shows how to attach a formula to a cell. The example assigns a formula to calculate the total value of other cells in the worksheet. Notice that, in order to create a cell with a formula, you must first set the cell type to the FORMULA type.

   //create a formula for the total           totalValue.setCellType(HSSFCell.CELL_TYPE_FORMULA);   totalValue.setCellFormula("SUM(E6:E9)");

When the workbook is complete, you can save it to a file system by opening a FileOutputStream object for the destination file and writing the Workbook contents to the stream using the write method.

   FileOutputStream stream = new       FileOutputStream("c:/dev/src/customs/Book1.xls");   workBook.write(stream);

It is really as simple as that. Here’s the entire procedure in condensed form.

  • First, create a HSSFWorkbook object.
  • Obtain a HSSFWorksheet object from the HSSFWorkbook object. Note, that the only way, you can obtain a reference to a Worksheet object is via a Workbook object. There’s no public constructor for the Worksheet object. This makes sense because Excel has no model for Worksheets that exist separately from a Workbook.
  • Similarly, you obtain HSSFRow objects from the Worksheet object and HSSFCell objects from Row objects?the existence of each type depends on its parent.
  • To create headers, create a merged region of rows and cells using the org.apache.poi.hssf.util.Region object. Note that the collections of rows and cells are zero-based.
  • Add the data to the cells using the HSSFCell object’s setCellValue method. Cells support all Java native data types such as String, int, boolean etc. The API also supports some other commonly used data types such as Date and Calendar and lets you set those through the setCellValue() method.
  • The HSSFCellStyle class in HSSF handles display attributes for the cells such as color, font, width, height etc.. You can create an arbitrary style for a particular cell. For complex attributes such as colors and fonts HSSF provides objects such as HSSFColour and HSSFFont.
  • The cells support formulas. The example applies the formula SUM(E6:E9) to the total cell value.
  • Lastly, to physically create the Excel document in the file system, you need to create an OutputStream object and call the HSSFWorkbook’s write method, passing the created OutputStream object as a parameter.

Reading an Excel Document
Reading an existing Excel document is just as simple as creating one. HSSF provides a custom POIFSFileSystem object that specializes in reading Microsoft’s OLE 2 documents. Using the POIFSFileSystem object, you construct an HSSFWorkbook object from a specified Excel document. Here’s an example.

   //create a POIFSFileSystem object to read the data   POIFSFileSystem fs = new POIFSFileSystem(       new FileInputStream(       "c:/dev/src/customs/Book1.xls"));

Next, create a new workbook from the input stream. You can obtain a reference to any of the worksheets within the workbook. Notice that the worksheet index starts at 0.

   // create a workbook out of the input stream   HSSFWorkbook wb = new HSSFWorkbook(fs);   // get a reference to the worksheet       HSSFSheet sheet = wb.getSheetAt(0);

A Little Bit More
Some of you may need to use Java to access Excel from a Web application. In such cases, you may need to construct an Excel document and show it in a browser. That’s not difficult but does require a little trick. You send the Excel document as the response, but to make it appear in Excel, you need to set the MIME type of the response to the appropriate type. This method is browser independent. Here’s an example:

   try {      OutputStream out = res.getOutputStream();         // SET THE MIME TYPE      res.setContentType("application/vnd.ms-excel");         // set content dispostion to attachment in       // case the open/save dialog needs to appear      res.setHeader("Content-disposition",          "inline; filename=sample");         workBook.write(out);         out.flush();         out.close();   } catch (FileNotFoundException fne) {        System.out.println("File not found...");   } catch (IOException ioe) {        System.out.println("IOException..." +            ioe.toString());   }

The following points are important:

  • Use the MIME type string “application/vnd.ms-excel”
  • In the response header, the content-disposition entry defines how to deliver this content to the browser. For example, an inline attribute specifies that the browser should present an Open/Save/Cancel dialog before opening this document. The filename attribute specifies a default file name for the document.

This tutorial should help you to get started in the world of Java-to-Microsoft bridges. The POI project offers considerably more than I’ve discussed here, such as the capability to interact with Microsoft Word and with Adobe’s PDF files. The project is very active, so you should monitor the POI project and keep up what’s going on.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist

©2024 Copyright DevX - All Rights Reserved. Registration or use of this site constitutes acceptance of our Terms of Service and Privacy Policy.