Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Manage Database Metadata with POI and Excel : Page 2

Set up quick and easy database metadata housekeeping with the Java Apache POI libraries and Microsoft Excel.




Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js

The Apache POI overview page describes the project this way:
"The POI project is the master project for developing pure Java ports of file formats based on Microsoft's OLE 2 Compound Document Format. OLE 2 Compound Document Format is used by Microsoft Office Documents, as well as by programs using MFC property sets to serialize their document objects."

Figure 1. POI-Generated Excel Output File

Practically speaking, this means that the POI API enables you to write Java code that interacts with Microsoft Office applications without having to know anything about the underlying C++ Microsoft Foundation Classes. Figure 1 shows an Excel output file containing some of the metadata that you can pull out of your database tables using POI.

A closer look at POI and its code explains how it generated the Excel output. (Click here to download the code.) The data dictionary utility first connects to your database to collect the metadata from the tables. (To connect to any database using Java you need your database vendor's JDBC drivers. The code in this article uses Oracle, so if you try it on your computer, you need to have Oracle's classes12.jar in your classpath.) Lines 46-55 in the constructor make the connection to the database and set up the Excel output file:

System.out.print(" Loading JDBC Driver -> " + driverClass + "\n"); Class.forName(driverClass).newInstance(); System.out.print(" Connecting to -> " + connectionURL + "\n"); this.con = DriverManager.getConnection(connectionURL, userID, userPassword); System.out.print(" Connected as -> " + userID + "\n"); wb = new HSSFWorkbook(); f = new File("c:\\poidatadict.xls");

To access the POI API you must have the POI jar file (see resources for download location) in your classpath. In the snippet above, the HSSFWorkbook variable wb is the top-level object of the Excel object hierarchy. As each table's metadata is accessed, the utility creates a new worksheet in the workbook object, and as each of the table's columns is processed, it adds a new row with three columns to hold the metadata (line 102):

String[] types = {"TABLE"}; resultSet = md.getTables(null, null, "%", types); // Get the table names while (resultSet.next()) { // Get the table name String tableName = resultSet.getString(3); // Get the table's catalog and schema names (if any) String tableCatalog = resultSet.getString(1); String tableSchema = resultSet.getString(2); if(tableSchema.equals("TESTSCHEMA")) { int rowNum = 1; System.out.println("table name: " + tableName); sheet = wb.createSheet(tableName); ResultSet rset = md.getColumns(null, null, tableName.toUpperCase(), "%"); HSSFRow row = sheet.createRow(rowNum); row.createCell(ZERO).setCellValue("Column Name"); row.createCell(ONE).setCellValue("Data Type"); row.createCell(TWO).setCellValue("Description"); while(rset.next()) { rowNum++; HSSFRow dataRow = sheet.createRow(rowNum); dataRow.createCell(ZERO).setCellValue(rset.getString(4)); dataRow.createCell(ONE).setCellValue(codeToText(rset.getInt(5))); //System.out.println("name: " + rset.getString(4) + " type: " +
rset.getString(5)); } rset.close(); }

Since a database could contain multiple schemas, the code first sets up a filter to process only the tables in a particular schema (line 89). For each database table, it creates a new worksheet in the workbook. By passing the tableName variable to the createSheet( ) method, the code automatically sets up the worksheet with the name of the table appearing in the worksheet tab.

This example registers the metadata values for only the column's name and its data type, but you can extract any metadata that interests you as long as your database vendor's JDBC spec implementation supports it (e.g., column size, number of decimal places, whether the column is nullable, whether some examples of other metadata are available using Oracle). Additionally, the utility creates an empty description field so you can add plain language explanations of the column's usage afterwards. Lines 96-98 set up headings for these three columns. Since the utility uses zero-based column numbering when a new cell is created in a row, I employ static constants that are named to match the column (ZERO, ONE, etc.).

The while loop (lines 100-106) evaluates each column in the table and adds its metadata to the cells of the worksheet. The private codeToText( ) method converts Oracle data type codes to plain text as each column's metadata is evaluated. Lines 109-115 write out the data in the workbook object to the physical file on the filesystem and catch any exceptions:

try { fos = new FileOutputStream(f); wb.write(fos); fos.close(); }catch(IOException ioe ){ ioe.printStackTrace(); }

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date