ocoon is a powerful tool for publishing content to multiple formats from XML. It can be used for static content, but its most powerful application is publishing dynamically generated XML streams. The XML streams can come from database information wrapped in XML tags, from XHTML Web site content from a remote site (as would be common in portal integration), from Web services, or from a multitude of other sources.
This article describes how to integrate Cocoon with PostgreSQL, a leading open source database. Once you learn how to access PostgreSQL data (or any other relational database) from Cocoon, you can develop robust XML-driven applications with a relational backend. Think about it: XML is highly structured data; relational databases store highly structured data—it’s a natural fit.
Because it assumes that you already have some familiarity with Cocoon, it does not describe how to set up the tool. If you have not set up Cocoon previously, visit the Cocoon Web site and read the INSTALL.txt file included in the Cocoon distribution.
Why PostgreSQL for Your Relational Database?
When you develop with Cocoon and use XML for your source data, chances are you will need to integrate with a relational database. Most Web sites or applications that store or access large amounts of data interface with a relational database, because relational databases are reliable, have good performance, and are familiar to many developers.
In the open source world, PostgreSQL and MySQL are two of the best-known databases. MySQL is in wider use, but PostgreSQL bears closer resemblance to the commercial database servers—in fact, it was the basis for Informix, now part of IBM. I prefer PostgreSQL for the following reasons:
- It more closely follows the ANSI SQL standard.
- It has had transaction support for longer (and thus is presumably more robust)
- It feels better designed, possibly because, as the successor to the INGRES database project at UC Berkeley, it has really been in development since the 1970’s.
Find additional information on MySQL and PostgreSQL at sql-info.de’s MySQL gotchas and PostgreSQL gotchas pages.
Set Up Tomcat and Cocoon
Begin by downloading Tomcat. You will want the Tomcat 5.5.4 release (or later). It should be called “5.5.4 tar.gz”. Unpack the archive once you have downloaded it.
Next, download Cocoon. You should get the latest release (2.1.6 as of the writing of this article). You should also download the patch I created (cocoon-2.1.6_patch_for_java_1.5.diff). To obtain the cocoon-2.1.6_patch_for_java_1.5.diff file, look in the downloadable code accompanying this article. You need the patch because Cocoon 2.1.6 does not compile under Java 1.5. Luckily, the changes are fairly minor: just changing “enum” to “enumeration” (because enum is a reserved keyword in Java 1.5) and fixing some of the build files in Cocoon to recognize Java 1.5.
I run Cocoon in a Tomcat servlet engine on Linux. Your interaction with Cocoon may differ if your setup is different. Since the rest of this article uses Unix commands, you will also have to adjust them if you run Windows rather than Linux.
The versions of software I use are Java 1.5.0_01, Cocoon 2.1.6, Tomcat 5.5.4, and PostgreSQL 8.0.0. Again, if you use different versions, you may have different results. If you use a version of Cocoon released after this article was published (in other words, greater than 2.1.6), you may not need to perform the patch described in the next paragraph because a newer version of Cocoon may already include the necessary changes.
If you use Java 1.5 and Cocoon 2.1.6, type the following commands:
- cd /path/to/cocoon-2.1.6 (Replace /path/to/cocoon-2.1.6 with the real path to the Cocoon directory.)
- patch -p1 < /path/to/cocoon-2.1.6_patch_for_java_1.5.diff
- ./build.sh webapp
- cp -a build/webapp /path/to/tomcat-5.5.4/webapp/cocoon (Replace /path/to/tomcat-5.5.4 with real path.)
Going forward, I use $TOMCAT_HOME to represent /path/to/tomcat-5.5.4 and $COCOON_HOME to represent /path/to/tomcat-5.5.4/webapp/cocoon. When you see these references, replace them with the full path to Tomcat and to Cocoon, respectively, in your filesystem.
Test that your Tomcat and Cocoon installation works. First, start up Tomcat with the following two commands:
- cd $TOMCAT_HOME/bin
|Figure 1. Welcome Screen for Successful Cocoon Installation|
View $TOMCAT_HOME/logs/catalina.out and look for something like “INFO: Server startup in 18312 ms” at the very end of the log file (the “18312 ms” will appear different because your computer will probably take a different amount of time to start Tomcat). Once that appears, you know the server has started up.
Open a Web browser and go to http://$SERVER_HOSTNAME:8080/cocoon, where $SERVER_HOSTNAME is the name of the host on which you run Tomcat and Cocoon. I have a Linux machine called galleon on which I run Cocoon. To test whether Cocoon works after following the Cocoon installation instructions, I visit http://galleon:8080/cocoon. Replace “galleon” with the name of the machine on which you run Cocoon (localhost is valid if you run Cocoon on your local computer). You should see something like Figure 1 in your browser. If you see it, you have successfully installed Cocoon and are ready to set up and test the PostgreSQL integration.
First, you should shut down Tomcat because you need to install more software and make further changes to your server configuration files. Shut down Tomcat with the following two commands:
- cd $TOMCAT_HOME/bin
To integrate PostgreSQL 8.0.0 and Cocoon 2.1.6, simply install a JDBC driver in Tomcat. Tomcat is the container for Cocoon, so a globally accessible driver in Tomcat will be accessible by the Tomcat servlet container, Cocoon, and any other servlets you place within the Tomcat container. Download the latest JDBC driver. The one you want is in the row labeled “8.0” and the column labeled “JDBC 3”. (It is build 309 as of the writing of this article.) The file should be called postgresql-8.0.309.jdbc3.jar or something similar (the build number might differ). Place the file in $TOMCAT_HOME/common/lib.
Copying the JAR file to $TOMCAT_HOME/common/lib provides a JDBC interface to PostgreSQL, but it does not actually tell Tomcat or Cocoon that a PostgreSQL database is installed. In order to do that, you have to configure a database connection pool. As the name implies, a database connection pool is an object (a “pool”) that returns database connections. You can either set up the connection pool in Tomcat or in Cocoon—you should not do both. If you want to set it up in Tomcat, consult this Web page. You then just access the JNDI binding from within Cocoon to fetch the DataSource object (which gives you Connection objects upon request). Since this article focuses on Cocoon and not Tomcat, I describe how to set up the connection pool in Cocoon:
- Open $COCOON_HOME/WEB-INF/cocoon.xconf in a text editor.
- Search for the section that contains the following text:
<!—..... Start configuration from 'datasources' —>
<!— If you have an Oracle database, and are using the pool-controller below, you should add the attribute "oradb" and set it to true: That way, the test to see if the server has disconnected the JdbcConnection will function properly. —> <!— If you need to ensure an autocommit is set to true or false, then create the following "auto-commit" element: falseThe default is true. —> jdbc:hsqldb:hsql://localhost:9002 sa <!—..... End configuration from 'datasources' —>
- In the preceding XML excerpt, the
line is shaded green. The line is shaded turquoise. Right after the tag but before the tag, insert the following XML code: jdbc:postgresql://localhost/mytestdb wchao
In the excerpt above, you can change the database name from mytestdb to anything you would like. If you change the database name, you should change the name and logger attributes of the
tag and the value of the tag. You can also specify a different username and password. In the example above, I do not have a password associated with the username “wchao” because I have password-less access in my test database. Normally, you should assign a password. I omitted one only for simplicity.
- Save the cocoon.xconf file and exit your editor.
- Open $COCOON_HOME/WEB-INF/web.xml in your text editor.
- Look for the following section of XML code:
load-class <!— For parent ComponentManager sample: org.apache.cocoon.samples.parentcm.Configurator —> <!— For IBM WebSphere: com.ibm.servlet.classloader.Handler —> <!— For Database Driver: —> org.hsqldb.jdbcDriver
- After the line that says “org.hsqldb.jdbc.Driver” (shaded in green), insert this line: “org.postgresql.Driver”. The
section of the web.xml file should now appear as follows: load-class <!— For parent ComponentManager sample: org.apache.cocoon.samples.parentcm.Configurator —> <!— For IBM WebSphere: com.ibm.servlet.classloader.Handler —> <!— For Database Driver: —> org.hsqldb.jdbcDriver org.postgresql.Driver
- Save the web.xml file and exit your editor.
Integrate PostgreSQL (cont’d)
Now you need to set up the “mytestdb” database. Make sure that PostgreSQL is running and that your user account has superuser privileges (or at least privileges to create a database). Create the mytestdb database with the following command:
If you do not have sufficient privileges to create the database, have someone else create the database for you. You also have to create some sample tables and insert some sample data so that you can test the integration of Cocoon with PostgreSQL. To save on typing, the downloadable code provides create_table.sql and sample_data.sql. It also provides drop_table.sql, in the event you make a mistake or just want to start fresh again with the database tables. Issue the following commands to create the tables and populate them with data:
- psql -d mytestdb < /path/to/create_table.sql
- psql -d mytestdb < /path/to/sample_data.sql
At this point, you are ready to start up Tomcat again. Type the following to start Tomcat:
- cd $TOMCAT_HOME/bin
You need to create the Cocoon code to access the database. One of the nice things about Tomcat is that it supports live reloading of code. Cocoon also checks for updated code. This enables you to add new XSPs and incrementally develop your application. Find the myapp.tgz file in the downloadable code. Perform the following steps:
- cd $COCOON_HOME
- tar xzf /path/to/cocoon_myapp.tgz
This article assumes you have familiarity with basic Cocoon concepts such as sitemaps, XSPs, and pipelines. If you have no idea what those concepts are, the Cocoon site is a good resource: http://cocoon.apache.org/2.1/. The tar command you just executed reads the archive file and creates a myapp directory in $COCOON_HOME. It then unpacks a few files: index.xsp, person_table.xslt, sitemap.xmap, and xml2html.xslt. The meat of the application is in index.xsp and person_table.xslt, which you should look over. It is a fairly simple application, but it does provide you with evidence that Cocoon is communicating with the PostgreSQL database named mytestdb.
Figure 2. RDBMS Table Data Wrapped in XML Tags
Point your Web browser at http://$SERVER_HOSTNAME:8080/cocoon/myapp/index.xml. You should see something like Figure 2.
Figure 2 shows the data from the relational database tables, only it’s wrapped in XML tags. You can verify this by looking through the data in the mytestdb database directly. Type “psql -d mytestdb” and then issue SQL commands such as “select * from person_tbl;” or “select * from address_tbl;”. For another view on the same data, point your Web browser at http://$SERVER_HOSTNAME:8080/cocoon/myapp/index.html. You should see something like Figure 3.
Figure 3. Index.html HTML View of the RDBMS Table Data
The index.html HTML view simply goes through a different stylesheet than the index.xml XML view, which you can verify by examining sitemap.xmap and reviewing the pipelines.
Put PostgreSQL and Cocoon to Good Use
As you can see, accessing PostgreSQL (or anther relational database) from Cocoon is useful. Even in XML-driven application development, or perhaps I should say especially in XML-driven application development, data from a relational backend is important. XML is highly structured data. Relational databases also store highly structured data. It’s a natural fit. Almost any significant XML-driven project is still going to involve a relational backend, and now that you know how to access PostgreSQL data from your Cocoon XSPs, you can develop rich data-driven applications with one of the leading open source relational databases.
Cocoon lets you access databases through XSPs and SQLTransformer, as well as in actions. For further details on writing code to access databases via these three methods, consult the following resources from the Cocoon Web site:
- SQLTransformer (PDF)
- Data sources