devxlogo

Sizing Up Open Source Java Persistence

Sizing Up Open Source Java Persistence

few months ago, I was teaching a class on JDBC. After a particularly tedious JDBC lab, I alluded to the existence of many Java persistence frameworks to help Java developers avoid some of the tedious and error-prone JDBC coding to help get Java objects in and out of a relational database. I was asked, “Which is best?” My response was the standard instructor/consultant answer: “Well, it depends.” I could tell by their facial expressions that my answer was less than satisfactory. There are some differences in the way each performs the work, but the results are supposed to be the same; at least in the general notion of getting data in and out of the database.

What follows is a study I have made over the past few months to more suitably answer the question posed by my students and others looking at Java persistence options. This is not an article that will teach you how to persist Java objects in each of the technologies; although the attached code provides a couple of examples in each framework that may help in that endeavor. Instead, this article considers the following questions:

  • How easy is it to obtain and setup the persistence frameworks?
  • What is the learning curve for each framework? Are there ample resources for help and support when using the frameworks?
  • How well do the frameworks adapt to existing database and object models?
  • How does each framework perform?
  • What’s the impact of the frameworks on the amount of Java code, configuration files, and byte code footprint?
  • How do the frameworks deal with persistent objects across transactions?
  • Do the frameworks assist with cascading operations across object relationships?

Is there a clear winner or choice for your project? Is there a Java persistence framework that can claimed the title of “Best-in-breed”? Unfortunately, as you will see, the answer to “which framework is best” might still be “it depends.” In fact, as you will also see, there are a lot of caveats and footnotes in my results. A direct comparison is possible, but not under the exact same conditions in all cases?thus the need to qualify the results. However, I do think you will find the results a good reference for understanding some of the strengths and weaknesses of the frameworks and a good place to start when examining Java persistence options. If for no other reason, the code and information provided with this article can help you more quickly devise your own tests.

The Contenders
What is a Java persistence framework (also referred to as a Java object-relational mapping mechanism)? It’s the part of a program that moves data in Java objects to and from a relational data store. There are several persistence frameworks. Of course, you can also write your own using the JDBC API.

The goal of these frameworks is to reduce the amount of Java code (in other words your own JDBC code) needed to get the Java object data in and out of the database. This article compares the following frameworks:

  • Hibernate, Version 3: The very popular open source persistence framework
  • JPOX JDO, Version 1.2: The reference implementation of Sun’s persistence Java Data Objects 2.0 specification.
  • Castor JDO, Version 1.0.5: An open source framework that provides Java-to-XML binding and Java-to-SQL persistence. (Side note: Even though JDO is in its name, Castor JDO is NOT a JDO implementation. To avoid confusion, I’ll refer to it simply as Castor in this article.)
  • iBatis for Java, Version 2.2: Another open source, lightweight data persistence framework.
  • JDBC: The JDBC API is provided with any standard Java Standard Edition download. Homegrown JDBC solutions provide a baseline for comparison for all frameworks.

A student of the Java persistence problem will recognize that this is not a complete list of options. There are several commercial persistence frameworks such as Oracle’s TopLink, Thought Inc.’s CocoBase, and ATG’s SQL repository mechanisms. There are also other lesser-known open source frameworks. Many of these can be found at www.java-source.net. The Spring framework also offers JDBC abstraction although it “integrates out of the box” with a number of the frameworks examined here. And there is, of course, Enterprise JavaBeans 3.0 and the relatively new Java Persistence API that originated as part of the work of the JSR 220 (EJB 3.0). So, while there are other options, this article concentrates on those frameworks that are free (open source), relatively popular and do not require a Java Enterprise Edition (JEE) container.

Framework Acquisition and Setup
Each of the persistence frameworks is very simple to download and get setup. In fact, all of the frameworks here simply require adding the appropriate libraries to the build and execution paths. Most of these, as open source frameworks, require additional open source libraries for tasks such as XML parsing and logging. Except for Castor, the persistence framework download obtained from the project Web site will contain both the framework JAR as well as all the required libraries (except for the database driver). With Castor, you will have to make an extra visit to some additional projects’ Web sites for required libraries. Table 1 provides an overview of what is downloaded and required for using each of the frameworks in your Java application.

Persistence Framework

Version

Download included libraries

Additional libraries needs

Hibernate

3.0

antlr.jar

cglib.jar

asm.jar

asm-attrs.jars

commons-collections.jar

commons-logging.jar

hibernate3.jar

jta.jar

dom4j.jar

ehcache.jar

c3po.jar

database driver

Castor

1.0.5

castor-1.0.5.jar

commons-logging.jar

xercesImpl.jar

database driver

JPOX JDO

1.2

jdo.jar

jpox.jar

log4j.jar

database driver

IBatis

2.2.0

ibatis-common-2.jar

ibatis-sqlmap-2.jar

database driver

JDBC

Java 1.5

JDBC API with standard Java

database driver

Table 1. What’s required? An overview of what is downloaded and required for using each of the frameworks in your Java application.

IDE Considerations
One development consideration when building an application that uses any of the frameworks is to look at the support your favorite IDE gives in XML editing. It’s common practice to configure applications now by XML and these persistence frameworks make heavy use of XML in object-relational mapping. Most of these IDEs allow you to create/edit an XML file, but the good ones also provide code assist based on the persistence framework’s Document Type Definitions. Each of the frameworks configuration files are structured via DTD.

JDO is a little unusual in that it also requires an additional step before you can deploy/execute an application?a special pre-deployment operation. After compiling your persistence-capable Java objects into the standard class byte codes (.class files), JDO requires you run these classes through a special JDO compilation called byte code enhancement. Below is an example of the JDO compile command that is completed on the persistent classes that result in modified byte code files:

java -cp C:NetBeansProjectsPersistenceuildwebWEB-INFclasses;c:javajpoxjpox-enhancer-1.1.4.jar;c:javajpoxjpox.jar;c:javajpoxjdo.jar;c:javajpoxlog4j.jar;c:javacel-5.2cel-5.2.jar -Dlog4j.configuration=file:log4j.properties org.jpox.enhancer.JPOXEnhancer -v comintertechdomainpackage.jdo

Therefore, in addition to the libraries listed above, JDO also requires you (and your development environment) to use a JDO tool provided with the JDO libraries. In many cases, you’ll want to create an Ant task to perform this activity.

Learning Curve and Support
Discerning which of the frameworks provides better support and a gentler learning curve is a bit subjective. However, there are a couple of factors to explore that might help quantify the time to spin up and obtain help when needed.

Documentation and Support Sites
The good news is that all of the frameworks explored in this article provide a Web site with examples, tutorials, and online documentation. The detail and organization of these Web sites does differ a bit. In my opinion, Hibernate and JPOX JDO offer more documentation than Castor or iBatis. JPOX JDO provides extensive documentation, but without a search capability and because of its size, it can be a bit unwieldy to find the appropriate example. Hibernate offers documentation in a single page HTML file or PDF to make searching easy.

Hibernate and JPOX JDO also offer design tips beyond the simple how-to documentation. JDO frameworks like JPOX also benefit from being an implementation of a specification. The Sun site helps provide more general information and explanations of JDO technology that can assist with any implementation.

All of the frameworks have official and unofficial forums or mailing lists for getting questions answered and for seeking help. The table below is not a complete list, but will give you some places to go for more assistance. This list does not include the framework’s project development forums and mailing lists. Developers working on the framework typically frown on posting end user questions to the development forums for obvious reasons.

Table 2. Persistence Framework Forum/Mailing Lists: This table lists the more popular forum and mailing lists for each.

During the research and work for this article, I posed several questions and issues to many of these sites. I found that helpful responses usually came within one week or less. Only in one case, using the Castor mailing list, did I not get back an answer before this article was published. However, even in this case, I did get a request for more code to be sent to their JIRA developer mailing list for examination and future response.

Help in Books and on the Web
Based on a quick, unscientific appraisal of available resources on the Web and in bookstores, JDBC comes out way ahead of the others in terms of helpful resources; Hibernate is a distant second.

Figure 1. Professional Help: This chart depicts framework skills listed on resumes posted to Monster.com.

If you are a manager, architect, or team lead on a project considering using one of these persistence frameworks, finding developer help with prior experience in a technology is a valid part of your evaluation criteria. Certainly finding someone with JDBC or even Hibernate experience is easier. Finding people with the other framework skills will be a little harder to find as represented by a recent search of resumes on Monster.com. While the number of people with appropriate persistence framework skills is not known, the chart in Figure 1 depicts framework skills listed on resumes posted to Monster.com. The number of results returned is capped at 1000 so the JDBC results are not reflective of the actual numbers of available resources.

Flexibility: How Well Do They Lie on Top of the Existing Models)
Persistence frameworks should be flexible. A key premise of this examination of Java persistence frameworks is that the database and Java objects should not have to change in order for the framework to do its job. On many software projects, the database or even some of the application business object code may already exist. Even where these have not been created, should the data or object model be formed around what the persistence framework supports? I contend that the answer is a definitive NO! So, the first big caveat regarding this comparison is that if your application or database design can be fashioned around the other and/or the framework that maps one to the other, then the results of this comparison might not align with your needs. Most developers do not have that luxury of a green field in data or object model design.

Happily, one of the biggest surprises of this examination is that, in general, these frameworks do adapt well to existing object and data structures.

The Test Model
In this comparison, I devised two models through which to examine each framework. The two models and their code are attached in this zip file. The first model is really just a simple model to make sure the framework, code, and database are working properly. The single Java class and its accompanying database table are shown in Figure 2.


Figure 2. Simple Java Object/Database Table: This simple Java is mapped to a single simple database table.

Figure 3. Java Object/Database Table Person Model: Both customers and employees associated to an organization are represented in Java objects and database tables.

The second model is more extensive and meant to test a wider range of framework properties and features (see Figure 3). This model shows inheritance, many-to-many, many-to-one, and one-to-one associations. You can also see that there is a diversity of property types in the classes/tables. People may be associated to a set of addresses. Employees may be associated to organizations and a “user” which represents the person in information systems.

Could the second model be even more complex? Certainly, and your classes and database schemas will likely make this simple model seem trivial. It is, however, enough to get an understanding of some of the basic differences between the frameworks.

The Test DAO Interface
Most experienced developers try to isolate the various tiers of their application. The business objects, like those designed above, do not change, or change rarely, whereas the persistence framework, user interface, etc. might change frequently. In fact, this examination is a case in point where the persistence framework changes frequently from Hibernate, to JDO, etc.

A data access layer (see Figure 4), defined by the interface shown in the code below, was implemented for each persistence framework, regardless of how it is implemented under the covers. In general, the frameworks were all able to lie on top of the existing models and satisfy the needs of this interface with little difficulty:

public interface PersonDAO {        public static final boolean debug = true;    public void saveEmployee(Employee employee) throws Exception;    public void deleteEmployee(Employee employee) throws Exception;        public List findAllEmployees();    public Employee findOneEmployee(long id);    public void saveCustomer(Customer customer) throws Exception;    public void deleteCustomer(Customer customer) throws Exception;        public List findAllCustomers();    public Customer findOneCustomer(long id);    public void saveSimple(Simple s) throws Exception;    public void deleteSimple(Simple s) throws Exception;        public List findAllSimples();    public Simple  findOneSimple(int id);}
Figure 4. Data Access Layer: This was implemented for each persistence framework.

Figure 4 shows that the tests performed against each of the persistence frameworks required general data access to find, create/save, update, and delete instances of the models in Figure 2 and Figure 3. The data access layer, as defined by the PersonDAO interface, facilitated these common operations in each framework

Flexibility Results
So how well did each persistence framework lie on top of the existing data/object models and implement the access layer? Quite well! In fact, none of the business classes or the database tables had to be modified for these tests.

While the models did not have to change, a couple of small issues and considerations involving the mapping between objects and data tables were discovered:

  • Some of the frameworks rely on and take advantage of table foreign key definitions. Other frameworks have difficulties with some of these definitions. As an example, JDO used a foreign key relationship to help record ids in the related objects when they are created. Those same foreign key definitions created dependency issues (especially when deleting objects) in other frameworks. Although not obvious through the documentation, I believe many of the foreign key issues could eventually be solved by configuration.
  • iBatis does not innately handle a character primitive property/column mapping. So, the gender property on the Person class/table presents some unique challenges for negotiating in or out of Strings as part of the mapping activities. This is because iBatis is based heavily on the capabilities of JDBC PreparedStatements, which do not provide a means to set a character parameter.
  • The intertech_users table has an employee_id column which, as it turns out, is a redundant column. In the database, employees can be related to users through the user_id column on the employees table. When saving an employee object related to a user object some of the frameworks handled populating this redundant column in the database. Others, like Hibernate, saw no need to populate the redundant column.

As you map your objects to database tables, you will undoubtedly come into contact with little issues like this. Give yourself time to deal with issues like these. Depending on the size and complexity of your model, they can be quite common.

Bottom line: the complexity of the code to implement the simple CRUD activities differed substantially (covered later is this article), but the ability to implement this base functionality without the need to change the models (object or database) is met by each framework.

Performance
Performance and scalability are like the good life. Everyone wants a lot of it, but what are you willing to give to get it? Therefore, when looking at performance figures for each of the frameworks, this is “out of the box” performance for the standard CRUD types of operations defined in the DAO interface defined above (see the first code example).

Many things can impact performance. The object property types, database indexes, database driver, etc. can all have huge impacts. Furthermore, features in each of the frameworks are provided to assist with certain performance issues. For example, many of the frameworks offer cache that can greatly improve performance for objects that are requested over and over again. So, a huge footnote to the numbers given here is that these performance results should be seen as a rough gage and basis for general performance comparison. There is little doubt that by fine-tuning and using special features of the framework, the results would vary. However, to improve the results would also require more time and expertise. They may also lock one into a unique feature of the framework that would not be supported or supported differently in another framework.

Editor’s Note: The benchmarks cited in this article were created by the author, who has made every effort to ensure that his methodology is fair and accurate. You should consider the findings here to be representative only. Other independent benchmarks will inevitably have slightly different results.

When examining performance to save and read back simple Java objects with no relationships to other objects, the frameworks all compared well to JDBC. Table 3 shows the averages from conducting 10 tests of each operation.

Framework ?

Operation ?

Create 100 Simple objects and persist them

Read 100 Simple objects

Hibernate

360

295

JDBC

410

45

iBatis

460

210

JDO

560

325

Castor

620

480

Table 3. Persistence Framework Performance on Simple Objects: The time, in milliseconds, for each of the frameworks to perform the operations listed.

Things get a little more interesting when you look at operations on complex object graphs. Table 4 shows the time, in seconds, for each of the frameworks to perform the operations listed. These are averages from conducting 10 tests of each operation. The asterisk (*) next to the Castor retrieval of 100 employees result is for retrieving 10 employees and not 100! Castor also presented some problems when saving/updating/removing which will be addressed later in this article.

Framework

      ?

Read 5K Customers from the DB into objects

Read 5K Employees from the DB into objects

retrieve 100 Employees from the DB into objects by id

Create and Insert 10 Employee object graphs & persist to the DB

Update 10 Employee objects and persist to the DB

Update and persist 10 Address objects associated to employees

Remove 10 employees and assoc objects (cascade delete)

operation?

findAllCustomers

findAllEmployees

findOneEmployee

saveEmployee

saveEmployee

saveEmployee

deleteEmployee

JDBC

0.37

0.51

0.59

0.73

0.58

0.28

2.25

iBatis

0.62

0.77

0.33

0.9

0.55

0.2

0.71

Hibernate

3.14

8.09

0.56

0.67

0.7

0.41

0.79

JDO

3.5

5.64

1.34

1.23

1.17

0.71

1.39

Castor

6.72

11.07

52.21*

<![if !supportEmptyParas]> <![endif]>

<![if !supportEmptyParas]> <![endif]>

<![if !supportEmptyParas]> <![endif]>

<![if !supportEmptyParas]> <![endif]>

Table 4. Persistence Framework Performance: The time, in seconds, for each of the frameworks to perform the operations listed.

When retrieving objects, all the associated objects were eagerly retrieved (versus lazy loading, which is covered later in this article). So the read times are actually for an object graph (customer and address, employee and addresses, user and organization). The update and delete times represent the time it takes for each framework to carry out cascade updates or deletes through the same graph.

As expected, straight JDBC generally performs best. How often do you add a layer to your application and find it performs faster? Frameworks add a layer of code. What the frameworks provide over straight JDBC code is maintainability, flexibility, etc. It should be noted, that while JDBC PreparedStatements were used in the JDBC code, the code did not take advantage of the parsing/planning cache that PreparedStatements can offer. So, the JDBC results could be made better still with some changes.

As you can see, JDBC actually does not perform best in all cases. When updating or removing an object, JDBC requires many database calls to handle all the cascade effects across object relationships (to address, user, and organization for example). iBatis performs closest to the JDBC code because iBatis is a lightweight framework. iBatis is essentially JDBC code with the SQL moved to XML files. Therefore, if you’re using JDBC or iBatis, your performance is going to be as good (or bad) as your SQL. Hibernate, JPOX JDO, and the like do not require you to know SQL in order to get optimal performance. And some of the SQL can get quite complex, as you’ll see in a bit.

Castor’s performance was especially disappointing?so poor, in fact, that retrieving an object (and its associated objects) by its identifier took too long to chart. According to the Castor documentation, a lock is obtained for each object loaded. I suspect, but am not certain, that this contributes to the longer execution times. Castor experts are encouraged to reply to explain the issue and how performance can be improved.

Each of the frameworks and JDBC offer improved performance over repeated iterations after an initial execution. Taking a look at Figure 5, you can see that in an operation such as reading five thousand customers and associated addresses, each of the frameworks takes longer to execute on the first execution. Most significantly improve after the first execution. JDO is the only framework that has oscillating performance over multiple iterations.

Figure 5. Performance over Multiple Iterations: This graph shows the time it takes to read 5K customers over 10 successive iterations.

Code Impact
When looking at the performance of code options, it is absolutely necessary to also look at the code impact (size, complexity, maintainability, etc.) of any two program’s solutions. They are the ying and yang of the development world. Give in one and take in the other. Where straight JDBC code may perform well in most situations, the amount of programmer code required to perform the same task in a persistence framework like Hibernate is stark.

As a small example, take a look at one of the CRUD operations that each of the frameworks must perform; notably that of saving/updating an employee and all associated objects (Listing 1).

This highlights the Java code differences. The amount of JDBC Java code required to save an employee and associated objects is typically more than double and almost triple than using a persistence framework! The complexity of the persistence code when a framework is used is pretty straightforward. In the general the algorithm is:

Open a connection/transaction
Save the new or modified object
Close the connection/transaction

No SQL. No loading of parameters. No extracting of data from result sets and appropriately setting object properties.

Not shown here is the amount of configuration in XML that is also required. Table 5 gives you the total number of lines of code and configuration needed to get the basic functionality of the PersonDAO interface implemented in each technology. The interface includes the DAO code and necessary utility class in each of the persistence frameworks. Most of the configuration code is in XML but some is also in simple key/value pair configuration files.

<![if !supportEmptyParas]> <![endif]>

Lines of Java code

Ratio to JDBC solution

Lines of configuration code

Total lines of code

Total ratio to JDBC

Hibernate

249

0.37

128

377

0.55

JDO

263

0.39

82

345

0.50

Castor

296

0.43

156

452

0.66

iBatis

331

0.49

290

621

0.91

JDBC

681

1.00

4

685

1.00

Table 5. Persistence Framework Code Size: This table represents the total number of lines of Java code and configuration code for implementing the PersonDAO interface.

Even with configuration code, JDBC is almost double the size of some of the code that takes advantage of the persistence framework. You might note that the iBatis code is not significantly smaller than that of JDBC. In fact, as mentioned earlier, iBatis is more of a lightweight persistence framework. It helps remove SQL from the Java code, but still requires the programmer to write SQL code that handles most of the labor associated with persisting objects and their association.

The SQL code you have to manage in iBatis or JDBC solutions can become quite nasty; especially when dealing with relationships. Take the relatively simple relationships between Person, Employee, Address, User, and Organization. As seen in the code below, the SQL code necessary to retrieve the data for these objects starts to get pretty ugly. How many of the developers around you can write SQL that includes left joins and left outer joins in the same statement? Ick!

SELECT e.employee_id,lastName,firstName,gender,dateOfBirth,email,phoneNumber,startDate,salary,i.user_id,username,password,passwordHint,active,o.organization_id,name,a.address_id,street,city,state,zip FROM persons p, employees e left outer join person_address pa on e.employee_id=pa.person_id left join intertech_users i one.user_id=i.user_id left join organizations o on e.organization_id=o.organization_id left join addresses a on pa.address_id=a.address_id WHERE p.person_id=e.employee_id ORDER BY e.employee_id

This is one of the areas that frameworks, like Hibernate, implementations of JDO and Castor, shine. They allow you to concentrate on Java programming and avoid SQL coding. While these frameworks do not free you from needing to understand relational databases (tables, columns, types, etc.), they do eliminate, or at least reduce, the need to write a lot of complex SQL.

The statistics in Table 5 for the Castor code are probably inflated. I had issues getting Castor to automatically handle the persistence of associated objects like addresses (again, something that will be discussed later in this article). This required extra code to force the persistence of associated objects.

Another caveat with regard to the data shown in Table 5 is that the JDBC and iBatis code does not have some of the association intelligence/awareness built into it that some of the other persistence frameworks automatically offer. No allowance is given to adding or removing addresses in an employee update scenario in the JDBC/iBatis code. Small property changes to an employee, for example, trigger SQL calls for the updating the employee row and all the associated object rows. To handle these situations, the JDBC and iBatis code would bloat further but probably perform even faster. This again highlights another of the strengths offered by many of the persistence frameworks covered in the next section.

Lines of code do not always tell the whole story with regard to application size. In some cases, especially on more limited hardware, the size of the executed code (byte code, libraries, etc.) can be an issue. Certainly the number and size of all the persistence framework libraries impact the footprint of an application. Additionally, JDO’s extra byte code enhancement adds significant persistence code to each of the domain objects. Table 6 gives an indication of the size of the persistence framework libraries, persistence code and configuration files as well as the impact on the size of the domain files.

<![if !supportEmptyParas]> <![endif]>

Size in bytes of data access management classes (DAO and utils)

Size in bytes of domain classes

Size in bytes of configuration files

Size of required libraries in KB

Total application footprint impact in KB

JDO

6,768

90,470

4,443

5,405

5506.7

Hibernate

5,900

28,983

5,422

3,416

3456.3

Castor

7,947

28,983

5,430

2,976

3018.4

iBatis

8,489

28,983

11,865

399

448.3

JDBC

18,945

28,983

94

0

48.0

Table 6. Persistence Footprint Impact: The libraries add significant size to the application when using a framework.

As you can see, JDBC does not require any additional libraries, but the code that performs the object persistence is significant. Also, surprisingly, JDO’s extra byte code enhancement more than triples the size of domain class byte codes.

While libraries and byte code files are not code that you maintain directly, they can impact how your application may run when resources are in short supply and should be a consideration in limited environments.

Entity Relationships and Cascade Operations
Handling persistence across object relationships is one of the more difficult issues to deal with when writing your own persistence code. Does your application have to keep track of each object and whether modifications have been made to it? If not, how do you determine whether to call on code to save the associated objects (addresses for example) when a simple property change has been made to the root object (in this case an employee or customer)? Imagine a condition where addresses are both added and removed from a person object. How does JDBC code handle this type of operation when the call to persist a person and its associated object graph is received?

Persistence frameworks can really take the burden of these tasks off the developer. In frameworks like JPOX JDO and Hibernate, you simply request to save the root object and, depending on configuration, this persists (or removes) all associations and associated objects in the database accordingly. This is known as cascading operations. A request to save a customer or employee results in a cascading operation to all affected associated objects?automatically! This saves an immense amount of Java code, but also removes many of the headaches that go with managing association details.

Of course, this feature also comes with a fair amount of configuration. You need to spell out the the structure and nature of each relationship in XML. The XML that defines these relationships can be less than straightforward. Take, as an example, the configuration of the one-to-one relationship between person/employee and user objects in Hibernate:

  • In the Person.hbm.xml file, relating an Employee to a User in a 1-1 relationship:
  • In the User.hbm.xml file, relating a User to an Employee in a 1-1 relationship:

In Hibernate, from the perspective of a person object, a one-to-one relationship is a “unique” instance of a one-to-many relationship. Huh?? These types of configuration take a lot of time to learn and can be frustrating to figure out. Each framework has its own way of specifying these relationships and how the system should react when objects in a graph are created, updated, removed, or retrieved. To get a flavor for the differences, take a look at how the relationship from Person to Address is configured in both JDO and Hibernate, as shown respectively in the configuration segments below:

  • In the package.jdo file for JPOX JDO, relating Person to Address:
                
  • In the Person.hbm.xml for Hibernate, relating Person to Address:
    		  	     

Remember, this is configuration code, so a compiler is not going to help you get this right. You have to code, configure, compile, deploy, and execute the code before you know if you got it right.

In my opinion, the mapping of associations in JDO and Castor are a little easier to learn; perhaps a bit more intuitive. However, you will generally find more help in the community when it comes to Hibernate.

While the configuration in Castor seems straightforward, I had many issues in getting associated objects to persist when creating or updating an object graph. According to the documentation, Castor is supposed to offer cascading features. However, I was unable to get these to work properly. I had no problems, on the other hand, setting up the configuration and reading all of the objects into an object graph.

In fact, Castor’s own Web site documentation (which is a bit dated and has not been updated to reflect some recent releases) suggests Castor has, at least in the past, taken a different view of object relationships than do some of the other frameworks. It suggests that “related objects via one-to-one relation are not created/removed automatically.” Further, again according to the documentation, “Castor currently only supports bi-directional relationships.” Uni-directional support has been added, but it is unclear to what level it is supported or works.

As mentioned earlier, there is no automatic cascading in JDBC or iBatis. You have to provide the cascading features in the JDBC or iBatis code?resulting in a lot more code and requiring a lot of forethought about how objects will be assembled, changed, updated, and removed in the application.

Lazy Loading
In a concept somewhat related to cascading operations, lazy loading effects how many objects are created and loaded into memory whenever a root object is requested from a persistence framework. Consider the operation findAllEmployees(). An employee can be associated to any number of addresses, say, an organization and a user. The addresses can each have an association to other employees. The organization could be related to other employees. In other words, a single object, when created and loaded with data from the database is potentially part of a much larger object graph. What part of the graph should be loaded?

In the case of JDBC code, or when using the iBatis framework, you must answer this question and write the code necessary to load the objects for a graph that you need.

When using a persistence framework like Castor, JPOX JDO, or Hibernate, the framework can automatically load as much of the graph as you need, based on configuration. Properties and associated objects that are instantiated and loaded at the time of the initial request of the root object are said to be “eagerly” loaded. Objects/data that are not instantiated and loaded immediately might still be accessed by the application. This data is loaded “lazily.” That is, it gets retrieved and put into objects only when requested by the application.

Lazy loading, and how it is managed by the framework, has an impact on performance based on how many trips to the database are made, the amount of code needed to deal with data not yet loaded, and how much memory is used by unneeded/unused objects.

As an example of how much of an impact lazy versus eager loading can have on performance, Table 7 (below) provides performance statistics for Hibernate and JDO loading of employees with both strategies. In the case of JDO, the initial retrieval can be four times as fast. In both cases, there is considerable savings when lazy loading. Of course, if associated data is needed, additional requests of the database will be required.

Hibernate with eager fetch

JDO with deep “fetch groups”

Hibernate with lazy load

JDO with shallow “fetch groups”

Run 0 took:  12500

Run 0 took:  6203

Run 0 took:  11453

Run 0 took:  3422

Run 1 took:  7687

Run 1 took:  1563

Run 1 took:  5922

Run 1 took:  969

Run 2 took:  7531

Run 2 took:  11609

Run 2 took:  5828

Run 2 took:  562

Run 3 took:  7688

Run 3 took:  1594

Run 3 took:  5984

Run 3 took:  1188

Run 4 took:  7500

Run 4 took:  10484

Run 4 took:  5610

Run 4 took:  468

Run 5 took:  7437

Run 5 took:  1469

Run 5 took:  5812

Run 5 took:  938

Run 6 took:  7422

Run 6 took:  10437

Run 6 took:  6485

Run 6 took:  687

Run 7 took:  7656

Run 7 took:  1141

Run 7 took:  5765

Run 7 took:  1516

Run 8 took:  7532

Run 8 took:  10625

Run 8 took:  5938

Run 8 took:  484

Run 9 took:  7937

Run 9 took:  1250

Run 9 took:  5593

Run 9 took:  1547

Average:  8089

Average:  5637

Average:  6439

Average:  1178

Table 7. Lazy vs. Eager Loading: The statistics above are times in milliseconds to load five thousand employee objects using eager fetching and lazy loading.

An important fact to consider when looking at lazy loading is what happens when the connection and/or transaction to the database is closed? What if the application needs access to an associated object or property and it was not loaded as a result of lazy loading? The answer to this question is handled in the next section.

Object State
While hidden under the covers of the API, persistence frameworks must still open a connection to the database and start a transaction in order to perform work with the database. However, unless your application and/or database is being used by only a few users and contention for data is limited, the application must connect, transact, and then disconnect quickly to avoid nasty data contention issues (dirty reads, long held row locks, etc.). So, how do you allow an application object to go on living while its direct connection to the database is discontinued? When the application reconnects and a new transaction is started, how do the object and the database get realigned when updates have been made to one or both? In other words, how are persistence objects managed across multiple database transactions?

Synchronizing object state with database state is a constant challenge?especially in applications where persistent objects are passed between application tiers. You load objects in one transaction, allow the users to edit the object and then save the changes in a new transaction.

Most of the frameworks call objects that are persistent in the database, but not directly reflective of the database via live transaction “detached.” When the application reconnects, frameworks like Hibernate, Castor, and JDO allow the object’s state to be automatically reattached and synchronized.

iBatis does not offer object state management across transactions. The concept of a detach object does not exist. As a developer, you must manage an object’s relationship to the database just as you have to do when using JDBC. Again, this requires a lot more code and detailed design about what to do when the underlying data has changed and when to synchronize the object’s state to the database.

Castor offers the concept of detached objects as well, but to “reattach” the object requires what is called a “long transaction.” In order to have long transactions, the domain objects must implement Castor’s org.exolab.castor.jdo.TimeStampable interface. This interface requires the object and resulting database table to essentially carry an effectivity timestamp; thus impacting the domain and database models! Ouch!

Proxy or Null for Lazy Data
Finally, returning to lazy loading, suppose you request an Employee object, but do not eagerly fetch the addresses associated to the Employee. Your application closes the transaction, leaving the Employee object without addresses. If the application attempts to then access the addresses, what does it get? Tough question?

Some frameworks, like Hibernate or Castor, will substitute a proxy object in place of any object that is not yet loaded and subject to lazy loading. For example, in Hibernate, in place of an actual java.util.Set of addresses, Hibernate will place its own proxy object instance (org.hibernate.collection.PersistentSet). In a transaction, if the application were to request the addresses of an employee, Hibernate would quickly load the addresses and replace the proxy object with the real Set.

Castor also uses the proxy strategy. However, in order to use lazy loading you must also change the persistent class that will hold the related objects. Examine Castor’s documentation carefully if you want to use lazy loading in Castor.

JDO takes a different approach. It allows lazy loading, but does not load anything into the lazily loaded field like addresses. In fact, if you access the property directly, you will see that the lazy loaded field would contain null. This can cause some issues if you don’t properly access the data with getters and setters:

Transaction tx=pm.currentTransaction();tx.begin();employee = (Employee) pm.getObjectById(com.intertech.domain.Employee.class, new Long(id));System.out.println(employee.addresses);System.out.println(employee.getAddresses());tx.commit();

With an understanding of how each of the frameworks deals with lazily loaded data, let’s return to the question about what happens when you try to access lazy loaded data when the connection or transaction to the database is closed? What if the application needs access to an associated object or property like addresses and it was not loaded as a result of lazy loading? Each of the frameworks will return an exception. The exception is different for each. Hibernate will return an org.hibernate.LazyInitializationException and indicate the exact problem. That is that you “failed to lazily initialize a collection of role: com.intertech.domain.Person.addresses – no session or session was closed.” JDO throws and javax.jdo.JDODetachedFieldAccessException and indicates “You have just attempted to access field “addresses” yet this field was not detached when you detached the object. Either don’t access this field, or detach the field when detaching the object.”

Miscellaneous Considerations
There are a number of factors that must also be considered when comparing any of the persistence frameworks. Many of these considerations are not quantifiable and, in some cases, are areas for additional research that cannot be covered here.

  • Framework by Specification
    Only one of the frameworks examined here is backed by a Java Community Process (JCP) managed specification. JPOX JDO is, in fact, the reference implementation of the JDO 2.0 spec. If the support and management of a specification is important to your organization’s applications then JDO, EJBs, or Java Persistence API implementation is the solution for you. However, stability by specification may be a false promise. As an example, simply look at the differences between EJB 2.1 and EJB 3.0.
  • Database Support
    Most of the persistence frameworks come ready to work with the popular databases and database drivers. Most, if not all, also provide configuration examples of how to connect the framework to the database. If your database is not one of those commonly supported, or if you have unusual configurations, such as mirrored or load balancing databases, then you may have additional issues that do not lend themselves to these solutions.
  • Scalability
    While the experiments shown in this article worked with 10,000 or more row tables, most would consider this sizeable but certainly not close to the limits supported by most relational databases. If your application needs to read/store millions of objects, you have further research to do.
  • Cache
    Persistence frameworks offer object cache, allowing objects that have been read in from the database to be cached in memory so that subsequent calls to an object with the same identity does not require another trip to the database. This also saves on memory in that a duplicate object should not be created when using the object in cache will do.Caching capabilities, strategies, and configuration vary greatly among the different persistence framework solutions. Some, like Hibernate and JDO, offer multiple levels of cache. One level of cache is dedicated to objects used by a single instance of the application while other levels of cache work in distributed environments (multiple instances of the application running and sharing objects) usually in JEE containers.

    If objects that are used frequently, but change rarely, are a big part of your application, consider examining the framework’s caching opportunities.

  • Queries
    JDBC and iBatis rely, obviously, on SQL to retrieve data. Hibernate, JDO, and Castor support different, and sometimes multiple, query mechanisms/languages. Hibernate offers the most options: HQL, Criteria, and native SQL to pull data and retrieve objects from the database. JDO offers JDOQL and native SQL. Castor offers OQL. There are plenty of religious arguments posted on the Web about which of these (HQL, Criteria, JDOSQL, and OQL) is best and which may be more like SQL (assuming you believe SQL is an attractive query mechanism). Suffice it to say that all help reduce the amount of actual SQL code in your application, but also require some additional learning time.
  • JEE Container Support
    There is nothing that prevents any of the persistence frameworks examined from being used in an enterprise situation; specifically in a JEE application on a JEE container. In fact, most persistence frameworks can be configured to use standard transactions or other transactions (through a data source established in the container) offer by a JEE server. Additionally, JEE servers often provide other cache and threading mechanisms to better support scalability and performance. How well the framework takes advantage of the JEE container’s transaction manager, cache, thread management, etc. and how these might interfere with the frameworks’ own capabilities should also be studied.

Only You Can Decide
When looking at persistence frameworks vs. JDBC, which works best for your application is going to depend. However, the information provided here should help you understand on what it depends.

As a quick recap, let’s summarize some of the major decision points and factors when comparing persistence frameworks.

The most widely used and popular persistence framework used today is Hibernate. Along with a large user community and lots of resources to help you get your Hibernate applications up and running, the performance of Hibernate is better than average (and gets better with Hibernate cache), the persistence code is tight and the API is easy to understand. In general, Hibernate leaves you with some nice Java code. Hibernate is feature-rich and very extensible. Because of this, configuring persistence in Hibernate can be adventuresome; especially as the models get complex.

JDO is a viable option to Hibernate, although certainly not as popular as Hibernate today. It, too, is feature-rich and JDO has the backing of a specification. While its user community appears relatively small, documentation, examples, and helpful resources appear ample. Performance is acceptable. JDO persistence code is very tight and easy to read (including the configuration file, in my opinion), but there is the pesky little byte code enhancement that is required before deploying/executing the application. Some may consider this step problematic to security and/or byte code verification issues.

iBatis still requires developers to have extensive SQL knowledge. In fact, it only offers minimal?if not negligible?assistance in reducing the amount and complexity of code, when including configuration code, compared to JDBC. iBatis does get SQL code out of the Java files, and offers some convenience in setting SQL statement parameters using JavaBeans and Maps. Object state and association management are still developer-managed issues when using iBatis. It appears that the iBatis user community is small and finding helpful resources may be tougher. However, given its simplicity and lightweight nature, lots of resources may not be necessary and its performance is about as good as JDBC and minimally impacts the overall application footprint.

Castor, in my opinion, is a disappointing option. The performance appears less than viable and I had issues trying to get its association mapping to work properly. I hope these are issues of my making and not problems of the framework. Castor’s take on dependent and independent relationships is a bit different than in other frameworks. The need to use long transactions and set up effectivity timestamps to manage detached objects seems intrusive. The user community is again small and the documentation, help, and resources are limited. The size of resulting persistence code and complexity of the API is attractive but probably no more than that offered by other frameworks. Castor was a project that was inactive for sometime. Recently, there has been a sharp rise of activity and involvement in this project. In fact, during the writing of this article, several dot releases were made available. So perhaps Castor is a project to watch even if you find results similar to those in this article.

Finally, writing your own JDBC code is still an option used by many a developer. With an abundance of persistence framework options, hopefully, this is out of design and not ignorance. Well-written SQL and JDBC code is hard to beat for performance; especially when objects are changing frequently. As I tell my students, JDBC coding isn’t rocket science, but it is incredibly tedious. When you do run into a question or issue, there are a ton of helpful resources. But code maintenance and issues that a persistence framework manages (object state, association management, etc.) is why the persistence frameworks are so very attractive.

 

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist