ost software applications require a persistence layer, such as a relational database, to store user data for later retrieval. As a Java developer, you have many choices for your application’s persistence layer. You can use entity beans with either container-managed persistence (CMP) or bean-managed persistence (BMP), or you can execute SQL queries from within your Java code through the Java Database Connectivity (JDBC) API.
Typically, those not familiar enough with SQL will choose CMP entity beans or frameworks such as TopLink and Hibernate, which generate queries and help to isolate them from the Java code. Regardless of which you choose, the resulting code will still contain Java-based queries. This article advises against this approach and proposes an alternative form of persistence that uses database stored procedures.
Using stored procedures to encapsulate data manipulation helps to solve the following problems often found when this logic?the associated queries?exists within the Java layer:
- Client/server data shuttling: When queries are made from Java code, potentially large amounts of data are shuttled back and forth between the application server and the database. This can greatly impact performance.
- Transactions opened from outside the database: A transaction is often created for each query executed from the Java code within an application server. When a single logical operation requires multiple queries, many individual transactions may be created.
- Tight coupling of database schema and Java code: The database structure is exposed to the Java layer. Hence, future changes to database structure may require changes to the Java code. As with Java class structure and member data, the database structure should be abstracted from other layers in the software.
- Software release coordination: Because the Java code depends on the database structure, database releases that contain schema changes must be coordinated with releases of the dependant Java code. It’s often difficult to coordinate such releases. Furthermore, if one system must be rolled-back due to a problem, dependant systems must be rolled back also.
The Proposed Solution
The use of stored procedures helps to solve the problems listed above. For operations that require the lookup and/or update of data across multiple tables, the resulting queries are placed in a stored procedure that encapsulates the details. The advantage of this design is that it places the data-related business logic (the knowledge of where data resides, and how to manipulate it) where it belongs: in the database. The design goal for every application should be to place as much of the business logic into the database layer as stored procedures and limit the amount of Java code in use (see Figure 1). The proposed architecture moves the business logic further into the center of the diagram, out of the Java layer, and into the database layer.
|Figure 1. Move Business Logic into the Database Layer as Stored Procedures|
As a Java developer, this architecture may alarm you as it appears to reduce your development role. In my experience, three reasons explain why this is not the case:
- Many Java developers write their own queries anyway, and this architecture simply moves them into stored procedures.
- You’ll always have more work to do in the Java layer than time and resources allow. Handing off the SQL work to a database team takes pressure off the Java developer.
- The gain in performance and scalability this architecture brings helps reduce the likelihood that a project will fail.
Let’s explore in more detail why this architecture is better.
Reduced Coupling of Business Logic and Data
The amount of data that is shuffled between the Java layer and the database is reduced because the stored procedure returns only the results of the operation, not the data from other tables used to produce the results. Since all of the queries reside within the stored procedure, the entire operation is wrapped in one transaction within the database. This eliminates the individual transactions that would otherwise have been created from the application server. Additionally, since the Java code no longer makes queries to the database directly, the database schema is not exposed.
Calling a stored procedure is very much like a call to a Java class: parameters are passed to a named function, and results are produced. As long as the function signature remains the same, changes to the internal format of the data (the schema) will not require changes to the Java code. Developing loosely coupled software components should be one of your goals.
Let’s look at a code example to help prove these two points. Figure 2 contains a summary of a design that places SQL queries within the Java layer to perform a user login function. Let’s assume that this is a Web-based application in which the following steps occur:
- Perform a SQL query to locate the user within the USER_DATA table, and retrieve the user’s data (username, password, login count, and max logins).
- Check that the password entered matches the password from the database
- If the password matches, perform a query on the USER_LOGIN table to see if the user is already logged in.
- If a row is not found, it means that the user is not logged in. Go to step 3.
- If a row is found, compare the user’s cookie to the cookie returned from the query. If it matches, it means that the user is already logged in, and can proceed to use the system – you are done.
- If the cookies don’t match, it means that the same username is being used on another computer. In this case, you must check that the user login count has not been exceeded (a system may limit a username to one login, multiple logins, or limitless logins). If the count is not exceeded, proceed to the next step. Otherwise, fail the login.
- Insert a row into the USER_LOGIN table with the username, cookie value, and a timestamp.
- Update the USER_DATA table to increment the user’s login count.
|Figure 2. SQL Queries Within the Java Layer to Perform a User Login|
The code in Listing 1 implements this design, and it contains all of the Java code and SQL queries needed to perform the login. In total, four calls are made to the database, and in all cases data is shuttled back and forth. Two of the calls execute SELECT statements; one call executes an INSERT; and the final call executes an UPDATE. In many cases, each call will start an internal database transaction?never mind the transaction the Java code must create to ensure that the four queries execute as one atomic operation.
Problems arise when you want to change your application’s business logic, which may result in changes to both the database and the Java code. Additionally, since database table and column names appear throughout the Java code, changes to the database schema may also require changes to the Java code. While tools such as Hibernate may appear to insulate you from the database schema, they usually don’t solve the data shuttling or multi-transaction problems. Additionally, these tools add a layer of complexity to your software, requiring an added step in the build sequence to generate the appropriate code.
Moving the logic behind user login into a stored procedure solves these problems. The stored procedure in Listing 2 performs all of the processing required for user login.
The username and password are checked; the user count is checked; an entry is placed into the USER_LOGIN table; and finally, the user login count is incremented. All of these steps are completed within a single transaction started from within the database. All of the data shuttling from the previous design has been eliminated, as well as the time it took to create the additional transactions. The result is an increase in performance, and an even greater increase in scalability. Experience with this design has shown that the removal of the data shuttling and extraneous transactions increases the number of concurrent user operations the software can handle.
Listing 3 shows how much simpler the Java code has become.
The calls to the stored procedures are made just like other method calls, albeit with different syntax. The Java code simply specifies the stored procedure name and the parameters. None of the internal database details such as tables, columns, primary keys, joins, and so on, are exposed. If a change is made to the database schema, or even to the business logic, the Java layer will not be impacted.
Additional benefits to data hiding through stored procedures include the independence of release management. As long as stored procedure signatures remain the same, releases to the database system?to optimize code or fix a bug?can be made without requiring a release of the Java code, and vice-versa.
Java and Database Efficiency Gains
As with anything, this approach can be taken too far. Since it’s more difficult to scale database servers than application servers, it’s not wise to put all of an application’s business logic into stored procedures. However, if you follow the general guideline to place data-related business logic into stored procedures, you will see gains in efficiency both at the Java layer and within the database server.