Simplify Java XML Parsing with Jakarta Digester

Simplify Java XML Parsing with Jakarta Digester

he Digester framework is a high-level interface that parses an XML stream and populates the data into Java objects based on rules provided to the Digester component. Among the other XML parsing options available, the Digester package offers greater simplicity. With very simple classes and a collection of predefined Rules included, Digester simplifies the parsing of complex XML schema.

Because Digester requires an XML parser that conforms to JAXP version 1.1 or later, the Digester component uses the SAX parser for the actual parsing. It is easier to use than SAX alone, however, because Digester hides all the complex parsing maintenance. The other main API for XML parsing, DOM, uses too much memory to be a practical solution for large documents—and don’t you deal with large documents most of the time in the real world? Since Digester is just a layer over SAX, the difference in memory usage between DOM and Digester is the same as that between DOM and SAX. (Click here for a good comparison of the two.)

Although Digester does not perform data binding like the other options such as JAXB and XMLBeans, it provides the flexibility to create the Java classes that your architecture requires—not ones that the XML semantics demand. It allows triggers to be executed with the Rules that you provide it.

Digester Under the Hood
Digester depends on the following Jakarta Commons components, which must be in the classpath when you use Digester. (Refer to this status file page for more information about these dependencies.):

  • BeanUtils Component
  • Collections component

Using and Customizing Digester
Digester is simplest to use when you have direct mapping between the input XML stream and the Java objects.

To begin creating the Rules, you need to complete the following four steps:

  1. Identify the mapping from the source (i.e., input XML stream) to the output (i.e., the Java objects).
  2. Identify the pattern elements in the XML that contain data you need.
  3. Identify the data components that will hold the data.
  4. Create rules based on your findings and assign them to Digester.

A simple example (input.xml) should make this process clearer. Listing 1 shows the XML that you need to parse.

Listing 1. input.xml			books		xml	      20

Listings 2 and 3 show the Java class that you need to populate.

Listing 2. Response Class    public class Response {        private int _matches = 0;        private Request  _request;        public Request getRequest() {            return _request;        }        public void setRequest(Request request) {            _request = request;        }        public int getMatches() {            return _matches;        }        public void setMatches(int matches) {            _matches = matches;        }   }

Listing 3. Request Class public class Request { private String _name = ""; private String _value = ""; public String getName() { return _name; } public void setName(String name) { _name = name; } public String getValue() { return _value; } public void setValue(String value) { _value = value; } }

Listing 4 shows the class that parses the XML using Digester. You set the Rules in Digester with this class.

Listing 4. DigesterExample Classimport org.apache.commons.digester.*;import java.io.Reader;import java.io.StringReader;public class DigesterExample {    public static void main(String ar[]) {        try {         Digester digester = new Digester();         digester.setValidating( false );         digester.addObjectCreate( "response", Response.class );         digester.addObjectCreate( "response/request", Request.class );         digester.addBeanPropertySetter("response/request/name", "name" );         digester.addBeanPropertySetter("response/request/value", value" );         digester.addSetNext( "response/request", "setRequest" );         digester.addBeanPropertySetter( "response/matches", "matches" );         Reader reader = new StringReader(                 "" +                  "" +                      "booksxml" +                       "20" +                  "");         Response response = (Response)digester.parse( reader );         System.out.println( response.toString() );      } catch( Exception exc ) {         exc.printStackTrace();      }    }}

Listing 4 proves how easy and straightforward using Digester is—especially when you have a direct mapping between the XML and the Java classes. (The Rules that come with the Digester package are sufficient to do the mapping and should serve as a constant reference while you read this article.) The element matching patterns within the class define when a particular Rule is fired. Each Rule extends the org.apache.commons.digester.Rule and defines the action that occurs when the rule is fired.

Digester in the Real World
Although Digester is a straightforward solution, things are not always so ideal in the real world. For example, say you have to parse XML whose elements keep changing based on the input. You typically find such request/response streams in the search world. If you search for a book, you get content that contains data to be populated in the Book object. Searching for a magazine returns content that contains data to be populated in the Magazine object. In such a situation, you end up writing custom Rules.

Listing 5 shows the response from a book search. The boldface lines show the dynamic section of the response.

Listing 5. BookResponse.xml			books		xml	      2                                                book1                  author1                                              book2                  author2                              

Listing 6 shows the response from a magazine search. Again, the boldface lines show the dynamic section of the response.

Listing 6. MagazineResponse.xml			magazines		security	      3                                  securityMagazine 1                  securityMagazine2                  securityMagazine3                

You can use the same Digester class you used earlier (DigesterExample) with a little modification to parse XML whose elements continually change this way. Just add the two new methods shown in Listing 7 to the Response class.

Listing 7. Content Related Methodspublic void addContent(Object o) {    _content = o;}public Object getContent() {    return _content;}

Now, you need to create a custom Rule that gets triggered whenever Digester encounters the content element. Listing 8 demonstrates how to use a custom Rule. The boldface lines are the extra code that is required.

Listing 8. DigesterExample Class Using a Custom Ruleimport org.apache.commons.digester.*;import java.io.*; public class DigesterExample {    public static void main(String ar[]) {        Class contentClass = ar[0];          ContentBuilder contentBuilder = ar[1];        try {         Digester digester = new Digester();         digester.setValidating( false );                  digester.setRules(new ExtendedBaseRules());          digester.addObjectCreate( "response", Response.class );         digester.addObjectCreate( "response/request", Request.class );         digester.addBeanPropertySetter("response/request/name", "name" );         digester.addBeanPropertySetter("response/request/value", value" );         digester.addSetNext( "response/request", "setRequest" );         digester.addBeanPropertySetter( "response/matches", "matches" );                  digester.addObjectCreate("response/content", contentClass);         digester.addRule("response/content/?",new DefaultRule(digester, contentBuilder));         digester.addSetNext("response/content","addContent", "java.lang.Object");            File input = new File("input.xml");         Response response = (Response)digester.parse( input );         System.out.println( response.toString() );      } catch( Exception exc ) {         exc.printStackTrace();      }    }}

Two arguments are passed to the program, and the Classes are passed through the command line. The first argument is “contentClass”, which is the container for the data in the content elements (boldfaced text in Listings 5 and 6). So, for a Book content item, you need a Book.class. The second argument is the class that is responsible for populating the Book.class.

Listing 9 shows the code for a custom Rule. The boldfaced text, getDigester.peek(), returns a reference to the object on the stack. In this example, it would be the type of content object based on the search request.

Listing 9. CustomRule Classimport org.apache.commons.digester.Rule;import org.apache.commons.digester.Digester;public class DefaultRule extends Rule {    public DefaultRule(Digester digester,  ContentBuilder builder) {        super();        _digester = digester;        _builder = builder;    }   public void body(String namespace, String name, String text)    throws Exception {        _builder.addAttribute(name, text, getDigester().peek());    }}

Listing 10 shows the BookBuilder class. The boldfaced text is a reference to the Book object.

Listing 10. BookBuilder Classpublic class  BookBuilder implements ContentBuilder{      public void body(String name, String text, Object object)      throws Exception {          Book book = (Book)Object;          If (name.equals("title")) {              book.setAuthor(text);          } else if (name.equals("author")) {              book.setAuthor(text);         }     }  }

Since you extended the Rules, you have to tell Digester to use the ExtendedBaseRules class, which allows more kinds of matching patterns. The methods exposed when you extend the Rule object are add, begin, end, finish, and body.

Just like that, you’ve seen how the Digester package offers simplicity in parsing XML. You can use it with a straightforward mapping, as well as more complex XML schemas, with some simple variations.

devx-admin

devx-admin

Share the Post:
Poland Energy Future

Westinghouse Builds Polish Power Plant

Westinghouse Electric Company and Bechtel have come together to establish a formal partnership in order to design and construct Poland’s inaugural nuclear power plant at

EV Labor Market

EV Industry Hurting For Skilled Labor

The United Auto Workers strike has highlighted the anticipated change towards a future dominated by electric vehicles (EVs), a shift which numerous people think will

Soaring EV Quotas

Soaring EV Quotas Spark Battle Against Time

Automakers are still expected to meet stringent electric vehicle (EV) sales quotas, despite the delayed ban on new petrol and diesel cars. Starting January 2023,

Affordable Electric Revolution

Tesla Rivals Make Bold Moves

Tesla, a name synonymous with EVs, has consistently been at the forefront of the automotive industry’s electric revolution. The products that Elon Musk has developed

Poland Energy Future

Westinghouse Builds Polish Power Plant

Westinghouse Electric Company and Bechtel have come together to establish a formal partnership in order to design and construct Poland’s inaugural nuclear power plant at the Lubiatowo-Kopalino site in Pomerania.

EV Labor Market

EV Industry Hurting For Skilled Labor

The United Auto Workers strike has highlighted the anticipated change towards a future dominated by electric vehicles (EVs), a shift which numerous people think will result in job losses. However,

Soaring EV Quotas

Soaring EV Quotas Spark Battle Against Time

Automakers are still expected to meet stringent electric vehicle (EV) sales quotas, despite the delayed ban on new petrol and diesel cars. Starting January 2023, more than one-fifth of automobiles

Affordable Electric Revolution

Tesla Rivals Make Bold Moves

Tesla, a name synonymous with EVs, has consistently been at the forefront of the automotive industry’s electric revolution. The products that Elon Musk has developed are at the forefront because

Sunsets' Technique

Inside the Climate Battle: Make Sunsets’ Technique

On February 12, 2023, Luke Iseman and Andrew Song from the solar geoengineering firm Make Sunsets showcased their technique for injecting sulfur dioxide (SO₂) into the stratosphere as a means

AI Adherence Prediction

AI Algorithm Predicts Treatment Adherence

Swoop, a prominent consumer health data company, has unveiled a cutting-edge algorithm capable of predicting adherence to treatment in people with Multiple Sclerosis (MS) and other health conditions. Utilizing artificial

Personalized UX

Here’s Why You Need to Use JavaScript and Cookies

In today’s increasingly digital world, websites often rely on JavaScript and cookies to provide users with a more seamless and personalized browsing experience. These key components allow websites to display

Geoengineering Methods

Scientists Dimming the Sun: It’s a Good Thing

Scientists at the University of Bern have been exploring geoengineering methods that could potentially slow down the melting of the West Antarctic ice sheet by reducing sunlight exposure. Among these

why startups succeed

The Top Reasons Why Startups Succeed

Everyone hears the stories. Apple was started in a garage. Musk slept in a rented office space while he was creating PayPal with his brother. Facebook was coded by a

Bold Evolution

Intel’s Bold Comeback

Intel, a leading figure in the semiconductor industry, has underperformed in the stock market over the past five years, with shares dropping by 4% as opposed to the 176% return

Semiconductor market

Semiconductor Slump: Rebound on the Horizon

In recent years, the semiconductor sector has faced a slump due to decreasing PC and smartphone sales, especially in 2022 and 2023. Nonetheless, as 2024 approaches, the industry seems to

Elevated Content Deals

Elevate Your Content Creation with Amazing Deals

The latest Tech Deals cater to creators of different levels and budgets, featuring a variety of computer accessories and tools designed specifically for content creation. Enhance your technological setup with

Learn Web Security

An Easy Way to Learn Web Security

The Web Security Academy has recently introduced new educational courses designed to offer a comprehensible and straightforward journey through the intricate realm of web security. These carefully designed learning courses

Military Drones Revolution

Military Drones: New Mobile Command Centers

The Air Force Special Operations Command (AFSOC) is currently working on a pioneering project that aims to transform MQ-9 Reaper drones into mobile command centers to better manage smaller unmanned

Tech Partnership

US and Vietnam: The Next Tech Leaders?

The US and Vietnam have entered into a series of multi-billion-dollar business deals, marking a significant leap forward in their cooperation in vital sectors like artificial intelligence (AI), semiconductors, and

Huge Savings

Score Massive Savings on Portable Gaming

This week in tech bargains, a well-known firm has considerably reduced the price of its portable gaming device, cutting costs by as much as 20 percent, which matches the lowest

Cloudfare Protection

Unbreakable: Cloudflare One Data Protection Suite

Recently, Cloudflare introduced its One Data Protection Suite, an extensive collection of sophisticated security tools designed to protect data in various environments, including web, private, and SaaS applications. The suite

Drone Revolution

Cool Drone Tech Unveiled at London Event

At the DSEI defense event in London, Israeli defense firms exhibited cutting-edge drone technology featuring vertical-takeoff-and-landing (VTOL) abilities while launching two innovative systems that have already been acquired by clients.

2D Semiconductor Revolution

Disrupting Electronics with 2D Semiconductors

The rapid development in electronic devices has created an increasing demand for advanced semiconductors. While silicon has traditionally been the go-to material for such applications, it suffers from certain limitations.

Cisco Growth

Cisco Cuts Jobs To Optimize Growth

Tech giant Cisco Systems Inc. recently unveiled plans to reduce its workforce in two Californian cities, with the goal of optimizing the company’s cost structure. The company has decided to

FAA Authorization

FAA Approves Drone Deliveries

In a significant development for the US drone industry, drone delivery company Zipline has gained Federal Aviation Administration (FAA) authorization, permitting them to operate drones beyond the visual line of

Mortgage Rate Challenges

Prop-Tech Firms Face Mortgage Rate Challenges

The surge in mortgage rates and a subsequent decrease in home buying have presented challenges for prop-tech firms like Divvy Homes, a rent-to-own start-up company. With a previous valuation of

Lighthouse Updates

Microsoft 365 Lighthouse: Powerful Updates

Microsoft has introduced a new update to Microsoft 365 Lighthouse, which includes support for alerts and notifications. This update is designed to give Managed Service Providers (MSPs) increased control and

Website Lock

Mysterious Website Blockage Sparks Concern

Recently, visitors of a well-known resource website encountered a message blocking their access, resulting in disappointment and frustration among its users. While the reason for this limitation remains uncertain, specialists