devxlogo

Beyond XML and JSON: YAML for Java Developers

Beyond XML and JSON: YAML for Java Developers

espite all the buzz generated by dynamic languages (Ruby, Groovy, Python, etc.) and their related frameworks (such as Ruby on Rails), the vast majority of Java developers reading this article deal mostly with pure Java at their day jobs and will continue to do so for many years to come. However, that doesn’t mean that they can’t learn something from the new kids on the block and add a new tool to their arsenals. This article introduces the YAML (short for YAML Ain’t Markup Language) file format (popularized by the Ruby on Rails framework, which uses YAML for all of its configuration files) and shows how it differs from XML and JSON. It goes on to examine YAML’s advantages and drawbacks.

Whitespace Indentation and the JSON Option
The YAML file format is centered on the concept of whitespace indentation, which is used to indicate the hierarchical structure of data?instead of nested XML tags or JSON braces ({}) and brackets ([]). It is, however, a superset of JSON. So when it is useful, you may break out of the whitespace flow and adopt a typical JSON-style syntax. Its creators describe it as a “human-friendly data serialization standard for all programming languages.” In my experience, its focus on “human-friendliness” is what sets it apart.

Here’s a sample of pure YAML using whitespace indentation. Fixed width fonts are critical when creating YAML files, because spacing is crucial (see Sidebar 1. YAML and Tabs):

JFrame:    defaultCloseOperation: JFrame.EXIT_ON_CLOSE    title: Test Frame    width: 800    height: 400    components:        - JTextArea:             name: textArea1             text: |               This is a really long text               that spans multiple lines (but preserves new lines).               It does not need to be escaped with special brackets,               CDATA tags, or anything like that        - JButton:             name: button2             text: Button 2

Depending on your needs, you can switch at any point to JSON-style syntax and mix and match it with whitespace indentation, for example:

JFrame:    defaultCloseOperation: JFrame.EXIT_ON_CLOSE    title: Test Frame    width: 800    height: 400    components:        - JTextArea:             name: textArea1             text: |               This is a really long text               that spans multiple lines (but preserves new lines).               It does not need to be escaped with special brackets,               CDATA tags, or anything like that        - JButton: {name: button2, text: Button 2} #JSON syntax

The ability to switch to JSON-style syntax is especially useful on the lowest-level nodes (i.e., those that have no children of their own). Also, as you’ve probably guessed, the pound sign (#) is used to embed comments into a YAML file.

Elements of YAML Structure
Now that you have seen a quick sample of YAML, let’s delve deeper into the typical elements of YAML structure: hashes, lists, and block literals.

Hashes
You create hashes by indenting the children and separating the key from the value with a colon (:), as follows:

JFrame:    defaultCloseOperation: JFrame.EXIT_ON_CLOSE    title: Test Frame    width: 800    height: 400

Alternatively, you can create hashes by using the JSON-compatible braces syntax ({}), where each key/value is comma-separated:

JFrame: {defaultCloseOperation: JFrame.EXIT_ON_CLOSE, title: Test, Frame, width: 800, height: 400}

Lists
You create lists by prefixing each element of the list with a minus sign (), combined with whitespace indentation?the cornerstone of YAML:

    components:        - JTextArea        - JButton

Alternatively, you can create lists by using the JSON-compatible brackets syntax ([]), for example:

    components: [JTextArea, JButton]

Block Literals
This is where YAML really shines, in particular compared with XML and its ugly CDATA hack. Block literals make inserting large blocks of text into a file trivially easy. You can preserve the newlines in your text by using the vertical line (|) directive as follows:

text: |   This is a really long text   that spans multiple lines (but preserves new lines).   It does not need to be escaped with special brackets,   CDATA tags, or anything like that

The YAML processor will start the text from the first character in the first line (and discard all leading whitespace used for indentation), but preserves all the newlines in your text.

Alternatively, you can use the greater-than (>) directive to tell the YAML processor to strip all newlines and treat the entered text as one long line of text:

text: >   This is a really long text   that spans multiple lines (but preserves new lines).   It does not need to be escaped with special brackets,   CDATA tags, or anything like that

Besides these two directives, you also can use vertical line and plus sign (|+), which strips leading whitespace and preserves newlines and trailing whitespace, and greater-than and minus sign (>-), which strips all whitespace.

YAML vs. XML and JSON
As you can plainly see from the examples so far, YAML is noticeably less verbose than XML. Most of a YAML file’s content is the actual data, not endless lists of opening and closing tags, which themselves are often larger that the data they describe. As such, YAML is much better suited for any sort of data file that you may need to maintain by hand.

On the downside, YAML does not provide the concept of a schema or DTD, so there is no way to verify whether the format of the file is what you expected. XML’s verbosity has its costs, but the overall maturity of that format provides a lot of extra tools for validation that YAML does not have (yet).

JSON is perfect for any data that is geared towards efficiency and reducing file size, because it wastes almost no space on whitespace or closing tags. However, as the content of a JSON file increases in complexity, it descends into closing-bracket hell. This is most painfully visible in JavaFX code (which is based around JSON). A UI structure that is any more complex results in a data file that becomes nearly incomprehensibly complex towards the end.

Look at this JavaFX code sample (click the “Edit this page” link) and pay particular attention to how it ends:

                          }                      }                    }                }              ]            }        }      center: bookPanel    }

The mix of structural “{}” and list “[]” brackets makes maintaining large JSON-style files by hand quite difficult. YAML solves this issue quite neatly with its whitespace indentation approach, while still allowing you to switch to a JSON-style flow whenever it is acceptable (such as on bottom-level nodes).

YAML Java Libraries and Development Tools
The most popular Java library for processing YAML files is JvYAML. JRuby (the Ruby version that runs as a dynamic language on the Java VM) uses JvYAML in its port of the Ruby on Rails framework. JvYAML provides facilities to perform generic processing of a file (in which case it returns a nested hierarchy of standard Java String, Long, Map, and List objects). You save a file using the static dump() method, and load one via the static load() method, for example:

YAML.dump(Object data, Writer output);Object data = YAML.load(Reader io);

See Sidebar 2. Open Source Project Based on YAML for information about a YAML-based open source project for building declarative UIs in Java.

In reality, YAML files are usually so simple that they you can easily maintain them with any text editor. However, a few specialized text editors provide helpful syntax highlighting. For Eclipse, it’s the Eclipse YAML editor. For NetBeans, you can use the YAML editor that comes with the Ruby pack. However, the YAML editor in NetBeans 6.1 is not very useful; it supports only a small subset of YAML (e.g., it does not support block literals). This is supposed to be fixed by the new YAML editor coming in NetBeans 6.5.

I did not have a chance to test IntelliJ IDEA, but I presume its Ruby on Rails plugin ships with a YAML editor.

Time to Add the YAML Tool
The overly verbose XML format is overkill in most cases. YAML and its usage from Java could be very useful alternatives on your next project. For further study, visit the YAML page at Wikipedia (which has an excellent overview of advanced YAML features, such as data merging and data casting), as well as the official YAML site.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist