Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Java Package Processes All RSS Formats

Several Java packages work with RSS. Some can read certain formats, while others can write them as well. The class this article features reads all known RSS formats and outputs the converted data in the 2.0 format.

SS (Real Simple Syndication) is a flexible and efficient format for exchanging structured, dynamically changing data, such as news headlines, blogs, job vacancies, new projects, recent wiki changes, etc. You might not have noticed it yet, but as soon as you get tuned in, you'll discover it is ubiquitous.

You'll find several scattered Java packages that work with RSS. Some can read certain formats, while others can write them as well. The class I feature in this article, com.myjavatools.xml.Rss, reads all known RSS formats (from version 0.90 to 2.0) and outputs all the converted data in the 2.0 format.

You can instantiate com.myjavatools.xml.Rss from any RSS feed, regardless of version. The class gives access to all the RSS elements, and you also can create new RSS containers, add or change the contents, and write RSS data to an output stream. However, it doesn't offer any syndication features, so no filtration or selection.

A Brief History of RSS
Userland developed RSS in 1997. Netscape immediately adopted it, and Userland and Netscape continued working on RSS formats in parallel. RSS-DEV working group and W3C's RDF (Resource Description Framework group) also participated in developing RSS standards. For Userland, RSS stood for Rich Site Summary; for Netscape, it was RDF Site Summary. Eventually, other explanations for RSS emerged: Remote Site Syndication and Really Simple Syndication. The last one is the most popular since it indicates the main function of the RSS format: facilitating content feed syndication.

Essentially, an RSS document contains one <channel> element, which may have a title, a description, and a link, as well as an <image> element and a certain amount of <item> and <textInput> elements. All these elements have at least a title, and <item> typically also has a description and a link to a resource. If only one RSS format (such as XML) existed, that would be the end of the story. Unfortunately, history had other ideas. The following is an example of the Netscape version (0.90) for a document that features an article and tip I wrote for DevX:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://my.netscape.com/rdf/simple/0.9/"> <channel> <title>devx.com</title> <link>http://www.devx.com</link> <description>DevX.com--the know-how behind application development.</description> <image> <title></title> <url>http://www.devx.com/assets/devx/7819.gif</url> <link>http://www.devx.com/</link> </image> <item> <title>Practical XML for Java Programs</title> <link>http://www.devx.com/Java/Article/16571/0</link> </item> <item> <title>Get Nasdaq Quotes Online in Your Java Program</title> <link>http://www.devx.com/tips/Tip/14965</link> </item> </channel> </rdf:RDF> Note the absence of item descriptions.

The next Netscape version, 0.91, which was introduced in July 1999, dropped namespaces and added descriptions to items. Look at how the same example code as above appears in this version:

<rss version="0.91"> <channel> <title>devx.com</title> <link>http://www.devx.com/</link> <description>DevX.com