SS (Real Simple Syndication) is a flexible and efficient format for exchanging structured, dynamically changing data, such as news headlines, blogs, job vacancies, new projects, recent wiki changes, etc. You might not have noticed it yet, but as soon as you get tuned in, you’ll discover it is ubiquitous.
You’ll find several scattered Java packages that work with RSS. Some can read certain formats, while others can write them as well. The class I feature in this article, com.myjavatools.xml.Rss, reads all known RSS formats (from version 0.90 to 2.0) and outputs all the converted data in the 2.0 format.
You can instantiate com.myjavatools.xml.Rss from any RSS feed, regardless of version. The class gives access to all the RSS elements, and you also can create new RSS containers, add or change the contents, and write RSS data to an output stream. However, it doesn’t offer any syndication features, so no filtration or selection.
A Brief History of RSS
Userland developed RSS in 1997. Netscape immediately adopted it, and Userland and Netscape continued working on RSS formats in parallel. RSS-DEV working group and W3C’s RDF (Resource Description Framework group) also participated in developing RSS standards. For Userland, RSS stood for Rich Site Summary; for Netscape, it was RDF Site Summary. Eventually, other explanations for RSS emerged: Remote Site Syndication and Really Simple Syndication. The last one is the most popular since it indicates the main function of the RSS format: facilitating content feed syndication.
Essentially, an RSS document contains one
devx.com http://www.devx.com DevX.com--the know-how behind application development. http://www.devx.com/assets/devx/7819.gif http://www.devx.com/ -
Practical XML for Java Programs http://www.devx.com/Java/Article/16571/0 -
Get Nasdaq Quotes Online in Your Java Program http://www.devx.com/tips/Tip/14965 Note the absence of item descriptions.
The next Netscape version, 0.91, which was introduced in July 1999, dropped namespaces and added descriptions to items. Look at how the same example code as above appears in this version:
devx.com http://www.devx.com/ DevX.commdash;the know-how behind application development. en-us -
Practical XML for Java Programs http://www.devx.com/Java/Article/16571/0 When dealing with XML, you need a convenient representation of the XML data in memory. This article offers Java programmers a solution to achieve this goal: an easy-to-use package for handling XML data in Java. Get Nasdaq Quotes Online in Your Java Program http://www.devx.com/tips/Tip/14965
Versions 0.92 through 0.94 added various elements to the format, such as webMaster, managingEditor, expirationDate, skipHours, rating, and language. In 2000, an independent group, RSS-DEV, published version 1.0, which was based on 0.90. Version 1.0 turned out to be obscure and unreadable since it widely used the XML namespaces that were popular during those opulent times.
In 2002, Userland published version 2.0, based on 0.92-0.94. Version 2.0 actually had three slightly different flavors:
- September 2002:
element dropped - November 2002:
changed from the questionable 1-24 range to 0-23 - January 2003:
element returns
Forgetting these minor discrepancies, take a look at an example of 2.0:
devx.com http://www.devx.com/ DevX.commdash;the know-how behind application development. en-us -
Practical XML for Java Programs http://www.devx.com/Java/Article/16571/0 When dealing with XML, you need a convenient representation of the XML data in memory. This article offers Java programmers a solution to achieve this goal: an easy-to-use package for handling XML data in Java. Vlad Patryshev 2003-07-25 Get Nasdaq Quotes Online in Your Java Program http://www.devx.com/tips/Tip/14965 Get Nasdaq quotes online in your Java program. Vlad Patryshev 2001-11-09
This link to RSS.NET contains a comparison chart of all RSS versions.
RSS in Java
As a decorator class for com.myjavatools.xml.BasicXmlData (see Practical XML for Java Programs), com.myjavatools.xml.Rss inherits all the standard features of XmlData:
- A type name
- A value
- A collection of attributes
- A collection of subelements (which are also instances of BasicXmlData)
The com.myjavatools.xml.Rss class defines three static member classes corresponding to popular RSS subelements: Rss.Image, Rss.TextInput, and RssItem. Other subelements, such as cloud, are just plain instances of BasicXmlData.
The class’ constructors instantiate RSS from a File, Url, InputStream, or XmlData. You can also use the default constructor to create an empty RSS container and then build the container by adding items (addItem(Rss.Item item)) and setting various data (using setters such as setImage, setCopyright, setDescription, setTextInput, setWebmaster, and the like?there are 21 setters in total).
Twenty-three getter methods in com.myjavatools.xml.Rss?getCategory(), getCloud(), etc., up to getWebMaster()?return values as various RSS elements. Most of these methods return a String, since many elements actually are just String values (e.g., getCategory(), getRating(), getSkipHours(), getWebMaster(), etc.) Other getters return complex values:
getCloud()
returns an instance of XmlData
that contains data as defined for
element.getItem(String title)
returns an item with the specified title.getItems()
returns a Collection that contains all the item elements of the RSS instance.getTextInput
returns the TextInput
element.
Item is the most important element for handling RSS data. So com.myjavatools.xml.Rss needs more functions for that element than just returning a collection of all items and retrieving an item by its title. For further functionality, it also includes more item finders: findByDescription(String description), findByUrl(String url), findByGuid(String guid).
To accelerate item search by title, it indexes items in the RSS container; getItem(String title) uses a HashMap that works as an index table. Within a RSS object, it also strips descriptions in all elements (channel, item, image, textInput) of their leading and trailing space characters. These characters have no meaning in human-readable descriptions and their absence makes search by description more reliable. In addition, you can use the methods inherited from BasicXmlData to save a RSS object to a File or to send it to an OutputStream. As previously stated, the output RSS format is always 2.0.
Sample JSP Using RSS
Suppose you need to retrieve a RSS feed and display it in your Web page. I’ll use my favorite flavor of Java, JSP, to show you how to do this with com.myjavatools.xml.Rss. Review the following JSP code:
Rss Feed Test <% String url = request.getParameter("url"); if (url != null) { com.myjavatools.xml.Rss rss = new com.myjavatools.xml.Rss(new java.net.URL(url)); %> Rss feed from url <%=url%>
title: <%=rss.getTitle()%>
category: <%=rss.getCategory()%>
description: <%=rss.getDescription()%>
language: <%=rss.getLanguage()%>
date: <%=rss.getLastBuildDate()%>
webmaster: <%=rss.getWebMaster()%>
title url description date author <% for (java.util.Iterator i = rss.getItems().iterator(); i.hasNext();) { com.myjavatools.xml.Rss.Item item = (com.myjavatools.xml.Rss.Item)i.next(); %> <%=item.getTitle()%> <%=item.getEnclosureUrl()%> <%=item.getDescription()%> <%=item.getPubDate()%> <%=item.getAuthor()%> <% }%>
<%}%>
If you run this JSP, and enter a popular RSS URL in the form (say, http://slashdot.org/index.rss), you will get the latest Slashdot news formatted in a table (see Listing 1).
Testing and Packaging
The com.myjavatools.xml.Rss class also comes with a set of unit tests, with samples of all versions of RSS. Take a look at a sample code snippet from the included RSS 0.91 Unit Test:
data_091 = new Rss(new URL("http://www.xml.com/cs/xml/query/q/19")); System.out.println("Title: " + data_091.getTitle()); System.out.println("Webmaster: " + data_091.getWebMaster()); System.out.println("Description: " + data_091.getDescription()); for (Iterator i = data_091.getItems().iterator(); i.hasNext();) { Rss.Item item = (Rss.Item)i.next(); System.out.println(" Item: " + item.getTitle()); System.out.println(" Description: " + item.getDescription()); System.out.println(" Link: " + item.getLink()); System.out.println(); }
See Listing 2 for the full listing, or download the whole package, com.myjavatools.xml, with the library, mjxml.jar. Click here to view Java documentation for the package.