Java Package Processes All RSS Formats

SS (Real Simple Syndication) is a flexible and efficient format for exchanging structured, dynamically changing data, such as news headlines, blogs, job vacancies, new projects, recent wiki changes, etc. You might not have noticed it yet, but as soon as you get tuned in, you’ll discover it is ubiquitous.

You’ll find several scattered Java packages that work with RSS. Some can read certain formats, while others can write them as well. The class I feature in this article, com.myjavatools.xml.Rss, reads all known RSS formats (from version 0.90 to 2.0) and outputs all the converted data in the 2.0 format.

You can instantiate com.myjavatools.xml.Rss from any RSS feed, regardless of version. The class gives access to all the RSS elements, and you also can create new RSS containers, add or change the contents, and write RSS data to an output stream. However, it doesn’t offer any syndication features, so no filtration or selection.

A Brief History of RSS
Userland developed RSS in 1997. Netscape immediately adopted it, and Userland and Netscape continued working on RSS formats in parallel. RSS-DEV working group and W3C’s RDF (Resource Description Framework group) also participated in developing RSS standards. For Userland, RSS stood for Rich Site Summary; for Netscape, it was RDF Site Summary. Eventually, other explanations for RSS emerged: Remote Site Syndication and Really Simple Syndication. The last one is the most popular since it indicates the main function of the RSS format: facilitating content feed syndication.

Essentially, an RSS document contains one element, which may have a title, a description, and a link, as well as an element and a certain amount of and elements. All these elements have at least a title, and typically also has a description and a link to a resource. If only one RSS format (such as XML) existed, that would be the end of the story. Unfortunately, history had other ideas. The following is an example of the Netscape version (0.90) for a document that features an article and tip I wrote for DevX:

      devx.com    http://www.devx.com    DevX.com--the know-how behind application development.                http://www.devx.com/assets/devx/7819.gif      http://www.devx.com/              Practical XML for Java Programs      http://www.devx.com/Java/Article/16571/0              Get Nasdaq Quotes Online in Your Java Program      http://www.devx.com/tips/Tip/14965      Note the absence of item descriptions.

The next Netscape version, 0.91, which was introduced in July 1999, dropped namespaces and added descriptions to items. Look at how the same example code as above appears in this version:

      devx.com    http://www.devx.com/    DevX.com&#mdash;the know-how behind application development.    en-us          Practical XML for Java Programs      http://www.devx.com/Java/Article/16571/0      When dealing with XML, you need a convenient representation of the XML data in memory. This article offers Java programmers a solution to achieve this goal: an easy-to-use package for handling XML data in Java.          Get Nasdaq Quotes Online in Your Java Program      http://www.devx.com/tips/Tip/14965  

Versions 0.92 through 0.94 added various elements to the format, such as webMaster, managingEditor, expirationDate, skipHours, rating, and language. In 2000, an independent group, RSS-DEV, published version 1.0, which was based on 0.90. Version 1.0 turned out to be obscure and unreadable since it widely used the XML namespaces that were popular during those opulent times.

In 2002, Userland published version 2.0, based on 0.92-0.94. Version 2.0 actually had three slightly different flavors:

  • September 2002: element dropped
  • November 2002: changed from the questionable 1-24 range to 0-23
  • January 2003: element returns

Forgetting these minor discrepancies, take a look at an example of 2.0:

      devx.com    http://www.devx.com/    DevX.com&#mdash;the know-how behind application development.    en-us          Practical XML for Java Programs      http://www.devx.com/Java/Article/16571/0      When dealing with XML, you need a convenient representation of the XML data in memory. This article offers Java programmers a solution to achieve this goal: an easy-to-use package for handling XML data in Java.      Vlad Patryshev      2003-07-25          Get Nasdaq Quotes Online in Your Java Program      http://www.devx.com/tips/Tip/14965      Get Nasdaq quotes online in your Java program.      Vlad Patryshev      2001-11-09  

This link to RSS.NET contains a comparison chart of all RSS versions.

RSS in Java
As a decorator class for com.myjavatools.xml.BasicXmlData (see Practical XML for Java Programs), com.myjavatools.xml.Rss inherits all the standard features of XmlData:

  • A type name
  • A value
  • A collection of attributes
  • A collection of subelements (which are also instances of BasicXmlData)

The com.myjavatools.xml.Rss class defines three static member classes corresponding to popular RSS subelements: Rss.Image, Rss.TextInput, and RssItem. Other subelements, such as cloud, are just plain instances of BasicXmlData.

The class’ constructors instantiate RSS from a File, Url, InputStream, or XmlData. You can also use the default constructor to create an empty RSS container and then build the container by adding items (addItem(Rss.Item item)) and setting various data (using setters such as setImage, setCopyright, setDescription, setTextInput, setWebmaster, and the like?there are 21 setters in total).

Twenty-three getter methods in com.myjavatools.xml.Rss?getCategory(), getCloud(), etc., up to getWebMaster()?return values as various RSS elements. Most of these methods return a String, since many elements actually are just String values (e.g., getCategory(), getRating(), getSkipHours(), getWebMaster(), etc.) Other getters return complex values:

  • getCloud() returns an instance of XmlData that contains data as defined for element.
  • getItem(String title) returns an item with the specified title.
  • getItems() returns a Collection that contains all the item elements of the RSS instance.
  • getTextInput returns the TextInput element.
  • Item is the most important element for handling RSS data. So com.myjavatools.xml.Rss needs more functions for that element than just returning a collection of all items and retrieving an item by its title. For further functionality, it also includes more item finders: findByDescription(String description), findByUrl(String url), findByGuid(String guid).

    To accelerate item search by title, it indexes items in the RSS container; getItem(String title) uses a HashMap that works as an index table. Within a RSS object, it also strips descriptions in all elements (channel, item, image, textInput) of their leading and trailing space characters. These characters have no meaning in human-readable descriptions and their absence makes search by description more reliable. In addition, you can use the methods inherited from BasicXmlData to save a RSS object to a File or to send it to an OutputStream. As previously stated, the output RSS format is always 2.0.

    Sample JSP Using RSS
    Suppose you need to retrieve a RSS feed and display it in your Web page. I’ll use my favorite flavor of Java, JSP, to show you how to do this with com.myjavatools.xml.Rss. Review the following JSP code:

    Rss Feed Test  <%    String url = request.getParameter("url");    if (url != null) {      com.myjavatools.xml.Rss rss = new com.myjavatools.xml.Rss(new java.net.URL(url));  %>  

    Rss feed from url <%=url%>

    title: <%=rss.getTitle()%>
    category: <%=rss.getCategory()%>
    description: <%=rss.getDescription()%>
    language: <%=rss.getLanguage()%>
    date: <%=rss.getLastBuildDate()%>
    webmaster: <%=rss.getWebMaster()%>
    <% for (java.util.Iterator i = rss.getItems().iterator(); i.hasNext();) { com.myjavatools.xml.Rss.Item item = (com.myjavatools.xml.Rss.Item)i.next(); %> <% }%>
    titleurldescriptiondateauthor
    <%=item.getTitle()%> <%=item.getEnclosureUrl()%> <%=item.getDescription()%> <%=item.getPubDate()%> <%=item.getAuthor()%>
    <%}%>
    Enter Rss Url:

    If you run this JSP, and enter a popular RSS URL in the form (say, http://slashdot.org/index.rss), you will get the latest Slashdot news formatted in a table (see Listing 1).

    Testing and Packaging
    The com.myjavatools.xml.Rss class also comes with a set of unit tests, with samples of all versions of RSS. Take a look at a sample code snippet from the included RSS 0.91 Unit Test:

     data_091 = new Rss(new URL("http://www.xml.com/cs/xml/query/q/19"));    System.out.println("Title:	" + data_091.getTitle());    System.out.println("Webmaster:	" + data_091.getWebMaster());    System.out.println("Description:	" + data_091.getDescription());    for (Iterator i = data_091.getItems().iterator(); i.hasNext();) {      Rss.Item item = (Rss.Item)i.next();      System.out.println("  Item:	" + item.getTitle());      System.out.println("  Description:	" + item.getDescription());      System.out.println("  Link:	" + item.getLink());      System.out.println();    }

    See Listing 2 for the full listing, or download the whole package, com.myjavatools.xml, with the library, mjxml.jar. Click here to view Java documentation for the package.

    Share the Post:
    Share on facebook
    Share on twitter
    Share on linkedin

    Overview

    Recent Articles: