I am starting to see the benefits of using XML and XSL in Web site development, but my challenge is convincing the decision makers. Their usual responses are: “Show me” (there aren’t a lot of examples to be found), “What about older browsers?”, and “What’s the point?” My company is a book publisher. What benefits does XML have in this area?
Actually, the issues you bring up are not all that surprising. Developers are finally beginning to “get” XML, but it’s often a hard-sell to convince a senior level manager why this technology isn’t just another “next big thing” that will blow over. I think I can provide at least a little bit of an argument, although certainly I’m not an unbiased source in talking about this technology.
XML isn’t really a client-side technology, although that sector will be growing soon enough. Thus, for all that we talk about Internet Explorer 5.0 as being the only XML browser, the important thing to keep in mind here is that most of XML’s uses are currently on the server, or are used by the server to help create “standard” HTML output. To concentrate on IE5 is to get mislead into thinking that XML is a Microsoft technology, is something that can only work through one browser, and really has no relevance to the users of your technology. In your case especially, that is far from the truth.
Consider, for instance, several of the problems that you run into when dealing with books in this crazy market. A typical computer book has an effective life span of something like six months from the date it first hits the shelves, with a significant portion of those books actually having a window of less than three months before they’re pushed from the shelves by newer products. At a capital investment of more than $30,000 a title, this means that anything that can extend the life span of that book should be examined.
One possibility, of course, is to sell the initial book, and within the book place, access codes for pulling down the content online. However, with most books, this particular strategy hasn’t worked terribly well. Converting a book to HTML format is a fairly costly procedure. You often have to spend a lot of time tweaking the output for different browsers, and the Web sites are at best poorly used because two of the key features that would make the book attractive as an online commodity?regular updates and indexability?are simply not there. So many times the publisher foists this task off onto the shoulders of the already overburdened author. However, by the time the book comes out, the author is probably already onto his or her next book and doesn’t really want the task of maintaining a large Web site full of less and less relevant information.
Suppose you look at an XML solution. You provide the author with a Word template, then when the book is done, a macro converts the structure of the book (primarily by using style sheets) into an XML document. Some cleanup work would probably need to be done, but far less than would be required in creating a good quality Web site. Note that at this point the style sheet information is contextual rather than syntactic in nature?Sidebar Head, Sidebar Body, Sidebar Bullets, and so forth. Once this data is in XML format, then you’re set.
Just a few of the things you can do with the XML, primarily through XSL:
- Table of Contents. You can create a table of contents at any level, for any browser, that would automatically link to the requisite document section.
- Version Control. You could add new sections into the XML sections, and have the users search on all sections added since some arbitrary date.
- Cross-Browser Support. With the content of the document essentially generic, you can create customized output for any browser simply by applying the appropriate XSL style sheet. This cuts down on system maintenance.
- Cross-referencing. An online reader of the book could cross link to other books that were also stored in XML format, at any level within the book, using an XLink capability. This is a server capability?from the client’s standpoint the reference link is a simple HTML anchor tag.
- Printable Output. You can create an XSL filter that will output a page or a chapter in a more printer friendly format.
- Preview Sections. You could designate specific XML sections (at any level of granularity) as open access?you wouldn’t need an access code to see these sections. By doing this, you can easily create a Web page automatically that shows only those sections that are accessible (or simply making live those sections that are accessible) as a teaser to encourage purchase of new books.
- Adding New Content. XML really doesn’t care where a given section is located, so updates could be added as separate manageable documents without breaking the illusion that the document is effectively a single seamless piece.
- Searchable. You could create a fast keyword search or a (slightly) slower text search of the whole page contents without a significant amount of programming or indexing. With a bit of intelligent design, you could even perform keyword searches across all of the books published by you.
- Subscriptions. This capability could conceivably even be sold through a subscription service. For students, such a service could prove invaluable because it simplifies their research efforts at a cost far lower than purchasing the equivalent books. For professionals like myself, who often have to lug around several dozen pounds of printed material just to do their jobs, an XML-based subscription service would be similarly a godsend.
- Improved Production. XML is a markup language, which means that you can go from XML back into your Word markup, or even (with some initial coding) directly into your Frame Maker or Quark applications. Thus, creating a revised edition for print output would take considerably less work that making the first edition. Simply link those sections in the revised edition that you want to incorporate, de-link those in the older edition that you want to lose, run it through an XSL filter and a layout macro and you’re 98+% of the way there.
- E-Books. The advent of the electronic book will have the most impact on two areas?textbooks and computer books. There, the demand for current information is highest, the density of information is also the highest, and the adopters of e-book technologies are the greatest in number. E-books have specialized formats for output, but using XML and XSL, you can transform your books automatically into an e-book format with little additional work.
Of course, this does bring up a few negatives, and I’d like to address them as well. This solution is great, but it isn’t magic:
- Transcription. You will need to have someone on staff that massages the initial document into a suitable XML format. Much of the work can be automated (with the use of appropriate Office macros) but you’re dealing with authors, many of whom don’t always understand why the use of styles is important. As this work is basically a copyediting job, it isn’t necessarily a major limitation.
- Potential Lost Revenue. E-books (in the generic sense) are a double-edged sword for any publisher. Those publishers who see their business as selling thinly sliced pieces of wood with liquefied carbon embossed on the surfaces of these pieces will doubtlessly face some major problems with e-books. Those who see their business as the collection and dissemination of information for a fee will do just fine. E-books will cut into print sales revenues, and you cannot reasonably expect to charge as much for access to an online database as you would for the same document in print. On the other hand, your costs come down to the cost of paying the authors for the document, for the transcription of the data into a common format, for the programming of the filters to handle the output into different modes, and for maintenance of the servers. The latter two are essentially one-time costs.
- Branding. A book should advertise itself. A significant amount of the work that goes into putting out a book comes in creating the brand around the book, and publishers work to have thick books because they occupy larger amounts of space on the shelf. Once you move the information onto the Internet, that advertising essentially disappears, and so you have to increase your marketing efforts in other directions to compensate. I would counter this contention, however, with the fact that e-books will raise the profiles of the publisher, rather than promote the individual books. You buy a subscription to MyBooks.com (perhaps a limited subscription of thirty days from an online book purchase or a more comprehensive subscription for a longer duration) and will be brought back to MyBooks.com because of the services offered. Subscriptions build consumer loyalty, which lead to more subscriptions as well as an increased likelihood that when the subscriber does buy an offline book it will be a MyBooks book.
- Tools. The number of XML related tools for document management is currently fairly small, in great part because up until recently, the XML standards (such as XSL) have not been ratified. This situation is finally changing however?XSL-Transform is a recommendation, as is XPath. XLink is likely to be recommended within the next month. XML Data will be ratified within the next two to three months. This means that your initial document management systems would probably be ad-hoc, although by the time you moved beyond the approval stage, this might no longer be an issue.
Personally, XML is the ideal tool for a technical publishing company to start moving into. It makes the transition to e-books considerably less painful, promotes your publishing company better than traditional books, makes maintenance of books easier in view of changing technologies, and transforms your company into an information vendor, rather than a printer and moving company. Indeed, you have the crucial commodity of information, which will end up producing far more revenue than the producers of the information display and processing tools.