Login | Register   
RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Getting Fancy with FOP-2 : Page 2

A Short XSL-FO Primer
XSL-FO is a page description language. It's a language specifically designed for working with fairly sophisticated page content; consequently, it can be surprisingly difficult to master well. You won't be throwing away your copy of Quark or Pagemaker any time soon…but don't be surprised to see Pagemaker—also an Adobe product—generating XSL-FO eventually. Adobe is perhaps the prime mover behind XSL-FO, though IBM, Sun, Xerox, and other companies also helped author of the XSL Recommendation.

XSL-FO uses the fo: namespace, xmlns:fo = "http://www.w3.org/1999/XSL/Format", to identify fo: elements contained within a <fo:root> element that acts something akin to HTML's <html> tag. The document then defines a set of "layout masters", which can be thought of as templates for different page types, which set the dimensions and general characteristics of each specific type of page. For example, you could create two masters, one for the left page and one for the right, because such pages generally are mirror symmetric in terms of margins. A third master page might define a title page.

I created a very simplified (ad hoc) XML schema to describing the sample document included with this article (although you could easily use something like DocBook to do much the same thing). The schema isn't the formatting code; it's just a simple "logical" breakdown of the document (see Listing 1).

Author Note:: The sample document contains an early version of this article—there may be minor differences between the sample document and the final version.

The XSL-FO markup for this document can look a little intimidating, but it's actually pretty straightforward. All XSL-FO documents begin with a <fo:root> that contains the markup and declares the fo: namespace

<fo:root xmlns:fo = "http://www.w3.org/1999/XSL/Format">

The next element should be a layout master set. This is a collection of masters that the document requires. For the current article, the name of this simple page master is (not surprisingly) "mainPage", but it could be pretty much anything—the master-name attribute just provides a value to refer to the page master:

<fo:layout-master-set> <fo:simple-page-master master-name="mainPage" page-height="11in" page-width="8.5in" margin-top="0.75in" margin-bottom="1in" margin-left="1.25in" margin-right="1.25in">

The page master defines the height and width of the page, as well as the dimensions of the margins. Note that the units involved can be any standard CSS units: inches (in), centimeters (cm), millimeters (mm), points (pt), etc. These dimensions are printer page dimensions—if you wanted to print to an 11x17 broadside, for example, you'd specify a page-height of "17in" and a page-width of "11in". The margins define the actual "printable" area on that page, given as an offset from the page itself along the respective axis.

The page itself is then broken into three distinct areas—the region-before, used to set header information (such as the title of the article), the region-after, which holds footer information such as page numbers, and the region-body, which is the active area where the process inserts the body of the text. The margins here work relative to the margins defined by the page itself, with the extents giving the amount before or after the body that the headers or footers extend respectively.:

<fo:region-body margin-top="0.25in" margin-bottom="0.75in" margin-left="0in" margin-right="0in"/> <fo:region-before extent="0.5in"/> <fo:region-after extent="0.75in"/> </fo:simple-page-master>

This defines one master, but it doesn't tell the order that the master appears. You do that in the page sequence master, which can describe both single instances and repeating collections of pages. The "simpleDoc" sequence master in the following example consists of nothing but repeating page masters named "mainPage."

<fo:page-sequence-master master-name="simpleDoc" > <fo:repeatable-page-master-alternatives> <fo:conditional-page-master-reference master-name="mainPage" /> </fo:repeatable-page-master-alternatives> </fo:page-sequence-master>

After defining the page sequence master, you can begin adding content. For a given page sequence adding content involves both defining static content—content such as footers or headers that either do not change or change predictably (such as page numbers) across multiple pages—and flow content, which consists of the main body of the article. You should declare the static content for the header first:

<fo:page-sequence master-name="simpleDoc"> <fo:static-content flow-name="xsl-region-before"> <fo:block font-size="8pt" font-family="sans-serif" border-after-color="black" border-after-style="solid" border-after-width="0.1pt"> XML 10 Minute Solution: Getting Fancy With FOP </fo:block> </fo:static-content>

The master-name attribute in the <fo:page-sequence> element indicates that the page sequence should use the page master defined under page-sequence block just given. The <fo:static-content> element in turn sets the contents for the header (flow-name="xsl-region-before") or the footer (flow-name="xsl-region-after"). Perhaps the most pervasive element in XSL-FO documents is the <fo-block> element. The <fo-block> element defines a block region, which is analogous to a block within CSS or a <div> element within HTML. A block defines a rectangular region of text, whether as a paragraph, a collection of paragraphs, a sidebar, a headline or any other layout element that has a bounding rectangle. The opposite of a <block> element is an <inline> element, which places the contents of this block within the current flow structure. An <inline> element can be thought of as being analogous to an HTML <span> element, or the CSS inline property.

A note about attributes. Many of the attributes you'll see within both <fo:block> and <fo:inline> elements may seem familiar if you're used to CSS. A big reason for this is that these tags are CSS based, whenever possible. One useful way to envision XSL-FO is to see it a framework for applying CSS in a purely XML environment. The XSL-FO elements define specific entities—parts of pages, regions, or even sections within text—while the attributes basically provide the media description about how those parts look and act.

The footer demonstrates that static content isn't really all that static:

<fo:static-content flow-name="xsl-region-after"> <fo:block font-size="8pt" font-family="serif" text-align="right"> Copyright 2001 Cagle Communications -- Page <fo:page-number/> </fo:block> </fo:static-content>

The footer content includes the <fo:page-number> element, which updates from one iteration to the next. The page-number is quite customizable, by the way, employing the same formatting convention that XSLT uses. For example, if you wanted to number elements by capital Roman numerals you'd add the format attribute to the <page-sequence> element (note that you don't add it to the <fo:page-number> element). Doing that would number the pages in the sequence I, II, III, IV, V, VI, etc., inserting the number at the position of the <fo:page-number> element.

The final (and arguably most important) part of the document is the <fo:flow> region. This contains, for each page sequence, the region into which the template flows the body of text. Unlike static content, which you explicitly set to reside on a given page, the flow content flows from one page to the next, creating new pages when there's insufficient space on the previous page. The vast majority of the contents within a <fo:flow> element will consist of <fo:block> elements.

For example, Listing 2 contains a flow object that shows the title, subtitle, author, and the first paragraph of the article itself <fo:flow flow-name="xsl-region-body">

<fo:block font-size="24pt" font-family="serif" font-weight="bold" background-color="blue" color="white" line-height="32pt" padding-before="4pt" padding-left="4pt" space-after.optimum="15pt"> Getting Fancy With FOP </fo:block> <fo:block font-size="18pt" font-family="sans-serif" font-weight="bold" line-height="22pt" space-after.optimum="15pt" margin-right="2.5in"> Creating Adobe Acrobat Files from XSLT and XSL-FO </fo:block> <fo:block font-size="14pt" font-family="sans-serif" font-weight="bold" line-height="15pt" space-after.optimum="15pt"> by Kurt Cagle </fo:block> <fo:block font-size="11pt" font-family="sans-serif" text-align="justify" line-height="12.5pt" space-after.optimum="5pt"> Webster's Dictionary defines a fop as being synonymous with a "dandy," a person (usually male) who spends an inordinate amount of time and effort on dress and appearance, sometimes to ludicrous extremes. Think of the gold-chain-festooned white-polyester clad lounge lizard of the 1970s and you'll get the basic idea. However, as with so many other terms, FOP has resurfaced with a different meaning&#151;as an acronym for the Formatting-Object Processor, part of the Open Source Apache Project. The FOP processor performs an interesting stunt: it converts an XSL-FO file into an Adobe Postscript Description Format (PDF) file. </fo:block> <!-- More Code --> </fo:flow>

Finally, the blocks may potentially contain in-line elements. An inline element, as mentioned earlier, is an element that is part of the flow of text. For example, in HTML the <b> element is an inline element that sets the font-weight of the enclosed text to "bold". In the article, rather than creating <b> and <i> elements, which give no real clue as to why they are bold or italic, I have instead three distinct inline elements: <concept>, <emph> for emphasis, and <tag> to easily create an angle bracketed element. For example, the document presents the <tag> element visually as:

<fo:inline font-weight="bold"> &lt;<fo:inline color="blue">tag</fo:inline>&gt; </fo:inline>

Note that the example given here is very simple—the full XSL-FO specification is more than 300 printed pages in length, and can be extraordinarily complicated. However, that size makes it robust enough to handle a wide variety of applications.

Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date