We're not so far removed from the databases of old. Most databases still have a specified length applied to field entries, although the exact relationship between that length and the way the database stores that information is considerably more complex than it was when databases were essentially single long strings of fixed length. Moreover, the interfaces for accessing this information have changed as well, so you're probably only peripherally aware of the length relationship. Still, with legacy databases you may run into situations where you're provided with data as a file in which the data consists of fixed-width records. Each record contains multiple defined fields, and field is a smaller string of known size. Usually, a carriage return separates each "record" from the next. One of the benefits of XML is its ability to richly format data through XSLT; but you have to get the information into XML format in the first place, otherwise, XSLT doesn't do you a lot of good.
Fortunately, converting fixed-field length text files into XML is not a terribly difficult undertaking, though you need to be careful about a few "gotchas". After some simple preliminary processing to wrap the data in markup and save it as a well-formed XML document, you can use XSLT to handle most of the real work.
Convert Text File to XML
The one aspect of conversion between text files and XML that you need to watch most carefully, especially when using DOM processing, is that the number of records involved could get large fast. If the files are comparatively small (up to about 5000 records), then you can use recursion techniques to parse lines; the problems appear when you have a large number of records, because most recursive routines will likely end up "blowing the stack", exceeding the maximum depth that the processor can handle. For that reason, it's preferable (and in many respects both easier and faster) to preprocess the source files so that each line becomes an element. After that, you can use standard node-set iterations to walk through each line in the XSLT and generate the individual fields.
For example, a set of fixed length records might originally be contained in a text file as shown below. Each item consists of a fixed-length substring always is found at the same position in the lines (unlike a comma or tab delimited file where the fields may be of variable length). Note that in order to make this work properly, there should be no carriage return after the last line. Each field in the source file is of the same length.
Fixed Field Length Text
31A201Kurt Cagle 3242.27 Basic
31A202Aleria Delamare 6250.54 Advanced
31A203Gina Delgadio 317.12 Advanced
31A204Sera Anadropolis 4392.15 Basic
31A205Gregor Hauptmann 1224.88 Special
31A206Alexis Porter 92.15 Basic
31A207James Cabal 2215.25 Basic
31A208Micheal Denning 925.66 Advanced
31A209Amaya Kiasabe 866.54 Special
31A210Nathan Lane 936.12 Advanced
... Additional Values ...
tags around each line, writes the wrapped line to the target text file, and then moves onto the next line. I chose to do this rather than just build the expression as a string in memory because files place no limits on the size of the text file you're reading...always an important issue to consider:
At the end of the processing, the text file has been converted to an XML document in this form:
31A201 Kurt Cagle 3242.27 Basic
31A202 Aleria Delamare 6250.54 Advanced
31A203 Gina Delgadio 317.12 Advanced