Converting Fixed-Width Text Records to XML

Converting Fixed-Width Text Records to XML



We’re not so far removed from the databases of old. Most databases still have a specified length applied to field entries, although the exact relationship between that length and the way the database stores that information is considerably more complex than it was when databases were essentially single long strings of fixed length. Moreover, the interfaces for accessing this information have changed as well, so you’re probably only peripherally aware of the length relationship. Still, with legacy databases you may run into situations where you’re provided with data as a file in which the data consists of fixed-width records. Each record contains multiple defined fields, and field is a smaller string of known size. Usually, a carriage return separates each “record” from the next. One of the benefits of XML is its ability to richly format data through XSLT; but you have to get the information into XML format in the first place, otherwise, XSLT doesn’t do you a lot of good.



Fortunately, converting fixed-field length text files into XML is not a terribly difficult undertaking, though you need to be careful about a few “gotchas”. After some simple preliminary processing to wrap the data in markup and save it as a well-formed XML document, you can use XSLT to handle most of the real work.

Convert Text File to XML
The one aspect of conversion between text files and XML that you need to watch most carefully, especially when using DOM processing, is that the number of records involved could get large fast. If the files are comparatively small (up to about 5000 records), then you can use recursion techniques to parse lines; the problems appear when you have a large number of records, because most recursive routines will likely end up “blowing the stack”, exceeding the maximum depth that the processor can handle. For that reason, it’s preferable (and in many respects both easier and faster) to preprocess the source files so that each line becomes an element. After that, you can use standard node-set iterations to walk through each line in the XSLT and generate the individual fields.

For example, a set of fixed length records might originally be contained in a text file as shown below. Each item consists of a fixed-length substring always is found at the same position in the lines (unlike a comma or tab delimited file where the fields may be of variable length). Note that in order to make this work properly, there should be no carriage return after the last line. Each field in the source file is of the same length.

Fixed Field Length Text

31A201Kurt        Cagle       3242.27  Basic      31A202Aleria      Delamare    6250.54  Advanced   31A203Gina        Delgadio    317.12   Advanced   31A204Sera        Anadropolis 4392.15  Basic      31A205Gregor      Hauptmann   1224.88  Special    31A206Alexis      Porter      92.15    Basic      31A207James       Cabal       2215.25  Basic      31A208Micheal     Denning     925.66   Advanced   31A209Amaya       Kiasabe     866.54   Special    31A210Nathan      Lane        936.12   Advanced   ... Additional Values ...

To perform the initial processing, I wrote a simple ASP JavaScript program (see Listing 1) that loads the source text document and creates a second document (with an XML extension but treated as text). Although the sample code is in JavaScript, you could easily port it to Java or another language. The program iterates through each line of the first document, wraps a set of tags around each line, writes the wrapped line to the target text file, and then moves onto the next line. I chose to do this rather than just build the expression as a string in memory because files place no limits on the size of the text file you’re reading…always an important issue to consider:

At the end of the processing, the text file has been converted to an XML document in this form:

        31A201  Kurt      Cagle        3242.27  Basic            31A202  Aleria    Delamare     6250.54  Advanced            31A203  Gina      Delgadio     317.12   Advanced   



Transform Field Values with XSLT
Now, it’s possible to take advantage of the strengths of XSLT to quickly process the fields. For fixed length data, you need to know three distinct pieces of meta-information:

  • the name of the field
  • the field’s associated data type
  • the number of characters allocated to displaying the field data

It is generally preferable to create a separate XML document that contains the relevant field information rather than attempting to store meta-information with the data. The fields.xml file contains meta-content about the text database.

                                 

The type information comes from the XML Schema namespace. Note that while type information is not –strictly speaking–necessary (especially if a schema already exists for the records to be created) it can be useful for processing field information down the road.

With this information, it’s possible to parse each line in the initial records and correctly extract the relevant field data. Iterating through each record in the recordset is simple, and occurs often enough that it’s worth building a general named template to perform the task:

                                                                                 

The named template getRecordsFromNodeSet takes a recordset as an argument and calls the getFields template over each record.

The getFields named template is a little more complicated, primarily because it involves a certain amount of recursion (see Listing 2). Essentially, the named template works by retaining an index for each field. The template loads the XML file containing the field meta-information from the URL in the parameter $fieldSource using the document() function. The individual field entries are passed into a node-set. This node-set in turn can act like an array to retrieve individual field elements from a given index:

The recursive calls increment the field pointer (re-initializing with each record). In turn, this can be compared to the number of fields in the meta-information file. Because the number of recursion calls here is very shallow?from a handful to a few dozen at maximum–it is in fact preferable (and probably faster) to use XSLT recursion than to process the fields via the DOM.

Display the Resulting XML
The ASP file ProcessTextFile.asp generates the file XML record set, saves it, then passes the XML file to another transformation (ProcessRecordset.xsl):

         31A201         Kurt         Cagle         3242.27         Basic            31A202         Aleria         Delamare         6250.54         Advanced            31A203         Gina         Delgadio         317.12         Advanced               

The processTextFile.asp page accepts a showType parameter. –It passes the value to the FixedLengthRoutines.xsl to determine whether to include the XML Schema data type information in the final output. You control the setting by using an optional query string parameter. For example, using the URL processTextFile.asp?showType=yes tells the application to include the type definitions, while setting showType to “no” in the URL removes them.

After you have the information in XML form, you can essentially do anything that you can do with any other XML file. In the sample code, the processTextFile.asp passes the XML to another stylesheet, processRecordset.xsl (see Listing 3), which formats the output in a table and provides a rudimentary way of filtering the output list.

The processRecordset.xsl file displays the records as a table. You can use the $matchTerm parameter as a rudimentary filter on the list. You can also subclass the templates to provide additional functionality, such as currency formatting or color-coding for different types of accounts. To avoid potential namespace conflicts, I defined all these templates as being of “format” mode. To subclass your own templates, you need to use the same mode indicator in your code. You can view the output of the ASP page in Figure 1.

The successful use of XSLT sometimes comes down to knowing when to avoid using XSLT. While it is possible to accomplish the same, transformation within XSLT, (especially using extensions) there is still a basic need for DOM to handle limitations in the XSLT model. However, regardless of the processing mechanism, by converting fixed records into even simple XML you can bootstrap your development dramatically and gain the robust processing capabilities that XSLT brings.

devx-admin

devx-admin

Share the Post:
Game Changer

How ChatGPT is Changing the Game

The AI-powered tool ChatGPT has taken the computing world by storm, receiving high praise from experts like Brex design lead, Pietro Schirano. Developed by OpenAI,

Future of Cybersecurity

Cybersecurity Battles: Lapsus$ Era Unfolds

In 2023, the cybersecurity field faces significant challenges due to the continuous transformation of threats and the increasing abilities of hackers. A prime example of

Apple's AI Future

Inside Apple’s AI Expansion Plans

Rather than following the widespread pattern of job cuts in the tech sector, Apple’s CEO Tim Cook disclosed plans to increase the company’s UK workforce.

AI Finance

AI Stocks to Watch

As investor interest in artificial intelligence (AI) grows, many companies are highlighting their AI product plans. However, discovering AI stocks that already generate revenue from

Web App Security

Web Application Supply Chain Security

Today’s web applications depend on a wide array of third-party components and open-source tools to function effectively. This reliance on external resources poses significant security

Thrilling Battle

Thrilling Battle: Germany Versus Huawei

The German interior ministry has put forward suggestions that would oblige telecommunications operators to decrease their reliance on equipment manufactured by Chinese firms Huawei and

Game Changer

How ChatGPT is Changing the Game

The AI-powered tool ChatGPT has taken the computing world by storm, receiving high praise from experts like Brex design lead, Pietro Schirano. Developed by OpenAI, ChatGPT is known for its

Future of Cybersecurity

Cybersecurity Battles: Lapsus$ Era Unfolds

In 2023, the cybersecurity field faces significant challenges due to the continuous transformation of threats and the increasing abilities of hackers. A prime example of this is the group of

Apple's AI Future

Inside Apple’s AI Expansion Plans

Rather than following the widespread pattern of job cuts in the tech sector, Apple’s CEO Tim Cook disclosed plans to increase the company’s UK workforce. The main area of focus

AI Finance

AI Stocks to Watch

As investor interest in artificial intelligence (AI) grows, many companies are highlighting their AI product plans. However, discovering AI stocks that already generate revenue from generative AI, such as OpenAI,

Web App Security

Web Application Supply Chain Security

Today’s web applications depend on a wide array of third-party components and open-source tools to function effectively. This reliance on external resources poses significant security risks, as malicious actors can

Thrilling Battle

Thrilling Battle: Germany Versus Huawei

The German interior ministry has put forward suggestions that would oblige telecommunications operators to decrease their reliance on equipment manufactured by Chinese firms Huawei and ZTE. This development comes after

iPhone 15 Unveiling

The iPhone 15’s Secrets and Surprises

As we dive into the most frequently asked questions and intriguing features, let us reiterate that the iPhone 15 brings substantial advancements in technology and design compared to its predecessors.

Chip Overcoming

iPhone 15 Pro Max: Overcoming Chip Setbacks

Apple recently faced a significant challenge in the development of a key component for its latest iPhone series, the iPhone 15 Pro Max, which was unveiled just a week ago.

Performance Camera

iPhone 15: Performance, Camera, Battery

Apple’s highly anticipated iPhone 15 has finally hit the market, sending ripples of excitement across the tech industry. For those considering upgrading to this new model, three essential features come

Battery Breakthrough

Electric Vehicle Battery Breakthrough

The prices of lithium-ion batteries have seen a considerable reduction, with the cost per kilowatt-hour dipping under $100 for the first occasion in two years, as reported by energy analytics

Economy Act Soars

Virginia’s Clean Economy Act Soars Ahead

Virginia has made significant strides towards achieving its short-term carbon-free objectives as outlined in the Clean Economy Act of 2020. Currently, about 44,000 megawatts (MW) of wind, solar, and energy

Renewable Storage Innovation

Innovative Energy Storage Solutions

The Department of Energy recently revealed a significant investment of $325 million in advanced battery technologies to store excess renewable energy produced by solar and wind sources. This funding will

Renesas Tech Revolution

Revolutionizing India’s Tech Sector with Renesas

Tushar Sharma, a semiconductor engineer at Renesas Electronics, met with Indian Prime Minister Narendra Modi to discuss the company’s support for India’s “Make in India” initiative. This initiative focuses on

Development Project

Thrilling East Windsor Mixed-Use Development

Real estate developer James Cormier, in collaboration with a partnership, has purchased 137 acres of land in Connecticut for $1.15 million with the intention of constructing residential and commercial buildings.

USA Companies

Top Software Development Companies in USA

Navigating the tech landscape to find the right partner is crucial yet challenging. This article offers a comparative glimpse into the top software development companies in the USA. Through a

Software Development

Top Software Development Companies

Looking for the best in software development? Our list of Top Software Development Companies is your gateway to finding the right tech partner. Dive in and explore the leaders in

India Web Development

Top Web Development Companies in India

In the digital race, the right web development partner is your winning edge. Dive into our curated list of top web development companies in India, and kickstart your journey to

USA Web Development

Top Web Development Companies in USA

Looking for the best web development companies in the USA? We’ve got you covered! Check out our top 10 picks to find the right partner for your online project. Your

Clean Energy Adoption

Inside Michigan’s Clean Energy Revolution

Democratic state legislators in Michigan continue to discuss and debate clean energy legislation in the hopes of establishing a comprehensive clean energy strategy for the state. A Senate committee meeting

Chips Act Revolution

European Chips Act: What is it?

In response to the intensifying worldwide technology competition, Europe has unveiled the long-awaited European Chips Act. This daring legislative proposal aims to fortify Europe’s semiconductor supply chain and enhance its

Revolutionized Low-Code

You Should Use Low-Code Platforms for Apps

As the demand for rapid software development increases, low-code platforms have emerged as a popular choice among developers for their ability to build applications with minimal coding. These platforms not

Cybersecurity Strategy

Five Powerful Strategies to Bolster Your Cybersecurity

In today’s increasingly digital landscape, businesses of all sizes must prioritize cyber security measures to defend against potential dangers. Cyber security professionals suggest five simple technological strategies to help companies

Global Layoffs

Tech Layoffs Are Getting Worse Globally

Since the start of 2023, the global technology sector has experienced a significant rise in layoffs, with over 236,000 workers being let go by 1,019 tech firms, as per data

Huawei Electric Dazzle

Huawei Dazzles with Electric Vehicles and Wireless Earbuds

During a prominent unveiling event, Huawei, the Chinese telecommunications powerhouse, kept quiet about its enigmatic new 5G phone and alleged cutting-edge chip development. Instead, Huawei astounded the audience by presenting

Cybersecurity Banking Revolution

Digital Banking Needs Cybersecurity

The banking, financial, and insurance (BFSI) sectors are pioneers in digital transformation, using web applications and application programming interfaces (APIs) to provide seamless services to customers around the world. Rising