devxlogo

Why EDI Must Die

Why EDI Must Die

lectronic Data Interchange (EDI), a data format used for inter-business messaging, has been partially responsible for reducing costs and increasing productivity in the manufacturing and services industries. Over the years, EDI has evolved into a mechanism for integrating back office systems and communication with business partners that has served its users well; however, it has inflexible, inherent, and practical problems that put a ceiling on its usefulness. To improve on the cost savings and productivity increases that EDI provided, it’s time to find alternatives and begin to retire EDI. The obvious alternative is XML, which is a maturing technology that solves many, if not all, of EDI’s problems.

EDI’s Problems
EDI is a special-purpose format. EDI’s intended purpose was to provide a common messaging standard for businesses to communicate with other businesses. Rather than having to deal with a different data format for each trading partner, all businesses could convert their proprietary data format to EDI to send or receive messages and then convert EDI to their respective proprietary formats. EDI is an unlikely choice for a use other than inter-business messaging.

In contrast, XML was created as a general purpose data format that can be used to encapsulate any type of data. It can be used for inter-business messaging, application integration, protocol wrappers, data storage, object serialization, etc. Because of its generic nature, XML has been subject to the broad adoption that typically leads to faster technological advances, strong support from a large user community, and creative uses that get back-populated into the community.

EDI lacks obvious looping declarations. The only looping constructs in an EDI (X12) document that explicitly mark the beginning and end of the loop (from outermost to innermost loop) are the interchange, functional group, and transaction set (where the transaction set contains the body of the business document). Looping occurs within a transaction set when a segment that is defined as the beginning of the loop is encountered and ends when no more segments defined in that loop are encountered. The looping is obvious in the document’s structural definition, but less clear in the actual document. This separation of structure from data increases the level of difficulty required to properly extract information from an EDI document.

EDI’s separation of structure from data increases the level of difficulty required to extract information from EDI documents.
Unlike EDI, XML has real nesting at all levels of a document. Every XML element has a start tag and an end tag (sometimes, they are the same tag) in which obvious, recognizable looping occurs. XML eases the ability to correctly ascertain the nesting structure of a document.

EDI has no standard parsing API. The lack of a standard parsing API is a severely limiting factor. Anyone who wants to parse EDI data and extract information from it must either reinvent the wheel, as it were, or purchase either an expensive EDI mapping application or a third party library for parsing EDI.

Although XML is a much younger format, it already has DOM, SAX, XSL, XPath, and XQuery and each API has been implemented?often several times?for most major programming languages. To parse XML data and extract certain pieces of information from it or transform it to a different data layout, all you have to do is pick a mainstream language, choose an API supported by your chosen language, and start writing code. Your have myriad options?and most are free.

EDI causes vendor lock-in. Those who continue to use EDI usually end up relying on a particular EDI vendor for software to manage their EDI and are, therefore, “locked in” to that particular EDI vendor. EDI vendors are essentially the sole providers of software tools for EDI. This is another option-limiting position to be in.

In contrast, tools for XML are created by the same community that utilizes XML. Many, if not most, XML tools are free and open source. One need not be bound to specific vendors to create a successful XML implementation.

XML’s Problems
Despite XML’s more flexible structure, standard parsing API, and readily available tools, it’s not without problems.

XML is a verbose format. XML documents are much more verbose than their EDI counterparts, often including redundant data in the document body that EDI leaves in the document definition. Some statistics show comparable XML documents being up to 25 times larger than EDI documents. Two main concerns related to this verbosity revolve around network and processing throughput.

XML’s verbosity stems from its explicit tag and attribute names that mark the beginning and end of looping structures and the data’s intended contents rather than relying on a fixed document format or delimited fields to dictate meaning. However, it’s precisely that explicit nature that makes XML a superior data format to EDI. The increased number of characters that an XML file must contain is a minor trade off for the increased flexibility and usefulness that is gained.

XML Usability
Although XML is seen by many as too verbose to be an efficient substitute for EDI, it’s worth visiting the objections.

Network Throughput. If a company sends XML documents with an average size of 50 kilobytes, they would have to send 25 per second to consume a 10 megabit network connection. This equates to around two million per day, which surpasses many (perhaps most) requirements of large businesses. The networking increase will likely not be an issue. Sending a 50 kilobyte XML document over TCP/IP is significantly faster than sending a 2 kilobyte EDI document over a legacy communications protocol such as a 9600 baud bisync modem (and yes, these are still in use today for EDI).

Processing Throughput. As a test, I wrote a Python script using the Expat XML parsing library to parse through an XML file with repeated five-level-deep nesting, an attribute on each element, and text nodes (filled with white space) between each element. The event-handlers created for the Expat parser simply counted each element as it was encountered. Table 1 shows the results for the Linux test, and Table 2 shows the test results on Windows XP.

Author’s Note: The Linux/Windows tests were run on different computers with different processors, and therefore are not meant to be directly comparable.

Table 1. XML Throughput Using Expat on Linux. The table shows the results obtained using the Expat XML parser on a Fedora Core 3 Linux installation.

Processor

Operating System

Num of Files

File Size

Process Time

P-II 300

Fedora Core 3 Linux

1

5.7MB

73 seconds

P-II 300

Fedora Core 3 Linux

50

50KB

30 seconds

Table 2. XML Throughput Using Expat on Windows XP. The table shows the results obtained using the Expat XML parser on Windows XP.

Processor

Operating System

Num of Files

File Size

Process Time

P-III 550

Windows XP

1

5.7MB

12 seconds

P-III 550

Windows XP

50

50KB

5 seconds

P-IV 3 GHz

Windows XP

1

5.7KB

1.7 seconds

P-IV 3 GHz

Windows XP

50

50KB

0.7 seconds

As Table 2 shows, the P-III is capable of parsing around 864,000 50KB documents per day and the P-IV is capable of parsing over 6 million 50KB documents per day. Even an old P-II 300 can parse 144,000 50KB documents per day. While processing XML in the enterprise consists of more than just parsing documents and counting elements, the tables show that there available XML parsers can perform quite well under any OS, and even on older and slower hardware.

Despite the many problems with EDI, it is impossible to dispute that EDI fulfills its intended function. I expect that EDI will remain in wide use for a number of years to come and it will continue to fulfill its intended function. However, that originally intended function is not generating any new benefit. To improve on the cost savings and productivity increases begun by EDI, we need a new data format with new and broader functionality. XML is the obvious choice when easier and more natural integration is the goal. Until the industry adopts a more usable data format and a broader integration methodology, the benefit levels will remain stunted by EDI’s limited capabilities.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist