You Just Inherited 1,000,000 Lines of Code — What do You do Next?

You Just Inherited 1,000,000 Lines of Code — What do You do Next?

Inheriting code from another project or another company is a mixed blessing for developers. Finished code that works is always a plus, but what about code that works and is rotten at the same time? You’ve still got to work with it, maintain it, and possibly build on top of it.

How do you know what you’ve inherited and what to do with it?

Michael Rozlog believes the solution is a formal approach known as software archeology, which enables developers to deconstruct an existing piece of software to find patterns for reuse in future developments.

Rozlog is product manager for Delphi Solutions at Embarcadero Technologies, a provider of tools for developers and database professionals.

“As a developer, at some point you face the daunting task of working on code you didn’t build,” he said. “Software archeology helps you determine how to deconstruct inherited software source code.”

Rozlog said the beauty of this reverse engineering is there are plenty of very fast tools for various languages.

“With the right tool, you can reverse engineer a piece of software with a million lines of code in less than a week,” he said.


Although getting the information by hand is possible, that approach is tedious and time-consuming. Using a set of tools reduces the time needed to generate data to help you move forward with development, he noted.

“Java developers, for example, can use JBuilder, which gives them software archeology tools and functionality on top of the base Eclipse SDK,” he said.

Rozlog and his developer team at Embarcadero have devised a six-step process for reviewing what is there and what is not to help developers define their project development strategy.

See also  The Role of Byzantine Fault Tolerance in dVPN Networks

Those steps are: visualizing an architectural diagram, understanding the health of the object model, studying the current state of the code, testing the current code, locating the bottlenecks in the source code, and, finally, assessing the adequacy of the documentation.

Visualizing an architectural diagram

“With a tool such as JBuilder 2008, you can reverse and forward engineer Java code,” he said. “This means that if you inherited a large amount of Java code, JBuilder can reverse engineer the code and produce a series of UML diagrams. And since JBuilder uses LiveSource, any changes made to the diagrams will result in the code being changed and vice-versa.”

Ideally, the code and diagrams are always in-sync.

Understanding the health of the object model

Rozlog said it is also important to get an understanding of the health of the object code.

“One of the fastest ways to do this is to run software metrics on the code,” he said. “Metrics give you information about the code’s construction and strength as well as the weak or problematic spots.”

Studying the current state of the code

Once developers understand the health of the code from a structural standpoint, they can go on to uncover issues that can cause errors, bugs, or misunderstandings going forward.

Rozlog said some software archeology tools include dozens of code audits that find possible performance issues, potential errors, and duplication of code.

Testing the current source

One of the most important processes with today’s complex systems is good testing.

“Ironically, most code that goes through the process of software Archeology has very limited testing,” he said. “If you don’t do simple testing on the code, it is hard to harvest patterns, change the code in any meaningful way, or integrate with other systems.”

See also  Essential Measures for Safeguarding Your Digital Data

Locating the bottlenecks in the source code

Performance reviews are also essential, said Rozlog.

“Software archeology helps point you to where the code is slow or does not perform well,” he noted. “It can help developers find the exact line or location of the code that is causing the performance issues.”

The general rule is that less than five percent of code causes 80 percent of the slow down.

Assessing the adequacy of the documentation

It is important that any diagrams, tests, metrics, audits and performance data become part of the overall code documentation set.

“When you generate a UML diagram it becomes part of the overall documentation, and when you run a metric or audit, it become part of the documentation,” he said.


About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist