When standards change, your development efforts must often change with them. But change doesn't always have to be painful. If you're trying to upgrade your HTML pages to the latest standards, fix unclosed tags, find and fix deprecated features, and format all your Web pages consistently, HTML Tidy is just what the doctor ordered.
by Alex Hildyard
Mar 29, 2004
Page 1 of 4
ou may never have heard of it, but HTML Tidy isn't new. HTML Tidy is a once-free but now open source application. It was originally written in C as a command-line executable by W3C employee Dave Raggett, before being taken over as an open source initiative in 2000. Somewhat characteristically of open source efforts, it's managed to shun the limelight, yet an ever increasing number of Web professionals rely on it daily to get their jobs done.
The principal reason it's so popular is because it combines syntactic, semantic, and stylistic advice in a single, highly configurable library. This means it can do more than simply fix unclosed or badly nested tags; it also has sufficient understanding of document structure to perform intelligent contextual cleanupfor example, culling empty paragraphs, removing duplicate attributes, or inlining blocks of text. Marry all this with W3C's recommendations for Web site accessibility and doctype compliance, add a basic understanding of browser differences, throw in a cautionary dollop of really verbose markup from your "smart" HTML export program of choice, and you end up with a package that can not only fix many of your stylistic faults; it can even tell you in plain English how to become a better HTML coder. The fact that it exports fully standards-compliant XHTML is just icing on the cake.
HTML Tidy's Genesis
Today HTML Tidy exists in many forms. The C library on which it was based has now been ported to most major operating systems (Windows, various flavors of Unix, BSD, MacOS, and DOS), as well as some minor ones (like the Atari 520ST's GEM o/s and the Amiga's OS3). You can download C++, Java, Delphi, Pascal, Perl, Python, and COM wrappers, and there's even a FrontPage 2000 plug-in.
There are currently two GUI implementations of HTML Tidy on the Windows platform: TidyGUI (see Figure 1) and TidyUI (see Figure 2).