A Word of Caution
HTML Tidy is not a panacea for solving all your markup problems, and you should be prepared for the fact it may change working HTML into reformed HTML or XHTML that no longer "works." This is usually because the "working HTML" in question does not in fact comply with its
doctype (explicit or implied), but your particular browser produces what appears to be "correct" behavior anyway. For example, I've been guilty of nesting TABLE tags within SPAN tags. According to the HTML 4.0 Transitional
doctype I've been using, that isn't permissible, but in Internet Explorer 6 I end up with the effect I want all the same. However, if I were to update my
doctype to XHTML 1.0, my tables would no longer position "correctly." While you can generally rely on HTML Tidy to alert you to potential problems like this, its resolutions may not always make immediate sense if you don't appreciate the logic behind its decisions. In this case, it took the following source:
<span id="span_1">
<table>
<tr><td>Test</td></tr>
</table>
<span id="span_2">
</span>
</span>
Then HTML Tidy rendered the output as follows:
<span id="span_1">
</span>
<table>
<tr>
<td>Test</td>
</tr>
</table>
<span id="span_1">
<span id="span_2">
</span>
</span>
Duplicating the
<span> tag might look like an error, but in fact the only sensible way to fix the illegal nesting is to close the
<span> before the table, and then reopen it again afterwards. HTML Tidy's diagnostics, sent to NETTidy's output panel, explain what it's done:
TidyWarning: (6, 1): missing </span> before <table>
TidyWarning: (10, 4): inserting implicit <span>
Despite such minor problems, in the final assessment, HTML Tidy is a powerful API for parsing, altering, and formatting HTML, and it continues to be developed and refined. As you've seen, it's easy to incorporate it into your .NET projectsand it's worth downloading and using for its diagnostics alone. NETTidy leverages only a little of its power; there's a lot still left there under the hood, so I encourage you to use it as a springboard for further development in your own projects.