Refactor and Validate Again
Looking again at how you have rewritten the original query, you will notice that the introduction of the left outer join changed the query behavior. In order to retain the same query behavior as the original, you should have used an inner join:
SELECT TableA.* FROM TableA
INNER JOIN TableB ON TableA.id = TableB.refid
Here are the results:
id textfield
----------- --------------------
1 Row one
2 Row two
4 Row four
6 Row six
(4 row(s) affected)
You can now use the EXCEPT clause once again to validate your results:
SELECT * FROM TableA
WHERE EXISTS (SELECT * FROM TableB WHERE TableA.id = TableB.refid)
EXCEPT
SELECT TableA.* FROM TableA
INNER JOIN TableB ON TableA.id = TableB.refid
Here are the results:
id textfield
----------- --------------------
(0 row(s) affected)
Remember, you have to reverse the query to confirm that your SQL results are the same:
SELECT TableA.* FROM TableA
INNER JOIN TableB ON TableA.id = TableB.refid
EXCEPT
SELECT * FROM TableA
WHERE EXISTS (SELECT * FROM TableB WHERE TableA.id = TableB.refid)
Here are the results:
id textfield
----------- --------------------
(0 row(s) affected)
You have now proved that the refactored query does return the same results.
Refactoring with No Harm Done
The row-by-row comparison provided by EXCEPT, combined with the number of rows being returned remaining consistent, provides compelling evidence that your refactored query is correct. In DB2, the EXCEPT operator is extended using the EXCEPT ALL syntax, which returns rows where the row is different and if more rows are on the LHS query than on the RHS query.
The technique shown in this article can save time and effort and provide you with certainty that your refactoring exercise has done no harm. If Hippocrates had been a software professional, he would most certainly have approved.