The Two-task Layer ("Two-Task Common") and Globalization
The two-task layer performs datatype and characterset translation operations between the server environment and the client environment. It carries out these conversions only when required. While connecting, it evaluates the differences between internal data and the target characterset.
In case of a database serving clients multilingual data, the NLS_LANG parameter is among the most important. Typically, the characterset of the database is US7ASCII, WE8ISO8859P1, or UTF8. The database characterset denotes the 'range' of data that may be stored in the database. The NLS_LANG value should match the version of the client's operating system. Matching a value to the backend database characterset is a common mistake that prevents the two-task common layer from performing conversions. It assumes that the data flowing from the client to the server is exactly in the format required by the server, and this can lead to data corruption.
Another very common error is mismatching the client characterset and the database characterset. If a mismatch exists, some characters occurring on the client may not be part of the database characterset. In such a case, the data will get inserted in a corrupt fashion. Be particularly aware of this in export/import. Data corruption issues tend to lie low initially, and when they manifest, it is often too late.
Since multiple NLS_LANG registry entries (in Windows) are possible, in addition to an environment variable (Unix has only the environment variable), the best way to find out which value is being used is by typing
"@.[%NLS_LANG%]" at the SQL*Plus prompt. The error message displayed correctly echoes the NLS_LANG being used.
Being aware of NLS_LANG is particularly important because the errors it causes are difficult to trace and pinpoint. If, for instance, certain inserted characters are showing up as '?' (or