State Mandated Change to Data Elements
The project in which I applied these techniques started when the New York State Health Department sent out a document that specified the data elements they wanted in record structures. The document prescribed the acceptable sources and procedures you could use to fill its regulatory reports, leaving you responsible for assuring that your schema and procedures satisfied their requirements.
This immediately called for coordinating three dimensions:
- The specification's name for the data element
- The data source (database, table and column) where the specification's data element could be found
- The implementation's table and column names, from which the data was going to be conditioned, processed, and reported (see Table 1 for a sample of the specification)
The "meta" aspects of the problem were:
- What is the specification's requirement conceptually? (What is its meaning?)
- From where in your current schema is this information available? (What is its source?)
- To where and in what format must you map that attribute output? (What is its destination?)
|Table 1. Sample of the New York State Health Department Specification|
To coordinate these aspects, I used extended properties. I saved the state's specification as text and then re-opened it in Excel so that I could more easily expand the table with attributes we would need further on in the process. The Excel file adds the attributes necessary to actually do the mapping from this document to the data sources that satisfy them to the implementation stores used to report them. Table 2 presents the names of all the columns in the Excel sheet.
|From the Specification
||Destinations for Reporting
||String Creation Elements
||" 'user'," " dbo,"
||" 'table'," Institutional
||Admit date for the stay
|Table 2. The Names of All the Columns in the Excel Sheet|
As mentioned previously, I collected the data in these columns by completing two simple steps:
- Copying the Word document into Excel
- Filling in the values by running
select * from
information_schema.columns in the database that contains the transaction tables and the database that contains the implementation tables
The work is all between the ears here because you have to know your data well enough to effect the mapping. It is tedious, but without it, knowledge of the meaning of the tables and columns in your organization is reduced to oral history, not science.