n organized system for maintaining and delivering configuration information is an essential component of any enterprise software system. For small enterprises with only a few servers, registry entries and individual configuration files offer a very simple configuration management solution; however, as the size and complexity of an organization grows to include many computers running across several physical locations, keeping all the related software components configured correctly quickly becomes a serious challenge. This article describes the various storage strategies available for storing configuration information, and then proposes a configuration system that you can easily implement yourself.
First, consider the attributes required to provide a useful system for managing configuration properties for a large enterprise of servers and software components. The following list prioritizes some key characteristics of such a system.
- Centralized Storage and Management. Whether the actual settings are stored in a central location or whether a tool provides the appearance of centralized storage, settings should be managed from a central location.
- Localized Settings. The system should be able to assign property values to the module, server, or location where the software will operate?for example, to a specific piece of software (the module)?more generically to a hosting server, or even more generically to a location (a physical location such as a primary datacenter or a disaster recovery facility). This feature supports (for example) setting a log file directory at the server level (all software modules on the same server will log to the same directory) and assigning an error reporting web service at the location level (all software modules running on all servers at a location will share the same value).
- Reliability. Even in the event of occasional network or database failures, the software components must be able to obtain the configuration information required to operate “properly.” While it’s likely that the software will not be able to perform fully during such failures, the available configuration should at least give it the capability to log errors, reroute messages to operational components, or queue work requests until the disturbance has passed.
- Scalability. Even when high numbers of servers and software components are reading configuration information very rapidly, the configuration system should scale with increasing demand.
Storage and Distribution Strategies
Historically, there have been four main ways of storing software configuration information: the registry, .config and .ini files, centralized databases, and custom text or XML files. These strategies all have their strengths (measured in terms of ease of modification, tools availability, retrieval speed, etc.) but they all become problematic as the size and complexity of the enterprise software environment grows (see Table 1).
|Registry||Local storage, fast reads, and immune to network and database failures||Large numbers of servers can be very burdensome to maintain|
|Config files||Local storage, fast reads, and immune to network and database failures||Very difficult to maintain due to lack of tools|
|Centralized database||Centralized management||Slow reads, difficult to scale to high numbers of servers|
|Custom files||Local storage, fast reads, and immune to network and database failures||Very difficult to maintain due to lack of tools|
Building a Configuration System
The configuration system described here has two main goals: central configuration storage, and enterprise-wide configuration property distribution.
|Author’s Note: This article uses the following terminology. A “module” is any logical piece of software that is operated?or more importantly?configured, as a single entity. A “location” is either a physical or logical group of servers that are configured similarly. A “property value” is a named piece of information affecting the operation of a software system.|
Storing Configuration Data
Before investigating the process of making configuration data available to distributed software components, you need to understand how the data is stored centrally. The configuration system allows a property value to be assigned to any (or none) of the following elements: software module, server, and location.
|Figure 1. Data Model: The data model for the proposed configuration system supports linking property values to modules, servers, and locations.|
Figure 1 shows the data model for the proposed configuration system. The escProperties table gives each property a name (the PropertyLabel field) and defines how often to update the local “caches” of these property values (the ValidFor field). The escModules table defines the various software modules that comprise the enterprise system. The escServers and escLocations tables define the servers and locations that operate as part the of the enterprise software system. Note that the foreign key relationship between the locations and servers tables lets the configuration system automatically detect the location given a server name.
In the center of Figure 1, the escPropertyValues table provides a way to assign values to properties. The tables’ construction allows a property value to vary depending on the associated module, server, and location. To support property value assignments made without associating them with a specific module, server, or location, you may supply a zero for those values; this gives the table a wildcard matching capability. As an example, inspect the set of escPropertyValues records shown in Table 2
The first record in Table 2 holds a property value of \SVRNAMElogfilesabc for a property with the ID 1017 belonging to module 1113 running on server 729 at location 2239. Similarly, the second row provides the property value \SVRNAMElogfiles for a property with an ID of 1017, for any module (0) or server (0) at location 2239. Row 3 returns \OTHERSVRNAMElogfiles when any module running on server 729 at location 2239 requests the value of property ID 1017. Likewise, when module ID 1113 requests the value for property 1017 running on any server at location 2239, the return value will be \SVRNAMElogfiles123. Finally, the last row demonstrates a full wildcard record?the value \SVRNAMElogfilesapp01 will be returned to any module requesting the value of property 1017 anywhere in the enterprise.
When querying for a property value record, client software must provide the module ID, the server name, and the name for the value it’s attempting to retrieve. The escServers table’s relationship with the escLocations table provides the location of the calling server. Using these values, the SQL query shown below will retrieve the most specific property value based on the input data:
SELECT TOP 1 PropertyLabel, PropertyValue, ValidFor FROM eacPropertyValues pv JOIN eacProperties p ON pv.PropertyID = p.PropertyID LEFT JOIN eacServers s ON pv.ServerID = s.ServerID LEFT JOIN eacLocations l ON s.LocationID = l.LocationID WHERE p.PropertyLabel like 'Log File Location' AND (pv.ModuleID = 1017 OR pv.ModuleID = 0) AND (s.ServerName like 'CORP_SVR' OR pv.ServerID = 0) AND (pv.LocationID = s.LocationID OR pv.LocationID = 0) ORDER BY pv.ModuleID DESC, pv.ServerID DESC, pv.LocationID DESC
To select only the records that are relevant to the client’s request, the query filters only records that either match the specified search criteria exactly, or whose field value is 0 (the wildcard setting). In the preceding query, the three filter expressions in the WHERE clause that reference the module, server, or location provide this functionality. Notice that each clause matches either the specific value provided by the client or a wildcard value.
To select the record that is most specific to the search criteria specified by the client, the query gives precedence to criteria matches in the following prioritized order: module, server, location?by ordering the result set by ModuleID, ServerID, and LocationID in descending order. Non-zero settings values sort higher in the results, which means that by selecting only the TOP 1 record, the query result is the single property value result that applies most specifically to the software retrieving the property value.
Together, the tables illustrated in Figure 1 and queries formulated like the one shown above provide a basic system for retrieving the most appropriate configuration data from a central location, and fulfill the primary goal of providing a flexible configuration system. This method allows users to assign a property value either specifically or non-specifically as appropriate. Obviously, it wouldn’t be difficult to write a simple configuration tool that would help users make such property assignments.
Remember, one primary goal of a configuration system is to provide centralized management. The proposed configuration system is based on a centralized SQL Server database. As discussed in the previous section, though, this can lead to scalability problems with large numbers of client systems. Even more troubling, when database or network connectivity fails, client software will be unable to retrieve any configuration information, leading to software failures throughout the enterprise.
To solve that problem, the next phase of this article discusses a strategy for taking the configuration values from the centralized database and distributing the values to all the client servers.
Distributing Configuration Values
The proposed system contains a very simple caching methodology to enable values retrieved from the centralized database to be stored locally on the client servers for later retrieval. To make the process simple and flexible, the simple class framework illustrated in Figure 2?helps local clients read configuration values from the centralized database, store them in a local cache (the Windows Registry), and keep the values updated appropriately.
|Figure 2. Property Class Framework: This extensible class framework manages both remotely and locally cached property values.|
This configuration caching strategy provides another important advantage. In the event that the central storage system is unavailable at the time of application startup, when many application configuration values are traditionally retrieved, the application will either not be able to start or will be left in a precarious state. Using a persistent local cache results in higher availability.
|Figure 3. Property Value Flowchart: This flowchart depicts the nominal flow that results when a client reads a property value.|
At the heart of the framework is a PropertyValue class that client applications use to read property values. The clients can remain unaware that configuration information is managed remotely and cached locally. Under the covers, the PropertyValue class first checks the local cache for the requested property value. If the value is found locally, it quickly returns the value to the client; otherwise, the class retrieves the value from the central database, stores it locally, and then returns the value to the client. Figure 3 shows the high-level flow used by the PropertyValue class.
The PropertyValue class has a Lifetime property that determines whether cached values have expired. Expiration, in this case, simply means the value should be re-retrieved from the central database. This aging process forces periodic updates to the local copies of property values across all the machines in the enterprise. The escPropertyValues table’s ValidFor field contains the number of seconds before a local property value must be refreshed. When a client requests an expired value, the class refreshes the value from the central database and updates the local value before returning the value to the client.
To cover situations when the central database is unreachable, the class behaves as follows:
- When a property value has not yet been stored locally, the PropertyValue class returns a null value.
- When a property has been stored locally, but has expired, the PropertyValue class returns the locally-stored value.
This scheme serves both to keep data as current as possible, and makes it highly likely that a requested property value will be found locally in the unlikely event of an unreachable central database server.
Many readers may protest that the solution proposed here implements the central repository using SQL Server. Even more may protest using the Windows Registry as the local repository of cached property values. I could argue that these are probably the best all-purpose choices for these storage choices. However, the class framework displayed in Figure 2 supports pluggable interchangeability of both the local and remote storage mechanisms. In most cases, I would argue that SQL Server or another enterprise-class relational database is usually the best choice for the central repository for property values. However, in cases where a server hosts mostly web applications, it may make more sense to employ the ASP.NET cache for local storage.
The extensibility mechanisms built into both the LocalStorageClient and the RemoteStorageClient work nearly identically. You can quickly extend the framework to support other local and remote storage mechanisms. For example, suppose you want to store local configuration properties in an XML file. Here’s the procedure.
First, create a new class that inherits from LocalStorageClient, and implement the two mustoverride methods: LocalPropertyValue_Get and LocalPropertyValue_Set. These two methods read and write PropertyValue objects locally. To use an XML file for storage, you will need to write code to support opening and closing the XML file, reading and writing to the XML file, and so forth. The BuildPropertyValue method of the LocalStorageClient base class aids in building a PropertyValue class from the values retrieved from the XML file.
Note that there’s no reference to the mechanism used by the RemoteStorageClient. Nor is there any code used to check whether the locally stored value has expired. That’s because the PropertyValue class handles those tasks. If a property value must be updated from remote storage, the PropertyValue class coordinates the interaction between the RemoteStorageClient and LocalStorageClient as shown in Figure 3.
Using the Sample Project
The downloadable sample code contains a fully functional project that retrieves property values from a central SQL Server database and stores them locally in the computer’s registry. The property values stored locally honor the ValidFor setting in the escPropertyValue table by updating the local cache from the SQL Server system as specified.
A call to the PropertyValue class’s GetValue shared method functions is the entry point into the system. You only need to pass GetValue a ModuleID and the name of the property you want to retrieve. The RemoteStorageClient class retrieves the server name, and the tables using the server name can infer the location. The system will return a PropertyValue object for the requested property if that property value can be retrieved from either local or remote storage.
Populating the Tables
Initially, to populate the database tables, create escProperties rows for each property you plan to implement. Do not create a property row with a property ID of zero?wildcards are not supported in the property name. Property values will vary widely from one enterprise to the next, but most facilities are likely to have property labels such as LogFileLocation, DebugLevel, and ErrorReportingWebServiceURL.
Next, create a row in the escModules table for each module you want to configure separately. You will also want to create a wildcard row with a ModuleID of zero. My recommendation is to create a module row for every assembly you run on a machine?not just for executables. This will allow you to add specific configuration information for each piece of software on the system.
Finally, add rows to the escServers and escLocations tables for each server and location in your enterprise. Also add wildcard rows to both of these tables for the reasons described above.
The downloadable code includes a sample client application, named escTest. You can use this client to help understand how the framework operates, and to test new local and remote storage frameworks. You can also use it to verify that the SQL Server database has been configured correctly.
After you have retrieved several values and have verified that subsequent retrievals are being returned from the local cache, open the registry and inspect the local property values. Using a registry editor, navigate to HKEY_Local_MachineSoftwareEnterpriseSoftwareConfiguration. You should find sub-keys for each ModuleID used to retrieve property values. Under each ModuleID sub-key, you will find separate sub-keys for each property label whose values have been retrieved. The PropertyValue and ValidUntil registry keys are stored under the property label sub-keys. Figure 4 shows the “Log File Location” property for ModuleID 1003 stored hierarchically in the registry.
|Figure 4. Local Registry Store: The LocalStorageClient_Registry class stores property values hierarchically using the ModuleID and PropertyLabel properties.|
One implication of storing all property values under the module ID is that property values shared by all modules (those with a ModuleID of 0 in the escPropertyValues table) will be stored redundantly. In other words each ModuleID that queries for the same property value will store its own copy. That’s an acceptable price to pay, because it has several advantages. In addition to being a very simple way to implement the storage, this storage scheme also simplifies the process of debugging property value storage. Bear in mind that just because a value can be assigned a ModuleID of zero doesn’t mean that all instances of the property value will be the same. The model supports setting a generic property value with a ModuleID of zero, while still letting selected modules have a specific value for the same property.
You should also note that use of the registry for local cache storage is available only on Windows platforms. Other platforms will require a different local storage solution.
Implementing the Project in Your Enterprise
I have intentionally kept the functionality and source code of the sample project simple to illustrate how the system can work flexibly in any organization. Before you implement it, be sure you’re doing it for the right reasons. Remember that the proposed framework is not intended to help cache frequently used application data; in other words, don’t consider it as a general-purpose cache. Many of the techniques used to build the framework are borrowed from such caching designs, but you’d have to optimize the design of the local and remote storage clients differently for higher-volume use or for storing larger chunks of data. As described here, the system functions solely as an alternative to managing software configuration via the registry and/or .NET’s app.config files.
In addition, you should consider making at least some of the modifications in the following list before deploying the project in your organization:
- It would be useful to create a basic web-based configuration tool to make configuration easier for the end-user. The tool should support the use of wildcards in property value assignments.
- If you do not use SQL Server, you should add a RemoteStorageClient class specific to your central configuration repository. Whether you’re modifying the code to use Oracle, the registry, or LDAP, you will find the modification process straightforward, and similar to the client repository modification discussed in this article.
- If your enterprise operates on several different platforms, such as Windows 2000, Windows 2003, or Linux, consider adding an escPlatforms table and a Platform column to the escPropertyValues table so you can support specific property values for each platform. If you do that, remember to link the escServers table to the new escPlatforms table in a manner similar to the link between escServers and escLocations.
- If you operate from a single location, consider removing support for the escLocations table to simplify queries and configuration.
- If you aren’t happy with the prospect of using the registry to store locally cached property values, consider adding a LocalStorageClient to support whatever local storage mechanism you wish.
- Consider adding support for more than one type of LocalStorageClient. As mentioned previously, if the bulk of the applications on a machine are ASP.NET applications, it may make sense to use ASP.NET’s built-in caching for local storage. If configured properly, you could modify the LocalStorageClient’s base class factory method (GetLocalStorageClient) to return a different LocalStorageClient implementation depending on the client application’s identity. Keep in mind, using a strictly in-memory local cache will not help in cases where the application stops and restarts. In such cases, the cached values are lost and will need to be reseeded locally.
- Depending on how your applications are configured, you may consider adding an attribute to the escPropertyValues and/or escProperties tables to support “critical” properties. As coded in the sample project, the normal operation of the LocalStorageClient for a property that cannot be refreshed after its ValidFor period ends is to return the expired property value. However, you could alter this so that when a property marked “critical” expires, the system would return a null value when the property cannot be retrieved from the central repository.
- Using your enterprise’s standard error reporting framework, add support for logging errors when trying to retrieve values from the repository. Error reporting was not implemented in the sample project to keep the project simple.
- Don’t be limited by the perception that a configuration value is generally assigned to a module. While most modules will represent a physical piece of software, it can also represent an instance of a module. For example, if you have a service that loads multiple instances of the same or similar assemblies, then the configuration for the service could instruct to load specific modules?each of the modules it loads reads their specific configuration settings. Extension of this idea will add greatly to the ways this system can be used to configure your enterprise systems.
- You could implement a local Windows service to help by pre-fetching property values before they expire. By retrieving property values before they expire, the client application will seldom experience the additional delay caused when a local property value must be refreshed.
- Using the Windows service described above, it would probably be helpful to probe local storage, removing property values that are more than a month old. These values are probably not used any longer and should be removed from the system
- One problem with caching property values locally is that they become out of date when the master copy of the value in the central repository changes. For example, if the ValidFor property is set to one hour, you will likely have servers running with old values for up to an hour. Therefore, another use for a local service would be to receive notifications when property values should be updated immediately. With such a service, you could reduce the time required to push new central values to all the client systems to mere seconds.
- One weakness in the implementation of the sample project is that when the central database is unreachable, all attempts to retrieve expired property values will block, waiting for the database connection to time out. To keep the local storage client responsive, you should implement a flag in the local cache that prevents unnecessary connections to the database when it is known to be unresponsive. As an example, using the registry version of the local storage client, you could add a flag to the root of the EnterpriseSoftwareConfiguration sub-key to indicate that the central database is unresponsive, along with a time after which the system should try connecting again. In this scheme, all attempts to read from remote storage would return null, meaning that property requests would simply return the version stored in the local cache. The flag would limit attempts to access the central database. For example, assuming you set the flag to one minute, only one client per minute would attempt to access the database while it is marked offline. The first client that successfully makes a central connection and retrieves a property would clear the flag, and normal property refreshes would resume.
As your organization grows, you will almost certainly feel the pain of keeping software configuration properties on multiple servers synchronized. Consider using the project provided in the sample code as a starting point for building your own enterprise-ready software configuration framework, extending it to fit the specific needs of your enterprise.