devxlogo

Data Validation Using .NET and External Metadata

Data Validation Using .NET and External Metadata

sing .NET reflection and external metadata makes it easy to add data validation to your objects. Nearly every application that collects data, whether from a Windows- or Web-based form or from a file, needs to validate that the data is in the correct format.

Each programmer has developed his or her own method of validating input data in varying degrees of complexity, but generally, this has resulted in coding the data validation rules into the procedural or object code. This has meant re-compiling and re-distributing your application each time the validation rules changed.

Some have attempted to build a better mousetrap over the years, and some of those efforts have resulted in a somewhat improved ability to separate the validation from the rest of the processing so that the rules could change without having to rebuild the entire application. Storing the validation rules externally, such as in a SQL Server database, is one way to accomplish such a task. However, that only solves part of the problem, as the underlying code may still have to change to support database changes. Coding validation rules in DLLs apart from the rest of the system allows you to update the rules without rebuilding the entire application, but you have to stop the application to replace the DLLs. In addition, the inherent lack of Type safety in calling DLL methods and having to load them using LoadLibrary and GetProcAddress results in code that is difficult to understand and maintain.

.NET makes this task easier. At Brierley & Partners we’ve developed a set of classes and interfaces to perform data validation in a fraction of the time it would take using traditional C++ and/or COM components. In this article I’ll show you how we can update our business rules on a moment’s notice without having to shutdown the application, allowing us to react quicker to changing business conditions.

.NET Reflection Is the Key
One of the many advantages to developing systems in .NET is the System.Reflection namespace classes.

Take a look at this sample code to dynamically determine information about a Type at run time:

   Assembly assembly =       Assembly.Load(strAssemblyName);   Type[] arrTypes = assembly.GetTypes();   foreach (Type t in arrTypes)   {   MethodInfo[] arrMI =       t.GetMethods(BindingFlags.Public|      BindingFlags.Instance|      BindingFlags.DeclaredOnly);   foreach (MethodInfo mi in arrMI)      {         ParameterInfo[] arrPI =             mi.GetParameters();   foreach (ParameterInfo pi in arrPI)         {            Console.WriteLine(pi.Name);         }      }   }

The code uses an assembly object to call the GetTypes method to return all the Types contained in this assembly. Using each Type you can query for Type information including constructors, methods, interfaces, abstraction, and other information. All this information is stored in the assembly’s manifest, which is created at compile time. The .NET reflection classes query the assembly’s manifest for information about what is contained in the assembly.

Another important class, System.Activator, contains methods to create types of objects locally or remotely, or obtain references to existing remote objects. You can use the information obtained using reflection to dynamically create an instance of a class located in an external assembly. Once you’ve created that instance you can call any public method passing the information retrieved.

   int param = 2;   Object[] objCtorArgs = new Object[] {param};   Class MyClass = (MyClass)      Activator.CreateInstance(type,       objCtorArgs);

Now that you have an instance of the class you can call any method contained in that class. However, doing so requires that you know in advance the name of the method and its signature. .NET provides another way to invoke members from an instance: the Type class.

System.Type, the root of all reflection operations, represents a Type inside the system. System.Type is an abstract base class that allows multiple implementations. The system will always provide the derived class’ RuntimeType. In reflection, all classes beginning with the word “Runtime” are created only once per object in the system and support comparison operations.

Using the Activator class and the Type class you can create an instance of a specific Type and invoke a member.

   int param = 2;   Object[] objCtorArgs = new Object[] {param};   object  MyClass = Activator.CreateInstance(      type, objCtorArgs);   object [] args = new object [] {100.09, 184.45};   object result;   result = typet.InvokeMember ("ComputeSum",       BindingFlags.Public | BindingFlags.InvokeMethod,       null, MyClass, args)

This sample will invoke a method called ComputeSum on an instance of the specified Type contained in an assembly.

The .NET Framework provides a series of classes that make it easy to load external assemblies, query the assembly manifest for Type information, create instances of objects of those Types, and invoke members. To get all of this necessary information into the application, you still have to compile the application. We need a level of abstraction to allow the application to know what methods to call and their associated parameters at run time instead of compile time.

Data Validation
You’ve seen how to use .NET classes to create instances of objects and invoke methods on those instances dynamically. Now you need to provide a layer of abstraction so that the necessary information is available at run time instead of compile time.

In order to unload a dynamically loaded assembly, you must load the assembly into a separate application domain. Use the static method AppDomain.CreateDomain. Then, use the returned AppDomain’s Load method to load an assembly into the domain. When you want to unload the domain, use the static method AppDomain.Unload. This will allow you to unload the validation assemblies in that app domain from the currently running program and replace them with updated versions.

We’ll use XML as the data markup language for building the data validators as well as for passing parameters to the validation routine. This allows your application to have a common data representation when communicating between modules. Internally the modules may perform transformations on the data, but the caller never knows about that transformation.

We’ll use Interfaces to implement our validation. The syntax for interface declarations is similar to that for class declarations. An interface is like a class in which every member is abstract; it can only contain property and method declarations without function bodies. An interface may not contain field declarations, initializer declarations, or nested class declarations. An interface can implement additional interfaces. A class may extend only one base class, but a class may implement many interfaces. Such implementation of multiple interfaces by a class allows for a form of multiple inheritance that is simpler than in other object-oriented languages, for example, in C++.

Using an interface to provide validation methods provides an easy way to determine if a particular class object is to be validated.

   IValidator v = foo as IValidator;   if (null != v)   {      v.Initialize(        bar.GetValidationData(        foo.ToString()));   }

In this example foo is an instance of a class of Type Foo. Foo implements the IValidator interface, so “v” is not null. You can then call a method on the interface using the instance of the class. If foo did not implement the IValidator interface, the “v” would be null, and you could not call the interface method.

Note the use of the operator as in the above example. The as operator acts like a cast except that it yields null on conversion failure instead of raising an exception. More formally, an expression of the form:

   expression as type

is equivalent to…

   expression is type ? (type)expression : (type)null

…except that expression is evaluated only once.

Using the is keyword means that it attempts to convert the object to the specified Type. If it can convert the object to a Type without throwing an exception, it returns true. However, you would still have to perform a cast to the Type again, so that requires two casts on the object whereas the as operator only requires one cast, and is more efficient.

Using as is more efficient then using the is operator. You use the is operator to check whether the run time type of an object is compatible with a given type. Use the is operator in an expression of the form: expression is type.

Now that you know how to use the .NET classes to dynamically load Types and invoke methods at run time, let’s define the data validation rules and store that information. I’ll refer to the data validation rules as metadata; that is, data that describes other data.

For Brierley & Partners’ projects, we store metadata in a series of tables stored in a Microsoft SQL Server database. Using a series of metadata classes, we load the information from the database and make it available to any program that utilizes these classes. The sample code presented in the next section uses a simplified XML file to hold the validation information that applications will load at run time.

Deciding where to store the metadata is a secondary consideration. What is important is storing the information you need to do the actual validation. At a minimum you need to know the assembly name, the class Type, and the method name you want to invoke. You’ll also need to know what parameters to pass to the validator and which class members you will be validating. Listing 1 demonstrates an XML document containing the data validation information.

This sample validation file defines the validation that will occur for a class named “Class2.” There is one validator defined that uses a class Type StringValidation contained in an assembly named StringValidation. The validator method to call is named “Validate.” It has four parameters defined: Min, Max, Action, and Value. Min specifies the minimum length of the string, Max the maximum length of the string, Action specifies what action (if any) to perform on the string (more about this later), and Value, which is the value being passed to the validator. Each parameter has a type, which indicates that the parameter is a constructor parameter, an Invocation parameter (used in the Validate method), or both. The “Value” node of each parameter is used by the validator to perform initialization in the case of constructor parameters, and as the source data for the Validate method.

You’ll notice that the “Value” node for the Value parameter is the name of a class member. The sample builds a hash table containing a class member name as a key and the name of its property get/set as the value. It also uses a single class to perform all validation on the classes. If you know the name of a property you can use reflection to obtain a PropertyInfo object on a class based on the property name. The PropertyInfo class has GetValue and SetValue methods that you can invoke to retrieve/set the actual data contained in the class member.

   Type t = obj.GetType();   PropertyInfo info = t.GetProperty("Foo");   String s = info.GetValue(obj, null).ToString();

Using validation metadata allows you to add, modify, and remove validation from class members without having to recompile any objects. You can also modify the validator and the metadata as needed without having to rebuild the main application.

Putting It All Together
Now that you have defined your data validation you can use the .NET classes to easily validate your data in a consistent manner that is easy to implement. In the future you can add or remove data validation from any particular class and data member without rebuilding your application. If you decide that you need a new validator, you can build that separately from the main application, add the metadata to your validation file (or SQL Server tables), and the application can take advantage of the newly available validation code. Let’s see how it all fits together.

   XmlDocument doc = new XmlDocument();   doc.Load("DataValidation.xml");

The document is stored in the main application. Each time you create a new instance of a class the application will check to see if it implements the IValidator interface. If it does you need to initialize the validation routines.

   Foo foo = new Foo();   IValidator v = foo as IValidator;   if (null != v)   {      v.Initialize(bar.GetValidationData(          foo.ToString()));   }

The main application class, bar, was used to load the data validation information. Now you must get the validation information needed for Type Foo so you’ll call foo.ToString(). The code overrides the ToString method to return the name of the class. Normally the ToString method will return the .NET type name for objects that are classes. You don’t want a decorated name so the code returns something easier to read. You’ll use the returned string to search through the metadata to find the validators necessary for that class. The Initialize method is defined on the IValidator interface and it takes an XMLDocumentFragment as its lone parameter. Listing 2 shows how to initialize the data validation rules.

The code to create and initialize the validators has been omitted from Listing 2. I want to note one thing here. You can get information about the properties of a class using the Type class and return an array of PropertyInfo objects. You will use this to store a hash able of class data members and their associated properties. One extra task you’ll perform is to check each property for a custom attribute. If the property has the custom attribute defined on it, then it gets loaded into the hash table. If not, ignore it. You can use this technique to exclude certain pieces of data that you know you do not need to validate. Listing 3 shows how easy it is to create and use a custom attribute class.

Each instance of a Type that you need to validate will contain an ArrayList of objects of Type Validator. The Validator class is a proxy for the real validator. In addition to loading the real validator, it is responsible for communicating with the class being validated using the hash table created above to retrieve and store data values. You can do this because of a method defined on the IValidator interface so that you don’t need to know the underlying details of the class you want to validate.

Now that you’ve ensured that the validation routine is available and properly constructed, your initialization is complete. You just need to call the Validate interface method on the class object instance. This method iterates through the list of validators, calling each Validate method, and stores any returned error messages. The Validate method of the Validator class creates an XML document containing all the parameters necessary to invoke the data validation method, and then uses the InvokeMember method of the Type class.

   object[] objArgs = new object[] {doc};   object result = true;   // use reflection services to dynamically invoke    // real validator here   try    {      result =          m_Type.InvokeMember(m_strMethodName,          BindingFlags.Default |          BindingFlags.InvokeMethod,          null, m_InstanceObj, objArgs);   }   catch(Exception e)   {   if (null != e.InnerException)      throw e.InnerException;   else      throw e;   }

You can use the existing instance of an object in an assembly to use the InvokeMember method of the Type class, passing an XML document by reference. The external assembly class receives the XML document, parses it, and performs its internal routines to validate and modify the data as necessary. Because the XML document gets passed by reference you can change the contents of the data in the external assembly and return the updates to modify the instance data. As the code parses the XML document upon return it can invoke the class property to update the data value as in the following sample code:

   Type t = obj.GetType();   PropertyInfo info =       t.GetProperty(v.GetAccessor(strFieldName));   object objValue =       Convert.ChangeType(child.InnerText,       info.PropertyType);   info.SetValue(obj, objValue, null);

Although the above sample validates a single data member of a class you can use it to validate multiple members of a class at one time. We currently perform this on address data using third-party software that we invoke from our AddressValidator using the p/Invoke System.Runtime.InteropServices namespace for calling functions in external Win32 DLLs. With additional modifications you could easily access information in other classes contained within the class you are currently validating.

Refreshing the Metadata
Metadata has assumed a very important role in the quest to design increasingly more generic and extensible software frameworks. Although I’ve gone into a fair level of detail describing how my company uses metadata to drive data validation rules, Brierley & Partners’ CRM (customer relationship management) framework uses metadata on a much broader scale to define the very essence of a customer. In an abstract sense, we define a customer as having a collection of profiles and events. The specific types of profiles and events vary dramatically per implementation and are therefore also defined external to the application. So not only have we removed the business rules surrounding the data from inline code, we’ve also removed the definition of the customer data itself. As the extent of the metadata repository grows, performance quickly becomes an issue.

To allow the generic code base to perform well in a high transactional environment we decided to cache the metadata within each process where it is required. As an instance of our CRM framework is created, once per process, it immediately loads the metadata definition from a database into local memory. This allows for very quick access to the information that drives the behavior of the system. Note that by placing all of this system level knowledge (data structure, database access rules, validation rules, etc.) within the backend framework we can apply it consistently regardless of the origin of the data. For example, processing could originate from a wireless device, a Web site, an XML Web service or from a batch process, but each share the same back end. By caching the metadata we have solved the pending performance issue. However, what happens when the business rules change? In a nutshell, the cache becomes invalid and we need to refresh it. We use .NET remoting as the foundation for various systems management tasks including refreshing the metadata and enabling distributed logging.

Describing the mechanics of .NET’s distributed object infrastructure is beyond the scope of this article. Instead, we’ll highlight how to use .NET remoting to solve our invalidated cache scenario since it relates to data validation.

Our systems management server lives in a traditional Windows NT service and makes itself known to the remoting environment within our implementation of the System.ServiceProcess.ServiceBase OnStart method.

   Hashtable hshtProps = new Hashtable();   hshtProps.Add("port", 8085);   BinaryServerFormatterSinkProvider sink = new       BinaryServerFormatterSinkProvider();   sink.TypeFilterLevel =       System.Runtime.Serialization.      Formatters.TypeFilterLevel.Full;   m_chan = new TcpChannel(hshtProps, null, sink);   ChannelServices.RegisterChannel(m_chan);   WellKnownServiceTypeEntry entry = new       WellKnownServiceTypeEntry(   typeof(SystemsManagement     .ServerManager), "ServerManager",      WellKnownObjectMode.Singleton);   RemotingConfiguration.RegisterWellKnownServiceType      (entry);

However, in the spirit of extensibility, the recommended approach for defining the characteristics of the server is to externalize these configuration elements into a configuration file and use RemotingConfiguration.Configure <>.

This server keeps track of each process that needs to be notified when the contents of the metadata have changed. In order to support this functionality, the server implements an interface allowing instances of the framework to register themselves as subscribers. The server, implemented as a Singleton, maintains a list of each subscriber and periodically pings each instance to verify its continued existence. The metadata (e.g. data validation rules) is updated using a GUI client that serves as a publisher. That is, when the metadata is modified it publishes this event to the systems management server, which in turn iterates over the subscriber list telling each instance that its metadata is out of date. Within each instance of the framework all access to the metadata is regulated through a static property, and using synchronization objects, the data can be refreshed properly in a multi-threaded environment.

The .NET Framework has allowed Brierley & Partners to build a very extensible CRM platform that is driven by externally defined rules. Through a unique combination of .NET reflection and remoting we have been able to deploy a highly configurable software platform that is also very responsive and suited to high transactional volume.

devxblackblue

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist