Browse DevX
Sign up for e-mail newsletters from DevX


Synchronize Your Databases with .NET Web Services (Part II)  : Page 10

When data gets created in many locations, you often need to create a process that collects and copies this data to multiple sites. In this article, you'll see how to use Web services to automatically synchronize remote databases in a decentralized way, by letting each machine query the others until they all contain the same data.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Extending the Sample Application
It's generally the case that examples you see in articles and books never seem to be exactly what you need for your own applications. So, this section provides some pointers to how you might extend the example for your own needs. There are two obvious areas to consider:
  1. Maintaining a centralized set of configuration data, making administration easier and ensuring each machine uses the same configuration data.
  2. Synchronizing changes to the data on each machine, rather than just adding new rows.

Centralizing Configuration Settings
Each client instance in the example shown in this article uses a local configuration file that lists the other members of the synchronization group. To implement a centralized configuration system, you could store the configuration information in one specific location, and then fetch it from each client as they start the synchronization process. As the configuration information file is in XML format, it would be easy to load this across the network (or across the Internet) using a .NET XmlDocument or XmlTextReader object. Then you would extract the data from the XML document rather than from a local .config file, or simply write the XML directly to disk to be used as the local configuration file. However, in the example shown earlier, each client's local configuration file lists only the other members of the configuration group. To implement a single configuration file that is used by all clients, you would have to include all the members of the synchronization group (all the Web service URLs and machine-specific key values), and then make sure that each client ignores it's own entry as it iterates through the list of Web service URLs. If the machine-specific key is the IP address of the client (as in our example), the client can detect this and simply skip the entry that specifies its own IP address. If you use a different value for the machine-specific keys, you could store this is a separate local file on each client, which it would read and use to identify which of the Web service URLs in the list applies to the local machine (and therefore should be skipped).

Synchronizing Changes to the Data
The example discussed in this article is aimed at synchronizing data by adding only new rows to all the databases within the synchronization group. However, what about synchronizing changes to existing rows in the databases? This is much harder to achieve, as it means that the process must be able to handle updates to existing rows and detect when rows have been deleted. This also brings into play issues such as which of the group of clients will hold the "master" database (in case there are conflicting updates), and what the rules are for handling updated and deleted rows. For example, if the same existing row is changed in more than one database, which version of the row should be retained and replicated to the rest? And if a row is deleted in one database, should it be deleted from all the others? Even more interesting, how would you detect that a row had been deleted anyway?

The DataAdapter object's Update method is clever in that it will automatically push changes to existing rows into a table in a database, as well as adding new rows. It will even delete rows in the table that have been deleted from the DataSet. The problem is that you have to populate the DataSet in such a way that it contains the appropriate set of updated and deleted rows (in other words, the current and original values of each column in each row and their RowState values have to be set correctly to be able to push the changes into the target table). One way to handle updates would be to create a DataSet containing the existing rows in the local database on the client machine, then send this DataSet to the target server where the DataAdapter object's Fill method would be used to refresh the rows in the DataSet with the updated values in that database (which would also collect any added rows). Then the DataSet would be sent back to the client where the Update method would push the changes into the local database.

For this to work, the tables must have a column defined as the primary key. It would also be wise to ensure that each client only updates its own rows between synchronization cycles (in other words, the rows that it has added, or for which it is "responsible"). Each update will over-write any previous values, so the values in a row where changes have been made on more than one machine will be determined by which machine carries out the synchronization process last. If this occurs, you must also repeat the whole synchronization process so that this "final" value ends up in all the synchronized databases. Handling deleted rows is even more difficult. The only way that a DataSet can remove rows from a database table is if the rows both exist in that DataSet and are marked as being deleted. But the Fill method cannot detect when rows have been deleted from a database table. This means that you would have send the current rows from the client to the target machine (as discussed for updates), and then execute some custom code that searches for each row in the target database. If it's not found, that row would be deleted from the DataSet (which sets the appropriate current, original and RowState values). Then, when the DataSet is returned the client and the Update method is called, these rows will be deleted from the local database.

In this article, you've seen a practical application of using Web services to synchronize data across multiple databases over the Internet. While the demonstration application is quite specific, you should recognize that the overall concept and process has much wider applications. For example, you could use it to synchronize almost any kind of data between different data sources, as long as the data can be extracted and exposed as a DataSet by the source Web service, or use the techniques shown here to aggregate data from many different machines and databases, regardless of how widely they're distributed.

Alex Homer is a director of Stonebroom, Ltd., a software development, consulting, and training organization. He was formerly lead technical author and reviewer for Wrox, specializing in Microsoft Web and database technologies. You can reach him through his Web site.
Comment and Contribute






(Maximum characters: 1200). You have 1200 characters left.



Thanks for your registration, follow us on our social networks to keep up-to-date