|The Synchronization Code
The ProcessAllSynchOperations method’s task is to implement the process shown in the schematic in Figure 8. Before it gets called, a couple of other routines (not shown here) collect the values from the configuration file. These routines split the list of Web service URLs (the Web sites from which to collect new rows) into an array named gSourceURLArray; split the list of “key” values? (the matching IP addresses of these sites)? into an array named gSiteIPArray; and split the list of table names into an array named gTableArray.
|Code involved only in writing status information to the Form window and the log file has been removed from this section.
Iterating Through the Web Service URLs
The ProcessAllSynchOperations method iterates through the gSourceURLArray array of Web service URLs using an Integer index to extract the matching key value from the gSiteIPArray. A separate function reads the file containing the last successful update date for that site. The URL, key, and date information are then used in the call to each site’s Web service. If all goes well you get back a DataSet containing all the new rows from all the tables for that site; otherwise you get a DataSet containing only a single table named Errors, or Nothing if the remote Web service fails completely. The method checks for both these conditions. Notice how it extracts the error message if the service returns an “error” DataSet. The method throws an exception that is caught in the main routine within the application, which displays and logs the message and then halts execution.
Sub ProcessAllSynchOperations() ' iterate through all the Web service URLs For iLoop = 0 To gSourceURLArray.Length - 1 ' get values for this source site sSourceURL = gSourceURLArray(iLoop).Trim() sWebSiteIP = gSiteIPArray(iLoop).Trim() dLastUpdate = GetLastDateLoaded(sWebSiteIP) ' call routine to fetch the DataSet from ' the remote Web service Dim oDS As DataSet = FetchTables(sSourceURL, sWebSiteIP, gTableNames, dLastUpdate) ' see if there were any errors with the Web service If oDS Is Nothing Then Throw New Exception( _ "No DataSet returned from Web service") End If If oDS.Tables(0).TableName = "Errors" Then Throw New Exception( "Error reported by Web service: " & _ oDS.Tables(0).Rows(0)(0)) End If ...
Iterating Through the List of Tables
At this point, you know that the service returned a DataSet containing any new rows, so you begin to insert the rows into the local database. You need to create a connection to the local database, open it, and start a transaction over this connection, iterating through the list of tables held in gTableArray. If no table in the DataSet matches the current item in the gTableArray list, (there were no rows added on the source machine), the application simply logs and displays a message. However, if the table does exist, it checks to see if there are any new rows in it using the Count property of the Rows collection:
... Dim oConn As SqlConnection Dim oTrans As SqlTransaction Try ' create Connection, open it and start a transaction oConn = New SqlConnection(gConnStr) oConn.Open() oTrans = oConn.BeginTransaction() ' iterate through all the tables in the list Dim sTableName As String For Each sTableName In gTableArray ' check if table is in DataSet -- ' might not be if there were no new rows If oDS.Tables(sTableName) Is Nothing Then WriteStatus(">> WARNING: no rows received " & _ "for table '" & sTableName & "'") Else ' see how many rows are in this table Dim iRows As Integer = oDS.Tables(sTableName).Rows.Count If iRows > 0 Then Try ...
Inserting New Rows into the Database
To insert the rows automatically, you must attach a DataAdapter to the DataSet, and call its Update method. However, you also need to ensure that this happens within the control of the currently pending transaction, which involves some extra code. Normally, you create a DataAdapter by specifying the connection and a SQL statement that selects an appropriate set of rows:
Dim oDA As New SqlDataAdapter("sql-statement", connection-instance)
However, calling that constructor only sets the SelectCommand property of the DataAdapter. To create a suitable InsertCommand, you can add a CommandBuilder:
Dim oCB As New SqlCommandBuilder(oDA)
Then, when you call the DataAdapter’s Update method, the CommandBuilder generates a suitable parameterized SQL INSERT statement, and assigns it to the CommandText property of the Command object associated with the DataAdapter’s InsertCommand property. It also populates the Parameters collection of that Command with the appropriate set of Parameter instances. If you choose not to use a CommandBuilder, you’ll have to do all this work manually.
However, there’s a problem; the Command objects that the CommandBuilder creates are not added to the pending transaction, and trying to assign the transaction to them afterwards causes an error. To get around this, you have to specify the current pending transaction when you create a new Command object. Subsequently, you can pass that Command to the constructor of the DataAdapter where it will be used as the SelectCommand.
Next, you can use the GetInsertCommand method of a new CommandBuilder to obtain a Command object, assign the pending transaction to it, and then assign it to the InsertCommand of the DataAdapter. Finally, you’re ready to call the Update method to push the rows into the database:
... ' create a new Command for the SelectCommand ' within the current transaction Dim oSCmd As New SqlCommand( _ "SELECT * FROM " & sTableName, oConn, oTrans) ' create DataAdapter using that Command Dim oDA As New SqlDataAdapter(oSCmd) ' create a new CommandBuilder from the DataAdapter Dim oCB As New SqlCommandBuilder(oDA) ' get the InsertCommand Command instance ' from the CommandBuilder Dim oICmd As SqlCommand = oCB.GetInsertCommand() ' attach the current transaction to the InsertCommand oICmd.Transaction = oTrans ' specify this as the InsertCommand of the DataAdapter oDA.InsertCommand = oICmd ' update the database table Dim iCount As Integer = oDA.Update(oDS, sTableName) ...
Committing or Rolling Back the Transaction
If the update succeeds, you can move on to repeat the process for the next table. If an error occurs, the application throws an exception that halts execution, calling the Rollback method of the current transaction first so that no updates will be applied to any of the tables. However, after successfully processing all the tables in the list, you can safely attempt to commit the transaction. The enclosing Try…Catch construct also rolls back all updates if any other error occurs. The Finally section ensures that the connection gets closed when the process completes (it also executes if an error occurs, before raising the exception to the calling routine).
... Catch e As Exception ' error encountered so roll back all the updates oTrans.Rollback() Throw New Exception( _ "Error updating target table " _ & sTableName & " - " & e.Message) End Try End If End If Next ' process next table in list oTrans.Commit() ' all OK so commit all of the updates Catch e As Exception ' error encountered so roll back all the updates oTrans.Rollback() ' and stop, though could just write error message ' and process next URL Throw New Exception( _ "Transaction failed to complete - " & e.Message) Finally oConn.Close() End Try ...
Updating the “Last Updated” File
The ProcessAllSynchOperations routine ends by calling a separate routine named UpdateLastDateLoaded, which just writes the current date and time into the “last updated” file for this Web service URL (you’ll recall that this is a file with a name such as 217_45_1_242.txt). Then the code returns to process the next Web service URL in the list:
... ' update last synchronization date file for this site UpdateLastDateLoaded(sWebSiteIP) Next ' process next site (next Web service URL) End Sub
The synchronization is complete for this client after it processes all the Web service URLs in the configuration list. Meanwhile, all the other machines within the synchronization group will carry out the same process, fetching new rows from each other and adding them to their local databases. You should schedule each machine for a slightly different time, so that they aren’t all trying to synchronize at the same time.
Extending the Sample Application
It’s generally the case that examples you see in articles and books never seem to be exactly what you need for your own applications. So, this section provides some pointers to how you might extend the example for your own needs. There are two obvious areas to consider:
- Maintaining a centralized set of configuration data, making administration easier and ensuring each machine uses the same configuration data.
- Synchronizing changes to the data on each machine, rather than just adding new rows.
Centralizing Configuration Settings
Each client instance in the example shown in this article uses a local configuration file that lists the other members of the synchronization group. To implement a centralized configuration system, you could store the configuration information in one specific location, and then fetch it from each client as they start the synchronization process. As the configuration information file is in XML format, it would be easy to load this across the network (or across the Internet) using a .NET XmlDocument or XmlTextReader object. Then you would extract the data from the XML document rather than from a local .config file, or simply write the XML directly to disk to be used as the local configuration file.
However, in the example shown earlier, each client’s local configuration file lists only the other members of the configuration group. To implement a single configuration file that is used by all clients, you would have to include all the members of the synchronization group (all the Web service URLs and machine-specific key values), and then make sure that each client ignores it’s own entry as it iterates through the list of Web service URLs. If the machine-specific key is the IP address of the client (as in our example), the client can detect this and simply skip the entry that specifies its own IP address. If you use a different value for the machine-specific keys, you could store this is a separate local file on each client, which it would read and use to identify which of the Web service URLs in the list applies to the local machine (and therefore should be skipped).
Synchronizing Changes to the Data
The example discussed in this article is aimed at synchronizing data by adding only new rows to all the databases within the synchronization group. However, what about synchronizing changes to existing rows in the databases?
This is much harder to achieve, as it means that the process must be able to handle updates to existing rows and detect when rows have been deleted. This also brings into play issues such as which of the group of clients will hold the “master” database (in case there are conflicting updates), and what the rules are for handling updated and deleted rows. For example, if the same existing row is changed in more than one database, which version of the row should be retained and replicated to the rest? And if a row is deleted in one database, should it be deleted from all the others? Even more interesting, how would you detect that a row had been deleted anyway?
The DataAdapter object’s Update method is clever in that it will automatically push changes to existing rows into a table in a database, as well as adding new rows. It will even delete rows in the table that have been deleted from the DataSet. The problem is that you have to populate the DataSet in such a way that it contains the appropriate set of updated and deleted rows (in other words, the current and original values of each column in each row and their RowState values have to be set correctly to be able to push the changes into the target table).
One way to handle updates would be to create a DataSet containing the existing rows in the local database on the client machine, then send this DataSet to the target server where the DataAdapter object’s Fill method would be used to refresh the rows in the DataSet with the updated values in that database (which would also collect any added rows). Then the DataSet would be sent back to the client where the Update method would push the changes into the local database.
For this to work, the tables must have a column defined as the primary key. It would also be wise to ensure that each client only updates its own rows between synchronization cycles (in other words, the rows that it has added, or for which it is “responsible”). Each update will over-write any previous values, so the values in a row where changes have been made on more than one machine will be determined by which machine carries out the synchronization process last. If this occurs, you must also repeat the whole synchronization process so that this “final” value ends up in all the synchronized databases.
Handling deleted rows is even more difficult. The only way that a DataSet can remove rows from a database table is if the rows both exist in that DataSet and are marked as being deleted. But the Fill method cannot detect when rows have been deleted from a database table. This means that you would have send the current rows from the client to the target machine (as discussed for updates), and then execute some custom code that searches for each row in the target database. If it’s not found, that row would be deleted from the DataSet (which sets the appropriate current, original and RowState values). Then, when the DataSet is returned the client and the Update method is called, these rows will be deleted from the local database.
In this article, you’ve seen a practical application of using Web services to synchronize data across multiple databases over the Internet. While the demonstration application is quite specific, you should recognize that the overall concept and process has much wider applications. For example, you could use it to synchronize almost any kind of data between different data sources, as long as the data can be extracted and exposed as a DataSet by the source Web service, or use the techniques shown here to aggregate data from many different machines and databases, regardless of how widely they’re distributed.