RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


Use Globally Unique Identifiers in Access and SQL Server : Page 2

When your business outgrows your database systems, using globally unique identifier (GUID) values in Access and SQL Server can offer a timely solution.

The GUID datatype can assist with a number of database challenges:
  • Global databases that must guarantee unique key values across many servers
  • Replicated databases that must guarantee unique key values across many servers
  • Large databases that have outgrown incremental primary key values
  • Companies faced with combining database files where primary key values are duplicated

Large databases that have outgrown their incremental primary key values (tables with more than 2 billion records) and developers who must combine two or more systems with duplicate primary key values require key values that are unique across an indeterminate number of servers. When feasible, a cost-effective and efficient solution in both cases is simply to declare the primary key values in all tables suspect and replace them all with GUID values (generated by the GUID datatype). The process requires updating every primary and foreign key value, which might be tedious but it guarantees that none of the primary key values are duplicated. The process won't cost you anything but time.

Microsoft discourages the use of GUID as a primary key, but that's because most people don't really need universal uniqueness. When uniqueness across many systems is vital, however, the GUID datatype is definitely the way to go. Just remember that you pay a price in performance.

GUIDs in Replicated Databases
By and large, the most common use for GUID values is in replicated databases, where users insert new records and update copies of the same records on different servers. At any moment, two database replicas could contain new records, so a row or rows might contain identical primary key values. Adding a GUID value to each record would guarantee that all replicas (as a group) contain a unique primary key, reducing the synchronization problem to a series of simple append queries.

However, updating existing data poses a more difficult problem—users could change the same record in different replicas. Simply replacing an incremental primary key with a GUID primary key won't solve this problem, because you still have n different versions of a row identified by the same GUID. For this reason, Access doesn't convert primary key fields to the GUID datatype in replicated files. It adds GUID values to each record as version trackers. Access changes the incremental primary key to random, which minimizes (but doesn't eliminate) the possibility of duplication within replicas. Then, during synchronization, the Replication Manager automatically detects and resolves primary key collisions, reassigning new primary keys to any duplicated numbers. However, that change handles only inserts. So what about updates and deletes? To that end, the Replication Manager adds four new columns to every table (see Figure 1).

Figure 1: The Concept Behind SQLXML

These new columns are read-only. If you change a value, Access tracks the change by changing the value in the s_Generation and the s_Lineage columns, accordingly. If you modify a column's definition (which you can do only in the design master), Access updates the s_ColLineage column. Thus both data and definition changes are automatically tracked and can be propagated successfully during synchronization.

You might expect replication to incur the same performance penalty as GUID primary keys, but it doesn't. Access searches these extra columns only during synchronization. In addition, typically only a few rows are added or changed between synchronizations, so the process isn't lengthy. (Of course, the longer the interval between synchronizations, the longer the process takes.)

Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date