he sheer number of objects in a database makes keeping track of them difficult. Keeping track of objects dedicated to a particular table, such as constraints and column names, can be discerned by using sp_help @tablename or the information_schema views. The identification of other objects and their dependencies shared between objects are more problematic, as in the case of stored procedures accepting parameters. As your system grows in complexity, identifying the objects’ dependencies and their purposes can quickly become confusing. The confusion multiplies after the creator of the objects moves on and his or her successor is faced with making sense out of the schema left behind.
Adopt a naming convention that limits the ambiguity and uncertainty surrounding an object and its purpose. The benefits of a consistent naming convention accrue exponentially as the number of objects under your control increases. Ten minutes spent in adopting a naming convention will help you, your co-workers, and successors keep track of your objects and their dependencies.
This month’s 10-Minute Solution discusses some alternative naming conventions for database objects. You can ignore these suggestions in favor of something that makes more sense to your workgroup, but the benefits of consistency in naming database objects cannot be ignored.
As the number of objects in my database grows, how do I keep track of each of their dependencies and purposes?
Adopt a consistent naming convention to limit the ambiguity and uncertainty surrounding an object and its purpose. The benefits of your naming convention will accrue exponentially as the number of objects under your control increases.Getting Started
When you are naming a new database object, you can segment the name to communicate its role in the system. A useful rule of thumb is to have a template for your names: a prefix, a stem, and a modifier that stand for the type, name, and function of the object you’re naming, respectively. This approach can be overkill, but more often than not it makes the name much more communicative. As an example of overkill, it doesn’t make much sense to add the prefix “tbl” to a table name. In this case, your convention could be to assume that a base table is named for the conceptual entity for which it stands, and forget about the prefix. On the other hand, prefixing a view of that table with a “v” would make plain that the user of that object is dealing with some variant (subset or join) of a base table.
A table sometimes represents more than a conceptual entity. For example, when expressing a many-to-many relationship, to decomposing the relationship into its participants is a must. The challenge arises from having to “downshift” the management of many-to-many relationships into terms that the database can handle (i.e., you recognize many-to-many relationships conceptually, but the database can manage only one-to-ones or one-to-manys). So you need to manage the representation of a real world fact in terms the database can handle. For associative relationships or intersection tables, mapping to reality breaks down when you try to cleanly represent the thing you are describing.
How Do the Experts Do It?
Consider how SQL Server’s creators signal the meaning of their own objects. Take a look at a user database or one of the system database tables. You’ll notice a bunch of them are prefixed with “sys”. In user databases, I averred that it makes sense to ignore the prefix rule for the conceptual entities, but with utility tables or helper tables (a temporary table or a calendar table, for example) keeping them segregated from the application-based tables is helpful. I still advocate keeping prefixes off tables when the best you can think of is “tbl”.
In SQL Server’s user database system tables, the convention is to segregate the sys tables from the ones you will create and maintain. Likewise the system stored procedures are prefixed with sp, and some are further modified with MS, help, or repl, communicating more about their application and purpose. A table prefix is a useful convention applied judiciously to entities you or your application shouldn’t touch.
Transactional vs. Analysis Oriented
An extension to this option is to prefix table names with the subject area to which they are related. Naming tables by subject area in a transactional system is a bad idea because a transactional database can be shared among many applications. Transactional databases are application oriented, and their entities need to be shared among the applications they service. A sales order-processing system may reference service products that have no packaging in contrast to an inventory product. Inventory products may be composed of kits, individual inventory items bundled together with accessories. Physically expressing these subtypes (service and inventory items for sale) as tables and naming them with prefixes contributes more confusion than clarity. It makes more sense to use a subtype identifier to distinguish between them and express the bundles in their own intersection table.
Conversely, naming tables for their subject area is an excellent idea for tables that have been migrated to a data warehouse. Grouping tables by subject area in a transaction-oriented system confuses the aims of a transactional system with a data warehouse implementation, where the tables are subject area oriented. This begs the question of the appropriate time to move to two systems, one transaction oriented and the other analysis oriented, and this decision is usually motivated by system performance rather than academics. The subject area prefix in the analysis-oriented implementation is fine.
Plural vs. Single
Another issue in table naming is the plural versus singular argument. Some think tables should be named in the plural, while others think they should be named in the singular. I think that plurals make more sense from a practical perspective. For example, I think of the Employees table as a collection of individual Employee instances. It makes more sense to think that you are selecting an employee instance, a single row, from a collection that houses all the Employees. So while reading a Where clause, it sounds like better grammar to say that you want to select the Employee from Employees where the Employee number is 12.View Naming
Views typically contain joins, so they are hybrids, not base tables that are talking about a single entity. Views can be used to partition tables either into security or access levels, or by subject instead of application. In my experience, they most commonly represent a security mechanism to hide the details in a table from users who don’t need to see all the data in a table. Most commonly, views represent joins between entities.
I prefer to name views with prefixes to distinguish them from tables. You can use views together with base tables in select statements, and I find it convenient to segregate them, especially since views are much more likely to already contain a join. MS adopted the ANSI idea of using Information_Schema, a prefix for the views that deliver the old sysobjects columns. These are views that deliver the content one queries to find out metadata about his or her database. There are Information_Schema.tables (as in select * from information_schema.tables, information_schema.columns, or information_schema.table_constraints). Using these views is much more convenient than having to remember how to navigate the intricacies of the system tables. Using the information schema views simplifies your life because their naming convention communicates their purpose right out front (and yes, they are implementing joins for you).
Suppose you were responsible for a database that supplies brands offering product promotions to a Web application. Further, the database has to keep customer information with each brand. You would structure an account table with a hierarchy in the accounts with brand at the highest level. Fielding a brand with its customers would always wind up being a two-step operation because you would have to collect all the accounts that belonged to a particular brand. In other words, you would be constantly creating a master-detail select statement. You would use the brand id to field the order-taking process on the Web for the same catalog of products offered under different names and prices for each particular brand.
This would be an ideal spot to create views that are specific to a particular brand. You would wind up with views for all the brands you had, say a Sony brand, a Motorola brand, an IBM brand, and an HPCompaq brand. The underlying table would have some redundancy, because if “Joe” bought something from one brand and then bought something else from another brand, he would be an account in two brands. Using views to isolate these associations would create a shortcut for segregating those multiple appearances so that Joe would appear once in any view. I would name the views vw_BrandSony, vw_BrandIBM, etc.
More prosaically, views are also extremely useful for partitioning tables according to privileges and/or authorities granted to their users. This makes it a good security enforcement vehicle for segregating data that you don’t want seen, because you can deny permissions to the table and allow only those columns and rows that are appropriate to the login asking for data. They cannot change what they cannot see.
Stored Procedure Naming
Another place where a naming convention can be useful is in the naming of stored procedures. We all know (or should know) that naming your stored procedures the same way as Microsoft qualifies as a worst practice. When you do that, you force the server to look at the Master database first. The sp_ prefix signals that the code to run is in the Master database. If the procedure is not found there, then the server seeks it in the user’s database. This just wastes time sending the processor out of the intended database, looking for a procedure that is not where its name says it should be, and forcing it back to the user’s database. (Procedures that include their owner names are preferred, but that is a whole article unto itself.) The moral of the story is that you should name your own stored procedures something other than sp_ , like usp_, to avoid sending the procedure call out to Master.
Once you have settled on a different prefix, let’s say usp_, a natural tendency is to name something like usp_InsertCustomer, usp_DeleteCustomer, usp_UpdateCustomer, and usp_SelectCustomer – natural, but not wise! The tree view for procedures becomes ordered by the first few characters of your stored procedure name, causing all the stored procedures to be sorted by function instead of by object. I bet that isn’t what you want.
Try placing their table names immediately after the underscores (some people hate underscores, but I like them in proc names). That way, they will naturally sort in object name order, so all your customer procedures will fall together, all your order procedures will fall together, and if you have a library of utility or other special-purpose stored procedures, then they will fall together in the library, API, or utility section of the tree view.
For triggers, I would reverse the object/function convention in the previous section because of the dedicated nature of their ownership. Triggers are by definition bound to their tables. This gives you have some flexibility in naming them because the owning table circumscribes its ownership. Triggers have other complexities though in that they also are named for specific CRUD functions (well, not the Read part of CRUD, but they do exist for Update, Create (insert) and Delete). These should appear as part of the trigger name to make its purpose explicit. If the trigger is multi-purpose, that is, its action can apply to more than a single database action, then all its parts should be reflected in the trigger name to keep ambiguity at a minimum (e.g., tr_InsUpd_CustomerBalance).Naming Indexes
You should explicitly name indexes also, so you can get a sense of the lineage of the key and avoid having the random numbers appended by SQL Server to your index and constraint names. An index name also should reflect whether the index is clustered or not, unique or not, and by including the referencing and referenced tables, whether it enforces a referential fact or not. As with triggers, indexes are dedicated, owned by a specific table, but because they are external structures, reflecting the owning table’s name makes sense. You wind up with index names such as PK_cl_u_Customer_CustomerNumber (a clustered unique index on the primary key of Customers), FK_cl_CustomerCustomerNumber_OrderNumber (a compound clustered index on CustomerNumber, a foreign key into Orders and Order Number), or ncl_u_OrderDetailNumber_ItemItemNumber (a nonclustered unique index on an order detail and its item number). These would apply to the primary key for a table, a foreign key into a child table, or an indexed column in a single table.
You can make column names more memorable also by applying a naming convention. Some people prefer naming the primary key column ID in the base table, then prefixing the ID with its entity name once it migrates to a child as a foreign key. Once the Customer.ID migrates to Orders, it becomes Orders.CustomerID, giving the reader a clue as to the origin of the key.
I have seen systems that prefix column names with their data types on the assumption that knowing the column’s data type makes stored procedure code writing easier. The reasoning is the data type of the column is carried over to the types of the variables used in the code. My own feeling is that the presence of the Object Browser makes this unnecessary, because you can find this information out just by looking at the Object Browser in Query Analyser.
A few other considerations exist for less commonly used database objects, such as functions, check clauses, defaults, and user-defined types. I use the same rules for naming these that I use for the more “visible” objects. If ownership is defaulted to some other object, I don’t include it in the name and favor including a functional description instead.
Since functions are not owned by anything, I would name them fn_WhatEverItsSupposedToDo. Check clauses are owned, but they have scope in that they can apply to the entire table or to just a column. So including the scope in its name would be nice, as in ck_t_ColumnsInvolvedOneOfWhichIsNotNull or ck_c_HasToBeInARange.
Defaults are prefixed with def_ and user defined types with udt_. In practice, however, I avoid the user-defined types because they create more work than they are worth at maintenance time.
Finally, don’t forget that some strings qualify as reserved words. These can have unintended effects on behavior because they cause errors that aren’t immediately apparent and difficult to track down. The reserved words to avoid are listed in Books On Line.
What’s in a Name? Almost Everything
To maintain understanding and readability, I take advantage of long names in my conventions, but I do downsize to either an alias name for an especially unwieldy column (using the AS keyword) or for a long view name in the From clause, (again, using the AS keyword) when I am actually using the long name in a query. To maintain readability, I use camel case in all my names to make them stand out and be more legible.
The thing to take away from this discussion of naming conventions is that adopting a naming convention helps you keep your thoughts together while you are working with your database. It really doesn’t matter what rule or convention you use, as long as you use one consistently. Will you remember what you were talking about six months from now? Put yourself in your successors’ shoes. Would they know what you were trying to convey? It’s all in the name (well, mostly).