The Features You've Come to Expect
The database comparison boils down to the features that each database engine provides (see Table 1
). Database administrators that have worked with commercial database engines such as Oracle, DB2, or MS-SQL have come to rely on a fairly broad feature set. This section compares these commercial databases with the open-source databases.
MySQL has several different data storage mechanisms available. It originally used ISAM/MyISAM, which was then replaced by the more advanced InnoDB. Other storage mechanisms are available, but this discussion focuses primarily on using InnoDB tables because it typically has the most advanced database feature set and is the default table type in MySQL version 4.x. When choosing a MySQL storage mechanism, make sure you read up on all of the features you plan on implementing. While researching this article, I found that some features exist in certain storage mechanisms, but not in others. Most notably, InnoDB and BDB are the only table types that are transaction-safe. PostgreSQL, on the other hand, uses only one data storage mechanism, the aptly named Postgres storage system.
One of the critical features of any database engine is data integrity. ACID (Atomic, Consistent, Isolated, Durable) compliance is a qualification that assures data integrity. ACID essentially means that when a transaction is performed within a database, either the whole transaction is successful and the information is written to the database, or nothing is written. Both PostgreSQL and MySQL support ACID-compliant transaction functionality.
Both databases also support partial rollbacks of transactions, and they know how to deal with deadlocks. MySQL uses traditional row-level locking. PostgreSQL uses something called Multi Version Concurrency Control (MVCC) by default. MVCC is a little different from row-level locking in that transactions on the database are performed on a snapshot of the data and then serialized. New versions of PostgreSQL support standard row-level locking as an option, but MVCC is the preferred method.
The Advanced Features
PostgreSQL has many of the database features that Oracle, DB2, or MS-SQL has, including triggers, views, inheritance, sequences, stored procedures, cursors, and user-defined data types. MySQL's development version, version 5.0, supports views, stored procedures, and cursors. MySQL's future version, version 5.1, will support triggers. MySQL does, however, support the advanced feature of data partitioning within a database. PostgreSQL does not.
Stored Procedures and Triggers
While PostgreSQL has had support for stored procedures and triggers for quite some time now, MySQL has support for these only in development versions 5.0 and beyond. PostgreSQL's query language, PL/pgSQL, is very similar to Oracle's PL/SQL. In addition, PostgreSQL's procedures and triggers can be written in other languages as well, such as PL/TCL, PL/perl, and PL/python. These additional languages come in two basic flavors, safe and unsafe. Safe allows only for use of things in the programming language that don't affect the host system negatively, such as direct access to the file system.
Oracle is known for the amount of tweaking it allows for databases, especially when it comes to indexing. Overall, experienced Oracle users will probably find the indexing strategies employed by these open-source databases quite primitive. Both PostgreSQL and MySQL support single column, multi-column, unique, and primary key indexes. MySQL supports full text indexes out of the box, and PostgreSQL can support full text indexes with some changes to the database that are included with the source.
Databases hold data, and the types of data that a database can hold are called data types. Both PostgreSQL and MySQL support most standard data types. In the past few years, large object support has become increasingly popular, and both databases support this as well. PostgreSQL supports user-defined data types, while MySQL does not. MySQL and PostgreSQL also both support the storing of geographic features, known as GIS (Geographic Information System). PostgreSQL additionally has network-aware data types that recognize Ipv4 and Ipv6 data types.
Another major feature of enterprise-level databases is support for replication. Both MySQL and PostgreSQL have support for single-master, multi-slave replication scenarios. This base level of replication is included with the distributions of the software, and the source code is open. PostgreSQL offers additional support for multi-master, multi-slave replication from a third-party vendor, as well as additional replication methods.
While both Oracle and DB2 run on multiple platforms, Microsoft's SQL Server is limited to Windows. Both MySQL and PostgreSQL support many different platforms, including Windows, Linux, FreeBSD, and MacOSX. MySQL uses a threaded model for server processes, wherein all of the users connect to a single database daemon for access. PostgreSQL uses a non-threaded model where every new connection to the database gets a new database process.
Database Interface Methods
PostgreSQL and MySQL both support ODBC and JDBC for network connectivity, as well as native database access methods. These native methods provide access via the network in both plain text methods and, for a higher level of security, SSL-encrypted methods.
Another important part of database interface methods is authentication for the database. MySQL uses a simple method to store all of its authentication information inside a table. When users attempt to access a database, MySQL compares their credentials against this database, verifying from which machines the users can connect and to what resources they have access.
PostgreSQL can use a similar method, but it also has some others. For example, it can use a hosts file for database access to define which remote users can connect to which database. It can also use the local authentication systems for database access (e.g., your Unix password would also be your PostgreSQL password).
A number of programming methods also provide ways to access these databases. Both PostgreSQL and MySQL support access via C/C++, Java, Perl, Python, and PHP. PostgresSQL also has internal programming languages for writing stored procedures and triggers, among them are pl/pgsql, pl/tcl, and pl/perl.
When it comes to backups, open-source databases may not completely fulfill your needs. Both databases come with scripts to facilitate a simple text dump of your database data and its schema. Both database solutions also provide methods for doing a hot-database backup, or backing up your database without shutting it down. Many commercial backup tools, such as Vertias NetBackup or Tivoli TSM, have agents that provide online backups of commercial databases. A quick Web search returned only a few vendors that create agents for PostgreSQL and MySQL. The overall coverage appears limited.
Backups also include simple database recovery from soft failures, such as database crashes or unexpected power failures. PostgreSQL uses a system called Write Ahead Logging to provide database consistency checking. MySQL has database consistency checking only under InnoDB table types.
Many people use GUI tools to manage their databases. Many such toolsboth open source and commercialare available for MySQL and PostgreSQL. These tools can be either applications that run natively on your operating system or Web-based tools. Many of these tools are closely modeled after tools available to commercial databases.
Both MySQL and PostgreSQL have database migration utilities to help migrate data from commercial databases. These utilities are available from third parties as either open-source or commercial tools. PostgreSQL also comes with tools to help migrate data from Oracle and MySQL. Obviously, the more complex your schema, the more difficult the conversion will be, and some of these tools may not completely migrate everything perfectly.