MVCC Explained: How Databases Handle Concurrency

You do not really understand a database until you have watched it fail under load.

The first time I saw it, we had a clean schema, well-indexed tables, and a workload that looked harmless in staging. Then production traffic hit. Suddenly, requests were hanging. CPU was fine. The disk was fine. But sessions were blocked, waiting on locks that seemed to multiply by the second.

That is the moment you meet concurrency control.

Multi-Version Concurrency Control, or MVCC, is the technique modern databases use to let many transactions read and write the same data at the same time without stepping on each other. Instead of locking rows aggressively, the database keeps multiple versions of a row and lets transactions see the version that makes sense for their timeline.

It sounds simple. It is not. But once you understand how MVCC works, you start seeing why PostgreSQL behaves differently from MySQL, why vacuuming matters, and why your long-running transaction can quietly sabotage the entire cluster.

Let us break it down.

What Problem MVCC Actually Solves

Before MVCC, databases mostly relied on locking for concurrency. If you updated a row, other transactions had to wait. If you were reading at a strict isolation level, you might block writers, too.

This leads to two classic problems:

Readers blocking writers
Writers blocking readers

Under load, that becomes a traffic jam.

To understand the tradeoff, I revisited how major engines describe their models and how practitioners talk about them in the field.

Michael Stonebraker, co-creator of PostgreSQL, has repeatedly emphasized that Postgres was designed around MVCC from the beginning to avoid read-write blocking that plagued earlier systems. The design goal was clear: reads should not block writes, and writes should not block reads.

Heikki Linnakangas, longtime PostgreSQL core developer, has explained in conference talks that MVCC in Postgres works by creating new row versions on update, rather than overwriting existing ones. The old version stays visible to transactions that started earlier. This design choice avoids in-place updates but shifts complexity to cleanup and storage management.

Mark Callaghan, a former MySQL and Facebook database engineer, has discussed how InnoDB uses undo logs to reconstruct older versions of rows for consistent reads. In his benchmarking work, he often highlights how MVCC interacts with IO patterns and long-running transactions in real production systems.

Three engines, same core idea, very different tradeoffs.

The synthesis is this: MVCC reduces blocking by trading space and background cleanup for concurrency. You get higher throughput under mixed read-write workloads, but you now need to manage version chains, vacuuming, and transaction visibility rules carefully.

The Core Idea: Time Travel for Rows

At its heart, MVCC is about snapshots.

When you start a transaction, the database gives you a logical snapshot of the data at a particular point in time. Every query you run in that transaction sees a consistent view of the database, even if other transactions are modifying rows concurrently.

Here is the mental model that works best in practice:

Every row has metadata about when it was created and, if applicable, when it was deleted.
Every transaction has an ID and a visibility rule.
When you read a row, the database checks whether that row version is visible to your transaction.

If another transaction updates a row, the database:

Creates a new version of the row.
Marks the old version as expired for future transactions.
Keeps the old version around for transactions that started earlier.

No overwriting. Just new versions.

That is why MVCC is often described as append-only at the logical level.

How MVCC Works in Practice (PostgreSQL vs InnoDB)

All MVCC systems share the concept of row versions, but the implementation details matter.

Here is a simplified comparison:

Feature	PostgreSQL	InnoDB (MySQL)
Update behavior	New row version written in the table	Row updated in place + undo log
Old versions are stored in	Same table as dead tuples	Undo log segments
Cleanup mechanism	VACUUM	Purge thread
Snapshot storage	Transaction ID visibility checks	Read view with undo reconstruction

This difference drives real operational consequences.

In PostgreSQL, updates create new physical tuples in the table. If you update a row 10 times, you may have 10 versions sitting in the heap until vacuum reclaims them. This can cause table bloat if the vacuum is misconfigured or blocked by long-running transactions.

In InnoDB, updates modify the row in place and store previous versions in undo logs. A consistent read may reconstruct older versions by walking the undo chain. That means heavy update workloads stress undo tablespaces and purge threads instead of inflating the main table.

Same idea, different pain points.

Isolation Levels and What MVCC Actually Guarantees

MVCC interacts tightly with transaction isolation levels. You cannot understand one without the other.

The standard isolation levels are:

Read Uncommitted
Read Committed
Repeatable Read
Serializable

Most MVCC databases default to Read Committed or Repeatable Read.

Let us walk through a concrete example.

Assume:

Transaction A starts at time T1.
Transaction B starts at time T2.
Both operate on the same row.

If Transaction B updates the row and commits, what does Transaction A see?

Under Read Committed:

Each query sees the latest committed version at the time of the query.
If A re runs the same SELECT, it might see new data.

Under Repeatable Read:

A sees a snapshot from T1.
Even if B commits, A keeps seeing the old version.

This is the power of MVCC: no blocking, yet consistent snapshots.

However, anomalies can still happen. For example, phantom reads under weaker isolation levels, or serialization failures under strict ones. In PostgreSQL’s Serializable isolation, the engine uses predicate locking and conflict tracking on top of MVCC to ensure true serializability, which can lead to transaction retries.

This is where many teams get surprised. MVCC does mean no contention. It just changes where contention appears.

The Hidden Cost: Long-Running Transactions

If you take only one operational lesson from MVCC, let it be this:

Long-running transactions are dangerous.

Why?

Old row versions cannot be cleaned up while a transaction that might still need them is open.

Let us do a quick back-of-the-envelope example.

You have a table with 10 million rows.
You update 100,000 rows per minute.
A reporting transaction runs for 30 minutes.

In PostgreSQL, all row versions touched during that window must remain until the reporting transaction finishes. That means:

100,000 updates per minute × 30 minutes = 3,000,000 dead tuples

Those dead tuples:

Consume disk space
Increase table scan costs
Increase vacuum work

I have seen this exact pattern cause latency spikes in production systems. A single analytics query left open in a connection pool kept autovacuum from reclaiming space, and the heap grew fast enough to impact cache efficiency.

In InnoDB, the effect is similar but shifts to undo logs. Long-running transactions prevent purge from advancing, causing undo logs to grow and potentially filling undo tablespaces.

The rule is simple: keep transactions short.

How to Design Systems That Play Well with MVCC

Understanding MVCC is not just academic. It changes how you design your application.

Here is how to work with it instead of against it.

1. Keep Transactions Tight and Purposeful

Do not hold transactions open across network calls or user interactions.

Bad pattern:

Begin transaction
Call external API
Wait for user confirmation
Commit

Good pattern:

Gather all inputs
Open transaction
Perform database work
Commit immediately

Measure transaction duration in milliseconds when possible.

2. Choose Isolation Levels Deliberately

Do not default to Serializable because it sounds safer.

Start with:

Read Committed for most web workloads
Repeatable Read when you need consistent analytical reads

Test for anomalies that matter to your domain, such as lost updates or write skew, and add constraints or explicit locking where necessary.

3. Monitor Version and Cleanup Metrics

In PostgreSQL:

Track dead tuples
Watch autovacuum activity
Monitor replication slots and xmin horizon

In MySQL InnoDB:

Monitor the history list length
Track undo tablespace growth
Watch purge lag

These are not optional metrics in high-write systems.

4. Use Explicit Locks When Needed

MVCC does not remove the need for locks.

If you are implementing something like:

Inventory decrement
Financial ledger updates
Idempotency enforcement

You may still need:

SELECT FOR UPDATE
Unique constraints
Application-level retries

MVCC prevents many blocking scenarios, but it does not automatically prevent all logical race conditions.

What Is Still Uncertain or Subtle

There are areas where even experienced engineers get tripped up.

One is the interaction between MVCC and indexing. Index entries often point to specific row versions. In PostgreSQL, a bloated table often implies bloated indexes. Vacuuming indexes is separate work and can affect query performance in surprising ways.

Another is replication. Logical replication streams row changes and depends on retained row versions. If you combine long-running transactions with replication slots, you can accidentally retain massive amounts of old data.

Finally, distributed databases layer additional concurrency mechanisms on top of MVCC, such as timestamp ordering or hybrid logical clocks. Systems like CockroachDB and YugabyteDB extend the snapshot concept across nodes, adding complexity around clock skew and transaction coordination.

No one really gets concurrency “for free.” They just move the complexity into different layers.

FAQ

Is MVCC the same as snapshot isolation?

Not exactly. MVCC is a mechanism. Snapshot isolation is a specific isolation level often implemented using MVCC. Many databases use MVCC to implement multiple isolation levels, including snapshot isolation.

Does MVCC eliminate deadlocks?

No. Deadlocks can still occur when transactions acquire locks in conflicting orders, especially with explicit row locks like SELECT FOR UPDATE. MVCC reduces some forms of blocking but does not eliminate lock-based conflicts.

Why does PostgreSQL need VACUUM?

Because old row versions are stored in the same table. VACUUM reclaims space from dead tuples and updates visibility metadata. Without it, tables bloat, and performance degrades.

Can you run without MVCC?

Some systems use alternative concurrency models, such as strict two-phase locking. However, for mixed read-write workloads at scale, MVCC or MVCC-like approaches dominate modern relational databases.

Honest Takeaway

MVCC is one of those concepts that feels abstract until it breaks something you care about.

At a high level, it lets readers and writers coexist by keeping multiple versions of rows. In practice, it introduces new operational responsibilities around cleanup, isolation levels, and transaction design.

If you are building anything beyond a toy application, you need to understand how your database implements MVCC, not just that it does.

The core idea is simple: your database is keeping time-stamped versions of reality so that everyone can pretend they are alone.

The complexity comes from cleaning up the past without breaking the present.