devxlogo

Primary–Secondary Topology: How It Works and Fails

Primary–Secondary Topology: How It Works and Fails
Primary–Secondary Topology: How It Works and Fails

A primary–secondary topology (sometimes called leader–follower, master–replica, or active–passive) is one of the most common architectures used in distributed systems and databases. It is simple, predictable, and easy to reason about. But it also has sharp edges that appear once systems scale or require strong availability.

Let’s break it down.

In a primary–secondary topology, one node is designated as the primary (leader), and one or more nodes are secondaries (replicas).

The core rule:

  • All writes go to the primary
  • Secondaries replicate the primary’s data
  • Reads may come from secondaries

Basic flow:

Client → Primary (write)
Primary → Secondaries (replication)
Client → Secondary (read)

Think of it like a newsroom.

  • The primary editor writes the official article.
  • Copy editors receive updates and distribute them.
  • Readers can read from any copy editor, but only the primary editor changes the article.

Why Systems Use It

The topology solves several practical problems.

1. Write coordination

If multiple nodes accept writes independently, conflicts become complex.

Primary–secondary solves this by enforcing:

Single writer → many readers

2. Read scaling

Secondaries can serve read traffic.

Example:

Node Role Traffic
Primary Writes + some reads 10%
Secondary A Reads 45%
Secondary B Reads 45%

This dramatically increases read throughput.

3. Failover capability

If the primary dies, a secondary can be promoted to primary.

This is common in:

  • PostgreSQL replication
  • MySQL replication
  • MongoDB replica sets
  • Redis replication
  • Elasticsearch clusters

The Replication Mechanism

Primary–secondary systems replicate changes from the primary to replicas.

Common mechanisms include:

Log shipping

The primary writes changes to a write-ahead log (WAL).

Secondaries replay the log.

See also  What Is a Distributed Consensus Algorithm?

Example flow:

Write request
↓
Primary commits transaction
↓
Primary writes to WAL
↓
Secondaries read WAL
↓
Secondaries replay change

State snapshot + streaming

  1. Replica copies the full dataset
  2. Then receives incremental updates

This is how systems like PostgreSQL streaming replication work.

The Hidden Assumption

Primary–secondary architectures assume something important:

The primary is always reachable and authoritative.

This assumption works well in small deployments.

But once networks get messy, the model begins to break down.

When Primary–Secondary Breaks Down

Several real-world conditions cause problems.

1. Replication Lag

Replication is usually asynchronous.

So secondaries are slightly behind.

Example:

User writes: balance = $500
Primary updates immediately

Secondary still shows:
balance = $200

This leads to stale reads.

Real production issues include:

  • inconsistent dashboards
  • outdated search indexes
  • Incorrect financial reads

Many systems expose this as:

read-after-write inconsistency

2. Primary Bottleneck

Since all writes go through the primary, it becomes a scaling limit.

If the primary handles:

  • 50k writes/sec
  • heavy transaction logic
  • replication streams

It becomes CPU, disk, or network-bound.

You eventually hit a hard ceiling.

Solutions often involve:

  • sharding
  • partitioning
  • multi-leader systems

3. Failover Complexity

If the primary fails, you must elect a new primary.

This introduces problems like:

Split brain

Two nodes believe they are primary.

Example:

Network partition

Cluster A → elects node 1 primary
Cluster B → elects node 2 primary

Now both accept writes.

When the network heals, the system has conflicting histories.

Consensus protocols like Raft or Paxos exist largely to solve this problem.

4. Network Partitions

In distributed systems, networks fail more often than machines.

A partition might isolate:

Primary + 1 replica
vs
3 replicas

Which side should remain writable?

Systems must choose between:

  • Consistency
  • Availability
See also  What Is a Materialized View (and When You Should Use One)

This is the CAP theorem tradeoff.

Some systems freeze writes. Others risk divergence.

5. Long Tail Latency

Replication creates cascading latency.

Example:

Client → Primary (5ms)
Primary → Replica A (20ms)
Replica A → Replica B (40ms)

Large clusters can end up with a multi-second lag during spikes.

Real Systems That Use Primary–Secondary

Examples include:

PostgreSQL

Streaming replication model.

Primary handles all writes.

Replicas replay WAL logs.

MySQL

Classic asynchronous replication.

Often used with read replicas behind load balancers.

MongoDB

Replica sets with automatic leader election.

Redis

Primary with replica nodes.

Replication can be async or semi-sync.

When Engineers Move Beyond It

Primary–secondary works great until systems require:

  • massive write scaling
  • low-latency global writes
  • strong multi-region availability

At that point, architectures evolve to:

Multi-primary (multi-leader)

Multiple nodes accept writes.

Examples:

Leaderless systems

Clients write to multiple nodes directly.

Examples:

  • DynamoDB
  • Cassandra
  • Riak

These designs trade simplicity for availability and scale.

The Core Insight

Primary–secondary topology is popular because it simplifies a hard problem.

Instead of solving distributed consensus for every write, the system centralizes writes into one authority.

But that simplicity creates structural limits:

  • single write bottleneck
  • replication lag
  • complicated failover
  • partition sensitivity

This is why large distributed systems eventually evolve toward more complex topologies.

About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.