You ship a service, traffic doubles, then doubles again. The humble hash table that felt roomy in staging now lives at ninety percent load with overflow chains that read like horror fiction. You could rehash into a larger table, take a write pause, and hope no one notices. Or you can use dynamic hashing, a family of techniques that lets a hash table grow and sometimes shrink in place while staying online.
In plain terms, dynamic hashing is a strategy that keeps the cost of lookups and inserts close to constant even as the key set changes. Instead of rebuilding the entire table when you cross a threshold, it changes only a small part of the structure, usually one bucket at a time. Done right, that keeps latency stable and tail behavior predictable during growth events.
We looked at the two classic designs that production systems keep returning to. Per-Åke Larson, linear hashing pioneer, showed that splitting a single bucket on demand sustains high load factors without global rehash pauses. Ron Fagin, Jürg Nievergelt, Nick Pippenger, and H. R. Strong, extendible hashing authors, argued for a directory that grows in bits so you only touch buckets that need attention. Storage engineers we spoke with stressed a practical point: dynamic methods are less about mythical O(1) and more about controlling the worst case during live growth. Collectively, these perspectives say the same thing. If you can amortize expansion into micro steps, you win predictable performance under churn.
What Dynamic Hashing Does Under The Hood
Dynamic hashing uses a normal hash function, then adds a growth protocol.
- Split policy decides when to split a bucket. Common triggers are bucket occupancy in records or bytes.
- Addressing decides which bits of the hash to use. You either advance a split pointer linearly through the table (linear hashing) or widen an index of hash bits when needed (extendible hashing).
- Reassignment moves the records that now belong to the new bucket. Only a fraction of a bucket moves per split, not the whole table.
Why this matters: you convert a painful N-sized rehash into many tiny moves that cap the extra latency per write.
Dynamic Families You Will Actually Use
| Method | How it grows | Memory shape | Typical sweet spot |
|---|---|---|---|
| Linear hashing (Larson) | Split the bucket pointed to by a moving split pointer. Table size advances by one bucket at a time. | Flat array of buckets, plus optional overflow pages. | Disks or SSDs, write heavy, steady growth, simple to implement. |
| Extendible hashing (Fagin et al.) | Double the directory when a local split would overflow the current bit depth. | Directory of pointers to buckets, directory can double, buckets can be shared. | Mixed read write, skewed keys, fast split targeting, great for transactional stores. |
| Cuckoo variants with stash | Not traditionally dynamic, but can be paired with a growth step. | Two or more tables with relocation. | In-memory with tight latency SLOs, small keys, strict load < 0.95. |
Pick based on constraints. If you hate indirection, linear hashing is friendlier. If you love precise bucket targeting under skew, extendible hashing gives you that. If you want ultra tight probe bounds in RAM, consider a cuckoo hybrid that grows in chunks.
How The Mechanics Work, With Numbers
Assume each bucket holds 8 records, average key record is 64 bytes, and you target a load factor of 0.85.
- A bucket hits 7 or 8 records, so it is a split candidate.
- In linear hashing with target 0.85, the expected fraction of records that move during a split is about one half when you advance one hash bit. That is 3 or 4 records per split.
- If you insert 1000 records starting at 0.80 load, you will trigger about 125 inserts per split on average, which spreads the work. The long tail stays bounded because every expensive insert only migrates a few records.
The important part is the tail. With classic doubling rehash, a single insert can trigger a move of N records. With dynamic methods, a single insert adds O(1) moves in expectation, and the worst case is capped by bucket size plus a spill read.
Build One That Survives Production
Here is a practitioner flow that has worked in storage engines and key value services.
1) Choose addressing and hash discipline
Use a 64 bit hash with solid avalanche properties. Mask off the lowest g bits to index the directory or table, where g is your current global depth. Keep two functions handy: index(h, g) returns the bucket index, and split_bit(h, g) tells you whether a record belongs to the new or old bucket during a split. Salt only if you face adversarial keys.
Pro tip: record the original hash with the record header, not the raw key. You avoid recomputing hashes during splits.
2) Set a hard split threshold and a soft backpressure threshold
Hard threshold could be 8 records or 32 KB of payload per bucket. Soft threshold is 90 percent of that. When a write crosses soft, enqueue the bucket for background split. When a write crosses hard and the queue is behind, perform a foreground split before acknowledging. This keeps SLOs predictable.
Short list of good thresholds:
- Load per bucket in bytes, not only in count.
- Time since last split for that bucket.
- Overflow chain depth if you allow overflow pages.
3) Implement the split path as an idempotent state machine
Splits get retried after crashes, so persist tiny state:
- Split target identity.
- New bucket allocation pointer.
- Migration cursor within the old bucket.
- Directory update version.
On restart, pick up at the cursor. Idempotence avoids partial double moves.
4) Make lookups tolerant of in-flight splits
While a split is happening, a key can live in either old or new bucket. Gate lookups by consulting a generation bit or a directory version. If versions match, probe the primary bucket. If not, probe both buckets before returning miss. Keep the extra probe rare and cheap.
5) Control skew and overflow
Hot keys and skew break pretty pictures. Add two mitigations:
- Detect buckets that split repeatedly within a short window, then give them a local depth bump if you are using extendible hashing.
- If you use linear hashing with overflow pages, cap overflow depth at one page and force a local split when depth would exceed that cap.
Performance Planning That You Can Defend In A Review
Suppose you run on NVMe SSD, 4 KB pages, bucket size 4 KB, average record 64 bytes, read latency 120 microseconds per random page, write latency 40 microseconds with batching, compression off.
- Lookup: one bucket read is 120 microseconds median, 240 microseconds p99 if you occasionally follow one overflow page.
- Insert without split: one bucket read plus one page write, about 160 microseconds.
- Insert with split: add reads for old bucket and writes for new bucket. With half of records moving, the extra work is one additional read and one write on average. Budget 320 to 400 microseconds.
- Throughput under growth: if 10 percent of inserts trigger a split at steady state, average insert cost rises by about 16 microseconds, which is easy to hide behind group commit.
These are back-of-the-envelope numbers, but they map closely to what operators report in OLTP systems when splits are amortized.
Pitfalls You Will Meet, And How To Avoid Them
Directory blowup in extendible hashing happens when many buckets share few physical pages. Keep the directory in memory and compress pointer runs. Use huge pages to reduce TLB pressure.
Split storms appear when a traffic spike lands on a small table. Start with a slightly conservative load factor, for example 0.75, then auto raise to 0.85 after the first expansion epoch.
Long overflow chains in linear hashing hurt tail latency. Cap overflow depth at one or two, then force local splits. If your workload has pathological skew, switch hot buckets to higher local capacity or hopscotch style neighborhoods.
Crash safety is not free. Persist the smallest state that lets you resume a split. Test by killing the process at every line in the split path.
FAQ
Is dynamic hashing always faster than a static table with periodic rehash?
No. If your workload is mostly reads with rare batch growth, a simple rehash during maintenance windows can be cheaper and simpler. Dynamic hashing shines when inserts arrive continuously and you cannot afford stop the world rebuilds.
Can I use it in memory without indirection penalties?
Yes. Many in-memory stores use power of two bucket arrays and linear hashing to avoid directory cache misses. If your keys are tiny and you want very tight probe bounds, a cuckoo or hopscotch hybrid may beat dynamic growth until you need to scale beyond RAM.
What load factor should I target?
Start at 0.80 to 0.85 for SSD backed stores where page reads dominate. Drop to 0.70 to 0.80 for RAM if you need lower p99. The right answer is the highest value that keeps overflow probability and split rate within your SLO.
Does dynamic hashing help with range queries?
Not directly. Hashing destroys order. If you need ranges, pair dynamic hashing for point lookups with a secondary ordered index, or choose an ordered structure for those keys.
Honest Takeaway
Dynamic hashing trades one big problem for many tiny ones. You avoid catastrophic rehash pauses by splitting a single bucket at a time, which keeps latency stable during growth. It is not magic. You still need careful thresholds, split state that survives crashes, and guardrails for skew. If you do that work, you get a hash table that keeps serving while your data changes under its feet, which is what production really asks of you.