Eventual Consistency

Eventual Consistency

The CAP theorem (also known as Brewer’s theorem) of distributed systems says that you can have two out of these three:

  • Consistency
  • Availability
  • Partitioning

Consistency means that you have the same state across all the machines in your cluster. Availability means that all the data is always accessible and partitioning means that the system can tolerate network partitions (some machines can’t reach other machines in the cluster) without affecting the system’s operation.

It’s pretty clear that if there is a network partition and server A can’t reach server B then any update to A can’t be communicated to B until the network is repaired. That means that when a network partition happens the system can’t remain consistent. If you’re willing to sacrifice availability, then you can just reject all reads and clients will never discover the inconsistency between A and B. So, you can have C/P???a system that can remain consistent (from the user’s point of view) and can tolerate network partitioning, but will sometimes be unavailable (in particular when there is a partition). This can be useful is certain situations, such as financial transactions where it is better to be unavailable than to break consistency.

If you can somehow guarantee that there will be no network partitions by employing massive networking redundancy, then you can have C/A. Every change will propagate to all servers and the system will always be available. It is very difficult to build such systems in practice, but it’s very easy to design systems that rely on uninterrupted connectivity.

Finally, if you’re willing to sacrifice perfect consistency, you can build A/P systems???always available and can tolerate network partitioning, but the data on different servers in the cluster might not always agree. This configuration is very common for some aspects of Web-based systems. The idea is that small temporary inconsistencies are fine and conflicts can be resolved later. For example, if you search Google for the same term from two different machines, in two different geographic locations, it is possible that you’ll receive different results. Actually, if you run the same search twice (and clear your cache) you might get different results. But, this is not a problem for Google???or for its users. Google doesn’t guarantee that there is a “true” answer to a search. It is very likely that the top results will be identical because it takes a lot of effort to change the rank. All the servers (or caching systems) constantly play catch up with the latest and greatest.

The same concept applies to something like the comments on a Facebook post. If you comment, then one of your friends may see it immediately and another friend may see it a little while later. There is no real-time requirement.

In general, distributed systems that are designed for eventual consistency typically still provision enough capacity and redundancy to be pretty consistent under normal operating conditions, but accept that 1% or 0.1% of actions/messages might be delayed.

Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes

ransomware cyber attack

Why Is Ransomware Such a Major Threat?

One of the most significant cyber threats faced by modern organizations is a ransomware attack. Ransomware attacks have grown in both sophistication and frequency over the past few years, forcing