The CAP theorem, also known as Brewer’s theorem, is a principle in computer science that a distributed data store can provide a maximum of two of the following three properties:
- Consistency (C)
- Availability (A)
- Partition tolerance (P)
Consistency
Consistency guarantees that every node in a distributed system will return the same (i.e., consistent) data. This means that every read request will return the latest state of the system from the most recent write or an error in the event of a timeout due to failure.
Availability
Availability, on the other hand, guarantees that every request will return a non-error response. However, the response may not reflect the state from the most recent write.
Partition tolerance
Partition tolerance means that the system as a whole will continue to operate despite network failures.
Implications of the CAP theorem
The limitations imposed by the CAP theorem drive the fundamental behaviors of distributed systems in the event of a network partition failure. In a failure scenario, distributed systems can choose either of the following:
- Be consistent and partition tolerant (CP) to return a consistent response, sacrificing temporary outages.
- Prioritize availability and partition tolerance (AP) to always respond to requests even if it means returning stale or conflicting responses.
It is important to note that the CAP theorem does not dictate that a distributed system must always choose CP or AP at all times. The tradeoff only comes into play during a network partition where consistency or availability is favored.
The PACELC theorem is an extension of the CAP theorem which describes trade-offs when the system is running normally in the absence of partitions.
Database designs
Due to the CAP theorem, most databases are designed to prioritize either consistency or availability depending on the use case.
Relational databases such as MySQL, PostgreSQL, and MariaDB choose consistency over availability to provide ACID guarantees. On the other hand, NoSQL databases like MongoDB or Apache Cassandra favor availability to handle high write throughputs and rely on eventual consistency for reads.
Additional resources
- The Inner Workings of Distributed Databases
- Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web
- A Critique of the CAP Theorem
- The CAP FAQ