Consistency, Availability, and Partition Tolerance are three aspects of distributed computing. And a fourth not included in the theorem is Latency.
The original CAP theorem said that you could only be guaranteed to get two aspects between the three options, consistency, availability, and partition tolerance.
But here’s where it went wrong. You see, consistency and availability are up to you to decide how to handle. And they don’t even have to be all of one and zero of the other. They can be combined. But a network failure is outside of your ability to control. Sure CAP says you can design around this too. But CAP implies that you could choose to design a system that’s consistent and available. Those are two of the original three choices. This ignores the fact that networks will fail.
Modern distributed systems can no longer ignore partition tolerance. As designs scale out more by using more computers across increasing distances, network disruptions become more frequent. Because partition tolerance is becoming a necessary choice, you’re left with the choice of how you want to deal with the situation when you don’t have a complete system.
If you choose consistency over availability, then when a request for some information is made you can either wait for the network to come back or return an error. Or maybe you wait for a while and then return an error. Either way, the computer making the request won’t wait forever and will timeout eventually on its own. Consistency says that you need to make sure you have the most up-to-date information before replying. Even if you have some information and could reply with what you have, consistency says you double-check first. And since you can’t do that, then an error is best.
Now if you choose availability, then you reply with the information you have. Even if there might be more up-to-date information that you just can’t get to right at that moment. This means that the information should be available. You can’t just make up a response. If you reply with an answer, then it should at least have a chance to be the right answer. If a computer requests some information that a server simply does not have and can’t get to at all because of the network outage, then choosing availability means that you should at least return right away with an error.
Listen to the full episode to hear more including how latency fits into all this.