146: Distributed Computing: It Happened When?

Computers rely on clocks. They coordinate everything. But the clocks on different computers can be slightly off from each other.

This is normally not a problem. Until you need to merge rapid events from multiple computers and put them in order. You have some options and I’ll explain those in this episode.

Let’s say that you have a file backup service that’s busy protecting your documents by copying your files online. This is a form of distributed computing and I’ll describe a real but unlikely scenario involving timestamps. I say unlikely because you really have to go out of your way to do this but it is possible. And because it’s possible, it represents the type of design decisions you need to make as a software developer. Dates and times are tricky enough on a single computer. Adding multiple computers takes the problem to a whole new level.

Listen to the full episode for more insights and tips for dealing with dates and times across multiple computers. Or you can read the full transcript below.

Transcript

This is normally not a problem. Until you need to merge rapid events from multiple computers and put them in order. You have some options and I’ll explain those in this episode.

Let’s first think about cases where this is no problem at all. If you create a document on your computer, then it gets a creation date and a modified date. After all, creating a document means it also gets modified. Then anytime you make additional changes, the modified date changes. How precise do you need these dates to be? Normally, I’d say a date and time down to the nearest minute is good enough. Probably more than enough.

Dates and times are a lot more complicated than might first appear. I devoted a full five episodes explaining them. Make sure to listen to episodes 118 through 122 for more information. And episode 117 about the decimal data type talks about precision.

The reason I say that a file date and time down to the nearest minute is good enough is that usually our need for precise timing information decreases over time. When you first save changes to your document and go back to look at the file information, it’s important for it to show that it was just updated a minute ago. Otherwise, you might think something was wrong and that your changes didn’t get saved. A week later though, and maybe just the hour is all you need to know. Maybe you just want to check if you last updated the document before or after an important meeting and you just want to check if the document was last changed in the morning or afternoon. A few months later, and you’re probably only interested in the date.

That’s what we need. Computers are different. A computer needs precise dates and times always. So a computer will record when something happened with a lot more precision than it usually displays to us.

Now, let’s say that you have a file backup service that’s busy protecting your documents by copying your files online. This is a form of distributed computing and I’ll describe a real but unlikely scenario involving timestamps. I say unlikely because you really have to go out of your way to do this but it is possible. And because it’s possible, it represents the type of design decisions you need to make as a software developer.

Imagine you have two computers both connected to the same backup service and the system clock is about five minutes ahead on one of your computers. That’s not very much and within reason. Now, you go to the computer with the later time and modify a document. Let’s say you modify the document right at 1 pm. The online file gets updated. Then you go to your other computer with the early time and modify the same document but with a different change at 12:56 pm. It took you a minute to switch computers. The question becomes, which document should be the latest version online? Should you use the newer document that was modified previously or the older document that was modified most recently?

There’s a case to be made for both sides of this and there’s no absolutely correct answer. The main thing is that you need to be aware of the issue and make a decision. Dates and times are tricky enough on a single computer. Adding multiple computers takes the problem to a whole new level.

There’s other ways to deal with computers with different clocks that I’ll explain right after this message from our sponsor.

Instead of the previous example, let’s say you again have two computers with clocks five minutes apart and this time, you sign into your bank account on the computer at 1 pm and try withdrawing a hundred dollars from your checking account that only has fifty dollars in it. This should fail and it should fail even if you then use your other computer that thinks the time is 12:56 pm to transfer another fifty dollars from your savings account into your checking account. The transfer will succeed. That is assuming the money is available in your savings account. But the withdrawal has already failed and won’t be tried again. What makes this example different? What makes this example have a clear and correct answer while the previous example was not so clear?

The difference is that banks don’t care what your computer thinks the time is. They have their own idea of time that they trust.

You can choose to follow this design as well. Anytime you have a central server computer that receives all requests, then you can use that server to record the order and time of each event. It might look a little strange from the customer’s point of view. Imagine you transferred fifty dollars at 12:55 pm and then the screen refreshes to show the transfer took place at 1 pm. You might think, how could you possibly do something before doing it?

This can happen because we’re dealing with timestamps that are close to each other and the results can be visible almost instantly. But the same process has been common practice for hundred of years.

If you’ve ever filed official papers with a government office, then you know that you can sometimes wait days or weeks for the papers to be filed and processed. What happens when this is done? Usually, the papers will get stamped with an official date just like the bank added its own time of 1 pm. This date has no dependency on the date you submitted the papers. Yet, it’s the only date that really counts. This is another example of where the date and time of the customer has no influence on the resulting date and time. When dealing with longer and more manual processes like this, the precision of modern computers doesn’t matter. I mention it to explain that how we handle timestamps hasn’t changed much over the years in any solution that needs to handle activities that start from multiple places.

Here’s where things can get interesting. Computers do enable events to be processed much faster than ever before. Just when you think you have a solution to always use the server time instead of an unreliable customer’s computer time, you run into another problem.

Imagine that the bank grows and gets more and more customers until eventually the single server handling all the online banking requests can no longer keep up. You need to add another server. But this puts you right back in the same problem as before only this time, there is no clear answer. You see, it’s possible and likely that the two servers will be slightly off from each other’s clocks. At some point, you’re going to hit the same or a similar problem where a transaction handled by one of the servers needs to be coordinated with a different transaction handled by another server.

One possible solution would be to make use of a single server computer that was responsible for just a single task, handing out timestamps. The reason it should be limited in what it does is because it needs to be able to keep up with a large and growing demand for handing out those timestamps. So it should be really fast at that job. It also acts as a single source for time that all the other computers trust.

Think of it like this. When you visit a busy grocery store to buy some cheese at the deli, do you wait in a line? Well, maybe. Just like you can wait in line at the bank. But a smart grocery store and a more friendly bank will instead provide a ticket dispenser. All you have to do is pull out your ticket that has a number on it. Then wait for your number to be called. It doesn’t matter what time you feel like it was when you pulled your ticket. The ticket dispenser acts like a central time keeping device. It’s quick and fast. And it can do this without even using a clock.

This implies that your design might be better served to stop using timestamps completely when coordinating activities. You could still record the time of the event. But this becomes more of a general guideline instead of a final deciding factor of which event happened first.

146: Distributed Computing: It Happened When?

Transcript

Tags

Leave a ReplyCancel reply

146: Distributed Computing: It Happened When?

Transcript

Share this:

Tags

Leave a ReplyCancel reply