fbpx

Knowing where to send information through IP is not enough. You have to know how to reliably send information too.

What is TCP and why is it needed?

Imagine for a moment that you need to send a very large package by postal mail. The package is just a bunch of papers so it’s not like you’re trying to send a single bulky item. Even so, let’s assume that the package is going to a different country that has unreliable service, especially for big items. And let’s also assume that the stamps are free. How would you send this?

Well, you could put everything in one big box and just send it. All those papers are an important document, so you make arrangements that the person you’re sending it to will send you a reply. The reply is very small so has a good chance of making it back but even that is not guaranteed. Okay, you send the package and get no response. Did it arrive? Or did the reply get lost? You don’t know. I’m going to call the reply an acknowledgement or just ack for short.

Since you got no reply, you decide to send the whole thing again. And again, you get no ack.

What you need is reliable delivery of information and that’s the main purpose of TCP. You need to be able to break the information into small pieces which is easy for a document with multiple pages. It’s also easy for electronic information. That’s because it doesn’t matter what you’re sending. It could be a single large presentation or a picture. Or it could be a document. All of these can be and actually need to be translated into a series of one’s and zero’s anyway. We can just divide these bits into smaller groups just like how a line of people waiting to be let into a store can be let in in groups.

Make sure to listen to the full episode for how TCP relates to the problem of sending many pages of a document through an unreliable mail service. Or you can also read the full transcript below. There are several problems that need to be solved including missing packets, packets arriving out of order, duplicate packets, flow control, and error detection.

Transcript

In the last episode, I explained IP, or Internet Protocol. Along with this, you’ll commonly need to work with TCP, or Transmission Control Protocol. Together, these two are referred to as TCP/IP which is pronounced as just TCPIP.

But what is TCP and why is it needed? Imagine for a moment that you need to send a very large package by postal mail. The package is just a bunch of papers so it’s not like you’re trying to send a single bulky item. Even so, let’s assume that the package is going to a different country that has unreliable service, especially for big items. And let’s also assume that the stamps are free. How would you send this?

Well, you could put everything in one big box and just send it. All those papers are an important document, so you make arrangements that the person you’re sending it to will send you a reply. The reply is very small so has a good chance of making it back but even that is not guaranteed.

Okay, you send the package and get no response. Did it arrive? Or did the reply get lost? You don’t know. I’m going to call the reply an acknowledgement or just ack for short.

Since you got no reply, you decide to send the whole thing again. And again, you get no ack.

You know the address of the person you’re trying to send the document to. That’s similar to an IP address. You realize that you need a better system for sending large documents. And that’s the purpose of TCP.

Now TCP is actually rather complicated. It’s more than I can explain in this podcast without putting you to sleep. So instead, I’m going to continue using the mail example to explain the main concepts.

What you need is reliable delivery of information and that’s the main purpose of TCP. You need to be able to break the information into small pieces which is easy for a document with multiple pages. It’s also easy for electronic information. That’s because it doesn’t matter what you’re sending. It could be a single large presentation or a picture. Or it could be a document. All of these can be and actually need to be translated into a series of one’s and zero’s anyway. We can just divide these bits into smaller groups just like how a line of people waiting to be let into a store can be let in in groups.

Back to the example. Instead of sending the entire thing in one big box that’ll probably get lost anyway, you just take a few pages and put them into a small envelope. It takes a while for the envelope to arrive and then it takes a while for the ack to come back. After a week, you finally get the ack you’ve been waiting for. So you think to yourself, “This is great, I’m making progress.”

And you are making progress. You just sent the first three pages and got confirmation that they were received. The problem is that you’ve got thousands of pages that need to be sent. At this rate, you’ll have to ask ten generations of your kids to complete your task for you.

Then you get the idea to send more than one envelope at a time. So the next day, you send ten envelopes and wait for the replies. You get back nine acks a week later. One of your envelopes either didn’t make it or one of the acks didn’t make it. But which one?

And at the same time you realize this, you also realize that the document you’re trying to send has no page numbers itself. Even if you got confirmation that all ten envelopes had arrived, there would be no way for the other person to put them together in the right order. Have you even seen a postal worker putting mail in any kind of order?

You think for a while and come up with a plan. All you need to do is number the envelopes and make sure that the other person numbers the ack messages with the corresponding numbers. You’ll then send envelopes in order and keep track of the numbered acks. If you discover that some ack was not received after a reasonable amount of time, then just send those pages again with the same number that was originally used and continue waiting for an ack.

This is great, you can send smaller envelopes that have a higher chance of arriving and can keep track of lost envelopes and try again. You even work out how many pages you can send in any one envelope that give good chances of success without requiring too much work. There is a certain amount of work you need to do after all to send each envelope and it doesn’t matter if the envelope contains five pages inside or ten. More pages mean less work for you. But too many pages, and the envelopes start getting lost. It’s a balance.

And once you get all this worked out, you decide to hire people to help you stuff envelopes. You hire ten workers before you run into another problem.

You realize now that your ten workers are able to send so many envelopes that the other person can’t keep up. The envelopes are arriving but are getting backed up and the replies don’t get sent right away. Since you don’t get the acks, you and your workers start resending envelopes. This causes the other person to get even more behind.

What you need is a way for the receiver to specify when acknowledging receipt of envelopes, how many more envelopes can be comfortably sent.

This gives you the information to know when you can send more envelopes or when you need to slow down a bit.

One final bit of this example that I haven’t mentioned yet is what happens when an envelope gets damaged along the way? It arrives, but it’s obviously been damaged and can’t be trusted. TCP takes care of this too through error checking.

In TCP, the envelopes are called packets. Each packet contains a portion of the overall data that needs to be sent. The packets are usually sent in order but can arrive out of order. The receiver can put everything back in the proper order and acknowledge what it received. It might sometimes receive a packet that it’s already received and acknowledged. That’s okay, because the ack itself may have been lost.

And TCP has the ability to control the flow of packets. A server computer may be very fast but your own computer may not be able to keep up. Or maybe the speed problem could be because the receiver has a slower network connection. TCP also has the ability to avoid network congestion.

When you put all this together, you get a reliable, ordered, and error-free delivery of your information. And the best part is that you don’t have to worry about all the flow control or retries. This is all handled by TCP.