How does one application communicate with another application?

So far, in the last several episodes, I’ve explained how to identify and find computers on a network, how to establish connections, how to think about communication in terms of seven layers, and a few different protocols. There are many more protocols. And there are many different types of networks not based on the internet protocol. Many of the concepts still apply and once you understand all of these things, you should be able to understand completely different ways to transport information between computers.

But there’s one final piece that I haven’t described yet. I’ve never actually seen or heard of a computer itself that wants to talk to another computer. It’s always software running on a computer that does the communicating. And computers can run many different programs at once.
Each running program is called a process. So what we really need to understand is how one process can communicate with another process. That’s the topic of today’s episode.

Specifically, this episode describes how to communicate using sockets over an IP network. Since computers can run multiple processes, just routing communication packets to the right computer isn’t enough. We need some way to specify what process should receive the information. And that’s where ports come in. Think of it like this, when you send a letter to someone, you don’t just put the address, you also put a name on the envelope. The post office doesn’t really need the name to deliver the letter. Sure, they might check it just to make sure. But really, all the post office needs is the address. This is like the IP address is all that’s needed to deliver a packet to the right computer. But once it reaches the destination, the port acts like the name. The port lets the computer know which process is interested to receive the data.

Sockets are resources provided by the operating system or drivers installed as part of the operating system. You may have a library called sockets or a library that uses something called sockets. This is a common name for network programming over an IP network. A socket is a concept that represents a communication channel with another computer.

Make sure to listen to the full episode to learn how to use sockets and ports when writing networking software. And if you subscribe to this podcast in iTunes or other podcast directory, then you’ll get future episodes delivered automatically. You can also read the full transcript below.

Transcript

So far, in the last several episodes, I’ve explained how to identify and find computers on a network, how to establish connections, how to think about communication in terms of seven layers, and a few different protocols. There are many more protocols than I described here. And there are many different types of networks not based on the internet protocol. Many of the concepts still apply and once you understand all of these things, you should be able to understand completely different ways to transport information between computers.

But there’s one final piece that I haven’t described yet. While it’s great that you know all of this, I’ve never actually seen or heard of a computer itself that wants to talk to another computer. It’s always software running on a computer that does the communicating. And computers can run many different programs at once.

Each running program is called a process or sometimes an application. I tend to think of an application as the software itself. Like what you buy from the store. And a process is the running instance of an application. Sometimes, I might mix these terms up. But in general, a process is not something you buy. A process is software that’s running on your computer. So what we really need to understand is how one process can communicate with another process. That’s the topic of today’s episode.

Specifically, I’m going to describe how to communicate using sockets over an IP network. Since computers can run multiple processes, just routing communication packets to the right computer isn’t enough. We need some way to specify what process should receive the information. And that’s where ports come in. Think of it like this, when you send a letter to someone, you don’t just put the address, you also put a name on the envelope. The post office doesn’t really need the name to deliver the letter. Sure, they might check it just to make sure. But really, all the post office needs is the address. This is like the IP address is all that’s needed to deliver a packet to the right computer. But once it reaches the destination, the port acts like the name. The port lets the computer know which process is interested to receive the data.

A port is a 16-bit number with possible values from 0 to 65535. There are well-known port numbers from 0 to 1023 for common services such as email, web sites, transferring files and news, getting the time, and chatting. Other port numbers can be registered with the Internet Assigned Numbers Authority for specific purposes from 1024 to 49151. And any port number above this is available for private or temporary use.

If you’ve been wondering what the difference is between a server computer and a regular computer, I can explain that now. A server is nothing more than a computer running at least one process that’s listening for data packets to arrive on certain ports and ready to serve responses requested by other computers. A typical home computer also has processes that are responding to communication requests but it’s not usually called a server unless it’s doing something more important like serving web pages or files.

Firewalls can sometimes block network traffic and you may need to open a port or a range of ports in order for an application to work properly. A firewall is a process that examines data transmitted and received and either allows it to proceed or throws it away. It normally stops incoming data for closed ports unless it first notices outgoing packets. What I mean is this, let’s say your firewall detects a data packet arriving from some remote computer. This is like an unwelcome guest. The firewall will just throw the packet away. If instead, you initiate contact first with the remote computer, maybe by visiting a website, then the firewall will notice the outgoing packet and remember where it was sent. Then if it gets an incoming packet from the same remote computer, it will be allowed to pass through.

Server computers will have ports opened in their firewall so they can accept incoming data packets anytime. A normal computer will usually have only a few ports open. Another name for normal computers in this case is a client. You’ll sometimes hear about communication designs called client/server. This is a design where a server computer will accept requests from many client computers and respond.

If you install and configure software on your home computer to serve web pages, then you’ve turned your computer into a web server. It can still be a client if you use it to access other servers.

The reason I’m explaining all this about clients and servers is because it affects how you write code to communicate. You may have a library called sockets or a library that uses something called sockets. This is a common name for network programming over an IP network. A socket is a concept that represents a communication channel with another computer. But if you’re using a protocol like UDP, then it can also represent a communication channel with multiple computers even if there are no other computers listening.

Sockets are resources provided by the operating system or drivers installed as part of the operating system.

If you’re writing a client application, then you open a socket by specifying you want to connect to a remote socket address and what protocol you want to use. A socket address is the IP address of the remote computer together with the port number. The operating system will open a local port for you using one of the temporary port numbers. You don’t need to specify a well-known port for the client port number when connecting to a server. Any port will work just fine. It’s the server that needs to be ready and waiting on the well-known port.

If you’re writing a server application, then it’s a little more involved. You first need to bind a socket to a protocol, IP address, and port. Depending on which port you’re trying to bind to, your process may need additional privileges. Many operating systems require administrator privileges in order to bind to any of the well-known ports. This makes it harder for processes to accidentally or maliciously open ports that can start receiving data packets. It also makes sure that only one process can open a socket with a given protocol, IP address, and port. You can’t have multiple processes receiving the same type of data.

Once you bind a socket on the server, then you need to listen to that socket. This sets up a queue where incoming data packets can be lined up. And then the server process will wait for a connection by calling accept. A protocol like UDP doesn’t have connections so in this case, accept will just wait for the next datagram to arrive.

A protocol like TCP does establish connections. So accept will return with a new socket that connects the local server IP address and port with the IP address and port of the client computer making the connection. Once you have this new socket, the server process can use it to send and receive information with the client. The server process will probably also want to go back to waiting on the original socket in case another client wants to connect.

The reason I said in the title of this episode that sockets and ports are the hidden pieces is because most of the time, users don’t have to know anything about them. Many users are aware of IP addresses and domain names and such. But sockets are just a concept in software that allows you to wrap up a communication channel with one of more other computers. And ports are also something that the software normally takes care of.

It is possible and useful to start a server process using a non-standard port. Maybe you already have a web server running on a computer that’s using port 80 for HTTP traffic and you want to run another web server. Both processes can’t bind to port 80 unless one of them uses a different IP address or protocol. Or maybe you want to setup a web server but don’t have the required permission to open port 80. A common solution is to use port 8,080 or just 80 80 as it’s normally pronounced.

If you start a web server configured like this, then any client computer running a web browser won’t be able to view any of the web pages because the web browser will keep trying to connect to port 80. What you have to do is specify the custom port number in the URL and you do this by putting a colon followed by the port number you want to use right after the domain name. You can also put the colon and port number after the IP address in case you don’t have a domain name mapped to the IP address.

This is actually what the browser does for you anyway. When you visit a web page and just provide a UR like www.takeupcode.com, then the browser actually connects to port 80 at the IP address pointed to by www.takeupcode.com. And if you connect over a secure channel by specifying HTTPS instead of just HTTP, then the browser knows to use port 443 instead of port 80. Browsers can do this because both ports 80 and 443 are well-known port numbers for HTTP and HTTPS.

Feedback

What's on your mind?
On a scale of 0 to 10, how likely are you to refer us to friends?