If you remember from our previous lesson, we talked about the three fundamental breakthroughs in design and thinking the founders of the internet had that allowed it to become this incredibly resilient and massively scalable piece of technology that we all rely upon today in our personal and work lives. These three breakthroughs were:
- Packet Routing: transition from a point to point connection between computers to a packet-based approach in which routers are linked together and your data is broken up into little chunks called packets which were marked with a destination IP address and were sent off with the hope that they would arrive at their destination.
- Best Effort: this transition introduced a degree of unreliability to the connection and it could no longer be guaranteed that our information would arrive at its destination. There would be instances where certain routers on the network would receive more traffic than they could handle and they needed a way to deal with this – so they did the best that they could to process as much information as they could and they simply dropped the rest.
- Protocol Hierarchy: the concept of nested protocols layered in such a way that each provides a specific function. This is a core component of what makes the original design of the internet so future-proof.
What this gave us was a set of interconnected routers where you would send individual packets addressed to a destination IP address and hope it would get there. Each router would receive the packets, look at the IP header within the packet (which enclosed whatever payload you had sent off) and send it off towards its destination. We talked about the destination IP being a 32-bit number composed of 4 bytes. The router checks the address and looks in its routing table and decides when the packet came in, on which one of its connections, and figured out which router the packet should be sent to next and would then send it along. All routers that are connected to the internet had to understand the IP protocol, which was designed to be as simple as it possibly could be. It carried the version number (IPv4 or IPv6 in the first four bits of the first byte of the packet. Right off the bat, this identified the format of the rest of the packet. For example, IPv6 has a different format than IPv4 since it has a 128-bit source IP and destination IP, not 32-bit, so the headers are different. The people that designed the internet set it up so that each level within the protocol hierarchy contained the minimum amount of information that we needed to get the job done. The packet does not care what it contains. The router doesn’t know or care what the packet contains.
In this article on how the internet works: ICMP and UDP, we take a look at two important protocols within the protocol hierarchy that make the internet possible. ICMP or Internet Control Message Protocol can best be thought of as the plumbing or maintenance protocol. The UDP or user datagram protocol is “connectionless” and is a means to transport data that works better for some types of content than does TCP.
A Problem Presents Itself: Router Loops
The packet would never arrive at its destination and it would never die. Zombie Packets.The designers had to account for something called “router loops.” What is a router loop? Imagine a complex network of interconnected routers. Each of these routers has its own routing table that, when it receives a packet (an IP packet), it looks at the destination IP and checks its routing table to determine which of its outgoing connections to other routers it should send the packet so that it reaches its destination. That is all that it does – it takes the packets, checks to see where it is going, looks at its routing table to figure out which direction to send the packet, and then sends it on its merry way. While the router is reading this information and checking its routing table, the packet is essentially in an output queue, and when there is bandwidth available, it sends it off to its next stop. There is a potential for problems however as it is possible for a router to make a mistake if its routing table is not configured correctly. This would cause the packet to be sent off in the wrong direction (sent out the wrong interface). This would make it possible that the packet could come back to a router from earlier in its journey. This would cause the packet to get stuck in a loop as it keeps going through the same routers over and over again. The packet would never arrive at its destination and it would never die.
The Solution: Expiration of Packets
The designers countered this problem with the expiration of packets. We want the packet to arrive where it is supposed to, but we can’t have it live forever in the event that something should happen – else the entire internet would be clogged up with these zombie packets and the entire system would be unusable. They added something called TTL – Time To Live – to the fundamental outer layer, the IP layer. Time To Live (TTL) is a byte, so it can have up to 256 different values (0 to 255). Any router that receives an incoming packet will decrement the TTL value of that packet by one. So whatever TTL a packet has when it arrives at a router, once it has left that router it’s new TTL would be one less. If the TTL ever reaches zero [TTL of 1 and then gets decremented to 0], the router simply drops the packet and will not forward it on. This simple solution solves the issue of packets living forever and clogging up the network.
There is a problem in this situation however. If you recall previously, we talked about how if a router had more traffic than it could handle it would drop the packet and would not send any report back to notify the sender that the packet was dropped (as this would just create more traffic and make the problem worse). Well, in instances where a packet expires, the router sends back a message to the original sender. The router sends back a maintenance level packet saying that the “time exceeded” basically. It encodes within the ICMP packet a message saying that time exceeded – which means that the packet didn’t arrive at its destination. It is best to think of Time to Live not as a measurement of time, but as a counter which counts the number of hops a packet makes on its way towards its destination.
The internet diameter is the largest number of hops between the furthest two points anywhere on the internet.In the early days of the internet, operating systems set the TTL to a relatively low number (16 or 32) because that was enough at the time – the internet was not that big in the early days. As it grew and ISPs came onboard, they had their own tiers of routers and they connected to other ISPs with their tiers of routers, and the concept of internet diameter came about. When the designers originally came up with the TTL idea, they were trying to conserve bits wherever they could as bandwidth was a precious commodity back then. TTL was given 8 bits and they initially set it to 16 and counted down from that, so if anything was more than 16 routers away, then there would be a problem. Since the rapid explosion of growth the internet has had since the millennium, many operating systems now set the TTL to 255 (its maximum possible value).
One of the things that TTL enables is the ability to trace the route that packets take to reach their destination. The way it normally works now is that you emit an IP packet of some sort with a TTL deliberately large enough (255 nowadays) to get to the other side – it’s destination. You send it off, and that is all that you hear about it. But if you remember, I mentioned earlier that any router that is responsible for expiring a packet has to send back a notice saying that the packet did not reach its destination (it’s TTL counted down and reached zero before it could reach its destination). The ICMP packet that this router sends back has its IP address, so you get the source IP of that message where the packet died.
Traceroute works by deliberately setting the TTL to one and sending the packet. Obviously, the first router that it hits decrements the TTL to zero and sends back an ICMP time exceeded message with its IP address captured on the screen when trying to do a traceroute. Then we send a packet to the same destination, but this time with the TTL set to two. It reaches the first router, which decrements the TTL to one, and sends it off to the second router, which decrements the TTL to zero and sends back an ICMP time exceeded message with its IP address information. We repeat this process which enables us to get back the IP address of every router along the way that this particular packet addressed to this particular destination would take. We can now map out these IP addresses to determine where our packet goes prior to reaching its destination.
This also allows us to measure the length of time for that round-trip. This is not entirely accurate however as we can never really know when a packet might go a few hops out towards its destination and then come a few hops back – there is no way to know with any certainty which link might have been slow. As with any scientific experiment and statistical analysis, repeating this process increases your result-set and degree of certainty as to which router or routers could be the slow link in the chain.
Ping is similar and can be considered part of the underlying internet plumbing. Ping is another command that many of the internet savvy people know of, and many others have used when troubleshooting their internet connection over the phone. Ping works by opening a command window and typing ping www.websiteaddress.com, and your computer will look up the IP address of www.websiteaddress.com in the same way that your browser does and sends off a packet in that direction. What it is doing is taking a standard IP packet, and gives it the normal TTL (we don’t want this packet to expire). The payload of this IP packet is an ICMP packet.
How to ping in Windows:
- Open the command prompt by pressing the Windows key or hitting the Start button
- Type in cmd
- Type ping www.website.com (or you can ping an ip address directly by typing ping 192.168.1.1)
- Press Enter
How to ping in Mac OS X:
- Open the terminal. Go to your Applications folder, and in the Utilities folder, select Terminal
- Type ping www.website.com (or you can ping an ip address directly by typing ping 192.168.1.1)
- Press Enter
How to ping in Linux:
- Open the Telnet/Terminal window. It is most commonly found in the Accessories folder in your Applications directory. (If you are using Ubuntu, you can press Ctrl + Alt + T to open the terminal)
- Type ping www.website.com (or ping 192.168.1.1 3)
- Press Enter
This brings us back to the nesting of protocols concept we had discussed previously. The IP packet contains an ICMP packet of type 8, which is an echo request. This is the originator asking to verify connectivity – thus the word ping (from sonar radar, when you ping something and get back an echo from the sound burst that you sent out initially). This is the same concept on the internet. There is a universal agreement that all machines connected to the internet should (when the machine itself, with no programs running, not servers running in it or services/applications – nothing running on the machine, the operating system itself, which is hosting the IP stack), when that IP stack receives a packet and looks at it to decide what to do with it, it sees that that IP packet contains an ICMP echo request, and immediately – without requiring any further processing – sends back an echo reply to the originator.
This allows network engineers and architects to verify that things are working as they should – that routers are busy routing, and that the links are up. They can ping the destination IP and check to see that they get a response back. This lets them know if their traffic is getting there and back, outside of the bounds of all of the other technology that can get in the way. They can then start to work up the chain to determine problems. These same capabilities have security concerns as well however. Computers on a network reveal themselves by default and if someone pings your IP address, they get a response verifying that yes, someone is at that address. Malicious attackers now know that your computer is out there and can mount attacks against it if they so choose.
Some of the other options that attackers have available to them is flooding a users’ computer with pings in order to flood a given IP with traffic (more than it can handle) so that it becomes unusable. Traceroute can be used to map the topology of entire networks since every link along the way responds with its IP address back to the sender. They can use traceroute in order to get the IP addresses of intermediate routers inside of corporations and ISPs.
Rules Were Made to be Broken
It was because of concerns such as these that the fundamental rules of the internet were broken over time. Most consumer routers nowadays have the option of whether or not they should respond to a ping (because it is the router at the public IP prior to doing its NAT translation into a private network) because it is the destination IP address. If the original rules were to be followed, all of these routers would respond to every echo request that they receive with an echo reply.
Increasingly, more and more ISPs are blocking traceroute through configuration of their routers so that they are suppressing their own routers’ response to time exceeded messages. If a packet expires inside of the ISPs network, then the router drops it (as expected), but it does not send back the time exceeded message. If you ran the traceroute I mentioned earlier in this article from Montis, you might have noticed that you got back the first few hops and then there was a dead zone patch, and then the connections start to appear again. This dead zone is a range of routers that have been configured not to send back time exceeded messages when they handle expiring packets. These routers will not reveal their presence (as you noticed), instead they just pass these packets along through their network and they will go to the next stop until they reach a router that replies with the time exceeded message and the traceroute picks up where it left off.
Some ISPs have also begun to block ICMP traceroutes as well. Advanced users with state-of-the-art internet probing utilities can use other protocols to perform traceroutes. They can use UDP, or TCP as well because all of these are encapsulated in the IP protocol packet (the outer wrapping), which is where the destination IP lives and the TTL resides.
Internet Control Message Protocol (ICMP) Uses
We have covered some of the uses of the ICMP protocol already, but lets review:
- It is the packet where you can ping another IP by sending out a type 8 echo request and receive a type 0 echo reply in response. This enables you to verify connectivity at the lowest level.
- It also handles the time exceeded message. That is contained within a ping type 3 – destination unreachable. There is a subtype within this to indicate the reason why it may be unreachable (see ICMP Control Messages table below). For example, subtype 0 means that the network is unreachable, subtype 1 means that the host is unreachable, etc.
ICMP Header Format
There is the possibility for fragmentation as a router receives a packet of a certain size on its incoming link. It is possible that a router would need to forward a packet across its network to another router that can only handle smaller packets. This was especially common on connections with telephone lines and modems. There might have been a chunk of the network that was bridged or inter-connected using a high-speed modem or a protocol that wasn’t the IP protocol. What would then happen is that it would carry the packet, but was unable to (by virtue of the protocol it was using) send the packet. The outgoing interface would be configured to know what the maximum packet size is that it is able to send.
A fragmented packet is where not all of the packet might be forward-able across the next link towards its destination.The early designers of the internet recognized that this could be a problem, so they designed into the outer wrapper (the IP wrapper) the ability for packets to become fragmented. When that happened, the router had the ability and permission by default to chop that packet up into one, two, or more pieces and then send them in smaller chunks. No router reassembles packets that have been fragmented, it simply forwards them on to the next stop. If a link is encountered where a packet needs to be fragmented, it will chop the packet up into however many smaller pieces are necessary and send them each towards their destination. The receiving router of these smaller, fragmented packets simply sees them just like any other smaller IP packets and sends them on their way. This works quite well, but there are some protocols where performance is a concern and this creates issues, such as the audio protocols. These early designers thought that it would be nice if there was a way that we could probe the network to tell us what the maximum sized packet we would be able to send without fragmentation would be, so there is a bit within the header (a 4-bit field of flags – general purpose flag bits) in the original IP header. One of these instructs the router not to fragment the packets – it is called the DF flag (DF meaning Don’t Fragment). If that bit is set in an IP packet that must be fragmented in order for it to move on through the router, the router will instead send back another one of the ICMP low-level maintenance packets saying that the destination is unreachable and that the reason is “fragmentation needed.” Within this ICMP packet will be the maximum size packet that the link it was trying to send the packet out of can handle.
This is called the Path MTU (Maximum Transmission Unit) which states the maximum packet size that we are able to use from where we are to where we are trying to get to. Think of it this way, if we are sending a packet out that is too big for any link that is on its way towards its destination, we can set the do not fragment bit which will inform any router that receives it and that is unable to forward it on towards its destination (because the packet size is too large) to send an error back letting us know what the maximum size is that that particular router can handle. This is a way to proactively discover the maximum packet size that we can send to a destination without fragmentation at any step along the way. Once we have this information, we know that we can’t send any packets larger than that. It is situations like this where the low-level maintenance protocol ICMP really shines and enables a wide range of abilities. When would this be important? It would be most important when dealing with media, such as audio or video.
User Datagram Protocol or UDP
To recap what we have so far:
- Outer layer IP packet which contains the version number of the IP protocol (version 4 or version 6)
- It contains some flags (such as the do not fragment permission)
- It contains the overall length of the entire packet so that the router knows as data is coming in, where the packet ends
- It contains both the source and the destination IP addresses
- It contains the TTL (time to live) of the packet
- It contains a checksum which allows the router to verify that there hasn’t been a communication error as the packet crosses through different links towards its destination
At this point, we don’t have any notion of ports. Ports are not in the IP packet. At the lowest level, all that we care about are the individual IP addresses. It is what happens after the packet gets there where we start to add the next level of complexity. We know that the IP packet carries this ICMP payload for the low-level maintenance tasks, and next up in the level of complexity is that it can also carry a UDP packet. As the designers intended (with the nested protocol hierarchy), the UDP packet is the minimum necessary to add just one more layer of complexity.
The UDP packet contains the source port, the destination port, the length of its own payload, a checksum, and whatever it is that it contains. That is the point – it can be anything. What UDP adds is some abstraction of what we want to do once we arrive at our destination. And that is port numbers. Ports are simply 16-bit values carried within the packet – they are just numbers.
The destination port is like the destination IP address. The IP doesn’t contain port numbers, but the UDP packet does, and it contains a destination port which tells the software running on the computer which service to send this to. This way, when a service starts up (say SMTP – Simple Mail Transfer Protocol – for example) in a UNIX machine or on some other server, it registers itself to listen for incoming traffic on port 25 (or whatever port). There exists a sort of agreement that has been universally adhered to as to ports and port assignments. For example, mail servers will listen to ports 110, 143, and 25. DNS servers listen to port 53. Web servers listen to port 80, and secure traffic on port 443. This allows the sender to identify the class of traffic and the type of traffic that it is sending. Essentially, this allows the sender to say that they want to send traffic to some computer or device at this IP address and to the service listening for traffic on this port number.
The port number, with a 16-bit value can have any value from 1 up to 65,535 (port 0 is reserved). It is convention that the first 1023 ports are reserved as service ports or server ports. Convention dictates that services typically set themselves up and listen for connections on these ports. In systems such as UNIX, the user processes that are running are unable to listen on those service ports – only services that are registered with the proper permissions are able to listen on these first 1023 ports. Other user processes are able to listen in on the higher numbered ports. It is for this reason that you will sometimes find someone running a web server on port 8080. This is why you have to put a colon after the URL to manually override your web browser’s normal use of port 80 and put “:8080” in order to tell your browser to connect at this location and on port 8080 rather than the standard convention port 80.
Think of UDP as the traffic carrying protocol. Remember that it is unreliable (packets can get dropped by the routers), so we can’t guarantee that it will arrive at its destination. There is no mechanism for the application that generates UDP traffic to know that it arrives at its destination. For this reason, it is the application itself that has to deal with this inherent unreliability.
UDP is a simple protocol that is terrific for audio, video, and real-time communications such as video streaming. If some information gets lost along the way, the audio or video reconstruction on the other end will try to make up for this missing gap. In photos, a missing pixel could be interpolated. In audio, if a packet is lost or delayed too long, the codec has to guess in order to fill in the audio so that there isn’t a deadspot, and it does this based upon the audio that it has already received.
What’s Ahead: Transmission Control Protocol or TCP
At this point, we don’t have congestion control, so we don’t know if things are too busy or if things are missing. We don’t know if packets are arriving out of order either. That is something that the next protocol we will discuss addresses. TCP handles all of these problems transparently for us. The problem is that it introduces overhead, which in some cases can become a real problem. TCP is very useful for downloading files where we have packets arriving and they need to be reassembled in the correct order else our file would be broken. When it is not important and we desire minimal overhead, UDP is our protocol of choice. When we want to make sure that something gets to its destination exactly right, then TCP is our protocol.
If you don’t count IP (the overall container), TCP is the most used protocol. It is THE protocol. All of our downloads, web browsing, and most of what the internet does is over TCP. We will cover this amazing protocol in our next article in the series.
ICMP Control Messages
|0 – Echo Reply||0||Echo reply (used to ping)|
|1 and 2||Reserved|
|3 – Destination Unreachable||0||Destination network unreachable|
|3 – Destination Unreachable||1||Destination host unreachable|
|3 – Destination Unreachable||2||Destination protocol unreachable|
|3 – Destination Unreachable||3||Destination port unreachable|
|3 – Destination Unreachable||4||Fragmentation required; and DF flag set|
|3 – Destination Unreachable||5||Source route failed|
|3 – Destination Unreachable||6||Destination network unknown|
|3 – Destination Unreachable||7||Destination host unknown|
|3 – Destination Unreachable||8||Source host isolated|
|3 – Destination Unreachable||9||Network administratively prohibited|
|3 – Destination Unreachable||10||Host administratively prohibited|
|3 – Destination Unreachable||11||Network unreachable for TOS|
|3 – Destination Unreachable||12||Host unreachable for TOS|
|3 – Destination Unreachable||13||Communication administratively prohibited|
|3 – Destination Unreachable||14||Host precedence violation|
|3 – Destination Unreachable||15||Precedence cutoff in effect|
|4 – Source Quench||0||Deprecated||Source quench (congestion control)|
|5 – Redirect Message||0||Redirect datagram for the network|
|5 – Redirect Message||1||Redirect datagram for the host|
|5 – Redirect Message||2||Redirect datagram for the TOS and network|
|5 – Redirect Message||3||Redirect datagram for the TOS and host|
|6||Deprecated||Alternate host address|
|8 – Echo Request||0||Echo request (used to ping)|
|9 – Router Advertisement||0||Router advertisement|
|10 – Router Solicitation||0||Router discovery/selection/solicitation|
|11 – Time Exceeded||0||TTL expired in transit|
|11 – Time Exceeded||1||Fragment reassembly time exceeded|
|12 – Parameter Problem: Bad IP Header||0||Pointer indicates the error|
|12 – Parameter Problem: Bad IP Header||1||Missing a required option|
|12 – Parameter Problem: Bad IP Header||2||Bad length|
|13 – Timestamp||0||Timestamp|
|14 – Timestamp Reply||0||Timestamp reply|
|15 – Information Request||0||Deprecated||Information request|
|16 – Information Reply||0||Deprecated||Information reply|
|17 – Address Mask Request||0||Deprecated||Address mask request|
|18 – Address Mask Reply||0||Deprecated||Address mask reply|
|19||Reserved for security|
|20 thru 29||Reserved for robustness experiment|
|30 – Traceroute||0||Deprecated||Information request|
|31||Deprecated||Datagram conversion error|
|32||Deprecated||Mobile host redirect|
|33||Deprecated||Where-are-you (originally meant for IPv6)|
|34||Deprecated||Here-I-am (originally meant for IPv6)|
|35||Deprecated||Mobile registration request|
|36||Deprecated||Mobile registration reply|
|37||Deprecated||Domain name request|
|38||Deprecated||Domain name reply|
|39||Deprecated||SKIP (Simple Key-Management for Internet Protocol)|
|40||Photuris, Security failures|
|41||ICMP for experimental mobility protocols|
|42 thru 255||Reserved|