Going Deeper With DNS
Sometimes you just need to say things out loud to someone else to know that you understand something. That’s what I did with my coworker the other day – just described out loud to him how I thought our internal service worked. It really helped. I got to put my vague thoughts into words, and he offered corrections as needed.
Turns out that, like most things, our service works because of the magic that is the Internet. HTTP requests, DNS lookup, IP addresses, CORS, etc, are all at the core of how it functions. Trying to explain how the service worked reminded me of the code interview prep question I practiced when I was trying to get my first software job: “Explain how the Internet works at a high level.”
The /etc/hosts hack
DNS (Domain Name System) lookup is hierarchical. When you make a request for a domain like google.com, the request will travel up through a series of DNS servers until it finds an entry for google.com that points to a specific IP address.
From your laptop, the very first place your computer looks in that hierarchy chain is a file called
/etc/hosts. It contains a list of domain names and IP addresses, just like any other DNS server. And so, if you put an entry like this:
then from now on, when you try to load google.com in your browser (unless your browser has a cached version), you will actually be directed to 127.0.0.1 – your own machine.
This is useful if you want to simulate, say, hitting an internal service that proxies your request to another app. Put the IP address of the internal service with the name of your app in your
/etc/hosts file, and your computer will map the domain of the app to the real live internal service, located at that IP address, that receives the request and proxies it elsewhere.
Who controls DNS?
So besides just putting it in your
/etc/hosts file, how do IP addresses end up with registered domain names?
when you buy a domain (like from Namecheap or Godaddy), those vendors work with IANA to add your DNS entry – that’s the department of ICANN that controls IP/DNS stuff. Yes, deep down in there, there’s a bureaucracy (no offense, IANA) sitting inside the Internet, pulling the strings.
Companies may have their own DNS servers, too. Internal apps and services that don’t need to be accessed by the public Internet can have IP addresses that don’t need to be registered through IANA.
Domain Name VS. Host VS. IP
It used to be the case that subdomains usually had their own unique IP address. Then we learned about Reduce, Reuse, Recycle. Now, it’s common for any one IP address to have many subdomains. So how do we know where exactly to send the request?
Every HTTP request comes with a
Host header. The Host header identifies the location – the actual host machine – to where we’re sending packets. The host is the domain name of the server (as well as the port if a nonstandard port is being used). Getting close to the metal here!
Knowing the hostname and port, we can now send the request to the correct host at the IP address we looked up.
In Summary: A Midsummer Night’s Dream
Image: John William Waterhouse - Art Renewal Center – description, Public Domain, https://commons.wikimedia.org/w/index.php?curid=39913701
Let’s say you’re making an internal service who receives requests from one lover, inserts some headers into the request, and proxies the request to its ultimate destination – then hands back the response. A go-between romantic messenger service.
Let’s pull in some Shakespeare: your service is Wall, and your lovers are Pyramus and Thisbe.
Here is the information you need:
1 2 3 4 5 6 7
Let’s put Pyramus behind the Wall. We’ll need a DNS entry that looks like this:
Now, Thisbe sends an HTTP request to Pyramus with the domain name
wall.service.com and the host header
The Wall receives the request (because it’s at
18.104.22.168, the IP address that matches the domain), sees the host header, and passes the request on to Pyramus. Success! (Of course, there will be a response back to Thisbe too, but it’s too saccharine to print here.)