DNS Resolution in HTTP Workflow: How GET Requests Interact with Domain Name System


2 views

When examining application layer protocols, it's crucial to understand that while HTTP and DNS both operate at Layer 7, they serve distinct purposes and interact through the operating system's networking stack rather than direct protocol-to-protocol communication.

Here's what happens when a browser initiates a GET request:

// Browser initiates HTTP request
GET /hello.htm HTTP/1.1
Host: www.pippo.it

The resolution process involves these steps:

  1. Browser checks local DNS cache
  2. OS resolver queries configured DNS servers
  3. TCP connection establishes to resolved IP
  4. HTTP request transmits over TCP

Modern operating systems provide DNS resolution through system libraries. In Linux, this typically involves the getaddrinfo() system call:

#include <netdb.h>
#include <stdio.h>

int main() {
    struct addrinfo hints, *res;
    
    memset(&hints, 0, sizeof hints);
    hints.ai_family = AF_UNSPEC;
    hints.ai_socktype = SOCK_STREAM;

    int status = getaddrinfo("www.pippo.it", "80", &hints, &res);
    if (status != 0) {
        fprintf(stderr, "DNS resolution failed: %s\n", gai_strerror(status));
        return 1;
    }
    
    // res now contains IP address information
    freeaddrinfo(res);
    return 0;
}

A minimal HTTP client handling DNS resolution would look like:

import socket
import urllib.parse

def http_get(url):
    parsed = urllib.parse.urlparse(url)
    hostname = parsed.hostname
    path = parsed.path if parsed.path else '/'
    
    # DNS resolution happens here
    ip_address = socket.gethostbyname(hostname)
    
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        s.connect((ip_address, 80))
        s.sendall(f"GET {path} HTTP/1.1\r\nHost: {hostname}\r\n\r\n".encode())
        return s.recv(4096).decode()

print(http_get("http://www.pippo.it/hello.htm"))

DNS resolution results are cached at multiple levels:

  • Browser DNS cache (typically short-lived)
  • OS-level DNS cache (configurable TTL)
  • ISP or public DNS resolver caches

To optimize the DNS-to-HTTP pipeline:

// Node.js example with DNS caching
const dns = require('dns');
const http = require('http');

const cache = new Map();

async function cachedLookup(hostname) {
    if (cache.has(hostname)) {
        return cache.get(hostname);
    }
    const addresses = await dns.promises.resolve4(hostname);
    cache.set(hostname, addresses[0]);
    return addresses[0];
}

async function fetchWithCachedDNS(url) {
    const { hostname, pathname } = new URL(url);
    const ip = await cachedLookup(hostname);
    
    return new Promise((resolve) => {
        http.get(http://${ip}${pathname}, { headers: { Host: hostname } }, (res) => {
            let data = '';
            res.on('data', (chunk) => data += chunk);
            res.on('end', () => resolve(data));
        });
    });
}

When examining the TCP/IP stack, it's indeed confusing how higher-layer protocols like HTTP appear to depend on DNS resolution while both reside at the Application Layer. The critical insight is that while they're logically at the same layer, they operate sequentially within the network stack implementation.

Here's what happens behind the scenes when you make an HTTP GET request:

// Browser initiates request for http://www.example.com/resource
1. Operating System receives the hostname "www.example.com"
2. OS calls its resolver library (e.g., gethostbyname() in C)
3. Resolver checks local cache → if miss, performs DNS query
4. DNS response provides IP address to the OS
5. OS returns IP to browser process
6. Browser establishes TCP connection to IP
7. Sends HTTP GET request over established connection

The resolution happens through operating system APIs rather than direct protocol interaction:

// Example in Python showing the separation
import socket

# DNS resolution happens first (OS-level)
ip = socket.gethostbyname("www.example.com")

# Then HTTP request is made
conn = socket.create_connection((ip, 80))
conn.send(b"GET /resource HTTP/1.1\r\nHost: www.example.com\r\n\r\n")

While the OSI model shows DNS and HTTP at the same layer, real-world implementations have:

  • DNS client components in the OS kernel
  • Resolution occurring before socket creation
  • HTTP clients relying on resolved IPs from the OS

A Wireshark trace would show:

1. DNS Query  (src: client, dst: 8.8.8.53)
   Question: www.example.com A-record
2. DNS Response (contains 93.184.216.34)
3. TCP SYN to 93.184.216.34:80
4. HTTP GET /resource

Modern browsers implement hybrid approaches:

// Chromium's multi-step resolution process
1. Check HSTS/preload lists
2. Consult DNS cache (in-memory and OS-level)
3. If using DoH, bypass OS resolver
4. For HTTP/3, may initiate DNS and QUIC in parallel

Important clarifications:

  • HTTP never "calls" DNS directly - it's a sequence of operations
  • The OS network stack handles protocol coordination
  • DNS-over-HTTPS changes this dynamic by moving DNS into HTTP