What happens when you visit a website?

How could we get a website?

Imagine you're accessing https://example.com from your browser. Here's what happens step-by-step behind the scenes.

DNS Lookup

Your browser first needs the website's IP address. Essentially, your computer asks, "Where in the world is example.com?"

It first checks any local DNS cache (or your OS's hosts file) to see if it already has an IP address.
If not, it asks a recursive DNS resolver (often operated by your ISP, or a public DNS service) to resolve the name.
That resolver doesn't know the answer initially, so it queries the DNS hierarchy:
1. A root nameserver
2. Then the .com TLD server
3. And finally, the example.com authoritative nameserver
The authoritative server returns the numeric IP (e.g., 127.184.216.34), which flows back through the resolver to your computer.
Along the way, each step caches the result, as DNS records have a TTL (time-to-live) that balances freshness against performance (short TTLs for frequently changing sites, long TTLs for stable ones).
Modern DNS also uses DNSSEC signatures (for authenticity), and can use DNS-over-HTTPS/TLS (for privacy) to secure these lookups.
Now, you have an IP address, like 127.184.216.34.

Sending a Request

Once your computer has the IP address, it sends a network request to that machine.

On an Ethernet/Wifi LAN, your computer may first use ARP (Address Resolution Protocol) to map the target IP to a MAC address on the local network (or the MAC of your gateway).
Then it opens a transport connection to the server's port (port 80 for HTTP, 443 for HTTPS). For HTTP/1.1 or HTTP/2, this involves a TCP three-way handshake (SYN, SYN/ACK, ACK).
1. For HTTPS, TLS handshakes follow immediately, where the client and server agree on a TLS version and cipher.
2. The server presents its certificate (which the client validates), and they derive session keys to encrypt the channel.
3. If using HTTP/3, there is no TCP; the browser will use QUIC over UDP, combining connection setup and TLS handshake into one step.
4. Crucially, during the TLS handshake, the browser uses SNI (Server Name Indication) to tell the server the hostname it's contacting. SNI ensures the server can present the correct SSL certificate for example.com even if multiple sites share the same IP.
After TLS is established, the browser sends the actual HTTP request, which is a request line (e.g., /path HTTP/1.1) plus headers. These headers include a Host header (the domain name is required by HTTP/1.1), so the server knows which sites you want, and any other metadata or credentials (e.g., User-Agent, Accept-Language, cookies, auth tokens, etc.).

Web Server Responds

On the server side, a web server (like NGINX, Apache, IIS, etc.) is listening on the requested port.

It accepts the connection, reads the HTTP request, and uses the Host header or virtual-host configuration to pick the right side.
The server then looks up the requested path:
1. If it's a static file, it retrieves it from the disk
2. If the page is dynamic, it passes the request to an application backend (e.g., Node.js, Rust, Go, etc.) to generate content.
3. In either case, once the server has the content, it sends an HTTP response back to the client.
If the file was found and generated successfully, this response has status 200 OK, and includes HTML (or other data, such as JSON data) in the body.
If the file/path doesn't exist, the server returns 404 Not Found (and may serve a "Not Found Page"). Servers can also issue redirects, for example a 301/302 redirect which can point the browser to a different URL.
Through this process, the server may log the request, apply access rules, or rewrite rules, and include headers like Content-Type, Content-Length, Cache-Control, or Set-Cookie, etc, in its response for different purposes.

Rendering Webpage on the Browser

The browser now has the HTML skeleton.

It begins parsing the HTML to build the DOM (Document Object Model) tree.
As it parses, it finds linked resources and fetches them, such as CSS files (for styling), media (images, fonts), which can be downloaded in parallel.
The browser parses each CSS file into a CSSOM (CSS Object Model), and then combines the DOM and CSSOM into a render tree that represents the visible content and styles.
The layout engine uses this render tree to compute geometry (the layout or reflow), and then paints pixels to the screen.
Meanwhile, the browser processes JavaScript.
1. Each <script> tag typically causes the browser to download and execute the script.
2. By default, an external <script> will block HTML parsing until it's fetched and run (web developers often use async or deter attributes so scripts can be loaded without blocking.
3. When JS executes, it can modify the DOM/CSSOM (such as adding content, handling events, etc.) and make the page interactive.
4. For example, it might attach click handlers, animate elements, or fetch more data.
5. By the end of this step, the styled page is visible and interactive in your browser.

HTTP Protocol

HTTP (Hypertext Transfer Protocol) is the application-level protocol that structures these requests and responses.

It's stateless and request/response-oriented:

A client (browser) sends a request message to the server
The server sends back a response message

Common HTTP Methods

GET: Ask for a page or file (e.g., a picture or homepage).
POST: Send data to the server (e.g., submit a form). Typically, used to create new data (the server processes the body).
PUT/PATCH: Update or replace a resource. PUT generally replaces the entire resource, while PATCH updates parts of it.
DELETE: Delete a resource.
HEAD: Same as GET but only asks for headers (no body). Just check if a file exists (without downloading it)

Each method has semantic meaning.

Safe methods (like GET/HEAD) don’t alter state, while others (POST/PUT/DELETE) may change data.
Idempotent methods (GET, PUT, DELETE, HEAD) can be called multiple times with the same effect, whereas POST/PATCH usually creates different results on each call.

HTTP Headers

HTTP uses header fields to carry metadata with requests and responses.

Think of these as metadata or extra instructions, like:

What browser are you using
What language do you prefer
Authorization information (e.g., cookies, token)

Requests tell the server about the client or the request.

The host header specifies the domain name (and port) of the server being requested
User-Agent identifies the browser and OS
Accept-Language indicates preferred languages
Cookies or Authorization headers carry session tokens or credentials
These headers let the server tailor its response

There are a lot of different headers that are used for different purposes; you can check them out here.

The server sends back a response

The server's HTTP response always begins with a status code that indicates the outcome.

Status codes are grouped by class:

2xx (Success): The request succeeded
3xx (Redirection): The client must take additional action to complete the request.
4xx (Client Error): The request was invalid.
5xx (Server Error): The server failed to fulfill a valid request.

The server always replies with a status code to let you know what happened:

Code

Meaning

200

OK – Everything worked

301

Redirect permanently

302

Redirect temporarily

400

Bad request

401

Unauthorized

403

Forbidden

404

Not found

500

Server error

503

Server busy/unavailable

In the response, after the status line comes headers (e.g. Content-Type: text/html, Content-Length: 5324, etc.), and then the body (HTML, JSON, image, etc.).

If the content hasn’t changed since the browser last fetched it, the server might respond with 304 Not Modified (with no body) to tell the browser to use its cached copy. Otherwise, it sends the new content.

TL;DR

A website is made of HTML, CSS, and JavaScript served by a web server.
The browser uses HTTP to ask for files.
Ports + IP + HTTP + status codes help everything work smoothly.
The internet is basically a big "request and response" conversation.

PreviousWhat are the Service & Port?NextCTF's Write-ups

Last updated 2 months ago