HTTP Protocols
HTTP Protocols
Introduction
The Hypertext Transfer Protocol (HTTP) is the application-layer protocol used for communication between clients and servers on the web. It enables retrieval of resources such as HTML, JSON, or binary content by defining a standardized format for request and response messages.
HTTP is stateless, meaning each request is independent unless explicitly managed using sessions, cookies, or tokens.
HTTP vs HTTPS
Feature | HTTP | HTTPS |
---|---|---|
Encryption | No | Yes (via TLS) |
Default Port | 80 | 443 |
Visibility | Unencrypted, readable by intermediaries | Encrypted end-to-end |
Certificate Required | No | Yes (X.509 certificate) |
HTTPS is HTTP over TLS (Transport Layer Security). It provides integrity, confidentiality, and authentication using cryptographic certificates and a handshake process.
Request Structure
An HTTP request typically consists of:
- A request line
e.g., `GET /index.html HTTP/1.1`
- Request headers
e.g., `Host`, `User-Agent`, `Accept`, `Content-Type`
- Optional body (for methods like POST or PUT)
Example:
GET /resource HTTP/1.1 Host: example.com User-Agent: curl/8.0 Accept: */*
Response Structure
An HTTP response from a server includes:
- A status line
e.g., `HTTP/1.1 200 OK`
- Response headers
e.g., `Content-Type`, `Server`, `Set-Cookie`
- Optional body (e.g., JSON, HTML, binary data)
Example:
HTTP/1.1 200 OK Content-Type: application/json Content-Length: 48 {"status":"ok","data":["value1","value2"]}
HTTP Methods
Method | Description | Notes |
---|---|---|
GET | Retrieves a resource | No side effects expected |
POST | Submits data to a resource | Used for creating or triggering actions |
PUT | Replaces a resource | Idempotent |
PATCH | Partially modifies a resource | Efficient for small updates |
DELETE | Removes a resource | Use with caution |
HEAD | Same as GET but returns headers only | Used for validation |
OPTIONS | Returns allowed methods for a resource | Used in CORS preflight |
Status Codes
HTTP status codes are three-digit numbers returned by the server to indicate the result of a request. They are grouped into five categories based on their first digit.
Code | Category | Meaning | Example Scenario |
---|---|---|---|
100 | Informational | Continue | Client should continue the request |
101 | Informational | Switching Protocols | Protocol upgrade requested |
200 | Success | OK | Standard response for successful request |
201 | Success | Created | Resource successfully created (e.g., via POST) |
204 | Success | No Content | Request succeeded, no body returned |
301 | Redirection | Moved Permanently | Resource has a new permanent URI |
302 | Redirection | Found | Temporarily redirected to another URI |
304 | Redirection | Not Modified | Cached content is still valid |
400 | Client Error | Bad Request | Malformed syntax or invalid parameters |
401 | Client Error | Unauthorized | Authentication required |
403 | Client Error | Forbidden | Request denied despite authentication |
404 | Client Error | Not Found | Resource not found at requested URI |
405 | Client Error | Method Not Allowed | Method not supported on resource |
408 | Client Error | Request Timeout | Server timed out waiting for request |
429 | Client Error | Too Many Requests | Rate-limiting or abuse prevention |
500 | Server Error | Internal Server Error | Generic server failure |
502 | Server Error | Bad Gateway | Invalid response from upstream server |
503 | Server Error | Service Unavailable | Server temporarily overloaded or down |
504 | Server Error | Gateway Timeout | Upstream server did not respond in time |
Important Headers
Header | Purpose |
---|---|
Host | Indicates the domain being requested |
User-Agent | Identifies the requesting client |
Accept | Lists acceptable response formats |
Authorization | Contains credentials (e.g., Basic or Bearer) |
Content-Type | Declares media type of the request/response body |
Set-Cookie | Instructs browser to store a cookie |
Referer | Specifies origin of the request |
X-Forwarded-For | Shows original IP behind proxy |
Session Handling
Because HTTP is stateless, persistent user sessions require one or more of the following:
- Cookies – Stored on the client to identify a session
- Session IDs – Server-stored session reference
- JWT (JSON Web Tokens) – Signed token containing session data
These mechanisms allow users to remain "logged in" and retain application context across requests.
Caching and Proxying
Caching is used to reduce latency and bandwidth usage. Proxy servers and browsers often store cacheable responses using headers like:
- `Cache-Control`
- `ETag`
- `Last-Modified`
- `Expires`
Caching behavior is controlled using these headers to define rules like expiration time, revalidation conditions, or cache-bypass.
TLS Handshake (in HTTPS)
The TLS handshake occurs before any HTTP data is exchanged:
- ClientHello: Client proposes supported ciphers and sends a random value
- ServerHello: Server selects a cipher and provides its certificate
- Key exchange: Secure key is derived
- Symmetric encryption begins
This secures the session and validates the server identity.
Diagnostic Tools
Tool | Example Usage | Function |
---|---|---|
curl | curl -v https://example.com | Command-line HTTP client |
wget | wget https://example.com/file.txt | File downloader |
tcpdump | tcpdump -A -i eth0 port 80 | Inspect raw HTTP packets |
mitmproxy | mitmproxy -p 8080 | Intercept and modify traffic |
browser devtools | F12 → Network tab | View request and response headers |
Common Weaknesses
- Sensitive data in query parameters (GET)
- Lack of TLS encryption on login forms
- Broad `Access-Control-Allow-Origin` headers
- Improper use of caching for private data
- Default error pages revealing server information
See also
- curl
- /tcpdump