Introduction#
HTTP stands for HyperText Transfer Protocol. HTTP was defined in RFC 1945 and RFC 2616 for HTTP/1.0 and HTTP/1.1 respectively. HTTP has been the backbone of our modern web and internet communication.
HTTP involves two components: a client and a server. The client is an active component that initializes the communication whereas the server is a passive component that waits for incoming communication. They talk to each other using HTTP messages. HTTP defines the structure of these messages to indicate which one is a request sent by clients, and which one is a response sent by servers.
HTTP Message#
As mentioned above, HTTP messages are used to communicate between clients and servers in the HTTP protocol. There are two types of messages: request and response. A request message is sent by the client to a target server to request certain resources. A response message is sent by the server to the client to tell the client the result of the request.
HTTP messages are not some special data structure or complex format. They are basically text messages or strings that have certain structure and format. It will unify the communication between different systems and platforms, and help the servers know how to parse the message and respond to it.
Both requests and responses share the same structure. They consist of three parts: a start line, a header section and an optional body. However, requests and responses have different content in the start line and require different headers to indicate the purpose of the message. The message body is optional and can be empty. Here is the general structure of an HTTP message defined in RFC 2616
generic-message = start-line *(message-header CRLF) CRLF [ message-body ] start-line = Request-Line | Status-Line message-header = field-name ":" [ field-value ] field-name = token field-value = *( field-content | LWS ) field-content = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string> message-body = entity-body | <entity-body encoded as per Transfer-Encoding>
HTTP Request#
HTTP Request message is defined in RFC 2616 Section 5.2.
Request = Request-Line *(( general-header | request-header | entity-header ) CRLF) CRLF [ message-body ] Request-Line = Method SP Request-URI SP HTTP-Version CRLF Method = "OPTIONS" | "GET" | "HEAD" | "POST" | "PUT" | "DELETE" | "TRACE" | "CONNECT" Request-URI = "*" | absoluteURI | abs_path | authority HTTP-Version = "HTTP" "/" ( "1.0" | "1.1" | "2" ) general-header = Cache-Control | Connection | Date | Pragma | Trailer | Transfer-Encoding | Upgrade | Via | Warning request-header = Accept | Accept-Charset | Accept-Encoding | Accept-Language | Authorization | Expect | From | Host | If-Match | If-Modified-Since | If-None-Match | If-Range | If-Unmodified-Since | Max-Forwards | Proxy-Authorization | Range | Referer | TE | User-Agent entity-header = [ unknown-header ]
A typical HTTP request message from a browser to a server might look like this:
GET /blogs/how-http-works HTTP/2 Host: www.richardhnguyen.com User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br, zstd Connection: keep-alive Upgrade-Insecure-Requests: 1 Sec-Fetch-Dest: document Sec-Fetch-Mode: navigate Sec-Fetch-Site: cross-site Priority: u=0, i Pragma: no-cache Cache-Control: no-cache TE: trailers
The first line of the request message is called the request line. It contains 3 parts:
- The request method (
GET
in this case) is used to tell the server what action the client wants to perform. There are several request methods in HTTP such asGET
,POST
,PUT
,DELETE
, etc. - The URI (
/index.html
) is the path to the resource that the client wants to access. Depending on the request method, the server might perform different actions on the resource. - The HTTP version (
HTTP/1.1
) is the version of the HTTP protocol that the client is using. Different versions of the HTTP protocol might have different features and formats. For simplicity, HTTP/1.1 is the main focus of this post.
The header section contains additional information about the request and the
client itself. The usage of the headers is versatile. They can be used to
request the resource with specific conditions. However, there are some that are
required in order to make the request valid. For example, the Host
header is
required in HTTP/1.1 to indicate the target server.
Clients can include as many headers as they want to provide necessary
information to the server. Then, how does the server know when the header
section ends? The answer is the empty line (\r\n
). The empty line indicates
the end of the header section and the beginning of the body section.
The final part of the request message is the body. The body is optional, which
means that some requests might have no body at all. For example, a GET
request
usually has no body because it is used to retrieve resources from the server.
However, a POST
request usually has a body because it is used to send data to
the server.
HTTP Response#
HTTP Response message is defined in RFC 2616 Section 6.
Response = Status-Line *(( general-header | response-header | entity-header ) CRLF) CRLF [ message-body ] Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF Status-Code = "100" ; Continue | "101" ; Switching Protocols | "200" ; OK | "201" ; Created | "202" ; Accepted | "203" ; Non-Authoritative Information | "204" ; No Content | "205" ; Reset Content | "206" ; Partial Content | "300" ; Multiple Choices | "301" ; Moved Permanently | "302" ; Found | "303" ; See Other | "304" ; Not Modified | "305" ; Use Proxy | "307" ; Temporary Redirect | "400" ; Bad Request | "401" ; Unauthorized | "402" ; Payment Required | "403" ; Forbidden | "404" ; Not Found | "405" ; Method Not Allowed | "406" ; Not Acceptable | "407" ; Proxy Authentication Required | "408" ; Request Time-out | "409" ; Conflict | "410" ; Gone | "411" ; Length Required | "412" ; Precondition Failed | "413" ; Request Entity Too Large | "414" ; Request-URI Too Large | "415" ; Unsupported Media Type | "416" ; Requested range not satisfiable | "417" ; Expectation Failed | "500" ; Internal Server Error | "501" ; Not Implemented | "502" ; Bad Gateway | "503" ; Service Unavailable | "504" ; Gateway Time-out | "505" ; HTTP Version not supported | extension-code extension-code = 3DIGIT Reason-Phrase = *<TEXT, excluding CR, LF> response-header = Accept-Ranges | Age | ETag | Location | Proxy-Authenticate | Retry-After | Server | Vary | WWW-Authenticate
A typical HTTP response message from a server to a client looks like this:
HTTP/1.1 200 OK Accept-Ranges: bytes Access-Control-Allow-Origin: * Age: 68438 Cache-Control: public, max-age=0, must-revalidate Content-Disposition: inline Content-Length: 161639 Content-Type: text/html; charset=utf-8 Date: Thu, 12 Dec 2024 20:07:37 GMT Etag: "e170d7e754287d0891883f2f5b2fdc8c" Server: Vercel Strict-Transport-Security: max-age=63072000 Vary: RSC, www.Router-State-Tree, www.Router-Prefetch X-Matched-Path: / X-Vercel-Cache: HIT X-Vercel-Id: pdx1::w995t-1734034056964-51f61c3fde64 [ HTML content ]
In HTTP response message, the first line is called the status line. It contains 3 parts:
-
The HTTP version (
HTTP/1.1
) is the version of the HTTP protocol that the server is using. It should be the same as the client’s version. -
The status code (
200
) is a 3-digit number that indicates the result of the request. There are many status codes in HTTP, but the most common ones are200
(OK),404
(Not Found),500
(Internal Server Error), etc. -
The reason phrase (
OK
) is a human-readable description of the status code. It is not used by clients to determine the result of the request, but it is useful for debugging and logging purposes.The reason phrase is removed from later versions of the HTTP protocol, HTTP/2 and HTTP/3. The status code is deemed enough to indicate the result of the request.
The header section are basically the same as in the request message. They contains some general headers and some specific headers that are used for the response. These headers are used to provide additional information about the response so that clients can understand and handle the response properly.
However, there are some headers that are important for the response to be
handled properly. For example, the Content-Type
header is required to tell
the client what type of content the response contains. For example, browsers
use this header to determine how to display the content of a response. If the
Content-Type
header is text/html
, the browser will render the content as
an HTML page. However, if the Content-Type
header is set to text/plain
, the
browser will render the content as plain text.
Another important header is the Content-Length
header. This header is used to
tell the client the length of the response body in bytes. As we mentioned above,
the body section of the response is optional and there is no way for the client
to stop reading the response body. The client will keep reading the response
and cause a blocking issue. Therefore, the Content-Length
header is used to
tell the client how many bytes it should read from the response body. After
reading the specified number of bytes, the client will stop reading the response
body.
The value in Content-Length
header should match the actual length of the body.
If the value is less than the actual length, the client will stop reading the
body prematurely and make the response incomplete. If the value is greater than
the actual length, the client will keep reading the body and cause a blocking
issue as it expects more data to come.
HTTP methods#
HTTP defines several request methods that clients can use to interact with a
resource location in different ways. For example, a GET
request to /login
URI means to retrieve the login page from the server but a POST
request to
the same /login
URI means to send the login credentials that the user wishes
to use to log in.
GET
method#
The GET
method is used to retrieve resources from the server; whatever the
server wants to return to the client. It can be an HTML page, an image, a video
or a JSON object.
A GET
can be also a conditional request when the header section includes these
headers
If-Modified-Since
&If-Unmodified-Since
: The server will return the resource only if it has been modified or not modified respectively since the specified date.If-Match
&If-None-Match
: The server will return the resource only if the entity tag matches or does not match the specified entity tag respectively.If-Range
: The server will return the resource only if the entity tag matches the specified entity tag or the range matches the specified range.
The purpose of these headers is to allow efficient updates of cached information with a minimum amount of data transfer.
For example, a client initially sends a GET
request to retrieve the index page
of a blog. The server returns the page with a Last-Modified
header and an
ETag
header. The client will cache these headers and the content of the body
locally. On sequential requests, the client will include the If-Modified-Since
or If-Match
header to check if the resource has been updated. If the server
verifies and finds that the resource has not been updated, it will return a
304 Not Modified
response with no body. The response implies that the client
can use the cached body to render the page.
POST
method#
The POST
method is mainly used to create a new resource on the server. The
origin server accepts the entity enclosed in the request as a new subordinate of
the resource identified by the request URI.
POST
requests contain a body that describes the entity closed in the request.
The server will use the body to create a new resource. Same as responses with a
body, POST
requests should include the Content-Length
header to indicate the
length of the body. The server will read the body until it reaches the specified
length.
POST
requests should only include the Content-Type
header to indicate the
type of the body. Its purpose is for the server to know how to parse the body
and store it in an appropriate data structure. For example, POST
data from a
form will have the Content-Type: application/x-www-form-urlencoded
header.
Origin servers can use this header to parse the body, extract the form data and
store it for efficient access. Or in REST servers, POST
data are often in JSON
format, so the Content-Type: application/json
header is used to tell the
origin server to parse the body as a JSON object.
If a new resource is created, the origin server should return a 201 Created
status code in the response and include a Location
header to indicate the URI
of the newly created resource. The client can use the URI to access the newly
created resource.
PUT
method#
The PUT
method is also used to send data to the server. With the Request URI,
the server should store the enclosed entity under the supplied Request URI. If
the Request URI refers to an already existing resource, the enclosed entity
should be considered as a modified version of the resource.
PUT
requests are idempotent, which means that multiple identical requests
should have the same effect as a single request. For example, if a client sends
a PUT
request to create a new resource, the server should create the resource
and return a 201 Created
status code. If the client sends the same PUT
request again, the server should return a 200 OK
status code because the
resource has already been created.
The fundamental difference between the
POST
andPUT
requests is reflected in the different meaning of theRequest-URI
. The URI in aPOST
request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in aPUT
request identifies the entity enclosed with the request — the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource. If the server desires that the request be applied to a different URI,— RFC 2616 Section 9.51
To illustrate this difference, POST
requests means that you let server decide
which resource to be handled with the enclosed entity. On the other hand, PUT
requests means that you tell the server which resource specifically to be
handled with the enclosed entity.
For example, a typical POST
request to a /users
URI means that you let the
server decide how to create with the enclosed entity:
POST /users HTTP/1.1 Content-Type: application/json Content-Length: 100 Connection: close { "fullName": "John Doe", "email": "johndoe@gmail.com", "userName": "johndoe", "password": "a_secret_password", "role": "user", "verified": false }
The URI /users
is ambiguous. It does not specify which user to create and lets
the server decide how to create the user. The result of a successful request can
result with a username, /users/johndoe
, or a user ID generated by the
server, /users/123456
.
On the other hand, a typical PUT
request to a /users/johndoe
URI means that
you tell the server to create a user with the enclosed entity under the /users
URI:
PUT /users/johndoe HTTP/1.1 Content-Type: application/json Content-Length: 80 Connection: close { "fullName": "John Doe", "email": "johndoe@gmail.com", "password": "a_secret_password", "role": "user", "verified": false }
If the user at the URI /users/johndoe
does not exist, the server should create
a new one with the enclosed entity. If the user already exists, the server
should update the user with the same data, depending on how the server is
implemented.
DELETE
method#
The DELETE
method is used to remove a resource identified by the Request URI
from the server. Same as PUT
requests, DELETE
requests tell the server
which resource specifically to be removed. For example,
DELETE /users/johndoe HTTP/1.1 Connection: close
HEAD
method#
The HEAD
method is pretty much identical to the GET
method, but instead of
returning the complete response with status line, headers and body, the server
only returns the status line and headers, and omits the body in the response.
HEAD
requests are often used to check the status of the resource without
transferring the entire body. It’s useful in testing, validity, accessibility,
and recent modification of the resource.
The response to HEAD
requests are identical to the response to GET
requests
except that the body is omitted. That includes the status line and headers. For
example:
HEAD / HTTP/1.1 Host: www.washington.edu
HTTP/1.1 200 OK date: Fri, 13 Dec 2024 00:36:01 GMT server: Apache upgrade: h2 connection: Upgrade last-modified: Thu, 12 Dec 2024 17:30:07 GMT etag: "12eeb-6291610c14c89" accept-ranges: bytes content-length: 77547 content-type: text/html
OPTIONS
method#
The OPTIONS
method is used to describe the communication options for the target resource. In other words, OPTIONS
requests are used to ask the server what methods are allowed on the target resource before sending the actual requests.
OPTIONS
requests are often used in cross-origin requests to check if the
server allows requests with certain methods, certain headers and from certain
origins.
For example, a client wants to know if a POST
request is allowed on a
resource. The client will send an OPTIONS
request to the server to check if
the server allows POST
requests on the resource.
OPTIONS / HTTP/1.1 Host: www.example.com
Typically, the server will respond a 200 OK
or 204 No Content
status code
with the Allow
2 header to in the response to indicate which methods are
allowed on the resource.
HTTP/1.1 200 OK Allow: OPTIONS, GET, HEAD, POST Cache-Control: max-age=604800 Content-Type: text/html; charset=UTF-8 Date: Fri, 13 Dec 2024 05:39:01 GMT Expires: Fri, 20 Dec 2024 05:39:01 GMT Server: EOS (vny/044F) Content-Length: 0
In the response, example.com
allows OPTIONS
, GET
, HEAD
, and POST
.
Therefore, a POST
request can be made to the resource.
In modern web servers, the Allow
header is usually replaced by the
Access-Control-Allow-Methods
header. The Access-Control-Allow-Methods
header
is used in cross-origin requests (CORS)3. It tells the client which methods
are allowed on the resource in the target origins. For example, if an OPTIONS
request is made to this blog page:
OPTIONS /blogs/how-http-works HTTP/1.1 Host: www.richardhnguyen.com
The server will respond with the Access-Control-Allow-Methods
header to tell
the client which methods are allowed on the resource in the target origin.
HTTP/1.1 204 No Content Access-Control-Allow-Methods: OPTIONS, GET, HEAD Access-Control-Allow-Origin: * Age: 0 Cache-Control: public, max-age=0, must-revalidate Content-Disposition: inline Date: Fri, 13 Dec 2024 05:46:18 GMT Server: Vercel Strict-Transport-Security: max-age=63072000 [ ... other headers]
TRACE
and CONNECT
methods#
The last two methods, TRACE
and CONNECT
, are less common and are used mainly
for debugging and proxy purposes. TRACE
requests allow clients to see what is
being received tat the other end of the request chain.
TRACE / HTTP/1.1 Host: www.example.com
The server will respond with the same request that the client sent. The response is useful for debugging purposes to see if the request is being modified by intermediate proxies or servers.
HTTP/1.1 200 OK Content-Type: message/http Content-Length: 100 Connection: close TRACE / HTTP/1.1 Host: www.example.com [ ... other headers]
However, the TRACE
method is considered a security risk because it can expose
sensitive information to attackers. Therefore, it is disabled by default in most
browsers and servers.
The CONNECT
method is used to establish a tunnel to the server identified by
the target resource. Upon successful connection, the client and the server can
blindly send data to each other.
HTTP Status Codes#
HTTP status codes are 3-digit numbers that indicate the result of the request. There are 5 classes of status codes in HTTP status codes
1xx
: Informational - Request received, continuing process2xx
: Success - The action was successfully received, understood, and accepted3xx
: Redirection - Further action must be taken in order to complete the request4xx
: Client Error - The request contains bad syntax or cannot be fulfilled5xx
: Server Error - The server failed to fulfill an apparently valid request
1xx
Informational#
1xx
status codes are used to indicate that the server has received the request
and is processing it. The client should continue with the request. There are
several 1xx
status codes in HTTP, but the most common ones are:
100 Continue
: The server has received the request headers and the client should proceed with the request. The server will send a final response after the request body is received.101 Switching Protocols
: The server is changing protocols. The client should switch to the new protocol and continue with the request.
2xx
Success#
2xx
status codes are the most common status codes in HTTP. They indicate that
the request was successfully processed by the server. There are some common 2xx
status codes:
200 OK
: The request was successful and the server has returned the requested resource.201 Created
: The request was successful and the server has created a new resource.204 No Content
: The request was successful but the server has not returned any content in the response body.206 Partial Content
: The server has returned only part of the requested resource. This status code is used in response toRange
requests.
3xx
Redirection#
3xx
status codes are used to indicate that the client must take additional
action to complete the request. There are several 3xx
status codes in HTTP,
but the most common ones are:
301 Moved Permanently
: The requested resource has been permanently moved to a new location. The client should update its bookmarks and links to the new location.302 Found
: The requested resource has been temporarily moved to a new location. The client should use the new location for this request only.304 Not Modified
: The requested resource has not been modified since the last request. The client can use the cached version of the resource.
4xx
Client Error#
4xx
status codes are used to indicate that there was an error in the request
made by the client. The purpose of these status codes is to inform the client
that the request was not valid. It could be due to invalid syntax, missing
headers, unauthorized access, or other reasons. Basically, the errors are caused
by the client and the server refuses to process the request further.
Some common 4xx
status codes are:
400 Bad Request
: The request was invalid and the server cannot process it.401 Unauthorized
: The client must authenticate itself to get the requested resource.403 Forbidden
: The client does not have permission to access the requested resource.404 Not Found
: The requested resource was not found on the server.405 Method Not Allowed
: The request method is not allowed for the requested resource.408 Request Timeout
: The server timed out while waiting for the request.
4xx
status codes are sometimes implemented in detail by the server to
indicate the exact error. However, if the error is not specific, the server
can return a 400 Bad Request
status code for general client errors.
5xx
Server Error#
5xx
status codes are used to indicate that there was an error on the server
side while processing the request. The purpose of these status codes is to inform the client that the request was valid but the server failed to process it. It could be due to server overload, server crash, server misconfiguration, or other reasons. Basically, the errors are caused by the server and the client should try the request again later.
Some common 5xx
status codes are:
500 Internal Server Error
: The server encountered an unexpected condition that prevented it from fulfilling the request.501 Not Implemented
: The server does not support the functionality required to fulfill the request.502 Bad Gateway
: The server received an invalid response from an upstream server while processing the request.503 Service Unavailable
: The server is currently unavailable due to overload or maintenance.504 Gateway Timeout
: The server did not receive a timely response from an upstream server while processing the request.
501
and 405
Status Code#
The 501 Not Implemented
status code is used to indicate that the server does
not support the functionality required to fulfill the request. For example, if a
client sends a PATCH
request to a server that does not support the PATCH
method, the server should return a 501 Not Implemented
status code.
The 405 Method Not Allowed
status code is used to indicate that the request
method is not allowed for the requested resource. For example, if a client sends
a POST
request to a resource that only allows GET
requests, the server should return a 405 Method Not Allowed
status code.
401
and 403
Status Code#
The 401 Unauthorized
status code is used to indicate that the client must
authenticate itself to get the requested resource. Maybe the client has not
included a cookie or an authorization header in the request. The client
should retry the request with the proper credentials.
The 403 Forbidden
status code is used to indicate that the client understood
the request but refused to fulfill it.
HTTP Persistent Connections#
From many example HTTP messages above, you might notice that clients and servers
often send the header Connection
in HTTP messages. If sent by the client with
the value keep-alive
, it requests the server to keep the connection open. If
send by the server with the value keep-alive
, it indicates that the server
agrees to keep the connection open. This is called a persistent connection.
However, if the value is close
, it indicates that the server will close the
connection after sending the response. The client should establish a new
connection if it wishes to send new data. This is called a non-persistent
connection.
Persistent connections are based on the underlying TCP connection. When a client sends a request to a server, it establishes a TCP connection to the server. The The process of establishing a TCP connection is expensive because it involves many steps such as the TCP handshake, the TLS handshake and the teardown. In reality, the process can be even more complex.
Therefore, persistent connections are used to reduce the overhead of
establishing. This is the main different between HTTP/1.0 and HTTP/1.1. In HTTP/
1.0, the default behavior is to close the connection after sending the
response. In HTTP/1.1, the default behavior is to keep the connection open
unless the server sends a Connection: close
header.
In the example above, for every request, the client has to establish a new TCP connection to the server. After successfully receiving a response, both client and server close the connection. The process repeats for every request.
The example above shows how persistent connections reduce the overhead of TCP
connections. By keeping the connection open, the client can send multiple
requests using the same TCP connection. Unless there is a timeout or the server
sends a Connection: close
header explicitly, the connection will be kept open.
In later version of the HTTP protocol, HTTP/2 and HTTP/3, persistent connections
are handled differently. Therefore, the Connection
header is not used. In
fact, it is prohibited:
HTTP/3 does not use the Connection header field to indicate connection- specific fields; in this protocol, connection-specific metadata is conveyed by other means. An endpoint MUST NOT generate an HTTP/3 field section containing connection-specific fields; any message containing connection- specific fields MUST be treated as malformed.
— RFC 91144
Stateless and Stateful HTTP#
By default, HTTP is a stateless protocol. It means that each request is handled separately and independently. The server does not keep track of the state of the client between requests. Therefore, each client request is required to put some information in the request to identify itself.
For example, when a client successfully logs in to a website, the server provides some authorization token to the client. The client must send the token in every subsequent request to access protected resources. The server does not keep track of the client’s login state. If the client loses the token, it must log in again to get a new one.
But them the question is that how does the server send the token and how does the client send it back in subsequent requests? The answer is Cookie 🍪.
Let’s take a look at the following simple authentication:
Upon a successful login, the server sends a Set-Cookie
header in the response
to the client. The Set-Cookie
header will tell the client’s browser to store
the cookie with the name token
and the value abcxx
. The client’s browser
will automatically include the cookie with the header Cookie
in subsequent
requests to the server. The server will extract the cookie from the header and
verify it with the database. If the cookie is valid, the server will return the
protected resources to the client.
Modern web applications also use different methods to maintain the state of the
client. Some will use the Authorization
header to send the token back to the
server. Some servers send the authorization token in the body of the response
and the client will store it in a local storage. With this method, developers
on client apps need to manually include the token in the Authorization
header.
HTTP over TLS#
Transport Layer Security (TLS) is a cryptographic protocol that provides a secure communication channel between two parties over the Internet. It is primarily used to encrypt the data exchange in HTTP messages between web applications and servers. A HTTP message over TLS is called HTTPS. HTTPS is set up on top of TCP and below the application layer.
What TLS provides:
- Encryption: hides the exchanged data from third parties.
- Authentication: ensures that the data is sent and received by the intended parties.
- Integrity: ensures that the data is not tampered with during the transmission.
In a nutshell, HTTPS provides a secure communication channel in which only the end sender and the end receiver can read the data. Even if the data travel across the Internet, it is encrypted and cannot be read by third parties. Without HTTPS, third parties such as Internet Service Providers (ISPs), gateways and eavesdroppers can read the data and potentially steal sensitive information.
HTTP servers can be configured to listen on many different port numbers as long
as they are valid and not in use. However, the default port number for HTTP and
HTTPS are 80
and 443
respectively. When a client sends a request to a server
without specifying the port number, the default port number is used based on the
the scheme of the URI. For example, if a client sends a request to http://www.example.com
. The default port number is 80
. If the client sends a request to https://www.example.com
, the default port number is 443
.
Conclusion#
HTTP has been the backbone of our modern web applications and the main protocol for communication between clients and servers. By defining a set of rules or formats of messages, HTTP allows clients and servers to transfer data over the Internet. That includes texts, images, videos, or hyperlinks. By learning the basic concepts of HTTP, you can understand how HTTP works under the hood in a basic level.
You can follow my GitHub repository, richardnguyen99/http-from-scratch, to learn more about building your own HTTP server from scratch in C++ without using any third-party HTTP libraries.
HTTP also has many advanced and complex features that are used to solve some common problems in HTTP. For more details, you can refer to my other posts:
- Cross-Origin Resource Sharing (CORS)
- Web caching
- So you would like some cookies
- head-of-line blocking: Differences between HTTP/1.1, HTTP/2 and HTTP/3