Around IT in 256 seconds

#10: HTTP protocol

July 28, 2020 | 3 Minute Read

HTTP protocol is fundamental to the Internet. It’s a simple request-response protocol where the request is initiated by the client, typically a web browser

HTTP 1.0, 1.1, 2 and 3

However, HTTP is so pervasive that it’s also used to communicate between machines through APIs. Most important chatacteristics of HTTP are:

  • both request and response contain a set of headers
  • both request and response may contain a body
  • body can be anything: HTML, JSON, binary image, MP3, etc.
  • request contains the name of the resource we’d like to access
  • accessing is described using a few verbs, like GET and POST
  • a response starts with code, most commonly used are 200 OK and 404 Not Found

But rather than explaining the basics, I’d like to discuss the differences between protocol versions. They are very important from performance perspective.

So the first major version of the protocol was 1.0. Version 1.0 has one flaw: it opens a new TCP/IP connection for each request. It’s a big deal: creating a new network connection requires a few round-trips. Also, TCP/IP has a built-in mechanism called slow start. Each new connection is initially rather slow but gets faster over time. This mechanism adapts to varying network conditions dynamically. Unfortunately, if you keep opening and closing new connections, they may never reach their full potential. To overcome this issue a Keep-alive request header was added as an afterthought. It instructs the server to keep connection open even after finishing with the response. Another request may come through the same connection. This is called a persistent connection.

In HTTP 1.1 persistent connections are enabled by default and you have to disable them explicitly using Connection: close header. Another important addition was HTTP pipelining. In HTTP 1.0, even with persistent connections, it wasn’t allowed to send second request before response to the first request was fully received. Pipelining allows sending multiple requests at once and then simply waiting for responses. This reduces network round-trips and overall waiting time. However, keep in mind that responses must be delivered in the same order as requests. So if your first response is really slow, server is not allowed to return subsequent faster responses. This is called head-of-line-blocking.

HTTP/2 is a huge upgrade to HTTP protocol. It evolved from SPDY protocol. First of all HTTP is now a binary protocol. Secondly headers are compressed to avoid repetition. Thirdly, under single persistent connection there can be multiple logical streams, each representing a single interaction. All of the above are important, but I’ll focus on streams. Streams are independent of each other, even though they share the same connection. Therefor it’s easy to send multiple HTTP requests and receive responses from fastest to slowest. They can even interleave half-way. Head-of-line-blocking is somewhat reduced. Also, stream doesn’t have to be a request-response interaction. Server can even push data to the client without request. That sounds weird until you realize that a server can proactively push CSS and JavaScript to the browser even before it asks for it. Server push and greater parallelism make HTTP/2 a much faster protocol.

HTTP/3 is not yet released, but some browsers already support it. It is based on QUIC protocol and operates using UDP, rather than TCP/IP. It turned out that head-of-line-blocking emerged one more time on TCP/IP level, unaware of independent streams in HTTP/2. Moving on to UDP tries to solve this problem. Lost packets no longer cause the whole TCP/IP connection to recover and retry. Only a single stream is affected.

More materials

Tags: http, quic, spdy

Be the first to listen to new episodes!

To get exclusive content: