@adlrocha - A QUIC look to HTTP
A brief history of one of the cores of the web
|Alfonso de la Rocha||May 31, 2020|
If I ask you, what is the killer app of the Internet, what would you say? For me, it is not Google, nor Facebook, is all of them. The killer app of the Internet is the Web itself. We use the Web to do more and more of our daily tasks. We even file our taxes and buy our groceries online (more so considering the times we are living). Along with DNS, the core protocol of the Web is HTTP, and in this post I will try to drive you through its history and its most recent major updates.
An overview of HTTP
HTTP is a network protocol which allows the fetching of resources, such as HTML documents, from distant machine. It is the foundation of any data exchange on the Web and it is a client-server protocol, which means requests are initiated by the recipient, usually a Web browser. A complete document is reconstructed from the different sub-documents fetched (text, layout description, images, videos, scripts, and more). Sounds familiar, right? This is how your favorite websites are rendered in your machine.
HTTP is an extensible protocol which has evolved over time. It is an application layer protocol that is sent over TCP, or over a TLS-encrypted TCP connection, though any reliable transport protocol could theoretically be used (this concept will come in handy when we talk about QUIC). Due to its extensibility, it is used to to fetch any kind of resource (JSON files, media, data blobs, etc.).
When a client wants to communicate with a server, either the final server or an intermediate proxy, it performs the following steps:
Open a TCP connection: The TCP connection is used to send a request, or several, and receive an answer. The client may open a new connection, reuse an existing connection, or open several TCP connections to the servers.
Send an HTTP message: HTTP messages (before HTTP/2) are human-readable. With HTTP/2, these simple messages are encapsulated in frames, making them impossible to read directly, but the principle remains the same (more about this in the next section).
Close or reuse the connection for further requests.
If HTTP pipelining is activated, several requests can be sent without waiting for the first response to be fully received. HTTP pipelining has proven difficult to implement in existing networks, where old pieces of software coexist with modern versions. HTTP pipelining has been superseded in HTTP/2 with more robust multiplexing requests within a frame.
The primary goals for HTTP/2 are to reduce latency by enabling full request and response multiplexing, minimize protocol overhead via efficient compression of HTTP header fields, and add support for request prioritization and server push. HTTP/2 modifies how the data is formatted (framed) and transported between the client and server.
At the core of all performance enhancements of HTTP/2 is the new binary framing layer, which dictates how the HTTP messages are encapsulated and transferred between the client and server. The "layer" refers to a design choice to introduce a new optimized encoding mechanism between the socket interface and the higher HTTP API exposed to our applications. The HTTP semantics, such as verbs, methods, and headers, are unaffected, but the way they are encoded while in transit is different. In this new version of HTTP:
All communication is performed over a single TCP connection that can carry any number of bidirectional streams.
Each stream has a unique identifier and optional priority information that is used to carry bidirectional messages.
Each message is a logical HTTP message, such as a request, or response, which consists of one or more frames.
The frame is the smallest unit of communication that carries a specific type of data—e.g., HTTP headers, message payload, and so on. Frames from different streams may be interleaved and then reassembled via the embedded stream identifier in the header of each frame.
With HTTP/1.x, if the client wants to make multiple parallel requests to improve performance, then multiple TCP connections must be used. HTTP/2 removes these limitations, and enables full request and response multiplexing, by allowing the client and server to break down an HTTP message into independent frames, interleave them, and then reassemble them on the other end. Starting new TCP connections is an expensive operation, so the lesser number of connections required to exchange our data, the less overhead we’ll be face in the communication.
One final advantage of HTTP/2 over HTTP/1.x is the ability to compress HTTP headers. The use of frames allows the compression of headers, and this improvement has a significant impact in performance. To understand the impact, this quote from Patrick McManus from Mozilla illustrates perfectly the effect of headers for an average page load:
If you assume that a page has about 80 assets (which is conservative in today’s Web), and each request has 1400 bytes of headers (again, not uncommon, thanks to Cookies, Referer, etc.), it takes at least 7-8 round trips to get the headers out “on the wire.” That’s not counting response time - that’s just to get them out of the client.
QUIC is not only HTTP/3
But despite all the improvements made in HTTP/2, the are things still to be fixed to make it the ultimate transport protocol. The first thing to bear in mind about QUIC is that it is not limited to just transporting HTTP. QUIC was designed looking to make the web and data in general delivered faster to end users. In June 2015, an Internet Draft of a specification for QUIC was submitted to the IETF for standardization. A QUIC working group was established in 2016. In October 2018, the IETF's HTTP and QUIC Working Groups jointly decided to call the HTTP mapping over QUIC "HTTP/3" in advance of making it a worldwide standard. So QUIC is not only HTTP/3, but much more.
With HTTP/2, typical browsers do tens or hundreds of parallel transfers over a single TCP connection. If a single packet is dropped, or lost in the network somewhere between two endpoints that speak HTTP/2, it means the entire TCP connection is brought to a halt while the lost packet is re-transmitted and finds its way to the destination. All of the advantage gained by multiplexing streams, thrown to the trash. Fixing this issue is not easy, if at all possible, with TCP, and this is one of the reasons why it was decided to build QUIC over UDP.
The fact that QUIC is built over UDP means that there is no message ordering and reliability by design like in TCP, but not to worry. QUIC uses steams to provide a lightweight, ordered byte-stream abstraction. QUIC streams can be unidirectional streams, carrying data in one direction from the initiator of the stream to its peer; and bidirectional streams which allow for data to be sent in both directions. Streams are identified using a Stream ID (a handy piece of information used to order data in its corresponding stream within reception).
QUIC is always secure. There is no clear-text version of the protocol so negotiating a QUIC connection means also doing cryptography and security with TLS 1.3. There are only a few initial handshake packets that are sent in the clear before the encryption protocols have been negotiated. The cool thing about this security by-design is that by encrypting as much of the communication as possible, we prevent middle-boxes routing our traffic from seeing much of the protocol passing through (for privacy freaks like me, this is groundbreaking).
QUIC offers both 0-RTT and 1-RTT handshakes that reduce the time it takes to negotiate and setup a new connection. Compared with the 3-way handshake of TCP, the improvement is noticeable. The 1-RTT connection is used to startup a new connection, while the 0-RTT handshake only works if there has been a previous connection established to a host and a secret from that connection has been cached (the best way of understanding this feature is the animation below).
All of the aforementioned features belong to QUIC, and it is upon them that HTTP/3 is built. HTTP/3 does HTTP-style transports, including HTTP header compression over QUIC stream, offering multilpexing by-design (unlike HTTP/2 where multiplexing had to be implemented from scratch in the protocol). HTTP/3 works sending data using frames (like in HTTP/2) over a QUIC stream. Thus, in an HTTP/3 request, the client sends its HTTP request on a client-initiated bidirectional QUIC stream. A request consists of a single HEADERS frame (which might be optionally compressed) followed by one or two other frames: a series of DATA frames and possibly a final HEADERS frame for trailers. After sending a request, a client closes the stream for sending. The server sends back its HTTP response on the bidirectional stream. A HEADERS frame, a series of DATA frames and possibly a trailing HEADERS frame. Easy peasy!
More yet to come
I was trying to find some kind of survey analyzing the distribution of versions of HTTP connections out in the wild. I would say that the predominant version of HTTP used by web applications currently is HTTP/2, with very little adoption of HTTP/3 and QUIC so far, but this are just guesses. I couldn’t find any reliable source with information about this. I am personally pretty excited to see QUIC working in the wild, and if I find the time, for a future publication I want to try to set up my own web application working over QUIC to see the benefits of this pretty cool piece of engineering. In the meantime, you can always subscribe to the newsletter so you don’t miss a thing. See you next week!