Categories
Technology Web

Moving to HTTP/2

I upgraded my server to Ubuntu 16.04, converted my websites over to HTTPs, and locked them in using HSTS. It would be a shame to stop here without making one last change: upgrading to the HTTP/2 protocol.

The web has been spinning along on HTTP 1.1 since 1997. It’s been a good and faithful servant. However, the protocol is also showing its age, leading to gimmicks and workarounds just to more readily process today’s web pages.

In 2015, the powers-that-be that brought us HTTP 1.1 released a new transport protocol: HTTP/2. According to a Google introduction on the new protocol:

The primary goals for HTTP/2 are to reduce latency by enabling full request and response multiplexing, minimize protocol overhead via efficient compression of HTTP header fields, and add support for request prioritization and server push.

We’ll look at the advantages to HTTP/2, as well as some of the gotchas associated with it. I’ll then cover what I needed to do on my site to get HTTP/2 up and running.

HTTP/2 is a binary rather than textual protocol

HTTP/2 is a binary protocol, rather than a textual one like HTTP 1.1. OK, that sounds peachy but what does it mean for the average website maintainer?

According to the HTTP/2 specification FAQ:

Binary protocols are more efficient to parse, more compact “on the wire”, and most importantly, they are much less error-prone, compared to textual protocols like HTTP/1.x, because they often have a number of affordances to “help” with things like whitespace handling, capitalization, line endings, blank lines and so on.

The communication between client and server is the same, supporting the same verbs such as GET, PUT, and so on. What’s different is the encoding of these messages. HTTP 1.1 used plain text, while HTTP/2 uses a binary encoding. HTTP 1.1 is human readable, while HTTP/1 was meant for machine access.

Moving from textual to binary won’t matter to most people, but does matter to those developing client or server applications.  For instance, Node.js currently provides an experimental module, HTTP/2, implementing the protocol. And, for the most part, major browsers now support its implementation.

What happens with browsers don’t support HTTP/2? The server should seamlessly downgrade to HTTP 1.1. There is, however, a downside to this, which I’ll get into a little later.

HTTP/2 supports header compression

Each transfer of data between server and client carries a load of metadata, or data about data, providing information about the data being transferred. HTTP 1.1 sent this metadata as plain text, which could add several hundred bytes to the transfer. This doesn’t sound like much, except if you consider how many data transfers occur daily.

According to Internet Live Stats, there are at least 40,000 Google searches a second.  This writing will be just one of over 3 million posted today. Who knows how many tweets are being tweeted, and Facebook posts being liked. I imagine that, at this moment, thousands are buying a pair of shoes online.

There’s more:  when you access a single web page, you’re also accessing other resources, such as images and CSS stylesheets.

All of these activities carry a load of metadata describing what’s happening: the verb for the transaction (GET or POST), the URL, and so on. Combined, sending this information as plain text adds an inordinate burden on the internet, which, in turns, slows the entire internet down for all of us.

HTTP/2 changes all of this by using HPACK, a header compression. Compressing the header data decreases the number of bytes transmitted, and increases the efficiency—not only of the individual requests but of the internet, as a whole.

Again, for the average person, we don’t see this change, and it has no impact on how we use the web. This is all behind the door stuff. However, not all of HTTP/2 is as transparent.

HTTP/2 supports multiplexing

When you access a web page using HTTP 1.1 your browser establishes a connection request using TCP, or Transmission Control Protocol, for each resource.  To keep one web page’s request from bogging down the entire server, host servers limit the number of TCP connections a browser could make. The requests for the page’s web resources have to share the connection, which means that some (many) of the requests would be blocked until previous requests had been satisfied.

The end result of this connection contention is a web page with hundreds of images, no matter how small, as well as dozens of CSS and JavaScript files, no matter how tiny, would be much slower than a web page with fewer resource requests, even though the resources requested with the latter may be larger.

To get around the connection limitation, people have created some interesting and convoluted workarounds.

One workaround is fairly standard: inlining. This means embedding JavaScript and CSS directly into the HTML page, rather than loading separate resources. With the broader support for SVG (Scalable Vector Graphics), graphics can also be embedded directly into the page. The theme this website uses features several SVG icons embedded in the page in the footer.

A more complex workaround is the use of image sprites. You’ll see these used extensively with online games. Image sprites are small, individual images all incorporated into one image file. When the web page is loaded, the larger image is pre-loaded. When one specific image sprite is accessed, CSS would be used to hide the rest of the larger image, only showing the one image.

An even more complex workaround to the connection limit is the use of domain sharding.

Since each TCP connection is limited to a specific domain, one way around the connection limit is to split the resources across several different domains, thereby increasing the number of connections. More connections mean less blocking and faster page loading.

Of course, both workarounds, image sprites and domain sharding, came with their issues. For instance, changing just one tiny image in the entire image file for the sprite file means that the entire file needs to be reloaded the next time the page is loaded. With domain sharding, you have to ensure resources are always stored in the same domain, or the resource will have to be re-cached, losing any performance benefits. There’s also an additional DNS lookup burden to using many domains, and it’s a heck of a problem once you utilize HTTPS: all domains need their own SSL certificate (unless you’re using an authority that can support wildcard * domains).

To summarize, image sprites and domain sharding are kludges. They’re kludges brought about because HTTP 1.1 doesn’t meet today’s web needs. They’re identical in nature to the one pixel transparent GIF we used to use before we had CSS to manage web page spacing, or the use of HTML tables to control a page’s layout.

The primary goal of HTTP/2 and the change bringing about the most impact is the protocol’s support for multiplexing. And this is the one that could most impact on the average person if that person is maintaining a website.

With HTTP/2, multiple resources in the same domain can be accessed concurrently on the same connection. No more blocking, and no need for multiple connections, either.

Think of attending an event at a stadium. At first, only one security checkpoint and entrance are open, and everyone is in a long line, waiting to get in. Now, think about what happens if suddenly ten more security checkpoints and entrances are opened up—the line splits into ten smaller lines, and people move proportionally faster.

Unlike some of the other HTTP/2 protocol changes, this is one that impacts on web designers and developers. Just as we optimized for HTTP 1.1, now we need to consider optimizing, or perhaps de-optimizing, for HTTP/2.

HTTP/2 on Burningbird

After all the other changes I made to my server it made no sense to stop before making the move to HTTP/2. In for a dime, in for a dollar.

The most important change I had to make in order to incorporate HTTP/2 is set my site up to serve pages via HTTPS. Most browsers will not support HTTP/2 from sites that don’t support SSL.

Next, I had to upgrade Apache. The version of Apache that comes standard with Ubuntu 16.04 (2.4.15) doesn’t provide support for HTTP/2. HTTP/2 support in Apache is stable and incorporated starting with version 2.4.17. However, versions earlier than 2.4.26 are considered insecure. So, I needed to upgrade my Apache to 2.4.27. Doing so required me to use a third party Apache package.

After first making a backup of my site using Linode’s snapshot facility, I used Ondřej Surý PPA for Apache2.x  to upgrade to 2.4.27 (all commands are run as sudo):

apt-get apt-get install software-properties-common python-software-properties
add-apt-repository ppa:ondrej/apache2
apt-get update
apt-get upgrade

I got a message about Apache2 and other dependencies being held back. I used dist-upgrade to install the held back packages rather than individually install each:

apt-get dist-upgrade

I tested the new Apache installation before proceeding. However, before I could enable HTTP/2 I had to make one more modification to my system.

Apache 2.4.27 doesn’t support HTTP/2 with the prefork MPM (multiprocessing module). However, the standard PHP installation that works with Apache is dependent on the prefork module. So, before I could change the MPM, I needed to change PHP to FastCGI. I followed HTTP2.Pro’s directions, except that I used the 7.0 version to enable FastCGI:

apachectl stop
apt-get -y install libapache2-mod-fastcgi php7.0-fpm
a2enmod proxy_fcgi setenvif
a2enconf php7.0-fpm
a2dismod php7.0

These commands install and enable the FastCGI module, and disable the previously installed PHP 7.0. I then disabled prefork MPM and enabled the event MPM, and restarted Apache:

apachectl stop
a2dismod mpm_prefork
a2enmod mpm_event
apachectl start

After I checked that FastCGI was working properly with my WordPress-enabled sites, I was ready to enable HTTP/2:

apachectl stop
a2enmod http2
apachectl start

To support the HTTP/2 protocol for my sites, I had to modify the virtual host file for each, adding in the following line:

<VirtualHost *:443>
  Protocols h2 http/1.1
  ...
</VirtualHost>

What this line tells Apache is that the HTTP/2 protocol (h2) should be used when the site is accessed, unless the client doesn’t support it. Then, Apache should use HTTP 1.1.

After adding the protocol line to each virtual host, I restarted Apache. I then tested my new HTTP/2 with KeyCDN’s online HTTP/2 test tool.

Success.

If you access this web page with most modern browsers, you’re seeing HTTP/2 in action.

HTTP/2 degrades to HTTP 1.1

In the last section I mentioned that if you access this page with most modern browsers, you’re seeing HTTP/2 in action. As you can see from the Can I Use page for HTTP/2, all of the most used modern browsers support HTTP/2.

What happens if you use a browser that doesn’t support HTTP/2? The browser gracefully degrades back to HTTP 1.1. And therein lies a dilemma.

Once you’ve upgraded your site to using HTTP/2 you can get rid of the kludges. You can get rid of image sprites in support of individual image files. You can stop inlining script, CSS, and graphics. You can definitely get rid of domain sharding.

When you do, though, if your page is accessed by someone using a client that doesn’t support HTTP/2, your web page performance degrades, perhaps significantly.

So the key is to understand how people access your website. If the vast majority of access is via modern browsers that support HTTP/2, then you can make the protocol upgrade and begin the process of optimizing your pages for the new protocol.

Summary

Moving to HTTP/2 didn’t improve my website performance to a significant degree, primarily because I don’t use a large number of resources in my pages.

However, by upgrading to HTTP/2 now, I know that if I do add additional resources, my pages will still load quickly. I’m future-proofed.

(Imagekit.io provides a very effective demonstration of the difference between the protocols.)

I’m still exploring the different HTTP/2 optimizations, including the number of concurrent streams (hence resources loaded) my Apache server supports (100 by default), as well as whether to incorporate the WordPress Server Push plugin. Another advantage to HTTP/2 is it supports server push, which means the site can push the resource to the client even before the client asks for it.

But these tweaks are best left to another day.

With the addition of HTTP/2 my site upgrade is complete.