What you will learn?
How to Build a Death Star Using HTTP/2, IPv6 & Service Workers
So in 1996, a man named Trey Harris was responsible for the email system for the University of North Carolina. And one day Trey was having his morning latte, sat at his desk when the phone rang. And he picked it up and it was a professor who was the head of the statistic, head of the statistics department.
So Trey sort of sat up, it’s unusual for a head of a department to phone him directly. And so the professor sort of explained that they, everyone in his department had been having trouble with their emails for the last couple of days.
So Trey asked for a little bit more details and the professor said, “I’m not really sure if it’s like we’re running out of energy or juice or whatever makes emails work. But any email that my department sends can’t travel more than 500 miles.” And Trey had that sort of predicament, something like spilled a bit of his latte out of his mouth and like sat up and just had that predicament where you need to tell someone that they’re wrong about something, but they’re quite an important person and you need to make sure you don’t imply that they’re an idiot.
So he sat there and he said like, that’s not really how email works, but I’m going to look into it and get right back to you. So he hung up the phone, spent the next sort of couple of hours just getting more and more shocked when he realized he was unable to send an email more than 500 miles. This is weird.
So to understand this, you need to understand how emails are sent. Basically, one email server that wants to send an email has to open a connection to another email server, which is going to receive the email and the amount of time that it takes to do that is called latency. And essentially, what was happening was someone had misconfigured the server, the email server such that the connection timeout was set to 3 milliseconds, which is a ridiculously low amount of time. And so the latency required to connect to email servers too far away was coming into play as problematic and if you do the maths and you look at three millisecond time out, and you take the speed of light, you can work out that in the amount, in that amount of time, light can traveled at about 500 and a little bit miles.
And so before I start getting into the weeds of it. By show of hands, like who is just really, really excited about http2 and service workers? Who brought their ticket just for this presentation. You don’t have to be shy, come on. No, not so much? We’ve got a couple of weirdos over there. Yeah. I get excited by this as well, but I probably should be the benchmark of any of this stuff.
In a couple of years’ time, I think that’s going to be true of both of these technologies and so I want to go through them today and basically prepare you for that point when in a couple of months’ time, a client calls you and asks about the SEO impact of HTTP/2 or you’re in a meeting with your boss and they ask about service workers, etc. And the good news is that it isn’t actually as scary as you may think it is, and in the next 30 minutes, I’m going to basically take you through everything you need to know as SEOs and you should be able to grasp all of it, and then you can go back to not worrying about it.
Okay, so diving with HTTP/2 first.
And we’re not going to get too much into the weeds of the sort of technical how it breaks down but really quickly, a really simple example of an HTTP 1.1 request is a GET request because you’ve got GET request, POST request and a few other bits. The page that I want, anchorman folder and then what version of HTTP I’m using. One point one has been around for 15 years or so.
And then you’ve got a host header, what website I’m actually wanting to get this page from, some servers will host multiple websites. So I need to say which website I’m interested in and you may have something like your user agent, which is what version of program you’re using or are you on a phone or that sort of thing and you have a whole bunch of these upper headers.
And then the response looks like this. You get a status code, 200, 404, whatever it might be and then you get a whole bunch of headers, something like content-type saying this is some HTML that I’m sending you and there’s a whole bunch of other headers that we don’t need to worry about and then you get the body of the HTML. So you’ve got Ron’s page here, Stay Classy San Diego. And so that’s basically what we need to understand in terms of what the anatomy of a request and response looks like.
And then the next thing we need to understand is that every single HTTP request is for a single file. So whether that’s for an HTML file or a PNG or font file or a CSS file, whatever it might be. And so we’re going to talk about HTTP/2 using trucks. We say lorry in England, you say trucks, so we’re going to use trucks like English 2.0 over here. We’re speaking English 1.1 in the UK. So we’re gonna use a lot of trucks and we can imagine that our browser is going to send her truck. That is the HTTP request down this road to the server and it’s gonna come back with the response. And that’s how our browser is going to fetch responses.
And without getting too much into the weeds of this either, the road is our TCP connection. So TCP, you may be seeing that phrase in your Wi-Fi settings, whatever, and that’s basically the networking protocol that connects the two servers and then our HTTP requests drive along this connection.
Okay, so we’ve got requests and outbound trucks are carrying our requests in there and returning trucks have got our response back from the server. Very simple and, but we already have a problem. If we got open trucks, open-topped trucks here, anyone can look into what is being carried in these trucks. Anyone can look and see what all our secrets are. So if you’re buying something on an e-commerce site, I do this, your credit card number’s in these HTTP trucks. And so someone might be able to spy on that or look at your emails wherever it might be. So we need to, we need to fix that. So that’s where HTTP came.
So now we’re going to have a tunnel. Someone should have probably shut me down before I got too carried away with the trucks, but we’re going to live with it now. So we got a tunnel and now the trucks are gonna drive through the tunnel. Nobody can look at the trucks anymore. But the important thing to understand is the trucks inside the tunnel are exactly the same as the trucks they were before. The HTTP protocol itself is exactly the same with HTTPS as it is with HTTP. Nothing changes. The only thing that changes is we’re driving through this tunnel. So nobody can look at the data. But the actual request, the response codes, etc. are exactly the same. So we’re going to spend most of the time without the tunnel because it’s easy to look at the road.
So this all sounds great. We’ve got HTTP and we got HTTPS. Everything’s been working fine. So, why do we need to change anything? There’s a number of problems that we need to address.
The first is that even very small requests still take a lot of time because our trucks can only drive at the speed of light. And so, like basically, a longer road takes a long time for a truck to drive even if it’s carrying a tiny little bit of information. And so most pages, web pages are made up of loads of little bits of information, tiny little requests and files, etc. that fit comfortably in our trucks. So the bigger problem is how far do they need to drive.
And so the amount of time they take to drive is latency if you think back to the 500-mile email problem. Okay, the second problem is that most pages are made from many, many files and so there’s many, many requests. We have to send a lot of trucks back and forth because you start off by asking for the HTML file and then you realize oh, they, we actually need the PNG, that’s this icon at the top of the page. We need just jpeg. That’s a picture of Ron Burgundy and we need this CSS file, but the CSS file reference is a font file and so now we need to send more trucks to get that. And so, over time, the number of files that most web pages are made up of has exploded and so nowadays typically seeing 50 to 100 different files just making up one single web page is not uncommon.
And then the third problem is that we’ve started using increasingly mobile devices. And so there, basically, I didn’t draw a diagram with this but you can imagine these potholes on the road and the trucks can’t go as fast. So a 3G connection like the latency is typically 100 to 500 milliseconds. So it takes a long time for a single truck to drive back and forth.
So what does all does that add up to?
Now we got a problem. It takes time to build roads. Someone should definitely have stopped me with the trucks. And so we’ve got steamrollers.
Steamrollers are going to build the road. So before we can ever send a truck down this road, before we can ever send a request for a web page, we first need to send our steamroller all the way to the server putting down the tarmac and then coming all the way back, painting the road. And so that’s going to take us another hundred milliseconds every time we build a road.
Modern browsers typically open about six connections maximum. They won’t go beyond that for a whole bunch of networking reasons that we’re not going to get into. And so what happens is you open up six connections to a server, but you need to get 50 or a hundred files. So you end up with a situation where you’ve got a whole bunch of trucks queuing up. We say queueing up in England, you say lining up over here. I’ll try to use English 2.0.
So we’ve got trucks lining up waiting for a row. And what this looks like in sort of waterfall diagram which you can see in Chrome Dev Tools or if you’re using a platform like a webpagetest.org to measure the performance of your site. You see something like this. So the top request is getting the HTML page as a request for a CSS, which we won’t worry about too much. And then we basically at this point we’ve got two roads open. So we’re collecting two files. And then we collect another two files when the first two are finished, so over here, these two trucks get back and now these two trucks can leave to go and get their request. And then we can start the next two when these two gets back. And then this is basically new connections being opened. So four steamrollers going.
And so now we are allowed to have, we got six roads. So now we can see the six requests going but with the wait for these to get back and so on so forth. This is called head of line blocking. Basically, we’ve got a problem in that all of these requests are waiting to have a free connection. And so you can see that like overtime, this stacks up to being quite a lot of impact in terms of how long it would take us to load all these resources. And basically, this is what, this is from a study by Google which was motivated by them trying to understand how they could address this problem. And they establish that in order to fix this problem where we’ve got our trucks always waiting, we basically, we need to compress this. We need to make, resolve, or reduce the latency.
So, how do we go about doing that? There’s a number of different ways that people have tried doing this over time. So. Some of you might be familiar with sprite sets which is basically the idea that we’re going to take a whole bunch of images and we’re going to glue them all together so that they’re one file. And then we’re going to ask our browser to use CSS just to show the part of the image that we’re interested at any one time. So this this star, or this magnifying glass whatever it might be. And the whole idea of this is that we can put all of these images as one file into one truck so we don’t have all of these round trips going back and forth. But it’s a real hack and it’s a real pain in the ass for the developers, and nobody really enjoys this.
So HTTP/2 to the rescue. With HTTP/2, we’re basically going to introduce a new traffic management system. We’re going to say, we’re gonna have one connection, but now with http2 you can have multiple trucks on the road at a time. So we’re going to number them or color code them or whatever we want to do. And so multiplexing, this is called multiplexing, and this is basically what HTTP/2, the main thing that HTTP/2 introduces.
And so previously we had the situation looking like this where we had all of these requests stuck, being cued, waiting one for another, I have as we move to HTTP/2, the exact same page looks like this. We’ve got the initial steamroller. We only ever have one road so we only have to do that once. And then the HTML arrives and then all of the trucks for all of the CSS, all of the image files all leave at once. It looks like it’s slower or the same amount of time because the scales differently, but here you can see it takes about two seconds and with HTTP/2 you can see exactly the same takes about one point one seconds. So we’ve got a significant improvement and we’ve, as SEOs often spend a lot of time fighting, trying to get like performance things pushed through and HTTP/2 actually is a huge performance increase in certain scenarios.
And so just as we’re, the move from HTTP to HTTPS the trucks stay the same. The same is true for HTTP/2. The actual requests stay exactly as they did before. All we’ve done is introduce this new traffic management system. So previously, I showed you this request and in this scenario, it looks exactly the same. All we’ve done is change it to HTTP/2 rather than HTTP 1.1. The response looks exactly the same. We’ve got headers and a response code and we got the HTML all coming together. All of the response codes that you know and love, 301s, 302s, 307s, 308s, what sort of redirect are we doing, all stay exactly as they did before. So none of this changes. So everything you already knew about HTTP is still relevant and you don’t need to worry too much about like the roads.
Matt Clayton is up right after the next break talking about performance. Matt’s going to be talking a little bit about HTTP/2 as well and some of the performance benefits in more detail, but he and I both agree that server push at the moment isn’t ready for prime time.
So HTTP/2 requires HTTPS, so you need to understand that you’re not going to be able to have HTTP/2 unless you already have HTTPS in place.
Okay, so, how can you get it? This all sounds great. We can speed up our websites. We don’t even have to do anything and everything keeps working the same. So, how can you do this?
The good news is that your developers don’t need to change anything because the trucks are the same, because the HTTP requests are exactly the same, in order to roll out HTTP/2, this is a problem for the server software. So Apache or Nginx or whatever it is that you’re using rather than there being a problem for the developers. So it doesn’t matter if you’re using WordPress or Drupal or whatever it is you’re using. You can essentially just roll out HTTP/2 without actually needing your devs to change any code. How can you do that?
One really simple way is, if you’re already using a CDN such as Cloudflare or Akamai or any of these things, then most of them, if you’re on HTTPS have a toggle where you can go in and basically say, turn on HTTP/2. And what will happen is browsers will then speak HTTP/2 to Cloudflare, which has got most of your sort of static assets like your images and CSS and doesn’t even need to request them to you from your server. So it’s really, really easy to roll out that way.
Okay, so we’re SEOs, like this all sounds great, but we shouldn’t be the people who are necessarily like driving this sort of change. So what’s the impact on Google? Does Google understand this?
First thing to understand is Google does not crawl. This is a subtitle easy hanging, doesn’t it? You’ve got to fix that before I upload us to suspect. Does not pull HTTP/2.
So on the original version of this deck, I had some hypotheses about how Google actually evaluates site performance with regards to, as a ranking factor, and I found out from John Mueller last week at a different conference. I spoke to him about. And he confirmed to me and this doesn’t seem to be published widely anywhere at the moment. So this is the first time we’re talking about it is that Google has switched to using Chrome user experience reports. So this is basically the data that you get from, that Google would get from Chrome users. Though anyone who opts into like sharing their stacks basically uploads every time you go to a website how fast was that website, how quick was the first paint, and all of that information. And then Google is using this as a ranking factor.
And so the takeaway from that is that they will notice in terms of if you’re getting the performance increase, they will notice this as it pertains to the ranking.
Okay, so HTTP/2 takeaways.
It can be a quick performance win and CDNs make deployment pretty easy.
You need to have HTTPS so if you haven’t already migrated to HTTPS, then this is another good reason for you to be pushing towards doing that because you can get some benefits from HTTP/2.
Inside Chrome development tools, you can turn on protocol column, which basically will then show you, H2 is just a nickname for HTTP/2. You can turn this protocol column on and you can see basically what protocol any connection any request is using. If and when you’re doing that you see a protocol called speedy, that was basically HTTP/2’s predecessor. That was a project at Google before it became a web standard. They designed this traffic management system and that’s being retired. So you can essentially ignore speedy.
There’s a Chrome extension here. This deck will be available to download. So don’t worry about the link, but you can install this Chrome extension and you’ll get a little icon showing you which protocol’s being used.
Then a couple of really important things. HTTP 1.1 and HTTP/2 can exist together on the same server just fine. So browsers, all the major browsers speak HTTP/2 but if you’ve got a device that doesn’t or Google bot or whatever, they all simply fall back to using HTTP 1.1. So what that means is HTTP/2 is not a migration. It’s not like you’re going to be changing as you did with HTTPS from HTTP: to HTTPS:. You’re not going to be HTTP/2 now, it’s still going to be HTTPS. So this is not a migration. You can just turn this on. There’s no sort of SEO hoops to jump through.
Okay, so we’ve talked about HTTP/2, now I’m talking about service workers, which is a completely separate topic but also plays the same thing. It’s something that plays the performance of websites and so we need to understand what this technology is from the sort of SEO side.
So we’re going to have a new type of truck. It’s going to be a flying truck. I’m just joking. We’re done with the trucks. No more trucks.
And then there’s two things that turn a single page application into a progressive web application. So taking one of these sort of angular or react powered sites and turning into progressive web application, which you can install onto your home screen and operate like a native app and that’s basically a manifest file and a service worker.
The manifest file you can essentially just think of as a blob of JSON that is largely uninteresting that you can think of it as the settings for your progressive web app. So what is your splash screen going to look like? What’s the theme color? Etc. Etc. We’re not going to worry about that today.
And then you’ve got the service worker. And this is how Google describes a service worker. A service worker is a script that your browser runs in the background separate from a web page and hopefully it’ll become clear what I mean by that. So service workers enable a whole bunch of cool stuff. So background sync, offline functionality, push notifications in your Android phone, etc., intelligent caching.
Okay, so I’m short on time. So I want to go through this quite quickly. So where does service workers live? So the same deal as we had before we’ve got a browser that requests something from the server and gets back some HTML.
But in reality, it’s a bit more complicated than that when the browser wants to request something from the server. What it does first is it checks the browser cache. Have I already seen this HTML file or this PNG file or whatever it might be? If the browser, if the cache hasn’t seen it, it sends a request onto the server. It gets a response back that comes back via the cache and then subsequent requests look like this. I asked her the same thing again, and I get the response back from the cache. Fine.
And this is what happens when we do control refresh or F5 or shift Refresh on a Mac. We’re actually saying okay, I want to bypass the cache and I think we’ve all done this when the page doesn’t seem to update and so you hold down control to refresh. And so in this case, we go bypassing the cash and then we refresh the cache in their way back. Fine.
So what we need to understand is if you do view source or you look in the network tab of chrome developer tools, you’re seeing the communication actually happened here. It’s called networking tab, but you’re not actually seeing what happened over here. So even if nothing went over the network, you’ll still see a response in the networking tab that looks like it, oh this whole bunch of HTML apparently came over the network. But in reality it came from the cache.
So it could say something like I recognize this page or requesting their contact page and I know that you’re going to need these CSS files and image files. So what I’m going to do is I’m going to request the HTML file that you asked for. But I’m also going to send some additional requests to sort of pre-fetch some stuff that I think you’re going to need. Or it might do something like this and say, oh I can see that the CSS is in the cache already, so I’m going to give the CSS that you’re requesting back to you straight away. But what I’m going to do in the meantime is I’m going to send off to a request to the server as well. And I’m going to get a fresh copy of that CSS. So your next request is going to have the updated CSS, but your initial request we’re going to serve really quickly.
Why, I’ve got a different screen on here than I have up there. How long’s that been the case?
This is like a computer game like I’m trying to work out. I’ll start pressing buttons and you guys work it out. There’s supposed to be a little lovely picture of me up there and they changed this one. That’s nice.
Okay, so. Go to the picture of the handsome guy who looks a lot like me and I’m pulling this face. Other direction. I think they’ve got different decks. Doesn’t matter. Okay, we’re going to work it out. We’re going to go to how to spot a service worker.
We’ve got different decks on each screen. This is like a new speaker game.
Okay. So service workers require the tunnel as well. So basically because of the power that service workers have, they’re like, they can intercept all requests going to your domain and basically change the contents in any way they see fit. So registering a service worker requires that you have an HTTPS connection.
Alright, we’re in sync again, this is good.
Okay. So registering a service worker. This is the sort of snippet that you would expect to see how our service workers registered. You don’t even need to worry about that. An easy way to do this is just to look inside the Chrome developer tools. And you basically look at, you’ll find a little cog icon and that indicates that there’s a service worker registered for this domain you run and is able to start doing stuff. So if you’re doing a site audit and you see this cog, just be really aware that it can basically be changing the way your requests are going, what the responses look like.
You can also go into the application tab and click on service workers and you’ll see, okay, this is the service worker. So you can see the garden.com which isn’t really a super active site has a service worker doing something. We don’t know what. And at this point, if you wanted to, you can unregister a service worker. So you can say stop fiddling with my stuff, but they’ll come back again.
Or you can go to this special Chrome URL and it’ll show you all the service workers that you’ve got registered in your Chrome. Most people I’ve done this with are quite shocked because most people have got like dozens of service workers sat inside the Chrome already, they’re like Gremlins in your browser changing stuff.
So how, if you did come across service workers and you wanted to like understand what’s going on in this page, then basically the same method we choose before for bypassing the cache actually entirely bypasses a service worker. If you do a control refresh like a hard refresh, you’ll entirely bypass service workers and they won’t impact what’s happening on the page.
I should probably look at this screen because there’s.
Okay, so. Same as with HTTP/2, the Google bar and the Google web rendering service, the headless browser that they use for the, like, fetch as Google bot and all of that stuff doesn’t use service workers at all. So none of this stuff is directly visible to Google. But in exactly the same way as HTTP/2, this stuff will be noticed by the Chrome user experience reports as Chrome users are sending data to Google. And so if you’ve got Chrome, a service worker speeding up your website, that will be noticed.
Takeaways, okay, so wrapping all this up quickly.
What have I told you? Basically for spotting HTTP/2, you should turn on the protocol column in Chrome and or install the Chrome extension I told you about.
Be wary of really cutting edge advice around things like HTTP/2, server push, etc. You should probably just pay attention to Matt. I don’t know if he’s here at the moment. He’s speaking after the break and he’s got a whole bunch of really good advice.
Look for the cog in network panel to identify whether you’ve got a service worker in play or not.
And basically with all of this stuff, as I’ve said, like Google bot won’t directly notice this. Google Search Console won’t talk about service workers or HTTP/2 and won’t notice the speed up but Google will notice and this is a really good opportunity for a lot of us in this room to get ahead of our competitors because so far, nobody seems to talk about this online, but John Muller, as I said, explained to me, like very clearly, that Chrome user experience reports will notice these, the impact of these and that will play into ranking factors directly. And so this is a really easy like thing especially the HTTP/2 that we can do to sort of get a jump on our competitors.
And so hopefully all of that made sense to you. And the main thing is like SEO shouldn’t be the driving force for either of these things. But as SEOs we do need to have that base knowledge so that when you get the call from your client or you’re in a meeting with your boss, you can actually sort of understand and answer what the impact of these are. The tl;dr is that neither of these things will ever hurt your SEO but the performance benefits would definitely help.
I think I’ve got two minutes for any questions if anyone’s got any. Thank you very much.
I don’t know if we should let you ask questions.
Man from audience:
But okay, there’s okay. I have a question that is based on remembering something that I think you said in a run through that I think you didn’t hear and I just want to remember correctly and clarify.
I said all sorts of things in run through that I didn’t say here.
Man from audience:
Like shit, that’s wrong, the lot. But go on.
Man from audience:
You said that a hard refresh gives you the, hit the server again, even if you actually have a service worker installed. But it’s a little, you had the asterisk, the, about the view source?
Man from audience:
That’s what I thought I remembered. I just clarified that for everyone. Thanks.
There’s another question here.
Any more questions about trucks?
I’ve got a million things I want to ask you and I suspect everyone else does as well but this might be the stuff that we need to sit down over a beer with tonight and take…
Beer is definitely the thing you need when you’re sort of making changes to your web server.
That’s a quote you can put on…
Beer and shots.
Yeah. All right. Let’s give it up for Tom Anthony.
Thank you very much.