Recently I ran into some problems when I upgraded my server to PHP 7.4. The problems were solvable but I figured I’d revert to my installation pre-PHP 7.4 upgrade and then work the issues off-line. But when I tried to re-install my system from a backup, all hell broke loose.
My VPS, or Virtual Private Server, company is Linode, one of the longest serving companies in the business. As part of my installation, I pay extra for weekly and on-demand backups. I’ve not had an issue with backups in the past, but this time, not only could I not start my server after restoring the backup, I couldn’t even figure out how to start restoring it.
I do know that the problems I had were not unique. In the Linode help system, I found others who also suffered boot failures very similar to mine, all after restoring from a backup.
The long and contentious trouble ticket
I submitted a trouble ticket, and will be honest: I wasn’t the most patient customer they’ve had. The thing is, I understand about problems I introduce and them being my responsibility, but it’s absolutely essential to be able to depend on our backups. When you start with a pristine machine and the backup fails, the problem is with the backup. And we have no control over backups.
Customer support and I went back and forth for two days, with several people entering the support thread. I received some good help, and good tips. At one point I almost got the system up and running, but the success was short-lived.
I wish I could say I was the model of good behavior, but between you and me…you know me.
I became frustrated, and even angry. I kept asking them why I can’t depend on the backups. Why have a backup service if you can’t depend on the backups?
What did you guys screw up?
Finally, Linode basically told me they weren’t going to offer any more support. After all, Linode is VPS provider, not a shared host environment, and VPS is basically, ‘ware, there be dragons here. If I wanted more support I’d have to pay for it.
Understandable response. Wrong response.
VPS Provider Responsibilities
I don’t think there’s a person who doesn’t maintain a VPS who doesn’t understand that we’re responsible when we screw up. Our VPS provider might give us advice, and the community help systems are good places to go for support, but bottom line, we’re on our own. This is the cost to the freedom a VPS gives us.
However, the line between where our responsibility ends and the company’s begins is when the functionality is outside of our control. Keeping the machines up and running is the company’s responsibility. Ensuring that an out-of-control VPS doesn’t harm others is also the company’s responsibility. Finally, if you offer a backup service, you really need to ensure backups work, or the service is basically useless. You’re giving the impression of security, when there is none.
Though Linode’s response was very polite, it basically told me:
Your problem is not our problem.
No, you would not want to be a fly on the wall in my room when I read this message. If you were, you’d be a dead smear now.
At this point, I was resigned to either painstakingly having to download all of my stuff and try to re-build the server, or just bag it all, including many years of weblog entries.
Thankfully for all involved, the person who told me I was on my own also mentioned something else, a newer service they provide called Images. Way back in the beginning of the questions, I asked if there wasn’t a way I could undelete my files. Evidently, that’s what Images are.
An Image is a snapshot in time of the system. Images are created any time there’s a major change to the VPS, such as in my case, deleting the old contents and restoring from backup.
Now Images are limited to a certain size, and they can be problematic if the image snapshot happens right as a database is being updated. In my case, though, the Image was the thing that saved me. My server Image fit within the size restrictions, and I didn’t have any database activity at the time I deleted my content in preparation for restoring.
Thanks to the Image I was able to recover my system, which is why you’re reading this, right now.
Culprit Unveiled
Once I got the system up and running I was able to immediately discover the problem with the backup. It’s a thing called Canonical Livepatch.
If you install Canonical Livepatch in your system, critical system kernel updates are performed without having to reboot. It was an attractive sounding concept when I first installed it. Actually, installed it using Linode’s instructions.
One essential requirement to using Canonical Livepatch is that you need to be running GRUB 2, not a specific Linode kernel. And when the backup was restored, it was restored to Ubuntu 18.04 with GRUB 2. However, when I restored my system from the Image, it was Ubuntu 18.04 running the latest 64-bit kernel. This configuration supposedly should be incompatible with Livepatch, but somehow, this was the state of everything.
My fault? Probably. I was fairly sure I had changed to GRUB 2, but who knows what happened in the last few years since I installed Livepatch.
Regardless, I never found Livepatch to be useful, so I removed it from the system. I then tried a new backup and restored it to a new Linode. It booted up cleanly, no problem.
Lessons Learned
After all of this, was there anything positive from the experience?
Sure. I am a better person.
And now I’ll cut the crap and say, yes, I did learn valuable stuff, but I’m still a bit pissed about the whole thing.
Anyway, I learned I don’t have to install every new and fancy gewgaw my system supports. I’m not six, I don’t need a new toy.
I also learned that if I have a problem with my system, damn it, just try fixing it first before scrapping it all and restoring from backup. I got lazy, and it bit me in the butt.
I also learned that I need to pay attention to what’s happening with Linode more than I have. I wasn’t aware of Images because I’ve been a Linode customer for so long, I stopped paying attention to Linode. Familiarity and all that.
As for the backups, when I first started programming, my teachers drummed into my head the importance of good backups, and that you really can’t depend on backups you don’t make yourself. Wow, they are shaking their heads at me now. They did way back then, too, but not because of backups.
I still have automatic backups, but once a month I schedule a bit of time to see if the latest restores cleanly. I also, at the same time, do a database dump and download it to my PC. In addition, I have a snapshot of my server on my PC, and when I finish any major change to files on my server, I download a copy of the changed files.
This experience reminded me that no amount of clever automation makes up for our commitments to our systems. It’s like all of life, really. If we coast along on autopilot, we’re eventually going to run into a fire truck.