The new natechoe.dev backup system
I probably shouldn't admit this while applying as a CS major for college, but before last week I didn't have backups. I know, that's insane, but last week I set up a very technically satisfying backup system which I feel deserves a blog post.
Part 1: Constraints
I want to protect myself from ransomware attacks, so I need a system where even if you broke into my server and got root access, you still wouldn't be able to delete backups. In other words, I need a write-only backup system.
I also want to make this system simple, so most of the work should be done server-side. All the client has to do is upload a file to back up, and the server handles everything else. The client shouldn't have to worry about choosing a file path that doesn't collide with anything else, TOCTOU vulnerabilities, or anything else like that. The client uploads a file, and the server says "it worked" or "it didn't work".
Finally, this system should be private. My backups are hosted by BuyVM. I don't trust them (or anyone, really) with all of my email data completely unencrypted. My backups ought to be encrypted before getting sent to or stored on a remote host.
Part 2: Options
If you use Linux, your first thought when I said "backups" was probably rsync. The problem is that rsync isn't encrypted server-side. There are some rsync-like programs with encryption, but I don't want to go through all the effort of auditing their source code.
If you don't use Linux, your immediate thought might have been to use FTP, which does technically allow you to configure a write-only server. This might work for some use-cases, but the client would have to worry about file-name collisions. Ideally, the server handles the creation of file names because that's where the files are.
The next obvious solution would be to create an HTTP server with an API to upload a file as a backup. The client would simply tar the files, encrypt them, then upload them. This would be simple to set up client-side, but it's an absolute pain server-side. I'd have to use some bloated server-side framework like node.js or PHP just to redirect some file paths. It would make the system way too big for a simple backup system like this.
It'd be really nice if I could just scp a file to a fixed location and the server handled everything for me. I began looking into creating a server with libssh, before realizing that scp is actually deprecated and the alternative would take way too long to implement.
Then I had a brainwave. execfs is a custom filesystem that allows you to execute shell commands from the filesystem, meant for use with the C preprocessor. Running cat execfs/(ls)
will make execfs actually run the "ls" command and give you its output as the file content. What if I did something similar with my backup system, creating a custom filesystem that redirects writes to some randomly generated file path so that the client only has to write to a single file?
Part 3: My solution
My backup server has a user called "backupfs" and a systemd service which runs my custom filesystem on /run/sftp-only/backupfs
. Writing to /run/sftp-only/backupfs/dev
as the "backupfs" user will generate a new file path and redirect any writes to that new file. My sshd_config
also contains these lines:
Match User backupfs
ForceCommand internal-sftp
ChrootDirectory /run/sftp-only
PermitTunnel no
AllowAgentForwarding no
AllowTcpForwarding no
X11Forwarding no
AuthorizedKeysFile /var/run/sftp-only/backupfs-authorized-keys.txt
PasswordAuthentication no
To upload a backup, a client can authenticate itself as the "backupfs" user and copy a file with sftp to /backupfs/dev
. Most importantly, a client that can authenticate itself as the "backupfs" user can't do much beyond that. If I ever get pwned, the worst the attacker can do is fill up my backup disk space with meaningless data.
Part 4: Retrospective
This entire system including the custom filesystem and sshd configuration takes up just over 300 lines of code and took me around four hours to set up. I'm sure experienced Linux users will tell me about some shiny system that does exactly what I need with a single line of configuration, but this solution is technically interesting enough to stay.