2010-10-20

Creating an Encrypted Subversion Repository on Linux

Why?

I have my source code on a server in the cloud. That makes perfect sense - I want to have my code accessible from everywhere, even if the only person accessing the repository is my own self. Access is secured using SSH with PKI - only whoever has the private key can access the system, no passwords allowed.

While I feel pretty secure about access, it bugs me that the source code is not encrypted at rest. Whoever gains access to a copy of the repository (for instance, from a backup) has the code in cleartext. That's absolutely not good. On the other hand, setting up an encrypted repository is too much of a hassle, and I couldn't find anything online about how to do it.

One rainy day (yes, we have those in Southern California, and we look at them like people in Hawaii look at snow) I decided I had enough of it. I wasn't going to take it anymore. I had to do it.

What Not?

When setting up my encrypted repository, I wanted to avoid the most common mistake: a repository that can be accessed from the machine itself. You see, the problem with most encryption software for drives is that it stores the key with the hardware. If you do it that way, the encryption is pretty pointless.

You could set up encryption so that only people with login access to the machine (who also know the password) can decrypt the repository. This approach works well for encrypted home directories, but in my source code access there is no password.

So, whatever I did, I needed to pass the credentials (or the path to them) with the request itself. The request would provide location and password, and that would be sufficient to unlock the encrypted file.

How?

My ideal scenario was simple: a Truecrypt repository on the server with SVN (Subversion) access. I base the whole description on this combination, and the peculiarities of both come into play at several times.

I chose Truecrypt over, say, CryptFS because the repository is a single file. It is completely opaque to the intruder, and I can even set it up so that it's not clear the file mentioned is a Truecrypt repository. (For instance, I could call it "STARWARS.AVI" and make people think it's a bootleg copy of a movie.) With most crypto filesystems, encryption is per file, which means the file name and the existence of single files (and directories is visible).

I chose Subversion over, say, git because... well, because my repo is already in Subversion, and because SVN has this neat remote "protocol," which consists of creating a remote, secure connection to the server and executing the commands locally, without a special network protocol involved.

Tricks

The first part is very, very simple: after installing truecrypt and subversion (as well as ssh, which you should already have) you need to create a Truecrypt container. Choose a file container and give it a Linux (ext3) filesystem, and make it big enough to fit the largest size your repository will ever grow to.

To create the container, simply type in truecrypt -t -c on the server. That will start the interactive dialog that will create the encrypted file. Give it any name (I assume here you called it STARWARS.AVI) and location (doesn't really matter). The defaults are fine for everything, you'll provide file name, file size, and none for the file system. When it comes to the password, choose something really really good.

[Note: volume creation on the client has the advantage of being able to use the graphical interface, which helps a ton.]

Since we didn't select a filesystem, we have to create one. To do that, we need to learn a little about truecrypt internals - but we'll use it in a moment for the actual subversion trick, so it's not too bad. Here goes: truecrypt creates a series of "devices" that the system uses to talk (indirectly) to the encryption core. That's done because truecrypt lives entirely in user space, and hence encryption is not available on a kernel level.

The devices are called mappers and reside in /dev/mapper/truecrypt*. To Linux, the devices behave like regular block devices (read, like a hard drive). One you have a map done, you can do anything with it that you would normally do with a drive, including formatting.

To map, you invoke truecrypt with the container name and the no filesystem option:

truecrypt --text --password= STARWARS.AVI --filesystem=none --keyfiles= --protect-hidden=no

(Long options this time to save the pain of explaining.)

Now you should have a mapper mounted - if you type mount, you should see one line that starts with truecrypt. Remember the number after aux_mnt (typically 1 on your first try).

Now we create the filesystem:

mkfs.ext3 /dev/mapper/truecrypt1

(You may have to be root to do that - in which case add "sudo" at the beginning of the line.)

AutoFS

The "evil" trick that we are going to use next is a dynamic filesystem mounted automatically. AutoFS is a package that allows you to declare that certain directories on your system are special and access to them requires special handling. For instance, I use autoFS to connect to my SSH servers. The directory /ssh on this machine is configured to open an sshfs connection to whichever server I name. So, if I write:

ls /ssh/secure.gawd.at

I get the listing of the home directory on the server secure.gawd.at. (The server doesn't exist, but that's beyond the point.)

In this case, we will use what is called an executable map: autoFS invokes a script that you name, and will take the results as the configuration options it needs to mount the filesystem. In our case, the script will first open the truecrypt container and then move on to passing the options for the underlying filesystem to autoFS.

Once more: ls /svn/magic-name - autoFS - mapper script - truecrypt mapper - mount options - mount

I wrote the script in Tcl, which is still my favorite shell to use. It requires nothing but Tcl and Tcllib, both package available in pretty much all Linux distributions (although the Debian people, noted bigots, require you to specify a version number). You can download it here.

Copy the script into the /etc directory. If you didn't already, install autofs on your machine. Now edit the /etc/auto.master file and add a line for this new file. Let's call it auto.enc and link it to the /svn directory.

A little magic setup: you have to create the directory (sudo mkdir /svn); then you have to make the map executable (sudo chmod+x /etc/auto.enc); finally, restart autoFS so that it looks at your map.

Now we have to choose the container and map. To make my life easier, I chose to use a .d directory in /etc. If you create a directory /etc/auto.enc.d with (sudo mkdir /etc/auto.enc.d), then all the links inside it will be considered maps. The name of the link is the name of the map, and the location it points to is the file container.

If you want to use the container /enc/data/STARWARS.AVI under the symbolic name repo, then you would do this:

ln -s /enc/data/STARWARS.AVI /etc/auto.enc.d/repo

Grand Finale: Credentials

Now the big question: how does the mapper find the password? That was the big question at the beginning. The way I solved it was to add the password to the name of the directory. Crazy guy!

If you followed the instructions above, then whenever you access /svn/anything, the map is consulted. The map script looks at the data passed in and looks for an "@" sign, which is considered the separator between map and password. So, if you wanted to access the repository repo with the password "secure," you would type in

ls /svn/repo@secure

The script would mount the mapper and tell autoFS to mount the directory as an ext3 file system.

But, but, but!!! You are passing the credentials in cleartext! That's bound to be terribly bad!!!

Well, yes and no. The transmission between server and client is SSH, so nobody can see the password in the clear. On the server, the password is in the clear, but it is not logged anywhere (unless you tell SVN to log everything). On the other hand, someone that happens to be on the server when a request comes in is also able to look at the encrypted data, since it is mounted for that period of time. So if an attacker looks at the password, the attacker might as well look at the files that are protected.

Epilogue

Let's just assume you got everything working - you should now be able to create a repository:

svnadmin create svn+ssh://svn/repo@secure

Now you should notice that the file STARWARS.AVI has been modified. If you mount it using truecrypt, you will see a series of files in there - files that SVN will continue using from now on whenever you access the encrypted repository. Hooray!

Notes and Addenda

1. I set the expiration timeout for the directories low, but not incredibly low - at ten seconds. You do that by specifying the "timeout" parameter in the auto.master file. That way, I can do an svn update; svn commit cycle without requiring a second mount. You can play with the parameters yourself.

2. The encryption scheme could be improved easily by using keyfiles instead of passwords. To do so, you would place a keyfile on a remote location (a web server, maybe) and require the script to get that resource, decrypt it using the password provided, and then use that as the keyfile. The advantage is that you require three pieces of information: the truecrypt container, the encrypted keyfile, and the password to the keyfile, to do your bidding.

3. Disclaimer: this setup works for me, but that's because I munged around for dozens of hours until I figured out all the options and configuration items necessary. If it doesn't work for you, don't sue me. If it works, but it stops working after a while and your brilliant source code is lost forever, don't sue me. Proceed with caution, always make backups, and never rely on the advice of strangers. Mahalo.