2010-12-26

Creating Panoramas with Hugin

Like it? This is what the cliffs and the ocean look like at Black's Beach, California. Christmas Eve 2010. See that steep canyon gouged into the cliff face? That's how I got to the beach, on a trail some call the Goat Trail.

I stood at the bottom of this thing and took a picture of where I had come down. Then I took a picture of La Jolla to the South. Then one of the nudie beach to the North. Then one of the waves and the low sun. Then I decided I was just going to cover the whole world around me, and figure out a way to stitch the pictures together.

Back home, I remembered I had used this took, Hugin, to create panoramic images a while back. It was a pain: you had to couple up the images and click on a series of points that matched up. Image 1, image 2, click, click, click, click, click... Then Hugin would figure out just exactly how badly you had screwed up your picture by tilting, turning, panning, etc. and would create a panorama. It took about four hours of tedious work.

I wasn't looking forward to that, so I looked on my computer (apt-cache search panorama) for alternatives. I downloaded all of them. There was a GIMP plugin called pandora that didn't seem to be doing much, but it complained initially about not having an application from hugin installed (I just had to restart GIMP to make it work). There was an "easy to use" application called fotoxx that seemed more primitive than easy to use. There was a CLI app called enblend. Other stuff seemed to be mostly supporting software.

I wept, I gnashed my teeth, and I started Hugin. Lo and behold! It started with a tabbed interface and (miracles!) a WIZARD!!! It promised it would ask you what it wanted, and started by asking about the pictures you wanted to use as a base. Then it started working, actually giving me feedback so that I didn't have to wonder whether it had died on me. Then it came out with a crude map of the panorama. Perfect! It had figured it all out on its own!

Next came a series of optimization steps, and that's where Hugin need to be a little more proactive. For instance, it won't tell you that it will adjust exposure, so you worry how weird things will look like with this particular section overexposed. It also shows you how it tilted and modified the original image to make it fit into the panorama - a real miracle.

Once I figured out the optimization step (about ten minutes), I saved the project for later re-use and exported the image. That was fun!

I should mention at this point that Hugin sensibly leaves parts of the image out that aren't included in the image components. That means that a section at the top and the bottom were missing and needed to be filled out. To get that done, I used the outstandingly marvelous "Smart Remove Selection" plugin for the GIMP.

Use OpenOffice.org to Create an Image Gallery

I went on this hike on Christmas Eve - a steep trail down a ravine, with the most unusual people and the most unusual features. At the end, a gorgeous beach with surfers and pretty ladies with parasols waiting for them. Huge mansions staring down from lofty cliffs, and an amazing sunset that colored the cliffs a golden hue of honey.

I looked at the pictures, and they were pretty. Unfortunately, due to the unusual nature of the features and the number of different things you were looking at, it wasn't quite clear how (or even if) they belonged together, and what the story behind them was. The pictures needed an explanation. A story had to be told.

I did what I do these days when I record a hike: I posted the pictures on Facebook. Add the pictures, add captions, and wait for comments. That sure works, in a crude way. More generally, though, this approach has several issues:
  • In Facebook, the caption is an afterthought. You read it only if you don't know what the picture is about, or if you happen to stumble upon it
  • Facebook has no way to annotate an image. You cannot highlight portions, add notes, or expand parts for better viewing. You have to edit the image outside Facebook and upload it with annotations
  • Facebook doesn't provide a way to visualize a story as a video, with slide auto-transition
I realized that what I wanted was a video. Something like the DVD authoring software that takes an image folder and makes it into a DVD. You put it into the DVD player and you get a slide show.

Wait! I said the words! Slide show! My gallery was a dreaded Powerpoint™ presentation!!!

Now, every geek hates Powerpoint presentations. That's because we have sat through one too many of them. The one that usually ends it for us is the one where the marketing drone discovered transitions and animations, and now we have to sit through an hour of flashing text and banners that swirl through the air into place. Meanwhile, we are thinking about this bug that needs to be fixed and about the pickup basketball game we are missing.

But slide shows are perfect for the purpose I had in mind. Put a bunch of pictures together, add the text that explains them, and use the presentation software's awesome annotation powers to enhance the image. Stay away from gimmicks as much as possible and you got yourself a video.

I looked at different solutions online, and none of them satisfied me. The problem with cloud software is that every change is painful, as it needs to wait for round trip times. You post a minuscule change, it takes a couple of seconds, then you get a chance to see the results. Will this color show on the image? Is the annotation the right size?

Downloadable software works, but on Linux the quality is so-so. I tried a bunch of different gallery applications, but none of them did what I wanted. Then I caved in and used OpenOffice Impress, the group's Powerpoint replacement. Impress is amazing in functionality, by the way, and my reluctance comes from the fact it gets really slow when editing large presentations.

When I started creating my gallery, I realized there were really only three types of slides I wanted:
  • Accessory slides - text only, like the title page and the thank you page at the end
  • Vertical layouts for pictures shot in portrait mode
  • Horizontal layouts, for landscape
If I had shot videos, there would have been two pages for those, but the layout would have presumably been the same as with the pictures, only with video objects.

I created the title page, then started with a vertical image. I laid it out in the nice presentation layout: title, image on the left, text box on the right. Seemed reasonable enough. Since the next page was the same format, I just copied the one I had, replaced image and text, and got going.

Things got a little tough with landscape images, since they don't leave a lot of room for text. I ended up placing them in the top left corner, with text floating around them. That turns out to be a problem, since Impress (and I recall Powerpoint) don't float text around images like desktop publishing programs do.

That all went well, though. I had the three slide types, and whenever I added a new picture, I just selected the appropriate slide type, pasted it at the end, replaced the image and the text, and on to the next one.

Now, I told you Impress can be a hog. So after about five slides, the image selection dialog would take 20 seconds to load. Paging through it would take another 10 seconds per page. With over 100 images, that seemed completely impossible to deal with. Fortunately, if you shut down and restarted Impress, things were quicker until you added another five pages.

That was tedious. Oddly, though, paging through the slides was fast, as was adding annotations and editing text. So if I had a script or wizard that generated the slides for me, that would have been marvelous.

I went online and checked how you create wizards for Impress. After all, what I wanted was not much different from a mail merge: get a data source (here, images) and plug it into an output format (here, slides). Not an ounce of information.

Fortunately, I remembered that OpenOffice standardized on XML as persistence format. I looked at the saved file and realized it was a PKZIP archive. That was to be expected, since PKZIP is the default persistence format for Java archives (JAR files).

I unzipped the archive and quickly discerned the various components. Kudos to the OO.o developers for creating a very sensible layout. I looked at the file content.xml, which had the slides in it (as a quick grep with the text on one of the slides revealed). Unlike some XML formats that are binary CDATA stuck into an XML wrapper, this one was very much in the spirit of eXtensible Markup - you could easily see how it worked.

So I wrote a little script that takes a template file and a set of images, and creates page after page of content. Then it saves the presentation, and you get to load it and modify it. Adding annotations, removing images where you don't like them, and finally saving the whole thing again. Then you can take the presentation and show it on its own, or share it using the Impress exporters. There is one for Flash, one for PDF, and one for HTML slideshows - amazing stuff!

The Live CD Web Server Project (Ongoing Series - Part I)

Imagine burning a CD with your latest and greatest in web sites, putting it into the computer's CD drive, turning on, and magically you have your web server running? No installation, no configuration, no login - no viruses, no persistent hacking, no version conflicts?

The idea came to me when running a large computer cluster in a three-tier application. The front end web servers were always the worst problem: they handled all the load, they handled the majority of attacks, they were the entry point for vulnerabilities and denial-of-service abuses.

I had tons of problems to deal with, so I never got to realize this idea: take the software we run on the web servers, make it into a CD, tear out the hard drive from the servers, and make the whole thing run from read-only drives. Stuff it and forget it.

I recently started thinking about the project again. I wanted to take an old machine I had sitting at home and train it as a minimal web server. It had to run LAMP and Joomla, but not allow persistent updates to the database. The project I thought of sounded like a perfect match, so I started looking into the available distributions.

Turns out nobody seems to have started a project in this direction, despite repeated calls for it. There are tons of Linux distributions on Live CD, but they are all fixed. The idea is that someone puts content together and the CD is to showcase the creator's efforts.

2010-10-20

Creating an Encrypted Subversion Repository on Linux

Why?

I have my source code on a server in the cloud. That makes perfect sense - I want to have my code accessible from everywhere, even if the only person accessing the repository is my own self. Access is secured using SSH with PKI - only whoever has the private key can access the system, no passwords allowed.

While I feel pretty secure about access, it bugs me that the source code is not encrypted at rest. Whoever gains access to a copy of the repository (for instance, from a backup) has the code in cleartext. That's absolutely not good. On the other hand, setting up an encrypted repository is too much of a hassle, and I couldn't find anything online about how to do it.

One rainy day (yes, we have those in Southern California, and we look at them like people in Hawaii look at snow) I decided I had enough of it. I wasn't going to take it anymore. I had to do it.

What Not?

When setting up my encrypted repository, I wanted to avoid the most common mistake: a repository that can be accessed from the machine itself. You see, the problem with most encryption software for drives is that it stores the key with the hardware. If you do it that way, the encryption is pretty pointless.

You could set up encryption so that only people with login access to the machine (who also know the password) can decrypt the repository. This approach works well for encrypted home directories, but in my source code access there is no password.

So, whatever I did, I needed to pass the credentials (or the path to them) with the request itself. The request would provide location and password, and that would be sufficient to unlock the encrypted file.

How?

My ideal scenario was simple: a Truecrypt repository on the server with SVN (Subversion) access. I base the whole description on this combination, and the peculiarities of both come into play at several times.

I chose Truecrypt over, say, CryptFS because the repository is a single file. It is completely opaque to the intruder, and I can even set it up so that it's not clear the file mentioned is a Truecrypt repository. (For instance, I could call it "STARWARS.AVI" and make people think it's a bootleg copy of a movie.) With most crypto filesystems, encryption is per file, which means the file name and the existence of single files (and directories is visible).

I chose Subversion over, say, git because... well, because my repo is already in Subversion, and because SVN has this neat remote "protocol," which consists of creating a remote, secure connection to the server and executing the commands locally, without a special network protocol involved.

Tricks

The first part is very, very simple: after installing truecrypt and subversion (as well as ssh, which you should already have) you need to create a Truecrypt container. Choose a file container and give it a Linux (ext3) filesystem, and make it big enough to fit the largest size your repository will ever grow to.

To create the container, simply type in truecrypt -t -c on the server. That will start the interactive dialog that will create the encrypted file. Give it any name (I assume here you called it STARWARS.AVI) and location (doesn't really matter). The defaults are fine for everything, you'll provide file name, file size, and none for the file system. When it comes to the password, choose something really really good.

[Note: volume creation on the client has the advantage of being able to use the graphical interface, which helps a ton.]

Since we didn't select a filesystem, we have to create one. To do that, we need to learn a little about truecrypt internals - but we'll use it in a moment for the actual subversion trick, so it's not too bad. Here goes: truecrypt creates a series of "devices" that the system uses to talk (indirectly) to the encryption core. That's done because truecrypt lives entirely in user space, and hence encryption is not available on a kernel level.

The devices are called mappers and reside in /dev/mapper/truecrypt*. To Linux, the devices behave like regular block devices (read, like a hard drive). One you have a map done, you can do anything with it that you would normally do with a drive, including formatting.

To map, you invoke truecrypt with the container name and the no filesystem option:

truecrypt --text --password= STARWARS.AVI --filesystem=none --keyfiles= --protect-hidden=no

(Long options this time to save the pain of explaining.)

Now you should have a mapper mounted - if you type mount, you should see one line that starts with truecrypt. Remember the number after aux_mnt (typically 1 on your first try).

Now we create the filesystem:

mkfs.ext3 /dev/mapper/truecrypt1

(You may have to be root to do that - in which case add "sudo" at the beginning of the line.)

AutoFS

The "evil" trick that we are going to use next is a dynamic filesystem mounted automatically. AutoFS is a package that allows you to declare that certain directories on your system are special and access to them requires special handling. For instance, I use autoFS to connect to my SSH servers. The directory /ssh on this machine is configured to open an sshfs connection to whichever server I name. So, if I write:

ls /ssh/secure.gawd.at

I get the listing of the home directory on the server secure.gawd.at. (The server doesn't exist, but that's beyond the point.)

In this case, we will use what is called an executable map: autoFS invokes a script that you name, and will take the results as the configuration options it needs to mount the filesystem. In our case, the script will first open the truecrypt container and then move on to passing the options for the underlying filesystem to autoFS.

Once more: ls /svn/magic-name - autoFS - mapper script - truecrypt mapper - mount options - mount

I wrote the script in Tcl, which is still my favorite shell to use. It requires nothing but Tcl and Tcllib, both package available in pretty much all Linux distributions (although the Debian people, noted bigots, require you to specify a version number). You can download it here.

Copy the script into the /etc directory. If you didn't already, install autofs on your machine. Now edit the /etc/auto.master file and add a line for this new file. Let's call it auto.enc and link it to the /svn directory.

A little magic setup: you have to create the directory (sudo mkdir /svn); then you have to make the map executable (sudo chmod+x /etc/auto.enc); finally, restart autoFS so that it looks at your map.

Now we have to choose the container and map. To make my life easier, I chose to use a .d directory in /etc. If you create a directory /etc/auto.enc.d with (sudo mkdir /etc/auto.enc.d), then all the links inside it will be considered maps. The name of the link is the name of the map, and the location it points to is the file container.

If you want to use the container /enc/data/STARWARS.AVI under the symbolic name repo, then you would do this:

ln -s /enc/data/STARWARS.AVI /etc/auto.enc.d/repo

Grand Finale: Credentials

Now the big question: how does the mapper find the password? That was the big question at the beginning. The way I solved it was to add the password to the name of the directory. Crazy guy!

If you followed the instructions above, then whenever you access /svn/anything, the map is consulted. The map script looks at the data passed in and looks for an "@" sign, which is considered the separator between map and password. So, if you wanted to access the repository repo with the password "secure," you would type in

ls /svn/repo@secure

The script would mount the mapper and tell autoFS to mount the directory as an ext3 file system.

But, but, but!!! You are passing the credentials in cleartext! That's bound to be terribly bad!!!

Well, yes and no. The transmission between server and client is SSH, so nobody can see the password in the clear. On the server, the password is in the clear, but it is not logged anywhere (unless you tell SVN to log everything). On the other hand, someone that happens to be on the server when a request comes in is also able to look at the encrypted data, since it is mounted for that period of time. So if an attacker looks at the password, the attacker might as well look at the files that are protected.

Epilogue

Let's just assume you got everything working - you should now be able to create a repository:

svnadmin create svn+ssh://svn/repo@secure

Now you should notice that the file STARWARS.AVI has been modified. If you mount it using truecrypt, you will see a series of files in there - files that SVN will continue using from now on whenever you access the encrypted repository. Hooray!

Notes and Addenda

1. I set the expiration timeout for the directories low, but not incredibly low - at ten seconds. You do that by specifying the "timeout" parameter in the auto.master file. That way, I can do an svn update; svn commit cycle without requiring a second mount. You can play with the parameters yourself.

2. The encryption scheme could be improved easily by using keyfiles instead of passwords. To do so, you would place a keyfile on a remote location (a web server, maybe) and require the script to get that resource, decrypt it using the password provided, and then use that as the keyfile. The advantage is that you require three pieces of information: the truecrypt container, the encrypted keyfile, and the password to the keyfile, to do your bidding.

3. Disclaimer: this setup works for me, but that's because I munged around for dozens of hours until I figured out all the options and configuration items necessary. If it doesn't work for you, don't sue me. If it works, but it stops working after a while and your brilliant source code is lost forever, don't sue me. Proceed with caution, always make backups, and never rely on the advice of strangers. Mahalo.

2010-09-25

The YouTube Conspiracy in Abby/CClive

There is a command line utility available on all Ubuntu derivatives called cclive. If you install it (sudo apt-get install cclive), you can give it a YouTube URL and it will download the video on it. I love using it for backup purposes - I upload a video from the camera, perform all my changes on YouTube, and then cclive the outcome for posterity. Just in case YouTube "forgets" about my latest cam ride.

There is also a GUI for cclive, called Abby. Abby is more than just a frontend for cclive, though, it also helps with pages that have multiple videos on them - playlists or RSS feeds. Abby is hosted on googlecode and written in QT/C++.

I started surfing, so I decided to scour YouTube for surf instruction videos. There is a set of 12 introductory videos by two Australian pros on there, so I decided to download them to take them to the beach. My N900 gladly displays YouTube videos on its gorgeous screen. Unfortunately, though, the T-Mobile coverage is fairly bad, so a downloaded video was the only real option.

BPM Detection in Linux

Doing a lot of cardio workouts, it is really good to have music that beats to your rhythm. The pulsating sound gives you energy and pace, both excellent ways to make a good workout, great, and to make time pass faster. When I get new music on my player (a Sansa Clip+ - the almost perfect player for a Sporty Spice) life is good. An hour of running is gone before I even know it, and when I look at the calorie count, I feel like Michael Phelps freshly crowned with Olympic gold.

At first I would stumble across music that matched the pace. I do lots of different kinds of workouts, so certain songs would work with different segments. I have a running pace, a hiking pace, a mountain climbing pace, a cycling pace, a weight lifting pace, a spinning pace, etc. I would get used to certain songs in certain parts, but that would get old, fast.

Then I got used to counting beats. I would look at the big clock in the gym and count beats for 15 seconds. That would give me a general idea of what I could use a song for. My favorite spinning song, for instance, was "Hazel Eyes," so anything that had the same beat count would be a good replacement.

Then I started getting bored with this random approach and realized I had a library of hundreds of CDs ripped onto my computers. I just had to detect the beat automatically and I would be able to simply do a lookup search for a specific BPM count and get all possible results.

2010-09-24

The Rise and Fall of Internet Browsers

It amazes me how, since the very inception, Internet Browsers have been subject to periodic meteoric rise and subsequent fall. They do so a lot more than other pieces of software, like operating systems or word processors. It seems people are much more willing to throw out their browsers than virtually any other kind of software.

It all started with the venerable grandfather of them all, Mozilla Navigator. Marc Andreesen, the ur-type of the "smart kid with an idea brighter than even he thinks it is who goes on to think he's the smartest person on the planet because he's been lucky with his idea", and his team created the software and threw it out. Instant success, huge company, enormous IPO. But a piece of software that was horrible, and got more and more horrible as time wore on.

Mozilla was mired in the conflict of the dot-com days: how do you monetize a piece of software without charging the user? It would take almost ten years for Google to show us, but back in the day, it meant shareware. Mozilla was selling servers, and the browser was a loss-leader. It got all the attention that a loss-leader gets - it got more and more bloated, supporting more and more reasons for people to upgrade their servers (and not buy them from anyone else), but in the process it got slower and slower.

Finally, in one of his last acts of Imperial Fiat, Bill Gates decreed that the Internet was not a fad and that Microsoft needed to get in on the action. A few years later and a ton of lawsuits after, Mozilla was dead (or bought by AOL, which is pretty much the same thing) and Internet Explorer the only dominant figure in the landscape.

Then IE started showing problems. Not bloat and slowness, although those became more apparent. No, it was security that became the big issue. IE's security model was cooperative and not designed for the abusive exploits of Internet Mafia conglomerates. As a result, surfing certain types of "shady" sites would invariably land your machine into zombie territory, or at least get you a virus infection or two.

When the Mozilla Foundation announced it was looking at a new, brand new browser named Firebird, even the most hopeful were not easily convinced. Navigator was a monster, written by people that needed to get things done, no matter how unmanageable the result, and Firebird would have to be a rewrite from scratch to compete.

But it did. Renamed Firefox (sadly), it began a march of conquest that landed it to top spot in the browser stats. Nowadays, almost half of all Internet users choose Firefox, while IE has only a little more than a third of the market.

Firefox was helped by a series of advantages: it was much faster than IE; it was factors more secure than IE; it had an extensive extension system with loads of useful things - useful for users, which IE had traditionally ignored in favor of usefulness for companies. Only lately has Firefox started to show weakness, and from the most unlikely of sources.

I invested much time in my Firefox setup. I have the extensions I want, synchronized across my two dozen machines (don't ask) using a sync extension. I have Firefox customizations for nearly everything, and I write my own Greasemonkey scripts. Yet, I started using Google's Chrome browser (Chromium on this laptop). Why? Because Chromium uses "one process per tab".

How does it matter? Why is it so important to me that each tab have its own process? The answer is Flash. You see, Flash is a giant memory leak. Whenever I land on a page that has Flash on it, memory gets allocated (by the Flash plugin) and never released. After a few hours of heavy browsing, my browser slows down to a crawl. Another few hours, and it's completely unusable. After a day, I have to restart it, and the process of freeing up memory may take upwards of 10 minutes.

Flash on Linux, of course, is an afterthought. The way Adobe treats its Linux users, though, shows all the weaknesses of the technology in a merciless way. First, there is the closed nature of Flash: Linux users cannot suggest modifications or fix bugs, as they do with other software, because the plugin is closed.

Then, there is the "one-size-fits-all" approach of the plugin. I find Flash used for controls on web pages (especially ones that require notifications), for e-cards, especially of the inspirational or funny kind, and for online videos. Those three use cases are totally different, and using the same software for each of them is only in the interest of the maker of the software, Adobe, not in the interest of the user.

So. for now I am forced to leave Firefox for no reason of its own and adopt a different (and very capable) browser simply because I can't get Flash to work on FF.