The last weeks have seen lots of infrastructure improvements. We've been preparing our software for new features and tweaking the servers it's all running on. A lot of the work is, well, pretty boring and frustrating because it doesn't directly bring new features to the site. But it has to be done, and for our small team we've built a pretty awesome system. (the rest of this post gets a little technical, so skip back to YellowBot if you aren't interested. :-) )
For example one of the changes we did recently was to setup our main database server with a redundant fail-over system. Most of the time this is, well, a waste! (We do use it for some of the "background" data processing too). But when we have to do work on the server or when something goes wrong with the primary server it's invaluable.
A few of our servers, including the database server is missing an IPMI card. The IPMI cards make remote management of the servers a breeze and saves us trips downtown when things go wrong. We got a list together of the boxes missing this handy enhancement and got the cards ordered this week, yay! They should show up next week, but to install them we have to shutdown the server! That's when the extra database server comes in handy. We can quietly switch them in the background and then bring down the primary server for maintenance. When we are done, we switch back and work on the backup server - and if we don't screw anything up you, the user, won't notice anything on the site!
While I am hopelessly technical already, I'll have to recommend a couple of our favorite related tools. First there's Puppet, a configuration management tool we use to keep the configuration on each server sane. Between Kickstart and Puppet we can get a new server ready to serve traffic in 10-15 minutes after it's plugged in to power and the network.
After the servers are running, we have to monitor how they are doing (if nothing else then to know when it's time to order more!). For that we primarily use Munin. Munin is a Raven in the Nordic Mythology that travels the world to bring news and information to Odin. I posted an example of the graphs it can make a few weeks ago.