Top Chef: Cloud Edition
Lately at Pure Charity I’ve been using Chef to build our infrastructure on Amazon’s EC2. After using chef for the last few weeks I have come to fully embrace it and I will never again build a Linux box by hand. I know that sounds pretty dramatic, but if you build servers (in the cloud or bare metal) you NEED to do yourself a favor and look at Chef. In the coming week I’m going to write a series of posts about Chef. This week I’m going to explain what Chef is and why it matters.
Chef is the Warden

Chef assists with configuration of your servers (I’m going to call these nodes) in your collection of servers (I’m going to call the collection a stack). It keeps them in line, kind of like a prison warden. I know what you might be thinking at this point: “I’ve already got scripts that I’ve written, I don’t need a tool to do this. Heck, I’ve even got my scripts checked into version control and shared with the rest of my team.”
Kudos if you are already taking the step to automate your own processes in scripts and bravo if you are using version control to manage and share these scripts. Do your scripts query servers based on roles to get the list of servers it will be acting on? Maybe LDAP or something similar solves that problem for you, but will your scripts adapt to software or even hardware specifics of the box? For example, will your scripts check the number of processors on your box and change the Nginx worker_processes setting to accomodate it? What if your database box is on a Ubuntu system, but your application servers are on a RedHat box? Will your scripts know the difference?
OHAI!

Chef scripts allow you to have access to environmental specifics in your scripts through the ohai system profiler. Ohai, besides being an awesome name, will give you NICs and transfer stats, users and groups, what version of Java/Ruby/etc. is installed, memory information, among other details. This means that your scripts become deterministic.
Okay so you can get information about the box that you’re deploying and your scripts can account for that, but why does that require a server? The server is for your configuration management. What’s does that mean? On the Chef server you can set variables that that pertain to boxes in a particular role, or maybe to an entire environment.
Think about that for a second. No more magic variables, random IP addresses, or having a script to run in prod and a script to run in staging. Chef takes the configuration out of your scripts and centralizes it in a server. The biggest benefit to this is that your server can use the Chef server as the authority for it’s configuration.
So what is the benefit to using Chef over some other technology? The greatest difference is that you get everything under one roof. Configuration management, automated building of servers from a base image, and a searchable database of all of your nodes. Plus recipes can be easily shared because they (should) have no information about your configuration in them. Opscode has their community cookbooks, as does 37 Signals.
In my next post I will examine a cookbook for an application using the deploy resource.