Zero downtime deployment for Rails

For the past few months, as well as developing the major release of BIME v6, we spent some time putting processes in place to improve the deployment process. It is important for us to minimize the time between a fix or a feature being ready for production, and actually being deployed.

We started with Continuous Integration software that handles most of the build process. We chose TeamCity as the CI environment, and started by creating build configurations for the builds we needed at this time.

With TeamCity, most of the continuous part of the process is automatically handled. It checks the VCS changes and triggers the build for you. The challenge remaining is the following:

How do you achieve a zero-downtime deployment for a Ruby on Rails application?

The build configuration is divided in three parts:

  • Checkout the source
  • Run the tests
  • Package, publish the artifact

The first two parts are easy enough as they are well integrated in TeamCity. The checkout process is automatic and you can run the tests via a Rake task easily.

Then we wrote a custom script that zips the sources, pushes the result on Amazon S3 and writes the new version number (the commit hash) to a key/value database (Amazon DynamoDB in our case)

 zip -r ../rails.zip *
 aws s3 cp ../rails.zip s3://bucket/rails4/rails.zip
 aws dynamodb put-item --table-name version --item "{\"key\":{\"S\":\"rails_production_timestamp\"},\"value\":{\"S\":\"$BUILD_NUMBER\"}}"

After this step, TeamCity considers the build a success.

At this point, the responsibility for updating the code on the production servers is left to the servers themselves. Each server that embeds the Rails application will auto-update via a simple process.

At the server startup, a script creates a cron job which is executed every 30 seconds. This cron jobs simply triggers a Rake task that will check the current rails_production_timestamp in DynamoDB and compare it with the current running version of the Rails app. If the version in DynamoDB is newer than the running one, the task will pull the artifact from Amazon S3, unzip it to a new folder and update the symbolic link to the new version.

 class RailsUpdater

   # Automatic update for Rails

   def self.check_update
     dynamo_db ||= AWS::DynamoDB.new(
       access_key_id: AwsHelper::DYNAMO_ACCESS_KEY,
       secret_access_key: AwsHelper::DYNAMO_SECRET_KEY
     )
     table = dynamo_db.tables['keyvalue']
     table.load_schema

     item = table.items["rails_#{Rails.env}_timestamp"]
     dynamo_version = item.attributes['value']

     current_version=File.open('/home/app/bimeio/rails.version','r') do |file|
       file.readline
     end

     if current_version.strip != dynamo_version
       exec '/etc/my_init.d/startup.sh'
     end
   end 
 end

As we use Phusion Passenger for the app server, a simple touch tmp/restart.txt will restart the app server on the new version without affecting client sessions.

Conclusion

By leaving the responsibility for building the artifact to TeamCity, and for the update process to the server itself, we can manage a fast, reliable zero-downtime deployment for our Ruby on Rails application.

This process gives us the ability to push both fixes and features quickly and safely in a very Agile way for the benefit of all our customers.

← Back to Home

Yannick Chaze
comments powered by Disqus