A better approach to restarting a Mongrel cluster

At Karabunga, we use Mongrel, a lot. As our Rails applications become larger, the startup time of a Mongrel process becomes significant.

As you know, restarting your a Mongrel cluster is a matter of issuing this command: /etc/init.d/mongrel_cluster restart. Here’s what happens for a cluster of 4 Mongrels:

A typical Mongrel Cluster restart

For one of our application, the “stop” time is about 2 seconds and the “start” time is somewhere around 10 seconds for each Mongrel process. Which means that for a cluster of 15, we have a window of about 3 minutes where at least 1 and often many Mongrels are innaccessible. Worse, there’s even a point where there is no Mongrel running at all. In a high traffic production environment, this is not acceptable.

With some hacking, I managed to modify mongrel_cluster_ctl (the script called by /etc/init.d/mongrel_cluster) to avoid the above scenario. My hack also makes sure that at most 1 Mongrel will be down at any given time. Here’s a graphical representation of my hack:

A Mongrel Cluster restart with my hack

I believe this is much more efficient. Moreover, when implemented with Swiftiply, downtime is reduced to zero since Swiftiply will detect a “dead” Mongrel and route requests to one that is alive.

Links:

Trackback URL

4 Comments on "A better approach to restarting a Mongrel cluster"

  1. Jon Wood
    04/01/2008 at 2:27 pm Permalink

    Doesn’t this mean that you’re running two different versions of your application, admitedly for a very short time.

    In most cases that’s probably not a problem, but it’s still something to consider.

  2. Carl Mercier
    04/01/2008 at 3:09 pm Permalink

    Yes it does, but in my case, it wasn’t an issue too much. The big issue was having my app down. I am aware of Seesaw but I didn’t want to go through all the configuration it requires.

Trackbacks

  1. [...] this, the Rails community has come up with a couple of approaches to mitigate the problem: Seesaw, one-at-a-time restarts, ...

  2. [...] this, the Rails community has come up with a couple of approaches to mitigate the problem: Seesaw, one-at-a-time restarts, ...

Hi Stranger, leave a comment:

ALLOWED XHTML TAGS:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Subscribe to Comments