My Rails (Mongrel) server sometimes dies
It has been awhile since I started a Rails project from scratch, so I appear to have missed the Phusion Passenger bandwagon. At first, this seemed like a GREAT idea, as one of the problems that I run into, and haven’t found a really good solution to is the odd mongrel failure. This is especially true of a site that while important, isn’t a huge traffic hog. Only after looking at Google analytics for the week does the user find that it has been down for a few days.
Restarting the mongrel servers can be a bit of a stretch, too. In the past, I have always just relied on my capistrano deployments to make sure my servers were up and running. There is one problem with that. Most clients have no interest in going to the command line and running some incantation to make their server work. They simply want it to run all the time. While I agree that this is makes perfect sense, it’s not always simple. I may have found the simple answer, but it took a little bit of tweaking.
Passenger makes it simple to run a rails server via apache, and avoids the need to stop and start the server manually. When apache gets a request, it simply fires off a new instance of passenger, and serves it up. It sounds great on paper, and it is, but you have to do a few things.
My Passenger instances lag!
The trick is, passenger will keep its instances open for only so long. Once that time passes, they will die in order to conserve resources. This is not that big of a deal, but it does become a problem with newer sites that are just starting to gain momentum. If a user comes to that site after the default (5 min) delay, that request will have to fire up a new instance. I suppose the time for the restart is based on the server hardware and load, but I don’t really see the need to spend more than necessary on hardware to fix this issue. On my server, the lag was ten or fifteen seconds. This seemed like a very long time.
Okay, so I RTFM
It turns out that Passenger has a config variable called PassengerPoolIdleTime. This will adjust the amount of time before the instance dies. It is a number of seconds that the instance will remain inactive before it quits.
Eureka! That’s what I needed. My first guess was to set this at 0, meaning that it stays around forever, waiting for a new request. Unfortunately, this isn’t how it works. It will stay open until “application instances will not be shutdown unless it’s really necessary”. It names some criteria, but I found that there was no real way to tell when the instance was going to die.
A completely kludgey fix
I hate the idea that I resorted to something so kludgey to get this thing to work. I know I should look further, but it’s 2:14AM, and I am just tired of dealing with this, so here is what i ended up doing:
- Set the PassengerPoolIdleTime variable to 600. This will give you a good ten minutes of waiting time.
- Set up a cron job to wget the site every five minutes.
I have been running it this way for a few hours, and it seems as though everything is working as it should. In order to test it, I have to nose around on the site every few minutes, so those hits will obviously effect the timing. I will know for sure in the morning.
Check those stats!
If you are running google analytics, you will have already guessed that there will be 288 hits per day that shouldn’t count against your total. In order to keep those hits from showing up in your analytics account, you need only create a filter to keep those hits out.
I hope this helps any others out there who might be running into the same issue.