The Last Word on Rails Scaling
Posted by Sam
Ok, I have to admit that I'm getting a bit tired of all the Rails scaling questions. People seem to think that's Twitter's scaling problems are Rail's scaling problems. They aren't. Twitter's scaling problems do not equal Rails scaling problems or to put it more succinctly Twitter's scaling problems != Rails scaling problems. Rather than try and convince people that Rails can scale I'm just going to include a couple of stories about large companies that have scaled Rails to Billions of page views a month. Will this finally convince people that Rails scales?
Our first Rails app that scales is brought to us by AT&T. Yellowpages.com was rewritten from a Java app to a Rails app. It serves over 1.4 billion requests a month using 25 servers per data center. That's only 4 more servers than it was using while running under Java. The Java code base weighed in at 125k lines of code while the Rails code base came in at under 20k lines of code, including tests. The much larger Java code base didn't include any tests. The entire site was coded with at most five developers over a three month period. One thing that really stands out with Rails site is the maintainability. As a developer would you rather jump into and maintain a site with 125k lines of code or less than 20k lines of code? Over the life of the site AT&T might spend a couple extra grand on servers, but they will more than make it up by having less developers and getting new developers up to speed much quicker! And do you think there are more bugs and security issues hiding in 125k lines of code or in way less than 20k? If you want to look at more details you can check out this presentation from Rails Conf 2008.
Our next app is brought to us by LinkedIn. LinkedIn built a Facebook app called Bumpersticker that handles 1 billion page views monthly and around 100TB of data each month. This app is obviously also built on Ruby on Rails otherwise I wouldn't be including it here. ZDNet has a write up and there's also a video that looks to be from Joyent. This story isn't as interesting as the Yellowpages.com story because it wasn't a rewrite and it's harder to see the benefits of using Rails as compared to a traditional environment like ASP.Net or Java but I'm including it to point out that you can get the advantages of a great framework, cut development time and costs and still scale out to a billion page views a month.
So that's it. Can we please stop asking if Rails can scale and stop using these horribly outdated technologies like Java and .Net? Please? Pretty please?
Tags: rubyonrails
Images Belong in the Database
Posted by Sam
Logical Reasons
The more web programming and system administration I do the more I'm convinced that images and other forms of uploaded content belong in the database. I've pretty strongly suspected this for many a years but it's becoming more and more clear that this is just how things need to be. There are many reasons. Some are technical and some are logical, but all signs point to loading images in the database. Below are just a couple.
I'll start with the logical reason first and then get to the technical reasons. Logically all other forms of dynamic content will reside in the database yet programmer after programmer insists on putting images and other uploaded content on the filesystem. Now you've got some content in the database and it's tied to content on the filesystem. Suddenly the nice clean line is broken. You've blurred the lines on who does what and where it goes. What usually happens is programmers forget to clean up after themselves so files references are deleted in the database but still exist on the file system. You know why this happens? Because you shouldn't have put it there in the first place. Repeat after me - dynamic content in the database and application files in the filesystem. Got it?
Technical Reasons
So know that we've got the logical reasons behind us what are some of the technical reasons? For starters your database(s) will always be in sync. That's what they are meant to do. That's what they have to do. Your database(s) have to be synchronized. Your site depends on it. It doesn't matter if it's a simple site with a single database server or a huge site with a dozen database servers. You have to keep your database servers in sync. And databases are good at it. It's much much easier to scale out database servers than file servers. Having all your dynamic content in the database keeps everything in sync.
Some people will wrongly argue that having files in the database will cause slowdowns. To that I say....um maybe, but only if you're doing it wrong. Databases are very fast and have excellent caching but it's highly likely that multiple web servers can server images straight from disk faster than a database. Well that's great, but images still belong in the database. And in this case you can have your cake and eat it too. Check out this article about caching images on the file system with Rails. Hrm, one extra line of code lets you cache your image on the file system. Doesn't seem to bad to me. Elegant, simple and lightning fast.
The Reason for the Rant
Time after time I'm responsible for deploying other people's crappy software. They invariably come up with some stupid solution that lets them click the checkbox that says their product will work on a load balancer, but they neglect to tell you upfront what ridiculous hoops you have to jump through to get it work. The most recent product is called Ektron. I can't say what it's like to work with from a programming stand point but from an admin stand point it's a nightmare. Instead of loading the images in the database like they should they instead force you to share out files and do this dumb little virtual directory linking to the other servers. It's just annoying and not even close to elegant, but then I have yet to see a CMS system that is so I wasn't surprised. Thankfully the Ruby on Rails team understands what it takes to scale apps and they provide you with a nice foundation. If only the rest of the world would catch up.
Tags: rubyonrails web rant
Engine Yard to Create mod_rubinius
Posted by Sam
In a previous blog I posted a rebuttal of sorts against the DreamHost whine fest. Well at least that's how I saw their blog. The arguments spilled over to a another blog posted on Ruby Inside. One of the points that I took issue with was DreamHost saying that the Ruby on Rails folks needed to step up and create something that DreamHost could resell. In their words:
4. Officially support shared hosting environments.Alastair and I are in agreement that there is already a great way to deploy Rails applications and it is not Mongrel and [insert favorite web server/proxy solution here]. LiteSpeed is light years of ahead of everybody else in Rails support and deployment and it's also much, much faster than Apache in general. But in all honesty the bigger issue here was that I feel like DreamHost needs to step up and put their money where their mouth is. They are making money on hosting not the Ruby on Rails developers. If they want to make money on Ruby on Rails hosting then they should step up and create a solution.
Well it appears I wasn't the only one who felt the same way. Yesterday, the folks at Engine Yard announced that they hired a developer to work on mod_rubinius. Rubinius is a very promising Ruby implementation that is gaining a lot of momentum and Engine Yard is backing it with the hiring of several developers. It's been written about plenty on the web but it's nice to see that a hosting company that is making money from Rails hosting is actually putting it's money where it's mouth is instead of just expecting everybody else to do their work for free and let them reap the benefits.
I couldn't be happier deploying Rails apps on LiteSpeed, but I always welcome alternatives. So well done Engine Yard!
Tags: rubyonrails
New Staging Environment in LiteSpeed
Posted by Sam
In version 3.3.4 of LiteSpeed they have finally added a staging environment for Ruby on Rails. This is a big deal for me because every site I create gets a staging environment. In a previous post I blogged about setting up custom environments for Rails in LiteSpeed, but with the new staging environment the code changes in that blog are no longer required. I'm very happy about this and it's nice to see my only real complaint with LiteSpeed fixed. Although now that they've added the staging option maybe they can expose a web service to add new web sites. It would be great for scripting the setup of new sites!
Tags: litespeed rubyonrails
Easy Ruby on Rails Deployments
Posted by Sam
A couple of days ago a large web hosting company named DreamHost posted a blog about how Ruby on Rails could be improved. Some of what they are wrestling with I also wrestled with when we first started deploying Rails applications. But honestly I figured out how to overcome these problems several years ago so I'm wondering what their problem is. Just to put my money where my mouth is I thought I would touch on their points one by one.
Ruby on Rails needs to be a helluva lot faster. With a proper accelerator it’s nicely usable but without one it’s painful. Ruby itself is a big part of the problem so this one may come down to just simplifying the management of the accelerator technologies, unfortunately. Mongrel seems like a big step in the right direction, even though it’s not Rails-specific. I hope the Rails core developers will be cooperating a lot more closely with Mongrel developers in the future.
Ok, where to begin on this. First of all let me say for the record that I don't understand why everybody insists on saying Rails is soooo slow. Judging by all the Rails is slow posts all over the net you would think it takes days to load a page, yet these folks handled it just fine. Any technology can fold under traffic and just about any technology if used correctly can scale. Is Ruby the fastest language? No, definitely not. Neither is PHP. If you really need speed you need to go with Java or C and you need to code it correctly. A fast language alone is no guarantee of performance. And because PHP has to startup every time a request comes in it can easily be much slower than Rails.
As for Mongrel being a big step in the right direction I couldn't disagree more strongly. Mongrel is a nightmare for hosting more than one or two sites. Having to setup clusters that don't grow and shrink based on the load is ridiculous. Especially for a shared hosting environment. I can't believe people have latched on to this so much and if Zed is so brilliant why can't he think of a MUCH better way to handle this. LiteSpeed handles this a million times better. Basically just tell LiteSpeed the maximum number of Rails instances and it handles bringing them up and down to handle the load. Oh yeah and LiteSpeed is much faster than Apache so all the way around it's a better setup.
Ruby on Rails needs to more or less work in ANY environment. You can’t just expect your users to set up their servers any which way. There are millions of established systems that cannot simply integrate any bleeding edge technology you think is better this week. If you continue to keep this attitude you are surely shooting yourselves in both feet.
I'm guessing by any environment what they are really upset about is that it doesn't work in their environment. Guess what? To support a new technology you often have to make changes. When Java came on the scene you had to add additional application servers, which is far more of a pain than what Rails asks of you. And it does work in pretty much environment. Your real complaint is that it doesn't work in your broken environment. This is really your problem. There are plenty of places where Rails will work just fine as David points out in his blog.
You need to maintain backwards compatibility better. Admittedly this is the area where PHP has historically done very poorly, but that’s no reason to not one-up them. Also, Rails is admittedly very young as a development platform and you guys have gotten a LOT of attention very early on. Still, with big hype comes big responsibility. You need to keep the momentum going now.
This is just dumb. Rails is an incredibly new framework and they have great backwards compatibility. In fact they have full backwards compatibility. If your site doesn't work with the latest version of Rails just stick with the one you are using. You can have hundreds of different versions of Rails installed at the same time and the app can use whatever version is wishes. This is full backwards compatibility.
Officially support shared hosting environments. The feeling I get from the Rails community is that Rails is being pushed as some sort of high-end application system and that makes it ok to ignore the vast majority of user web environments. You simply cannot ignore the shared hosting users. In my opinion, the one thing the PHP people did that got them to where they are today is to embrace shared hosting and work hard to make their software work well within it. That means it has to be very lightweight (it may be too late for that in Rails already!), and it has to ‘plug in’ to a wide variety of operating environments with minimal fuss and hassle. Compatibility work like that is not glamorous, exciting, or fun, but it’s gotta be done.
The reason you get the feeling that Rails is some sort of high-end application system is because to some extent you are right. You absolutely can't ignore shared hosting? Really? Seems to have worked OK for Java. Why doesn't DreamHost support Java? Let me guess, because it's too heavy weight? Rails is a full fledged web development framework. Just like Java, it's always in memory so that it can take advantage of being a long running process. This is less than ideal for hosting companies that are trying to sell $5 hosting packages, but for building real web applications this is almost a necessity. I'd say not only CAN they ignore shared hosting companies but they quite successfully are ignore you. Rails doesn't lend itself as well to shared hosting as well as PHP does. It's up to the shared hosting companies to figure that out. That's what YOUR business is. Oh and several hosting companies can and have figured it out so your public rant is really kind of embarrassing for DreamHost.
Tags: rubyonrails litespeed