April 16th, 2007

Interesting set of posts related to Twitter, Ruby on Rails, and scaling.

It starts with an online interview with Twitter developer Alex Payne, where he mentions some challenges associated with scaling and the use of RoR.

None of these scaling approaches are as fun and easy as developing
for Rails. All the convenience methods and syntactical sugar that
makes Rails such a pleasure for coders ends up being absolutely
punishing, performance-wise. Once you hit a certain threshold of
traffic, either you need to strip out all the costly neat stuff that
Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.) or move
the slow parts of your application out of Rails, or both.

The 'creator' of RoR, David Heinemeier Hansson responds.

Scaling is the act of removing bottlenecks. When you remove one bottleneck (like application code execution), you tend to reveal another (like database queries). That's natural and means you're making progress. But you have to keep your marbles straight when doing this. If your bottleneck has moved to the database, you probably won't see big results by replacing pretty constructs with ugly ones. In other words, if a database query is taking 0.5 seconds, improving a loop from 0.05 to 0.01 seconds is not worth bothering with at this point.

I wasn't quite sure what DHH was implying here. It's not unusual for any infrastructure such as RoR that there is a cost tradeoff between ease of development and scalability. Luckily, I didn't have to puzzle it out too long. Mark Pilgrim provided a handy interpretive translation of DHH's post.

Rails is an ogre, and ogres have layers.

Comments
1
Bud Gibson - 4:04 pm 4/16/2007

Thanks for the pointer. Pilgrim's post was clear and to the point.

2
Larry - 4:38 pm 4/16/2007

If people are tiring of Rails, it's because of DHH's self-importance.

3
Shelley - 10:04 pm 4/16/2007

I think that's why Mark's post has received the response it has, Larry.

4
McD - 9:34 am 4/17/2007

For me Mark makes the mistake of turning a technical analysis into an ad hominem attack on the technology architect. I laughed and it was brialliant writing but in the final analysis I also see DHH's view: 95% of web apps are NOT like Twitter. Using Rails for Twitter was a suspect decision since the performance constraints of Ruby and Rails in their current instantiations we're well documented.

The Twitter guys came from the Odeo programming team. They had used Rails for Odeo and knew it's limts and furstration for building something that could have the 10x uptake you'd expect from Twitter to be successful.

Whatever you write twitter in in would require an incredibly birlliant architecture to scale as the SXSW audience goes crazy for docuemting every interesting action of their day. The technology behind AOL's IM must be incredibly expensive and difficult to maintain. The design that made YouTube scale must have similar challenges.

Are Ruby, Rails and DHH suspect becuase Twitter hit a scaling problem? No. Should DHH defend the applicability of Rails for 95% of most web apps. Yes.

All else is personality politics. Why am I not surprized. That blogging approach drives more blog hits than a great analysis of web frameworks.

It's also worth stating in conclusion that DHH designed, marketed and modeled one of the most elegant MVC frameworks available. It just works and I can't say that for most Java, C# or other scripting frameworks. They can be made to work but their "out of the zip file" experience is too often fraught with undocumented problems.

5
Lawrence Krubner - 9:36 am 4/17/2007

It must be difficult to do anything that great when you're 27 and then not develop a bit of a god complex about ones self.

6

[…] Shelley Power's blog I commented about this: For me Mark Pilgrim makes the mistake of turning a technical analysis into […]

7

Would it have been easier to scale Twitter if it had been built in plain old PHP? It doesn't feel to me to be a significantly different problem to something like Livejournal. And it's not as though Twitter is a huge site with huge numbers of pages and page function.

The comparison with AIM is apt. Twitter should perhaps have been built as a desktop P2P app in the style of Skype or a combined P2P+Server app like Jabber/IRC. Maybe then we'd have got a properly conversational interface that encouraged dialogues rather than the aggregation of monologues we've actually got. it's somewhat comical watching people trying to answer back on Twitter.

8
Scott - 9:12 pm 4/17/2007

"They can be made to work but their "out of the zip file" experience is too often fraught with undocumented problems."

Having spent a lot of time hunting down gem dependencies and fiddling with mod_rewrite rules and routes I can say that sometimes installing Rails apps isn't a picnic either. Let's not even talk about how some versions of Rails are specific to particular versions of Ruby and sometimes the app devs freeze to that version of Rails for their apps zip distribution.

"Would it have been easier to scale Twitter if it had been built in plain old PHP?" "And it's not as though Twitter is a huge site with huge numbers of pages and page function."

Part of me wants to say "yes" simply because sites built with PHP have had those Twitter-esque kinds of loads put on them before so any language bottlenecks have been shook out a long time ago. Ruby + Rails is still new enough that knowing what it can and can't do specifically is hard to pin down. I know that a lot of the optimizations that you take when you need to scale an application up (e.g. query optimization, table/index optimization, clustering databases) is possible in Rails. I don't think it's the reads that are killing Twitter, simply because there is caching going on at multiple levels in the database (except for outlying cases like "Scoble with friends"), I think it's the writes and possibly the SMS/IM integration. Sending out that many IM's and SMS messages in addition to serving HTML/XML/JSON has to take a toll on your pipe.

9

"Would it have been easier to scale Twitter if it had been built in plain old PHP?"

I have the impression that it is easier to drop out of Ruby and into C code when you hit a bottleneck, than it is to drop out of PHP code and into C code. I suppose the way to do it in PHP is to write a C extension to PHP and use that. I'm guessing that means learning a lot about PHP's internals, which sounds like a pain to me.

In a recent interview David Heinemeier Hansson mentions how he used 300 lines of C code to overcome a performance bottleneck when they were building Campfire.

Anyway, is PHP really all that fast? Seems like for any big site one starts to come up with caching schemes, which is what one would also normally do when working with Ruby. For a site like Twitter, where the database queries pretty much have to be constant and very little or no caching is possible, then I think PHP would also run into a bit of strain.

I've worked mostly with PHP for the last 7 years, but I have to say, I prefer Ruby's more dynamic nature. And most of the sites I work on are small enough that scaling issues never come up.

10
Shelley - 8:21 am 4/18/2007

McD, I imagine that Mark was somewhat inspired by DHH's use of "Fuck you" in a slide at a conference as a way of responding to those critical of RoR. What goes around…

Julian, that is funny! People trying to have a dialog with twitter.

Lawrence, and Scott, we have to be careful comparing RoR with PHP. PHP is a language, just as Ruby is. RoR is an infrastructure, where you typically get a tradeoff between ease of initial development and scalability. There are PHP infrastructures that do the same, and I wouldn't necessarily expect them to do better in this type of application.

I've only just started working with Ruby, and I'm less than enamored with it, but will continue until I either see the light, or decide the light was never meant for me. I do know that even Ruby enthusiasts say it's not the fastest language, or the best performer. Leaves me going: why use it for an application that's meant to scale, and quickly?

As for writing in C, brrr, I did my duty with that language. Maybe I'll lose geek cred, but I'd rather eat candy and then go suck a lemon.

11

"PHP is a language, just as Ruby is. RoR is an infrastructure, where you typically get a tradeoff between ease of initial development and scalability. "

Good point. It would be better to compare Symfony or CakePHP to Ruby on Rails, as opposed to PHP to Ruby on Rails.

"I've only just started working with Ruby, and I'm less than enamored with it, but will continue until I either see the light, or decide the light was never meant for me."

If you decide you hate it, it will be fascinating to hear why. You are speaking of Ruby and not of Rails?

"I do know that even Ruby enthusiasts say it's not the fastest language, or the best performer. Leaves me going: why use it for an application that's meant to scale, and quickly?"

I kind of have the impression, at this point, that the crew at 37 Signals would give 2 answers. First, your application will probably never get the traffic that would warrant concern about scaling. Second, if it ever does get that kind of traffic, then rewrite portions of your app in C.

I've already pointed to the interview where DHH spoke of replacing 100 lines of Ruby code with 300 lines of C. This was the post where they argued that scaling was an issue most people would ever have to worry about:

“It’s possible my site could be the next MySpace. It could happen, right?!”

Well, yeah, it’s possible. But not likely. Very not likely.

We gave a similar view in “Scale Later” (PDF), an essay in Getting Real:

The truth is the overwhelming majority of web apps are never going to reach that stage. And even if you do start to get overloaded it’s usually not an all-or-nothing issue. You’ll have time to adjust and respond to the problem. Plus, you’ll have more real-world data and benchmarks after you launch which you can use to figure out the areas that need to be addressed.

Allocate your fear properly

When it comes to building a web app, some things create more fear than they should…

Fear: It won’t scale
Truth: You’re not going to be Google overnight."

As always, with this crew, I'm torn between agreeing with them or being amazed by their attitude.

12

Damn. Sorry the italics didn't work. Their quote ends with the line:

"Truth: You’re not going to be Google overnight.""

13

Although Mark's post was funny I'm not sure it was fair. I agree with a lot of what DHH had to say. For a web app with lots of reads, the bottleneck will most likely always be I/O not computation. This means number of web servers, number of database servers, in-memory caching and database schemas are where the optimizations lie and not whether your app is written in PHP, Ruby or C.

I guess it's a testament to how few sites have these problems that people are saying "Ruby is slow" as if the execution time of the application language could possibly be the bottleneck.

14
Arthur - 6:16 pm 4/19/2007

So, the conclusion is that RoR doesn't scale well for these large applications which doesn't surprise me at all either. I'm with Shelly and I have not even touched RoR (well, I did once, but I didn't inhale).

As for writing C code: I actually enjoy writing C# code (and that is something that hasn't happened to me since a long time).

15
Shelley - 8:39 pm 4/19/2007

Lawrence, yes Ruby, not RoR. I'll let you know, but I do need to give it some time.

Dare, I got the impression that Twitter's problems were somewhat related to using an infrastructure tool, and then having difficulties mapping in multiple databases.

Arthur, I like C#, though I haven't played with it since the first release.

16

Shelley,
I saw that. It seems that was the only issue that had technical merit which unfortunately was obscured by all the rhetoric flying back and forth.

PS: I'll have some more photos from Nigeria when I get back next week. Less pictures of billboards this time, I promise. :)

Thanks to all those who have contributed to the discussion. Comments are now closed, but you can contact the author of the post directly.