The other week, while shooting the shit about programming languages, a colleague mentioned the Twitter Ruby v. Scala debate, how Twitter had switched to Scala to meet their scaling needs. I wasn't following the ruckus that closely (no dog in the fight), but that sounded really odd, to say that one language is more scalable than the another.
For one, different people give different meanings to scalability so any discussion on scalability is bound to be problematic unless the participants all agree on a single definition. I personally like to use Amazon CTO Werner Vogel's description. Haven't given much more thought to the debate since then; until yesterday, when an article on Twitter's evolving architecture appeared on InfoQ. Most of the commotion stems from Twitters messaging queue problem:
The first implementation of the MQ was using Starling, written in Ruby, and did not scale well especially because Ruby’s GC which is not generational. That lead to MQ crashes because at some point the entire queue processing stopped for the GC to finish its job. A decision was made to port the MQ to Scala which is using the more mature JVM GC. The current MQ is only 1,200 lines and it runs on 3 servers.
Now I see the problem. We're muddling the language with the platform. The sentence "a decision was made to port the MQ to Scala which is using the more mature JVM GC" should have read "a decision was made to move to the more mature JVM GC and Scala was chosen as the language". The move to Scala was dictated by the performance and reliability of the Java VM and its garbage collector, not Scala the programming language. Twitter is giving Scala the programming language too much credit. I don't think they are doing this intentionally and they do always note the JVM aspect, but with slides like this
and focusing more on Scala the programming language than the VM platform it's not entirely surprising that some folks took issue and were confused by Scala's magical scaling abilities. And the bit about only 1,200 lines of code omits their dependency on Apache Mina (plain old Java). In essence Twitter moved to Java and the Java platform--it just so happens that Scala is a cool language that can co-exist on the platform.
And that's Scala's dirty little secret. Scala is only scalable in the sense that it runs on the reliable, high-performing JVM platform. This is why people say that Java the programming language is dead, but long live the virtual machine. Rich Hickey has a good, succinct take on languages and platforms in his rationale for Clojure. In this sense Ruby is doing it the "old way" whereas languages like Clojure and Scala are doing it the "new way" by leveraging the platform. If Scala only ran on the 1.4 JVM no one probably would even be talking about Scala, because while some of the language constructs are neat, lets face it, they are not revolutionary or entirely novel. What's the saying, everything interesting in computer science was between 1950-1980.
So, in addition to
- languages don't scale, architecture scale
- frameworks don't scale, solutions scale
I will add the following
- languages don't scale, platforms scale
Incidentally, it should be noted that the creator of Scala refers to Scala's scalability not with respect to the platform but because Scala can be applied to a "wide range of programming tasks, from writing small scripts to building large systems." But since scalability means different things to different people we're bound for disconnect.
I don't think that's a secret, much less a dirty one.
That being said, it would have been more interesting if they would have tried Starling on JRuby and compared it to Kestrel.
Posted by: Erik Engbrecht | June 28, 2009 at 04:21 AM
but i like how the title sounds
Posted by: tinou | June 28, 2009 at 09:33 AM
in all seriousness, of course it's not a secret or dirty. but whenever something is poised to make a break through, it's important to be precise. similar issue with scala's concurrency magic. magically bc of actors we're going to get better concurrency--huh?. actors are rather low-level if you think about it. but too many slides lead off with "scala makes concurrency a breeze, it has actors."
Posted by: tinou | June 28, 2009 at 10:09 AM
Um, but Scala also has a pretty mature model for Actors. A robust way to handle parallel, scalable immutable message passing is kind of important in this arena, maybe it bears mention.
Posted by: Scott Swank | June 29, 2009 at 09:55 AM
Yes, languages can scale according to the creator of Scala. Go read: http://www.artima.com/scalazine/articles/scalable-language.html
What's with all the negative blogging about Scala? It's just like many people are afraid of it!
Posted by: Danny | June 29, 2009 at 12:40 PM
@Tinou - Yeah, catchy titles are important, and you're absolutely right about the importance of being precise. And agreed that Actors are a relatively low-level concurrency abstraction.
@Scott - As someone intimately familiar with the internals of the Scala Actors library I seriously doubt it's maturity.
Posted by: Erik Engbrecht | June 30, 2009 at 03:53 AM
Obviously Java and Scala ( and Groovy and JRuby, etc.) all generate JVM bytecode. Speaking theoretically, is it possible that there are valid programs (in bytecode) that Java the language cannot compile to, but Clojure or JRuby or Scala can, and vice versa? I don't know enough about the spec or writing bytecode compilers to know. If it is the case, then it seems reasonable to give some props to Scala. On the the other hand, the fact that they point to the mature GC of the JVM, so the effect of Scala on performance is probably second order (though important in the reduction and comprehensibility of code.)
As to everything interesting in CS having been done between 1950-1980, I think the FP community would disagree.
Posted by: Joe | July 01, 2009 at 11:05 AM
Scarling is a port of Starling to Scala: http://robey.lag.net/2008/05/07/scarling.html
Posted by: rick | July 01, 2009 at 11:13 AM
"In essence Twitter moved to Java and the Java platform--it just so happens that Scala is a cool language that can co-exist on the platform."
As one of the people who pushed for the adoption of Scala at Twitter, I can confirm that the above was our logic. I'm confused, though, as to why you think that's a negative. To my mind, it seems like two wins: a cool language that happens to run on a great platform. Nothing wrong with that.
Your whole piece seems to be predicated on the fact that Odersky defines Scala as "scalable" to a variety of programming tasks, whereas our engineers have present about "scalability" in terms of architecture. Whatever controversy exists here seems to be invented from your confusion about terminology.
I'm not sure what value there is in tossing off aphorisms like "languages don't scale, platforms scale"; they're glib and, in this case, untrue. A successful language is able to scale because of tight integration with its host platform. Creating a successful language is a holistic problem, and a difficult one at that. The choice to adopt a language should also be holistic.
Posted by: Alex Payne | July 01, 2009 at 12:39 PM
@alex
i definitely don't see any of this as negative. it's very positive. i'm not a one world, one language, one platform guy. i wished scala on .net was still alive. now that would be cool, to be able to run scala on jvm or .net or some other platform.
is the controversy invented? i don't see it as a controversy. but i think you'll have to concede that when you (not you specifically) throw in lines like "we switched to scala, things now fly, and it's only 1,200 locs" some people might be confused w/o the full context.
Posted by: tinou | July 01, 2009 at 01:21 PM
Op might want to take a look at this
http://tiagofernandez.blogspot.com/2009/05/java-integration-with-groovy-jruby-and.html
Posted by: atomi | July 01, 2009 at 05:57 PM
The point is succinctness of expression. They moved to Scala because it was a language they could move to without losing a bunch of the features they were used to with Ruby. The fact that Scala compiles to the Java platform (and does so very well) gives it the performance characteristics of Java. It's not a "dirty little secret", it's the value proposition.
Posted by: Czerwonka | July 01, 2009 at 09:07 PM
Scala is a hybrid Object-Oriented/Functional Programming language on the JVM. When I heard that Twitter was using Scala, I was curious and started collecting all the sites and articles to learn scala programming. If you are interested check the link below for the big list I have gathered (more than 200 sites) for learning scala programming.
http://markthispage.blogspot.com/2009/06/more-than-100-sites-to-study-scala.html
Posted by: Sri | July 17, 2009 at 11:49 PM