The other week, while shooting the shit about programming languages, a colleague mentioned the Twitter Ruby v. Scala debate, how Twitter had switched to Scala to meet their scaling needs. I wasn't following the ruckus that closely (no dog in the fight), but that sounded really odd, to say that one language is more scalable than the another.
For one, different people give different meanings to scalability so any discussion on scalability is bound to be problematic unless the participants all agree on a single definition. I personally like to use Amazon CTO Werner Vogel's description. Haven't given much more thought to the debate since then; until yesterday, when an article on Twitter's evolving architecture appeared on InfoQ. Most of the commotion stems from Twitters messaging queue problem:
The first implementation of the MQ was using Starling, written in Ruby, and did not scale well especially because Ruby’s GC which is not generational. That lead to MQ crashes because at some point the entire queue processing stopped for the GC to finish its job. A decision was made to port the MQ to Scala which is using the more mature JVM GC. The current MQ is only 1,200 lines and it runs on 3 servers.
Now I see the problem. We're muddling the language with the platform. The sentence "a decision was made to port the MQ to Scala which is using the more mature JVM GC" should have read "a decision was made to move to the more mature JVM GC and Scala was chosen as the language". The move to Scala was dictated by the performance and reliability of the Java VM and its garbage collector, not Scala the programming language. Twitter is giving Scala the programming language too much credit. I don't think they are doing this intentionally and they do always note the JVM aspect, but with slides like this
and focusing more on Scala the programming language than the VM platform it's not entirely surprising that some folks took issue and were confused by Scala's magical scaling abilities. And the bit about only 1,200 lines of code omits their dependency on Apache Mina (plain old Java). In essence Twitter moved to Java and the Java platform--it just so happens that Scala is a cool language that can co-exist on the platform.
And that's Scala's dirty little secret. Scala is only scalable in the sense that it runs on the reliable, high-performing JVM platform. This is why people say that Java the programming language is dead, but long live the virtual machine. Rich Hickey has a good, succinct take on languages and platforms in his rationale for Clojure. In this sense Ruby is doing it the "old way" whereas languages like Clojure and Scala are doing it the "new way" by leveraging the platform. If Scala only ran on the 1.4 JVM no one probably would even be talking about Scala, because while some of the language constructs are neat, lets face it, they are not revolutionary or entirely novel. What's the saying, everything interesting in computer science was between 1950-1980.
So, in addition to
- languages don't scale, architecture scale
- frameworks don't scale, solutions scale
I will add the following
- languages don't scale, platforms scale
Incidentally, it should be noted that the creator of Scala refers to Scala's scalability not with respect to the platform but because Scala can be applied to a "wide range of programming tasks, from writing small scripts to building large systems." But since scalability means different things to different people we're bound for disconnect.