Issue 8 ~ 3/10/2017
If you were to walk into RubyConf and ask 100 people what the most important problem is with Ruby, many of them would call out its speed - or lack thereof.
...but we may be approaching a future where this is no longer the case. As unique as Ruby's speed problems may seem, similar problems have been overcome by many other interpreted languages.
It's a fairly well-trodden path, actually.
- Early versions of the language interpret source code directly, which is very slow.
- Later versions convert the source into byte code which is run on a VM, which is faster.
- The VM is optimized via all kinds of computer science tricks (this is where Ruby 2.4 is).
- A just-in-time (JIT) compiler is added to convert performance-critical code into machine languages.
The farther you progress along this path, the more difficult the obstacles are. Fortunately for us, it's becoming more and more possible to leverage work done on other better-funded languages <cough> Java <cough> and apply it to Ruby.
The state of CRuby (MRI)
In Ruby 1.9, Matz' Ruby Interpreter (MRI), the original Ruby interpreter, was retired and replaced with a byte code interpreter called YARV. This gave us a huge performance gain since executing the new Ruby byte code was much faster than the old way of executing source code "directly."
If Ruby were a less flexible language, there would be all sorts of tricks it could do to optimize execution of the "new" byte code. For example, it could cache method results, replacing
1 + 2 with
But in Ruby, the definition of a method could change at any time. This makes certain kinds of optimizations more complicated, because Ruby needs a way to detect when the optimized code is no longer trustworthy and revert to a "deoptimized version." There has been some work on a deoptimization engine but at the moment it's unclear whether or not it will proceed.
The race To JIT
To me, the most alluring source of performance gains lies in adding just-in-time (JIT) compilation to Ruby. A JIT compiler looks at your Ruby byte code, identifies hot-spots, and then compiles them into assembly code which can be run directly by your CPU.
How did Java get fast? A JIT. How did JS get fast? A JIT. How will Ruby get fast?...
JIT compilation is a very complex process, made moreso because Ruby is a very dynamic language. It appears that various people are working on JITs in private and Github is littered with the corpses of failed attempts.
One of the more public efforts is by a team at IBM, which is working on integrating MRI with Eclipse OMR, its suite of open-source language runtime libraries. While these libraries are based on IBM's work optimizing Java, they're actually built in C and C++.
Of course, it could be that Ruby is too dynamic to make good use of a JIT. It could be that the JIT would use too much RAM to be viable. A hundred things could go wrong, but I'm crossing my fingers.
TruffleRuby, JRuby and Rubinius
Of course, if you're a JRuby or Rubinius developer, you're probably screaming to yourself "we've had a JIT for years!"
By building Ruby implementations on top of the Java virtual machine (JVM) and LLVM runtimes you cut the Gordian knot, getting performance that is in many cases much better than MRI.
But the JVM wasn't originally built to run Ruby, neither was its JIT compiler. That means that it could be much more efficient than it is.
The Graal project aims to solve this problem. Graal is an external JIT compiler built specifically for multi-language support. TruffleRuby is an implementation of Ruby built on top of Graal. Not only is TruffleRuby faster than standard JRuby in many benchmarks, but it's also exploring some interesting ways to improve two of JRuby's main problems: C extension compatibility and startup time.
These are exciting times for Ruby. We are - with luck - only a few years away from having either a much faster JIT-powered MRI, or a JVM-powered drop-in replacement for MRI.
Not only will this ensure that Ruby stays viable for future web development, but it also means that we may soon be able to use Ruby for real-time performance-critical applications like - umm - NES emulators. :)
If the future really looks like the Eclipse OMR team believes it will, performance may no longer be a reason to choose one dynamic language over another. We'll finally be able to choose languages purely for criteria like ergonomics and ease-of-use.
Join us next week, when we'll talk about hashes, cryptography and the CIA!