A Critique of InvokeDynamic in JVM
18 Jan 2022The bytecode InvokeDynamic was introduced in JSR 292 to support fast implementation of dynamic languages on JVM, such as JRuby. The benchmarks confirm that JRuby + InvokeDynamic is about 2 times faster than JRuby without InvokeDynamic. The design has largely achieved its original goal.
However, I’d like to argue that the introduction of InvokeDynamic is a mistake. The main points are the following:
-
Supporting fast implementation of dynamic languages is not a worthy goal as programming languages are embracing static types.
-
InvokeDynamic compromised the simplicity of the class file format and bytecode semantics.
-
InvokeDynamic complicates platform portability and endangers JVM’s future at the age when we are witnessing a plethora of new platforms.
The Evolution of Languages
Dynamic languages are popular in the first 10 years of the century. But the wind has changed around 2010 when Twitter and several other companies migrated their infrastructure from Ruby to Scala/JVM.
It shows that dynamic languages are inadequate to serve as the foundation of modern software.
If we look at the active programming languages created in the last decade:
- Go (2009)
- Rust (2010)
- Kotlin (2011)
- TypeScript (2012)
- Julia (2012)
- Elm (2012)
- Swift (2014)
- Zig (2015)
It is no surprise that all languages are typed! (Julia enjoys an unobtrusive type system.)
It is more interesting if we look at what happened with typical dynamic languages in the past ten years:
- Ruby becomes more typed with RBS and Sorbet.
- Hack, The Facebook dialect of PHP, embraces static typing.
- A boom of typed languages for JavaScript: TypeScript, Scala.js, Elm, etc.
- Dropbox adopts Mypy to statically type check Python code.
The importance of type systems has been touted by programming language researchers for decades, and finally it is widely embraced by both language designers and programmers.
If the popularity of dynamic languages are only transient, and JVM intends to be a solid piece of foundational software for decades, does it make sense to complicate JVM for the sake of dynamic languages?
The Price of Simplicity
The JVM is designed around class files and bytecodes. The modularity of class files, the clarity of bytecode semantics and the overall simplicity are widely recognized as a success.
However, the simplicity and elegance of the class file format is compromised by the introduction of InvokeDynamic.
For the Java 8 SE version of JVM specification, if we search for “java.lang.”, there are 80 occurrences, of which 68 are “java.lang.invoke”! The other 12 occurrences of “java.lang.” only appear in examples.
It means now the semantics of bytecodes are more closely coupled with the standard library, while previously it only refers to runtime exceptions defined in the standard library. References to the exceptions are justified, as they are just common data formats which serve as interface to programmers. But for InvokeDynamic it is not the case: it has the concept of bootstrapped methods, call sites, method handles, etc.
While a high-level language may implement language semantics with the help of runtime libraries, it should be forbidden for a virtual machine to specify the semantics of instructions with the help of standard library beyond common data format definitions.
This is because, bytecodes, as well as other formal languages, should be able to be equipped with a formal semantics in terms of heaps, stacks, etc. That is where simplicity comes from: when something can be explained with mathematics, it is simple, precise and clear.
The mathematical lucidity of JVM is lost with the introduction of InvokeDynamic. The specification becomes more difficult to understand. The class file format now is complicated with many news entities just for InvokeDynamic: MethodHandleInfo, InvokeDynamicInfo, bootstrap methods, etc.
As a piece of foundational software, reliability and maintainability of JVM is vital for the software industry. As put by C. A. R. Hoare, that can only be achieved at the price of simplicity:
Reliability can be purchased only at the price of simplicity.
As an alternative, when there is a compelling need to extend the capacity of the JVM, introducing new native APIs is a much less devastating approach than introducing new instructions.
Impact on Research and Innovations around JVM
The instruction InvokeDynamic was later used to implement lambda expressions. As a result, nearly all programs will directly or indirectly use the instruction due to the pervasiveness of lambda expressions. The complexity of InvodeDynamic now impacts nearly every project that takes class files as input.
This is another mistake on top of the mistake of introducing InvokeDynamic in the very beginning. Lambda expressions can be easily implemented via inner classes without resorting to InvokeDynamic.
The Opal project, which is an analysis framework for Java, even provides a InvokeDynamic rectifier to remove the usage of InvokeDynamic:
If you are a developer of static analyses and suffered from the pain of having to support Invokedynamic (Java 7+) - in other words, Java lambda expressions and method references - then this is the tool for you! OPAL’s project serializer rewrites Invokedynamic calls using plain-old standard Java bytecode instructions to facilitate the writing of static analyses! You no longer have to worry about Invokedynamics created by Java compilers.
The Soot analysis framework takes pains to remove the invokeDynamic-calls introduced by lambda expressions. Almost a decade after its introduction, programming language researchers are still struggling with the general usage of InvokeDynamic.
The problem is not restricted to academic research, it is also a pain for real world projects.
In Android, D8 replaces invokeDynamic-calls introduced by lambda expressions with generated inner classes. The same happens for GraalVM Native Image in order to support ahead-of-time compilation.
Java historically has a reputation for platform portability: compile once, run everywhere. Now Java is running on cloud machines, mobile phones as well as cards in your wallet.
The fact that the instruction InvokeDynamic unnecessarily complicates cross-platform support is damaging Java’s reputation of platform-portability and endangering Java’s future as we are witnessing emergence of new computing platforms, such as smart watches, glasses, cars, etc.
Conclusion
I argued in this post that the introduction of InvokeDynamic and the usage of it to implement lambda expressions are two mistakes in the history of Java/JVM. The mistakes damage Java/JVM’s reputation of platform portability and may have a negative impact on Java/JVM’s future.
I’d like to end the post with a quote from the physicist Richard Feynman:
You can’t make imperfections on a perfect thing — you have to have another perfect thing.