Last week I experimented with Groovy, one of the more popular dynamically typed languages available on the JVM. Specifically, I was trying to mix and match Java and Groovy, starting with a pure Java codebase, and merging certain sections to Groovy where it made sense — beginning with the domain objects. In a nutshell, I found that while Groovy is pretty tubular, my expectations with regards to its interoperability with Java were somewhat optimistic.
After a little thought and some prodding from Marc and Guillaume, it became clear that from an interoperability perspective Groovy use can be categorized into four broad use case scenarios. From simplest to most complex, they are:
- Pure Groovy.
- Groovy invoking Java.
- Java invoking Groovy.
- Groovy invoking Java invoking Groovy invoking… a.k.a. circular dependency.
The first use case is obviously the most trivial. If Groovy code didn’t work, there would be no Groovy project. This is the baseline from which the other use cases are built. The second scenario is where things start to get interesting. The implications of successfully covering the second use case are huge: bytecode generation, JVM as a platform (HotSpot!), and access to third party libraries representing over 10 years of development. Just in case some of you have been hiding under a rock, I should note that this has been possible for a while ;-)
Moving from the second scenario to the third is trivial, as far as Groovy is concerned. Unfortunately, build tools and project management packages such as Maven need to be coerced into compiling the Groovy code before the Java code. This is a challenge, at least for Maven. This inflexibility, though not Groovy’s fault, does pose a problem. It seems the standard way of dealing with this dilemma is to hide all of your Groovy code behind pure Java interfaces, and then move the Java code, the interfaces, and the Groovy code into three separate modules that can be built serially. Obviously, this amounts to erecting an artificial barrier between your Java codebase and your Groovy codebase. However, if the future does indeed hold a harmonious coexistence of multiple first-class languages on the JVM, as Groovy enthusiasts hope, then these issues will fade as Java developers become multilingual and the more popular build systems begin to cater to their needs.
Now we arrive at the fourth and final scenario: circular dependency. Looking back, this is the problem I somewhat naively assumed had already been solved by the Groovy team. The Java domain object which I converted to Groovy implemented an interface defined in Java, and the domain object itself was obviously used throughout the application, implemented in Java. Thus, I couldn’t compile the Java code before compiling the Groovy code because the domain object referenced by the Java code had not and could not yet be compiled. Furthermore, I wouldn’t have been able to fix the problem by compiling the Groovy code first, because the domain object referenced an interface implemented in Java. Dierk Koenig’s Groovy in Action calls this “the chicken and egg dependency problem:”
Groovy and Java both have no problem accessing, extending, or implementing compiled classes or interfaces from the other language. But at the source code level, neither compiler is really aware of the other language’s source files. If you want to work seamlessly between the two languages, the trick is to always compile dependent classes using the appropriate compiler prior to compiling a class that uses a dependent class.
This sounds simple, but in practice, there are many tricky scenarios, such as compiling a Java file that depends on a Groovy file that depends on a Java file. Before you know it, you can quickly end up with intricate dependencies crossing the boundaries of each language. In the best scenario, you may have the alternate back and forth between the two language compilers until all the relevant classes are compiled. A more likely scenario is that it will become difficult to determine which compiler to call when. The worst case scenario — and it’s not uncommon — occurs when you have circular dependencies. You will reach a deadlock where neither language will compile because it needs the other language to be compiled first.
The obvious solution is to punt: get rid of the circular dependency, revert to either use case two or use case three, and rant about “open sores” software for a couple of days to make yourself feel better. But let’s dig a little deeper and leave the ranting for next week… what would really need to be implemented in order to achieve the final interoperability scenario? JSR 223 (Scripting for the Java Platform) is all about implementing script engines in Java, which then host some target scripting language; not exactly what we’re looking for. JSR 292 (Supporting Dynamically Typed Languages on the Java Platform) is looking to add a new JVM instruction called invokedynamic, allowing method invocation verification to occur dynamically, at runtime. This new bytecode would make compilation of dynamically typed languages easier, but doesn’t solve our present conundrum.
What, then, is the solution? Well, we would probably need to use a single compiler (or compiler framework) for both languages. Classes defined in Java syntax would need to be aware of classes defined in Groovy syntax, and vice versa, before any bytecode is serialized into .class files. I tried to stay away from the compiler classes in college, but it seems fairly obvious that, assuming Eric isn’t lying, you would need a compiler framework providing:
- Pluggable lexical and syntactical analysis (one per language).
- Unified semantic analysis, code optimization and generation.
JSR 199 (Java Compiler API) looks interesting, but it doesn’t provide this sort of pluggable architecture. However, given that Sun GPL’ed the Java compiler not too long ago, it doesn’t seem too far-fetched to wonder if someone from Groovy, JRuby, Jython or Rhino might not start exploring the possibilities.
There’s also a second possibility that might require a little less tinkering. According to Tom Ball, a technology director at Sun who works on Java programming language tools, an ingenious hacker might make use of the JSR 199 compiler API to create a custom JavaFileManager implementation such that calls to getJavaFileForInput() requesting JavaFileObjects of type JavaFileObject.Kind.CLASS would forward the requests to the Groovy compiler. This custom JavaFileManager would behave similarly to the existing GroovyClassLoader, providing a convenient facade for the Groovy compiler, but could possibly be written in such a way that circular dependencies would be resolved correctly.
As Dierk puts it:
Until the Java compiler is aware of classes not yet compiled in other languages, you have to use intermediary interfaces or abstract classes in Java to make the interaction between Java and Groovy smoother during the compilation process. Let’s hope some day the Java compilers will provide hooks for interacting with foreign compilers of alternate languages for the JVM.
So. The code is all out there. Does anyone want to bet on how long it takes someone to stop hoping and start scratching their itch?