Return back to the blog

Java 7 Bytecode Verifier: Huge backward step for the JVM

Posted by Prashant Deva on March 25, 2013
    • Tweet

    The reason why the JVM is so popular and has so many languages, IDEs, profilers, debuggers, APMs and other wonderful tools is because the JVM bytecode is extremely simple. You can teach someone the pretty much the entire bytecode in an hour and the next they would be using the ASM framework and manipulating bytecode.

    Stack Frames: Feature turned Flaw

    Enter Stack Map Frames..
    Java 7 brings with it a new bytecode ‘feature’ called Stack Map Frames which is the most complicated and unnecessary addition to Java yet and which erodes any simplicity the JVM bytecode has enjoyed thus far.

    To be fair, Stack Maps were initially introduced in Java 6, but they were an optional part of the class file and every tool maker ignored them. However Java 7 makes them a compulsory part of bytecode, opening the can of worms described below.

    What are Stack Map Frames?
    The aforementioned simplicity of the Java Bytecode came from the fact that it hid away all the black art and gory details of compiler design inside the JVM’s JIT compiler, leaving programmers to deal only with a dead simple high level representation. You didn’t have to understand compiler theory terms like data flow or control flow analysis if you just wanted to inject some bytecode. However, Stack Map Frames make you do exactly that!

    Stack Map frames essentially require you to keep track of the exact ‘type’ of the local variable table and operand stack at every bytecode instruction. While in theory, this may sound simple, in practice it will make your pull your hair out.

    Consider the case of code like this and how the popular ASM framework generates Stack Map Frames:

    1. The ‘type’ of a at line 6 will need to be whatever the superclass of MyThis and MyThat is.

    2. You don’t really know the type hierarchy of the classes during runtime bytecode transformation

    3. Trying to use reflection to lookup the hierarchy can cause the said class to be loaded. This has a few flaws:

    1. It will cause its static class initializer to be loaded at a time completely unexpected by the programmer (leading to bugs in the application).
    2. Since the class is loaded during the transformation, your bytecode instrumentor will not be called when its loaded, and its bytecode wont be transformed.
    3. You can easily get a ClassNotFoundException due to different classloaders being used for sub and super classes.
    4. Also did I say mention you have to be an expert in data flow analysis to actually implement this.

    How complex you ask?
    1. The code to do the Data Flow Analysis for this is so complex that the ASM Framework has over a 1000 lines dedicated to just this and it took 2 versions to get it right. But even then the fundamental approach is flawed due to the problem of trying to find the ‘common superclass’.

    2. As an example, here is what a google search for Stack Map Frames shows for me:

    stackmaps
    The top 10 results are pretty much all error messages. No mention of what stack map frames even are.

    3. Since the ‘common superclass’ approach used by the ASM Framework doesn’t really work, correct custom implementations are so complex and hard to get right that that for Chronon we skipped it altogether and run the jvm with a special switch (described below) that falls back to the old verifier.

    The bullshit and the flaw

    According to the JVM spec, this new ‘feature’ of the JVM allows the bytecode verifier to go a tiny bit faster since it doesn’t have to do all those type calculations itself, since they are already done during compile time.

    The above rationale is so bad and full of bullshit that it deserves to be classified as a design flaw of the JVM. Here is why:

    1. The Bytecode Verifier was never really a source of performance problems. The java verifier can be turned off entirely with a jvm switch, but most people don’t do it, cause there is really no huge performance issue related to the verifier. Remember the verifier is run only once each time a class is loaded, and at that time its the IO that takes up most of the time, not the verifier.
    2. Stack Maps don’t really get rid of the verifier. The verifier still exists and it still runs. Only a tiny portion of the verifier is now turned off.
    3. The verifier still has to have the code to run without stack maps. Remember older jvm versions didn’t require stack maps, and even with jvm 7 a special jvm flag allows users to switch back to the old verification method. Thus the code to do the verification without stack map still exists and hasn’t really cut down the complexity of the JVM for the JVM designers either. They have to maintain 2 versions of the verifier for pretty much forever. Nobody wins in this game.
    4. Pre Java 7, there was one implementation for checking the type of the variables and stack at every instruction, and it was in the JVM, written by hard core compiler designers who know how to write efficient compilers. Now, that still needs to exist but every bytecode manipulator has to write his own implementation, but they are not compiler designers and it takes away time from writing the real application code. We end up with tons of different, inefficient, buggy implementations which do no good to the users, toolmakers or the java ecosystem has a whole.
    5. The whole ‘faster’ verification stuff is hogwash. Today almost every single application runs with some form of bytecode instrumentation. If you use Spring AOP, EclipeLink or tools like JRebel, Chronon, YourKit, NewRelic or basically any popular java tool or framework, your bytecode will be instrumented. Since the bytecode is instrumented (and sometimes multiple times), the stack maps will have to be generated during runtime anyway (defeating the whole purpose of the new verifier), and they will be generated by those buggy, and highly inefficient generators written by people who are not compiler designers.

    Bottom line: The new verifier resuls in slower, more complex and buggier code.

    Mitigation

    The addition of Stack Map Frames only results in a net negative to the Java community and should be abolished.

    I propose the following:

    1. Always use -XX:-UseSplitVerifier the flag when running JVM 7. It will disable the new verifier which requires the need for Stack Frames.
    2. The ASM Framework needs to have another another algorithm that allows simple additions within the scope of a single basic blocks which don’t require recalculation of class hierarchy at runtime.
    3. Lastly, we should petition Oracle to make Stack Maps optional for Java 8 and above. This will preserve the currently generated code by the java 7 compiler, but it will free the world from a completely broken design and unnecessary bunch of complexity.

    Update
    Apparently Oracle is deprecating the -XX:-UseSplitVerifier flag in Java 8, which will make the this workaround impossible and have us forever stuck with stack map frames :(

    Comment on bug 8009595 to prevent Oracle from doing so.