Server Mode in the Chronon Recorder

Posted by Prashant Deva on July 21, 2011
    • Tweet

    This week we released Chronon 1.5. The big feature of this release is the inclusion of ‘Server Mode’ in the Chronon Recorder.

    What is the Server Mode?

    The Server Mode is designed to allow the Chronon Recorder to be controlled through the Chronon Recording Server.

    It includes features such as:

    • Ability to do dynamic start/stop of the recorder in a running program
      The recorder can stay dormant in your program unless explicitly started from the Recording Server UI.
    • Ability to record long running programs.
    • Ability to split a recording based at a time interval or when the physical size of the recording gets too large.
    • Ability to dynamically modify the set of classes that are being recorded in a running program.
      Thus, you can start recording with say include=com.package1.** and later decide to recorder com.package2.**.  All this without the need to stop the program.

    Future directions

    With the addition of the Server mode in the recorder, we now have 2 distinct modes for the Recorder: 

    1. Developer Mode
    2. Server Mode

    The developer mode is the one you are probably familiar with as that is what is used when you record using the Chronon Eclipse plugin. It records the entire program from beginning to end and is meant for short running programs, as is common in development scenarios.

    Moving forward we will probably have each of these 2 modes optimized for their specific use cases. There are a lot of optimizations that we want to put in the Recorder that will add a bit of overheard to the instrumentation time. While this is acceptable for long running server programs, it is not as useful if you are going to run your program for only a few minutes from within eclipse.

    A good analogy is the server and client jvms. While the client jvm is optimized for quick startup and does less optimization, the server jvm is meant for long running programs and does a lot more aggressive optimization.

    We will keep you posted on the specifics of how we proceed with putting optimizations/features in each of these modes of the Chronon Recorder.

     

     

    Is the traditional debugger still relevant in 2011?

    Posted by Prashant Deva on July 15, 2011
      • Tweet

      The traditional debugger as we know it hasnt changed since the dawn of programming; which is to say it has remained pretty much the same since 1970s.  Lets take a deeper look at some of its fundamental design principals and whether they are still relevant in 2011.

      Traditional Debugger

      Design Principles

      The traditional debugger is designed around the idea that :

      1. Programs are single threaded
      2. Flow of execution is sequential
      3. Bugs are always reproducible.
      4. Programs run for short periods of time

      Implementation

      Sequential flow of execution and single threaded-ness

      This principle is clearly reflected in the interface of the debugger which has the ‘stepping’ buttons which allow you to navigate the execution of your program sequentially. There is no well defined semantic for what happens when you say ‘step forward’ in one thread, with respect to all the other threads.

       

      Dbgr-stepping

      Reproducible bugs and short runs of a program

      The traditional debugger relies on the ‘breakpoint’ model which assumes that the person debugging has a well defined and fully reproducible set of actions. It also assumes that the program doesn’t run for very long otherwise you would have to set a breakpoint and wait hours for it to hit.

      Not multithreaded by design

      Although most debuggers can stop and show you the stack frames of all the active threads when you hit a breakpoint, that is more of a evolution of the traditional design of just showing the stack frames of the single sole thread which  the program is assumed to be running on.  The rest of the debugging elements are not designed around the fact that the program flow is not  merely sequential and data is being modified by multiple threads.

      But we are in 2011…

      None of the assumptions of the traditional debugger hold true anymore in 2011:

      1. Almost all programs are multi threaded
      2. Flow of execution is not 100% sequential. Data can be modified by multiple threads at the same time.
      3. Bugs are becoming increasingly non reproducible due to race conditions and just the increasing complexity of programs.
      4. Programs run for days, months and even years on servers.

      Anybody who has had to debug a multi threaded program knows that merely showing the active stack frames does not help much in detecting race conditions. Not only that, but just breaking the program modifies the execution and timings of various threads leading to the bug becoming non reproducible while debugging.

      The ‘breakpoint’ model is broken since for long running server side programs you can’t realistically set a breakpoint and wait for days to hit the breakpoint, only to start all over again once you step over a line you didn’t intend to.

      And that leads us to Log files…

      The failure of the debugger to keep up with programs written in the 21st century has led to the rise of logging and huge log files.

      Logging is fundamentally broken by its very nature because :

      1. You are trying to predict the errors in your program, in advance, which you dont even know of.
      2. Since you usually put a logging statement where, you might think the error would be, you have usually hardened the code around that area already. Thus in real world situations the program usually breaks where there wasn’t any log statement at all, because the programmer never thought he might encounter an error in that piece of code.
      3. Long running programs generate enormous log files and you usually have to write another set of programs just to parse through those log files.
      4. Writing logging statements is a distraction from programming and results in clutter of code.

      Thus the obsoleteness of the traditional debugger has led to people coding their own custom debugging mechanisms for every program they write.

      Chronon – Reinventing the debugger for 2011

      When we started designing the Chronon Time Travelling Debugger, we built it with programs of the 21st century in mind. Our assumptions were:

      1. Programs are inherently multi-threaded
      2. Flow of execution not entirely sequential. Data may be changed by other threads. Calls to a method may be interleaved across threads.
      3. Bugs are tough, if not,  impossible to reproduce in a multithreaded world with race conditions.
      4. Programs run for (very) long periods of time.

      Implementation

      Record everything, no need to reproduce

      The Chronon Recorder records the entire execution of the Java program. The recording is then subsequently used to debug the program in the time travelling debugger. This ensures that no bugs need to be reproduced.

      No breakpoints, built for long running programs

      Chronon does away with the concept of breakpoints entirely. You can jump to any point in time of the execution of your program instantly. Thus you might have a recording that is 5 hrs long and maybe you want to get to an exception that was thrown after 4 hrs. Chronon allows you to jump to the exception instantly with a click of a button, instead of making you wait for 4 hrs like your traditional debugger would.

      We even came up with the Chronon Recording Server recently which is specifically designed for long running programs. It takes care of splitting the recording after a pre defined time interval or if the physical size of the recording gets too large.

      Embraces multi-threaded, non sequential nature of programs

      Although Chronon still has the stepping buttons, including a ‘step back’ button, to allow examining sequential execution of a single thread, the rest of the interface is designed with multithreaded, non-sequential execution in mind.

      Dbgr-stepback

      All the views like the Exceptions view, Variable History and Method History view show you data independently of threads. You can then proceed to filter them by the thread you want to examine.

      Dbgr-exceptionsDbgr-methodhistoryDbgr-varhistory

      Showing data independently of threads and then allowing you to jump to any point in time and examine the sequence that led to that particular state embraces both the multi threaded data manipulation as well as the single threaded sequential nature of the execution of the program.

      Conclusion

      The traditional debugger as we know it is of not much use in 2011. This is the reason people have resorted to the use of log files and custom debugging mechanisms. With Chronon, we have solved a lot of the issues with the traditional debugger and designed it to debug the way modern programs are written and used.

      We believe that in 2011, you should not need to litter your code with logging or any other kind of custom debugging mechanism. Our current product and upcoming enhancements are steps in that direction.

       

      Chronon Recording Server Architecture

      Posted by Prashant Deva on July 1, 2011
        • Tweet
        Rs_architecture

        Here are some details on the architecture of the Chronon Recording Server which recently went into beta.

        As shown in the diagram above:

        Per Machine:

        Each machine being recorded can have a number of jvms running. Each jvm has a recorder attached to it.

        Each machine also has a ‘controller’ service running on it.

        Controller:

        The controller is the heart of the communications mechanism in the Recording Server product. There is a controller service running on each machine which is controlled by the recording server web ui. The web ui talks to the controller which in turn talks to any of the jvms being recorded on that machine.

        Server + UI:

        The ‘server’ portion and the web based UI of the server sit on a separate machine and talk to the controller of each machine that is connected to the recording server.

        Design implictaions

        This design was chosen because:

        Performance
        The recordings are stored locally on the machines being recorded. This reduces network traffic.

        Fault Tolerance
        Any machine can go down without affecting any other machine.

        ·         If any of the machines being recorded go down, they don’t affect communication or recordings of any other machine.

        ·         If the Recording Server goes down, each of the machines being recorded still continue doing what they were last directed, since the recorders are controlled by the Controller service.
        Thus all activity like flushing old recordings or splitting a recording after a time interval still keeps happening as it was scheduled when the recording server goes down.

        Method Size Limit in Java

        Posted by Prashant Deva on June 12, 2011
          • Tweet

          Most people don’t know this, but you cannot have methods of unlimited size in Java.

          Java has a 64k limit on the size of methods.

          What happens if I run into this limit?

          If you run into this limit, the Java compiler will complain with a message which says something like "Code too large to compile".

          You can also run into this limit at runtime if you had an already large method, just below the 64k limit and some tool or library does bytecode instrumentation on that method, adding to the size of the method and thus making it go beyond the 64k limit. In this case you will get a java.lang.VerifyError at runtime.

          This is an issue we ran into with the Chronon recorder where most large programs would have atleast a few large methods, and adding instrumentation to them would cause them to blow past the 64k limit, thus causing a runtime error in the program.

          Before we look into how we went about solving this problem for Chronon, lets look at under what circumstances people write such large methods in the first place.

          Where do these large methods come from?

          ·         Code generators
          As it turns out, most humans don’t infact write such gigantic methods. We found that most of these large methods were the result of some code generators, eg the ANTLR parser generator generates some very large methods.

          ·         Initialization Methods
          Initialization methods, especially gui initialization methods, where all the layout and attaching listeners, etc to every component in some in one large chunk of code is a common practise and results in a single large method.

          ·         Array initializers
          If you have a large array initialized in your code, eg:
          static final byte largeArray[] = {10, 20, 30, 40, 50, 60, 70, 80, …};
          that is translated by the compiler into a method which uses load/store instructions to initialize the array. Thus an array too large can cause this error too, which may seem very mysterious to those who don’t know about this limit.

          ·         Long jsp pages
          Since most JSP compilers put all the jsp code in one method, large jsp pages can make you run into these errors too.

          Of course, these are only a few common cases, there can be a lot of other reasons why your method size is too large.

          How do we get around this issue?

          If you get this error at compile time, it is usually trivial to split your code into multiple methods. It may be a bit hairy when the method limit is reached due to some automated code generation like ANTLR or JSPs, but usually even these tools have provisions to allow you to split the code into chunks, eg : jsp:include in the case of JSPs.

          Where things get hairy is the second case I talked about earlier, which is when bytecode instrumentation causes the size of your methods to go beyond the 64k limit, which results in a runtime error. Of course you can still look at the method which is causing the issue, and go back and split it. However, this may not be possible if the method is inside a third party library.

          Thus, for the Chronon recorder at least, the way we fixed it was to instrument the method, and then check the method’s size after instrumentation. If the size is above the 64k limit, we go back and ‘deinstrument’ the method, thus essentially excluding it from recording. Since both our Recorder and  Time Travelling Debugger are already built from the groud up to deal  with excluded code, it wasn’t an issue while recording or debugging the rest of the code.

          That said, the method size limit of 64k is too small and not needed in a world of 64 bit machines. I would urge everyone reading this to go vote on this JVM bug so that this issue can be resolved in some future version of the JVM.

          Time inside a Time Travelling Debugger

          Posted by Prashant Deva on May 27, 2011
            • Tweet

            When we were developing Chronon and started using it ourselves, we realized something very intriguing. You see, the various views of Chronon allow you to step not only forward and backward but to any random point in time. For example, using the Variable History view, you can instantly jump to when a variable became ‘null’ or use one of the powerful filters in method history view to jump directly to a particular call of a method.

            The problem

            Since you are not just stepping forward, it is easy to get lost in time.

            For example:

            1. How does one event relate to the other, did it happen before or after the other?
            2. Did I just jump forward or backward when I clicked in the variable history view?
            3. If I did jump backward/forward, by how much did I jump?
            4. Where am I in the execution of my program? Am I near the end of my program/ middle or end?

            The Solution

            Imagine you are a real world time traveler. What is the most important tool in your arsenal?

            A clock.

            We needed some sort of a clock inside Chronon to solve all the above issues.
            Thus we invented the concept of ‘time’.

            Time inside Chronon
            ‘Time’ inside Chronon does not stand for real-world clock time. After all how precisely can you really measure the time interval between say 2 variable assignments, and even if you could looking at the system clock each time your program executed an instruction, would result in a huge performance drain.

             

            Thus ‘time’ in Chronon is merely an application wide counter. It has no relation to clock time. Its sole purpose is to give an ordering to events recorded in a program.
            The only thing you know when you look at a time value is:
            An event at say time 5 occurred after any event at time < 5, ie 4,3,… and before any event > 5, ie 6,7…

             

            It is not guaranteed that the next time value after 5 would be 6 (though it is in 99% of cases), it could be 7,8 or any time > 5.
            All you know about time values is that they are ascending in order, thus putting an ordering on events in your application.

             

            Thus all the views in Chronon, like the Variable History, Method History, Thrown Exceptions, etc have a time value so you can see how the events in each relate to the other.

             

            Timeline View
            The ‘Timeline View’ was added to quite literally serve the role of a clock inside Chronon.
            Timeline
            • It literally shows the current time value.
            • The progress bar gives you an idea of how far down the execution of the program you are.
              See the bar completely fill up, well you are near the end, if its almost empty, you are near the beginning.
            • We also added ‘time bookmarks’ in this view which act as a checkpoint mechanism for anything interesting you might want to return to in the future.

             

             

            Chronon release 1.2

            Posted by Prashant Deva on May 20, 2011
              • Tweet

              It has barely been a week since we released update 1.1.1 and we are back again with another update full of more goodies.

              Support for Reflection in the recorder

              The Chronon recorder will now recognize updates to the fields of your object done using the Java Reflection APIs.
              This is especially useful if you use ORM frameworks like Hibernate which use reflection to set the fields of the Java objects. No longer will you see ‘null’ in those fields, but the actual values.

              ‘Copy Value’ for variables in the Debugger

              Locals-copyvalVarhistory-copyvalCurrentline-copyvar

              If the value of a variable has a string that is too large to fit in the eclipse view, you can right-click and select ‘Copy Value’ to copy a fully formatted version of the string to clipboard.
              This functionality is supported for all views that show the value of a variable, ie theLocals view, Variable History view and Current Line view.

              And of course, lots more bugfixes in the recorder and debugger. If you were running into deadlocks while recording before, you shouldnt anymore.

              So go ahead and update your Chronon installation!

              Chronon release 1.1.1

              Posted by Prashant Deva on May 14, 2011
                • Tweet

                This update brings a ton of improvements. I will list some of the major ones here:

                Support for applications with huge number of threads

                Until now, you could create only 1024 different threads in your application, after which Chronon would throw an exception.
                With this release, if your application is going to create more threads, you can specify that in the recorder config file, eg

                maxrecordedthreads = 3000

                Of course, you can set this option from within Eclipse too.

                PrefsLaunchpref

                Note that the number of threads here does not mean ‘the number of threads active at a certain point in time’. It means ‘the total number of threads created during the lifetime of your application’. So if your application frequently creates and destroys new threads instead of using a thread pool, this might be useful to you.

                Much faster stack traces in the Stack view

                The stack view can now create stack traces much faster and wont crash if the stack trace at a point is extremely deep.

                Recorder no longer deadlocks during shutdown

                This was a problem for some people who had redirected System.out to a custom Logging class, which they were also recording. This would cause a deadlock in the recorder. No more. Now we print out shutting down messages in a different thread. Since during shutdown, Chronon locks up the rest of the threads while it is persisting data, it is possible if you have System.out redirected to a custom Logging class, for the printer thread to be locked while the persistance is taking place, but that only means the messages to the console will appear a bit delayed. The recorder will still complete and in the same amount of time as before, but it wont deadlock.

                We recommend everyone to update their Chronon installation.

                Misconceptions regarding Java heap size arguments

                Posted by Prashant Deva on May 3, 2011
                  • Tweet

                  There seem to be some misconceptions regarding the Java Heap size arguments.

                  For those who don’t know, Java is peculiar in the sense that you have to specify the amount of memory you want your program to use, before you run your program. If you fail to do so then depending on the version and implementation of the JVM, your program will only be allowed to use a fraction of your computer’s total RAM. This is the reason why you might get an OutOfMemoryError even if your machine has 24gb of ram and you know your Java program needs way below that amount of memory.

                  The way to mitigate this is to set the -Xms and -Xmx parameters when launching your Java program to explicitly set the initial and maximum amount of memory your Java program is allowed to use.

                  Note the use of the word ‘allowed‘ in my previous statement. This is the misconception I am talking about.  Let’s say you start Eclipse with a setting of -Xmx6g. This does not mean that magically eclipse will start allocating more memory. All this means is that you have allowed eclipse to use 6gb of memory and that if a situation does occur that eclipse maybe for a short time needs that extra amount of memory, it will be able to use it and wont crash by giving you an OutOfMemoryError.

                  The need to pre define the amount of memory usage allowed was a big problem with Chronon , since Chronon does make good use of the memory on your system and gets a huge performance boost the more memory you give it. However since eclipse by default only sets a -Xmx value of 384m, we would get a lot of complaints from people saying ‘I have 4 gigs of ram on my machine, why is Chronon still running dog slow’. And it would always involve setting the -Xmx and -Xms to a higher value.

                  So we decided to use some of the functionalities of Eclipse p2 and now when you install the Chronon Time Travelling Debugger eclipse plugin, we automatically increase the -Xms and -Xmx to a much higher value in your eclipse.ini file. This has obviously put an end to all the performance complaints and I bet has solved a lot of non Chronon related performance problems for some eclipse users who are now actually able to use their machines to their full potential when running eclipse.

                  Of course, this has angered some people who when they see the heap size in their eclipse status bar set to a high value tend to freak out. "What, my eclipse is suddenly taking more memory!". Thus, the point of this post is to make it clear that Eclipse is not magically using more memory when you install Chronon, you are only allowing it to use all the memory of your system which you paid with your hard earned cash.

                  Chronon bugfix update 1.1.0.144

                  Posted by Prashant Deva on April 28, 2011
                    • Tweet

                    We have updated Chronon with some bugfixes related to recording JEE servers from within Eclipse.
                    If you were seeing an exception when trying to Record a server, you shouldnt see it anymore.

                    Everyone please update using the update instructions here.

                     

                    Chronon full version released with Eclipse Java EE support

                    Posted by Prashant Deva on April 25, 2011
                      • Tweet

                      4 years ago, when I was still in college, every time I would encounter a bug in my program it would be a nightmare. Debugging would involve many, many hours of trying to reproduce the bug and fiddling with breakpoints. Of course, we all know how tough it is to reproduce a bug when it occurs in production, but even while developing it can be a nightmare when you have to go through a whole series of steps that takes a full 5 minutes before you can run your program to point where it hits the breakpoint where you think the problem might be. Of course like everyone else on the planet, I would never, ever get that breakpoint right the first time. Thus would begin my frustrating journey as I would keep fiddling with my breakpoints trying to get it to break the exact moment before the error occurred. And of course most of the time I would end up stepping over that the error nous piece of code and end up cringing, if only I could step back! And god forbid, if that error was in a multithreaded program I really wanted to step back because I may never be able to reproduce the error ever again.

                      If only I could step back….

                      With the release of full version of Chronon today, I think its safe to say this marks the beginning of a new era for Java and programming in general. No longer is the thought of debugging scary. No longer do we have to worry about reproducing bugs. No longer do we have to wish if only we could step back. We can.

                      What initially began few years ago as an idea for a debugger with a ‘step back’ button has morphed into something much greater. We now have a full ‘flight data recorder’ for Java, which can produce recordings to disk, eliminating the need to reproduce bugs entirely. When we started working with the data from the recording, we realized we can make something much bigger than just a ‘step back’ button. The result is all the amazing functionality that the Chronon debugger provides. In fact, once you start using it you realize that you rarely ever need to use any of the stepping buttons at all!

                      Pricing

                      We really do want every Java developer using Chronon technology to record and debug their applications, thus ending the nightmare of debugging entirely. To that end, instead of pricing Chronon at a super-expensive price where the only way to buy it would be to get approval from 5 layers of management and involve tons of sales people, we have decided to make it so affordable, you dont have to think twice about purchasing.

                      Currently our most expensive plan costs only $35/month (price of a good dinner) and it goes down to just $10/month (price of a movie ticket). We have also decided to offer the product as a subscription. This is another way to keep our prices down since we dont have to charge you a large fee for a perpetual license. Also this means we dont have to stall bugfixes and upgrades to entice you to buy the next version. As long as you have an active subscription you can download all the updates for free.

                      Whats new

                      A big part of this release is the new Eclipse Java EE integration. You can now record your Tomcat/JBoss/any other app server right from within Eclipse. Below are some of the screenshots of the functionality:

                      Jee2Jee1Jee3

                      We have even published a video demoing the Chronon integration with Eclipse Java EE. You can even read about the integration features here.

                      Thanks to the feedback of a huge number of beta testers, we have been able to fix an enormous amount of bugs and improve the product a lot. If you used a beta release, you should definitely try the current release and you will have an entirely different, much smoother experience. I will be describing in technical detail some of the improvements in the next few blog posts.

                      That said, we have a lot more in store. If you think the current Chronon is cool, wait till you see what we come up with in the next few versions. We will keep improving the product very frequently in the weeks to come. With our subscription model, you should be getting them as soon as they are available. So go ahead, download Chronon and give it a try with the free 30 day evaluation.

                       

                      « Previous 1 2 3 4 5 6 7 8 9 10 11 Next »