16 April 2009

Emma - Java Code Coverage Made Easy (Part 3)

Before reading this post, I would suggest the readers to go through the previous posts (Part 1and Part 2) for better understanding even though this post alone can be of use but when combined with other previous will help you to understand the whole context of code coverage. In the first post, we understood the importance of code coverage and the second post talked about Emma's on-the-fly instrumentation. Before dealing with other possible ways of using Emma, in this post, I would like to give an overview about Emma. The intention of having previous posts is to help someone to get off the ground and this post will help to understand Emma. This post will be predominantly theoretical (which is very much needed to understand how tools are written for Java in general and how Emma is implemented in particular). With the information given in this post, you will be able to come to decision on how to use Emma effectively.

In the world of Java, you can write tools in whichever way you want. However, Java gives you at least two standard way of writing tools. These are defined as specification. The first method is something to do with "monitoring and managing" runtime behavior. Java Platform Debug Architecture, Java Debug Wire Protocol, Java Virtual Machine Tool Interface and Java Debug Interface primarily do "runtime monitoring and managing". The concept can be put simply put as "watching what JVM does and taking actions based on JVM behavior". The tool should listen what JVM is doing through events and write logic - what the tool intended to do. The main disadvantage of this is approach is the overhead in receiving the events and processing the events.

Another standard way is to add the logic to the class files. Let us consider code coverage. The objective of any code coverage tool is to measure the coverage line by line. In order to do this, the coverage tool injects the code coverage logic in all class files. Whenever the class is loaded and executed, along with the application logic, the code coverage logic also runs and records the coverage metrics. This is called as bytecode instrumentation. Again, there are two ways of doing this. First is to modify the application class in the hard disk (or any secondary storage) and the second method is add the bytecode on-the-fly when the classes are loaded.

Emma exploits both the methods and hence it lets users to measure the code coverage both offline instrumentation and on-the-fly instrumentation. The caveat is that during the on-the-fly and offline, the way the application is measured for coverage is not going to vary. Only the method used to inject or instrument bytecodes is going to vary (which we described already).

Emma employs on-the-fly instrumentation if you have integrated Emma with IDEs. The command line version of Emma has the capability to run both in offline and on-the-fly instrumentation mode. Most of the times, the developers/testers use on-the-fly instrumentation since the risk of changing the application classes is not there as bytecode instrumentation happens on-the-fly. But the life is not easy as it seems. When you are using certain application servers or servlet container, you cannot use on-the-fly because the application servers/servlet containers have custom classloaders. Here offline instrumentation is the only option. So, before using Emma, you may need to study little bit about your application and use the suitable methods.

I have also come up with a little interactive presentation with flash using Wink on Emma. It can be accessed through this link