Wednesday, February 10, 2010

Say goodbye to awful concurrency bugs -- Showcase of MulticoreSDK on Derby

In my last blog, I illustrate one of the notorious concurrency bugs – deadlocks, and how to find them without reproducing the deadlock using MulticoreSDK. The sample I gave was the classic dining philosophers problem. To verify how effective the tool is, I am thinking that MulticoreSDK should be applied to real-world applications to find real deadlocks.

Finally I found one real deadlock case reported in Derby, an open source relational database implemented in Java. Then this real deadlock case becomes one of our benchmarks to verify effectiveness of MulticoreSDK deadlock detector.

I downloaded the driver program BlobDeadlock.java and the buggy Derby version. Apply MulticoreSDK in the deadlock case with following steps,

  1. Download MulticoreSDK from its website, and install it following the user manual. Suppose MulticoreSDK is extracted under {msdk-cmd}.
  2. Open KingProperties file in props folder, set preference targetClasses = org/apache/derby to instrument and monitor all classes in Derby.
  3. Compile the driver program,
$ mkdir bin && javac -d bin BlobDeadlock.java
  1. Run the driver program with MulticoreSDK (no real deadlock occurs in execution)
$ java -Dcontest.preferences={msdk-cmd}/prop/KingProperties -javaagent:{msdk-cmd}/lib/ConTest.jar -cp .:bin:derby.jar BlobDeadlock
  1. Run post analysis against trace file,
$ java -ea -cp ConTest.jar com.ibm.contest.lock_dis_checker.Main .


Surprisingly, the post analysis found no deadlock cycle. I first checked the trace file generated in step 4 and threaddump.txt indicates where the deadlock happens. According to threaddump.txt file, one of the two threads involved in the deadlock is waiting to acquire lock at java.util.Observable.deleteObserver(Observable.java:78). I realize that in step 2, we didn't specify to instrument Java core classes, such as java.util.Observable, etc. So the locks taken in Observable class were not traced in file. Perhaps it's the root cause why MulticoreSDK doesn't report the deadlock in Derby.

Following additional steps are taken to instrument class Observable,

  1. Open KingProperties file in props folder, set preference targetClasses = java/util/Observable.
  2. Instrument class Observable offline, since JVM doesn't give you a chance to instrument preloaded Java core classes in runtime,
$ java -cp {msdk-cmd}/lib/ConTest.jar:{$JAVA_HOME/jre/lib} com.ibm.contest.instrumentation.Instrument java.util.jar
After that, apply MulticoreSDK in the deadlock case again from step 2 above. The deadlock analysis result is shown below,
Listing 1. Potential Deadlocks Results from Derby
Deadlock Cycle 1: [666, 315]
#315->#666 #666->#315
edge #315->#666 consists of:
Thread [java.lang.Thread@1909682643]: lock taken at [java/util/Observable.java:78 deleteObserver(java.util.Observer) org.apache.derby.impl.store.raw.data.BaseContainerHandle@840] inside a different lock taken at [org/apache/derby/impl/store/raw/data/BasePage.java:1720 releaseExclusive() org.apache.derby.impl.store.raw.data.StoredPage@487]
edge #666->#315 consists of:
Thread [java.lang.Thread@1915449899]: lock taken at [org/apache/derby/impl/store/raw/data/BasePage.java:1334 isLatched() org.apache.derby.impl.store.raw.data.StoredPage@487] inside a different lock taken at [org/apache/derby/impl/store/raw/data/BaseContainerHandle.java:408 close() org.apache.derby.impl.store.raw.data.BaseContainerHandle@840]
===================================================

Now MulticoreSDK successfully reports the same deadlock to the real deadlock case despite that the deadlock doesn't surface once in my execution :)

MulticoreSDK Tool Link
http://www.alphaworks.ibm.com/tech/msdk

No comments: