Java/Profiling
From charlesreid1
Contents
Methods
There are various methods for profiling Java code. Some are covered below.
JIP: Java Interactive Profiler
The Java interactive profiler is covered here: http://confluence.concord.org/display/CCTR/attachments/15911/1574.pdf
This is a project available on sourceforge: http://jiprof.sourceforge.net/
Instructions
Start by downloading the source code for the project.
Look in that directory for a folder called profile/
. Look in that folder for a Jar file called profile.jar
. This jar file contains everything you need to profile your program with the JIP.
Store this Jar file somewhere easy to find, as you will be pointing the Java executable to the Jar file when you run the program that you are profiling.
Running Java with JIP
When you run Java, pass the -javaagent
flag, and the location of profile.jar:
$ java -javaagent:/Volumes/noospace/Users/charles/Downloads/jip-src-1.2/profile/profile.jar NQueens objc[21690]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined. profiler: on remote: off port: 15599 thread-depth: -1 thread.compact.threshold.ms: 10 max-method-count: -1 method.compact.threshold.ms: 10 file: profile.txt track.object.alloc: off output: text debug: off profiler-class: com.mentorgen.tools.profile.runtime.Profile output-method-signatures: no clock-resolution: ms output-summary-only: no exclude:null Accept ClassLoader: sun.misc.Launcher$AppClassLoader ClassLoaderFilter.1: null Using the generic class loader filter. Java Interactive Profiler: starting ------------------ Found 14200 solutions Controller -- shuttingdown
This creates the file profile.txt
, which provides an excellent breakdown of what amount of time was spent in which function calls, from the highest level of the program down to the lowest level.
Example JIP Output
The JIP will create an output file in profile.txt
. Here is example output for a Java N queens problem solution:
+---------------------------------------------------------------------- | File: profile.txt | Date: 2017.03.18 23:40:36 PM +---------------------------------------------------------------------- +------------------------------ | Thread depth limit: Unlimited +------------------------------ +------------------------------ | Thread: 1 +------------------------------ Time Percent ----------------- --------------- Count Total Net Total Net Location ===== ===== === ===== === ========= 1 19491.8 98.1 100.0 0.5 +--NQueens:main () 1 0.1 0.1 0.0 | +--SolutionSaver:<init> () 1 4.3 4.3 0.0 | +--Board:<init> () 1 19389.2 0.6 99.5 | +--NQueens:explore () 1 0.0 0.0 0.0 | | +--Board:size () 1 0.3 0.3 0.0 | | +--Board:legalRows () 14 0.1 0.1 0.0 | | | +--Board:size () 12 0.0 0.0 0.0 | | +--Board:choose () 12 19388.1 0.4 99.5 | | +--NQueens:explore () 12 0.0 0.0 0.0 | | | +--Board:size () 12 0.4 0.3 0.0 | | | +--Board:legalRows () 191 0.1 0.1 0.0 | | | | +--Board:size () 110 0.0 0.0 0.0 | | | +--Board:choose () 110 19387.2 2.3 99.5 | | | +--NQueens:explore () 110 0.0 0.0 0.0 | | | | +--Board:size () 110 1.5 1.2 0.0 | | | | +--Board:legalRows () 1951 0.4 0.4 0.0 | | | | | +--Board:size () 756 0.1 0.1 0.0 | | | | +--Board:choose () 756 19383.0 9.4 99.4 | | | | +--NQueens:explore () 756 0.1 0.1 0.0 | | | | | +--Board:size () 756 10.6 8.0 0.1 | | | | | +--Board:legalRows () 14717 2.6 2.6 0.0 | | | | | | +--Board:size () 4080 0.7 0.7 0.0 | | | | | +--Board:choose () 4080 19361.3 35.5 99.3 0.2 | | | | | +--NQueens:explore () 4080 1.1 1.1 0.0 | | | | | | +--Board:size () 4080 62.6 48.4 0.3 0.2 | | | | | | +--Board:legalRows () 86162 14.2 14.2 0.1 | | | | | | | +--Board:size () 16852 2.8 2.8 0.0 | | | | | | +--Board:choose () 16852 19256.1 114.7 98.8 0.6 | | | | | | +--NQueens:explore () 16852 2.7 2.7 0.0 | | | | | | | +--Board:size () 16852 276.5 209.5 1.4 1.1 | | | | | | | +--Board:legalRows () 382380 67.0 67.0 0.3 0.3 | | | | | | | | +--Board:size () 52856 11.7 11.7 0.1 | | | | | | | +--Board:choose () 52856 18836.2 295.2 96.6 1.5 | | | | | | | +--NQueens:explore () 52856 9.6 9.6 0.0 | | | | | | | | +--Board:size () 52856 903.0 682.9 4.6 3.5 | | | | | | | | +--Board:legalRows () 1278319 220.1 220.1 1.1 1.1 | | | | | | | | | +--Board:size () 120104 22.8 22.8 0.1 0.1 | | | | | | | | +--Board:choose () 120104 17577.2 494.4 90.2 2.5 | | | | | | | | +--NQueens:explore () 120104 20.2 20.2 0.1 0.1 | | | | | | | | | +--Board:size () 120104 2180.6 1641.2 11.2 8.4 | | | | | | | | | +--Board:legalRows () 3075183 539.4 539.4 2.8 2.8 | | | | | | | | | | +--Board:size () 195270 33.8 33.8 0.2 0.2 | | | | | | | | | +--Board:choose () 195270 14813.5 629.7 76.0 3.2 | | | | | | | | | +--NQueens:explore () 195270 36.2 36.2 0.2 0.2 | | | | | | | | | | +--Board:size () 195270 3667.5 2787.1 18.8 14.3 | | | | | | | | | | +--Board:legalRows () 5259936 880.4 880.4 4.5 4.5 | | | | | | | | | | | +--Board:size () 222720 41.7 41.7 0.2 0.2 | | | | | | | | | | +--Board:choose () 222720 10399.6 552.9 53.4 2.8 | | | | | | | | | | +--NQueens:explore () 222720 36.6 36.6 0.2 0.2 | | | | | | | | | | | +--Board:size () 222720 4347.6 3282.2 22.3 16.8 | | | | | | | | | | | +--Board:legalRows () 6278007 1065.4 1065.4 5.5 5.5 | | | | | | | | | | | | +--Board:size () 160964 92.4 92.4 0.5 0.5 | | | | | | | | | | | +--Board:choose () 160964 5342.1 331.4 27.4 1.7 | | | | | | | | | | | +--NQueens:explore () 160964 28.7 28.7 0.1 0.1 | | | | | | | | | | | | +--Board:size () 160964 3340.2 2531.0 17.1 13.0 | | | | | | | | | | | | +--Board:legalRows () 4727379 809.2 809.2 4.2 4.2 | | | | | | | | | | | | | +--Board:size () 68264 12.0 12.0 0.1 | | | | | | | | | | | | +--Board:choose () 68264 1603.7 106.6 8.2 0.5 | | | | | | | | | | | | +--NQueens:explore () 68264 11.4 11.4 0.1 | | | | | | | | | | | | | +--Board:size () 68264 1421.5 1076.9 7.3 5.5 | | | | | | | | | | | | | +--Board:legalRows () 2082052 344.5 344.5 1.8 1.8 | | | | | | | | | | | | | | +--Board:size () 14200 2.4 2.4 0.0 | | | | | | | | | | | | | +--Board:choose () 14200 59.4 29.9 0.3 0.2 | | | | | | | | | | | | | +--NQueens:explore () 14200 2.5 2.5 0.0 | | | | | | | | | | | | | | +--Board:size () 14200 23.2 23.2 0.1 0.1 | | | | | | | | | | | | | | +--Board:toString () 14200 3.7 3.7 0.0 | | | | | | | | | | | | | | +--SolutionSaver:saveSolution () 14200 2.5 2.5 0.0 | | | | | | | | | | | | | +--Board:unchoose () 68264 26.1 26.1 0.1 0.1 | | | | | | | | | | | | +--Board:unchoose () 160964 27.9 27.9 0.1 0.1 | | | | | | | | | | | +--Board:unchoose () 222720 38.8 38.8 0.2 0.2 | | | | | | | | | | +--Board:unchoose () 195270 34.7 34.7 0.2 0.2 | | | | | | | | | +--Board:unchoose () 120104 28.4 28.4 0.1 0.1 | | | | | | | | +--Board:unchoose () 52856 14.2 14.2 0.1 | | | | | | | +--Board:unchoose () 16852 3.3 3.3 0.0 | | | | | | +--Board:unchoose () 4080 0.8 0.8 0.0 | | | | | +--Board:unchoose () 756 0.2 0.2 0.0 | | | | +--Board:unchoose () 110 0.1 0.1 0.0 | | | +--Board:unchoose () 12 0.1 0.1 0.0 | | +--Board:unchoose () 1 0.1 0.1 0.0 | +--SolutionSaver:nSolutions () +-------------------------------------- | Most expensive methods (by net time) | Frame Count Limit: Unlimited +-------------------------------------- Net ------------ Count Time Pct Location ===== ==== === ======== 222720 3282.2 16.8 Board:legalRows 195270 2787.1 14.3 Board:legalRows 160964 2531.0 13.0 Board:legalRows 120104 1641.2 8.4 Board:legalRows 68264 1076.9 5.5 Board:legalRows 6278007 1065.4 5.5 Board:size 5259936 880.4 4.5 Board:size 4727379 809.2 4.2 Board:size 52856 682.9 3.5 Board:legalRows 195270 629.7 3.2 NQueens:explore 222720 552.9 2.8 NQueens:explore 3075183 539.4 2.8 Board:size 120104 494.4 2.5 NQueens:explore 2082052 344.5 1.8 Board:size 160964 331.4 1.7 NQueens:explore 52856 295.2 1.5 NQueens:explore 1278319 220.1 1.1 Board:size 16852 209.5 1.1 Board:legalRows 16852 114.7 0.6 NQueens:explore 68264 106.6 0.5 NQueens:explore 1 98.1 0.5 NQueens:main 160964 92.4 0.5 Board:choose 382380 67.0 0.3 Board:size 4080 48.4 0.2 Board:legalRows 222720 41.7 0.2 Board:choose 222720 38.8 0.2 Board:unchoose 222720 36.6 0.2 Board:size 195270 36.2 0.2 Board:size 4080 35.5 0.2 NQueens:explore 195270 34.7 0.2 Board:unchoose 195270 33.8 0.2 Board:choose 14200 29.9 0.2 NQueens:explore 160964 28.7 0.1 Board:size 120104 28.4 0.1 Board:unchoose 160964 27.9 0.1 Board:unchoose 68264 26.1 0.1 Board:unchoose 14200 23.2 0.1 Board:toString 120104 22.8 0.1 Board:choose 120104 20.2 0.1 Board:size 86162 14.2 0.1 Board:size 52856 14.2 0.1 Board:unchoose 68264 12.0 0.1 Board:choose 52856 11.7 0.1 Board:choose 68264 11.4 0.1 Board:size 52856 9.6 0.0 Board:size 756 9.4 0.0 NQueens:explore 756 8.0 0.0 Board:legalRows 1 4.3 0.0 Board:<init> 14200 3.7 0.0 SolutionSaver:saveSolution 16852 3.3 0.0 Board:unchoose 16852 2.8 0.0 Board:choose 16852 2.7 0.0 Board:size 14717 2.6 0.0 Board:size 14200 2.5 0.0 Board:size 14200 2.5 0.0 Board:unchoose 14200 2.4 0.0 Board:choose 110 2.3 0.0 NQueens:explore 110 1.2 0.0 Board:legalRows 4080 1.1 0.0 Board:size 4080 0.8 0.0 Board:unchoose 4080 0.7 0.0 Board:choose 1 0.6 0.0 NQueens:explore 12 0.4 0.0 NQueens:explore 1951 0.4 0.0 Board:size 12 0.3 0.0 Board:legalRows 1 0.3 0.0 Board:legalRows 756 0.2 0.0 Board:unchoose 1 0.1 0.0 SolutionSaver:<init> 756 0.1 0.0 Board:choose 756 0.1 0.0 Board:size 110 0.1 0.0 Board:unchoose 191 0.1 0.0 Board:size 1 0.1 0.0 SolutionSaver:nSolutions 12 0.1 0.0 Board:unchoose 14 0.1 0.0 Board:size 110 0.0 0.0 Board:choose 110 0.0 0.0 Board:size 1 0.0 0.0 Board:size 12 0.0 0.0 Board:choose 12 0.0 0.0 Board:size +--------------------------------------+ | Most expensive methods summarized | +--------------------------------------+ Net ------------ Count Time Pct Location ===== ==== === ======== 841989 12268.9 62.9 Board:legalRows 24042480 4092.6 21.0 Board:size 856189 2603.1 13.4 NQueens:explore 856188 220.5 1.1 Board:choose 856188 177.1 0.9 Board:unchoose 1 98.1 0.5 NQueens:main 14200 23.2 0.1 Board:toString 1 4.3 0.0 Board:<init> 14200 3.7 0.0 SolutionSaver:saveSolution 1 0.1 0.0 SolutionSaver:<init> 1 0.1 0.0 SolutionSaver:nSolutions
Using JIP Output to Improve Code
We can use the output of the JIP profiling tool to make a major improvement to our N queens problem implementation. Take a look at the last table, which contains quite a bit of information about where our program is spending all its time:
+--------------------------------------+ | Most expensive methods summarized | +--------------------------------------+ Net ------------ Count Time Pct Location ===== ==== === ======== 841989 12268.9 62.9 Board:legalRows 24042480 4092.6 21.0 Board:size 856189 2603.1 13.4 NQueens:explore 856188 220.5 1.1 Board:choose 856188 177.1 0.9 Board:unchoose 1 98.1 0.5 NQueens:main 14200 23.2 0.1 Board:toString 1 4.3 0.0 Board:<init> 14200 3.7 0.0 SolutionSaver:saveSolution 1 0.1 0.0 SolutionSaver:<init> 1 0.1 0.0 SolutionSaver:nSolutions
You'll notice that a HUGE amount of time is spent getting the size of the board, and in legalRows, which is a method that calls the size() method many, many times.
It turns out that this seemingly trivial detail - calling size()
instead of this.size
- is a major bottleneck for the code. It is quite amazing what a big affect this has: check out the same exact problem, but with all calls to size() in the Board class eliminated:
+--------------------------------------+ | Most expensive methods summarized | +--------------------------------------+ Net ------------ Count Time Pct Location ===== ==== === ======== 856189 2722.4 78.9 NQueens:explore 841989 353.2 10.2 Board:legalRows 856188 173.8 5.0 Board:unchoose 856188 159.4 4.6 Board:choose 14200 28.0 0.8 Board:toString 1 11.1 0.3 NQueens:main 14200 4.0 0.1 SolutionSaver:saveSolution 1 0.1 0.0 SolutionSaver:nSolutions 1 0.1 0.0 SolutionSaver:<init> 1 0.1 0.0 Board:<init>
Totally different! Way faster! And this is for the exact same number of queens!
The total time spent in the legalRows class dropped drastically, and now the explore method is the most expensive method called.
Hprof
Hprof is a lower-level profiling tool than the Java Mission Control, which makes it both easier and harder. It gives much more useful information than the JMC method of profiling, although the information is a bit too low level.
Running
You will need to include the following flags when you call Java:
$ java -agentlib:hprof NQueens
Note that hprof has many options, most covered on the official documentation page for hprof: https://docs.oracle.com/javase/7/docs/technotes/samples/hprof.html
Hprof output
I tested hprof on a Java implementation of a recursive backtracking solution to the N queens problem, which made heavy use of integer arrays to keep track of where various queens had been placed and check what squares were valid for placing the next queen.
The result was a file with over 1 million lines, detailing every single function call that occurred and the trace for each. HUUUUGE amount of information.
At the very end of the file (using tail -n 100
) was a summary of what was contained in the lines above:
SITES BEGIN (ordered by live bytes) Sat Mar 18 22:29:07 2017 percent live alloc'ed stack class rank self accum bytes objs bytes objs trace name 1 38.18% 38.18% 14603008 228172 53886464 841976 300465 int[] 2 19.10% 57.28% 7305056 228283 26946784 842087 300271 java.util.LinkedList 3 19.09% 76.37% 7301920 228185 26943648 841989 300463 java.util.LinkedList$ListItr 4 14.50% 90.87% 5545128 231047 20548512 856188 300461 java.util.LinkedList$Node 5 3.56% 94.43% 1363200 14200 1363200 14200 300470 char[] 6 1.70% 96.13% 650544 5073 3197984 24586 300010 char[] 7 0.89% 97.02% 340800 14200 340800 14200 300471 java.util.LinkedList$Node 8 0.89% 97.91% 340800 14200 340800 14200 300469 java.lang.String 9 0.36% 98.27% 136080 2835 681600 14200 300467 char[] 10 0.24% 98.51% 92152 1000 92152 1000 300000 char[] 11 0.18% 98.69% 68040 2835 340800 14200 300466 java.lang.StringBuilder 12 0.14% 98.83% 52536 597 348656 3962 300468 char[] 13 0.07% 98.90% 26312 518 26312 518 300000 java.lang.Object[] 14 0.07% 98.96% 25504 8 25504 8 300000 byte[] 15 0.06% 99.03% 24280 990 24280 990 300000 java.lang.String 16 0.02% 99.04% 6672 97 7184 105 300268 char[] 17 0.01% 99.05% 4640 257 4640 257 300000 java.lang.Integer 18 0.01% 99.07% 4288 118 4288 118 300000 java.util.Hashtable$Entry SITES END
This means FORTY PERCENT of the code was spent on array operations. That's a lot - but not terribly surprising, since this implementation an N queens solution makes heavy use of arrays.
Examining heap dumps with jhat
You can use Hprof to create a heap dump by passing that option along with the hprof flag:
$ java -agentlib:hprof=heap=dump NQueens
This will create a very large output file in java.hprof.txt.
Now examine it with the jhat utility:
$ jhat java.hprof.txt Reading from java.hprof.txt... java.io.IOException: Version string not recognized at byte 18 at com.sun.tools.hat.internal.parser.HprofReader.readVersionHeader(HprofReader.java:390) at com.sun.tools.hat.internal.parser.HprofReader.read(HprofReader.java:175) at com.sun.tools.hat.internal.parser.Reader.readFile(Reader.java:92) at com.sun.tools.hat.Main.main(Main.java:159)
So much for that idea.
Java Mission Control
Not nearly as useful, since you don't get a method-level breakdown of how much time was spent in various places in the code. Lots and lots of superfluous information, though.
Running
You can specify that Java should create an output file for Java Mission Control by using flags:
$ java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -Dscijava.log.level=error \ -XX:StartFlightRecording=name=MyRecording,duration=999s,filename=//tmp/my_flight_recording.jfr,settings=profile NQueens
When that's finished, you can run Java Mission Control, which has a GUI interface and can be used to open the /tmp/my_flight_recording.jfr
file:
$ jmc
For the recursive backtracking method, this did not reveal anything about where the real time was being spent.
Timing Java Code
See Java/Timing
References
List of Java profiling tools: http://imagej.net/Profiling_Java_Code