Java/Profiling
From charlesreid1
Contents
Methods
There are various methods for profiling Java code. Some are covered below.
JIP: Java Interactive Profiler
The Java interactive profiler is covered here: http://confluence.concord.org/display/CCTR/attachments/15911/1574.pdf
This is a project available on sourceforge: http://jiprof.sourceforge.net/
Instructions
Start by downloading the source code for the project.
Look in that directory for a folder called profile/. Look in that folder for a Jar file called profile.jar. This jar file contains everything you need to profile your program with the JIP.
Store this Jar file somewhere easy to find, as you will be pointing the Java executable to the Jar file when you run the program that you are profiling.
Running Java with JIP
When you run Java, pass the -javaagent flag, and the location of profile.jar:
$ java -javaagent:/Volumes/noospace/Users/charles/Downloads/jip-src-1.2/profile/profile.jar NQueens objc[21690]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined. profiler: on remote: off port: 15599 thread-depth: -1 thread.compact.threshold.ms: 10 max-method-count: -1 method.compact.threshold.ms: 10 file: profile.txt track.object.alloc: off output: text debug: off profiler-class: com.mentorgen.tools.profile.runtime.Profile output-method-signatures: no clock-resolution: ms output-summary-only: no exclude:null Accept ClassLoader: sun.misc.Launcher$AppClassLoader ClassLoaderFilter.1: null Using the generic class loader filter. Java Interactive Profiler: starting ------------------ Found 14200 solutions Controller -- shuttingdown
This creates the file profile.txt, which provides an excellent breakdown of what amount of time was spent in which function calls, from the highest level of the program down to the lowest level.
Example JIP Output
The JIP will create an output file in profile.txt. Here is example output for a Java N queens problem solution:
+----------------------------------------------------------------------
| File: profile.txt
| Date: 2017.03.18 23:40:36 PM
+----------------------------------------------------------------------
+------------------------------
| Thread depth limit: Unlimited
+------------------------------
+------------------------------
| Thread: 1
+------------------------------
Time Percent
----------------- ---------------
Count Total Net Total Net Location
===== ===== === ===== === =========
1 19491.8 98.1 100.0 0.5 +--NQueens:main ()
1 0.1 0.1 0.0 | +--SolutionSaver:<init> ()
1 4.3 4.3 0.0 | +--Board:<init> ()
1 19389.2 0.6 99.5 | +--NQueens:explore ()
1 0.0 0.0 0.0 | | +--Board:size ()
1 0.3 0.3 0.0 | | +--Board:legalRows ()
14 0.1 0.1 0.0 | | | +--Board:size ()
12 0.0 0.0 0.0 | | +--Board:choose ()
12 19388.1 0.4 99.5 | | +--NQueens:explore ()
12 0.0 0.0 0.0 | | | +--Board:size ()
12 0.4 0.3 0.0 | | | +--Board:legalRows ()
191 0.1 0.1 0.0 | | | | +--Board:size ()
110 0.0 0.0 0.0 | | | +--Board:choose ()
110 19387.2 2.3 99.5 | | | +--NQueens:explore ()
110 0.0 0.0 0.0 | | | | +--Board:size ()
110 1.5 1.2 0.0 | | | | +--Board:legalRows ()
1951 0.4 0.4 0.0 | | | | | +--Board:size ()
756 0.1 0.1 0.0 | | | | +--Board:choose ()
756 19383.0 9.4 99.4 | | | | +--NQueens:explore ()
756 0.1 0.1 0.0 | | | | | +--Board:size ()
756 10.6 8.0 0.1 | | | | | +--Board:legalRows ()
14717 2.6 2.6 0.0 | | | | | | +--Board:size ()
4080 0.7 0.7 0.0 | | | | | +--Board:choose ()
4080 19361.3 35.5 99.3 0.2 | | | | | +--NQueens:explore ()
4080 1.1 1.1 0.0 | | | | | | +--Board:size ()
4080 62.6 48.4 0.3 0.2 | | | | | | +--Board:legalRows ()
86162 14.2 14.2 0.1 | | | | | | | +--Board:size ()
16852 2.8 2.8 0.0 | | | | | | +--Board:choose ()
16852 19256.1 114.7 98.8 0.6 | | | | | | +--NQueens:explore ()
16852 2.7 2.7 0.0 | | | | | | | +--Board:size ()
16852 276.5 209.5 1.4 1.1 | | | | | | | +--Board:legalRows ()
382380 67.0 67.0 0.3 0.3 | | | | | | | | +--Board:size ()
52856 11.7 11.7 0.1 | | | | | | | +--Board:choose ()
52856 18836.2 295.2 96.6 1.5 | | | | | | | +--NQueens:explore ()
52856 9.6 9.6 0.0 | | | | | | | | +--Board:size ()
52856 903.0 682.9 4.6 3.5 | | | | | | | | +--Board:legalRows ()
1278319 220.1 220.1 1.1 1.1 | | | | | | | | | +--Board:size ()
120104 22.8 22.8 0.1 0.1 | | | | | | | | +--Board:choose ()
120104 17577.2 494.4 90.2 2.5 | | | | | | | | +--NQueens:explore ()
120104 20.2 20.2 0.1 0.1 | | | | | | | | | +--Board:size ()
120104 2180.6 1641.2 11.2 8.4 | | | | | | | | | +--Board:legalRows ()
3075183 539.4 539.4 2.8 2.8 | | | | | | | | | | +--Board:size ()
195270 33.8 33.8 0.2 0.2 | | | | | | | | | +--Board:choose ()
195270 14813.5 629.7 76.0 3.2 | | | | | | | | | +--NQueens:explore ()
195270 36.2 36.2 0.2 0.2 | | | | | | | | | | +--Board:size ()
195270 3667.5 2787.1 18.8 14.3 | | | | | | | | | | +--Board:legalRows ()
5259936 880.4 880.4 4.5 4.5 | | | | | | | | | | | +--Board:size ()
222720 41.7 41.7 0.2 0.2 | | | | | | | | | | +--Board:choose ()
222720 10399.6 552.9 53.4 2.8 | | | | | | | | | | +--NQueens:explore ()
222720 36.6 36.6 0.2 0.2 | | | | | | | | | | | +--Board:size ()
222720 4347.6 3282.2 22.3 16.8 | | | | | | | | | | | +--Board:legalRows ()
6278007 1065.4 1065.4 5.5 5.5 | | | | | | | | | | | | +--Board:size ()
160964 92.4 92.4 0.5 0.5 | | | | | | | | | | | +--Board:choose ()
160964 5342.1 331.4 27.4 1.7 | | | | | | | | | | | +--NQueens:explore ()
160964 28.7 28.7 0.1 0.1 | | | | | | | | | | | | +--Board:size ()
160964 3340.2 2531.0 17.1 13.0 | | | | | | | | | | | | +--Board:legalRows ()
4727379 809.2 809.2 4.2 4.2 | | | | | | | | | | | | | +--Board:size ()
68264 12.0 12.0 0.1 | | | | | | | | | | | | +--Board:choose ()
68264 1603.7 106.6 8.2 0.5 | | | | | | | | | | | | +--NQueens:explore ()
68264 11.4 11.4 0.1 | | | | | | | | | | | | | +--Board:size ()
68264 1421.5 1076.9 7.3 5.5 | | | | | | | | | | | | | +--Board:legalRows ()
2082052 344.5 344.5 1.8 1.8 | | | | | | | | | | | | | | +--Board:size ()
14200 2.4 2.4 0.0 | | | | | | | | | | | | | +--Board:choose ()
14200 59.4 29.9 0.3 0.2 | | | | | | | | | | | | | +--NQueens:explore ()
14200 2.5 2.5 0.0 | | | | | | | | | | | | | | +--Board:size ()
14200 23.2 23.2 0.1 0.1 | | | | | | | | | | | | | | +--Board:toString ()
14200 3.7 3.7 0.0 | | | | | | | | | | | | | | +--SolutionSaver:saveSolution ()
14200 2.5 2.5 0.0 | | | | | | | | | | | | | +--Board:unchoose ()
68264 26.1 26.1 0.1 0.1 | | | | | | | | | | | | +--Board:unchoose ()
160964 27.9 27.9 0.1 0.1 | | | | | | | | | | | +--Board:unchoose ()
222720 38.8 38.8 0.2 0.2 | | | | | | | | | | +--Board:unchoose ()
195270 34.7 34.7 0.2 0.2 | | | | | | | | | +--Board:unchoose ()
120104 28.4 28.4 0.1 0.1 | | | | | | | | +--Board:unchoose ()
52856 14.2 14.2 0.1 | | | | | | | +--Board:unchoose ()
16852 3.3 3.3 0.0 | | | | | | +--Board:unchoose ()
4080 0.8 0.8 0.0 | | | | | +--Board:unchoose ()
756 0.2 0.2 0.0 | | | | +--Board:unchoose ()
110 0.1 0.1 0.0 | | | +--Board:unchoose ()
12 0.1 0.1 0.0 | | +--Board:unchoose ()
1 0.1 0.1 0.0 | +--SolutionSaver:nSolutions ()
+--------------------------------------
| Most expensive methods (by net time)
| Frame Count Limit: Unlimited
+--------------------------------------
Net
------------
Count Time Pct Location
===== ==== === ========
222720 3282.2 16.8 Board:legalRows
195270 2787.1 14.3 Board:legalRows
160964 2531.0 13.0 Board:legalRows
120104 1641.2 8.4 Board:legalRows
68264 1076.9 5.5 Board:legalRows
6278007 1065.4 5.5 Board:size
5259936 880.4 4.5 Board:size
4727379 809.2 4.2 Board:size
52856 682.9 3.5 Board:legalRows
195270 629.7 3.2 NQueens:explore
222720 552.9 2.8 NQueens:explore
3075183 539.4 2.8 Board:size
120104 494.4 2.5 NQueens:explore
2082052 344.5 1.8 Board:size
160964 331.4 1.7 NQueens:explore
52856 295.2 1.5 NQueens:explore
1278319 220.1 1.1 Board:size
16852 209.5 1.1 Board:legalRows
16852 114.7 0.6 NQueens:explore
68264 106.6 0.5 NQueens:explore
1 98.1 0.5 NQueens:main
160964 92.4 0.5 Board:choose
382380 67.0 0.3 Board:size
4080 48.4 0.2 Board:legalRows
222720 41.7 0.2 Board:choose
222720 38.8 0.2 Board:unchoose
222720 36.6 0.2 Board:size
195270 36.2 0.2 Board:size
4080 35.5 0.2 NQueens:explore
195270 34.7 0.2 Board:unchoose
195270 33.8 0.2 Board:choose
14200 29.9 0.2 NQueens:explore
160964 28.7 0.1 Board:size
120104 28.4 0.1 Board:unchoose
160964 27.9 0.1 Board:unchoose
68264 26.1 0.1 Board:unchoose
14200 23.2 0.1 Board:toString
120104 22.8 0.1 Board:choose
120104 20.2 0.1 Board:size
86162 14.2 0.1 Board:size
52856 14.2 0.1 Board:unchoose
68264 12.0 0.1 Board:choose
52856 11.7 0.1 Board:choose
68264 11.4 0.1 Board:size
52856 9.6 0.0 Board:size
756 9.4 0.0 NQueens:explore
756 8.0 0.0 Board:legalRows
1 4.3 0.0 Board:<init>
14200 3.7 0.0 SolutionSaver:saveSolution
16852 3.3 0.0 Board:unchoose
16852 2.8 0.0 Board:choose
16852 2.7 0.0 Board:size
14717 2.6 0.0 Board:size
14200 2.5 0.0 Board:size
14200 2.5 0.0 Board:unchoose
14200 2.4 0.0 Board:choose
110 2.3 0.0 NQueens:explore
110 1.2 0.0 Board:legalRows
4080 1.1 0.0 Board:size
4080 0.8 0.0 Board:unchoose
4080 0.7 0.0 Board:choose
1 0.6 0.0 NQueens:explore
12 0.4 0.0 NQueens:explore
1951 0.4 0.0 Board:size
12 0.3 0.0 Board:legalRows
1 0.3 0.0 Board:legalRows
756 0.2 0.0 Board:unchoose
1 0.1 0.0 SolutionSaver:<init>
756 0.1 0.0 Board:choose
756 0.1 0.0 Board:size
110 0.1 0.0 Board:unchoose
191 0.1 0.0 Board:size
1 0.1 0.0 SolutionSaver:nSolutions
12 0.1 0.0 Board:unchoose
14 0.1 0.0 Board:size
110 0.0 0.0 Board:choose
110 0.0 0.0 Board:size
1 0.0 0.0 Board:size
12 0.0 0.0 Board:choose
12 0.0 0.0 Board:size
+--------------------------------------+
| Most expensive methods summarized |
+--------------------------------------+
Net
------------
Count Time Pct Location
===== ==== === ========
841989 12268.9 62.9 Board:legalRows
24042480 4092.6 21.0 Board:size
856189 2603.1 13.4 NQueens:explore
856188 220.5 1.1 Board:choose
856188 177.1 0.9 Board:unchoose
1 98.1 0.5 NQueens:main
14200 23.2 0.1 Board:toString
1 4.3 0.0 Board:<init>
14200 3.7 0.0 SolutionSaver:saveSolution
1 0.1 0.0 SolutionSaver:<init>
1 0.1 0.0 SolutionSaver:nSolutions
Using JIP Output to Improve Code
We can use the output of the JIP profiling tool to make a major improvement to our N queens problem implementation. Take a look at the last table, which contains quite a bit of information about where our program is spending all its time:
+--------------------------------------+
| Most expensive methods summarized |
+--------------------------------------+
Net
------------
Count Time Pct Location
===== ==== === ========
841989 12268.9 62.9 Board:legalRows
24042480 4092.6 21.0 Board:size
856189 2603.1 13.4 NQueens:explore
856188 220.5 1.1 Board:choose
856188 177.1 0.9 Board:unchoose
1 98.1 0.5 NQueens:main
14200 23.2 0.1 Board:toString
1 4.3 0.0 Board:<init>
14200 3.7 0.0 SolutionSaver:saveSolution
1 0.1 0.0 SolutionSaver:<init>
1 0.1 0.0 SolutionSaver:nSolutions
You'll notice that a HUGE amount of time is spent getting the size of the board, and in legalRows, which is a method that calls the size() method many, many times.
It turns out that this seemingly trivial detail - calling size() instead of this.size - is a major bottleneck for the code. It is quite amazing what a big affect this has: check out the same exact problem, but with all calls to size() in the Board class eliminated:
+--------------------------------------+
| Most expensive methods summarized |
+--------------------------------------+
Net
------------
Count Time Pct Location
===== ==== === ========
856189 2722.4 78.9 NQueens:explore
841989 353.2 10.2 Board:legalRows
856188 173.8 5.0 Board:unchoose
856188 159.4 4.6 Board:choose
14200 28.0 0.8 Board:toString
1 11.1 0.3 NQueens:main
14200 4.0 0.1 SolutionSaver:saveSolution
1 0.1 0.0 SolutionSaver:nSolutions
1 0.1 0.0 SolutionSaver:<init>
1 0.1 0.0 Board:<init>
Totally different! Way faster! And this is for the exact same number of queens!
The total time spent in the legalRows class dropped drastically, and now the explore method is the most expensive method called.
Hprof
Hprof is a lower-level profiling tool than the Java Mission Control, which makes it both easier and harder. It gives much more useful information than the JMC method of profiling, although the information is a bit too low level.
Running
You will need to include the following flags when you call Java:
$ java -agentlib:hprof NQueens
Note that hprof has many options, most covered on the official documentation page for hprof: https://docs.oracle.com/javase/7/docs/technotes/samples/hprof.html
Hprof output
I tested hprof on a Java implementation of a recursive backtracking solution to the N queens problem, which made heavy use of integer arrays to keep track of where various queens had been placed and check what squares were valid for placing the next queen.
The result was a file with over 1 million lines, detailing every single function call that occurred and the trace for each. HUUUUGE amount of information.
At the very end of the file (using tail -n 100) was a summary of what was contained in the lines above:
SITES BEGIN (ordered by live bytes) Sat Mar 18 22:29:07 2017
percent live alloc'ed stack class
rank self accum bytes objs bytes objs trace name
1 38.18% 38.18% 14603008 228172 53886464 841976 300465 int[]
2 19.10% 57.28% 7305056 228283 26946784 842087 300271 java.util.LinkedList
3 19.09% 76.37% 7301920 228185 26943648 841989 300463 java.util.LinkedList$ListItr
4 14.50% 90.87% 5545128 231047 20548512 856188 300461 java.util.LinkedList$Node
5 3.56% 94.43% 1363200 14200 1363200 14200 300470 char[]
6 1.70% 96.13% 650544 5073 3197984 24586 300010 char[]
7 0.89% 97.02% 340800 14200 340800 14200 300471 java.util.LinkedList$Node
8 0.89% 97.91% 340800 14200 340800 14200 300469 java.lang.String
9 0.36% 98.27% 136080 2835 681600 14200 300467 char[]
10 0.24% 98.51% 92152 1000 92152 1000 300000 char[]
11 0.18% 98.69% 68040 2835 340800 14200 300466 java.lang.StringBuilder
12 0.14% 98.83% 52536 597 348656 3962 300468 char[]
13 0.07% 98.90% 26312 518 26312 518 300000 java.lang.Object[]
14 0.07% 98.96% 25504 8 25504 8 300000 byte[]
15 0.06% 99.03% 24280 990 24280 990 300000 java.lang.String
16 0.02% 99.04% 6672 97 7184 105 300268 char[]
17 0.01% 99.05% 4640 257 4640 257 300000 java.lang.Integer
18 0.01% 99.07% 4288 118 4288 118 300000 java.util.Hashtable$Entry
SITES END
This means FORTY PERCENT of the code was spent on array operations. That's a lot - but not terribly surprising, since this implementation an N queens solution makes heavy use of arrays.
Examining heap dumps with jhat
You can use Hprof to create a heap dump by passing that option along with the hprof flag:
$ java -agentlib:hprof=heap=dump NQueens
This will create a very large output file in java.hprof.txt.
Now examine it with the jhat utility:
$ jhat java.hprof.txt Reading from java.hprof.txt... java.io.IOException: Version string not recognized at byte 18 at com.sun.tools.hat.internal.parser.HprofReader.readVersionHeader(HprofReader.java:390) at com.sun.tools.hat.internal.parser.HprofReader.read(HprofReader.java:175) at com.sun.tools.hat.internal.parser.Reader.readFile(Reader.java:92) at com.sun.tools.hat.Main.main(Main.java:159)
So much for that idea.
Java Mission Control
Not nearly as useful, since you don't get a method-level breakdown of how much time was spent in various places in the code. Lots and lots of superfluous information, though.
Running
You can specify that Java should create an output file for Java Mission Control by using flags:
$ java -XX:+UnlockCommercialFeatures -XX:+FlightRecorder -Dscijava.log.level=error \ -XX:StartFlightRecording=name=MyRecording,duration=999s,filename=//tmp/my_flight_recording.jfr,settings=profile NQueens
When that's finished, you can run Java Mission Control, which has a GUI interface and can be used to open the /tmp/my_flight_recording.jfr file:
$ jmc
For the recursive backtracking method, this did not reveal anything about where the real time was being spent.
Timing Java Code
See Java/Timing
References
List of Java profiling tools: http://imagej.net/Profiling_Java_Code