From charlesreid1

Summary

The outcome, of implementing a solution to the N Queens Problem using the same recursive depth-first search algorithm in Perl and Java, and measuring the wall time for various values of N:

Perl vs Java.png

Both codes implemented a solution utilizing integer arrays.

Perl version: https://gist.github.com/charlesreid1/4ce97a5f896ff1c89855a5d038d51535

Java version: https://gist.github.com/charlesreid1/7b8d7b9dffb7b3090039849d72c5fff5

Background

Recently I read an (11 year old) article [1] by Steve Yegge entitled "Execution in the Kingdom of Nouns." In it, Steve describes the way that in Java,

Classes are really the only modeling tool Java provides you. So whenever a new idea occurs to you, you have to sculpt it or wrap it or smash at it until it becomes a thing, even if it began life as an action, a process, or any other non-'thing' concept.

This idea really resonated with me. I have been curious about verb-oriented languages like Haskell and OCaml, and have found lambda functions extremely useful in data analysis when using Python (not to mention the wonderful Python feature of being able to pass functions or lambda functions as parameters).

After reading this article, I began thinking about how we might implement programs in Java using verb-oriented thinking. So, I began by thinking through the N-queens problem, a classic recursive backtracking problem.

Perl vs. Java

Steve Yegge's article had another quote that stuck out to me:

I've really come around to what Perl folks were telling me 8 or 9 years ago: 'Dude, not everything is an object.'

The last tool I wrote in Perl was a web scraper called the Swartz Mechanizer [2]. It was an unpleasant and ugly script running to 800 lines; it's main purpose was to navigate a page, click buttons, and download PDF files. It worked, and it was entirely action-based, consisting of functions doing actions and passing small parcels of information around.

This time around I used Perl to implement a recursive backtracking solution to the N-queens problem, which consisted of a recursive function that kept track of queen placement using an array of integers. The Perl solution was a verb-based solution that was principally performing *actions* using built-in Perl types. This is one of the core tenets of a verb-based approach: you reshape your problem to fit your data structures (in contrast to the noun-based approach, which does it the other way around).

The Perl solution was written first, and the Java solution second. The Java solution also implement a recursive backtracking approach, and is modeled closely on the Perl solution. It also uses integer arrays to keep track of queen positions, and uses a static class to emulate the global solution storage variable. It also checks for legal moves in the same way that the Perl program does.

The Perl Solution

In Perl, you have the concept of a global namespace, by declaring variables at the top level of the file. The Perl implementation of the N queens problem utilizes a global array to store solutions, so that when the base case of the recursive backtracking method is reached, meaning a solution has been found, it is added to this global array, which is shared amongst all instances of the recursive function. This is the only "sharing" that needs to happen among instances of the recursive function (and even then, the sharing isn't strictly necessary, since the set of solutions can be passed as a parameter).

See the original Perl N queens problem solution and a script for gathering timing data for different board sizes: https://gist.github.com/charlesreid1/4ce97a5f896ff1c89855a5d038d51535


#!/usr/bin/perl

# Solve the N queens problem
# using recursive backtracking.
#
# Author: Charles Reid
# Date: March 2017
#
# Modified and expanded from http://rosettacode.org/wiki/N-queens_problem#Perl

# Create an array to store solutions
my @solutions;

# Create an array to store where queens have been placed
my @queens;

# Mark the rows already used (useful for lookup)
my @occupied;

# explore() implements a recursive, depth-first backtracking method
sub explore {
	# Parameters:
	#	depth : this is the argument passed by the user

	# First argument passed to the function is $depth
	# (how many queens we've placed on the board),
	# so use shift to pop that out of the parameters
	my ($depth, @diag) = shift;

	# Explore is a recursive method,
	# so we need a base case and a recursive case.
	#
	# The base case is, we've reached a leaf node,
	# placed 8 queens, and had no problems,
	# so we found a solution.
	if ($depth==$board_size) {
		# Here, we store the stringified version of @queens,
		# which are the row numbers of prior queens.
		# This is a global variable that is shared across
		# instances of this recursive function.
		push @solutions, "@queens\n";
		return;
	}

	# Mark the squares that are attackable,
	# so that we can cut down on the search space.
	$#diag = 2 * $board_size;
	for( 0 .. $#queens) {
		$ix1 = $queens[$_] + $depth - $_ ;
		$diag[ $ix1 ] = 1;

		$ix2 = $queens[$_] - $depth + $_ ;
		$diag[ $ix2 ] = 1;
	}

	for my $row (0 .. $board_size-1) {
		# Cut down on the search space:
		# if this square is already occupied
		# or will lead to an invalid solution,
		# don't bother exploring it.
		next if $occupied[$row] || $diag[$row];

		# Make a choice
		push @queens, $row;
		# Mark the square as occupied
		$occupied[$row] = 1;

		# Explore the consequences
		explore($depth+1);

		# Unmake the choice
		pop @queens;

		# Mark the square as unoccupied
		$occupied[$row] = 0;

	}
}

$board_size = 8;

explore(0);

print "total ", scalar(@solutions), " solutions\n";

The Java Solution

See the original Java N queens problem solution and a script for gathering timing data for different board sizes: https://gist.github.com/charlesreid1/4ce97a5f896ff1c89855a5d038d51535


import java.util.LinkedList;
import java.util.Arrays;

public class NQueens {

    public static final int SIZE = 12;

    ///////////////////////////////////////////////////
    // main
    //
    public static void main(String[] args) {
        SolutionSaver s = new SolutionSaver();
        Board b = new Board(SIZE);
        explore(b,0);
		System.out.printf("Found %d solutions\n ", SolutionSaver.nSolutions() );
    }
    //
    ///////////////////////////////////////////////////


    /// Placing queens, column-by-column; place the queen in column col.
    public static void explore(Board b, int col) {

        if(col >= SIZE ) {

            // We have reached the end.
            // No conflicts so far means no conflicts period.

            // Save solution in a static class, no instantiation overhead
            SolutionSaver.saveSolution( b.toString() );

		} else {

            // The legalRows() method is a little bit of magic.
            // It returns an array of valid rows on which to place the col^th queen.
            for(Integer row : b.legalRows(col) ) { // important question: is this called each time in the loop?
                                                   // (do i need to make a copy outside the loop?)
                b.choose(row,col);
                explore(b,col+1);
                b.unchoose(row,col);

            }
        }// done with base/recursive cases
    }

}

class Board {

    int[] queens;                   // Array to store where queens have been placed.
                                    // The queens array has a length of board_size.
                                    // Each element stores an integer between 1 and N.
                                    // That indicates the row/column.

    int[] occupiedrows;             // Array to mark occupied rows.
                                    // This is how we check for horizontal attacks.

    public static int board_size;

    public Board(int size) {
        this.board_size = size;
        this.queens = new int[size];
        this.occupiedrows = new int[size];
    }

    /**
     * Get String representation of queen positions.
     * This prints 8 numbers, corresponding to 8 columns.
     * Each number is an integer, 1..(board_size-1), indicating
     * the row of the queen on that particular column.
     * All queens on first row would be 0 0 0 0 0 0 0 0
     */
    public String toString() {
        return Arrays.toString(this.queens);
    }

    /// Make the choice of putting a queen on row, col
    public void choose(int row, int col) {
        if( col < this.queens.length && row < this.occupiedrows.length ) {
            this.queens[col] = row;
            this.occupiedrows[row] = 1;
        }
    }

    /// Unmake the choice of putting a queen on row, col
    public void unchoose(int row, int col) {
        if( col < this.queens.length && row < this.occupiedrows.length ) {
            this.queens[col] = 0;
            this.occupiedrows[row] = 0;
        }
    }

    /// Get a list of legal rows
    public LinkedList<Integer> legalRows( int col ) {

        LinkedList<Integer> legalList = new LinkedList<Integer>();

        // 1. Mark invalid squares on diagonals of already-placed queens
        //
        // 2. For this column, loop over each row where it's legal to place a queen,
        //    and run backtracking on that choice. Then unmake the choice and keep going.

        // Store invalid rows on other queens' diagonals
        int[] diag = new int[this.board_size];

        // Loop over all of the queens already placed (col-1)
        for(int k = 0; k <= (col-1); k++ ) {

            // We're gonig to place the next queen on col.
            // Find which squares are on diagonal of queen k,
            // and mark them as impossible.

            // Lower right diagonal
            int ix1 = this.queens[k] + col - k;
            if(ix1 >= 0 && ix1 < this.board_size ) {
                diag[ix1] = 1;
            }

            // Upper right diagonal
            int ix2 = this.queens[k] - col + k;
            if(ix2 >= 0 && ix2 < this.board_size ) {
                diag[ix2] = 1;
            }

        }

        // Legal rows are non-diagonal squares and squares on non-occupied rows.
        for (int row = 0; row < this.board_size; row++) {

            // If this row is legal, add it to the list
            boolean legal = diag[row]==0 && this.occupiedrows[row]==0;
            if(legal) {
                legalList.add(row);
            }

        }

        return legalList;
    }
}


class SolutionSaver {
    private static LinkedList<String> solutions;

    public SolutionSaver() {
        SolutionSaver.solutions = new LinkedList<String>();
    }

    public static void saveSolution(String serialized) {
        SolutionSaver.solutions.add(serialized);
    }

    public static void printSolutions() {
        int c = 0;
        for( String sol : solutions ) {
            System.out.printf("Solution %d: %s \n", c, sol);
        }
    }
    public static int nSolutions() {
        return solutions.size();
    }
}

Profiling

Profiling Perl

To profile the Perl code, I used the Devel::NYTProf Perl module. (See Perl/Profiling for details.) This gave a nice interactive report that showed that much of the time Perl was spending was being spent doing computations for valid locations to place queens.

perl -d:NYTProf nqueens.pl

This generates a nytprof.out file that contains a database of profiling information. This information can be extracted to various formats.

CSV format

Here's how to extract it to a CSV file:

$ /usr/local/Cellar/perl/5.24.0_1/bin/nytprofcsv nytprof.out

This will generate a directory called nytprof/ containing a CSV file:

# Profile data generated by Devel::NYTProf::Reader
# Version: v6.04
# More information at http://metacpan.org/release/Devel-NYTProf/
# Format: time,calls,time/call,code
0.086191,166926,0.000001,my ($depth, @attacked) = shift;
0.010183,2680,0.000004,push @solutions, "@queens\n";
0.010521,2680,0.000004,return;
0.196345,164246,0.000001,$#attacked = 2 * $board_size;
0.152597,164246,0.000001,for( 0 .. $#queens) {
0.683294,1.26035e+06,0.000001,$attacked[ $ix2 ] = 1;
1.189223,164246,0.000007,for my $row (0 .. $board_size-1) {
0.272883,166925,0.000002,explore($depth+1);
0.116465,166925,0.000001,$occupied[$row] = 0;
0.000001,1,0.000001,$board_size = 11;
0.000019,1,0.000019,explore(0);
0.000080,1,0.000080,print "Found ", scalar(@solutions), " solutions\n";

One number here in particular is important - the number of times we call the for loop over columns. This gives us a TOTAL count of the number of solutions that were tried by the program. This number, for Perl, is 164,246. We will see that Java explores the exact same number of solutions, so the difference in computational cost between these languages is not coming from a difference in algorithms.

Callgraph format

To output a callgraph from the Perl profiler results, use the nytprofcalls utility in combination with sort:

$ /usr/local/Cellar/perl/5.24.0_1/bin/nytprofcalls nytprof.out | sort
main::CORE:print 49
main::explore 105
main::explore;main::explore 657
main::explore;main::explore;main::explore 4929
main::explore;main::explore;main::explore;main::explore 19102
main::explore;main::explore;main::explore;main::explore;main::explore 76128
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 245851
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 561232
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 928979
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 1082376
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 830196
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 366525
main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore;main::explore 16076

This is a bit cryptic - there are far more calls than there should be. 105 calls to the top level explore method??

Callgrind (Valgrind tool) format

You can also look at the call graph using callgrind, a tool bundled with Valgrind. (Valgrind is a memory and execution analysis tool.) Call the nytprofcg tool:

/usr/local/Cellar/perl/5.24.0_1/bin/nytprofcg nytprof.out

This will result in a file called nytprof.callgrind, which can be analyzed with Valgrind. Now fire up the callgrind_annotate function to print out information contained in the call graph.

For callgrind annotate options, see the Valgrind manual: http://valgrind.org/docs/manual/cl-manual.html

$ callgrind_annotate --inclusive=yes nytprof.callgrind
--------------------------------------------------------------------------------
Profile data file 'nytprof.callgrind'
--------------------------------------------------------------------------------
Profiled target:  (unknown)
Events recorded:  Ticks
Events shown:     Ticks
Event sort order: Ticks
Thresholds:       99
Include dirs:
User annotated:
Auto-annotation:  off

--------------------------------------------------------------------------------
     Ticks
--------------------------------------------------------------------------------
12,396,614  PROGRAM TOTALS (calculated)

--------------------------------------------------------------------------------
    Ticks  file:function
--------------------------------------------------------------------------------
4,132,205  nqueens.pl:main::RUNTIME
4,132,156  /Volumes/noospace/Users/charles/codes/hello-world/perl/nqueens.pl:main::explore
4,132,155  nqueens.pl:main::explore
       49  /Volumes/noospace/Users/charles/codes/hello-world/perl/nqueens.pl:main::CORE:print

Profiling Java

To profile the Java code, I used two tools: the Java Interactive Profiler (JIP), and the HPROF tool provided by Oracle. (See Java/Profiling for details.)

JIP

See Java/Profiling for instructions on how to get JIP. Basically you download it and look for profile.jar.

JIP is run by passing a "javaagent" flag to Java when you execute your program. This will generate a profile.txt file, which gives a summary of how many times each method was called at each recursive level, plus a summary of the most expensive methods. This helps identify where most of the time was spent.

$ java -javaagent:${HOME}/Downloads/jip-src-1.2/profile/profile.jar NQueens

objc[39911]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined.
profiler: on
remote: off
port: 15599
thread-depth: -1
thread.compact.threshold.ms: 10
max-method-count: -1
method.compact.threshold.ms: 10
file: profile.txt
track.object.alloc: off
output: text
debug: off
profiler-class: com.mentorgen.tools.profile.runtime.Profile
output-method-signatures: no
clock-resolution: ms
output-summary-only: no
exclude:null
Accept ClassLoader: sun.misc.Launcher$AppClassLoader
ClassLoaderFilter.1: null
Using the generic class loader filter.
Java Interactive Profiler: starting
------------------
Found 2680 solutions
 Controller -- shuttingdown

Here's an example of the output:

+----------------------------------------------------------------------
|  File: profile.txt
|  Date: 2017.03.19 13:57:01 PM
+----------------------------------------------------------------------

+------------------------------
| Thread depth limit: Unlimited
+------------------------------
+------------------------------
| Thread: 1
+------------------------------
              Time            Percent
       ----------------- ---------------
 Count    Total      Net   Total     Net  Location
 =====    =====      ===   =====     ===  =========
     1   1033.5     12.6   100.0     1.2  +--NQueens:main	()
     1      0.1      0.1     0.0          | +--SolutionSaver:<init>	()
     1      0.0      0.0     0.0          | +--Board:<init>	()
     1   1020.6      0.5    98.8          | +--NQueens:explore	()
     1      0.1      0.1     0.0          | | +--Board:legalRows	()
    11      0.0      0.0     0.0          | | +--Board:choose	()
    11   1020.0      0.4    98.7          | | +--NQueens:explore	()
    11      0.0      0.0     0.0          | | | +--Board:legalRows	()
    90      0.0      0.0     0.0          | | | +--Board:choose	()
    90   1019.4      2.1    98.6     0.2  | | | +--NQueens:explore	()
    90      0.1      0.1     0.0          | | | | +--Board:legalRows	()
   536      0.1      0.1     0.0          | | | | +--Board:choose	()
   536   1016.9      9.3    98.4     0.9  | | | | +--NQueens:explore	()
   536      0.4      0.4     0.0          | | | | | +--Board:legalRows	()
  2468      1.7      1.7     0.2     0.2  | | | | | +--Board:choose	()
  2468   1005.0     33.7    97.2     3.3  | | | | | +--NQueens:explore	()
  2468      1.6      1.6     0.2     0.2  | | | | | | +--Board:legalRows	()
  8492      1.7      1.7     0.2     0.2  | | | | | | +--Board:choose	()
  8492    966.3     90.3    93.5     8.7  | | | | | | +--NQueens:explore	()
  8492      5.1      5.1     0.5     0.5  | | | | | | | +--Board:legalRows	()
 21362      4.9      4.9     0.5     0.5  | | | | | | | +--Board:choose	()
 21362    860.9    169.8    83.3    16.4  | | | | | | | +--NQueens:explore	()
 21362     12.2     12.2     1.2     1.2  | | | | | | | | +--Board:legalRows	()
 37248     47.8     47.8     4.6     4.6  | | | | | | | | +--Board:choose	()
 37248    623.2    206.5    60.3    20.0  | | | | | | | | +--NQueens:explore	()
 37248     19.1     19.1     1.8     1.8  | | | | | | | | | +--Board:legalRows	()
 44148      9.4      9.4     0.9     0.9  | | | | | | | | | +--Board:choose	()
 44148    374.7    162.7    36.3    15.7  | | | | | | | | | +--NQueens:explore	()
 44148     25.9     25.9     2.5     2.5  | | | | | | | | | | +--Board:legalRows	()
 34774      6.9      6.9     0.7     0.7  | | | | | | | | | | +--Board:choose	()
 34774    171.5     94.7    16.6     9.2  | | | | | | | | | | +--NQueens:explore	()
 34774     16.7     16.7     1.6     1.6  | | | | | | | | | | | +--Board:legalRows	()
 15116      4.4      4.4     0.4     0.4  | | | | | | | | | | | +--Board:choose	()
 15116     52.8     24.4     5.1     2.4  | | | | | | | | | | | +--NQueens:explore	()
 15116      7.0      7.0     0.7     0.7  | | | | | | | | | | | | +--Board:legalRows	()
  2680      0.6      0.6     0.1          | | | | | | | | | | | | +--Board:choose	()
  2680     20.3      7.2     2.0     0.7  | | | | | | | | | | | | +--NQueens:explore	()
  2680     11.0     11.0     1.1     1.1  | | | | | | | | | | | | | +--Board:toString	()
  2680      2.1      2.1     0.2     0.2  | | | | | | | | | | | | | +--SolutionSaver:saveSolution	()
  2680      0.6      0.6     0.1          | | | | | | | | | | | | +--Board:unchoose	()
 15116      3.0      3.0     0.3     0.3  | | | | | | | | | | | +--Board:unchoose	()
 34774      7.7      7.7     0.7     0.7  | | | | | | | | | | +--Board:unchoose	()
 44148     13.5     13.5     1.3     1.3  | | | | | | | | | +--Board:unchoose	()
 37248      7.9      7.9     0.8     0.8  | | | | | | | | +--Board:unchoose	()
 21362      5.2      5.2     0.5     0.5  | | | | | | | +--Board:unchoose	()
  8492      1.7      1.7     0.2     0.2  | | | | | | +--Board:unchoose	()
  2468      0.5      0.5     0.0          | | | | | +--Board:unchoose	()
   536      0.1      0.1     0.0          | | | | +--Board:unchoose	()
    90      0.1      0.1     0.0          | | | +--Board:unchoose	()
    11      0.0      0.0     0.0          | | +--Board:unchoose	()
     1      0.1      0.1     0.0          | +--SolutionSaver:nSolutions	()

+--------------------------------------
| Most expensive methods (by net time)
| Frame Count Limit: Unlimited
+--------------------------------------

               Net
          ------------
 Count     Time    Pct  Location
 =====     ====    ===  ========
 37248    206.5   20.0  NQueens:explore
 21362    169.8   16.4  NQueens:explore
 44148    162.7   15.7  NQueens:explore
 34774     94.7    9.2  NQueens:explore
  8492     90.3    8.7  NQueens:explore
 37248     47.8    4.6  Board:choose
  2468     33.7    3.3  NQueens:explore
 44148     25.9    2.5  Board:legalRows
 15116     24.4    2.4  NQueens:explore
 37248     19.1    1.8  Board:legalRows
 34774     16.7    1.6  Board:legalRows
 44148     13.5    1.3  Board:unchoose
     1     12.6    1.2  NQueens:main
 21362     12.2    1.2  Board:legalRows
  2680     11.0    1.1  Board:toString
 44148      9.4    0.9  Board:choose
   536      9.3    0.9  NQueens:explore
 37248      7.9    0.8  Board:unchoose
 34774      7.7    0.7  Board:unchoose
  2680      7.2    0.7  NQueens:explore
 15116      7.0    0.7  Board:legalRows
 34774      6.9    0.7  Board:choose
 21362      5.2    0.5  Board:unchoose
  8492      5.1    0.5  Board:legalRows
 21362      4.9    0.5  Board:choose
 15116      4.4    0.4  Board:choose
 15116      3.0    0.3  Board:unchoose
    90      2.1    0.2  NQueens:explore
  2680      2.1    0.2  SolutionSaver:saveSolution
  8492      1.7    0.2  Board:choose
  2468      1.7    0.2  Board:choose
  8492      1.7    0.2  Board:unchoose
  2468      1.6    0.2  Board:legalRows
  2680      0.6    0.1  Board:unchoose
  2680      0.6    0.1  Board:choose
     1      0.5    0.0  NQueens:explore
  2468      0.5    0.0  Board:unchoose
    11      0.4    0.0  NQueens:explore
   536      0.4    0.0  Board:legalRows
   536      0.1    0.0  Board:choose
     1      0.1    0.0  SolutionSaver:<init>
   536      0.1    0.0  Board:unchoose
     1      0.1    0.0  SolutionSaver:nSolutions
    90      0.1    0.0  Board:legalRows
    90      0.1    0.0  Board:unchoose
     1      0.1    0.0  Board:legalRows
     1      0.0    0.0  Board:<init>
    11      0.0    0.0  Board:legalRows
    90      0.0    0.0  Board:choose
    11      0.0    0.0  Board:choose
    11      0.0    0.0  Board:unchoose

+--------------------------------------+
| Most expensive methods summarized    |
+--------------------------------------+

               Net
          ------------
 Count     Time    Pct  Location
 =====     ====    ===  ========
166926    801.7   77.6  NQueens:explore
164246     88.0    8.5  Board:legalRows
166925     77.6    7.5  Board:choose
166925     40.2    3.9  Board:unchoose
     1     12.6    1.2  NQueens:main
  2680     11.0    1.1  Board:toString
  2680      2.1    0.2  SolutionSaver:saveSolution
     1      0.1    0.0  SolutionSaver:<init>
     1      0.1    0.0  SolutionSaver:nSolutions
     1      0.0    0.0  Board:<init>

From the tree diagram of method calls, we can see the actual number of branches explored at each depth of the decision tree:

  • At a depth of 11, the explore() method was called 2680 times, precisely the number of solutions to the N queens problem for N = 11. (Reaching a depth of 11 means all 11 queens have been placed.)
  • At a depth of 10, the explore() method was called 15116 times.
  • At a depth of 9, the explore() method was called 34774 times.
  • At a depth of 8, the explore() method was called 44148 times. This is the depth at which the decision tree reaches its maximum width/complication.
  • At a depth of 7, the explore() method was called 37248 times.
  • At a depth of 6, the explore() method was called 21362 times.
  • At a depth of 5, the explore() method was called 8492 times.
  • At a depth of 4, the explore() method was called 2468 times.
  • At a depth of 3, the explore() method was called 536 times.
  • At a depth of 2, the explore() method was called 90 times.
  • At a depth of 1, the explore() method was called 11 times.

If we add up each of these numbers, except for the depth of 11, or if we just look at the number times the legalRows() method is called, or subtract 2,680 from the number of times the explore() method was called, we get 164,246 - precisely the same number of times Perl calls the for loop to loop over each column.


HPROF

This is another profiling tool, but unlike JIP this one comes bundled with Java. Here is the official Oracle documentation for HPROF: https://docs.oracle.com/javase/7/docs/technotes/samples/hprof.html

This can be run with multiple options, and can produce some REALLY big dump files, so try it on small executables/small cases first.

Running without any options can produce some huge files:

$ java -agentlib:hprof NQueens

Specifying the no verbose option can help make the dump smaller:

$ java -agentlib:hprof=verbose=n NQueens

Dumping out the profile based on CPU usage is interesting:

$ java -agentlib:hprof=cpu=samples NQueens

It shows CPU usage information, broken down by methods.

SITES BEGIN (ordered by live bytes) Sun Mar 19 17:07:21 2017
          percent          live          alloc'ed  stack class
 rank   self  accum     bytes objs     bytes  objs trace name
    1 37.89% 37.89%   4162176 65034  10510976 164234 300465 int[]
    2 18.98% 56.87%   2084640 65145   5259008 164344 300271 java.util.LinkedList
    3 18.95% 75.82%   2081504 65047   5255872 164246 300463 java.util.LinkedList$ListItr
    4 14.42% 90.24%   1584096 66004   4006200 166925 300461 java.util.LinkedList$Node
    5  2.15% 92.39%    235840 2680    235840  2680 300470 char[]
    6  0.84% 93.23%     92096  998     92096   998 300000 char[]
    7  0.59% 93.81%     64320 2680     64320  2680 300469 java.lang.String
    8  0.59% 94.40%     64320 2680     64320  2680 300471 java.util.LinkedList$Node
    9  0.44% 94.84%     48752  554    133320  1515 300468 char[]
   10  0.41% 95.25%     45072  939    128640  2680 300467 char[]
   11  0.31% 95.56%     33880  385    127560  1313 300010 char[]
   12  0.24% 95.80%     26312  518     26312   518 300000 java.lang.Object[]
   13  0.23% 96.03%     25504    8     25504     8 300000 byte[]
   14  0.22% 96.25%     24232  988     24232   988 300000 java.lang.String
   15  0.21% 96.46%     22536  939     64320  2680 300466 java.lang.StringBuilder
   16  0.06% 96.52%      6672   97      7184   105 300268 char[]
   17  0.04% 96.56%      4640  257      4640   257 300000 java.lang.Integer
   18  0.04% 96.60%      4288  118      4288   118 300000 java.util.Hashtable$Entry
   19  0.04% 96.64%      4224    8      4224     8 300250 byte[]
   20  0.03% 96.67%      3584   77      3584    77 300000 java.lang.ref.SoftReference
   21  0.03% 96.70%      3496   21      5256    35 300129 char[]
   22  0.03% 96.73%      3352   20      4128    26 300151 char[]
   23  0.03% 96.76%      2912   91      2912    91 300158 java.util.HashMap$Node
   24  0.03% 96.79%      2872    8      2872     8 300251 byte[]
   25  0.02% 96.81%      2704   59      2704    59 300000 java.lang.String[]
   26  0.02% 96.83%      2368   39      2368    39 300000 sun.util.locale.LocaleObjectCache$CacheEntry
   27  0.02% 96.85%      2328   97      2328    97 300273 java.util.LinkedList$Node
   28  0.02% 96.88%      2272   28      2272    28 300000 java.net.URL
   29  0.02% 96.90%      2136   89      2328    97 300269 java.lang.String
   30  0.02% 96.91%      2048    5      2048     5 300000 java.lang.Thread
   31  0.02% 96.93%      2000   50      2000    50 300100 java.lang.ref.Finalizer
   32  0.02% 96.95%      1944   81      1944    81 300275 java.util.LinkedList$Node
   33  0.02% 96.97%      1784   32      1784    32 300000 java.util.LinkedHashMap$Entry
   34  0.02% 96.98%      1760   39      1760    39 300000 java.util.concurrent.ConcurrentHashMap$Node
   35  0.02% 97.00%      1656    4      1656     4 300000 int[]
   36  0.01% 97.01%      1584    2      1584     2 300000 java.lang.Integer[]
   37  0.01% 97.03%      1504    4      1504     4 300001 java.lang.Thread
   38  0.01% 97.04%      1480   10      1480    10 300000 java.lang.Class
   39  0.01% 97.05%      1472    6      1472     6 300000 java.util.Hashtable$Entry[]
   40  0.01% 97.06%      1304   20      1304    20 300000 sun.util.locale.BaseLocale$Key
   41  0.01% 97.08%      1288   32      1288    32 300000 java.io.ExpiringCache$Entry
   42  0.01% 97.09%      1280   16      1280    16 300198 java.util.WeakHashMap$Entry[]
   43  0.01% 97.10%      1280   20      1664    26 300152 java.net.URL
   44  0.01% 97.11%      1280   16      1280    16 300201 java.lang.Object[]
   45  0.01% 97.12%      1224   18      1224    18 300000 java.io.ObjectStreamField
   46  0.01% 97.13%      1152   20      1152    20 300000 sun.util.locale.BaseLocale
   47  0.01% 97.14%      1152   20      1152    20 300000 java.util.Locale
   48  0.01% 97.15%      1136    6      1136     6 300000 java.util.HashMap$Node[]
   49  0.01% 97.16%      1120   14      1280    16 300274 java.util.HashMap$Node[]
SITES END

Interpreting Profiler Output

The fact that both codes are evaluating the same total number of solutions, 164,245 (plus/minus 1), is a reassurance that timing differences between Java and Perl are "real," and not an artifact of different algorithms that explore solutions differently. Beyond that, it's hard to do intercomparisons, due to the very different output of the profilers.

The Perl profiler seems like a better tool for a deep, line-by-line comparison, making it possible to drill into specific statements that are bottlenecks, while Java is better for more sprawling call stacks - figuring out what happened at a HIGH level, identify class or method level bottlenecks.