From charlesreid1

Summary

Background

Recently I read an [(11 year old) article http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html] by Steve Yegge entitled "Execution in the Kingdom of Nouns." In it, Steve describes the way that in Java,

Classes are really the only modeling tool Java provides you. So whenever a new idea occurs to you, you have to sculpt it or wrap it or smash at it until it becomes a thing, even if it began life as an action, a process, or any other non-'thing' concept.

This idea really resonated with me. I have been curious about verb-oriented languages like Haskell and OCaml, and have found lambda functions extremely useful in data analysis when using Python (not to mention the wonderful Python feature of being able to pass functions or lambda functions as parameters).

After reading this article, I began thinking about how we might implement programs in Java using verb-oriented thinking. So, I began by thinking through the N-queens problem, a classic recursive backtracking problem.

Perl vs. Java

Steve Yegge's article had another quote that stuck out to me:

I've really come around to what Perl folks were telling me 8 or 9 years ago: 'Dude, not everything is an object.'

The last tool I wrote in Perl was a web scraper called the [Swartz Mechanizer](https://github.com/charlesreid1/swartz-mechanizer). It was an unpleasant and ugly script running to 800 lines; it's main purpose was to navigate a page, click buttons, and download PDF files. It worked, and it was entirely action-based, consisting of functions doing actions and passing small parcels of information around.

This time around I used Perl to implement a recursive backtracking solution to the N-queens problem, which consisted of a recursive function that kept track of queen placement using an array of integers. The Perl solution was a verb-based solution that was principally performing *actions* using built-in Perl types. This is one of the core tenets of a verb-based approach: you reshape your problem to fit your data structures (in contrast to the noun-based approach, which does it the other way around).

The Perl solution was written first, and the Java solution second. The Java solution also implement a recursive backtracking approach, and is modeled closely on the Perl solution. It also uses integer arrays to keep track of queen positions, and uses a static class to emulate the global solution storage variable. It also checks for legal moves in the same way that the Perl program does.

The Perl Solution

In Perl, you have the concept of a global namespace, by declaring variables at the top level of the file. The Perl implementation of the N queens problem utilizes a global array to store solutions, so that when the base case of the recursive backtracking method is reached, meaning a solution has been found, it is added to this global array, which is shared amongst all instances of the recursive function. This is the only "sharing" that needs to happen among instances of the recursive function (and even then, the sharing isn't strictly necessary, since the set of solutions can be passed as a parameter).

See the original Perl N queens problem solution and a script for gathering timing data for different board sizes: https://gist.github.com/charlesreid1/4ce97a5f896ff1c89855a5d038d51535


#!/usr/bin/perl

# Solve the N queens problem
# using recursive backtracking.
#
# Author: Charles Reid
# Date: March 2017
#
# Modified and expanded from http://rosettacode.org/wiki/N-queens_problem#Perl

# Create an array to store solutions
my @solutions;

# Create an array to store where queens have been placed
my @queens;

# Mark the rows already used (useful for lookup)
my @occupied;

# explore() implements a recursive, depth-first backtracking method
sub explore {
	# Parameters:
	#	depth : this is the argument passed by the user

	# First argument passed to the function is $depth
	# (how many queens we've placed on the board),
	# so use shift to pop that out of the parameters
	my ($depth, @diag) = shift;

	# Explore is a recursive method,
	# so we need a base case and a recursive case.
	#
	# The base case is, we've reached a leaf node,
	# placed 8 queens, and had no problems,
	# so we found a solution.
	if ($depth==$board_size) {
		# Here, we store the stringified version of @queens,
		# which are the row numbers of prior queens.
		# This is a global variable that is shared across
		# instances of this recursive function.
		push @solutions, "@queens\n";
		return;
	}

	# Mark the squares that are attackable,
	# so that we can cut down on the search space.
	$#diag = 2 * $board_size;
	for( 0 .. $#queens) {
		$ix1 = $queens[$_] + $depth - $_ ;
		$diag[ $ix1 ] = 1;

		$ix2 = $queens[$_] - $depth + $_ ;
		$diag[ $ix2 ] = 1;
	}

	for my $row (0 .. $board_size-1) {
		# Cut down on the search space:
		# if this square is already occupied
		# or will lead to an invalid solution,
		# don't bother exploring it.
		next if $occupied[$row] || $diag[$row];

		# Make a choice
		push @queens, $row;
		# Mark the square as occupied
		$occupied[$row] = 1;

		# Explore the consequences
		explore($depth+1);

		# Unmake the choice
		pop @queens;

		# Mark the square as unoccupied
		$occupied[$row] = 0;

	}
}

$board_size = 8;

explore(0);

print "total ", scalar(@solutions), " solutions\n";

The Java Solution

See the original Java N queens problem solution and a script for gathering timing data for different board sizes: https://gist.github.com/charlesreid1/4ce97a5f896ff1c89855a5d038d51535


import java.util.LinkedList;
import java.util.Arrays;

public class NQueens {

    public static final int SIZE = 12;

    ///////////////////////////////////////////////////
    // main
    //
    public static void main(String[] args) {
        SolutionSaver s = new SolutionSaver();
        Board b = new Board(SIZE);
        explore(b,0);
		System.out.printf("Found %d solutions\n ", SolutionSaver.nSolutions() );
    }
    //
    ///////////////////////////////////////////////////


    /// Placing queens, column-by-column; place the queen in column col.
    public static void explore(Board b, int col) {

        if(col >= SIZE ) {

            // We have reached the end.
            // No conflicts so far means no conflicts period.

            // Save solution in a static class, no instantiation overhead
            SolutionSaver.saveSolution( b.toString() );

		} else {

            // The legalRows() method is a little bit of magic.
            // It returns an array of valid rows on which to place the col^th queen.
            for(Integer row : b.legalRows(col) ) { // important question: is this called each time in the loop?
                                                   // (do i need to make a copy outside the loop?)
                b.choose(row,col);
                explore(b,col+1);
                b.unchoose(row,col);

            }
        }// done with base/recursive cases
    }

}

class Board {

    int[] queens;                   // Array to store where queens have been placed.
                                    // The queens array has a length of board_size.
                                    // Each element stores an integer between 1 and N.
                                    // That indicates the row/column.

    int[] occupiedrows;             // Array to mark occupied rows.
                                    // This is how we check for horizontal attacks.

    public static int board_size;

    public Board(int size) {
        this.board_size = size;
        this.queens = new int[size];
        this.occupiedrows = new int[size];
    }

    /**
     * Get String representation of queen positions.
     * This prints 8 numbers, corresponding to 8 columns.
     * Each number is an integer, 1..(board_size-1), indicating
     * the row of the queen on that particular column.
     * All queens on first row would be 0 0 0 0 0 0 0 0
     */
    public String toString() {
        return Arrays.toString(this.queens);
    }

    /// Make the choice of putting a queen on row, col
    public void choose(int row, int col) {
        if( col < this.queens.length && row < this.occupiedrows.length ) {
            this.queens[col] = row;
            this.occupiedrows[row] = 1;
        }
    }

    /// Unmake the choice of putting a queen on row, col
    public void unchoose(int row, int col) {
        if( col < this.queens.length && row < this.occupiedrows.length ) {
            this.queens[col] = 0;
            this.occupiedrows[row] = 0;
        }
    }

    /// Get a list of legal rows
    public LinkedList<Integer> legalRows( int col ) {

        LinkedList<Integer> legalList = new LinkedList<Integer>();

        // 1. Mark invalid squares on diagonals of already-placed queens
        //
        // 2. For this column, loop over each row where it's legal to place a queen,
        //    and run backtracking on that choice. Then unmake the choice and keep going.

        // Store invalid rows on other queens' diagonals
        int[] diag = new int[this.board_size];

        // Loop over all of the queens already placed (col-1)
        for(int k = 0; k <= (col-1); k++ ) {

            // We're gonig to place the next queen on col.
            // Find which squares are on diagonal of queen k,
            // and mark them as impossible.

            // Lower right diagonal
            int ix1 = this.queens[k] + col - k;
            if(ix1 >= 0 && ix1 < this.board_size ) {
                diag[ix1] = 1;
            }

            // Upper right diagonal
            int ix2 = this.queens[k] - col + k;
            if(ix2 >= 0 && ix2 < this.board_size ) {
                diag[ix2] = 1;
            }

        }

        // Legal rows are non-diagonal squares and squares on non-occupied rows.
        for (int row = 0; row < this.board_size; row++) {

            // If this row is legal, add it to the list
            boolean legal = diag[row]==0 && this.occupiedrows[row]==0;
            if(legal) {
                legalList.add(row);
            }

        }

        return legalList;
    }
}


class SolutionSaver {
    private static LinkedList<String> solutions;

    public SolutionSaver() {
        SolutionSaver.solutions = new LinkedList<String>();
    }

    public static void saveSolution(String serialized) {
        SolutionSaver.solutions.add(serialized);
    }

    public static void printSolutions() {
        int c = 0;
        for( String sol : solutions ) {
            System.out.printf("Solution %d: %s \n", c, sol);
        }
    }
    public static int nSolutions() {
        return solutions.size();
    }
}

Profiling

Profiling Perl

To profile the Perl code, I used the Devel::NYTProf Perl module. (See Perl/Profiling for details.) This gave a nice interactive report that showed that much of the time Perl was spending was being spent doing computations for valid locations to place queens.

perl -d:NYTProf nqueens.pl

This generates a nytprof.out file that contains a database of profiling information. This information can be extracted to various formats, but here's how to extract it to a CSV file:

$ /usr/local/Cellar/perl/5.24.0_1/bin/nytprofcsv nytprof.out

This will generate a directory called nytprof/ containing a CSV file:

# Profile data generated by Devel::NYTProf::Reader
# Version: v6.04
# More information at http://metacpan.org/release/Devel-NYTProf/
# Format: time,calls,time/call,code
0.086191,166926,0.000001,my ($depth, @attacked) = shift;
0.010183,2680,0.000004,push @solutions, "@queens\n";
0.010521,2680,0.000004,return;
0.196345,164246,0.000001,$#attacked = 2 * $board_size;
0.152597,164246,0.000001,for( 0 .. $#queens) {
0.683294,1.26035e+06,0.000001,$attacked[ $ix2 ] = 1;
1.189223,164246,0.000007,for my $row (0 .. $board_size-1) {
0.272883,166925,0.000002,explore($depth+1);
0.116465,166925,0.000001,$occupied[$row] = 0;
0.000001,1,0.000001,$board_size = 11;
0.000019,1,0.000019,explore(0);
0.000080,1,0.000080,print "Found ", scalar(@solutions), " solutions\n";

One number here in particular is important - the number of times we call the for loop over columns. This gives us a TOTAL count of the number of solutions that were tried by the program. This number, for Perl, is 164,246. We will see that Java explores 164,245 solutions, so the difference in computational cost between these languages is not coming from a drastic difference in operation counts.

Profiling Java

To profile the Java code, I used the Java Interactive Profiler (JIP). (See Java/Profiling for details.) This generates a profile.txt file, which gives a summary of how many times each method was called at each recursive level, plus a summary of the most expensive methods. This helps identify where most of the time was spent.

$ java -javaagent:/Volumes/noospace/Users/charles/Downloads/jip-src-1.2/profile/profile.jar NQueens

objc[39911]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java and /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/jre/lib/libinstrument.dylib. One of the two will be used. Which one is undefined.
profiler: on
remote: off
port: 15599
thread-depth: -1
thread.compact.threshold.ms: 10
max-method-count: -1
method.compact.threshold.ms: 10
file: profile.txt
track.object.alloc: off
output: text
debug: off
profiler-class: com.mentorgen.tools.profile.runtime.Profile
output-method-signatures: no
clock-resolution: ms
output-summary-only: no
exclude:null
Accept ClassLoader: sun.misc.Launcher$AppClassLoader
ClassLoaderFilter.1: null
Using the generic class loader filter.
Java Interactive Profiler: starting
------------------
Found 2680 solutions
 Controller -- shuttingdown

Here's an example of the output:

+----------------------------------------------------------------------
|  File: profile.txt
|  Date: 2017.03.19 13:57:01 PM
+----------------------------------------------------------------------

+------------------------------
| Thread depth limit: Unlimited
+------------------------------
+------------------------------
| Thread: 1
+------------------------------
              Time            Percent
       ----------------- ---------------
 Count    Total      Net   Total     Net  Location
 =====    =====      ===   =====     ===  =========
     1   1033.5     12.6   100.0     1.2  +--NQueens:main	()
     1      0.1      0.1     0.0          | +--SolutionSaver:<init>	()
     1      0.0      0.0     0.0          | +--Board:<init>	()
     1   1020.6      0.5    98.8          | +--NQueens:explore	()
     1      0.1      0.1     0.0          | | +--Board:legalRows	()
    11      0.0      0.0     0.0          | | +--Board:choose	()
    11   1020.0      0.4    98.7          | | +--NQueens:explore	()
    11      0.0      0.0     0.0          | | | +--Board:legalRows	()
    90      0.0      0.0     0.0          | | | +--Board:choose	()
    90   1019.4      2.1    98.6     0.2  | | | +--NQueens:explore	()
    90      0.1      0.1     0.0          | | | | +--Board:legalRows	()
   536      0.1      0.1     0.0          | | | | +--Board:choose	()
   536   1016.9      9.3    98.4     0.9  | | | | +--NQueens:explore	()
   536      0.4      0.4     0.0          | | | | | +--Board:legalRows	()
  2468      1.7      1.7     0.2     0.2  | | | | | +--Board:choose	()
  2468   1005.0     33.7    97.2     3.3  | | | | | +--NQueens:explore	()
  2468      1.6      1.6     0.2     0.2  | | | | | | +--Board:legalRows	()
  8492      1.7      1.7     0.2     0.2  | | | | | | +--Board:choose	()
  8492    966.3     90.3    93.5     8.7  | | | | | | +--NQueens:explore	()
  8492      5.1      5.1     0.5     0.5  | | | | | | | +--Board:legalRows	()
 21362      4.9      4.9     0.5     0.5  | | | | | | | +--Board:choose	()
 21362    860.9    169.8    83.3    16.4  | | | | | | | +--NQueens:explore	()
 21362     12.2     12.2     1.2     1.2  | | | | | | | | +--Board:legalRows	()
 37248     47.8     47.8     4.6     4.6  | | | | | | | | +--Board:choose	()
 37248    623.2    206.5    60.3    20.0  | | | | | | | | +--NQueens:explore	()
 37248     19.1     19.1     1.8     1.8  | | | | | | | | | +--Board:legalRows	()
 44148      9.4      9.4     0.9     0.9  | | | | | | | | | +--Board:choose	()
 44148    374.7    162.7    36.3    15.7  | | | | | | | | | +--NQueens:explore	()
 44148     25.9     25.9     2.5     2.5  | | | | | | | | | | +--Board:legalRows	()
 34774      6.9      6.9     0.7     0.7  | | | | | | | | | | +--Board:choose	()
 34774    171.5     94.7    16.6     9.2  | | | | | | | | | | +--NQueens:explore	()
 34774     16.7     16.7     1.6     1.6  | | | | | | | | | | | +--Board:legalRows	()
 15116      4.4      4.4     0.4     0.4  | | | | | | | | | | | +--Board:choose	()
 15116     52.8     24.4     5.1     2.4  | | | | | | | | | | | +--NQueens:explore	()
 15116      7.0      7.0     0.7     0.7  | | | | | | | | | | | | +--Board:legalRows	()
  2680      0.6      0.6     0.1          | | | | | | | | | | | | +--Board:choose	()
  2680     20.3      7.2     2.0     0.7  | | | | | | | | | | | | +--NQueens:explore	()
  2680     11.0     11.0     1.1     1.1  | | | | | | | | | | | | | +--Board:toString	()
  2680      2.1      2.1     0.2     0.2  | | | | | | | | | | | | | +--SolutionSaver:saveSolution	()
  2680      0.6      0.6     0.1          | | | | | | | | | | | | +--Board:unchoose	()
 15116      3.0      3.0     0.3     0.3  | | | | | | | | | | | +--Board:unchoose	()
 34774      7.7      7.7     0.7     0.7  | | | | | | | | | | +--Board:unchoose	()
 44148     13.5     13.5     1.3     1.3  | | | | | | | | | +--Board:unchoose	()
 37248      7.9      7.9     0.8     0.8  | | | | | | | | +--Board:unchoose	()
 21362      5.2      5.2     0.5     0.5  | | | | | | | +--Board:unchoose	()
  8492      1.7      1.7     0.2     0.2  | | | | | | +--Board:unchoose	()
  2468      0.5      0.5     0.0          | | | | | +--Board:unchoose	()
   536      0.1      0.1     0.0          | | | | +--Board:unchoose	()
    90      0.1      0.1     0.0          | | | +--Board:unchoose	()
    11      0.0      0.0     0.0          | | +--Board:unchoose	()
     1      0.1      0.1     0.0          | +--SolutionSaver:nSolutions	()

+--------------------------------------
| Most expensive methods (by net time)
| Frame Count Limit: Unlimited
+--------------------------------------

               Net
          ------------
 Count     Time    Pct  Location
 =====     ====    ===  ========
 37248    206.5   20.0  NQueens:explore
 21362    169.8   16.4  NQueens:explore
 44148    162.7   15.7  NQueens:explore
 34774     94.7    9.2  NQueens:explore
  8492     90.3    8.7  NQueens:explore
 37248     47.8    4.6  Board:choose
  2468     33.7    3.3  NQueens:explore
 44148     25.9    2.5  Board:legalRows
 15116     24.4    2.4  NQueens:explore
 37248     19.1    1.8  Board:legalRows
 34774     16.7    1.6  Board:legalRows
 44148     13.5    1.3  Board:unchoose
     1     12.6    1.2  NQueens:main
 21362     12.2    1.2  Board:legalRows
  2680     11.0    1.1  Board:toString
 44148      9.4    0.9  Board:choose
   536      9.3    0.9  NQueens:explore
 37248      7.9    0.8  Board:unchoose
 34774      7.7    0.7  Board:unchoose
  2680      7.2    0.7  NQueens:explore
 15116      7.0    0.7  Board:legalRows
 34774      6.9    0.7  Board:choose
 21362      5.2    0.5  Board:unchoose
  8492      5.1    0.5  Board:legalRows
 21362      4.9    0.5  Board:choose
 15116      4.4    0.4  Board:choose
 15116      3.0    0.3  Board:unchoose
    90      2.1    0.2  NQueens:explore
  2680      2.1    0.2  SolutionSaver:saveSolution
  8492      1.7    0.2  Board:choose
  2468      1.7    0.2  Board:choose
  8492      1.7    0.2  Board:unchoose
  2468      1.6    0.2  Board:legalRows
  2680      0.6    0.1  Board:unchoose
  2680      0.6    0.1  Board:choose
     1      0.5    0.0  NQueens:explore
  2468      0.5    0.0  Board:unchoose
    11      0.4    0.0  NQueens:explore
   536      0.4    0.0  Board:legalRows
   536      0.1    0.0  Board:choose
     1      0.1    0.0  SolutionSaver:<init>
   536      0.1    0.0  Board:unchoose
     1      0.1    0.0  SolutionSaver:nSolutions
    90      0.1    0.0  Board:legalRows
    90      0.1    0.0  Board:unchoose
     1      0.1    0.0  Board:legalRows
     1      0.0    0.0  Board:<init>
    11      0.0    0.0  Board:legalRows
    90      0.0    0.0  Board:choose
    11      0.0    0.0  Board:choose
    11      0.0    0.0  Board:unchoose

+--------------------------------------+
| Most expensive methods summarized    |
+--------------------------------------+

               Net
          ------------
 Count     Time    Pct  Location
 =====     ====    ===  ========
166926    801.7   77.6  NQueens:explore
164246     88.0    8.5  Board:legalRows
166925     77.6    7.5  Board:choose
166925     40.2    3.9  Board:unchoose
     1     12.6    1.2  NQueens:main
  2680     11.0    1.1  Board:toString
  2680      2.1    0.2  SolutionSaver:saveSolution
     1      0.1    0.0  SolutionSaver:<init>
     1      0.1    0.0  SolutionSaver:nSolutions
     1      0.0    0.0  Board:<init>

From the tree diagram of method calls, we can see the actual number of branches explored at each depth of the decision tree:

  • At a depth of 11, the explore() method was called 2680 times, precisely the number of solutions to the N queens problem for N = 11. (Reaching a depth of 11 means all 11 queens have been placed.)
  • At a depth of 10, the explore() method was called 15116 times.
  • At a depth of 9, the explore() method was called 34774 times.
  • At a depth of 8, the explore() method was called 44148 times. This is the depth at which the decision tree reaches its maximum width/complication.
  • At a depth of 7, the explore() method was called 37248 times.
  • At a depth of 6, the explore() method was called 21362 times.
  • At a depth of 5, the explore() method was called 8492 times.
  • At a depth of 4, the explore() method was called 2468 times.
  • At a depth of 3, the explore() method was called 536 times.
  • At a depth of 2, the explore() method was called 90 times.
  • At a depth of 1, the explore() method was called 11 times.

If we add up each of these numbers, except for the depth of 11, we get 164,245 - precisely one shy of the number of times Perl calls the for loop to loop over each column, which is called 164,246 times.

Interpreting Differences in Profiling

The fact that both codes are evaluating the same total number of solutions, 164,245 (plus/minus 1), is a reassurance that timing differences between Java and Perl are "real," and not an artifact of different algorithms that explore solutions differently.