From charlesreid1

TLDR: Use Devel::NYTProf

How to profile Perl

Before you begin

Installing cpanm

Start by installing cpanm to take care of all your Perl packages:

brew install cpanm

Installing Perl Profiling Tools


Install SizeMe:

$ cpanm Devel::SizeMe

Hat tip:


Install the best Perl profiling tool out there, the NYTProf, originally written by Adam Kaplan at the New York Times:

$ cpanm Devel::NYTProf

This also installs the dependencies, of which there are multiple:

Profiling Code using SizeMe

Official page:

Hat tip to SO user @TimBunce (author of SizeMe) for pointing out the SizeMe module for profiling Perl heap memory usage:

Profiling Code using NYTProf

Normally, we can time code using the time function, which gives a really high-level description of how long the code took to do stuff. Like, "this much user time, this much real wall time." Here's an example of output from the time command:

Running 8 queens problem with Perl...
total 92 solutions

real	0m0.112s
user	0m0.025s
sys	0m0.008s

Running 12 queens problem with Perl...
total 14200 solutions

real	0m11.726s
user	0m11.519s
sys	0m0.032s

We want to get a more detailed description of what's going on with the code to find bottlenecks.

Running with NYTProf

After installing it, run a Perl script with the NYTProf module turned on:

perl -d:NYTProf

This results in an incomprehensible nytprof.out file.

Some tools:

  • nytprofcsv - export as csv file
  • nytprofcfg - generates nytprof.callgraph
  • nytprofhtml - generate html version

NYTProf csv format

Now you can run nytprofcsv to turn that output into a CSV file, one line per line of code, with a summary of where all the time was spent:

/usr/local/Cellar/perl/5.24.0_1/bin/nytprofcsv nytprof.out

This puts the CSV file into a nytprof directory. Here's an example output for the N queens problem, N = 11:

# Profile data generated by Devel::NYTProf::Reader
# Version: v6.04
# More information at
# Format: time,calls,time/call,code
0.086191,166926,0.000001,my ($depth, @attacked) = shift;
0.010183,2680,0.000004,push @solutions, "@queens\n";
0.196345,164246,0.000001,$#attacked = 2 * $board_size;
0.152597,164246,0.000001,for( 0 .. $#queens) {
0.683294,1.26035e+06,0.000001,$attacked[ $ix2 ] = 1;
1.189223,164246,0.000007,for my $row (0 .. $board_size-1) {
0.116465,166925,0.000001,$occupied[$row] = 0;
0.000001,1,0.000001,$board_size = 11;
0.000080,1,0.000080,print "Found ", scalar(@solutions), " solutions\n";

From this we can see the bottlenecks are the for loop over each column, and populating the attacked array.

NYTProf html format





For a recursive backtracking function, you don't get a whole lot of information, but it does tell you the maximum depth of the recursive function and the total call time. It also shows very clearly that the biggest bottlenecks in the code are the population of the array of legal locations to place queens, which makes sense, given that it's a lot of index math and stuffing numbers into arrays.

Here's the relevant portion of the profiler output, which shows that the loop to mark invalid locations on the diagonals of the other queens involves a huge number of calculations - those lines are called 7 million times! Likewise the check of whether the particular square we're considering is a valid square is called 10 million times!

The total number of solutions for the N queens problem, when N is 11, is 2680.


Timing Code

See Perl/Timing



Official package info (some documentation, but nothing high-level at all):

Slide deck explaining a bit more about how to use NYTProf: