From charlesreid1

Loading Data

Loading Single CSV File

A simple example of loading and manipulating CSV data in D3 is given here.

The Data

Let's begin with some data. This is data about runs allowed by the Giants and Athletics (http://www.baseball-reference.com/play-index/inning_summary.cgi?year=2014&team_id=OAK&submitter=1).

Team,Year,Inning,#,0,Any,1,2,3,4,≥5,Most,Total,Avg,AvgPer9inn
As,2014,1,86,61,25,17,6,1,1,0,4,36,0.42,3.77
As,2014,2,86,68,18,10,6,2,0,0,3,28,0.33,2.93
As,2014,3,86,64,22,18,1,1,2,0,4,31,0.36,3.24
As,2014,4,86,65,21,13,3,2,2,1,6,39,0.45,4.08
As,2014,5,86,63,23,14,5,4,0,0,3,36,0.42,3.77
As,2014,6,86,63,23,13,6,1,1,2,6,43,0.50,4.50
As,2014,7,86,72,14,8,3,2,1,0,4,24,0.28,2.51
As,2014,8,86,70,16,12,1,1,2,0,4,25,0.29,2.62
As,2014,9,72,58,14,4,6,2,2,0,4,30,0.42,3.79
As,2014,10,15,13,2,1,1,0,0,0,2,3,0.20,1.80
As,2014,11,9,8,1,0,0,0,1,0,4,4,0.46,4.15
As,2014,12,5,4,1,1,0,0,0,0,1,1,0.21,1.93
As,2014,13,2,2,0,0,0,0,0,0,0,0,0.00,0.00
As,2014,14,2,1,1,1,0,0,0,0,1,1,0.60,5.40
As,2014,Total,793,612,181,112,38,16,12,3,6,301,0.38,3.42
Giants,2014,1,87,67,20,8,8,2,2,0,4,38,0.44,3.93
Giants,2014,2,87,68,19,13,5,1,0,0,3,26,0.30,2.69
Giants,2014,3,87,60,27,11,15,0,1,0,4,45,0.52,4.66
Giants,2014,4,87,67,20,8,7,4,1,0,4,38,0.44,3.93
Giants,2014,5,87,61,26,14,5,4,2,1,5,49,0.56,5.07
Giants,2014,6,87,67,20,11,7,2,0,0,3,31,0.36,3.22
Giants,2014,7,86,68,18,13,4,0,0,1,5,26,0.30,2.72
Giants,2014,8,86,67,19,14,2,1,2,0,4,29,0.34,3.03
Giants,2014,9,72,55,17,9,6,1,0,1,5,29,0.41,3.66
Giants,2014,10,8,7,1,1,0,0,0,0,1,1,0.13,1.13
Giants,2014,11,4,2,2,0,1,0,0,1,5,7,1.75,15.75
Giants,2014,12,2,2,0,0,0,0,0,0,0,0,0.00,0.00
Giants,2014,13,1,1,0,0,0,0,0,0,0,0,0.00,0.00
Giants,2014,Total,781,592,189,102,60,15,8,4,5,319,0.41,3.68

Loading Into D3

We can load our data into a D3 document like so:

<html>
  <head>
    <title>Bar Chart</title>
    <script type="text/javascript" src="/d3.v2.js"></script>
  </head>
  <body>
    <script type="text/javascript">

d3.csv("data_as_giants.csv", function(data) {
  // here's where we do stuff with the data
};

Grabbing Data Header

Let's say we want to get a Javascript array containing each of our CSV's headers, like this:

["0", "1", "2", "3", "4", "Team", "Year", "Inning", "#", "Any", "≥5", "Most", "Total", "Avg", "AvgPer9inn"] 

We can do this by adding some code right after our d3.csv call:

d3.csv("data_as_giants.csv", function(data) {

  var headerRow = Object.keys(data[0]);
  console.log(headerRow);

First, data[0] grabs only the first row. The Object.keys() method will extract the keys for that row, and we assign that to the headerRow variable. Finally, we print this variable value to the console, where we see precisely what we are expecting:

["0", "1", "2", "3", "4", "Team", "Year", "Inning", "#", "Any", "≥5", "Most", "Total", "Avg", "AvgPer9inn"] 

Manipulating Data Row-by-Row

To perform some kind of element-by-element transformation of our data, we can use the data.map function, like so:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) { 
    // do something here
  } );

Here, the map function will iterate through each element of your data. You can perform your transformation on d, and that transformation will be applied to an entire data set.

When combined with the technique below for accessing elements of a vector, this can be used to iterate through data, filter it out, and perform operations.

Accessing Data Element-By-Element

From within the map function, you can access elements of your data using the keys:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) {
    if (d['Team']=='As') {
        console.log( d['AvgPer9inn'] );
    }
  });

This can be used, not just for parsing data or using it to make new data, but for displaying data as well:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) {
    if (d['Team']=='As') {
        console.log( 'In inning number '+d['Inning']+' the As allowed '+d['Avg']+' average runs, which evens out to '+d['AvgPer9inn']+' run
    }
  });

Once you combine these techniques with jQuery's ability to manipulate page elements, rather than using something boring like console.log, you'll have a way of pulling data into an HTML document and creating a data-driven document (D x 3).

Manipulating Multidimensional Data

Maybe I'm interested in preparing some plots and need to generate (x,y) coordinate pairs from various data. I can use the map function on the full data set to reduce the number of variables for convenience.

I'll show an example of how the data above can be reduced to two interesting dimensions: inning number, and average number of runs scored against A's and Giants in that inning, showing when the A's and Giants tend to give up runs.

This information is contained in the data above, which shows histogram data of the number of times the A's and Giants have played a particular inning number (the "#" column) versus the number of runs they have given up in a particular inning (the "Total" column).

The ratio of these two numbers is the average number of runs each team gave up in that inning ("Avg" column).

Plotting the "Inning" column vs the "Avg" column would give us what we're looking for, so let's reduce the data set using the map function:

First, print the data:

d3.csv("data_as_giants.csv", function(data) {

  // Return a keyless (x,y) pair for (Inning,Avg)
  data.map(function(d) {
        console.log( [ +d['Inning'] , +d['Avg'] ] );
    });

This shows there are two rows (one for each team) where Inning is a string (totals). We'll have to filter that one out, and then return the two values we were getting (Inning and Avg).

d3.csv("data_as_giants.csv", function(data) {

  // Return a keyless (x,y) pair for (Inning,Avg)
  xy_pairs = data.map(function(d) {
        if(d['Inning']!='Total') {
            return [ +d['Inning'] , +d['Avg'] ];
        }
    });
  xy_pairs = cleanArray(xy_pairs);

  xy_pairs.map(function(d) { console.log( 'Inning ' + d['0'] + ' - Avg ' + d['1']); });

Alternatively, we can create a new array that works the same as the old array - that is, that we can access using the keys "Inning" and "Avg":

  // Return a key-value (x,y) pair for (Inning,Avg)
  xy_pairs = data.map(function(d) {
        if(d['Inning']!='Total') {
            var result = {
                Inning : +d['Inning'],
                Avg : +d['Avg']
            }
            return result
        }
    });
  xy_pairs = cleanArray(xy_pairs);

  xy_pairs.map(function(d) { console.log( 'Inning ' + d['Inning'] + ' - Avg ' + d['Avg']); });

which results in:

Inning 1 - Avg 0.42 as_giants.html:101
Inning 2 - Avg 0.33 as_giants.html:101
Inning 3 - Avg 0.36 as_giants.html:101
Inning 4 - Avg 0.45 as_giants.html:101
Inning 5 - Avg 0.42 as_giants.html:101
Inning 6 - Avg 0.5 as_giants.html:101
Inning 7 - Avg 0.28 as_giants.html:101
Inning 8 - Avg 0.29 as_giants.html:101
Inning 9 - Avg 0.42 as_giants.html:101
Inning 10 - Avg 0.2 as_giants.html:101
Inning 11 - Avg 0.46 as_giants.html:101
Inning 12 - Avg 0.21 as_giants.html:101
Inning 13 - Avg 0 as_giants.html:101
Inning 14 - Avg 0.6 as_giants.html:101
Inning 1 - Avg 0.44 as_giants.html:101
Inning 2 - Avg 0.3 as_giants.html:101
Inning 3 - Avg 0.52 as_giants.html:101
Inning 4 - Avg 0.44 as_giants.html:101
Inning 5 - Avg 0.56 as_giants.html:101
Inning 6 - Avg 0.36 as_giants.html:101
Inning 7 - Avg 0.3 as_giants.html:101
Inning 8 - Avg 0.34 as_giants.html:101
Inning 9 - Avg 0.41 as_giants.html:101
Inning 10 - Avg 0.13 as_giants.html:101
Inning 11 - Avg 1.75 as_giants.html:101
Inning 12 - Avg 0 as_giants.html:101
Inning 13 - Avg 0 as_giants.html:101

Loading Multiple CSV Files

To load multiple (arbitrary number) CSV files:

var filesArray = ["myrandedata.csv","myrandndata.csv","myrandudata.csv"];                

var remaining = filesArray.length;                                                       

// from https://groups.google.com/forum/#!msg/d3-js/3Y9VHkOOdCM/YnmOPopWUxQJ             
//
// this will populate an array mydata (number of elements = number of files)
// so that mydata[somefile] = all the data in somefile
filesArray.forEach( function(f) {                                                        
    d3.csv(f, function(data) {                                                           
        mydata[f] = data;
        if (!--remaining) doSomething();                                                 
    });                                                                                  
});

function doSomething() {
    filesArray.forEach( function(f) {                                                    
        console.log( mydata[f] );                                                        
    });                                                                                  
}

Things I learned about D3 while modifying Parallel example

  • Role of maps
  • Nesting functions
  • Scope of Javascript
  • How to use for loops instead of functions to avoid scope issues
  • How to load multiple files using a counter to avoid scope/asynchronous issues
  • The whole function notation
  • Loading data as CSV (associated array) or as text (plain multidimensional array)
  • Console
  • Accessing arrays using notation data[0] versus data['x1'] versus data.x1
  • Notation +p[d]


Name of an Object's Type

In case you have absolutely no idea what an object is, I recommend using myObjectInstance.constructor.name

(Hat tip to Stack Overflow)

Plus-Sign Notation

Some notation I found confusing, but was able to eventually clear up:

+p[d]

What the plus sign prefix means is, "Convert this string to a number" (hat tip to Stack Overflow again).

So, if you have a column of data that contains an index for each row, and you want to convert that index from a string to a number, you can use the plus sign in conjunction with map:

num_data = string_data.map(function(d) {
  return +d['Index'];
});

This will return an array that contains one element for each row, and that element is the number corresponding to the row index.

Notation for Accessing Arrays

How you access your array depends on how you load your external data. (See section #Loading External Data).

The following three are equivalent (in theory):

data['x1']

data[1]

data.x1

although sometimes the names get too complicated, so only the first two work.

actually, only the first one works, because Javascript has no respect for key order. data[1] could be any column of your data. ANY.

Javascript Doesn't Respect Array Order

This is important for accessing arrays. If you pass in a CSV with a list of columns with headers, or initialize a key-value array, Javascript will not respect the order of the keys. (This makes numerical indexing of arrays pretty much impossible...)

c.f. Stack Overflow

Console

Workflow with console...

Loading External Data

Two ways to load a CSV file with data:

First way is for CSV with headers, load as associative array (which allows you to access data using the keys):

Alternative method is for CSV with no headers, load as plain text into a "plain" multidimensional array:

// To parse csv as an array of arrays:
d3.text("myrandedata.csv", "text/csv", function(text) {
    var rows = d3.csv.parseRows(text);

    // Do stuff with rows here... 

    //console.log(rows.slice(1,5)); 

    var i1 = 6;
    var i2 = 12;
    rows.slice(i1,i2).forEach( function(d) {
        console.log("d1 = " + d[1] );
    });
    
    // Use map to turn a multidimensional array into a single dimensional array
    // (i.e., to grab a single column)
    thiscol = rows.slice(i1,i2).map( function(d) { return d[1]; });
    console.log( d3.max( thiscol ) );

});

D3 Function Notation

One of the most confusing things about looking at D3 code for the first time is that there are nested functions everywhere.

Functions are way of processing data inline/on the fly, without breaking up all your code.

Nesting Functions

Simple Nesting Functions Example

var abc=[1,2,3];
var def=[11,12,13];
var ghi=[21,22,23];

dummy = 1;
abc.forEach( function(x) {
    def.forEach( function(y) {
        ghi.forEach( function(z) {
            dummy = x*y*z;
            console.log( x + "*" + y + "*" + z + " = " + dummy );
        });
    });
});

results in:

1*11*21 = 231 
1*11*22 = 242 
1*11*23 = 253 
1*12*21 = 252 
1*12*22 = 264 
1*12*23 = 276 
1*13*21 = 273 
1*13*22 = 286 
1*13*23 = 299 
2*11*21 = 462 
2*11*22 = 484 
2*11*23 = 506 
2*12*21 = 504 
2*12*22 = 528 
2*12*23 = 552 
2*13*21 = 546 
2*13*22 = 572 
2*13*23 = 598 
3*11*21 = 693 
3*11*22 = 726 
3*11*23 = 759 
3*12*21 = 756 
3*12*22 = 792 
3*12*23 = 828 
3*13*21 = 819 
3*13*22 = 858 
3*13*23 = 897

Complex (Asynchronous) Nesting Functions Example

Too much nesting of functions can lead you into trouble if you're trying to modify global variables, however, as nested functions change scope.

This happens because function calls are asynchronous. When the functions are simple, as in the above example, it's no problem, but when the functions become more complicated, they are asynchronous, leading to problems if you're depending on your functions to modify a global variable.

Working Around Asynchronous Functions

Multiple ways to get around them. One way is to use for loops.

Another way is to force synchronicity, with a counter triggering a call to a function.

via https://groups.google.com/forum/#!msg/d3-js/3Y9VHkOOdCM/YnmOPopWUxQJ

Role of Maps

Array maps are really handy, because they create a mapping of an existing array to a new one. This can be used to transform an array (for example, you could take an array of numbers and transform it into an array of squares of those numbers) or to expand/reduce an array (for example, you could take a two-dimensional array and find the sum of each element over a particular dimension).

Expansion Example: Single Dimension to Multiple Dimension Array

var abc=[1,2,3,4,5];
def = abc.map( function(d) { return [ +d, +d + 4 ] } ); 
def.forEach( function(p) {
    console.log(p);
}

Results in:

[1, 5]
[2, 6]
[3, 7]
[4, 8]
[5, 9]

Contraction Example: Multiple Dimension to Single Dimension Array

var abc=[[1,1],[2,4],[3,9],[4,16],[5,25],[6,36]];                                        
def = abc.map( function(d) { return d[1] } );                                            
console.log(def);

Results in:

[1, 4, 9, 16, 25, 36] 

Transformation Example: Values to Squared Values

var abc=[1,2,3,4,5];
def = abc.map( function(d) { return Math.pow(+d,2) } );
console.log(def);

results in:

[1, 4, 9, 16, 25]

Related: d3.zip

A related function, to create arrays of arrays, is d3.zip:

https://github.com/mbostock/d3/wiki/Arrays#wiki-d3_zip

A Return to D3

I've been returning to D3 after a long hiatus.

D3 and maps: D3 Map

D3 and Leaflet: D3 Leaflet Map

Things I Learned from D3 Leaflet Example

This is a compilation of learnings from my D3 Leaflet Map example, which I put together as part of http://charlesreid1.github.io/a-shrubbery "A Shrubbery," my Github mapping project.

How to Hand-Code Data (or, Passing JS Variables to D3)

When you import JSON or CSV data from a file, it looks something like this:

d3.csv("data.csv", function(error, data) {
    ....
}

What d3.csv() does is to provide a callback function for the data.

But what if we are constructing our data "by hand"? What if we don't want to grab data from a file, and don't need a callback function?

In that case, we have to print out the data that d3.csv() returns, and see how it packages up the data. The pattern is something like this:

var mydat = [{'cat' : 'A', 'dat' : 0.40 },
             {'cat' : 'B', 'dat' : 0.80 },
             {'cat' : 'C', 'dat' : 0.15 },
             {'cat' : 'D', 'dat' : 0.16 },
             {'cat' : 'E', 'dat' : 0.23 }];

An array of dictionaries, with each dictionary sharing the same keys. This then allows the properties "cat" and "dat" to be accessed by D3 to construct charts and things.

Whereas with a d3.csv() or d3.json() call, we would have the callback function and then do stuff in the callback function:

d3.csv("data.csv", function(error, data) {

    var g = svg.selectAll(".arc")
        .data(pie(data))
        .enter().append("g")
        .attr("class","arc");

...

now, with our hand-coded data, we just cut to the chase, and pass our data in to the .data() call directly:

var mydat = [{'cat' : 'A', 'dat' : 0.40 },
             {'cat' : 'B', 'dat' : 0.80 },
             {'cat' : 'C', 'dat' : 0.15 },
             {'cat' : 'D', 'dat' : 0.16 },
             {'cat' : 'E', 'dat' : 0.23 }];

var g = svg.selectAll(".arc")
    .data(pie(mydat))
    .enter().append("g")
    .attr("class","arc");

git.charlesreid1.com

D3 organization

Link to D3 organization on git.charlesreid1.com: https://git.charlesreid1.com/d3

D3 Calendar Visualization

Several D3 calendar visualizations.

Notes on wiki here: D3/Calendar

D3 calendar code: https://git.charlesreid1.com/d3/charlesreid1-calendar

Git commits calendar: https://git.charlesreid1.com/d3/charlesreid1-calendar/src/master/git

MediaWiki edits calendar: https://git.charlesreid1.com/d3/charlesreid1-calendar/src/master/wiki

D3 Map Visualizations Project

Combining Leaflet with D3, and using wrappers like C3, to make map visualizations.

Notes on wiki here: Javascript/Maps

D3 maps project repo: https://git.charlesreid1.com/d3/maps

Flags