From charlesreid1

Loading Data

Loading Single CSV File

A simple example of loading and manipulating CSV data in D3 is given here.

The Data

Let's begin with some data. This is data about runs allowed by the Giants and Athletics (http://www.baseball-reference.com/play-index/inning_summary.cgi?year=2014&team_id=OAK&submitter=1).

Team,Year,Inning,#,0,Any,1,2,3,4,≥5,Most,Total,Avg,AvgPer9inn
As,2014,1,86,61,25,17,6,1,1,0,4,36,0.42,3.77
As,2014,2,86,68,18,10,6,2,0,0,3,28,0.33,2.93
As,2014,3,86,64,22,18,1,1,2,0,4,31,0.36,3.24
As,2014,4,86,65,21,13,3,2,2,1,6,39,0.45,4.08
As,2014,5,86,63,23,14,5,4,0,0,3,36,0.42,3.77
As,2014,6,86,63,23,13,6,1,1,2,6,43,0.50,4.50
As,2014,7,86,72,14,8,3,2,1,0,4,24,0.28,2.51
As,2014,8,86,70,16,12,1,1,2,0,4,25,0.29,2.62
As,2014,9,72,58,14,4,6,2,2,0,4,30,0.42,3.79
As,2014,10,15,13,2,1,1,0,0,0,2,3,0.20,1.80
As,2014,11,9,8,1,0,0,0,1,0,4,4,0.46,4.15
As,2014,12,5,4,1,1,0,0,0,0,1,1,0.21,1.93
As,2014,13,2,2,0,0,0,0,0,0,0,0,0.00,0.00
As,2014,14,2,1,1,1,0,0,0,0,1,1,0.60,5.40
As,2014,Total,793,612,181,112,38,16,12,3,6,301,0.38,3.42
Giants,2014,1,87,67,20,8,8,2,2,0,4,38,0.44,3.93
Giants,2014,2,87,68,19,13,5,1,0,0,3,26,0.30,2.69
Giants,2014,3,87,60,27,11,15,0,1,0,4,45,0.52,4.66
Giants,2014,4,87,67,20,8,7,4,1,0,4,38,0.44,3.93
Giants,2014,5,87,61,26,14,5,4,2,1,5,49,0.56,5.07
Giants,2014,6,87,67,20,11,7,2,0,0,3,31,0.36,3.22
Giants,2014,7,86,68,18,13,4,0,0,1,5,26,0.30,2.72
Giants,2014,8,86,67,19,14,2,1,2,0,4,29,0.34,3.03
Giants,2014,9,72,55,17,9,6,1,0,1,5,29,0.41,3.66
Giants,2014,10,8,7,1,1,0,0,0,0,1,1,0.13,1.13
Giants,2014,11,4,2,2,0,1,0,0,1,5,7,1.75,15.75
Giants,2014,12,2,2,0,0,0,0,0,0,0,0,0.00,0.00
Giants,2014,13,1,1,0,0,0,0,0,0,0,0,0.00,0.00
Giants,2014,Total,781,592,189,102,60,15,8,4,5,319,0.41,3.68

Loading Into D3

We can load our data into a D3 document like so:

<html>
  <head>
    <title>Bar Chart</title>
    <script type="text/javascript" src="/d3.v2.js"></script>
  </head>
  <body>
    <script type="text/javascript">

d3.csv("data_as_giants.csv", function(data) {
  // here's where we do stuff with the data
};

Grabbing Data Header

Let's say we want to get a Javascript array containing each of our CSV's headers, like this:

["0", "1", "2", "3", "4", "Team", "Year", "Inning", "#", "Any", "≥5", "Most", "Total", "Avg", "AvgPer9inn"] 

We can do this by adding some code right after our d3.csv call:

d3.csv("data_as_giants.csv", function(data) {

  var headerRow = Object.keys(data[0]);
  console.log(headerRow);

First, data[0] grabs only the first row. The Object.keys() method will extract the keys for that row, and we assign that to the headerRow variable. Finally, we print this variable value to the console, where we see precisely what we are expecting:

["0", "1", "2", "3", "4", "Team", "Year", "Inning", "#", "Any", "≥5", "Most", "Total", "Avg", "AvgPer9inn"] 

Manipulating Data Row-by-Row

To perform some kind of element-by-element transformation of our data, we can use the data.map function, like so:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) { 
    // do something here
  } );

Here, the map function will iterate through each element of your data. You can perform your transformation on d, and that transformation will be applied to an entire data set.

When combined with the technique below for accessing elements of a vector, this can be used to iterate through data, filter it out, and perform operations.

Accessing Data Element-By-Element

From within the map function, you can access elements of your data using the keys:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) {
    if (d['Team']=='As') {
        console.log( d['AvgPer9inn'] );
    }
  });

This can be used, not just for parsing data or using it to make new data, but for displaying data as well:

d3.csv("data_as_giants.csv", function(data) {

  data.map(function(d) {
    if (d['Team']=='As') {
        console.log( 'In inning number '+d['Inning']+' the As allowed '+d['Avg']+' average runs, which evens out to '+d['AvgPer9inn']+' run
    }
  });

Once you combine these techniques with jQuery's ability to manipulate page elements, rather than using something boring like console.log, you'll have a way of pulling data into an HTML document and creating a data-driven document (D x 3).

Loading Multiple CSV Files

To load multiple (arbitrary number) CSV files:

var filesArray = ["myrandedata.csv","myrandndata.csv","myrandudata.csv"];                

var remaining = filesArray.length;                                                       

// from https://groups.google.com/forum/#!msg/d3-js/3Y9VHkOOdCM/YnmOPopWUxQJ             
//
// this will populate an array mydata (number of elements = number of files)
// so that mydata[somefile] = all the data in somefile
filesArray.forEach( function(f) {                                                        
    d3.csv(f, function(data) {                                                           
        mydata[f] = data;
        if (!--remaining) doSomething();                                                 
    });                                                                                  
});

function doSomething() {
    filesArray.forEach( function(f) {                                                    
        console.log( mydata[f] );                                                        
    });                                                                                  
}

Things I learned about D3 while modifying Parallel example

  • Role of maps
  • Nesting functions
  • Scope of Javascript
  • How to use for loops instead of functions to avoid scope issues
  • How to load multiple files using a counter to avoid scope/asynchronous issues
  • The whole function notation
  • Loading data as CSV (associated array) or as text (plain multidimensional array)
  • Console
  • Accessing arrays using notation data[0] versus data['x1'] versus data.x1
  • Notation +p[d]


Name of an Object's Type

In case you have absolutely no idea what an object is, I recommend using myObjectInstance.constructor.name

(Hat tip to [Stack Overflow http://stackoverflow.com/questions/332422/how-do-i-get-the-name-of-an-objects-type-in-javascript])

Miscellaneous Confusing Notation

Some notation I found confusing, but was able to eventually clear up:

+p[d]

(Of course, I haven't thought about this for a while, and have now completely forgotten what this means.)

Notation for Accessing Arrays

Depends on how you load the external data (see section #Loading External Data).

The following three are equivalent:

data[1]

data['x1']

data.x1


Console

Workflow with console...

Loading External Data

Two ways to load a CSV file with data:

First way is for CSV with headers, load as associative array (which allows you to access data using the keys):

Alternative method is for CSV with no headers, load as plain text into a "plain" multidimensional array:

// To parse csv as an array of arrays:
d3.text("myrandedata.csv", "text/csv", function(text) {
    var rows = d3.csv.parseRows(text);

    // Do stuff with rows here... 

    //console.log(rows.slice(1,5)); 

    var i1 = 6;
    var i2 = 12;
    rows.slice(i1,i2).forEach( function(d) {
        console.log("d1 = " + d[1] );
    });
    
    // Use map to turn a multidimensional array into a single dimensional array
    // (i.e., to grab a single column)
    thiscol = rows.slice(i1,i2).map( function(d) { return d[1]; });
    console.log( d3.max( thiscol ) );

});

D3 Function Notation

One of the most confusing things about looking at D3 code for the first time is that there are nested functions everywhere.

Functions are way of processing data inline/on the fly, without breaking up all your code.

Nesting Functions

Simple Nesting Functions Example

var abc=[1,2,3];
var def=[11,12,13];
var ghi=[21,22,23];

dummy = 1;
abc.forEach( function(x) {
    def.forEach( function(y) {
        ghi.forEach( function(z) {
            dummy = x*y*z;
            console.log( x + "*" + y + "*" + z + " = " + dummy );
        });
    });
});

results in:

1*11*21 = 231 
1*11*22 = 242 
1*11*23 = 253 
1*12*21 = 252 
1*12*22 = 264 
1*12*23 = 276 
1*13*21 = 273 
1*13*22 = 286 
1*13*23 = 299 
2*11*21 = 462 
2*11*22 = 484 
2*11*23 = 506 
2*12*21 = 504 
2*12*22 = 528 
2*12*23 = 552 
2*13*21 = 546 
2*13*22 = 572 
2*13*23 = 598 
3*11*21 = 693 
3*11*22 = 726 
3*11*23 = 759 
3*12*21 = 756 
3*12*22 = 792 
3*12*23 = 828 
3*13*21 = 819 
3*13*22 = 858 
3*13*23 = 897

Complex (Asynchronous) Nesting Functions Example

Too much nesting of functions can lead you into trouble if you're trying to modify global variables, however, as nested functions change scope.

This happens because function calls are asynchronous. When the functions are simple, as in the above example, it's no problem, but when the functions become more complicated, they are asynchronous, leading to problems if you're depending on your functions to modify a global variable. For example:

results in:


Working Around Asynchronous Functions

Multiple ways to get around them. One way is to use for loops.

Another way is to force synchronicity, with a counter triggering a call to a function.

via https://groups.google.com/forum/#!msg/d3-js/3Y9VHkOOdCM/YnmOPopWUxQJ

Role of Maps

Array maps are really handy, because they create a mapping of an existing array to a new one. This can be used to transform an array (for example, you could take an array of numbers and transform it into an array of squares of those numbers) or to expand/reduce an array (for example, you could take a two-dimensional array and find the sum of each element over a particular dimension).

Expansion Example: Single Dimension to Multiple Dimension Array

var abc=[1,2,3,4,5];
def = abc.map( function(d) { return [ +d, +d + 4 ] } ); 
def.forEach( function(p) {
    console.log(p);
}

Results in:

[1, 5]
[2, 6]
[3, 7]
[4, 8]
[5, 9]

Contraction Example: Multiple Dimension to Single Dimension Array

var abc=[[1,1],[2,4],[3,9],[4,16],[5,25],[6,36]];                                        
def = abc.map( function(d) { return d[1] } );                                            
console.log(def);

Results in:

[1, 4, 9, 16, 25, 36] 

Transformation Example: Values to Squared Values

var abc=[1,2,3,4,5];
def = abc.map( function(d) { return Math.pow(+d,2) } );
console.log(def);

results in:

[1, 4, 9, 16, 25]

Related: d3.zip

A related function, to create arrays of arrays, is d3.zip:

https://github.com/mbostock/d3/wiki/Arrays#wiki-d3_zip