From charlesreid1

overarching goal

2018/Data Project

dashboards

current status: resolved to use mongodb

next step goals:

  • run mongodb query in javascript
  • collect more data
  • visualize data for single chart/single time series with D3
  • visualize more data using Grafana

toy problem:

  • use instrumented traveling salesman problem (or rubiks cube code, or some other project euler code)
  • proof of concept use of monitoring
  • greenfield deployment of netdata

goal:

  • understand/monitor/understand large complex systems
  • minimize time to set up database, add metrics, visualize, gain insight, repeat

toy problem:

  • instrumented TSP or other

toy problem goal

instrument a code, instrument a node, combine collected data about both

  • pick a computationally intensive problem
  • pick a node platform
  • collect the data - mongodb

instrument a TSP problem code, show a proof of concept of what we want to be able to record/measure/visualize

focus on greenfield deployments - install netdata, collect stats, visualize them

maintenance and monitoring scripts

dotfiles/debian/scripts

rojo wiki script:

  • set up script to pull latest wiki edit data
  • subprocess
  • pull update theme
  • pull update charlesreid1.com
  • pelican content
  • copy to htdocs (use rsync)

jupiter:

  • set up script to regenerate wiki edit data
  • subprocess
  • push update charlesreid1.com

rojo git script:

  • generate latest gitea dump
  • best to parse the data from git status logs on rojo itself - not using mongodb
  • rewrite gitea binary page so that it pulls an external CSV (like, where it is already on charlesreid1.com)

d3 viz

10 visualizations:

back burner:

Fix a shrubbery

Also see D3x10

blog posts

bots

charlesreid1 bot:

Bots/Charlesreid1

bot instrumentation:

  • dashboard, monitoring, statistics, status
  • bot dashboard with grafana

Bots/Instrumentation

apollo bots:

  • 14/15/16/17
  • incorporate lunar surface dialogue

apollo references:

Category:Bots

Bots/New Apollo

git

More organizing and creating git repositories:

  • data organization for repos dealing with data collection, monitoring, or storing raw data
  • created repos for raw data - census, maps, git, wiki
  • these are contained in a data-master repo that is cloned directly into charlesreid1.com htdocs dir
  • data is accessible at charlesreid1.com/data

d3 on git.charlesreid1.com:

genealogy

Genealogy photos:

  • Photos cropped/organized by family
    • 2011
    • 2017
    • Rename scheme
    • Notes - A2k11
    • Notes - R2k11
    • Notes - A2k17
    • Notes - K2k17
    • Notes - R2k17
  • Send email to fam with link on Dropbox

Writing:

  • Pauline and Bruce chapters
  • Historical research planning

back burner

networking

monitoring hardware

  • network tap/switch
  • wired router

new router:

  • website with database of embedded dev boards: board-db.org
  • Banana Pi R2 is designed with built-in switch hardware, so it's intended to be used as a Raspberry Pi for home routers, of sorts. Long term, this would be a good hardware platform.
  • Banana Pi R2 Link: [1]

data streams

data streams:

  • sensor data from a physical sensor (raspberry pi, gpio, radio, sdr)
  • rojo/jupiter log data
  • network log info from bro
  • twitter/news scraping

https://community.rackspace.com/products/f/public-cloud-forum/6800/how-to-set-up-monitoring-stack-using-collectd-graphite-grafana-and-seyren-on-ubuntu-14-04

bokeh viz

bokeh: https://github.com/bokeh/bokeh

interactive dashboards:

  • glue between command-line scripts and visual graphs
  • don't worry about "live" ajax refreshing - just focus on analytics and visualization

complete

data store (db)

the database is the central thread for everything.

get a database solution up and running over the management lan.

minimize friction and time to bring up/explore/check new collection.

Note: minimizing friction mainly just comes down to (a) getting it running, thank you very much docker, and (b) familiarity with syntax. everything else is pretty seamless.

completed data streams

completed data streams:

We're not using Collectd anymore, we're using Netdata

Docker/System Stats is a possible solution to collectd

wiki visualization

visualizations:

  • calendar of edits
  • calendar of character counts of edits

calendar visualization: https://charlesreid1.com/calendar

flags