2018/Data Project: Difference between revisions
From charlesreid1
| (22 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
=Project Overview= | =Project Overview= | ||
The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards. | The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics. | ||
== | ==Stage 1: Collecting System Data== | ||
See [[2018/Data Project/Stage 1]] | |||
==Stage 2: Spy== | |||
dahak-spy project: | |||
* lightweight server (may want larger disk, okay if non-free) | |||
* running mongodb | |||
* running mongoexpress | |||
* running prometheus | |||
* running grafana | |||
* running netdata | |||
additional components before real world testing: | |||
* netdata on the build node | |||
* | * netdata python plugin from another process, monitoring.....??? | ||
* | * metrics: | ||
* | ** is snakemake running (binary yes/no) | ||
* | ** current stage of snakemake (adjust snakemake file to write into a dotfile) | ||
* | ** cpu/memory/network/disk io | ||
* | |||
* | |||
netdata python plugin workflow? | |||
* does it need to be installed and netdata restarted, or can it push data into netdata? | |||
real yeti: | |||
* get a yeti node | |||
* debug the snakemake file one step at a time using already-downloaded files (faster step) | |||
* let the snakemake file run with netdata and friends running | |||
=Flags= | |||
[[Category:Data Engineering]] | |||
[[Category:Data Project]] | |||
[[Category:MongoDB]] | |||
[[Category:MongoExpress]] | |||
[[Category:Graphite]] | |||
[[Category:Grafana]] | |||
[[Category:Netdata]] | |||
[[Category:Collectd]] | |||
[[Category:Prometheus]] | |||
[[Category:2018]] | |||
[[Category:January 2018]] | |||
[[Category:February 2018]] | |||
<!-- | |||
==Stage 2: Finalized Data Collection System== | |||
= | ===Phase 4: Netdata and Mongo=== | ||
==Phase 4: Netdata and Mongo== | |||
Netdata provides a backend API that can be called to extract data from Netdata. MongoDB listens for API calls to store data in the database. All we need is software that will poll various Netdata instances using the Netdata API and dump that data into MongoDB. This gives much more fine-grained control over the process, schema, and storage format of the data. | Netdata provides a backend API that can be called to extract data from Netdata. MongoDB listens for API calls to store data in the database. All we need is software that will poll various Netdata instances using the Netdata API and dump that data into MongoDB. This gives much more fine-grained control over the process, schema, and storage format of the data. | ||
| Line 63: | Line 64: | ||
See [[Netdata/MongoDB/API]] for script that calls APIs of Netdata and MongoDB to construct the time series database in MongoDB. | See [[Netdata/MongoDB/API]] for script that calls APIs of Netdata and MongoDB to construct the time series database in MongoDB. | ||
Link to script: https://charlesreid1.com:3000/data/netdata/src/master/netdata_mongo.py | |||
[[Image:NetdataMongodb.png|500px]] | [[Image:NetdataMongodb.png|500px]] | ||
This is a (micro)service design pattern - small, lightweight, standalone daemons act as instruments that continuously read whatever they read, available to be queried but otherwise not saving or doing anything with the data themselves. The data is handled by an application that queries each service it manages to collect data about those services (and coordinate if necessary). | This is a (micro)service design pattern - small, lightweight, standalone daemons act as instruments that continuously read whatever they read, available to be queried but otherwise not saving or doing anything with the data themselves. The data is handled by an application that queries each service it manages to collect data about those services (and coordinate if necessary). | ||
--> | |||
<!-- | |||
==Stage 3: Visualizing Data== | |||
===Phase 5: Grafana=== | |||
[[Grafana]] container to create dashboards from it. | |||
Link to Grafana docker files: https://charlesreid1.com:3000/docker/d-grafana | |||
Need to fix grafana user on jupiter. | |||
We're basically after something like this: https://github.com/firehol/netdata/wiki/Netdata,-Prometheus,-and-Grafana-Stack | |||
--> | |||
Latest revision as of 08:18, 3 March 2018
Project Overview
The 2018 data project is an ongoing effort to figure out how to set up "painless" dashboards for monitoring metrics.
Stage 1: Collecting System Data
Stage 2: Spy
dahak-spy project:
- lightweight server (may want larger disk, okay if non-free)
- running mongodb
- running mongoexpress
- running prometheus
- running grafana
- running netdata
additional components before real world testing:
- netdata on the build node
- netdata python plugin from another process, monitoring.....???
- metrics:
- is snakemake running (binary yes/no)
- current stage of snakemake (adjust snakemake file to write into a dotfile)
- cpu/memory/network/disk io
netdata python plugin workflow?
- does it need to be installed and netdata restarted, or can it push data into netdata?
real yeti:
- get a yeti node
- debug the snakemake file one step at a time using already-downloaded files (faster step)
- let the snakemake file run with netdata and friends running
Flags