From charlesreid1

notes from January 2018 data engineering work.

this consists of review (rebooting the data engineering stuff) and coding (identifying relevant scenarios for data engineering scenarios).

pages

Review page: Google Cloud/Review

Project 1: 2018/January/Data Engineering/Scientific Data Processing

Project 2: 2018/January/Data Engineering/Big Data Text Processing

Project 3: 2018/January/Data Engineering/Cosmos


procedure

Expanding data-engineering-scenarios:

  • Start with ready examples
  • Work toward synthetic experimental data
  • An imaginary factory... lots of widgets... Kubernetes/container engine... orchestrating a process
  • Focus on a particular process or set of processes, and drill into it, use it to provide multiple angles on a single concept

Software tools list, (abstract) example for each: Google Cloud

  • Storage/database/computation/GPUs vs CPUs/containerization

Software quality assurance: https://git.charlesreid1.com/charlesreid1/scientific-software

  • 10 Best
  • More informal
  • Bullet points - things I've learned
  • Apply style of later points to earlier points
  • Github page - 10 things
  • Clear out lorem ipsum (7-10)

links

links to notes

Notes review: GCDEC

links to codelabs

Google Codelabs:

Google Qwiklabs:

Flags