From charlesreid1

(Created page with "Review in preparation for interview: * Components of workflow and open source tools for each step * Highlight each step with a data engineering repository * Individual service...")
 
No edit summary
Line 3: Line 3:
* Highlight each step with a data engineering repository
* Highlight each step with a data engineering repository
* Individual services offered on the cloud - know the idea behind, e.g., why so many database solutions
* Individual services offered on the cloud - know the idea behind, e.g., why so many database solutions
* What specific challenges, software, workflows do genomics researchers face/use?
Process:
Case study
* Start by reviewing the logistics company case study
* https://charlesreid1.com/wiki/Google_Cloud/Case_Study
Software tools
* Basic software technologies: storage, databases, distributed computation, GPUs vs CPUs, Docker/containerization
* https://charlesreid1.com/wiki/Google_Cloud
* Google Cloud Genomics
Software Quality Assurance
* Github pages/10 things list (time machine)
GCDEC Review:
* 1 - https://charlesreid1.com/wiki/GCDEC/Fundamentals/Notes
* 2 - https://charlesreid1.com/wiki/GCDEC/Unstructured_Data/Notes
* 3a - https://charlesreid1.com/wiki/GCDEC/BigQuery/Notes
* 3b - https://charlesreid1.com/wiki/GCDEC/Dataflow/Notes
* 4a - https://charlesreid1.com/wiki/GCDEC/Building_Tensorflow/Notes
* 4b - https://charlesreid1.com/wiki/GCDEC/Deploying_Tensorflow/Notes
* 4c - https://charlesreid1.com/wiki/GCDEC/Engineering_Tensorflow/Notes
* 5 - https://charlesreid1.com/w/index.php?title=GCDEC/Streaming/Notes&action=edit&redlink=1

Revision as of 01:45, 3 January 2018

Review in preparation for interview:

  • Components of workflow and open source tools for each step
  • Highlight each step with a data engineering repository
  • Individual services offered on the cloud - know the idea behind, e.g., why so many database solutions
  • What specific challenges, software, workflows do genomics researchers face/use?


Process:

Case study

Software tools

Software Quality Assurance

  • Github pages/10 things list (time machine)

GCDEC Review: