From charlesreid1

No edit summary
No edit summary
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Review==
==Review Notes Pages==


Review in preparation for interview:
[[Google Cloud/Scientific Data Processing]] - doing the scientific data processing qwiklab
* Components of workflow and open source tools for each step
* Highlight each step with a data engineering repository
* Individual services offered on the cloud - know the idea behind, e.g., why so many database solutions
* What specific challenges, software, workflows do genomics researchers face/use?


Review page: [[Google Cloud/Review]]


===Review Process===
==Data Engineering Scenarios Review==


Case study
Project 1: [[2018/January/Data Engineering/Scientific Data Processing]]
* Start by reviewing the logistics company case study
* https://charlesreid1.com/wiki/Google_Cloud/Case_Study


Software tools
Project 2: [[2018/January/Data Engineering/Big Data Text Processing]]
* Basic software technologies: storage, databases, distributed computation, GPUs vs CPUs, Docker/containerization
* https://charlesreid1.com/wiki/Google_Cloud
* Google Cloud Genomics
 
Software Quality Assurance
* Github pages/10 things list (time machine)
 
GCDEC Review:
* 1 - https://charlesreid1.com/wiki/GCDEC/Fundamentals/Notes
* 2 - https://charlesreid1.com/wiki/GCDEC/Unstructured_Data/Notes
* 3a - https://charlesreid1.com/wiki/GCDEC/BigQuery/Notes
* 3b - https://charlesreid1.com/wiki/GCDEC/Dataflow/Notes
* 4a - https://charlesreid1.com/wiki/GCDEC/Building_Tensorflow/Notes
* 4b - https://charlesreid1.com/wiki/GCDEC/Deploying_Tensorflow/Notes
* 4c - https://charlesreid1.com/wiki/GCDEC/Engineering_Tensorflow/Notes
* 5 - https://charlesreid1.com/w/index.php?title=GCDEC/Streaming/Notes&action=edit&redlink=1
 
Google Quiklabs:
* Google Cloud Platform essentials - https://google.qwiklabs.com/quests/23?locale=en
* Scientific data processing - https://google.qwiklabs.com/quests/28?locale=en
* Data engineering - https://google.qwiklabs.com/quests/25?locale=en


Project 3: [[2018/January/Data Engineering/Cosmos]]


==Flags==


[[Category:Google Cloud]]
[[Category:Google Cloud]]
[[Category:Data Engineering]]
[[Category:Data Engineering]]

Latest revision as of 23:25, 11 January 2018

Review Notes Pages

Google Cloud/Scientific Data Processing - doing the scientific data processing qwiklab

Review page: Google Cloud/Review

Data Engineering Scenarios Review

Project 1: 2018/January/Data Engineering/Scientific Data Processing

Project 2: 2018/January/Data Engineering/Big Data Text Processing

Project 3: 2018/January/Data Engineering/Cosmos

Flags