Docker/Pods/Deep Learning
From charlesreid1
Notes on a Docker Pod for deep learning.
Setting Up Docker Deep Learning
We are looking for Docker images that can handle a couple of different deep learning technologies:
- Python 3
- Jupyter
- Numpy, scipy, matplotlib, pandas
- Scikit Learn/Scikit Image
- Tensorflow
- OpenCV
- Keras
It would also be nice to be ready to use a GPU if it is available...
This may require a single Docker container, or it might require the use of multiple containers. Either way, we'll call it a Docker pod - a collection of related containers.
Docker container from dockerhub
To get various containers set up, we can use a container created by Github user @waleedka:
- Github: https://github.com/waleedka/modern-deep-learning-docker
- Dockerhub: https://hub.docker.com/r/waleedka/modern-deep-learning/
This Github repo provides a Dockerfile that installs pretty much every item we wanted from the list above, plus a few other things (Java).
There is also the @floydhub snake-oil salesman, who nonetheless has some interesting materials: https://github.com/floydhub/dl-docker
Setting Up Docker Container
Using CPU Based Platform
If you're just using a CPU, start by installing Docker on your platform of choice: Docker/Installing
Next, if you just want to use the deep learning container without any modifications, run this pair of commands to get the docker container and run it:
$ docker pull waleedka/modern-deep-learning $ docker run -it -p 8888:8888 -v ~/:/host waleedka/modern-deep-learning
Note that this takes care of adding a persistent volume to the container, located at /host
, that maps to the host's home directory. This allows getting data in and out of the container.
Using GPU Based Platform
Using a GPU is a little more complicated, since Docker containers have no inherent way of accessing GPU hardware from onboard the container.
Nvidia-docker provides a CUDA image and a docker command line wrapper to allow the GPUs to be accessed by a Docker container when it is launched. To get nvidia-docker, you have to sign up for a free account with Nvidia: https://devblogs.nvidia.com/parallelforall/nvidia-docker-gpu-server-application-deployment-made-easy/
Once you do that and install the nvidia-docker utility, you will have a command line utility for running it. Here's what running a hello world script looks like with nvidia-docker:
$ nvidia-docker run --rm hello-world
Here are the steps that Nvidia suggests for any nvidia-docker project:
1. Set up and explore the development environment inside a container.
2. Build the application in the container.
3. Deploy the container in multiple environments.
Once you've done all of that, you can run the container as above (with the CPU case), but replacing docker
with nvidia-docker
:
$ nvidia-docker run -it -p 8888:8888 -v ~/:/host waleedka/modern-deep-learning
Customizing Docker Container
Using the Docker image as-is just requires getting the image from Github or Docker hub.
To customize it, I created a git repository: https://git.charlesreid1.com/docker/d-deep-learning
Data Volumes Strategy
Volumes strategy for deep learning models using Docker:
Training and testing is going to happen in one go. Single data set, split into training or testing data. Training and testing both happen in same notebook.
Data from host to container includes:
- Entire data set - split into training and testing for the two different steps.
- Pre-prepared notebooks or scripts
- Input files
Data from container to host includes:
- Exported trained model
- Files with results (e.g., image style transfer or generated text)
May be running multiple containers with multiple algorithms, architectures, or parameters
Aggregating output from multiple containers
For GPU/expensive instances, need to get the data off of the machine when finished instead of waiting
Optimal: have one or multiple shared disks with persistent storage
Alt: have secure file transfer happen when training is finished
Next step beyond this would be to put machine learning models into production.
Testing Docker Deep Learning
CPU-Based Platform (Macbook Pro)
Stock image
Start out by running Docker.app in the Applications folder. This will run the Docker daemon in the background.
Now get the docker container, run it, and start a notebook.
$ docker pull waleedka/modern-deep-learning $ docker run -it -p 8888:8888 waleedka/modern-deep-learning root@a944863bc1e6:~# root@a944863bc1e6:~# jupyter notebook
Now on the host machine, we can navigate to localhost:8888
and see a Jupyter notebook server up and running. This is exposing the container's file system and any notebooks running in the container. This container runs Python 3 only.
Create a new Python notebook, and try importing a few libraries:
import numpy import scipy import sklearn import theano import tensorflow import pandas import matplotlib import keras
Custom image
Or, use the custom image:
$ git clone https://charlesreid1.com:3000/docker/d-deep-learning.git $ cd d-deep-learning
Build it:
$ docker build -t deep_learning
Run it:
$ docker run -it -p 8888:8888 deep_learning
GPU-Based Platform
On a GPU-based platform, you can test out the deep learning image as follows.
First, make sure the NVIDIA CUDA driver for the GPU card is installed. cuDNN (CUDA toolbox for deep learning/neural networks) is included with the deep learning docker image provided by waleedka, so you don't need to install cuDNN.
Next, install Docker, followed by Nvidia-docker.
The deep learning image is run using the same command as above, but with nvidia-docker instead of regular old docker.
$ nvidia-docker run -it -p 8888:8888 -p 6006:6006 -v ~/:/host waleedka/modern-deep-learning:gpu
Flags
docker notes on the virtual microservice container platform
Installing the docker platform: Docker/Installing Docker Hello World: Docker/Hello World
Creating Docker Containers: Getting docker containers from docker hub: Docker/Dockerhub Creating docker containers with dockerfiles: Docker/Dockerfiles Managing Dockerfiles using git: Docker/Dockerfiles/Git Setting up Python virtualenv in container: Docker/Virtualenv
Running docker containers: Docker/Basics Dealing with volumes in Docker images: Docker/Volumes Removing Docker images: Docker/Removing Images Rsync Docker Container: Docker/Rsync
Networking with Docker Containers:
|
docker pods pods are groups of docker containers that travel together
Docker pods are collections of Docker containers that are intended to run in concert for various applications.
Wireless Sensor Data Acquisition Pod The wireless sensor data acquisition pod deploys containers This pod uses the following technologies: Stunnel · Rsync · Apache · MongoDB · Python · Jupyter (numerical Python stack)
Deep Learning Pod This pod utilizes the following technologies: Python · Sklearn · Jupyter (numerical Python stack) · Keras · TensorFlow
|