Jupyter/MPI
From charlesreid1
Main Jupyter page: Jupyter
ipyparallel documentation: https://ipyparallel.readthedocs.io/en/latest/
Contents
Steps
Install OpenMPI
Start by installing OpenMPI:
brew install open-mpi
Install Necessary Notebook Modules
Install the mpi4py library:
pip3 install mpi4py
Install the ipyparallel notebook extension:
jupyter serverextension enable --py ipyparallel
Start MPI Cluster
Then start an MPI cluster using ipcluster
:
ipcluster start --n=4
The output should look like this:
$ ipcluster start --n=4 2017-11-27 05:25:47.988 [IPClusterStart] Starting ipcluster with [daemon=False] 2017-11-27 05:25:47.989 [IPClusterStart] Creating pid file: /Users/charles/.ipython/profile_default/pid/ipcluster.pid 2017-11-27 05:25:47.989 [IPClusterStart] Starting Controller with LocalControllerLauncher 2017-11-27 05:25:48.998 [IPClusterStart] Starting 4 Engines with LocalEngineSetLauncher 2017-11-27 05:26:19.332 [IPClusterStart] Engines appear to have started successfully
Start cluster with MPI (failures)
If you do pass an --engines
flag, though, it could be problematic:
$ ipcluster start --engines=MPI --n=4 2017-11-27 05:24:09.772 [IPClusterStart] Starting ipcluster with [daemon=False] 2017-11-27 05:24:09.773 [IPClusterStart] Creating pid file: /Users/charles/.ipython/profile_default/pid/ipcluster.pid 2017-11-27 05:24:09.773 [IPClusterStart] Starting Controller with LocalControllerLauncher 2017-11-27 05:24:10.777 [IPClusterStart] Starting 4 Engines with MPI 2017-11-27 05:24:10.813 [IPClusterStart] ERROR | Engines shutdown early, they probably failed to connect. Check the engine log files for output. If your controller and engines are not on the same machine, you probably have to instruct the controller to listen on an interface other than localhost. You can set this by adding "--ip='*'" to your ControllerLauncher.controller_args. Be sure to read our security docs before instructing your controller to listen on a public interface. 2017-11-27 05:24:10.813 [IPClusterStart] ERROR | IPython cluster: stopping 2017-11-27 05:24:13.816 [IPClusterStart] Removing pid file: /Users/charles/.ipython/profile_default/pid/ipcluster.pid
To solve this problem, you'll need to create an iPython profile, which iPython parallel can then load up. You'll also add some info in the config file for the profile to specify that MPI should be used to start any clusters.
Link to documentation with description: https://ipyparallel.readthedocs.io/en/stable/process.html#using-ipcluster-in-mpiexec-mpirun-mode
ipython profile create --parallel --profile=mpi
then
vim $IPYTHONDIR/profile_mpi/ipcluster_config.py
then add the line
c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
then start ipcluster (creates the cluster for iPython parallel to use) and tell it to use the mpirun program.
ipcluster start -n 4 --profile=mpi
If it is still giving you trouble, try dumping debug info:
ipcluster start --engines=MPI --n=4 --log-to-file=True --log-level DEBUG
This is still not working... more info: https://stackoverflow.com/questions/33614100/setting-up-a-distributed-ipython-ipyparallel-mpi-cluster#33671604
Thought I just forgot to run a controller, but this doesn't help fix anything:
ipcontroller --ip='*' --profile=mpi
ipcluster start -n 4 --engines=MPIEngineSetLauncher
FINALLY, adding debug info helped track down what the problem was: specifying 4 procs on a 2 proc system.
$ ipcluster start -n 4 --engines=MPIEngineSetLauncher --log-level DEBUG 2017-11-27 08:21:58.843 [IPClusterStart] IPYTHONDIR set to: /Users/charles/.ipython 2017-11-27 08:21:58.844 [IPClusterStart] Using existing profile dir: '/Users/charles/.ipython/profile_default' 2017-11-27 08:21:58.845 [IPClusterStart] Searching path ['/Users/charles/codes/ipython-in-depth', '/Users/charles/.ipython/profile_default', '/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/etc/ipython', '/usr/local/etc/ipython', '/etc/ipython'] for config files 2017-11-27 08:21:58.845 [IPClusterStart] Attempting to load config file: ipython_config.py 2017-11-27 08:21:58.845 [IPClusterStart] Looking for ipython_config in /etc/ipython 2017-11-27 08:21:58.846 [IPClusterStart] Looking for ipython_config in /usr/local/etc/ipython 2017-11-27 08:21:58.846 [IPClusterStart] Looking for ipython_config in /usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/etc/ipython 2017-11-27 08:21:58.846 [IPClusterStart] Looking for ipython_config in /Users/charles/.ipython/profile_default 2017-11-27 08:21:58.846 [IPClusterStart] Looking for ipython_config in /Users/charles/codes/ipython-in-depth 2017-11-27 08:21:58.847 [IPClusterStart] Attempting to load config file: ipcluster_config.py 2017-11-27 08:21:58.847 [IPClusterStart] Looking for ipcontroller_config in /etc/ipython 2017-11-27 08:21:58.847 [IPClusterStart] Looking for ipcontroller_config in /usr/local/etc/ipython 2017-11-27 08:21:58.847 [IPClusterStart] Looking for ipcontroller_config in /usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/etc/ipython 2017-11-27 08:21:58.847 [IPClusterStart] Looking for ipcontroller_config in /Users/charles/.ipython/profile_default 2017-11-27 08:21:58.848 [IPClusterStart] Looking for ipcontroller_config in /Users/charles/codes/ipython-in-depth 2017-11-27 08:21:58.849 [IPClusterStart] Attempting to load config file: ipcluster_config.py 2017-11-27 08:21:58.849 [IPClusterStart] Looking for ipengine_config in /etc/ipython 2017-11-27 08:21:58.849 [IPClusterStart] Looking for ipengine_config in /usr/local/etc/ipython 2017-11-27 08:21:58.849 [IPClusterStart] Looking for ipengine_config in /usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/etc/ipython 2017-11-27 08:21:58.850 [IPClusterStart] Looking for ipengine_config in /Users/charles/.ipython/profile_default 2017-11-27 08:21:58.850 [IPClusterStart] Looking for ipengine_config in /Users/charles/codes/ipython-in-depth 2017-11-27 08:21:58.850 [IPClusterStart] Attempting to load config file: ipcluster_config.py 2017-11-27 08:21:58.851 [IPClusterStart] Looking for ipcluster_config in /etc/ipython 2017-11-27 08:21:58.851 [IPClusterStart] Looking for ipcluster_config in /usr/local/etc/ipython 2017-11-27 08:21:58.851 [IPClusterStart] Looking for ipcluster_config in /usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/etc/ipython 2017-11-27 08:21:58.851 [IPClusterStart] Looking for ipcluster_config in /Users/charles/.ipython/profile_default 2017-11-27 08:21:58.851 [IPClusterStart] Looking for ipcluster_config in /Users/charles/codes/ipython-in-depth 2017-11-27 08:21:58.861 [IPClusterStart] Starting ipcluster with [daemon=False] 2017-11-27 08:21:58.862 [IPClusterStart] Creating pid file: /Users/charles/.ipython/profile_default/pid/ipcluster.pid 2017-11-27 08:21:58.862 [IPClusterStart] Starting Controller with LocalControllerLauncher 2017-11-27 08:21:58.862 [IPClusterStart] Starting LocalControllerLauncher: ['/usr/local/opt/python3/bin/python3.6', '-m', 'ipyparallel.controller', '--profile-dir', '/Users/charles/.ipython/profile_default', '--cluster-id', '', '--log-level=20'] 2017-11-27 08:21:58.868 [IPClusterStart] Process '/usr/local/opt/python3/bin/python3.6' started: 54085 2017-11-27 08:21:59.740 [IPClusterStart] b"2017-11-27 08:21:59.740 [IPControllerApp] Using existing profile dir: '/Users/charles/.ipython/profile_default'" 2017-11-27 08:21:59.759 [IPClusterStart] b'2017-11-27 08:21:59.758 [IPControllerApp] Hub listening on tcp://127.0.0.1:64185 for registration.' 2017-11-27 08:21:59.761 [IPClusterStart] b"2017-11-27 08:21:59.761 [IPControllerApp] Hub using DB backend: 'DictDB'" 2017-11-27 08:21:59.868 [IPClusterStart] Starting 4 Engines with MPIEngineSetLauncher 2017-11-27 08:21:59.868 [IPClusterStart] Starting MPIEngineSetLauncher: ['mpiexec', '-n', '4', '/usr/local/opt/python3/bin/python3.6', '-m', 'ipyparallel.engine', '--profile-dir', '/Users/charles/.ipython/profile_default', '--cluster-id', '', '--log-level=20'] 2017-11-27 08:21:59.874 [IPClusterStart] Process 'mpiexec' started: 54098 2017-11-27 08:21:59.899 [IPClusterStart] b'--------------------------------------------------------------------------' 2017-11-27 08:21:59.905 [IPClusterStart] b'There are not enough slots available in the system to satisfy the 4 slots' 2017-11-27 08:21:59.906 [IPClusterStart] Process 'mpiexec' stopped: {'exit_code': 1, 'pid': 54098} 2017-11-27 08:21:59.906 [IPClusterStart] ERROR | Engines shutdown early, they probably failed to connect. Check the engine log files for output. If your controller and engines are not on the same machine, you probably have to instruct the controller to listen on an interface other than localhost. You can set this by adding "--ip='*'" to your ControllerLauncher.controller_args. Be sure to read our security docs before instructing your controller to listen on a public interface. 2017-11-27 08:21:59.907 [IPClusterStart] ERROR | IPython cluster: stopping 2017-11-27 08:21:59.909 [IPClusterStart] b'Traceback (most recent call last):' 2017-11-27 08:22:00.027 [IPClusterStart] b' File "/usr/local/Cellar/python3/3.6.3/Frameworks/Python.framework/Versions/3.6/lib/python3.6/runpy.py", line 193, in _run_module_as_main' 2017-11-27 08:22:00.027 [IPClusterStart] Process '/usr/local/opt/python3/bin/python3.6' stopped: {'exit_code': 1, 'pid': 54085} 2017-11-27 08:22:02.910 [IPClusterStart] Removing pid file: /Users/charles/.ipython/profile_default/pid/ipcluster.pid
Start cluster with MPI (success)
The cluster runs when I change to:
$ ipcluster start --n=2 --engines=MPIEngineSetLauncher --profile=mpi
and when I connect to the cluster using:
from ipyparallel import Client, error cluster = Client(profile="mpi")
Problems
Problems sharing a variable using px magic
Ideally, we want something like this to work:
rc = ipp.Client()
then:
%px foo = 5 print(rc[:]['foo']) rc[:]['bar'] = lambda : 2 * foo rc[:].apply_sync(parallel.Reference('bar'))
However, this fails.
Documentation:
- https://ipyparallel.readthedocs.io/en/latest/api/ipyparallel.html#classes
- Suggests answer may be push/pull?
- Gives px example with variable assignment: https://github.com/ipython/ipyparallel/blob/527dfc6c7b7702fb159751588a5d5a11d8dd2c4f/docs/source/magics.rst
# import numpy here and everywhere In [25]: with rc[:].sync_imports(): ....: import numpy importing numpy on engine(s) In [27]: %px a = numpy.random.rand(2,2) Parallel execution on engines: [0, 1, 2, 3] In [28]: %px numpy.linalg.eigvals(a) Parallel execution on engines: [0, 1, 2, 3] Out [0:68]: array([ 0.77120707, -0.19448286]) Out [1:68]: array([ 1.10815921, 0.05110369]) Out [2:68]: array([ 0.74625527, -0.37475081]) Out [3:68]: array([ 0.72931905, 0.07159743]) In [29]: %px print 'hi' Parallel execution on engine(s): all [stdout:0] hi [stdout:1] hi [stdout:2] hi [stdout:3] hi
More hints, but nothing solid: https://github.com/ipython/ipyparallel/blob/1cc0f67ba12a4c18be74384800aa906bc89d4dd3/docs/source/direct.rst
Original notebook: https://github.com/charlesreid1/ipython-in-depth/blob/master/examples/Parallel%20Computing/Using%20Dill.ipynb
ipython parallel built in magic (mentions px magic, but no examples):
- Cell magic: https://ipython.readthedocs.io/en/stable/interactive/magics.html#cell-magics
- All magic: https://ipyparallel.readthedocs.io/en/latest/magics.html
Resources
Notebook to illustrate ipython usage of pxlocal: https://nbviewer.jupyter.org/gist/minrk/4470122