Overview#
Architecture#
The IPython.parallel
architecture consists of four components:
IPython-Engine#
The IPython engine is an extension of the IPython kernel for Jupyter. The module waits for requests from the network, executes code and returns the results. IPython parallel extends the Jupyter messaging protocol with native Python object serialisation and adds some additional commands. Several engines are started for parallel and distributed computing.
IPython-Hub#
The main job of the hub is to establish and monitor connections to clients and engines.
IPython-Schedulers#
All actions that can be carried out on the engine go through a scheduler. While the engines themselves block when user code is executed, the schedulers hide this from the user to provide a fully asynchronous interface for a number of engines.
IPython-Client#
There is a main object Client
to connect to the cluster. Then there is a
corresponding View
for each execution model. These Views
allow users to
interact with a number of engines. The two standard views are:
ipyparallel.DirectView
class for explicit addressingipyparallel.LoadBalancedView
class for target-independent scheduling
Start#
Starting the IPython Hub:
$ pipenv run ipcontroller [IPControllerApp] Hub listening on tcp://127.0.0.1:53847 for registration. [IPControllerApp] Hub using DB backend: 'DictDB' [IPControllerApp] hub::created hub [IPControllerApp] writing connection info to /Users/veit/.ipython/profile_default/security/ipcontroller-client.json [IPControllerApp] writing connection info to /Users/veit/.ipython/profile_default/security/ipcontroller-engine.json [IPControllerApp] task::using Python leastload Task scheduler …
- DB backend
The database in which the IPython tasks are managed. In addition to the in-memory database
DictDB
,MongoDB
andSQLite
are further options.ipcontroller-client.json
Configuration file for the IPython client
ipcontroller-engine.json
Configuration file for the IPython engine
- Task-Schedulers
The possible routing scheme.
leastload
always assigns tasks to the engine with the fewest open tasks. Alternatively,lru
(Least Recently Used),plainrandom
,twobin
andweighted
can be selected, the latter two also need Numpy.This can be configured in
ipcontroller_config.py
, for example withc.TaskScheduler.scheme_name = 'leastload'
or with$ pipenv run ipcontroller --scheme=pure
Starting the IPython controller and the engines:
$ pipenv run ipcluster start [IPClusterStart] Starting ipcluster with [daemon=False] [IPClusterStart] Creating pid file: /Users/veit/.ipython/profile_default/pid/ipcluster.pid [IPClusterStart] Starting Controller with LocalControllerLauncher [IPClusterStart] Starting 4 Engines with LocalEngineSetLauncher
- Batch systems
Besides the possibility to start
ipcontroller
andipengine
locally, see Starting the controller and engine on your local machine in ipyparallel:/tutorial/process.md#starting-a-cluster-with-ssh, there are also the profiles forMPI
,PBS
,SGE
,LSF
,HTCondor
,Slurm
,SSH
andWindowsHPC
.This can be configured in
ipcluster_config.py
for example withc.IPClusterEngines.engine_launcher_class = 'SSH'
or with$ pipenv run ipcluster start --engines=MPI
See also
Starting the Jupyter Notebook and loading the IPython-Parallel-Extension:
$ pipenv run jupyter notebook [I NotebookApp] Loading IPython parallel extension [I NotebookApp] [jupyter_nbextensions_configurator] enabled 0.4.1 [I NotebookApp] Serving notebooks from local directory: /Users/veit//jupyter-tutorial [I NotebookApp] The Jupyter Notebook is running at: [I NotebookApp] http://localhost:8888/?token=4e9acb8993758c2e7f3bda3b1957614c6f3528ee5e3343b3
Finally the cluster with the
default
profile can be started in the browser at the URLhttp://localhost:8888/tree/docs/parallel/ipyparallel#ipyclusters
.