Using repo2docker
#
Note
Docker must be running in
order to run repo2docker
. For more information on installing
repo2docker
, see Installing repo2docker.
repo2docker
can build a reproducible computational environment for any repository that
follows The Reproducible Execution Environment Specification. repo2docker is called with the URL of a Git repository,
a DOI from Zenodo or Figshare,
a Handle or DOI from a Dataverse installation,
a SWHID of a directory of a revision archived in the
Software Heritage Archive,
or a path to a local directory.
It then performs these steps:
Inspects the repository for configuration files. These will be used to build the environment needed to run the repository.
Builds a Docker image with an environment specified in these configuration files.
Launches the image to let you explore the repository interactively via Jupyter notebooks, RStudio, or many other interfaces (optional)
Pushes the images to a Docker registry so that it may be accessed remotely (optional)
Calling repo2docker#
repo2docker is called with this command:
jupyter-repo2docker <source-repository>
where <source-repository>
is:
a URL of a Git repository (
https://github.com/binder-examples/requirements
),a Zenodo DOI (
10.5281/zenodo.1211089
),a SWHID (
swh:1:rev:999dd06c7f679a2714dfe5199bdca09522a29649
), ora path to a local directory (
a/local/directory
)
of the source repository you want to build.
For example, the following command will build an image of Peter Norvig’s Pytudes repository:
jupyter-repo2docker https://github.com/norvig/pytudes
Building the image may take a few minutes.
Pytudes
uses a requirements.txt file
to specify its Python environment. Because of this, repo2docker
will use
pip
to install dependencies listed in this requirement.txt
file, and
these will be present in the generated Docker image. To learn more about
configuration files in repo2docker
visit Configuration Files.
When the image is built, a message will be output to your terminal:
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://0.0.0.0:36511/?token=f94f8fabb92e22f5bfab116c382b4707fc2cade56ad1ace0
Pasting the URL into your browser will open Jupyter Notebook with the dependencies and contents of the source repository in the built image.
Building a specific branch, commit or tag#
To build a particular branch and commit, use the argument --ref
and
specify the branch-name
or commit-hash
. For example:
jupyter-repo2docker --ref 9ced85dd9a84859d0767369e58f33912a214a3cf https://github.com/norvig/pytudes
Tip
For reproducible builds, we recommend specifying a commit-hash to deterministically build a fixed version of a repository. Not specifying a commit-hash will result in the latest commit of the repository being built.
Where to put configuration files#
repo2docker
will look for configuration files in:
A folder named
binder/
in the root of the repository.A folder named
.binder/
in the root of the repository.The root directory of the repository.
Having both binder/
and .binder/
folders is not allowed.
If one of these folders exists, only configuration files in that folder are considered, configuration in the root directory will be ignored.
Check the complete list of configuration files supported
by repo2docker
to see how to configure the build process.
Note
repo2docker
builds an environment with Python 3.7 by default. If you’d
like a different version, you can specify this in your
configuration files.
Debugging repo2docker with --debug
and --no-build
#
To debug the docker image being built, pass the --debug
parameter:
jupyter-repo2docker --debug https://github.com/norvig/pytudes
This will print the generated Dockerfile
, build it, and run it.
To see the generated Dockerfile
without actually building it,
pass --no-build
to the commandline. This Dockerfile
output
is for debugging purposes of repo2docker
only - it can not
be used by docker directly.
jupyter-repo2docker --no-build --debug https://github.com/norvig/pytudes
Command line API#
jupyter-repo2docker#
Fetch a repository and build a container image
usage: jupyter-repo2docker [-h] [--help-all] [--version] [--config CONFIG] [--json-logs] [--image-name IMAGE_NAME]
[--ref REF] [--debug] [--no-build] [--build] [--build-memory-limit BUILD_MEMORY_LIMIT]
[--no-run] [--run] [--publish PORTS] [--publish-all] [--no-clean] [--clean] [--push]
[--no-push] [--volume VOLUMES] [--user-id USER_ID] [--user-name USER_NAME] [--env ENVIRONMENT]
[--editable] [--target-repo-dir TARGET_REPO_DIR] [--appendix APPENDIX] [--label LABELS]
[--build-arg BUILD_ARGS] [--subdir SUBDIR] [--cache-from CACHE_FROM] [--engine ENGINE]
repo ...
- repo#
Path to repository that should be built. Could be local path or a git URL.
- cmd#
Custom command to run after building container
- -h, --help#
show this help message and exit
- --help-all#
Display all configurable options and exit.
- --version#
Print the repo2docker version and exit.
- --config <config>#
Path to config file for repo2docker
- --json-logs#
Emit JSON logs instead of human readable logs
- --image-name <image_name>#
Name of image to be built. If unspecified will be autogenerated
- --ref <ref>#
Reference to build instead of default reference. For example branch name or commit for a Git repository.
- --debug#
Turn on debug logging
- --no-build#
Do not actually build the image. Useful in conjunction with –debug.
- --build#
Build the image (default)
- --build-memory-limit <build_memory_limit>#
Total Memory that can be used by the docker build process
- --no-run#
Do not run container after it has been built
- --run#
Run container after it has been built (default).
- --publish <ports>, -p <ports>#
Specify port mappings for the image. Needs a command to run in the container.
- --publish-all, -P#
Publish all exposed ports to random host ports.
- --no-clean#
Don’t clean up remote checkouts after we are done
- --clean#
Clean up remote checkouts after we are done (default).
- --push#
Push docker image to repository
- --no-push#
Don’t push docker image to repository (default).
- --volume <volumes>, -v <volumes>#
Volumes to mount inside the container, in form src:dest
- --user-id <user_id>#
User ID of the primary user in the image
- --user-name <user_name>#
Username of the primary user in the image
- --env, -e#
Environment variables to define at container run time
- --editable, -E#
Use the local repository in edit mode
- --target-repo-dir <target_repo_dir>#
Path inside the image where contents of the repositories are copied to, and where all the build operations (such as postBuild) happen. Defaults to ${HOME} if not set
- --appendix <appendix>#
Appendix of Dockerfile commands to run at the end of the build. Can be used to customize the resulting image after all standard build steps finish.
- --label <labels>#
Extra label to set on the image, in form name=value
- --build-arg <build_args>#
Extra build arg to pass to the build process, in form name=value
- --subdir <subdir>#
Subdirectory of the git repository to examine. Defaults to ‘’.
- --cache-from <cache_from>#
List of images to try & re-use cached image layers from. Docker only tries to re-use image layers from images built locally, not pulled from a registry. We can ask it to explicitly re-use layers from non-locally built images by through the ‘cache_from’ parameter.
- --engine <engine>#
Name of the container engine. Defaults to ‘docker’.