datalad install - install a dataset from a (remote)
source.
datalad install [-h] [-s SOURCE]
[-d DATASET] [-g] [-D DESCRIPTION] [-r]
[--recursion-limit LEVELS] [--nosave] [--reckless] [-J NJOBS]
[PATH [PATH ...]]
This command creates a local sibling of an existing dataset from a
(remote) location identified via a URL or path. Optional recursion into
potential subdatasets, and download of all referenced data is supported. The
new dataset can be optionally registered in an existing superdataset by
identifying it via the DATASET argument (the new dataset's path needs to be
located within the superdataset for that).
It is recommended to provide a brief description to label the
dataset's nature *and* location, e.g. "Michael's music on black
laptop". This helps humans to identify data locations in distributed
scenarios. By default an identifier comprised of user and machine name, plus
path will be generated.
When only partial dataset content shall be obtained, it is
recommended to use this command without the GET-DATA flag, followed by a
`get` operation to obtain the desired data.
- NOTE
- Power-user info: This command uses git clone, and git annex init to
prepare the dataset. Registering to a superdataset is performed via a git
submodule add operation in the discovered superdataset.
- PATH
- path/name of the installation target. If no PATH is provided a destination
path will be derived from a source URL similar to git clone. [Default:
None]
- -h, --help,
--help-np
- show this help message. --help-np forcefully disables the use of a pager
for displaying the help message
- -s SOURCE, --source
SOURCE
- URL or local path of the installation source. Constraints: value must be a
string [Default: None]
- -d DATASET,
--dataset DATASET
- specify the dataset to perform the install operation on. If no dataset is
given, an attempt is made to identify the dataset in a parent directory of
the current working directory and/or the PATH given. Constraints: Value
must be a Dataset or a valid identifier of a Dataset (e.g. a path)
[Default: None]
- -g,
--get-data
- if given, obtain all data content too. [Default: False]
- -D DESCRIPTION,
--description DESCRIPTION
- short description to use for a dataset location. Its primary purpose is to
help humans to identify a dataset copy (e.g., "mike's dataset on lab
server"). Note that when a dataset is published, this information
becomes available on the remote side. Constraints: value must be a string
[Default: None]
- -r,
--recursive
- if set, recurse into potential subdataset. [Default: False]
- --recursion-limit
LEVELS
- limit recursion into subdataset to the given number of levels.
Constraints: value must be convertible to type 'int' [Default: None]
- --nosave
- by default all modifications to a dataset are immediately saved. Giving
this option will disable this behavior. [Default: True]
- --reckless
- Set up the dataset to be able to obtain content in the cheapest/fastest
possible way, even if this poses a potential risk the data integrity (e.g.
hardlink files from a local clone of the dataset). Use with care, and
limit to "read-only" use cases. With this flag the installed
dataset will be marked as untrusted. [Default: False]
- -J NJOBS, --jobs
NJOBS
- how many parallel jobs (where possible) to use. Constraints: value must be
convertible to type 'int', or value must be one of ('auto',) [Default:
'auto']
datalad is developed by The DataLad Team and Contributors
<team@datalad.org>.