datalad clone(1) | General Commands Manual | datalad clone(1) |
datalad clone - obtain a dataset (copy) from a URL or local directory
datalad clone [-h] [-d DATASET] [-D DESCRIPTION] [--reckless [auto|ephemeral|shared-...]] [--version] SOURCE [PATH] ...
The purpose of this command is to obtain a new clone (copy) of a dataset and place it into a not-yet-existing or empty directory. As such CLONE provides a strict subset of the functionality offered by `install`. Only a single dataset can be obtained, and immediate recursive installation of subdatasets is not supported. However, once a (super)dataset is installed via CLONE, any content, including subdatasets can be obtained by a subsequent `get` command.
Primary differences over a direct `git clone` call are 1) the automatic initialization of a dataset annex (pure Git repositories are equally supported); 2) automatic registration of the newly obtained dataset as a subdataset (submodule), if a parent dataset is specified; 3) support for additional resource identifiers (DataLad resource identifiers as used on datasets.datalad.org, and RIA store URLs as used for store.datalad.org - optionally in specific versions as identified by a branch or a tag; see examples); and 4) automatic configurable generation of alternative access URL for common cases (such as appending '.git' to the URL in case the accessing the base URL failed).
In case the clone is registered as a subdataset, the original URL passed to CLONE is recorded in `.gitmodules` of the parent dataset in addition to the resolved URL used internally for git-clone. This allows to preserve datalad specific URLs like ria+ssh://... for subsequent calls to GET if the subdataset was locally removed later on.
URL mapping configuration
specifications. A substitution specification is defined as a configuration setting 'datalad.clone.url-substition.<seriesID>' with a string containing a match and substitution expression, each following Python's regular expression syntax. Both expressions are concatenated to a single string with an arbitrary delimiter character. The delimiter is defined by prefixing the string with the delimiter. Prefix and delimiter are stripped from the expressions (Example: ",^http://(.*)$,https://1"). This setting can be defined multiple times, using the same '<seriesID>'. Substitutions in a series will be applied incrementally, in order of their definition. The first substitution in such a series must match, otherwise no further substitutions in a series will be considered. However, following the first match all further substitutions in a series are processed, regardless whether intermediate expressions match or not. Substitution series themselves have no particular order, each matching series will result in a candidate clone URL. Consequently, the initial match specification in a series should be as precise as possible to prevent inflation of candidate URLs.
SEEALSO
handbook:3-001 (http://handbook.datalad.org/symbols)
More information on Remote Indexed Archive (RIA) stores
Install a dataset from GitHub into the current directory::
% datalad clone https://github.com/datalad-datasets/longnow-podcasts.git
Install a dataset into a specific directory::
% datalad clone https://github.com/datalad-datasets/longnow-podcasts.git
myfavpodcasts
Install a dataset as a subdataset into the current dataset::
% datalad clone -d .
https://github.com/datalad-datasets/longnow-podcasts.git
Install the main superdataset from datasets.datalad.org::
% datalad clone ///
Install a dataset identified by a literal alias from store.datalad.org::
% datalad clone ria+http://store.datalad.org#~hcp-openaccess
Install a dataset in a specific version as identified by a branch or tag name from store.datalad.org::
% datalad clone
ria+http://store.datalad.org#76b6ca66-36b1-11ea-a2e6-f0d5bf7b5561@myidentifier
Install a dataset with group-write access permissions::
% datalad clone http://example.com/dataset --reckless shared-group
datalad is developed by The DataLad Team and Contributors <team@datalad.org>.
2023-01-25 | datalad clone 0.18.1 |