Git URL Parser - libvcs.url.git#

Detect, parse, and change git URLs using libvcs’s URL parser for git(1). It builds on top of the VCS-friendly URL parser framework.

Detect, parse, and validate git URLs.

libvcs.url.git.DEFAULT_RULES: list[Rule] = [Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n        ^\n    (?P<scheme>\n      (\n        http|https\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostname>([^/:]+))\n    (:(?P<port>\\d{1,5}))?\n    (?P<separator>[, re.VERBOSE), defaults={}), Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n        ^(?P<scheme>ssh)?\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n    (?P<hostname>([^/:]+))\n    (?P<separator>:)\n    # The server-s, re.VERBOSE), defaults={'username': 'git'})]#

Core regular expressions. These are patterns understood by git(1)

See also: https://git-scm.com/docs/git-clone#URLS

libvcs.url.git.PIP_DEFAULT_RULES: list[Rule] = [Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+https|\n        git\\+http|\n        git\\+file\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+file\n      )\n    )\n\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n        (?P<scheme>git\\+file)://\n        (?P<path>[^@]*)\n        \n    (@(?P<rev>.*))\n?\n        ', re.VERBOSE), defaults={}, is_explicit=True)]#

pip-style git URLs.

Examples of PIP-style git URLs (via pip.pypa.io):

MyProject @ git+ssh://git.example.com/MyProject
MyProject @ git+file:///home/user/projects/MyProject
MyProject @ git+https://git.example.com/MyProject

Refs (via pip.pypa.io):

MyProject @ git+https://git.example.com/MyProject.git@master
MyProject @ git+https://git.example.com/MyProject.git@v1.0
MyProject @ git+https://git.example.com/MyProject.git@da39a3ee5e6b4b0d3255bfef95601890afd80709
MyProject @ git+https://git.example.com/MyProject.git@refs/pull/123/head

Notes

libvcs.url.git.NPM_DEFAULT_RULES: list[Rule] = []#

NPM-style git URLs.

Git URL pattern (from docs.npmjs.com):

<protocol>://[<user>[:<password>]@]<hostname>[:<port>][:][/]<path>[#<commit-ish> | #semver:<semver>]

Examples of NPM-style git URLs (from docs.npmjs.com):

ssh://git@github.com:npm/cli.git#v1.0.27
git+ssh://git@github.com:npm/cli#semver:^5.0
git+https://isaacs@github.com/npm/cli.git
git://github.com/npm/cli.git#v1.0.27

Notes

class libvcs.url.git.GitBaseURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None)[source]#

Bases: URLProtocol, SkipDefaultFieldsReprMixin

Git repository location. Parses URLs on initialization.

Examples

>>> GitBaseURL(url='https://github.com/vcs-python/libvcs.git')
GitBaseURL(url=https://github.com/vcs-python/libvcs.git,
        scheme=https,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        rule=core-git-https)
>>> myrepo = GitBaseURL(url='https://github.com/myproject/myrepo.git')
>>> myrepo.hostname
'github.com'
>>> myrepo.path
'myproject/myrepo'
>>> GitBaseURL(url='git@github.com:vcs-python/libvcs.git')
GitBaseURL(url=git@github.com:vcs-python/libvcs.git,
        user=git,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        rule=core-git-scp)
Parameters:
  • url (str) –

  • scheme (str | None) –

  • user (str | None) –

  • hostname (str | None) –

  • port (int | None) –

  • path (str | None) –

  • suffix (str | None) –

  • rule (str | None) –

rule#

name of the Rule

Type:

str

url: str#
scheme: Optional[str] = None#
user: Optional[str] = None#
hostname: Optional[str] = None#
port: Optional[int] = None#
path: Optional[str] = None#
suffix: Optional[str] = None#
rule: Optional[str] = None#
rule_map = RuleMap(_rule_map={'core-git-https': Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n        ^\n    (?P<scheme>\n      (\n        http|https\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostname>([^/:]+))\n    (:(?P<port>\\d{1,5}))?\n    (?P<separator>[, re.VERBOSE), defaults={}), 'core-git-scp': Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n        ^(?P<scheme>ssh)?\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n    (?P<hostname>([^/:]+))\n    (?P<separator>:)\n    # The server-s, re.VERBOSE), defaults={'username': 'git'})})#
classmethod is_valid(url, is_explicit=None)[source]#

Whether URL is compatible with VCS or not.

Return type:

bool

Parameters:
  • url (str) –

  • is_explicit (bool | None) –

Examples

>>> GitBaseURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
True
>>> GitBaseURL.is_valid(url='git@github.com:vcs-python/libvcs.git')
True
>>> GitBaseURL.is_valid(url='notaurl')
False

Unambiguous VCS detection

Sometimes you may want to match a VCS exclusively, without any change for, e.g. in order to outright detect the VCS system being used.

>>> GitBaseURL.is_valid(
...     url='git@github.com:vcs-python/libvcs.git', is_explicit=True
... )
False

In this case, check GitPipURL.is_valid() or GitURL.is_valid()’s examples.

to_url()[source]#

Return a git(1)-compatible URL. Can be used with git clone.

Return type:

str

Examples

>>> git_url = GitBaseURL(url='git@github.com:vcs-python/libvcs.git')
>>> git_url
GitBaseURL(url=git@github.com:vcs-python/libvcs.git,
        user=git,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        rule=core-git-scp)

Switch repo libvcs -> vcspull:

>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'git@github.com:vcs-python/vcspull.git'

Switch them to gitlab:

>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'git@gitlab.com:vcs-python/vcspull.git'
_abc_impl = <_abc._abc_data object>#
_is_protocol = False#
class libvcs.url.git.GitPipURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None, rev=None)[source]#

Bases: GitBaseURL, URLProtocol, SkipDefaultFieldsReprMixin

Supports pip git URLs.

Parameters:
  • url (str) –

  • scheme (str | None) –

  • user (str | None) –

  • hostname (str | None) –

  • port (int | None) –

  • path (str | None) –

  • suffix (str | None) –

  • rule (str | None) –

  • rev (str | None) –

rev: Optional[str] = None#
rule_map = RuleMap(_rule_map={'pip-url': Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+https|\n        git\\+http|\n        git\\+file\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), 'pip-scp-url': Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+file\n      )\n    )\n\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), 'pip-file-url': Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n        (?P<scheme>git\\+file)://\n        (?P<path>[^@]*)\n        \n    (@(?P<rev>.*))\n?\n        ', re.VERBOSE), defaults={}, is_explicit=True)})#
to_url()[source]#

Export a pip-compliant URL.

Return type:

str

Examples

>>> git_url = GitPipURL(
...     url='git+ssh://git@bitbucket.example.com:7999/PROJ/repo.git'
... )
>>> git_url
GitPipURL(url=git+ssh://git@bitbucket.example.com:7999/PROJ/repo.git,
        scheme=git+ssh,
        user=git,
        hostname=bitbucket.example.com,
        port=7999,
        path=PROJ/repo,
        suffix=.git,
        rule=pip-url)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url()
'git+ssh://bitbucket.example.com/libvcs/vcspull.git'

It also accepts revisions, e.g. branch, tag, ref:

>>> git_url = GitPipURL(
...     url='git+https://github.com/vcs-python/libvcs.git@v0.10.0'
... )
>>> git_url
GitPipURL(url=git+https://github.com/vcs-python/libvcs.git@v0.10.0,
        scheme=git+https,
        hostname=github.com,
        path=vcs-python/libvcs,
        suffix=.git,
        rule=pip-url,
        rev=v0.10.0)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url()
'git+https://github.com/libvcs/vcspull.git@v0.10.0'
classmethod is_valid(url, is_explicit=None)[source]#

Whether URL is compatible with Pip Git’s VCS URL pattern or not.

Return type:

bool

Parameters:
  • url (str) –

  • is_explicit (bool | None) –

Examples

Will not match normal git(1) URLs, use GitURL.is_valid() for that.

>>> GitPipURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
False
>>> GitPipURL.is_valid(url='git@github.com:vcs-python/libvcs.git')
False

Pip-style URLs:

>>> GitPipURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git')
True
>>> GitPipURL.is_valid(url='git+ssh://git@github.com:vcs-python/libvcs.git')
True
>>> GitPipURL.is_valid(url='notaurl')
False

Explicit VCS detection

Pip-style URLs are prefixed with the VCS name in front, so its rule_map can unambiguously narrow the type of VCS:

>>> GitPipURL.is_valid(
...     url='git+ssh://git@github.com:vcs-python/libvcs.git', is_explicit=True
... )
True
_abc_impl = <_abc._abc_data object>#
_is_protocol = False#
class libvcs.url.git.GitURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None, rev=None)[source]#

Bases: GitPipURL, GitBaseURL, URLProtocol, SkipDefaultFieldsReprMixin

Batteries included URL Parser. Supports git(1) and pip URLs.

Ancestors (MRO) This URL parser inherits methods and attributes from the following parsers:

Parameters:
  • url (str) –

  • scheme (str | None) –

  • user (str | None) –

  • hostname (str | None) –

  • port (int | None) –

  • path (str | None) –

  • suffix (str | None) –

  • rule (str | None) –

  • rev (str | None) –

_abc_impl = <_abc._abc_data object>#
_is_protocol = False#
rule_map = RuleMap(_rule_map={'core-git-https': Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n        ^\n    (?P<scheme>\n      (\n        http|https\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostname>([^/:]+))\n    (:(?P<port>\\d{1,5}))?\n    (?P<separator>[, re.VERBOSE), defaults={}), 'core-git-scp': Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n        ^(?P<scheme>ssh)?\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n    (?P<hostname>([^/:]+))\n    (?P<separator>:)\n    # The server-s, re.VERBOSE), defaults={'username': 'git'}), 'pip-url': Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+https|\n        git\\+http|\n        git\\+file\n      )\n    )\n\n        ://\n        \n    ((?P<user>\\w+)@)?\n    (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), 'pip-scp-url': Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n        \n    (?P<scheme>\n      (\n        git\\+ssh|\n        git\\+file\n      )\n    )\n\n        \n    # Optional user, e.g. 'git@'\n    ((?P<user>\\w+)@)?\n    # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), 'pip-file-url': Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n        (?P<scheme>git\\+file)://\n        (?P<path>[^@]*)\n        \n    (@(?P<rev>.*))\n?\n        ', re.VERBOSE), defaults={}, is_explicit=True)})#
classmethod is_valid(url, is_explicit=None)[source]#

Whether URL is compatible included Git URL rule_map or not.

Return type:

bool

Parameters:
  • url (str) –

  • is_explicit (bool | None) –

Examples

Will match normal git(1) URLs, use GitURL.is_valid() for that.

>>> GitURL.is_valid(url='https://github.com/vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='git@github.com:vcs-python/libvcs.git')
True

Pip-style URLs:

>>> GitURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='git+ssh://git@github.com:vcs-python/libvcs.git')
True
>>> GitURL.is_valid(url='notaurl')
False

Explicit VCS detection

Pip-style URLs are prefixed with the VCS name in front, so its rule_map can unambiguously narrow the type of VCS:

>>> GitURL.is_valid(
...     url='git+ssh://git@github.com:vcs-python/libvcs.git', is_explicit=True
... )
True

Below, while it’s GitHub, that doesn’t necessarily mean that the URL itself is conclusively a git URL (e.g. the pattern is too lax):

>>> GitURL.is_valid(
...     url='git@github.com:vcs-python/libvcs.git', is_explicit=True
... )
False

You could create a GitHub rule that consider github.com hostnames to be exclusively git:

>>> GitHubRule = Rule(
...     # Since github.com exclusively serves git repos, make explicit
...     label='gh-rule',
...     description='Matches github.com https URLs, exact VCS match',
...     pattern=re.compile(
...         rf'''
...         ^(?P<scheme>ssh)?
...         ((?P<user>\w+)@)?
...         (?P<hostname>(github.com)+):
...         (?P<path>(\w[^:]+))
...         {RE_SUFFIX}?
...         ''',
...         re.VERBOSE,
...     ),
...     is_explicit=True,
...     defaults={
...         'hostname': 'github.com'
...     },
...     weight=100,
... )
>>> GitURL.rule_map.register(GitHubRule)
>>> GitURL.is_valid(
...     url='git@github.com:vcs-python/libvcs.git', is_explicit=True
... )
True
>>> GitURL(url='git@github.com:vcs-python/libvcs.git').rule
'gh-rule'

This is just us cleaning up:

>>> GitURL.rule_map.unregister('gh-rule')
>>> GitURL(url='git@github.com:vcs-python/libvcs.git').rule
'core-git-scp'
to_url()[source]#

Return a git(1)-compatible URL. Can be used with git clone.

Return type:

str

Examples

SSH style URL:

>>> git_url = GitURL(url='git@github.com:vcs-python/libvcs')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'git@github.com:vcs-python/vcspull'

HTTPs URL:

>>> git_url = GitURL(url='https://github.com/vcs-python/libvcs.git')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url()
'https://github.com/vcs-python/vcspull.git'

Switch them to gitlab:

>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'https://gitlab.com/vcs-python/vcspull.git'

Pip style URL, thanks to this class implementing GitPipURL:

>>> git_url = GitURL(url='git+ssh://git@github.com/vcs-python/libvcs')
>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url()
'git+ssh://gitlab.com/vcs-python/libvcs'