Git URL Parser - libvcs.url.git
#
Detect, parse, and change git URLs using libvcsβs URL parser for git(1)
. It builds on top of the
VCS-friendly URL parser framework.
Detect, parse, and validate git URLs.
Detect:
GitURL.is_valid()
Parse:
compare to
urllib.parse.ParseResult
Compatibility focused:
GitURL
: Will work withgit(1)
as well aspip(1)
style URLsOutput
git(1)
URL:GitURL.to_url()
Strict
git(1)
compatibility:GitBaseURL
.Output
git(1)
URL:GitBaseURL.to_url()
- libvcs.url.git.DEFAULT_RULES: list[Rule] = [Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n ^\n (?P<scheme>\n (\n http|https\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostname>([^/:]+))\n (:(?P<port>\\d{1,5}))?\n (?P<separator>[, re.VERBOSE), defaults={}), Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n ^(?P<scheme>ssh)?\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n (?P<hostname>([^/:]+))\n (?P<separator>:)\n # The server-s, re.VERBOSE), defaults={'username': 'git'})]#
Core regular expressions. These are patterns understood by
git(1)
See also: https://git-scm.com/docs/git-clone#URLS
- libvcs.url.git.PIP_DEFAULT_RULES: list[Rule] = [Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+https|\n git\\+http|\n git\\+file\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+file\n )\n )\n\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n (?P<scheme>git\\+file)://\n (?P<path>[^@]*)\n \n (@(?P<rev>.*))\n?\n ', re.VERBOSE), defaults={}, is_explicit=True)]#
pip-style git URLs.
Examples of PIP-style git URLs (via pip.pypa.io):
MyProject @ git+ssh://git.example.com/MyProject MyProject @ git+file:///home/user/projects/MyProject MyProject @ git+https://git.example.com/MyProject
Refs (via pip.pypa.io):
MyProject @ git+https://git.example.com/MyProject.git@master MyProject @ git+https://git.example.com/MyProject.git@v1.0 MyProject @ git+https://git.example.com/MyProject.git@da39a3ee5e6b4b0d3255bfef95601890afd80709 MyProject @ git+https://git.example.com/MyProject.git@refs/pull/123/head
Notes
- libvcs.url.git.NPM_DEFAULT_RULES: list[Rule] = []#
NPM-style git URLs.
Git URL pattern (from docs.npmjs.com):
<protocol>://[<user>[:<password>]@]<hostname>[:<port>][:][/]<path>[#<commit-ish> | #semver:<semver>]
Examples of NPM-style git URLs (from docs.npmjs.com):
ssh://git@github.com:npm/cli.git#v1.0.27 git+ssh://git@github.com:npm/cli#semver:^5.0 git+https://isaacs@github.com/npm/cli.git git://github.com/npm/cli.git#v1.0.27
Notes
- class libvcs.url.git.GitBaseURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None)[source]#
Bases:
URLProtocol
,SkipDefaultFieldsReprMixin
Git repository location. Parses URLs on initialization.
Examples
>>> GitBaseURL(url='https://github.com/vcs-python/libvcs.git') GitBaseURL(url=https://github.com/vcs-python/libvcs.git, scheme=https, hostname=github.com, path=vcs-python/libvcs, suffix=.git, rule=core-git-https)
>>> myrepo = GitBaseURL(url='https://github.com/myproject/myrepo.git')
>>> myrepo.hostname 'github.com'
>>> myrepo.path 'myproject/myrepo'
>>> GitBaseURL(url='git@github.com:vcs-python/libvcs.git') GitBaseURL(url=git@github.com:vcs-python/libvcs.git, user=git, hostname=github.com, path=vcs-python/libvcs, suffix=.git, rule=core-git-scp)
Compatibility checking:
GitBaseURL.is_valid()
URLs compatible with
git(1)
:GitBaseURL.to_url()
- Parameters:
- rule_map = RuleMap(_rule_map={'core-git-https': Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n ^\n (?P<scheme>\n (\n http|https\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostname>([^/:]+))\n (:(?P<port>\\d{1,5}))?\n (?P<separator>[, re.VERBOSE), defaults={}), 'core-git-scp': Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n ^(?P<scheme>ssh)?\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n (?P<hostname>([^/:]+))\n (?P<separator>:)\n # The server-s, re.VERBOSE), defaults={'username': 'git'})})#
- classmethod is_valid(url, is_explicit=None)[source]#
Whether URL is compatible with VCS or not.
Examples
>>> GitBaseURL.is_valid(url='https://github.com/vcs-python/libvcs.git') True
>>> GitBaseURL.is_valid(url='git@github.com:vcs-python/libvcs.git') True
>>> GitBaseURL.is_valid(url='notaurl') False
Unambiguous VCS detection
Sometimes you may want to match a VCS exclusively, without any change for, e.g. in order to outright detect the VCS system being used.
>>> GitBaseURL.is_valid( ... url='git@github.com:vcs-python/libvcs.git', is_explicit=True ... ) False
In this case, check
GitPipURL.is_valid()
orGitURL.is_valid()
βs examples.
- to_url()[source]#
Return a
git(1)
-compatible URL. Can be used withgit clone
.- Return type:
Examples
>>> git_url = GitBaseURL(url='git@github.com:vcs-python/libvcs.git')
>>> git_url GitBaseURL(url=git@github.com:vcs-python/libvcs.git, user=git, hostname=github.com, path=vcs-python/libvcs, suffix=.git, rule=core-git-scp)
Switch repo libvcs -> vcspull:
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url() 'git@github.com:vcs-python/vcspull.git'
Switch them to gitlab:
>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url() 'git@gitlab.com:vcs-python/vcspull.git'
- _abc_impl = <_abc._abc_data object>#
- _is_protocol = False#
- class libvcs.url.git.GitPipURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None, rev=None)[source]#
Bases:
GitBaseURL
,URLProtocol
,SkipDefaultFieldsReprMixin
Supports pip git URLs.
- Parameters:
- rule_map = RuleMap(_rule_map={'pip-url': Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+https|\n git\\+http|\n git\\+file\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), 'pip-scp-url': Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+file\n )\n )\n\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), 'pip-file-url': Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n (?P<scheme>git\\+file)://\n (?P<path>[^@]*)\n \n (@(?P<rev>.*))\n?\n ', re.VERBOSE), defaults={}, is_explicit=True)})#
- to_url()[source]#
Export a pip-compliant URL.
- Return type:
Examples
>>> git_url = GitPipURL( ... url='git+ssh://git@bitbucket.example.com:7999/PROJ/repo.git' ... )
>>> git_url GitPipURL(url=git+ssh://git@bitbucket.example.com:7999/PROJ/repo.git, scheme=git+ssh, user=git, hostname=bitbucket.example.com, port=7999, path=PROJ/repo, suffix=.git, rule=pip-url)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url() 'git+ssh://bitbucket.example.com/libvcs/vcspull.git'
It also accepts revisions, e.g. branch, tag, ref:
>>> git_url = GitPipURL( ... url='git+https://github.com/vcs-python/libvcs.git@v0.10.0' ... )
>>> git_url GitPipURL(url=git+https://github.com/vcs-python/libvcs.git@v0.10.0, scheme=git+https, hostname=github.com, path=vcs-python/libvcs, suffix=.git, rule=pip-url, rev=v0.10.0)
>>> git_url.path = 'libvcs/vcspull'
>>> git_url.to_url() 'git+https://github.com/libvcs/vcspull.git@v0.10.0'
- classmethod is_valid(url, is_explicit=None)[source]#
Whether URL is compatible with Pip Gitβs VCS URL pattern or not.
Examples
Will not match normal
git(1)
URLs, useGitURL.is_valid()
for that.>>> GitPipURL.is_valid(url='https://github.com/vcs-python/libvcs.git') False
>>> GitPipURL.is_valid(url='git@github.com:vcs-python/libvcs.git') False
Pip-style URLs:
>>> GitPipURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git') True
>>> GitPipURL.is_valid(url='git+ssh://git@github.com:vcs-python/libvcs.git') True
>>> GitPipURL.is_valid(url='notaurl') False
Explicit VCS detection
Pip-style URLs are prefixed with the VCS name in front, so its rule_map can unambiguously narrow the type of VCS:
>>> GitPipURL.is_valid( ... url='git+ssh://git@github.com:vcs-python/libvcs.git', is_explicit=True ... ) True
- _abc_impl = <_abc._abc_data object>#
- _is_protocol = False#
- class libvcs.url.git.GitURL(url, scheme=None, user=None, hostname=None, port=None, path=None, suffix=None, rule=None, rev=None)[source]#
Bases:
GitPipURL
,GitBaseURL
,URLProtocol
,SkipDefaultFieldsReprMixin
Batteries included URL Parser. Supports git(1) and pip URLs.
Ancestors (MRO) This URL parser inherits methods and attributes from the following parsers:
- Parameters:
- _abc_impl = <_abc._abc_data object>#
- _is_protocol = False#
- rule_map = RuleMap(_rule_map={'core-git-https': Rule(label=core-git-https, description=Vanilla git pattern, URL ending with optional .git suffix, pattern=re.compile('\n ^\n (?P<scheme>\n (\n http|https\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostname>([^/:]+))\n (:(?P<port>\\d{1,5}))?\n (?P<separator>[, re.VERBOSE), defaults={}), 'core-git-scp': Rule(label=core-git-scp, description=Vanilla scp(1) / ssh(1) type URL, pattern=re.compile("\n ^(?P<scheme>ssh)?\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n (?P<hostname>([^/:]+))\n (?P<separator>:)\n # The server-s, re.VERBOSE), defaults={'username': 'git'}), 'pip-url': Rule(label=pip-url, description=pip-style git URL, pattern=re.compile('\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+https|\n git\\+http|\n git\\+file\n )\n )\n\n ://\n \n ((?P<user>\\w+)@)?\n (?P<hostn, re.VERBOSE), defaults={}, is_explicit=True), 'pip-scp-url': Rule(label=pip-scp-url, description=pip-style git ssh/scp URL, pattern=re.compile("\n \n (?P<scheme>\n (\n git\\+ssh|\n git\\+file\n )\n )\n\n \n # Optional user, e.g. 'git@'\n ((?P<user>\\w+)@)?\n # Server, e.g. 'github.com'.\n , re.VERBOSE), defaults={}, is_explicit=True), 'pip-file-url': Rule(label=pip-file-url, description=pip-style git+file:// URL, pattern=re.compile('\n (?P<scheme>git\\+file)://\n (?P<path>[^@]*)\n \n (@(?P<rev>.*))\n?\n ', re.VERBOSE), defaults={}, is_explicit=True)})#
- classmethod is_valid(url, is_explicit=None)[source]#
Whether URL is compatible included Git URL rule_map or not.
Examples
Will match normal
git(1)
URLs, useGitURL.is_valid()
for that.>>> GitURL.is_valid(url='https://github.com/vcs-python/libvcs.git') True
>>> GitURL.is_valid(url='git@github.com:vcs-python/libvcs.git') True
Pip-style URLs:
>>> GitURL.is_valid(url='git+https://github.com/vcs-python/libvcs.git') True
>>> GitURL.is_valid(url='git+ssh://git@github.com:vcs-python/libvcs.git') True
>>> GitURL.is_valid(url='notaurl') False
Explicit VCS detection
Pip-style URLs are prefixed with the VCS name in front, so its rule_map can unambiguously narrow the type of VCS:
>>> GitURL.is_valid( ... url='git+ssh://git@github.com:vcs-python/libvcs.git', is_explicit=True ... ) True
Below, while itβs GitHub, that doesnβt necessarily mean that the URL itself is conclusively a git URL (e.g. the pattern is too lax):
>>> GitURL.is_valid( ... url='git@github.com:vcs-python/libvcs.git', is_explicit=True ... ) False
You could create a GitHub rule that consider github.com hostnames to be exclusively git:
>>> GitHubRule = Rule( ... # Since github.com exclusively serves git repos, make explicit ... label='gh-rule', ... description='Matches github.com https URLs, exact VCS match', ... pattern=re.compile( ... rf''' ... ^(?P<scheme>ssh)? ... ((?P<user>\w+)@)? ... (?P<hostname>(github.com)+): ... (?P<path>(\w[^:]+)) ... {RE_SUFFIX}? ... ''', ... re.VERBOSE, ... ), ... is_explicit=True, ... defaults={ ... 'hostname': 'github.com' ... }, ... weight=100, ... )
>>> GitURL.rule_map.register(GitHubRule)
>>> GitURL.is_valid( ... url='git@github.com:vcs-python/libvcs.git', is_explicit=True ... ) True
>>> GitURL(url='git@github.com:vcs-python/libvcs.git').rule 'gh-rule'
This is just us cleaning up:
>>> GitURL.rule_map.unregister('gh-rule')
>>> GitURL(url='git@github.com:vcs-python/libvcs.git').rule 'core-git-scp'
- to_url()[source]#
Return a
git(1)
-compatible URL. Can be used withgit clone
.- Return type:
Examples
SSH style URL:
>>> git_url = GitURL(url='git@github.com:vcs-python/libvcs')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url() 'git@github.com:vcs-python/vcspull'
HTTPs URL:
>>> git_url = GitURL(url='https://github.com/vcs-python/libvcs.git')
>>> git_url.path = 'vcs-python/vcspull'
>>> git_url.to_url() 'https://github.com/vcs-python/vcspull.git'
Switch them to gitlab:
>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url() 'https://gitlab.com/vcs-python/vcspull.git'
Pip style URL, thanks to this class implementing
GitPipURL
:>>> git_url = GitURL(url='git+ssh://git@github.com/vcs-python/libvcs')
>>> git_url.hostname = 'gitlab.com'
>>> git_url.to_url() 'git+ssh://gitlab.com/vcs-python/libvcs'
See also