download_file#
- astropy.utils.data.download_file(remote_url, cache=False, show_progress=True, timeout=None, sources=None, pkgname='astropy', http_headers=None, ssl_context=None, allow_insecure=False)[source]#
Downloads a URL and optionally caches the result.
It returns the filename of a file containing the URL’s contents. If
cache=True
and the file is present in the cache, just returns the filename; if the file had to be downloaded, add it to the cache. Ifcache="update"
always download and add it to the cache.The cache is effectively a dictionary mapping URLs to files; by default the file contains the contents of the URL that is its key, but in practice these can be obtained from a mirror (using
sources
) or imported from the local filesystem (usingimport_file_to_cache
orimport_download_cache
). Regardless, each file is regarded as representing the contents of a particular URL, and this URL should be used to look them up or otherwise manipulate them.The files in the cache directory are named according to a cryptographic hash of their URLs (currently MD5, so hackers can cause collisions). The modification times on these files normally indicate when they were last downloaded from the Internet.
- Parameters:
- remote_url
str
The URL of the file to download
- cachebool or “update”, optional
Whether to cache the contents of remote URLs. If “update”, always download the remote URL in case there is a new version and store the result in the cache.
- show_progressbool, optional
Whether to display a progress bar during the download (default is
True
). Regardless of this setting, the progress bar is only displayed when outputting to a terminal.- timeout
float
, optional Timeout for remote requests in seconds (default is the configurable
astropy.utils.data.Conf.remote_timeout
).- sources
list
ofstr
, optional If provided, a list of URLs to try to obtain the file from. The result will be stored under the original URL. The original URL will not be tried unless it is in this list; this is to prevent long waits for a primary server that is known to be inaccessible at the moment. If an empty list is passed, then
download_file
will not attempt to connect to the Internet, that is, if the file is not in the cache a KeyError will be raised.- pkgname
str
, optional The package name to use to locate the download cache. i.e. for
pkgname='astropy'
the default cache location is~/.astropy/cache
.- http_headers
dict
orNone
HTTP request headers to pass into
urlopen
if needed. (These headers are ignored if the protocol for thename_or_obj
/sources
entry is not a remote HTTP URL.) In the default case (None), the headers areUser-Agent: some_value
andAccept: */*
, wheresome_value
is set byastropy.utils.data.conf.default_http_user_agent
.- ssl_context
dict
, optional Keyword arguments to pass to
ssl.create_default_context
when downloading from HTTPS or TLS+FTP sources. This can be used provide alternative paths to root CA certificates. Additionally, if the key'certfile'
and optionally'keyfile'
and'password'
are included, they are passed tossl.SSLContext.load_cert_chain
. This can be used for performing SSL/TLS client certificate authentication for servers that require it.- allow_insecurebool, optional
Allow downloading files over a TLS/SSL connection even when the server certificate verification failed. When set to
True
the potentially insecure download is allowed to proceed, but anAstropyWarning
is issued. If you are frequently getting certificate verification warnings, consider installing or upgrading certifi package, which provides frequently updated certificates for common root CAs (i.e., a set similar to those used by web browsers). If installed, Astropy will use it automatically.
- remote_url
- Returns:
- local_path
str
Returns the local path that the file was download to.
- local_path
- Raises:
urllib.error.URLError
Whenever there’s a problem getting the remote file.
KeyError
When a file was requested from the cache but is missing and no sources were provided to obtain it from the Internet.
Notes
Because this function returns a filename, another process could run
clear_download_cache
before you actually open the file, leaving you with a filename that no longer points to a usable file.