GALLERY-DL.CONF(5) | gallery-dl Manual | GALLERY-DL.CONF(5) |
gallery-dl.conf - gallery-dl configuration file
gallery-dl will search for configuration files in the following places every time it is started, unless --ignore-config is specified:
/etc/gallery-dl.conf $HOME/.config/gallery-dl/config.json $HOME/.gallery-dl.conf
It is also possible to specify additional configuration files with the -c/--config command-line option or to add further option values with -o/--option as <key>=<value> pairs,
Configuration files are JSON-based and therefore don't allow any ordinary comments, but, since unused keys are simply ignored, it is possible to utilize those as makeshift comments by settings their values to arbitrary strings.
{
"{manga}_c{chapter}_{page:>03}.{extension}"
{ "extension == 'mp4'": "{id}_video.{extension}", "'nature' in title" : "{id}_{title}.{extension}", "" : "{id}_default.{extension}" }
If this is an object, it must contain Python expressions mapping to the filename format strings to use. These expressions are evaluated in the order as specified in Python 3.6+ and in an undetermined order in Python 3.4 and 3.5.
The available replacement keys depend on the extractor used. A list of keys for a specific one can be acquired by calling *gallery-dl* with the -K/--list-keywords command-line option. For example:
$ gallery-dl -K http://seiga.nicovideo.jp/seiga/im5977527 Keywords for directory names:
category seiga subcategory image
Keywords for filenames:
category seiga extension None image-id 5977527 subcategory image
Note: Even if the value of the extension key is missing or None, it will be filled in later when the file download is starting. This key is therefore always available to provide a valid filename extension.
["{category}", "{manga}", "c{chapter} - {title}"]
{ "'nature' in content": ["Nature Pictures"], "retweet_id != 0" : ["{category}", "{user[name]}", "Retweets"], "" : ["{category}", "{user[name]}"] }
If this is an object, it must contain Python expressions mapping to the list of format strings to use.
Each individual string in such a list represents a single path segment, which will be joined together and appended to the base-directory to form the complete target directory path.
If this is a string, add a parent's metadata to its
children's
to a field named after said string. For example with
"parent-metadata": "_p_":
{ "id": "child-id", "_p_": {"id": "parent-id"} }
Special values:
* "auto": Use characters from
"unix" or "windows" depending on the
local operating system
* "unix": "/"
* "windows": "\\\\|/<>:\"?*"
* "ascii": "^0-9A-Za-z_."
Note: In a string with 2 or more characters, []^-\ need to be escaped with backslashes, e.g. "\\[\\]"
Note: In a string with 2 or more characters, []^-\ need to be escaped with backslashes, e.g. "\\[\\]"
Special values:
* "auto": Use characters from
"unix" or "windows" depending on the
local operating system
* "unix": ""
* "windows": ". "
{ "jpeg": "jpg", "jpe" : "jpg", "jfif": "jpg", "jif" : "jpg", "jfi" : "jpg" }
* true: Skip downloads
* false: Overwrite already existing files
* "abort": Stop the current extractor run
* "abort:N": Skip downloads and stop the current
extractor run after N consecutive skips
* "terminate": Stop the current extractor
run, including parent extractors
* "terminate:N": Skip downloads and stop the current
extractor run, including parent extractors, after N consecutive
skips
* "exit": Exit the program altogether
* "exit:N": Skip downloads and exit the program after
N consecutive skips
* "enumerate": Add an enumeration index to the beginning of the filename extension (file.1.ext, file.2.ext, etc.)
Specifying a username and password is required for
* nijie
and optional for
* aibooru (*)
* aryion
* atfbooru (*)
* danbooru (*)
* e621 (*)
* e926 (*)
* exhentai
* idolcomplex
* imgbb
* inkbunny
* kemonoparty
* mangadex
* mangoxo
* pillowfort
* sankaku
* seisoparty
* subscribestar
* tapas
* tsumino
* twitter
* zerochan
These values can also be specified via the -u/--username and -p/--password command-line options or by using a .netrc file. (see Authentication_)
(*) The password value for these sites should be the API key found in your user profile, not the actual account password.
* The Path to a Mozilla/Netscape format cookies.txt file
"~/.local/share/cookies-instagram-com.txt"
* An object specifying cookies as name-value pairs
{ "cookie-name": "cookie-value", "sessionid" : "14313336321%3AsabDFvuASDnlpb%3A31", "isAdult" : "1" }
* A list with up to 4 entries specifying a browser profile.
* The first entry is the browser name
* The optional second entry is a profile name or an absolute path to a
profile directory
* The optional third entry is the keyring to retrieve passwords for
decrypting cookies from
* The optional fourth entry is a (Firefox) container name
("none" for only cookies with no container)
["firefox"] ["firefox", null, null, "Personal"] ["chromium", "Private", "kwallet"]
"http://10.10.1.10:3128"
{ "http" : "http://10.10.1.10:3128", "https": "http://10.10.1.10:1080", "http://10.20.1.128": "http://10.10.1.10:5323" }
* If this is a string, it is the proxy URL for all
outgoing requests.
* If this is an object, it is a scheme-to-proxy mapping to specify
different proxy URLs for each scheme. It is also possible to set a proxy
for a specific host by using scheme://host as key. See
Requests' proxy documentation for more details.
Note: If a proxy URLs does not include a scheme, http:// is assumed.
Can be either a simple string with just the local IP
address
or a list with IP and explicit port number as elements.
Setting this value to "browser" will try to automatically detect and use the User-Agent used by the system's default browser.
Note: This option has no effect on pixiv, e621, and mangadex extractors, as these need specific values to function correctly.
Optionally, the operating system used in the User-Agent header can be specified after a : (windows, linux, or macos).
Note: requests and urllib3 only support HTTP/1.1, while a real browser would use HTTP/2.
{ "User-Agent" : "<extractor.*.user-agent>", "Accept" : "*/*", "Accept-Language": "en-US,en;q=0.5", "Accept-Encoding": "gzip, deflate" }
To disable sending a header, set its value to null.
["ECDHE-ECDSA-AES128-GCM-SHA256", "ECDHE-RSA-AES128-GCM-SHA256", "ECDHE-ECDSA-CHACHA20-POLY1305", "ECDHE-RSA-CHACHA20-POLY1305"]
For example, setting this option to "gdl_file_url" will cause a new metadata field with name gdl_file_url to appear, which contains the current file's download URL. This can then be used in filenames, with a metadata post processor, etc.
For example, setting this option to "gdl_path" would make it possible to access the current file's filename as "{gdl_path.filename}".
For example, setting this option to "gdl_http" would make it possible to access the current file's Last-Modified header as "{gdl_http[Last-Modified]}" and its parsed form as "{gdl_http[date]}".
The content of the object is as follows:
{ "version" : "string", "is_executable" : "bool", "current_git_head": "string or null" }
Each identifier can be
* A category or basecategory name ("imgur",
"mastodon")
* | A (base)category-subcategory pair, where both names are separated by a
colon ("gfycat:user"). Both names can be a * or left
empty, matching all possible names ("*:image",
":user").
Note: Any blacklist setting will automatically include "oauth", "recursive", and "test".
The resulting archive file is not a plain text file but an SQLite3 database, as either lookup operations are significantly faster or memory requirements are significantly lower when the amount of stored IDs gets reasonably large.
Note: Archive files that do not already exist get generated automatically.
Note: Archive paths support regular format string replacements, but be aware that using external inputs for building local paths may pose a security risk.
See <https://www.sqlite.org/pragma.html> for available PRAGMA statements and further details.
[ { "name": "zip" , "compression": "store" }, { "name": "exec", "command": ["/home/foobar/script", "{category}", "{image_id}"] } ]
Unlike other options, a postprocessors setting at a
deeper level
does not override any postprocessors setting at a lower level.
Instead, all post processors from all applicable postprocessors
settings get combined into a single list.
For example
* an mtime post processor at
extractor.postprocessors,
* a zip post processor at extractor.pixiv.postprocessors,
* and using --exec
will run all three post processors - mtime, zip, exec - for each downloaded pixiv file.
{ "archive": null, "keep-files": true }
2xx codes (success responses) and 3xx codes (redirection messages) will never be retried and always count as success, regardless of this option.
5xx codes (server error responses) will always be retried, regardless of this option.
This value gets internally used as the timeout parameter for the requests.request() method.
If this is a string, it must be the path to a CA bundle to use instead of the default certificates.
This value gets internally used as the verify parameter for the requests.request() method.
Setting this to false won't download any files, but all other functions (postprocessors, download archive, etc.) will be executed as normal.
These can be specified as
* index: 3 (file number 3)
* range: 2-4 (files 2, 3, and 4)
* slice: 3:8:2 (files 3, 5, and 7)
Arguments for range and slice notation are optional
and will default to begin (1) or end (sys.maxsize) if
omitted. For example 5-, 5:, and 5:: all mean
"Start at file number 5".
Note: The index of the first file is 1.
A file only gets downloaded when *all* of the given expressions evaluate to True.
Available values are the filename-specific ones listed by -K or -j.
See strptime for a list of formatting directives.
Note: Despite its name, this option does **not** control how {date} metadata fields are formatted. To use a different formatting for those values other than the default %Y-%m-%d %H:%M:%S, put strptime formatting directives after a colon :, for example {date:%Y%m%d}.
* true: Start on users' main gallery pages and
recursively descend into subfolders
* false: Get posts from "Latest Updates" pages
This value must be divisble by 16 and gets rounded down otherwise. The maximum possible value appears to be 1920.
Setting this option to "auto" uses the same domain as a given input URL.
* true: Original ZIP archives
* false: Converted video files
It is possible to specify a custom list of metadata includes. See available_includes for possible field names. aibooru also supports ai_metadata.
Note: This requires 1 additional HTTP request per 200-post batch.
Note: Changing this setting is normally not necessary. When the value is greater than the per-page limit, gallery-dl will stop after the first batch. The value cannot be less than 1.
Setting an explicit filter ID overrides any default filters and can be used to access 18+ content without API Key.
See Filters for details.
Note: Enabling this option also enables deviantart.metadata_.
* true: Use a flat directory structure.
* false: Collect a list of all gallery-folders or
favorites-collections and transfer any further work to other extractors
(folder or collection), which will then create individual
subdirectories for each of them.
Note: Going through all gallery folders will not be able to fetch deviations which aren't in any folder.
Note: Gathering this information requires a lot of API calls. Use with caution.
Possible values are "gallery", "scraps", "journal", "favorite", "status".
It is possible to use "all" instead of listing all values separately.
* "html": HTML with (roughly) the same layout
as on DeviantArt.
* "text": Plain text with image references and HTML tags
removed.
* "none": Don't download textual content.
This option simply sets the mature_content parameter for API calls to either "true" or "false" and does not do any other form of content filtering.
Setting this option to "images" only downloads original files if they are images and falls back to preview versions for everything else (archives, etc.).
* "api": Trust the API and stop when
has_more is false.
* "manual": Disregard has_more and only stop when
a batch of results is empty.
Disable this option to *force* using a private token for all requests when a refresh token is provided.
Using a refresh-token allows you to access private or otherwise not publicly available deviations.
Note: The refresh-token becomes invalid after 3 months or whenever your cache file is deleted or cleared.
Note: This requires 0-2 additional HTTP requests per post.
Note: Changing this setting is normally not necessary. When the value is greater than the per-page limit, gallery-dl will stop after the first batch. The value cannot be less than 1.
Adds archiver_key, posted, and torrents. Makes date and filesize more precise.
* "hitomi": Download the corresponding gallery from hitomi.la
* true: Extract embed URLs and download them if
supported (videos are not downloaded).
* "ytdl": Like true, but let youtube-dl
handle video extraction and download for YouTube, Vimeo and SoundCloud
embeds.
* false: Ignore embeds.
* If this is an integer, it specifies the maximum image
dimension (width and height) in pixels.
* If this is a string, it should be one of Flickr's format
specifiers ("Original", "Large", ...
or "o", "k", "h",
"l", ...) to use as an upper limit.
* "text": Plain text with HTML tags removed
* "html": Raw HTML content
Possible values are "gallery", "scraps", "favorite".
It is possible to use "all" instead of listing all values separately.
* "auto": Automatically differentiate between
"old" and "new"
* "old": Expect the *old* site layout
* "new": Expect the *new* site layout
If a selected format is not available, the next one in the list will be tried until an available format is found.
If the format is given as string, it will be extended with ["mp4", "webm", "mobile", "gif"]. Use a list with one element to restrict it to only one possible format.
If not set, a temporary guest token will be used.
A not up-to-date value will result in 401 Unauthorized errors.
Setting this value to null will do an extra HTTP request to fetch the current value used by gofile.
Possible values are "pictures", "scraps", "stories", "favorite".
It is possible to use "all" instead of listing all values separately.
Available formats are "webp" and "avif".
"original" will try to download the original jpg or png versions, but is most likely going to fail with 403 Forbidden errors.
* true: Follow Imgur's advice and choose MP4 if the
prefer_video flag in an image's metadata is set.
* false: Always choose GIF.
* "always": Always choose MP4.
(See API#Search for details)
* "rest": REST API - higher-resolution media
* "graphql": GraphQL API - lower-resolution media
Possible values are "posts", "reels", "tagged", "stories", "highlights", "avatar".
It is possible to use "all" instead of listing all values separately.
Note: This requires 1 additional HTTP request per post.
* true: Download duplicates
* false: Ignore duplicates
Available types are artist, and post.
Available types are file, attachments, and inline.
Use "all" to download all available formats, or a (comma-separated) list to select multiple formats.
If the selected format is not available, the first in the list gets chosen (usually mp3).
Setting this option to "auto" uses the same domain as a given input URL.
Use true to download animated images as gifs and false to download as mp4 videos.
(See /manga/{id}/feed and /user/follows/manga/feed)
Note: gallery-dl comes with built-in tokens for mastodon.social, pawoo and baraag. For other instances, you need to obtain an access-token in order to use usernames in place of numerical user IDs.
If the selected format is not available, the next smaller one gets chosen.
Possible values are "art", "audio", "games", "movies".
It is possible to use "all" instead of listing all values separately.
Possible values are "illustration", "doujin", "favorite", "nuita".
It is possible to use "all" instead of listing all values separately.
* true: Download videos
* "ytdl": Download videos using youtube-dl
* false: Skip video Tweets
* true: Use Python's webbrowser.open() method to
automatically open the URL in the user's default browser.
* false: Ask the user to copy & paste an URL from the
terminal.
Note: All redirects will go to port 6414, regardless of the port specified here. You'll have to manually adjust the port number in your browser's address bar when using a different port than the default.
Note: This requires 1 additional HTTP request per post.
Available types are postfile, images, image_large, attachments, and content.
Setting this option to "auto" uses the same domain as a given input URL.
Possible values are "artworks", "avatar", "background", "favorite".
It is possible to use "all" instead of listing all values separately.
Note: This requires 1 additional API call per bookmarked post.
* "japanese": List of Japanese tags
* "translated": List of translated tags
* "original": Unmodified list with both Japanese and translated
tags
These animations come as a .zip file containing all animation frames in JPEG format.
Use an ugoira post processor to convert them to watchable videos. (Example__)
Use true to download animated images as gifs and false to download as mp4 videos.
* "stop: Stop the current extractor run.
* "wait: Ask the user to solve the CAPTCHA and wait.
"auto" uses the quality parameter of the input URL or "hq" if not present.
Reddit's internal default and maximum values for this parameter appear to be 200 and 500 respectively.
The value 0 ignores all comments and significantly reduces the time required when scanning a subreddit.
Note: This requires 1 additional API call for every 100 extra comments.
Special values:
* 0: Recursion is disabled
* -1: Infinite recursion (don't do this)
Using a refresh-token allows you to access private or otherwise not publicly available subreddits, given that your account is authorized to do so, but requests to the reddit API are going to be rate limited at 600 requests every 10 minutes/600 seconds.
* true: Download videos and use youtube-dl to
handle HLS and DASH manifests
* "ytdl": Download videos and let youtube-dl
handle all of video extraction and download
* "dash": Extract DASH manifest URLs and use
youtube-dl to download and merge them. (*)
* false: Ignore videos
(*) This saves 1 HTTP request per video and might potentially be able to download otherwise deleted videos, but it will not always get the best video quality available.
If a selected format is not available, the next one in the list will be tried until an available format is found.
If the format is given as string, it will be extended with ["hd", "sd", "gif"]. Use a list with one element to restrict it to only one possible format.
To generate a token, visit /user/USERNAME/list-tokens and click Create Token.
Allows skipping over posts without having to waste API calls.
For each photo with "maximum" resolution (width equal to 2048 or height equal to 3072) or each inline image, use an extra HTTP request to find the URL to its full-resolution version.
* "abort": Raise an error and stop extraction
* "wait": Wait until rate limit reset
Possible types are text, quote, link, answer, video, audio, photo, chat.
It is possible to use "all" instead of listing all types separately.
Setting an explicit filter ID overrides any default filters and can be used to access 18+ content without API Key.
See Filters for details.
* false: Ignore cards
* true: Download image content from supported cards
* "ytdl": Additionally download video content from
unsupported cards using youtube-dl
Possible values are
* card names
* card domains
* <card name>:<card domain>
* "auto": Always auto-generate a token.
* "cookies": Use token given by the ct0 cookie if
present.
Going through a timeline with this option enabled is essentially the same as running gallery-dl https://twitter.com/i/web/status/<TweetID> with enabled conversations option for each Tweet in said timeline.
Note: This requires at least 1 additional API call per initial Tweet. Age-restricted replies cannot be expanded when using the syndication API.
Known available sizes are 4096x4096, orig, large, medium, and small.
* false: Skip age-restricted Tweets.
* true: Download using Twitter's syndication API.
* "extended": Try to fetch Tweet metadata using the
normal API in addition to the syndication API. This requires additional
HTTP requests in some cases (e.g. when retweets are enabled).
Note: This does not apply to search results (including timeline strategies). To retrieve such content from search results, you must log in and disable "Hide sensitive content" in your search settings <https://twitter.com/settings/search>.
If this option is enabled, gallery-dl will try to fetch a quoted (original) Tweet when it sees the Tweet which quotes it.
If this value is "self", only consider replies where reply and original Tweet are from the same user.
Note: Twitter will automatically expand conversations if you use the /with_replies timeline while logged in. For example, media from Tweets which the user replied to will also be downloaded.
It is possible to exclude unwanted Tweets using image-filter <extractor.*.image-filter_>.
If this value is "original", metadata for these files will be taken from the original Tweets, not the Retweets.
* "tweets": /tweets timeline + search
* "media": /media timeline + search
* "with_replies": /with_replies timeline + search
* "auto": "tweets" or
"media", depending on retweets and
text-tweets settings
This only has an effect with a metadata (or exec) post processor with "event": "post" and appropriate filename.
Special values:
* "timeline":
https://twitter.com/i/user/{rest_id}
* "tweets":
https://twitter.com/id:{rest_id}/tweets
* "media":
https://twitter.com/id:{rest_id}/media
Note: To allow gallery-dl to follow custom URL formats, set the blacklist for twitter to a non-default value, e.g. an empty string "".
* true: Download videos
* "ytdl": Download videos using youtube-dl
* false: Skip video Tweets
Available formats are "raw", "full", "regular", "small", and "thumb".
See https://wallhaven.cc/help/api for more information.
Possible values are "uploads", "collections".
It is possible to use "all" instead of listing all values separately.
Note: This requires 1 additional HTTP request per post.
Note: This requires 1 additional HTTP request per submission.
Possible values are "home", "feed", "videos", "newvideo", "article", "album".
It is possible to use "all" instead of listing all values separately.
If this value is "original", metadata for these files will be taken from the original posts, not the retweeted posts.
Set this option to "force" for the same effect as youtube-dl's --force-generic-extractor.
Note: Set quiet and no_warnings in extractor.ytdl.raw-options to true to suppress all output.
Setting this to null will try to import "yt_dlp" followed by "youtube_dl" as fallback.
{ "quiet": true, "writesubtitles": true, "merge_output_format": "mkv" }
All available options can be found in youtube-dl's docstrings <https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L138-L318>.
Note: This requires 1-2 additional HTTP requests per post.
Note: This requires 1 additional HTTP request per post.
Note: This requires 1 additional HTTP request per post.
* true: Start with the latest chapter
* false: Start with the first chapter
Possible values are valid integer or floating-point numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
* true: Write downloaded data into .part files
and rename them upon download completion. This mode additionally
supports resuming incomplete downloads.
* false: Do not use .part files and write data directly into
the actual output files.
Missing directories will be created as needed. If this value is null, .part files are going to be stored alongside the actual output files.
Set this option to null to disable this indicator.
Possible values are valid integer or floating-point numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
Disable the use of a proxy for file downloads by explicitly setting this option to null.
For example, this will change the filename extension ({extension}) of a file called example.png from png to jpg when said file contains JPEG/JFIF data.
If the value is true, consume the response body. This avoids closing the connection and therefore improves connection reuse.
If the value is false, immediately close the connection without reading the response. This can be useful if the server is known to send large bodies for error responses.
Possible values are integer numbers optionally followed by one of k, m. g, t, or p. These suffixes are case-insensitive.
Codes 200, 206, and 416 (when resuming a partial download) will never be retried and always count as success, regardless of this option.
5xx codes (server error responses) will always be retried, regardless of this option.
Fail a download when a file does not pass instead of downloading a potentially broken file.
Note: Set quiet and no_warnings in downloader.ytdl.raw-options to true to suppress all output.
Setting this to null will first try to import "yt_dlp" and use "youtube_dl" as fallback.
Special values:
* null: generate filenames with
extractor.*.filename
* "default": use youtube-dl's default, currently
"%(title)s-%(id)s.%(ext)s"
Note: An output template other than null might cause unexpected results in combination with other options (e.g. "skip": "enumerate")
{ "quiet": true, "writesubtitles": true, "merge_output_format": "mkv" }
All available options can be found in youtube-dl's docstrings <https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L138-L318>.
* "null": No output
* "pipe": Suitable for piping to other processes or files
* "terminal": Suitable for the standard Windows console
* "color": Suitable for terminals that understand ANSI
escape codes and colors
* "auto": "terminal" on Windows with
output.ansi disabled, "color" otherwise.
It is possible to use custom output format strings
by setting this option to an object and specifying start,
success, skip, progress, and
progress-total.
For example, the following will replicate the same output as mode: color:
{ "start" : "{}", "success": "\r\u001b[1;32m{}\u001b[0m\n", "skip" : "\u001b[2m{}\u001b[0m\n", "progress" : "\r{0:>7}B {1:>7}B/s ", "progress-total": "\r{3:>3}% {0:>7}B {1:>7}B/s " }
start, success, and skip are used to output the current filename, where {} or {0} is replaced with said filename. If a given format string contains printable characters other than that, their number needs to be specified as [<number>, <format string>] to get the correct results for output.shorten. For example
"start" : [12, "Downloading {}"]
progress and progress-total are used when
displaying the
download progress indicator, progress when the total number
of bytes to download is unknown,
progress-total otherwise.
For these format strings
* {0} is number of bytes downloaded
* {1} is number of downloaded bytes per second
* {2} is total number of bytes
* {3} is percent of bytes downloaded to total bytes
"utf-8"
{ "encoding": "utf-8", "errors": "replace", "line_buffering": true }
Possible options are
* encoding
* errors
* newline
* line_buffering
* write_through
When this option is specified as a simple string, it is interpreted as {"encoding": "<string-value>", "errors": "replace"}
Note: errors always defaults to "replace"
Set this option to "eaw" to also work with east-asian characters with a display width greater than 1.
* true: Show the default progress indicator
("[{current}/{total}] {url}")
* false: Do not show any progress indicator
* Any string: Show the progress indicator using this as a custom
format string. Possible replacement keys are current,
total and url.
If this is a simple string, it specifies the format string for logging messages.
The default format string here is "{message}".
{ "Pictures": ["jpg", "jpeg", "png", "gif", "bmp", "svg", "webp"], "Video" : ["flv", "ogv", "avi", "mp4", "mpg", "mpeg", "3gp", "mkv", "webm", "vob", "wmv"], "Music" : ["mp3", "aac", "flac", "ogg", "wma", "m4a", "wav"], "Archives": ["zip", "rar", "7z", "tar", "gz", "bz2"] }
Files with an extension not listed will be ignored and stored in their default location.
* "replace": Replace/Overwrite the old version with the new one
* "enumerate": Add an enumeration index to the filename of the new version like skip = "enumerate"
* "abort:N": Stop the current extractor run after N consecutive files compared as equal.
* "terminate:N": Stop the current extractor run, including parent extractors, after N consecutive files compared as equal.
* "exit:N": Exit the program after N consecutive files compared as equal.
archive-format, archive-prefix, and archive-pragma options, akin to extractor.*.archive-format, extractor.*.archive-prefix, and extractor.*.archive-pragma, are supported as well.
* If this is a string, it will be executed using the system's shell, e.g. /bin/sh. Any {} will be replaced with the full path of a file or target directory, depending on exec.event
* If this is a list, the first element specifies the program name and any further elements its arguments. Each element of this list is treated as a format string using the files' metadata as well as {_path}, {_directory}, and {_filename}.
See metadata.event for a list of available events.
* "json": write metadata using
json.dump()
* "jsonl": write metadata in JSON Lines
<https://jsonlines.org/> format
* "tags": write tags separated by newlines
* "custom": write the result of applying
metadata.content-format to a file's metadata dictionary
* "modify": add or modify metadata entries
* "delete": remove metadata entries
Using "-" as filename will write all output to stdout.
If this option is set, metadata.extension and metadata.extension-format will be ignored.
Note: metadata.extension is ignored if this option is set.
The available events are:
init After post processor initialization and before the first file download finalize On extractor shutdown, e.g. after all files were downloaded prepare Before a file download file When completing a file download, but before it gets moved to its target location after After a file got moved to its target location skip When skipping a file download post When starting to download all files of a post, e.g. a Tweet on Twitter or a post on Patreon. post-after After downloading all files of a post
["blocked", "watching", "status[creator][name]"]
{ "blocked" : "***", "watching" : "\fE 'yes' if watching else 'no'", "status[username]": "{status[creator][name]!l}" }
Note: Only applies for "mode": "custom".
See the ensure_ascii argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
See the indent argument of json.dump() for further details.
Note: Only applies for "mode": "json".
See the separators argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
See the sort_keys argument of json.dump() for further details.
Note: Only applies for "mode": "json" and "jsonl".
For example, use "a" to append to a file's content or "w" to truncate it.
See the mode argument of open() for further details.
See the encoding argument of open() for further details.
archive-format, archive-prefix, and archive-pragma options, akin to extractor.*.archive-format, extractor.*.archive-prefix, and extractor.*.archive-pragma, are supported as well.
Enabling this option will only have an effect *if* there is actual mtime metadata available, that is
* after a file download ("event":
"file" (default), "event":
"after")
* when running *after* an mtime post processes for the same
event
For example, a metadata post processor for "event": "post" will *not* be able to set its file's modification time unless an mtime post processor with "event": "post" runs *before* it.
This value must either be a UNIX timestamp or a datetime object.
Note: This option gets ignored if mtime.value is set.
The resulting value must either be a UNIX timestamp or a datetime object.
* "concat" (inaccurate frame timecodes for
non-uniform frame delays)
* "image2" (accurate timecodes, requires nanosecond file
timestamps, i.e. no Windows or macOS)
* "mkvmerge" (accurate timecodes, only WebM or MKV, requires
mkvmerge)
"auto" will select mkvmerge if available and fall back to concat otherwise.
* "auto": Automatically assign a fitting
frame rate based on delays between frames.
* any other string: Use this value as argument for -r.
* null or an empty string: Don't set an explicit frame
rate.
This option, when libx264/5 is used, automatically adds ["-vf", "crop=iw-mod(iw\\,2):ih-mod(ih\\,2)"] to the list of FFmpeg command-line arguments to reduce an odd width/height by 1 pixel and make them even.
Note: Relative paths are relative to the current download directory.
* "safe": Update the central directory file header each time a file is stored in a ZIP archive.
This greatly reduces the chance a ZIP archive gets corrupted in case the Python interpreter gets shut down unexpectedly (power outage, SIGKILL) but is also a lot slower.
Any file in a specified directory with a .py filename extension gets imported and searched for potential extractors, i.e. classes with a pattern attribute.
Note: null references internal extractors defined in extractor/__init__.py or by extractor.modules.
Set this option to null or an invalid path to disable this cache.
For example, setting this option to "#" would allow a replacement operation to be Rold#new# instead of the default Rold/new/
* If given as string, it is parsed according to
date-format.
* If given as integer, it is interpreted as UTC timestamp.
* If given as a single float, it will be used as that
exact value.
* If given as a list with 2 floating-point numbers a &
b , it will be randomly chosen with uniform distribution such
that a <= N <= b. (see random.uniform())
* If given as a string, it can either represent a single
float value ("2.85") or a range
("1.5-3.0").
Simple tilde expansion and environment variable expansion is supported.
In Windows environments, backslashes ("\") can, in addition to forward slashes ("/"), be used as path separators. Because backslashes are JSON's escape character, they themselves have to be escaped. The path C:\path\to\file.ext has therefore to be written as "C:\\path\\to\\file.ext" if you want to use backslashes.
{ "format" : "{asctime} {name}: {message}", "format-date": "%H:%M:%S", "path" : "~/log.txt", "encoding" : "ascii" }
{ "level" : "debug", "format": { "debug" : "debug: {message}", "info" : "[{name}] {message}", "warning": "Warning: {message}", "error" : "ERROR: {message}" } }
* format
* General format string for logging messages or a dictionary with format
strings for each loglevel.
In addition to the default LogRecord attributes, it is
also possible to access the current extractor, job,
path, and keywords objects and their attributes, for example
"{extractor.url}", "{path.filename}",
"{keywords.title}"
* Default: "[{name}][{levelname}] {message}"
* format-date
* Format string for {asctime} fields in logging messages (see
strftime() directives)
* Default: "%Y-%m-%d %H:%M:%S"
* level
* Minimum logging message level (one of "debug",
"info", "warning",
"error", "exception")
* Default: "info"
* path
* Path to the output file
* mode
* Mode in which the file is opened; use "w" to truncate
or "a" to append (see open())
* Default: "w"
* encoding
* File encoding
* Default: "utf-8"
Note: path, mode, and encoding are only applied when configuring logging output to a file.
{ "name": "mtime" }
{ "name" : "zip", "compression": "store", "extension" : "cbz", "filter" : "extension not in ('zip', 'rar')", "whitelist" : ["mangadex", "exhentai", "nhentai"] }
It is possible to set a "filter" expression similar to image-filter to only run a post-processor conditionally.
It is also possible set a "whitelist" or "blacklist" to only enable or disable a post-processor for the specified extractor categories.
The available post-processor types are
classify Categorize files by filename extension
compare Compare versions of the same file and replace/enumerate
them on mismatch
(requires downloader.*.part = true and
extractor.*.skip = false)
exec Execute external commands metadata Write metadata to
separate files mtime Set file modification time according to its
metadata ugoira Convert Pixiv Ugoira to WebM using FFmpeg
zip Store files in a ZIP archive
https://github.com/mikf/gallery-dl/issues
Mike Fährmann <mike_faehrmann@web.de>
and https://github.com/mikf/gallery-dl/graphs/contributors
2023-04-30 | 1.25.3 |