URLWATCH-JOBS(5) | urlwatch 2.25 Documentation | URLWATCH-JOBS(5) |
urlwatch-jobs - Job types and configuration for urlwatch
urlwatch --edit
Jobs are the kind of things that urlwatch(1) can monitor.
The list of jobs to run are contained in the configuration file urls.yaml, accessed with the command urlwatch --edit, each separated by a line containing only ---. The command urlwatch --list prints the name of each job, along with its index number (1, 2, 3, ...) which gets assigned automatically according to its position in the configuration file.
While optional, it is recommended that each job starts with a name entry:
name: "This is a human-readable name/label of the job"
The following job types are available:
This is the main job type -- it retrieves a document from a web server:
name: "urlwatch homepage" url: "https://thp.io/2008/urlwatch/"
Required keys:
Job-specific optional keys:
(Note: url implies kind: url)
This job type is a resource-intensive variant of "URL" to handle web pages requiring JavaScript in order to render the content to be monitored.
The optional pyppeteer package must be installed to run "Browser" jobs (see dependencies).
At the moment, the Chromium version used by pyppeteer only supports macOS (x86_64), Windows (both x86 and x64) and Linux (x86_64). See this issue <https://github.com/pyppeteer/pyppeteer/issues/155> in the Pyppeteer issue tracker for progress on getting ARM devices supported (e.g. Raspberry Pi).
Because pyppeteer downloads a special version of Chromium (~ 100 MiB), the first execution of a browser job could take some time (and bandwidth). It is possible to run pyppeteer-install to pre-download Chromium.
name: "A page with JavaScript" navigate: "https://example.org/"
Required keys:
Job-specific optional keys:
As this job uses Pyppeteer <https://github.com/pyppeteer/pyppeteer> to render the page in a headless Chromium instance, it requires massively more resources than a "URL" job. Use it only on pages where url does not give the right results.
Hint: in many instances instead of using a "Browser" job you can monitor the output of an API called by the site during page loading containing the information you're after using the much faster "URL" job type.
(Note: navigate implies kind: browser)
This job type allows you to watch the output of arbitrary shell commands, which is useful for e.g. monitoring an FTP uploader folder, output of scripts that query external devices (RPi GPIO), etc...
name: "What is in my Home Directory?" command: "ls -al ~"
Required keys:
Job-specific optional keys:
(Note: command implies kind: shell)
By default urlwatch captures stderr for error reporting (non-zero exit code), but ignores the output when the shell job exits with exit code 0.
This behavior can be customized using the stderr key:
For example, this job definition will make the job appear as failed, even though the script exits with exit code 0:
command: |
echo "Normal standard output."
echo "Something goes to stderr, which makes this job fail." 1>&2
exit 0 stderr: fail
On the other hand, if you want to diff both stdout and stderr of the job, use this:
command: |
echo "An important line on stdout."
echo "Another important line on stderr." 1>&2 stderr: stdout
The main configuration file has a job_defaults key that can be used to configure keys for all jobs at once.
See urlwatch-config(5) for how to configure job defaults.
See urlwatch-cookbook(7) for example job configurations.
$XDG_CONFIG_HOME/urlwatch/urls.yaml
2022 Thomas Perl
March 15, 2022 | urlwatch 2.25 |