DOKK / manpages / debian 12 / public-inbox / public-inbox-extindex.1.en
PUBLIC-INBOX-EXTINDEX(1) public-inbox user manual PUBLIC-INBOX-EXTINDEX(1)

public-inbox-extindex - create and update external search indices

public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...

public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all

public-inbox-extindex creates and updates an external search and overview database used by the read-only public-inbox PSGI (HTTP), NNTP, and IMAP interfaces. This requires either the Search::Xapian XS bindings OR the Xapian SWIG bindings, along with DBD::SQLite and DBI Perl modules.

These switches behave as they do for public-inbox-index(1)
Index all "publicinbox" entries in "PI_CONFIG".

"publicinbox" entries indexed by "public-inbox-extindex" can have full Xapian searching abilities with the per-"publicinbox" "indexlevel" set to "basic" and their respective Xapian ("xap15" or "xapian15") directories removed. For multiple public-inboxes where cross-posting is common, this allows significant space savings on Xapian indices.

Perform garbage collection instead of indexing. Use this if inboxes are removed from the extindex, or if messages are purged or removed from some inboxes.
Forces a re-index of all messages in the extindex. This can be used for in-place upgrades and bugfixes while read-only server processes are utilizing the index. Keep in mind this roughly doubles the size of the already-large Xapian database.

The extindex locks will be released roughly every 10s to allow public-inbox-mda(1) and public-inbox-watch(1) processes to write to the extindex.

Used with "--reindex", it will only look for new and stale entries and not touch already-indexed messages.

public-inbox-extindex-format(5)

public-inbox-extindex does not currently write to the public-inbox-config(5) file, configuration may be entered manually. The extindex name of "all" is a special case which corresponds to indexing "--all" inboxes. An example for "--all" is as follows:

        [extindex "all"]
                topdir = /path/to/extindex_dir
                url = all
                coderepo = foo
                coderepo = bar

See public-inbox-config(5) for more details.

Used to override the default "~/.public-inbox/config" value.
The number of documents to update before committing changes to disk. This environment is handled directly by Xapian, refer to Xapian API documentation for more details.

Setting "XAPIAN_FLUSH_THRESHOLD" or "publicinbox.indexBatchSize" for a large "--reindex" may cause public-inbox-mda(1), public-inbox-learn(1) and public-inbox-watch(1) tasks to wait long and unpredictable periods of time during "--reindex".

Default: none, uses "publicinbox.indexBatchSize"

Occasionally, public-inbox will update it's schema version and require a full index by running this command.

Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>

The mail archives are hosted at <https://public-inbox.org/meta/> and <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

Copyright all contributors <mailto:meta@public-inbox.org>

License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>

Search::Xapian, DBD::SQLite

1993-10-02 public-inbox.git