INNFEED(8) | InterNetNews Documentation | INNFEED(8) |
innfeed, imapfeed - Multi-host, multi-connection, streaming NNTP feeder
innfeed [-ChmMvxyz] [-a spool-dir] [-b directory] [-c config-file] [-d log-level] [-e bytes] [-l logfile] [-o bytes] [-p pid-file] [-s command] [-S status-file] [file]
innfeed implements the NNTP protocol for transferring news between computers. It handles the standard IHAVE protocol as well as the CHECK/TAKETHIS streaming extension. innfeed can feed any number of remote hosts at once and will open multiple connections to each host if configured to do so. The only limitations are the process limits for open file descriptors and memory.
As an alternative to using NNTP, INN may also be fed to an IMAP server. This is done by using an executable called imapfeed, which is identical to innfeed except for the delivery process. The new version has two types of connections: an LMTP connection to deliver regular messages and an IMAP connection to handle control messages.
innfeed has three modes of operation: channel, funnel-file and batch.
Channel mode is used when no filename is given on the command line, the input-file keyword is not given in the config file, and the -x option is not given. In channel mode, innfeed runs with stdin connected via a pipe to innd. Whenever innd closes this pipe (and it has several reasons during normal processing to do so), innfeed will exit. It first will try to finish sending all articles it was in the middle of transmitting, before issuing a QUIT command. This means innfeed may take a while to exit depending on how slow your peers are. It never (well, almost never) just drops the connection. The recommended way to restart innfeed when run in channel mode is therefore to tell innd to close the pipe and spawn a new innfeed process. This can be done with "ctlinnd flush feed" where feed is the name of the innfeed channel feed in the newsfeeds file.
Funnel-file mode is used when a filename is given as an argument or the input-file keyword is given in the config file. In funnel-file mode, it reads the specified file for the same formatted information as innd would give in channel mode. It is expected that innd is continually writing to this file, so when innfeed reaches the end of the file, it will check periodically for new information. To prevent the funnel file from growing without bounds, you will need to periodically move the file to the side (or simply remove it) and have innd flush the file. Then, after the file is flushed by innd, you can send innfeed a SIGALRM, and it too will close the file and open the new file created by innd. Something like:
innfeed -p <pathrun in inn.conf>/innfeed.pid my-funnel-file & while true; do sleep 43200 rm -f my-funnel-file ctlinnd flush funnel-file-site kill -ALRM `cat <pathrun>/innfeed.pid` done
Batch mode is used when the -x flag is used. In batch mode, innfeed will ignore stdin, and will simply process any backlog created by a previously running innfeed. This mode is not normally needed as innfeed will take care of backlog processing.
innfeed expects a couple of things to be able to run correctly: a directory where it can store backlog files and a configuration file to describe which peers it should handle.
The configuration file is described in innfeed.conf(5). The -c option can be used to specify a different file. For each peer (say, "foo"), innfeed manages up to 4 files in the backlog directory:
You should never alter the foo.input or foo.output files of a running innfeed. The format of these last three files is one of the following:
/path/to/article <message-id> @token@ <message-id>
This is the same as the first two fields of the lines innd feeds to innfeed, and the same as the first two fields of the lines of the batch file innd will write if innfeed is unavailable for some reason. When innfeed processes its own batch files, it ignores everything after the first two whitespace separated fields, so moving the innd-created batch file to the appropriate spot will work, even though the lines have extra fields.
The first field can also be a storage API token. The two types of lines can be intermingled; innfeed will use the storage manager if appropriate, and otherwise treat the first field as a filename to read directly.
innfeed writes its current status to the file innfeed.status (or the file given by the -S option). This file contains details on the process as a whole, and on each peer this instance of innfeed is managing.
If innfeed is told to send an article to a host it is not managing, then the article information will be put into a file matching the pattern innfeed-dropped.*, with part of the file name matching the pid of the innfeed process that is writing to it. innfeed will not process this file except to write to it. If nothing is written to the file, then it will be removed if innfeed exits normally. Otherwise, the file remains, and procbatch can be invoked to process it afterwards.
Upon receipt of a SIGALRM, innfeed will close the funnel file specified on the command line, and will reopen it (see funnel file description above).
innfeed with catch SIGINT and will write a large debugging snapshot of the state of the running system.
innfeed will catch SIGHUP and will reload both the config and the log files. See innfeed.conf(5) for more details.
innfeed will catch SIGCHLD and will close and reopen all backlog files.
innfeed will catch SIGTERM and will do an orderly shutdown.
Upon receipt of a SIGUSR1, innfeed will increment the debugging level by one; receipt of a SIGUSR2 will decrement it by one. The debugging level starts at zero (unless the -d option it used), in which case no debugging information is emitted. A larger value for the level means more debugging information. Numbers up to 5 are currently useful.
There are 3 different categories of syslog entries for statistics: host, connection and global.
The host statistics are generated for a given peer at regular intervals after the first connection is made (or, if the remote is unreachable, after spooling starts). The host statistics give totals over all connections that have been active during the given time frame. For example (broken here to fit the page, with "vixie" being the peer):
May 23 12:49:08 news innfeed[16015]: vixie checkpoint seconds 1381 offered 2744 accepted 1286 refused 1021 rejected 437 missing 0 accsize 8506220 rejsize 142129 spooled 990 on_close 0 unspooled 240 deferred 10/15.3 requeued 25 queue 42.1/100:14,35,13,4,24,10
The meanings of these fields are:
The second number is the average (mean) size of deferred articles during the previous logging interval
If the -z option is used (see below), then when the peer stats are generated, each connection will log its stats too. For example, for connection number zero (from a set of five):
May 23 12:49:08 news innfeed[16015]: vixie:0 checkpoint seconds 1381 offered 596 accepted 274 refused 225 rejected 97 accsize 773623 rejsize 86591
If you only open a maximum of one connection to a remote, then there will be a close correlation between connection numbers and host numbers, but in general you cannot tie the two sets of number together in any easy or very meaningful way. When a connection closes, it will always log its stats.
If all connections for a host get closed together, then the host logs its stats as "final" and resets its counters. If the feed is so busy that there is always at least one connection open and running, then after some amount of time (set via the config file), the host stats are logged as final and reset. This is to make generating higher level stats from log files, by other programs, easier.
There is one log entry that is emitted for a host just after its last connection closes and innfeed is preparing to exit. This entry contains counts over the entire life of the process. The "seconds" field is from the first time a connection was successfully built, or the first time spooling started. If a host has been completely idle, it will have no such log entry.
May 23 12:49:08 news innfeed[16015]: decwrl global seconds 1381 offered 34 accepted 22 refused 3 rejected 7 missing 0 accsize 81277 rejsize 12738 spooled 0 unspooled 0
The final log entry is emitted immediately before exiting. It contains a summary of the statistics over the entire life of the process.
Feb 13 14:43:41 news innfeed[22344]: ME global seconds 15742 offered 273441 accepted 45750 refused 222008 rejected 3334 missing 217 accsize 93647166 rejsize 7421839 spooled 10 unspooled 0
innfeed takes the following options.
Note that innfeed with -y and no peer in innfeed.conf would cause a problem that innfeed drops the first article.
When using the -x option, the config file entry's initial-connections field will be the total number of connections created and used, no matter how many big the batch file, and no matter how big the max-connections field specifies. Thus a value of 0 for initial-connections means nothing will happen in -x mode.
innfeed does not automatically grab the file out of pathoutgoing. This needs to be prepared for it by external means.
Probably too many other bugs to count.
An alternative to innfeed can be innduct, maintained by Ian Jackson and available at <http://www.chiark.greenend.org.uk/ucgi/~ian/git-manpage/innduct.git/innduct.8>. It is intended to solve a design issue in the way innfeed works. As a matter of fact, the program feed protocol spoken between innd and innfeed is lossy: if innfeed dies unexpectedly, articles which innd has written to the pipe to innfeed will be skipped. innd has no way of telling which articles those are, no useful records, and no attempts to resend these articles.
Written by James Brister <brister@vix.com> for InterNetNews. Converted to POD by Julien Elie.
Earlier versions of innfeed (up to 0.10.1) were shipped separately; innfeed is now part of INN and shares the same version number.
ctlinnd(8), inn.conf(5), innfeed.conf(5), innd(8), procbatch(8).
2023-09-06 | INN 2.7.1 |