archmbox(1) | archmbox(1) |
archmbox - a simple email archiver
archmbox [ -h | --version ]
archmbox MODE [ OPTIONS ] -d date mailbox [
mailbox ... ]
archmbox MODE [ OPTIONS ] -o days mailbox [
mailbox ... ]
Archmbox is a simple email archiver written in perl; it parses one or more mailboxes, select some or all messages and then perform specific actions on the selected messages.
Four different MODES are available:
Messages selection is based upon a date criteria; an absolute date or a days offset can be specified.
It is also possible to refine the selection using perl regular expressions on the header fields of the message. Keep in mind to quote the so called metacharacters, which are reserved for use in perl's regex notation. The metacharacters are
{}[]()^$.|*+?\
All archived messages are stored in a new mailbox with the same name of the original one + .archived as extension (this is the default, but can be changed); the archive mailbox can be saved in gz or bz2 compressed format as well.
Please note that the archive mailbox format is always mbox, regardless of original mailbox format. Moreover, mailboxes must be specified using the full path.
Messages are appended to the archive mailbox to allow multiple executions of the script against the same mailbox.
Archmbox is completely written in perl, but it uses some shell helpers to perform its job (fuser, rm, gzip/gunzip etc.).
The correct path for the helpers (both required and optional ones) is probed at installation time. If one required helper is missing the installation will not take place. If one optional helper is missing, the feature provided using that helper will be unavailable, but the script will be installed anyway.
All other relevant configuration options can be specified at installation time or at run time using the command line switches.
A complete example:
archmbox -a -b -c -e 01 -f -d 2002-01-01 -p ~/mail-archive ~/Mail/personal-stuff
This will archive all messages older than (received before...) Jan 1st 2002 from the personal-stuff mailbox in the Mail directory. Archive messages are saved in a mailbox called Mail-personal-stuff.01.gz in the ~/mail-archive directory. After execution, you'll find a mailbox called personal-stuff.backup in ~/Mail.
Complex examples, using perl regular expressions:
archmbox -a -o 1 --keep-flagged --keep-unread \
-x From='(nagios|arpwatch|logcheck)@host\.net' \
-x Subject='^(Security Events|Syslog Summary|\[SNORT\])' \
~/Mail/inbox
This will archive all unflagged, read messages older than 1 day where the sender address matches nagios@host.net, arpwatch@host.net or logcheck@host.net or whose subject field starts with either 'Security Events' or 'Syslog Summary' or '[SNORT]' from the mailbox ~/Mail/inbox. Messages will be saved in inbox.archive in the current directory where archmbox was started from.
archmbox --archive --offset 1 --keep-flagged --keep-unread
\
--Regexp From='@(host1|host2).example\.com' \
--regexp Subject='^(Security Events|Syslog Summary|\[SNORT\])' \
--archive-path ~/Mail/local-network.archive \
--archive-name system-msgs \
--extension 'none' \
~/Mail/inbox
This will archive all unflagged, read messages older than 1 day where the sender address matches @host1.example.com or @host2.example.com and whose subject field starts with either 'Security Events' or 'Syslog Summary' or '[SNORT]' from the mailbox ~/Mail/inbox. Messages will be archived to the mbox system-msgs in the directory ~/Mail/local-network.archive.
Some simpler examples:
archmbox -a -o 15 ~/Mail/personal-stuff
This will archive all messages older than 15 days in personal-stuff.archived (uncompressed mailbox).
archmbox -a -r -o 15 ~/Mail/personal-stuff
The same as above, but only messages newer than 15 days will be archived.
archmbox -k -o 15 ~/Mail/personal-stuff
This will delete all messages older than 15 days from Mail/personal-stuff
archmbox -a -o 15 ~/Mail/* -c
This will archive all messages older than 15 days in every mailbox found in ~/Mail. All the archive mailboxes will be compressed.
archmbox -l -r -c /tmp/mbox -o 20
List all messages in /tmp/mbox which are newer than 20 days. Option -c is meaningless (and so ignored...).
archmbox -l -r -c /tmp/mbox -o 20 -a --bzip2
Same as above, but archiving is forced (-a) and bzip2 is used for compression.
archmbox -a -x Subject='archmbox' -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field satisfies regexp match Subject =~ /archmbox/ (Subject is case sensitive, archmbox is is case insensitive).
archmbox -l -x Subject='archmbox' -x From='fritz' -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field contains archmbox or the sender is fritz (matches are case insensitive).
archmbox -l -x Subject='archmbox' -X From='fritz' -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field contains archmbox and the sender is fritz (matches are case insensitive).
archmbox -a -o 5 -R /tmp/mbox ~/Mail
archmbox will archive all messages older than five days in /tmp/mbox. It then start parsing all mailboxes stored in ~/Mail (recursion is active, and ~/Mail is a directory). If one or more directories will be found in ~/Mail, those directories will be explored as well.
archmbox -a -o -1 ~/Mail/my_mbx_mailbox --format mbx
archmbox archives all messages stored in my_mbx_mailbox and puts them into my_mbx_mailbox.archived. The source mailbox is a mbx mailbox (--format mbx is used). The archive mailbox will be a mbox mailbox.
When the script has to decide if a message needs to be selected from the mailbox, it looks for the header From generated by the mail server (this is the first line of the message) and doesn't care about the date specified by the sender's mail client. This is useful to avoid removing messages sent from misconfigured mail clients. This behavior can be changed by forcing the use of the "Date:" header (option -D).
Not all options are meaningful in all modes, ie compression is meaningless in list or kill mode. If you specify a useless option for a particular mode, archmbox simply ignores it.
Archmbox uses a working directory to store temporary mailboxes. A default value for that directory is hard coded in the script, but can be changed during the configuration/installation process (see INSTALL for details). It might happen that your mailboxes are too big for the partition holding this temporary directory, or you might want to perform archiving on too much mailboxes at the same time. In other words, you may run out of space. Use the -t option to specify a suitable working directory at runtime.
If you see some differences in the mailbox's dimension (size/free space), keep in mind that your mailbox may contain a special message (512 bytes in size) with internal information related to the mailbox. This message is meaningless for you, though archmbox recognizes it and lets you be aware of it. That message is left untouched in your source mailbox.
A few words about locking. There has been a discussion about archmbox handles file locking. The answer is simple: no mailbox is ever locked. The reason behind this behavior is that I want archmbox to be as least invasive as possible, so other kind of checks are performed to ensure that no data is lost (mailbox has changed/mailbox is in use by another program). I will surely add some locking mechanism in the future.
You don't need to execute archmbox as root... just take care to have write permissions for the directories you use.
Archmbox can be downloaded from:
http://adc-archmbox.sourceforge.net
Archmbox is distributed under the terms of the GPL
Copyright (C) 2001-2005
Alessandro Dotti Contra <adotti@users.sourceforge.net>
Parts of the code were contributed by:
Alex Aminoff, Brian Medley, Buck Holsinger, Davor Ocelic, Fabrice Noilhan, Jayanth Varma, Juergen Edner, Laurent Cheylus, Nicolas Ecarnot, Paco Regodon, Scott Thompson, Juergen Desher.
The FreeBSD port is maintained by Talal Al-Dik.
The OpenDarwin port is maintained by Markus Weissman.
The Debian package is maintained by Alberto Furia
<straluna@email.it>
Please report bugs to <adotti@users.sourceforge.net>