MAILDIRSYNC(1) | User Contributed Perl Documentation | MAILDIRSYNC(1) |
maildirsync - Online synchronizer for Maildir-format mailboxes
maildirsync.pl [ --recursive ] [ --backup path ] [ --backup-tree ] \ [ --bzip2=bzip2 ] [ --gzip=gzip ] [ --maildirsync=maildirsync ] \ [ --rsh=ssh ] [ --verbose ] [ --alg=md5 ] [ --delete-before ] \ [ --exclude=^/Folder1 ] [ --exclude=^/Fold.*er2 ] \ [ --exclude-source=^/Folder3 ] [ --exclude-target=^/Folder4 ] \ [ --rename="s/SourceFolder/TargetFolder/" ] \ [ --destination=win|lin ] \ [ -r ] [ -b path ] [ -B ] [ -R ssh ] [ -v ] [ -a ] [ -d ] \ source dest state_file.bz2
A simple two-way synchronization:
maildirsync.pl -rvv -a md5 desktop:Maildir Maildir lib/sync_desk_note.bz2 maildirsync.pl -rvv -a md5 Maildir desktop:Maildir lib/sync_note_desk.bz2
maildirsync is a utility for online Maildir-synchronization. It is designed to be used on live maildir folders, be fail-safe and optimized for minimal bandwidth.
If you call the program once, it will propagate the changes from the source side to the target side. Two-way synchronization requires two state-file and two call for the program.
This propagation is basically two different operations from the source side:
At the first phase, the source side reads the state file (which stores the state of the last synchronization) and compares it to the current state. It collects the changes and sends them the target side.
The target side checks every file, which is marked new in the source file, and decides if:
After it decides, it sends back the requests for new files.
Then the source side will send the files to the target side, which stores them into the Maildir structure.
After the send operation is completed, both operation agrees upon the commit operation. Then the source side saves the new state into the state file.
Note if we forget saving the state, or the program exits before it, the operation can be restarted without data loss and inconsistency, because all operations can be redone without errors.
Maildir can be used in remote mode, so it can synchronize Maildir folders between computers. If you want to use it this way, you have to provide the host name before either the source or the target, like:
maildirsync.pl ... desktop:Maildir Maildir lib/maildirsync.bz2
or
maildirsync.pl ... Maildir desktop:Maildir lib/maildirsync.bz2
In remote mode, the target side must have maildirsync installed also. (See the --maildirsync command-line argument).
The state-file must be in the same system as the source, so the source file in the first example is searched in the "desktop" computer, and in the local computer in the second example.
At least source or the destination must be local, so you cannot sync maildirs on two different remote hosts.
Some command line switches has two form: a short form and a longer form. In the short form, the switches can be grouped, like: -vvvr. Short options with parameters also can be grouped, but the parameter must be the following command-line argument, like:
maildirsync.pl -rvvvbR Maildir/Trash/cur ssh ...
It is the same as:
maildirsync.pl --recursive --verbose --verbose --verbose --backup \ Maildir/Trash/cur --rsh ssh ...
Long options can use '=' for assigning the parameter, or they can use the syntax above.
Let's see what switches we have:
For more information about the algorithms, read the chapter about that.
--exclude=^/Trash
The excluded paths are used in either source or the target side also. So if you exclude a very large directory, you will notice speedup in the source and target side also.
Note that this regexp is matched for every directory that is read from the filesystem, and every directory what is found in the state file. So if you provide the exclude pattern as ^/Trash$, then it will skip the Trash directory when traversing the directory structure, but it WON'T skip files from the Trash/cur directory when reading from the state file! So be careful if you use the exclude pattern with existing state file!
If you use this option, use it with care, because you have to provide exactly the opposite of the name-transformation if you sync to the other side, for example:
In one side, you can use (A to B):
--rename="s{^/Saved/}{/ToBeSaved/}" \ --exclude-target="^/Saved/" \
In the other side, you can use (B to A):
--exclude-source="^/Saved/" \ --rename="s{^/ToBeSaved/}{/Saved/}" \
In this case, the Saved folder in the A side will be synchronized with the ToBeSaved folder in the B side, and the Saved folder in the B side will be excluded from the synchronization. This scenario can be used when you don't want to store your emails in the server, but you want to use the "Saved" folder in the server too. In this case, the emails will be downloaded from the server (A side) to your laptop (B side), then you can move them to the Saved folder in B with a script. If this is done, then you can resynchronize, then the saved files will disappear from the server also.
Currently the program has two algorithms, which can be used for synchronization.
With this algorithm, you can trace the flag changes or the deletion of a message and these changes can be propagated to the other side also. It also handles if a message is copied from the "new" directory to the "cur" directory without retransmit the files over the network.
This algorithm is recommended if you want a simple and quite fast operation, and if you have a not-so-slow internet connection.
By using this algorithm, you can track the copies and moves of a message, so you don't need to retransmit large files if you move them to a new folder.
If you copy or move a message from one folder to another, the header is sometimes changed by the mail-reader program. This is why we cannot simply calculate the MD5 sum of the whole message. A new message in a folder will have a new identifier also, so it doesn't violate the law that a message-id is unique.
When you copy or move a message from your INBOX to your "Save" folder (for example), the new message is analyzed in the source side, header-size and md5 sum is calculated on the new message, and from the md5-hash, the source side can tell the target side what messages has the same hash-value, so the target side can copy the body from the other message. If the target side has successfully copied the body from one of those provided messages, then only the header needs to be transmitted across the network. If the target-side did not find the messages, then it requests for the body also.
Online operation means that the software can be used in an online mailbox also. It assumes that the Maildir folder can be changed when the program is working, so it tries to be as fail-safe as can be.
Every new file is opened in the "tmp" directory, and moved to the target place only when the file is fully downloaded.
This mode of operation was the first priority, because this feature is missing from most synchronizer software, including my "drsync" utility also.
I am using this program to synchronize my Mailboxes. I have 9700 emails in my mailbox and the state file (bzipped) is 283K.
The first time of a two-way synchronization between a P166 server and a PIII/1200 notebook over a Cable network, where the starting position is an already synchronized directory, tooks about 10 minutes. This time is used for md5-calculation and message-id propagation.
The next two-way run tooks about 40 seconds.
These things are measured in Debian GNU/Linux testing/unstable operating system (08 Oct 2002).
These are only the overhead of the software, not the real transfer. If you got a very big email, it needs to be transferred at least one time on the network. But if you have it in both sides, then it does not require any more transfer if you save it to different folders.
I am currently happy with this feature set, but if I have time, I will implement these features into the software. Anyway if you have time and willingness, I accept patches also:
Copyright (c) 2000-2010 Szabo, Balazs (dLux)
All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
dLux (Szabo, Balazs) <dlux@dlux.hu>
2021-09-28 | perl v5.32.1 |