mirror of
https://github.com/imapsync/imapsync.git
synced 2025-06-10 14:44:32 +02:00
144 lines
5.5 KiB
Text
144 lines
5.5 KiB
Text
#!/bin/cat
|
|
$Id: FAQ.Duplicates.txt,v 1.10 2016/04/17 19:06:39 gilles Exp gilles $
|
|
|
|
This documentation is also at http://imapsync.lamiral.info/#doc
|
|
|
|
======================================================================
|
|
Imapsync tips about duplicated messages issues.
|
|
======================================================================
|
|
|
|
=======================================================================
|
|
Q. How can I remove duplicates in an lone account?
|
|
|
|
R. Just run imapsync on the same account with option --delete2duplicates,
|
|
ie, with host1 == host2 and user1 == user2
|
|
|
|
=======================================================================
|
|
Q: Multiple copies, duplicates, when I run imapsync twice ore more.
|
|
|
|
R0.
|
|
Normally and by default, imapsync doesn't generate duplicates.
|
|
So if it does generate duplicates it means a problem occurs
|
|
with message identification. It happens sometimes with IMAP
|
|
servers changing the "Message-Id" header line or one or more
|
|
of the "Received:" header lines in the header part of messages.
|
|
Imapsync uses "Message-Id" header line and "Received:" header
|
|
lines to identify messages on both sides.
|
|
|
|
R1.
|
|
A first solution is to use option --useuid.
|
|
With option --useuid, imapsync doesn't use header lines
|
|
to identify and compare messages in folders.
|
|
Instead of some headers, --useuid tell imapsync to use
|
|
the imap UIDs given by imap servers on bith sides.
|
|
To avoid duplicates on next runs, imapsync uses a local cache
|
|
where it keeps UIDs already transfered.
|
|
|
|
imapsync ... --useuid
|
|
|
|
There is an issue when --useuid is not used the first time.
|
|
A big issue with --useuid is that it doesn't generate duplicates if
|
|
used from the first time but it does generate duplicates after a previous
|
|
run without --useuid (because then uses a different method to identify
|
|
the messages).
|
|
|
|
A solution? Two solutions.
|
|
|
|
The easiest is --delete2
|
|
if you are permitted to use it. Option --delete2 removes messages on host2
|
|
that are not on host1 so with --delete2 you go for resyncing all
|
|
messages again but all previously transferred messages are deleted,
|
|
and also messages previously there without imapsync.
|
|
So --useuid --delete2 is easy to remove duplicates but not for
|
|
all context.
|
|
|
|
A second solution, better if R2 works (see R2 below), is to build
|
|
the cache before using --useuid
|
|
|
|
First sync:
|
|
|
|
imapsync ... --useheader "Message-Id" --addheader --usecache
|
|
|
|
Next syncs:
|
|
|
|
imapsync ... --useuid
|
|
imapsync ... --useuid
|
|
...
|
|
|
|
R2.
|
|
Best way if you can follow it.
|
|
Multiple copies of the emails on the destination server. Some IMAP
|
|
servers (Domino for example) change some headers for each message
|
|
transferred. All messages are transferred again and again each time you
|
|
run imapsync. This is bad of course. The explanation is that imapsync
|
|
considers messages are not the same on each side, default headers used
|
|
to identify the messages have changed.
|
|
|
|
You can look at the headers found by imapsync by using the --debug
|
|
option (and search for the message on both part), Header lines from
|
|
the source server begin with a "FH:" prefix, Header lines from the
|
|
destination server begin with a "TH:" prefix. Since --debug is very
|
|
verbose I suggest to isolate a email in a specific folder in case you
|
|
want to forward me the output.
|
|
|
|
A way to avoid this problem is by using option --useheader with
|
|
a different set than the default ones used by imapsync.
|
|
|
|
The default set is equivalent to:
|
|
|
|
imapsync ... --useheader "Message-Id" --useheader "Received"
|
|
|
|
The problem now is that what can be used instead of Message-Id
|
|
and Received lines? Often standalone Message-Id works:
|
|
|
|
imapsync ... --useheader "Message-Id"
|
|
|
|
Once imapsync does not generate duplicates, the previous duplicates
|
|
can be deleted with option --delete2duplicates
|
|
|
|
imapsync ... --useheader "Message-Id" --delete2duplicates
|
|
|
|
Another good way toward a solution is to isolate two or three messages
|
|
in a BUG folder and send me the --debug output by email at
|
|
gilles.lamiral@laposte.net
|
|
|
|
imapsync ... --debug --folder BUG
|
|
|
|
I will take a close look at the log and modify imapsync to fix
|
|
this faulty duplicate behavior.
|
|
|
|
Remark. (Trick found by Tomasz Kaczmarski)
|
|
|
|
Option --useheader "Message-Id" asks the server to send only header
|
|
lines beginning with "Message-Id". Some (buggy) servers send the whole
|
|
header (all lines) instead of the "Message-Id" line. In that case, a
|
|
trick to keep the --useheader filtering behaviour is to use
|
|
--skipheader with a negative lookahead pattern:
|
|
|
|
imapsync ... --skipheader "^(?!Message-Id)"
|
|
|
|
Read it as "skip every header except Message-Id".
|
|
|
|
=======================================================================
|
|
Q. imapsync calculates 479 messages in a folder but only transfers 400
|
|
messages. What's happen?
|
|
|
|
R1. Unless --useuid is used, imapsync considers a header part
|
|
of a message to identify a message on both sides.
|
|
By default the header part used is lines "Message-Id:" "Message-ID:"
|
|
and "Received:" or specific lines depending on --useheader
|
|
--skipheader. Whole header can be set by --useheader ALL
|
|
|
|
Consequences:
|
|
|
|
1) Duplicate messages on host1 (identical header) are not transferred.
|
|
|
|
The result is that you can have more messages on host1 than on host2.
|
|
|
|
R2. With option --useuid imapsync doesn't use headers to identify
|
|
messages on both sides but it uses their imap uid identifier.
|
|
In that case duplicates on host1 are also transferred on host2.
|
|
|
|
=======================================================================
|
|
|
|
|