Patrick Useldinger wrote:
Brent Frère wrote:
Sorry about that. Just my style. Not sarcastic at
all. I even think
we go in an interesting and constructive direction (big kiss :-) ). I
even need this for a customer project (the one that is described in
page 20 of the February LuxBox issue)
Good, you can give me your feedback then, I'll try to incorporate as
much as possible.
When a filesystem is periodically _replicated_ (not actually
_synchronised_ despide RSYNC name might suggest) with a previous version
of it, the efficiency of RSYNC is fantastic: accelerations of up to 2000
times compared to a full stupid copy. Even if parts of a file have been
changed, even if some data have been added at the end, even if some data
have been added _at the beginning_ of the file, the RSYNC tool takes the
existing data already available at the destination side to save
unecessary bandwidth in the replication process (a kind of 'refreshment'
of the files instead of plain copying). But there is a lack in RSYNC: if
a file is renamed, its entire content will be retransmitted to the
destination at the next RSYNC replication, and not just a file renaming,
because it appears that RSYNC is not able to detect that the file is
actually the same, just having a different filename.
Worst even is the case of folders: if you rename a folder, all the files
and subfolders will have to be fully copied during next replication,
because RSYNC will not notice the change. It takes the entire tree
structure as new, and will delete the old one on the target. What I wish
is to be able to detect this condition, by detecting files and folders
that dissepeared since last replication, and files and folders that
appeared since lase replication, and so instead of deleting the files
and folders no more present on the source side, try to move it or rename
it the the appropriate location, before running the genuine RSYNC
protocol, saving large amount of bandwidth in this case.
I hope it is clear. The point here is that I don't want to have to
transmit the entire file content on the network to perform the actual
comparaison: the idea is to save bandwidth. So, I can consider files as
identical if they share the same md5sum. I have no reason to do better:
RSYNC does the job very well. If a file that appeared on the source has
the same checksum as a file on the target that dissepeared on the
source, I'll just rename it to the new candidate file name. If a folder,
newly appeared on the source, contains lots of such files (having the
same checksum has files belonging to a folder on the target side that
dissepeared in the source, I'll rename the folder...
You see the point ?
--
Brent Frère
Private e-mail: Brent(a)BFrere.net
Postal address: 5, rue de Mamer
L-8280 Kehlen
Grand-Duchy of Luxembourg
European Union
Mobile: +352-021/29.05.98
Fax: +352-26.30.05.96
Home: +352-307.341
URL:
http://BFrere.net
This e-mail signature can be checked if you have the CaCERT certificate installed.
Check
http://www.CaCERT.org for details.