Patrick Useldinger wrote:
Brent Frère wrote:
Sorry about that. Just my style. Not sarcastic at all. I even think we go in an interesting and constructive direction (big kiss :-) ). I even need this for a customer project (the one that is described in page 20 of the  February LuxBox issue)
Good, you can give me your feedback then, I'll try to incorporate as much as possible.
When a filesystem is periodically replicated (not actually synchronised despide RSYNC name might suggest) with a previous version of it, the efficiency of RSYNC is fantastic: accelerations of up to 2000 times compared to a full stupid copy. Even if parts of a file have been changed, even if some data have been added at the end, even if some data have been added at the beginning of the file, the RSYNC tool takes the existing data already available at the destination side to save unecessary bandwidth in the replication process (a kind of 'refreshment' of the files instead of plain copying). But there is a lack in RSYNC: if a file is renamed, its entire content will be retransmitted to the destination at the next RSYNC replication, and not just a file renaming, because it appears that RSYNC is not able to detect that the file is actually the same, just having a different filename.

Worst even is the case of folders: if you rename a folder, all the files and subfolders will have to be fully copied during next replication, because RSYNC will not notice the change. It takes the entire tree structure as new, and will delete the old one on the target. What I wish is to be able to detect this condition, by detecting files and folders that dissepeared since last replication, and files and folders that appeared since lase replication, and so instead of deleting the files and folders no more present on the source side, try to move it or rename it the the appropriate location, before running the genuine RSYNC protocol, saving large amount of bandwidth in this case.

I hope it is clear. The point here is that I don't want to have to transmit the entire file content on the network to perform the actual comparaison: the idea is to save bandwidth. So, I can consider files as identical if they share the same md5sum. I have no reason to do better: RSYNC does the job very well. If a file that appeared on the source has the same checksum as a file on the target that dissepeared on the source, I'll just rename it to the new candidate file name. If a folder, newly appeared on the source, contains lots of such files (having the same checksum has files belonging to a folder on the target side that dissepeared in the source, I'll rename the folder...

You see the point ?
-- 
Brent Frère

Private e-mail:  Brent@BFrere.net

Postal address: 5, rue de Mamer
                L-8280 Kehlen
                Grand-Duchy of Luxembourg
                European Union

Mobile: +352-021/29.05.98
Fax:    +352-26.30.05.96
Home:   +352-307.341
URL:    http://BFrere.net 

This e-mail signature can be checked if you have the CaCERT certificate installed.
Check http://www.CaCERT.org for details.