Patrick Useldinger a écrit :
Brent Frère wrote:
Sorry. This looks interesting to me but I
don't fully understand what
you are trying to do.
Find duplicate files (i.e. same length and content, not necessarily
same name) on a single file-system.
No time to write correctly the script, but it is indeed interesting,
especially when you do un rsync and the source part has a folder that
has been renamed. It would be great to have the renamed folder being
detected and renamed at the destination side instead of re-copying all
the files...
Here is the idea.
Do a find. For each file, compute a md5sum. Do a sort of it. Detect the
sets of files having matching md5sum. Do a binary compare of each couple
of such files. If it matches, you found it !
Roughly speaking:
# find . -type f -exec md5sum {} | sort > md5sum.lst \;
# uniq md5sum.lst > md5sum.uniq
# for each couple in `diff md5sum.lst md5sum.uniq`; do
cmp $1 $2
done
--
Brent Frère
Private e-mail: Brent(a)BFrere.net
Postal address: 5, rue de Mamer
L-8280 Kehlen
Grand-Duchy of Luxembourg
European Union
Mobile: +352-021/29.05.98
Fax: +352-26.30.05.96
Home: +352-307.341
URL:
http://BFrere.net
If you have problem with my digital signature, please install the appropriate authority
certificate by browsing
https://www.cacert.org/certs/root.crt.