Brent Frère wrote:
Great idea.
Why do you always have to be so sarcastic? And quick to shoot? Try to be
a little bit more constructive...
If you had read my mail, you'd know that my intention was to compare
only those files who have the same length. If you want a more
algorithmic description, here's what I have in mind:
- build a list of all the files which have the same length and which are
larger than 1KB
- for each group of files of the same length
- read the first block (1KB) of each file
- compare the blocks in memory one to another
- throw out those who are different to all the others
- repeat until no file is left in the pool or end of files
- print the files which are left in the pool
So in the _worst_ case, that is if all files are equal, I read each one
entirely. That is the _best_ case in your approach.
Unless I am missing something, of course.
the two involved files will be actually compared. You
don't wish to flag
as identical files the ones that are just sharing the same md5sum and
file length, I guess ? Doing so would lead to a M$t-like system:
something that works properly sometimes, and has strange behaviour in
some unpredictable, unidentified circumstances, and even sometimes a non
causal behaviour. Do your choice.
Stop this crap, please.
-pu