Quantcast
Channel: Planet Python
Viewing all articles
Browse latest Browse all 22462

Dan Stromberg: fdupes and equivs3e

$
0
0

I recently saw an announcement of fdupes on linuxtoday.

Upon investigating it a bit, I noticed that it uses almost exactly the same algorithm as my equivs3e program.

Both are intended to find duplicate files in a filesystem, quickly.

The main difference seems to be that fdupes is in C, and equivs3e is in Python.  Also, fdupes accepts a directory in argv (like tar), while equivs3e expects to have "find /directory -type f -print0" piped into it (like cpio).

However, upon doing a quick performance comparison, it turns out that fdupes is quite a bit faster on large collections of small files, and equivs3e is quite a bit faster on collections of large files.  I really don't know why the python code is sometimes outperforming the C code, given that they're so similar internally.

I've added a "related work" section on my equivs3e page that compares equivs3e and fdupes.

Anyway, I hope people find one or both of these programs useful.


Viewing all articles
Browse latest Browse all 22462

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>