Thanks to http://tips4linux.com/, I've found out how to track duplicate files on my GNU/Linux system. I've modified the proposed solution to suit my needs. In short, the command retrieves the size of each file, and compares them to see if they are the same files sizes. If so, an md5 hash will be performed to ensure that the files are exactly the same.
We set a SEARCH variable, which will contain the path where we wish to search for duplicate files:
root@host:~# SEARCH=/data
root@host:~# find $SEARCH -not -empty -type f -printf %s\\n | sort -rn | uniq -d | xargs -I{} -n1 find $SEARCH -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
find $SEARCH -not -empty -type f -printf %s\\n
sort -rn
uniq -d
xargs -I{} -n1 find $SEARCH -type f -size {}c -print0
xargs -0 md5sum
uniq -w32 -all-repeated=separate
Contact :