Tuesday, October 10, 2006

MD5 Shell Scripts Find Duplicate Files

The shell scripts wrap the default md5sum program found on most Linux system to prepare a report of unique and duplicate files in a given directory.

The lengthy source code could be shorten if removing the DupUniRpt function which merely used to prepare an easy to read report that showing both the filename and number of duplicate and unique files.

By removing the DupUniRpt function call and function coding, do remember to add the line

cat $FM5L

right after the DupUniRpt function call in the if-else statement. The $FM5L is a semi-raw report file that groups duplicate and unique files together.

The wDupUniRpt.sh contain the same source code of DupUniRpt function source code, which could be used to prepare that easy to read report based on the semi-raw report file. This shell scripts created merely for easy debugging purpose.

Both these shell scripts have been tested successfully with as much as possible scenarios. The source code might be able to further enhanced for efficiency or bugs fixing if any. Any suggestive comments are greatly appreciated.

Related information:

  • wmd5.sh to report both filename and number of duplicate and unique files in a given directory
  • wDupUniRpt.sh used to report redundant and unique records in an ASCII file of sorted records.
  • MD5 checksum used to find redundant files
  • Search more related info with Google Search engine built-in

This article has no comments yet. Why don't write your comments for this article?