Wednesday, September 13, 2006

Find And Remove Duplicate File

When the hard disk space going larger and larger, more files are storing into it. Over the time, there might be a lot of duplicate files scatter around the file system.

As bulk of these redundant files are here and there, redundant files mess up file system, decrease system performance, wasting valuable disk spaces, and ineffective file backup. It takes time and could be a really tedious job to find and delete these redundant files when low disk space alarmed!

MD5 checksum could be a good candidate to find duplicate and redundant files! It could be used to precisely identify which files have updated since last backup done by comparing the MD5 checksum of files between source and target of the backup.

MD5 short for Message-Digest algorithm 5, is a widely-used cryptographic hash function with a 128-bit hash value. MD5 has been employed in a wide variety of security applications used to check the integrity of data stream, TCP/IP packets, files, etc.
Related information:

  • MD5Sums is a tiny Windows command line freeware that able to automatically generate MD5 checksum for all files in a directory except sub directories. Technically, a Windows shell scripts such as VBScripts could be written to programmatic find duplicate files that reside in the file system by calling this tiny freeware via Run method of WshShell object.
  • MD5 unofficial homepage to find implementations in various programming languages.
  • MD5 shell scripts to find unique and redundant files in given directory
  • Search more related info with Google Search engine built-in

This article has no comments yet. Why don't write your comments for this article?