diff options
author | Anas Nashif <anas.nashif@intel.com> | 2012-11-07 09:30:22 -0800 |
---|---|---|
committer | Anas Nashif <anas.nashif@intel.com> | 2012-11-07 09:30:22 -0800 |
commit | 689298ab0fd25c5441b219c82438c478355f1bb4 (patch) | |
tree | 578aac75af7578378861bfa30e110b6d71123b8d | |
download | fdupes-689298ab0fd25c5441b219c82438c478355f1bb4.tar.gz fdupes-689298ab0fd25c5441b219c82438c478355f1bb4.tar.bz2 fdupes-689298ab0fd25c5441b219c82438c478355f1bb4.zip |
Imported Upstream version 1.40upstream/1.40
-rw-r--r-- | CHANGES | 67 | ||||
-rw-r--r-- | CONTRIBUTORS | 12 | ||||
-rw-r--r-- | INSTALL | 25 | ||||
-rw-r--r-- | Makefile | 53 | ||||
-rw-r--r-- | README | 76 | ||||
-rw-r--r-- | TODO | 5 | ||||
-rw-r--r-- | fdupes.1 | 105 | ||||
-rw-r--r-- | fdupes.c | 905 | ||||
-rw-r--r-- | md5/README | 4 | ||||
-rw-r--r-- | md5/md5.c | 392 | ||||
-rw-r--r-- | md5/md5.h | 98 | ||||
-rw-r--r-- | testdir/nine_upsidedown | 1 | ||||
-rw-r--r-- | testdir/recursed_a/five | 1 | ||||
-rw-r--r-- | testdir/recursed_a/one | 1 | ||||
-rw-r--r-- | testdir/recursed_a/two | 1 | ||||
-rw-r--r-- | testdir/recursed_b/four | 1 | ||||
-rw-r--r-- | testdir/recursed_b/one | 1 | ||||
-rw-r--r-- | testdir/recursed_b/three | 1 | ||||
-rw-r--r-- | testdir/recursed_b/two_plus_one | 1 | ||||
l--------- | testdir/symlink_dir | 1 | ||||
l--------- | testdir/symlink_two | 1 | ||||
-rw-r--r-- | testdir/twice_one | 1 | ||||
-rw-r--r-- | testdir/two | 1 | ||||
-rw-r--r-- | testdir/with spaces a | 1 | ||||
-rw-r--r-- | testdir/with spaces b | 1 | ||||
-rw-r--r-- | testdir/zero_a | 0 | ||||
-rw-r--r-- | testdir/zero_b | 0 |
27 files changed, 1756 insertions, 0 deletions
@@ -0,0 +1,67 @@ +The following list, organized by fdupes version, documents changes +to fdupes. Every item on the list includes, inside square brackets, +a list of indentifiers referring to the people who contributed +that particular item. When more than one person is listed the person +who contributed the patch or idea appears first, followed by +those who've otherwise worked on that item. For a list of +contributors names and identifiers please see the CONTRIBUTORS file. + + +Changes from 1.31 to 1.40 + +- Added option to omit the first file in each group + of matches. [LM, AL] +- Added escaping of filenames containing spaces when + sameline option is specified. [AL] +- Changed version indicator format from "fdupes version X.Y" + to the simpler "fdupes X.Y". [AL] +- Changed ordering of options appearing in the help + text (--help), manpage, and README file. [AL] + +Changes from 1.30 to 1.31 + +- Added interactive option to preserve all files during + delete procedure (something similar was already in + place, but now it's official). [AL] +- Updated delete procedure prompt format. [AL] +- Cosmetic code changes. [AL] + +Changes from 1.20 to 1.30 + +- Added size option to display size of duplicates. [LB, AL] +- Added missing typecast for proper compilation under g++. [LB] +- Better handling of errors occurring during retrieval + of a file's signature. [KK, AL] +- No longer displays an error message when specified + directories contain no files. [AL] +- Added red-black tree structure (experimental compile-time + option, disabled by default). [AL] + +Changes from 1.12 to 1.20 + +- Fixed bug where program would crash when files being + scanned were named pipes or sockets. [FD] +- Fix against security risk resulting from the use of a + temporary file to store md5sum output. [FD, AL] +- Using an external md5sum program is now optional. Started + using L. Peter Deutsh's MD5 library instead. [FD, AL] +- Added hardlinks option to distinguish between hard links + and actual duplicate files. [FD, AL] +- Added noempty option to exclude zero-length files + from consideration [AL] + +Changes from 1.11 to 1.12 + +- Improved handling of extremely long input on preserve + prompt (delete option). [SSD, AL] + +Changes from 1.1 to 1.11 + +- Started checking file sizes before signatures + for better performance. [AB, AL] +- Added fdupes manpage. [AB, AL] + +Changes from 1.0 to 1.1 + +- Added delete option for semi-automatic deletion + of duplicate files. [AL] diff --git a/CONTRIBUTORS b/CONTRIBUTORS new file mode 100644 index 0000000..14b2458 --- /dev/null +++ b/CONTRIBUTORS @@ -0,0 +1,12 @@ +The following people have contributed in some way to the development +of fdupes. Please see the CHANGES file for detailed information +on their contributions. Names are listed in alphabetical order. + + [AB] Adrian Bridgett <adrian.bridgett@iname.com> + [AL] Adrian Lopez <adrian2@caribe.net> + [FD] Frank DENIS aka Jedi/Sector One aka DJ Chrysalis <j@4u.net> + [KK] Kresimir Kukulj <madmax@pc-hrvoje.srce.hr> + [LB] Laurent Bonnaud <Laurent.Bonnaud@iut2.upmf-grenoble.fr> + [LM] Luca Montecchiani <m.luca@iname.com> +[SSD] Steven S. Dick <ssd@nevets.oau.org> + @@ -0,0 +1,25 @@ +Installing fdupes +-------------------------------------------------------------------- +To install the program, issue the following commands: + +make fdupes +su root +make install + +This will install the program in /usr/local/bin. You may change this +to a different location by editing the Makefile. Please refer to +the Makefile for an explanation of compile-time options. If you're +having trouble compiling, please take a look at the Makefile. + +UPGRADING NOTE: When upgrading from a version prior to 1.2, it should +be noted that the default installation directory for the fdupes man +page has changed from "/usr/man" to "/usr/local/man". If installing +to the default location you should delete the old man page before +proceeding. This file would be named "/usr/man/man1/fdupes.1". + +A test directory is included so that you may familiarise yourself +with the way fdupes operates. You may test the program before +installing it by issuing a command such as "./fdupes testdir" +or "./fdupes -r testdir", just to name a couple of examples. Refer +to the documentation for information on valid options. + diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..090ad1d --- /dev/null +++ b/Makefile @@ -0,0 +1,53 @@ +# +# INSTALLDIR indicates directory where program is to be installed. +# Suggested values are "/usr/local/bin" or "/usr/bin". +# +INSTALLDIR = /usr/local/bin + +# +# MANPAGEDIR indicates directory where the fdupes man page is to be +# installed. Suggested values are "/usr/local/man" or "/usr/man". +# +MANPAGEDIR = /usr/local/man + +# +# VERSION determines the program's version number. +# +VERSION = "1.40" + +# +# To use the md5sum program for calculating signatures (instead of the +# built in MD5 message digest routines) uncomment the following +# line (try this if you're having trouble with built in code). +# +#EXTERNAL_MD5 = -DEXTERNAL_MD5=\"md5sum\" + +# +# This version of fdupes can use a red-black tree structure to +# store file information. This is disabled by default, as it +# hasn't been optimized or verified correct. If you wish to +# enable this untested option, uncomment the following line. +# +#EXPERIMENTAL_RBTREE = -DEXPERIMENTAL_RBTREE + +##################################################################### +# no need to modify anything beyond this point # +##################################################################### + +fdupes: fdupes.c md5/md5.c + gcc fdupes.c md5/md5.c -Wall -o fdupes -DVERSION=\"$(VERSION)\" $(EXTERNAL_MD5) $(EXPERIMENTAL_RBTREE) + +install: fdupes + cp fdupes $(INSTALLDIR) + cp fdupes.1 $(MANPAGEDIR)/man1 + +tarball: clean + tar --directory=.. -c -z -v -f ../fdupes-$(VERSION).tar.gz fdupes-$(VERSION) + +clean: + rm -f *.o + rm -f fdupes + rm -f *~ + +love: + @echo You\'re not my type. Go find a human partner. @@ -0,0 +1,76 @@ +Introduction +-------------------------------------------------------------------- +FDUPES is a program for identifying duplicate files residing +within specified directories. + + +Usage +-------------------------------------------------------------------- +Usage: fdupes [options] DIRECTORY... + + -r --recurse include files residing in subdirectories + -s --symlinks follow symlinks + -H --hardlinks normally, when two or more files point to the same + disk area they are treated as non-duplicates; this + option will change this behavior + -n --noempty exclude zero-length files from consideration + -f --omitfirst omit the first file in each set of matches + -1 --sameline list each set of matches on a single line + -S --size show size of duplicate files + -q --quiet hide progress indicator + -d --delete prompt user for files to preserve and delete all + others; important: under particular circumstances, + data may be lost when using this option together + with -s or --symlinks, or when specifying a + particular directory more than once; refer to the + fdupes documentation for additional information + -v --version display fdupes version + -h --help display this help message + +Unless -1 or --sameline is specified, duplicate files are listed +together in groups, each file displayed on a separate line. The +groups are then separated from each other by blank lines. + +When -1 or --sameline is specified, spaces and backslash characters (\) +appearing in a filename are preceded by a backslash character. For +instance, "with spaces" becomes "with\ spaces". + +When using -d or --delete, care should be taken to insure against +accidental data loss. While no information will be immediately +lost, using this option together with -s or --symlink can lead +to confusing information being presented to the user when prompted +for files to preserve. Specifically, a user could accidentally +preserve a symlink while deleting the file it points to. A similar +problem arises when specifying a particular directory more than +once. All files within that directory will be listed as their own +duplicates, leading to data loss should a user preserve a file +without its "duplicate" (the file itself!). + + +Contact Information for Adrian Lopez +-------------------------------------------------------------------- +email: adrian2@caribe.net + + +Legal Information +-------------------------------------------------------------------- +FDUPES Copyright (c) 1999 Adrian Lopez + +Permission is hereby granted, free of charge, to any person +obtaining a copy of this software and associated documentation files +(the "Software"), to deal in the Software without restriction, +including without limitation the rights to use, copy, modify, merge, +publish, distribute, sublicense, and/or sell copies of the Software, +and to permit persons to whom the Software is furnished to do so, +subject to the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS +OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. +IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY +CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, +TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. @@ -0,0 +1,5 @@ +- Add option to highlight or identify symlinked files. +- Verify correctness of red-black tree algorithm. Optimize. +- Fix problem where MD5 collisions will result in one of the + files not being registered (causing it to be ignored). + diff --git a/fdupes.1 b/fdupes.1 new file mode 100644 index 0000000..f4b4ca3 --- /dev/null +++ b/fdupes.1 @@ -0,0 +1,105 @@ +.TH FDUPES 1 +.\" NAME should be all caps, SECTION should be 1-8, maybe w/ subsection +.\" other parms are allowed: see man(7), man(1) +.SH NAME +fdupes \- finds duplicate files in a given set of directories +.SH SYNOPSIS +.B fdupes +[ +.I options +] +.I DIRECTORY +\|.\|.\|. + +.SH "DESCRIPTION" +Searches the given path for duplicate files. Such files are found by +comparing file sizes and MD5 signatures, followed by a +byte-by-byte comparison. + +.SH OPTIONS +.TP +.B -r --recurse +include files residing in subdirectories +.TP +.B -s --symlinks +follow symlinked directories +.TP +.B -H --hardlinks +normally, when two or more files point to the same disk area they are +treated as non-duplicates; this option will change this behavior +.TP +.B -n --noempty +exclude zero-length files from consideration +.TP +.B -f --omitfirst +omit the first file in each set of matches +.TP +.B -1 --sameline +list each set of matches on a single line +.TP +.B -S --size +show size of duplicate files +.TP +.B -q --quiet +hide progress indicator +.TP +.B -d --delete +prompt user for files to preserve, deleting all others (see +.B CAVEATS +below) +.TP +.B -v --version +display fdupes version +.TP +.B -h --help +displays help +.SH "SEE ALSO" +.\" Always quote multiple words for .SH +.BR md5sum (1) +.SH NOTES +Unless +.B -1 +or +.B --sameline +is specified, duplicate files are listed +together in groups, each file displayed on a separate line. The +groups are then separated from each other by blank lines. + +When +.B -1 +or +.B --sameline +is specified, spaces and backslash characters (\fB\e\fP) appearing +in a filename are preceded by a backslash character. +.SH CAVEATS +If fdupes returns with an error message such as +.B fdupes: error invoking md5sum +it means the program has been compiled to use an external +program to calculate MD5 signatures (otherwise, fdupes uses +interal routines for this purpose), and an error has occurred +while attempting to execute it. If this is the case, the +specified program should be properly installed prior +to running fdupes. + +When using +.B \-d +or +.BR \-\-delete , +care should be taken to insure against +accidental data loss. + +When used together with options +.B \-s +or +.BR \-\-symlink , +a user could accidentally +preserve a symlink while deleting the file it points to. + +Furthermore, when specifying a particular directory more than +once, all files within that directory will be listed as their +own duplicates, leading to data loss should a user preserve a +file without its "duplicate" (the file itself!). + +.SH AUTHOR +Adrian Lopez <adrian2@caribe.net> + diff --git a/fdupes.c b/fdupes.c new file mode 100644 index 0000000..b52f723 --- /dev/null +++ b/fdupes.c @@ -0,0 +1,905 @@ +/* FDUPES Copyright (c) 1999 Adrian Lopez + + Permission is hereby granted, free of charge, to any person + obtaining a copy of this software and associated documentation files + (the "Software"), to deal in the Software without restriction, + including without limitation the rights to use, copy, modify, merge, + publish, distribute, sublicense, and/or sell copies of the Software, + and to permit persons to whom the Software is furnished to do so, + subject to the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS + OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY + CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, + TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE + SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ + +#include <stdio.h> +#include <stdarg.h> +#include <string.h> +#include <sys/stat.h> +#include <dirent.h> +#include <unistd.h> +#include <stdlib.h> +#include <getopt.h> +#include <string.h> +#include <errno.h> + +#ifndef EXTERNAL_MD5 +#include "md5/md5.h" +#endif + +#define ISFLAG(a,b) ((a & b) == b) +#define SETFLAG(a,b) (a |= b) + +#define F_RECURSE 0x001 +#define F_HIDEPROGRESS 0x002 +#define F_DSAMELINE 0x004 +#define F_FOLLOWLINKS 0x008 +#define F_DELETEFILES 0x010 +#define F_EXCLUDEEMPTY 0x020 +#define F_CONSIDERHARDLINKS 0x040 +#define F_SHOWSIZE 0x080 +#define F_OMITFIRST 0x100 + +char *program_name; + +unsigned long flags = 0; + +#define CHUNK_SIZE 8192 + +#define INPUT_SIZE 256 + +typedef struct _file { + char *d_name; + off_t size; + char *crcsignature; + ino_t inode; + int hasdupes; /* true only if file is first on duplicate chain */ + struct _file *duplicates; + struct _file *next; +} file_t; + +typedef struct _filetree { + file_t *file; +#ifdef EXPERIMENTAL_RBTREE + unsigned char color; + struct _filetree *parent; +#endif + struct _filetree *left; + struct _filetree *right; +} filetree_t; + +#ifdef EXPERIMENTAL_RBTREE +#define COLOR_RED 0 +#define COLOR_BLACK 1 +#endif + +void errormsg(char *message, ...) +{ + va_list ap; + + va_start(ap, message); + + fprintf(stderr, "\r%40s\r%s: ", "", program_name); + vfprintf(stderr, message, ap); +} + +void escapefilename(char *escape_list, char **filename_ptr) +{ + int x; + int tx; + char *tmp; + char *filename; + + filename = *filename_ptr; + + tmp = (char*) malloc(strlen(filename) * 2 + 1); + if (tmp == NULL) { + errormsg("out of memory!\n"); + exit(1); + } + + for (x = 0, tx = 0; x < strlen(filename); x++) { + if (strchr(escape_list, filename[x]) != NULL) tmp[tx++] = '\\'; + tmp[tx++] = filename[x]; + } + + tmp[tx] = '\0'; + + if (x != tx) { + *filename_ptr = realloc(*filename_ptr, strlen(tmp) + 1); + if (*filename_ptr == NULL) { + errormsg("out of memory!\n"); + exit(1); + } + strcpy(*filename_ptr, tmp); + } +} + +off_t filesize(char *filename) { + struct stat s; + + if (stat(filename, &s) != 0) return -1; + + return s.st_size; +} + +ino_t getinode(char *filename) { + struct stat s; + + if (stat(filename, &s) != 0) return 0; + + return s.st_ino; +} + +int grokdir(char *dir, file_t **filelistp) +{ + DIR *cd; + file_t *newfile; + struct dirent *dirinfo; + int lastchar; + int filecount = 0; + struct stat info; + struct stat linfo; + static int progress = 0; + static char indicator[] = "-\\|/"; + + cd = opendir(dir); + + if (!cd) { + errormsg("could not chdir to %s\n", dir); + return 0; + } + + while ((dirinfo = readdir(cd)) != NULL) { + if (strcmp(dirinfo->d_name, ".") && strcmp(dirinfo->d_name, "..")) { + if (!ISFLAG(flags, F_HIDEPROGRESS)) { + fprintf(stderr, "\rBuilding file list %c ", indicator[progress]); + progress = (progress + 1) % 4; + } + + newfile = (file_t*) malloc(sizeof(file_t)); + + if (!newfile) { + errormsg("out of memory!\n"); + closedir(cd); + exit(1); + } else newfile->next = *filelistp; + + newfile->inode = 0; + newfile->crcsignature = NULL; + newfile->duplicates = NULL; + newfile->hasdupes = 0; + + newfile->d_name = (char*)malloc(strlen(dir)+strlen(dirinfo->d_name)+2); + + if (!newfile->d_name) { + errormsg("out of memory!\n"); + free(newfile); + closedir(cd); + exit(1); + } + + strcpy(newfile->d_name, dir); + lastchar = strlen(dir) - 1; + if (lastchar >= 0 && dir[lastchar] != '/') + strcat(newfile->d_name, "/"); + strcat(newfile->d_name, dirinfo->d_name); + + if (filesize(newfile->d_name) == 0 && ISFLAG(flags, F_EXCLUDEEMPTY)) { + free(newfile->d_name); + free(newfile); + continue; + } + + if (stat(newfile->d_name, &info) == -1) { + free(newfile->d_name); + free(newfile); + continue; + } + + if (lstat(newfile->d_name, &linfo) == -1) { + free(newfile->d_name); + free(newfile); + continue; + } + + if (S_ISDIR(info.st_mode)) { + if (ISFLAG(flags, F_RECURSE) && (ISFLAG(flags, F_FOLLOWLINKS) || !S_ISLNK(linfo.st_mode))) + filecount += grokdir(newfile->d_name, filelistp); + free(newfile->d_name); + free(newfile); + } else { + if (S_ISREG(linfo.st_mode) || (S_ISLNK(linfo.st_mode) && ISFLAG(flags, F_FOLLOWLINKS))) { + *filelistp = newfile; + filecount++; + } else { + free(newfile->d_name); + free(newfile); + } + } + } + } + + closedir(cd); + + return filecount; +} + +#ifndef EXTERNAL_MD5 + +/* If EXTERNAL_MD5 is not defined, use L. Peter Deutsch's MD5 library. + */ +char *getcrcsignature(char *filename) +{ + int x; + off_t fsize; + off_t toread; + md5_state_t state; + md5_byte_t digest[16]; + static md5_byte_t chunk[CHUNK_SIZE]; + static char signature[16*2 + 1]; + char *sigp; + FILE *file; + + md5_init(&state); + + fsize = filesize(filename); + + file = fopen(filename, "rb"); + if (file == NULL) { + errormsg("error opening file %s\n", filename); + return NULL; + } + + while (fsize > 0) { + toread = (fsize % CHUNK_SIZE) ? (fsize % CHUNK_SIZE) : CHUNK_SIZE; + if (fread(chunk, toread, 1, file) != 1) { + errormsg("error reading from file %s\n", filename); + return NULL; + } + md5_append(&state, chunk, toread); + fsize -= toread; + } + + md5_finish(&state, digest); + + sigp = signature; + + for (x = 0; x < 16; x++) { + sprintf(sigp, "%02x", digest[x]); + sigp = strchr(sigp, '\0'); + } + + fclose(file); + + return signature; +} + +#endif /* [#ifndef EXTERNAL_MD5] */ + +#ifdef EXTERNAL_MD5 + +/* If EXTERNAL_MD5 is defined, use md5sum program to calculate signatures. + */ +char *getcrcsignature(char *filename) +{ + static char signature[256]; + char *command; + char *separator; + FILE *result; + + command = (char*) malloc(strlen(filename)+strlen(EXTERNAL_MD5)+2); + if (command == NULL) { + errormsg("out of memory\n"); + exit(1); + } + + sprintf(command, "%s %s", EXTERNAL_MD5, filename); + + result = popen(command, "r"); + if (result == NULL) { + errormsg("error invoking %s\n", EXTERNAL_MD5); + exit(1); + } + + free(command); + + if (fgets(signature, 256, result) == NULL) { + errormsg("error generating signature for %s\n", filename); + return NULL; + } + separator = strchr(signature, ' '); + if (separator) *separator = '\0'; + + pclose(result); + + return signature; +} + +#endif /* [#ifdef EXTERNAL_MD5] */ + +void purgetree(filetree_t *checktree) +{ + if (checktree->left != NULL) purgetree(checktree->left); + + if (checktree->right != NULL) purgetree(checktree->right); + + free(checktree); +} + +#ifdef EXPERIMENTAL_RBTREE +/* Use a red-black tree structure to store file information. + */ + +void rotate_left(filetree_t **root, filetree_t *node) +{ + filetree_t *subject; + + subject = node->right; + node->right = subject->left; + + if (subject->left != NULL) subject->left->parent = node; + subject->parent = node->parent; + + if (node->parent == NULL) { + *root = subject; + } else { + if (node == node->parent->left) + node->parent->left = subject; + else + node->parent->right = subject; + } + + subject->left = node; + node->parent = subject; +} + +void rotate_right(filetree_t **root, filetree_t *node) +{ + filetree_t *subject; + + subject = node->left; + node->left = subject->right; + + if (subject->right != NULL) subject->right->parent = node; + subject->parent = node->parent; + + if (node->parent == NULL) { + *root = subject; + } else { + if (node == node->parent->left) + node->parent->left = subject; + else + node->parent->right = subject; + } + + subject->right = node; + node->parent = subject; +} + +#define TREE_LEFT -1 +#define TREE_RIGHT 1 +#define TREE_ROOT 0 + +void registerfile(filetree_t **root, filetree_t *parent, int loc, file_t *file) +{ + filetree_t *node; + filetree_t *uncle; + + file->size = filesize(file->d_name); + file->inode = getinode(file->d_name); + + node = (filetree_t*) malloc(sizeof(filetree_t)); + if (node == NULL) { + errormsg("out of memory!\n"); + exit(1); + } + + node->file = file; + node->left = NULL; + node->right = NULL; + node->parent = parent; + node->color = COLOR_RED; + + if (loc == TREE_ROOT) + *root = node; + else if (loc == TREE_LEFT) + parent->left = node; + else + parent->right = node; + + while (node != *root && node->parent->color == COLOR_RED) { + if (node->parent->parent == NULL) return; + + if (node->parent == node->parent->parent->left) { + uncle = node->parent->parent->right; + if (uncle == NULL) return; + + if (uncle->color == COLOR_RED) { + node->parent->color = COLOR_BLACK; + uncle->color = COLOR_BLACK; + node->parent->parent->color = COLOR_RED; + node = node->parent->parent; + } else { + if (node == node->parent->right) { + node = node->parent; + rotate_left(root, node); + } + node->parent->color = COLOR_BLACK; + node->parent->parent->color = COLOR_RED; + rotate_right(root, node->parent->parent); + } + } else { + uncle = node->parent->parent->left; + if (uncle == NULL) return; + + if (uncle->color == COLOR_RED) { + node->parent->color = COLOR_BLACK; + uncle->color = COLOR_BLACK; + node->parent->parent->color = COLOR_RED; + node = node->parent->parent; + } else { + if (node == node->parent->right) { + node = node->parent; + rotate_left(root, node); + } + node->parent->color = COLOR_BLACK; + node->parent->parent->color = COLOR_RED; + rotate_right(root, node->parent->parent); + } + } + } + + (*root)->color = COLOR_BLACK; +} + +#endif /* [#ifdef EXPERIMENTAL_RBTREE] */ + +#ifndef EXPERIMENTAL_RBTREE + +int registerfile(filetree_t **branch, file_t *file) +{ + file->size = filesize(file->d_name); + file->inode = getinode(file->d_name); + + *branch = (filetree_t*) malloc(sizeof(filetree_t)); + if (*branch == NULL) { + errormsg("out of memory!\n"); + exit(1); + } + + (*branch)->file = file; + (*branch)->left = NULL; + (*branch)->right = NULL; + + return 1; +} + +#endif /* [#ifndef EXPERIMENTAL_RBTREE] */ + +file_t *checkmatch(filetree_t **root, filetree_t *checktree, file_t *file) +{ + int cmpresult; + char *crcsignature; + off_t fsize; + + /* If inodes are equal one of the files is a hard link, which + is usually not accidental. We don't want to flag them as + duplicates, unless the user specifies otherwise. */ + + if (!ISFLAG(flags, F_CONSIDERHARDLINKS) && getinode(file->d_name) == + checktree->file->inode) return NULL; + + fsize = filesize(file->d_name); + + if (fsize < checktree->file->size) + cmpresult = -1; + else + if (fsize > checktree->file->size) cmpresult = 1; + else { + if (checktree->file->crcsignature == NULL) { + crcsignature = getcrcsignature(checktree->file->d_name); + if (crcsignature == NULL) return NULL; + + checktree->file->crcsignature = (char*) malloc(strlen(crcsignature)+1); + if (checktree->file->crcsignature == NULL) { + errormsg("out of memory\n"); + exit(1); + } + strcpy(checktree->file->crcsignature, crcsignature); + } + + if (file->crcsignature == NULL) { + crcsignature = getcrcsignature(file->d_name); + if (crcsignature == NULL) return NULL; + + file->crcsignature = (char*) malloc(strlen(crcsignature)+1); + if (file->crcsignature == NULL) { + errormsg("out of memory\n"); + exit(1); + } + strcpy(file->crcsignature, crcsignature); + } + + cmpresult = strcmp(file->crcsignature, checktree->file->crcsignature); + } + + if (cmpresult < 0) { + if (checktree->left != NULL) { + return checkmatch(root, checktree->left, file); + } else { +#ifndef EXPERIMENTAL_RBTREE + registerfile(&(checktree->left), file); +#else + registerfile(root, checktree, TREE_LEFT, file); +#endif + return NULL; + } + } else if (cmpresult > 0) { + if (checktree->right != NULL) { + return checkmatch(root, checktree->right, file); + } else { +#ifndef EXPERIMENTAL_RBTREE + registerfile(&(checktree->right), file); +#else + registerfile(root, checktree, TREE_RIGHT, file); +#endif + return NULL; + } + } else return checktree->file; +} + +/* Do a bit-for-bit comparison in case two different files produce the + same signature. Unlikely, but better safe than sorry. */ + +int confirmmatch(FILE *file1, FILE *file2) +{ + unsigned char c1; + unsigned char c2; + size_t r1; + size_t r2; + + fseek(file1, 0, SEEK_SET); + fseek(file2, 0, SEEK_SET); + + do { + r1 = fread(&c1, sizeof(c1), 1, file1); + r2 = fread(&c2, sizeof(c2), 1, file2); + + if (c1 != c2) return 0; /* file contents are different */ + } while (r1 && r2); + + if (r1 != r2) return 0; /* file lengths are different */ + + return 1; +} + +void printmatches(file_t *files) +{ + file_t *tmpfile; + + while (files != NULL) { + if (files->hasdupes) { + if (!ISFLAG(flags, F_OMITFIRST)) { + if (ISFLAG(flags, F_SHOWSIZE)) printf("%ld byte%seach:\n", files->size, + (files->size != 1) ? "s " : " "); + if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &files->d_name); + printf("%s%c", files->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); + } + tmpfile = files->duplicates; + while (tmpfile != NULL) { + if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &tmpfile->d_name); + printf("%s%c", tmpfile->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); + tmpfile = tmpfile->duplicates; + } + printf("\n"); + + } + + files = files->next; + } +} + +void autodelete(file_t *files) +{ + int counter; + int groups = 0; + int curgroup = 0; + file_t *tmpfile; + file_t *curfile; + file_t **dupelist; + int *preserve; + char *preservestr; + char *token; + char *tstr; + int number; + int sum; + int max = 0; + int x; + int i; + + curfile = files; + + while (curfile) { + if (curfile->hasdupes) { + counter = 1; + groups++; + + tmpfile = curfile->duplicates; + while (tmpfile) { + counter++; + tmpfile = tmpfile->duplicates; + } + + if (counter > max) max = counter; + } + + curfile = curfile->next; + } + + max++; + + dupelist = (file_t**) malloc(sizeof(file_t*) * max); + preserve = (int*) malloc(sizeof(int) * max); + preservestr = (char*) malloc(INPUT_SIZE); + + if (!dupelist || !preserve || !preservestr) { + errormsg("out of memory\n"); + exit(1); + } + + while (files) { + if (files->hasdupes) { + curgroup++; + counter = 1; + dupelist[counter] = files; + + printf("[%d] %s\n", counter, files->d_name); + + tmpfile = files->duplicates; + + while (tmpfile) { + dupelist[++counter] = tmpfile; + printf("[%d] %s\n", counter, tmpfile->d_name); + tmpfile = tmpfile->duplicates; + } + + printf("\n"); + + do { + printf("Set %d of %d, preserve files [1 - %d, all]", + curgroup, groups, counter); + if (ISFLAG(flags, F_SHOWSIZE)) printf(" (%ld byte%seach)", files->size, + (files->size != 1) ? "s " : " "); + printf(": "); + fflush(stdout); + + fgets(preservestr, INPUT_SIZE, stdin); + + i = strlen(preservestr) - 1; + + while (preservestr[i]!='\n'){ /* tail of buffer must be a newline */ + tstr = (char*) + realloc(preservestr, strlen(preservestr) + 1 + INPUT_SIZE); + if (!tstr) { /* couldn't allocate memory, treat as fatal */ + errormsg("out of memory!\n"); + exit(1); + } + + preservestr = tstr; + if (!fgets(preservestr + i + 1, INPUT_SIZE, stdin)) + break; /* stop if fgets fails -- possible EOF? */ + i = strlen(preservestr)-1; + } + + for (x = 1; x <= counter; x++) preserve[x] = 0; + + token = strtok(preservestr, " ,\n"); + + while (token != NULL) { + if (strcasecmp(token, "all") == 0) + for (x = 0; x <= counter; x++) preserve[x] = 1; + + number = 0; + sscanf(token, "%d", &number); + if (number > 0 && number <= counter) preserve[number] = 1; + + token = strtok(NULL, " ,\n"); + } + + for (sum = 0, x = 1; x <= counter; x++) sum += preserve[x]; + } while (sum < 1); /* make sure we've preserved at least one file */ + + printf("\n"); + + for (x = 1; x <= counter; x++) { + if (preserve[x]) + printf(" [+] %s\n", dupelist[x]->d_name); + else { + printf(" [-] %s\n", dupelist[x]->d_name); + remove(dupelist[x]->d_name); + } + } + printf("\n"); + } + + files = files->next; + } + + free(dupelist); + free(preserve); + free(preservestr); +} + +void help_text() +{ + printf("Usage: fdupes [options] DIRECTORY...\n\n"); + + printf(" -r --recurse \tinclude files residing in subdirectories\n"); + printf(" -s --symlinks \tfollow symlinks\n"); + printf(" -H --hardlinks \tnormally, when two or more files point to the same\n"); + printf(" \tdisk area they are treated as non-duplicates; this\n"); + printf(" \toption will change this behavior\n"); + printf(" -n --noempty \texclude zero-length files from consideration\n"); + printf(" -f --omitfirst \tomit the first file in each set of matches\n"); + printf(" -1 --sameline \tlist each set of matches on a single line\n"); + printf(" -S --size \tshow size of duplicate files\n"); + printf(" -q --quiet \thide progress indicator\n"); + printf(" -d --delete \tprompt user for files to preserve and delete all\n"); + printf(" \tothers; important: under particular circumstances,\n"); + printf(" \tdata may be lost when using this option together\n"); + printf(" \twith -s or --symlinks, or when specifying a\n"); + printf(" \tparticular directory more than once; refer to the\n"); + printf(" \tfdupes documentation for additional information\n"); + printf(" -v --version \tdisplay fdupes version\n"); + printf(" -h --help \tdisplay this help message\n\n"); +} + +int main(int argc, char **argv) { + int x; + int opt; + FILE *file1; + FILE *file2; + file_t *files = NULL; + file_t *curfile; + file_t *match = NULL; + filetree_t *checktree = NULL; + int filecount = 0; + int progress = 0; + + static struct option long_options[] = + { + { "omitfirst", 0, 0, 'f' }, + { "recurse", 0, 0, 'r' }, + { "quiet", 0, 0, 'q' }, + { "sameline", 0, 0, '1' }, + { "size", 0, 0, 'S' }, + { "symlinks", 0, 0, 's' }, + { "hardlinks", 0, 0, 'H' }, + { "noempty", 0, 0, 'n' }, + { "delete", 0, 0, 'd' }, + { "version", 0, 0, 'v' }, + { "help", 0, 0, 'h' }, + { 0, 0, 0, 0 } + }; + + program_name = argv[0]; + + while ((opt = getopt_long(argc, argv, "frq1SsHndvh", long_options, NULL)) != EOF) { + switch (opt) { + case 'f': + SETFLAG(flags, F_OMITFIRST); + break; + case 'r': + SETFLAG(flags, F_RECURSE); + break; + case 'q': + SETFLAG(flags, F_HIDEPROGRESS); + break; + case '1': + SETFLAG(flags, F_DSAMELINE); + break; + case 'S': + SETFLAG(flags, F_SHOWSIZE); + break; + case 's': + SETFLAG(flags, F_FOLLOWLINKS); + break; + case 'H': + SETFLAG(flags, F_CONSIDERHARDLINKS); + break; + case 'n': + SETFLAG(flags, F_EXCLUDEEMPTY); + break; + case 'd': + SETFLAG(flags, F_DELETEFILES); + break; + case 'v': + printf("fdupes %s\n", VERSION); + exit(0); + case 'h': + help_text(); + exit(1); + default: + fprintf(stderr, "Try `fdupes --help' for more information\n"); + exit(1); + } + } + + if (optind >= argc) { + errormsg("no directories specified\n"); + exit(1); + } + + for (x = optind; x < argc; x++) filecount += grokdir(argv[x], &files); + + if (!files) exit(0); + + curfile = files; + + while (curfile) { + if (!checktree) +#ifndef EXPERIMENTAL_RBTREE + registerfile(&checktree, curfile); +#else + registerfile(&checktree, NULL, TREE_ROOT, curfile); +#endif + else + match = checkmatch(&checktree, checktree, curfile); + + if (match != NULL) { + file1 = fopen(curfile->d_name, "rb"); + if (!file1) { + curfile = curfile->next; + continue; + } + + file2 = fopen(match->d_name, "rb"); + if (!file2) { + fclose(file1); + curfile = curfile->next; + continue; + } + + if (confirmmatch(file1, file2)) { + match->hasdupes = 1; + curfile->duplicates = match->duplicates; + match->duplicates = curfile; + } + + fclose(file1); + fclose(file2); + } + + curfile = curfile->next; + + if (!ISFLAG(flags, F_HIDEPROGRESS)) { + fprintf(stderr, "\rProgress [%d/%d] %d%% ", progress, filecount, + (int)((float) progress / (float) filecount * 100.0)); + progress++; + } + } + + if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); + + if (ISFLAG(flags, F_DELETEFILES)) autodelete(files); + else printmatches(files); + + while (files) { + curfile = files->next; + free(files->d_name); + free(files); + files = curfile; + } + + purgetree(checktree); + + return 0; +} diff --git a/md5/README b/md5/README new file mode 100644 index 0000000..84a7f14 --- /dev/null +++ b/md5/README @@ -0,0 +1,4 @@ +The MD5 library code residing within this directory was written +by L. Peter Deutsch. Although distributed here with fdupes, the +license for his MD5 library is different from the fdupes license. +Please look md5.c or md5.h for licensing information. diff --git a/md5/md5.c b/md5/md5.c new file mode 100644 index 0000000..d6f78b8 --- /dev/null +++ b/md5/md5.c @@ -0,0 +1,392 @@ +/* + Copyright (C) 1999 Aladdin Enterprises. All rights reserved. + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + L. Peter Deutsch + ghost@aladdin.com + + */ +/*$Id: md5.c $ */ +/* + Independent implementation of MD5 (RFC 1321). + + This code implements the MD5 Algorithm defined in RFC 1321. + It is derived directly from the text of the RFC and not from the + reference implementation. + + The original and principal author of md5.c is L. Peter Deutsch + <ghost@aladdin.com>. Other authors are noted in the change history + that follows (in reverse chronological order): + + 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. + 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5). + 1999-05-03 lpd Original version. + */ + +#include "md5.h" + +#ifdef TEST +/* + * Compile with -DTEST to create a self-contained executable test program. + * The test program should print out the same values as given in section + * A.5 of RFC 1321, reproduced below. + */ +#include <string.h> +main() +{ + static const char *const test[7] = { + "", /*d41d8cd98f00b204e9800998ecf8427e*/ + "a", /*0cc175b9c0f1b6a831c399e269772661*/ + "abc", /*900150983cd24fb0d6963f7d28e17f72*/ + "message digest", /*f96b697d7cb7938d525a2f31aaf161d0*/ + "abcdefghijklmnopqrstuvwxyz", /*c3fcd3d76192e4007dfb496cca67e13b*/ + "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", + /*d174ab98d277d9f5a5611c2c9f419d9f*/ + "12345678901234567890123456789012345678901234567890123456789012345678901234567890" /*57edf4a22be3c955ac49da2e2107b67a*/ + }; + int i; + + for (i = 0; i < 7; ++i) { + md5_state_t state; + md5_byte_t digest[16]; + int di; + + md5_init(&state); + md5_append(&state, (const md5_byte_t *)test[i], strlen(test[i])); + md5_finish(&state, digest); + printf("MD5 (\"%s\") = ", test[i]); + for (di = 0; di < 16; ++di) + printf("%02x", digest[di]); + printf("\n"); + } + return 0; +} +#endif /* TEST */ + + +/* + * For reference, here is the program that computed the T values. + */ +#if 0 +#include <math.h> +main() +{ + int i; + for (i = 1; i <= 64; ++i) { + unsigned long v = (unsigned long)(4294967296.0 * fabs(sin((double)i))); + printf("#define T%d 0x%08lx\n", i, v); + } + return 0; +} +#endif +/* + * End of T computation program. + */ +#define T1 0xd76aa478 +#define T2 0xe8c7b756 +#define T3 0x242070db +#define T4 0xc1bdceee +#define T5 0xf57c0faf +#define T6 0x4787c62a +#define T7 0xa8304613 +#define T8 0xfd469501 +#define T9 0x698098d8 +#define T10 0x8b44f7af +#define T11 0xffff5bb1 +#define T12 0x895cd7be +#define T13 0x6b901122 +#define T14 0xfd987193 +#define T15 0xa679438e +#define T16 0x49b40821 +#define T17 0xf61e2562 +#define T18 0xc040b340 +#define T19 0x265e5a51 +#define T20 0xe9b6c7aa +#define T21 0xd62f105d +#define T22 0x02441453 +#define T23 0xd8a1e681 +#define T24 0xe7d3fbc8 +#define T25 0x21e1cde6 +#define T26 0xc33707d6 +#define T27 0xf4d50d87 +#define T28 0x455a14ed +#define T29 0xa9e3e905 +#define T30 0xfcefa3f8 +#define T31 0x676f02d9 +#define T32 0x8d2a4c8a +#define T33 0xfffa3942 +#define T34 0x8771f681 +#define T35 0x6d9d6122 +#define T36 0xfde5380c +#define T37 0xa4beea44 +#define T38 0x4bdecfa9 +#define T39 0xf6bb4b60 +#define T40 0xbebfbc70 +#define T41 0x289b7ec6 +#define T42 0xeaa127fa +#define T43 0xd4ef3085 +#define T44 0x04881d05 +#define T45 0xd9d4d039 +#define T46 0xe6db99e5 +#define T47 0x1fa27cf8 +#define T48 0xc4ac5665 +#define T49 0xf4292244 +#define T50 0x432aff97 +#define T51 0xab9423a7 +#define T52 0xfc93a039 +#define T53 0x655b59c3 +#define T54 0x8f0ccc92 +#define T55 0xffeff47d +#define T56 0x85845dd1 +#define T57 0x6fa87e4f +#define T58 0xfe2ce6e0 +#define T59 0xa3014314 +#define T60 0x4e0811a1 +#define T61 0xf7537e82 +#define T62 0xbd3af235 +#define T63 0x2ad7d2bb +#define T64 0xeb86d391 + +static void +md5_process(md5_state_t *pms, const md5_byte_t *data /*[64]*/) +{ + md5_word_t + a = pms->abcd[0], b = pms->abcd[1], + c = pms->abcd[2], d = pms->abcd[3]; + md5_word_t t; + +#ifndef ARCH_IS_BIG_ENDIAN +# define ARCH_IS_BIG_ENDIAN 1 /* slower, default implementation */ +#endif +#if ARCH_IS_BIG_ENDIAN + + /* + * On big-endian machines, we must arrange the bytes in the right + * order. (This also works on machines of unknown byte order.) + */ + md5_word_t X[16]; + const md5_byte_t *xp = data; + int i; + + for (i = 0; i < 16; ++i, xp += 4) + X[i] = xp[0] + (xp[1] << 8) + (xp[2] << 16) + (xp[3] << 24); + +#else /* !ARCH_IS_BIG_ENDIAN */ + + /* + * On little-endian machines, we can process properly aligned data + * without copying it. + */ + md5_word_t xbuf[16]; + const md5_word_t *X; + + if (!((data - (const md5_byte_t *)0) & 3)) { + /* data are properly aligned */ + X = (const md5_word_t *)data; + } else { + /* not aligned */ + memcpy(xbuf, data, 64); + X = xbuf; + } +#endif + +#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n)))) + + /* Round 1. */ + /* Let [abcd k s i] denote the operation + a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */ +#define F(x, y, z) (((x) & (y)) | (~(x) & (z))) +#define SET(a, b, c, d, k, s, Ti)\ + t = a + F(b,c,d) + X[k] + Ti;\ + a = ROTATE_LEFT(t, s) + b + /* Do the following 16 operations. */ + SET(a, b, c, d, 0, 7, T1); + SET(d, a, b, c, 1, 12, T2); + SET(c, d, a, b, 2, 17, T3); + SET(b, c, d, a, 3, 22, T4); + SET(a, b, c, d, 4, 7, T5); + SET(d, a, b, c, 5, 12, T6); + SET(c, d, a, b, 6, 17, T7); + SET(b, c, d, a, 7, 22, T8); + SET(a, b, c, d, 8, 7, T9); + SET(d, a, b, c, 9, 12, T10); + SET(c, d, a, b, 10, 17, T11); + SET(b, c, d, a, 11, 22, T12); + SET(a, b, c, d, 12, 7, T13); + SET(d, a, b, c, 13, 12, T14); + SET(c, d, a, b, 14, 17, T15); + SET(b, c, d, a, 15, 22, T16); +#undef SET + + /* Round 2. */ + /* Let [abcd k s i] denote the operation + a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */ +#define G(x, y, z) (((x) & (z)) | ((y) & ~(z))) +#define SET(a, b, c, d, k, s, Ti)\ + t = a + G(b,c,d) + X[k] + Ti;\ + a = ROTATE_LEFT(t, s) + b + /* Do the following 16 operations. */ + SET(a, b, c, d, 1, 5, T17); + SET(d, a, b, c, 6, 9, T18); + SET(c, d, a, b, 11, 14, T19); + SET(b, c, d, a, 0, 20, T20); + SET(a, b, c, d, 5, 5, T21); + SET(d, a, b, c, 10, 9, T22); + SET(c, d, a, b, 15, 14, T23); + SET(b, c, d, a, 4, 20, T24); + SET(a, b, c, d, 9, 5, T25); + SET(d, a, b, c, 14, 9, T26); + SET(c, d, a, b, 3, 14, T27); + SET(b, c, d, a, 8, 20, T28); + SET(a, b, c, d, 13, 5, T29); + SET(d, a, b, c, 2, 9, T30); + SET(c, d, a, b, 7, 14, T31); + SET(b, c, d, a, 12, 20, T32); +#undef SET + + /* Round 3. */ + /* Let [abcd k s t] denote the operation + a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */ +#define H(x, y, z) ((x) ^ (y) ^ (z)) +#define SET(a, b, c, d, k, s, Ti)\ + t = a + H(b,c,d) + X[k] + Ti;\ + a = ROTATE_LEFT(t, s) + b + /* Do the following 16 operations. */ + SET(a, b, c, d, 5, 4, T33); + SET(d, a, b, c, 8, 11, T34); + SET(c, d, a, b, 11, 16, T35); + SET(b, c, d, a, 14, 23, T36); + SET(a, b, c, d, 1, 4, T37); + SET(d, a, b, c, 4, 11, T38); + SET(c, d, a, b, 7, 16, T39); + SET(b, c, d, a, 10, 23, T40); + SET(a, b, c, d, 13, 4, T41); + SET(d, a, b, c, 0, 11, T42); + SET(c, d, a, b, 3, 16, T43); + SET(b, c, d, a, 6, 23, T44); + SET(a, b, c, d, 9, 4, T45); + SET(d, a, b, c, 12, 11, T46); + SET(c, d, a, b, 15, 16, T47); + SET(b, c, d, a, 2, 23, T48); +#undef SET + + /* Round 4. */ + /* Let [abcd k s t] denote the operation + a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */ +#define I(x, y, z) ((y) ^ ((x) | ~(z))) +#define SET(a, b, c, d, k, s, Ti)\ + t = a + I(b,c,d) + X[k] + Ti;\ + a = ROTATE_LEFT(t, s) + b + /* Do the following 16 operations. */ + SET(a, b, c, d, 0, 6, T49); + SET(d, a, b, c, 7, 10, T50); + SET(c, d, a, b, 14, 15, T51); + SET(b, c, d, a, 5, 21, T52); + SET(a, b, c, d, 12, 6, T53); + SET(d, a, b, c, 3, 10, T54); + SET(c, d, a, b, 10, 15, T55); + SET(b, c, d, a, 1, 21, T56); + SET(a, b, c, d, 8, 6, T57); + SET(d, a, b, c, 15, 10, T58); + SET(c, d, a, b, 6, 15, T59); + SET(b, c, d, a, 13, 21, T60); + SET(a, b, c, d, 4, 6, T61); + SET(d, a, b, c, 11, 10, T62); + SET(c, d, a, b, 2, 15, T63); + SET(b, c, d, a, 9, 21, T64); +#undef SET + + /* Then perform the following additions. (That is increment each + of the four registers by the value it had before this block + was started.) */ + pms->abcd[0] += a; + pms->abcd[1] += b; + pms->abcd[2] += c; + pms->abcd[3] += d; +} + +void +md5_init(md5_state_t *pms) +{ + pms->count[0] = pms->count[1] = 0; + pms->abcd[0] = 0x67452301; + pms->abcd[1] = 0xefcdab89; + pms->abcd[2] = 0x98badcfe; + pms->abcd[3] = 0x10325476; +} + +void +md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes) +{ + const md5_byte_t *p = data; + int left = nbytes; + int offset = (pms->count[0] >> 3) & 63; + md5_word_t nbits = (md5_word_t)(nbytes << 3); + + if (nbytes <= 0) + return; + + /* Update the message length. */ + pms->count[1] += nbytes >> 29; + pms->count[0] += nbits; + if (pms->count[0] < nbits) + pms->count[1]++; + + /* Process an initial partial block. */ + if (offset) { + int copy = (offset + nbytes > 64 ? 64 - offset : nbytes); + + memcpy(pms->buf + offset, p, copy); + if (offset + copy < 64) + return; + p += copy; + left -= copy; + md5_process(pms, pms->buf); + } + + /* Process full blocks. */ + for (; left >= 64; p += 64, left -= 64) + md5_process(pms, p); + + /* Process a final partial block. */ + if (left) + memcpy(pms->buf, p, left); +} + +void +md5_finish(md5_state_t *pms, md5_byte_t digest[16]) +{ + static const md5_byte_t pad[64] = { + 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 + }; + md5_byte_t data[8]; + int i; + + /* Save the length before padding. */ + for (i = 0; i < 8; ++i) + data[i] = (md5_byte_t)(pms->count[i >> 2] >> ((i & 3) << 3)); + /* Pad to 56 bytes mod 64. */ + md5_append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1); + /* Append the length. */ + md5_append(pms, data, 8); + for (i = 0; i < 16; ++i) + digest[i] = (md5_byte_t)(pms->abcd[i >> 2] >> ((i & 3) << 3)); +} diff --git a/md5/md5.h b/md5/md5.h new file mode 100644 index 0000000..6614ef4 --- /dev/null +++ b/md5/md5.h @@ -0,0 +1,98 @@ +/* + Copyright (C) 1999 Aladdin Enterprises. All rights reserved. + + This software is provided 'as-is', without any express or implied + warranty. In no event will the authors be held liable for any damages + arising from the use of this software. + + Permission is granted to anyone to use this software for any purpose, + including commercial applications, and to alter it and redistribute it + freely, subject to the following restrictions: + + 1. The origin of this software must not be misrepresented; you must not + claim that you wrote the original software. If you use this software + in a product, an acknowledgment in the product documentation would be + appreciated but is not required. + 2. Altered source versions must be plainly marked as such, and must not be + misrepresented as being the original software. + 3. This notice may not be removed or altered from any source distribution. + + L. Peter Deutsch + ghost@aladdin.com + + */ +/*$Id: md5.h $ */ +/* + Independent implementation of MD5 (RFC 1321). + + This code implements the MD5 Algorithm defined in RFC 1321. + It is derived directly from the text of the RFC and not from the + reference implementation. + + The original and principal author of md5.h is L. Peter Deutsch + <ghost@aladdin.com>. Other authors are noted in the change history + that follows (in reverse chronological order): + + 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. + 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5); + added conditionalization for C++ compilation from Martin + Purschke <purschke@bnl.gov>. + 1999-05-03 lpd Original version. + */ + +#ifndef md5_INCLUDED +# define md5_INCLUDED + +/* + * This code has some adaptations for the Ghostscript environment, but it + * will compile and run correctly in any environment with 8-bit chars and + * 32-bit ints. Specifically, it assumes that if the following are + * defined, they have the same meaning as in Ghostscript: P1, P2, P3, + * ARCH_IS_BIG_ENDIAN. + */ + +typedef unsigned char md5_byte_t; /* 8-bit byte */ +typedef unsigned int md5_word_t; /* 32-bit word */ + +/* Define the state of the MD5 Algorithm. */ +typedef struct md5_state_s { + md5_word_t count[2]; /* message length in bits, lsw first */ + md5_word_t abcd[4]; /* digest buffer */ + md5_byte_t buf[64]; /* accumulate block */ +} md5_state_t; + +#ifdef __cplusplus +extern "C" +{ +#endif + +/* Initialize the algorithm. */ +#ifdef P1 +void md5_init(P1(md5_state_t *pms)); +#else +void md5_init(md5_state_t *pms); +#endif + +/* Append a string to the message. */ +#ifdef P3 +void md5_append(P3(md5_state_t *pms, const md5_byte_t *data, int nbytes)); +#else +void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes); +#endif + +/* Finish the message and return the digest. */ +#ifdef P2 +void md5_finish(P2(md5_state_t *pms, md5_byte_t digest[16])); +#else +void md5_finish(md5_state_t *pms, md5_byte_t digest[16]); +#endif + +#ifdef __cplusplus +} /* end extern "C" */ +#endif + +#endif /* md5_INCLUDED */ + + + + diff --git a/testdir/nine_upsidedown b/testdir/nine_upsidedown new file mode 100644 index 0000000..ffe2fce --- /dev/null +++ b/testdir/nine_upsidedown @@ -0,0 +1 @@ +six diff --git a/testdir/recursed_a/five b/testdir/recursed_a/five new file mode 100644 index 0000000..54f9d6d --- /dev/null +++ b/testdir/recursed_a/five @@ -0,0 +1 @@ +five diff --git a/testdir/recursed_a/one b/testdir/recursed_a/one new file mode 100644 index 0000000..5626abf --- /dev/null +++ b/testdir/recursed_a/one @@ -0,0 +1 @@ +one diff --git a/testdir/recursed_a/two b/testdir/recursed_a/two new file mode 100644 index 0000000..f719efd --- /dev/null +++ b/testdir/recursed_a/two @@ -0,0 +1 @@ +two diff --git a/testdir/recursed_b/four b/testdir/recursed_b/four new file mode 100644 index 0000000..8510665 --- /dev/null +++ b/testdir/recursed_b/four @@ -0,0 +1 @@ +four diff --git a/testdir/recursed_b/one b/testdir/recursed_b/one new file mode 100644 index 0000000..5626abf --- /dev/null +++ b/testdir/recursed_b/one @@ -0,0 +1 @@ +one diff --git a/testdir/recursed_b/three b/testdir/recursed_b/three new file mode 100644 index 0000000..2bdf67a --- /dev/null +++ b/testdir/recursed_b/three @@ -0,0 +1 @@ +three diff --git a/testdir/recursed_b/two_plus_one b/testdir/recursed_b/two_plus_one new file mode 100644 index 0000000..2bdf67a --- /dev/null +++ b/testdir/recursed_b/two_plus_one @@ -0,0 +1 @@ +three diff --git a/testdir/symlink_dir b/testdir/symlink_dir new file mode 120000 index 0000000..895d79b --- /dev/null +++ b/testdir/symlink_dir @@ -0,0 +1 @@ +recursed_a
\ No newline at end of file diff --git a/testdir/symlink_two b/testdir/symlink_two new file mode 120000 index 0000000..64c5e58 --- /dev/null +++ b/testdir/symlink_two @@ -0,0 +1 @@ +two
\ No newline at end of file diff --git a/testdir/twice_one b/testdir/twice_one new file mode 100644 index 0000000..f719efd --- /dev/null +++ b/testdir/twice_one @@ -0,0 +1 @@ +two diff --git a/testdir/two b/testdir/two new file mode 100644 index 0000000..f719efd --- /dev/null +++ b/testdir/two @@ -0,0 +1 @@ +two diff --git a/testdir/with spaces a b/testdir/with spaces a new file mode 100644 index 0000000..9502d6c --- /dev/null +++ b/testdir/with spaces a @@ -0,0 +1 @@ +with spaces diff --git a/testdir/with spaces b b/testdir/with spaces b new file mode 100644 index 0000000..9502d6c --- /dev/null +++ b/testdir/with spaces b @@ -0,0 +1 @@ +with spaces diff --git a/testdir/zero_a b/testdir/zero_a new file mode 100644 index 0000000..e69de29 --- /dev/null +++ b/testdir/zero_a diff --git a/testdir/zero_b b/testdir/zero_b new file mode 100644 index 0000000..e69de29 --- /dev/null +++ b/testdir/zero_b |