Click to See Complete Forum and Search --> : Comparing files across directories


jon rouse
02-13-2004, 10:24 AM
I had to back up a couple of machines in a hurry, and have got myself in a bit of a mess. I (probably) have the same (dozens of) files at several places on the disk.

Is there a bit of software that will throw up all instances of a file name anywhere on my hard disk and allow me to compare file sizes and dates? Even better if it let me examine the file content of each.

There used to be a Norton Utility on DOS that did this, is there anything similar in Linux?

I could use file manager and search, but there are an awful lot of files.

If I am able to see that files are the same size and date, I could happily delete the duplicates.

ph34r
02-13-2004, 11:27 AM
for i in `find / -iname filename`
do
dir -lh $i >> file_log
done

Then look thru file_log to see the size, date, time, etc.

cwjolly
02-13-2004, 01:03 PM
Cut and past this small tcl script that I use to compare two different directories it will will do the comparisons for you using checksums so file timestamps are not a factor in the comparisons. It also lists files that are in one directory but not another, which have not changed and which files have changed if a file exists in both directories. It will recurse both directories. Ensure the path to tclsh is correct and that you have the cksum and find on your PATH. Oops I see you can attach files so the content below is also in attached file. save to diffdir.tcl then chmod 777 diffdir.tcl to make executable.
#!/usr/bin/tclsh

proc checksum { filename } {
set retval -1
if { [ catch { open "| cksum $filename" "r" } fd ] != 0 } {
puts stderr "Unable to cksum $filename: $fd "
} else {
gets $fd line;
catch { close $fd }
set retval [lindex [split $line " " ] 0 ]
}
return $retval;
}
# returns 0 on failure 1 on success
proc getFiles { dir filearray } {
upvar $filearray f
set retval 1
set curdir [pwd]
cd $dir
if { [catch { open "|find ./ -type f " "r" } g ] != 0 } {
puts stderr "getFiles: find had error : $g "
set retval 0
} else {
set buff [ read $g ]
catch { close $g }
foreach filename [split $buff "\n" ] {
if { [string length $filename ] == 0 } { continue }
set f($filename) [ checksum $filename ]
}
}
cd $curdir;
return $retval;
}

proc Usage {} {
global argv0
puts stdout " Format: $argv0 dir1 dir2"
puts stdout " Synopsis: Recursively finds the difference between files between two directories. "
puts stdout " "
puts stdout " First character of line Meaning"
puts stdout " ----------------------- ---------"
puts stdout " < file only in directory 1"
puts stdout " > file only in directory 2"
puts stdout " = file unchanged in both directories"
puts stdout " ! file is different between directory 1 and 2 "
puts stdout " "
puts stdout " "
}


if { $argc != 2 } {
Usage
exit 127
}

set dir1 [ lindex $argv 0 ]
set dir2 [ lindex $argv 1 ]

getFiles $dir1 farray
getFiles $dir2 farray2

foreach fname [ array names farray ] {
if { ![ info exists farray2($fname) ] } {
lappend notInDir2 $fname
} else {
lappend inDir2 $fname
}
}
foreach fname $inDir2 {
if { $farray($fname) != $farray2($fname) } {
lappend inDir2Changed $fname
}
}
# output
foreach fname $inDir2Changed {
puts stdout "! $fname"
}
foreach fname $inDir2 {
if { [lsearch $inDir2Changed $fname ] != -1 } { continue }
puts stdout "= $fname"
}
foreach fname $notInDir2 {
puts stdout "< $fname"
}
foreach fname [ array names farray2 ] {
if { ![ info exists farray($fname) ] } {
puts stdout "> $fname "
}
}

jon rouse
02-23-2004, 05:55 AM
Thanks, I'll give that a whirl.