News:

Welcome to RetroCoders Community

Main Menu

Duplicate files (Julia)

Started by Tomaaz, Jul 24, 2023, 10:05 PM

Previous topic - Next topic

Tomaaz

This is a very simple and basic script I've written in Julia. It searches for duplicate files and displays the result in the terminal.

using MD5
function main(location)
    sums_names = Dict()
    nof = 0
    dup = 0
    ori = 0
    all = walkdir(location)
    for (root, dirs, files) in all
        for name in files
            nof += 1
            suma = open(md5, joinpath(root, name))
            suma_hex = bytes2hex(suma)
            if haskey(sums_names, suma_hex)
                sums_names[suma_hex] = push!(sums_names[suma_hex], joinpath(root, name))
                dup += 1
            else
                sums_names[suma_hex] = [joinpath(root, name)]
                ori += 1
            end
            print("\033c")
            println(nof, " files analyzed...\n")
            println("Number of originals: ", ori)
            println("Number of duplicates: ", dup, "\n")
        end
    end
    for (sum, name) in sums_names
        if length(name) > 1 
            println("Original and duplicates:")
            for x in name
                println(x)
            end
            println()
        end
    end
    println("Number of files: ", nof)
    println("Number of originals: ", ori)
    println("Number of duplicates: ", dup)
end

main("/home/tom/Documents/Coding/Julia/")
   


Tomaaz

And here is a Ruby version:

require 'digest'
sumy = {}
dup, ori, nof = 0, 0, 0
Dir.glob("/home/tom/Pictures/**/*").filter {|plik| File.file?(plik)}.each do |x|
    nof += 1
    if sumy[Digest::MD5.file(x).hexdigest] 
        sumy[Digest::MD5.file(x).hexdigest]  << x
        dup += 1
    else
        sumy[Digest::MD5.file(x).hexdigest] = [x]
        ori += 1
    end
    system('clear')
    print  nof, " files analyzed...\n\n"
    print "Number of originals: ", ori, "\n"
    print "Number of duplicates: ", dup, "\n"
end
sumy.each do |x, y|
    if y.length > 1 then
        puts "\nOriginal and duplicates:\n"
        y.each {|z| puts z}
    end
end

print "Number of files: ", nof, "\n"
print "Number of originals: ", ori, "\n"
print "Number of duplicates: ", dup, "\n"

To be honest I like this one more. If you run Linux there is a chance that Ruby is already installed. You can use this script to find duplicate files. Just change the directory in the fourth line and it will check recursively  all files in it.