jdupes: file de-duplication tool
This simple guide to jdupes is split into traffic light sections 🚦 based on the danger of the commands.
If you want some file de-duplication test data, there’s a simple repo at https://github.com/alexhunsley/file-duplication-test-data.
Please test any file de-duplication tool to a high level of personal confidence before unleashing it on your real data. And before using, please confirm the commands I give here against what man jdupes
says.
Never use powertools like jdupes
without:
- having an up-to-date backup of the data you are working on
- afterwards verifying that the changes you made are what you expected!
Green light
No files are deleted.
Recursively searches all given dirs for duplicates, outputting match groups, their file sizes, and a summary:
> jdupes -rOMS dir1/ dir2/ dir3/
I’ve supplied three search dirs above, but you can supply any amount, including just one.
The O
flag guarantees the dupes in groups are listed in the same order as the dirs given. If you add sorting options to this command, you might want to remove the O
, as it overrides those.
Amber light
Recursively searches all given dirs for duplicates, outputting match groups, their file sizes, and a summary.
You are asked one-by-one which file out of each dupe group to keep, with immediate deletion on each answer.
# this command interactively allows you to delete files one-by-one
> jdupes -rS -d dir1/ dir2/
Red light
Files can be deleted without any further user input.
# DANGER! This can delete files without any further input!
> jdupes -rOS -d -N dir1/ dir2/ dir3/
This recursively searches all given dirs for duplicates and deletes duplicates without prompt. When selecting which copies of dupes to delete, it will preserve a copy in the first given dir where possible.
The -N
flag is the dangerous one here: it’s ‘delete without interactive prompt’.
Note that the above command can find and delete dupes that exist in just one of the input dirs (as well as dupes across input dirs). If you never want to detect dupes that are inside the same input dir, add the -I
flag for ‘isolate’:
# DANGER!
> jdupes -rOIS -d -N dir1/ dir2/ dir3/
This is useful if you have one ‘reference’ dir that is the source of truth and which you don’t want to delete anything from; just specify that reference folder first.