Finding duplicate files

From Newroco Tech Docs
Jump to navigationJump to search

Using rmlint (not in a repo at time of writing). Home page


How to install for xenial 16.04

$ apt-get install git scons python3-sphinx python3-nose gettext build-essential
# Optional dependencies for more features:
$ apt-get install libelf-dev libglib2.0-dev libblkid-dev libjson-glib-1.0 libjson-glib-dev
# Optional dependencies for the GUI:
$ apt-get install python3-gi gir1.2-rsvg gir1.2-gtk-3.0 python-cairo gir1.2-polkit-1.0 gir1.2-gtksource-3.0

For Compilation

$ # Omit -b develop if you want to build from the stable master
$ git clone -b develop https://github.com/sahib/rmlint.git
$ cd rmlint/
$ scons config       # Look what features scons would compile
$ scons DEBUG=1      # Optional, build locally.
# Install (and build if necessary). For releases you can omit DEBUG=1
$ sudo scons DEBUG=1 --prefix=/usr install

Install for ubuntu 18.04

apt-get install rmlint

Examples

Search a directory for duplicate files. Command is designed for a big volume so the report is not very long.

  • --progress --> show progress bar, don't display the whole report on stdout
  • --algorithm=paranoid --> uses the paranoid algorithm
  • --types="minimal" --> searches just for duplicate files, excludes empty files/directories
rmlint --progress --algorithm=paranoid --types="minimal" /mnt/ald-vol1/

Example 2

rmlint > output.txt
echo "<message header>" | mail -s "<message> `hostname`" -A output.txt <mail_exmp>
m  h   D   M W
* 20 24-31 * 5   root    (cd /path/from/where/to/execute/ && rmlint --progress --algorithm=paranoid --types="minimal" /path/to/file/to/scan/ -o sh)


rmlint official documentation

Old write

https://github.com/sahib/rmlint

Base command notes on getting & use

git clone -b develop https://github.com/sahib/rmlint.git
cd rmlint/
killall rmlint
rmlint -g
less rmlint.sh 
less rmlint.json 

source : https://rmlint.readthedocs.io/en/latest/rmlint.1.html