Note

✨🚀 Join the difPy for Desktop beta tester program now and be among to first to test the new difPy desktop app! We are now accepting beta tester sign ups and will soon be starting our first tester access wave.

difPy for Desktop

difPy for Desktop brings difPy’s image deduplication capabilities as an intuitive, easy to use app to your desktop.

Unlike most deduplication software, difPy does not compare images based on their hashes - it compares them based on their image tensors (i. e. the actual image content). This allows difPy to not only search for exact duplicate images, but also for similar images (which can be very useful if duplicate images have different file extensions, or if images are cropped versions of one another).

Installation

➡️ Download the difPy v1.0-beta app for Windows (currently available for beta testers)

➡️ Download the difPy v1.0-beta app for MacOS (coming soon)

Basic Usage

To start a new search, open the difPy for Desktop app and click the “New Search” button on the main menu. The search process is divided into two steps: (1) import folders and (2) configure search.

Import Folders

You can import one or more folders at once by clicking the “Browse” button. Alternatively, you can also paste folder paths (separated by a semicolon “;”) directly into the text box.

The following import modes are supported:

  • Recursive: defines whether difPy should search through the subfolder(s) of the imported folder. If selected, difPy will search for matches in all subfolders.

  • In-folder: defines whether difPy should treat each folder as separate and only search for matches among the folder itself. If selected, difPy will treat the folders as separate. Can only be selected if at least 2 folders have been imported.

Additionally, you can configure the following advanced import settings:

Pixel size: defines the width and height to which the images are compressed to before the search (default value is 50). The higher the pixel size, the more precise but the slower the search. It is recommended not to change the default value. Only change this value if you know what you are doing. If you would like to improve the precision of the search (f. e. when you are searching for matches among images that contain text), it is recommended to increment this value by steps of 50.

Search Results

When difPy has completed the search, the search results will be displayed, incl. the number of duplicate and/or similar matches it found.

You can then:

  • View/manage the search results in the difPy Image Viewer (see Image Viewer).

  • View the search logs for more information about the search process.

Image Viewer

The difPy image viewer allows you to view the duplicate/similar images and easily manage them. The Image Viewer lets you go through each group of matches and see the resolutions of each of the images so that you know which ones are safe to be moved or deleted.

For each image, you have the option to open it, move it to a new location, or delete it.

If you want to move or delete all lower resolution matches at once, you can use the “Bulk Actions…” dropdown menu and select the bulk action you would like to take.

Advanced Settings

From the difPy settings on the main menu, you can access advanced search settings.

Warning

It is not recommended to change these settings unless you know what you are doing. See Adjusting ‘processes’ and ‘chunksize’.

Processes: defines the maximum number of worker processes (i. e. parallel tasks) to perform when multiprocessing. The more processes, the faster the search, but the more processing power (CPU) the app will use. See processes (int) for more information.

Chunksize: defines the number of image sets that should be compared at once per process. The higher the chunksize, the faster the search, but the more memory (RAM) the app will use. See chunksize (int) for more information.

The process and chunksize become relevant if difPy received more than 5k images to process. With large datasets, it can make sense to adjust these parameters. For example, in order to lower the overall CPU overhead, you could lower processes. In order to decrease memory usage, you could decrease chunksize. The higher both parameters, the more performance you will gain, but the more resources the app will use. See Adjusting ‘processes’ and ‘chunksize’ for more information.

Limitations

  • Using the difPy desktop app for large datasets can lead to slower processing times. For better performance, with large datasets (> 10k images) it is recommended to use the difPy Python package instead.

  • The desktop app is currently only available to beta testers on Windows.

  • The desktop app is currently in beta and may contain bugs. If you encounter any issues, please report it. See Give Feedback / Report Bug.

Give Feedback / Report Bug

🐞 Did you encounter an issue with the difPy desktop app? Report it here.

🗨️ Do you have feedback about the difPy desktop app? Share your feedback here.