So, I’m selfhosting immich, the issue is we tend to take a lot of pictures of the same scene/thing to later pick the best, and well, we can have 5~10 photos which are basically duplicates but not quite.
Some duplicate finding programs put those images at 95% or more similarity.

I’m wondering if there’s any way, probably at file system level, for the same images to be compressed together.
Maybe deduplication?
Have any of you guys handled a similar situation?

  • WIPocket@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 months ago

    Note that Git doesnt store deltas. It will reuse unchanged files, but stores a (compressed) version of every file that has existed in the whole history, under its SHA1 hash.

    • cizra@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 months ago

      Indeed! Interesting! I made an experiment now with a non-compressible file (strings < /dev/urandom | head -n something) and it shows you’re right. 2nd commit, where I added a tiny line to that file, increased repo size by almost the size of the whole file.

      Thanks for this bit.