You might want strip metadata before doing a comparison, using exiftool. Even though exiftool was originally written for EXIF metadata on JPGs, these days, it supports a lot of metadata standards, including PDF. This command will do it assuming you set filename=`basename your.pdf .pdf`:
That won't help you with small differences in the contents, but might help with small differences in metadata. Running `md5sum` on the stripped PDF should give more reliable dedupe results.
I was recently working on a similar problem for JPG, RAW, and MP4 files (photo/video backup) so it is fresh in my mind.