Accessdata deduplicates email and attachment records according to their family. This means that the same PDF attached to two different emails will not be marked as a duplicate unless the emails are also duplicates.

Deduplication for files outside of email types occurs by hashing the document. This process generates an MD5Hash that can be compared to see if the document is different. A single character difference will result in different MD5Hash.