I'm a big fan of Hyperconvergence as you might now. One of the important capabilities of Hyperconverged products is the ability to deduplicate storage. In this article on of the pioneers in deduplication, Atlantis Computing, explains what's important about data deduplication.
Understanding the available data reduction technologies, with their strengths and weaknesses is a good place to start with. But perhaps as important is to understand your own data set. E.g. hoping for a high level of data reduction on an already compressed file format is just that, a hope. On the other hand, there is a widespread misconception that introducing data reduction technology will necessarily, or at least in most cases, impact application performance.

Atlantis Computing created a holistic white paper on the key data reduction technologies. You can find the paper here.

Here is an excerpt from the white paper that talks about the top ten questions you should ask yourself when choosing the right data reduction technology:

Is my data set de-duplicable?

Is my data set compressible? If so, is compression already turned on at the application layer?