How To Hunt For Drugs In Dark Chemical Matter

Pharma productivity has remained a daunting challenge, notably because R&D departments have been shuffled and reorganized over the last few years in attempts to revitalize the process. One trend, among multiple parallel tracks, has come into fashion. With millions of compounds in inventory, why not re-examine the ones that have consistently shown no activity in bioassays (and presumed are safe if inactive also in toxicological assays) rather than re-invent new ones? Enter the concept of exploring Dark Chemical Matter (DCM).

The term DCM was coined by Wassermann and colleagues at Novartis and then refined with the input from researchers at Merck, after Wassermann switched camps (Refs. 1,2). Jaisal and Bajorath recently revisited the topic as well (Ref. 3). In each publication, the approach has been to effect some sort of molecular fingerprint matching for searching through chemical libraries. The objective of the exercise is to identify sets of otherwise consistently inactive, e.g. negative in 50-100 assays, DCM compounds (DCMs) whose structural features coincide with closely related active compounds in biological or phenotypic domains for which the DCMs were never screened. In other words, DCM "Inactives" stand a chance to be "Brightened" or repurposed into "Actives".

Quoting Wassermann (Ref. 1), DCMs therefore "have the potential to be potent hits with little or no tar­get promiscuity and thus could present an opportunity for identify­ing new leads." Indeed, a solid and unexpected new antifungal chemotype emerged from this work "with strong activity against the pathogen Cryptococcus neoformans but little activity at targets relevant to human safety." A less bullish but still optimistic view concluded her later work (Ref. 2): "dark compounds that become active in a screen of an on-going drug discovery project or through targeted profiling efforts may be considered as tool compounds for the respective biological process or protein target."

In the latest salvo on prospecting through DCM, the Bajorath group just set their cards on the table for all to see and avoided value judgment: "Analog series containing DCM and known bioactive compounds were generated on a large scale, making it possible to derive target hypotheses for more than 8000 extensively assayed DCM molecules" (Ref. 3 and accompanying open access deposition).

However, the DCM metaphor may be incomplete as currently described. What Wasserman and Bajorath are actually looking at in the chemical landscape should more appropriately be termed “Dull” rather than “Dark” chemical matter. “Dark Matter” cannot be seen; it lies at the edge and beyond the statistical confidence limits of the structures that we can readily examine.

If one analyzes the so-called DCM data sets for 3D molecular shape diversity and sp3 complexity using moments of inertia (Ref. 4), for example, "Actives" aka "Bright" and "Inactives" aka "Dull" compounds demonstrably occupy the same chemical space. This appears to be the conclusion drawn through intuition in Derek Lowe's analysis (Ref. 5): "None of these structures, I have to say, look odd at all; I don’t think any medicinal chemist would look at them and say “You know what, you could screen that stuff through a hundred assays and never see a damn thing”. Quite the opposite – they look fine. Clustering them in chemical space didn’t show any obvious “dark nebulae” – all the clusters with DCM compounds in them also have active compounds in them (and sometimes these are very similar structures indeed)." Moreover, Siramshetty and Preissner, writing for Drug Discovery Today in July of 2017, concur with Lowe's earlier assessment (Ref. 6) based on an independent analysis.

What is the take-home lesson here? In contrast to what may just turn out to be "Dull Chemical Matter", real DCM should lie beyond the outer limits of the overlap between known actives and consistent inactives. I argue that the place to look for novel leads should fall at or past the margins where any given set of "Dull" and "Bright" chemical spaces coincide. This is achievable using the now time tested toolkits for shape analysis that project 3D molecular complexity onto a 2D triangular rod-sphere-disk continuum (Ref. 4).

As suggested in the graphic above, it may even become a general prescription for picking places to start in screening libraries of any sort: a) map them against reference data sets, and b) pick compounds sufficiently distanced from the central domain of any data overlap. As a case in point, this approach to insure 3D complexity, and natural product like fragments, otherwise uncharted in either Dull or Dark chemical space appears to have resonated at Novartis, site of the original DCM coinage, and presumably with Merck via a key author who relocated there along with Wassermann (Ref. 7).