Thalidomide-targeted degradation

Thalidomide and its analogs improve the survival of patients with multiple myeloma and other blood cancers. Previous work showed that the drugs bind to the E3 ubiquitin ligase Cereblon, which then targets for degradation two specific zinc finger (ZF) transcription factors with a role in cancer development. Sievers et al. found that more ZF proteins than anticipated are destabilized by thalidomide analogs. A proof-of-concept experiment revealed that chemical modifications of thalidomide can lead to selective degradation of specific ZF proteins. The detailed information provided by structural, biochemical, and computational analyses could guide the development of drugs that target ZF transcription factors implicated in human disease.

Structured Abstract

INTRODUCTION

Thalidomide, lenalidomide, and pomalidomide are clinically approved therapies for the treatment of multiple myeloma and other hematologic malignancies. These drugs induce rapid ubiquitination and proteasomal degradation of two transcription factors, Ikaros (IKZF1) and Aiolos (IKZF3), by recruiting them to the CRL4CRBN E3 ubiquitin ligase through a Cys2-His2 (C2H2) zinc finger (ZF) domain that is present in both proteins and required for their destruction.

RATIONALE

Transcription factors have been challenging drug targets because they lack discrete catalytic domains amenable to small-molecule inhibition. Thalidomide analog–induced degradation of IKZF1 and IKZF3 through a C2H2 ZF domain raised the possibility that the >800 C2H2 ZF–containing proteins encoded by the human genome, many of which are putative transcription factors, could be similarly destabilized. We therefore set out to (i) define the human ZF “degrome” in the context of thalidomide, lenalidomide, and pomalidomide; (ii) characterize the ZF-drug-CRBN interaction structurally and functionally; and (iii) determine whether different thalidomide analogs degrade distinct ZFs.

RESULTS

Using a reporter of substrate degradation, we screened 6572 C2H2 ZFs for degradation in the presence of thalidomide, lenalidomide, and pomalidomide, identifying 11 ZF degrons, motifs that are capable of mediating drug-dependent degradation, of which six were found to mediate degradation of their full-length protein. Surprisingly, the 11 ZF degrons lack an identifiable consensus sequence. Saturation mutagenesis of the IKZF1/3 ZF degron and crystal structures of two ZF degrons bound to pomalidomide-engaged CRBN demonstrate that the drug-CRBN interface accommodates ZF degrons with diverse amino acid sequences. Computational docking in combination with in vitro binding assays revealed that a large number of ZFs are capable of weakly binding the drug-CRBN interface, indicating that this interface may be more permissive than suggested by the 11 ZF degrons identified in the degradation screen. To test this hypothesis, we screened the ZF library against two thalidomide analogs with chemical alterations at the ZF-drug-CRBN interface. The two thalidomide analogs induced degradation of distinct sets of C2H2 ZF degrons, including ZFs that bind the CRBN-pomalidomide complex weakly in vitro, but were not degraded by pomalidomide in cells.

CONCLUSION

We found that thalidomide analogs mediate CRL4CRBN-dependent degradation of a larger number of C2H2 ZF proteins than previously anticipated. ZFs compatible with the drug-CRBN interface show little sequence conservation apart from residues that stabilize the ternary ZF fold. In addition to the complex ZF-CRBN side chain interactions, direct contacts between thalidomide analogs and varying ZF residues provide another layer of specificity. Thalidomide analogs with altered chemical scaffolds thus allow selective degradation of distinct ZF targets. Our results provide a structural and functional basis for the chemical modulation of CRL4CRBN to degrade C2H2 ZF transcription factors. Degradation of C2H2 ZF–containing proteins through derivatized thalidomide analogs may be a general paradigm for therapeutically targeting C2H2 ZF transcription factors, a class of proteins previously perceived to be “undruggable.”

We created a cellular library in which each cell expresses one of 6572 individual C2H2 ZF domains from the human proteome fused to enhanced green fluorescent protein (eGFP). Cells expressing a C2H2 ZF reporter that is susceptible to thalidomide analog–induced degradation lose their eGFP signal, allowing their identification through a combination of fluorescence-activated cell sorting (FACS) and high-throughput sequencing. Structural and functional studies revealed how the drug-CRBN complex accommodates ZFs with diverse amino acid sequences. On the basis of these results, we tested thalidomide analogs with chemical modifications at the drug-ZF interface and found that these derivatives can degrade different sets of C2H2 ZFs. Our results suggest that chemical modulation of CRL4CRBN may be a more generalizable paradigm to inactivate C2H2 ZF–containing proteins, the largest group of putative transcription factors in the human genome. IRES, internal ribosomal entry site; DMSO, dimethyl sulfoxide.

Abstract

The small molecules thalidomide, lenalidomide, and pomalidomide induce the ubiquitination and proteasomal degradation of the transcription factors Ikaros (IKZF1) and Aiolos (IKZF3) by recruiting a Cys2-His2 (C2H2) zinc finger domain to Cereblon (CRBN), the substrate receptor of the CRL4CRBN E3 ubiquitin ligase. We screened the human C2H2 zinc finger proteome for degradation in the presence of thalidomide analogs, identifying 11 zinc finger degrons. Structural and functional characterization of the C2H2 zinc finger degrons demonstrates how diverse zinc finger domains bind the permissive drug-CRBN interface. Computational zinc finger docking and biochemical analysis predict that more than 150 zinc fingers bind the drug-CRBN complex in vitro, and we show that selective zinc finger degradation can be achieved through compound modifications. Our results provide a rationale for therapeutically targeting transcription factors that were previously considered undruggable.

Thalidomide and its derivatives, lenalidomide and pomalidomide, are effective therapies for the hematologic malignancies multiple myeloma, del(5q) myelodysplastic syndrome, and mantle cell lymphoma (1). Thalidomide analogs bind Cereblon (CRBN), the substrate receptor of the CUL4-RBX1-DDB1-CRBN (CRL4CRBN) E3 ubiquitin ligase, and alter its substrate selectivity to recruit, ubiquitinate, and degrade seemingly unrelated proteins, including Ikaros (IKZF1), Aiolos (IKZF3), casein kinase 1 α (CK1α), and GSPT1 (2–7). Degradation of these targets in part explains the therapeutic effects of thalidomide analogs. IKZF1 and IKZF3 are lymphocyte lineage transcription factors (8, 9) that are essential for the survival of the malignant plasma cells in multiple myeloma. CK1α is essential for the survival of hematopoietic stem cells, and heterozygous deletion of its gene in del(5q) myelodysplastic syndrome provides a therapeutic window for eliminating the malignant stem cell clone (10).

IKZF1 and IKZF3 belong to the family of Cys2-His2 (C2H2) zinc finger (ZF) proteins (11). These ZF proteins share a conserved C2H2 ZF fold composed of a β-hairpin and an α-helix held together by pairs of zinc-coordinating cysteine and histidine residues. Because amino acids in the α-helical portion of some ZF domains recognize DNA base pairs in a sequence-specific manner (12), the approximately 800 C2H2 ZF proteins are predicted to comprise the largest group of transcription factors in the human genome (13). Transcription factors have remained challenging drug targets because of the absence of druggable active sites (14). Thalidomide analogs, however, induce degradation of IKZF1 and IKZF3, raising the possibility that other C2H2 ZF–containing transcription factors are similarly destabilized.

E3 ubiquitin ligases recognize their substrates through degrons, short stretches of primary sequence that are necessary and sufficient for the interaction with substrate receptors of ubiquitin ligases (15). Previous work has implicated the second C2H2 ZF domain in IKZF1 and IKZF3 (5, 7) and ZF4 of ZFP91 as the drug-inducible degrons (5, 7, 16). Unexpectedly, the known thalidomide analog targets—IKZF1/IKZF3, ZFP91, CK1a, and GSPT1—do not share obvious primary sequence similarity, with the exception of a glycine residue located in a β-hairpin, which diverges from the canonical destruction motif paradigm. Although the majority of C2H2 ZFs are structurally similar to the ZF degrons of IKZF1/IKZF3 and ZFP91, with 4661 out of 6572 carrying a glycine residue at an equivalent position, proteome-wide mass spectrometry demonstrated selective degradation of endogenous IKZF1/IKZF3 and ZFP91 in multiple cell lines (3, 4, 16). To identify determinants of drug-induced substrate specificity, we set out to characterize the human ZF “degrome” amenable to degradation in the presence of CRL4CRBN and thalidomide analogs and examined whether compound modifications change ZF selectivity.

To characterize the minimal C2H2 ZF degron of IKZF1/IKZF3, we first generated different IKZF1 deletion constructs and measured their affinity for the drug-CRBN complex in vitro using time-resolved fluorescence resonance energy transfer (TR-FRET) (Fig. 1A and fig. S1, A and B). IKZF1 ZF2 (amino acid residues 141 to 174) was the shortest construct that showed binding to CRBN-pomalidomide, with an inhibition constant (Ki) of 2314 ±81 nM. Higher affinity binding to CRBN-pomalidomide in vitro was observed with an IKZF1 construct spanning ZF2 and ZF3 [amino acid residues 141 to 243 (Δ197 to 238); Ki 165 ± 37 nM]. Replacing ZF3 with ZF1 in the context of the ZF2-ZF3 construct (ZF2-ZF1) decreased the binding affinity by a factor of 6 (Ki 1027 ± 302 nM). IKZF1 binding to CRBN-pomalidomide is thus driven by ZF2 in vitro, with minor but specific contributions from the C-terminal ZF3.

To identify the minimal construct required for IKZF1/IKZF3 degradation in cells, we created a lentiviral degradation reporter vector that enabled us to compare the fluorescence of degron-tagged enhanced green fluorescent protein (eGFP) to untagged mCherry using flow cytometry (Fig. 1B) (17). We transduced WT and CRBN−/− human embryonic kidney cells (HEK293T) with the degradation reporter that contains IKZF3 deletion constructs and treated the cells with thalidomide, lenalidomide, or pomalidomide. Deletion of ZF2 (amino acid residues 146 to 168) in full-length IKZF3 rendered the reporter resistant to drug treatment, whereas deletion of the other ZFs had little or no effect (fig. S1C). Accordingly, IKZF3 ZF2 (amino acid residues 146 to 168), which is identical to IKZF1 ZF2 (amino acid residues 145 to 167) (fig. S1D), was sufficient to confer CRBN-dependent degradation of the reporter (Fig. 1C). Together, these results establish IKZF1/IKZF3 ZF2 as the minimal unit required for thalidomide analog-induced CRBN binding in vitro and for CRL4CRBN-dependent degradation in cells.

Having demonstrated that a single ZF is sufficient to induce degradation of the eGFP/mCherry reporter, we sought to screen the entire human C2H2 ZF proteome for degradation in the presence of thalidomide analogs. To analyze the human ZF degrome independent of cell-type specific expression levels (fig. S1, E and F) and accessibility of ZFs in the context of full-length proteins engaged in macromolecular assemblies, we synthesized cDNAs for 6572 distinct C2H2 ZFs from the human proteome that match the PROSITE (18) ZF motif x(2)-C-x(2,4)-C-x(3)-[LIVMFYWC]-x(7)-H-x(3,5)-H and inserted the cDNAs into the degradation reporter vector. HEK293T cells were transduced with this C2H2 ZF library and then treated with dimethyl sulfoxide (DMSO), thalidomide, lenalidomide, or pomalidomide. Fluorescence-activated cell sorting (FACS) was used to isolate eGFP+/mCherry+ cells, and the relative number of read counts of each ZF was quantified by means of next-generation sequencing (Fig. 1D). A ZF was considered degraded if read counts were significantly underrepresented in drug-treated eGFP+/mCherry+ cells relative to DMSO-treated control cells.

Of the 6572 ZFs, 5611 had sufficient representation in the sequencing data to be assayed (>200 read counts). At a false discovery rate (FDR) of <0.01, pomalidomide depleted 11 C2H2 ZFs, each from different proteins, including IKZF1/IKZF3 ZF2 (Fig. 1E). Lenalidomide and thalidomide targeted a variable subset of these 11 ZFs. When cloned into the degradation reporter vector and tested in isolation, the 11 ZFs exhibited degradation in the presence of all three compounds (Fig. 2A and fig. S2A).

We next used the degradation reporter to determine whether the 11 ZFs destabilized their respective full-length protein and found that 6 of the 11 full-length proteins were degraded in the presence of the drug (IKZF1/IKZF3, ZNF692, ZFP91, ZNF276, ZNF653, and ZNF827) (Fig. 2B and fig. S2B). All six ZFs that mediated degradation of their full-length protein carry an additional ZF C-terminal to the one identified in the library screen, whereas five of six ZFs that did not mediate degradation of their corresponding full-length protein do not posses a proximal C-terminal ZF. These results are consistent with our findings that although IKZF1/IKZF3 ZF2 constitutes the minimal ZF degron, an IKZF1 construct spanning ZF2 and ZF3 conferred higher affinity binding to CRBN-pomalidomide in vitro than ZF2 alone (Fig. 1A). We confirmed degradation of endogenous ZNF692, ZFP91, ZNF276, ZNF653, and ZNF827 in the presence of pomalidomide using Western blotting (Fig. 2, C to E). Additionally, we demonstrated binding of hemagglutinin (HA)–tagged, full-length ZFP91 and ZNF692 to endogenous CRBN in the presence of all three compounds by means of immunoprecipitation (fig. S2C).

A screen of 6572 C2H2 ZFs identified 11 C2H2 ZFs degraded in the presence of thalidomide, lenalidomide, or pomalidomide, six of which mediate degradation of their respective full-length protein. Four of these full-length proteins were previously unknown thalidomide analog targets (fig. S2D).

Identification of amino acid loci critical to degradation

Sequence alignment of the C2H2 ZF hits (Fig. 3A) highlighted shared residues that are part of the PROSITE C2H2 ZF search motif (IKZF3 C148, C151, H164, H168, and F155), residues that stabilize the ternary fold of the ZF domain (IKZF3 L161), and residues that frequently appear within C2H2 ZF domains (IKZF3 G152). The PROSITE ZF search motif covered sequences in which the β-hairpin zinc-coordinating cysteines are separated by two, three, or four residues [C-x(2,4)-C], but only C-x(2)-C ZFs were found destabilized in the screen (Fig. 3A). Insertion of a glycine residue between the β-hairpin residues N148 and Q149 of the IKZF1 ZF2-ZF3 construct (IKZF3 N149 and Q150) (fig. S1D) compromised binding in vitro (fig. S3A). This suggests that the ZF degrome is restricted to an inter-cysteine spacing of two residues. Besides the inter-cysteine spacing, residues that stabilize the ternary ZF fold, or residues that frequently appear in C2H2 ZF domains, no discernable sequence consensus could be detected among the 11 ZFs (Fig. 3A). The identified ZF degrons thus revealed structural features common to C2H2 ZF domains that alone cannot explain selective degradation of only 11 ZFs, whereas 4661 structurally similar ZFs were present in the library.

To test whether the divergent, nonstructural residues contribute to a functional ZF degron, we synthesized a mutagenesis library of IKZF3 spanning amino acid residues 130 to 189 so that at each of the 60 loci, all 19 possible amino acid substitutions were represented. The pooled library was inserted into the degradation reporter vector and transduced into HEK293T cells. We treated the cells with DMSO, thalidomide, lenalidomide, or pomalidomide and used FACS to isolate drug-treated eGFP–/mCherry+ cells or DMSO-treated eGFP+/mCherry+ control cells. Next-generation sequencing was used to quantify the relative number of read counts for each substitution (fig. S3B).

The screen highlighted nine loci in IKZF3 ZF2 whose amino acid identities were critical for degradation (Fig. 3, B and C, and fig. S3, C and D). These loci again included residues that maintain the tertiary ZF fold (IKZF3 C148, C151, F155, L161, H164, and H168) and the β-hairpin glycine residue (IKZF3 G152). In addition, the screen highlighted two nonstructural residues (IKZF3 Q147 and A153) that varied among the other ZF degrons (Fig. 3A) but were important for degradation of the IKZF3 ZF2 reporter (Fig. 3, B and C). Mutation of these two residues impaired degradation in validation experiments, whereas N149, a residue that was not highlighted in the mutagenesis screen, tolerated mutation (Fig. 3D). These data show that in addition to the residues maintaining the ZF fold, nonstructural amino acids (IKZF3 Q147, G152, and A153) contribute to degron specificity.

Structures of CRBN bound to pomalidomide and two different ZF degrons

To examine how the varying, nonstructural residues within ZF degrons contribute to selective CRBN binding, we determined crystal structures of DDB1ΔBPB-CRBN bound to pomalidomide and IKZF1 ZF2 (amino acid residues 141 to 174) or ZNF692 ZF4 (amino acid residues 416 to 442) (Fig. 4, A and B). Unlike previous CRBN structures, CRBN bound IKZF1 ZF2 in an “open” conformation, with its N- and C-terminal domains separated and stabilized by crystal contacts (Fig. 4A and fig. S4A). As supported by the ZNF692 structure, release of the C-terminal domain is not required for ZF binding (Fig. 4B).

Attempts to crystalize the IKZF1 ZF2-ZF3 construct resulted in poorly diffracting crystals. To dissect the contribution of the C-terminal ZF to binding affinity, we used an unbiased docking approach that placed ZF3 on the neighboring N-terminal domain of “closed” CRBN (fig. S4B). This model positions the C terminus of ZF2 adjacent to the N terminus of ZF3 consistent with the five-amino-acid linker between the two ZFs. The model highlights three consecutive arginine residues of ZF3 (IKZF1 R183 to R185) at the CRBN-ZF2 interface that are not present in the ZF2-ZF1 construct (Fig. 1A and fig. S4C). Mutation of R185 to alanine, or replacement of all three arginines with the corresponding amino acids in ZF1 (IGP), reduced the affinity of the ZF2-ZF3 construct similar to that of ZF2-ZF1 (Fig. 1A) or ZF2 alone as predicted by our model (fig. S4D). These results show that IKZF1 ZF2 is the primary determinant for drug binding and identify features of ZF3 that contribute to high-affinity binding to the drug-CRBN complex (Fig. 1A and fig. S4B).

The crystal structures of CRBN-pomalidomide bound to IKZF1 ZF2 and ZNF692 ZF4 reveal the detailed side chain interactions between CRBN, pomalidomide, and the ZFs. Both IKZF1 ZF2 and ZNF692 ZF4 bind a complementary groove on the CRBN C-terminal domain (Fig. 4C), interact with the compound through their β-hairpin loops, and match the overall backbone conformation of previous models of the ZF-drug-CRBN complex (fig. S4E). In each structure, the β-hairpin glycine of the ZF packs against the phthalimide ring of pomalidomide similar to the β-hairpins of CK1α (6) and GSPT1 (5), affirming the structural importance of a glycine residue at this position (Fig. 4D and fig. S4F). Because the CRBN groove is narrow (Fig. 4C), ZF-CRBN interactions at both sides of the groove together determine binding. IKZF1 ZF2 N148 and Q149 (IKZF3 N149 and Q150) (fig. S1D) intercalate through their amphipathic side chains with CRBN residues H353 and Y355 (Fig. 4D), explaining CRBN compatibility with diverse amino acids at these ZF positions. IKZF1 residues A152 and L166 (IKZF3 A153 and L167) face CRBN residues V388, I371, and A395, located on the opposite side of the groove (Fig. 4, C and D). Although ZNF692 ZF4 and IKZF1 ZF2 show similar binding modes (Fig. 4C), the position of ZNF692 ZF4 is offset with respect to IKZF1 ZF2 (Fig. 4E and fig. S4, G and H), without significant differences in binding affinity (fig. S4I). Thus, the two structures demonstrate how CRBN accommodates ZF degrons with varying side chain properties at the ZF-CRBN interface (IKZF3 N148, E, N, K, R, V, P; Q149, I, V, R, L, A; A153, Y, F, L, R; and L167, V, I, K, N) (Fig. 3A).

IKZF1 Q146 forms the only side-chain interaction with the compound (Fig. 5A), and mutation of the equivalent residue in IKZF3 (Q147) stabilized the reporter in the saturation mutagenesis experiment (Fig. 3, B and D). The IKZF1 Q146 side chain (IKZF3 Q147) packs against the phthalimide group of pomalidomide and forms a water-mediated hydrogen bond with the C4 amino group of the compound (Fig. 5, A and B, and fig. S5A). Thalidomide lacks the C4 amino group (Fig. 5B), and mutation of IKZF1 Q146 (IKZF3 Q147) to isoleucine, which removes the ability to form a hydrogen bond with the drug, equalizes the binding affinity of IKZF1 ZF2-ZF3 to CRBN across the three compounds accordingly (Fig. 5C and fig. S5, B and C). Equivalent mutations in IKZF3 ZF2 (Q147I/A/H) stabilized the degradation reporter in cells (Fig. 5D and fig. S5D), explaining preferential binding of IKZF1/IKZF3 to CRBN engaged with pomalidomide or lenalidomide over thalidomide in vitro (6) and the contribution of this residue to compound selectivity in vivo (fig. S5E). Seven of the 11 ZF degrons carry an equivalent glutamine residue at this position (Fig. 3A).

ZF degrons show epistatic properties during CRL4CRBN engagement

The crystal structures demonstrate that ZFs with diverse amino acid sequences are compatible with the drug-CRBN interface. To test whether the variable residues are interchangeable between ZF degrons, we swapped the β-hairpin and α-helix of IKZF3 ZF2 with that of ZFP91 ZF4. Replacing the IKZF3 ZF2 β-hairpin with the ZFP91 ZF4 β-hairpin resulted in a construct that was degraded more efficiently than either of the parent molecules; however, replacing the IKZF3 α-helix with the ZFP91 α-helix resulted in a ZF domain that was resistant to degradation (fig. S5F). Furthermore, introducing a single-residue mutation at the drug-ZF interface of IKZF3 (Fig. 5A and fig. S1D), Q147 to a glutamic acid residue, stabilized the IKZF3 reporter (IKZF3 ZF2 Q147E) (Fig. 3D), whereas an equivalent glutamic acid allows robust reporter degradation in the sequence context of the ZF degron from E4F1 (Fig. 3A and fig. S2A). Despite the low degree of sequence similarity across the 11 ZF degrons (Fig. 3A), only distinct amino acid combinations at the drug-CRBN interface result in degradation. ZF degrons are thus defined by the sequence context of their amino acid side chains that contact the drug-CRBN interface (Fig. 4D), which suggests an epistatic relationship of these variable ZF residues during CRL4CRBN engagement and explains the absence of a primary sequence consensus.

In silico identification of ZFs that interact with the drug-CRBN interface

Only 11 of the 4661 structurally similar ZFs were degraded in the library screen, yet the ZF degron appears to be complex. To create a semiquantitative model of the observed degradation pattern with respect to our crystal structures, we used computational docking to account for multiple, potentially epistatic ZF interactions at the drug-CRBN interface. The Rosetta software package (19) was used to computationally dock all human C2H2 ZFs into the complementary binding groove of CRBN. The Rosetta interface energy scores and the structural similarity to our ZF-pomalidomide-CRBN crystal structures were calculated to assess the docking trials (Fig. 6A and fig. S6, A and B). Using the ZNF692 template structure, we accurately predicted all 11 ZFs identified in the library screen (Fig. 6A), found 40 ZFs with lower interface energy scores than the ZNF692 ZF4 template, and identified 108 ZFs that score better than the lowest ranking ZF found degraded in the library screen. Overall, different docking trials revealed an overlapping set of ~50 to 150 ZFs with interface scores similar to or better than those of ZFs found degraded in the library screen (Fig. 6A and fig. S6, A and B), suggesting that an unexpectedly large number of ZFs are capable of binding the drug-CRBN interface.

On the basis of biological relevance, 21 of these ZFs were selected and tested for CRBN engagement in TR-FRET experiments, either as individual ZFs (fig. S6C) or fused to their respective C-terminal ZF (equivalent to the IKZF1 ZF2-ZF3 construct) (Fig. 6B). Sixteen ZFs bound CRBN-pomalidomide in vitro [BCL6, BCL6B, EGR1, EGR4, GZF1, HIC1, HIC2, SALL1, SALL3, SALL4, OSR1, OSR2, WIZ (ZF6 and ZF7), ZBTB7A, and ZBTB7B), with similar (WIZ ZF7)] or lower binding affinities (~5 to 20 fold) than the respective IKZF1 ZF2-ZF3 or ZF2 constructs (Fig. 6B and fig. S6, C, D, and E). Analogously, IKZF1 ZF1, a ZF that was not predicted to bind CRBN in silico (Fig. 6A), did not bind CRBN in vitro (fig. S6C). Thus, of the 33 ZFs individually tested in this study, 28 are recruited to pomalidomide-engaged CRBN (fig. S6F). Our in silico ZF docking approach is thus capable of predicting ZFs that interact with CRBN-pomalidomide with greater than 80% accuracy.

Those ZFs that bound CRBN-pomalidomide in the biochemical assay were subsequently tested for degradation in cells by using the degradation reporter. Of the 16 ZFs, WIZ ZF7 showed partial degradation in the presence of pomalidomide, the best in vitro binder in the set (fig. S6E and fig. S7, A and B). Examination of published proteome-wide mass spectrometry data confirmed degradation of endogenous, full-length WIZ in WSU-DLCL2 (human diffuse large B cell lymphoma) and TMD8 (human diffuse large B cell lymphoma) cells in the presence of lenalidomide and the thalidomide analog CC-122 (fig. S7C) (20). Because the majority of these ZFs are recruited to pomalidomide-engaged CRBN in vitro, but not destabilized by the same compound in cells, our results suggest that small changes in ZF-binding affinity disproportionally influence ZF degradation in cells.

Different thalidomide analogs target distinct sets of ZFs for degradation

We next examined whether thalidomide derivatives with chemical alterations at the drug-ZF interface induce degradation of the computationally predicted and biochemically validated ZF hits that are not destabilized by pomalidomide in cells. We therefore treated cells that express these ZFs in the eGFP/mCherry degradation reporter with the previously reported thalidomide derivatives CC-122 (20), CC-220 (21), and CC-885 (5) (fig. S7, A and B). The modified thalidomide analogs induced mild but significant degradation for some of these ZFs, including two ZFs destabilized by CC-122 (WIZ ZF6 and BCL6 ZF3), two destabilized by CC-220 (BCL6B ZF2 and HIC2 ZF5), and eight destabilized by CC-885 (BCL6 ZF3, BCL6B ZF2, OSR1 ZF1, ZBTB7B ZF2, SALL3 ZF4, SALL4 ZF2, ERG1 ZF3, and SALL1 ZF4). Although the effects are small, our findings suggest that different thalidomide analogs target different ZFs for degradation, including ZFs identified by computational docking, rendering them suitable candidates for drug development.

To comprehensively determine whether the chemically distinct thalidomide analogs induce degradation of different sets of ZFs, we repeated the C2H2 ZF library screen in the presence of pomalidomide, CC-122, and CC-220 (Fig. 6C and fig. S8, A and B). The 11 previously identified ZF hits again scored in the screen with pomalidomide (fig. S8B). Consistent with our earlier finding that CC-122 and CC-220 were capable of degrading ZFs not affected by pomalidomide, all three thalidomide analogs exhibited distinct ZF degradation patterns across the ZF library (Fig. 6C). IKZF2/4 ZF2 was selectively degraded by CC-220 (Fig. 6C). IKZF2/4 ZF2 differs from IKZF/13 ZF2 by a single amino acid substitution at the drug-ZF interface, IKZF3 Q147H (Fig. 5A and fig. S6F). This amino acid change decreases binding to CRBN-pomalidomide in vitro by a factor of 2 to 3 (fig. S5B) and stabilizes the respective ZF reporter in cells (Fig. 5D). In validation experiments, the IKZF2/4 ZF2 reporter was degraded by more than 40% in the presence of CC-220 (fig. S8, C and D). These results demonstrate that thalidomide analogs with chemical alterations at the drug-ZF interface promote degradation of distinct sets of ZFs and are capable of converting ZFs that bind CRBN-pomalidomide weakly in vitro into degraded ZFs in cells.

Discussion

We used a combination of functional and structural approaches to define the molecular basis of C2H2 ZF recruitment to the drug-engaged CRL4CRBN ubiquitin ligase. Thalidomide, lenalidomide, and pomalidomide mediate CRL4CRBN-dependent degradation of a larger number of proteins than previously appreciated through a C2H2 ZF degron (3, 4, 16). We identified 15 individual ZFs and seven full-length ZF-containing proteins that are degraded by thalidomide derivatives in functional or computational screens. Crystal structures, in vitro binding, and cellular degradation assays illustrate that 28 ZFs (including IKZF2/4 ZF2) with diverse amino acid sequences bind the same drug-CRBN interface (fig. S6E).

The majority of E3 ligase degrons are characterized by conserved primary sequence motifs that lie in unstructured regions of the protein substrate (15). By contrast, the more than 28 C2H2 ZFs accommodated by the drug-CRBN interface show surprisingly little sequence conservation, apart from residues that stabilize the ternary ZF fold. ZF degrons bind a complementary groove on the CRBN surface that fits the overall shape of the C2H2 ZF domain, bringing different amino acids of different ZF degrons in contact with the same drug-CRBN interface. Substitution of a ZF residue at one site of the CRBN interface can be compensated by amino acid changes at another site of the ZF, explaining the observed epistatic properties of the ZF degrons. Recruitment and degradation are thus influenced by the complementarity of the substrate to the drug-CRBN binding groove and the sequence context of the contacting ZF residues, making the overall “shape” of the ZF the important binding determinant rather than its primary amino acid sequence. In addition to the complex ZF-CRBN side chain interaction, direct contacts between thalidomide analogs and the ZF provide additional layers of specificity (fig. S5D). Unbiased functional and computational approaches are thus required to identify the full complement of ZF degrons capable of drug-dependent binding and degradation.

The discovery that multiple C2H2 ZFs are drug targets raises the possibility that a large number of ZFs could be subject to thalidomide analog–based degradation. Supporting this notion, 28 of the ~50 to 150 additional C2H2 ZFs nominated by means of in silico docking were shown to bind the drug-CRBN complex, demonstrating that the CRBN surface is more permissive than suggested by the number of targets identified in the ZF library screens. Small differences in affinity between the ZF and the drug-CRBN complex in vitro translate into large differences in degradation in cells, which are likely due to competition for CRBN occupancy by multiple ZFs as well as other substrates that use the same binding surface (Fig. 6, C and D, and figs. S5 and S7). If ZF degradation depends on the ZF occupancy of CRL4CRBN, it is possible that high local concentration of the ZF or CRL4CRBN, or low concentrations of competing ZFs, could compensate for low binding affinity, leading to degradation under such conditions. Given the CRL4CRBN protein architecture and its ~100-Å ubiquitination radius (6, 7, 22), we expect degradation of full-length ZF proteins to be less dependent on lysine accessibility and instead primarily determined by the protein synthesis rate and the affinity and binding kinetics of the internal ZF degron to CRBN, which can be modulated by the drug.

We observed that thalidomide analogs with chemical modifications at the drug-ZF interface are capable of converting ZFs with weak affinity for the CRBN-pomalidomide complex into degraded targets. IKZF2/4 ZF2 is not degraded by pomalidomide, lenalidomide, or CC-122 but is efficiently degraded by CC-220. The crystal structures suggest that ZF side chains—particularly those proximal to the drug at IKZF1 position 146, 148, and 153 (IKZF3 147, 149 and 154)—interact with thalidomide analogs and that chemical alterations can modulate ZF specificity (fig. S5, D and E). ZFs that bind weakly but are not degraded in response to one compound could therefore serve as starting points for the development of new thalidomide analogs that selectively and efficiently degrade such ZFs. Several of the ZFs identified as degraded in functional experiments, or in vitro binders that could be targeted with further drug development, have been implicated in human disease. For example, BCL6 is an oncoprotein in lymphomas, ZFP91 is implicated in NF-κB signaling (23), and ZNF827 is reported to be an essential scaffolding protein for alternative lengthening of telomeres (24). HIC1, HIC2, GZF1, OSR1, OSR2, and SALL4 have all been implicated in development and could contribute to developmental abnormalities in fetuses exposed to thalidomide (25–30).

The approximately 800 C2H2 ZF-containing proteins are the largest class of putative transcription factors in the human proteome (13). Our results suggest that degradation of C2H2 ZF–containing proteins through different thalidomide derivatives may be a generalizable paradigm for targeting “undruggable” transcription factors and provides a structural and functional starting point from which to explore the extent to which CRBN-binding small molecules may be used to target ZF proteins for therapeutic intervention.

Materials and methods summary

Binding assays

In vitro CRBN binding was measured using a TR-FRET assay, and cellular CRBN binding was assayed via CRBN immunoprecipitation.

Protein degradation assays

HEK293T cells expressing candidate proteins in an eGFP/mCherry protein degradation reporter vector were treated with DMSO or drug and the eGFP:mCherry ratio was quantified by flow cytometry. Western blotting was used to assay degradation of endogenous protein targets.

C2H2 ZF library screen

HEK293T cells expressing a library of 6572 C2H2 zinc fingers in the protein degradation reporter vector were treated with DMSO or drug, eGFP+ and eGFP− cell populations were isolated by FACS, and the relative frequency of individual ZFs was quantified with next-generation sequencing.

Saturation mutagenesis

HEK293T cells expressing an IKZF3 saturation mutagenesis library spanning amino acid residues 130 to 189 were exposed to DMSO or drug, eGFP+ and eGFP− cell populations were isolated using FACS, and the relative frequency of each mutation was quantified with next-generation sequencing.

X-ray structure determination

IKZF1 ZF2 and ZNF692 ZF4 were crystalized in the presence of DDB1ΔBPB-CRBNΔN40 and pomalidomide. The structures were determined by molecular replacement using PHASER, with iterative cycles of model building in COOT followed by refinement in phenix.refine or autoBUSTER.

Docking simulations

ZF docking was carried out with the RosettaDock pipeline between the CRBNCTD (residues 320 to 425) bound to pomalidomide and 4645 ZF models matching the pattern [x(6)- C-x(2)-C-G-x(2)-[LIVMFYWC]-x(8)-H-x(3)-H-x(2)]. The docking runs were evaluated with the beta_nov16 scoring function.

Acknowledgments: We thank D. Neuberg and R. Redd for statistical analysis of the saturation mutagenesis screen, C. Zhu for helpful discussions on how to clone the saturation mutagenesis library, D. Haldar for biological characterization of the novel targets, J. Chen for assistance with analysis of the C2H2 ZF library screen, P. Rogers for help with FACS, and S. S. Roy Burman for advice with Rosetta docking. Funding: B.L.E. received funding from the NIH (R01HL082945 and P01CA108631), the Howard Hughes Medical Institute, the Edward P. Evans Foundation, and the Leukemia and Lymphoma Society. N.H.T. received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement 666068). Q.L.S. was supported by award T32GM007753 from the National Institute of General Medical Sciences. G.P. was supported by the Human Frontier Science Program (HFSP Long-Term Fellowship LT000210/2014) and the European Molecular Biology Organization (EMBO Advanced Fellowship aALTF 761-2016). M.S. received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement 702642. B.J.L. was supported by the National Health and Medical Research Council (Early Career Fellowships Grant APP1124979). Author contributions: N.H.T. and B.L.E. conceived of the concept for the identification of previously unidentified C2H2-ZF–containing targets of thalidomide analogs and advised those involved in the project. Q.L.S. generated the degradation reporter vector; identified the minimal degron required for in vivo degradation; conducted the C2H2 ZF library screen with thalidomide, lenalidomide, and pomalidomide; validated the ZF degrons; and executed the saturation mutagenesis of IKZF3 with advice and experimental and analytic assistance from T.M. G.P. purified proteins, performed TR-FRET experiments, and crystalized the IKZF1 and ZNF692 ZF degrons bound to CRBN and pomalidomide, with help from W.A. G.P. and R.D.B. carried out the structural analysis. R.D.B. performed Rosetta docking and interpreted results with input from G.P. A.R. and B.J.L. validated new targets and A.R. tested new thalidomide analogs against IKFZ3 Q147 mutants. M.S. conducted and analyzed the C2H2 ZF library screen with pomalidomide, CC-122, and CC-220. N.H.T., B.L.E., G.P., R.D.B., and Q.L.S. wrote the manuscript. Competing interests: B.L.E., Q.L.S., and T.S.M. are inventors on a patent application (U.S. 15/759,168, EP 168451557.7), submitted by the Broad Institute, Harvard University, and Brigham and Women’s Hospital that covers methods of identifying drug-modulated polypeptide targets for degradation. B.L.E. has received research funding from Celgene. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health. Data and materials availability: All data are publicly available. The insect cell expression plasmids are available from the Friedrich Miescher Institute for Biomedical Research under a materials transfer agreement. Structural coordinates have been deposited in the Protein Data Bank under the accession nos. 6H0F (bound to IKZF1ZF2) and 6H0G (bound to ZNF692ZF4).