Thyroglobulin (Tg) is a large glycoprotein specific to the thyroid gland and is the precursor of the iodinated thyroid hormones thyroxine (T4) and triiodothyronine (T3). The N-terminal section of Tg contains 10 repeats of a domain of about 65 amino acids which is known as the Tg type-1 repeat [(PUBMED:3595599), (PUBMED:8797845)]. Such a domain has also been found as a single or repeated sequence in the HLA class II associated invariant chain [(PUBMED:3038530)]; human pancreatic carcinoma marker proteins GA733-1 and GA733-2 [(PUBMED:2333300)]; nidogen (entactin), a sulphated glycoprotein which is widely distributed in basement membranes and that is tightly associated with laminin; insulin-like growth factor binding proteins (IGFBP) [(PUBMED:1709161)]; saxiphilin, a transferrin-like protein from Rana catesbeiana (Bull frog) that binds specifically to the neurotoxin saxitoxin [(PUBMED:8146142)]; chum salmon egg cysteine proteinase inhibitor, and equistatin, a thiol-protease inhibitor from Actinia equina (sea anemone) [(PUBMED:9153250)]. The existence of Thyr-1 domains in such a wide variety of proteins raises questions about their activity and function, and their interactions with neighbouring domains. The Thyr-1 and related domains belong to MEROPS proteinase inhibitor family I31, clan IX.

Equistatin from A. equina is composed of three Thyr-1 domains; as with other proteins that contains Thyr-1 domains, the thyropins, they bind reversibly and tightly to cysteine proteases (inhibitor family C1). In equistatin inhibition of papain is a function of domain-1. Unusually domain-2 inhibits cathepsin D, an aspartic protease (inhibitor family A1) and has no activity against papain. Domain-3, does not inhibit either papain or cathepsin D, and its function or its target peptidase has yet to be determined [(PUBMED:9153250), (PUBMED:12650938)].

MC3T3-E1 murine osteoblasts produce insulin-like growth factor (IGF)-binding protein-4 (IGFBP-4)-degrading proteinase activity, which is inhibited by IGFBP-3 and a highly basic, C-terminal domain of IGFBP-3. Of all the other five IGFBPs, IGFBP-5 and -6 share the highest degree of homology with this domain of IGFBP-3; therefore, we investigated whether these two IGFBPs inhibit IGFBP-4 degradation. Both IGFBP-5 and IGFBP-6 inhibit the degradation of 125I-IGFBP-4 by MC3T3-E1-conditioned media, and their inhibitory effects are variably reversed by IGFs. Synthetic peptides containing highly basic, C-terminal regions of IGFBP-5 and IGFBP-6 inhibit 125I-IGFBP-4 degradation, as does an homologous IGFBP-3 peptide, yet each peptide displays a different IC50, with the IGFBP-5 peptide being the most potent and the IGFBP-6 peptide being the least potent. In contrast, a homologous, yet neutral, IGFBP-4 peptide does not inhibit 125I-IGFBP-4 proteolysis, confirming the role of basic residues in the inhibitory process. The IGFBP-3, -5, and -6 peptides, each of which contains the heparin-binding consensus sequence XBBBXXBX, bind heparin, yet the IGFBP-3 and -5 peptides bind heparin with the highest affinities, whereas the IGFBP-6 peptide binds heparin with approximately 10-fold less affinity. Consistent with these regions being involved in proteinase inhibition, heparin completely reverses their inhibitory effects on 125I-IGFBP-4 proteolysis. Together, these data demonstrate that IGFBP-3, -5, and -6 can function as IGF-reversible inhibitors of IGFBP-4 proteolysis, likely through homologous, highly basic, heparin-binding domains contained within the conserved thyroglobulin type-1 motif present in the C-termini of these IGFBPs.

Equistatin, a new inhibitor of cysteine proteinases from Actinia equina, is structurally related to thyroglobulin type-1 domain.

J Biol Chem. 1997; 272: 13899-903

Display abstract

It is well known that the activities of the lysosomal cysteine proteinases are tightly regulated by their endogenous inhibitors, cystatins. Here we report a new inhibitor of cysteine proteinases isolated from sea anemone Actinia equina. The inhibitor, equistatin, is an acidic protein with pI 4.7 and molecular weight of 14,129. It binds tightly and rapidly to cathepsin L (ka = 5.7 x 10(7) M-1 s-1, Ki = 0.051 nM) and papain (ka = 1.2 x 10(7) M-1 s-1, Ki = 0.57 nM). The lower affinity for cathepsin B (Ki = 1.4 nM) was shown to be due mainly to a lower second order association rate constant (ka = 0.04 x 10(6) M-1 s-1). The inhibitor is composed of 128 amino acids forming two repeated domains with 48% identity. Neither of the domains shows any sequence homology to cystatins, but they do show a significant homology to thyroglobulin type-1 domains. A highly conserved consensus sequence motif of Cys-Trp-Cys-Val together with conserved Cys, Pro, and Gly residues is present in major histocompatibility complex class II-associated p41 invariant chain, nidogen, insulin-like growth factor proteins, saxiphilin domain a, pancreatic carcinoma marker proteins (GA733), and chum salmon egg cysteine proteinase inhibitor. In each of the domains of the equistatin, the three residues are similarly conserved, and the sequences Val-Trp-Cys-Val and Cys-Trp-Cys-Val are present in domains a and b, respectively. We suggest that equistatin belongs to a new superfamily of protein inhibitors of cysteine proteinases named thyroglobulin type-1 domain inhibitors. This superfamily currently includes equistatin, major histocompatibility complex class II- associated p41 invariant chain fragment, and chum salmon egg cysteine proteinase inhibitor.

Thyroglobulin (Tg) proteolytic steps are central phenomena in Tg processing and thyroid hormone release in thyrocytes. Based on recent literature data, we propose that the type-1 repetitive units present in the Tg sequence could act as binders and reversible inhibitors of the proteases implicated in Tg processing. The pH-dependent interactions of proteases with the repeats could permit (i) protection from degradation of low iodinated Tg to be recycled; (ii) restriction of early proteolytic attacks to N- and C-terminal hormone formation sites; (iii) increase of the half-time of acidic proteases necessary for the final, extensive degradation steps of Tg.

Characterization of the type-1 repeat from thyroglobulin, a cysteine-rich module found in proteins from different families.

Eur J Biochem. 1996; 240: 125-33

Display abstract

The amino acid sequence of human thyroglobulin is known to enclose cysteine-rich repetitive regions. In this study, we report the existence of an eleventh type-1 repeat within the human thyroglobulin sequence, and we characterize the thyroglobulin type-1 repeat as a protein module. The 11 thyroglobulin type-1 repeats possessed the same number of cysteine residues (six in type A, four in the two type B repeats), a fairly constant number of residues between cysteines and a conserved sequence pattern. By scanning protein sequence databases, 29 proteins belonging to six different families were found to enclose at least one, and up to three, thyroglobulin type-1 repeats in their sequence. Although the repeat was present in numerous proteins possessing binding properties, an examination of the information available in the literature showed that a direct role of the repeat in protein-protein interaction has rarely been assessed. A distance analysis of the sequences indicated that all repeats segregate into four clusters of phylogenically close sequences. A consensus sequence of type-1 repeats was derived from sequence similarity analysis; it comprised a central core of conserved residues including two highly conserved motifs, QC and CWCV. The type-1 repeat from thyroglobulin was found to differ from several previously described cysteine-rich modules, in particular from the epidermal-growth-factor-like module with which it has sometimes been confused. Therefore, our results provide a complete characterization of the repeats which will help in the detection of these repeats in newly characterized proteins, a necessary step for understanding the structural/biological role of this module.

Primary structure of bovine thyroglobulin deduced from the sequence of its 8,431-base complementary DNA.

Nature. 1985; 316: 647-51

Display abstract

In mammals, an adequate supply of thyroid hormones is essential for normal growth and neurological development. The biosynthesis of thyroid hormones involves an iodinated precursor protein, thyroglobulin, which may be considered an extreme example of a pro-hormone. Thyroglobulin is a dimeric glycoprotein of relative molecular mass (Mr) 660,000 (660K), which is secreted by the thyrocyte and stored in the lumen of the thyroid follicle. The hormonogenic reaction is extracellular, and involves iodination of tyrosyl residues of thyroglobulin and the intramolecular coupling of a subset of these into thyroxine (T4) and triiodothyronine (T3), which remain part of the polypeptide chain. Secretion of hormones results from the endocytosis of thyroglobulin followed by its complete hydrolysis in lysosomes. Considering that the maximum yield of hormones is approximately 6-8 per 660K protein, the whole process is apparently wasteful. However, the efficiency of thyroglobulin as a thyroid hormone precursor is extremely high when the supply of iodine is short; in such conditions, almost all the iodine incorporated is found in iodothyronine. Hence it is suggested that the thyroglobulin structure has evolved to allow for the preferential and efficient iodination and coupling of the hormonogenic tyrosines. Here we report the complete primary structure of bovine thyroglobulin, derived from the sequence of its 8,431-base-pair complementary DNA. The 2,769-amino-acid sequence is characterized by a pattern of imperfect repeats derived from three cysteine-rich motifs. Four hormonogenic tyrosines have been precisely localized near the amino and carboxyl ends of the protein.

Disease (disease genes where sequence variants are found in this domain)

This information is based on mapping of SMART genomic protein database to KEGG orthologous groups. Percentage points are related to the number of proteins with TY domain which could be assigned to a KEGG orthologous group, and not all proteins containing TY domain. Please note that proteins can be included in multiple pathways, ie. the numbers above will not always add up to 100%.