Abstract

Serpins are a broadly distributed family of protease inhibitors that use a conformational
change to inhibit target enzymes. They are central in controlling many important proteolytic
cascades, including the mammalian coagulation pathways. Serpins are conformationally
labile and many of the disease-linked mutations of serpins result in misfolding or
in pathogenic, inactive polymers.

Review

Serpins (serine protease inhibitors or classified inhibitor family I4) are the largest
and most broadly distributed superfamily of protease inhibitors [1,2]. Serpin-like genes have been identified in animals, poxviruses, plants, bacteria
and archaea, and over 1,500 members of this family have been identified to date. Analysis
of the available genomic data reveals that all multicellular eukaryotes have serpins:
humans, Drosophila, Arabidopsis thaliana and Caenorhabditis elegans have 36, 13, 29, and about 9 serpin-like genes, respectively [1,3]. In contrast, serpins in prokaryotes are sporadically distributed and most serpin-containing
prokaryotes have only a single serpin gene [4]. The majority of serpins inhibit serine proteases, but serpins that inhibit caspases
[5] and papain-like cysteine proteases [6,7] have also been identified. Rarely, serpins perform a non-inhibitory function; for
example, several human serpins function as hormone transporters [8] and certain serpins function as molecular chaperones [9] or tumor suppressors [10]. A phylogenetic study of the superfamily divided the eukaryotic serpins into 16 'clades'
(termed A-P) [1]. The proteins are named SERPINXy, where X is the clade and y is the number within
that clade; many serpins also have alternative names from before this classification
was proposed.

Serpins are relatively large molecules (about 330-500 amino acids) in comparison with
protease inhibitors such as basic pancreatic trypsin inhibitor (BPTI, which is about
60 amino acids) [11]. Over 70 serpin structures have been determined, and these data, along with a large
amount of biochemical and biophysical information, reveal that inhibitory serpins
are 'suicide' or 'single use' inhibitors that use a unique and extensive conformational
change to inhibit proteases [12]. This conformational mobility renders serpins heat-labile and vulnerable to mutations
that promote misfolding, spontaneous conformational change, formation of inactive
serpin polymers and serpin deficiency [13]. In humans, several conformational diseases or 'serpinopathies' linked to serpin
polymerization have been identified, including emphysema (SERPINA1 (antitrypsin) deficiency)
[14], thrombosis (SERPINC1 (antithrombin) deficiency) [15] and angio-edema (SERPING1 (C1 esterase inhibitor) deficiency) [16]. Accumulation of serpin polymers in the endoplasmic reticulum of serpin-secreting
cells can also result in disease, most notably cirrhosis (SERPINA1 polymerization)
[14] and familial dementia (SERPINI1 (neuroserpin) polymerization) [17]. Other serpin-related diseases are caused by null mutations or (rarely) point mutations
that alter inhibitory specificity or inhibitory function [18]. Here, we summarize the evolution, structure and mechanism of serpin function and
dysfunction.

Broad organization of the serpin superfamily

Serpins appear to be ubiquitous in multicellular higher eukaryotes and in the poxviridae
pathogens of mammals. In humans, the two largest clades of the 36 serpins that have
been identified are the extracellular 'clade A' molecules (thirteen members found
on chromosomes 1, 14 and X) and the intracellular 'clade B' serpins (thirteen members
on chromosomes 18 and 6) [3].

Recent bioinformatic and structural studies have also identified inhibitory serpins
in the genomes of certain primitive unicellular eukaryotes (such as Entamoeba histolytica [19]) as well as prokaryotes [4,20]. No fungal serpin has been identified to date, and the majority of prokaryotes do
not contain clearly identifiable serpin-like genes. Phylogenetic analyses have found
no evidence for horizontal transfer [1,21], and it is instead suggested that serpins are ancient proteins and that most prokaryotes
have lost the requirement for serpin-like activity [4].

Functional diversity of serpins

Inhibitory serpins have been shown to function in processes as diverse as DNA binding
and chromatin condensation in chicken erythrocytes [22,23], dorsal-ventral axis formation and immunoregulation in Drosophila and other insects [24,25], embryo development in nematodes [26], and control of apoptosis [5].

In humans, the majority (27 out of 36) of serpins are inhibitory (Table 1). Clade A serpins include inflammatory response molecules such as SERPINA1 (antitrypsin)
and SERPINA3 (antichymotrypsin) as well as the non-inhibitory hormone-transport molecules
SERPINA6 (corticosteroid-binding globulin) and SERPINA7 (thyroxine-binding globulin).
Clade B includes inhibitory molecules that function to prevent inappropriate activity
of cytotoxic apoptotic proteases (SERPINB6, also called PI6, and SERPINB9, also called
PI9) and inhibit papain-like enzymes (SERPINB3, squamous cell carcinoma antigen-1)
as well as the non-inhibitory molecule SERPINB5 (maspin). SERPINB5 does not undergo
the characteristic serpin-like conformational change and functions to prevent metastasis
in breast cancer and other cancers through an incompletely characterized mechanism
[10,27]. The roles of several other well characterized human serpins are also summarized
in Table 1.

Numerous important branches of the serpin superfamily remain to be functionally characterized.
For example, although plants have a large number of serpin genes, the function of
plant serpins remains obscure. Studies in vitro clearly show that plant serpins can function as protease inhibitors [28], but plants lack close relatives of chymotrypsin-like proteases, which would be the
obvious targets for these serpins. Thus, it has been suggested that plant serpins
may be involved in inhibiting proteases in plant pathogens; for example, they may
be targeting digestive proteases in insects [29]. One study convincingly demonstrated a close inverse correlation between the upregulation
of Cucurbita maxima (squash) phloem serpin-1 (CmPS) and aphid survival [30]. Feeding experiments in vitro showed, however, that purified CmPS did not affect insect survival [30]. Together, these data suggest that rather than directly interacting with the pathogen,
plant serpins, like their insect counterparts, may have a role in the complex pathways
involved in upregulating the host immune response.

Similarly, the role of serpins in prokaryotes remains to be understood; again, these
molecules are capable of inhibitory activity in vitro [20], but their targets in vivo and their function remain to be characterized. Interestingly, several inhibitory
prokaryote serpins are found in extremophiles that live at elevated temperatures (for
example, Pyrobaculum aerophilum, which lives at 100°C); these serpins use novel strategies to function as inhibitors
at elevated temperatures while resisting inappropriate conformational change [4,20,31].

Structural biology of the serpins and the mechanism of protease inhibition

Serpins are made up of three β sheets (A, B and C) and 8-9 α helices (termed hA-hI).
Figure 1a shows the native structure of the archetypal serpin SERPINA1 [32]. The region responsible for interaction with target proteases, the reactive center
loop (RCL), forms an extended, exposed conformation above the body of the serpin scaffold.
The remarkable conformational change characteristic of inhibitory serpins is depicted
in Figure 1d; the structure of SERPINA1 with its RCL cleaved [33] shows that, following proteolysis, the amino-terminal portion of the RCL inserts
into the center of β-sheet A to form an additional (fourth) strand (s4A). This conformational
transition is termed the 'stressed (S) to relaxed (R) transition', as the cleavage
of native inhibitory serpins results in a dramatic increase in thermal stability.
Native serpins are therefore trapped in an intermediate, metastable state, rather
than their most stable conformation, and thus represent a rare exception to Anfinsen's
conjecture, which predicts that a protein sequence will fold to a single structure
that represents the lowest free-energy state [34].

Figure 1. The structure and mechanism of inhibitory serpins. (a) The structure of native SERPINA1 (Protein Data Bank (PDB) code 1QLP) [32]. The A sheet
is in red, the B sheet in green and the C sheet in yellow; helices (hA-hI) are in
blue. The reactive center loop (RCL) is at the top of the molecule, in magenta. The
position of the breach and the shutter are labeled and the path of RCL insertion indicated
(magenta dashed line). Both of these regions contain several highly conserved residues,
many of which are mutated in various serpinopathies. (b) The Michaelis or docking complex between SERPINA1 and inactive trypsin (PDB code 1OPH)
[36], with the protease (multicolors) docked onto the RCL (magenta). Upon docking
with an active protease (b), two possible pathways are apparent. (c) The final serpin enzyme complex (PDB code 1EZX [12]). The serpin has undergone the
S to R transition, and the protease hangs distorted at the base of the molecule. (d) The structure of cleaved SERPINA1 is shown (PDB code 7API) [93]) with the RCL (magenta)
forming the fourth strand of β-sheet A. The result of serpin substrate-like behavior
can be seen where the protease has escaped the conformational trap, leaving active
protease and inactive, cleaved serpin. Certain serpin mutations, particularly non-conservative
substitutions within the hinge region of the RCL, result in substrate-like, rather
than inhibitory, behavior [94].

Serpins use the S-to-R transition to inhibit target proteases. Figure 1b shows the structure of an initial docking complex between a serpin and a protease
(SERPINA1 and trypsin [35,36]) and Figure 1c shows the final serpin-enzyme complex [12]. These structural studies [12,35,36], combined with extensive biochemical data, revealed that RCL cleavage and subsequent
insertion is crucial for effective protease inhibition. In the final serpin-protease
complex, the protease remains covalently linked to the serpin, the enzyme being trapped
at the acyl-intermediate stage of the catalytic cycle. Structural comparisons show
that the protease in the final complex is severely distorted in comparison with the
native conformation, and that much of the enzyme is disordered [12]. In addition, a fluorescence study demonstrated that the protease was partially unfolded
in the final complex [37]. These conformational changes lead to distortion at the active site, which prevents
efficient hydrolysis of the acyl intermediate and the subsequent release of the protease.
These data are consistent with the observation that buried or cryptic cleavage sites
within trypsin become exposed following complex formation with a serpin [38]. It is possible that cleavage of such cryptic sites within the protease occurs in vivo and thus results in permanent enzyme inactivation. The absolute requirement for RCL
cleavage, however, means that serpins are irreversible 'suicide' inhibitors.

A major advantage of the serpin fold over small protease inhibitors such as BPTI is
that the inhibitory activity of serpins can be exquisitely controlled by specific
cofactors. For example, human SERPINC1 (antithrombin) is a relatively poor inhibitor
of the proteases thrombin and factor Xa until it is activated by the cofactor heparin
[39]. Structural studies of SERPINC1 highlight the molecular basis for heparin function.
Figure 2a shows the structure of native SERPINC1. Here, we use the convention of Schechter
and Berger, in which residues on the amino-terminal side of the cleavage site (P1/P1')
are termed P2, P3, and so on, and those carboxy-terminal are termed P2', P3', and
so on; corresponding subsites in the enzyme are termed S1, S2, and so on [40]. The RCL is partially inserted into the top of the 3 sheet; the residue (P1-Arg)
responsible for docking into the primary specificity pocket (S1) of the protease is
relatively inaccessible to docking with thrombin, as it is pointing towards and forming
interactions with the body of the serpin [41,42]. Figure 2b illustrates the ternary complex between SERPINC1, thrombin and heparin [43]. Upon interaction with a specific heparin pentasaccharide sequence present in high-affinity
heparin, SERPINC1 undergoes a substantial conformational rearrangement whereby the
RCL is expelled from β-sheet A and the P1 residue flips to an exposed protease-accessible
conformation [44-46]]. In addition to loop expulsion and P1 exposure, long-chain heparin can bind both
enzyme and inhibitor and thus provides an additional acceleration of the inhibitory
interaction. Several other serpins, including SERPIND1 (heparin cofactor II), also
use cofactor binding and conformational change to achieve exquisite inhibitory control
[47].

Figure 2. Modulation of serpin conformation by cofactors. (a) The structure of native SERPINC1 (PDB code 2ANT) [95]. The partial insertion of the
RCL (two residues) into the top of β-sheet A is circled, and the position of the P1
residue is shown (magenta spheres). (b) The structure of the ternary complex between SERPINC1, inactive thrombin (the Ser195Ala
mutant) and a synthetic long-chain heparin construct (PDB code 1TB6) [43]. A specific
high-affinity pentasaccharide (green) on the heparin interacts with the heparin-binding
site on SERPINC1 (on and around helix hD) and promotes expulsion of the RCL (blue
arrow) and rearrangement of the P1 residue (magenta spheres).

Structural studies on prokaryote and viral serpins have revealed several interesting
variations of the serpin scaffold. Viral proteins are often 'stripped down' to a minimal
scaffold in order to minimize the size of the viral genome. Consistent with this requirement,
the structure of the viral serpin crmA, one of the smallest members of the serpin
superfamily [48,49], shows that it lacks helix hD. More recently, the structure of the prokaryote serpin
thermopin from Thermobifida fusca revealed the absence of helix hH [20,31]. These studies also showed that thermopin contains a 4 amino-acid insertion at the
carboxyl terminus that forms extensive interactions with conserved residues at the
top of β-sheet A (called the 'breach'; see later); biophysical data suggest that this
region is important for proper and efficient folding of this unusual serpin.

The major conformational change that occurs within both the protease and the serpin
as a result of serpin-enzyme complex formation provides an elegant mechanism for cells
to specifically detect and clear inactivated serpin-protease complexes. Several studies
have shown that the low density lipoprotein-related protein (LRP) specifically binds
to and promotes internalization of the final complexes SERPINC1-thrombin, SERPIND1-thrombin
and SERPINA1-trypsin. In contrast, native or cleaved serpin alone are not internalized
[50]. Additionally, recent studies on SERPINI1 show that both SERPINI1-tissue plasminogen
activator complexes and native SERPINI1 are internalized in an LRP-dependent manner.
However, while SERPINI1-tissue plasminogen activator complexes can bind directly to
LRP, native SERPINI1 requires the presence of an (as yet unidentified) cofactor [51]. The structural basis for interaction of LRP with serpin-enzyme complexes and the
subsequent intracellular signaling response remain to be fully understood. It is clear,
however, that native serpins and serpin-enzyme complexes can induce powerful responses
such as cell migration in an LRP-dependent manner [52].

The metastability of serpins and their ability to undergo controlled conformational
change also renders these molecules susceptible to spontaneous conformational rearrangements.
Most notably, the serpin SERPINE1 (plasminogen activator inhibitor-1) uses spontaneous
conformational change to control inhibitory activity [53]. Structural and biochemical studies show that, in the absence of the cofactor vitronectin,
native SERPINE1 (Figure 3a) rapidly converts to a latent inactive state (Figure 3b). The transition to latency is accompanied by insertion of the RCL into β-sheet A,
where it cannot interact with the target protease. Interestingly, the structure of
SERPINE1 in complex with the somatomedin B domain of vitronectin [54] shows that the cofactor-binding site on SERPINE1 is located in a similar region to
the heparin-binding site of SERPINC1 (on and around helices hD and hE; Figure 3c). Whereas heparin promotes conformational change in SERPINC1, however, vitronectin
prevents conformational change in SERPINE1. Several other serpins, including SERPINC1,
have been shown to spontaneously undergo the transition to the latent state, and it
is suggested that this may be an important control mechanism [55].

Figure 3. Spontaneous conformational change in serpins. (a) Structure of native SERPINE1 (PDB code 1B3K) [96]. The RCL is in magenta and strand
s1c of β-sheet C is in yellow. (b) The structure of latent SERPINE1 (PDB code 1DVN) [53,97], which can form by spontaneous
conversion from the native protein. The RCL (magenta) is inserted into β-sheet A.
In order to enable full insertion of the RCL, s1C of β-sheet C (pale yellow) has peeled
off. In addition, conformational change in the strands s3C and s4C (pale green) is
indicated. (c) Structure of SERPINE1 (blue) in complex with the somatomedin B domain (green) of vitronectin
(PDB code 1OC0) [54]. The interaction with vitronectin locks SERPINE1 in the native,
active conformation.

Although the transition to latency could be an important control mechanism in at least
one serpin, an alternative spontaneous conformational change, serpin polymerization,
results in deficiency and disease (or serpinopathy) [14,56]. Serpin polymerization is postulated to occur via a domain-swapping event whereby
the RCL of one molecule docks into β-sheet A of another to form an inactive long-chain
serpin polymer (Figure 4a, b) [14,57-59]. Several important human serpin variants result in polymerization, the best studied
and most common of which is the Z allele (Glu342Lys) of SERPINA1 [14]. Here, failure to properly control the activity of neutrophil elastase (the inhibitory
target of SERPINA1) in the lung during the inflammatory response results in the destruction
of lung tissue, leading to emphysema. Furthermore, in individuals homozygous for the
Z-variant, the accumulation of serpin aggregates or polymers in the endoplasmic reticulum
of anti-trypsin-producing cells, the hepatocytes, can eventually result in cell death
and liver cirrhosis [14]. Similarly, mutation of SERPINI1 results in the formation of neural inclusion bodies
and in the disease 'familial encephalopathy with neuroserpin inclusion bodies' (FENIB)
[17,60,61].

Figure 4. Structure of serpin polymers and other inactive conformers. (a) Schematic diagram of domain swapping in serpins; the RCL of one molecule (magenta
loop), is docked into β-sheet A (black lines) of the next (only four strands of β-sheet
A are shown). (b) Structure of a cleaved serpin polymer (PDB code 1D5S) [57], showing the promiscuous
nature of the RCL. Cleavage at the P5/P6 position has resulted in RCL (magenta) insertion
into β-sheet A; the 'gap' at the bottom of β-sheet A is filled with the P5-P1 portion
(pale pink) from an RCL from another molecule. (c) The structure of an alternative confirmation of SERPINA3 -δ-SERPINA3 (PDB code 1QMN)
[62]. Four residues of the RCL (magenta) are inserted into the top of β-sheet A. The
F-helix (green) has partially unwound and filled the bottom half of β-sheet A. (d) Serpins can accept a peptide with the sequence of the RCL (pale pink) into β-sheet
A (PDB code 1BR8) [98].

In addition to promoting polymerization, several serpin mutations have been identified
that promote formation of a disease-linked latent state. Notably, a mutation in SERPINC1,
the wibble variant (Thr85Met), results in formation of large amounts of circulating
latent SERPINC1 (about 10% of total SERPINC1) [55]. An alternative 'half-way house' conformation of SERPINA3, termed δ, has also been
identified (Figure 4c) [62]. The structure of δ-SERPINA3 also highlights the extraordinary flexibility of the
serpin scaffold: in this conformation the RCL is partially inserted into β-sheet A
and helix hF has partially unwound and inserted into the base of β-sheet A, completing
the β-sheet hydrogen bonding (Figure 4c). Finally, the promiscuity of β-sheet A is highlighted by the ability of this region
to readily accept short peptides: several structural and biochemical studies have
demonstrated that peptides can bind to β-sheet A and induce the S-to-R transition
(Figure 4d).

Valuable insights into the mechanism of serpin function have been gleaned from the
structural location of variants that promote serpin instability [18,63]. The majority of serpinopathy-linked mutations (including antitrypsin Siiyama [64] and Mmalton [65], antithrombin wibble [55] and δ-SERPINA3 [62]) cluster in the center of the serpin molecule, underneath β-sheet A, in a region
termed the shutter (marked on Figure 1a). Interestingly, Glu342, the position mutated in the Z allele of SERPINA1, is located
at the breach, which is just above the shutter at the top of β-sheet A. This portion
of the molecule is the point of initial RCL insertion. It is suggested that destabilization
of β-sheet A in either the shutter or the breach is sufficient to favor the transition
to a polymeric or latent state over maintenance of the monomeric metastable native
state [14]. Interestingly, analysis of conserved residues in the serpin superfamily also reveals
a striking distribution of highly conserved residues stretching down the center of
β-sheet A from the breach to the base of the molecule [1].

Unsurprisingly, given the important proteolytic processes they control, simple deficiencies
such as those caused by null mutations of a large number of human serpins are linked
to disease (some of these are summarized in Table 1). Interestingly, however, several (rare) mutations have been identified that do not
promote instability but instead interfere with the ability of the serpin to interact
correctly with proteases. These include the Enschede variant of SERPINF2 [66], in which insertion of an additional alanine in the RCL results in predominantly
substrate-like (rather than inhibitory) behavior upon interaction with a protease.
Mutations that alter serpin specificity can also have a devastating effect. For example,
the Pittsburgh variant of SERPINA1 (antitrypsin) is an effective thrombin inhibitor
as a result of mutation of the P1 methionine to an arginine [67]. The carrier of this variant died of a fatal bleeding disorder in childhood.

Our knowledge of the functional biochemistry and cell biology of serpins has been
shaped by extensive contributions from structural biology and genomics. The structure
of six different serpin conformations, together with analysis of numerous different
dysfunctional serpin variants, has allowed the characterization of a unique conformational
mechanism of protease inhibition. These data highlight the intrinsic advantages as
well as the dangers of structural complexity in protease inhibitors. On the one hand,
conformational mobility provides an inherently controllable mechanism of inhibition.
On the other, uncontrolled serpin conformational change may result in misfolding and
the development of specific serpinopathies. Serpins thus join a growing number of
structurally distinct molecules that can misfold and cause important degenerative
diseases, such as prions, polyglutamine regions of various proteins and the amyloid
proteins that form inclusions in Alzheimer's disease. While the mechanism of serpin
function is now structurally well characterized, the precise role and biological target
of many serpins remains to be understood.

Acknowledgements

Qingwei Zhang is a recipient of a Monash Graduate Scholarship. James Whisstock is
an NHMRC Senior Research Fellow and Monash University Logan Fellow. We thank the NHMRC
and the ARC for support.