An improved method of hybridization with oligonucleotide probes using tetramethylammonium chloride is provided. The method is useful for screening mixtures of DNA sequences, including libraries of high DNA sequence complexity, with a single oligonucleotide probe or a pool of probes representing all possible...http://www.google.com/patents/US5654147?utm_source=gb-gplus-sharePatent US5654147 - Method of hybridization using oligonucleotide probes

An improved method of hybridization with oligonucleotide probes using tetramethylammonium chloride is provided. The method is useful for screening mixtures of DNA sequences, including libraries of high DNA sequence complexity, with a single oligonucleotide probe or a pool of probes representing all possible codon choices for a short amino acid sequence.

Images(32)

Claims(5)

We claim:

1. An improved method of hybridization using an oligonucleotide probe to detect a DNA sequence complementary to the probe in a mixture of DNA sequences comprising:

(a) mixing an oligonucleotide probe with a mixture of DNA sequences under conditions to allow hybridization of the probe to a complementary DNA sequence in the mixture;

(b) washing the hybridized mixture of step (a) with an approximately 3M tetramethylammonium chloride wash solution at a temperature that dissociates non-complementary sequences from the probe; and

2. The method of claim 1, wherein the hybridized complementary DNA sequence of step (c) is an exact match of the probe.

3. The method of claim 1, wherein the hybridized complementary DNA sequence of step (c) differs from the probe by 1 or 2 base pairs.

4. The method of claim 1, wherein the oligonucleotide probe is a pool of probes.

5. The method of claim 4 wherein the pool of probes comprises probes from about 14 to about 20 nucleotides long specifying all possible degeneracy combinations for each codon choice.

Description

This is a continuation of application Ser. No. 07/829,867 filed on 3 Feb. 1992, now U.S. Pat. No. 5,618,789 which is a divisional of Ser. No. 07/570,096 filed on 20 Aug. 1990, now U.S. Pat. No. 5,618,788, which is a continuation of Ser. No. 07/083,758 filed on 7 Aug. 1987, now U.S. Pat. No. 4,965,199, which is a continuation of Ser. No. 06/602,312 filed 20 Apr. 1984, now abandoned, which applications are incorporated herein by reference and to which applications priority is claimed under 35 USC §120.

FIELD OF THE INVENTION

The present invention relates to human factor VIII, to novel forms and compositions thereof and particularly to means and methods for the preparation of functional species of human factor VIII, particularly via recombinant DNA technology.

The present invention is based in part on the discovery of the DNA sequence and deduced amino acid sequence of human factor VIII as well as associated portions of the factor VIII molecule found in our hands to be functional bioactive moieties. This discovery was enabled by the production of factor VIII in various forms via the application of recombinant DNA technology, thus, in turn enabling the production of sufficient quality and quantity of materials with which to conduct biological testing and prove biological functionality. Having determined such, it is possible to tailor-make functional species of factor VIII via genetic manipulation and in vitro processing, arriving efficiently at hitherto unobtainable commercially practical amounts of active factor VIII products. This invention is directed to these associated embodiments in all respects.

The publications and other materials hereof used to illuminate the background of the invention, and in particular cases, to provide additional details concerning its practice are incorporated herein by reference and listed at the end of the specification in the form of a bibliography.

BACKGROUND OF THE INVENTION

The maintenance of an intact vascular system requires the interaction of a variety of cells and proteins. Upon injury to the vascular bed, a series of reactions is initiated in order to prevent fluid loss. The initial response is the activation of platelets, which adhere to the wound and undergo a series of reactions. These reactions include the attraction of other platelets to the site, the release of a number of organic compounds and proteins, and the formation of a thrombogenic surface for the activation of the blood coagulation cascade. Through this combined series of reactions, a platelet plug is formed sealing the wound. The platelet plug is stabilized by the formation of fibrin threads around the plug preventing unwanted fluid loss. The platelet plug and fibrin matrix are subsequently slowly dissolved as the wound is repaired. For a general review, see (1).

A critical factor in the arrest of bleeding is the activation of the coagulation cascade in order to stabilize the initial platelet plug. This system consists of over a dozen interacting proteins present in plasma as well as released and/or activated cellular proteins (2, 3). Each step in the cascade involves the activation of a specific inactive (zymogen) form of a protease to the catalytically active form. By international agreement (4), each protein of the cascade has been assigned a Roman numeral designation. The zymogen form of each is represented by the Roman numeral, while the activated form is represented by the Roman numeral followed by a subscript "a". The activated form of the protease at each step of the cascade catalytically activates the protease involved in the subsequent step in the cascade. In this manner a small initial stimulus resulting in the activation of a protein at the beginning of the cascade is catalytically amplified at each step such that the final outcome is the formation of a burst of thrombin, with the resulting thrombin catalyzed conversion of the soluble protein fibrinogen into its insoluble form, fibrin. Fibrin has the property of self-aggregating into threads or fibers which function to stabilize the platelet plug such that the plug is not easily dislodged.

FIG. 1 summarizes the current understanding of the interactions of the proteins involved in blood coagulation. The lack or deficiency of any of the proteins involved in the cascade would result in a blockage of the propagation of the initial stimulus for the production of fibrin. In the middle of the cascade represented in FIG. 1 is a step wherein factor IXa initiates the conversion of factor X to the activated form, factor Xa. Factor VIII (also synonomously referred to as factor VIIIC) is currently believed to function at this step, in the presence of phospholipid and calcium ions, as a cofactor; that is, it has no known function in itself, and is required to enhance the activity of factor IXa. This step in the cascade is critical since the two most common hemophilia disorders have been determined to be caused by the decreased functioning of either factor VIII (hemophilia A or classic hemophilia) or factor IXa (hemophilia B). Approximately 80 percent of hemophilia disorders are due to a deficiency of factor VIII. The clinical manifestation in both types of disorders are the same: a lack of sufficient fibrin formation required for platelet plug stabilization, resulting in a plug which is easily dislodged with subsequent rebleeding at the site. The relatively high frequency of factor VIII and factor IX deficiency when compared with the other factors in the coagulation cascade is due to their genetic linkage to the X-chromosome. A single defective allele of the gene for factor VIII or factor IX results in hemophilia in males, who have only one copy of the X chromosome. The other coagulation factors are autosomally linked and generally require the presence of two defective alleles to cause a blood coagulation disorder--a much less common event. Thus, hemophilia A and B are by far the most common hereditary blood clotting disorders and they occur nearly exclusively in males.

Several decades ago the mean age of death of hemophiliacs was 20 years or younger. Between the early 1950's and the late 1960's, research into the factor VIII disorder led to the treatment of hemophilia A initially with whole plasma and, later, with concentrates of factor VIII. The only source for human factor VIII has been human plasma. One factor contributing to the expense is the cost associated with obtaining large amounts of usable plasma. Commercial firms must establish donation centers, reimburse donors, and maintain the plasma in a frozen state immediately after donation and through the shipment to the processing plant. The plasma samples are pooled into lots of over 1000 donors and processed. Due to the instability of the factor VIII activity, large losses are associated with the few simple purification procedures utilized to produce the concentrates (resulting in approximately a 15 percent recovery of activity). The resulting pharmaceutical products are highly impure, with a specific activity of 0.5 to 2 factor VIII units per milligram of protein (one unit of factor VIII activity is by definition the activity present in one milliliter of plasma). The estimated purity of factor VIII concentrate is approximately 0.04 percent factor VIII protein by weight. This high impurity level is associated with a variety of serious complications including precipitated protein, hepatitis, and possibly the agent responsible for Acquired Immune Deficiency Syndrome. These disadvantages of the factor VIII concentrates are due to the instability of the plasma derived factor VIII, to its low level of purity, and to its derivation from a pool of multiple donors. This means that should one individual out of the thousand donors have, for example, hepatitis, the whole lot would be tainted with the virus. Donors are screened for hepatitis B, but the concentrates are known to contain both hepatitis A and hepatitis non-A non-B. Attempts to produce a product of higher purity result in unacceptably large losses in activity, thereby increasing the cost.

The history of purification of factor VIII illustrates the difficulty in working with this protein. This difficulty is due in large part to the instability and trace amounts of factor VIII contained in whole blood. In the early 1970's, a protein was characterized which was then believed to be factor VIII (5, 6, 7). This protein was determined to be an aggregate of a subunit glycoprotein, the subunit demonstrating a molecular weight of approximately 240,000 daltons as determined by SDS gel electrophoresis. This subunit aggregated into a heterogeneous population of higher molecular weight species ranging from between one million and twenty million daltons. The protein was present in hemophiliac plasma, but missing in plasma of patients with von Willebrand's disease, an autosomally transmitted genetic disorder characterized by a prolonged bleeding time and low levels of factor VIII (8). The theory then proposed was that this high molecular weight protein, termed von Willebrand factor (vWF) or factor VIII related antigen (FVIIIRAg), was responsible for the coagulation defect in both diseases, with the protein being absent in von Willebrand's disease and somehow non-functional in classic hemophilia disease states (9). However, it was later observed that under certain conditions, notably high salt concentrations, the factor VIII activity could be separated from this protein believed responsible for the activity of factor VIII (10-20). Under these conditions, the factor VIII coagulant activity exhibited a molecular weight of 100,000 to 300,000. Since this time, great effort has concentrated on identifying and characterizing the protein(s) responsible for the coagulant activity of factor VIII. However, the availability of but trace amounts of the protein in whole blood coupled with its instability have hampered such studies.

Efforts to isolate factor VIII protein(s) from natural source, both human and animal, in varying states of purity, have been reported (21-27, 79). Because of the above mentioned problems, the possibility exists for the mistaken identification and subsequent cloning and expression of a contaminating protein in a factor VIII preparation rather than the factor VIII protein intended. That this possibility is real is emphasized by the previously mentioned mistaken identification of von Willebrand protein as being the factor VIII coagulant protein. Confusion over the identification of factor VIII-like activity is also a distinct possibility. Either factor Xa or thrombin would cause a shortening of the clotting time of various plasmas, including factor VIII deficient plasma, thereby appearing to exhibit factor VIII-like activity unless the proper controls were performed. Certain cells are also known to produce activities which can function in a manner very similar to that expected of factor VIII (28, 29, 30). The latter reference (30) proves that this factor VIII-like activity is in fact a protein termed tissue factor. The same or similar material has also been purified from human placenta (31). This protein functions, in association with the plasma protein factor VII, at the same step as factor VIII and factor IXa, resulting in the activation of factor X to factor Xa.

The burden of proof for expression of a recombinant factor VIII would therefore rest on the proof of functional expression of what is unquestionably a factor VIII activity. Even were prior workers to show that they obtained a full or partial clone encoding all or a portion of factor VIII, the technical problems in the expression of a recombinant protein which is four times larger than any other recombinant protein expressed to date could well have proven insurmountable to workers of ordinary skill.

SUMMARY OF THE INVENTION

The potential artifacts and problems described above combine to suggest the need for close scrutiny of any claims of successful cloning and expression of human factor VIII. The success of the present invention is evidenced by:

3) Identification of a genomic DNA corresponding to the factor VIII cDNA of the invention as being located in the X-chromosome, where factor VIII gene is known to be encoded.

4) Expression of a functional protein which exhibits:

a) Correction of factor VIII deficient plasma.

b) Activation of factor X to factor Xa in the presence of factor IXa, calcium and phospholipid.

c) Inactivation of the activity observed in a) and b) by antibodies specific for factor VIII.

d) Binding of the activity to an immobilized monoclonal antibody column specific for factor VIII.

e) Activation of the factor VIII activity by thrombin.

f) Binding of the activity to and subsequent elution from immobilized von Willebrand factor.

Thus, the present invention is based upon the successful use of recombinant DNA technology to produce functional human factor VIII, and in amounts sufficient to prove identification and functionality and to initiate and conduct animal and clinical testing as prerequisites to market approval. The product, human factor VIII, is suitable for use, in all of its functional forms, in the prophylactic or therapeutic treatment of human beings diagnosed to be deficient in factor VIII coagulant activity. Accordingly, the present invention, in one important aspect, is directed to methods of diagnosing and treating classic hemophilia (or hemophilia A) in human subjects using factor VIII and to suitable pharmaceutical compositions therefor.

The present invention further comprises essentially pure, functional human factor VIII. The product produced herein by genetically engineered appropriate host systems provides human factor VIII in therapeutically useful quantities and purities. In addition, the factor VIII hereof is free of the contaminants with which it is ordinarily associated in its non-recombinant cellular environment.

The present invention is also directed to DNA isolates as well as to DNA expression vehicles containing gene sequences encoding human factor VIII in expressible form, to transformant host cell cultures thereof, capable of producing functional human factor VIII. In still further aspects, the present invention is directed to various processes useful for preparing said DNA isolates, DNA expression vehicles, host cell cultures, and specific embodiments thereof. Still further, this invention is directed to the preparation of fermentation cultures of said cell cultures.

Further, the present invention provides novel polypeptides comprising moiety(ies) corresponding to functional segments of human factor VIII. These novel polypeptides may represent the bioactive and/or-antigenic determinant segments of native factor VIII. For example, such polypeptides are useful for treating hemophiliacs per se, and particularly those who have developed neutralizing antibodies to factor VIII. In the latter instance, treatment of such patients with polypeptides bearing the requisite antigen determinant(s) could effectively bind such antibodies, thereby increasing the efficiency of treatment with polypeptides bearing the bioactive portions of human factor VIII.

The factor VIII DNA isolates produced according to the present invention, encoding functional moiety(ies) of human factor VIII, find use in gene therapy, restoring factor VIII activity in deficient subjects by incorporation of such DNA, for example, via hematopoetic stem cells.

Particularly Preferred Embodiment

Human factor VIII is produced in functional form in a particularly suitable host cell system. This system comprises baby hamster kidney cells (BHK-21 (C-13), ATCC No. CCL 10) which have been transfected with an expression vector comprising DNA encoding human factor VIII, including 3'- and 5'- untranslated DNA thereof and joined at the 3'- untranslated region with 3'- untranslated terminator DNA sequence, e.g., such as from hepatitis B surface antigen gene. Expression of the gene is driven by transcriptional and translational control elements contributed by the adenovirus major late promoter together with its 5' spliced leader as well as elements derived from the SV40 replication origin region including transcriptional enhancer and promoter sequences. In addition, the expression vector may also contain a DHFR gene driven by an SV40 early promoter which confers gene amplification ability, and a selectable marker gene, e.g., neomycin resistance (which may be provided via cotransfection with a separate vector bearing neomycin resistance potential).

DESCRIPTION OF THE DRAWINGS

FIG. 1. Diagrammatic representation of the coagulation cascade (2).

FIG. 2. Melting of DNA in TMACl and 6× SSC. A: For each point ten duplicate aliquots of λ DNA were first bound to nitrocellulose filters. These filters were then hybridized without formamide at 37° C. as described in Methods. Pairs of spots were then washed in 6× SSC, 0.1 percent SDS (□) or 3.0M TMACl, 50 mM Tris HCl, pH 8.0, 0.1 percent SDS, 2 mM EDTA (0) in 2° C. increments from 38° to 56° C. The melting temperature is the point where 50 percent of the hybridization intensity remained. B: A melting experiment as in panel A was performed by binding aliquots of pBR322 DNA to nitrocellulose filters. Probe fragments of various lengths were generated by digestion of pBR322 with MspI, end-labeling of the fragments with 32 p, and isolation on polyacrylamide gels. The probe fragments from 18 to 75 b were hybridized without formamide at 37° C. and those from 46 to 1374 b in 40 percent formamide at 37° C. as described in Methods. The filters were washed in 3.0M tetramethylammonium chloride (TMACl), 50 mM Tris HCl, pH 8.0, 0.1 percent SDS, 2 mM EDTA in 3° C. increments to determine the melting temperature. (0) melting temperature determined for pBR322 MspI probe fragments, (Δ) melting temperatures in 3.0M TMACl from panel A for 11-17 b probes.

FIG. 4. Map of the Human Factor VIII Gene. The top line shows the positions and relative lengths of the 26 protein coding regions (Exons A to Z) in the Factor VIII gene. The direction of transcription is from left to right. The second line shows the scale of the map in kilobase pairs (kb). The location of the recognition sites for the 10 restriction enzymes that were used to map the Factor VIII gene are given in the next series of lines. The open boxes represent the extent of human genomic DNA contained in each of the λ phage (λ114, λ120, λ222, λ482, λ599 and λ605) and cosmid (p541, p542, p543, p612, 613, p624) clones. The bottom line shows the locations of probes used in the genomic screens and referred to in the text: 1) 0.9 kb EcoRI/BamHI fragment from p543; 2) 2.4 kb EcoRI/BamHI fragment from λ222; 3) 1.0 kb NdeI/BamHI triplet of fragments from λ120; 4) oligonucleotide probe 8.3; 5) 2.5 kb StuI/EcoRI fragment from λ114; 6) 1.1 kb EcoRI/BamHI fragment from λ482; 7) 1.1 kb BamHI/EcoRI fragment from p542. Southern blot analysis of 46,XY and 49,XXXXY genomic DNA revealed no discernible differences in the organization of the Factor VIII gene.

FIG. 5. Cosmid vector pGcos4. The 403 b annealed HincII fragment of λc1857S7 (Bethesda Research Lab.) containing the cos site was cloned in pBR322 from AvaI to PvuII to generate the plasmid pGcos1. Separately, the 1624 b PvuII to NaeI fragment of pFR400 (49n), containing an SV40 origin and promoter, a mutant dihydrofolate reductase gene, and hepatitis B surface antigen termination sequences, was cloned into the pBR322 AhaIII site to generate the plasmid mp33dhfr. A three-part ligation and cloning was then performed with the 1497 b SphI to NdeI fragment of pGcos1, the 3163 b NdeI to EcoRV fragment of mp33dhfr, and the 376 b EcoRV to SphI fragment of pKT19 to generate the cosmid vector pGcos3. pKT19 is a derivative of pBR322 in which the BamHI site in the tetracycline resistance gene has the mutated nitroguanosine treatment. pGcos4 was generated by cloning the synthetic 20mer, 5'AATTCGATCGGATCCGATCG, in the EcoRI site of pGcos3.

FIG. 6. Map of pESVDA. The 342 b PvuII-HindIII fragment of SV40 virus spanning the SV40 origin of replication and modified to be bounded by EcoRI sites (73), the polyadenylation site of hepatitis B virus (HBV) surface antigen (49n), contained on a 580 bp BamHI-BglII fragment, and the pBR322 derivative pML (75) have been previously described. Between the EcoRI site following the SV40 early promoter and the BamHI site of HBV was inserted the PvuII-HindIII fragment (map coordinates 16.63-17.06 of Adenovirus 2) containing the donor splice site of the first late leader (position 16.65) immediately followed by the 840 bp HindIII-SacI fragment of Adenovirus 2 (position 7.67-9.97) (49j), containing the Elb acceptor splice site at map position 9.83. Between the donor and acceptor sites lie unique BglII and HindIII sites for inserting genomic DNA fragments.

FIG. 7A-C. Analysis of RNA transcripts from pESVDA vectors. Confluent 10 cm dishes of COS-7 cells (77) were transfected with 2 μg plasmid DNA using the modified DEAE-dextran method (84) as described (73). RNA was prepared 4 days post-transfection from cytoplasmic extracts (49n) and electrophoresed in denaturing formaldehyde-agarose gels. After transfer to nitrocellulose, filters were hybridized with the appropriate 32 P-labelled DNA as described in Methods. Filters were washed in 2× SSC, 0.2 percent SDS at 42° and exposed to Kodak XR5 film. The position of the 28S and 18S ribosomal RNAs are indicated by arrow in each panel.

The 9.4 kb BamHI fragment of λ114 containing exon A (see FIG. 4) was cloned into the BglII site of pESVDA (FIG. 6). Plasmid pESVDA111.6 contained the fragment inserted in the orientation such that the SV40 early promoter would transcribe the genomic fragment in the proper (i.e., sense) direction. pESVDA111.7 contains the 9.4 kb BamHI in the opposite orientation. Plasmid pESVDA.S127 contains the 12.7 kb SacI fragment of λ114 inserted (by blunt end ligation) into the BglII site of pESVDA in the same orientation as pESVDA111.6.

FIG. 8. Sequence of pESVDA.S127 cDNA clone S36. The DNA sequence of the human DNA insert is shown for the cDNA clone S36 obtained from the exon expression plasmid pESVDA.S127 (see infra for details). Vertical lines mark exon boundaries as determined by analysis of genomic and cDNA clones of factor VIII, and exons are lettered as in FIG. 4. Selected restriction endonuclease sites are indicated.

FIG. 9. cDNA cloning. Factor VIII mRNA is depicted on the third line with the open bar representing the mature protein coding region; the hatched area the signal peptide coding region, and adjacent lines the untranslated regions of the message. The 5' end of the mRNA is at the left. Above this line is shown the extent of the exon B region of the genomic clone λ222, and below the mRNA line are represented the six cDNA clones from which were assembled the full length factor VIII clone (see text for details). cDNA synthesis primers 1, 3, 4 and oligo(dT) are shown with arrows depicting the direction of synthesis for which they primed. Selected restriction endonuclease sites and a size scale in kilobases are included.

FIG. 10A-C. Sequence of Human Factor VIII Gene. The complete nucleotide sequence of the composite Factor VIII cDNA clone is shown with nucleotides numbered at the left of each line. Number one represents the A of the translation initiation codon ATG. Negative numbers refer to 5' untranslated sequence. (mRNA mapping experiments suggest that Factor VIII mRNA extends approximately 60 nucleotides farther 5' than position -109 shown here.) The predicted protein sequence is shown above the DNA. Numbers above the amino acids are S1-19 for the predicted signal peptide, and 1-2332 for the predicted mature protein. "Op" denotes the opal translation stop codon TAG. The 3' polyadenylation signal AATAAA is underlined and eight residues of the poly(A) tail (found in clone λc10.3) are shown. The sequence homologous to the synthetic oligonucleotide probe 8.3 has also been underlined (nucleotides 5557-5592). Selected restriction endonuclease cleavage sites are shown above the appropriate sequence. Nucleotides 2671-3217 represent sequence derived from genomic clones while the remainder represents cDNA sequence.

The complete DNA sequence of the protein coding region of the human factor VIII gene was also determined from the genomic clones we have described. Only two nucleotides differed from the sequence shown in this figure derived from cDNA clones (except for nucleotides 2671-3217). Nucleotide 3780 (underlined) is G in the genomic clone, changing the amino acid codon 1241 from asp to glu. Nucleotide 8728 (underlined) in the 3' untranslated region is A in the genomic clone.

FIG. 11. Assembly of full length recombinant factor VIII plasmid. See the text section 8a for details of the assembly of the plasmid pSVEFVIII containing the full length of human factor VIII cDNA. The numbering of positions differs from those in the text and FIG. 10 by 72 bp.

FIG. 12. Assembly of the factor VIII expression plasmid. See the text section 8b for details of the assembly of the plasmid pAML3p.8c1 which directs the expression of functional human factor VIII in BHK cells.

FIG. 13. Western Blot analysis of factor VIII using fusion protein antisera. Human factor VIII was separated on a 5-10 percent polyacrylamide gradient SDS gel according to the procedure of (81). One lane of factor VIII was stained with silver (80). The remaining lanes of factor VIII were electrophoretically transferred to nitrocellulose for Western Blot analysis. Radiolabeled standards were applied into lanes adjacent to factor VIII in order to estimate the molecular weight of the observed bands. As indicated, the nitrocellulose strips were incubated with the appropriate antisera, washed, and probed with 125 I protein A. The nitrocellulose sheets were subjected to autoradiography.

FIG. 15. Elution profile for high pressure liquid chromatography (HPLC) of factor VIII on a Toya Soda TSK 4000 SW column. The column was equilibrated and developed at room temperature with 0.1 percent SDS in 0.1M sodium phosphate, pH 7.0.

FIG. 16. Elution profile for reverse phase HPLC separation of factor VIII tryptic peptides. The separation was performed on a Synchropak RP-P C-18 column (0.46 cm×25 cm, 10 microns) using a gradient elution of acetonitrile (1 percent to 70 percent in 200 minutes) in 0.1 percent trifluoroacetic acid. The arrow indicates the peak containing the peptide with the sequence AWAYFSDVDLEK.

FIG. 17. Thrombin activation of purified factor VIII activity. The cell supernatant was chromatographed on the C8 monoclonal resin, and dialyzed to remove elution buffer. Thrombin (25 ng) was added at time 0. Aliquots were diluted 1:3 at the indicated times and assayed for coagulant activity. Units per ml were calculated from a standard curve of normal human plasma.

DETAILED DESCRIPTION

A. Definitions

As used herein, "human factor VIII" denotes a functional protein capable, in vivo or in vitro, of correcting human factor VIII deficiencies, characterized, for example, by hemophilia A. The protein and associated activities are also referred to as factor VIIIC (FVIIIC) and factor VIII coagulant antigen (FVIIICAg)(31a). Such factor VIII is produced by recombinant cell culture systems in active form(s) corresponding to factor VIII activity native to human plasma. (One "unit" of human factor VIII activity has been defined as that activity present in one milliliter of normal human plasma.) The factor VIII protein produced herein is defined by means of determined DNA gene and amino acid sequencing, by physical characteristics and by biological activity.

Factor VIII has multiple degradation or processed forms in the natural state. These are proteolytically derived from a precursor, one chain protein, as demonstrated herein. The present invention provides such single chain protein and also provides for the production per se or via in vitro processing of a parent molecule of these various degradation products, and administration of these various degradation products, which have been shown also to be active. Such products contain functionally active portion(s) corresponding to native material.

Allelic variations likely exist. These variations may be demonstrated by one or more amino acid differences in the overall sequence or by deletions, substitutions, insertions or inversions of one or more amino acids in the overall sequence. In addition, the location of and degree of glycosylation may depend on the nature of the host cellular environment. Also, the potential exists, in the use of recombinant DNA technology, for the preparation of various human factor VIII derivatives, variously modified by resultant single or multiple amino acid deletions, substitutions, insertions or inversions, for example, by means of site directed mutagenesis of the underlying DNA. In addition, fragments of human factor VIII, whether produced in vivo or in vitro, may possess requisite useful activity, as discussed above. All such allelic variations, glycosylated versions, modifications and fragments resulting in derivatives of factor VIII are included within the scope of this invention so long as they contain the functional segment of human factor VIII and the essential, characteristic human factor VIII functional activity remains unaffected in kind. Such functional variants or modified derivatives are termed "human factor VIII derivatives" herein. Those derivatives of factor VIII possessing the requisite functional activity can readily be identified by straightforward in vitro tests described herein. From the disclosure of the sequence of the human factor VIII DNA herein and the amino acid sequence of human factor VIII, the fragments that can be derived via restriction enzyme cutting of the DNA or proteolytic or other degradation of human factor VIII protein will be apparent to those skilled in the art.

Thus, human factor VIII in functional form, i.e., "functional human factor VIII", is capable of catalyzing the conversion of factor X to Xa in the presence of factor IXa, calcium, and phospholipid, as well as correcting the coagulation defect in plasma derived from hemophilia A affected individuals, and is further classified as "functional human factor VIII" based on immunological properties demonstrating identity or substantial identity with human plasma factor VIII.

"Essentially pure form" when used to describe the state of "human factor VIII" produced by the invention means substantially free of protein or other materials ordinarily associated with factor VIII when isolated from non-recombinant sources, i.e. from its "native" plasma containing environment.

"DHFR protein" refers to a protein which is capable of exhibiting the activity associated with dihydrofolate reductase (DHFR) and which, therefore, is required to be produced by cells which are capable of survival on medium deficient in hypoxanthine, glycine, and thymidine (-HGT medium). In general, cells lacking DHFR protein are incapable of growing on this medium, and cells which contain DHFR protein are successful in doing so.

"Expression vector" includes vectors which are capable of expressing DNA sequences contained therein, where such sequences are operably linked to other sequences capable of effecting their expression. These expression vectors replicate in the host cell, either by means of an intact operable origin of replication or by functional integration into the cell chromosome. Again, "expression vector" is given a functional definition, and any DNA sequence which is capable of effecting expression of a specified DNA code disposed therein is included in this term as it is applied to the specified sequence. In general, expression vectors of utility in recombinant DNA techniques are often in the form of "plasmids" which refer to circular double stranded DNA loops. However, the invention is intended to include such other forms of expression vectors which serve equivalent functions.

"DNA isolate" means the DNA sequence comprising the sequence encoding human factor VIII, either itself or as incorporated into a cloning vector.

"Recombinant host cell" refers to cell/cells which have been transformed with vectors constructed using recombinant DNA techniques. As defined herein, factor VIII or functional segments thereof are produced in the amounts achieved by virtue of this transformation, rather than in such lesser amounts, and degrees of purities, as might be produced by an untransformed, natural host source. Factor VIII produced by such "recombinant host cells" can be referred to as "recombinant human factor VIII".

Size units for DNA and RNA are often abbreviated as follows: b=base or base pair; kb=kilo (one thousand) base or kilobase pair. For proteins we abbreviate: D=Dalton; kD=kiloDalton. Temperatures are always given in degrees Celsius.

B. Host Cell Cultures and Vectors

Useful recombinant human factor VIII may be produced, according to the present invention, in a variety of recombinant host cells. A particularly preferred system is described herein.

In general, prokaryotes are preferred for cloning of DNA sequences in constructing the vectors useful in the invention. For example, E. coli K12 strain 294 (ATCC No. 31446) is particularly useful. Other microbial strains which may be used include E. coli strains such as E. coli B, and E. coli X1776 (ATTC No. 31537), and E. coli c600 and c600hfl, E. coli W3110 (F-, λ-, prototrophic, ATTC No. 27325), bacilli such as Bacillus subtilus, and other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various pseudomonas species. These examples are, of course, intended to be illustrative rather than limiting.

In general, plasmid vectors containing replicon and control sequences which are derived from species compatible with the host cell are used in connection with these hosts. The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells. For example, E. coli is typically transformed using pBR322, a plasmid derived from an E. coli species (32). pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying and selecting transformed cells. The pBR322 plasmid, or other microbial plasmid, must also contain, or be modified to contain, promoters which can be used by the microbial organism for expression of its own proteins. Those promoters most commonly used in recombinant DNA construction include the β-lactamase (penicillinase) and lactose promoter systems (33-35) and a tryptophan (trp) promoter system (36, 37). While these are the most commonly used, other microbial promoters have been discovered and utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to ligate them functionally with plasmid vectors (38).

In addition to prokaryotes, eukaryotic microbes, such as yeast cultures, may also be used. Saccharomyces cerevisiae, or common baker's yeast, is the most commonly used among eukaryotic microorganisms, although a number of other strains are commonly available. For expression in Saccharomyces, the plasmid YRp7, for example, (39-41) is commonly used. This plasmid already contains the trp1 gene which provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example, ATCC No. 44076 or PEP4-1 (42). The presence of the trp1 lesion as a characteristic of the yeast host cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan.

Suitable promoting sequences in yeast vectors include the promoters for 3-phosphoglycerate kinase (43) or other glycolytic enzymes (44, 45), such as enolase, glyceraldehyde-3-phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. In constructing suitable expression plasmids, the termination sequences associated with these genes are also ligated into the expression vector 3' of the sequence desired to be expressed to provide polyadenylation of the mRNA and termination. Other promoters, which have the additional advantage of transcription controlled by growth conditions, are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3-phosphate dehydrogenase, and enzymes responsible for maltose and galactose utilization. Any plasmid vector containing yeast-compatible promoter, origin of replication and termination sequences is suitable.

Use of cultures of cells derived from multicellular organisms as cell hosts is preferred, particularly for expression of underlying DNA to produce the functional human factor VIII hereof, and reference is particularly had to the preferred embodiment hereof. In principle, vertebrate cells are of particular interest, such as VERO and HeLa cells, Chinese hamster ovary (CHO) cell lines, and W138, BHK, COS-7 and MDCK cell lines. Expression vectors for such cells ordinarily include (if necessary) (an) origin(s) of replication, a promoter located in front of the gene to be expressed, along with any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional terminator sequences.

For use in mammalian cells, the control functions on the expression vectors may be provided by vital material. For example, commonly used promoters are derived from polyoma, Simian Virus 40 (SV40) and most particularly Adenovirus 2. The early and late promoters of SV40 virus are useful as is the major late promoter of adenovirus as described above. Further, it is also possible, and often desirable, to utilize promoter or control sequences normally associated with the desired gene sequence, provided such control sequences are compatible with the host cell systems.

An origin of replication may be provided either by construction of the vector to include an exogenous origin, such as may be derived from adenovirus or other viral (e.g. Polyoma, SV40, VSV, BPV, etc.) source, or may be provided by the host cell chromosomal replication mechanism, if the vector is integrated into the host cell chromosome.

In selecting a preferred host cell for transfection by the vectors of the invention which comprise DNA sequences encoding both factor VIII and DHFR protein, it is appropriate to select the host according to the type of DHFR protein employed. If wild type DHFR protein is employed, it is preferable to select a host cell which is deficient in DHFR, thus permitting the use of the DHFR coding sequence as a marker for successful transfection in selective medium which lacks hypoxanthine, glycine, and thymidine.

On the other hand, if DHFR protein with low binding affinity for MTX is used as the controlling sequence, it is not necessary to use DHFR resistant cells. Because the mutant DHFR is resistant to methotrexate, NTX containing media can be used as a means of selection provided that the host cells themselves are methotrexate sensitive. Most eukaryotic cells which are capable of absorbing MTX appear to be methotrexate sensitive.

Alternatively, a wild type DHFR gene may be employed as an amplification marker in a host cell which is not deficient in DHFR provided that a second drug selectable marker is employed, such as neomycin resistance.

Examples which are set forth hereinbelow describe use of BHK cells as host cells and expression vectors which include the adenovirus major late promoter.

C. General Methods

If cells without formidable cell wall barriers are used as host cells, transfection is carried out by the calcium phosphate precipitation method (46). However, other methods for introducing DNA into cells such as by nuclear injection or by protoplast fusion may also be used.

If prokaryotic cells or cells which contain substantial cell wall constructions are used, the preferred method of transfection is calcium treatment using calcium chloride (47).

Construction of suitable vectors containing the desired coding and control sequences employs standard ligation techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in the form desired to form the plasmids required.

Cleavage is performed by treating with restriction enzyme (or enzymes) in suitable buffer. In general, about 1 μg plasmid or DNA fragments are used with about 1 unit of enzyme in about 20 μl of buffer solution for 1 hour. (Appropriate buffers and substrate amounts for particular restriction enzymes are specified by the manufacturer. Likewise, standard conditions for use of T4 ligase, T4 polynucleotide kinase and bacterial alkaline phosphatase are provided by the manufacturer.) After incubations, protein is removed by extraction with phenol and chloroform, and the nucleic acid is recovered from the aqueous fraction by precipitation with ethanol. Standard laboratory procedures are available (48).

each four deoxynucleoside triphosphates and 8 units DNA polymerase Klenow fragment at 24° C. for 30 minutes. The reaction was terminated by phenol and chloroform extraction and ethanol precipitation, or

Synthetic DNA fragments were prepared by known phosphotriester (47a) or phosphoramidite (47b) procedures. DNA is subject to electrophoresis in agarose or polyacrylamide slab gels by standard procedures (48) and fragments were purified from gels by electroelution (48a). DNA "Southern" blot hybridization followed the (49a) procedure.

For the λ/4X library, five 50 μg aliquots of the 49, XXXXY DNA was digested in a 1 ml volume with Sau3AI concentrations of 3.12, 1.56, 0.782, 0.39, and 0.195 U/ml for 1 hr at 37° C. Test digestion and gel analysis had shown that under these conditions at 0.782 U/ml Sau3AI, the weight average size of the DNA was about 30 kb; thus these digests generate a number average distribution centered at 15 kb. DNA from 5 digests was pooled, phenol and chloroform extracted, ethanol precipitated and electrophoresed on a 6 g/l low-gelling temperature horizontal agarose gel (48) (Seaplaque agarose, FMC Corporation), in two 5.6×0.6×0.15 cm slots. The 12-18 kb region of the gel was cut out and the DNA purified by melting the gel slice as described in (48).

Charon 30 arms were prepared by digesting 50 μg of the vector with BamHI and isolating the annealed 31.9 kb arm fragment from a 6 g/l low-gelling temperature agarose gel as described above. For construction of the λ/4X library, the optimal concentration of Charon 30 BamHI arms and 12-18 kb Sau3A partial 49,XXXXY DNA was determined as described (48). The ligated DNA was packaged with an in vitro extract, "packagene" (Promega Biotec, Inc., Madison, Wis.). In a typical reaction about 1.3 μg of Charon 30 BamHI arms were ligated to 0.187 μg of 12-18 kb Sau3A insert DNA in a 10 μl volume. Packaging the plating of the DNA gave about 1.3×106 phage plaques. To generate the λ4X library, 1.7×106 phage were plated at 17000 phage per 150 cm plate. These plates were grown overnight, scraped into 10 mM Tris HCl, pH 7.5, 0.1M NaCl, 10 mM MgCl2, 0.5 g/l gelatin, and centrifuged briefly, to amplify the phage. Generally, a suitable number (0.5-2×106) of these phage were plated out and screened (48). In some cases the ligated and in vitro packaged phage were screened directly without amplification.

For the isolation of λ482, a clone containing a 22 kb BclI fragment of the Factor VIII genome, and the BamHI arm fragments of the vector λ1059 (49) were isolated by gel electrophoresis. Separately, 100 μg of DNA from the 49,XXXXY cell line was digested with BclI and the 20-24 kb fraction isolated by gel electrophoresis. About 0.8 μg of λ1059 arms fragments and 5 percent of the isolated BclI DNA were ligated in a volume of 10 μl (48) to generate 712,000 plaques. Four hundred thousand of those were screened in duplicate with 2.2 kb StuI/EcoRI probe of λ114.

The cosmid/4X library was generated from the 49,XXXXY DNA used to generate the λ/4X library, except that great care was used in the DNA isolation to avoid shearing or other breakage. The DNA was partially cleaved with five concentrations of Sau3AI and the pooled DNA sized on a 100 to 400 g/l sucrose gradient (49). The fractions containing 35-45 kb DNA were pooled, dialyzed, and ethanol precipitated. Arm fragments of the cosmid vector pGcos4 were prepared following the principles described elsewhere (50). In brief, two separate, equal aliquots of pGcos4 were cut with SstI (an isoschizomer of SacI) or SalI and then treated with bacterial alkaline phosphatase. These aliquots were then phenol and chloroform extracted, pooled, ethanol precipitated and cut with BamHI. From this digest two arm fragments of 4394 and 4002 b were isolated from a low-gelling temperature agarose gel. These arm fragments were then ligated to the isolated, 40 kb Sau3AI partial digest DNA. In a typical reaction, 0.7 μg of pGcos4 arm fragments were ligated to 1 μg of 40 kb human 4X DNA in a volume of 10 μl (48). This reaction was then packaged in vitro and used to infect E. coli HB101, a recA- strain (48). This reaction generated about 120,000 colonies when plated on tetracycline containing plates. About 150,000 cosmids were screened on 20 150-mm plates in duplicate as described, with overnight amplification on chloramphenicol-containing plates (48).

Double-stranded cDNA was prepared as previously described (36, 67) employing either oligo(dT)12-18 or synthetic deoxyoligonucleotide 16-mers as primers for first-strand synthesis by reverse transcriptase. Following isolation by polyacrylamide gels, cDNA of the appropriate size (usually 600 bp or greater) was either C-tailed with terminal transferase, annealed together with G-tailed PstI-digested pBR322 and transformed into E. coli strain DH1 (76), or ligated with a 100-fold molar excess of synthetic DNA EcoRI adaptors, reisolated on a polyacrylamide gel, inserted by ligation in EcoRI-digested λGT10, packaged into phage particles and propagated on E. coli strain C600hfl (68). As a modification of existing procedures an adaptor consisting of a complementary synthetic DNA 18-mer and 22-mer (5'-CCTTGACCGTAAGACATG and 5'AATTCATGTCTTACGGTCAAGG) was phosphorylated at the blunt terminus but not at the EcoRI cohesive terminus to permit efficient ligation of the adaptor to double-stranded cDNA in the absence of extensive self-ligation at the EcoRI site. This effectively substituted for the more laborious procedure of ligating self-complementary EcoRI linkers to EcoRI methylase-treated double-stranded cDNA, and subsequently removing excess linker oligomers from the cDNA termini by EcoRI digestion. To improve the efficiency of obtaining cDNA clones >3500 bp extending from the poly(A) to the nearest existing 3' factor VIII probe sequences made available by genomic cloning (i.e., exon A), second-strand cDNA synthesis was specifically primed by including in the reaction a synthetic DNA 16-mer corresponding to a sequence within exon B on the mRNA sense strand.

D. Adenovirus Subcloning

Adenovirus 2 DNA was purchased from Bethesda Research Laboratories (BRL). The viral DNA was cleaved with HindIII and electrophoresed through a 5 percent polyacrylamide gel (TBE buffer). The region of the gel containing the HindIII B fragment (49j) was excised and the DNA electroeluted from the gel. After phenol-chloroform extraction, the DNA was concentrated by ethanol precipitation and cloned into HindIII-cleaved pUC13 (49k) to generate the plasmid pAdHindB. This HindIII subclone was digested with HindIII and SalI, and a fragment was isolated spanning adenoviral coordinates 17.1-25.9 (49j). This fragment was cloned into HindIII, SalI cleaved pUC13 to generate the plasmid pUCHS. From pAdHindB the SalI to XhoI fragment, coordinates 25.9-26.5, was isolated and cloned into pUCHS at the unique SalI site to create pUCHSX. This plasmid reconstructs the adenoviral sequences from position 17.1 within the first late leader intervening sequence to the XhoI site at position 26.5 within the third late leader exon.

The adenovirus major late promoter was cloned by excising the HindIII C, D, and E fragments (which comigrate) from the acrylamide gel, cloning them into pUC13 at the HindIII site, and screening for recombinants containing the HindIII C fragment by restriction analysis. This subclone was digested with SacI, which cleaved at position 15.4, 5' of the major late promoter (49j) as well as within the polylinker of pUC13. The DNA was recircularized to form pMLP2, containing the SacI to HindIII fragment (positions 15.4-17.1) cloned in the SacI and HindIII sites of pUC13.

E. Construction of Neomycin Resistance Vector

The neomycin resistance marker contained within E. coli transposon 5 was isolated from a Tn5 containing plasmid (49l). The sequence of the neomycin resistance gene has been previously published (49m). The neo fragment was digested with BglII, which cleaves at a point 36 bp 5' of the translational initiation codon of the neomycin phosphotransferase gene, and treated with exonuclease Bal31. The phosphotransferase gene was excised with BamHI, which cleaves the DNA 342 bp following the translational termination codon, and inserted into pBR322 between a filled-in HindIII site and the BamHI site. One clone, pNeoBal6, had the translational initiation codon situated 3 bp 3' of the filled in HindIII site (TCATCGATAAGCTCGCATG . . . ). This plasmid was digested with ClaI and BamHI, whereupon the 1145 bp fragment spanning the phosphotransferase gene was isolated and inserted into the mammalian expression vector pCVSVEHBS (see infra.). The resultant plasmid, pSVENeoBal6, situates the neomycin phosphotransferase gene 3' of the SV40 early promoter and 5' of the polyadenylation site of the HBV surface antigen gene (49n). When introduced into mammalian tissue culture cells, this plasmid is capable of expressing the phosphotransferase gene and conferring resistance to the aminoglycoside G418 (49o).

F. Transfection of Tissue Culture Cells

The BHK-21 cells (ATCC) are vertebrate cells grown in tissue culture. These cells, as is known in the art, can be maintained as permanent cell lines prepared by successive serial transfers from isolated normal cells. These cell lines are maintained either on a solid support in liquid medium, or by growth in suspensions containing support nutrients.

The cells are transfected with 5 μg of desired vector (4 μg pAML3P.8c1 and 1 μg pSVEneoBal6) as prepared above using the method of (49p).

The method insures the interaction of a collection of plasmids with a particular host cell, thereby increasing the probability that if one plasmid is absorbed by a cell, additional plasmids would be absorbed as well (49q). Accordingly, it is practicable to introduce both the primary and secondary coding sequences using separate vectors for each, as well as by using a single vector containing both sequences.

G. Growth of Transfected Cells and Expression of Peptides

The BHK cells which were subjected to transfection as set forth above were first grown for two days in non-selective medium, then the cells were transferred into medium containing G418 (400 μg/ml), thus selecting for cells which are able to express the plasmid phosphotransferase. After 7-10 days in the presence of the G418, colonies became visible to the naked eye. Trypsinization of the several hundred colonies and replating allowed the rapid growth of a confluent 10 cm dish of G418 resistant cells.

This cell population consists of cells representing a variety of initial integrants. In order to obtain cells which possessed the greatest number of copies of the FVIII expression plasmid, the cells were next incubated with an inhibitor of the DHFR protein.

H. Treatment with Methotrexate

The G418 resistant cells are inhibited by methotrexate (MTX), a specific inhibitor of DHFR at concentrations greater than 50 nM. Consistent with previous studies on the effects of MTX on tissue culture cells, cells resistant to MTX by virtue of expression of the multiple copies of the DHFR gene contained within the FVIII expression vector are selected for, and a concomitant increase in expression of the FVIII encoding sequences can be observed. By stepwise increasing the amount of MTX, amplification of the plasmid pAML3P.8c1 is affected, thus increasing the copy number. The upper limit of the amplification is dependent upon many factors, however cells resistant to millimolar concentrations of MTX possessing hundreds or thousands of copies of the DHFR expression (and thus the FVIII expression) plasmid may be selected in this manner.

For Factor VIII expression, G418-resistant BHK cells which arose after transfection with pAML3P.8c1 and pSVENeoBal6 were incubated with media containing 100 nM and 250 nM NTX as described (49r). After 7-10 days, cells resistant to 250 nM MTX were assayed for Factor VIII expression by activity, radioimmunoassay and mRNA Northern analysis.

I. Factor VIII antibodies

A variety of polyclonal and monoclonal antibodies to Factor VIII were used throughout this work. CC is a polyclonal antibody derived from the plasma of a severely affected hemophiliac (49s). C8 is a neutralizing monoclonal antibody which binds to the 210 kD portion of Factor VIII (49t). C10 is a monoclonal antibody with properties similar to C8 and was isolated essentially as described by (49t). A commercial neutralizing monoclonal antibody which binds the 80 kD portion of Factor VIII was obtained from Synbiotic Corp., San Diego Calif., Product No. 10004. C7F7 is a neutralizing monoclonal antibody that binds to the 80 kD portion of Factor VIII. C7F7 was induced and purified as follows: Six-week-old female BALB/c mice were multiply inoculated with approximately 10 μg of purified Factor VIII and splenocytes fused with X63-Ag8.653 mouse myeloma cells (49u) three days after the final inoculation. The hybridization procedure and isolation of hybrid cells by cloning methods followed previously described protocols (49r). Specific antibody producing clones were detected by solid phase RIA procedures (49w). Positive clones were subsequently assayed for coagulation prolongation capacity by APTT assay described above. Monoclonal C7F7 was expanded by growth in syngeneic animals; antibody was purified from ascites fluids by protein A-Sepharose CL-4B chromatography (49x).

J. Radioimmune Assays for Factor VIII

Two radioimmune assays (RIA) were developed to assay Factor VIII produced from BHK and other cell lines. Both are two stage assays in which the CC antibody bound to a solid support is used to bind Factor VIII (49t). This immune complex is then detected with I125 labeled C10 antibody (210 kD specific) or I125 labeled C7F7 antibody (80 kD specific).

Briefly, the two-stage RIAs are performed as follows: the 96 wells of a microtiter dish are coated overnight with 100 μl of 50 mM NaHCO3 buffer, pH 9.6 containing 2.5 mg/l of CC antibody which has been purified by protein A-sepharose chromatography (49x). The wells are washed three times with 200 μl of PBS containing 0.05 percent Tween 20 and blocked with 200 μl of PBS containing 0.1 percent gelatin and 0.01 percent methiolate for 1 to 2 hours. The wells are washed as before and 100 μl of sample added and incubated overnight. The wells are washed and 100 μl of I125 labeled (82) C10 or C7F7 antibody (1000 cpm/μl) added and incubated 6 to 8 hours. The wells are washed again and counted. The standard curve is derived from samples of normal plasma diluted 1:10 to 1:320.

K. Factor VIII Monoclonal Antibody Column

A human factor VIII monoclonal antibody column was prepared by incubation of 1.0 mg of C8 antibody (in 0.1M NaHCO3, pH8.5) with 1.0 ml of Affi-Gel 10 (Bio-Rad Laboratories, Richmond, Calif.) for four hours at 4° C. Greater than 95 percent of the antibody was coupled to the gel, as determined by the Bio-Rad Protein Assay (Bio-Rad Laboratories). The gel was washed with 50 volumes of water and 10 volumes of 0.05M imidazole, pH6.9, containing 0.15M NaCl.

E. coli containing the plasmids constructed for fusion protein expression were grown in M-9 media at 37° C. Fusion protein expression was induced by the addition of indole acrylic acid at a final concentration of 50 μg/mL for time periods of 2.5 to 4 hours. The cells were harvested by centrifugation and frozen until use.

The cell pellets for fusion 3 were suspended in 100 mL of 20 mM sodium phosphate, pH 7.2, containing 10 μg/mL lysozyme and 1 μg/mL each of RNase and DNase. The suspension was stirred for 30 minutes at room temperature to thoroughly disperse the cell pellet. The suspension was then sonicated for four minutes (pulsed at 60 percent power). The solution was centrifuged at 8000 rpm in a Sorvall RC-2B centrifuge in a GSA rotor. The pellet was resuspended in 100 mL of 0.02M sodium phosphate, pH 7.2. The suspension was layered over 300 mL of 60 percent glycerol. The sample was centrifuged at 4000 rpm for 20 minutes in an RC-3B centrifuge. Two layers resulted in the glycerol. Both pellet and the bottom glycerol layer showed a single protein band of the expected molecular weight of 25,000 daltons when analyzed on SDS polyacrylamide gels. The pellet was dissolved in 0.02M sodium phosphate buffer containing 0.1 percent SDS. The resuspended pellet and the lower glycerol layer were dialyzed against 0.02M ammonium bicarbonate, pH 8.0, to remove glycerol. The solution was lyophilized and redissolved in 0.01M sodium phosphate buffer containing 0.1 percent SDS, and frozen until use.

The cell pellets for fusion proteins 1 and 4 were suspended in 0.05M Tris, pH 7.2, containing 0.3M sodium chloride and 5 mM EDTA. Lysozyme was added to a concentration of 10 μg/mL. Samples were incubated for 5 minutes at room temperature. NP-40 was added to 0.2 percent and the suspension incubated in ice for 30 minutes. Sodium chloride was added to yield a final concentration of 3M and DNase added (1 μg/mL). The suspension was incubated 5 minutes at room temperature. The sample was centrifuged and the supernatant discarded. The pellet was resuspended in a small volume of water and recentrifuged. The cell pellets were dissolved in solutions containing 0.1 percent to 1 percent SDS and purified by either preparative SDS polyacrylamide gel electrophoresis followed by electroelution of the fusion protein band, or by HPLC on a TSK 3000 column equilibrated with 0.1M sodium phosphate containing 0.1 percent SDS.

Rabbit antisera were produced by injecting New Zealand white rabbits with a sample of fusion protein suspended in Freund's complete adjuvant (first injection) followed by boosts at two week intervals using the sample suspended in Freund's incomplete adjuvant. After six weeks, sera were obtained and analyzed by Western Blot analysis for reactivity with human plasma derived factor VIII proteins.

N. Assays for Detection of Expression of Factor VIII Activity Correction of Hemophilia A plasma--Theory--Factor VIII activity is defined as that activity which will correct the coagulation defect of factor VIII deficient plasma. One unit of factor VIII activity has been defined as that activity present in one milliliter of normal human plasma. The assay is based on observing the time required for formation of a visible fibrin clot in plasma derived from a patient diagnosed as suffering from hemophilia A (classic hemophilia). In this assay, the shorter the time required for clot formation, the greater the factor VIII activity in the sample being tested. This type of assay is referred to as activated partial thromboplastin time (APTT). Commercial reagents are available for such determinations (for example, General Diagnostics Platelin Plus Activator; product number 35503).

Procedure--All coagulation assays were conducted in 10×75 mm borosilicate glass test tubes. Siliconization was performed using SurfaSil (product of Pierce Chemical Company, Rockford, Ill.) which had been diluted 1 to 10 with petroleum ether. The test tubes were filled with this solution, incubated 15 seconds, and the solution removed. The tubes were washed three times with tap water and three times with distilled water.

Platelin Plus Activator (General Diagnostics, Morris Plains, N.J.) was dissolved in 2.5 ml of distilled water according to the directions on the packet. To prepare the sample for coagulation assays, the Platelin plus Activator solution was incubated at 37° C. for 10 minutes and stored on ice until use. To a siliconized test tube was added 50 microliters of Platelin plus Activator and 50 microliters of factor VIII deficient plasma (George King Biomedical Inc, Overland Park, Kans.). This solution was incubated at 37° C. for a total of nine minutes. Just prior to the end of the nine minute incubation of the above solution, the sample to be tested was diluted into 0.05M Tris-HCl, pH 7.3, containing 0.02 percent bovine serum albumin. To the plasma/activator suspension was added 50 microliters of the diluted sample, and, at exactly nine minutes into the incubation of the suspension, the coagulation cascade was initiated by the addition of 50 microliters of calcium chloride (0.033M). The reaction mixture was quickly mixed and, with gentle agitation of the test tube, the time required for the formation of a visible fibrin clot to form was monitored. A standard curve of factor VIII activity can be obtained by diluting normal plasma (George King Biomedical, Inc., Overland Park, Kans.) 1:10, 1:20, 1:50, 1:100, and 1:200. The clotting time is plotted versus plasma dilution on semilog graph paper. This can then be used to convert a clotting time into units of factor VIII activity.

O. Chromogenic Peptide Determination

Theory--Factor VIII functions in the activation of factor X to factor Xa in the presence of factor IXa, phospholipid, and calcium ions. A highly specific assay has been designed wherein factor IXa, factor X, phospholipid, and calcium ions are supplied. The generation of factor Xa in this assay is therefore dependent upon the addition of a source of factor VIII activity. The more factor VIII added to the assay, the more factor Xa is generated. After allowing the generation of factor Xa, a chromogenic peptide substrate is added to the reaction mixture. This peptide is specifically cleaved by factor Xa, is not effected by factor X, and is only slowly cleaved by other proteases. Cleavage of the peptide substrate releases a para-nitro-anilide group which has absorbance at 405 nm, while the uncleaved peptide substrate has little or no absorbance at this wavelength. The generation of absorbance due to cleavage of the chromogenic substrate is dependent upon the amount of factor Xa in the test mixture after the incubation period, the amount of which is in turn dependent upon the amount of functional factor VIII in the test sample added to the reaction mixture. This assay is extremely specific for factor VIII activity and should be less subject to potential false positives when compared to factor VIII deficient plasma assay.

Procedure--Coatest factor VIII was purchased from Helena Laboratories, Beaumont, Tex. (Cat. No. 5293). The basic procedure used was essentially that provided by the manufacturer for the "End Point Method" for samples containing less than 5 percent factor VIII. Where indicated, the times of incubation were prolonged in order to make the assay more sensitive. For certain assays the volumes of reagents recommended by the manufacturer were altered. This change in the protocol does not interfere with the overall results of the assay.

The chromogenic substrate (S-2222+I-2581) for factor Xa was dissolved in 10 milliliters of water, resulting in a substrate concentration of 2.7 millimoles per liter. This substrate solution was aliquoted and stored frozen at -20° C. The FIXa +FX reagent contained the factor IXa and factor X and was dissolved in 10 milliliters of water. The solution was aliquoted and stored frozen at -70° C. until use. Also supplied with the kit were the following solutions: 0.025 molar calcium chloride; phospholipid (porcine brain); and Buffer Stock Solution (diluted one part of Stock Solution to nine parts of water for the assay, resulting in a final concentration of 0.05M Tris-HCl, pH 7.3, containing 0.02 percent bovine albumin). These solutions were stored at 4° C. until use.

The phospholipid+FIXa +FX reagent is prepared by mixing one volume of phospholipid with five volumes of FIXa +FX reagent.

The absorbance of the sample at 405 nm was determined against the reagent blank in a spectrophotometer within 30 minutes.

The absorbance at 405 was related to factor VIII units by calibrating the assay using a standard normal human plasma (George King Biomedical, Overland Park, Kans.).

Example of Preferred Embodiment

1. General Strategy for Obtaining the Factor VIII Gene

The most common process of obtaining a recombinant DNA gene product is to screen libraries of cDNA clones obtained from mRNA of the appropriate tissue or cell type. Several factors contributed to use also of an alternative method of screening genomic DNA for the factor VIII gene. First, the site of synthesis of factor VIII was unknown. Although the liver is frequently considered the most likely source of synthesis, the evidence is ambiguous. Synthesis in liver and possibly spleen have been suggested by organ perfusion and transplantation studies (56). However, factor VIII activity is often increased in patients with severe liver failure (56a). Recent conflicting studies employing monoclonal antibody binding to cells detect highest levels of the protein in either liver sinusoidal endothelial (51), hepatocyte (52) or lymph node cells (followed in amount by lung, liver and spleen; (53)). In contrast, the factor VIII related antigen (von Willebrand Factor) is almost certainly synthesized by endothelial cells (54). Not only is the tissue source uncertain, the quantity of factor VIII in plasma is extremely low. The circulating concentration of about 100-200 ng/ml (55) is about 1/2,000,000 the molar concentration of serum albumin, for example. Thus, it was not clear that cDNA libraries made from RNA of a given tissue would yield factor VIII clones.

Based on these considerations, it was decided to first screen recombinant libraries of the human genome in bacteriophage lambda (henceforth referred to as genomic libraries). Although genomic libraries should contain the factor VIII gene, the likely presence of introns might present obstacles to the ultimate expression of the recombinant protein. The general strategy was to:

1. Identify a genomic clone corresponding to a sequenced portion of the human factor VIII protein.

2. Conduct a "genomic walk" to obtain overlapping genomic clones that would include the entire mRNA coding region.

3. Use fragments of the genomic clones to identify by hybridization to RNA blots tissue or cell sources of factor VIII mRNA and then proceed to obtain cDNA clones from such cells.

4. In parallel with no. 3, to express portions of genomic clones in SV40 recombinant "exon expression" plasmids. RNA transcribed from these plasmids after transfection of tissue culture (cos) cells should be spliced in vivo and would be an alternative source of cDNA clones suitable for recombinant factor VIII protein expression.

The actual progress of this endeavor involved simultaneous interplay of information derived from cDNA clones, genomic clones of several types, and SV40 recombinant "exon expression" clones, which, of necessity, are described separately below.

2. Genomic Library Screening Procedures

The factor VIII gene is known to reside on the human X chromosome (56). To increase the proportion of positive clones, genomic libraries were constructed from DNA obtained from an individual containing 4X chromosomes. (The lymphoblast cell line is karyotyped 49,XXXXY; libraries constructed from this DNA are referred to herein as "4X libraries"). 49,XXXXY DNA was partially digested with Sau3AI and appropriate size fractions were ligated into λ phage or cosmid vectors. Details of the construction of these λ/4X and cosmid/4X libraries are given below. The expected frequency of the factor VIII gene in the λ/4X library is about one in 110,000 clones and in the cosmid library about one in 40,000.

These libraries were screened for the factor VIII gene with synthetic oligonucleotide probes based on portions of the factor VIII protein sequence. These oligonucleotide probes fall into two types, a single sequence of 30 to 100 nucleotides based on codon choice usage analysis (long probes) and a pool of probes 14-20 nucleotides long specifying all possible degeneracy combinations for each codon choice (short probes).

The main advantage of long probes is that they can be synthesized based on any 10-30 amino acid sequence of the protein. No special regions of low codon redundancy need be found. Another advantage is that since an exact match with the gene sequence is not necessary (only stretches of complementarity of 10-14 nucleotides are required), interruption of complementarity due to presence of an intron, or caused by gene polymorphism or protein sequencing error, would not necessarily prevent usable hybridization. The disadvantage of long probes is that only one codon is selected for each amino acid. We have based our choice of codons on a table of mammalian codon frequency (57), and when this gave no clear preference, on the codon usage of the Factor IX gene (58). Since the expected sequence match of the long probes is unknown, the hybridization stringency must be determined empirically for each probe. This was performed by hybridization to genomic DNA blots and washes at various stringencies.

The advantage of short probes is that every codon possible is synthesized as a pool of oligonucleotides. Thus if the amino acid sequence is correct, a short probe should always hybridize to the gene of interest. The main limitation is the complexity of the pool of sequences that can be synthesized. Operationally a pool of 32 different sequences might be considered as a maximum pool size given the signal to noise limitations of hybridization to genomic libraries. This means that only protein sequences in regions of low codon redundancy can be used. A typical probe would be a pool of 16 17-mers specifying all possible sequences over a 6 amino acid fragment of protein sequence.

As with long probes, the hybridization stringency used for short probes had been determined empirically. This is because under ordinarily used hybridization conditions (6× SSC), the stability of the hybrids depends on the two factors--the length and the G-C content; stringent conditions for the low G-C content probes are not at all stringent for the high G-C content ones. A typical pool of 16 17-mers might have a range of 41 to 65 percent G-C and these probes will melt in 6× SSC over a 10° C. temperature range (from 48°-58° C.). Since the correct sequence within the pool of 16 is not known in advance, one uses a hybridization stringency just below 48° C. to allow hybridization of the lowest G-C content sequence. However, when screening a large number of clones, this will give many false positives of shorter length and higher G-C content. Since the change in melting temperature is 1° to 2° C. per base pair match, probe sequences as short as 12 or 13 of the 17 will also bind if they have a high G-C content. At random in the human genome a pooled probe of 16 17-mers will hybridize with 1200 times as many 13 base sequences as 17 base sequences.

A hybridization technique was developed for short probes which equalizes the stability of G-C and A-T base pairs and greatly enhances the utility of using short probes to screen libraries of high DNA sequence complexity.

In FIG. 2A is plotted the melting temperature of 4 short probes under ordinary (6× SSC) and 3.0M TMACl wash conditions. In 3.0M TMACl the probes melt as a nearly linear function of length, while in 6× SSC, the melting is greatly influenced by the G-C content. The high melting temperature in 6× SSC of the 13-mer that is 65 percent G-C clearly demonstrates this conclusion. FIG. 2B shows the melting temperature in 3.0M TMACl as a function of length for 11 to thousands of bases. This figure allows the rapid selection of hybridization conditions for a probe with an exact match of any length desired.

The TMACl hybridization procedure has great utility whenever an exact sequence match of some known length is desired. Examples of this technique include: 1. Screening of a human genomic library with a pool of 16 17-mers. We have used a 3.0M TMACl wash at 50° C., which allows hybridization of only 17, 16, and a few 15 base sequences. The large number of high G-C content probes of lower homology are thus excluded. 2. If a short probe screen yields too many positives to sequence easily, the mostly likely candidates can be found by a TMACl melting procedure. Replicas of the positives are hybridized and washed at 2° C. intervals (for 17-mers (which melt at 54° C.) 46°, 48°, 50°, 52°, 54°, and 56° C. would be used). The positives that melt at the highest temperature will match the probe most closely. With a standard of known sequence the homology can be predicted ±1 base or better for a 17-mer. 3. Similarly, if a long probe screen yields too many positives, pooled short probes based on the same protein sequence can be synthesized. Since one member of this pool would contain a perfect match, TMACl melting experiments could refine the choice of best candidate positives. 4. In site directed mutagenesis, an oligonucleotide typically 20 long with 1 or more changes in the center is synthesized. The TMACl wash procedure can easily distinguish the parental and mutant derivatives even for a 1 base mismatch in the middle of a 20-mer. This is because the desired mutation matches the probe exactly. The wash conditions can simply be determined from FIG. 2B. 5. Selection of one particular gene out of a family of closely related genes. A melting experiment similar to that described above has been used to select one particular gene out of a collection of 100 very similar sequences.

3. First Isolation of the Factor VIII Genomic Clone

Factor VIII enriched preparations were prepared from human cryoprecipitate by polyelectrolyte chromatography and immunoadsorption as previously described (79). This material was dialyzed into 0.1 percent sodium dodecyl sulfate (SDS) and 1 percent ammonium bicarbonate, lyophilized, and stored at -20° C. until use.

Due to contamination of the factor VIII preparations by other plasma proteins, further fractionation was required in order to purify the factor VIII as well as separate the various polypeptide chains believed to arise from the factor VIII. This was accomplished by chromatography of the protein on Toya Soda TSK 4000 SW columns using high pressure liquid chromatography in the presence of SDS. Such chromatography separates the proteins by molecular size.

The lyophilized protein was reconstituted in distilled water and made 1 percent SDS and 0.1M sodium phosphate, pH 7.5. The TSK column (0.78×50 cm; Alltech, Deerfield, Ill.) was equilibrated at room temperature with 0.1 percent SDS in 0.1M sodium phosphate, pH 7.0. Samples of approximately 0.15 to 0.25 mL were injected and the column was developed isocratically at a flow rate of 0.5 mL per minute. The absorbance was monitored at 280 nm and fractions of 0.2 mL were collected. A representative elution profile is shown in FIG. 15. Aliquots were analyzed by sodium dodecyl sulfate gel electrophoresis on gradient gels of 5 percent to 10 percent polyacrylamide and analyzed by silver staining (80). The material which eluted after 25 minutes corresponded to a doublet of proteins at 80,000 and 78,000 D. The fractions containing these proteins were pooled as indicated by bar in FIG. 15, from three separate preparative TSK runs, and stored at -20 degrees until use.

The purified 80,000 dalton protein from the TSK fractionation (0.8 nmoles) was dialyzed overnight against 8M urea, 0.36M Tris-HCl, pH 8.6, and 3.3 mM ethylenediamine-tetraacetic acid under a nitrogen atmosphere. Disulfide bonds were reduced by the inclusion of 10 mM dithiothreitol in the above dialysis buffer. The final volume was 1.5 ml. The cysteines were alkylated with 15 microliters of 5M iodoacetic acid (dissolved in 1M NaOH). The reaction was allowed to proceed for 35 minutes at room temperature in the dark, and the alkylation reaction was quenched by the addition of dithiothreitol to a final concentration of 100 mM. The protein solution was dialyzed against 8M urea in 0.1M ammonium bicarbonate for four hours. The dialysis solution was changed to gradually dilute the urea concentration (8M, 4M, 2M, 1M, and finally 0.5M urea) over a period of 24 hours. Tryptic digestion was performed on the reduced, alkylated 80,000 dalton protein by the addition of TPCK-treated trypsin (Sigma Chem. Co.) at a weight ratio of 1 part trypsin to 30 parts factor VIII protein. The digestion was allowed to continue for 12 hours at 37° C. The reaction mixture was frozen until use. HPLC separation of the tryptic peptides was performed on a high resolution Synchropak RP-P C-18 column (0.46×25 cm, 10 microns) at room temperature with a Spectra-Physics 8000 chromatograph. Samples of approximately 0.8 mL were injected and the column was developed with a gradient of acetonitrile (1 percent to 70 percent in 200 minutes) in 0.1 percent trifluoroacetic acid. The absorbance was monitored at 210 nm and 280 nm (FIG. 16). Each peak was collected and stored at 4° C. until subjected to sequence analysis in a Beckman spinning cup sequencer with on-line PTH amino acid identification. The arrow in FIG. 16, eluting at approximately 23 percent acetonitrile, indicates the peak containing the peptide with the sequence AWAYFSDVDLEK. This sequence was used to generate the oligonucleotide probe 8.3 for human genomic library screening.

Long and short probes were synthesized based on the considerations just discussed. The second long probe used was based on the sequence of a 12 amino acid factor VIII tryptic fragment, AWAYFSDVDLEK. The DNA sequence chosen to synthesize for this probe was 5'-CTTTTCCAGGTCAACGTCGGAGAAATAAGCCCAAGC. This probe (called 8.3) was first tested in genomic blot hybridizations. FIG. 3A shows genomic Southern blots of normal male (1X) and 49,XXXXY (4X) DNA hybridized with labeled 8.3 probe and washed at various stringencies. Even at the highest stringency (1× SSC, 46° C.) a single band of 3.8 kb (EcoRI) and 9.4 kb (BamHI) was observed. The intensity of this band had a ratio of about 1:4 in the 1X and 4X lanes as would be expected for the X-linked factor VIII gene. Control experiments had demonstrated that a known X-linked gene probe (Factor IX) gave the expected 1:4 hybridization ratio, while an autosomal gene (albumin) gave a 1:1 ratio.

Based on these genomic blot results, the 8.3 probe was used to screen the λ/4X library. 500,000 phage were grown on fifty 150 mm plates and duplicate nitrocellulose filters were hybridized with 32 P-labeled 8.3 probe at a wash stringency of 1× SSC, 37° C. (FIG. 3). Upon retesting, 15 strongly hybridizing and 15 more weakly hybridizing clones were obtained. DNA was prepared from these isolated plaques, cleaved with restriction endonucleases, and blot hybridized with probe 8.3. Many of the strongly hybridizing clones yielded a hybridizing EcoRI fragment of 3.8 kb, the same size detected in the genomic blot. In addition, all strongly hybridizing clones displayed an identical 262 base pair Sau3AI fragment upon hybridization with the 8.3 probe. Sau3AI fragments were cloned into the single-stranded phage vector M13mp8 (86), screened by hybridization, and sequenced by the dideoxy procedure. The DNA sequence of the 262 bp fragment showed considerable homology with the 8.3 probe. The homology included regions of continuous matches of 14 and 10 bp with an overall homology of 83 percent. The first ten residues of the peptide fragment agreed with that deduced from the DNA sequence of the recombinant clones and they were preceded by a lysine codon as expected for the product of a tryptic digest. The final two predicted residues did not match the DNA sequence. However, the DNA at this juncture contained a good consensus RNA splice donor sequence (60, 61) followed shortly by stop codons in all three possible reading frames. This suggested the presence of an intron beginning at this position. (This suggestion was confirmed with cDNA clones described below.) An open reading frame extended almost 400 b 5' of the region of homology. In this region several consensus splice acceptor sequences were identified. Inspection of the DNA-predicted protein sequence for this region revealed matches with protein sequence of several additional tryptic peptide fragments of factor VIII. This demonstrated that an exon of a genomic clone for human factor VIII had been obtained.

4. Extension of Genomic Clones: λ Library Genome Walking

Initially 8 independent factor VIII genomic clones were obtained from the λ/4X library. These contained overlapping segments of the human genome spanning about 28 kb. From the estimated size of the factor VIII protein, it was assumed that the complete gene would encompass 100-200 kb, depending on the length of introns. Hence the collection of overlapping clones was expanded by "genome walking".

The first step in this process was the mapping of restriction endonuclease cleavage sites in the existing genomic clones (FIG. 4). DNA from the clones was digested with restriction enzymes singly or in combinations, and characterized by gel electrophoresis (followed by Southern blot hybridization in some cases). DNA fragments generated by EcoRI and BamHI digestion were subcloned into pUC plasmid vectors (59) for convenience. Restriction mapping, DNA sequence analysis, and blot hybridizations with the 8.3 probe determined the gene orientation.

Next, single copy fragments near the ends of the 28 kb region were identified as "walk" probes. Digests of cloned DNA were blot hybridized with total 32 P-labeled human DNA. With this technique only fragments containing sequences repeated more than about 50 times in the genome will hybridize (87, 88). Non-hybridizing candidate walk probe fragments were retested for repeated sequences by hybridization to 50,000 phage from the λ/4X library.

In the 5' direction, a triplet of 1 kb probe fragments was isolated from λ120 DNA digested with NdeI and BamHI (see FIG. 4). One million λ/4X bacteriophage were screened with this probe. A resulting clone, λ222, was shown to extend about 13 kb 5' of λ120 (see FIG. 4).

In the 3' direction, a 2.5 kb StuI/EcoRI restriction fragment of λ114 was identified as a single copy walk probe. Exhaustive screening of the λ/4X, and subsequently other λ/human genomic libraries, failed to yield extending clones. Under-representation of genomic regions in x libraries has been observed before (62). It was decided to specifically enrich genomic DNA for the desired sequences and construct from it a limited bacteriophage library.

Southern blot hybridization of human genomic DNA with the 2.5 kb StuI/EcoRI probe showed a 22 kb hybridizing BclI restriction fragment. Restriction mapping showed that cloning and recovery of this fragment would result in a large 3' extension of genomic clones. Human 49,XXXXY DNA was digested with BclI, and a size fraction of about 22 kb was purified by gel electrophoresis. This DNA was ligated into the BamHI site of the bacteriophage vector λ1059 and a library was prepared. (The previously used vector, Charon 30, could not accommodate such a large insert.) Six hybridizing clones were obtained from 400,000 phage screened from this enriched library. The desired clone, designated λ482, extended 17 kb further 3' than our original set of overlapping genomic clones (FIG. 4).

5. Genome Walking: Cosmid Clones

A new genomic library was constructed with cosmid vectors. Cosmids (63), a plasmid and bacteriophage hybrid, can accommodate approximately 45 kb of insert, about a three-fold increase over the average insert size of the λ/4X DNA library. A newly constructed cosmid vector, pGcos4, has the following desirable attributes: 1. A derivative of the tetracycline resistance gene of pBR322 was used that did not contain a BamHI site. This allowed a BamHI site to be put elsewhere in the plasmid and to be used as the cloning site. Tetracycline resistance is somewhat easier to work with than the more commonly used ampicillin resistance due to the greater stability of the drug. 2. The 403 b HincII fragment of λ containing the cos site was substituted for the 641 b AvaI/PvuII fragment of pBR322 so that the copy number of the plasmid would be increased and to remove pBR322 sequences which interfere with the transformation of eukaryotic cells (75). 3. A mutant dihydrofolate reductase gene with an SV40 origin of replication and promoter was included in the pGcos4 vector. In this way any fragments cloned in this vector could then be propagated in a wide range of eucaryotic cells. It was expected this might prove useful in expressing large fragments of genomic DNA with their natural promoters. 4. For the cloning site, a synthetic 20-mer with the restriction sites EcoRI, PvuI, BamHI, PvuI, and EcoRI was cloned into the EcoRI site from pBR322. The unique BamHI site is used to clone 35-45 b Sau3A1 fragments of genomic DNA. The flanking EcoRI sites can be used for subcloning the EcoRI fragments of the insert. The PvuI sites can be used to cut out the entire insert in most cases. PvuI sites are exceedingly rare in eucaryotic DNA and are expected to occur only once every 134,000 b based on dinucleotide frequencies of human DNA.

FIG. 5 gives the scheme for constructing the cosmid vector, pGcos4. 35-45 kb Sau3A1 fragments of 49,XXXXY DNA were cloned in this vector. About 150,000 recombinants were screened in duplicate with a 5' 2.4 kb EcoRI/BamHI fragment of λ222 and a 3' 1 kb EcoRI/BamHI fragment of λ482 which were single copy probes identified near the ends of the existing genomic region. Four positive cosmid clones were isolated and mapped. FIG. 4 includes cosmids p541, p542 and p543. From this screen, these cosmid clones extended the factor VIII genomic region to a total of 114 kbp. Subsequent probing with cDNA clones identified numerous exons in the existing set of overlapping genomic clones, but indicated that the genomic walk was not yet complete. Additional steps were taken in either direction.

A 3' walk probe was prepared from a 1.1 kb BamHI/EcoRI fragment of p542 (FIG. 4). This probe detected the overlapping cosmid clone p613 extending about 35 kb farther 3'. At a later time, the full Factor VIII message sequence was obtained by cDNA cloning (see below). When a 1.9 kb EcoRI cDNA fragment containing the 3'-terminal portion of the cDNA was hybridized to Southern blots of human genomic and cosmid cloned DNA, it identified a single 4.9 kb EcoRI band and 5.7, 3.2 and 0.2 kb BamHI bands in both noncloned (genomic) and p613 DNA. This implied that the 3' end of the gene had now been reached, as we later confirmed by DNA sequence analysis.

A 5' walk probe was prepared from a 0.9 kb EcoRI/BamHI fragment of p543. It detected an overlapping cosmid clone p612, which slightly extended the overlapping region. The 5'-most genomic clones were finally obtained by screening cosmid/4X and λ/4X libraries with cDNA derived probes. As shown in FIG. 4, λ599, λ605 and p624 complete the set of recombinant clones spanning Factor VIII gene. (These clones overlap and contain all of the DNA of this region of the human genome with the exception of an 8.4 kb gap between p624 and λ599 consisting solely of intron DNA.) Together, the gene spans 200 kb of the human X chromosome. This is by far the largest gene yet reported. Roughly 95 percent of the gene is comprised of introns which must be properly processed to produce template mRNA for the synthesis of Factor VIII protein.

The isolation of the factor VIII gene region in λ and cosmid recombinant clones is not sufficient to produce a useful product, the factor VIII protein. Several approaches were followed to identify and characterize the protein coding (exon) portions of the gene in order to ultimately construct a recombinant expression plasmid capable of directing the synthesis of active factor VIII protein in transfected microorganisms or tissue culture cells. Two strategies failed to yield substantially useful results: further screening of genomic clones with new oligonucleotide probes based on protein sequencing, and the use of selected fragments of genomic clones as probes to RNA blot hybridizations. However, coding regions for the factor VIII protein were isolated with the use of SV40 "exon expression" vectors, and, ultimately, by cDNA cloning.

6. SV40 exon expression vectors

It is highly unlikely that a genomic region of several hundred kb could be completely characterized by DNA sequence analysis or directly used to synthesize useful amounts of factor VIII protein. Roughly 95 percent of the human factor VIII gene comprises introns (intervening sequences) which must be removed artificially or by eukaryotic RNA splicing machinery before the protein could be expressed. A procedure was created to remove introns from incompletely characterized restriction fragments of genomic clones using what we call SV40 expression vectors. The general concept entails inserting fragments of genomic DNA into plasmids containing an SV40 promoter and producing significant amounts of recombinant RNA which would be processed in the transfected monkey cos cells. The resulting spliced RNA can be analyzed directly or provide material for cDNA cloning. In theory at least, this technique could be used to assemble an entire spliced version of the factor VIII gene.

Our first exon expression constructions used existing SV40 cDNA vectors that expressed the hepatitis surface antigen gene (73). However, the genomic factor VIII fragments cloned into these vectors gave no observable factor VIII RNA when analyzed by blot hybridization. It was surmised that the difficulty might be that in the course of these constructions the exon regions of the cDNA vectors had been joined to intron regions of the factor VIII gene. To circumvent these difficulties, the exon expression vector pESVDA was constructed as shown in FIG. 6. This vector contains the SV40 early promoter, the Adenovirus II major late first splice donor site, intron sequences into which the genomic factor VIII fragments could be cloned, followed by the Adenovirus II E1b splice acceptor site and the hepatitis B surface antigen 3' untranslated and polyadenylation sequences (49j).

Initially the 9.4 kb BamHI fragment and the 12.7 kb SstI fragment of λ114 were cloned in the intron region of pESVDA (see FIG. 6). Northern blot analysis of the RNA synthesized by these two constructions after transfection of cos cells is shown in FIG. 7. With the 9.4 kb BamHI construction, a hybridizing RNA band of about 1.8 kb is found with probes for exon A, and hepatitis 3' untranslated sequence. To examine the RNA for any new factor VIII exons, a 2.0 kbp StuI/BamHI fragment of λ114, 3' of exon A, was hybridized in a parallel lane. This probe also showed an RNA band of 1.8 kb demonstrating the presence of additional new factor VIII exons in this region. Each of these three probes also hybridized to an RNA band from a construction containing the 12.7 kb SstI genomic fragment. This RNA band was about 2.1 kb. This observation suggested that an additional 200-300 bp of exon sequences were contained in this construction 3' of the BamHI site bordering the 9.4 kb BamHI fragment.

Control experiments showed that this system is capable of correctly splicing known exon regions. A 3.2 kb genomic HindIII fragment of murine dhfr spanning exons III and IV was cloned in pESVDA. An RNA band of 1 kb was found with a murine dhfr probe. This is the size expected if the exons are spliced correctly. Constructions with the 9.4 kb BamHI factor VIII or 3.2 kb dhfr genomic fragments in the opposite orientation, gave no observable RNA bands with any of the probes (FIG. 7).

A cDNA copy of the RNA from the 12.7 kb SstI construction was cloned in pBR322 and screened. One nearly full length (1700 bp) cDNA clone (S36 ) was found. The sequence of the 950 bp SstI fragment containing all of the factor VIII insert and a portion of the pESVDA vector on either side is presented in FIG. 8. The sequence begins and ends with the Adenovirus splice donor and acceptor sequences as expected. In between there are 888 bp of factor VIII sequence including exon A. The 154 bp preceding and the 568 bp following exon A contain several factor VIII 80K tryptic fragments, confirming that these are newly identified exons. Sequences of the genomic region corresponding to these exons showed that the 154 bp 5' of exon A are contained in one exon, C, and that the region 3' of exon A is composed of 3 exons, D, E, and I of 229, 183 and 156 bp respectively. Each of these exons is bounded by a reasonable splice donor and acceptor site (60, 61).

Subsequent comparison of the S36 exon expression cDNA with the factor VIII cell line cDNA clones showed that all the spliced factor VIII sequence in S36 is from factor VIII exons. This included as expected exons C, A, D, E, and I. However, 47 bp of exon A were missing at the C, A junction and exons F, G, and H had been skipped entirely. The reading frame shifts resulting from such aberrant RNA processing showed that it could not correspond exactly to the factor VIII sequence. At the C, A junction a good censensus splice site was utilized rather than the authentic one. The different splicing of the S36 clone compared with the authentic factor VIII transcript may be because only a portion of the RNA primary transcript was expressed in the cos cell construction. Alternatively, cell type or species variability may account for this difference.

7. cDNA Cloning

a. Identification of a cell line producing Factor VIII mRNA

To identify a source of RNA for the isolation of factor VIII cDNA clones, polyadenylated RNA was isolated from numerous human cell lines and tissues and screened by Northern blot hybridization with the 189 bp StuI-HineII fragment from the exon A region of λ120. Poly(A)+ RNA from the CH-2 human T-cell hybridoma exhibited a hybridizing RNA species. The size of the hybridizing RNA was estimated to be about 10 kb. This is the size mRNA expected to code for a protein of about 300 kD. By comparison with control DNA dot-blot hybridizations (66), the amount of this RNA was determined to be 0.0001-0.001 percent of the total cellular poly(A)+ RNA in the CH-2 cell line. This result indicated that isolation of factor VIII cDNA sequences from this source would require further enrichment of specific sequences or otherwise entail the screening of extremely large numbers of cDNA clones.

b. Specifically Primed cDNA Clones

The DNA sequence analysis of Factor VIII genomic clones allowed the synthesis of 16 base synthetic oligonucleotides to specifically prime first strand synthesis of cDNA. Normally, oligo(dT) is used to prime cDNA synthesis at the poly(A) tails of mRNA. Specific priming has two advantages over oligo(dT). First, it serves to enrich the cDNA clone population for factor VIII. Second, it positions the cDNA clones in regions of the gene for which we possessed hybridization probes. This is especially important in cloning such a large gene. As cDNA clones are rarely longer than 1000-2000 base pairs, oligo(dT) primed clones would usually be undetectable with a probe prepared from most regions of the factor VIII gene. The strategy employed was to use DNA fragments and sequence information from the initial exon A region to obtain specifically primed cDNA clones. We proceeded by obtaining a set of overlapping cDNA clones in the 5' direction based upon the characterization of the earlier generation of cDNA clones. In order to derive the more 3' region of cDNA, we employed cDNA and genomic clone fragments from 3' exons to detect oligo(dT) primed cDNA clones. Several types of cDNA cloning procedures were used in the course of this endeavor and will be described below.

The initial specific cDNA primer, 5'-CAGGTCAACATCAGAG ("primer 1"; see FIG. 9) was synthesized as the reverse complement of the 16 3'-terminal residues of the exon A sequence. C-tailed cDNA was synthesized from 5 μg of CH-2 cell poly(A)+ RNA with primer 1, and annealed into G-tailed pBR322 as described generally in (67). Approximately 100,000 resulting E. coli transformants were plated on 100 150 mm dishes and screened by hybridization (48) with the 189 bp StuI/HincII fragment from the exon A region of the genomic clone λ120 (FIG. 4). One bona fide hybridizing clone ("p1.11") was recovered (see FIG. 9). DNA sequence analysis of p1.11 demonstrated identity with our factor VIII genomic clones. The 447 bp cDNA insert in p1.11 contained the first 104 b of genomic exon A (second strand synthesis apparently did not extend back to the primer) and continued further into what we would later show to be exons B and C. The 5' point of divergence with exon A sequence was bordered by a typical RNA splice acceptor site (61).

Although the feasibility of obtaining factor VIII cDNA clones from the CH-2 cell line had now been demonstrated, further refinements were made. Efforts of several types were made to further enrich CH-2 RNA for factor VIII message. A successful strategy was to combine specifically primed first strand cDNA synthesis with hybrid selection of the resulting single stranded cDNA. Primer 1 was used with 200 μg of poly(A)+ CH-2 RNA to synthesize single stranded cDNA. Instead of using DNA polymerase to immediately convert this to double stranded DNA, the single stranded DNA was hybridized to 2 μg of 189 bp StuI/HincII genomic fragment DNA which had been immobilized on activated ABM cellulose paper (Schleicher and Schuell "Transa-Bind"; see (48). Although RNA is usually subject to hybrid selection, the procedure was applied after cDNA synthesis in order to avoid additional manipulation of the rare, large and relatively labile factor VIII RNA molecules. After elution, the material was converted to double stranded cDNA, size selected, and 0.5 ng of recovered DNA was C-tailed and cloned into pBR322 as before. Approximately 12,000 recombinant clones were obtained and screened by hybridization with a 364 bp Sau3A/StuI fragment derived from the previous cDNA clone p1.11. The probe fragment was chosen deliberately not to overlap with the DNA used for hybrid selection. Thus avoided was the identification of spurious recombinants containing some of the StuI/HincII DNA fragment which is invariably released from the DBM cellulose. 29 hybridizing colonies were obtained. This represents a roughly 250-fold enrichment of desired clones over the previous procedure.

Each of the 29 new recombinants was characterized by restriction mapping and the two longest (p3.12 and p3.48; FIG. 9) were sequenced. These cDNA clones extended about 1500 bp farther 5' than p1.11. Concurrent mapping and sequence analysis of cDNA and genomic clones revealed the presence of an unusually large exon (exon B, FIG. 4) which encompassed p3.12 and p3.48. Based on this observation, DNA sequence analysis of the genomic clone λ222 was extended to define the extent of this exon. Exon B region contained an open reading frame of about 3 kb. 16 mer primers 2 and 3 were synthesized to match sequence within this large exon in the hope of obtaining a considerable extension in cDNA cloning.

At this point, it was demonstrated that a bacteriophage based cDNA cloning system could be employed, enabling production and screening of vast numbers of cDNA clones without prior enrichment by hybrid selection. λGT10 (68) is a phage λ derivative with a single EcoRI restriction site in its repressor gene. If double stranded cDNA fragments are flanked by EcoRI sites they can be ligated into this unique site. Insertion of foreign DNA into this site renders the phage repressor minus, forming a clear plaque. λGT10 without insert forms turbid plaques which are thus distinguishable from recombinants. In addition to the great transformation efficiency inherent in phage packaging, λ cDNA plaques are more convenient to screen at high density than are bacterial colonies.

Double stranded cDNA was prepared as before using primer 3, 5'-AACTCTGTTGCTGCAG (located about 550 bp downstream from the postulated 5' end of exon B). EcoRI "adaptors" were ligated to the blunt ended cDNA. The adaptors consisted of a complementary synthetic 18 mer and 22 mer of sequence 5'-CCTTGACCGTAAGACATG and 5'-AATTCATGTCTTACGGTCAAGG. The 5' end of the 18 mer was phosphorylated, while the 5' end of the 22 mer retained the 5'-OH with which it was synthesized. Thus, when annealed and ligated with the cDNA, the adaptors form overhanging EcoRI sites which cannot self-ligate. This allows one to avoid EcoRI methylation of cDNA and subsequent EcoRI digestion which follows linker ligation in other published procedures (83). After gel isolation to size select the cDNA and remove unreacted adapters, an equimolar amount of this cDNA was ligated into EcoRI cut λGT10, packaged and plated on E. coli c600hfl. About 3,000,000 clones from 1 μg of poly(A)+ RNA were plated on 50 150 mm petri dishes and hybridization screened with a 300 bp HinfI fragment from the 5' end of exon B. 46 duplicate positives were identified and analyzed by EcoRI digestion. Several cDNA inserts appeared to extend about 2500 bp 5' of primer 3. These long clones were analyzed by DNA sequencing. The sequences of the 5' ends of λ13.2 and λ13.27 are shown in FIG. 10. They possessed several features which indicated that we had reached the 5' end of the coding region for factor VIII. The initial 109 bp contained stop codons in all possible reading frames. Then appeared an ATG triplet followed by an open reading frame for the rest of the 2724 bp of the cDNA insert in λ13.2. Translation of the sequence following the initiator ATG gives a 19 amino acid sequence typical of a secreted protein "leader" or "pre" sequence (69). Its salient features are two charged residues bordering a 10 amino acid hydrophobic core. Following this putative leader sequence is a region corresponding to amino terminal residues obtained from protein sequence analysis of 210 kD and 95 kD thrombin digest species of factor VIII.

c. Oligo(dT) primed cDNA clones

Several thousand more 3' bases of factor VIII mRNA remained to be converted into cDNA. The choice was to prime reverse transcription with oligo(dT) and search for cDNA clones containing the 3' poly(A)+ tails of mRNA. However, in an effort to enrich the clones and to increase the efficiency of second strand DNA synthesis, established procedures were replaced with employment of a specific primer of second strand cDNA synthesis. The 16-mer primer 4, 5'-TATTGCTGCAGTGGAG, was synthesized to represent message sense sequence at a PstI site about 400 bp upstream of the 3' end of exon A (FIG. 9). mRNA was reverse transcribed with oligo(dT) priming, primer 4 was added with DNA polymerase for second strand synthesis, and EcoRI adapted cDNA then ligated into λGT10 as before. 3,000,000 plaques were screened with a 419 bp PstI/HincII fragment contained on p3.12, lying downstream from primer 4. DNA was prepared from the four clones recovered. These were digested, mapped, and blot hybridized with further downstream genomic fragments which had just been identified as exons using SV40 exon expression plasmids described above. Three of the four recombinants hybridized. The longest, λ10.44, was approximately 1,800 base pairs. The DNA sequence of λ10.44 showed that indeed second strand synthesis began at primer 4. It contained all exon sequences found in the SV40 exon expression clone S36 and more. However, the open reading frame of λ10.44 continued to the end of the cDNA. No 3' untranslated region nor poly(A) tail was found. Presumably second strand synthesis had not gone to completion.

To find clones containing the complete 3' end, we rescreened the same filters with labeled DNA from λ10.44. 24 additional clones were recovered and mapped, and the two longest (λ10.3 and λ10.9.2) were sequenced. They contained essentially identical sequences which overlapped λ10.44 and added about 1900 more 3' base pairs. 51 base pairs beyond the end of the λ10.44 terminus, the DNA sequence showed a TGA translation stop codon followed by an apparent 3' untranslated region of 1805 base pairs. Diagnostic features of this region are stop codons dispersed in all three reading frames and a poly(A) signal sequence, AATAAA (89), followed 15 bases downstream with a poly(A) stretch at the end of the cDNA (clone λ10.3 contains 8 A's followed by the EcoRI adapter at this point, while λ10.9.2 contains over 100 A's at its 3' end).

d. Complete cDNA Sequence

The complete sequence of overlapping clones is presented in FIG. 10. It consists of a continuous open reading frame coding for 2351 amino acids. Assuming a putative terminal signal peptide of 19 amino acids, the "mature" protein would therefore have 2332 amino acids. The calculated molecular weight for this protein is about 267,000 daltons. Taking into account possible glycosylation, this approximates the molecular weight of native protein as determined by SDS polyacrylamide gel electrophoresis.

In order to express recombinant Factor VIII, the full 7 kb protein coding region was assembled from several separate cDNA and genomic clones. We describe below and in FIG. 11 the construction of three intermediate plasmids containing the 5', middle, and 3' regions of the gene. The intermediates are combined in an expression plasmid following an SV40 early promoter. This plasmid in turn serves as the starting point for various constructions with modified terminal sequences and different promoters and selectable markers for transformation of a number of mammalian cell types.

The 5' coding region was assembled in a pBR322 derivative in such a way as to place a ClaI restriction site before the ATG start codon of the Factor VIII signal sequence. Since no other ClaI site is found in the gene, it becomes a convenient site for refinements of the expression plasmid. The convenient ClaI and SacI containing plasmid pT24-10 (67a) was cleaved with HindIII, filled in with DNA polymerase, and cut with SacI. A 77 b AluI/SacI was recovered from the 5' region of the Factor VIII cDNA clone λ13.2 and ligated into this vector to produce the intermediate called pF8Cla-Sac. (The AluI site is located in the 5' untranslated region of Factor VIII and the SacI site 10 b beyond the initiator ATG at nucleotide position 10 in FIG. 10; the nucleotide position of all restriction sites to follow will be numbered as in FIG. 10 beginning with the A of the initiator codon ATG.) An 85 b ClaI/SacI fragment containing 11 bp of adaptor sequence (the adaptor sequence 5' ATCGATAAGCT is entirely derived from pBR322 ) was isolated from pF8Cla-Sac and ligated along with an 1801 b SacI/KpnI (nucleotide 1811) fragment from λ13.2 into a ClaI/KpnI vector prepared from a pBR322 subclone containing a HindIII fragment (nuc. 1019-2277) of Factor VIII. This intermediate, called pF8Cla-Kpn, contained the initial 2277 coding nucleotides of Factor VIII preceded by 65 5' untranslated base pairs and the 11 base pair ClaI adaptor sequence. pF8Cla-Kpn was opened with KpnI and SphI (in the pBR322 portion) to serve as the vector fragment in a ligation with a 466 b KpnI/HindIII fragment derived from an EcoRI subclone of λ13.2 and a 1654 b HindIII/SphI (nuc. 4003) fragment derived from the exon B containing subclone p222.8. This produced pF8Cla-Sph containing the first 3931 b of Factor VIII coding sequence.

The middle part of the coding region was derived from a three-piece ligation combining fragments of three pBR322/cDNA clones or subclones. p3.48 was opened with BamHI (nuc. 4743) and SalI (in pBR322 tet region) to serve as vector. Into these sites were ligated a 778 b BamHI/NdeI (nuc. 5520) fragment from p3.12 and a 2106 b NdeI/SalI (in pBR322) fragment from the subclone pλ10.44R1.9. Proper ligation resulted in a tetracycline resistant plasmid pF8Sca-RI.

The most 3' portion of Factor VIII cDNA was cloned directly into an SV40 expression vector. The plasmid pCVSVEHBV contains an SV40 early promoter followed by a polylinker and the gene for the Hepatitis B surface antigen.

[pCVSVEHBV, also referred to as pCVSVEHBS, is a slight variant of p342E (73). In particular, pCVSVEHBV was obtained as follows: The 540 bp HindIII-HindIII fragment encompassing the SV40 origin of replication (74) was ligated into plasmid pML (75) between the EcoRI site and the HindIII site. The plasmid EcoRI site and SV40 HindIII site were made blunt by the addition of Klenow DNA polymerase I in the presence of the 4 dNTPs prior to digestion with HindIII. The resulting plasmid, pESV, was digested with HindIII and BamHI and the 2900 b vector fragment isolated. To this fragment was ligated a HindIII-BglII fragment of 2025 b from HBV modified to contain a polylinker (DNA fragment containing multiple restriction sites) at the EcoRI site. The HBV fragment encompasses the surface antigen gene and is derived by EcoRI-BglII digestion of cloned HBV DNA (74). The double stranded linker DNA fragment (5'dAAGCTTATCGATTCTAGAATTC3' . . . ) was digested with HindIII and EcoRI and added to the HBV fragment, converting the EcoRI-BglII fragment to a HindIII-BglII fragment. Although this could be done as a 3 part ligation consisting of linker, NBV fragment, and vector, it is more convenient and was so performed to first add the HindIII-EcoRI linker to the cloned HBV DNA and then excise the HindIII-BglII fragment by codigestion of the plasmid with those enzymes. The resulting plasmid, pCVSVEHBV, contains a bacterial origin of replication from the pBR322 derived pML, and ampicillin resistance marker, also from pML, an SV40 fragment oriented such that the early promoter will direct the transcription of the inserted HBV fragment, and the surface antigen gene from HBV. The HBV fragment also provides a polyadenylation signal for the production of polyadenylated mRNAs such as are normally formed in the cytoplasm of mammalian cells.]

The plasmid pCVSVEHBV contained a useful ClaI site immediately 5' to an XbaI site in the polylinker. This plasmid was opened with XbaI and BamHI (in the Hepatitis Ag 3' untranslated region) and the ends were filled in with DNA polymerase. This removed the Hepatitis surface antigen coding region but retained its 3' polyadenylation signal region, as well as the SV40 promoter. Into this vector was ligated a 1883 b EcoRI fragment (with filled in ends) from the cDNA clone λ10.3. This contained the final 77 coding base pairs of Factor VIII, the 1805 b 3' untranslated region, 8 adenosine residues, and the filled in EcoRI adaptor. By virtue of joining the filled in restriction sites, the EcoRI end was recreated at the 5' end (from filled in XbaI joined to filled in EcoRI) but destroyed at the 3' end (filled in EcoRI joined to filled in BamHI). This plasmid was called pCVSVE/10.3.

The complete factor VIII cDNA region was joined in a three-piece ligation. pCVSVE/10.3 was opened with ClaI and EcoRI and served as vector for the insertion of the 3870 b ClaI/ScaI fragment from pF8Cla-Sca and the 3182 b ScaI/EcoRI fragment from pF8Sca-RI. This expression plasmid was called pSVEFVIII.

b. Construction for Expression of Factor VIII in Tissue Culture Cells

A variant vector based on PSVEFVIII, containing the adenovirus major late promoter, tripartide leader sequence, and a shortened Factor VIII 3'-untranslated region produced active factor VIII when stably transfected into BHK cells.

FIG. 12 shows the construction of pAML3P.8cl, the expression plasmid that produces active factor VIII. To make this construction first the SstII site in pFD11 (49r) and the ClaI site in pEHED22 (49y) were removed with Klenow DNA polymerase I. These sites are in the 3' and 5' untranslated regions of the DHFR gene on these plasmids. Then a three-part ligation of fragments containing the deleted sites and the hepatitis B surface antigen gene from pCVSVEHBS (supra) was performed to generate the vector pCVSVEHED22ΔCS which has only one ClaI and one SstII site. The plasmid pSVEFVIII containing the assembled factor VIII gene (FIG. 11) was cleaved with ClaI and HpaI to excise the entire coding region and about 380 b of the 3' untranslated region. This was inserted into the ClaI, SstII deletion vector at its unique ClaI and HpaI sites, replacing the surface antigen gene to give the expression plasmid pSVE.8c1D.

Separately, the adenovirus major late promoter with its tripartite 5' leader was assembled from two subclones of portions of the adenovirus genome along with a DHFR expression plasmid, pEHD22 (49y). Construction of the two adenovirus subclones, pUCHSX and pMLP2 is described in the methods. pMLP2, contains the SstI to HindIII fragment from adenovirus coordinates 15.4 to 17.1 cloned in the SstI to HindIII site of pUC13 (59). pUCHSX contains the HindIII to Xhol fragment coordinates 17.1 to 26.5 cloned in the HindIII to SalI site of pUC13. When assembled at the HindIII site, these two adenovirus fragments contain the major late promoter of adenovirus, all of the first two exons and introns, and part of the third exon up to the XhoI site in the 5' untranslated region.

A three-part ligation assembled the adenovirus promoter in front of the DHFR gene in the plasmid pAML3P.D22. This put a ClaI site shortly following the former XhoI site in the third exon of the adenovirus tripartite 5' leader. Finally, the SV40 early promoter of the factor VIII expression plasmid, pSVE.8c1D, was removed with ClaI and SalI and replaced with the adenovirus promoter to generate the final expression plasmid, pAML3P.8c1. (See FIG. 12) This plasmid contains the adenovirus tripartite leader spliced in the third exon to the 5' untranslated region of factor VIII. This is followed by the full length Factor VIII structural gene including its signal sequence. The 3' untranslated region of the factor VIII gene is spliced at the HpaI site to the 3' untranslated region of Hepatitis B surface antigen gene. This is followed by the DHFR gene which has an SV40 early promoter and a Hepatitis 3' untranslated region conferring a functional polyadenylation signal.

The factor VIII expression plasmid, pAML3P.8c1, was cotransfected into BHK cells with the neomycin resistance vector pSVEneoBal6 (ATCC No., CRL8544, deposited 20 Apr. 1984). These cells were first selected with G418 followed by a selection with methotrexate.

Initial characterization of the Factor VIII RNA produced by the BHK cell line was performed by Northern analysis of poly(A)+ cytoplasmic RNA by hybridization to a 32 P-labeled Factor VIII DNA probe. This analysis shows a band approximately 9 kb in length. Based on hybridization intensities, this band is about 100 to 200 fold enriched when compared to the 9 kb band found in the CH-2 cell line.

9. Identification of Recombinant Factor VIII

a. Radioimmune assay

Radioimmune assays were performed as described in the Methods on supernatants and lysed cells from the BHK Factor VIII producing cell line. Table 1 shows that the supernatants (which contain factor VIII activity) (see 96) also contain approximately equal amounts of the 210 kD (C10) and the 80 kD (C7F7) portions of Factor VIII as judged by these RIAs. Factor VIII can also be detected in the cell lysates by both RIAs. Control cell lines not expressing factor VIII produced RIA values of less than 0.001 units per ml.

I125 cpm bound were converted to units/ml with a standard curve based on dilutions of normal plasma. All values are significantly above background. Limits of detection were 0.005 U/ml for the C10 and 0.01 U/ml for the C7F7 assays.

b. Chromogenic Assay on BHK Cell Media

As is shown in Table 2, media from these cells generated an absorbance at 405 nm when tested in the Coatest assay. As described above, this assay is specific for factor VIII activity in the activation of factor X, Addition of monoclonal antibodies specific for factor VIII decreased the amount of factor Xa generated as evidenced by the decrease in absorbance from 0.155 for the media to 0.03 for the media plus antibodies (after subtracting out the blank value). Therefore, the cells are producing an activity which functions in an assay specific for factor VIII activity and this activity is neutralized by antibodies specific for factor VIII.

Incubation of the media in the reaction mixture without the addition of the factor 1Xa, factor X, and phospholipid did not result in an increase in the absorbance at 405 nm above the blank value. The observed activity is therefore not due to the presence of a nonspecific protease cleaving the substrate, and in addition neutralized by antibodies specific for factor VIII.

Serum containing media containing factor VIII activity was chromatographed on the C8 monoclonal antibody (ATCC No. 40115, deposited 20 Apr. 1984) column as described (supra). The eluted fractions were diluted 1:100 and assayed for activity. To 50 μl of the diluted peak fraction was added various monoclonal antibodies known to be neutralizing for plasma factor VIII activity. The results shown in Table 3 demonstrate that the factor VIII activity eluted from the column (now much more concentrated than the media) was also neutralized by these factor VIII antibodies.

The activity detected in the cell media was purified and concentrated by passage over a C8 monoclonal resin (supra). The peak fraction was dialyzed against 0.05M imidazole, pH6.9, containing 0.15M NaCl, 0.02M glycine ethyl ester, 0.01M CaCl2, and 10 percent glycerol in order to remove the elution buffer. The activity peak fraction was assayed by coagulation analysis in factor VIII deficient plasma (Table 4). A fibrin clot was observed at 84 seconds. With no addition, the hemophilia plasma formed a clot in 104.0 seconds. Therefore, the eluted fraction corrected the coagulation defect in hemophilia plasma. Normal human plasma was diluted and assayed in the same manner. A standard curve prepared from this plasma indicated that the eluted fraction had approximately 0.01 units per milliliter of factor VIII coagulant activity.

Activation of coagulant activity by thrombin is a well established property of factor VIII. The eluted fraction from the monoclonal column was analyzed for this property. After dialysis of the sample to remove the elution buffer (supra), 100 μl of the eluate was diluted with 100 μl of 0.05M imidazole, pH7.6, containing 0.15M NaCl, 0.02M glycine ethyl ester, 0.01M CaCl2 and 10 percent glycerol. This dilution was performed to dilute further any remaining elution buffer (which might interfere with thrombin functioning) as well as to increase the pH of the reaction mixture. Thrombin (25 ng) was added to the solution and the reaction was performed at room temperature. Aliquots of 25 μl were removed at various time points, diluted 1:3, and assayed for coagulation activity. The results are shown in FIG. 17. The factor VIII activity increased with time, and subsequently decreased, as expected for a factor VIII activity. The amount of thrombin added did not clot factor VIII deficient plasma in times observed for these assays, and the observed time dependent increase and subsequent decrease in observed coagulation time proved that the activity being monitored was in fact due to thrombin activation of factor VIII. The observed approximately 20-fold activation by thrombin is in agreement with that observed for plasma factor VIII.

Factor VIII is known to circulate in plasma in a reversible complex with von Willebrand Factor (vWF) (10-20). A useful form of recombinant factor VIII should therefore also possess this capacity for forming such a complex in order to confirm identity as factor VIII. In addition, the ability to form such a complex would prove the ability of a recombinant factor VIII to form the natural, circulating form of the activity as the factor VIII/vWF complex upon infusion into hemophiliacs. In order to test the ability of recombinant factor VIII to interact with vWF, vWF was purified and immobilized on a resin as follows:

Human von Willebrand factor was prepared by chromatography of human factor VIII concentrates (purchased from, e.g., Cutter Laboratories) on a Sepharose CL4B resin equilibrated with 0.05M Tris, pH 7.3, containing 0.15M NaCl. The von Willebrand factor elutes at the void volume of the column. This region was pooled, concentrated by precipitation with ammonium sulfate at 40 percent of saturation and re-chromatographed on the column in the presence of the above buffer containing 0.25M CaCl2 in order to separate the factor VIII coagulant activity from the von Willebrand factor. The void volume fractions were again pooled, concentrated using ammonium sulfate, and dialyzed against 0.1M sodium bicarbonate. The resulting preparation was covalently attached to cyanogen bromide activated Sepharose (purchased from Pharmacia) as recommended by the manufacturer. The column was washed with 0.02M Tris, pH 7.3, containing 0.05M NaCl and 0.25M CaCl2 in order to remove unbound proteins. The recombinant factor VIII was prepared in serum free media and applied to a 1.0 ml column of the vWF resin at room temperature.

The column was washed to remove unbound protein and eluted with 0.02M Tris, pH 7.3, containing 0.05M NaCl and 0.25M CaCl2. Fractions of 1.0 ml were collected, diluted 1:10 and assayed. The results are shown in Table 5. The factor VIII activity is absorbed from the media onto the column. The activity can subsequently be eluted from the column using high salt (Table 5), as expected for the human factor VIII. Therefore, the factor VIII produced by the BHK cells has the property of specific interaction with the von Willebrand factor protein.

The purpose of this set of experiments was to prove immunological identity of the protein encoded by the clone with the polypeptides in plasma. This was accomplished by expressing portions of the gene as fusion proteins in E. coli. All or part of the coding sequences of the cloned gene can be expressed in forms designed to provide material suitable for raising antibodies. These antibodies, specific for desired regions of the cloned protein, can be of use in analysis and purification of proteins. A series of E. coli/factor VIII "fusion proteins" were prepared for this purpose. Fragments of factor VIII clones were ligated into the BgIII site of the plasmid pNCV (70) in such a way as to join factor VIII coding sequences, in proper reading frame, to the first 12 amino acids of the fused E. coli trp LE protein (48, 70, 71). Substantial amounts of recombinant protein product are usually produced from this strong trp promoter system.

pfus1 was constructed by isolating a 189 bp StuI/HincII fragment of factor VIII (coding for amino acids 1799-1860) and ligating this into the SmaI site of pUC13 (49K). This intermediate plasmid was digested with BamHI and EcoRI and the 200 bp fragment inserted into pNCV (70) from which the 526 bp BglII to EcoRI fragment had been removed. This plasmid, pfus1, produces under trp promoter control a 10 kD fusion protein consisting of 16 trpLE and linker coded amino acids, followed by 61 residues of factor VIII and a final 9 linker coded and trpE carboxy terminal residues.

pfus3 was constructed by removing a 290 bp AvaII fragment of factor VIII (amino acids 1000-1096), filling in the overhanging nucleotides using Klenow fragment of DNA polymerase, and ligating this now blunt-ended DNA fragment into pNCV which had been cut with BglII and similarly filled in. This plasmid, with the filled in fragment in the proper orientation (as determined by restriction digests and DNA sequence analysis), directs the synthesis of an approximately 40 kD fusion protein containing 97 amino acids of factor VIII embedded within the 192 amino acid trpLE protein.

pfus4 was made by cutting a factor VIII subclone, λ222.8, with BanI, digesting back the overhang with nuclease S1, followed by PstI digestion and isolation of the resulting 525 bp blunt/PstI fragment (amino acids 710-885). This was ligated into pNCV, which had been digested with BglII, treated with S1, digested with PstI, and the vector fragment isolated. plus4 directs the synthesis of a 22 kD fusion protein containing 175 amino acids of factor VIII following the initial 12 amino acids of trpLE.

The fusion proteins were purified and injected into rabbits in order to generate antibodies as described Supra. These antibodies were tested for binding to plasma derived factor VIII by Western Blot analysis.

The results of such a Western transfer are shown in FIG. 13. Each of the fusion proteins reacts with the plasma factor VIII. Fusion 1 was generated from the region of the gene encoding an 80,000 dalton polypeptide. It can be seen that fusion 1 antisera react only with the 80,000 dalton band, and do not react with the proteins of higher molecular weight. Fusion 3 and 4 antisera show cross reactivity with the proteins of greater than 80,000 daltons, and do not react with the 80,000 dalton band. The monoclonal antibody C8 is an activity neutralizing monoclonal directed against factor VIII and is known to react with the 210,000 dalton protein. FIG. 14 demonstrates that fusion 4 protein will react with this monoclonal antibody, thereby demonstrating that the amino acid sequence recognized by C8 is encoded by fusion 4 polypeptide. This further supports the identity of fusion 4 protein containing protein sequences encoding the 210,000 dalton protein. The above studies conclusively prove that the gene encodes the amino acid sequence for both the 210,000 and 80,000 dalton proteins.

11. Pharmaceutical Compositions

The compounds of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions, whereby the human factor VIII product hereof is combined in admixture with a pharmaceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e.g. human serum albumin, are described for example in Remington's Pharmaceutical Sciences by E. W. Martin, which is hereby incorporated by reference. Such compositions will contain an effective amount of the protein hereof together with a suitable amount of vehicle in order to prepare pharmaceutically acceptable compositions suitable for effective administration to the host. For example, the human factor VIII hereof may be parenterally administered to subjects suffering, e.g., from hemophilia A.

The average current dosage for the treatment of a hemophiliac varies with the severity of the bleeding episode. The average doses administered intraveneously are in the range of: 40 units per kilogram for pre-operative indications, 15 to 20 units per kilogram for minor hemorrhaging, and 20 to 40 units per kilogram administered over an 8-hour period for a maintenance dose.