What are the pros and cons of shRNA-mediated knockdown versus CRISPR- or TALEN-mediated knockout?

Either shRNA-mediated knockdown or nuclease-mediated knockout (e.g. CRISPR or TALEN) can be valuable experimental approach to study the loss-of-function effects of a gene of interest in cell culture. In order to decide which method is optimal for your specific application, there are a few things you should consider.

Mechanisms

Knockdown vectors

Knockdown vectors express short hairpin RNAs (shRNAs) that repress the function of target mRNAs within the cell by inducing their cleavage and repressing their translation. Therefore, shRNA knockdown vectors are not associated with any DNA level sequence change of the gene of interest.

CRISPR and TALEN both function by directing nucleases to cut specific target sites in the genome. These cuts are then inefficiently repaired by the cellular machinery, resulting in permanent mutations, such as small insertions or deletions, at the sites of repair. A subset of these mutations will result in loss of function of the gene of interest due to frame-shifts, premature stop codons, etc. If two closely positioned cut sites in the genome (i.e. within several kb) are targeted simultaneously, this can also result in the deletion of the intervening region.

shRNA-mediated knockdown will never completely repress the expression of the target gene. Even for the most effective shRNAs, some residual expression of the target gene will remain. In contrast, in a fraction of treated cells, CRISPR and TALEN can generate permanent mutations which may result in complete loss of gene function.

Consistency and uniformity

shRNA vectors generally provide high cell-to-cell uniformity within the pool of treated cells and very consistent results between experiments. In contrast, CRISPR and TALEN produce results that are highly non-uniform from cell to cell due to the stochastic nature of the mutations introduced. To fully knock out the gene of interest in a cell, all copies of the gene in the cell must be knocked out. Given that normal cells have two copies of any gene (except for X- or Y-linked genes) while cancer cells can have more than two copies, such full knockout cells may represent a very small fraction of all the treated cells. For this reason, nuclease-mediated knockout experiments require the screening of clones by sequencing to identify the subset in which all copies of the gene of interest have been knocked out.

Off-target effects

Off-target effects have been reported for both shRNA-mediated knockdown and nuclease-mediated knockout. The off-target phenotype(s) can be estimated by using multiple different shRNAs to target the same gene. If a gene knocked down by multiple different shRNAs results in consistent phenotype(s), then it argues against the phenotype(s) being caused by off-target effects. For CRISPR- or TALEN-mediated knockout, multiple clones containing loss-of-function mutations should be analyzed in order to account for any phenotype(s) that may be due to off-target mutations. Additionally, bioinformatically identified off-target sites could be sequenced in the clones to see if they have been mutated.

Both CRISPR and TALEN systems have been harnessed to edit genomes of cultured cells and model organisms. Both systems can be used to knock out genes, or to knock in point mutations or insertions, but these two systems are different in several ways and have their own pros and cons.

Mechanisms

CRISPR

The CRISPR system uses a site-specific guide RNA (gRNA) to direct the Cas9 nuclease to its target site in the genome to create DNA cleavage. The target sequence is typically ~20 bp long, and sites containing a few mismatches may still be recognized and cleaved.

The TALEN system employs a pair of chimeric proteins, each composed of a TAL effector DNA-binding domain (recognizing a specific sequence) fused to a FokI nuclease domain. The pair of proteins are designed to bind to a pair of target sites in the genome, each ~18 bp long and flanking a 14-20 bp spacer. Upon binding to DNA, the Fokl nuclease domains on the pair of proteins are able to dimerize, which in turn leads to DNA cleavage within the spacer region between the two target sites.

Efficiency

Both CRISPR- and TALEN-mediated genome editing show good efficiency, but the efficiency varies a lot depending on application, species and cell type. In general, CRISPR can be delivered into cells and induce DNA cleavage more efficiently than TALEN.

Off-target effects

A CRISPR gRNA targets ~20 bp sequence, whereas a TALEN pair binds to a total of ~36 bp target sequence. In addition, Cas9/gRNA complex has higher tolerance for sequence mismatches (up to 5 bp mismatches) than TALEN does. Therefore, TALEN-mediated cleavage has better specificity than CRISPR, and off-target cleavage in the genome by TALEN is unlikely. In contrast, off-target effects have been reported for CRISPR in cell lines, though analyses of CRISPR knockout mice suggest lower off-target frequency in vivo. Recent developments of CRISPR system have significantly enhanced CRISPR specificity. By using Cas9 nickase (Cas9 mutant that contains only one catalytic nuclease domain, e.g. Cas9_D10A and Cas9_H840A) with dual gRNAs, two single-strand DNA nicks are generated with close proximity of the target region, resulting in a double-nick DSB (double-strand break) within the target region that could be repaired. In this design, the off-target effects are minimized since the dual gRNAs expand the target sequence to ~40 bp long.

Target site requirements

TALEN can be generated to specifically target nearly any sequence in the genome. In contrast, target site selection for CRISPR is limited by the requirement for a PAM sequence (typically NGG) sequence located on the immediate 3’ end of the gRNA target sequence. This is no barrier to knocking out genes because cleavage anywhere in the gene is potentially effective, but may present difficulties in generating site-specific mutations or insertions that require cleavage at a specific position of the gene. To precisely edit a specific genomic site using CRISPR, a homologous recombination donor vector or long oligo containing the desired edit sequence flanking by the immediate upstream and downstream homology arms of the target site can be delivered to the cells together with gRNA(s) and Cas9, in order to guide HDR (homology directed repair)-mediated DNA repair at the target site.

In terms of simplicity, CRISPR out competes TALEN in several ways. First, for vector construction, CRISPR system only needs to construct a short gRNA because targeting of Cas9/gRNA complex relies on simple RNA/DNA hybridization, while TALEN system requires re-engineering of the TAL DNA-binding domain that is unique for each protein-DNA interaction. Therefore, gRNAs are cheaper and easier to design and construct than TALENs which always require two vectors per target site. However, TALEN recognition modules, such the one built into the VectorBuilder system, have greatly reduced the work required to generate TALEN vectors. Secondly, for some applications, such as injecting mouse embryos, Cas9 protein and gRNA can be more efficiently delivered via direct injection, but TALEN cannot. Thirdly, CRISPR is extremely versatile in genetic screening experiments since CRISPR screening library expressing many thousands different gRNAs can be easily constructed in a high-throughput manner.

Should I use single gRNA or dual gRNA for CRISPR-mediated knockout?

For CRISPR-mediated genome editing, Cas9 nuclease is directed to the target site of site-specific guide RNA (gRNA) in the genome to create DNA cleavage. In most cases, to generate simple gene knockout, a single gRNA can be used together with Cas9 to generate a double-strand break (DSB), which is then inefficiently repaired by the non-homologous end joining (NHEJ), resulting in permanent mutations, such as small insertions or deletions, at the site of repair. A subset of these mutations will result in loss of function of the gene of interest due to frame-shifts, premature stop codons, etc.

Dual gRNAs can be used if Cas9_D10A nickase is being used to target the two opposite strands of a single target site. In this approach, the nickase enzyme will generate single strand cuts on both strands, one guided by each of the two gRNAs, resulting in DSBs at the target site. Generally, this method reduces off-target effects of CRISPR/Cas9 expression because targeting by both gRNAs is necessary for DSBs to be generated.

Dual gRNAs can also be used when Cas9_D10A nickase and an exogenous donor DNA template are being used to introduce specific base-changes (e.g. knockins) into a gene of interest. In this approach, the two opposite strands would be targeted by the two gRNAs at two sites flanking the desired mutation site, and homology-directed repair (HDR) pathways make use of the exogenous donor template to repair the excised sequence.

Common viral vectors used in biomedical research include lentivirus, Moloney murine leukemia virus (MMLV), adenovirus, and adeno-associated virus (AAV), each with its advantages and disadvantages. Many factors affect the decision on what type of viral vector to use in your experiment. The key considerations include: Does the virus have the tropism for the target cells (namely, can it efficiently infect target cells)? Are the cells dividing or non-dividing? Do you want transient transduction or stable integration into the host genome? What transduction efficiency is needed? Do you need to use a customized promoter to drive the gene of interest? Will your vector be used in cell culture or in vivo? Will an immune response to the virus affect your experiment? The table below lists these considerations when choosing commonly used viruses:

Lentivirus

MMLV

Adenovirus

AAV

Tropism

Broad

Broad

Ineffective for some cells

Depending on viral serotype

Can infect non-dividing cells?

Yes

No

Yes

Yes

Stable integration or transient

Stable integration

Stable integration

Transient, episomal

Transient, episomal

Maximum titer

High

Moderate

Very High

High

Promoter customization

Yes

No

Yes

Yes

Primary use

Cell culture and in vivo

Cell culture and in vivo

In vivo

In vivo

Immune response in vivo

Low

Low

High

Very low

Lentivirus

Lentivirus is a type of retrovirus. Upon infecting cells, the RNA genome of the virus is reversely transcribed and then permanently integrated into the host genome, thus allowing long-term stable expression of genes carried on the viral vector. Lentivirus is the most commonly used viral system for gene delivery, as it is a highly efficient vehicle for introducing genes permanently into mammalian cells. This system has broad tropism (i.e. it can infect a wide range of cell types) for both dividing and non-dividing cells, with relatively low cellular toxicity or immune response. Live lentivirus can be produced at high titer (>108 TU/ml), and transduction efficiency for cultured cells can approach 100% under optimal conditions. While lentivirus is primarily used for in vitro transduction of cultured cells, it can also be used to transduce cells in live animals.

MMLV is a type of retrovirus just like lentivirus. It also has broad tropism and stably integrates into the host genome, allowing stable gene expression. However, MMLV does not efficiently infect non-dividing cells, and can produce more significant cellular immune response than lentivirus. Additionally, the viral titer of MMLV and similar retroviruses is usually only about one tenth that of lentivirus. Another major shortcoming of the MMLV vector is that it does not allow customization of the promoter used to drive the gene of interest (GOI). Rather, the expression of GOI is driven by viral 5’ LTR. In contrast, lentivirus, adenovirus and AAV all allow customized promoters to be used to drive GOI expression. For these reasons, the use of MMLV has declined and generally been replaced by lentivirus. However, because of historical precedent and for certain technical reason, MMLV is still used in some applications such as the derivation of iPS cells.

When adenovirus infects cells, its genome does not integrate into the host genome, but rather remains in an episomal state within infected cells. Expression of genes carried by the adenoviral vector is usually transient, particularly in rapidly dividing cells, which will lose the viral episome over time. Many cell types (both diving and non-dividing) can be transduced with adenovirus, but certain cell types lack the appropriate cell surface receptor and therefore cannot be efficiently transduced. Cellular and in vivo immune response due to adenoviral infection can be significant, and may interfere with certain experiments. Adenovirus can be produced at very high titer (>1010 TU/ml), which allows for very efficient transduction of susceptible target cells. This system is primarily used for in vivo gene delivery, such as gene therapy and vaccination.

AAV is another non-integrating, episomal virus commonly used to produce transient gene expression. Unlike adenovirus, AAV has very low immunogenicity and is almost entirely nonpathogenic in vivo. A practical advantage is that AAV can in most cases be handled in biosafety level 1 (BSL1) facilities. AAV can infect both dividing and non-dividing cells. When VectorBuilder’s AAV vectors are packaged into virus, different serotypes can be conferred to the virus by using different capsid proteins for the packaging. Viruses of different serotypes have different tissue tropisms, so care should be taken to select the proper serotype when attempting to infect a particular tissue type. The relatively high titer of most AAV preparations makes it an efficient gene delivery system. AAV is the ideal viral vector system for many animal studies.

We recommend that you use VectorBuilder offered virus packaging services, which utilize a wide range of proprietary technologies to provide you with high-quality, high-titer viruses at lower cost and faster turnaround than what you can do on your own.

Can I transfect my viral vector directly into cells, or do I need to make virus first?

Direct transfection of cells with the viral vector (rather than using live virus) may facilitate expression of your gene of interest (GOI), but there are a number of complications (see below). We therefore recommend that viral vectors be used for production of live virus, and not for direct transfection of cells.

Virus transduction can usually deliver DNA into target cells more efficiently than plasmid transfection. When using retrovirus such as lentivirus or MMLV, the viral genome can integrate into the host cell genome so that genes carried on virus can be stably expressed. By contrast, transfected vector plasmids only have transient expression in the cells since they do not integrate into the host genome. For retroviral vectors, comparing to virus transduction that has low copy number in the host genome, direct transfection of plasmids can often result in very high copy number in cells, which leads to very high expression levels of the genes carried on the vector. However, this can be very non-uniform (some cells can contain many copies while others carry very few or none).

Our lentiviral vector plasmids contain a strong RSV promoter within the 5' LTR, which is used to drive transcription of the viral RNA genome during virus packaging. After transduction of cells with the packaged lentivirus, the promoter activity of the 5’ LTR is inactivated, so it will not affect the expression of the user’s GOI present between the two LTRs in the viral vector. However, if the viral vector is used to directly transfect cells, the 5’ LTR promoter activity will remain active. This can have a number of effects, including activating, distorting, or even inhibiting expression of downstream gene(s) within the lentiviral vector.

Additionally, due to the presence of components necessary for virus production, viral vector tends to be significantly larger than regular plasmid containing the same expression cassette. As plasmid size increases, the efficiency of DNA preparation and plasmid transfection both decrease, which may result in very low efficiency for many viral vectors when being used in direct transfection. This matters a lot for adenoviral vectors which are more than 30 kb.

We recommend that you use VectorBuilder offered virus packaging services, which utilize a wide range of proprietary technologies to provide you with high-quality, high-titer viruses at lower cost and faster turnaround than what you can do on your own.

After harvesting viral particles, if the viral vector carries a fluorescent reporter gene, we usually first check the quality of virus by transducing the virus into some common cell lines (e.g. 293T or 293A) to observe the expression of fluorescent protein. Different methods are then used to quantify the titer of virus depending on viral type. Occasionally, if there is a major discrepancy between fluorescence observation and quantitative measurement, we will perform re-measurement or additional validation to ensure that viruses manufactured by VectorBuilder are of high quality.

Lentivirus

To measure lentivirus titer, we transduce 293T cells with lentivirus diluted from the stock. Then, we use a qPCR-based approach to quantify the average number of integration events of the proviral genome (using the copy number of WPRE as a proxy) per host genome (using the copy number of BMP2 as a proxy) to estimate titer in the original viral stock. This approach measures the functional titer which precisely reflects the number of viral particles with infectious capability. There are other methods for measuring lentivirus titer. One is to measure the physical titer by performing RT-qPCR directly on the viral genomic RNA extracted from the virus. But this method can grossly over-estimate titer (typically by ~10 fold and occasionally up to 100 fold), because it measures any viral genome regardless of whether it is from a live infectious viral particle or a dead particle. This problem is exacerbated by the fact that lentivirus is not stable and can quickly lose infectivity if not frozen at -80°C or if subjected to repeated freeze-thaw cycles. Another method is to measure the number of transduced cells based on the expression of either a fluorescent or drug-selection marker carried on the viral vector. This method could severely under-estimate titer because a fluorescent or drug-selection marker may fail to be expressed at detectable levels in some host cells due to silencing or some other reason, and also because one cell may be infected by multiple viral particles.

Adenovirus

For adenovirus, we also measure the functional titer. After transducing serially-diluted adenovirus into 293A cells, we use an immunocytochemistry-based approach to count the number of cells being successfully transduced via the detection of adenovirus-specific hexon protein, and each immunostained cell is considered as one infectious unit. Cells are infected at very low multiplicity of infection (MOI) to ensure that most transduced cells are each infected by a single viral particle. This assay shows good correlation with conventional plaque assay. For ultra-purified adenovirus, we directly measure the optical density (using OD260) of the viral particles to estimate titer, because there is a tight correlation between the optical density of ultra-purified adenovirus and functional titer. Adenovirus has very good stability. In our preparation, the viral particles are essentially all alive and can remain functional at room temperature for many days.

Adeno-associated virus (AAV)

We measure the physical titer of AAV by directly extracting viral genome from lysed viral particles, and then using qPCR to accurately quantify the copy number of viral genome (using the copy number of ITR region as a proxy) in the stock. AAV particles are very stable. In our AAV preparation, viral particles are essentially all alive and can remain functional at room temperature for many days. As such, the physical titer, though not measured in a way involving the transduction of cells, is very close to the functional titer.

Based on our experience, if a customer buys viral vectors from VectorBuilder but chooses to package virus on their own rather than using our virus packaging service, there is a greater than 50% chance that the resulting titer is significantly below the optimal level. This could lead to a great deal of unhappiness. In many cases, customers who chose to do their own virus packaging blamed the low titer on our vectors, only to find out after they switched our packaging service that the vectors could produce good titer in our hands.

VectorBuilder’s virus packaging service employs extensively optimized protocols and many proprietary reagents and techniques to ensure high titer, high purity and consistence even in the case of difficult vectors. When customers do their own packaging, they typically use standard protocols that have not been optimized. This, coupled with a lack of experience, can lead to low titer. We often see customers make low-level mistakes such as using the wrong packaging cells or helper plasmids. For example, some customers mistakenly used the 2nd generation lentiviral packaging system to package our vectors (all VectorBuilder lentiviral vectors are 3rd generation).

Technical competence of personnel aside, some viral vectors are inherently hard to package due to various reasons. One of the most common causes is that the viral vector carries a gene that is toxic to packaging cells such that the packaging cells are killed or very unhealthy. For some of these cases, VectorBuilder has developed proprietary technologies to overcome the toxic gene effect so the virus can still be made. Another common cause is that the size of the insert fragment is too big for the viral packaging machinery to handle. We recommend that you limit the insert size to 6.4 kb for lentiviral vectors, 8.3 kb for adenoviral vectors, 4.7 kb for AAV vectors, and 5.5 kb for MMLV vectors. An insert fragment exceeding the above listed cargo limit may result in compromised viral production. Also, based on our experience, if the viral vector contains very high GC sequence (> 70% across a few hundred bases), it can reduce packaging efficiency.

Properly packaged high-titer virus could also lose titer over time. This is especially true for lentivirus, which is unstable and can quickly lose infectivity if not frozen at -80°C or if subjected to repeated freeze-thaw cycles. Adenovirus and AAV are much more stable but may also lose infectivity if left unfrozen for a very long period of time or otherwise mishandled.

Sometimes, the experimental circumstance in which VectorBuilder’s virus preparation is used may give a customer the false impression that the viral titer is much lower than what is indicated. This can happen if the customer is using the virus on a cell type that is much harder to transduce than what we used to measure titer. It also happens frequently that the customer estimates titer from the number of transduced cells that express either a fluorescent or drug-selection marker carried on the viral vector. This approach could severely under-estimate titer because a fluorescent or drug-selection marker may fail to be expressed at detectable levels in some host cells due to silencing or some other effect, and also because one cell may be infected by multiple viral particles. For these reasons, it is important to first perform small-scale tests of a virus preparation on the cells of interest at several multiplicity of infection (MOI) to find an optimal MOI before conducting large-scale transduction experiments.

We offer virus packaging services for lentivirus, adenovirus, AAV and MMLV. We strongly suggest that you choose our packaging service because our proprietary packaging reagents, packaging cells and protocols are optimized for our viral vectors. As a result, we can generally produce higher titer, better purity, and more robust live virus than our customers can on their own. You will also find that our prices are highly competitive and likely lower than the cost of making virus on your own.

Fluorescent proteins, luciferases, and LacZ are all recombinant protein-based reporters that can be used for localization or imaging studies. However, these systems differ in several important ways that determine their suitability for different experimental designs.

Fluorescent proteins and luciferases are both types of proteins which emit light, which is then detectable with a camera or similar device. Fluorescent proteins function by absorbing light of one color (excitation), and then emitting lower-energy light of a different color (emission). In contrast, luciferase (and other bioluminescence enzymes) generate light by catalyzing a chemical reaction in which a substrate (i.e. luciferin) is oxidized, emitting photons as a reaction product.

Fluorescent proteins are much brighter than luciferase, which benefits many types of experiments, but in tissue samples or live animals, background, autofluorescence, and light scattering can make fluorescent proteins problematic. Since luciferase activity requires luciferin, this substrate must be added to culture media or injected into live animals prior to analysis. In many cases this can limit reproducibility or quantitation of experiments, when compared to fluorescent proteins.

Unlike the above two types of markers, LacZ does not emit light. Instead, the LacZ gene product, β-galactosidase, can catalyze the conversion of X-gal into an opaque blue compound similar to indigo. Therefore, in order to visualize LacZ, samples must first be stained with X-gal solution, and this staining procedure is not compatible with live samples. Although highly sensitive, LacZ/X-gal staining is not quantitative, and does not provide the high resolution possible with fluorescent or bioluminescent proteins.

In general, we recommend LacZ for gene expression studies in whole-mount embryos or tissue sections. For live animal studies, either fluorescence or bioluminescence may be a good choice, depending on a number of factors, such as whether imaging deep inside tissues is necessary and whether marker brightness may be an issue.

Which fluorescent protein should I use?

Many fluorescent proteins (FPs) have been developed over the years, and which FP to use in an experiment depends on many factors.

Single-color experiments

For single-color experiments, green FPs are the most common choice. EGFP is the most popular green FP and is a good choice for many single-color studies. However, other green FPs, such as TurboGFP (a.k.a. maxGFP), may be better choices for certain applications. For example, TurboGFP has many more advanced features compared to EGFP, such as brighter green fluorescence, faster maturation, and high pH- and photo-stability, making it ideal for experiments that would benefit from early signal detection and high sensitivity. If a red FP is preferred, mCherry is a great choice for most experiments due to its monomeric structure, good fluorescent properties and low toxicity. It is particularly suitable for protein tagging or when cells are sensitive to toxicity or protein aggregation that may occur in other red FPs. The brighter dTomato works well in situations where dimerization and some potential aggregation of the FP is acceptable. Another good red FP is DsRed_Express2, which is a minimally cytotoxic version of red FP.

Multicolor experiments

For multicolor experiments that simultaneously employ multiple fluorophores (including FPs and other dyes such as DAPI), researchers must carefully consider the spectral properties of the fluorophores to ensure that they are spectrally distinguishable based on the excitation and/or emission filters on the available microscopes, flow cytometers, or other hardware used in fluorescence detection. Basically, the detection hardware should be able to read out the fluorescence signal from each of the multiple fluorophores used in the experiment without interference from other fluorophores. This can be done by using excitation filter (or laser) to produce excitation light of the proper frequency that only excites the fluorophore of interest. It can also be done by using emission filter to only allow the emitted fluorescence light from the fluorophore of interest to enter the detector. For two colors, a standard green FP such as EGFP plus a standard red FP such as mCherry should work well on virtually all fluorescence microscopes and flow cytometers. For three colors, a blue, green and red FP combination (e.g. TagBFP + EGFP + mCherry) or a cyan, yellow and red combination (e.g. CyPet + YPet + mCherry) can work well, as these combinations are easily separable on most fluorescence microscopes or flow cytometers.

Fluorescence resonance energy transfer (FRET)

FPs are widely used in many fluorescence resonance energy transfer (FRET)-based applications. FRET is a physical process during which an excited donor chromophore molecule transfers energy to an acceptor chromophore through nonradioactive dipole-dipole coupling. In observation, FRET leads to a reduction of donor’s fluorescence and an increase of the acceptor’s emission. There are a few prerequisites for FRET: The donor and acceptor must be in close proximity (10-100 Å); The excitation spectrum of the acceptor must overlap with the emission spectrum of the donor; The donor’s and acceptor’s transition dipole orientations must be in parallel. FRET is distance-dependent and the efficiency of FRET is inversely proportional to the sixth power of the distance between donor and acceptor. It is therefore very sensitive to small changes in distance, making it a powerful tool in many applications, such as studying DNA or protein structure, and investigating molecular interactions. The cyan-yellow donor/acceptor pair, CyPet-YPet, is commonly used in FRET-based approaches due to its good dynamic range.

Our recommendations:

Single Color: EGFP, TurboGFP, mCherry, dTomato, or DsRed_Express2 (These FPs can be used with DAPI.)

Two-color: EGFP + mCherry or TurboGFP + mCherry (These pairs can be used with DAPI.)

Three-color: TagBFP + EGFP + mCherry or CyPet + YPet + mCherry (The second combination can be used with DAPI when proper filters are used.)

Cells typically need only low to moderate expression of a drug resistance gene to acquire resistance to the drug. For fluorescent proteins (FPs), in contrast, high levels of expression are required to obtain bright fluorescence signal, whereas low to moderate expression may produce only weak signal or signal under detectable threshold of the hardware. In addition, certain FPs (e.g. EBFP) are much dimmer than common FPs (e.g. EGFP), and are especially difficult to detect. In fact, some researchers had to resort to immunofluorescence with anti-FP antibodies in order to visualize their FPs of interest in cell culture or in vivo. Below are some of the most frequent causes for poor fluorescence:

The FP gene is driven by a weak promoter

Some promoters are inherently weak (e.g. UBC), and can result in low expression of the FP. Other promoters (e.g CMV) may work well in cell culture but can sometimes get silenced in vivo. When expressing FPs, you should always try to use a strong ubiquitous promoter (e.g. EF1A or CAG) if you can. If you have to use a tissue-specific promoter, you should also choose a strong one when possible. If you have to use a weak promoter or a promoter that is poorly characterized, you should be prepared for the possibility of inadequate FP signal. When this happens, it does not mean that your FP is not expressed. It may just be that it is not highly expressed. You can always check FP expression using more sensitive assays such as RT-PCR or immunofluorescence.

For retroviral vector systems (e.g. lentivirus or MMLV), internal polyadenylation signals cannot be present between the LTRs, as this would inhibit virus packaging. Instead, a single polyadenylation signal is present in the 3’LTR. As a result, transcription from the upstream promoter often continues past the end of the upstream ORF, through the downstream promoters and ORFs. This often leads to partial inhibition of expression of the downstream ORF(s). When the downstream ORF is an FP, then its expression can be much reduced. You can improve your FP gene expression as follows:

Consider using a different vector system, such as regular plasmid, adenovirus or AAV.

Consider expressing the FP gene in an upstream expression cassette as part of a polycistron. You can use 2A linker to connect the FP gene to your gene of interest. However, this may affect the biological function of the other ORF(s) in the polycistron, may still result in low fluorescence (see below), and will increase the cost and time of vector construction.

When the FP gene is expressed in a polycistron, there are a number of complications as described below, and sometimes the expression of the FP gene is dependent on the upstream or downstream ORF partners.

The FP gene is downstream of an IRES.

In a polycistronic transcript containing one or more IRES elements, the ORF downstream of the IRES is expressed at much lower levels (typically 10-20%) as compared to the upstream ORF in the polycistron. If an FP is expressed downstream of an IRES, this reduction in expression may well lead to poor fluorescence. If you have to express an FP in a downstream position of a polycistron, you can consider using a 2A self-cleaving linker (e.g. P2A or T2A) instead of IRES. By using 2A, the downstream ORF of a polycistronic transcript is usually expressed at a level that is comparable to (or only moderately lower than) the upstream ORF. However, 2A also has its pitfalls that could comprise function of the gene of interest (GOI). When multiple 2As are present, the downstream ORFs also tend to be expressed at lower levels. Additionally, 2A self-cleavage is not 100% efficient, and the efficiency can be strongly influenced by the sequence context of the upstream and downstream ORF. As such, a significant fraction of the translation product from the polycistron could be fusion protein that has failed to self-cleave, which could be a concern in some applications. Furthermore, the cleavage of the 2A linker leaves behind an extra short peptide on the C terminus of the upstream protein and an extra proline on the N terminus of the downstream protein. This is not an issue in most applications, but under some circumstance, it can comprise protein function.

Fusion proteins almost always show weaker fluorescence than their unfused counterparts (e.g. EGFP/Neo fusion protein has weaker fluorescence than EGFP alone). Also, untested fusion proteins have the potential to be unstable, misfolded, or nonfunctional within the cell. If you suspect that this is the reason for weak fluorescence, you can try the following:

Consider using a stronger promoter (e.g. EF1A) to drive expression of the fusion gene.

Some FPs are inherently dimmer than the other variants in the same color family. For example, in blue color family, the brightness of EBFP is only about one third that of TagBFP. We suggest that you choose a brighter FP variant based on our guide on fluorescent reporters.

VectorBuilder offers a variety of drug-selection markers on its vectors, such as puromycin (Puro), neomycin (Neo), hygromycin B (Hygro) and blasticidin (Bsd). In general, we have found that puromycin kills non-resistant cells faster and more consistently than other antibiotics. For this reason, we recommend Puro for most cell types. However, some cell types may naturally have some degree of resistance to certain antibiotics without the resistance gene or that they are sensitive to certain antibiotics even with the resistance gene. These cells would require testing of different drug-selection markers to find the optimal ones.

Recommended concentration and selection duration of commonly used drugs

Antibiotics

Cell line

Recommended concentration

Recommended duration

Puromycin

293T

1-2 ug/ml

3-5 days

Geneticin (G418)

HT1080

500-1000 ug/ml

7-11 days

Blasticidin

293T

5-15 ug/ml

7-11 days

Hygromycin B

293T

100-200 ug/ml

5-7 days

Note:

a. Geneticin (G418) is used to select for cells expressing neomycin resistance gene.

Drug-selection genes typically require only low to moderate expression to be effective, but in some cases they may be insufficient to confer full drug resistance to cells. Below are some of the most frequent causes for the lack of drug resistance:

The dosage of the drug is too high for your cell line

The response to different drugs varies from cell line to cell line. Before applying a new drug on your cell line, we recommend that you always generate a kill curve (also called “dose response curve”) and use the lowest drug concentration that effectively kills the non-transfected or non-transduced cells. Using an unnecessarily high drug dosage can result in the death of cells expressing drug resistance gene, and can also compromise the health of even resistant cells. Some cells are extremely sensitive to certain drugs, in which case you will have to consider using another drug-selection marker.

Some promoters are inherently weak (e.g. UBC), and can result in low expression of the drug-selection marker. You can consider using a stronger promoter (e.g. EF1A) to drive expression of the drug-selection marker.

For retroviral vector systems (e.g. lentivirus or MMLV), internal polyadenylation signals cannot be present between the LTRs, as this would inhibit virus packaging. Instead, a single polyadenylation signal is present in the 3’LTR. As a result, transcription from the upstream promoter often continues past the end of the upstream ORF, through the downstream promoters and ORFs. This often leads to partial inhibition of expression of the downstream ORF(s). When the downstream ORF is a drug-selection marker, then its expression can be much reduced. You can improve your marker expression as follows:

Consider using a different vector system, such as regular plasmid, adenovirus or adeno-associated virus.

Consider expressing the drug-selection marker in an upstream expression cassette as part of a polycistron. You can use 2A linker to connect the drug-selection marker to your gene of interest. However, this may affect the biological function of the other ORF(s) in the polycistron, may still result in weak drug resistance, and will increase the cost and time of vector construction.

In a polycistronic transcript containing one or more IRES elements, the gene downstream of the IRES is expressed at much lower levels (typically 10-20%) as compared to the first gene in the polycistron. If a drug-selection marker is expressed downstream of an IRES, this reduction in expression may well lead to poor drug resistance. If you have to express a drug-selection marker in a downstream position of a polycistron, you can consider using a 2A linker (e.g. P2A or T2A) instead of IRES. By using 2A, the downstream ORF of a polycistronic transcript is expressed at a level that is comparable to (or only moderately lower than) the upstream ORF. However, 2A also has its pitfalls that could comprise function of the gene of interest. Most notably, the cleavage of the 2A linker leaves behind an extra short peptide on the C terminus of the upstream protein and an extra proline on the N terminus of the downstream protein. This is not an issue in most applications, but under some circumstance, it can comprise protein function.

Should I use IRES or 2A (and which 2A) in my polycistronic expression cassette?

When designing a gene expression vector to co-express multiple ORFs under the control of a single promoter, you can choose to place multiple ORFs behind the promoter, separated by linkers such as the internal ribosome entry site (IRES) or the 2A family peptides. For either type of linker, multiple proteins will be produced from a single mRNA transcript.

Mechanisms

The most commonly used IRES element, and also the one used by VectorBuilder, is derived from the encephalomyocarditis virus (EMCV). It functions by acting as an additional ribosome recruitment site, allowing translation initiation to occur at an internal region of the mRNA in addition to the primary translation initiation site.

The 2A peptides are short (~18-25 aa) peptides derived from viruses. They are often called “self-cleaving” peptides, which will produce multiple proteins from the same transcript. 2A peptides do not entirely “self-cleave,” as they function by making the ribosome skip the synthesis of the glycine and proline peptide bond at the C-terminal end of the 2A element, causing separation between the end of the 2A sequence and downstream peptide. As a result, the upstream protein will have a few extra 2A residues added to its C terminus while the downstream protein will have an extra proline added to its N terminus. There are four commonly used 2A peptides, P2A, T2A, E2A and F2A, that are derived from four different viruses.

Pros and cons of the two types of linkers

The key advantage of IRES over 2A is that it does not affect the protein sequence of either the upstream or the downstream ORF, which is not the case for 2A.

The main disadvantage of IRES is that the ORF downstream of the IRES is expressed at much lower levels (typically 10-20%) as compared to the upstream ORF in the polycistron. This is why when people express fluorescent proteins after IRES, they often have trouble getting detectable fluorescence signal, especially for in vivo applications where there is a limited number of transgenes per cell. IRES elements can also be problematic due to their size (>500 bp), which increases the difficulty of vector construction and virus packaging.

The key advantage of 2A over IRES is that the downstream ORF can be expressed at a level comparable to (or just moderately less than) the upstream ORF.

The disadvantage of 2A is the possibility of undesired biological effects of the additional peptide residues left behind on either the upstream or the downstream ORF. Additionally, 2A self-cleavage is not 100% efficient, and the efficiency can be strongly influenced by the sequence context of the upstream and downstream ORF. As such, a significant fraction of the translation product from the polycistron could be fusion protein that has failed to self-cleave, which could be a concern in some applications.

If your experiments do not require high-level expression of the second ORF of a bicistron (e.g. a drug-selection marker), then IRES should be sufficient. However, if there are more than two ORFs in the polycistron, or if predictable or equal amount of the multiple co-expressed ORFs is important, we suggest using 2A peptides.

Comparison of different 2A peptides

Of the four commonly used 2A peptides (P2A, T2A, E2A and F2A), P2A is shown to generally have the highest cleavage efficiency (close to 100% in some cases). T2A comes next, followed by E2A and F2A. The cleavage efficiency of F2A is only about 50%. We generally recommend using P2A or T2A in polycistrons.

The number of plasmid copies per bacterial cell is determined by the origin of replication on the plasmid. Some origins have inherent low copy number. Check the copy number of the origin of replication on your plasmid. For low-copy plasmids, increase the amount of E. coli culture for plasmid DNA prep in order to obtain satisfying DNA yield.

Please check the binding capacity of your plasmid prep column and whether your plasmid is high- or low-copy plasmid. For mini preps, we recommend that you harvest 1-5 ml of overnight bacterial culture. For maxi preps, if the plasmid is high-copy plasmid, we recommend using 100-150 ml of overnight bacterial culture; if the plasmid is low-copy plasmid, we recommend using 300-500 ml of overnight bacterial culture. Typically, for high-copy plasmid, ~5 ug of plasmid DNA can be extracted from every 1 ml of culture in mini prep and ~500 ug of plasmid DNA can be obtained from 150 ml of culture; for low-copy plasmid (e.g. pET), 1.5-2.5 ug of plasmid DNA can be harvested from every 1 ml of culture in mini prep and 150-200 ug of plasmid DNA can be obtained from 150 ml of culture.

Only a fraction of bacteria in the liquid culture contain plasmids

Some antibiotics, ampicillin in particular, degrade fast in liquid culture. As a result, bacteria that do not contain plasmids can propagate to a significant fraction of the culture, causing poor yield of the plasmid prep. To avoid this, please prepare ampicillin containing growth medium freshly before use and make sure that enough ampicillin is supplied. Also, when culturing ampicillin-resistant bacteria, do not let the liquid culture saturate for too long before harvesting. Besides insufficient antibiotics in the culture, extracting plasmid DNA from very old culture can also result in low yield, since many bacterial cells are dead and plasmid DNA they contain is degraded. Therefore, try to extract plasmid DNA from fresh culture. If plasmid prep cannot be performed immediately, you can spin down the bacteria and store the pellet in -80°C freezer for later plasmid prep.

The liquid culture is directly inoculated from E. coli stab culture

Direct inoculation of a liquid culture from the E. coli liquid stock or stab culture you have received from VectorBuilder can very occasionally result in low yield. We recommend streaking the stock onto an LB agar plate containing the appropriate antibiotic first, and then inoculating a liquid culture with a fresh colony growing from that plate. Detailed user instructions can be found by going to menu item Learning Center > Documentation > Stab Culture.

You have not carefully followed the manual of the plasmid prep kit

If you use a plasmid prep kit, please carefully read the manual before use. Improper operations can often lead to poor performance of the kit.

The plasmid may not be expressed in an inappropriate host strain for induction. Most vectors from VectorBuilder are shipped as E. coli stock in the cloning host strain Stbl3 (this information is also indicated on the vector report). Stbl3 is the preferred cloning host due to its ability to maintain the stability of the plasmid, but it may not be suitable for recombinant protein expression. For example, for pET, the IPTG-induced recombinant protein expression requires the T7 RNA polymerase to be expressed in the host strain, which is not present in Stbl3. As such, bacterial expression vectors typically require transferring the plasmid into an appropriate host strain such as BL21(DE3) for proper induction.

The protein expressed on the vector is “problematic”

Sometimes when the protein being expressed is insoluble, misfolded, improperly cleaved, or toxic to the bacterial host, there may appear to be poor induction of the recombinant protein. In this case, you will need to optimize your induction system (see below), or express your gene of interest in a more tolerant host strain or from an alternative expression vector.

Your induction system is not optimized

Depending on the gene to be expressed, the expression vector and the host strain, you may need to consider optimizing the following things in your induction system: OD600 (usually between 0.6 and 0.8); concentration of the inducing agent (e.g. IPTG or L-arabinose) cannot be too low or too high; duration of induction cannot be too short or too long; induction temperature needs to be optimized especially when you are dealing with “problematic” proteins (see above), and different temperatures can be tested (e.g. 16°C, 25°C, 30°C, 37°C, etc.). In rare cases, for unclear reasons, different clones of the same expression vector may show different induction behavior, so you may need to pick a number of single colonies to test individually and select the one that has best induction performance.

VectorBuilder applies rules similar to that used by the RNAi consortium (TRC) to design and score shRNAs. For each given RefSeq transcript, we search for all possible 21mers that are considered as candidate target sites. Candidates are excluded if they contain features thought to reduce knockdown efficiency/specificity or cloneability, including a run of ≥4 of the same base, a run of ≥7 G or C, GC content <25% or >60%, and AA at the 5’ end. Knockdown scores are penalized for candidates that contain internal stem-loop, high GC content toward the 3’ end, known miRNA seed sequences, or off-target matches to other genes. For genes with alternative transcripts, target sites that exist in all transcripts are given higher scores.

All scores are ≥0, with mean at ~5, standard deviation at ~5, and 95% of scores ≤15. An shRNA with a knockdown score about 15 is considered to have the best knockdown performance and cloneability, while an shRNA with a knockdown score of 0 has the worst knockdown performance or is hard to be cloned.

Please note that knockdown scores are only a rough guide. Actual knockdown efficiency could depart significantly from what the scores predict. Target sites with low scores may still work well. Also, please note that targeting 3’ UTR can be as effective as targeting coding region.

Why isn’t my shRNA knocking down my gene of interest?

Not all shRNAs will work

Based on our experience and feedback from our customers, we know that generally when 3 or 4 shRNAs are tested for any arbitrary gene, typically 2 or 3 produce reasonable to good knockdown. However, when using shRNAs, it is important to recognize the fact that not all shRNAs will work. Typically, ~50-70% of shRNAs have noticeable knockdown effect, and ~20-30% of them have strong knockdown. If you try a few shRNAs targeting a specific gene, it is possible that by chance, none will produce satisfactory knockdown. When this happens, the best approach is to try more shRNAs, especially the ones that have literature validation. Many researchers also use a “cocktail” of shRNAs (i.e. mixture of different shRNAs) targeting the same gene, which sometimes can improve knockdown efficiency.

The assay for validating the knockdown of your gene is not performed properly

The most common and sensitive assay to evaluate shRNA knockdown efficiency is RT-qPCR. Sometimes, you may need to try several pairs of primers, and then choose the most specific and efficient pair to use. In general, the RT-qPCR primers should span exon-exon junction if possible to avoid amplifying genomic DNA. When using a new pair of primers, we recommend that you run the PCR product on an agarose gel to verify the band, or even validate the PCR product by sequencing. You should always include minus-RT control in RT-qPCR to better estimate the level of genomic DNA contamination. You can use NCBI primer designing tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) to help you better examine the quality of your primers in silico.

Knockdown efficiency can also be assessed by Western blot. However, Western blot is notoriously prone to false positive bands from non-specific antibody binding, which could mistakenly lead to the interpretation that there is no knockdown. Care must therefore be taken to make sure that the antibody used is indeed specific to the gene of interest.

The shRNA might only target a subset of transcript isoforms of your gene

When designing shRNA, we generally recommend those that can target as many transcript isoforms of the gene as possible, unless you are only interested in knocking down a particular isoform. VectorBuilder has created shRNA databases that contain optimized shRNAs for common species. If you design shRNA vectors on VectorBuilder, when you insert the shRNA component into the vector, you will have the option to search the target gene in our database. Then, you will see the detailed information of all the available shRNAs we designed for you, including a link to UCSC Genome Browser to view these shRNAs in the context of genomic sequence and all the transcript isoforms.

How is gRNA specificity score calculated?

VectorBuilder follows the algorithm developed in Feng Zhang’s lab to calculate specificity scores for gRNAs. Briefly, for a given gRNA intended to target a N(20)NGG sequence in a species, we search for all potential off-target sites in the genome of that species that have ≤3 mismatches with the target sequence. For each potential off-target site identified this way, a single off-target score is calculated. Scores for all the off-target sites are then used in aggregate to calculate the final specificity score of the gRNA, which is between 0 and 100, with higher values indicating greater targeting specificity.

Please note that specificity scores are only a rough guide. Actual targeting efficiency and specificity could depart from what the scores predict. gRNAs with low scores may still work well.

How to use piggyBac system in vitro and in vivo?

Our piggyBac system contains two vectors, both engineered as E. coli plasmids. One vector, referred to as the “helper plasmid”, encodes the transposase. The other vector, referred to as the “transposon plasmid”, contains two inverted terminal repeats (ITRs) bracketing the region to be transposed. The gene (or other DNA fragment) to be delivered into host cells is cloned into this region between the ITRs.

When both the helper and transposon plasmids are co-transfected into target cells, the transposase produced from the helper recognizes the two ITRs on the transposon, and inserts the flanked region (including the two ITRs) into the host genome. Insertion typically occurs at host chromosomal sites that contain the TTAA sequence, which is duplicated on the two flanks of the integrated fragment.

In cell culture systems, using a piggyBac vector system is relatively straightforward, with serial transfection or co-transfection of the transposon and helper plasmids. Alternatively, electroporation may be used instead of transfection.

In live animals, however, plasmid delivery is more problematic, with low transfection and electroporation efficiencies making co-delivery of two plasmids inefficient. Some researchers have made use of transgenic animal lines expressing either the transposon or transposase alone. Crossing of these two lines or transfection of a single plasmid into one of these transgenic lines can induce transposition of the transposon in live animals (e.g. Ding et al. Cell. 122:473-83 and Horn et el. Genetics. 163:647-61)