CRISPR-Mediated Integration of Large Gene Cassettes Using AAV Donor Vectors
CRISPR-Mediated Integration of Large Gene Cassettes Using AAV Donor Vectors
Correspondence
[email protected] (R.O.B.),
[email protected] (M.H.P.)
In Brief
Integration of transgenes into specific
sites of the genome of primary cells using
CRISPR/Cas9 and AAV donor vectors is
currently hampered by the limited
packaging capacity of AAV. Bak and
Porteus now report a method for efficient
integration of large transgenes that
exceed the capacity of a single AAV.
Highlights
d Two AAV donors can be designed to undergo sequential
homologous recombination (HR)
Resource
750 Cell Reports 20, 750–756, July 18, 2017 ª 2017 The Author(s).
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
only the AAV6 donor pair, the frequency of GFP+ cells was less
than 1.0% in both cell types on day 4 after transduction. In
contrast, electroporation with Cas9 RNP and transduction with
both AAV6 donors gave rise to 8.5% and 9.5% GFP+ T cells
and HSPCs, respectively (Figures 3A and S3D). In comparison,
with a single AAV6 donor vector encoding GFP, average target-
ing rates were 46% and 19% (Figures S3E and S3F). Evaluation
of cell death and apoptosis in T cells showed little impact of the
treatment on viability of the cells (Figure S4G). We next assessed
whether the sequential HR process was able to target early pro-
genitor cells in the HSPC population capable of forming colonies
in methylcellulose. Sorted GFP+ cells formed erythroid, myeloid,
and mixed colonies at ratios comparable to those of mock-elec-
Figure 1. Sequential Homologous Recombination of Two AAV6 troporated cells, but an overall lower colony formation frequency
Donors with a Split Gene indicated a lower frequency of progenitor cells in the GFP+ pop-
Schematic overview of a two-step HR platform, in which a gene is split be-
ulation than in the mock-electroporated population (Figures 3B,
tween two homologous recombination (HR) donors (donors A and B), which
undergo sequential HR. Donor A carries an sgRNA target site (red box) upper panel, and S3H). However, the extent of this decrease was
immediately after ‘‘part A’’ of the transgene. This allows HR of donor B using donor dependent. In-Out PCRs, in which one primer is located
the same sgRNA, which seamlessly fuses ‘‘part B’’ of the transgene to ‘‘part in the targeted genomic locus outside the region of the homology
A.’’ Stuffer DNA (white box) after the sgRNA target site is used as a homology arm and the other primer is located in the donor DNA, were used
arm for donor B to avoid re-using the right homology arm from donor A. to confirm on-target integration (Figure S3I). On genomic DNA
derived from GFP+ colonies, we confirmed on-target integration
the two plasmid donors with the split GFP. We observed an of both donors A and B in all colonies analyzed (41 colonies total),
average of 0.02% GFP+ cells when only the two plasmid donors and sequencing confirmed seamless HR (Figures 3B, lower
were delivered, while 0.45% of the cells stably expressed GFP panel, and S3I).
when the CRISPR components were co-electroporated (Fig- Only a small fraction of the CD34+ HSPCs are stem cells that
ure S1D). We next tested the system with the two donors deliv- are capable of long-term repopulation. To examine whether
ered as AAV6 vectors immediately following electroporation. In the two-step HR process occurred in long-term repopulating
mock-electroporated cells receiving both AAV6 donors, tran- stem cells, we transplanted sorted GFP+ HSPCs into three irra-
sient and low expression of GFP were observed at day 4 after diated immunodeficient non-obese diabetic (NOD) scid gamma
electroporation and transduction, which was lost by day 16 (Fig- (NSG) mice. 16 weeks after transplant, all mice showed human
ures 2C and S2A). In contrast, electroporation of the CRISPR chimerism in the bone marrow, with an average of 91% GFP+
components and transduction with both AAV6 donors gave cells in the human population (Figure 3C). Collectively, these
rise to a stable population of GFPhigh-expressing cells observed data show that a split rAAV donor system can efficiently undergo
at both day 4 and day 16 in about 40% of the cells (Figures 2C sequential HR stimulated by the CRISPR system in the K562 cell
and S2A), indicative of chromosomal expression of the GFP line, in primary human T cells, and in HSPCs.
expression cassette, as previously observed (Dever et al., The epidermal growth factor receptor gene (EGFR), with an
2016). As expected, the GFP+ population was highly enriched open reading frame of 3.6 kb, can modulate cell migration and
for BFP/mCherry double-positive cells, confirming that integra- proliferation and has been shown to play roles in HSPC expan-
tion of both donors is required for reconstitution of the GFP sion and G-CSF-induced HSPC mobilization (Takahashi et al.,
expression cassette (Figure S2B). To confirm that HR did occur 1998; Ryan et al., 2010). We next applied the methodology to
through a sequential process that relied on the incorporated try and integrate EGFR into the CCR5 locus. With the EF1a
sgRNA target site in donor A, we made a new variant of donor promoter, the woodchuck hepatitis virus posttranscriptional
A with a mutation in the protospacer adjacent motif (PAM) site regulatory element (WPRE), the bovine growth hormone (BGH)
of the sgRNA. Using this PAM-mutated donor A with donor B polyadenylation signal, and two 400-bp homology arms, such
as before, we observed stable GFP expression in an average targeting vector would be 6.5 kb, greatly exceeding the pack-
of only 2.4% of cells (compared to 38.4% if the sgRNA site aging capacity of AAV. The EGFR gene was split between two
was preserved), confirming that the majority of donor B integra- donors as before, but to avoid introducing stuffer DNA as in
tion events relied on CRISPR activity at the sgRNA target site in the split GFP system, the WPRE and BGH poly(A) were intro-
donor A (Figure S2C). duced along EGFR part A after the sgRNA target site, so that
We next tested this split GFP system in activated primary part of WPRE could serve as a homology arm for donor B (Fig-
human T cells and CD34+ HSPCs with the CRISPR system deliv- ure 4A). Using this split AAV6 donor pair in T cells and HSPCs,
ered by electroporation of precomplexed Cas9 ribonucleopro- donor-only controls yielded less than 1.0% EGFR+ cells in both
tein (RNP) and the two AAV donors delivered immediately after. cell types, but with Cas9 RNP electroporation, we detected an
In cells electroporated with Cas9 RNP but not receiving AAV6 average of 9.8% and 9.1% EGFR+ cells in the two cell types,
donors, >90% indel (insertion or deletion) rates were measured, respectively, with similar rates of EGFR+ cells in the CD4 and
confirming high activity of the Cas9 RNP system in both cell CD8 sub-populations (Figures 4B and S4A). Quantification of in-
types (Figures S3A–S3C). In mock-electroporated cells receiving del rates showed that alleles that had not undergone HR mainly
Figure 2. Sequential HR Targeting a GFP Gene Split between Two AAV Donor Vectors to the CCR5 Locus in K562 Cells
(A) Overview of the donor design for splitting GFP between two AAV donors. The endogenous CCR5 target site is shown with the PAM in red and the 20-nt target
site in purple. The Cas9 cut site is between nucleotides 17 and 18 of the target sequence. Donor A is designed with 2 3 400-bp homology arms (LHA and RHA) that
are split at the CCR5 cut site. The homology arms flank a PGK-BFP expression cassette, part A of the GFP expression cassette (SFFV-GFP (A)), a sgRNA target
site for the same CCR5 sgRNA, and stuffer DNA (to serve as homology arm for donor B to avoid having to re-use the 400-bp CCR5 right homology arm). After HR
of donor A, donor B is designed to seamlessly integrate the rest of GFP using the sgRNA target site present in donor A. Donor B has an LHA homologous to GFP
(begins at amino acid 57 of GFP) and an RHA consisting of part of the sgRNA target site and the stuffer DNA, and it carries an EF1a-mCherry expression cassette.
Neither donor expresses GFP on its own (Figures S1A and S1B). SV40 pA, simian virus 40 polyadenylation signal; TK pA, thymidine kinase polyadenylation signal;
SSC, side scatter.
(B) GFP is split at a PAM site for the CCR5 sgRNA. Codons are depicted above the nucleotides. Donor A carries LHA and RHA, which are split directly at the Cas9
cut site in CCR5 as depicted in (A). Donor A carries a truncated GFP sequence that ends after the PAM site identified in the GFP gene. Directly after the PAM, the
20-nt target site for the same CCR5 sgRNA is introduced. Note that the last codon (Pro) of the truncated GFP sequence is maintained with the fusion to the sgRNA
target sequence. Thus, the LHA of donor B ends right after this proline codon. The right homology arm begins immediately after the Cas9 cut site. The two
homology arms flank the remaining part of GFP and an mCherry expression cassette—see (A)—that, upon seamless HR of donor B, will reconstitute a functional
GFP open reading frame.
(C) K562 cells were mock-electroporated or electroporated with Cas9 mRNA and CCR5 synthetic sgRNAs (CRISPR) followed by transduction with the split GFP
AAV6 donor pair. GFP expression was measured either by total percent GFP+ cells after 16 days or percent GFPhigh cells 8 days after transduction (see
Figure S2A). Left panel: representative FACS plots. Right panel: frequencies of cells stably expressing GFP, n = 7, Error bars represent SD.
See also Figures S1 and S2.
harbored indels (Figure S4B). Minimal toxicity was observed in mary T cells and CD34+ HSPCs. A key aspect of the system is
T cells, while modest toxicity was observed in CD34+ HSPCs, that it does not involve having to serially transfect and transduce
which was mainly caused by the high MOI used for AAV6 trans- cells but, instead, can be performed in a single step in which the
duction (Figures S4C and S4D). Colony-forming unit assays on intracellular HR machinery naturally iterates the process. This is
the EGFR+ HSPC population showed formation of erythroid, particularly important when working with stem cells like CD34+
myeloid, and mixed colonies at a comparable ratio to mock- HSPCs that do not tolerate repeated genetic manipulations
electroporated cells and a small but non-statistical significant well and differentiate during extended culturing. While other viral
decrease in overall formation frequency (Figure S4E). Finally, vectors with larger carrying capacity have been used to deliver
In-Out PCRs on colony-derived genomic DNA confirmed HR templates, e.g., gutless adenoviral vectors and integration-
on-target integration in all analyzed colonies (20 colonies defective lentiviral vectors (IDLVs) (Knipping et al., 2017; Hoban
total), and sequencing confirmed seamless integration by HR et al., 2016; Holkers et al., 2014; Zhang et al., 2014a, 2014b;
(Figure 4C). Genovese et al., 2014), AAV is currently the vector platform of
choice for gene editing in primary T cells and HSPCs, since it
DISCUSSION supports high rates of HR (Sather et al., 2015). However, in other
cell types, different viral vectors may be superior in donor tem-
Our findings establish that efficient iterative HR after simulta- plate delivery. Nonetheless, since rates of HR decrease with
neous delivery of the genome editing components can occur in increasing insert size, a sequential two-step HR approach may
human cells, which may enable complex genome engineering prove to be equivalent to, or even more efficient than, a single-
through intracellular genomic DNA assembly. Importantly, the step integration of a large insert (Perez et al., 2005; Kung et al.,
system is not only highly efficient in human cancer cell lines 2013). Of note, the principle of sequential HR should be appli-
but also very efficient in primary human blood cells, including pri- cable to other viral vectors systems as well.
Existing methods for expression of long transgenes split be- same sgRNA for both HR events, which simplifies the design
tween two AAV vectors rely either on an approach where two and avoids the use of different sgRNAs, which would presum-
overlapping vectors after transduction recombine or anneal ably double the required Cas9 RNP dose and potentially lead to
before second-strand synthesis to produce the full-length large higher rates of off-target activity and translocations. One rate-
expression cassette or on an approach where two vectors are limiting step of the procedure is that the sgRNA target site
designed with splice donor and acceptor in each vector so that, may be mutated by non-homologous end joining (NHEJ).
upon intermolecular head-to-tail concatemerization, the full- When this happens after the first HR event, it can prevent the
length mRNA transcript is produced. Both these approaches second HR step from occurring, thereby leaving a truncated
rely on interaction between the two donor vectors and the but functional expression cassette. To avoid production of a
production of a full-length episomal expression vector. Our truncated protein, donor A may be designed so that the trun-
approach differs mechanistically from these, as it relies on cated mRNA transcript does not contain a stop codon in any
two sequential events of HR between the donor vectors and reading frame downstream of the sgRNA target site, so that
the genome, thus reestablishing the full-length expression the transcripts undergo nonstop decay (van Hoof et al., 2002;
cassette upon integration into the genome. Interestingly, in Frischmeyer et al., 2002), and it may be designed with micro-
the K562 cell line, we do observe episomal reconstitution of RNA (miRNA) binding sites downstream of the sgRNA target
the GFP expression cassette (Figure S2A). We hypothesize site for rapid RNAi-mediated degradation of the transcripts
that this episomal expression could be generated by annealing (Brown et al., 2006). Alternatively, a reporter gene may be
of the left homology arm of donor B to the complementary included for selection of cells that have undergone both HR
sequence in donor A, which would prime upstream second- steps, or the order of the HR steps may be reversed so the pro-
strand synthesis and regenerate the GFP cassette (see Fig- moter is integrated at the second step.
ure 1). Mutating the PAM of the sgRNA target site in donor A, In conclusion, we demonstrate that the HR machinery in pri-
we did observe low frequencies of GFP reconstitution above mary human blood cells is robust enough to facilitate sequential
background, and we cannot rule out that episomal DNA forms HR for integration of gene expression cassettes that exceed the
may be generated that serve as donor template for a single- packaging capacity of AAV. This is desirable for therapeutic
step targeted integration of the full expression cassette. How- genome editing that involves integration of large transgenes, in
ever, we find it more likely that these events are due to on- settings where a multi-cistronic cassette is introduced or in the
target integration of donor B through nuclease-independent setting where two or more full transgene cassettes (promoter-
HR (Barzel et al., 2015). The sequential HR platform uses the transgene-poly(A) signal) need to be integrated. Each of these
Figure 4. Sequential HR of Two AAV6 Donors with a Split EGFR Gene in Human T Cells and CD34+ Hematopoietic Stem and Progenitor Cells
(A) Schematic overview of a two-step HR platform integrating an EGFR expression cassette into the CCR5 gene. Donor A carries all elements of the expression
cassette, but only ‘‘part A’’ of the EGFR coding sequence followed by the same sgRNA target site (red box) used for HR of Donor A. ‘‘Part B’’ is introduced by HR
using this sgRNA target site and is fused seamlessly with ‘‘part A,’’ thereby constituting a full EGFR open reading frame.
(B) Primary human T cells and CD34+ HSPCs were mock-electroporated or electroporated with Cas9 protein precomplexed with CCR5 sgRNA (CRISPR) followed by
transduction with the split EGFR AAV6 donor pair. Left panel: representative FACS plots showing EGFR expression 4 days post-transduction. Right panel: frequencies
of EGFR+ cells measured 4 days post-transduction; n = 14 (T cells, all from different buffy coat donors), and n = 9 (HSPCs, all from different cord-blood donors).
(C) HSPCs were treated as in (B), and at day 4 post-transduction, EGFR+ cells were single-cell sorted into 96-well plates containing methylcellulose, and In-Out
PCR was performed on genomic DNA from progenitor-derived clones 14 days after seeding. Representative gel image shows targeted integration of donors A
and B, confirmed by 50 end and 30 end PCR, respectively, in 6 out of 20 total colonies. Input control is PCR amplification of part of the HBB gene.
examples has features that will enable specific therapeutic and the CCR5 sgRNA with the modified nucleotides underlined is as follows:
research applications in the future. 50 -GCAGCAUAGUGAGCCCAGAAGUUUUAGAGCUAGAAAUAGCAAGUUA
AAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGU
GCUUUU-30 . Cas9 mRNA containing 5-methylcytidines and pseudouridines
EXPERIMENTAL PROCEDURES
was purchased from TriLink Biotechnologies. Cas9 protein was purchased
from IDT. Cas9 protein and sgRNAs were complexed by incubation at a molar
AAV Vector Production
ratio of 1:2.5 at 25 C for 10 min immediately prior to electroporation. CD34+
The backbone for all AAV vector plasmids was the pAAV-MCS plasmid (Agilent
HSPCs were electroporated 2 days after isolation, and T cells were electropo-
Technologies), which contains inverted terminal repeats (ITRs) from AAV sero-
rated 3 days after stimulation. All electroporations were performed on the
type 2. All homology arms used were 400 bp each. Plasmids were produced us-
Lonza Nucleofector 2b (program U-014 for HSPCs and T cells; program
ing standard molecular cloning techniques. Plasmid pDGM6 (a kind gift from
T-016 for K562 cells). For CD34+ HSPCs and T cells, either the buffer from
David Russell, University of Washington) was used in the virus production,
the Human T Cell Nucleofection Kit (VPA-1002, Lonza) or the 1M electropora-
which contains the AAV6 cap genes, AAV2 rep genes, and adenovirus helper
tion buffer described in Chicaybam et al. (2013) was used. The following
genes. AAV6 vectors were produced by iodixanol gradient purification as
conditions were used for electroporation: 5–10 3 106 cells/mL, 300 mg/mL
described in Dever et al. (2016). Vectors were titered using qPCR to measure
Cas9 protein complexed with sgRNA at a 1:2.5 molar ratio. For K562
the number of vector genomes as described here (Aurnhammer et al., 2012).
electroporations, an electroporation buffer containing 100 mM KH2PO4,
15 mM NaHCO3, 12 mM MgCl2 3 6H2O, 8 mM ATP, and 2 mM glucose
Cell Isolation and Culture
(pH 7.4) was used with 50 mg/mL Cas9 mRNA and 50 mg/mL sgRNA. For
CD34+ HSPCs from cord blood were acquired from donors under informed
experiments with plasmid donors, 2.5 mg of each plasmid was used. For exper-
consent via the Binns Program for Cord Blood Research at Stanford Uni-
iments using AAV6 donors, directly following electroporation, cells were incu-
versity. CD34+ cells were purified using the CD34+ Microbead Kit Ultrapure
bated for 15 min at 37 C, after which, they were added AAV6 at 20% of the final
(Miltenyi Biotec) according to the manufacturer’s protocol. All CD34+ HSPCs
culture volume (MOI was typically 2–5 3 105 vector genomes (vg) per cell per
were used fresh without freezing and cultured at 37 C, 5% CO2, and 5% O2
AAV donor, unless otherwise stated).
in StemSpan SFEM II (STEMCELL Technologies) supplemented with stem
cell factor (SCF) (100 ng/mL), thrombopoietin (TPO) (100 ng/mL), Flt3 ligand
(100 ng/mL), interleukin (IL)-6 (100 ng/mL), StemRegenin 1 (0.75 mM), and Flow Cytometry
UM171 (35 nM). Primary human CD3+ T cells were isolated from buffy coats Expression of fluorescent proteins or cell-surface markers was analyzed by
obtained from the Stanford University School of Medicine Blood Center using flow cytometry on a FACSAria II SORP (BD Biosciences) or a CytoFLEX
a human Pan T Cell Isolation Kit (Miltenyi Biotec) according to the manufac- Flow Cytometer (Beckman Coulter). The following antibodies were used:
turer’s instructions. CD3+ cells were cultured at 37 C, 5% CO2, and 20% O2 anti-EGFR (phycoerythrin [PE] or allophycocyanin [APC], clone: AY13;
in X-VIVO 15 (Lonza) supplemented with 5% human serum (Sigma-Aldrich), BioLegend), anti-CCR5 (APC, clone: 2D7/CCR5; BD Biosciences), anti-CD3
100 IU/mL human recombinant IL-2 (PeproTech), and 10 ng/mL human recom- (BV605, clone: UCHT1; BioLegend), anti-CD4+ (PE-Cy7, clone: RPA-T4;
binant IL-7 (BD Biosciences). Before electroporation, T cells were activated for Tonbo Biosciences), anti-CD8 (VF450, clone: RPA-T8; Tonbo Biosciences).
3 days with immobilized anti-CD3 antibodies (clone: OKT3, Tonbo Biosci- The blue or violet LIVE/DEAD Fixable Dead Cell Stain Kit (Life Technologies)
ences) and soluble anti-CD28 antibodies (clone: CD28.2, Tonbo Biosciences). or the Ghost Dye Red 780 (Tonbo Biosceiences) was used to discriminate
K562 cells were purchased from the ATCC and cultured at 37 C, 5% CO2, and live and dead cells according to the manufacturer’s instructions. For discrim-
20% O2 in RPMI 1640 (HyClone) supplemented with 10% bovine growth ination of apoptotic cells, PE- or APC-labeled annexin V (BioLegend) was used
serum, 100 mg/mL streptomycin, 100 U/mL penicillin, and 2 mM L-glutamine. following the manufacturer’s instructions.
Electroporation and Transduction of Cells Methylcellulose Colony-Forming Unit Assay and PCR Detection of
The CCR5 synthetic sgRNAs used were purchased HPLC (high-performance Integration
liquid chromatography) purified from TriLink Biotechnologies and contained The colony-forming unit (CFU) assay was performed by fluorescence-activated
chemically modified nucleotides (20 -O-methyl 30 -phosphorothioate) at the cell sorting (FACS) sorting of single cells into 96-well plates containing
three terminal positions at both the 50 and 30 ends. The sequence of MethoCult Optimum or MethoCult Enriched (STEMCELL Technologies) 4 days