Wellseq
Wellseq
9 Kun Yin, Meijuan Zhao, Yiling Xu, Shanqing Huang, Dianyi Liang, He Dong, Ye Guo,
10 Li Lin, Zhi Zhu, Chaoyong Yang
11 State Key Laboratory of Physical Chemistry of Solid Surfaces,
12 The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation
13 Key Laboratory for Chemical Biology of Fujian Province
14 Collaborative Innovation Center of Chemistry for Energy Materials, Department of
15 Chemical Biology, College of Chemistry and Chemical Engineering
16 Xiamen University
17 Xiamen 361005, P. R. China
18 Email: [email protected], [email protected]
19
20 Zhong Zheng, Jia Song, Junhua Zheng, Chaoyong Yang
21 Institute of Molecular Medicine
22 State Key Laboratory of Oncogenes and Related Genes
23 Renji Hospital, School of Medicine
24 Shanghai Jiao Tong University
25 Shanghai, 200120,China
26 Email: [email protected]
27
28 Huiming Zhang, Chaoyong Yang
29 Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian
30 Province (IKKEM)
31 Xiamen 361005, P. R. China
32
33
34 Abstract
35 High-throughput single-cell RNA sequencing (scRNA-seq) is recognized as a
36 powerful technology for disentangling the heterogeneity of cellular states. However,
37 the Poisson-dependent cell capture and low sensitivity in scRNA-seq methods pose
38 challenges for throughput and for samples with low RNA-content. Herein, to address
39 these challenges, we developed Well-Paired-Seq2 (WPS2) based on size-exclusion
40 and locally quasi-static hydrodynamic principles to realize high efficiency of cell
41 utilization, single cell/bead pairing, and cell-free RNA removal. WPS2 exploits
42 molecular crowding effect, tailing activity enhancement in reverse transcription, and
43 homogeneous enzymatic reaction in the initial bead-based amplification to achieve
44 3116 genes and 8447 transcripts. With an average of ~20,000 reads per cell, WPS2
45 detected 1420 more genes and 4864 more transcripts than our previous
46 Well-Paired-Seq. Using WPS2, we overcame the Poisson limit for the capture of both
47 cells and beads and accurately characterized transcriptomes of low RNA-content
48 single cells and nuclei with high sensitivity. WPS2 was further applied to
49 comprehensively profile transcriptomes from frozen clinical samples. We found that
50 clear cell renal cell carcinoma (ccRCC) has a complex microenvironment, and that
51 chromophobe renal cell carcinoma (chRCC) exhibits abundant copy number
52 variations (CNVs). In addition, metanephric adenoma (MA) was characterized at
53 single-cell level for the first time and some potentially specific markers were revealed.
54 With the advantages of high sensitivity, high throughput, and high fidelity, we
55 anticipate that WPS2 will be broadly applicable in basic and clinical research.
56
57 Introduction
58 Complex cellular systems contain diverse types of cells, and each type can switch
59 among different biological states.1 To understand how complex multicellular systems
60 work, it is essential to conduct research on the functionalities and responses of each
61 cell type. Over the past decade, single-cell RNA sequencing (scRNA-seq) has become
62 a powerful tool for uncovering transcriptional profiles and defining cell identities in
63 biological samples. Tang et al. first reported the scRNA-seq method in 2009.2 Since
64 then, many scRNA-seq technologies have been developed to improve performance,
65 including the yield of the cDNA library, the coverage of a full-length transcriptome,
66 and the sensitivity of gene detection, such as Smart-seq3/Smart-seq24/Smart-seq35,
67 CEL-Seq6/CEL-Seq27, and Quartz-Seq8/Quartz-Seq29. Although these methods have
68 achieved superior performance, the single-cell libraries still need to be constructed
69 individually, which is limited to analyzing tens to hundreds of single-cell
70 transcriptomes at a time. The low throughput and high cost of these scRNA-seq
71 methods make it difficult to comprehensively dissect broadly varied cell types and
72 states of complex cellular systems.10
73 Throughput is an important feature for the deconvolution of cell type and state in
74 complex multicellular systems.11 Recently, several high-throughput bead-based
75 methods have been reported.10, 12, 13 The throughput has been successfully expanded
76 from a handful of cells to thousands of cells per assay. This increase in cell throughput
77 has promoted the efforts to profile the atlas of whole organs or entire organisms.13, 14
78 However, Poisson-dependent single-cell isolation leads to many unused reactors,
79 which limits the throughput of scRNA-seq due to reagent costs and the physical
80 constraints of devices.11, 15, 16 Recently, an integrated dielectrophoresis (DEP)-trapping
81 nanowell-transfer (dTNT-seq) platform was reported for high throughput scRNA-seq,
82 overcoming Poisson-dependent single cell/bead isolation.16 However, dTNT-seq
83 requires DEP assistance and precision control of flow rate for single-cell isolation,
84 leading to complicated chip fabrication and long single-cell isolation time. This may
85 result in non-ideal cell viability during single-cell isolation and limit the expansibility
86 of throughput.
87 To address these limitations, we developed a size-exclusion and locally quasi-static
88 hydrodynamic microwell-based Well-Paired-Seq (WPS) platform15, which shows
89 excellent efficiency of single-cell isolation within a short time and without peripheral
90 specialized instrumentation. Compared to the other well-based methods, WPS
91 allowed ~80% of single-cell/bead pairing, which greatly enhanced the isolation
92 density of single cells and the throughput. Using WPS, we have successfully realized
93 ~100,000 single-cell analyses per flow channel. Moreover, the presence of cell-free
94 RNA and aggregated cells resulting from tissue digestion usually poses a significant
95 challenge in scRNA-seq. WPS has demonstrated a high degree of effectiveness in
96 removing these undesirable interfering substances, resulting in significantly reduced
97 background noise and a lower risk of generating inaccurate biological findings.17
98 However, WPS still suffered from low sensitivity of gene detection. The inefficiencies
99 in gene detection limited the ability to characterize essential but typically sparsely
100 expressed genes, such as transcription factors, signaling molecules, and affinity
101 receptors, and to reveal distinct cell states, particularly for low-RNA content units,
102 such as immune cells and nuclei.
103 In addition, the current requirement of harsh enzymatic dissociation for preparing
104 single-cell suspensions from fresh tissues poses a significant obstacle for handling
105 clinical materials, frozen samples, and tissues that cannot be readily dissociated.18
106 Furthermore, single cells that have been treated with enzymes often pose difficulties,
107 such as damage in complexity of RNA molecules by enzymes, skewed proportions of
108 dissociated cell types, and triggered stress reactions of transcriptional expression.19 As
109 a complementary technology to scRNA-seq, single-nucleus RNA sequencing
110 (snRNA-seq) can handle complex tissues that cannot be dissociated, thus providing
111 access to archived samples.18-20 To date, the development of a compatible method for
112 single-cell and single-nucleus transcriptome profiling with high sensitivity,
113 throughput, and fidelity remains challenging.
114 Herein, we developed an optimized method, Well-Paired-Seq2 (WPS2), to
115 overcome the limitation of low sensitivity, while being fully compatible with both
116 single-cell and single-nucleus RNA-seq. By adopting molecular crowding effect,
117 tailing activity enhancement in reverse transcription, and homogeneous enzymatic
118 reaction in the initial bead-based amplification, the sensitivity of WPS2 is highly
119 enhanced. After optimization, we were able to detect 3116 genes and 8447 transcripts
120 of NIH 3T3 cells at an average of ~20,000 reads per cell, with an improvement of
121 1420 genes and 4864 transcripts compared to our previous WPS. To further validate
122 the performance of WPS2, we applied it to low-RNA content samples, such as
123 immune cells and nucleus samples. We detected an average of 1345 genes and 3180
124 transcripts at an average of ~23,000 reads per cell using mouse spleen tissue. This was
125 680 and 1969 more genes and transcripts than WPS. Using the nuclei from the NIH
126 3T3 cells and the frozen kidney tissue, we validated the compatibility of WPS2 with
127 snRNA-seq. The successful application to snRNA enables access to massively
128 archived samples.
129 Finally, we applied our method to analyze renal tumor samples (clear cell renal cell
130 carcinoma (ccRCC), chromophobe renal cell carcinoma (chRCC), and metanephric
131 adenoma (MA)), and revealed a comprehensive profile of multiple pathologic
132 transcriptomes. We found that ccRCC cells have complex microenvironment and that
133 highly expressed genes are associated with immune response, hypoxia, angiogenesis,
134 etc., while chRCC has more copy number variations (CNVs) in its genome, indicating
135 the instability of the genome. We also characterized MA, a rare benign tumor that
136 accounts for ~0.2% of adult renal epithelial neoplasms21, at the single-cell level for
137 the first time. MA are often misdiagnosed due to lack of specificity in clinical
138 presentation and imaging features. Here, we identified 10 candidate specific markers
139 of MA. We expect WPS2, a high-sensitivity, high-throughput, and high-fidelity
140 platform compatible with scRNA and snRNA-seq, will have broad application in cell
141 biology, precision medicine, and reproductive biology.
142 Results
143 Workflow of Well-Paired-Seq2
144 To enable high-throughput and highly sensitive sequencing of single cells or single
145 nuclei, we developed WPS2, which was systematically optimized based on our
146 previously reported size-exclusion and locally quasi-static hydrodynamic
147 microwell-based single-cell RNA sequencing platform (WPS)15. The workflow of
148 WPS2 is depicted in Figure 1, which includes the following major steps: (1) Cells
149 and barcoded beads are successively trapped in the cell-capture-wells and
150 bead-capture-wells to realize single-cell/bead pairing; before loading barcoded beads,
151 cell-free RNA can be removed by washing buffer. Then, cells are lysed by the
152 dissolved surfactant molecules from the settled surfactant aggregates in the sealing oil
153 and the released mRNA molecules are captured by the barcoded beads with oligo(dT)
154 sequence. (2) The captured RNA molecules on the barcoded beads are reverse
155 transcribed using Maxima RTase with the addition of GTP and PEG. After reverse
156 transcription, the single-strand cDNAs on the barcoded beads are used as templates
157 for the second-strand cDNA synthesis. (3) The double-strand cDNAs are amplified by
158 PCR. (4) The cDNA products are used for library preparation using Tn5 transposase.
159 After sequencing, the transcriptome of single cells is inferred from the digital
160 expression matrices and used for downstream analysis.
161
162 Figure 1. Workflow of Well-Paired-Seq2: (1) Pairing cells and barcoded beads in the size-exclusion
163 and locally quasi-static hydrodynamic dual wells; (2) Reverse transcription of the captured mRNA
164 molecules on the barcoded beads, followed by second strand synthesis; (3) Denaturing the
165 double-strand cDNA molecules and amplification by PCR; (4) Construction of the library of the
166 amplified cDNA products for sequencing.
167
206 However, this process often results in amplification bias, where low-copy or
207 difficult-to-amplify cDNA molecules are undetectable, limiting the sensitivity of gene
208 detection.24 This is an especially critical problem in bead-based scRNA-seq methods.
209 In the process of bead-based scRNA-seq, the first-strand cDNA molecules are
210 covalently attached to the barcoded beads after reverse transcription. Hence, the initial
211 amplification is performed on the solid-liquid interface (heterogeneous reaction) and
212 would impede the efficiency of the second-strand synthesis (Figure 3A (i)). Due to
213 the inefficiency of the reaction on the solid-liquid phase, we assumed that only a part
214 of the second strands are synthesized in the initial cycles. However, in subsequent
215 cycles, the synthesized second strands would be released in the denaturation step, and
216 the bias would be largely amplified because of the different efficiencies of strand
217 synthesis at the solid-liquid interfaces and in the liquid phase.
218 To overcome the amplification bias, we designed a second-strand synthesis method
219 (Figure 3A(ii)). Before PCR, a long time (60 min) is allowed for second-strand
220 synthesis to ensure that as many as possible second-strand cDNA molecules are
221 synthesized. Therefore, in the first cycle of amplification, the second-strand cDNA
222 molecules are simultaneously released into liquid during denaturation and amplified
223 in the liquid phase, avoiding the efficiency variance of amplification between the
224 solid-liquid interface and the liquid phase.
225 To assess the sensitivity after adding the second-strand synthesis step, we compared
226 the gene detection of the following three methods: WPS, WPS with RGP buffer
227 (WPS+), and WPS with the RGP buffer and the second-strand synthesis (WPS2).
228 Figure 3B shows that the cDNA yields with WPS2 are significantly higher than the
229 yields with WPS (2-fold) and WPS+ (1.4-fold), suggesting a high and uniform
230 efficiency of amplification using WPS2. As expected, the detected genes and
231 transcripts by WPS2 are significantly higher than that of WPS and WPS+ (Figure 3C,
232 D). At an average of 20,000 reads per cell, a median of 3112 genes and 8485
233 transcripts were detected among NIH 3T3 cells in WPS2. Compared to WPS, WPS2
234 identified 1416 and 4902 more genes and transcripts (Figure S4A, B). Zooming in on
235 four genes (two housekeeping genes and two high variation genes in NIH 3T3), we
236 found that WPS2 detected more molecules per cell than WPS and WPS+ (Figure 3E,
237 F, Figure S4C, D). Together, these results demonstrated that WPS2 significantly
238 improved the sensitivity after systematically optimizing the workflow, especially the
239 steps of RT and second-strand synthesis.
240
241 Figure 3. Performance validation of WPS2. A) Diagram of amplification in WPS and WPS2. B) Box
242 plot showing the improvement of cDNA yields relative to WPS. C, D) Median gene and UMI
243 variations along with different mean reads per cell. E, F) Violin plots showing the expression level of
244 given genes (housekeeping genes: Gapdh; high variation genes: Slc25a3) detected by WPS, WPS+, and
245 WPS2.
246
292
321
322 Figure 5. WPS2 for high-throughput single-nucleus RNA-seq. A) Scatter plot showing the gene
323 numbers versus read numbers of each individual NIH 3T3 cell. B) Scatter plot showing Pearson
324 correlation between NIH 3T3 nuclei (x axis) and cells (y axis) by WPS2. Log (TP10k + 1) corresponds
325 to log-transformed UMIs per 10k. C) Percent reads mapped to the exons, introns, and intergenic
326 regions on mouse genome for cells (green bars) from mouse fresh kidney and nuclei (red bars) from
327 mouse frozen kidney. D) Visualization by UMAP plot of clustering of 13 cell-type expression profiles
328 from mouse frozen kidney. TAL (thick ascending limb of Henle's loop cell), PT_S1 (proximal tubule
329 segment 1), PT_S2 (proximal tubule segment 2), PT_S3 (proximal tubule segment 3), DCT (distal
330 convoluted tubule cell), CNT (connecting tubule cell), ENDO (endothelial cell), MC (mesangial cell),
331 IC (intercalated cell), DTL (descending thin limb of Henle's loop cell), PC (principal cell), PODO
332 (podocyte), MACRO (macrophage cell). E) Gene expression heatmap showing the top differentially
333 expressed genes for each cell cluster in mouse single-cell data. F, G) Violin plots showing the detection
334 of genes and distribution of the number of transcripts in each cluster.
335
391
448
449 Figure 7. Cell-cell interaction in ccRCC single-cell transcription sample. A) Visualization by UMAP
450 plot of clustering of 4645 single-cell expression profiles from the ccRCC sample: ccRCC, clear cell
451 renal carcinoma cell; ENDO, endothelial cells; CAF, cancer-associated fibroblast; M2, macrophage
452 cells 2; DC, dendritic cell; M1, macrophage cell1; Pro T cell, proliferative T cell. B) Histogram and dot
453 plot showing the proportion of different cells in the ccRCC and gene expression patterns of
454 cell-type-specific marker genes. C) CellChat showing significant ligand-receptor pairs among the 9 cell
455 types. D, E) Outgoing and incoming communication patterns of target cells. F) Signaling pathway role
456 heatmap showing indicated signaling pathway network in each cell type.
457 Discussion
458 scRNA-seq have attracted extensive attentions for reveal the heterogenous cellular
459 state. However, the throughput and sensitivity in scRNA-seq limit the ability to
460 characterize the low RNA contents cells/nuclei in multicellular systems. Many efforts
461 have been devoted to increase efficiency in single-cell capture and gene detection.
462 Using WPS, we have successfully realized ~100,000 single-cell analyses per flow
463 channel. However, WPS still suffered from low sensitivity of gene detection.
464 Well-Paired-Seq2 demonstrated high sensitivity in gene detection, benefiting from
465 high efficiency in reverse transcription by molecular crowing effect and tailing
466 activity enhancement and uniform amplification by homogeneous enzymatic reaction.
467 These strategies could be readily extended to other scRNA-seq platforms. WPS2
468 identified 3116 median genes and 8447 transcripts at an average of 20,000 reads per
469 cell, which is 1420 more genes and 4864 more transcripts compared to WPS. By
470 adopting the new workflow of WPS2, we successfully applied it to characterize low
471 RNA-content units, such as immune cells and nuclei, with high sensitivity and
472 throughput.
473 In addition, the high cost and complicate peripheral equipment of current
474 scRNA-seq techniques has hindered their accessibility. Well-Paired-Seq2 is a
475 user-friendly platform. Compared to the high-cost and complicated 10x Genomics
476 platform, the whole workflow of Well-Paired-Seq can be accomplished by a chip and
477 pipette and only costs <US$0.05 per cell for library preparation, which can be
478 established quickly and inexpensively in a standard biology lab. Moreover, it is
479 crucial to note that the presence of cell-free RNA are significant challenges in
480 snRNA-seq. a large number of RNAs in cytoplasm are released into solution in the
481 process of nuclei extraction. Based on the locally quasi-hydrodynamic principle, a
482 high effectiveness of removing these undesirable interfering substances can be
483 achieved without cell loss, resulting in reduced background noise and a lower risk of
484 generating inaccurate biological findings.
485 Next, we applied WPS2 to characterize clinical samples. We dissected the
486 heterogenous CNVs of three subtypes and found that chRCC has more copy number
487 variations (CNVs) in its genome, indicating the instability of the genome. The ccRCC
488 has complex microenvironment and the highly expressed genes of ccRCC are
489 associated with immune response, hypoxia, angiogenesis, etc. Notably, we profiled
490 metanephric adenoma for the first time and some potential specific markers were
491 revealed to faciliate accurate diagnosis of MA. In summary, the use of WPS2 enables
492 profiling of multiple pathologic transcriptome maps of clinical samples, which can
493 contribute to discover novel biomarkers and therapeutic targets for other cancers. we
494 believe our high-sensitivity, high-throughput, and high-fidelity WPS2 platform that is
495 compatible with scRNA and scRNA-seq, has great potential in cell biology, precision
496 medicine, and reproductive biology.
497 Methods
498 Cell preparation
499 Mouse NIH 3T3 cells used in optimization experiments were obtained from the cell
500 bank of the Chinese Academy of Sciences and were cultured in Dulbecco’s Modified
501 Eagle Medium (DMEM, Thermo Fisher) supplemented with 10% fetal bovine serum
502 (FBS, ThermoFisher) and 1% penicillin-streptomycin (Thermo Fisher) at 37 °C with 5%
503 CO2. Cells were harvested by trypsinization and resuspended in cold Dulbecco’s
504 Phosphate-Buffered Saline (DPBS, Corning).
505 The spleen tissue was obtained from C57BL/6J mice (JiangSu GemPharmatech Co.,
506 Ltd). The tissue was minced into 2-mm pieces on ice and rinsed with 1x DPBS, then
507 treated with tissue dissociation mix (containing 1 mg/mL collagenase II and 5 mM
508 CaCl2) for 15 min at 37 °C under rotation. The cell suspension was filtered through a
509 40-µm cell strainer and then centrifuged at 1200 rpm for 3 min at 4 °C. After the
510 supernatant was removed, the cells were resuspended with red blood cell lysis buffer
511 and incubated on ice for 3 min. The cell suspension was centrifuged, rinsed, and
512 resuspended in Hanks. Cell count and viability were measured by the Automated cell
513 counter (ThermoFisher).
514 The collection of kidney tumor tissues was approved by the Research Ethics
515 Committee of Renji Hospital (KY2021-215-B), and informed consent was obtained
516 from all patients. The single-cell suspensions of the clinic tumor tissues mentioned
517 above were prepared by Tumor Dissociation Kit (Miltenyi Biotec).
518 Nuclei preparation
519 NIH 3T3 cells were lysed with cold ATAC-Resuspension Buffer (RSB; 10 mM
520 Tris-HCl, 10 mM NaCl and 3 mM MgCl2) containing 0.01% Digitonin, 0.1% NP40,
521 and 0.1% Tween-20 (RSB-DNT) described for Omni-ATAC37. Tissue samples
522 dissected from 6-8 weeks C57BL/6J mice or clinical samples were minced into 2-mm
523 pieces, homogenized with a pre-chilled Dounce tissue grinder (Sigma, Cat #D8938) in
524 2 mL pre-cooled RSB-DNT (15 times with pastel A and 10 times with pastel B), and
525 incubated on ice for 5 min with another 2 mL RSB-DNT. Nuclei were centrifuged at
526 500 g for 3 min at 4 °C. After centrifugation, the nuclei were washed twice and
527 resuspended with RSB, then filtered through a 40-μm cell strainer, and diluted to a
528 final concentration of 500 nuclei per mL for subsequent experiments.
529 Barcoded bead preparation
530 Commercial barcoded beads were obtained from ChemGenes Company
531 (Wilmington, Massachusetts, USA; cat. Macosko-2011-10 (V+)), as described in
532 Drop-seq. The barcoded beads were washed twice with 30 mL of 100% ethanol and
533 30 mL of TE/TW (10 mM Tris pH 8.0, 1 mM EDTA, 0.01% Tween). Then the
534 barcode beads were resuspended in 10 mL of TE/TW and placed at 4 °C until use.
535 Before experiments, barcoded beads were resuspended in 20× TE (pH 8.0) and 50
536 mM DTT solution for subsequent capture in the chip.
537 Well-Paired-Seq2 operation
538 For optimization experiments, the operation on the WPS2 chip was similar to that
539 described for WPS. Compared with the WPS, WPS2 was designed with smaller
540 cell-capture-wells to match the size of the nucleus. First, the 70 µL prepared nuclear
541 suspension was introduced into the chip, and the nuclei were captured in the
542 cell-capture-well by gravitational sedimentation. The uncaptured nuclei were
543 resuspended with a pipette and carried through multiple sedimentation captures. The
544 remaining uncaptured nuclei in the chip were removed with 1xDPBS, followed by
545 loading the barcode beads were. The surfactant aggregates (sodium lauroyl sarcosine,
546 Solarbio) were ultasonically dispersed evenly in mineral oil 5% (w/v). Then, the
547 mineral oil containing the surfactant aggregates was injected into the chip and the
548 dual wells were sealed. The surfactant aggregates settled and dissolved in the solution
549 of the wells to complete cell lysis. After incubation at room temperature for 10 min,
550 the barcode beads were recovered and transferred to a 1.5 mL RNase-free tube,
551 washed three times with 1 mL of 6× SSC and once with 1× RT buffer.
552 Reverse transcription and exonuclease I treatment
553 The barcoded beads were resuspended in 20 µL reverse transcription reaction
554 solution, containing 1× RT buffer, 1mM dNTP (TransGen Biotech, cat#AD101-12), 1
555 U/µL RNase Inhibitor (TransGen Biotech, cat#AI101-02), 1 mM GTP (Thermo
556 Scientific, cat#R0461), 5% PEG8000 (PERFEMIKER, cat# PMA020380), 2.5 μM
557 Template Switch Oligo (AAGCAGTGGTATCAACGCAGAGTGAATrGrGrG,
558 Sangon), and 10 U/µL Maxima H-Reverse Transcriptase (Thermo Scientific,
559 cat#EP0751), and incubated at 42 °C for 90 min, followed by washing the beads once
560 with 200 µL of 1× TE-SDS, once with 200 µL of 1× TE-TW, and once with 200 µL of
561 10 mM Tris (pH 8.0). The beads were then resuspended in 20 µL exonuclease I mix,
562 containing 1× Exonuclease I Buffer and 1 U/µL Exonuclease I (NEB, cat # B0293S),
563 and incubated at 37 °C for 45 min. After Exo I treatment, the beads were washed once
564 with 200 µL of 1× TE-SDS, once with 200 µL of 1× TE-TW, and once with 200 µL of
565 RNase free water.
566 Second-strand synthesis on barcode beads
567 Following Exonuclease I treatment, the beads were resuspended in 200 µL of 0.1 M
568 NaOH and incubated for 5 min at room temperature with rotation to denature the
569 mRNA-cDNA hybrid product. After that, 200 µL of 10 mM Tris-HCl (pH 7.5) was
570 added to neutralize the solution, the beads were then washed twice with 200 µL of
571 TE-TW buffer and once with 200 µL of 10 mM Tris-HCl (pH 8.0). Second-strand
572 synthesis reaction was performed on the beads by incubating in 40 µL of the reaction
573 mixture (1x RT buffer, 12% PEG-8000, 1 mM dNTPs, 1 μM second strand synthesis
574 primer (AAGCAGTGGTATCAACGCAGAGTGAATG, Sangon) and 0.125 U/μL
575 Klenow exo- (BioLabs)) at 37 °C for 1 h with rotation. The reaction was stopped by
576 washing the beads once with TE-SDS buffer, twice with TE-TW buffer, and once with
577 RNase-free water.
578 cDNA amplification
579 The beads were resuspended in 50 µL PCR mix including 1× HiFi HotStart
580 Readymix (Kapa Biosystems, cat #KK2602) and 0.8 μM ISPCR oligo
581 (AAGCAGTGGTATCAACGCAGAGT, Sangon)). The PCR program was as follows:
582 95 °C for 3 min; four cycles of 98 °C for 20 s, 65 °C for 45 s, and 72 °C for 3 min; ten
583 cycles of 98 °C for 20 s, 67 °C for 20 s, and 72 °C for 3 min; 72 °C for 5 min. The
584 PCR product was then purified with 0.6× VAHTS DNA Clean Beads (Vazyme,
585 cat#N411-02) according to the manufacturer’s instructions and eluted in 10 µL of H2O.
586 The concentration of the purified PCR product was measured with a Qubit
587 fluorometer (ThermoFisher Scientific).
588 Well-Paired-Seq library preparation
589 The library was constructed with TruePrepDNA Library Prep Kit V2 for Illumina
590 (Vazyme Biotech) according to the manufacturer’s instructions and was amplified by
591 Nextera_N70x primer (Sangon) and custom primer P5_TSO_Hybrid
592 (AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAAGCAGTGG
593 TATCAACGCAGAGT*A*C, Sangon). Then 0.6× VAHTS DNA Clean Beads were
594 used for purifying the library. After elution in 10 µL of H2O, the concentration of the
595 purified PCR production was measured with a Qubit fluorometer. The average size of
596 sequenced libraries was between 450 and 750 bp.
597 Statistical analysis
598 Reads alignment and data preprocessing
599 The paired-end reads were produced from WPS2 sequencing libraries: read 1
600 contained a cell barcode at 1-12 bases and a UMI at 13-20 bases; read 2 contained the
601 sequence of the transcript. GRCh38 and mm10 were used to align the human and
602 mouse data, respectively. Data preprocessing and read alignment were implemented
603 by using the zUMIs38. In order to correct the differences in sequencing depth between
604 conditions, seqtk v1.0 (https://github.com/lh3/seqtk) was used to downsample the
605 sequencing data, so that each condition could be analyzed nearly at the same level of
606 average number of reads per cell. For samples sequenced with sc/snRNA-seq, each
607 gene’s transcripts were counted by including exon and intron reads together.
608 Cell type analysis
609 Based on digital gene expression, Seurat version 4.0.139 was run on R4.1.3 to
610 perform cell clustering and marker gene calling. The spleen cells and kidney nuclei
611 were preprocessed and filtered on the basis of a minimal expression threshold of 300
612 and 500 genes and genes being expressed in at least three cells or nuclei. As high
613 proportion of transcripts mapping to MT-genes indicate low cell quality, we removed
614 cells with more than 30% MT-transcripts. Data normalization and scaling followed
615 the suggested default settings and 2000 highest variable genes were selected using the
616 FindVariableFeatures function with the vst method. Next, we performed PCA based
617 on the scaled data and reduced the dimensionality of data based on the ElbowPlot. To
618 cluster the cells, the FindClusters function in Seurat was implemented and visualized
619 by projecting cells in a two-dimensional space using Uniform Manifold
620 Approximation and Projection (UMAP), and cell types were identified by the cell
621 marker. Similarly, the cell type analysis for the clinical samples data was implemented
622 using cells with at least 500 detected genes and genes expressed in at least three cells.
623 PCA was performed based on scaled data after the top 2000 variable genes were
624 selected. Using the identified 20 PCs as input, cells were clustered into eight groups
625 with resolution = 1.2, and the cell types were identified by the cell marker.
626 DEGs and GO term enrichment
627 Differential expression analysis based on the wilcoxon rank sum test was then
628 performed to confirm that the three tumor types were distinct. Genes with a fold
629 change of transcripts >2 and an adjusted P value < 0.05 were recognized as
630 differentially expressed genes. Furthermore, the type-specific pathways were revealed
631 by Gene Ontology (GO) enrichment analysis for biological processes completed by
632 the clusterProfiler R package.40
633 Copy number variation analysis
634 The InferCNV package (https://github.com/broadinstitute/inferCNV) was used to
635 detect the CNVs in 10704 malignant cells. 1623 non-malignant cells were used as
636 baselines to estimate the CNAs of malignant cells. Genes expressed in more than 20
637 cells were sorted based on their loci on each chromosome. The relative expression
638 values were centered to 1, using 1.5 standard deviations from the residual-normalized
639 expression values as the ceiling. A slide window size of 101 genes was used to smooth
640 the relative expression on each chromosome to remove the effect of gene-specific
641 expression. The CNV scores were calculated by normalizing the CNVs of each cell
642 from -1 to 1 and then calculating the sum of squares of the normalized values
643 Cell-cell interactions based on CellChat33
644 Cell-cell interaction based on cell-chat begins with processing scRNA-seq data
645 using standard quality control, normalization, and dimensionality reduction
646 techniques. Then we used curated databases of known cell-cell signaling interactions
647 to identify potential ligand-receptor interactions between cell types, which were then
648 combined with expression data to create a communication matrix. This matrix was
649 then used to cluster cell types into functional groups based on their signaling
650 interactions, and the resulting communication network was visualized with default
651 parameters.
652
653 Acknowledgements
654 We thank the National Key R&D Program of China (2021YFA0909400,
655 2019YFA0905800), the National Natural Science Foundation of China (21974113,
656 21974112, 21904085, and 21927806), and the Fundamental Research Funds for the
657 Central Universities (20720210001, 20720220005) for their financial support.
658
659 Conflict of Interest
660 The authors declare no conflict of interest.
661
662 Data Availability Statement
663 The data that support the findings of this study are available from the corresponding
664 author upon reasonable request.
665
666 References
667 (1) Grün, D.; van Oudenaarden, A. Design and analysis of single-cell sequencing experiments.
668 Cell 2015, 163 (4), 799-810.
669 (2) Tang, F.; Barbacioru, C.; Wang, Y.; Nordman, E.; Lee, C.; Xu, N.; Wang, X.; Bodeau, J.;
670 Tuch, B. B.; Siddiqui, A. mRNA-Seq whole-transcriptome analysis of a single cell. Nat.
671 Methods 2009, 6 (5), 377-382.
672 (3) Ramsköld, D.; Luo, S.; Wang, Y.-C.; Li, R.; Deng, Q.; Faridani, O. R.; Daniels, G. A.;
673 Khrebtukova, I.; Loring, J. F.; Laurent, L. C. Full-length mRNA-Seq from single-cell levels
674 of RNA and individual circulating tumor cells. Nat. Biotechnol. 2012, 30 (8), 777-782.
675 (4) Picelli, S.; Björklund, Å. K.; Faridani, O. R.; Sagasser, S.; Winberg, G.; Sandberg, R.
676 Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 2013,
677 10 (11), 1096-1098.
678 (5) Hagemann-Jensen, M.; Ziegenhain, C.; Chen, P.; Ramsköld, D.; Hendriks, G.-J.; Larsson,
679 A. J.; Faridani, O. R.; Sandberg, R. Single-cell RNA counting at allele and isoform resolution
680 using Smart-seq3. Nat. Biotechnol. 2020, 38 (6), 708-714.
681 (6) Hashimshony, T.; Wagner, F.; Sher, N.; Yanai, I. CEL-Seq: single-cell RNA-Seq by
682 multiplexed linear amplification. Cell Rep. 2012, 2 (3), 666-673.
683 (7) Hashimshony, T.; Senderovich, N.; Avital, G.; Klochendler, A.; De Leeuw, Y.; Anavy, L.;
684 Gennert, D.; Li, S.; Livak, K. J.; Rozenblatt-Rosen, O. CEL-Seq2: sensitive
685 highly-multiplexed single-cell RNA-Seq. Genome biology 2016, 17, 1-7.
686 (8) Sasagawa, Y.; Nikaido, I.; Hayashi, T.; Danno, H.; Uno, K. D.; Imai, T.; Ueda, H. R.
687 Quartz-Seq: a highly reproducible and sensitive single-cell RNA sequencing method, reveals
688 non-genetic gene-expression heterogeneity. Genome biology 2013, 14 (4), 1-17.
689 (9) Sasagawa, Y.; Danno, H.; Takada, H.; Ebisawa, M.; Tanaka, K.; Hayashi, T.; Kurisaki, A.;
690 Nikaido, I. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that
691 effectively uses limited sequence reads. Genome biology 2018, 19, 1-24.
692 (10) Macosko, E. Z.; Basu, A.; Satija, R.; Nemesh, J.; Shekhar, K.; Goldman, M.; Tirosh, I.;
693 Bialas, A. R.; Kamitaki, N.; Martersteck, E. M. Highly parallel genome-wide expression
694 profiling of individual cells using nanoliter droplets. Cell 2015, 161 (5), 1202-1214.
695 (11) McGinnis, C. S.; Patterson, D. M.; Winkler, J.; Conrad, D. N.; Hein, M. Y.; Srivastava, V.;
696 Hu, J. L.; Murrow, L. M.; Weissman, J. S.; Werb, Z. MULTI-seq: sample multiplexing for
697 single-cell RNA sequencing using lipid-tagged indices. Nat. Methods 2019, 16 (7), 619-626.
698 (12) Gierahn, T. M.; Wadsworth, M. H.; Hughes, T. K.; Bryson, B. D.; Butler, A.; Satija, R.;
699 Fortune, S.; Love, J. C.; Shalek, A. K. Seq-Well: portable, low-cost RNA sequencing of single
700 cells at high throughput. Nat. Methods 2017, 14 (4), 395-398. Datlinger, P.; Rendeiro, A. F.;
701 Boenke, T.; Senekowitsch, M.; Krausgruber, T.; Barreca, D.; Bock, C. Ultra-high-throughput
702 single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing.
703 Nat. Methods 2021, 18 (6), 635-642.
704 (13) Han, L.; Wei, X.; Liu, C.; Volpe, G.; Zhuang, Z.; Zou, X.; Wang, Z.; Pan, T.; Yuan, Y.;
705 Zhang, X. Cell transcriptomic atlas of the non-human primate Macaca fascicularis. Nature
706 2022, 604 (7907), 723-731.
707 (14) Consortium*, T. S.; Jones, R. C.; Karkanias, J.; Krasnow, M. A.; Pisco, A. O.; Quake, S.
708 R.; Salzman, J.; Yosef, N.; Bulthaup, B.; Brown, P. The Tabula Sapiens: A multiple-organ,
709 single-cell transcriptomic atlas of humans. Science 2022, 376 (6594), eabl4896. Suo, C.;
710 Dann, E.; Goh, I.; Jardine, L.; Kleshchevnikov, V.; Park, J.-E.; Botting, R. A.; Stephenson, E.;
711 Engelbert, J.; Tuong, Z. K. Mapping the developing human immune system across organs.
712 Science 2022, 376 (6597), eabo0510.
713 (15) Yin, K.; Zhao, M.; Lin, L.; Chen, Y.; Huang, S.; Zhu, C.; Liang, X.; Lin, F.; Wei, H.;
714 Zeng, H. WellPairedSeq: A SizeExclusion and Locally QuasiStatic Hydrodynamic
715 Microwell Chip for SingleCell RNASeq. Small Methods 2022, 6 (7), 2200341.
716 (16) Bai, Z.; Deng, Y.; Kim, D.; Chen, Z.; Xiao, Y.; Fan, R. An integrated
717 dielectrophoresis-trapping and nanowell transfer approach to enable double-sub-poisson
718 single-cell RNA sequencing. ACS nano 2020, 14 (6), 7412-7424.
719 (17) Young, M. D.; Behjati, S. SoupX removes ambient RNA contamination from
720 droplet-based single-cell RNA sequencing data. Gigascience 2020, 9 (12), giaa151.
721 (18) Habib, N.; Avraham-Davidi, I.; Basu, A.; Burks, T.; Shekhar, K.; Hofree, M.; Choudhury,
722 S. R.; Aguet, F.; Gelfand, E.; Ardlie, K. Massively parallel single-nucleus RNA-seq with
723 DroNc-seq. Nat. Methods 2017, 14 (10), 955-958.
724 (19) Slyper, M.; Porter, C. B.; Ashenberg, O.; Waldman, J.; Drokhlyansky, E.; Wakiro, I.;
725 Smillie, C.; Smith-Rosario, G.; Wu, J.; Dionne, D. A single-cell and single-nucleus RNA-Seq
726 toolbox for fresh and frozen human tumors. Nat. Med. 2020, 26 (5), 792-802.
727 (20) Ding, J.; Adiconis, X.; Simmons, S. K.; Kowalczyk, M. S.; Hession, C. C.; Marjanovic,
728 N. D.; Hughes, T. K.; Wadsworth, M. H.; Burks, T.; Nguyen, L. T. Systematic comparison of
729 single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 2020, 38 (6),
730 737-746.
731 (21) Davis Jr, C. J.; Barton, J. H.; Sesterhenn, I. A.; Mostofi, F. Metanephric adenoma.
732 Clinicopathological study of fifty patients. The American journal of surgical pathology 1995,
733 19 (10), 1101-1114. Spaner, S. J.; Yu, Y.; Cook, A. J.; Boag, G. Pediatric metanephric
734 adenoma: case report and review of the literature. International urology and nephrology 2014,
735 46, 677-680.
736 (22) Ohtsubo, Y.; Nagata, Y.; Tsuda, M. Compounds that enhance the tailing activity of
737 Moloney murine leukemia virus reverse transcriptase. Scientific reports 2017, 7 (1), 6520.
738 (23) Bagnoli, J. W.; Ziegenhain, C.; Janjic, A.; Wange, L. E.; Vieth, B.; Parekh, S.; Geuder, J.;
739 Hellmann, I.; Enard, W. Sensitive and powerful single-cell RNA sequencing using
740 mcSCRB-seq. Nature communications 2018, 9 (1), 2937.
741 (24) Hrdlickova, R.; Toloue, M.; Tian, B. RNASeq methods for transcriptome analysis.
742 Wiley Interdisciplinary Reviews: RNA 2017, 8 (1), e1364.
743 (25) Griffiths, J. A.; Scialdone, A.; Marioni, J. C. Using singlecell genomics to understand
744 developmental processes and cell fate decisions. Mol. Syst. Biol. 2018, 14 (4), e8046.
745 (26) Mazutis, L.; Gilbert, J.; Ung, W. L.; Weitz, D. A.; Griffiths, A. D.; Heyman, J. A.
746 Single-cell analysis and sorting using droplet-based microfluidics. Nat. Protoc. 2013, 8 (5),
747 870-891.
748 (27) Su, C.; Lv, Y.; Lu, W.; Yu, Z.; Ye, Y.; Guo, B.; Liu, D.; Yan, H.; Li, T.; Zhang, Q.
749 Single-cell RNA sequencing in multiple pathologic types of renal cell carcinoma revealed
750 novel potential tumor-specific markers. Frontiers in Oncology 2021, 11, 719564.
751 (28) Schreibing, F.; Kramann, R. Mapping the human kidney using single-cell genomics.
752 Nature Reviews Nephrology 2022, 18 (6), 347-360.
753 (29) Clark, D. J.; Dhanasekaran, S. M.; Petralia, F.; Pan, J.; Song, X.; Hu, Y.; da Veiga
754 Leprevost, F.; Reva, B.; Lih, T.-S. M.; Chang, H.-Y. Integrated proteogenomic
755 characterization of clear cell renal cell carcinoma. Cell 2019, 179 (4), 964-983. e931.
756 (30) Ohe, C.; Kuroda, N.; Takasu, K.; Senzaki, H.; Shikata, N.; Yamaguchi, T.; Miyasaka, C.;
757 Nakano, Y.; Sakaida, N.; Uemura, Y. Utility of immunohistochemical analysis of KAI1,
758 epithelial-specific antigen, and epithelial-related antigen for distinction of chromophobe renal
759 cell carcinoma, an eosinophilic variant from renal oncocytoma. Medical molecular
760 morphology 2012, 45, 98-104.
761 (31) Chen, C. V.; Croom, N. A.; Simko, J. P.; Stohr, B. A.; Chan, E. Differential
762 immunohistochemical and molecular profiling of conventional and aggressive components of
763 chromophobe renal cell carcinoma: pitfalls for diagnosis. Human Pathology 2022, 119, 85-93.
764 (32) Shuch, B.; Amin, A.; Armstrong, A. J.; Eble, J. N.; Ficarra, V.; Lopez-Beltran, A.;
765 Martignoni, G.; Rini, B. I.; Kutikov, A. Understanding pathologic variants of renal cell
766 carcinoma: distilling therapeutic opportunities from biologic complexity. Eur. Urol. 2015, 67
767 (1), 85-97.
768 (33) Jin, S.; Guerrero-Juarez, C. F.; Zhang, L.; Chang, I.; Ramos, R.; Kuan, C.-H.; Myung, P.;
769 Plikus, M. V.; Nie, Q. Inference and analysis of cell-cell communication using CellChat.
770 Nature communications 2021, 12 (1), 1088.
771 (34) Rio, D. D.; Caprara, V.; Masi, I.; Spadaro, F.; Giannitelli, S.; Rainer, A.; Bagnato, A.;
772 Rosanò, L. Tumor-derived endothelin-1 recruits and activates fibroblasts to support tumor
773 aggressiveness. Cancer Res. 2022, 82 (12_Supplement), 6137-6137. Katarkar, A.; Bottoni, G.;
774 Clocchiatti, A.; Goruppi, S.; Bordignon, P.; Lazzaroni, F.; Gregnanin, I.; Ostano, P.; Neel, V.;
775 Dotto, G. P. NOTCH1 gene amplification promotes expansion of Cancer Associated
776 Fibroblast populations in human skin. Nature communications 2020, 11 (1), 5126.
777 (35) Noe, J. T.; Mitchell, R. A. MIF-dependent control of tumor immunity. Frontiers in
778 immunology 2020, 11, 609948. Hu, H.; Ma, T.; Liu, N.; Hong, H.; Yu, L.; Lyu, D.; Meng, X.;
779 Wang, B.; Jiang, X. Immunotherapy checkpoints in ovarian cancer vasculogenic mimicry:
780 Tumor immune microenvironments, and drugs. Int. Immunopharmacol. 2022, 111, 109116.
781 (36) Clark, E. A.; Giltiay, N. V. CD22: a regulator of innate and adaptive B cell responses and
782 autoimmunity. Frontiers in immunology 2018, 9, 2235. Engeroff, P.; Vogel, M. The role of
783 CD23 in the regulation of allergic responses. Allergy 2021, 76 (7), 1981-1989.
784 (37) Corces, M. R.; Trevino, A. E.; Hamilton, E. G.; Greenside, P. G.; Sinnott-Armstrong, N.
785 A.; Vesuna, S.; Satpathy, A. T.; Rubin, A. J.; Montine, K. S.; Wu, B. An improved ATAC-seq
786 protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 2017,
787 14 (10), 959-962.
788 (38) Parekh, S.; Ziegenhain, C.; Vieth, B.; Enard, W.; Hellmann, I. zUMIs-A fast and flexible
789 pipeline to process RNA sequencing data with UMIs. Gigascience 2018.
790 (39) Stuart, T.; Butler, A.; Hoffman, P.; Hafemeister, C.; Papalexi, E.; Mauck, W. M.; Hao, Y.;
791 Stoeckius, M.; Smibert, P.; Satija, R. Comprehensive integration of single-cell data. Cell 2019,
792 177 (7), 1888-1902. e1821.
793 (40) Yu, G.; Wang, L.-G.; Han, Y.; He, Q.-Y. clusterProfiler: an R package for comparing
794 biological themes among gene clusters. OMICS: J. Integrative Biol. 2012, 16 (5), 284-287.
795