GEO Oct 2024
GEO Oct 2024
8000000
7000000
6000000
5000000
4000000
3000000
2000000
1000000
0
2000 2005 2010 2015 2020 2024
Evolution of assay type in GEO
100%
Year
https://www.ncbi.nlm.nih.gov/geo/
GEO Analysis tools
• Genome Browser Tracks
• GEO2R
Genome Browser Tracks
• Loadable into NCBI’s Data Genome viewer
• 9618 series with tracks
• 40,202 samples with tracks
• Tracks are mostly from ENCODE samples
How to find tracks on GEO records
• Use "track"[Filter] in search bar
• Use links from GEO’s home page
• https://www.ncbi.nlm.nih.gov/geo/encode/
Tracks on series vs samples SERIES (GSE)
• Click on
• Button on Series (GSE)
page loads tracks for all
samples in series
• Button on sample (GSM)
page loads tracks only
for that sample
ENCODE ChIP-seq data for SETDB1
What if you want to use UCSC genome
browser for data stored in GEO?
Use “ftp”
link to copy
URL
Paste URL
Viewing bigwig file from GEO in UCSC Browser
GEO Analysis tools
• Genome Browser Tracks
• GEO2R
Evolution of assay type in GEO
100%
• Change options
• Explore visualizations
• Sample relationship with UMAP
• Volcano plot
• Venn Diagram
• P-value and expression value
distribution
• P-value adjustment
method for multiple
testing
• Choose thresholds for P-
value and log2 fold-
change for plots
• Choose contrasts (which
set of samples to display
in plots)
Looking up your
favorite gene
• Go to ‘Profile graph’ tab
• Enter your gene of interest
Accessing the
R Code
• Go to ‘R script’ tab
• Copy entire script!
• Re-run analysis on your
own
Visualize and explore results in GEO2R
7 plots provided for every study
Visualization
plots 3 plots (with green outline) are
interactive
Subsets of data are downloadable
from interactive plots
Added in 2020
Volcano plot
Volcano Plot
• Changed Options
• Padj< 0.01
• Log2 fold-change
threshold set to 1
Venn
Diagram
GEO2R Help
• https://www.ncbi.nlm.nih.gov/geo/info/geo2r.html (Documentation)
• Tutorial video
• Updated in 2023
• https://www.youtube.com/watch?v=9RyWjzSnaE0&t=17s
What’s new at GEO?
RNA-seq data in GEO
RNA-seq studies released by GEO each year
Year
https://www.ncbi.nlm.nih.gov/geo/
Making RNA-seq data more FAIR
FAIR Data Principles
https://www.nlm.nih.gov/oet/ed/cde/tutorial/02-300.html
NCBI has produced an
RNA-seq analysis pipeline
Consistently computed
RNA-seq counts for
millions of samples
NCBI RNA-seq count pipeline
Newly released Available
Deploy via Cloud
bulk RNA-seq runs for
and GEO
enter pipeline
use!
Produce gene-level
Remove runs with
counts with
< 50% alignment
featureCounts
NCBI RNA-seq analysis pipeline goals
Human COVID-19
available
in GEO
Search
• “rnaseq counts”
[Filter]
• 27,731 studies
available with
NCBI-provided
RNA-seq counts
NCBI RNA-seq counts downloadable from GEO
NCBI RNA-seq counts downloadable from GEO
My email:
[email protected]
Acknowledgements
Pierre Ledoux, PhD Alexandra Soboleva Rodney Brister, PhD Ilene Mizrachi, PhD
Hyeseung Lee, PhD Maxim Tomashevsky Ryan Connor, PhD Valerie Schneider, PhD
Kimberly Marshall Naigong Zhang, PhD Ravinder P. Eskandary, PhD Kim Pruitt, PhD
This work was supported by the National Center for Biotechnology Information (NCBI) at the National Library of
Medicine and NIH’s Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability
(STRIDES) initiative.