Cancer Databases

Uploaded by

Alexis Torres

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Cancer Databases

Uploaded by

Alexis Torres

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Cancer Databases and

Retrieval

~ Dilraj Kaur, PhD

IIITD
Outline
● Introduction to Biological Databases
● List of Cancer databases
● The Cancer Genome Atlas (TCGA)
● GDC portal
● Data accessibility
● Dataset retrieval (TCGA-Assembler)
● Q &A
What we need to know ?
• What is a database and what are the features of an ideal db?
• What are the relationships/differences between primary and
derived sequence databases?
• What are databases used for ?
• Why is data integration useful?
What are Databases?
• Structured collection of information.
• Consists of basic units called records or entries.
• Each record consists of fields, which hold pre-defined data
related to the record.
• For example, a protein database would have protein entries as
records and protein properties as fields (e.g., name of protein,
length, amino-acid sequence)
Key Features of an ideal DB
• Comprehensive, but easy to search.
• A simple, easy to understand structure.
• Cross-referenced.
• Minimum redundancy.
• Easy retrieval of data.
Types of Molecular DB
• Primary Databases
• Original submissions by experimentalists
• Content controlled by the submitter
• Examples: GenBank, Trace, SRA, SNP, GEO
• Derivative Databases
• Derived from primary data
• Content controlled by third party
• Examples: NCBI Protein, Refseq, TPA, RefSNP, GEO datasets,
UniGene, Homologene, Structure, Conserved Domain
Sequence Databases at NCBI
• Primary
• GenBank: NCBI’s primary sequence database
• Trace Archive: reads from capillary sequencers
• Sequence Read Archive: next generation data
• Derivative
• GenPept (GenBank translations)
• Protein (UniProt—Swiss-Prot, PDB)
• NCBI Reference Sequences (RefSeq)
List of Databases for
Oncogenomic Research
TCGA (The Cancer Genome Atlas)
o TCGA comprised of :
o BiospecimenClinicalData, CNAData, MethylationData,
o miRNASeqData, RNASeqData, RPPAData,
o SomaticMutationData
Levels of dataset in TCGA

• Level 1 indicated raw and controlled data,

• level 2 indicated processed and controlled data,
• level 3 indicated Segmented or Interpreted Data and open
access and level 4 indicated region of interest and open
access data.
• While the TCGA data portal provided level 1 to 3 data,
Firehose only provides level 3 and 4.
Cancer Dataset Retrieval
FIREHOSE-GDAC
https://gdac.broadinstitute.org
Download the desired file
Here we are taking mRNA seq
TCGA-ASSEMBLER
Go to the link,
folder have all the
supporting files
Click this to
download
COMMANDS

rm(list = ls()) # Clear workspace

source("Module_A.R") # Load Module A functions
source("Module_B.R") # Load Module B functions

# Download normalized gene expression data

RNASeqRawData <- DownloadRNASeqData(saveFolderName = "./output path",

cancerType = ”LIHC", assayPlatform = "gene.normalized_RNAseq”)

GeneExpData <- ProcessRNASeqData(inputFilePath = RNASeqRawData[1], outputFileName =

”LIHC__illuminahiseq_rnaseqv2__GeneExp", outputFileFolder = "./output path", dataType = "geneExp", verType =
"RNASeqV2")
UCEC-Clinical data
UCEC-RNAseq File
THANK YOU

Lecture 3 Database
No ratings yet
Lecture 3 Database
81 pages
Drilling Into Big Cancer-Genome Data
No ratings yet
Drilling Into Big Cancer-Genome Data
5 pages
Drilling Into Big Cancer-Genome Data
No ratings yet
Drilling Into Big Cancer-Genome Data
5 pages
LO4 Access to Sequenced Data and Related Information
No ratings yet
LO4 Access to Sequenced Data and Related Information
11 pages
PC#1_Exercises_Introduction_to_NCBI_2020_v2
No ratings yet
PC#1_Exercises_Introduction_to_NCBI_2020_v2
4 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Tao 2016
No ratings yet
Tao 2016
11 pages
Bioinfi U3 Part -1
No ratings yet
Bioinfi U3 Part -1
4 pages
Manual
No ratings yet
Manual
68 pages
4Bioinformaticsdatabases
No ratings yet
4Bioinformaticsdatabases
71 pages
Cancer genomic hub
No ratings yet
Cancer genomic hub
10 pages
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
No ratings yet
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
41 pages
Microarray Databases
No ratings yet
Microarray Databases
3 pages
Supplementary Material To "Fantastic Databases and Where To Find Them: Web Applications For Researchers in A Rush"
No ratings yet
Supplementary Material To "Fantastic Databases and Where To Find Them: Web Applications For Researchers in A Rush"
2 pages
Database
No ratings yet
Database
16 pages
ok
No ratings yet
ok
29 pages
Nucleic_Acid_Databases
No ratings yet
Nucleic_Acid_Databases
37 pages
Ncbi
No ratings yet
Ncbi
25 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Coursera 14b Unit 1-Ncbi PDF
No ratings yet
Coursera 14b Unit 1-Ncbi PDF
5 pages
rcellminer: exploring molecular profiles and drug response of the NCI-60 cell lines in R
No ratings yet
rcellminer: exploring molecular profiles and drug response of the NCI-60 cell lines in R
3 pages
Ncbi Dulu
No ratings yet
Ncbi Dulu
6 pages
RTCGA
No ratings yet
RTCGA
34 pages
Bioinformatics
No ratings yet
Bioinformatics
55 pages
Bioinformatic Databases 2
No ratings yet
Bioinformatic Databases 2
28 pages
Databases of NCBI
No ratings yet
Databases of NCBI
13 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Online Biological Databases: A/Prof. Ly Le
No ratings yet
Online Biological Databases: A/Prof. Ly Le
64 pages
Bioinformatics Manual Updated (2) (1)
No ratings yet
Bioinformatics Manual Updated (2) (1)
48 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
David L. Wheeler Et Al - Database Resources of The National Center For Biotechnology Information
No ratings yet
David L. Wheeler Et Al - Database Resources of The National Center For Biotechnology Information
8 pages
02. Biological Sequence Databases
No ratings yet
02. Biological Sequence Databases
35 pages
Data Retrieval System: Text-Based Database Searching
No ratings yet
Data Retrieval System: Text-Based Database Searching
54 pages
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
2024.HF_BioInformatics_Lec3p
No ratings yet
2024.HF_BioInformatics_Lec3p
11 pages
Comp Bio Lab File
No ratings yet
Comp Bio Lab File
43 pages
Lab 1
No ratings yet
Lab 1
39 pages
Bioinfo Lab Final
No ratings yet
Bioinfo Lab Final
49 pages
TMExplorer: A Tumour Microenvironment Single-Cell RNAseq Database and Search Tool
No ratings yet
TMExplorer: A Tumour Microenvironment Single-Cell RNAseq Database and Search Tool
23 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics Day 5
No ratings yet
Bioinformatics Day 5
6 pages
Giga DB
No ratings yet
Giga DB
2 pages
DATAbases1KD
No ratings yet
DATAbases1KD
5 pages
bioinformatics
No ratings yet
bioinformatics
5 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
Genbank: National Center For Biotechnology Information
No ratings yet
Genbank: National Center For Biotechnology Information
5 pages
#1 L1 BioDatabases
No ratings yet
#1 L1 BioDatabases
89 pages
Exp 1
No ratings yet
Exp 1
7 pages
System Biology Assignment
No ratings yet
System Biology Assignment
17 pages
CR Micro
No ratings yet
CR Micro
2 pages
Biological Databases: Notes Adapted From Lecture Notes of Dr. Larry Hunter at The University of Colorado
No ratings yet
Biological Databases: Notes Adapted From Lecture Notes of Dr. Larry Hunter at The University of Colorado
41 pages
CR Micro
No ratings yet
CR Micro
2 pages
lecture1_BIOF242_shuvadeep
No ratings yet
lecture1_BIOF242_shuvadeep
38 pages
Cancer Bioinformatics Addressing the Challenges of Integrated Postgenomic Cancer Research
No ratings yet
Cancer Bioinformatics Addressing the Challenges of Integrated Postgenomic Cancer Research
5 pages
5.7. Data Retrieval
No ratings yet
5.7. Data Retrieval
16 pages
Bioinformatics Unveiled
From Everand
Bioinformatics Unveiled
Joan Melody
No ratings yet
Introduction to Bioinformatics, Sequence and Genome Analysis
From Everand
Introduction to Bioinformatics, Sequence and Genome Analysis
Jerry H. Swift
No ratings yet
0 Circuit Designer
No ratings yet
0 Circuit Designer
2 pages
0 - A Manual For The Part-Compositor Framework
No ratings yet
0 - A Manual For The Part-Compositor Framework
10 pages
2013 - Engineering Protein Thermostability Using A Generic Activity-Independent Biophysical Screen Inside The Cell - SI
No ratings yet
2013 - Engineering Protein Thermostability Using A Generic Activity-Independent Biophysical Screen Inside The Cell - SI
13 pages
2011 - Improving A Natural Enzyme Activity Through Incorporation of Unnatural Amino Acids - SI
No ratings yet
2011 - Improving A Natural Enzyme Activity Through Incorporation of Unnatural Amino Acids - SI
14 pages
Expression of Recombinant Proteins in The Methylotrophic Yeast Pichia Pastoris
No ratings yet
Expression of Recombinant Proteins in The Methylotrophic Yeast Pichia Pastoris
5 pages
2013 - Engineering Protein Thermostability Using A Generic Activity-Independent Biophysical Screen Inside The Cell
No ratings yet
2013 - Engineering Protein Thermostability Using A Generic Activity-Independent Biophysical Screen Inside The Cell
8 pages
Bl21-Codonplus Competent Cells: Instruction Manual
No ratings yet
Bl21-Codonplus Competent Cells: Instruction Manual
19 pages
2011 - Improving A Natural Enzyme Activity Through Incorporation of Unnatural Amino Acids
No ratings yet
2011 - Improving A Natural Enzyme Activity Through Incorporation of Unnatural Amino Acids
8 pages
Placa N°1 28 01 2020
No ratings yet
Placa N°1 28 01 2020
3 pages
Nature 19769
No ratings yet
Nature 19769
16 pages
Defined Media Optimization For Growth of Recombinant Escherichia Coli X90 - 1992
No ratings yet
Defined Media Optimization For Growth of Recombinant Escherichia Coli X90 - 1992
10 pages
Yeast: Cytology
No ratings yet
Yeast: Cytology
22 pages
Identification of Lipoprotein Homologues of Pneumococcal PsaA in The Equine Pathogens Streptococcus Equi and Streptococcus Zooepidemicus
No ratings yet
Identification of Lipoprotein Homologues of Pneumococcal PsaA in The Equine Pathogens Streptococcus Equi and Streptococcus Zooepidemicus
4 pages
RNA Sequencing (RNA-seq)– Comprehensive Notes
No ratings yet
RNA Sequencing (RNA-seq)– Comprehensive Notes
5 pages
UCSC Genome Browser
No ratings yet
UCSC Genome Browser
424 pages
15.2 Solved (A)
No ratings yet
15.2 Solved (A)
8 pages
Alignment With Mega
No ratings yet
Alignment With Mega
2 pages
Multiple Sequence Alignment (MSA)
No ratings yet
Multiple Sequence Alignment (MSA)
78 pages
Mzo-005 Genomics and Proteomics
No ratings yet
Mzo-005 Genomics and Proteomics
3 pages
The State-Of-The-Art in Biomimetics: Nathan F. Lepora, Paul Verschure and Tony J. Prescott
No ratings yet
The State-Of-The-Art in Biomimetics: Nathan F. Lepora, Paul Verschure and Tony J. Prescott
19 pages
Full download Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang pdf docx
100% (5)
Full download Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang pdf docx
36 pages
Terry_Gaasterland
No ratings yet
Terry_Gaasterland
2 pages
Y2H System
No ratings yet
Y2H System
12 pages
Disease Gene Identification Methods and Protocols 1st Edition Johanna K. Distefano all chapter instant download
100% (4)
Disease Gene Identification Methods and Protocols 1st Edition Johanna K. Distefano all chapter instant download
71 pages
CARSEF 2014 Final Awards List March 1st
No ratings yet
CARSEF 2014 Final Awards List March 1st
26 pages
8 - geneMAP™ Thrombophilia Panel V2.3 RUO
No ratings yet
8 - geneMAP™ Thrombophilia Panel V2.3 RUO
9 pages
Computational System Biology - New
No ratings yet
Computational System Biology - New
2 pages
From Protein Structure To Function With Bioinformatics (PDFDrive)
100% (1)
From Protein Structure To Function With Bioinformatics (PDFDrive)
509 pages
BIOL2165 Tutorial 1 - Multigene Family
No ratings yet
BIOL2165 Tutorial 1 - Multigene Family
1 page
BioinfoMethods I Lab01
No ratings yet
BioinfoMethods I Lab01
19 pages
375 575 Kallas
No ratings yet
375 575 Kallas
6 pages
Technical Program_NGS 2025_v5
No ratings yet
Technical Program_NGS 2025_v5
2 pages
13. ვ. ქობალია. ბიოტექნოლოგია მცენარეთა დაცვაში
No ratings yet
13. ვ. ქობალია. ბიოტექნოლოგია მცენარეთა დაცვაში
215 pages
CRISPR Ebook 3rd Edition
100% (1)
CRISPR Ebook 3rd Edition
219 pages
Qasem
No ratings yet
Qasem
6 pages
Buku Biomedik
No ratings yet
Buku Biomedik
4 pages
Resume
No ratings yet
Resume
3 pages
Cut Scores List of Selected Process 2008-2009
No ratings yet
Cut Scores List of Selected Process 2008-2009
34 pages
Protein Alignment Scoring - PAM and BLOSUM
No ratings yet
Protein Alignment Scoring - PAM and BLOSUM
11 pages
Grad
No ratings yet
Grad
132 pages
SLOPE PROTECTION Guide To Road Slope Proctection Works CH 3
86% (7)
SLOPE PROTECTION Guide To Road Slope Proctection Works CH 3
14 pages
Manuscript - Editing The Genome of Crops
No ratings yet
Manuscript - Editing The Genome of Crops
20 pages
Genomics and Bioinformatics
No ratings yet
Genomics and Bioinformatics
34 pages