0% found this document useful (0 votes)
6 views7 pages

Assignment (DB)

The document provides an overview of the Ensembl bioinformatics platform, detailing its genomic resources, including genome annotation, comparative genomics, and genetic variation data. It focuses on the human genome assembly GRCh38 and explores specific genes TP53 and BRCA1, discussing their structures, variants, and associated diseases. Additionally, it highlights insights gained from gene analysis and includes SQL queries for retrieving gene information and variants.

Uploaded by

saliharanihere2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views7 pages

Assignment (DB)

The document provides an overview of the Ensembl bioinformatics platform, detailing its genomic resources, including genome annotation, comparative genomics, and genetic variation data. It focuses on the human genome assembly GRCh38 and explores specific genes TP53 and BRCA1, discussing their structures, variants, and associated diseases. Additionally, it highlights insights gained from gene analysis and includes SQL queries for retrieving gene information and variants.

Uploaded by

saliharanihere2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

National University of Sciences & Technology

(NUST)
School of interdisciplinary Engineering & Sciences (SINES)
BS Bioinformatics (2nd Semester)
Course: DATABASE SYSTEMS
Assignment 2
Exploring the Human Genome Database on Ensembl

 Amina Mohsin (506545)


 Areeba Amjad (503173)
 Hafsa Jabeen (517093)
 Saliha Rani (516052)
 Shahliza Saeed (517110)
1. Overview of Ensembl and its Genomic Resource:
Ensembl is a bioinformatics platform and a genomic browser that provides us high
quality and comprehensive genomic data for various species.

 Genome Annotation
 Provides gene, transcript, and protein annotations.
 Uses computational prediction and experimental evidence for accurate gene
models.
 Comparative Genomics
 Allows cross-species comparisons.
 Shows which genes are similar between species.
 Helps understand evolution.
 Compares large sections of DNA
 Genetic Variation Data
 Information about DNA differences like SNPs (single-letter changes).
 Shows if mutations are linked to diseases.
 Has tools to predict the effect of mutations.
 Tools for Analysis
 BLAST/BLAT for sequence alignment.
 VEP (Variant Effect Predictor) for functional impact assessment.
 BioMart for data mining and export.
 APIs for programmers to get Ensembl data automatically
2. Human Genome Assembly:
The human genome assembly GRCh38 (Genome Reference Consortium Human
Build 38 is the current reference genome for Homo sapiens.

Structure and Features:


 Total genome size:
 3.2 billion Base pairs (bp).
 Number of chromosomes:
 24 distinct chromosomes:
22 autosomes (chromosomes 1–22)
2 sex chromosomes (X and Y)
 Mitochondrial DNA:
 Small circular mitochondrial DNA genome (mtDNA)
 Alternative Loci:
 Extra sequences for variable regions.
 Represents genetic diversity in the human population.
 Unplaced & Unlocalized Scaffolds
 Sequences not fully mapped to chromosomes.
 Helps capture unresolved regions
 Improved Genome Accuracy
 Fixed sequencing errors and gaps from GRCh37.
 Better representation of repetitive regions and duplications.
3. Demonstration of Gene-Specific Exploration: TP53 and BRCA1

TP53
 Gene Summary:
 Located on chromosome 17, encodes the tumor suppressor protein p53,
regulating cell cycle, DNA repair, and apoptosis.
 Gene Structure:
 11 exons, 10 introns:
 It encodes a protein with a DNA-binding domain (critical for target gene
activation) and a tetramerization domain (essential for functional
oligomerization).
 Variants & Diseases:
 Mutations (e.g., R175H) are linked to Li-Fraumeni syndrome and sporadic
cancers (breast, lung, colorectal).
 Protein Domains:
 DNA-binding domain: Mutations here impair tumor suppression.
 Tetramerization domain: Required for p53’s active tetrameric form.
BRCA1:
 Gene Summary:
 Located on chromosome 17, involved in homologous recombination
repair of double-strand DNA breaks.
 Gene Structure:
 24 exons:
 It encodes a protein with a RING domain (ubiquitin ligase activity)
and BRCT domains (protein-protein interactions in DNA repair).
 Variants & Diseases:
 Loss-of-function mutations (e.g., frameshift/nonsense) increase risk
of hereditary breast and ovarian cancer.
 Protein Domains:
 RING domain (mediates E3 ubiquitin ligase activity).
 BRCT domains (bind phosphorylated proteins in damage response).
4. Insights Gained from Gene Analysis
 Gene Structure
 TP53 (11 exons) is compact, while BRCA1 (24 exons) has a larger, more
complex structure.
 Alternative splicing in TP53 generates isoforms with distinct roles in
apoptosis.
 Protein Domains & Function
 TP53: DNA-binding domain mutations impair tumor suppression.
 BRCA1: RING and BRCT domains are essential for DNA repair.
 Disease & Therapeutic Implications
 TP53: R175H causes Li-Fraumeni syndrome, somatic mutations drive many
cancers.
 BRCA1: Truncating mutations increase breast/ovarian cancer risk, targeted
by PARP inhibitors.
 Comparative Genomics
 TP53 is highly conserved, highlighting its critical role in genome integrity.

5. Executing SQL queries


Retrieve Basic Gene Information for TP53 and BRCA1
Query:
Output:

Retrieve Variants of TP53:


Query:

Output:

List All Transcripts of TP53


Query
Output:

List All Transcripts of BRCA1


Query:

Output:

You might also like