Genomics
Genomics
Functional Genomics
Comparative Genomics
Group 2:
Francis Kyomuhendo 2024/HD17/2403U
Gilbert Gumisiriza 2019/HD17/23142U
Synopsis
Introduction
The Central Dogma
Structural Genomics
Functional Genomics
Comparative Genomics
Applications and Case Studies
Review of Trends and Future
Directions
5/6/2025 Summary and Conclusions
Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 2
Genomics studies whole genomes
Introduction
Structural Genomics: determine 3D structure of proteins encoded by
genomes
Functional Genomics: assign function to genes using high-throughput
assays
Comparative Genomics: compare genomes across species to infer evolution,
Information is stored in DNA and copied during replication genomics & epigenomics
Some
5/6/2025 proteins mediate in biochemical functions
Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 metabolomics 4
Structural Genomics & Physical Mapping of
Genomes
Structural genomics Physical mapping
• Sequence assembly,
• Large fragment,
• Annotation
• Genomic libraries
Genomics: Study of the genome, including interactions of those genes with each other
and with the organism’s environment
Genomic library: Collection of DNA fragments that represents entire genome of an organism
Structural genomics aims to determine 3-D structure of proteins, and then investigate their function
Physical mapping of genomes determines physical location & distance between DNA sequences on a
chromosome, expressed in base pairs
Genome assembly combines short DNA sequence fragments (reads) to reconstruct original
sequence
Genome annotation finds and designates locations of genes, biological features on nucleotide seqs
Source: microbenotes.com
5/6/2025 Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 7
Genome mapping
Genome mapping: Identification of the locations of genes, mutations, or traits on a chromosome
Cytologic maps: Describe visible banding patterns on stained x-somes that serve as markers
Source: microbenotes.com
RNA-Seq
Source: https://www.veteringroup.us/articles/figures/IJVSR-8-208-g002.gif
Uses fluorescently labeled di-deoxy nucleotides to terminate DNA chains of varying lengths
Resulting fragments separated by electrophoresis & seq. determined by reading gel banding pattern
Source: https://www.veteringroup.us
Shotgun sequencing: DNA clones are randomly sequenced from both ends;
Generates large number of DNA fragments, which
are assembled by computer program to create a
complete genome sequence
Source: gigabyte31-cover.jpg
Short fragments need to be joined together while removing any overlapping regions
Firstly, base calling which derives base calls and assigns quality scores to these base calls to
determine the sequence of nucleotides at every position
Quality scores assigned to each base call, which represents confidence or accuracy of base call
Then the individual sequence reads are assembled into contiguous sequences, known as contigs
Involves identifying overlaps between different fragments, determining relative order, & then
deriving a consensus sequence for each contig
Using programs for processing raw sequencing data & assembling contigs, including Phred, Phrap,
TIGR Assembler, and ARACHNE.
5/6/2025 Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 14
Genome Assembly
Involves reconstruction of genome sequence from short or
long sequencing reads
De novo assembly
Source: BFnmeth1935_Figb_HTML.jpg
Quality assessment to ensure genome assembly is accurate and complete using tools like:
QUAST(checks scaffold statistics),
BUSCO(assess genome completeness)
Gap filling and error correction that closes missing regions & fixes sequencing errors
using tools like:
Pilon(Illumina),
Medaka(Nanopore)
Structural annotations for finding coding genes, non coding RNAs, promoter etc. using;
Prokka,
GeneMark,
AUGUSTUS
Functional annotations for assigning functions to identified genes based on known databases like;
InterProScan(protein domains),
KEGG (metabolic pathways)
Source: whole-genome-assembly.jpg
Reverse Genetics: Systematic gene knock-downs and/or knock-outs to phenotype gene functions
Functional assays: Reporter constructs, epigenomic profiling, and comparative expression analyses
Source: EMBL-EBI
5/6/2025 Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 21
Computational Assignment of Gene
Function
Genome sequencing can identify genes but does not reveal their functions
Computationally generated, tentative identification is based on homology with genes of known function.
The best way to identify gene function is to look at their proteins (i.e. BLASTp search)
There are many “genes” that are classified as orphans, but in reality, they do not exist
PCR can be used to produce a gene knockout that results in a permament DNA change
Hybridisation: Labelled cDNA (from RNA) from test vs. reference samples co-hybridise
fluorescent readouts quantify relative transcript abundance
Source: EMBL-EBI
Plant Microarrays: Arabidopsis whole-genome arrays (e.g. Agilent 44K) and custom slides are
widely used to study growth stages, pathogen response, etc.
Limitations: Requires prior sequence knowledge; dynamic range and detection limits
Complemented by RNA-Seq (NGS) for transcript discovery and splicing
Valuable for large-scale screens
Sequencing,
Reverse Genetics: Systematic knockout lines (e.g. Arabidopsis T-DNA insertion collection covering genome)
Enables analysis of gene function by disrupting known loci
CRISPR-Cas9
Example: Arabidopsis SALK T-DNA lines provide mutants for >80% of annotated genes;
phenotypic databases (TAIR) catalog mutant effects.
Gene Silencing: Artificial hairpin constructs or transgenes produce dsRNA for target
genes.
Endogenous Dicer/RISC machinery degrades the target transcript,
effectively “knocking down” gene expression.
Virus-Induced Gene Silencing: VIGS utilises engineered plant viruses carrying a fragment of host gene.
Infection spreads fragment, triggering RNAi against host gene.
Widely used in N. benthamiana for rapid gene
function testing.
Epigenomics: Genome-wide profiling of chromatin state (ChIP-seq for histone marks, DNA methylation
maps) to link epigenetic regulation with gene expression patterns.
Example: Correlation of unknown enzymes with known metabolites to reveal pathway enzymes
Systems Biology:
Combining metabolite QTLs & transcriptomics aids in quantitative trait gene mapping
(mQTL) 5/6/2025
and Kyomuhendo Francisengineering.
metabolic & Gumisiriza Gilbert, MBS7228 28
Omics Integration & Pathway
Elucidation
Integrated Analysis: Simultaneous profiling (transcriptome + proteome + metabolome) enables reconstruction
of regulatory and metabolic networks.
Network Modelling: Graph-based and ML methods predict gene function by linking multi-omics data.
GWAS can incorporate metabolite levels to pinpoint genes affecting metabolism.
Source: EMBL-EBI
Functional associations
Metabolic pathways
Transcription regulation
Signaling pathways
Protein complexes
Cellular processes
Comparative genomics
Comparing genomes
Gene Fusions
Gene Neighborhood conservation
Gene Presence/Absence
Comparing genomics data
Horizontal comparative genomics
Vertical comparative genomics
Stress Responses: Genome-wide profiling of drought, cold, & pathogens reveals defense genes.
RNAi or CRISPR knockout confirms function (e.g. Arabidopsis R gene
networks).
Pathway Engineering: Functional genomics pinpoints enzymes for metabolic engineering (e.g. vit. C
biosynthesis). Overexpression or silencing validated by metabolomics.
Reference: Saito et al. – transcript/metabolite cooccurrence as tool for gene discovery in A. thaliana.
Marker-Assisted Breeding:
Biotechnology
Comparative genomics identifies conserved trait loci (yield, disease resistance)
SNP markers from model studies accelerate breeding in related crops.
Transgenic Crops: Gene from models (pest resistance, stress tolerance) engineered into crops.
Structural genomics aids enzyme optimization (herbicide target
modification)
Synthetic Biology: Design of synthetic regulatory circuits in plants (synthetic promoters controlled
by transcription factors identified via functional genomics).
Comparative Models: Use plant model (Arabidopsis, N. benthamiana) to study basic biology (e.g. virus-
host interactions) and translate to crops (e.g. brassica).
Mutant Screens: Forward genetic screens (e.g. EMS-induced mutants) uncovered key regulators
(e.g. phytochrome signaling) and were mapped via genome sequencing.
Metabolite Profiling: Mutant collections (e.g. 50 mutants in metabolism) were profiled metabolomically to build
a metabolic network atlas.
Rapid assays (weeks vs. months for transgenics) allow functional testing of plant
genes.
VIGS Example: Silencing of NbPDS (phytoene desaturase) causes photobleaching – used as a visual
control for VIGS efficiency. Many studies use TRV-VIGS in N. benthamiana to
identify disease resistance genes or metabolic enzymes.
Transient Expression: Agroinfiltration also used for protein localization, protein–protein interactions (BiFC, co-IP)
and overexpression, complementing stable transformation.
Untargeted Profiling: Example – drought vs. control in Arabidopsis; LC-MS reveals accumulation of
osmo-protectants (proline, sugars). Transcriptomics of same samples pinpoints
biosynthetic genes.
Coexpression with Metabolites: Saito et al. emphasize that correlating transcript and metabolite data
in Arabidopsis is powerful for discovering pathway genes. Flavonoid and
glucosinolate pathways were elucidated this way.
Arabidopsis knowledge informed tomato and brassica research (homologs for disease resistance, oil biosynthesis).
Comparative maps help clone crop genes based on colinearity.
Synteny Examples:
Despite >100 MYA divergence, Arabidopsis and tomato genomes share syntenic enabling prediction of orthologous
genes. Similarity with soybean (100 MYA) is also observed.
Pan-Genome (Arabidopsis):
Recent study of 69 A. thaliana accessions reveals 60% of ~33,000 gene families are core, while 40% are
dispensable. This genomic diversity explains adaptation and is a resource for finding novel genes in wild
accessions.
Cryo-EM Advances:
Single-particle cryo-EM yields structures of large complexes and membrane proteins at
atomic resolution.
Integrative structural methods (cryo-EM, crystallography, NMR) tackle systems previously
intractable.
CRISPR Screens:
Genome-wide CRISPR libraries for knockouts and CRISPRi in plants will allow pooled genetic
screens, linking genes to phenotypes at scale.
Machine Learning:
AI will integrate multi-omics data (genomics, transcriptomics, phenomics) to predict gene
functions and genotype–phenotype relationships (phenotypic prediction models for breeding).
Synthetic Genomics:
De novo synthesis of plant chromosomes or gene circuits for metabolic engineering (custom
pathways for bio-factories).
5/6/2025 Kyomuhendo Francis & Gumisiriza Gilbert, MBS7228 40
Future Directions: Comparative Genomics
Pangenome Sequencing:
Large-scale sequencing of crop accessions (like the 69 Arabidopsis accessions) and wild
relatives will capture structural variation and dispensable genes.
Breeding will leverage this natural diversity.
Evolutionary Insight:
Comparative epigenomics across species; study of polyploidy (e.g. Brassica species, wheat)
to understand gene retention and expression dominance.
Genome Editing:
Use knowledge from comparative analyses to engineer novel traits (e.g. introduce stress-
resistance alleles from one species to another via genome editing).
Ongoing advances (AI, single-cell, pan-genomics) promise even deeper insights into
plant genomes and their applications in biotechnology.
2. O'Malley, R. C., Barragan, C. C., & Ecker, J. R. (2015). A user's guide to the Arabidopsis T-DNA insertion mutant collections. Methods in molecular
biology (Clifton, N.J.), 1284, 323–342. https://doi.org/10.1007/978-1-4939-2444-8_16
3. Todd, A. T., Liu, E., Polvi, S. L., Pammett, R. T., & Page, J. E. (2010). A functional genomics screen identifies diverse transcription factors that regulate
alkaloid biosynthesis in Nicotiana benthamiana. The Plant journal : for cell and molecular biology, 62(4), 589–600. https://doi.org/10.1111/j.1365-
313X.2010.04186.x
4. Liu, S., Li, K., Dai, X. et al. A telomere-to-telomere genome assembly coupled with multi-omic data provides insights into the evolution of hexaploid
bread wheat. Nat Genet 57, 1008–1020 (2025). https://doi.org/10.1038/s41588-025-02137-x
5. Chaudhary, D., Jeena, A. S., Rohit, Gaur, S., Raj, R., Mishra, S., Kajal, Gupta, O. P., & Meena, M. R. (2024). Advances in RNA Interference for Plant
Functional Genomics: Unveiling Traits, Mechanisms, and Future Directions. Applied biochemistry and biotechnology, 196(9), 5681–5710.
https://doi.org/10.1007/s12010-023-04850-x
6. Barreda, L., Boutet, S., De Vos, D. et al. Specialized metabolome and transcriptome atlas of developing Arabidopsis thaliana seed under warm
temperatures. Sci Data 12, 306 (2025). https://doi.org/10.1038/s41597-025-04563-2
7. Watanabe, M., & Tohge, T. (2023). Species-specific 'specialized' genomic region provides the new insights into the functional genomics
characterizing metabolic polymorphisms in plants. Current opinion in plant biology, 75, 102427. https://doi.org/10.1016/j.pbi.2023.102427
8. Eckardt N. A. (2001). Everything in its place. Conservation of gene order among distantly related plant species. The Plant cell, 13(4), 723–725.
https://doi.org/10.1105/tpc.13.4.723