0% found this document useful (0 votes)
25 views

Module1 Understanding Bioinformatics

Here are the steps and key information to analyze your assigned gene: 1. Go to the NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene/) and search for your gene symbol 2. Review the gene name, symbol, description, location on chromosome, and links to related records 3. Check the "Cancer" section under "Gene-disease associations" 4. Search PubMed for recent articles on the role of that gene in cancer 5. Note any associated genes or other disease links 6. Explore the RefSeq records for transcript and protein details 7. Use the HomoloGene, OrthoDB or Ensembl
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Module1 Understanding Bioinformatics

Here are the steps and key information to analyze your assigned gene: 1. Go to the NCBI Gene database (https://www.ncbi.nlm.nih.gov/gene/) and search for your gene symbol 2. Review the gene name, symbol, description, location on chromosome, and links to related records 3. Check the "Cancer" section under "Gene-disease associations" 4. Search PubMed for recent articles on the role of that gene in cancer 5. Note any associated genes or other disease links 6. Explore the RefSeq records for transcript and protein details 7. Use the HomoloGene, OrthoDB or Ensembl
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

July, 2021

INTRODUCTION TO BIOINFORMATICS

Instructor: Dang Thi Minh Nguyet


French National Research Institue for Sustainable Development (IRD) – University of Montpellier
COURSE DESCRIPTION

• Objectives:
• Address major aspects in bioinformatics
• Provide hand-on experience in bioinformatics work
• Provide a foundation for learners to continue explore bioinformatics domain
• Course format
• 7 sessions on every Saturday from 20.00 to 21.30 starting July 18th
• Platform: https://gather.town/app/2bYlSKRX70dtHGhj/bayclassroom1 | Password: 11Bay2020
• Interaction: https://padlet.com/dangminhnguyet09/71zy9941mx0zcbkt
• Modules:
• Module 1: Understanding bioinformatics
• Module 2: Genetic testing
• Module 3: Introduction to bioinformatics algorithms
• Module 4: Introduction to biostatistics
• Module 5: Workflow in NGS data analysis
WARM UP: TELL EVERYONE ABOUT YOURSELF

• Your studies
• Your background
• Your project in the future
• If you already know somethings about bioinformatics?

• Anything else?
MODULE 1: UNDERSTAND BIOINFORMATICS

• Fundamentals of bioinformatics

• Human genome project

• NCBI database
MODULE 1: UNDERSTAND BIOINFORMATICS

• Fundamentals of bioinformatics

• Human genome project

• NCBI database
BIOINFORMATICS?!?

BIOINFORMATICS ANSWER
DATA
TOOLS

Scientists need to find the right tool that gives them the answer

This is a misconception!!!
BIOINFORMATICS IN REALITY

DATA 0 ANSWER

BIOINFORMATICS BIOINFORMATICS
ANSWER
TOOL 2 TOOL 1

DATA 1 DATA 2 ANSWER

BIOINFORMATICS
ANSWER DATA 3
TOOL 3

Tools create new data. Data may have some answers.


BIOINFORMATICS IN REALITY

DATA 0 ANSWER

BIOINFORMATICS BIOINFORMATICS
ANSWER
TOOL 2 TOOL 1

DATA 1 DATA 2 ANSWER

BIOINFORMATICS
ANSWER DATA 3
TOOL 3

Tools create new data. Data may have some answers.


DATA IS THE KEY TO UNLOCK
SCIENTIFIC QUESTIONS

• Bioinformatics is very different


from typical Life Science
• It is a information science – you
need to decode information
locked away within data
COMPLEXITY VS DECISION MAKING

• Bioinformatics analyses require a large number of relatively simple decisions


• Most of which need to be correct! That’s what makes it difficult!
• There are no absolute rules, only guidelines provided. You must learn to
improvise and adapt.
• Explore behaviors
• Expand the scope of the study
• Try new solutions, push the boundaries
BIOINFORMATICS TODAY
HOW IS BIOINFORMATICS PRACTICED?

Command line tools


• Tools are chained together
to formed a pipeline
• Data “flows” from one
command to another

R Programming environment Tools with graphical user interfaces


• High level programming • Web based interfaces to
statistical environment command line tools
• Best suited for later stages • Large selection of commercial
of analyses softwares
TOOLS VS DATA REVISITED

Identify highly expressed genes

• Modern instruments produce vast amounts of data Approach 1: Run the bowtie alinger, then run the
cuffdiff software
• Impossible to interpret them without various tools
• Bioinformatics skill means understanding how to
extract information from data Approach 2: Create a spliced alignment file, then
quantify the abundances by intersecting the
• Tools change all the time – we can learn more from alignments with the genomic intervals, then apply a
the same data statistical test to select differentialy expressed
entries

Tools change. Concepts don’t.


Over time tools implement the same concepts better and better.
MODULE 1: UNDERSTAND BIOINFORMATICS

• Fundamentals of bioinformatics

• Human genome project

• NCBI database
INTRODUCTION

https://www.youtube.com/watch?v=-hryHoTIHak
INTRODUCTION

• Human genome: more than 3M ‘letters’ A, C, T & G


• First phase: from 1990 to 2003
• Twenties research institutes from six different countries: China, France,
Germany, Japan, UK and USA
• Sequencing the human genome:
• Identify important genes and regulatory regions
• Better understand their role in disease
• Investigate our origins using variations in the DNA sequence
SAMPLE COLLECTION
HOW WAS THE HUMAN GENOME SEQUENCED?
WHO HAS ACCESS TO THE HUMAN GENOME DATA?

• Everyone
• Provide free and open access to the data for everyone in the scientific
community and the public domain
• Deposited in freely avialable, online public databases
• Genome browsers: www.ensembl.org
• Access to more than 50 species’ genome
MODULE 1: UNDERSTAND BIOINFORMATICS

• Fundamentals of bioinformatics

• Human genome project

• NCBI database
SYSTEM BIOLOGY – FROM ONE GENE TO A SYSTEM VIEW

A gene

A couple of genes

Many genes

How these genes are working


together and cooperatively
INTERCONNECTION BETWEEN BIOLOGICAL DATABASES
NATIONAL CENTER FOR BIOTECHNOLOGY INFORMATION

• Created by Public Law 100-607 in 1988 as part of National Library of Medicine at NIH to:
• Create automated systems for knowledge about molecular biology, biochemistry, and genetics
• Perform research into advanced methods of analyzing and interpreting molecular biology data
• Enable biotechnology researcher and medical care personnel to use the systems and methods
developed
• The NCBI advances science and health by providing access to biomedical and genomic information
• Builders and providers of GenBank, Entrez, BLAST, PubMed, dbGaP, SRA, dbSNP, Pubchem and
much, much more…
• Center for basic research and training in computational biology
• URL: https://www.ncbi.nlm.nih.gov/
GENBANK SEQUENCES & NCBI WEB USERS
MAIN DATABASES OF NCBI

• Archival or Primary Data:


• Text: Pubmed
• DNA seqence: GenBank/EMBL/DDBJ
• Protein sequence/structures: PDB (RCSB)
• Curated or Processed Data:
• Sequence: RefSequence
• Protein sequences and structures: MMDB
• Organisms maps: Entrez Genomes
• Genes: LocusLink (loci), Homogene (orthologs),
OMIM (disease)
• Specialized Databases:
• Organism: Maps in Entrez Genomes
• Function: Sequences in UniVec, UniGene
• Sequencing Methods: dbEST, dbGSS, dbSTS, HTG
ENTREZ INCREASES DISCOVERY SPACE
PRACTICAL SESSION

Figure out how the genes assigned to each of you are implicated in cancers
• What sections are provided by NCBI gene?
• Gene symbol, full name, reviewed by RefSeq
• Summary of its functions
• Location on the human genome (based on GRCh38)
• How this gene is related to cancer:
• Get one open-access reference most relevant to cancers in your opinion. List the article title,
authors, institutions, publication year, journal name
• Other genes associated with cancer
• Association with other diseases
• Transcript and protein sequences
• Find the tools help you to obtain its orthologous genes in other species (mouse, fruit fly…) and the
list the results and key indicators
PRACTICAL SESSION

Group Student Gene Group Student Gene

Lưu Ngọc Tuyền Vũ Thị Thuỳ Trang


1 MNS1 5 DCTN1
Bàng Quỳnh Anh Phạm Hoàng Hải
Trần Tuấn Phát Trần Khánh Quỳnh
2 CEP192 6 PTCH1
Nguyễn Thị Hải Hà Võ Thị Kim Hoa
Đỗ Ngọc Thuý Anh
3 KRAS Đào Thị Lan Anh
Nguyễn Phan Long
7 Nguyễn Ái Như AKAP9
Võ Ngọc Trâm Anh Lê Đức Việt
4 PLK1
Le Thị Huyền Trang

You might also like