0% found this document useful (0 votes)

93 views7 pages

Ests: Gene Discovery Made Easier

ESTs are small pieces of DNA sequence generated by sequencing expressed genes that can help identify unknown genes and map their positions. Researchers isolate mRNA from cells, convert it to stable cDNA, and then sequence short segments from the ends of the cDNA to create ESTs. ESTs serve as landmarks for gene mapping and discovery by greatly reducing the time needed to locate genes. Using observable biological clues, scientists can search for disease-related ESTs and examine patient DNA for mutations in candidate genes identified through EST analysis.

Uploaded by

Nihal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views7 pages

Ests: Gene Discovery Made Easier

Uploaded by

Nihal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

ESTs: GENE DISCOVERY MADE EASIER

Investigators are working diligently to sequence and assemble the genomes of various organisms,
including the mouse and human, for a number of important reasons. Although important goals of any
sequencing project may be to obtain a genomic sequence and identify a complete set of genes, the
ultimate goal is to gain an understanding of when, where, and how a gene is turned on, a process
commonly referred to as gene expression. Once we begin to understand where and how a gene is
expressed under normal circumstances, we can then study what happens in an altered state, such as in
disease. To accomplish the latter goal, however, researchers must identify and study the protein, or
proteins, coded for by a gene.

As one can imagine, finding a gene that codes for a protein, or proteins, is not easy. Traditionally,
scientists would start their search by defining a biological problem and developing a strategy for
researching the problem. Oftentimes, a search of the scientific literature provided various clues about
how to proceed. For example, other laboratories may have published data that established a link
between a particular protein and a disease of interest. Researchers would then work to isolate that
protein, determine its function, and locate the gene that coded for the protein. Alternatively, scientists
could conduct what is referred to as linkage studies to determine the chromosomal location of a
particular gene. Once the chromosomal location was determined, scientists would use biochemical
methods to isolate the gene and its corresponding protein. Either way, these methods took a great deal
of time—years in some cases—and yielded the location and description of only a small percentage of
the genes found in the human genome.

Now, however, the time required to locate and fully describe a gene is rapidly
decreasing, thanks to the development of, and access to, a technology used to An Expressed
generate what are called Expressed Sequence Tags, or ESTs. ESTs provide Sequence Tag
researchers with a quick and inexpensive route for discovering new genes, for is a tiny
obtaining data on gene expression and regulation, and for constructing genome portion of an
maps. Today, researchers using ESTs to study the human genome find themselves entire gene
riding the crest of a wave of scientific discovery the likes of which has never been that can be
seen before. used to help
identify
unknown
genes and to
map their
positions
within a
genome.
What Are ESTs and How Are They Made?

ESTs are small pieces of DNA sequence (usually 200 to 500 nucleotides long) that are generated by
sequencing either one or both ends of an expressed gene. The idea is to sequence bits of DNA that
represent genes expressed in certain cells, tissues, or organs from different organisms and use these
"tags" to fish a gene out of a portion of chromosomal DNA by matching base pairs. The challenge
associated with identifying genes from genomic sequences varies among organisms and is dependent
upon genome size as well as the presence or absence of introns, the intervening DNA sequences
interrupting the protein coding sequence of a gene.

Separating the Wheat from the Chaff: Using mRNA to Generate cDNA

Gene identification is very difficult in humans, because most of our genome is composed of introns
interspersed with a relative few DNA coding sequences, or genes. These genes are expressed as
proteins, a complex process composed of two main two steps. Each gene (DNA) must be converted,
or transcribed, into messenger RNA (mRNA), RNA that serves as a template for protein synthesis.
The resulting mRNA then guides the synthesis of a protein through a process called translation.
Interestingly, mRNAs in a cell do not contain sequences from the regions between genes, nor from
the non-coding introns that are present within many genes. Therefore, isolating mRNA is key to
finding expressed genes in the vast expanse of the human genome.

Figure 1. An overview of the process of protein synthesis.

Protein synthesis is the process whereby DNA codes for the production of amino acids and proteins.
The process is divided into two parts: transcription and translation. During transcription, one strand
of a DNA double helix is used as a template by mRNA polymerase to synthesize a mRNA. During this
step, mRNA passes through various phases, including one called splicing, where the non-coding
sequences are eliminated. In the next step, translation, the mRNA guides the synthesis of the protein
by adding amino acids, one by one, as dictated by the DNA and represented by the mRNA.

The problem, however, is that mRNA is very unstable outside of a cell; therefore, scientists use special
enzymes to convert it to complementary DNA (cDNA). cDNA is a much more stable compound
and, importantly, because it was generated from a mRNA in which the introns have been removed,
cDNA represents only expressed DNA sequence.

cDNA is a form of DNA prepared in the laboratory using an enzyme called

reverse transcriptase. cDNA production is the reverse of the usual process of
transcription in cells because the procedure uses mRNA as a template rather
than DNA. Unlike genomic DNA, cDNA contains only expressed DNA
sequences, or exons.

From cDNAs to ESTs

Once cDNA representing an expressed gene has been isolated, scientists can
then sequence a few hundred nucleotides from either end of the molecule to A "gene family" is
create two different kinds of ESTs. Sequencing only the beginning portion of a group of closely
the cDNA produces what is called a 5' EST. A 5' EST is obtained from the related genes that
portion of a transcript that usually codes for a protein. These regions tend to produces similar
be conserved across species and do not change much within a gene family. protein products.
Sequencing the ending portion of the cDNA molecule produces what is called
a 3' EST. Because these ESTs are generated from the 3' end of a transcript, they are likely to fall
within non-coding, or untranslated regions (UTRs), and therefore tend to exhibit less cross-species
conservation than do coding sequences.

A UTR is that part of a gene that is not translated into protein.

Figure 2. An overview of how ESTs are generated.

ESTs are generated by sequencing cDNA, which itself is synthesized from the mRNA molecules in a
cell. The mRNAs in a cell are copies of the genes that are being expressed. mRNA does not contain
sequences from the regions between genes, nor from the non-coding introns that are present within
many interesting parts of the genome.

ESTs: Tools for Gene Mapping and Discovery

ESTs as Genome Landmarks

Just as a person driving a car may need a map to find a destination, scientists searching for genes also
need genome maps to help them to navigate through the billions of nucleotides that make up the
human genome. For a map to make navigational sense, it must include reliable landmarks or
"markers". Currently, the most powerful mapping technique, and one that has been used to generate
many genome maps, relies on Sequence Tagged Site (STS) mapping. An STS is a short DNA
sequence that is easily recognizable and occurs only once in a genome (or chromosome). The 3' ESTs
serve as a common source of STSs because of their likelihood of being unique to a particular species
and provide the additional feature of pointing directly to an expressed gene.

ESTs as Gene Discovery Resources

Because ESTs represent a copy of just the interesting part of a genome, that which
ESTs are is expressed, they have proven themselves again and again as powerful tools in
powerful tools the hunt for genes involved in hereditary diseases. ESTs also have a number of
in the hunt for practical advantages in that their sequences can be generated rapidly and
known genes inexpensively, only one sequencing experiment is needed per each cDNA
because they generated, and they do not have to be checked for sequencing errors because
greatly reduce mistakes do not prevent identification of the gene from which the EST was
the time derived.
required to
locate a gene.

Using ESTs, scientists have rapidly isolated some of the genes involved in Alzheimer's
disease and colon cancer.

To find a disease gene using this approach, scientists first use observable biological clues to identify
ESTs that may correspond to disease gene candidates. Scientists then examine the DNA of disease
patients for mutations in one or more of these candidate genes to confirm gene identity. Using this
method, scientists have already isolated genes involved in Alzheimer's disease, colon cancer, and
many other diseases. It is easy to see why ESTs will pave the way to new horizons in genetic research.
ESTs and NCBI

Because of their utility, speed with which they may be generated, and the low
cost associated with this technology, many individual scientists as well as For ESTs to be
large genome sequencing centers have been generating hundreds of easily accessed and
thousands of ESTs for public use. Once an EST was generated, scientists were useful as gene
submitting their tags to GenBank, the NIH sequence database operated by discovery tools,
NCBI. With the rapid submission of so many ESTs, it became difficult to they must be
identify a sequence that had already been deposited in the database. It was organized in a
becoming increasingly apparent to NCBI investigators that if ESTs were to searchable
be easily accessed and useful as gene discovery tools, they needed to be database that also
organized in a searchable database that also provided access to other genome provides access to
data. Therefore, in 1992, scientists at NCBI developed a new database genome data.
designed to serve as a collection point for ESTs. Once an EST that was
submitted to GenBank had been screened and annotated, it was then deposited in this new database,
called dbEST.

dbEST: A Descriptive Catalog of ESTs

Scientists at NCBI created dbEST to organize, store, and provide access to the
Scientists at great mass of public EST data that has already accumulated and that continues
NCBI annotate to grow daily. Using dbEST, a scientist can access not only data on human
EST records with ESTs but information on ESTs from over 300 other organisms as well.
text information Whenever possible, NCBI scientists annotate the EST record with any known
regarding DNA information. For example, if an EST matches a DNA sequence that codes for
and mRNA a known gene with a known function, that gene's name and function are placed
homologies. on the EST record. Annotating EST records allows public scientists to use
dbEST as an avenue for gene discovery. By using a database search tool, such
as NCBI’s BLAST, any interested party can conduct sequence similarity searches against dbEST.

UniGene: A Non-Redundant Set of Gene-oriented Clusters

Because a gene can be expressed as mRNA many, many times, ESTs ultimately derived from this
mRNA may be redundant. That is, there may be many identical, or similar, copies of the same EST.
Such redundancy and overlap means that when someone searches dbEST for a particular EST, they
may retrieve a long list of tags, many of which may represent the same gene. Searching through all
of these identical ESTs can be very time consuming. To resolve the redundancy and overlap problem,
NCBI investigators developed the UniGene database UniGene automatically partitions GenBank
sequences into a non-redundant set of gene-oriented clusters.

Although it is widely recognized that the generation of ESTs constitutes an efficient strategy to
identify genes, it is important to acknowledge that despite its advantages, there are several limitations
associated with the EST approach. One is that it is very difficult to isolate mRNA from some tissues
and cell types. This results in a paucity of data on certain genes that may only be found in these tissues
or cell types.

Second is that important gene regulatory sequences may be found within an intron. Because ESTs are
small segments of cDNA, generated from a mRNA in which the introns have been removed, much
valuable information may be lost by focusing only on cDNA sequencing. Despite these limitations,
ESTs continue to be invaluable in characterizing the human genome, as well as the genomes of other
organisms. They have enabled the mapping of many genes to chromosomal sites and have also
assisted in the discovery of many new genes.

The Gene Sequencing, Mapping and Cloning of Human Disease Genes.
No ratings yet
The Gene Sequencing, Mapping and Cloning of Human Disease Genes.
22 pages
Solution Manual For Genetics From Genes To Genomes 5th Edition by Hartwell Goldberg Fischer ISBN 0073525316 9780073525310
100% (51)
Solution Manual For Genetics From Genes To Genomes 5th Edition by Hartwell Goldberg Fischer ISBN 0073525316 9780073525310
36 pages
Molecular Basis of Inheritance
No ratings yet
Molecular Basis of Inheritance
8 pages
Objectives: Perspectives of Curriculum Development
No ratings yet
Objectives: Perspectives of Curriculum Development
12 pages
Lecture 2
No ratings yet
Lecture 2
40 pages
EST - "Expressed Sequence Tags": - Manali Mehendale
No ratings yet
EST - "Expressed Sequence Tags": - Manali Mehendale
19 pages
Expressed Sequence Tag: Dr. Sujoy Ghosh 7/07/2011
No ratings yet
Expressed Sequence Tag: Dr. Sujoy Ghosh 7/07/2011
29 pages
Biochemistry: Concepts and Connections (2nd Edition) - Global PDF
No ratings yet
Biochemistry: Concepts and Connections (2nd Edition) - Global PDF
10 pages
Lecture 8 Chapter 11
No ratings yet
Lecture 8 Chapter 11
61 pages
BIO 411 - Decoding Understanding Genomes Lecture
No ratings yet
BIO 411 - Decoding Understanding Genomes Lecture
55 pages
Stuvia 1321801 Summary Bhcs 2003 Genetics
No ratings yet
Stuvia 1321801 Summary Bhcs 2003 Genetics
58 pages
Biotechnology
No ratings yet
Biotechnology
29 pages
MBBS BCH3245 Mutation
No ratings yet
MBBS BCH3245 Mutation
127 pages
CE6068 Lecture 3
No ratings yet
CE6068 Lecture 3
80 pages
Slides 1
No ratings yet
Slides 1
57 pages
Gene Structure
No ratings yet
Gene Structure
12 pages
Biotech Class 1 Handout With MCQ (F)
No ratings yet
Biotech Class 1 Handout With MCQ (F)
8 pages
1 Dr. Ergoren - Genes and Genomes Evolution 2022
No ratings yet
1 Dr. Ergoren - Genes and Genomes Evolution 2022
67 pages
Eukaryotic Gene Structure: Done By: Laith Saeed Alamoudi
No ratings yet
Eukaryotic Gene Structure: Done By: Laith Saeed Alamoudi
9 pages
Topic: Fertilization: Print Self Learning Material
No ratings yet
Topic: Fertilization: Print Self Learning Material
9 pages
1-Genome Organisation-22-07-2024
No ratings yet
1-Genome Organisation-22-07-2024
29 pages
Anatomy of A Gene
No ratings yet
Anatomy of A Gene
33 pages
Chapter 11 STS
No ratings yet
Chapter 11 STS
3 pages
Expressed Sequence Tags
No ratings yet
Expressed Sequence Tags
4 pages
Gene Expression and DNA Replication
No ratings yet
Gene Expression and DNA Replication
22 pages
Types of e Learning
No ratings yet
Types of e Learning
11 pages
GENE Lecure Notes PDF Version
No ratings yet
GENE Lecure Notes PDF Version
23 pages
Chapter 13 Genetics and Biotechnology
100% (2)
Chapter 13 Genetics and Biotechnology
43 pages
2.-Genetics Notes 2425 NOMENDEL
No ratings yet
2.-Genetics Notes 2425 NOMENDEL
83 pages
Lecture 1.1.4 Gene
No ratings yet
Lecture 1.1.4 Gene
34 pages
Structure of Genomes 2
No ratings yet
Structure of Genomes 2
8 pages
Biotech STE 8 Q2 Lesson 5 Role of DNA RNA and Protein in The Transmission of Hereditary Traits 3
100% (3)
Biotech STE 8 Q2 Lesson 5 Role of DNA RNA and Protein in The Transmission of Hereditary Traits 3
14 pages
Statistics For Microarrays: Biological Background: Gene Expression and Molecular Laboratory Techniques
No ratings yet
Statistics For Microarrays: Biological Background: Gene Expression and Molecular Laboratory Techniques
47 pages
Chapter 18 Presentation
No ratings yet
Chapter 18 Presentation
47 pages
Module - 5 - Reference Course Content
No ratings yet
Module - 5 - Reference Course Content
25 pages
Gene Evolution and Supercoiling
No ratings yet
Gene Evolution and Supercoiling
26 pages
GeneExpressionandRegulation StudyGuide
No ratings yet
GeneExpressionandRegulation StudyGuide
13 pages
Genomic Medicine: Basic Molecular Biology
No ratings yet
Genomic Medicine: Basic Molecular Biology
23 pages
B.SC (BioTechnology) Syllabus
No ratings yet
B.SC (BioTechnology) Syllabus
32 pages
Lecture 1 - Genes and Genomics
No ratings yet
Lecture 1 - Genes and Genomics
51 pages
Gene and Chromosomes
No ratings yet
Gene and Chromosomes
14 pages
Genes and Chromosome
No ratings yet
Genes and Chromosome
8 pages
Na Plug Jacks
No ratings yet
Na Plug Jacks
63 pages
01 From DNA To RNA
No ratings yet
01 From DNA To RNA
11 pages
Expressed Sequence Tags
0% (1)
Expressed Sequence Tags
20 pages
Recent Trend in Cell Bio
No ratings yet
Recent Trend in Cell Bio
33 pages
Genes, Genome and Genectic Code
No ratings yet
Genes, Genome and Genectic Code
8 pages
In-Depth cDNA Library Sequencing Provides Quantitative Gene Expression Prof Iling in Cancer Biomarker Discovery
No ratings yet
In-Depth cDNA Library Sequencing Provides Quantitative Gene Expression Prof Iling in Cancer Biomarker Discovery
12 pages
The Alkaloids Chemistry and Biology Vol 60 1st Edition Geoffrey A. Cordell (Ed.) - Download The Complete Ebook in PDF Format and Read Freely
No ratings yet
The Alkaloids Chemistry and Biology Vol 60 1st Edition Geoffrey A. Cordell (Ed.) - Download The Complete Ebook in PDF Format and Read Freely
47 pages
Bioinformatics Unit I
No ratings yet
Bioinformatics Unit I
6 pages
The Basics of Nutrigenomics
No ratings yet
The Basics of Nutrigenomics
7 pages
Functional Proteomics To Exploit Genome Sequences: A. Donny Strosberg
No ratings yet
Functional Proteomics To Exploit Genome Sequences: A. Donny Strosberg
6 pages
Science & Technology 04: Daily Class Notes - (UPSC Sankalp Hinglish)
No ratings yet
Science & Technology 04: Daily Class Notes - (UPSC Sankalp Hinglish)
7 pages
Glosario Biologia Molecular
No ratings yet
Glosario Biologia Molecular
11 pages
Biochem Act
No ratings yet
Biochem Act
7 pages
1629 Full
No ratings yet
1629 Full
3 pages
Lecture 2
No ratings yet
Lecture 2
28 pages
Genomics and Proteomics
No ratings yet
Genomics and Proteomics
2 pages
A Critical Review and Analysis of The Definitions of Curriculum and The Relationship Between Curriculum and Instruction
100% (1)
A Critical Review and Analysis of The Definitions of Curriculum and The Relationship Between Curriculum and Instruction
5 pages
Genetics
No ratings yet
Genetics
26 pages
GlOsario Bioinformatica
No ratings yet
GlOsario Bioinformatica
5 pages
Biology - Paper 2 Revision Notes
No ratings yet
Biology - Paper 2 Revision Notes
31 pages
NEET 2020 Question Paper Set E3 PDF
No ratings yet
NEET 2020 Question Paper Set E3 PDF
21 pages
A Technical Paper ON Genomic Digital Signal Processing: Jyothishmathi Institute of Technology and Science Karimnagar
No ratings yet
A Technical Paper ON Genomic Digital Signal Processing: Jyothishmathi Institute of Technology and Science Karimnagar
12 pages
Chapter IV Introduction To Bacterial Genetics-1
No ratings yet
Chapter IV Introduction To Bacterial Genetics-1
67 pages
DNA Replication
No ratings yet
DNA Replication
34 pages
IAS BABA PEP 2023 - UPSC CSE PRELIMS PREVIOUS YEAR QUESTIONS (2013 - 2022) - Science & Tech Pyq 2013
No ratings yet
IAS BABA PEP 2023 - UPSC CSE PRELIMS PREVIOUS YEAR QUESTIONS (2013 - 2022) - Science & Tech Pyq 2013
21 pages
Marine Biology 6e - Molecular Tools Chapter
No ratings yet
Marine Biology 6e - Molecular Tools Chapter
10 pages
Science Education - 2022 - Zimmerman - Using Youths Personal DNA Data in Science Camps Fostering Genetics Learning and
No ratings yet
Science Education - 2022 - Zimmerman - Using Youths Personal DNA Data in Science Camps Fostering Genetics Learning and
30 pages
MCQ V.O
No ratings yet
MCQ V.O
4 pages
Lagetubijo Agriculture Field Officer Book PDF Xekareperivalaf
No ratings yet
Lagetubijo Agriculture Field Officer Book PDF Xekareperivalaf
4 pages
SLM Quality Assurance
No ratings yet
SLM Quality Assurance
10 pages
Self Learning Material
100% (1)
Self Learning Material
20 pages
Science 13
No ratings yet
Science 13
4 pages
From Gene To Protein
100% (1)
From Gene To Protein
16 pages
Topper 101 4 1 Biology Solution Up202301061603 1673001236 2898
No ratings yet
Topper 101 4 1 Biology Solution Up202301061603 1673001236 2898
12 pages
9700 w18 QP 23 PDF
No ratings yet
9700 w18 QP 23 PDF
16 pages
Enzyme: Characteristics Properties Significance of Enzyme
No ratings yet
Enzyme: Characteristics Properties Significance of Enzyme
50 pages
Proteins B
No ratings yet
Proteins B
46 pages
AIDS and Ebola: Weaponization
100% (2)
AIDS and Ebola: Weaponization
14 pages
Systematic Approach Instruction - Econtent
No ratings yet
Systematic Approach Instruction - Econtent
21 pages
Molecules of The Living System Part 4
No ratings yet
Molecules of The Living System Part 4
21 pages
Biology Form 5 Notes Chapter 5
No ratings yet
Biology Form 5 Notes Chapter 5
26 pages
Lecture 13 Genes and How They Work
No ratings yet
Lecture 13 Genes and How They Work
31 pages
Gerardus Johannes Mulder: 20 Aug 1779 - 7 Aug 1848
No ratings yet
Gerardus Johannes Mulder: 20 Aug 1779 - 7 Aug 1848
25 pages
What Is DNA
No ratings yet
What Is DNA
4 pages
Tool For E-Content Development
No ratings yet
Tool For E-Content Development
12 pages
Computer Assisted SLM
No ratings yet
Computer Assisted SLM
10 pages
Take 3
No ratings yet
Take 3
7 pages
Blended Learning: Assignment 1: Emerging Trends in E-Learning
No ratings yet
Blended Learning: Assignment 1: Emerging Trends in E-Learning
2 pages
Proteins 3A
No ratings yet
Proteins 3A
22 pages
Carbohydrates 1
No ratings yet
Carbohydrates 1
23 pages
Carbohydrates 2
No ratings yet
Carbohydrates 2
20 pages
SLM & Intellectual Property Rights
No ratings yet
SLM & Intellectual Property Rights
11 pages
Exercise 15-Life Science
No ratings yet
Exercise 15-Life Science
3 pages
Gene Technology Questions AQA OCR Edexcel
No ratings yet
Gene Technology Questions AQA OCR Edexcel
3 pages
1998 (Handelsman Et Al.) Molecular Biological Access To The Chemistry of Unknown Soil Microbes A New Frontier For Natural Products
No ratings yet
1998 (Handelsman Et Al.) Molecular Biological Access To The Chemistry of Unknown Soil Microbes A New Frontier For Natural Products
5 pages
Science 10 Notes
No ratings yet
Science 10 Notes
1 page
Kurse DTU
No ratings yet
Kurse DTU
9 pages
Case Study
No ratings yet
Case Study
3 pages
AP Biology Review Chapters 12 Review Questions Chapter 12: Molecular Biology of The Gene
No ratings yet
AP Biology Review Chapters 12 Review Questions Chapter 12: Molecular Biology of The Gene
2 pages
Quarter 3 Learning Tasks: (Cycle 1)
No ratings yet
Quarter 3 Learning Tasks: (Cycle 1)
2 pages
CAO Basic Molecular Biology2
No ratings yet
CAO Basic Molecular Biology2
6 pages
DNA Computing: A Seminar Report On
No ratings yet
DNA Computing: A Seminar Report On
25 pages
B11 Biotechnology - Principles and Processes
No ratings yet
B11 Biotechnology - Principles and Processes
1 page
Marker TDS-MW-1700-10
No ratings yet
Marker TDS-MW-1700-10
2 pages

Ests: Gene Discovery Made Easier

Uploaded by

Ests: Gene Discovery Made Easier

Uploaded by

ESTs: GENE DISCOVERY MADE EASIER

Figure 1. An overview of the process of protein synthesis.

cDNA is a form of DNA prepared in the laboratory using an enzyme called

From cDNAs to ESTs

A UTR is that part of a gene that is not translated into protein.

ESTs: Tools for Gene Mapping and Discovery

ESTs as Genome Landmarks

ESTs as Gene Discovery Resources

dbEST: A Descriptive Catalog of ESTs

UniGene: A Non-Redundant Set of Gene-oriented Clusters

You might also like