0% found this document useful (0 votes)
7 views34 pages

Lecture 1.1.4 Gene

The document outlines the course objectives and outcomes for an Advanced Molecular Genetics class, emphasizing the understanding of genetics principles and cellular mechanisms. It discusses the evolution of the gene concept, gene duplication, and the existence of introns, along with examples of gene families and pseudogenes. Additionally, it provides insights into the human genome, including gene numbers and sizes, highlighting the complexity of genetic structures.

Uploaded by

aditya sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views34 pages

Lecture 1.1.4 Gene

The document outlines the course objectives and outcomes for an Advanced Molecular Genetics class, emphasizing the understanding of genetics principles and cellular mechanisms. It discusses the evolution of the gene concept, gene duplication, and the existence of introns, along with examples of gene families and pseudogenes. Additionally, it provides insights into the human genome, including gene numbers and sizes, highlighting the complexity of genetic structures.

Uploaded by

aditya sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

UNIVERSITY INSTITUTE OF BIOTECHNOLOGY

Master of Sciences (Biotechnology)


Course Name: Advanced Molecular Genetics
Course Code: 22BTT-602
Master Course Coordinator
Dr. Manisha Phour
Assistant Professor

Lecture 1.1.4 & 1.1.5: DISCOVER . LEARN . EMPOWER


COURSE OBJECTIVE

1. Students would learn the advance level of Molecular Genetics.


2. Students will have a thorough knowledge on the molecular
mechanisms that gives rise to complex cellular processes like the
concept of DNA, RNA and Proteins

2
COURSE OUTCOME
CO1. Students will be able to understand the principles of genetics

CO2. Students will able to have In-depth knowledge on how cellular machinery
works

CO3. Students will be able to contrast important genetic features that distinguish
prokaryotes from eukaryotes

CO4. Students will be able to apply molecular knowledge for understanding nuclear
processes of cell

3
Gene
•The concept of the gene (word gene was first used by Wilhelm Johannsen in
1909) has evolved and become more complex since it was first proposed.

•The idea that genes are responsible for the synthesis of proteins was first
proposed In year 1902 by a British physician, Sir Archibald Garrod, who
realized that alkaptonuria was an inherited metabolic condition in humans
and hypothesized that it was due to the absence of an enzyme (a catalytic
protein) required for the breakdown of homogentisic acid.

•Later, George W. Beadle and Edward L. Tatum, who studied Neurospora


metabolism, demonstrated that most genes correspond to a region of the
genome that directs the synthesis of a single enzyme. This led to the one
gene: one enzyme hypothesis. 4
Numbers and size of genes
•The total gene number is known for several organisms. The number of
genes in bacterial genomes is proportional to genome size.

•The bacterium with the smallest known genome, M. genitalium, has


~470 genes. The number of genes in a eukaryote varies from 6000 to
40,000, but does not correlate with the genome size or the complexity of
the organism.

•Some genes are present in more than one copy or are related to one
another, the number of different types of genes is less than the total
number of genes.
5
Table: Genome size and number of protein coding genes

Species Genes Genome size


(approx.)
M. genitalium ~470 0.58 Mb
H. Influenzae 1727 1.8 Mb
E. coll 4288 4.6 Mb
S. cerevisiae 6275 12 Mb
D. melanogaster ~12000 123 Mb
C. elegans ~20000 100 Mb
Human ~30000 3300 Mb

6
Introns

•An intron is any nucleotide sequence within a gene which is


represented in the primary transcript of the gene, but not present
in the final processed form.
•The term intron refers to both the DNA sequence within a gene
and the corresponding sequence in primary transcripts. Introns
were first discovered in 1977, independently by Phillip Sharp and
Richard Roberts.
•Introns are present in most genes of higher eukaryotes, although
they are not universal.
7
Introns

•Large genes consist of a long string of alternating exons and


introns, with most of the genes consisting of introns.

•The overall length of a gene is determined largely by its introns.


Introns range in size from about 50 nucleotides to >100,000
nucleotides.

•Exons are usually short, typically on the order of 150 nucleotides.

8
Table : Average sizes of exons and introns in human genes

Gene product Size of gene Number of Average size of


(kb) exons intron (bp)
Insulin 1.4 3 480
β-Globin 1.6 3 490
Serum albumin 18 14 1100
CFTR (cystic 250 27 9100
fibrosis)
Titin 283 363 466
Dystrophin 2400 79 30770

9
Table: Types of most common introns
Intron type Where found
GU-AG Introns Eukaryotic nuclear pre-mRNA
AU-AC introns Eukaryotic nuclear pre-mRNA
Group I Eukaryotic nuclear pre-rRNA, organelle RNAS,
some prokaryotic RNAs
Group II Organelle RNAS, some prokaryotic RNAS V
Pre-tRNA Introns Eukaryotic nuclear pre-tRNA
Archaeal introns nova Various RNAs

10
Evidence for the existence of introns in eukaryotic genes
•Early evidence about the existence of introns in eukaryotic genes was
provided by the R-loop technique, in which a base-paired complex
between mRNA and DNA molecules is visualized in the electron
microscope.
•When purified ovalbumin mRNA preparation is annealed to a dsDNA
molecule in a suitable solvent containing the gene that encodes the
mRNA, the RNA can displace a DNA strand wherever the two sequences
match and form regions of the RNA-DNA helix.
•Regions of DNA, where no match to the mRNA sequence is possible, are
clearly visible as large loops of dsDNA. Each of these loops represents an
intron in the gene sequence.
11
The R-loop technique, in which a base-
paired complex between mRNA and DNA
molecules is visualized in the electron
microscope. When this single-stranded
mRNA preparation is annealed tranded
in a suitable solvent to a cloned double
DNA molecule containing the gene that
encodes the mRNA, the RNA can
displace a DNA strand wherever the two
sequences match and form regions of
RNA-DNA helix. Regions of DNA where
no match to the mRNA sequence is
possible are clearly visible as large loops
of double-stranded DNA. Each of these
loops represents an intron in the gene
12
sequence.
Acquisition of new genes
•The acquisition of new genes is a primary driving force of evolution in all
organisms. There are different ways in which new genes could be acquired
by a genome:
•By duplication and divergence of the existing genes in the genome;
•Duplications of a single gene or group of genes in the genome may be the
result of several mechanisms:
• Unequal crossing-over between non-sister chromatids of homologous
chromosomes,
• Unequal sister chromatid exchange
• Replication slippage
•Duplications involve not only protein-coding genes, but also noncoding
RNA genes. For example, a novel class of retroduplicates includes snoRNAs,
which are a class of RNA genes that are involved in ribosomal RNA
13
processing.
By acquiring genes from other species
•A new gene that is the basis of new functions does not arise only from the
duplication and divergence of genes. In course of evolution, extra gene has
been imported into the genome from outside sources by mechanisms other
than vertical gene transfer.

•The term lateral gene transfer is used to refer the case in which a gene
does not have a vertical origin (i.e., direct inheritance from parent to
offspring) but instead comes from an unrelated genome.

•This sort of transfer occurs between bacteria, and also has taken place
between the genomes of the cellular organelles (mitochondria and
chloroplasts) and the nuclear genomes.
14
Gene fusion and fission
•Existing genes can fuse (1.e., two or more genes can become part of the
same transcript) or undergo fission (i.e., a single transcript can break into
two or more separate transcripts), thereby forming new genes.

de novo gene origination


New genes can originate de novo from noncoding regions of DNA. Indeed,
several novel genes derived from noncoding DNA have recently been
described in Drosophila. For these recently originated Drosophila genes
with likely protein-coding abilities, there are no homologues in any other
species.

15
Fate of duplicated genes
Duplicated genes may have two fates:

1. The initial result of gene duplication is two identical genes. Selective


pressure ensures that one of these genes retains its original nucleotide
sequence, or something very similar to it, so that it can continue to provide
the protein function that was originally supplied by the single gene copy
before the duplication took place. The second copy is probably not subject
to the same selective pressures and so can accumulate mutations at
random. Evidence shows that the majority of new genes that arise by
duplication acquire deleterious mutations that inactivate them so that they
become pseudogenes. Inactivation of the duplicated gene functionally
restores the one gene state that preceded the duplication.
16
Fate of duplicated genes
Duplicated genes may have two fates:

2. Occasionally, the mutations that accumulate within a gene copy do not


lead to inactivation of the gene, but instead result in a new gene function
that is useful to the organism. Repeated rounds of this process of
duplication and divergence, over many years, have enabled one gene to
give rise to a whole family of genes with related functions within a single
genome. The globin gene family provides a particularly good example of
how DNA duplication generates new proteins. All globin genes (myoglobin,
leghemoglobin and alpha and beta) are descended by duplication and
divergence from an ancestral gene.

17
Gene duplication and divergence

18
Gene families

Most of protein-coding genes are represented only once in the


haploid genome and thus are termed solitary genes. Duplicated
genes constitute the gene families. A gene family can be defined
as a group of genes of identical or similar sequence that code for
identical or related proteins. These members either remain
clustered or dispersed around the genome. It may be:

19
Simple multigene families
•Families in which all the members have identical or nearly identical
sequences belong to simple (or classical) multigene families. The
rRNA genes are examples of simple multigene families.
•Within a tandem array of rRNA genes, each copy is exactly, or
almost exactly, like all the others. Each cell contains multiple copies
of the rRNA genes that code for ribosomal RNAS.
•Even E. coll genome contains seven copies of its rRNA genes.
Human cells contain about 200 rRNA gene copies per haploid
genome, clustered on one chromosome. The multiple copies of
conserved rRNA genes on a given chromosome are located in a
tandemly repeated manner. The repeated rRNA genes are needed
to meet the great cellular demand for their transcripts. 20
Simple multigene families
•Families in which all the members have identical or nearly identical
sequences belong to simple (or classical) multigene families. The
rRNA genes are examples of simple multigene families.

•Within a tandem array of rRNA genes, each copy is exactly, or


almost exactly, like all the others. Each cell contains multiple copies
of the rRNA genes that code for ribosomal RNAS.

21
Simple multigene families
•Even E. coll genome contains seven copies of its rRNA genes.
Human cells contain about 200 rRNA gene copies per haploid
genome, clustered on one chromosome.

•The multiple copies of conserved rRNA genes on a given


chromosome are located in a tandemly repeated manner. The
repeated rRNA genes are needed to meet the great cellular
demand for their transcripts.

22
Complex multigene families
•The individual members of the complex multigene family, although
similar in sequence, are sufficiently different for the gene products
to have distinctive properties.

•One of the best examples of this type of multigene family is the


mammalian globin genes. The globins are the blood proteins that
combine to make hemoglobin, each molecule of hemoglobin being
made up of two α-type and two ß-type globins.

• In humans, the a-type globins are coded by a small multigene


family on chromosome 16 and the β-type globins by a second
family on chromosome 11. 23
Complex multigene families
•The α (alpha)-subfamily in human contains three genes: zeta gene
(expressed only in the early embryonic stage) and two copies of the
alpha genes (expressed during the fetal and adult stages).
•In human β (beta)-globin gene cluster is longer than the α-globin
gene cluster and contains five genes.
•Of the five genes, three are expressed prior to birth. The epsilon ε
gene is expressed only during embryogenesis, while the two nearly
identical gamma γ genes are expressed only during fetal
development.
•The remaining genes, delta (δ) and beta (β), are expressed
following birth.
24
Complex multigene families
•There are several duplicated globin DNA sequences in the a and ß-globin
gene clusters that are not functional genes. They are examples of
pseudogenes.
•These have a close homology to the functional genes, but have been
disabled by mutations that prevent their expression. Two non-functional
pseudogenes are present in the a-cluster and one non- functional
pseudogene (Y) is present in the B-cluster.
Organization of alpha and beta globin gene subfamily

25
Homologous gene
•Homologous genes are ones that share a common evolutionary ancestor. A
pair of homologous genes has sequence similarities, but do not have identical
nucleotide sequences.
•It occurs because the two genes undergo different random changes by
mutation.
•Homologous genes fall into two categories - paralogous and orthologous.
Homologous genes present in the same organism are said to be paralogous,
whereas homologous genes in different organisms that arose through species
divergence are orthologous.
•For example, the myoglobin and ß-globin genes of humans are paralogs:
they originated by duplication of an ancestral gene. Whereas the myoglobin
genes of humans and chimpanzees are orthologs.
26
The types of homologs (A) orthologs and (B) paralogs
27
Pseudogenes
Pseudogenes are functionless gene variants that are present as a result of an
ancient historical accident. Pseudogenes are a type of evolutionary relic that
indicates the changing nature of the genome. The abundance of
pseudogenes in a given genome usually depends on rates of the gene
duplication and loss.

There are two main types of pseudogene:


1. Conventional (non-processed) pseudogene
It is a gene that has become inactive (non-functional) because of the
accumulation of mutations. The globin pseudogenes are most common
examples of conventional pseudogenes. The human globin gene clusters
contain five pseudogenes.
28
Pseudogenes
2. Processed pseudogene

It is a gene that results from integration into the genome of a reverse-


transcribed copy of an mRNA. It is thought to originate through
retrotransposition. They lack introns and a promoter region, but often
contain a polyadenylation signal and are flanked by direct repeats. Errors in
reverse transcription and the lack of an appropriate regulatory element often
lead to the degeneration of processed copies of genes. Mammals appear to
have a high number of processed pseudogenes.

29
Human nuclear genome
•The human haploid genome content is ~3.2x102 bp. The euchromatin
comprises the majority of the genome, ~2.9×109 bp.

•The human genome has about 30,000 genes. The total human gene number
is much less than we had expected (most previous estimates had been
~100,000).

•The average human gene is 27 kb long. Genes occupy ~25% of the human
genome, but protein-coding sequences are only a small part (~1.5%) of this
fraction.

30
Human nuclear genome
Salient features of human genome
1. The human genome contains 3164.7 million nucleotide bases.
2. The average gene consists of 3000 base pairs (gene codes for the protein
dystrophin consists of 2.4 million base pairs).
3. The total number of genes estimated is about 30,000.
4. The size of chromosomes is in between about 55 Mb to 250 Mb.
5. Chromosome 1 has maximum number of genes (2968), and the Y has the
fewest (231).
6. About 98.5% genome is non-coding (only about 1.5% is the protein coding
sequence).
7. Repetitive sequences make up very large portion of the human genome
(highly repetitive sequences comprise less than 10% and transposable
elements about 50%). 31
ASSESSMENT MODEL

Assessment Model

Continuous Internal Semester End


Assessment (CAE) Examination (SEE)
40 Marks 60 Marks

Mid-
Assi Sem Atte
gnm Quiz este Quiz nda
ent (4 r (4 nce
(10 Mar Test Mar (2
Mar ks) (20 ks) Mar
ks) Mar ks)
ks)

32
REFERENCES
Textbooks Books
• Watson, J.D., 2004. Molecular biology of the gene (Vol. 1). Pearson Education India.
Lewin, B., 2008. genes IX (No. 575.12 L48/ 9).
• Malacinski, G.M. and Freifelder, D., 2015. Essentials of Molecular Biology, Jones and
Bartlett Publishers.
• Gupta, P.K., 2008. Molecular biology and geneticengineering . Rastogi Publications.
Reference Books
• Raff, M., Alberts, B., Lewis, J., Johnson, A. and Roberts, K., 2002. Molecular Biology of the
Cell, 4th edition.
• Cooper, G.M. and Hausman, R.E., 2007. The Cell: a molecular approach. 2000. ASM
Press, Washington DC , pp.467 -519.
• Karp, G., 2009. Cell and molecular biology: concepts and experiments. John Wiley &
Sons.
33
THANK YOU

For queries: [email protected]

You might also like