Central Dogma Transcription Raza PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Transcription

BASIC INFORMATION

 Chromosome: A structure found inside the nucleus of a cell. A chromosome is made up of proteins and DNA
organized into genes. Each cell normally contains 23 pairs of chromosomes.

 Codon: A sequence of three consecutive nucleotides in a DNA or RNA molecule that codes for a specific amino
acid. Certain codons signal the start or end of translation. These are called start or stop (or termination)
codons.

The codons are of two types:


A. Sense Codon: Those codons that code for amino acids, are called sense codons. There are 61 sense
codons in the genetic code which code for 20 amino acids.
B. Signal Codons: Those codons that code for signals during protein synthesis are known as signal codons.
There are four codons which code for signal. These are AUG, UAA, UAG and UGA.

Signal codons are of two types:


a. Start Codons: The codon which starts the translation process is known as start codon. It is also known
as initiation codon because it initiates the synthesis of polypeptide chain. Example of this codon is AUG.
This codon also codes for the amino acid methionine. In eukaryotes, the starting amino acid is
methionine, while in prokaryotes it is N-formyl methionine.
b. Stop Codons: Those codons that provide signal for termination of polypeptide chain are known as stop
codons. These codons are also known as termination codons because they provide signal for the
termination and release of polypeptide chain. Examples of stop codons are UAA, UAG and UGA. Since
stop signal codons do not code for any amino acid they were earlier called as non-sense codons.

 Anticodon: An anticodon is a trinucleotide sequence located at one end of a transfer RNA (tRNA) molecule,
which is complementary to a corresponding codon in a messenger RNA (mRNA) sequence.

 Gene expression: Gene expression is the process by which the information encoded in a gene is used to
either make RNA molecules that code for proteins or to make non-coding RNA molecules that serve other
functions.

 Genetic code: The genetic code is the set of rules by which information encoded in genetic material (DNA
or RNA sequences) is translated into proteins (amino acid sequences) by living cells.
Page 1

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
Characteristics of the Genetic Code
1. The genetic code is universal: All known living organisms use the same genetic code. The genetic code
has been found to be universal in all kinds of living organisms — prokaryotes and eukaryotes.
2. The genetic code is unambiguous: Each codon codes for just one amino acid.
3. The genetic code is specific: There are specific tRNA for 20 amino acids and specific codon for all the
tRNA.
4. The code is degenerate: The occurrence of more than one codon for a single amino acid is referred to as
degenerate. Out of 61 functional codons, AUG and UGG code to one amino acid each. But remaining 18
amino acids are coded by 59 codons. E.g., Arginine has 6 different codons.
5. The Code is a Triplet: The coding units or codons for amino acids comprise three letter words, 43 = 64. 64
codons are quite adequate to specify 20 proteinous amino acids.

 Q. Why genetic code is universal?


The same codons are assigned to the same amino acids and to the same START and STOP signals in the vast
majority of genes in animals, plants, and microorganisms. All known living systems use nucleic acids and the
same three-base codons to direct the synthesis of proteins from amino acids. The mRNA codon UUU, for
example codes for phenyl alanine in all cells of all organisms. Hence, genetic code is universal.

 Exceptions in Genetic Code


The genetic code is universal since similar codons are assigned to identical amino acids along with similar
START and STOP signals in the majority of genes in microorganisms and plants.
Recently, some differences have been discovered between the universal genetic code and two mitochondrial
genetic codes which are shown in the following table-

Codon Mammalian mitochondrial code Yeast mitochondrial code Universal Code


AUA Methionine Methionine Isoleucine
AGA Stop Arginine Arginine

 Genetic Code is Redundancy but Not Ambiguous.


Redundancy- In most of the cases several codons code for the same amino acid. Out of 61 functional codons,
AUG and UGG code to one amino acid each. But remaining 18 amino acids are coded by 59 codons. E.g.,
Arginine has 6 different codons. This multiple system of coding is
known as redundant code system.

Such system provides a protection to the organism against many


harmful mutations, because if one base of a codon is mutated, there are
other codons which will code for the same amino acid and there will be
no alteration in the polypeptide chain.

Not Ambiguous- The genetic code has 64 codons. Out of these, 61


codons code for 20 different amino acids. However, none of the codon
codes for more than one amino acid. In other words, each codon codes
Page 2

only for one amino acid. This clearly indicates that the genetic code is non-ambiguous. In case of ambiguous
Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
code, one codon should code for more than one amino acid. E.g., CCU, CCA, CCG & CCC code for proline. But
these codons don’t code for the other 19 amino acid.

 Q. Why Genetic Code Is Triplet Not Singlet, Not Doublet?


Ans: Genetic code is a triplet, not a singlet or a doublet codon as they are not adequate to code for 20 amino
acids.
✓ Case 1- now if we consider the genetic code to be singlet then it would be impossible as there are only 4
bases and that cannot code for 20 amino acids. If codon is a singlet code, then it can code for 4 amino acids
only.
✓ Case 2- if the genetic code is doublet code, an amino acid is coded by 2-nitrogenous bases on mRNA in a
specific sequence, and then it can form 16 codons (42=16), still not sufficient enough to code 20 amino
acids. So, the genetic code cannot be of 2 letters.
✓ Case 3- if the genetic code is triplet code, that is, an amino acid is coded by 3-nitrogenous bases, and then it
can form 64 codons (43=64). But we have only 20 amino acids so codons are in excess. Then the possibility
is some amino acid may be coded by more than one triplet code.
✓ Case 4- if the genetic code is a 4 letters code, an amino acid is coded by 4-nitrogenous bases then it can
form 256 codons (44=256). But we only have 20 amino acids which is way too more. So, it cannot be 4.

So, the best possibility is for the codon to have 3-nitrogenous bases. George Gamow postulated in 1954 that
each codon is triplet code and is encoding the 20 amino acids of protein used by living cells.

 Open reading frame: A portion of mRNA that occurs between a start codon and a termination codon
which can potentially be translated into a protein

WOBBLE HYPOTHESIS
Definition: The Wobble hypothesis proposes that normal base
pairing can occur between nitrogen bases in positions 1 and 2 of the
codon and the corresponding bases (3 and 2) in the anticodon.
Actually, the base 1 in anticodon can form non-Watson-Crick base
pairing with the third position of the codon. The hypothesis is
applicable to most (not all) tRNAs.

Description: Hypothesis proposed by Francis Crick in 1966 to


explain the observed degeneracy in the third position of a codon.
Except for tryptophan and methionine, more than one codon directs the synthesis of one amino acid. There are 61
codons that synthesize amino acids, therefore, there must be 61 tRNAs each having different anticodons. But the
total number of tRNAs is less than 61. This may be explained that the anticodons of some tRNA read more than one
codon.

In addition, identity of the third codon seems to be unimportant. For example, CGU, CGC, CGA and CGG all code for
arginine. It appears that CG specifies arginine and the third letter is not important. Conventionally, the codons are
written from 5′ end to 3′ end.
Page 3

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
Therefore, the first and second bases specify amino acids in some cases. According to the Wobble hypothesis, only
the first and second bases of the triple codon on 5′ → ‘3 mRNA
pair with the bases of the anticodon of tRNA i.e., A with U, or G
with C.

The pairing of the third base varies according to the base at


this position, for example G may pair with U. The conventional
pairing (A = U, G = C) is known as Watson-Crick pairing and
the second abnormal pairing is called wobble pairing.

Usually, one anticodon recognizes and reads a single codon, but sometimes it may recognize more than one codon.
This phenomenon is called wobbling. It generally occurs for the 3’-nucleotide of the codon and 5’-nucleotide of the
anticodon. For example, according to wobble base pairing anticodon GAG can recognize two codons i.e., CUC & CUU
and both stand for leucine. Wobbling occurs due to the degeneracy of the 3rd base of the codon. This 3rd base is
called wobble base.

Relationship of Wobble Hypothesis


✓ Base pairing between nitrogen bases in positions 1 and 2 of the codon and the corresponding bases (3 and 2)
in the anticodon is stronger than that of the first base in anticodon and third base of the codon.
✓ The first base of the anticodon determines the specify amino acids which is to be transferred.
✓ At least 32 tRNA are presented to read 61 codons. Among the 32 tRNA, 1 tRNA read the start codon and the
other 31 tRNA read the remaining 60 codons.
✓ The first and second bases specify specific amino acids. If one of the first two bases is changed, the resulting
codon will specify another amino acid.

Advantages of Wobble Hypothesis


✓ Our bodies have a limited amount of tRNAs and wobble allows for broad specificity.
✓ tRNA can dissociate more readily from the mRNA template
✓ Allows faster protein synthesis

 Genome: The genome is the entire set of DNA instructions found in a cell. In humans, the genome consists of
23 pairs of chromosomes. A genome contains all the information needed for an individual to develop and
function.
Page 4

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
 Genomic DNA: The DNA which is found in the organism’s genome and is passed on to offspring as
information necessary for survival is called genomic DNA. The phrase is used to distinguish between other
types of DNA, such as found within plasmids.

 Complementary DNA: Complementary DNA (cDNA) is a synthetic DNA which is reverse transcribed from
the mRNA through the action of the enzyme reverse transcriptase.

 Gene: Gene is the basic unit of heredity passed from parent to child. Genes are made up of sequences of DNA.
Some genes act as instructions to make molecules called proteins. However, many genes do not code for
proteins.

There are 2 types of gene:


A. Structural or Transcribed Gene: Structural genes are the nucleotide or the segment of DNA that act as a
template for the synthesis of mRNA, tRNA & rRNA. Its length is about 1000bp in prokaryotic and 10,000bp
in eukaryotic.
B. Regulatory or Controlling Genes: Regulatory genes are the short sequence of the DNA strands usually
about 15-30bp that control the expression of the structural gene. The regulatory genes include miRNA and
siRNA. Regulatory gene can be categorized in the followings:
1. Promoter region
2. Initiator region
3. Attenuator region &
4. Terminator region

There are 2 types of regulatory gene:


a. Cis-regulatory genes (CRGs): Cis-regulatory genes are regions of non-coding DNA which regulate the
transcription of neighboring genes. CRGs are stretches of DNA, usually 100–1000 DNA base pairs in length.
CREs typically regulates gene transcription by binding to transcription factors.
b. Trans-Regulatory Genes (TRGs): TRGs are DNA sequences encoding upstream regulators (ie. trans-acting
factors), which may modify or regulate the expression of distant genes. Trans-acting factors interact with
cis-regulatory genes to regulate gene expression.

What are the Similarities between Structural and Regulatory Genes?


Both Structural and Regulatory Genes are code for proteins or RNA.
Both Structural and Regulatory Genes are made up of nucleotides.
Both Structural and Regulatory Genes are important in living organisms.

What is the Difference between Structural and Regulatory Genes?

Features Structural Genes Regulatory Genes


Description Structural genes are those genes that Regulatory genes are those genes that
code all the proteins in a genome code for proteins or factors that
except regulatory genes. control the expression of structural
Page 5

genes.

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
Location In prokaryotes, structural genes are Regulatory genes are usually found a
present in a sequence called operon. bit far from the structural genes, say,
However, in eukaryotes, structural 500 base pairs apart, and are mostly
genes are found in the exon regions. found in the intron regions.
Genes Encoded mRNA, rRNA, and tRNA miRNA and siRNA
Function It encodes all the proteins required It encodes factors / proteins that
for structural and functional uses. control the expression of structural
genes.
Structure Structural genes are complex Regulatory genes are simpler
structures. structures.
Example Structural genes of the lac operon Regulatory genes are lac I and CAP.
such as lac A, lac Y and lac Z.

 Transcription factors: Transcription factors are proteins that help turn specific genes "on" or "off" by
binding to nearby DNA. Transcription factors that are activators boost a gene’s transcription and that are
repressors decrease a gene’s transcription.

 What are Introns?


Introns are intervening sequences between two exons found in eukaryotes.
They do not directly code for proteins. They are removed before the mRNA
forms proteins. Therefore, these introns undergo the process of splicing.
Introns are the non-coding parts of the nucleotides and are not highly
conserved. Therefore, it is essential to remove introns to prevent the formation
of incorrect proteins.

 What are Exons?


Exons are the coding sequences that code for amino acid sequence of the protein. The exons are transcribed
into mature mRNA after post-transcriptional modification. These are highly conserved sequences, i.e., they do
not change frequently with time.

 Difference between Introns and Exons

Introns Exons
Found in Eukaryotes only Found in both prokaryotes and eukaryotes
Non-coding areas of the DNA Coding areas of the DNA
The sequence of the introns changes frequently over Exons are highly conserved
time. In other words, they are less conserved
DNA bases found in between exons DNA bases that are translated to proteins
Introns are removed in the nucleus before the mRNA Mature mRNA contains exons and moves to the
moves to the cytoplasm cytoplasm from the nucleus
Page 6

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
REGULATORY PROTEINS
Regulatory proteins are those proteins which effect the expression of structural gene by binding to the controlling
site near the structural genes. These proteins either activate or reprocess the transcription. e.g.
 Transcriptional factors
 Activators
 Co-factors
 Repressor
 Basal transcription factor.

 PROMOTER
Promoter is the DNA region where the transcription initiation takes place. In prokaryotes, the sequence of a
promoter is recognized by the sigma (σ) factor of the RNA polymerase. In eukaryotes, it is recognized by specific
transcription factors.
Promoters are the specific initiation sites upstream from the transcription start site in the DNA strand. Promoters
are characterized by the following features:
It essential for transcription of structural gene
It contains conserved sequence
Generally, more than one promoter is available,
Usually, 6bp region present in a promoter,
Locations of promoters are within the 40bp upstream,
Different promoter sequences cause considerable variation of the rates at which different genes are
transcribed.
RNA polymerase or regulatory protein bind with the promoters and initiate transcription.
Promoters are not themselves transcribed into RNA.

PROKARYOTIC PROMOTERS:
During transcription of prokaryotes, RNA polymerase binds directly to the promoter sites. Although different
promoters are recognized by different σ-factors, these interact with the different RNA-polymerase core enzyme.
The most common σ-factor in E. coli is σ70. The promoter lies upstream of the start site of the transcription,
generally assigned as position +1. In accordance with this, promoter sequences are assigned a negative number
reflecting the distance upstream from the start of transcription.

Two 6bp sequences at around position -10 and -35 have been shown to be particularly important for promoter
function in E. coli. Beside these two, other promoters are also available in prokaryotes which are not so important.

The promoter which is located at around the -10 position with respect to the transcription start site is the most
conserved sequence consists of 6bp. This is sometimes referred to as the Pribnow box, having being first
recognized by Pribnow in 1975. It has a sequence of TATAAT. The -10 sequence appears to be the sequence at
which the DNA unwinded is initiated by the polymerase. This promoter is also known as controlling sequence. The
sequence between -10 to +1 is critical.

The promoter which is centered at around the -35 position is known as the recognition sequence. It has a
conserved hexamer sequence of TTGACA. The first 3 position of these hexamer are the most conserved by 16-18bp
from the -10 box. The intervening sequence between these promoters is not important. It enhances recognition
and interaction with the σ factor.
Page 7

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
EUKARYOTIC PROMOTERS
The most common promoter element in eukaryotic protein genes is the TATA box, located at -35 to -20. Its
consensus sequence, TATAAA, is quite similar to the -10 region of the Sigma 70 recognition site. Another promoter
element is called the initiator (Inr). It has the consensus sequence PyPyA+1N(T/A)PyPy, where Py denotes
pyrimidine (C or T), N = any (A, T, C or G), and (T/A) means T or A. The base A at the third position is located at +1
(the transcriptional start site).

TATA box and initiator are the core promoter elements. There are other elements often located within 200 bp of
the transcriptional start site, such as CAAT box and GC box which may be referred to as promoter-proximal
elements.

Table: Eukaryotic promoter elements.

Promoter Position Transcription Factor Consensus Sequence


Initiator +1 TBP (TATA box binding Protein) Py Py A+1 N(T/A)Py Py

TATA box -35 to -70 TBP TATAAA


CAAT box -70 to -200 CBF (CAAT box binding protein), CCAAT
C/EBP (CAAT/Enhancer binding protein),
NF1.
GC box -70 to -200 SP1 (Specific Factor-1) GGGCG

 Classification of Eukaryotic Promoters


There have 3 classes of eukaryotic promoters: I, II and III which are used by RNA Pol I, II, and III respectively.

Promoter Used By Product Consist of


Promoter-I RNA pol-I rRNA • GC rich region and
• Core promoter element
Promoter-II RNA pol-II mRNA. • TATA box
• Upstream promoter and
• Enhancers
Promoter-III RNA pol-III tRNA, ssRNA, snRNA • Resembles class-II promoters

Page 8

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
PROMOTER SEQUENCE:

 Terminator:
The termination of transcription, namely the dissociation of the transcription complex and the ending of RNA
synthesis, occurs at a specific DNA sequence known as the terminator. Hairpin or stem-loop indicates the
terminator sequence. Some terminator requires accessory factors, such as rho factor or Nus-A for termination.

 Activator proteins
Activator proteins bind to genes at site known as enhancers. Activators help to determine which genes will be
switched on and they speed the rate of transcription from the basal level to a high level by forming the DNA to loop
back on itself and cause rapid interaction between the enzyme, transcription factors and promoters. e.g.,
Page 9

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
Cif is an activator protein that binds
with the CAAT box (GCCAATCT)
which presents all the -75bp
upstream from the polymerase
binding site.
SP-1 binds with G box (GGCCGG).
SPi-1 binds with PU box (GAGGAA).
Oct-1 binds with AT-CAT box
(ATTTCAT)

 Co-activator:
Co-activators are the adaptor molecules
that integrate signals from activators and
perhaps repressors and relay the results of
basal transcriptional factors. It helps the
enhancer to bind with pre-initiation
complex and form a entire conformation of
DNA unwinding and stands separation.

 Repressor:
Transcriptional repressors are proteins that bind to specific sites on DNA and prevent transcription of nearby
genes. It is molecular protein and has one DNA binding site and one or more repress site and usually located at 100
to 200bp upstream. They bind to selected sets of genes at site known as silencers. They interfere with the
functioning of activators and thus slow transcription.

 Response Elements
Response elements are the recognition sites with short sequences of DNA within a gene promoter or enhancer
region. It is the structural unit of RNA-Polymerase.
Response elements bind to specific transcription factors to initiate transcription. After the initiation of
transcription, σ-factor detaches from the sites. Response elements in the prokaryotic transcription is σ-70.

RNA POLYMERASE:
RNA polymerase is the enzyme which catalyzes the transcription reaction. There are some differences between
prokaryotic polymerase and eukaryotic polymerase but their action is almost same.

 Prokaryotic RNA polymerase:


In most prokaryotic, a single RNA polymerase species transcribes all types of RNA and it generally larger size
than that in eukaryotic. The E. coli RNA polymerase consists of at least five sub-units. These are:
1. α sub unit (MW 40,000 Daltons)
2. β sub unit (MW 150,000 Daltons)
3. β'sub unit (MW 160,000 Daltons)
4. ω sub unit (unknown)
5. ε sub unit (MW 70,000 Daltons)
Page 10

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
The combination of α, β, β’ and ω subunits is called core enzyme. When σ sub-unit combined with the core
enzyme then it is called holo-enzyme which is necessary for correct initiation of transcription. After initiation σ
sub-unit or factor dissociate from the rest of the complex, leading the core enzyme which can continue
transcription and after initiation.

A. α sub unit: Two identical α subunits are present


in the core RNA polymerase enzyme. The sub
unit is required for core enzyme assembly, but
there has no clinical evidence or participation to
initiate transcription. It may take part promoter
recognition i.e., when σ factor binds with core
enzyme then, its affinity to bind with promoter
sites increase and its affinity to bind with other
site of DNA strand decreases.
B. β sub-unit: One β sub-unit presents in the core enzyme which locates in the catalytic center of the RNA
polymerase and catalyze the initiation and elongation steps of transcription. The important antibiotic
(Rifampicin) is a potent inhibitor of RNA polymerase which binds to the β sub unit and blocks the initiation of
transcription.
C. β' sub unit: One β' sub unit present in the core enzyme which binds with two Zn++ ions and take part in the
catalytic function of the polymerase. β' sub-unit binds with templates DNA and continue the elongation
process.
D. Sigma (σ) factor: The most common sigma (σ) factor in E. coli is σ70. Binding of the sigma (σ) factor converts
the core RNA polymerase to holo-enzyme. The sigma (σ) factor has a critical role on promoter recognition but
it is not required for elongation.

Mechanism of action of sigma (σ) factor:


The sigma (σ) factor contributes to promoter we cognition by decreasing the affinity of the core enzyme for non-
specific DNA sites by a factor of 104 and increasing affinity for the specific promoter binding sites.

Many prokaryotes have multiple sigma (σ) factors and are involved in the recognition of specific classes of
promoter sequences. The sigma (σ) factor is released from the RNA polymerase when the transcript RNA chain
reaches 8-9 nucleotides in length. The core enzyme then moves along the DNA synthesizing the growing RNA
strand. The sigma (σ) factor is then binds with a further core enzyme complex and reinitiates transcription.

There is only 30% of the sigma (σ) factor present in the compare with core enzyme complexes. Therefore, only the
one-third of the polymerase complexes can exist as holo-enzyme at any one time.

Eukaryotic RNA Polymerase


Eukaryotic RNA polymerases are three types which are responsible for transcription of various types of eukaryotic
genes. The different types of eukaryotic RNA polymerases were identified by chromatographic purification of the
enzyme and elution at different salt concentration. Three Eukaryotic RNA polymerases are
1. RNA polymerase I,
2. RNA polymerase II,
3. RNA polymerase III.
Page 11

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
1. RNA polymerase I: RNA polymerase I is located in the nuclei of chromosome and is nucleotide for the
synthesis of precursors of most rRNA i.e. 5.8s, 28s, 18s. In other word RNA polymerase I is the enzyme that
transcribe gene as specified.
2. RNA polymerase II: RNA polymerase II in the nucleoplasm and is responsible for transcription of all protein
coding genes and some small nuclear RNA genes. Especially it transcribes the pre m-RNA.
3. RNA polymerase III: RNA polymerase III is located in the nucleoplasmand transcribes the gene for tRNA
SSnRNA, SrpRNA (single recognition particle RNA), cytosolic RNA and other small RNA

Polymerase Binding site Function


RNA pol I Varies Mainly synthesis rRNA.
RNA pol II Generally, TATA box Mainly synthesis mRNA.
RNA pol III Frequently TATA box Mailny synthesis tRNA.

BASIC PRINCIPLE OF TRANSCRIPTION:


Transcription is the enzymatic synthesis of RNA on a DNA
template. This is the first stage in the overall process of
gene expression and ultimately leads to synthesis of
protein encoded by gene.

This process is catalyzed by the enzyme RNA polymerase


which requires a dsDNA template as well as the precursor
ribonucleotides ATP, CTP, GTP and UTP. The synthesis
always occurs in affixed direction from 5' to 3'. A number of
enzymes stimulate the local unwinding of DNA and this
allows the RNA polymerase to initiate transcription of one
of the DNA strands. Within a gene only one of the DNA is transcribed into mRNA. This DNA is called anticoding
strand or antisense strand. The DNA strand is not transcribed is called the coding strand or sense strand.

The synthesized RNA is a copy of sense strand and complementary of antisense strand. The basic process of the
synthesis of mRNA i.e., transcription is almost same with the DNA replication. During transcription,
complementary ribonucleotides are set up on the template strand one by one which are available in the cell. After
forming a certain length, mRNA is separated from RNA-DNA hybrid. A transcriptional bubble is formed during
transcription which is vanished after this process.

Mechanism of Transcription in E. coli / Transcription of E.Coli:


Transcription can be described in E. coli by the following steps:
1. Initiation,
2. Elongation,
3. Termination.
Page 12

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
1. Initiation:
Initiation of the transcription involves the binding of RNA polymerase to dsDNA RNA polymerase. RNA
polymerase is usually multi subunits enzymes. They bind to dsDNA at Promoter sites. There are two
promoters, one is TATA box or Pribnow box located at 10bp upstream and another is recognition sequence
located at 35bp upstream from transcription start site.

Correct initiation of transcription is obviously important. The -35 and TATA boxes are the signals for
positioning the correct place to start transcription of a gene. The core enzyme or a RNA polymerase has an
affinity for any stretch of DNA which it attaches at random, but it cannot recognize the correct initiation site.
When α factor binds with the core enzyme, the resultant holoenzyme losses much of its affinity for random
DNA but binds tightly to the -35 and TATA boxes and initiation of transcription starts. The resulting structure
is formed a closed promoter complex. The enzyme then unwinds the bases near the -10 region to form an open
promoter complex. A bubble of separated DNA strands is formed, thus making the template strand bases for
available pairing with incoming bases of nucleotides. The enzyme now synthesized the first few
phosphodiester bonds from nucleotides and initiation thus achieved.

2. Elongation:
Shortly after initiating transcription, the α factor dissociate from RNA polymerase. The RNA is always
synthesized in a 5' to 3' direction with nucleotides acting as substrate for the enzyme. The equation below
represents the addiction of each ribonucleotide and how energy is produced for the reaction.
𝐃𝐍𝐀
𝐍𝐓𝐏 + (𝐍𝐌𝐏)𝐧 → + (𝐍𝐌𝐏)𝐧+𝟏 + 𝐏𝐏𝐢
𝐌𝐠 𝟐+ 𝐑𝐍𝐀 𝐏𝐎𝐥𝐲𝐦𝐞𝐫𝐚𝐬𝐞

The polymerase unwinds a stretch of DNA about m17bp in length forming a transcriptional bubble that
progresses along with DNA. The DNA has to unwind ahead of the polymerase and rewind behind it. The newly
formed RNA forms a RNA-DNA double helix about 12bp long.

3. Termination:
RNA polymerase also recognizes signal for chain termination, which involves the release of the nascent RNA or
pre-RNA and enzyme for the template and the reformation of the dsDNA. These occur at a specific DNA
sequence known as terminator. There are two major mechanisms for termination in E. coli.
Page 13

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
a. First mechanism: Rho independent
The terminator sequences contain GC rich self-
complementary regions which can form a stem-loop
or hairpin. It is followed by the terminal run of Uracil's
that corresponds to the Adenine residues on the DNA
template.

Strong G–C bonds in the hairpin and weak A–U bond


between DNA and RNA is to facilitate detachment of
the mRNA and termination of the transcription. It
occurs directly and need no additional factors.

b. Second mechanism: Rho dependent


If poly Uracil tail or poly Adenine tail is
absent then two regulatory proteins
Rho-factor and Nus-A can terminate
transcription.

The Rho-factor attaches to the newly


transcribed mRNA and moves along it
behind the RNA polymerase. At
terminator site Rho-factor binds with a
specific site on the RNA termed rut. The
rut sites are located just upstream from
sequence at which RNA polymerase
tends to pause, probably because of a
difficult to separate G-C rich section of
the DNA. It uses the hydrolysis of ATP to
ADP and phosphate inorganic to drive
the termination reaction.

[RNAP which is 100nucleotide away


from the RUT site stops the
transcription, the sequence that halts the
RNA polymerase are call Rho Sensitive
Pause site. Once Rho protein binds the
sequence in RNA, i uses energy of ATP &
translocate at the RNA-DNA hybrid is
unwinds the region]

The Nus-A protein acts directly at RNA


polymerase complex site and dissociate
the enzyme by catalysis from the DNA
strands and terminate transcription.
Page 14

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
Transcription can be inhibited by different inhibitors or antibiotics and their blocking action site are given
below:

Inhibitors Target Enzyme Inhibitory action


Rifampicin Bacteria holoenzyme Binds to the β subunit to prevent
initiation.
Streptoligidin Bacterial core enzyme Binds to the β subunit to prevent
elongation.
Actinomycin-D Eukaryotic RNA polymerase I Binds to the DNA to prevent
elongation process.
α-Amanitin Eukaryotic RNA polymerase II Binds to the RNA polymerase II
enzyme.

Difference between prokaryotic RNA polymerase and eukaryotic RNA polymerase enzyme:

Prokuryotic RNA polymerase Eukaryotic RNA polymerase


Only type of RNA polymerase is present. Three different types of RNA polymerase are present.
It contains 5 subunits. It contains 12 or more subunits.
It is responsible for the synthesis of three types of RNA pol. I → rRNA
RNA (mRNA, tRNA, and rRNA). RNA pol. II → mRNA
RNA pol. II → tRNA
Molecular weight is 48000 daltons. Molecular weight is about 50000 daltons.
All of the Prokaryotic RNA polymerase contains γ It has no γ factor.
factor.

PROCESSING OF mRNA:
Processing of mRNA is the post modification of nascent mRNA synthesized by RNA polymerase II before a
functional mRNA is produced in the nucleus of eukaryotes.

From the time of nascent transcripts first emerge from RNA polymerase II until mature mRNA, the RNA molecules
are associated with heterogeneous ribonucleoproteins particles, which contain heterogeneous nuclear RNA
(hnRNA), referring to pre-mRNA. The hnRNA may prevent the formation of short secondary structures depended
on base pairing of complementary regions, thereby making pre-mRNAs accessible for interaction with other
macrotubules.

The pre-mRNA-then undergoes the processing through 3 steps as following:


1. 5' capping,
2. 3' cleavage and poly adenylation,
3. RNA splicing.
Page 15

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
1. 5' CAPPING
After nascent RNA molecules produced by RNA polymerase reach a length of 25-30 nucleotides, 7-methyl
guanidine is added to their 5'end. The initial steps in RNA processing are catalyzed by a dimeric capping enzyme
that associates with the phosphorylated carboxyl terminal tail domain (CTD) of RNA polymerase II.

One subunit of the capping enzyme removes the γ phosphate from the 5'end of the nascent RNA emerging from the
surface of a RNA polymerase II. The other subunit transfers the GMP moiety from GTP to the 5'diphosphate of the
nascent transcript, creating guanine 5', 5'-triphosphate structure.

S-adenosyl-methionine is the source of the methyl group for the two methylation steps catalyzed by methyl
transferase. Enzymes transfer the methyl groups from S-adenosyl- methionine to the N-position of the guanine and
2'-oxygen of ribose at the 5'end of the nascent RNA.

Page 16

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
2. 3' CLEAVAGE AND POLY ADENYLATION:
Nearly all mRNA contains the sequence AAUAAA 10-35 nucleotide upstream from the poly A tail and a second
signal within 50 nucleotides downstream from the cleavage site are required for efficient cleavage and poly
adenylation of most pre-mRNAs, The downstream poly A signal is not a specific sequence but rather a G-U rich or
Uracil rich region.

Cleavage and poly adenylation specificity factor (CPSF)


composed of four different polypeptides, binds to an
upstream AAUAAA poly A signal. Then cleavage stimulatory
factor (CStF) interacts with a downstream G-U or Uracil rich
sequence with bound CPSF, forming a loop in the RNA.
Bonding of cleavage factor I (CF 1) and cleavage factor II
(CF II) help to stabilize the complex. Finally, poly A
polymerase-(PAP) binds to the complex and stimulate
cleavage at a poly A site. The cleavage factors are released,
as is the downstream RNA cleavage product, which is
rapidly degraded.

Bounds PAP then adds about 12 A residues at a slow rate, to


the 3' hydroxyl group generated by the cleavage reaction.
Binding of poly A binding protein II (PAB II) to the initial
short poly A tail accelerates the rate of addition by (PAP).
After 200-250bp adenine residues have been added, PAB II
signals PAP to stop polyadenylation.

Figure: Polyadenylation at the 3' end. The major signal for the 3'
cleavage is the sequence AAUAAA. Cleavage occurs at 10-35
nucleotides downstream from the specific sequence. A second signal
is located about 50 nucleotides downstream from the cleavage site.
This signal is a GU-rich or U-rich region.
Page 17

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
3. RNA SPLICING:
During the final step in formation of a mature functional mRNA, the introns are removed and exons are spliced
together. Splicing of exons in the recent RNA usually begin before transcription of the gene is compete and the
5'capping and 3'poly A tail mRNA precursors are retained in mature cytoplasmic mRNA.

Generally, 30-40 nucleotides at each end of


an intron are necessary for splicing to occur
at normal rates. The most conserved
nucleotides are the 5'GU and 3'AG founds at
the end of most introns. A branch point of
Adenine and a Pyrimidine rich region are
also found in the introns.

snRNA is associated with six to ten proteins


in small nuclear ribonucleoprotein particles
(snRNPs) assist in the splicing reaction.
Five Uracil rich snRNAs (U₁, U₂, U4, U5 and
U6) participate in RNA splicing. The five
splicing SnRNAs are thought to sequentially
assemble on the pre-mRNA forming a large
ribonucleoproteins complex called
spliceosome. First the short consensus
sequence at the 5'end of introns is found to
be complementary to a sequence of U₁ and
bind with it by base pairing. Similarly U₂
base pairs with branch point A and
interacts with U1 to form a loop structure.
Then U4, U5 and U6 complex associates with
the previously formed complex to yield a
spliceosome.

After formation of the spliceosome, extensive rearrangement occurs in the pairing of snRNA and pre-mRNA.
The rearranged spliceosome then catalyzes the two trans-esterification reactions that result in RNA splicing.
After the second trans-esterification reaction, the ligated exons are released from the spliceosome; with the
intron has a branched lariat structure remains associated with the snRNPS. The final intron snRNP complex is
unstable and dissociates. The individual snRNPS released participate in a new cycle of splicing. The excised
intron is rapidly degraded by hydrolysis and other the action of other enzymes.
Page 18

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.
SPLICING PATHWAYS:
A. Assembly
B. Rearrangement
C. Catalysis

Mechanism:
1. U1 recognize 5' splice site.
2. One subunit of U2AG binds to Py
tract and the other to the 3' splice
site.
3. U2 binds to the branch site, and
then a complex is formed.
4. The base-pairing between the U2
and the branch site is such that
the branch site A is extruded.
This A residue is available to
react with the 5' splice site.

Formation of the active site exposes the


5' splice site of the pre-mRNA and the
branch site, allowing the branched A
residue to attack the 5' splice site to
accomplish the first trans- esterification
reaction.

RNA splicing is carried out by a large


complex called spliceosome. The
spliceosome comprises about 150
proteins and 5 snRNAs. Many functions
of the spliceosome are carried out by its
RNA components.

Three roles of snRNA splicing


1. They recognize the 5' splicing site and the branch site.
2. They bring those sites together as required.
3. They catalyze (or help catalyze) the RNA cleavage and joining reactions.
Page 19

Mehadi Hasan Evan & Raisul Islam Raza 10th Batch Dept. of Pharmacy, PUST.

You might also like