0% found this document useful (0 votes)
2 views36 pages

Gene Expression

Gene expression analysis involves studying how genes are transcribed to produce functional products like proteins or non-coding RNAs, impacting phenotypes. Various techniques such as Northern blotting, DNA microarrays, and RT-PCR are utilized for this analysis, each with specific protocols and applications. RT-PCR, in particular, is highlighted for its sensitivity and real-time quantification capabilities, making it a preferred method for validating gene expression changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views36 pages

Gene Expression

Gene expression analysis involves studying how genes are transcribed to produce functional products like proteins or non-coding RNAs, impacting phenotypes. Various techniques such as Northern blotting, DNA microarrays, and RT-PCR are utilized for this analysis, each with specific protocols and applications. RT-PCR, in particular, is highlighted for its sensitivity and real-time quantification capabilities, making it a preferred method for validating gene expression changes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Gene Expression Analysis

Gene expression is the process by which information from a gene is used in the synthesis of
a functional gene product that enables it to produce end products, protein or non-coding
RNA, and ultimately affect a phenotype, as the final effect.
These products are often proteins, but in non-protein-coding genes such as transfer RNA
(tRNA) and small nuclear RNA (snRNA), the product is a functional non-coding RNA.
Gene expression analysis is most simply described as the study of the way genes are
transcribed to synthesize functional gene products — functional RNA or protein products.
Gene expression analysis studies are as follows:
• Northern blotting
• DNA Micro arrays
• Rt-PCR
• Expression of reporter genes
• Gel shift assays
• Chromatin Immunoprecipitation (ChIP)
• Transcriptome sequencing by next-generation sequencing
• Western blotting
• 2D-Gel Electrophoresis
• Mass Spectrometry
Northern Blotting
Northern blot analysis reveals information about RNA identity, size, and abundance, allowing
a deeper understanding of gene expression levels.
It is developed by James Alwine, David Kemp and George Stark in 1977.
Principle of Northern Blot
• The principle of the northern blot is the same as all other blotting technique that is
based on the transfer of biomolecules from one membrane to another.
• The RNA samples are separated on gels according to their size by gel electrophoresis.
Since RNAs are single-stranded, these can form secondary structures by
intermolecular base pairing. The electrophoretic separation of the RNA segments is
thus performed under denaturing conditions.
• The separated RNA fragments are then transferred to a nylon membrane.
Nitrocellulose membrane is not used as RNA doesn’t bind effectively to the
membrane.
• The transferred segments are immobilized onto the membrane by fixing agents. The
RNA fragments on the membrane are detected by the addition of a labeled probe
complementary to the RNA sequences present on the membrane.
• The hybridization forms the basis of the detection of RNA as the specificity of
hybridization between the probe, and the RNA allows the accurate identification of
the segments.
• Northern blot utilizes size-dependent separation of RNA segments and thus can be
used to determine the sizes of the transcripts.
Northern blot protocol Phases:
1.Extraction of RNA:
 There are many RNA extraction kits commercially available, but they all involve cell
lysis, inhibition of RNAases, removal of proteins and other contaminants, and
recovery of RNA

2. Isolation of mRNA:
 Oligo dT cellulose chromatography can be used to isolate only mRNA with a polyA
tail. The poly A tail is the final step of mRNA production in the nucleus. The
tail enables nuclear export, translation, and stability of mRNA. In Oligo dT
cellulose chromatography, oligos complementary to the poly A tail are covalently
attached to a resin column. When the sample is applied to the column the mRNA
with the poly A tail will hybridize to the oligo probe and be retained on the column.
Then, the elution buffer is applied to disrupt hybridization and recover the mRNA.

3. Gel electrophoresis to separate mRNA by size:


 Agarose gels containing formaldehyde were traditionally used to denature RNA. The
formaldehyde reacts with the imine and amine groups on the nucleic acids, which
disrupts the hydrogen bonding between bases and disrupts the secondary structure
of the RNA. It is important to disrupt the secondary structure because the RNA
must be extended to allow proper binding of probe for identification.

4. Transfer of RNA to blotting membrane:


 The transfer is necessary because the probes can’t enter into the gel matrix. Therefore,
the RNA must be transferred to a membrane where they can be accessed by the
probes
 Transfer is accomplished via a capillary (overnight) or vacuum (15-60
minutes) blotting system.
 The blotting membrane is positively charged to attract the negatively charged RNA.
Nylon is a commonly used membrane.

5. Immobilization of RNA to the blotting membrane:


 Covalently attached to the membrane by the application of UV light or heat.
6. Application of Probe:
 Probes have a minimum of 25 bases that are complimentary to the mRNA sequence of
interest.
 Excess probe is washed off

7. Probe visualization:
 Radioactive isotopes were traditionally used, but have been replaced in favor of safer
detection methods.
 Chemiluminescence is commonly used in the modern northern blot protocol
(Glyoxal, an alternative to formaldehyde, is now available is many commercial northern blot
kits. Formaldehyde is a suspected human carcinogen, and has been shown to cause squamous
cell carcinoma in the nasal passages and trachea in rats exposed to inhalation. Formaldehyde
denatures large RNA molecules by forming single point adducts with the amine or imine
molecules on base pairs and disrupting normal hydrogen bonding. These linear adducts are
unstable and will decompose if formaldehyde is not present in the electrophoresis gel. This
means that gels must be poured and run in a fume hood in order to reduce the human
exposure risk. Glyoxal is a diformyl molecule that contains two carbonyl groups, which
means that it can react with neighbouring amines and imines at the same time on a base pair
and form a more stable cyclic adduct with two points of contact. The stability of glyoxal
denaturing is attractive because it eliminates the need to pour a denaturing gel. The glyoxal
can be applied to the RNA prior to gel electrophoresis, and the gel does not need to contain
glyoxal or formaldehyde.)
DNA Microarrays
Chip carrying array of DNA segments used to simultaneously detect and identify many short
RNA or DNA fragments by hybridization. These are also known as DNA chip or
oligonucleotide array. DNA microarrays can be used to detect gene expression by hybridizing
the array to messenger RNA.
DNA microarrays work by hybridization, they can also be used to monitor RNA. Microarrays
are fairly expensive and analysis of the data is highly labour intensive, despite computerized
analysis. If only one or a few genes are the objects of interest, other methods such as
Northern hybridization to detect mRNA or using a reporter gene to measure the level of
transcription are more appropriate. For total transcriptome analysis, the solid support (i.e., the
“chip”) has DNA sequences complementary to all possible mRNA molecules that a cell
might express. The DNA is robotically printed onto a nylon membrane or a glass slide.
Current technology can print about 100,000 spots of DNA per cm 2, with glass slides capable
of carrying higher densities than nylon membranes.
Next, mRNA is extracted from cells and labelled, either with a radioactive isotope or
more often with a fluorescent dye. Next, the labelled mRNA is placed on the DNA array in
conditions that favour binding of complementary sequences. After binding to the chip, the
intensity of the label in each spot correlates to the amount of that particular mRNA. Most
gene expression studies compare two different conditions, one “control” set or untreated
cells, and one “experimental” set where the cells are exposed to a different environment. Both
mRNA samples can be hybridized to the chip at the same time if two different fluorescent
dyes (e.g., Cy3, which is green and Cy5, which is red) are used for each mRNA set. Red
spots will show genes expressed under “control” conditions and green spots will show genes
expressed under “experimental” conditions. When the same mRNA is expressed in both
conditions that spot will fluoresce yellow.
Determining the intensity of green, red, and yellow for each spot is accomplished by
computer analysis, which determines the mean of the pixels or median value for the pixels,
and normalizes these to a set of internal controls. Rather than simply presenting a table of
numbers, the computerized analysis is often presented as a “heat map” grid. The gene
sequences for the control set of data are listed on the x-axis and the experimental genes are
listed along the y-axis. Each square of the grid is coloured, where red indicates an increase in
expression and blue represents a decrease in expression over the control. Shading of either
red or blue from light to dark indicates relative increases or decreases of gene expression.
Four major steps in performing a typical microarray experiment are…..
1. Sample Preparation and Labelling
• Isolate a total RNA containing mRNA that ideally represents a quantitative copy of
genes expressed at the time of sample collection.
• Preparation of cDNA from mRNA using a reverse transcriptase enzyme.
• Short primer is required to initiate cDNA synthesis.
• Each cDNA (Sample and Control) is labelled with fluorescent cyanine dyes (i.e. Cy3
and Cy5).
2. Array Hybridization
 The mixed labelled cDNA is competitively hybridized against denatured PCR product
or cDNA molecules spotted on a glass slide.
3. Purification
4. Image Acquisition and Data Analysis
• Slide is dried and scanned to determine how much labelled cDNA (probe) is bound to
each target spot.
• Hybridized target produces emissions.
• Microarray software often uses green spots on the microarray to represent
upregulated genes.
• Red to represent those genes that are downregulated and yellow to present in equal
abundance.
Reverse Transcription Real Time-PCR
RT-PCR (reverse transcription-polymerase chain reaction) is the most sensitive
technique for mRNA detection and quantitation currently available. Compared to the two
other commonly used techniques for quantifying mRNA levels, Northern blot analysis and
RNase protection assay, RT-PCR can be used to quantify mRNA levels from much smaller
samples. In fact, this technique is sensitive enough to enable quantitation of RNA from a
single cell.
Over the last several years, the development of novel chemistries and instrumentation
platforms enabling detection of PCR products on a real-time basis has led to widespread
adoption of real-time RT-PCR as the method of choice for quantitating changes in gene
expression. Furthermore, real-time RT-PCR has become the preferred method for validating
results obtained from array analyses and other techniques that evaluate gene expression
changes on a global scale.

To truly appreciate the benefits of real-time PCR, a review of PCR fundamentals is


necessary. At the start of a PCR reaction, reagents are in excess, template and product are at
low enough concentrations that product renaturation does not compete with primer binding,
and amplification proceeds at a constant, exponential rate. The point at which the reaction
rate ceases to be exponential and enters a linear phase of amplification is extremely variable,
even among replicate samples, but it appears to be primarily due to product renaturation
competing with primer binding (since adding more reagents or enzyme has little effect). At
some later cycle the amplification rate drops to near zero (plateaus), and little more product is
made.

For the sake of accuracy and precision, it is necessary to collect quantitative data at a
point in which every sample is in the exponential phase of amplification (since it is only in
this phase that amplification is extremely reproducible). Analysis of reactions during
exponential phase at a given cycle number should theoretically provide several orders of
magnitude of dynamic range. Rare targets will probably be below the limit of detection, while
abundant targets will be past the exponential phase. In practice, a dynamic range of 2-3 logs
can be quantitated during end-point relative RT-PCR. In order to extend this range, replicate
reactions may be performed for a greater or lesser number of cycles, so that all of the samples
can be analyzed in the exponential phase.

Real-time PCR automates this otherwise laborious process by quantitating reaction


products for each sample in every cycle. The result is an amazingly broad 107-fold dynamic
range, with no user intervention or replicates required. Data analysis, including standard
curve generation and copy number calculation, is performed automatically. With increasing
numbers of labs and core facilities acquiring the instrumentation required for real-time
analysis, this technique is becoming the dominant RT-PCR-based quantitation technique.

Real-Time PCR Chemistries

Currently four different chemistries, TaqMan® (Applied Biosystems, Foster City, CA, USA),
Molecular Beacons, Scorpions® and SYBR® Green (Molecular Probes), are available for
real-time PCR. All of these chemistries allow detection of PCR products via the generation of
a fluorescent signal. TaqMan probes, Molecular Beacons and Scorpions depend on Förster
Resonance Energy Transfer (FRET) to generate the fluorescence signal via the coupling of a
fluorogenic dye molecule and a quencher moeity to the same or different oligonucleotide
substrates. SYBR Green is a fluorogenic dye that exhibits little fluorescence when in
solution, but emits a strong fluorescent signal upon binding to double-stranded DNA.

TaqMan Probes

TaqMan probes depend on the 5'- nuclease activity of the DNA polymerase used for
PCR to hydrolyze an oligonucleotide that is hybridized to the target amplicon. TaqMan
probes are oligonucleotides that have a fluorescent reporter dye attached to the 5' end and a
quencher moeity coupled to the 3' end. These probes are designed to hybridize to an internal
region of a PCR product. In the unhybridized state, the proximity of the fluor and the quench
molecules prevents the detection of fluorescent signal from the probe. During PCR, when the
polymerase replicates a template on which a TaqMan probe is bound, the 5'- nuclease activity
of the polymerase cleaves the probe. This decouples the fluorescent and quenching dyes and
FRET no longer occurs. Thus, fluorescence increases in each cycle, proportional to the
amount of probe cleavage. However, TaqMan probes can be expensive to synthesize,
with a separate probe needed for each mRNA target being analyzed.

Molecular Beacons

Like TaqMan probes, Molecular Beacons also use FRET to detect and quantitate the
synthesized PCR product via a fluor coupled to the 5' end and a quench attached to the 3' end
of an oligonucleotide substrate. Unlike TaqMan probes, Molecular Beacons are designed to
remain intact during the amplification reaction, and must rebind to target in every cycle for
signal measurement. Molecular Beacons form a stem-loop structure when free in solution.
Thus, the close proximity of the fluor and quench molecules prevents the probe from
fluorescing. When a Molecular Beacon hybridizes to a target, the fluorescent dye and
quencher are separated, FRET does not occur, and the fluorescent dye emits light upon
irradiation.

Molecular Beacons, like TaqMan probes, can be used for multiplex assays by using
spectrally separated fluor/quench moieties on each probe. As with TaqMan probes, Molecular
Beacons can be expensive to synthesize, with a separate probe required for each target.

Scorpions

With Scorpion probes, sequence-specific priming and PCR product detection is


achieved using a single oligonucleotide. The Scorpion probe maintains a stem-loop
configuration in the unhybridized state. The fluorophore is attached to the 5' end and is
quenched by a moiety coupled to the 3' end. The 3' portion of the stem also contains sequence
that is complementary to the extension product of the primer. This sequence is linked to the 5'
end of a specific primer via a non-amplifiable monomer. After extension of the Scorpion
primer, the specific probe sequence is able to bind to its complement within the extended
amplicon thus opening up the hairpin loop. This prevents the fluorescence from being
quenched and a signal is observed.

SYBR Green

SYBR Green provides the simplest and most economical format for detecting and
quantitating PCR products in real-time reactions. SYBR Green binds double-stranded DNA,
and upon excitation emits light. Thus, as a PCR product accumulates, fluorescence increases.
The advantages of SYBR Green are that it is inexpensive, easy to use, and sensitive. The
disadvantage is that SYBR Green will bind to any double-stranded DNA in the reaction,
including primer-dimers and other non-specific reaction products, which results in an
overestimation of the target concentration. For single PCR product reactions with well
designed primers, SYBR Green can work extremely well, with spurious non-specific
background only showing up in very late cycles.

SYBR Green is the most economical choice for real-time PCR product detection. Since
the dye binds to double-stranded DNA, there is no need to design a probe for any particular
target being analyzed. However, detection by SYBR Green requires extensive optimization.
Since the dye cannot distinguish between specific and non-specific product accumulated
during PCR, follow up assays are needed to validate results.

Scorpion Probes
Quantitation of Results

Two strategies are commonly employed to quantify the results obtained by real-time RT-
PCR; the standard curve method and the comparative threshold method. These are discussed
briefly below.

Standard Curve Method

In this method, a standard curve is first constructed from an RNA of known


concentration. This curve is then used as a reference standard for extrapolating quantitative
information for mRNA targets of unknown concentrations. Though RNA standards can be
used, their stability can be a source of variability in the final analyses. In addition, using RNA
standards would involve the construction of cDNA plasmids that have to be in vitro
transcribed into the RNA standards and accurately quantitated, a time-consuming process.
However, the use of absolutely quantitated RNA standards will help generate absolute copy
number data.

In addition to RNA, other nucleic acid samples can be used to construct the standard
curve, including purified plasmid dsDNA, in vitro generated ssDNA or any cDNA sample
expressing the target gene. Spectrophotometric measurements at 260 nm can be used to
assess the concentration of these DNAs, which can then be converted to a copy number value
based on the molecular weight of the sample used. cDNA plasmids are the preferred
standards for standard curve quantitation. However, since cDNA plasmids will not control for
variations in the efficiency of the reverse transcription step, this method will only yield
information on relative changes in mRNA expression. This, and variation introduced due to
variable RNA inputs, can be corrected by normalization to a housekeeping gene.

Comparative Ct Method

Another quantitation approach is termed the comparative Ct method. This involves


comparing the Ct values of the samples of interest with a control or calibrator such as a non-
treated sample or RNA from normal tissue. The Ct values of both the calibrator and the
samples of interest are normalized to an appropriate endogenous housekeeping gene.

The comparative Ct method is also known as the 2–[delta][delta]Ct method, where

[delta][delta]Ct = [delta] Ct,sample - [delta] Ct,reference

Here, [delta] CT,sample is the Ct value for any sample normalized to the endogenous
housekeeping gene and [delta]Ct, reference is the Ct value for the calibrator also normalized
to the endogenous housekeeping gene.

For the [delta][delta]Ct calculation to be valid, the amplification efficiencies of the target and
the endogenous reference must be approximately equal. This can be established by looking at
how [delta]Ct varies with template dilution. If the plot of cDNA dilution versus delta Ct is
close to zero, it implies that the efficiencies of the target and housekeeping genes are very
similar. If a housekeeping gene cannot be found whose amplification efficiency is similar to
the target, then the standard curve method is preferred.

One-Step Versus Two-Step RT PCR

When quantifying mRNA, RT PCR can be performed as either a one-step reaction,


where the entire reaction from cDNA synthesis to PCR amplification is performed in a single
tube, or as a two-step reaction, where reverse transcription and PCR amplification occur in
separate tubes. There are several pros and cons associated with each method. One-step
realtime PCR is thought to minimize experimental variation because both enzymatic
reactions occur in a single tube. However, this method uses an RNA starting template, which
is prone to rapid degradation if not handled properly. Therefore, a one-step reaction may not
be suitable in situations where the same sample is assayed on several occasions over a period
of time. One-step protocols are also reportedly less sensitive than two-step protocols.

Two-step RTPCR separates the reverse transcription reaction from the real-time PCR
assay, allowing several different real-time PCR assays on dilutions of a single cDNA.
Because the process of reverse transcription is notorious for its highly variable reaction
efficiency, using dilutions from the same cDNA template ensures that reactions from
subsequent assays have the same amount of template as those assayed earlier. Data from two-
step real-time PCR is quite reproducible with Pearson correlation coefficients ranging from
0.974 to 0.988. A two-step protocol may be preferred when using a DNA binding dye (such
as SYBR Green I) because it is easier to eliminate primer-dimers through the manipulation of
melting temperatures (Tms). However, two-step protocols allow for increased opportunities of
DNA contamination in real-time PCR.
Reporter Genes for Monitoring Gene Expression
Genes that are used in genetic analysis because their products are easy to detect are
known as reporter genes. They are often used to report on gene expression, although they
may also be used for other purposes, such as detecting the location of a protein or the
presence of a particular segment of DNA.
Easily Assayable Enzymes as Reporters
One of the first reporter genes for monitoring gene expression was the lacZ gene
encoding β-galactosidase. This enzyme normally splits lactose, a compound sugar found in
milk, into the simpler sugars glucose and galactose. However, β-galactosidase will also split a
wide range of galactose compounds (i.e., galactosides) both natural and artificial. The two
most commonly used artificial galactosides are ONPG and X-gal. ONPG (o-nitrophenyl
galactoside) is split into o-nitrophenol and galactose. The o-nitrophenol is yellow and soluble,
so it is easy to measure quantitatively. X-gal (5-bromo-4-chloro-3-indolyl β-D-galactoside) is
split into galactose plus the precursor to an indigo type dye. Oxygen in the air converts the
precursor to an insoluble blue dye that precipitates out at the location where the lacZ gene is
expressed.
Antibiotic Resistance As a Reporter Gene : Antibiotic resistance genes are included on
plasmids in order to determine whether the plasmids are present in a cell. When bacteria are
transformed with plasmid DNA those that get a plasmid that carries an antibiotic resistance
gene will survive when treated with the antibiotic, whereas those cells that fail to get a
plasmid will be killed.
Substrates Used by β-Galactosidase: The enzyme β-galactosidase normally cleaves lactose
into two monosaccharides, glucose and galactose. β-galactosidase also cleaves two artificial
substrates, ONPG and X-gal, releasing a group that forms a visible dye. ONPG releases a
bright yellow substance called o-nitrophenol, whereas X-gal releases an unstable group that
reacts with oxygen to form a blue indigo dye.
Another reporter gene is the phoA gene that encodes alkaline phosphatase. This
enzyme cleaves phosphate groups from a broad range of substrates. Like β-galactosidase,
alkaline phosphatase will use a variety of artificial substrates:
(1) O-Nitrophenyl phosphate is split, releasing yellow o-nitrophenol.
(2) X-phos (5-bromo-4-chloro-3-indolyl phosphate) consists of an indigo dye precursor
joined to phosphate. After the enzyme splits this, exposure to air converts the dye precursor
to a blue dye, as in the case of X-gal.
(3) 4-Methylumbelliferyl phosphate releases a fluorescent compound when the phosphate is
removed.
Substrates Used by Alkaline Phosphatase: Alkaline phosphatase removes phosphate groups
from various substrates. When the phosphate group is removed from o-nitrophenyl
phosphate, a yellow dye is released. When the phosphate is removed from X-phos, further
reaction with oxygen produces an insoluble blue dye as for X-gal. Additionally, alkaline
phosphatase releases a fluorescent molecule when the
phosphate is removed from 4-methylumbelliferyl
phosphate.
Light Emission by Luciferase As a Reporter System
A more sophisticated reporter gene encodes luciferase. This enzyme emits light when
provided with a substrate known as luciferin. Luciferase is found naturally in assorted
luminous creatures from bacteria to deep-sea squid. The lux genes from bacteria and the luc
genes from fireflies produce different brands of luciferase, but both work well as reporter
genes. The luciferins used by the different types of luciferase are chemically different.
Bacterial luciferase uses the reduced form of the co-factor FMN (flavin mononucleotide) as
its luciferin. Oxygen and a long chain aldehyde (R-CHO) are also needed. Both the reduced
FMN and the aldehyde are oxidized:
R - CHO + FMNH2 + O2 R-COOH + FMN + H2O + hυ
Different groups of eukaryotes make several chemically distinct luciferins that are used solely
for light emission. Firefly luciferase requires ATP as well as oxygen and firefly luciferin:
luciferin + O2 +1 ATP oxidized luciferin + CO2 + H2O + AMP + Diphosphate + hυ
If DNA carrying a gene for luciferase is incorporated into a target cell, it will emit light only
when the appropriate luciferin is added. Although high-level expression of luciferase can be
seen with the naked eye, usually the amount of light is small and must be detected with a
sensitive electronic apparatus such as a luminometer or a scintillation counter.
Green Fluorescent Protein As Reporter
Green fluorescent protein (GFP) is not an enzyme, and it does not need a nonprotein co-
factor for it to fluoresce. GFP is a stable and nontoxic protein from jellyfish that can be
visualized by its inherent green fluorescence. GFP can be directly observed in living tissue
without the need for adding any reagents. Nearly 2000 years ago, the Roman author Pliny
noted that the slime from certain jellyfish Aequorea Victoria would generate enough light
when rubbed on his walking stick to help guide his steps in the dark. It emits green light after
illumination with long-wave UV. GFP can be used to follow gene expression or to localize
proteins inside the cell. GFP can be used to reveal where a protein is localized within the cell.
• The first step is to fuse the GFP gene in frame with all or part of the structural gene
that encodes the protein of interest.
• The fused construct is then expressed in a host cell.
• The cells are excited with long wavelength UV light and visualized under the
microscope.
• If the protein is normally located in the membrane, the cell membrane will fluoresce
green in the microscope.

Gene Fusions
Reporter genes can be used to track the physical location of a segment of DNA or to
monitor gene expression. In particular, reporter genes are often incorporated into gene
fusions where they are used to follow the level of expression of the target gene. Many
genes have products that are complicated or tedious to assay by direct measurement or
may even be unknown. To avoid this, the original gene product is replaced by fusing its
regulatory region to the structural region of a reporter gene. To create this fusion, the
target gene is cut between its regulatory region and coding region. The same is done with
the reporter gene. Then the regulatory region of the gene under investigation is joined to
the coding region of the reporter gene. This hybrid structure is a gene fusion. The
regulatory sequences control the expression of the reporter gene in the same manner that
the original gene is controlled. Once the fusion gene is present in the organism, then the
researcher can alter the environment, treat the organism with different substances, or even
simply determine reporter gene expression at different stages of development. This
approach, especially using lacZ and β-galactosidase, is widely used in determining what
regulatory DNA sequences are important for gene expression.
Many bacteria, such as E. coli, already possess a wild-type copy of reporter genes
such as the lacZ or phoA. In these cases, the wild-type version of the gene must be
deleted from the chromosome before the gene fusions are used. Strains of E. coli deleted
for the lac operon or for phoA are readily available. In eukaryotes, gene fusions use
different reporter genes. For example, yeast reporter genes include CUP1, a gene that
enables yeast to grow on copper-containing media, URA3, a gene that kills yeast when
growing on 5-fluorouracil, and ADE1 and ADE2, two genes that synthesize adenine.
ADE1 and ADE2 mutants produce a red pigment when grown on regular media and are
easily visualized.

Deletion Analysis of the Upstream Region


The upstream regulatory region of a gene often contains several sites where regulatory
proteins such as transcription factors bind as well as the promoter region where RNA
polymerase binds. These regulatory sites enhance or suppress the expression of the gene
under a variety of conditions. To determine the function of the regulatory elements, it is often
helpful to construct a series of altered upstream regions in which presumed binding sites have
been eliminated. The simplest way to do this is to remove successive segments from the 5‘
end of the upstream region. Originally, restriction enzymes were used to create the deletions.
However, finding convenient restriction sites was always a problem. PCR offers a much
better alternative because of its specificity. A variety of PCR primers can be designed to
amplify different areas within the upstream region of the gene of interest. These engineered
upstream regions are then tested for possible alterations in gene expression and regulation by
creating a gene fusion with a reporter gene. They are then examined by assaying the
expression of the reporter gene. For example, suppose we have an upstream region whose
sequence reveals a binding motif for Crp, the E. coli cAMP receptor protein. If this region is
removed in the deletion analysis, the effect of losing the regulatory motif is assayed by
monitoring the expression of lacZ. Without the Crp-binding site in this example, the reporter
gene expression is about half of normal, indicating that Crp must enhance gene expression.
Locating Protein-Binding Sites in the Upstream Region
Deletion analysis of these upstream sites determines how they affect gene expression,
but do not show if a protein actually binds to the site. Consequently, even after a presumed
binding site has been found, the binding of the regulatory protein must be confirmed
experimentally.
The electrophoretic mobility shift, bandshift, or gel retardation assay tests whether a
suspected protein binds to DNA from the upstream region. First, the DNA carrying the gene
and its upstream region is labeled with digoxygenin or radioactivity and the cut with a
convenient restriction enzyme to get a series of fragments. After cutting, the DNA is
separated into two tubes. To the experimental sample, the suspected DNA-binding protein is
added. The other sample, or control sample, is not mixed with any proteins. Both samples are
then run side by side on a nondenaturing agarose gel. If the protein binds to one of the DNA
fragments, the complex will be larger and run slower than the original DNA (i.e., that
fragment will be retarded).
Gel retardation reveals which segment of DNA binds a protein. To locate the binding
site more precisely, a footprint analysis is performed. In footprinting, the fragment of DNA
that binds the protein is labeled at one end with radioactivity or fluorescence. As before, the
sample of DNA is split into two portions and the protein is mixed with one batch. Both
portions of the DNA are then treated with a small amount of a reagent that breaks DNA
strands. Deoxyribonuclease I (DNase I) is often used because it is relatively nonspecific and
cuts DNA between any two nucleotides. Other chemical reagents that attack DNA may also
be used. In either case, the DNA is attacked and degraded except in the region covered, and
thus protected, by the protein. Although there are many other fragments of DNA in the
sample after DNase I treatment, the labeled fragment is the only one that is visible in the gel.
Only a small amount of DNase is used, just enough to cut each molecule of DNA once on
average, in a random position. Consequently, the sample of protected DNA will have certain
fragments missing. In contrast, cutting a sample of unprotected DNA will give rise to a series
of fragments of all possible lengths, varying by a single base pair. When the two samples are
run on a gel side by side, a region without any fragments appears as a “footprint”. In practice,
the footprint is run side by side with a sequencing reaction, which allows matching the
footprint with the DNA sequence.
DNA-Protein Complexes Can Be Isolated by Chromatin Immunoprecipitation
Chromatin Immunoprecipitation (ChIP)- The Principle
Covalent crosslinks join any proteins that are attached to each other, and also any
proteins attached to DNA. These crosslinked segments are then sheared or cut into smaller
fragments, and then the transcription factor of interest is isolated from the remaining cellular
components with immunoprecipitation.
ChIA-PET Procedure
• Inside the nucleus, DNA-protein interactions are 3D and involve DNA loops.
• After crosslinking, different regions of a chromosome are often associated with a
single protein complex.
• After ChIP, each of these DNA sequences can be determined with paired end-tag
sequencing. First, the immunoprecipitated DNA:protein complex is divided into two
samples and each DNA end is connected to a different linker DNA.
• Then, the two samples are recombined and mixed with very dilute ligase.
• The linkers anneal preferentially within the same complex, but occasionally there are
inter ligations of paired tags.
• The ligated tags have a restriction enzyme site for MmeI, which recognizes its
sequence in the tag, but cuts 20 nucleotides away in the DNA sequence.
• These small pieces of DNA are then sequenced using paired end-sequencing
technology.
Primer Extension Reveals Start of Transcription
First, mRNA is isolated from cells that are expressing the gene of interest. A primer
specific to the gene of interest is added and anneals to the mRNA. Reverse transcriptase
makes a complementary DNA strand from the primer to the 5’ end of the mRNA (i.e., the
start of transcription). The exact transcription start site is determined by comparing the
size of the primer extension DNA strand to a sequencing ladder of the same region of
DNA.
Locating Start of Transcription by S1 Nuclease
 S1 nuclease mapping Method using S1 nuclease to locate the 5’ end or 3’ end of a
transcript.
 S1 nuclease Endonuclease from Aspergillus oryzae that cleaves single-stranded RNA
or DNA but does not cut double-stranded nucleic acids.
 The first step in mapping the transcriptional start site by S1 nuclease treatment is to
clone the upstream region of the gene into an M13 vector.
 Next, single-stranded M13 DNA is prepared using labeled nucleotide precursors for
use as a probe.
 The labeled single-stranded DNA is mixed with the total cellular mRNA.
 The mRNA with sequence complementary to the DNA will hybridize with the DNA.
S1 nuclease is added to the mixture to digest all the single-stranded RNA and DNA.
 All that is left is the DNA:RNA hybrid, which is isolated from the degraded
nucleotides by precipitation.
 The DNA portion of the hybrid is isolated by alkali treatment and the length
determined by comparing the fragment size to the entire gene.
Transcriptome Analysis
Unlike the genome, the transcriptome varies as different genes are expressed under
different conditions. Transcriptome analysis attempts to measure the levels of all
transcribed RNAs simultaneously. The number and types of RNA are captured at one
moment in time and provide a snapshot of what is being expressed in the cell.
The workflow requires RNA purification first, followed by sequencing, and then finally
the data analysis.
Removing Unwanted rRNA From an RNA Sample
• Although most rRNAs are not polyadenylated, a fraction of the transcripts do have
poly(A) tails.
• These can contaminate RNA for transcriptome analysis, and therefore, need to be
removed.
• One method uses biotinylated single-stranded probes that have complementary
sequences to rRNA.
• These hybridize to the rRNA in the sample and are removed by binding to
avidincoated beads followed by centrifugation.

RNA-Seq
• The entire transcriptome can be identified by sequencing a cDNA library in its
entirety. Next generation sequencing makes this process possible, resulting in the
identification of each and every RNA that was expressed.
Overall Scheme of Single Cell RNA Sequencing
• Cells from a complex mixture are separated and isolated. Single cells are mixed with
barcoded primers and the cells are lysed in their micro environment. Reverse
transcription yields cDNA molecules containing the cell specific, unique barcode. The
samples from hundreds or thousands of cells are pooled and the barcoded cDNAs are
amplified then sequenced.

Microbeads Showing Barcoded Primers for scRNA-Seq


• A. Millions of primers on each microbead have the same unique barcode and each
microbead contains a different barcode sequence. Additionally, each primer may have
a UMI to assist in transcript quantification.
• B. One primer attached to the microbead showing the orientation of the barcode, UMI
if present, and the poly(dT) primer. Polyadenylated RNA binds to the primer. Reverse
transcription begins at the 3’ end of poly(dT) primer to produce cDNA using the RNA
as a template. Amplification by PCR fills in the second strand complementary to the
UMI and barcode.
Principle of Drop-Seq
• Microbeads containing unique barcoded (red, blue, and green) primers are prepared
separately. Isolated cells and microbeads are mixed together in a fluid droplet, usually
a lipid. Nanoliter droplets co-encapsulate a single cell with a single microbead. Each
cell is lysed and the transcripts bind the primers on the microbeads. Reverse
transcription produces cDNA molecules. The oil droplets are broken and the cDNA
from all cells is pooled, amplified, and sequenced. The barcodes map each sequence
to a single cell.
Serial analysis of gene expression (SAGE)
• Method to monitor level of multiple mRNA molecules by sequencing a DNA
concatemer that contains many serially-linked sequence tags derived from the
mRNAs.
SAGE—The Principle
• To analyze the total mRNA expressed in a cell, small sequences from each mRNA are
converted to complementary DNA and linked together into one long concatemer,
which is sequenced. Each of the segments represents a single mRNA; therefore, the
number of repeats of each segment correlates with the level of expression of the
corresponding gene in the cell.
SAGE—The Procedure
The first step in making long concatemers of expressed sequences involves isolating
the total cellular mRNA and making the corresponding cDNA. The total mRNA is bound
via its poly(A) tail to an oligo(dT) primer linked to biotin. It is then converted to cDNA
using reverse transcriptase. The cDNAs are then truncated to short, tagged sequences.
First, the cDNAs are cleaved with a restriction enzyme known as the anchoring enzyme.
This generates a pool of shortened cDNA averaging 256 bp long, with some longer and
others shorter. These are isolated using streptavidin, which binds to the biotin tag on the
poly(A) tail end of the cDNA. This mixture is divided into two samples and each is
ligated to a different linker. This linker has two features: (1) its overhang matches the
overhang generated previously by the anchoring enzyme, and (2) it has a recognition site
for a type II restriction enzyme (known as the tagging enzyme). Each sample is cut with
the tagging enzyme. This enzyme recognizes the sequence in the linker, but actually
makes a blunt end cut downstream in the cDNA sequence. This generates two pools of
small cDNA sequence tags with different linkers. Finally, the sequence tags are joined
into one long sequence. First, fragments are linked by blunt-end ligation. Then PCR
primers complementary to the linkers are used to amplify only those ligated molecules
that have linker A and linker B flanking two different sequence tags. The PCR products
are digested with the anchoring enzyme to remove the linkers and generate sticky ends.
These are ligated and the resulting fragment is cloned and sequenced.
Western Blot

Western blot was introduced by Towbin et al. in 1979, which is a commonly used method
for protein analysis. It can be used for qualitative and semi-quantitative protein analysis. For
the accomplishment of the western blot, there are three elements, separation of proteins by
size, transferring proteins to a solid support, and marking proteins by primary and secondary
antibodies for visualization.

The Principle
Western blot is performed by using polypropylene gel electrophoresis. SDS-PAGE allows
protein samples to be separated and transferred to a solid support, such as nitrocellulose (NC)
or polyvinylidene difluoride (PVDF) membrane. The solid support can absorb the protein and
keep its biological activity unchanged. The transferred solid support membrane is called a
blot and is treated with a protein solution to block the hydrophobic binding site on the
membrane. The membrane is treated with the antibody (primary antibody) of the target
proteins. Only the proteins to be studied can specifically bind to the primary antibody to form
an antigen-antibody complex. After the primary antibody is washed and removed, only the
position of the target protein binds to the primary antibody. The primary antibody-treated
membranes are treated with a labeled secondary antibody after washing. After treatment, the
labeled secondary antibody that binds to the primary antibody forms an antibody complex
that can indicate the location of the primary antibody, both the location of the protein being
studied.

The Procedure:
There are six steps involved in western blot, including sample preparation, gel
electrophoresis, proteins transfer, blocking, antibody incubation, and proteins detection and
visualization.
1. Sample preparation.
Proteins can be extracted from different samples, such as tissues or cells. Since tissue
samples display a higher degree of structure, the tissues are first broken down by the
mechanical invention, such as homogenizer or sonication. Protease and phosphatase
inhibitors are commonly used to prevent the digestion of the sample at cold temperatures.
After protein extraction, it is important to detect the concentration of proteins, which permits
the mass of proteins loaded into each well. And a spectrophotometer is often used for proteins
concentration.
2. Gel electrophoresis.

The most commonly used gel is polyacrylamide gels (PAG) and buffers loaded with
sodium dodecyl sulfate (SDS). Western blot uses two types of agarose gel: stacking gel that is
used for concentrate all proteins in one band and separating gel that allows for separating
proteins according to their molecular weight. Smaller proteins migrate faster in SDS-PAGE
when a voltage is applied. PAGE can separate proteins ranging from 5 to 2,000 kDa
according to the uniform pore size which is controlled by the Different concentration of PAG.
Typically separating gels are made in 5%, 8%, 10%, 12% or 15%. When we choose the
appropriate percentage of the separating gel, we should consider the size of the target
proteins. The smaller the known weight of proteins is, the higher percentage of gels should be
used.

3. Proteins transfer.

After separating proteins by gel electrophoresis, proteins are moved from within the gel
onto a solid support membrane to make the proteins accessible to antibody detection. The
main method for transferring proteins is called electroblotting, which uses an electric field
oriented perpendicular to the surface of the gel, to pull proteins out of the gel and move into
the membrane. It can be done semi-dry or wet conditions, while wet conditions are usually
more reliable as it is less likely dry out the gel. As shown in the left figure, the membrane is
placed between the gel surface and filter. The transfer sandwich is created as follows: a fiber
pad (sponge), filter papers, the gel, a membrane, filter papers, a fiber pad (sponge).

4. Blocking.
Blocking is an important step in the western blot to prevent antibodies from binding to
the membrane non-specifically. The most commonly used typical blockers are BSA and non-
fat dry milk. When the membrane is placed in the dilute solution of proteins, the proteins
attach to all places in the membrane where the target proteins have not attached. In this way,
the “noise” in the final product of the western blot can be reduced and result in clearer results.

5. Antibody incubation.

After blocking, the primary antibody binds to target protein when the primary antibody
is incubated with the membrane. The choice of a primary antibody depends on the antigen to
be detected. Washing the membrane with the antibody-buffer solution is helpful for
minimizing background and removes unbound antibodies. After rinsing the membrane, the
membrane is exposed to the specific enzyme conjugated secondary antibody. When
performing secondary antibody incubation, the labeled secondary antibody can bind to the
primary antibody which has reacted with target proteins. Based on the species of the primary
antibody, we can choose the appropriate secondary antibody.

6. Protein detection and visualization.

A substrate reacts with the enzyme that is bound to the secondary antibody to generate
colored substance. It enables us to know the densitometry and location of the targets protein.
And the size approximations are taken by comparing the proteins bands to the marker. There
are several detection systems are available for protein visualization, such as colorimetric
detection, chemiluminescent detection, radioactive detection, and fluorescent detection. The
electrochemiluminescence (ECL) system is the most common detection method.
2-D electrophoresis is a powerful and widely used method for the analysis of complex
protein mixtures extracted from cells, tissues, or other biological samples.
• Separation of the proteins by isoelectric point is called isoelectric focusing (IEF).
• When a gradient of pH is applied to a gel and an electric potential is applied across
the gel, making one end more positive than the other.
• At all pH values other than their isoelectric point, proteins will be charged.
• If they are positively charged, they will be pulled towards the negative end of the gel
and if they are negatively charged they will be pulled to the positive end of the gel.
• The proteins applied in the first dimension will move along the gel and will
accumulate at their isoelectric point; that is, the point at which the overall charge on
the protein is 0 (a neutral charge).
• If a protein should diffuse away from its pI, it immediately gains charge and migrates
back. This is the focusing effect which allows proteins to be separated on the basis of
very small charge differences.
• The resolution is determined by the slope of the pH gradient and the electric field
strength so, IEF is therefore performed at high voltages (typically in excess of 1000
V).
• When the proteins have reached their final positions in the pH gradient, there is very
little ionic movement in the system, resulting in a very low final current.
• In separating the proteins by mass, the gel treated with sodium dodecyl sulfate (SDS)
along with other reagents (SDS-PAGE in 1-D).
• This denatures the proteins (that is, it unfolds them into long, straight molecules) and
binds a number of SDS molecules roughly proportional to the protein's length.
• Because a protein's length (when unfolded) is roughly proportional to its mass, Since
the SDS molecules are negatively charged, the result of this is that all of the proteins
will have approximately the same mass-to-charge ratio as each other.
• In addition, proteins will not migrate when they have no charge (a result of the
isoelectric focusing step) therefore the coating of the protein in SDS (negatively
charged) allows migration of the proteins in the second dimension.
• In the second dimension, an electric potential is again applied, but at a 90 degree
angle from the first field.
• The proteins will be attracted to the more positive side of the gel (because SDS is
negatively charged) proportionally to their mass-to-charge ratio.
• The gel therefore acts like a molecular sieve when the current is applied, separating
the proteins on the basis of their molecular weight with larger proteins being retained
higher in the gel and smaller proteins being able to pass through the sieve and reach
lower regions of the gel.
Mass Spectrometry:
Mass spectrometry is basically an analytic technique that determines the relative masses
of molecular ions and fragments. Using this process, the gas phase molecules are ionized to
determine their mass-to-charge ratio. Since lighter ions will travel faster and be detected first
when an electric field is applied, the relative mass can be accurately measured and the
composition of the molecule can then be identified. In addition, the sequence of component
amino acids can also be identified using the same procedure.

In analyzing proteins using mass spectrometry, the proteins are first broken down into
their component peptides. Trypsin is usually the protease most researchers use in digesting
proteins due to a number of reasons. First, it cleaves proteins into its component peptides
with an average size of 700 to 1500 Daltons (the ideal size for mass spec). Second, it
specifically cleaves the protein at the carboxyl side of arginine and lysine residues. The C-
terminals of these peptides are charged and are therefore easily detectable by mass
spectrometry. Third, trypsin is highly active and can tolerate a number of additives. And
lastly, trypsin can be modified by the methylation of lysines to prevent self-digestion at these
sites.
After the proteins have been digested into peptides, they are then separated using reverse
phase column with acetonitrile gradient. Note: Acetic acid should be used with the solvents
since trifluoroacetic usually interferes with the ionization process.
The ionized peptides are then made to pass through the column eluate which contains
peptides and solvent. After the solvent evaporates, their charged surfaces move the ionized
peptides into the mass spectrometer. This method, which uses chromatography to introduce
molecules into a mass spectrometer, is called high performance liquid chromatography or
HPLC.
After this step, the mass of peptides and the mass of peptide fragments produced through
collision-induced dissociation (CIS) or collisionally-activated dissociation (CAD) using
tandem mass spectrometry or MS/MS can now be measured.
Tandem mass spectrometry, also known as MS/MS or MS2, is a technique in instrumental
analysis where two or more mass analyzers are coupled together using an additional reaction
step to increase their abilities to analyse chemical samples.
The molecules of a given sample are ionized and the first spectrometer (designated MS1)
separates these ions by their mass-to-charge ratio (often given as m/z or m/Q). Ions of a
particular m/z-ratio coming from MS1 are selected and then made to split into
smaller fragment ions, e.g. by collision-induced dissociation, ion-molecule reaction,
or photodissociation. These fragments are then introduced into the second mass spectrometer
(MS2), which in turn separates the fragments by their m/z-ratio and detects them.

You might also like