0% found this document useful (0 votes)
40 views

Noncoding DNA

Noncoding DNA makes up a large fraction of eukaryotic genomes and can be categorized into different types. In bacteria, noncoding DNA is typically around 12% of the genome and includes regulatory sequences. In humans, noncoding DNA is between 98-99% of the genome. This noncoding DNA includes noncoding genes, repetitive sequences, and regulatory elements. The amount of noncoding DNA varies greatly between species and contributes to differences in genome size.

Uploaded by

Nikki SStark
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Noncoding DNA

Noncoding DNA makes up a large fraction of eukaryotic genomes and can be categorized into different types. In bacteria, noncoding DNA is typically around 12% of the genome and includes regulatory sequences. In humans, noncoding DNA is between 98-99% of the genome. This noncoding DNA includes noncoding genes, repetitive sequences, and regulatory elements. The amount of noncoding DNA varies greatly between species and contributes to differences in genome size.

Uploaded by

Nikki SStark
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Noncoding DNA (ncDNA) sequences are components of an organism's DNA that do

not encode protein sequences. Some non-coding DNA is transcribed into functional
non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA,
and regulatory RNAs). Other functional regions of the non-coding DNA fraction
include regulatory sequences that control gene expression; scaffold attachment regions;
origins of DNA replication; centromeres; and telomeres. Some non-coding regions
appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and
fragments of transposons and viruses.

FRACTIONS OF NON-CODING DNA

In bacteria, the coding regions typically take up 88% of the genome. The remaining
12% consists largely of non-coding genes and regulatory sequences, which means that
almost all of the bacterial genome has a function. The amount of coding DNA in
eukaryotes is usually a much smaller fraction of the genome because eukaryotic
genomes contain large amounts of repetitive DNA not found in prokaryotes. The human
genome contains somewhere between 1% and 2% coding DNA. (The exact number isn't
known because there are disputes over the number of functional coding exons and over
the total size of the human genome.) This means that 98-99% of the human genome
consists of non-coding DNA and this includes many functional elements such as non-
coding genes and regulatory sequences.

Genome size in eukaryotes can vary over a wide range, even between closely related
sequences. This puzzling observation was originally known as the C-value Paradox
where "C" refers to the haploid genome size. The paradox was resolved with the
discovery that most of the differences were due to the expansion and contraction of
repetitive DNA and not the number of genes. Some researchers speculated that this
repetitive DNA was mostly junk DNA.

This led to the observation that the number of genes does not seem to correlate with
perceived notions of complexity because the number of genes seems to be relatively
constant - an issue that's called the G-value Paradox. For example, the genome of the
unicellular Polychaos dubium (formerly known as Amoeba dubia) has been reported to
contain more than 200 times the amount of DNA in humans (i.e. more than 600 billion
pairs of bases vs a bit more than 3 billion in humans) The pufferfish Takifugu rubripes
genome is only about one eighth the size of the human genome, yet seems to have a
comparable number of genes. Genes take up about 30% of the pufferfish genome and
the coding DNA is about 10%. (Non-coding DNA = 90%.) The reduced size of the
pufferfish genome is due to a reduction in the length of introns and less repetitive DNA.
Types of non-coding DNA sequences

Noncoding genes

There are two types of genes: protein coding genes and noncoding genes. Noncoding
genes are an important part of non-coding DNA and they include genes for transfer
RNA and ribosomal RNA. These genes were discovered in the 1960s. Prokaryotic
genomes contain genes for a number of other noncoding RNAs but noncoding RNA
genes are much more common in eukaryotes.

Typical classes of noncoding genes in eukaryotes include genes for small nuclear RNAs
(snRNAs), small nucleolar RNAs (sno RNAs), microRNAs (miRNAs), short interfering
RNAs (siRNAs), PIWI-interacting RNAs (piRNAs), and long noncoding RNAs
(lncRNAs). In addition, there are a number of unique RNA genes that produce catalytic
RNAs.

Noncoding genes account for only a few percent of prokaryotic genomes but they can
represent a vastly higher fraction in eukaryotic genomes. In humans, the noncoding
genes take up at least 6% of the genome, largely because there are hundreds of copies
of ribosomal RNA genes. Protein-coding genes occupy about 38% of the genome; a
fraction that is much higher than the coding region because genes contain large introns.

The total number of noncoding genes in the human genome is controversial. Some
scientists think that there are only about 5,000 noncoding genes while others believe
that there may be more than 100,000.

Types of non-coding DNA sequences - Promoters and regulatory elements, Introns,


Untranslated regions, Origins of replication, Centromeres, Telomeres, Scaffold
attachment regions, Pseudogenes, Repeat sequences, transposons and viral elements

DNA may be categorised as:


Unique DNA sequences (60% of total)

Eukaryotic genomes contain large amounts of repetitive DNA sequences that are
present in many copies (thousands, in some cases). By contrast, coding regions of
genes (which are typically present in a single copy per haploid genome) are referred
to as unique-sequence DNA.

 Present in single or low copy numbers.


 Includes coding sequence for structural genes (up to 1400 bp in size), which account
for 3% of the genome.
 The remainder is intronic sequence or spacer DNA.

Moderately repetitive DNA sequences (30% of total)

These include short (150 to 300-bp) sequences or long ones (5-kbp) amounting about 40%
and 1-2% of the total genome, respectively. These are dispersed throughout the euchromatin
having 103-105 copies per haploid genome. These sequences are involved in the regulation of
gene expression. In some cases, long dispersed repeats of 300 to 600-bp show homology with
the retro viruses.

a) microsatellites / minisatellites (VNTR, DNA 'fingerprints)


b) dispersed-repetitive DNA, mainly transposable elements (LINES/ SINES)

Highly repetitive DNA (10% total)

Highly repetitive DNA consists of short stretches of DNA that are repeated many times
in tandem (one after the other). The repeat segments are usually between 2 bp and 10
bp but longer ones are known(Present at between 105 - 107) copies per genome. Highly
repetitive DNA is rare in prokaryotes but common in eukaryotes, especially those with
large genomes. It is sometimes called satellite DNA.

Most of the highly repetitive DNA is found in centromeres and telomeres (see above)
and most of it is functional although some might be redundant. The other significant
fraction resides in short tandem repeats (STRs; also called microsatellites) consisting of
short stretches of a simple repeat such as ATC. There are about 350,000 STRs in the
human genome and they are scattered throughout the genome with an average length of
about 25 repeats.

Variations in the number of STR repeats can cause genetic diseases when they lie within
a gene but most of these regions appear to be non-functional junk DNA where the
number of repeats can vary considerably from individual to individual. This is why
these length differences are used extensively in DNA fingerprinting.

Satellite DNA

These are represented by monomer sequences, usually less than 2000-bp long, tandemly
reiterated up to 105 copies per haploid animals and located in the pericentromeric and
or telomeric heterochromatic regions. Satellite DNA constitutes from 1 to 65% of the
total DNA of numerous organisms, including that of animals, plants, and prokaryotes.
The term “satellite” in the genetic sense was first coined by the Russian cytologist
Sergius Navashin, in 1912, initially in Russian (“sputnik”) and Latin (satelle), and was
later translated to “satellite”. The more familiar usage of "satellite" relates to a small
band of DNA with a density different (usually lower, because of a high AT-content)
from the bulk of the genomic DNA, which are separated from the main band following
CsCl centrifugation. Nucleotide changes and copy number variations fuel the process
of their evolution within and across the species. Satellite fraction(s), though not
conserved evolutionarily are unique to a species and usually show similarity amongst
related group of animals
Schematic diagram showing biological categories of the different repetitive
sequences.

Cot value

Cot value and Cot Curve analysis

It is a technique for measuring the complexity (size) of DNA or genome. The technique
was developed by Roy Britten and Eric Davidson in 1960. The technique is based on
the principle of DNA renaturation kinetics.

The renaturation of DNA can be analysed by the Cot Curve.

Principle:

The rate of renaturation is directly proportional to the number of times the sequences
are present in the genome. Given enough time all DNA that is denatured will reassociate
or reanneal in a given DNA sample. The more the repetitive sequence the less will be
the time taken for renaturation.

Procedure : The process involves denaturation of DNA by heating and allowed to


reanneal by cooling. The renaturation of DNA is assessed stereoscopically. Large DNA
molecules take longer time to reanneal.

What is cot value?

The renaturation depends on the following factors DNA concentration, reassociation


temperature, cation concentration and viscosity.

Cot=DNA Concentration (moles/L) X renaturation time in seconds X buffer factor (that


accounts for the effects of cations on the speed of renaturation).

Cot: Co=Concentration of DNA and t= time taken for renaturation

Low cot value indicates more number of repetitive sequences


High cot value indicates more number of unique sequences or less number of repetitive
sequences.

For example: Bacteria- 99.7% Single Copy Mouse - 60% Single Copy +25% Middle
Repetitive+ 10% Highly Repetitive

How to calculate cot value?

Cot=DNA Concentration (moles/L) X renaturation time in seconds X buffer factor (that


accounts for the effects of cations on the speed of renaturation).

Nucleotide concentration = 0.050 M

Renaturation time = 344 sec

Buffer factor, 0.5 M SPB = 5.820

Cot value = 0.050X 344 X 5.820=100.000

You might also like