General Information

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 2

Human Genome

The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs.
Twenty-two of these are autosomal chromosome pairs, while the remaining pair is sex-
determining. The haploid human genome occupies a total of just over 3 billion DNA base pairs.
The haploid human genome contains ca. 23,000 protein-coding genes, far fewer than had been
expected before its sequencing. In fact, only about 1.5% of the genome codes for proteins, while
the rest consists of non-coding RNA genes, regulatory sequences, introns, and (controversially
named) "junk" DNA

The Human Genome Project (HGP) produced a reference sequence of the euchromatic human
genome, which is used worldwide in biomedical sciences.

Genes

Surprisingly, the number of human genes seems to be less than a factor of two greater than that
of many much simpler organisms, such as the roundworm and the fruit fly. However, human cells
make extensive use of alternative splicing to produce several different proteins from a single gene,
and the human proteome is thought to be much larger than those of the aforementioned
organisms. Besides, most human genes have multiple exons, and human introns are frequently
much longer than the flanking exon.

Human genes are distributed unevenly across the chromosomes. Each chromosome contains
various gene-rich and gene-poor regions, which seem to be correlated with chromosome bands
and GC-content. The significance of these nonrandom patterns of gene density is not well
understood. In addition to protein coding genes, the human genome contains thousands of RNA
genes, including tRNA, ribosomal RNA, microRNA, and other non-coding RNA genes.

Regulatory Sequences

The human genome has many different regulatory sequences which are crucial to controlling gene
expression. These are typically short sequences that appear near or within genes. A systematic
understanding of these regulatory sequences and how they together act as a gene regulatory
network is only beginning to emerge from computational, high-throughput expression and
comparative genomics studies. Some types of non-coding DNA are genetic "switches" that do not
encode proteins, but do regulate when and where genes are expressed.

Identification of regulatory sequences relies in part on evolutionary conservation. The evolutionary


branch between the human and mouse, for example, occurred 70–90 million years ago.

So computer comparisons of gene sequences that identify conserved non-coding sequences will
be an indication of their importance in duties such as gene regulation.

Another comparative genomic approach to locating regulatory sequences in humans is the gene
sequencing of the puffer fish. These vertebrates have essentially the same genes and regulatory
gene sequences as humans, but with only one-eighth the "junk" DNA. The compact DNA
sequence of the puffer fish makes it much easier to locate the regulatory genes.

ory gene sequences as humans, but with only one-eighth the "junk" DNA. The compact DNA
sequence of the puffer fish makes it much easier to locate the regulatory genes
Other DNA

Protein-coding sequences (specifically, coding exons) comprise less than 1.5% of the human
genome. Aside from genes and known regulatory sequences, the human genome contains vast
regions of DNA the function of which, if any, remains unknown. These regions in fact comprise the
vast majority, by some estimates 97%, of the human genome size. Much of this is composed of:

Repeat elements.

Tandem repeat: Tandem repeats occur in DNA when a pattern of two or more nucleotides is
repeated and the repetitions are directly adjacent to each other. An example would be:
A-T-T-C-G-A-T-T-C-G-A-T-T-C-G
in which the sequence A-T-T-C-G is repeated three times.

Interspersed repetitive DNA is found in all eukaryotic genomes. Certain classes of these
sequences propagate themselves by RNA mediated transposition, and they have been called
retrotransposons.

Transposons

The major difference of class II transposons from retrotransposons is that their transposition
mechanism does not involve an RNA intermediate. Class II transposons usually move by a
mechanism analogous to cut and paste, rather than copy and paste, using the transposase
enzyme. Different types of transposase work in different ways. Some can bind to any part of the
DNA molecule, and the target site can therefore be anywhere, while others bind to specific
sequences. Transposase makes a staggered cut at the target site producing sticky ends, cuts out
the transposon and ligates it into the target site.

Junk DNA

However, there is also a large amount of sequence that does not fall under any known
classification. Much of this sequence may be an evolutionary artifact that serves no present-day
purpose, and these regions are sometimes collectively referred to as "junk" DNA. There are,
however, a variety of emerging indications that many sequences within are likely to function in
ways that are not fully understood. Recent experiments using microarrays have revealed that a
substantial fraction of non-genic DNA is in fact transcribed into RNA.]which leads to the possibility
that the resulting transcripts may have some unknown function. Also, the evolutionary
conservation across the mammalian genomes of much more sequence than can be explained by
protein-coding regions indicates that many, and perhaps most, functional elements in the genome
remain unknown. The investigation of the vast quantity of sequence information in the human
genome whose function remains unknown is currently a major avenue of scientific inquiry.

Author: Ms. Sujata Roy Saha


Research Scholar
Molecular Modeling & Drug Design

You might also like