Human Genome Project

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

Jai academy

Session: 2023-2024

Biology project

Submitted to: submitted by :


Mr.farhan tanishka puri
12th b
50

CERTIFICATE
THIS IS TO CERTIFY THAT TANISHKA PURI OF CLASS 12TH B,
HAS COMPLETED THE PROJECT DURING THE SESSION 2023-
2024 TOWARDS PARTIAL FULFILLMENT OF CREDITS FOR
CHEMISTRY OF CBSE AND SUBMITTED SATISFACTORY AS
COMPILED IN THE FOLLOWING PAGES UNDER SUPERVISION
OF MR. FARHAN.

TEACHER’S SIGNATURE

EXAMINER’S SIGNATURE

PRINCIPAL’S SIGNATURE

ACKNOWLEDGEMENT
I AM VERY GLAD TO HAVE THE OPPORTUNITY TO
MAKE THIS PROJECT AND EXPRESS MY PROFOUND
GRATITUDE AND DEEP REGARDS TO MY GUIDE MR.
FARHAN FOR HIS EXEMPLARY GUIDANCE,
MONITORING AND CONSTANT ENCOURAGEMENT
THROUGHOUT THIS PROJECT. I WOULD LIKE TO
THANK THE PEOPLE WHO HELPED ME DIRECTLY OR
INDIRECTLY TO COMPLETE THIS PROJECT. I WOULD
LIKE ALSO TO EXTEND MY GRATITUDE TO MR. SANJAY
SINGH PRINCIPAL OF JAI ACADEMY, FOR THEIR
VALUABLE ENCOURAGEMENT AND APPROVAL OF THE
PROJECT WORK. LASTLY, I THANK MY PARENTS AND
FRIENDS FOR THEIR CONSTANT ENCOURAGEMENT

INDEX
1. CERTIFICATE
2. ACKNOWLEDGEMENT
3. INTRODUCTION TO HUMAN GENOME PROJECT
4. WHAT MAKES UP A GENE
5. WHY TO STUDY OUR GENOME
6. HUMAN GENOME PROJECT
7. APPLICATIONS AND PROPOSED BENEFITS
8. TECHNIQUES AND ANALYSIS
9. FINDINGS AND ACCOMPLISHMENTS
10. GENOME DONORS
11.DEVELOPEMENTS
12.ETHIC,SOCIAL,LEGAL ISSUES
13.BIBLIOGRAPHY

Introduction to human genome project

The Human Genome Project (HGP) was an international scientific research project with the
goal of determining the base pairs that make up human DNA, and of identifying, mapping
and sequencing all of the genes of the human genome from both a physical and a functional
standpoint. It started in 1990 and was completed in 2003It remains the world's largest
collaborative biological project. Planning for the project started after it was adopted in 1984
by the US government, and it officially launched in 1990. It was declared complete on April
14, 2003, and included about 92% of the genome. Level "complete genome" was achieved in
May 2021, with a remaining only 0.3% bases covered by potential issues. The final gapless
assembly was finished in January 2022.
Funding came from the United States government through the National Institutes of Health
(NIH) as well as numerous other groups from around the world. A parallel project was
conducted outside the government by the Celera Corporation, or Celera Genomics, which
was formally launched in 1998. Most of the government-sponsored sequencing was
performed in twenty universities and research centres in the United States, the United
Kingdom, Japan, France, Germany, and China working in the International Human Genome
Sequencing Consortium (IHGSC).

History

Human Genome Project was a 15 year-long publicly funded project initiated in 1990 with the
objective of determining the DNA sequence of the entire euchromatic human genome within
13 years.
In 1990, the two major funding agencies, DOE and the National Institutes of Health,
developed a memorandum of understanding in order to coordinate plans and set the clock for
the initiation of the Project to 1990.[23] At that time, David J. Galas was Director of the
renamed "Office of Biological and Environmental Research" in the U.S. Department of
Energy's Office of Science and James Watson headed the NIH Genome Program. In 1993,
Aristides Patrinos succeeded Galas and Francis Collins succeeded Watson, assuming the role
of overall Project Head as Director of the NIH National Center for Human Genome Research
(which would later become the National Human Genome Research Institute). A working
draft of the genome was announced in 2000 and the papers describing it were published in
February 2001. A more complete draft was published in 2003, and genome "finishing" work
continued for more than a decade after that.[citation needed]

The $3 billion project was formally founded in 1990 by the US Department of Energy and
the National Institutes of Health, and was expected to take 15 years.[24] In addition to the
United States, the international consortium comprised geneticists in the United Kingdom,
France, Australia, China, and myriad other spontaneous relationships.[25] The project ended
up costing less than expected, at about $2.7 billion (equivalent to about $5 billion in 2021).
[7][26][27]

Two technologies enabled the project: gene mapping and DNA sequencing. The gene
mapping technique of restriction fragment length polymorphism (RFLP) arose from the
search for the location of the breast cancer gene by Mark Skolnick of the University of Utah,
[28] which began in 1974.[29] Seeing a linkage marker for the gene, collaboration with
David Botstein, Ray White and Ron Davies conceived of a way to construct a genetic
linkage map of the human genome.
This enabled scientists to launch the larger human genome effort

What makes up a gene?


Our DNA (Deoxyribo Nucleic Acid) is found in the nucleus of every cell in our body (apart
from red blood cells, which don’t have a nucleus). DNA is a long moleculOur DNA
(Deoxyribo Nucleic Acid) is found in the nucleus of every cell in our body (apart from red
blood cells, which don’t have a nucleus). DNA is a long molecule, made up of lots of smaller
units. To make a DNA molecule you need: Our DNA (Deoxyribo Nucleic Acid) is found in
the nucleus of every cell in our body (apart from red blood cells, which don’t have a
nucleus). DNA is a long molecule, made up of lots of smaller units. To make a DNA
molecule you need:
 nitrogenous bases—there are four of these: adenine (A), thymine (T), cytosine (C),
guanine (C)
 carbon sugar molecules
 phosphate molecules
If you take one of the four nitrogenous bases, and put it together with a sugar molecule and a
phosphate molecule, you get a nucleotide base. The sugar and phosphate molecules connect
the nucleotide bases together to form a single strand of DNA. Two of these strands then wind
around each other, making the twisted ladder shape of the DNA double helix. The nucleotide
bases pair up to make rungs of the ladder, and the sugar and phosphate molecules make the
sides. The bases pair up together in specific combinations: A always pairs with T, and C
always pairs with G to make base pairs. Put three billion of these base pairs together in the
right order, and you have a complete set of human DNA—the human genome. This amounts
to a DNA molecule about a metre long. It’s the order in which the base pairs are arranged—
their sequence—in our DNA that provides the blueprint for all living things and makes us
what we are.

The DNA sequence of the base pairs in a fish’s DNA is different to those in a monkey. The
base pair sequence of all people is nearly identical—that’s what makes us all humans.
However, there are small differences in the order of the three billion base pairs in everyone’s
DNA that cause the variations we see in hair colour, eye colour, nose shape etc. No two
people have exactly the same DNA sequence (except for identical twins, because they came
from a single egg that split into two, forming two copies of the same DNA). We get our DNA
from our parents. The DNA of the human genome is broken up into 23 pairs of chromosomes
(46 in total). We receive 23 from our mother and 23 from our father. Egg and sperm cells
have only one copy of each chromosome so that when they come together to form a baby, the
baby has the normal 2 copies. Three billion is a lot of base pairs, and together they contain an
enormous amount of information.

Why to study our genome

Working out the sequence of the base pairs in all our genes enables us to understand the code
that makes us who we are. This knowledge can then give us clues on how we develop as
embryos, why humans have more brainpower than other animals and plants, and what
happens in the body to cause cancer. But establishing the sequence of three billion base pairs
is a BIG task. The great and ambitious research program that sought to do this was called the
Human Genome Project.

The idea of the Human Genome Project was born in the 1970s, when scientists learned how
to ‘clone’ small bits of DNA, around the size of a gene. To clone DNA, scientists cut out a
fragment of human DNA from the long strand and then incorporate it into the genome of a
bacteria, or a bacterial virus. The fragment is then is replicated within the bacterial cell many
times and every time the bacterial cell divides, the new cells also contain the introduced
Francis Collins, former director of the National Human Genome Research Institute, led the
Human Genome Project.
A cell in human body is simply invisible to naked eye, Microscopes are essential to view
them. A Human DNA which is about 2m long gets packed so well that it fits into cell
nucleus, then think of the difficulty in viewing a DNA D DNA fragment.

Bacterial cells reproduce prolifically, and so this process ends up making millions of cells
that all contain the introduced DNA fragment, enough that researchers can study it in detail
and figure out the sequence of the base pairs.

With time, researchers have been able to study an ever greater number of different DNA
fragments, that is, different genes. It became clear that certain variant DNA sequences were
associated with particular conditions: diseases such as cystic fibrosis or breast cancer, or
normal, non-harmful variants like red hair.

There was initially a lot of opposition to the Human Genome Project, even from some
scientists. Considering only around 1.5 per cent of our genome is actual genes that code for
proteins, it was thought that much of the $3 billion cost to sequence the entire human
genome would be wasted on the ‘junk’ DNA that scientists thought didn’t get used.

The important role the ‘junk’ DNA plays in gene regulation wasn’t yet appreciated.
Research groups in many countries, including Australia, began to sequence different genes,
providing the beginnings of a total human gene map.
In 1989, the Human Genome Organization (HUGO) was found by leading scientists to
coordinate the massive International effort involved in collecting sequence data to unravel
the secrets of our genes.

The human genome project


The Human Genome Project aimed to map the entire genome, including the position of every
human gene along the DNA strand, and then to determine the sequence of each gene’s base
pairs. At the time, sequencing even a small gene could take months, so this was seen as a
stupendous and very costly undertaking. Fortunately, biotechnology was advancing rapidly,
and by the time the project was finishing it was possible to sequence the DNA of a gene in a
few hours. Even so, the project took ten years to complete; the first draft of the human
genome was announced in June 2000.

In February 2001, the publicly funded Human Genome Project and the private company
Celera both announced that they had mapped virtually all of the human genome, and had
begun the task of working out the functions of the many new genes that were identified.
Scientists were surprised to find that humans only have around 25,000 genes, not much more
than the roundworm Caenorhabditis elegans, and less than a tiny water crustacean called
Daphnia, which has around 30,000. However, genome sequencing was making it clear that
an organism's complexity is not necessarily related to its number of genes.
Also, while we might have a surprisingly small number of genes, they are often expressed in
multiple and complex ways. Numerous genes have as many as a dozen different functions
and may be translated into several different versions active in different tissues. We also have
a lot of extra DNA that doesn’t make up specific genes. So even though the puffer
fishTetraodon nigroviridis has more genes than we do—nearly 28,000—the size of its entire
genome is actually only around one tenth of ours as it has much less of the non-coding DNA.
In April 2003, the 50th anniversary of the publication of the structure of DNA, the complete
final map of the Human Genome was announced.
The DNA from a large number of donors, women and men from different nations and of
different races, contributed to this ‘typical’ Human Genome Sequence.

The process of identifying the boundaries between genes and other features in a raw DNA
sequence is called genome annotation and is in the domain of bioinformatics.
While expert biologists make the best annotators, their work proceeds slowly, and computer
programs are increasingly used to meet the high-throughput demands of genome sequencing
projects. Beginning in 2008, a new technology known as RNA-seq was Introduced that
allowed scientists to directly sequence the messenger RNA in cells.
This replaced previous methods of annotation, which relied on inherent properties of the
DNA sequence, with direct measurement, which was much more accurate.

Today, annotation of the human genome and other genomes relies primarily on deep
sequencing of the transcripts in every human tissue using RNA-seq.

These experiments have revealed that over 90% of genes contain at least one and usually
several alternative splice variants, in which the exons are combined in different ways to
produce 2 or more gene products from the same locus.

The genome published by the HGP does not represent the sequence of every individual's
genome.
It is the combined mosaic of a small number of anonymous donors, all of European origin.
The HGP genome is a scaffold for future work in identifying differences among individuals.
Subsequent projects sequenced the genomes of multiple distinct ethnic groups, though as of
today there is still only one "reference genome.
Applications and proposed benefits

The sequencing of the human genome holds benefits for many fields, from molecular
medicine to human evolution. The Human Genome Project, through its sequencing of the
DNA, can help researchers understand diseases including: genotyping of specific viruses to
direct appropriate treatment; identification of mutations linked to different forms of cancer;
the design of medication and more accurate prediction of their effects; advancement in
forensic applied sciences; biofuels and other energy applications; agriculture, animal
husbandry, bioprocessing; risk assessment; bioarcheology, anthropology and evolution.
Another proposed benefit is the commercial development of genomics research related to
DNA-based products, a multibillion-dollar industry.[citation needed]

The sequence of the DNA is stored in databases available to anyone on the Internet. The U.S.
National Center for Biotechnology Information (and sister organizations in Europe and
Japan) house the gene sequence in a database known as GenBank, along with sequences of
known and hypothetical genes and proteins. Other organizations, such as the UCSC Genome
Browser at the University of California, Santa Cruz,[52] and Ensembl[53] present additional
data and annotation and powerful tools for visualizing and searching it. Computer programs
have been developed to analyze the data because the data itself is difficult to interpret
without such programs. Generally speaking, advances in genome sequencing technology
have followed Moore's Law, a concept from computer science which states that integrated
circuits can increase in complexity at an exponential rate.[54] This means that the speeds at
which whole genomes can be sequenced can increase at a similar rate, as was seen during the
development of the Human Genome Project.
Techniques and analysis
The process of identifying the boundaries between genes and other features in a raw DNA
sequence is called genome annotation and is in the domain of bioinformatics. While expert
biologists make the best annotators, their work proceeds slowly, and computer programs are
increasingly used to meet the high-throughput demands of genome sequencing projects.
Beginning in 2008, a new technology known as RNA-seq was introduced that allowed
scientists to directly sequence the messenger RNA in cells.

This replaced previous methods of annotation, which relied on the inherent properties of the
DNA sequence, with direct measurement, which was much more accurate. Today, annotation
of the human genome and other genomes relies primarily on deep sequencing of the
transcripts in every human tissue using RNA-seq. These experiments have revealed that over
90% of genes contain at least one and usually several alternative splice variants, in which the
exons are combined in different ways to produce 2 or more gene products from the same
locus.[55]

The genome published by the HGP does not represent the sequence of every individual's
genome. It is the combined mosaic of a small number of anonymous donors, of African,
European and east Asian ancestry. The HGP genome is a scaffold for future work in
identifying differences among individuals.[citation needed] Subsequent projects sequenced
the genomes of multiple distinct ethnic groups, though as of 2019 there is still only one
"reference genome".[56]
Key findings and accomplishments
Key findings of the draft (2001) and complete (2004) genome sequences include:

There are approximately 22,300[57] protein-coding genes in human beings, the same range
as in other mammals.

The human genome has significantly more segmental duplications (nearly identical, repeated
sections of DNA) than had been previously suspected.[58][59][60]

At the time when the draft sequence was published, fewer than 7% of protein families
appeared to be vertebrate specific.[61

The human genome has approximately 3.1 billion base pairs.[62] The Human Genome
Project was started in 1990 with the goal of sequencing and identifying all base pairs in the
human genetic instruction set, finding the genetic roots of disease and then developing
treatments. It is considered a megaproject.

The genome was broken into smaller pieces; approximately 150,000 base pairs in length.[63]
These pieces were then ligated into a type of vector known as "bacterial artificial
chromosomes", or BACs, which are derived from bacterial chromosomes which have been
genetically engineered. The vectors containing the genes can be inserted into bacteria where
they are copied by the bacterial DNA replication machinery. Each of these pieces was then
sequenced separately as a small "shotgun" project and then assembled. The larger, 150,000
base pairs go together to create chromosomes. This is known as the "hierarchical shotgun"
approach, because the genome is first broken into relatively large chunks, which are then
mapped to chromosomes before being selected for sequencing.[64][65]
Genome donors

In the International Human Genome Sequencing Consortium (IHGSC) public-sector HGP,


researchers collected blood (female) or sperm (male) samples from a large number of donors.
Only a few of many collected samples were processed as DNA resources.
Thus the donor identities were protected so neither donors nor scientists could know whose
DNA was sequenced. DNA clones taken from many different libraries were used in the
overall project, with most of those libraries being created by Pieter J. de Jong. Much of the
sequence (>70%) of the reference genome produced by the public HGP came from a single
anonymous male donor from Buffalo, New York, (code name RP11; the "RP" refers to
Roswell Park Comprehensive Cancer Center).

HGP scientists used white blood cells from the blood of two male and two female donors
(randomly selected from 20 of each) – each donor yielding a separate DNA library. One of
these libraries (RP11) was used considerably more than others, because of quality
considerations.

One minor technical issue is that male samples contain just over half as much DNA from the
sex chromosomes (one X chromosome and one Y chromosome) compared to female samples
(which contain two X chromosomes).
The other 22 chromosomes (the autosomes) are the same for both sexes.
Although the main sequencing phase of the HGP has been completed, studies of DNA
variation continued in the International HapMap Project, whose goal was to identify patterns
of single-nucleotide polymorphism (SNP) groups (called haplotypes, or "haps"). The DNA
samples for the HapMap came from a total of 270 individuals; Yoruba people in Ibadan,
Nigeria; Japanese people in Tokyo; Han Chinese in Beijing; and the French Centre d'Etude
du Polymorphisme Humain (CEPH) resource, which consisted of residents of the United
States having ancestry from Western and Northern Europe.

In the Celera Genomics private-sector project, DNA from five different individuals were
used for sequencing. The lead scientist of Celera Genomics at that time, Craig Venter, later
acknowledged (in a public letter to the journal Science) that his DNA was one of 21 samples
in the pool, five of which were selected for use.
Developments

With the sequence in hand, the next step was to identify the genetic variants that increase the
risk for common diseases like cancer and diabetes.[23][63]

It is anticipated that detailed knowledge of the human genome will provide new avenues for
advances in medicine and biotechnology. Clear practical results of the project emerged even
before the work was finished. For example, a number of companies, such as Myriad
Genetics, started offering easy ways to administer genetic tests that can show predisposition
to a variety of illnesses, including breast cancer, hemostasis disorders, cystic fibrosis, liver
diseases and many others.
Also, the etiologies for cancers, Alzheimer's disease and other areas of clinical interest are
considered likely to benefit from genome information and possibly may lead in the long term
to significant advances in their management.[77][78]

There are also many tangible benefits for biologists. For example, a researcher investigating
a certain form of cancer may have narrowed down their search to a particular gene. By
visiting the human genome database on the World Wide Web, this researcher can examine
what other scientists have written about this gene, including (potentially) the three-
dimensional structure of its product, its functions, its evolutionary relationships to other
human genes, or to genes in mice, yeast, or fruit flies, possible detrimental mutations,
interactions with other genes, body tissues in which this gene is activated, and diseases
associated with this gene or other datatypes.

Further, a deeper understanding of the disease processes at the level of molecular biology
may determine new therapeutic procedures.

Given the established importance of DNA in molecular biology and its central role in
determining the fundamental operation of cellular processes, it is likely that expanded
knowledge in this area will facilitate medical advances in numerous areas of clinical interest
that may not have been possible without them.[79]

The analysis of similarities between DNA sequences from different organisms is also
opening new avenues in the study of evolution.

In many cases, evolutionary questions can now be framed in terms of molecular biology;
indeed, many major evolutionary milestones (the emergence of the ribosome and organelles,
the development of embryos with body plans, the vertebrate immune system) can be related
to the molecular level.
Many questions about the similarities and differences between humans and their closest
relatives (the primates, and indeed the other mammals) are expected to be illuminated by the
data in this project.[77][80]

The project inspired and paved the way for genomic work in other fields, such as agriculture.
For example, by studying the genetic composition of Tritium aestivum, the world's most
commonly used bread wheat, great insight has been gained into the ways that domestication
has impacted the evolution of the plant.

It is being investigated which loci are most susceptible to manipulation, and how this plays
out in evolutionary terms.
Genetic sequencing has allowed these questions to be addressed for the first time, as specific
loci can be compared in wild and domesticated strains of the plant.
This will allow for advances in the genetic modification in the future which could yield
healthier and disease-resistant wheat crops, among other things.

Ethical,legal and social issues

At the onset of the Human Genome Project, several ethical, legal, and social concerns were
raised in regard to how increased knowledge of the human genome could be used to
discriminate against people. One of the main concerns of most individuals was the fear that
both employers and health insurance companies would refuse to hire individuals or refuse to
provide insurance to people because of a health concern indicated by someone's genes.[82]

In 1996, the United States passed the Health Insurance Portability and Accountability Act
(HIPAA), which protects against the unauthorized and non-consensual release of individually
identifiable health information to any entity not actively engaged in the provision of
healthcare services to a patient.[83]

Along with identifying all of the approximately 20,000–25,000 genes in the human genome
(estimated at between 80,000 and 140,000 at the start of the project), the Human Genome
Project also sought to address the ethical, legal, and social issues that were created by the
onset of the project.

For that, the Ethical, Legal, and Social Implications (ELSI) program was founded in 1990.
Five percent of the annual budget was allocated to address the ELSI arising from the project.
[24][85] This budget started at approximately $1.57 million in the year 1990, but increased
to approximately $18 million in the year 2014.
Whilst the project may offer significant benefits to medicine and scientific research, some
authors have emphasized the need to address the potential social consequences of mapping
the human genome. Historian of science Hans-Jörg Rheinberger wrote that "the prospect of
'molecularizing' diseases and their possible cure will have a profound impact on what
patients expect from medical help, and on a new generation of doctors' perception of illness."

Bibliography

1.google
2.Wikipedia
3.google images
4.ncert biology class 12th book

You might also like