Biological Computing
Biological Computing
Project Report on
“BIOINFORMATICS”
At
as a partial fulfilment
(B. C. A.)
2021-2022
Guided By: Submitted By:
PARTH PATEL ADARSH ASHOK
1|P a ge
Gidc rajju shroff rofel institute of management studies (bba program)
and ROFEL, Shri G.M Bilakhia College of Applied Sciences (BCA), VAPI
An ISO Certified 9001-2015 & 29990-2010
CERTIFICATE
This is to certify that the seminar work entitled as “Bioinformatics “is bonafide work
done by, ADARSH ASHOK
Guided by Principal
2|P a ge
Acknowledgement
We take this opportunity to thank all those who have contributed their support in preparing this
seminar. Firstly, we would like to express our deep sense of gratitude towards ROFEL BBA
and BCA College, Vapi.
Providing us this opportunity to implement our skills in making Presentations on a topic in our
sixth semester course curriculum.
I would not miss the opportunity to thank my guide Parth patel who has always been providing
continuous guidance & support and always been a stepping stone in completing this seminar.
I am also very thankful to Principal Dr. P.H.Ved of our college for their keen interest and for
providing all facilities in our project work.
Last but not least, we are also grateful to our parents and friends whose continuous support has
always boosted our moral towards working on this seminar.
3|P a ge
Sr.No Topic Pg no.
1 What is Bioinformatics 5
2 Computational biology 6
3 What is DNA? 7
7 Benefits 18
8 Applications 25
9 Importance 26
10 Future 27
11 Webliography 28
12 Bibliography 29
4|P a ge
What is Bioinformatics?
It is the combination of molecular biology, statistics and information technology.
It is a kind of manipulation of data within database using biological factors such as DNA or
proteins.
Bioinformatics includes biological studies that use computer programming as part of their
methodology, as well as specific analysis "pipelines" that are repeatedly used, particularly in the
field of genomics. Common uses of bioinformatics include the identification of
candidates genes and single nucleotide polymorphisms (SNPs).
History
Historically, the term bioinformatics did not mean what it means today. Paulien
Hogeweg and Ben Hesper coined it in 1970 to refer to the study of information processes in
biotic systems. This definition placed bioinformatics as a field parallel to biochemistry.
5|P a ge
Computational biology
6|P a ge
What is DNA?
DNA is the chemical name for the molecule that carries genetic instructions in all living
things. The DNA molecule consists of two strands that wind around one another to form a
shape known as a double helix. Each strand has a backbone made of alternating sugar
(deoxyribose) and phosphate groups.
This phosphate group has four chemical bases which are:
1. ADENINE
2. THYMINE
3. CYTOSINE
4. GUANINE
7|P a ge
What are DNA computers?
DNA (deoxyribonucleic acid) molecules, the material our genes are made of, have the potential
to perform calculations many times faster than the world's most powerful human-built
computers. DNA might one day be integrated into a computer chip to create a so-called biochip
that will push computers even faster. DNA molecules have already been harnessed to perform
complex mathematical problems.
DNA computer will be capable of storing billions of times more data than your personal
computer. In this article, you'll learn how scientists are using genetic material to create nano-
computers that might take the place of silicon-based computers in the next decade.
DNA computers can't be found at your local electronics store yet. The technology is still in
development, and didn't even exist as a concept a decade ago. In 1994, Leonard Adleman
8|P a ge
introduced the idea of using DNA to solve complex mathematical problems. Adleman, a
computer scientist at the University of Southern California, came to the conclusion that DNA
had computational potential after reading the book "Molecular Biology of the Gene," written by
James Watson, who co-discovered the structure of DNA in 1953. In fact, DNA is very similar to
a computer hard drive in how it stores permanent information about your genes.
Protein synthesis is the process in which cells make proteins. It occurs in two stages:
transcription and translation.
9|P a ge
Internet and Bioinformatics
The Internet (or internet) is the global system of interconnected computer networks that uses
the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is
a network of networks that consists of private, public, academic, business, and government
networks of local to global scope, linked by a broad array of electronic, wireless, and optical
networking technologies. The Internet carries a vast range of information resources and services,
such as the inter-linked hypertext documents and applications of the World Wide
Web (WWW), electronic mail, telephony, and file sharing.
The origins of the Internet date back to the development of packet switching and research
commissioned by the United States Department of Defense in the 1960s to enable time-
sharing of computers. The primary precursor network, the ARPANET, initially served as a
backbone for interconnection of regional academic and military networks in the 1970s. The
funding of the National Science Foundation Network as a new backbone in the 1980s, as well as
10 | P a g e
private funding for other commercial extensions, led to worldwide participation in the
development of new networking technologies, and the merger of many networks. The linking of
commercial networks and enterprises by the early 1990s marked the beginning of the transition
to the modern Internet, and generated a sustained exponential growth as generations of
institutional, personal, and mobile computers were connected to the network. Although the
Internet was widely used by academia in the 1980s, commercialization incorporated its services
and technologies into virtually every aspect of modern life.
Most traditional communication media, including telephony, radio, television, paper mail and
newspapers are reshaped, redefined, or even bypassed by the Internet, giving birth to new
services such as email, Internet telephony, Internet television, online music, digital newspapers,
and video streaming websites. Newspaper, book, and other print publishing are adapting
to website technology, or are reshaped into blogging, web feeds and online news aggregators.
The Internet has enabled and accelerated new forms of personal interactions through instant
messaging, Internet forums, and social networking services. Online shopping has grown
exponentially for major retailers, small businesses, and entrepreneurs, as it enables firms to
extend their "brick and mortar" presence to serve a larger market or even sell goods and services
entirely online. Business-to-business and financial services on the Internet affect supply
chains across entire industries.
11 | P a g e
Protein–protein interaction networks
Protein-protein interaction networks (PINs) represent the physical relationship among proteins
present in a cell, where proteins are nodes, and their interactions are undirected edges. Due to
their undirected nature, it is difficult to identify all the proteins involved in an
interaction. Protein–protein interactions (PPIs) are essential to the cellular processes and also the
most intensely analyzed networks in biology. PPIs could be discovered by various experimental
techniques, among which the yeast two-hybrid system is a commonly used technique for the
study of binary interactions. Recently, high-throughput studies using mass spectrometry have
identified large sets of protein interactions.
Many international efforts have resulted in databases that catalog experimentally determined
protein-protein interactions. Some of them are the Human Protein Reference Database, Database
of Interacting Proteins, the Molecular Interaction Database (MINT), IntAct, and BioGRID. At
the same time, multiple computational approaches were proposed to predict
interactions. STRING is one such database, where both computationally predicted and
experimentally validated protein-protein interactions are gathered for public usage.
Recent studies have indicated the conservation of molecular networks through deep evolutionary
time. Moreover, it has been discovered that proteins with high degrees of connectedness are
more likely to be essential for survival than proteins with lesser degrees. This observation
suggests that the overall composition of the network (not simply interactions between protein
pairs) is vital for an organism's overall functioning.
12 | P a g e
Gene regulatory networks (DNA–protein interaction networks)
The genome encodes thousands of genes whose products (mRNAs, proteins) are crucial to the
various processes of life, such as cell differentiation, cell survival, and metabolism. Genes
produce such products through a process called transcription, which is regulated by a class of
proteins called transcription factors. For instance, the human genome encodes almost 1,500
DNA-binding transcription factors that regulate the expression of more than 20,000 human
genes. The complete set of gene products and the interactions among them constitutes gene
regulatory networks (GRN). GRNs regulate the levels of gene products within the cell and in-
turn the cellular processes.
GRNs are represented with genes and transcriptional factors as nodes and the relationship
between them as edges. These edges are directional, representing the regulatory relationship
between the two ends of the edge. For example., the directed edge from gene A to gene B
indicates that A regulates the expression of B. Thus, these directional edges can not only
represent the promotion of gene regulation but also its inhibition.
Biological databases are libraries of biological sciences, collected from scientific experiments,
published literature, high-throughput experiment technology, and computational analysis. They
contain information from research areas,
including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics.
Information contained in biological databases includes gene function, structure, localization
(both cellular and chromosomal), clinical effects of mutations as well as similarities of biological
sequences and structures.
Biological databases can be classified by the kind of data they collect (see below). Broadly, there
are molecular databases (for sequences, molecules, etc.), functional databases (for physiology,
13 | P a g e
enzyme activities, phenotypes, ecology etc), taxonomic databases (for species and other
taxonomic ranks), images and other media, or specimens (for museum collections etc.)
Databases are important tools in assisting scientists to analyze and explain a host of biological
phenomena from the structure of biomolecules and their interaction, to the whole metabolism of
organisms and to understanding the evolution of species. This knowledge helps facilitate the
fight against diseases, assists in the development of medications, predicting certain genetic
diseases and in discovering basic relationships among species in the history of life.
The computational part of bioinformatics use to optimize the biological problems like :
metabolic disorder , genetic disorders.
Operating System
Software & tools development
Win
2000\XP\linux\un
Computer
ix
Science
14 | P a g e
Database Development
Software & tools applications
INTERNET RESOURCE FOR BIOINFORMATICSBIOINFORMATICS
Relational Database
Management System Bioinformatics software’s application
15 | P a g e
Human genome project (HGP)
The Human Genome Project (HGP) was an international scientific research project with the
goal of determining the base pairs that make up human DNA, and of identifying, mapping and
sequencing all of the genes of the human genome from both a physical and a functional
standpoint. It remains the world's largest collaborative biological project.
Planning started after the idea was picked up in 1984 by the US government, the project formally
launched in 1990, and was declared essentially complete on April 14, 2003, but included only
about 85% of the genome. Level "complete genome" was achieved in May 2021, with a
16 | P a g e
remaining only 0.3% bases covered by potential issues The missing Y chromosome was added in
January 2022.
Funding came from the American government through the National Institutes of Health (NIH) as
well as numerous other groups from around the world. A parallel project was conducted outside
the government by the Celera Corporation, or Celera Genomics, which was formally launched in
1998. Most of the government-sponsored sequencing was performed in twenty universities and
research centres in the United States, the United Kingdom, Japan, France, Germany, and China.
The Human Genome Project originally aimed to map the nucleotides contained in a
human haploid reference genome (more than three billion). The "genome" of any given
individual is unique; mapping the "human genome" involved sequencing a small number of
individuals and then assembling to get a complete sequence for each chromosome.
Therefore, the finished human genome is a mosaic, not representing any one individual. The
utility of the project comes from the fact that the vast majority of the human genome is the same
in all humans.
BENEFITS
The sequencing of the human genome holds benefits for many fields, from molecular
medicine to human evolution. The Human Genome Project, through its sequencing of the DNA,
can help researchers understand diseases including: genotyping of specific viruses to direct
appropriate treatment; identification of mutations linked to different forms of cancer; the design
of medication and more accurate prediction of their effects; advancement in forensic applied
sciences; biofuels and other energy applications; agriculture, animal
17 | P a g e
husbandry, bioprocessing; risk assessment; bioarcheology, anthropology and evolution. Another
proposed benefit is the commercial development of genomics research related to DNA-based
products, a multibillion-dollar industry.
The sequence of the DNA is stored in databases available to anyone on the Internet. The
U.S. National Center for Biotechnology Information (and sister organizations in Europe and
Japan) house the gene sequence in a database known as GenBank, along with sequences of
known and hypothetical genes and proteins. Other organizations, such as the UCSC Genome
Browser at the University of California, Santa Cruz, and Ensembl present additional data and
annotation and powerful tools for visualizing and searching it. Computer programs have been
developed to analyze the data because the data itself is difficult to interpret without such
programs. Generally speaking, advances in genome sequencing technology have
followed Moore's Law, a concept from computer science which states that integrated circuits can
increase in complexity at an exponential rate. This means that the speeds at which whole
genomes can be sequenced can increase at a similar rate, as was seen during the development of
the Human Genome Project.
APPLICATIONS OF BIOINFORMATICS
• Bioinformatics has not only become essential for basic genomic and molecular biology
research, but is having a major impact on many areas of biotechnology and biomedical
sciences. The main uses of bioinformatics include:
• Bioinformatics plays a vital role in the areas of structural genomics, functional genomics,
and nutritional genomics.
18 | P a g e
• It covers emerging scientific research and the exploration of proteomes from the overall
level of intracellular protein composition (protein profiles), protein structure, protein-
protein interaction, and unique activity patterns (e.g. post-translational modifications).
• Bioinformatics is used for transcriptome analysis where mRNA expression levels can be
determined.
Molecular medicine:
A branch of medicine that develops ways to diagnose and treat disease by understanding the
way genes, proteins, and other cellular molecules work.
Molecular medicine is a broad field, where physical, chemical, biological, bioinformatics and
medical techniques are used to describe molecular structures and mechanisms, identify
fundamental molecular and genetic errors of disease, and to develop molecular interventions to
correct them. The molecular medicine perspective emphasizes cellular and molecular phenomena
19 | P a g e
and interventions rather than the previous conceptual and observational focus on patients and
their organs.
Personalized medicine
It is an emerging practice of medicine that uses an individual's genetic profile to guide dzcisions
made in regard to the prevention, diagnosis, and treatment of disease.
also referred to as precision medicine, is a medical model that separates people into different
groups—with medical decisions, practices, interventions and/or products being tailored to the
individual patient based on their predicted response or risk of disease. [1] The terms personalized
medicine, precision medicine, stratified medicine and P4 medicine are used interchangeably to
20 | P a g e
describe this concept though some authors and organizations use these expressions separately to
indicate particular nuances.
While the tailoring of treatment to patients dates back at least to the time of Hippocrates, the
term has risen in usage in recent years given the growth of new diagnostic and informatics
approaches that provide understanding of the molecular basis of disease, particularly genomics.
This provides a clear evidence base on which to stratify (group) related patients.
Gene therapy
It is the altering the genes inside your body's cells in an effort to treat or stop disease.
Gene therapy is a medical field which focuses on the genetic modification of cells to produce a
therapeutic effect or the treatment of disease by repairing or reconstructing defective genetic
material. The first attempt at modifying human DNA was performed in 1980, by Martin Cline,
but the first successful nuclear gene transfer in humans, approved by the National Institutes of
21 | P a g e
Health, was performed in May 1989. The first therapeutic use of gene transfer as well as the first
direct insertion of human DNA into the nuclear genome was performed by French Anderson in a
trial starting in September 1990. It is thought to be able to cure many genetic disorders or treat
them over time.
Gene doping
Athletes may adopt gene therapy technologies to improve their performance. [62] Gene doping is
not known to occur, but multiple gene therapies may have such effects. Kayser et al. argue that
gene doping could level the playing field if all athletes receive equal access. Critics claim that
any therapeutic intervention for non-therapeutic/enhancement purposes compromises the ethical
foundations of medicine and sports.
Waste cleanup
Bacteria and microbes are helpful in cleaning waste. Deinococcus radiodurans Bacterium is
point out in the Guinness Book of World Records.
Biomedical waste or hospital waste is any kind of waste containing infectious (or potentially
infectious) materials.[1] It may also include waste associated with the generation of biomedical
waste that visually appears to be of medical or laboratory origin (e.g. packaging, unused
22 | P a g e
bandages, infusion kits etc.), as well research laboratory waste containing biomolecules or
organisms that are mainly restricted from environmental release. As detailed below,
discarded sharps are considered biomedical waste whether they are contaminated or not, due to
the possibility of being contaminated with blood and their propensity to cause injury when not
properly contained and disposed. Biomedical waste is a type of biowaste.
Biomedical waste may be solid or liquid. Examples of infectious waste include discarded blood,
sharps, unwanted microbiological cultures and stocks, identifiable body parts (including those as
a result of amputation), other human or animal tissue, used bandages and dressings, discarded
gloves, other medical supplies that may have been in contact with blood and body fluids, and
laboratory waste that exhibits the characteristics described above. Waste sharps include
potentially contaminated used (and unused discarded) needles, scalpels, lancets and other devices
capable of penetrating skin.
Biotechnology
Biotechnology is the use of biology to solve problems and make useful products.
23 | P a g e
as cell and tissue culture technologies, biotechnology is the integration of natural science and
organisms, cells, parts thereof, and molecular analogues for products and services.
Biotechnology is based on the basic biological sciences (e.g., molecular
biology, biochemistry, cell biology, embryology, genetics, microbiology) and conversely
provides methods to support and perform basic research in biology.
Biotechnology is the research and development in the laboratory using bioinformatics for
exploration, extraction, exploitation, and production from any living organisms and any source
of biomass by means of biochemical engineering where high value-added products could be
planned (reproduced by biosynthesis, for example), forecasted, formulated, developed,
manufactured, and marketed for the purpose of sustainable operations (for the return from
bottomless initial investment on R & D) and gaining durable patents rights (for exclusives rights
for sales, and prior to this to receive national and international approval from the results on
animal experiment and human experiment, especially on the pharmaceutical branch of
biotechnology to prevent any undetected side-effects or safety concerns by using the
products). The utilization of biological processes, organisms or systems to produce products that
are anticipated to improve human lives is termed biotechnology.
Forensic analysis
Detailed process of detecting, investigating, and documenting the reason, course, and
consequences of a security incident or violation against state and organization laws.
Forensic Data Analysis (FDA) is a branch of Digital forensics. It examines structured data with
regard to incidents of financial crime. The aim is to discover and analyze patterns
24 | P a g e
of fraudulent activities. Data from application systems or from their underlying databases is
referred to as structured data.
Unstructured data in contrast is taken from communication and office applications or from
mobile devices. This data has no overarching structure and analysis thereof means applying
keywords or mapping communication patterns. Analysis of unstructured data is usually referred
to as Computer forensics.
The analysis of large volumes of data is typically performed in a separate database system run by
the analysis team. Live systems are usually not dimensioned to run extensive individual analysis
without affecting the regular users. On the other hand, it is methodically preferable to analyze
data copies on separate systems and protect the analysis teams against the accusation of altering
original data.
Due to the nature of the data, the analysis focuses more often on the content of data than on the
database it is contained in. If the database itself is of interest then Database forensics are applied.
Importance of bioinformatics
Although scientists mapped the genomes of various organisms in the laboratories but it was the
greatest challenge for them how to store or compile that huge biological data. For this purpose
computers had to be used in the research field as there was no other way for the storing of the
data. Bioinformatics provided this opportunity to the researchers to store the data in the form of
databases on computers. The whole work of DNA sequencing, its observation, analysis and
25 | P a g e
interpretation, all was the work of computers and it could not be done manually.If the data is
present in the raw form and is not compiled then even the professional researchers cannot use it.
Bioinformatics has enabled scientists to make biological tools which extract this information
from the databases and can be used for the research purpose. Bioinformatics tools can be used
for three purposes.
In 2020 coronavirus pandemic shows that rapid data analysis and interpretation is much
more powerful to help control the spread when that data is shared quickly and openly.
But it’s not all about producing new data, when so much already exists. Analyzing data is
hugely important.
Sharing the results of this requires “showing your working”: the data you used, the
methods you employed, the software you used (with versions and parameters). This all
takes time and effort, and bioinformaticians can help.
The field of computer science called bioinformatics is used to analyze whole-genome
sequencing data.
This involves algorithm, pipeline and software development, and analysis, transfer and
storage/database development of genomics data.
Future of bioinformatics
Apart from analysis of genome sequence data, bioinformatics is now being used for a vast array
of other important tasks, including analysis of gene variation and expression, analysis and
prediction of gene and protein structure and function, prediction and detection of gene regulation
networks, simulation environments.
26 | P a g e
Bioinformatics is super essential for the analysis of Data in modern biology and medicine.
And this global collaboration is going to grow by leaps and bounds in the next decade definitely.
So learning Bioinformatics at this point will definitely put you on the international collaboration
roadmap as well.
Bioinformatics provides central, globally accessible databases that enable scientists to submit,
search and analyze information.
Bioinformatics is super essential for the analysis of Data in modern biology and medicine.
this global collaboration is going to grow by leaps and bounds in the next decade definitely
Conclusion
27 | P a g e
As the field of systems biology advances with customized medicine, scientists will have
the capacity to apply tools and innovation to comprehend the ailment system that
advances from the molecular, cellular, tissue, and organ levels to the individual and, at
long last, the populace levels.
Bioinformatics discoveries can be converted into advancements that are received by the
healthcare framework and biomedical industry in type of diagnostic kits, analysis
programs, and so forth.
With the help of AI now we can classify different types of bioinformatics data for disease
identification and also helps to find the Genetical cause of a particular disease.
Webliography
www.bioinformatics.in
28 | P a g e
www.bioinformatics.wikipidea
Bibliography
29 | P a g e
Introduction to bioinformatics
Bioinformatics applications and principles
Bioinformatics for dummies
Bioinformatics for beginners
30 | P a g e
Thank you
31 | P a g e