Bioinformatics Combines Computer Programming
Bioinformatics Combines Computer Programming
Bioinformatics combines computer programming, big data, and biology to help scientists understand
and identify patterns in biological data. It is particularly useful in studying genomes and DNA sequencing,
as it allows scientists to organize large amounts of data.
What is database?
Database is the collection of data let us see the list of databases in bioinformatics
Biological databases can be broadly classified as sequence and structure databases. Structure databases
are for protein structures, while sequence databases are for nucleic acid and protein sequences.
Data science is the study of data to extract meaningful insights for business. It is a multidisciplinary
approach that combines principles and practices from the fields of mathematics, statistics, artificial
intelligence, and computer engineering to analyze large amounts of data. This analysis helps data
scientists to ask and answer questions like what happened, why it happened, what will happen, and
what can be done with the results.
Now Naveen will make sure about what data science can do?
With Data science, we can explore hidden patterns from data; Descriptive statistics and data
visualization. From this we can apply statistics to draw inference from the given data. At last, we can
apply machine learning to make data driven guesses from new data.
Now Akshaya will explain the skill sets needed to be a data scientist.
First of all, in each and every company they soft skills like communication skills, creativity, critical
thinking, writing skills and much more. Then you need to have knowledge of programming like u need to
know languages like python, c/c++, java, SQL, and Julia. Next you need to have idea about machine
learning like classification and regression. After that you need idea on statistics, data preprocessing,
software engineering, data visualization and mathematics.
Over to Hemchand.
So now we have got an idea about data science. So, with this knowledge we can use them in drug
discovery.
With data science we can relate the chemical structure of biological matter; build models on biological
activity of query compounds. These models can be used for repurpose existing FDA approved drugs for
new therapeutic treatment, and also can be applied for developing Personalized Medicine.
Quantitative structure – activity relationship model is a mathematical modeling that seeks to discern the
relationship between the chemical structure and their bioactivity.
To create a data set we need subsets which are variables like x and y. with these we can create a table
which is helpful for analyzing the data in an easier way.
EDA gives preliminary understanding of the data sets.it has three major approaches.
At last, we all know about data visualization, which is to represent the data sets in different charts.
And if you have interest over these topics you can search about data splitting and cross validation when
you are free.
Over to Naveen.
So, there are three types of learning. We will deal with them in order.
So supervised learning is a machine learning task that establishes the mathematical relationship
between the input variable x and output variable y.
Such x, y pair constitutes the labeled data that are used for model building in an effort to learn how to
predict the outcome from the input.
Unsupervised learning is a machine learning task that makes use of only x variables. Such variables are
unlabeled data that the learning algorithm uses in modeling the inherent structure of data.
Reinforcement learning is a machine learning task that decides on the next course of action and it does
this by learning through trial and error in an effort to maximize the reward.
OVER TO HEMCHAND.
There are some ways mentioned here to proceed with data science which these are preferable ways.
No code in the sense means that you don’t have to use coding, by using graphical user interface-based
tools or applications.
Low code in the sense means to use automated machine learning software which uses codes to clean
the data and to proceed with data sets.
At last, the other way is to use much of coding languages to create database and do every task with
those languages.