0% found this document useful (0 votes)

22 views15 pages

What Is Data Science A Beginner's Guide To Data Science

Uploaded by

Nidhin Shaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views15 pages

What Is Data Science A Beginner's Guide To Data Science

Uploaded by

Nidhin Shaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

What Is

Data
Science? A
Beginner’s
Guide To
Data
Science

ZaranTech
What Is Data Science? A Beginner’s Guide To
Data Science
As the world entered the era of big data, the need for its storage also grew. It
was the main challenge and concern for the enterprise industries until 2010.
The main focus was on building framework and solutions to store data. Now
when Hadoop and other frameworks have successfully solved the problem of
storage, the focus has shifted to the processing of this data. Data Science is
the secret sauce here. All the ideas which you see in Hollywood sci-fi movies
can actually turn into reality by Data Science. Data Science is the future of
Artificial Intelligence. Therefore, it is very important to understand what is Data
Science and how can it add value to your business.

In this blog, I will be covering the following topics.

The need for Data Science.

What is Data Science?
How is it different from Business Intelligence (BI) and Data Analysis?
The lifecycle of Data Science with the help of a use case.
By the end of this blog, you will be able to understand what is Data Science
and its role in extracting meaningful insights from the complex and large sets of
data all around us.

Let’s Understand Why We Need Data Science

Traditionally, the data that we had was mostly structured and small in size,
which could be analyzed by using the simple BI tools. Unlike data in the
traditional systems which was mostly structured, today most of the data is
unstructured or semi-structured. Let’s have a look at the data trends in the
image given below which shows that by 2020, more than 80 % of the data will
be unstructured.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
This data is generated from different sources like financial logs, text files,
multimedia forms, sensors, and instruments. Simple BI tools are not capable
of processing this huge volume and variety of data. This is why we need more
complex and advanced analytical tools and algorithms for processing, analyzing
and drawing meaningful insights out of it.
This is not the only reason why Data Science has become so popular. Let’s dig
deeper and see how Data Science is being used in various domains.

How about if you could understand the precise requirements of your customers
from the existing data like the customer’s past browsing history, purchase
history, age and income. No doubt you had all this data earlier too, but now
with the vast amount and variety of data, you can train models more effectively
and recommend the product to your customers with more precision. Wouldn’t it
be amazing as it will bring more business to your organization?Let’s take a
different scenario to understand the role of Data Science in decision
making. How about if your car had the intelligence to drive you home? The self-
driving cars collect live data from sensors, including radars, cameras and
lasers to create a map of its surroundings. Based on this data, it takes
decisions like when to speed up, when to speed down, when to overtake,
where to take a turn – making use of advanced machine learning
algorithms.Let’s see how Data Science can be used in predictive analytics.
Let’s take weather forecasting as an example. Data from ships, aircrafts,
radars, satellites can be collected and analyzed to build models. These models
will not only forecast the weather but also help in predicting the occurrence of
any natural calamities. It will help you to take appropriate measures
beforehand and save many precious lives.
Let’s have a look at the below infographic to see all the domains where Data
Science is creating its impression.

Now that you have understood the need of Data Science, let’s understand what
is Data Science.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Get Started With Data Science

What is Data Science?

Use of the term Data Science is increasingly common, but what does it exactly
mean? What skills do you need to become Data Scientist? What is the
difference between BI and Data Science? How are decisions and predictions
made in Data Science? These are some of the questions that will be answered
further.

First, let’s see what is Data Science. Data Science is a blend of various tools,
algorithms, and machine learning principles with the goal to discover hidden
patterns from the raw data. How is this different from what statisticians have
been doing for years?

The answer lies in the difference between explaining and predicting.

As you can see from the above image, a Data Analyst usually explains what is
going on by processing history of the data. On the other hand, Data Scientist
not only does the exploratory analysis to discover insights from it, but also uses
various advanced machine learning algorithms to identify the occurrence of a
particular event in the future. A Data Scientist will look at the data from many
angles, sometimes angles not known earlier.

So, Data Science is primarily used to make decisions and predictions making
use of predictive causal analytics, prescriptive analytics (predictive plus
decision science) and machine learning.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Predictive causal analytics – If you want a model which can predict the
possibilities of a particular event in the future, you need to apply predictive
causal analytics. Say, if you are providing money on credit, then the probability
of customers making future credit payments on time is a matter of concern for
you. Here, you can build a model which can perform predictive analytics on the
payment history of the customer to predict if the future payments will be on
time or not. Prescriptive analytics: If you want a model which has the
intelligence of taking its own decisions and the ability to modify it with dynamic
parameters, you certainly need prescriptive analytics for it. This relatively new
field is all about providing advice. In other terms, it not only predicts but
suggests a range of prescribed actions and associated outcomes.
The best example for this is Google’s self-driving car which I had discussed
earlier too. The data gathered by vehicles can be used to train self-driving
cars. You can run algorithms on this data to bring intelligence to it. This will
enable your car to take decisions like when to turn, which path to take
,

when to slow down or speed up.Machine learning for making predictions —

If you have transactional data of a finance company and need to build a model
to determine the future trend, then machine learning algorithms are the best
bet. This falls under the paradigm of supervised learning. It is called supervised
because you already have the data based on which you can train your
machines. For example, a fraud detection model can be trained using a
historical record of fraudulent purchases.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Machine learning for pattern discovery — If you don’t have the parameters
based on which you can make predictions, then you need to find out the hidden
patterns within the dataset to be able to make meaningful predictions. This is
nothing but the unsupervised model as you don’t have any predefined labels for
grouping. The most common algorithm used for pattern discovery is Clustering.
Let’s say you are working in a telephone company and you need to establish a
network by putting towers in a region. Then, you can use the clustering
technique to find those tower locations which will ensure that all the users
receive optimum signal strength.
Let’s see how the proportion of above-described approaches differ for Data
Analysis as well as Data Science. As you can see in the image below, Data
Analysis includes descriptive analytics and prediction to a certain extent. On the
other hand, Data Science is more about Predictive Causal Analytics and
Machine Learning.

I am sure you might have heard of Business Intelligence (BI) too. Often Data
Science is confused with BI. I will state some concise and clear contrasts
between the two which will help you in getting a better understanding. Let’s
have a look.

Business Intelligence (BI) vs. Data Science

BI basically analyzes the previous data to find hindsight and insight to describe
the business trends. BI enables you to take data from external and internal
sources, prepare it, run queries on it and create dashboards to answer the
questions like quarterly revenue analysis or business problems. BI can evaluate
the impact of certain events in the near future.Data Science is a more forward-
looking approach, an exploratory way with the focus on analyzing the past or
current data and predicting the future outcomes with the aim of making
informed decisions. It answers the open-ended questions as to “what” and
“how” events occur.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Let’s have a look at some contrasting features.

Features Business Intelligence Data Science

(BI)

Data Structured Both Structured and Unstructured

Sources (Usually SQL, often Data ( logs, cloud data, SQL, NoSQL, text)
Warehouse)

Approach Statistics and Statistics, Machine Learning, Graph Analysis, Neuro-

Visualization linguistic Programming (NLP)

Focus Past and Present Present and Future

Tools Pentaho, Microsoft RapidMiner, BigML, Weka, R

BI, QlikView, R

This was all about what is Data Science, now let’s understand the lifecycle of
Data Science.

A common mistake made in Data Science projects is rushing into data

collection and analysis, without understanding the requirements or even framing
the business problem properly. Therefore, it is very important for you to follow
all the phases throughout the lifecycle of Data Science to ensure the smooth
functioning of the project.

Learn Data Science From Experts

Lifecycle of Data Science

Here is a brief overview of the main phases of the Data Science Lifecycle:

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science

Phase 1—Discovery: Before you begin the project, it is important to

understand the various specifications, requirements, priorities and required
budget. You must possess the ability to ask the right questions. Here, you
assess if you have the required resources present in terms of people,
technology, time and data to support the project. In this phase, you also need
to frame the business problem and formulate initial hypotheses (IH) to test.

Phase 2—Data preparation: In this phase, you require analytical sandbox in

which you can perform analytics for the entire duration of the project. You need
to explore, preprocess and condition data prior to modeling. Further, you will
perform ETLT (extract, transform, load and transform) to get data into the
sandbox. Let’s have a look at the Statistical Analysis flow below.

You can use R for data cleaning, transformation, and visualization. This will
help you to spot the outliers and establish a relationship between the
variables. Once you have cleaned and prepared the data, it’s time to do
exploratory analytics on it. Let’s see how you can achieve that.

Phase 3—Model planning: Here, you will determine the methods and
techniques to draw the relationships between variables. These relationships will
set the base for the algorithms which you will implement in the next phase. You
will apply Exploratory Data Analytics (EDA) using various statistical formulas
and visualization tools.

Let’s have a look at various model planning tools.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
1. R has a complete set of modeling capabilities and provides a good
environment for building interpretive models.
2. SQL Analysis services can perform in-database analytics using common
data mining functions and basic predictive models.
3. SAS/ACCESS can be used to access data from Hadoop and is used for
creating repeatable and reusable model flow diagrams.
Although, many tools are present in the market but R is the most commonly
used tool.

Now that you have got insights into the nature of your data and have decided
the algorithms to be used. In the next stage, you will apply the algorithm and
build up a model.

Phase 4—Model building: In this phase, you will develop datasets for training
and testing purposes. You will consider whether your existing tools will suffice
for running the models or it will need a more robust environment (like fast and
parallel processing). You will analyze various learning techniques like
classification, association and clustering to build the model.

You can achieve model building through the following tools.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Phase 5—Operationalize: In this phase, you deliver final reports, briefings,
code and technical documents. In addition, sometimes a pilot project is also
implemented in a real-time production environment. This will provide you a clear
picture of the performance and other related constraints on a small scale
before full deployment.

Phase 6—Communicate results: Now it is important to evaluate if you have

been able to achieve your goal that you had planned in the first phase. So, in
the last phase, you identify all the key findings, communicate to the
stakeholders and determine if the results of the project are a success or a
failure based on the criteria developed in Phase 1.

Now, I will take a case study to explain you the various phases described
above.

Case Study: Diabetes Prevention

What if we could predict the occurrence of diabetes and take appropriate
measures beforehand to prevent it? In this use case, we will predict the
occurrence of diabetes making use of the entire lifecycle that we discussed
earlier. Let’s go through the various steps.

Step 1:

First, we will collect the data based on the medical history of the patient as
discussed in Phase 1. You can refer to the sample data below.

As you can see, we have the various attributes as mentioned below.

Attributes:

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
1. npreg – Number of times pregnant
2. glucose – Plasma glucose concentration
3. bp – Blood pressure
4. skin – Triceps skinfold thickness
5. bmi – Body mass index
6. ped – Diabetes pedigree function
7. age – Age
8. income – Income
Step 2:

Now, once we have the data, we need to clean and prepare the data for
data analysis.
This data has a lot of inconsistencies like missing values, blank columns,
abrupt values and incorrect data format which need to be cleaned.
Here, we have organized the data into a single table under different
attributes – making it look more structured.
Let’s have a look at the sample data below.

This data has a lot of inconsistencies.

1. In the column npreg, “one” is written in words, whereas it should be in the

numeric form like 1.
2. In column bp one of the values is 6600 which is impossible (at least for
humans) as bp cannot go up to such huge value.
3. As you can see the Income column is blank and also makes no sense in
predicting diabetes. Therefore, it is redundant to have it here and should
be removed from the table.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
So, we will clean and preprocess this data by removing the outliers, filling
up the null values and normalizing the data type. If you remember, this is
our second phase which is data preprocessing.
Finally, we get the clean data as shown below which can be used for
analysis.

Step 3:

Now let’s do some analysis as discussed earlier in Phase 3.

First, we will load the data into the analytical sandbox and apply various
statistical functions on it. For example, R has functions like describe which
gives us the number of missing values and unique values. We can also use
the summary function which will give us statistical information like mean,
median, range, min and max values.
Then, we use visualization techniques like histograms, line graphs, box
plots to get a fair idea of the distribution of data.

Step 4:

Now, based on insights derived from the previous step, the best fit for this kind
of problem is the decision tree. Let’s see how?

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Since, we already have the major attributes for analysis like npreg, bmi,
etc., so we will use supervised learning technique to build a model here.
Further, we have particularly used decision tree because it takes all
attributes into consideration in one go, like the ones which have a linear
relationship as well as those which have a non-linear relationship. In our
case, we have a linear relationship between npreg and age, whereas the
nonlinear relationship between npreg and ped.
Decision tree models are also very robust as we can use the different
combination of attributes to make various trees and then finally implement
the one with the maximum efficiency.
Let’s have a look at our decision tree.

Here, the most important parameter is the level of glucose, so it is our root
node. Now, the current node and its value determine the next important
parameter to be taken. It goes on until we get the result in terms of pos or neg.
Pos means the tendency of having diabetes is positive and neg means the
tendency of having diabetes is negative.

Step 5:

In this phase, we will run a small pilot project to check if our results are
appropriate. We will also look for performance constraints if any. If the results
are not accurate, then we need to replan and rebuild the model.

Step 6:

Once we have executed the project successfully, we will share the output for
full deployment.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science
Being a Data Scientist is easier said than done. So, let’s see what all you need
to be a Data Scientist. A Data Scientist requires skills basically from three
major areas as shown below.

As you can see in the above image, you need to acquire various hard skills and
soft skills. You need to be good at statistics and mathematics to analyze and
visualize data. Needless to say, Machine Learning forms the heart of Data
Science and requires you to be good at it. Also, you need to have a solid
understanding of the domain you are working in to understand the business
problems clearly. Your task does not end here. You should be capable of
implementing various algorithms which require good coding skills. Finally, once
you have made certain key decisions, it is important for you to deliver them to
the stakeholders. So, good communication will definitely add brownie points to
your skills.

In the end, it won’t be wrong to say that the future belongs to the Data
Scientists. It is predicted that by the end of the year 2018, there will be a need
of around one million Data Scientists. More and more data will provide
opportunities to drive key business decisions. It is soon going to change the
way we look at the world deluged with data around us. Therefore, a Data
Scientist should be highly skilled and motivated to solve the most complex
problems.

l hope you enjoyed reading my blog and understood what is Data

Science. Check out our Data Science certification training here, that comes
with instructor-led live training and real-life project experience.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To
Data Science

ZaranTech
ZaranTech is a US based Global IT Training and Consulting
Company, which provides focused Individual and Corporate e-
learning programs. Our Senior trainers have more than eight years
of experience in the fast paced world of Information Technology.

LEARN MORE

What Is Data Science? A Beginner’s Guide To Data Science

DSV Module-1
No ratings yet
DSV Module-1
26 pages
DS Notes
No ratings yet
DS Notes
159 pages
Unit I TYCS DS
No ratings yet
Unit I TYCS DS
73 pages
Unit 1 DS BCA NOTES
No ratings yet
Unit 1 DS BCA NOTES
7 pages
Chapter 1-Introduction To Data Science
No ratings yet
Chapter 1-Introduction To Data Science
39 pages
DS QB
No ratings yet
DS QB
81 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
85 pages
Data Science Chacha
No ratings yet
Data Science Chacha
150 pages
Introduction To Data-Science
No ratings yet
Introduction To Data-Science
246 pages
DS231 Module 2
No ratings yet
DS231 Module 2
33 pages
Data Science Life Cycle
No ratings yet
Data Science Life Cycle
12 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
30 pages
OceanofPDF - Com Python For Data Science The Ultimate Step - Daniel OReilly
No ratings yet
OceanofPDF - Com Python For Data Science The Ultimate Step - Daniel OReilly
72 pages
Task 2a
No ratings yet
Task 2a
16 pages
What Is Data Science?: Module - 1
No ratings yet
What Is Data Science?: Module - 1
29 pages
SEM-V AIDS Syllabus
No ratings yet
SEM-V AIDS Syllabus
40 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Handbook Introduction of Data Science AY 23-24
No ratings yet
Handbook Introduction of Data Science AY 23-24
171 pages
TLMweek 1 Intro Ds
No ratings yet
TLMweek 1 Intro Ds
11 pages
DS231 Week 2
No ratings yet
DS231 Week 2
33 pages
Data Science Basics
No ratings yet
Data Science Basics
25 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
17 pages
DS B&V-1
No ratings yet
DS B&V-1
30 pages
Data Science Tutorial 1
No ratings yet
Data Science Tutorial 1
26 pages
Unit 1 DA
No ratings yet
Unit 1 DA
72 pages
Unit 1
No ratings yet
Unit 1
60 pages
Ch7-Overview of Data Science-Part 1
No ratings yet
Ch7-Overview of Data Science-Part 1
37 pages
UNIT - I Intro To DS
No ratings yet
UNIT - I Intro To DS
18 pages
Computational Data Science - Unit 1
No ratings yet
Computational Data Science - Unit 1
18 pages
Data Science
No ratings yet
Data Science
7 pages
Data Analytics - Beginner's Guide
No ratings yet
Data Analytics - Beginner's Guide
22 pages
Ab Assignment 3
No ratings yet
Ab Assignment 3
7 pages
1 - Introduction To Data Science
No ratings yet
1 - Introduction To Data Science
28 pages
Data Scince
No ratings yet
Data Scince
8 pages
Intro To Career in Data Science: Md. Rabiul Islam
100% (1)
Intro To Career in Data Science: Md. Rabiul Islam
62 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Basic of Ds
No ratings yet
Basic of Ds
14 pages
Data Science - FYBCA-Sem-II
No ratings yet
Data Science - FYBCA-Sem-II
13 pages
Chapter 1
No ratings yet
Chapter 1
47 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
100% (1)
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
122 pages
Data Analytics Beginners Guide - Shared by WorldLine Technology
100% (2)
Data Analytics Beginners Guide - Shared by WorldLine Technology
22 pages
Introductiontoaiml 240919083826 24f51819
No ratings yet
Introductiontoaiml 240919083826 24f51819
105 pages
09 Handout 1
No ratings yet
09 Handout 1
4 pages
Data Science 2020
100% (1)
Data Science 2020
123 pages
Introduction To Data Science What Is Data Science?
No ratings yet
Introduction To Data Science What Is Data Science?
11 pages
Data Science Lecture 1 Introduction
No ratings yet
Data Science Lecture 1 Introduction
27 pages
What Is Data Science - A Beginner's Guide To Data Science - Edureka
No ratings yet
What Is Data Science - A Beginner's Guide To Data Science - Edureka
14 pages
Data Science
No ratings yet
Data Science
5 pages
Introduction To Data Science Lecture 1
No ratings yet
Introduction To Data Science Lecture 1
4 pages
Unit 1
No ratings yet
Unit 1
8 pages
5 - Data Analytics, Data Science and Machine Learning
No ratings yet
5 - Data Analytics, Data Science and Machine Learning
56 pages
A Beginners Guide To Getting First Data Science Job PDF
No ratings yet
A Beginners Guide To Getting First Data Science Job PDF
64 pages
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
No ratings yet
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
5 pages
What Is Data Science
No ratings yet
What Is Data Science
3 pages
Data Science A Beginner S Guide 1668243666
100% (1)
Data Science A Beginner S Guide 1668243666
26 pages
An Introduction To Clustering and Different Methods of Clustering
No ratings yet
An Introduction To Clustering and Different Methods of Clustering
9 pages
Machine Learning With Cae
100% (2)
Machine Learning With Cae
6 pages
Modern Technologies For Big Data Classification and Clustering 1st Edition Hari Seetha
No ratings yet
Modern Technologies For Big Data Classification and Clustering 1st Edition Hari Seetha
65 pages
PHD Thesis Defense (Final)
No ratings yet
PHD Thesis Defense (Final)
96 pages
Cse5243 Intro. To Data Mining: Chapter 1. Introduction
No ratings yet
Cse5243 Intro. To Data Mining: Chapter 1. Introduction
56 pages
18-Computer Science Syllabus
No ratings yet
18-Computer Science Syllabus
5 pages
VO - MCA - S4 - Data Mining Unit 1
No ratings yet
VO - MCA - S4 - Data Mining Unit 1
18 pages
Market Basket Analysis For Data Mining - Msthesis PDF
No ratings yet
Market Basket Analysis For Data Mining - Msthesis PDF
75 pages
Ai Fundamental Midterm Quizzes - Jei
No ratings yet
Ai Fundamental Midterm Quizzes - Jei
48 pages
Module 4 ML
No ratings yet
Module 4 ML
11 pages
Lecture Slides-Week15,16
No ratings yet
Lecture Slides-Week15,16
50 pages
AI Powered Tropical Cloud Cluster Identification For INSAT
No ratings yet
AI Powered Tropical Cloud Cluster Identification For INSAT
8 pages
Article Segmentation Clients
No ratings yet
Article Segmentation Clients
6 pages
A Conceptual Design of Virtual Internship System To Benchmark Software Development Skills in A Blended Learning Environment
No ratings yet
A Conceptual Design of Virtual Internship System To Benchmark Software Development Skills in A Blended Learning Environment
6 pages
LogSig Generating System Events From Raw Textual Logs
No ratings yet
LogSig Generating System Events From Raw Textual Logs
10 pages
Engineering Literature Review Outline
100% (2)
Engineering Literature Review Outline
5 pages
Portfolio Optimization Using Machine Learning Techniques
No ratings yet
Portfolio Optimization Using Machine Learning Techniques
7 pages
Deterministic Feature Selection For KMeans Clustering
No ratings yet
Deterministic Feature Selection For KMeans Clustering
12 pages
0810 IT ITC801 BDA SampleQB
No ratings yet
0810 IT ITC801 BDA SampleQB
22 pages
9 - IAI5101 Unsupervised Learning - 20-40
No ratings yet
9 - IAI5101 Unsupervised Learning - 20-40
21 pages
jto1Clustering-Based Energy Efficient Task Offloading For Sustainable Fog Computing
No ratings yet
jto1Clustering-Based Energy Efficient Task Offloading For Sustainable Fog Computing
12 pages
Management Zone Analyst: For Windows 95, 98, NT, and 2000
No ratings yet
Management Zone Analyst: For Windows 95, 98, NT, and 2000
8 pages
1 s2.0 S0016003220302544 Main
No ratings yet
1 s2.0 S0016003220302544 Main
22 pages
Data Warehousing and Data Mining 3 0 0 3
No ratings yet
Data Warehousing and Data Mining 3 0 0 3
4 pages
Meat Science: C.E. Realini, M. Font I Furnols, C. Sañudo, F. Montossi, M.A. Oliver, L. Guerrero
No ratings yet
Meat Science: C.E. Realini, M. Font I Furnols, C. Sañudo, F. Montossi, M.A. Oliver, L. Guerrero
8 pages
Conference Schedule - ICACM 2019
No ratings yet
Conference Schedule - ICACM 2019
16 pages
Presenting A Model For Identifying The Best Location of Melli Bank ATMS by Combining Clustering Algorithms and Particle Optimization
No ratings yet
Presenting A Model For Identifying The Best Location of Melli Bank ATMS by Combining Clustering Algorithms and Particle Optimization
7 pages
Jyoti Singh Kirar (Accepted)
No ratings yet
Jyoti Singh Kirar (Accepted)
3 pages
Mastering Data Analytics: For Absolute Beginners To Business Intelligence
From Everand
Mastering Data Analytics: For Absolute Beginners To Business Intelligence
Er. Allen Sage Jr.
No ratings yet
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
From Everand
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
Riley Adams
5/5 (1)
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
From Everand
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
Waldo Todd
No ratings yet
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
From Everand
PYTHON DATA SCIENCE: A Practical Guide to Mastering Python for Data Science and Artificial Intelligence (2023 Beginner Crash Course)
Calvert Long
No ratings yet
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet

What Is Data Science A Beginner's Guide To Data Science

Uploaded by

What Is Data Science A Beginner's Guide To Data Science

Uploaded by

What Is

In this blog, I will be covering the following topics.

The need for Data Science.

Let’s Understand Why We Need Data Science

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To Data Science

What is Data Science?

The answer lies in the difference between explaining and predicting.

What Is Data Science? A Beginner’s Guide To Data Science

when to slow down or speed up.Machine learning for making predictions —

What Is Data Science? A Beginner’s Guide To Data Science

Business Intelligence (BI) vs. Data Science

What Is Data Science? A Beginner’s Guide To Data Science

Features Business Intelligence Data Science

Data Structured Both Structured and Unstructured

Approach Statistics and Statistics, Machine Learning, Graph Analysis, Neuro-

Focus Past and Present Present and Future

Tools Pentaho, Microsoft RapidMiner, BigML, Weka, R

A common mistake made in Data Science projects is rushing into data

Learn Data Science From Experts

Lifecycle of Data Science

What Is Data Science? A Beginner’s Guide To Data Science

Phase 1—Discovery: Before you begin the project, it is important to

Phase 2—Data preparation: In this phase, you require analytical sandbox in

Let’s have a look at various model planning tools.

What Is Data Science? A Beginner’s Guide To Data Science

You can achieve model building through the following tools.

What Is Data Science? A Beginner’s Guide To Data Science

Phase 6—Communicate results: Now it is important to evaluate if you have

Case Study: Diabetes Prevention

As you can see, we have the various attributes as mentioned below.

What Is Data Science? A Beginner’s Guide To Data Science

This data has a lot of inconsistencies.

1. In the column npreg, “one” is written in words, whereas it should be in the

What Is Data Science? A Beginner’s Guide To Data Science

Now let’s do some analysis as discussed earlier in Phase 3.

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To Data Science

l hope you enjoyed reading my blog and understood what is Data

What Is Data Science? A Beginner’s Guide To Data Science

What Is Data Science? A Beginner’s Guide To Data Science

You might also like