R Programming Basics
R Programming Basics
Before you know about Machine Learning or Data Science or Business Analytics,
you have to know the need or use of Data Science. Day to day base different
organism tries to solve different business portion for their growth.
Traditional Approach:
Data Driven:
All the company collect some data then store the data and processing it thereafter
they analyze the data from multiple sources in a lot of statistical technique, while
complete the realization then it shows in dashboards and reports.
That is the way data driven to insight, here it is like a roadmap where data is
collected to insight. In these process, the analyzing process is most vital that’s why
we ultimate focus on the analytical part.
In 2008-2010 there was a huge change in data.There was a lot of volume of data
like tweets, facebook post , youtube videos That’s why to store the data we need a
cheap commodities hardware and the answer is Hadoop where we can store all
those data.Then we find out MapReduce. Using the MapReduce technique we
process those data in a fraction of second.
You can see the trends in last ten years the cost of storing the data is gradually
decreased
Now the third stage is about machine learning. Using machine learning technique
we can understand a data and extract the insight from data.
For an example,Decision Tree and Neural Network these two are one of the
machine learning algorithms ,invented in 1970’s but at that time if you ran a
decision tree of 1gb of data it took half an hour but in today’s world with
Hadoop,it took just one second.These are the three reason because of that we are
using it.
1. Descriptive Analytics
2. Diagnostic Analytics
3. Predictive Analytics
4. Prescriptive Analytics
The graph Descriptive to Prescriptive, the complexity is higher as well the business
value and skill also high.
Descriptive Analytics:
If you know more about how something happened or if you know more about a
business event happened or you interested to know what happened or those kinds
of think that is Descriptive Analytics.
As it requires minimal to no coding, that’s why Descriptive Analytics is the easiest technique
for data analytics.
It is analysis of the past(historical) data to understand trends and evaluates metrics over time.
There are many sophisticated tools that can handle Descriptive Analytics
like Tableau,QlikView,Microstrategy, Google Analytics etc.
EXAMPLE:
1. Analyzing past 6 months sales data and identify top 10 selling product
2. Analyzing customers comment on Twitter and count positive and negative comments
Diagnostic Analytics:
Thereafter if you go one level down and think why it happened in the past then it’s
called diagnostic Analytics.
1. Predictive Analytics are Predictive future sales based on past historical data
2. Predictive whether a customer would take a product or not
3. Predicting whether a customer leaves your organization or not.
Prescriptive Analytics:
The final one is prescriptive Analytics,it tells what would happened and it also tell
what you have to do,just like a Doctor prescription.
Prescriptive Analytics specifies best course of actions for a business activity in the form of
the output of a prescriptive model.
It uses optimization algorithm to create the final output or prescription.
The Prescriptive Analytic is sophisticated tools and technique as well.
EXAMPLE:
Using these four process, we can automate the whole process. but how?
We are dealing with lots of data using some technique and based on that we are
taking a decision.
In Descriptive Analytics then the human or the marketing manager or you have to
take on decision based on the output from your Descriptive Model.
Thereafter Diagnostic Analytics,may be the human input is a bit less but still, you
have to take the decision based on the output.
Then predictive you take the decision based on the data or based on the predictive
output.
Now, the prescriptive model you don’t need to do anything everything is providing
the model you just simply take the decision.
This is all about the introduction Data Science with R.This brings an end to this
post, I encourage you to re read the post to understand it completely if you haven’t
and THANK YOU.
Telecom:
We'll try understand some customers will actually churn or not. In a telecom
industry, some customers will actually leave the organization or not. If you look it
in the Business function then it is a case of Customer Analytics Function similarly
if you take it in technique then we can tell it by Predictive Modeling.
Similarly, each of the use cases can look into by Industry, BY Business Function
and By Technique. Here we give some analytics examples one by one.
BFSI:
Credit scoring: It is based on customer behavior of purchasing. Based on
customer’s credit score bank can score a customer and decided if these
customers deserve a loan or credit card or not .
Fraud Detecttion: The Bank can detect a fraud by following the money
transaction and there automatically give a warning alert to the customer
that there may be a fraud is here .You immediately take an action for it.
Manufacturing:
Manufacturing has a predictive maintenance like when a
machinery fails and then you have to fix it.But in machine learning
approach we have to predict where the next failure is going to happen and at what
time and which machine is going to fail.
By Business Function
CustomerAnalytics:
Customer Segmentation: The model tells you the detail about who is your premium
customer, which customer want to leave your organization, which customer is ok based on
all those insights you can set different type of marketing offer.
Churn Prediction: The model tells you that which customer might leave your organization.
The propensity to Buy: The model tells you that which are the products that the customer
might be interested in and you should send a mail or message them or you should actually
productively call them for buying purpose.
Sales Analytics:
Sales Forecasting:We can actually predict your future sales, based on the last 6 years of sales
data we can predict future 3 or 4 quarter sales,if you know that things then it would be very
much beneficial.
Inventory Planning Analysis,Store Analytics & up sell or cross sell:All of those helps to
progress your growth and the predictive analysis make you much more strategic for your
future planning.
Price Analytics:
Markdown Optimization model:These model tell you what time you give the discount and
how much you give the discount.
Dynamic Pricing: It tells you what is the right price you should keep so that your profit is
maximized.
Discount Analytics & What if Scenario Modelling: It also come under the Price Analytics
which also help you to make the discount.
Marketing Analytics: has lots of different kind of use cases. You will be surprised
to know about it. The same technique is being used here in different use cases and
different industry.
Market Basket Analysis
Marketing Mix Modelling
Personalized Customer Offer
Social Media Analytics
These help you in the marketing process.
By Process
If you have a manufacturer company and you may have a supply chain manager
then you would be always think about the forecasted demand so we can here help
you by this machine learning model to forecast, as well as we can predict the price
of the raw materials and similarly the MRP which is Materials Requirements
Planning so what time you increase your planning and how much that is decided
by the MRP. The Inventory Optimization tells you how much inventory you
should actually keep in and whether this is also not low or not so high .Transport
Cost Optimization as well for helping the right route. Finally using all those
predicting modelings you can enhance your revenue or you can decrease your cost.
By Technique
1. Descriptive
2. Diagnostic
3. Predictive
4. Prescriptive
Now the framework thinks you, whatever industry you belong you can think all
those kind of different use cases. So maybe you are doing in descriptive analytics
so day to day you create a chart in Excel and but you don’t think how you can
predict something or do prescriptive. But these Machine Learning course actually
give you more, using the same data you don’t require any other data, using the
same data you can create Predictive Analytical use cases.
Now just give some couple of examples like campaign conversion Rate when
you are doing some campaign or promotion, that how they are working that’s why
you have to create an excel chart that’s are actually descriptive but not predictive.
Similarly, we also understand where the next network failure happens. We can also
understand which customer is more likely to churn the organization and so that you
immediately call them and provide some special discount or give the promotion as
well.
This brings an end to this post, I encourage you to re read the post to understand it
completely if you haven’t and THANK YOU.
1. Data To Insight
2. What is Machine Learning?
3. How Machine Learning Works?
4. What is Big Data ?
What is Data Science?
Data Science is nothing but and technique by which we can actually achieve Data
Driven Decision. So, whenever we take any decision instead of our knowledge and
if we take the decision based on the data that is actually Data Driven Decision and
to actually achieve this thing the Science is Data Science.
After the collecting, storing and processing you have to analyze. Now,these are the
two cases, we use any machine learning algorithms, maybe some time use
Forecasting, some time use Predictive Modeling or Optimization, Simulation or
Artificial Intelligence or whatever ultimately we come off with a system.
Now you create a rule and algorithm and you pass a new data. Suppose you have
written if else statement and you are trying to create a system which can actually
detect a spam. So, in a mail, you are looking for each of those mail and you are
trying to figure out whether this is spam or not. If it is spam then you put into a
spam folder.
Traditional Programming:
Now, if you would try in a Traditional Programming way, then you would
create different rules so you would write if else statement, while statement
etc. and all those things. A new data would come which is a new mail and based on
your program it would give you an output. Whether this is spam or not but
this base is actually a Rule Based approach. Where your rule accepts or
rejects based on the Rule Based but it would not change the situation.
Suppose you have created a rule that each mail contains more than ten images
then it is a spam but unfortunately, you receive a good mail from your friend
and which is contain more than 10 images but it is not actually a spam. Here
your programme marked this as a spam but once you say no that is not a spam
then it would not learn again. But if you see in our Gmail what’s happened
you try to train your model, you say no this is not a spam but next time whenever
you received from your friend more than 10 images it would not mark it
spam why because Gmail spam system is based on ML Based is not rule
based.
Machine Programming
What is different between Rule Based and ML Based?
ML Based actually adopt the new situation and it improves with new each
data and on the other hand Rule Based accept or reject based on rule
doesn’t change the situation.
we have a historical data, historical data is nothing just historical email data
that is last or previous 6 month to 1 year ago message. whatever the
mail came that is historical data and it has an output also. So whenever
you marked that it is not a spam or said it is not a spam, then those
are placed in your outbox.whether it can be spam or not.
Now, It will be trained in a Machine Learning Algorithm, now put some algorithm,
SVM algorithm or other, a lots of algorithms are there.So, now just pass on some
data, pass on the output and my machine will learn itself and it will create an
algorithm. So, Algorithm is nothing just a rule to be followed in calculations or
other problem-solving operations, for a computer. So it would now say that if there
is normally more than 10 images then it would be spam but if the images more
than ten and comes from your friend then it is not a spam. Since you have passed
the historical data and output to your machine and your system can learn using the
machine learning algorithm. So, next time when it comes from your friend it owns
marked it a spam.
Here we don’t create any algorithms rather we pass an algorithm and the
system learn itself and now your last 6 month all the emails are in historical data
which is divided into 2 trains (Training Set) and Test(Evaluation Set) then your
five month data is training set and one month data is in evaluation set and then we
pass different different algorithms like Linear Regression, Logistic Regression, and
SVM and then we evaluate the model and then best on output we choose which is
the highest accuracy. That’s why machine learning work.
Think tomorrow you have a new data and it also becomes in the history of the data
and the same way it comes in the training data set so your model change the data
set every day and every second and then the best on that it would predict. The
model is tested on an unseen data (Evaluation Set) and the model score is
calculated for each of the algorithms. Thereafter best algorithm is chosen.
What is Big Data and the Connection between Big Data and Data Science-
Velocity :
The data is coming with so much Velocity so may be from a GPS or from a twitter
or from a facebook every time in a nanosecond there are so many tweets and so
many facebook post and you have to store all of them, that against a big data
challenge that’s why you have to use the Hadoop system. Velocity deals with the
speed at which data flows in from the sources of business processes, application
logs, networks, social media sites, sensors, mobile devices etc.
Variety :
Similarly, Variety refers to the many types of data which is different in nature.
Earlier all those data are stored in very structured format but today datas
are unstructured, not structured at all. Because all those videos, images are
unstructured data. This refers to the inconsistency which can be shown by the data
at times.Now,since the Big Data are very different in nature to actually understand
to get the insight out of this data is also very difficult that’s why we need to have
these Decision Science and we need Machine Learning Algorithm to learn itself
because it not easy that manually you would actually extract the insight from the
data that’s why we need all these Machine Learning Algorithm to learn itself.
This brings an end to this post, I encourage you to re read the post to understand it
completely if you haven’t and THANK YOU.