0% found this document useful (0 votes)
16 views45 pages

000 Into Machine Learning

Uploaded by

tuanminh140309
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views45 pages

000 Into Machine Learning

Uploaded by

tuanminh140309
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Azure Machine Learning

What is Machine Learning ?

Using known data, develop a model to predict unknown data.


Microsoft & Machine Learning

1997 2008 2009 2010 2014 2015

Hotmail Bing maps Bing search Kinect launches Skype Translator Azure Machine
launches launches launches launches Learning GA
What does that
What’s the best Which searches motion “mean”? What is that What will
Which email is way to home? are most
junk? person saying? happen next?
relevant?
Why Machine Learning?
What is the probability
of a click on each ad?

What
language?
What is the
intent?
Which ads to
show, and in Machine learning enables nearly every
what order?
value proposition of web search.
Are any of these
pages malicious?
Misspelled?

What pages
Which links are should we index?
most likely to get
clicked? What ad pricing
will optimize
revenue?
Image Analyze
Accent Color: Which border color is the best?
Accent Color: Analyze Image
Accent Color: Windows 10 Store
Accent Color: Windows 10 Store
Text Analytics: User reviews
Positive Negative
Data Science
• Data Science is far too complex
• Cost of accessing/using efficient ML algorithms is high
• Comprehensive knowledge required on different tools/platforms to develop
a complete ML project
• Difficult to put the developed solution into a scalable production stage

• Need a simpler/scalable method:


Azure Machine Learning Service
Microsoft Azure Machine Learning

Make machine learning accessible to every enterprise,


data scientist, developer, information worker, consumer,
and device anywhere in the world.
From data to decisions and actions
Decision automation

Recommendations
What should I do?
Decision support

Predictions
Data What will happen?
Decision Action

Interactive dashboards
Why did it happen?

Static reports Manual process


What happened?

Value
Transform data into intelligent action

Information Big Data Stores Machine Learning Dashboards and


Management and Analytics Visualizations
Power BI
Business
apps
Azure Azure
Personal Digital
Data Factory Machine Learning
Assistant
Azure Cortana
Data Lake
People
Azure Azure
Custom Data Catalog HDInsight (Hadoop) Perceptual Intelligence
apps
Azure Face, vision
SQL Data Warehouse
Speech, text
Azure Azure
Event Hub Stream Analytics Business Scenarios
Sensors Recommendations,
and devices Automated
customer churn,
Systems
forecasting, etc.

DATA INTELLIGENCE ACTION


Microsoft Azure Machine Learning
• Web based UI accessible from different browsers

• Share|collaborate to any other ML workspace

• Drag&Drop visual design|development

• Wide range of ML Algorithms catalog

• Extend with OSS R|Python scripts

• Share|Document with IPython|Jupyter

• Deploy|Publish|Scale rapidly (APIs)


Microsoft Azure Machine Learning

ML
Algorithms
Best of MS

ML Studio Data Scientist

ML Operationalization IT Professional

ML APIs
Marketplace ISVs & Developers
Azure Machine Learning Ecosystem

Provision Workspace Build ML Model Deploy as Web Service Publish an App

Get/Prepare
Data

Evaluate
Build/Edit Azure Data
Get Azure Create Model
Experiment Publish Web
Subscription Workspace Results Marketplace
Service
Create/Update
Model
API examples
• Green Score, Wealth Score, Giving Score
• Frequently Bought Together API
• Recommendations API
• Anomaly Detection API
• Lexicon Based Sentiment Analysis
• Forecasting: Exponential Smoothing
• Forecasting: ETS+STL
• Forecasting: AutoRegressive Integrated
Moving Average (ARIMA)
• Binary Classifier API
• Cluster Model API
• Survival Analysis API
• Multivariate Linear Regression API
• Survival Analysis API
• Multivariate Linear Regression API
• Normal Distribution Quantile Calculator
• Binomial Distribution Quantile Calculator
• And more on datamarket.azure.com
Azure Machine Learning Service
Data -> Predictive model -> Operational web API in minutes

Data Clients

ML STUDIO API
Model is now a web
service that is callable
Blobs and Tables
Integrated development
Hadoop (HDInsight) environment for Machine
Relational DB (Azure SQL DB) Learning

Monetize the API through


our marketplace
What is Machine Learning ?

Using known data, develop a model to predict unknown data.

Known Data: Big enough archive, previous observations, past data


Unknown Data: Missing, Unseen, not existing, future data
Model: Known data + Algorithms (ML algorithms)
EXAMPLE
Known data
Model
Unknown data

… … … … …
1990 50°F 30°F 68°F 95°F
2000 29°F sample
48°F forecast
Weather 70°F 98°F
2010 49°F 27°F 67°F 96°F
2020 ? ? ? ?
Using known data, develop a model to predict unknown data.
Model (Regression)

90°F
1990 50°F 30°F 68°F 95°F
Predict 2020 Summer

2000 48°F 29°F 70°F 98°F


2010 49°F 27°F 67°F 96°F
-26°F

Using known data, develop a model to predict unknown data.


EXAMPLE
Model (Decision Tree)

Xbox-One
Customer
Income >
$50K
Not Xbox-One
Customer

Age<30 Xbox-One
Customer
Income >
$50K
Days Played > Not Xbox-One
728 Customer
Xbox-One
Customer
EXAMPLE
Model (Classification)
Classify a news article as (politics, sports, technology, health, …)

Politics Sports Tech Health

Using known data, develop a model to predict unknown data.


Known data (Training data)
Documents Labels

Tech

Health

Politics
Documents consist of
unstructured text. Machine
Politics
learning typically assumes a
more structured format of
examples

Process the raw data


Sports

Using known data, develop a model to predict unknown data.


Known data (Training data)
Documents Labels
Process each data instance to represent it as a feature vector
Tech
Documents Labels

Health
Feature

Politics

Politics

Sports

Using known data, develop a model to predict unknown data.


Feature vector

i.e.
Blood Pressure
Age Height/Weight Hearth Rate

Known data {40, (180, 82), (11,7), 70, …..} : Healthy

Features Label
Data instance
Feature Vector
Developing a Model
Training data Base
Documents Labels Feature Vectors Model
Tech Adjust
Parameters
Health

Politics
Train
Politics the
Model

Sports

Using known data, develop a model to predict unknown data.


Model’s Performance Detach
True
labels
Predicted
labels
Model’s
Tech
+/- Tech Performance
Known data with true labels Health
+/- Health
Politics Politics Difference between
Politics Politics “True Labels” and
Tech “Predicted Labels”
%
Health 20
ata +/-
d Sports Sports
st
Te
Politics
Co
Test m
wi pare
Split

Politics
train
ed m th
Tra od e tru pred
l wit e l icti
in ing h fe ab on
Tech
atures els
da
ta Health
80
% Politics
Politics
Train the Model
Sports

Sports
Steps to Build a Machine Learning Solution
1
Problem
Framing

5
2
Evaluate / Get/Prepare
Track Data
Performance
3.1
3.5 Analysis/
Evaluation Metric
definition

4 3
Deploy Develop 3.4 3.2
Model Model Parameter Feature
Tuning Engineering

3.3
Model
Training
Example use cases
Finance Sales Customer Operations
and risk and marketing and channel and workforce

$$$ Revenue Forecasting Sales forecasting User segmentation Agent allocation

Portfolio optimization Demand forecasting Personalized offers Warehouse efficiency

$$$ Investment modelling Sales lead scoring Product recommendation Smart buildings

Marketing mix Predictive maintenance


Fraud detection
optimization

Supply chain
Risk management optimization
Machine Learning Algorithms

• ML Algorithm defines how your model will react

• Which Algorithm to use? Depends on:


• Data Quality
• Data Size
• What you want to predict
• Time constraint
• Computation power
• Memory limits
Machine Learning Algorithms

You can develop solutions by using


• Custom algorithms written in R | Python
• Ready to use ML services from data market
• Existing algorithms
Machine Learning Algorithms
Two major category of algorithms
• Supervised
• Unsupervised

Most commonly used machine learning algorithms are supervised


(requires labels)
• Supervised learning examples • Unsupervised learning examples

• This customer will like coffee • These customers are similar


• This network traffic indicates a • This network traffic is unusual
denial of service attack
Common Classes of Algorithms
(Supervised|Unsupervised)

Classification Clustering Regression Anomaly


Detection
Why you need to know these algorithms?
• If you want to answer a YES|NO question, it is classification

• If you want to predict a numerical value, it is regression

• If you want to group data into similar observations, it is clustering

• If you want to recommend an item, it is recommender system

• If you want to find anomalies in a group, it is anomaly detection

and many other ML algorithms for specific problem


Classification
Scenarios: Classification
 Which customer are more likely to buy, stay, leave (churn analysis)
 Which transactions|actions are fraudulent
 Which quotes are more likely to become orders
 Recognition of patterns: speech, speaker, image, movement, etc.

Algorithms: Boosted Decision Tree, Decision Forest, Decision


Jungle, Logistic Regression, SVM, ANN, etc.
Clustering
Scenarios: Clustering
 Customer segmentation: divide a customer base into groups of
individuals that are similar in specific ways relevant to marketing, such
as age, gender, interests, spending habits, etc.
 Market segmentation
 Quantization of all sorts, such as, data compression, color reduction,
etc.
 Pattern recognition

Algorithms: K-means
Regression
Scenarios: Regression
 Stock prices prediction
 Sales forecasts
 Premiums on insurance based on different factors
 Quality control: number of complaints over time based on product
specs, utilization, etc.
 Workforce prediction
 Workload prediction

Algorithms: Bayesian Linear, Linear Regression, Ordinal


Regression, ANN, Boosted Decision Tree, Decision Forest
Regression versus Classification

Does your customer want to predict|estimate a number (regression)


or apply a label|categorize (classification)?
• Regression problems • Classification problems
• Estimate household power • Power station will|will not meet
consumption demand
• Estimate customer’s income • Customer will respond to
advertising
Binary versus Multiclass Classification

Does your customer want a yes|no answer?


• Binary examples • Multiclass examples
• click prediction • kind of tree
• yes|no • kind of network attack
• over|under • type of heart disease
• win|loss

You might also like