0% found this document useful (0 votes)

117 views81 pages

Introduction To Machine Learning

Uploaded by

Hanson Tian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views81 pages

Introduction To Machine Learning

Uploaded by

Hanson Tian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Week 1: ML Intro; Linear Models

MBusA Machine Learning 2022

Copyright: University of Melbourne

Who?
• Lecturer
* James Bailey (baileyj@[Link])

2
Why Machine Learning

3
Motivation
• “We are drowning in information,
but we are starved for knowledge”
- John Naisbitt, Megatrends

• Data = raw information

• Knowledge = patterns or models behind the data

4
Solution: Machine Learning
• Hypothesis: pre-existing data repositories contain a lot of
potentially valuable knowledge

• Mission of learning: find it

• One definition of ML:

(semi-)automatic extraction of valid, novel, useful and
comprehensible knowledge – in the form of rules, regularities,
patterns, constraints or models – from arbitrary sets of data

5
Applications: Widespread
• Online ad selection and placement
• Risk management in finance, insurance, security

• High-frequency trading
• Medical diagnosis
• Mining and natural resources
• Malware analysis

• Drug discovery
• Search engines
• Education
• Sport

• …
6
Draws on Many Disciplines
• Artificial Intelligence
• Statistics
• Continuous optimisation
• Databases
• Information Retrieval
• Communications/information theory
• Signal Processing
• Computer Science Theory
• Philosophy
• Psychology and neurobiology
• Linguistics

7
Data Science / BusA Landscape

Domain
Computing Statistics
expertise

-Data wrangling -Robust models and -Health

-Machine learning methods -Business
-Data mining -Sampling
-Social sciences
-Databases -Hypothesis testing
-…….
-Distributed computing - …..
-AI

9
AI, Machine Learning, Big Data
Statistics / ML
Artificial Intelligence
“Intelligent machines and Big Data, data
software” processing

Planning,
Reasoning,
Decision Making
6 9x(x ) (z _ ¬y))

10
This item: The Martian by Andy Weir Paperback $8.92
The Revenant: A Novel of Revenge by Michael Punke Paperback $9.52

“AI is the new electricity” – Andrew Ng

The Life We Bury by Allen Eskens Paperback $8.75

Customers Who Bought This Item Also Bought

Data-driven, intelligent systems

Page 1 of 15

The Revenant: A Novel of Ready Player One: A Novel The Life We Bury The 5th Wave: The First The Boys in the Boat: Nine
Revenge › Ernest Cline › Allen Eskens Book of the 5th Wave Americans and Their Epic
› Michael Punke 9,210 1,896 Series Quest for Gold at the…
1,250 Paperback Paperback › Rick Yancey › Daniel James Brown
Paperback $8.37 $8.75 2,006 17,056
$9.52 Paperback #1 Best Seller in Boating
$6.70 Paperback
$9.15
Netflix 13/03/2016 10:03 26am

Kids Categories Search Kids... Exit Kids

Fuller House The Wiggles My Little Pony Mako Mermaids H2O: Just Add Water Good Luck Charlie Pokémon

Recently watched Top Picks for Kids

Popular

Action

[Link] Page 1 of 4
11
Jobs
Numerous companies
across all industries hire
ML experts:

Data Scientist
Analytics Expert
Business Analyst
Statistician
Software Engineer
Researcher
…

12
Companies Employing our Students in ML Roles

Telstra, Citibank, Danske Bank, Deutsche Bank, NAB, ANZ,

Veda, Tencent, LexisNexis Risk Solutions, GE Capital, Deloitte,
PwC, Accenture, Deloitte, IBM Research, IBM, Sportsbet,
OpenBet:, CrowdsourceHire, Hugo, Flipkart, Rome2rio,
Breadtrip, SAP, Salesforce, Hitachi, Oracle, Google Apple,
Microsoft, Amazon, Groupon, Nokia, CSIRO, MongoDB, DST
Group, Data61, Evernote, Teradata, Kepler Analytics, Business
Predictions, Thales, Tata, LinkedIn, Ford, Huawei, KPMG,
northraine, Woolworths, [Link], Microsoft Research, SAS,
Peter MacCallum Cancer Centre, Commonwealth Bank,
Computershare, Blackmagic Design, Baker IDI, AIG, ….

13
Discussion
Share an example of how machine
learning can help in either your
workplace or in your daily life.
About this Subject

15
Teaching Staff
• Lecturer (James)

• Tutors
* Curtis (Hanxun) Huang
• hanxunh@[Link]
• PhD candidate, School of Computing and Information Systems
* Edmund Lau
* [Link]@[Link]
* PhD candidate, School of Mathematics and Statistics
* Yuning Zhou
• yunizhou@[Link]
• PhD candidate, School of Computing and Information Systems

16
Getting Help
• Machine learning subject on Canvas is operational
* Please check for announcements, lecture and workshop
materials, discussion forums, ….

• Ask questions during and after lecture

• Post questions to Canvas (before emailing if possible)
• Ask questions to your tutor during afternoon session
• Consultation by appointment – send an email
* If emailing, please start subject line with “BUSA90501”

17
Timetable
• 9am-10:20am Part A
* 9:00-9:15 of Part A reserved for quizzes (Weeks 3-7)

• 10:20-10:40 Break

• 10:40-12:00 Part B

• 2:00-5:00 Workshop.

18
Relation to Other Subjects
• Machine learning (aka “Statistical Learning II”) versus
* Statistical Learning
* Predictive Analytics
* Text and Web Analytics

19
Versus “Statistical Learning”
• Complementary, with some overlap on regression

• “Statistical Learning” more emphasis on

* Statistical flavour; Frequentist stats (MLE)
* (Generalised) linear models
* Statistical validation techniques like model coefficients

• This subject
* More computer science (CS) in flavour
* More: Regularisation, nonlinear & computational perspectives
* Covers variety of learning tasks beyond regression

20
Versus “Predictive Analytics”
• Complementary, again overlapping but mostly early

• “Predictive Analytics” more emphasis on

* Econometrics flavour
* Time series forecasting, and model selection

• This subject
* Less timeseries, more approaches to prediction in CS
* Drill down further into the algorithms - implementations;
scratch the theory surface

21
Versus “Text and Web Analytics”
• Also complementary, with some overlap

• “Text and Web analytics” emphasises

* Web data, info. retrieval, natural language processing
* Using machine learning algorithms
• Naïve Bayes, logistic regression, neural networks, clustering

• This subject “Machine Learning”

* Variety of approaches to prediction (not only for text)
* More time to go into the how and why

22
Assumed Knowledge
• Programming
* Familiarity with computer programming
* Load data, perform simple manipulations, call ML libraries,
inspect & plot results

• Maths
* Comfort with formal notation (“mathematical maturity”)
* Familiarity with probability (e.g. Bayes rule, multivariate
distributions)
* Exposure to optimisation (and some calculus, linear algebra)

• Masters level subject

23
Textbooks (Optional)
• No set textbook for the subject. You can find required information in lecture
notes, supplemented with easy to connect info on the Web. Some independent
research is reasonable for a masters-level subject. However, there exist number of
good books can might used as references:

• Hastie, Tibshirani, and Friedman (2009), The Elements of Statistical Learning: Data
Mining, Inference and Prediction
* This is a seminal book on machine learning that covers major machine learning
tools in depth with great rigour.

• Bishop (2007), Pattern Recognition and Machine Learning

* The contents of this book overlaps with the previous book. However, the
author often uses a different way to explain the concepts, that is sometimes
more accessible. Having a different perspective can be beneficial also.

26
Textbooks (Optional)
• Russell and Norvig (2002), Artificial Intelligence: A
Modern Approach
* A very broad (but less deep) overview of the whole field of
artificial intelligence, including machine learning
• Data mining resources are also useful
* Data Mining, Fourth Edition: Practical Machine Learning
Tools and Techniques (Morgan Kaufmann Series in Data
Management Systems), 4th edition, Witten, Frank, Hall and
Pal.
* Introduction to Data Mining, 1st edition, Tan, Steinbach and
Kumar.

27
Materials

• Lectures and workshop content will be posted to

Canvas.
• Where possible these will be posted several days in
advance.
• Any updates will be flagged in Canvas

28
Software – Python Stack
• We will be using Python 3 as the primary language in
workshops
• Get a copy of Python for your machine
* The Anaconda distribution is particularly convenient
• [Link]
* Jupyter used extensively in workshops (and industry!)
* See Software Guide published to Canvas

• Welcome to use other languages (e.g. R) as well but

not instead of.

29
Assessment
• 25% individual short in-lecture quizzes (Weeks 3 to 8)
* 5 quizzes (each 13+2min reading time)
* Worth 5% each

• 25% syndicate assignment with report

(Released ~Week 3, due in ~Week 6)
* Hands-on machine learning experience

• 50% final individual exam, ~3 hours (Week 9)

30
Syndicate assignment
We are planning to use a property dataset from
ANZ (CoreLogic dataset)

You will be requested to sign a confidentiality

agreement, indicating you will keep the dataset
confidential and not distribute, etc

Signatures for this confidentiality agreement will be

collected via a canvas assignment (you will upload
a signed personal confidentiality deed)
Subject Plan (Preliminary)
• Week 1:
* Introduction, performance evaluation, ML approaches, linear models

• Week 2:
* Feature selection and decision trees. Ensemble methods, bagging and
boosting

• Week 3:
* Regularization, support vector machines

• Week 4:
* Neural networks and optimisation
32
Subject Plan (cont.)
• Week 5:
* Boosting: gradient tree boosting (XGBoost) and AdaBoost

• Week 6:
* Unsupervised learning: Clustering
• Week 7:
* Unsupervised learning: Network analysis, community detection
and semi-supervised learning

• Week 8:
* Revision

33
Machine Learning – A Dizzying Array
• We will be looking at a range of machine learning
techniques
* Regression, naïve Bayes, decision trees, random forests,
gradient tree boosting, neural networks, clustering,
community detection

• It can seem like a bag of tricks, without strong

connection between techniques …..

34
Machine Learning – Common Themes

1. Supervised (today) versus unsupervised (weeks 6-7)

2. Types of approaches to ML (today)

1. How varying loss functions lead to different learners

2. Use of regularisation to control model complexity, to
avoid overfitting
3. Input to training the model (matrix versus graph versus
text)
4. Single versus (ensemble of) multiple models: (weighted)
averages vs compositions
5. Important role of optimization in learning

35
ML Setup

Focus on evaluation first

36
Terminology
• Input to a machine learning system can consist of
* Instance (aka object): measurements about individual
entities/objects a loan application
* Attribute (aka Feature): component of the instances
the applicant’s salary, number of dependents, etc.
* (Class) Label: an outcome that is categorical, numeric, etc.
forfeit vs. paid off
* Examples: instance coupled with label
<(100k, 3), “forfeit”>
* Models: discovered relationship between attributes and/or label

37
Terminology
Height Weight Age Gender
1.8 80 22 Male
1.53 82 23 Male
1.6 62 18 Female

• The 4 columns (height, weight, age, gender) are features or

attributes
• The data items (3 rows) are called instances or objects or
samples
• Height, Weight and Age are continuous features
• Gender is a categorical or discrete feature

38
Supervised vs Unsupervised Learning
Data Model used for

Supervised Labelled Predict labels on (typically)

learning new instances – encompasses
classification, regression,
ordinal regression/ranking,
etc.
Unsupervised Unlabelled Cluster related instances;
learning Understand attribute
relationships; Visualise; etc.

39
Architecture of a Supervised Learner

Examples
Learner
Train data

More soon

Model
Instances
Labels
Test data
Evaluation
Labels
40
Evaluation (Supervised Learners)
• How you measure quality depends on your problem!
• Typical process
* Pick an evaluation metric comparing label vs prediction
* Procure an independent, labelled test set
* “Average” the evaluation metric over the test set

• Example evaluation metrics

* Accuracy, Precision-Recall, Root Mean Squared Error

41
Training and Testing: If Sufficient Data
• Divide data into:
* Training set (e.g. 2/3)
* Test set (e.g. 1/3)

• Learn model (e.g. logistic

regression) using the training
set
• Evaluate performance of
model on the test set

Workshop will
cover cross
validation
Why Evaluate on “Independent” Data?

43
Why Evaluate on “Independent” Data?

44
Why Evaluate on “Independent” Data?

45
Metrics for Performance Evaluation
• Can be summarised in a Confusion Matrix
(contingency table)
– Actual class: {yes, no, yes, yes, …}
– Predicted class: {no, yes, yes, no…}
For binary classification
PREDICTED CLASS
a: TP (true positive)
Class=Yes Class=No
b: FN (false negative)
c: FP (false positive)
ACTUAL Class=Yes a b
d: TN (true negative)
CLASS
Class=No c d
Metrics for Performance Evaluation
PREDICTED CLASS
Class=Yes Class=No

ACTUAL Class=Yes a b
CLASS (TP) (FN)
Class=No c d
(FP) (TN)

a+d TP + TN
Accuracy = =
a + b + c + d TP + TN + FP + FN
* Actual class: {yes, no, yes, yes, no, yes, no, no}
* Predicted: {no, yes, yes, no, yes, no, no, yes}

PREDICTED CLASS
Class=Yes Class=No

ACTUAL Class=Yes a= 1 b=3

CLASS (TP) (FN)
Class=No c=3 d=1
(FP) (TN)
Limitations of Accuracy
• Consider a 2-class problem
– Number of Class 0 examples = 9990
* Number of Class 1 examples = 10

• If model predicts everything to be class 0,

accuracy is 9990/10000 = 99.9 %

Question: Is accuracy useful here? Why?

Exercise
Suppose you are designing a system that
predicts presence of brain cancer from MRI
data. What is more important, precision or
recall?

Would your answer change if instead the

system was predicting COVID-19 infection,
based on audio of coughing?
More Metrics
P positive instances and N negative instances

• True positive rate (aka sensitivity, recall): TP/P

• True negative rate (aka specificity): TN/N

• False positive rate: FP/N=1-specificity

• Precision: TP/(TP+FP)

• F-measure (F1-score): 2TP/(2TP+FP+FN)

=2 * (1/(1/recall + 1/precision) )
Example

What is accuracy?
What is precision?
What is recall?
What is F1-score?

Accuracy=(60+9760)/10000
Precision=60/(200)=6/20
Recall=60/100
F-measure=2/(20/6 + 10/6)
ROC Curves
• Many classification algorithms output not only a
classification for each test instance but also some
“rating” of classification accuracy:
• naive Bayes, logistic regression, ... support vector machines,
neural networks
• Often in machine learning tasks, we can afford the
luxury of “skimming off” a subset of the instances with
higher classification plausibility
• Also, we are often more interested in how reliably we
can predict a small subset of positive instances than
the vast majority of negative instances
• Is this a good classifier?
Receiver Operating Characteristic (ROC) Curves

• Reflects trade-off between

• TPR= TP/(TP+FN)
• FPR=FP/(FP+TN)

• Many models output “score” of classification confidence:

* naive Bayes, logistic regression, neural networks, etc.
• The ROC curve is formed by thresholding this score
• Convenient graphic tool for:
* visualising the ability of a classifier to classify positive instances;
* visually comparing classifiers over a given test set;
* arriving at a single-figure classifier evaluation metric (cf. F-score)
* selecting an operating threshold based on resulting trade-off
Perfect on all
ROC Curve Example
Predict all test
test points
points as positive

Predict all test points

by random coin

Predict all test

points as negative
Generating ROC Curves
Area Under the ROC Curve (AUC)
• Scalar “figure of merit” for a given classifier based on the
ROC curve by calculating the area under the curve (AUC):
* AUC = 1: perfect classifier
* AUC = 0.5: random baseline classifier

• Advantages
* Can compare classifiers inc. relative to a baseline
* Unbiased estimate of: probability that a randomly chosen
positive instance will be ranked higher than a randomly chosen
negative instance. Why is this an advantage in practice?
Generating ROC Curves Example

If score>0.95 then predict + else predict – (final column)

If score>=0.95 then predict + else predict - (2nd last column)
If score>=0.93 then predict + else predict - (3rd last column)
…….
If score>=0.85 then predict + else predict - (2nd column)
Generating ROC Curves Example

Area under curve=1 - (1/3)*(1/2)=5/6

Approaches to Learning
With example learners on linear classifiers

61
Major Frameworks in Statistical ML
•

62
Major Types of Supervised Models
• Given instance x wish to predict response y
• Recall conditional probability Pr(y|x) = Pr(x,y) / Pr(x)
Example: Wish to distinguish between Swedish and Russian

• Discriminative models
* Model only Pr(y|x)
* E.g. Logistic regression (also linear regression, SVMs, …)
Identify characteristics that differentiate the languages, use presence
absence to compute Pr(Swedish|speech) and Pr(Russian|speech)
• Generative models
* Model full Pr(x,y) = Pr(y|x) Pr(x)
* E.g. Naïve Bayes
Learn to speak Russian and Swedish, then classify the
speech with your knowledge of each language
Linear Models
Discriminative approaches mostly as refresher

You have seen some of these methods already in “Statistical

Learning” module.

64
Bayes Rule
Bayes rule in action
Naïve Bayes (NB) Classifiers
Simplifying assumption
The final NB formulation
Estimating the probabilities (1)
Estimating the probabilities (2)
Naïve Bayes in Action
Marginals

Q: Why not sum to one?

76
Naïve Bayes: Summary
• A simple linear classifier with a generative model
• Frequentist: Its probabilistic model is fit by MLE
• Bayesian? Bayes rule, but not necessarily Bayesian!
• Naïve? It models strong independence assumptions
• Easy to implement, fast, scalable; good baseline
• Can handle continuous features (use Gaussians)
• Can handle missing data (just ignore; v simple!)
• Scores not always great; Feature correlations ignored

77
Linear Regression

78
Example: Predict Humidity from Temperature

79
Method of Least Squares

Question: decision theoretic as

written, how equivalent to frequentist?

80
Regression for classification

• Any regression technique can be used for classification

• Training: perform a regression for each class, setting the output
to 1 for training instances that belong to class, and 0 for those
that don’t
• Prediction: predict class corresponding to model with largest
output value (membership value)
• For linear regression this method is also known as multi-
response linear regression
• Problem: membership values are not in the [0,1] range,
so they cannot be considered proper probability
estimates
• In practice, they are often simply clipped into the [0,1]
range and normalized to sum to 1
81
Linear models: logistic regression
• Can we do better than using linear regression for classification?
• Yes, we can, by applying logistic regression
• Logistic regression builds a linear model for a transformed target
variable
• Assume we have two classes
• Logistic regression replaces the target

by this target

• This logit transformation maps [0,1] to (-¥ , +¥ ), i.e., the new target
values are no longer restricted to the [0,1] interval

82
Logistic Regression Model
Logistic function
•

1.0
0.8
0.6
Probabilities
" !
0.4
0.2
0.0
-10 -5 0 5 10
!
Reals

83
Logistic Regression Model
predict predict
• “no” “yes”
T2D

Note: here we
do not use sum
0.5 of squared
errors for fitting

BMI
0

84
Logistic Regression: Linearity, Training
•

85
Decision boundary example
[Link]
Decision boundary example
[Link]
Exercise
How can you use linear regression (or logistic
regression) to model non-linear functions on your
data?
Summary
• Subject intro and logistics
• Performance evaluation metrics
* Accuracy, AUC, and a veritable zoo

• Approaches to ML
* Frequentist vs Bayesian vs Decision Theoretic
* Supervised models: Generative vs Discriminative

• Linear approaches
* Naïve Bayes
* Linear regression
* Logistic regression

ML Lec1
No ratings yet
ML Lec1
5 pages
01 Intro To ML Wo Videos
No ratings yet
01 Intro To ML Wo Videos
46 pages
ML Cahp 1
No ratings yet
ML Cahp 1
35 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Lecture 1
No ratings yet
Lecture 1
65 pages
Chapter 1
No ratings yet
Chapter 1
62 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
33 pages
MLUnit - 1 Share
No ratings yet
MLUnit - 1 Share
162 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
ML - Lecture - 1 Introduction To ML
No ratings yet
ML - Lecture - 1 Introduction To ML
29 pages
ML Full Syllabus
No ratings yet
ML Full Syllabus
576 pages
Chapter 5 AI
No ratings yet
Chapter 5 AI
40 pages
Lecture - 1 Introduction To ML
No ratings yet
Lecture - 1 Introduction To ML
38 pages
ML - Unit I - Final
No ratings yet
ML - Unit I - Final
132 pages
CS550 Lec1
No ratings yet
CS550 Lec1
32 pages
ML Module I
No ratings yet
ML Module I
71 pages
MATH 370: Intro to Machine Learning
No ratings yet
MATH 370: Intro to Machine Learning
60 pages
Big Data Machine Learning Prodegree Ebrochure
No ratings yet
Big Data Machine Learning Prodegree Ebrochure
6 pages
Advance ML - Unit 1
No ratings yet
Advance ML - Unit 1
12 pages
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
No ratings yet
Lahore University of Management Sciences CS 535/EE 514 Machine Learning
3 pages
Introduction To Machine Learning: Pekka Parviainen
No ratings yet
Introduction To Machine Learning: Pekka Parviainen
39 pages
ML 01
No ratings yet
ML 01
15 pages
CS446 Machine Learning Course Intro
100% (1)
CS446 Machine Learning Course Intro
46 pages
Comprehensive Data Science Curriculum
No ratings yet
Comprehensive Data Science Curriculum
16 pages
Lecture 1
100% (1)
Lecture 1
51 pages
1 - ML Introduction1
No ratings yet
1 - ML Introduction1
23 pages
Final Unit 4
No ratings yet
Final Unit 4
107 pages
Machine Learning Fundamentals Overview
No ratings yet
Machine Learning Fundamentals Overview
18 pages
CSN-526 Machine Learning Course Overview
No ratings yet
CSN-526 Machine Learning Course Overview
23 pages
Lecture 1 - Intro
No ratings yet
Lecture 1 - Intro
63 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
1 Introduction
No ratings yet
1 Introduction
81 pages
ML Notes
No ratings yet
ML Notes
52 pages
ML Lec5
No ratings yet
ML Lec5
4 pages
Lec 01
No ratings yet
Lec 01
28 pages
Mathematics For ML
No ratings yet
Mathematics For ML
12 pages
Machine Learning For Beginners
No ratings yet
Machine Learning For Beginners
25 pages
ML Lecture 1 Intro
No ratings yet
ML Lecture 1 Intro
21 pages
Introduction to Machine Learning Basics
50% (2)
Introduction to Machine Learning Basics
175 pages
Introduction to Machine Learning Concepts
100% (8)
Introduction to Machine Learning Concepts
112 pages
ML Notes (BCS602)
No ratings yet
ML Notes (BCS602)
186 pages
AI & ML Business Applications Program
No ratings yet
AI & ML Business Applications Program
8 pages
Steps ML
No ratings yet
Steps ML
37 pages
TIU UCS T451 Machine Learning AI
No ratings yet
TIU UCS T451 Machine Learning AI
5 pages
Machine Learning Model
No ratings yet
Machine Learning Model
163 pages
ABES Presentation
No ratings yet
ABES Presentation
91 pages
Machine Learning Notes22
No ratings yet
Machine Learning Notes22
45 pages
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
59 pages
W01 - FA23 - AIC270 - Programming For AI - Syed Ahmed
No ratings yet
W01 - FA23 - AIC270 - Programming For AI - Syed Ahmed
22 pages
L0 Big Picture of ML - PMDS
No ratings yet
L0 Big Picture of ML - PMDS
12 pages
Machine Learning and Deep Learning Syllabus
No ratings yet
Machine Learning and Deep Learning Syllabus
3 pages
Data Science Student Schedule
No ratings yet
Data Science Student Schedule
7 pages
AI Fellowship Nepal
No ratings yet
AI Fellowship Nepal
17 pages
Intro To Machine Learning
100% (1)
Intro To Machine Learning
250 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
DSML Curriculum Doc - Google Sheets
0% (1)
DSML Curriculum Doc - Google Sheets
12 pages
CCP Machine Learning
No ratings yet
CCP Machine Learning
11 pages
Set1 1 Intro To ML RK July 2025
No ratings yet
Set1 1 Intro To ML RK July 2025
38 pages
Data Preparation & Univariate Analysis
No ratings yet
Data Preparation & Univariate Analysis
18 pages
5 - Logistic Regression
No ratings yet
5 - Logistic Regression
19 pages
Analysis of Incomplete Block. Designs With Reference Samples in Every Block
No ratings yet
Analysis of Incomplete Block. Designs With Reference Samples in Every Block
6 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
32 pages
Machine Learning Theory Updated
No ratings yet
Machine Learning Theory Updated
8 pages
Management Consultancy by CPAs
No ratings yet
Management Consultancy by CPAs
41 pages
Regression Analysis Summary Output
No ratings yet
Regression Analysis Summary Output
2 pages
Jash Module 7 Project Maths
No ratings yet
Jash Module 7 Project Maths
7 pages
Sha RDL Syllabus
No ratings yet
Sha RDL Syllabus
9 pages
Global LIMS Market Analysis 2020-2030
No ratings yet
Global LIMS Market Analysis 2020-2030
126 pages
DWDM Lecture Notes
No ratings yet
DWDM Lecture Notes
139 pages
Impacts of Ict in Hospital Management System
No ratings yet
Impacts of Ict in Hospital Management System
29 pages
Modeling Tea Prices and Their Volatility in Kenya
No ratings yet
Modeling Tea Prices and Their Volatility in Kenya
45 pages
9.4 Non-Parametric Tests
No ratings yet
9.4 Non-Parametric Tests
3 pages
Villareal Transport Terminal Plan
No ratings yet
Villareal Transport Terminal Plan
2 pages
Sarthak Final Math
No ratings yet
Sarthak Final Math
9 pages
Unit 3 Slides-1
No ratings yet
Unit 3 Slides-1
50 pages
Chapter 14 Exam: Regression & Correlation
No ratings yet
Chapter 14 Exam: Regression & Correlation
77 pages
FIT2086 Lecture 6 Linear Regression: Daniel F. Schmidt
No ratings yet
FIT2086 Lecture 6 Linear Regression: Daniel F. Schmidt
72 pages
Regression Analysis of Euro Index Data
No ratings yet
Regression Analysis of Euro Index Data
16 pages
CSR Activities by Tata Group Study
No ratings yet
CSR Activities by Tata Group Study
49 pages
Hierarchical Multiple Regression - D. Boduszek
100% (1)
Hierarchical Multiple Regression - D. Boduszek
27 pages
Zynga's Data-Driven Gaming Strategy
100% (3)
Zynga's Data-Driven Gaming Strategy
5 pages
Criminological Research and Statistics
No ratings yet
Criminological Research and Statistics
9 pages
Linear Regression
No ratings yet
Linear Regression
12 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
RMA Discussion and References
No ratings yet
RMA Discussion and References
2 pages
Kirk - EMS For 2-Way ANOVA Models
No ratings yet
Kirk - EMS For 2-Way ANOVA Models
1 page
Student Resources PDF
100% (3)
Student Resources PDF
943 pages
A meteorologist-WPS Office
No ratings yet
A meteorologist-WPS Office
9 pages

Introduction To Machine Learning

Uploaded by

Introduction To Machine Learning

Uploaded by

Week 1: ML Intro; Linear Models

MBusA Machine Learning 2022

Copyright: University of Melbourne

• Data = raw information

• Mission of learning: find it

• One definition of ML:

-Data wrangling -Robust models and -Health

“AI is the new electricity” – Andrew Ng

Customers Who Bought This Item Also Bought

Data-driven, intelligent systems

Kids Categories Search Kids... Exit Kids

Recently watched Top Picks for Kids

Telstra, Citibank, Danske Bank, Deutsche Bank, NAB, ANZ,

• Ask questions during and after lecture

• “Statistical Learning” more emphasis on

• “Predictive Analytics” more emphasis on

• “Text and Web analytics” emphasises

• This subject “Machine Learning”

• Masters level subject

• Bishop (2007), Pattern Recognition and Machine Learning

• Lectures and workshop content will be posted to

• Welcome to use other languages (e.g. R) as well but

• 25% syndicate assignment with report

• 50% final individual exam, ~3 hours (Week 9)

You will be requested to sign a confidentiality

Signatures for this confidentiality agreement will be

• It can seem like a bag of tricks, without strong

1. Supervised (today) versus unsupervised (weeks 6-7)

1. How varying loss functions lead to different learners

Focus on evaluation first

• The 4 columns (height, weight, age, gender) are features or

Supervised Labelled Predict labels on (typically)

• Example evaluation metrics

• Learn model (e.g. logistic

ACTUAL Class=Yes a= 1 b=3

• If model predicts everything to be class 0,

Question: Is accuracy useful here? Why?

Would your answer change if instead the

• True positive rate (aka sensitivity, recall): TP/P

• True negative rate (aka specificity): TN/N

• False positive rate: FP/N=1-specificity

• F-measure (F1-score): 2TP/(2TP+FP+FN)

• Reflects trade-off between

• Many models output “score” of classification confidence:

Predict all test points

Predict all test

If score>0.95 then predict + else predict – (final column)

Area under curve=1 - (1/3)*(1/2)=5/6

You have seen some of these methods already in “Statistical

Q: Why not sum to one?

Question: decision theoretic as

• Any regression technique can be used for classification

You might also like