0% found this document useful (0 votes)
22 views

CS215 LectureSlidesSet2 IntroductionToMachineLearning AI

This document provides an introduction and overview of machine learning and social issues related to its use. It outlines the instructor of the course, Jacob Levman, and their credentials. It then gives a brief overview of machine learning methods, including supervised learning techniques like artificial neural networks and K-nearest neighbors, as well as unsupervised learning, data visualization, and validation. Applications and implications of machine learning are also discussed at a high level.

Uploaded by

bojimir730
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

CS215 LectureSlidesSet2 IntroductionToMachineLearning AI

This document provides an introduction and overview of machine learning and social issues related to its use. It outlines the instructor of the course, Jacob Levman, and their credentials. It then gives a brief overview of machine learning methods, including supervised learning techniques like artificial neural networks and K-nearest neighbors, as well as unsupervised learning, data visualization, and validation. Applications and implications of machine learning are also discussed at a high level.

Uploaded by

bojimir730
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 112

Introduction to Machine

Learning
Social Issues in the
Information Age
Jacob Levman, PhD
Associate Professor
Department of Computer Science
St. Francis Xavier University

Winter, 2024
Instructor
• Dr. Jacob Levman
Associate Professor, Department of Computer Science, St. Francis Xavier University

Visiting Faculty, Massachusetts General Hospital, Harvard Medical School

Research Affiliate, Nova Scotia Health Authority

• Office: Physical Sciences Building 1020


• 902-867-2221
[email protected]
• Term: Fall 2023
• Course resources: moodle (to be set up)
• Office Hours:
• Tuesdays 1:30 pm to 2:20 pm
• Wednesdays 2:30 pm to 3:20 pm, and 3:30 pm to 4:20 pm
• Fridays 2:30 to 3:20 pm
Machine Learning
Why Machine Learning?
Why Now?
Machine Learning Methods
Overview
• Supervised learning
• Artificial neural networks
• K Nearest Neighbour
• Etc.
• Unsupervised learning
• K-means clustering
• Hierarchical clustering
• Etc.
• Data Visualization
• An adjunct to statistical validation
• Validation
• Evaluation criteria (Overall Accuracy, ROC analyses, sensitivity, specificity, PPV,
NPV)
• Independent datasets
• Within dataset validation
Types of Learning: Supervised
• Most common class of machine learning
• Also called Classification (assigning samples to
defined classes)
Examples of
Interest
Result:
Examples Sample is of
not of Machine Learning
Interest or
Interest NOT

New Sample
Applications
Learning

• But are machine learning techniques necessarily


adaptive?
• No! Not necessarily!
• Some techniques (Unsupervised learning esp.) don’t improve
as they operate at all. but we still call the machine learning
Basic Learning Paradigm

Examples of
Interest
Result:
Examples
Sample is of
not of Machine Learning
Interest or
Interest
NOT

New Sample
Feedback critical for future
tech

Examples of
Interest
Result:
Examples
Sample is of
not of Machine Learning
Interest or
Interest
NOT

New Sample
MACHINE LEARNING BEHAVIOUR CHANGES WITH TRAINING DATA!
NEEDS RE-VALIDATION
Except on USS Voyager?

Examples of
Interest
Result:
Examples
Sample is of
not of Machine Learning
Interest or
Interest
NOT

New Sample
HOW TO HANDLE THIS PROBLEM?
IN CURRENT PRACTICE WE REMOVE THE ‘LIVE’ FEEDBACK
VALIDATION OCCURS ONCE
AI behaviour doesn’t change while in operation
Updates require extensive re-validation, new version release

Examples of
Interest
Result:
Examples
Sample is of
not of Machine Learning
Interest or
Interest
NOT

New Sample
Implications of removing live
feedback

• Currently the only acceptable option for


medical diagnostics
• Without live feedback we lower the risk of
AI conquering us all!
• Limits on tech can produce suboptimal
performance
• Improving performance requires extensive
re-validation (time consuming, costly, ….)
• Machines that retrain ‘on-the-fly’ are
inherently dangerous, difficult to ensure
safety
Discussion Break

• AI Risks to our collective safety and security


Machine Learning’s Future

• How long until we trust a holographic doctor?


• A very long time!
Add a Slide Title - 2
Adaptive Learning – example
without ethical limitations:
Add a Slide Title - 2
• A major challenge of supervised learning classifiers that are LINEAR!
• Advanced techniques allow nonlinear solutions as well
• Future subjects
In practice things get complicated
quickly
• Not just 2 input feature measurements
• Maybe input is an image 300x600 or larger
• Maybe input is a time series of variable length data
(written words, audio clips, stock price history)
• These can be smooshed or clamped/trimmed to fit in a
spreadsheet (as in for your project), but modern
methods for these challenging data types retain aspects
of the input configuration in the learning machine
• How challenging do things get? Image analysis is a
great example…….
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Supervised Learning Example
Types of Learning: Supervised
• Most common class of machine learning
• Also called Classification (assigning samples to
defined classes)
Examples of
Interest

Examples Result: Sample


not of Machine Learning is of Interest or
Interest NOT

New Sample
Data-driven approach
• Collect a dataset with example measurements (could be an
image for example) and labels
• Use machine learning to train a classifier
• Evaluate classifier on withheld set of test images
• Simple example of what API code will look like:
We package all that up with fair statistical
comparisons, feature selection, comparison
between many prominent ML technologies
and have a spreadsheet as the input to the
program! (no programming required)

• So how do you evaluate how well the machine worked??


Background – Validation

• TP: True Positives https://www.medcalc.org/manual/roc-curves.php

• Samples of interest correctly labelled by ML


• TN: True Negatives
• Samples not of interest correctly labelled by ML
• FP: False Positives
• Samples of not of interest incorrectly labelled by ML
• FN: False Negatives
• Samples of interest incorrectly labelled by ML
Background – Validation
• Sensitivity (also called recall): https://www.medcalc.org/manual/roc-curves.php

• The proportion of samples/patients of interest correctly classified


• TP / (TP + FN)
• Specificity:
• The proportion of samples/patients not of interest correctly classified
• TN / (TN + FP)
• Positive Predictive Value (PPV also called precision):
• The proportion of samples/patients predicted to be of interest that actually are of interest
• TP / (TP + FP)
• Negative Predictive Value (NPV):
• The proportion of samples/patients predicted to be not of interest that are actually not of interest
• TN / (TN + FN)
• Overall Accuracy (OA):
• The proportion of samples/patients correctly classified
• (TP + TN) / (TP+ TN + FP + FN)
• Test Error:
• Defined in various ways, generally simple – summarizing deviation from ground truth: Proportion of Errors
relative to all cases.
• (FP+FN)/(TP+TN+FP+FN)
• Sum of Squares Error:
• Forces positive values for all differences
Background – Validation
• Receiver Operating Characteristic Curve
Analysis
• Vary threshold/criterion

https://www.medcalc.org/manual/roc-curves.php

› Area under the ROC curve: AUC, a robust metric for separation between
two groups, assessing Dx potential w/o knowing operating point in
advance

› We have some basics about how to evaluate ML models, so let’s jump in


and learn our first basic technique……
K Nearest Neighbour

KNN is a simple algorithm that stores


all available cases and classifies
new cases based on a similarity
measure
K-NN
$250,000

$200,000

$150,000
Loan$ Non-Default
$100,000 Default

$50,000

$0
15 20 25 30 35 40 45 50 55 60 65
Age
K-NN $250,000

$200,000

$150,000
Non-Default
Loan$ Default
$100,000

$50,000

$0
15 20 25 30 35 40 45 50 55 60 65
Age

• All distance measurements are sorted from smallest to


largest
• Analyze the first (smallest) K (user defined parameter)
distance measurements
• Voting system, who wins?
K Nearest Neighbour
K Nearest Neighbour
K Nearest Neighbour – How to
choose K?
K Nearest Neighbour

• Strengths
• Simple and intuitive
• Effective (in a basic way!)
• Flexible decision boundary
• Weaknesses
• Easily misled by noise
• Easily misled by irrelevant features
• Must choose a distance function (Euclidean is often too simplistic)
• Vulnerable to high dimensionality problems
• Computation costs can be high
• Many irrelevant distances to distant training samples are computed
though unused
• How to handle unbalanced distributions (more of one group than the
other)?

› So we understand the KNN basics, how to practically implement this in


python?
K Nearest Neighbour – Case
Study

• 10 measurements per sample


• Thus 10 dimensions!
• Measurements from histopathology
• Ie. Analyzing cells under a microscope
K Nearest Neighbour – Case
Study
• Results:

› Older results (2000)


› Many more modern techniques available
now
› Likely to be outperformed in a more recent
repeat analysis
K Nearest Neighbour – Case
Study

• Results:

Machine learning: ECML-98, 1998 - Springer


› In 1998, SVM was new
– Will cover in detail later

› K-NN performed best of ‘traditional’


techniques
K Nearest Neighbour – Another
Case Study
My background

• Computer Engineering
• Electrical and Computer Engineering
• Medical Biophysics (Physics Stream)
• Imaging Research Postdoc
• Biomedical Engineering Postdoc
• Neuroscience Postdoc
Bringing Together Disparate Fields
Research Outline
• Physics Research
• Computational Neuroscience Research
• Machine Learning Research
• Neuroscience Research

Image from Forbes.com Image from Scientific


American
Research fMRI

Active and Passive fMRI for Presurgical Mapping of Motor and Language Cortex
By Bradley Goodyear, Einat Liebenthal and Victoria Mosher
DOI: 10.5772/58269
Research - fMRI
Research – Diffusion MRI
• Diffusion MRI based on Doppler effect
• Inherently less signal when based on phase shift
• Diffusion measurements acquired in many directions
• Reliability can be a challenge
• Particularly for making pretty tractograms

Image from Boston Children’s Hospital,


Harvard Medical School
Image from John Radcliffe Hospital,
University of Oxford
Scalar Diffusion MRI

I Fragata, et al., Early Prediction of Delayed Ischemia …, Stroke 2017, 48(8): 2091-2097.
Research - Diffusion MRI
Computational Neuroscience
Research
Machine Learning Applications:
PICUs
• Predict cardiac arrest before it happens
• Predict renal failure before it happens
• Predict any actionable circumstances in the clinic
Machine Learning Research
Modelling the Radiologist
• Reliably predict when machines can equal or outperform the radiologist
• Report triage statistics, like statistical percentiles governing where a given sample/patient falls
relative to training (i.e. machine can report, 100% of patients this extreme have autism, 94% of
patients presenting like this turn out to have multiple sclerosis, 100% of all patients with this kind of
an image profile are healthy or would be called normal by a radiologist)
• Eventually machines will be handling more and more of their workload
Machine Learning Research
Video Processing
Machine Learning Research
COVID-19 Detection from Lung CT
Machine Learning Research
Brain MRI

• Image from: https://www.google.com/imgres?imgurl


=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fconsultqd.clevelandclinic.org%2Fwp-content%2Fuploads%2Fsites%2F2%2F201
8%2F01%2F18-NEU-508-Nakamura-MRI-650x450.jpg&imgrefurl=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fconsultqd.clev
elandclinic.org%2Fmaking-the-most-of-brain-mri-machine-learning-integrated-with-image-post-p
rocessing%2F&tbnid=q476ZCnqd9AO0M&vet=12ahUKEwjojsSNhenrAhVol-AKHeudAeMQMygAeg
UIARCiAQ..i&docid=7lV7ygzEb3z3JM&w=650&h=450&q=brain%20MRI%20machine%20learning&
ved=2ahUKEwjojsSNhenrAhVol-AKHeudAeMQMygAegUIARCiAQ
General Purpose Machine Learning
• Test bias setting alters the curvature of the decision function to match local data
distribution

Levman et al., Journal of Digital Imaging, 27:145-151, 2014


General Purpose Machine Learning
Tissue outcome prediction – is tissue at risk of death?
• Combining physiological measurements

Levman et al., ISMRM 2014, Milan, selected for oral presentation


Ensembles of Learners

• Image from: https


://www.google.com/search?tbm=isch&sxsrf=ALeKk01d36aaxLAv5NJdttO7MzGGT3nu-w%3A1600
100144086&source=hp&biw=1360&bih=608&ei=MJdfX6zWAo-l_QbuyZVY&q=ensemble+learning
&oq=ensemble+learning&gs_lcp=CgNpbWcQAzICCAAyAggAMgIIADICCAAyAggAMgIIADICCAAyAg
gAMgIIADICCAA6BQgAELEDOggIABCxAxCDAVD_BliCG2D8G2gAcAB4AYAB-AGIAa4OkgEFOC44LjGY
AQCgAQGqAQtnd3Mtd2l6LWltZw&sclient=img&ved=0ahUKEwisrsPFhenrAhWPUt8KHe5kBQsQ4d
UDCAc&uact=5#imgrc=vYIRTEN-SXSnxM
Neuroscience Research

• FreeSurfer measurements
• Talk about how future work will entail looking into fMRI and
diffusion difference abnormalities in a variety of medical
conditions
Acknowledgements

• Dr. Emi Takahashi (PhD, Neuroscience)


• Funding: National Institute of Health (NIH)

• Faculty in the Institute of Biomedical Engineering: Dr. Stephen Payne


• Funding: Wellcome Trust

• Dr. Anne Martel (medical physics)


• Funding: Canadian Breast Cancer Foundation, CIHR, CBCRA, OGS
• Technical & Clinical Researchers

Current Funding: Canada Research Chair program (NSERC), CFI


Machine Learning Methods
Overview
• Supervised learning
• Artificial neural networks
• Support vector machines
• Linear discriminant analysis
• Etc.
• Unsupervised learning
• K-means clustering
• Hierarchical clustering
• Etc.
• Dimensionality reduction
• Principal components analysis
• Independent components analysis
• Etc.
• Data Visualization
• An adjunct to statistical validation
• Validation
• Evaluation criteria (Overall Accuracy, ROC analyses, sensitivity, specificity, PPV, NPV)
• Independent datasets
• Within dataset validation
Lecture Plan

• Introduction and Background


• Technique focused approach:
• Present a major technique, including key
mathematics/algorithms
• Present examples of its use in the real world
(scientific literature and/or industry)
• Demonstrate its use to the class as much as possible
• Live demo
• Real world data
• Will generally start with easier techniques work up
to harder ones
Machine Learning Methods
Overview
• Supervised learning
• Artificial neural networks
• Support vector machines
• Linear discriminant analysis
• Etc.
• Unsupervised learning
• K-means clustering
• Hierarchical clustering
• Etc.
• Dimensionality reduction
• Principal components analysis
• Independent components analysis
• Etc.
• Data Visualization
• An adjunct to statistical validation
• Validation
• Evaluation criteria (Overall Accuracy, ROC analyses, sensitivity, specificity, PPV, NPV)
• Independent datasets
• Within dataset validation
Intro to Unsupervised Learning

• No labels, no ground truth!


• No examples provided to the algorithm
• Algorithm typically groups samples into classes of which it knows
NOTHING! (except for the representative examples it places therein)
• Algorithm attempts to find patterns in the data w/o a priori info
• Medical Images: finding regions-of-interest
• Big medical data: find natural groupings within a condition (subtypes of
ADHD)
• Usually more challenging to evaluate performance compared with SL
Intro to Unsupervised Learning
Intro to Unsupervised Learning

• Generally we want to minimize the within-class


distance between samples while simultaneously
maximizing the between-class distance
• Can also be evaluated as per congruency with
known classes not provided to the algorithm
• Caveat: if you know classes, SL will probably
outperform by benefitting from this information
Example: a cholera outbreak in London

Many years ago, during a cholera outbreak in London, a physician


plotted the location of cases on a map. Properly visualized, the
data indicated that cases clustered around certain intersections,
where there were polluted wells, not only exposing the cause of
cholera, but indicating what to do about the problem.

X X
X
XX XX
X X X
X X
X X
X X
X X
XX
X
• A technique demanded by many real world
tasks
– Bank/Internet Security: fraud/spam pattern discovery
– Biology: taxonomy of living things such as kingdom, phylum, class, order, family,
genus and species
– City-planning: Identifying groups of houses according to their house type, value, and
geographical location
– Climate change: understanding earth’s climate, finding atmospheric and oceanic
weather change patterns
– Finance: stock clustering analysis to uncover correlation underlying shares
– Image Compression/segmentation: coherent pixels grouped
– Information retrieval/organisation: Google search, topic-based news
– Land use: Identification of areas of similar land use in an earth observation database
– Marketing: Help marketers discover distinct groups in their customer bases, and
then use this knowledge to develop targeted marketing programs
– Social network mining: special interest group automatic discovery
• Imaging: Unsupervised learning is often called
image segmentation

https://www.mathworks.com/discovery/image-segmentation.html
Data courtesy of Boston
Children’s Hospital, Harvard
Medical School
Intro to Validation

• Ideal validation:
• Assessment on many independently acquired datasets
• Challenges: independent dataset often not available to researcher
• Alternative: Publish on a single dataset, validation by other researchers comparing
their work to your publication
• How to have confidence in self assessed ML performance on a single dataset?
• Validation!
• K-fold validation
• Randomized trials
• Leave one out
• Efron’s bootstrap
• Metrics to assess performance
• AUC
• OA
• Sensitivity
• Specificity
• PPV
• NPV
Background – Validation in Supervised
Learning: Leave-one-out validation
Background – Validation in Supervised
Learning: Leave-one-out validation

• Or average the OA or other evaluative metric


Background – Validation in Supervised
Learning: K – fold cross validation

• Group-wise equivalent of Leave-one-out (LOO)


• Divide dataset into K random non overlapping
groups
• Perform LOO on a group wise basis
• Train on all but the current group
• Test on the current group
Background – Validation in
Supervised Learning
• Randomized Trials / Bootstrapping
• Randomly select X% of samples for training
• Remaining (100-X)% of samples are for testing
• Evaluate performance
• Repeat many many times
• Summarize performance with statistics (mean, SD, CI
etc.)
• Alternative variations available
• Efron’s 0.638+ bootstrap which allows repeat samples within
the training set only
Background – Validation: 3 Way
Splits
• So far we’ve discussed dividing data into training and testing
sets
• Additional validation approaches include 3 main groupings:
training, testing and validation datasets
• Training and testing proceed as before
• Once complete and a validated model selected, it is
evaluated for performance on the validation set (whose data
was NEVER used during the validation)
• Note sometimes what is referred to as the testing and
validation sets are reversed
Machine Learning Methods
Overview
• Supervised learning
• KNN
• Artificial neural networks
• Support vector machines
• Linear discriminant analysis
• Etc.
• Unsupervised learning
• K-means clustering
• Hierarchical clustering
• Etc.
• Dimensionality reduction
• Principal components analysis
• Independent components analysis
• Etc.
• Data Visualization
• An adjunct to statistical validation
• Validation
• Evaluation criteria (Overall Accuracy, ROC analyses, sensitivity, specificity, PPV, NPV)
• Having covered a nice intro/overview and the KNN basics,


Independent datasets
Within dataset validation
let’s look at some more advanced KNN approaches
K Nearest Neighbour
K Nearest Neighbour –
Regression Adaptation

• Example on whiteboard
Supervised Learning Example
Revisited
Supervised Learning Example
Revisited
Supervised Learning Example
Revisited

• Surely we can do better than KNN?


References for These Course
Lecture Slides
• (Cal Tech) Machine Learning & Data Mining
• http://www.yisongyue.com/courses/cs155/2017_winter/
• Lior Rokach, Ben Gurion University of the Negev (
https://www.slideshare.net/liorrokach/introduction-to-machine-learning-13809045/1)
• CS4811 AI lecture notes: (http://pages.mtu.edu/~nilufer/classes/cs4811/2009-spring/)
• Ke Chen COMP24111 (https://studentnet.cs.manchester.ac.uk/ugt/COMP24111/)
• Gwen Englebienne (http://gwenn.dk/mlpr/)
• http://www.robots.ox.ac.uk/~az/lectures/ml/lect1.pdf
• http://cs231n.stanford.edu/slides/2016/
• Saed Sayad chem-eng.utoronto.ca/~datamining/Presentations/KNN.ppt
• http://classes.engr.oregonstate.edu/eecs/spring2012/cs534/notes/knn.pdf
• http://dataaspirant.com/2016/12/23/k-nearest-neighbor-classifier-intro/
• David Sontag http://cs.nyu.edu/~dsontag/courses/ml12/slides

You might also like