0% found this document useful (0 votes)
80 views45 pages

ML Lecture 1 Introduction and Policies

This document provides an overview of a machine learning course taught by Faizad Ullah. It discusses the instructor's background and areas of specialization. The course will cover theoretical machine learning concepts with an emphasis on hands-on learning of algorithms. Grading will be based on assignments, quizzes, a midterm, final exam, and optional project. Students will complete programming assignments using Python libraries. Policies address attendance, submissions, plagiarism, and classroom conduct. Required textbooks and contact information are also included.

Uploaded by

Faizad Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views45 pages

ML Lecture 1 Introduction and Policies

This document provides an overview of a machine learning course taught by Faizad Ullah. It discusses the instructor's background and areas of specialization. The course will cover theoretical machine learning concepts with an emphasis on hands-on learning of algorithms. Grading will be based on assignments, quizzes, a midterm, final exam, and optional project. Students will complete programming assignments using Python libraries. Policies address attendance, submissions, plagiarism, and classroom conduct. Required textbooks and contact information are also included.

Uploaded by

Faizad Ullah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

CSCS 460 – Machine

Learning
Faizad Ullah

1
About Me
 Faizad Ullah
 Ph.D. Student at LUMS
 Specialization
 Natural Language Processing (NLP)
 Machine Learning
 Data Science
 Contributions
 Medical Image Analysis
 Graph Analysis
 Text Analytics of Low-Resourced Language

2
Course Description
 Theoretical (3 Credit Hrs Course)
 Emphasis on hands-on and ML algorithms intuitions

 Can primarily be divided into following modules


 Course Introduction
 Feature Space and Feature Importance
 Labels Space
 Labeled Data Sources
 Classifiers and Decision Boundaries
 Boosting and Bagging
 Transformers (Tentative)

3
Grading
 Point distribution
Class Attendance and Participation 5%

Quizzes (5-8) 20%

Assignments (3-5) 20%

Midterm 20%

Final Term 25%

Project (Or Additional Assignments) 10%

4
Programming Tasks
 *3-5 Assignments
 Programming Assignments

 *One Project
 Programming Environment
 Python (Pytoch, TensorFlow, Colab)

*Vivas will be conducted for assignments and the project


5
Policies
 Quizzes
 Most quizzes are announced (50% quizzes will be unannounced)
 Announcements will be made during the class and on slides, so check slides regularly if you miss
lectures. No announcement will be sent via email
 Sharing
 Copying is not allowed for assignments. Discussions are encouraged; however, you must submit your
own work.
 Violators would be reported to the Disciplinary Committee or face marks reduction penalties

 Plagiarism
 Do NOT pass someone else’s work as your own!
 Write in your own words and cite the reference if you use someone else’s material.

6
Policies (2)
 Submission Policy
 Submissions are due at the day and time specified
 Late submissions will result in 50% marks deduction per day from obtained marks (i.e., 2 days
late submission will get zero credit).
 Attendance Policy
 You are advised to attend all lectures.
 It’s the students’ responsibility to recover any information or announcements posted during a
lecture from which they were absent.
 Classroom behavior
 Maintain classroom sanctity by remaining quiet and attentive
 Asking questions is encouraged.
 You are not allowed to use a Laptop/mobile phone, etc., during class.

7
Policies (3)
 Retakes
 No retakes for quizzes, assignments, exams, or projects
 In case of any medical emergency or unavoidable circumstances, inform before hand and seek a formal
approval. You need to share medical reports for departmental record.
 Do not wait for the final exam to seek approval for retakes

8
Course Material
 All course material (i.e.,Books, class handouts, reading
assignments) will be shared on Moodle
 Text Book
 Machine Learning: A Probabilistic Perspective, Murphy, Kevin P. MIT press, 2012 – Murphy.
 The Elements of Statistical Learning: Data mining, Inference, and Prediction, Hastie, Trevor, Robert Tibshirani,
and Jerome Friedman, Springer Science & Business Media, 2009 – ESLII

 Reference Book
 Machine Learning, Tom Mitchell, McGraw Hill, 1997 – TM

9
Contact
 How to contact me?
 E-mail: Will share soon
 Office:
 Office Hours: Mentioned on office door

10
Most Important

Don’t be afraid of giving wrong answers!


Let’s start our ML journey…

12
We Imagine Machine Learning as…

13
ML is all around us…

14
Robots we Actual
Imagine
Robots 15
Robots Invasion We Imagine

16
Actual Invasion We Imagine

17
Why Discussing All This in ML Course?
 A Broader understanding of Machine Learning

 Although the focus of this course is concepts, mathematics, and implementation of machine
learning algorithms.
 But you should know why we needed ML
 What comes after we have learned ML
 How are ML algorithms deployed in real-world applications

18
What is
Machine Learning?
How does it work?

19
Machine as Mechanical Helpers

20
Machines as Intellectual Helpers

21
Machines as Intellectual Helpers

Cat

Is this a cat or a dog?

No

Should I hire this person?


22
A Classifier

Cat

{cat, dog}

happy
{happy, sad, angry,
surprised,
neutral}
23
A Classifier

empty

{empty, full}

Hospital
{Vocabulary of the
language}

24
A Classifier

Best Move
{Return the
best move}

Elon Musk

Names of People

25
A Classifier

How do we train a classifier?

26
How to train your intern?
 How would you train a new intern to conduct job interviews?

 Option 1: Teach all the complicated rules


 Grades are important
 University is important
 Great grades + Good university = All good!
 Bad grades + Unknown university = Not so good
 Bad grades + Good university = ?
 Good grades + Unknown university = ?
 Still there would be exceptions
 Very hard to instill intuitive and experiential knowledge

27
How to train your intern?
 How would you train a new intern to conduct job interviews?

 Option 2: Make them sit and watch, as a expert conduct interviews.


 Learning by experience
 Eventually, patterns start emerging
 Let the intern get the intuition on their own!

 More experience = Better learning


 More exposure (balanced cases) = Better learning

 Caveat!
 What if the expert has systematic flaws of judgement aka biases?
◦ Conduct sessions with many experts
◦ What is they all share biases and stereotypes?
◦ Initially, your intern could only be as good as the expert

28
How to train your machine?
 Allowing the machines to learn on their own, using prior decisions of experts is known as
Machine Learning!

Supervised Unsupervised

The outcome is The outcome is NOT


provided provided along
along with the with the data.
data.

Classifying emails in Spam and Not_Spam

29
Artificial Intelligence VS ML
 Colloquially both terms are used interchangeably

 However, traditionally there is a difference.


 The goal of AI was to make a machine more like a human AI
 Give the machine a lot of world knowledge
 A logical decision-making framework

 The ML framework seeks to make a better machine – not


necessarily emulating a human
 Based on Statistics and Optimization – not logic! ML
 Learn from labelled data
 More data – more consistent decisions
 More balanced data – more confident decisions

30
Life is not governed by certainty…
 Certainty in the real-world is a rare luxury
 Probability of something of being 0 or 1 is very rare!

 Uncertainty is the basis of the ML that is quantified using probability and statistics
 Something can and cannot happen with a certain probability!

31
Traditional Computer Science
 Tasks like:
 Play an audio/video file
 Display a text file on screen
 Perform a mathematical operation on two numbers
 Sort an array of numbers using Insertion Sort
 Search for a string in a text file
 …

Data
Output
Program

32
Problems that Traditional CS Can’t Handle

Tumor? Y/N Price? What was said? Summarize text

Data
Output
Program?

33
Machine Learning
Regression
Classification

34
Traditional CS
Data
Output

Program

Machine Learning
Data
Program
Output

35
Machine Learning Pipeline

36
What is Machine Learning?
 Formally:
 A computer program A is said to learn from experience E with respect to some class of tasks T and
performance measure P if its performance at tasks in T, as measured by P, improves with experience E.
(Tom Mitchell, 1997)

 Informally:
 Algorithms that improve on some task with experience.

To train a classifier, we need labelled data (called dataset)

37
Data – Big, Big,… data!
 How do we obtain these massive datasets to train our Machine Learning models?
 From real interactions e.g., call centers
 Expert annotators e.g., hired tams of annotators
 Crowd sourcing

Recaptcha Tagging

38
Task-Label Relationship
 Labels are dictated by the task to be performed.
 Example: Speech Technologies
What was said? Speech Recognition

Who said it? Speaker Recognition

Was it John Doe? Speaker Verification

Did it mention “hey Google”? Keyword Detection


What’s the language? Language Identification

Is the language native for the speaker?


What is their height?
What is the age of the speaker?
What is emotional state?
What was the sentiment?
Is the voice fake?
39
Task-Label Relationship
 Example: Text Technologies

Who wrote it?


Summary of what was written?
Was it plagiarized?
What was the intent?
What language is this?
Is the language native for the speaker?
What is author’s literacy level?
What is the topic of this document?
What is emotional state?
What was the sentiment?
Can we fake this writing style?
40
Challenges of ML - Explainability
 A classifier can potentially learn to classify on the basis of features not desirable for humans
 All dogs waring a collar in the training data while no cat is wearing it – ML just learns to separate based
on collar
 All horse images have a copyrights notice – ML just learns to recognize horses based on the copyrights
notice

 Explainable ML: The results should be understandable by humans


 As opposed to a black-box system

41
Challenges of ML – Fairness
 AI tends to reflect the biases of the society
 Human taggers who mark a recording as misinformation based on accent or gender
 Court decisions in country that make a rich person’s acquittal more likely
 Automated standardized testing in the US could yield unfavorable results for certain demographic
groups
 AI plays a decision role in hiring decisions, with up to 72% of resumes in the US never being viewed by a
human (Automation Bias)
 Decision on immigration, bank loans, credit history checks, criminal profiling

42
ML in Low-resource settings
 Problems where large datasets and tools are not available
 Natural Language Processing and Speech
 Pakistan has 71 languages
 We barely have speech recognition capabilities for Urdu!

43
The Offline Ones
 3.6 billion people worldwide are offline
 That is 46.6% of the world population
 13.4% of the develop world, 53% of the developing world, and 80.9% of the Least Developed Countries
are offline*
 Offline Populations
 Too poor to afford internet-enabled devices
 Too remote to access the internet
 Too low-literate to navigate the mostly text-driven internet

 285 million visually impaired individuals

*International Telecommunication Union (ITU), https://itu.foleon.com/itu/measuring-digital-development/offline-population/ (accessed: Feb 2023)

44
References
 Murphy Chapter 1
 Alpaydin Chapter 1
 TM Chapter 1

 Lectures of Andrew Ng., Dr. Ali Raza, and “Machine Learning for Intelligent Systems
(CS4780/CS5780)”, Kilian Weinberger.

 This disclaimer should serve as adequate citation.

45

You might also like