0% found this document useful (0 votes)

50 views16 pages

CS221 - Artificial Intelligence - Machine Learning - 1 Overview

This document provides an overview of machine learning topics that will be covered in a course. It introduces classification, regression, and structured prediction as common machine learning tasks. For each task, it provides examples of real-world applications. It also outlines the course roadmap, which will cover linear and non-linear models, neural networks, considerations like generalization, and algorithms like stochastic gradient descent and backpropagation.

Uploaded by

Ardiansyah Mochamad Nugraha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views16 pages

CS221 - Artificial Intelligence - Machine Learning - 1 Overview

Uploaded by

Ardiansyah Mochamad Nugraha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Machine learning: overview

• In this module, I will provide an overview of the topics we plan to cover under machine learning.
Course plan

Search problems Constraint satisfaction problems

Markov decision processes Markov networks
Adversarial games Bayesian networks

Reflex States Variables Logic

Low-level High-level

Machine learning

CS221 2
• Recall that machine learning is the process of turning data into a model. Then with that model, you can perform inference on it to make
predictions.
Course plan

Search problems Constraint satisfaction problems

Markov decision processes Markov networks
Adversarial games Bayesian networks
Reflex States Variables Logic
Low-level High-level

Machine learning

CS221 4
• While machine learning can be applied to any type of model, we will focus our attention on reflex-based models, which include models such
as linear classifiers and neural networks.
• In reflex-based models, inference (prediction) involves a fixed set of fast, feedforward operations.
Reflex-based models

predictor
input x y output
f

CS221 6
• Abstractly, a reflex-based model (which we will call a predictor f ) takes some input x and produces some output y.
• (In statistics, y is known as the response, and when x is a real vector, it is known as covariates or sometimes predictors, which is an unfortunate
naming clash.)
• The input can usually be arbitrary (an image or sentence), but the form of the output y is generally restricted, and what it is determines the
type of prediction task.
Binary classification
classifier
x y ∈ {+1, −1} label
f

Fraud detection: credit card transaction → fraud or no fraud

Toxic comments: online comment → toxic or not toxic

Higgs boson: measurements of event → decay event or background

Extension: multiclass classification: y ∈ {1, . . . , K}

CS221 8
• One common prediction task is binary classification, where the output y, typically expressed as positive (+1) or negative (-1).
• In the context of classification tasks, f is called a classifier and y is called a label (sometimes class, category, or tag).
• Here are some practical applications.
• One application is fraud detection: given information about a credit card transaction, predict whether it is a fradulent transaction or not, so
that the transaction can be blocked.
• Another application is moderating online discussion forums: given an online comment, predict whether it is toxic (and therefore should get
flagged or taken down) or not.
• A final application comes from physics: After the discovery of the Higgs boson, scientists were interested in how it decays. The Large Hadron
Collider at CERN smashes protons against each other and then detects the ensuing events. The goal is to predict whether each event is a
Higgs boson decaying (into two tau particles) or just background noise.
• Each of these applications has an associated Kaggle dataset. You can click on the pictures to find out more details.
• As an aside, multiclass classification is a generalization of binary classification where the output y could be one of K possible values. For
example, in digit classification, K = 10.
Regression

x f y∈R response

Poverty mapping: satellite image → asset wealth index

Housing: information about house → price

Arrival times: destination, weather, time → time of arrival

CS221 10
• The second major type of prediction task we’ll cover is regression. Here, the output y is a real number (often called the response or target).
• One application is poverty mapping: given a satellite image, predict the average asset wealth index of the homes in that area. This is used
to measure poverty across the world and determine which areas are in greatest need of aid.
• Another application: given information about a house (e.g., location, number of bedrooms), predict its price.
• A third application is to predict the arrival time of some service, which could be package deliveries, flights, or rideshares.
• The key distinction between classification and regression is that classification has discrete outputs (e.g., ”yes” or ”no” for binary classification),
whereas regression has continuous outputs.
Structured prediction

x f y is a complex object

Machine translation: English sentence → Japanese sentence

Dialogue: conversational history → next utterance

Image captioning: image → sentence describing image

Image segmentation: image → segmentation

CS221 12
• The final type of prediction task we will consider is structured prediction, which is a bit of a catch all.
• In structured prediction, the output y is a complex object, which could be a sentence or an image. So the space of possible outputs is huge.
• One application is machine translation: given an input sentence in one language, predict its translation into another language.
• Dialogue can be cast as structured prediction: given the past conversational history between a user and an agent (in the case of virtual
assistants), predict the next utterance (what the agent should say).
• In image captioning, say for visual assistive technologies: given an image, predict a sentence describing what is in that image.
• In image segmentation, which is needed to localize objects for autonomous driving: given an image of a scene, predict the segmentation of
that image into regions corresponding to objects in the world.
• Generating an image or a sentence can seem daunting, but there’s a secret here. A structured prediction task can often be broken up into a
sequence of multiclass classification tasks. For example, to predict an entire sentence, predict one word at a time, going left to right. This is
a very powerful reduction!
• Aside: one challenge with this approach is that the errors might cascade: if you start making errors, then you might go off the rails and start
making even more errors.
Roadmap
Models
Tasks
Non-linear features
Linear regression
Feature templates
Linear classification
Neural networks
K-means
Differentiable programming

Considerations
Algorithms
Group DRO
Stochastic gradient descent
Generalization
Backpropagation
CS221 Best practices 14
• Here are the rest of the modules under the machine learning unit.
• We will start by talking about regression and binary classification, the two most fundamental tasks in machine learning. Specifically, we study
the simplest setting: linear regression and linear classification, where we have linear models trained by gradient descent.
• Next, we will introduce stochastic gradient descent, and show that it can be much faster than vanilla gradient descent.
• We then take a careful look at the errors of a model and discuss group DRO, a technique that will make sure the errors don’t fall unevenly
on different groups of the population.
• Then we will push the limits of linear models by showing how you can define non-linear features, which effectively gives us non-linear
predictors using the machinery of linear models! Feature templates provide us with a framework for organizing the set of features.
• Then we introduce neural networks, which also provide non-linear predictors, but allow these non-linearities to be learned automatically from
data. We follow up immediately with backpropagation, an algorithm that allows us to automatically compute gradients needed for training
without having to take gradients manually.
• We then briefly discuss the extension of neural networks to differentiable programming, which allows us to easily build up many of the
existing state-of-the-art deep learning models in NLP and computer vision like lego blocks.
• So far we have focused on supervised learning. We take a brief detour and discuss K-means, which is a simple unsupervised learning algorithm
for clustering data points.
• We end on a more reflective note: Generalization is about answering the question: when does a model trained on set of training examples
actually generalize to new test inputs? This is where model complexity comes up. Finally, we discuss best practices for doing machine
learning in practice.

Information Security 06 Hashing and Digital Signatures
No ratings yet
Information Security 06 Hashing and Digital Signatures
29 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
CS7641 Machine Learning Midterm Notes PDF
No ratings yet
CS7641 Machine Learning Midterm Notes PDF
239 pages
Machine Learning
100% (2)
Machine Learning
104 pages
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
100% (2)
Building Machine Learning Systems With Python - Second Edition - Sample Chapter
32 pages
ML - 1 - Sovan - Introduction To ML
No ratings yet
ML - 1 - Sovan - Introduction To ML
83 pages
Andrew NG Main - Notes PDF
No ratings yet
Andrew NG Main - Notes PDF
226 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Linear Arrangement Questions For Cat 16
No ratings yet
Linear Arrangement Questions For Cat 16
8 pages
Bec613a MMC Mod4
100% (1)
Bec613a MMC Mod4
41 pages
Operation Research MCQ (Answers)
100% (1)
Operation Research MCQ (Answers)
30 pages
Maths For ML
No ratings yet
Maths For ML
156 pages
Machine Learning 2025
No ratings yet
Machine Learning 2025
111 pages
W1 - Introduction To ML
No ratings yet
W1 - Introduction To ML
57 pages
XCS221 Mod1 Slides
No ratings yet
XCS221 Mod1 Slides
307 pages
ML - Unit I - Final
No ratings yet
ML - Unit I - Final
132 pages
Learning 1
No ratings yet
Learning 1
68 pages
Prediction Problems Handout
No ratings yet
Prediction Problems Handout
4 pages
Course Overview
No ratings yet
Course Overview
33 pages
Module 01 - ML-21EC744
No ratings yet
Module 01 - ML-21EC744
20 pages
Ai - W2L4
No ratings yet
Ai - W2L4
18 pages
Chapter Introduction
No ratings yet
Chapter Introduction
7 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
Learning 2
No ratings yet
Learning 2
82 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
Mlintro 4
No ratings yet
Mlintro 4
28 pages
Unit 3
No ratings yet
Unit 3
53 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
33 pages
Week 2: Machine Learning Intro: Instructor: Ting Sun
No ratings yet
Week 2: Machine Learning Intro: Instructor: Ting Sun
21 pages
Lec 2 Basics of Machine Learning
No ratings yet
Lec 2 Basics of Machine Learning
35 pages
Mlintro 3
No ratings yet
Mlintro 3
28 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
Week 01
No ratings yet
Week 01
37 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Algorithms For Artificial Intelligence
No ratings yet
Algorithms For Artificial Intelligence
69 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
This Story Paraphrased From A Post On 9/4/12
No ratings yet
This Story Paraphrased From A Post On 9/4/12
7 pages
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
No ratings yet
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
101 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
01 ML Basics
No ratings yet
01 ML Basics
61 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
Unit4 PPT
No ratings yet
Unit4 PPT
118 pages
M2 AI Chap1 Neural-Network
No ratings yet
M2 AI Chap1 Neural-Network
60 pages
Machine Learning Slides
No ratings yet
Machine Learning Slides
281 pages
4 DL
No ratings yet
4 DL
81 pages
ML 0
No ratings yet
ML 0
47 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
ML in Simple Words: in Python, The Function Is Used To Display Output On The Screen or Other Standard Output Device
No ratings yet
ML in Simple Words: in Python, The Function Is Used To Display Output On The Screen or Other Standard Output Device
30 pages
Chap 1 Introduction To ML
No ratings yet
Chap 1 Introduction To ML
33 pages
Machine Learning
No ratings yet
Machine Learning
95 pages
CS464 Ch1 Intro Fall2020
No ratings yet
CS464 Ch1 Intro Fall2020
83 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
Matematics and Machine Learning
No ratings yet
Matematics and Machine Learning
156 pages
Fill in The Blanks
No ratings yet
Fill in The Blanks
2 pages
Machine Learning A Lecture Note
No ratings yet
Machine Learning A Lecture Note
111 pages
Trip Assignment
No ratings yet
Trip Assignment
6 pages
Pre-Analysis: Example: Steady One-Dimensional Heat Conduction in A Bar
No ratings yet
Pre-Analysis: Example: Steady One-Dimensional Heat Conduction in A Bar
12 pages
Unit 3 Discrete Fourier Transform Questions and Answers - Sanfoundry PDF
No ratings yet
Unit 3 Discrete Fourier Transform Questions and Answers - Sanfoundry PDF
5 pages
Review of Transforms: ECGR 6118 Computer Project: Transforms Student Name
No ratings yet
Review of Transforms: ECGR 6118 Computer Project: Transforms Student Name
25 pages
Machine Learning: Neural Networks
No ratings yet
Machine Learning: Neural Networks
22 pages
Numerical Methods Machine Problem 1
No ratings yet
Numerical Methods Machine Problem 1
11 pages
Ijprse V4i5 13
No ratings yet
Ijprse V4i5 13
3 pages
Chapter 8 and 9
No ratings yet
Chapter 8 and 9
25 pages
Expt 1 - First Come First Serve Scheduling (FCFS)
No ratings yet
Expt 1 - First Come First Serve Scheduling (FCFS)
2 pages
Machine Learning: Backpropagation
No ratings yet
Machine Learning: Backpropagation
24 pages
50 DSA Problems To Crack Any Interview 1744037785
No ratings yet
50 DSA Problems To Crack Any Interview 1744037785
22 pages
Week 4 Notes
No ratings yet
Week 4 Notes
18 pages
EE5130: Digital Signal Processing
No ratings yet
EE5130: Digital Signal Processing
3 pages
Midterm
No ratings yet
Midterm
4 pages
Sheet 4
No ratings yet
Sheet 4
3 pages
DBSCAN
No ratings yet
DBSCAN
5 pages
4-Curve Fitting and Interpolation
No ratings yet
4-Curve Fitting and Interpolation
48 pages
Generalized Gaussian Quadrature Rules For Systems of Arbitrary Functions
No ratings yet
Generalized Gaussian Quadrature Rules For Systems of Arbitrary Functions
27 pages
SCIP - Introduction
No ratings yet
SCIP - Introduction
109 pages
程Model order reduction method based on (r) POD-ANNs for parameterized
No ratings yet
程Model order reduction method based on (r) POD-ANNs for parameterized
13 pages
Aksantara2015 Sheet1
No ratings yet
Aksantara2015 Sheet1
2 pages
Deep Learning Powers Better Decisions in Financial Services
No ratings yet
Deep Learning Powers Better Decisions in Financial Services
29 pages
Unit 3b
No ratings yet
Unit 3b
9 pages
Examples: Bubble Sort, Insertion Sort, Merge Sort, Quick Sort, Heap Sort
No ratings yet
Examples: Bubble Sort, Insertion Sort, Merge Sort, Quick Sort, Heap Sort
10 pages
CS221 - Artificial Intelligence - Machine Learning - 6 Non-Linear Features
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 6 Non-Linear Features
22 pages
Performance & Tunning: Programação em ABAP/4
No ratings yet
Performance & Tunning: Programação em ABAP/4
41 pages
Chapter 5 (Print)
No ratings yet
Chapter 5 (Print)
11 pages
Toom Cook Polynomials Bodrato
No ratings yet
Toom Cook Polynomials Bodrato
15 pages
CS221 - Artificial Intelligence - Machine Learning - 3 Linear Classification
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 3 Linear Classification
28 pages
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
24 pages
CS221 - Artificial Intelligence - Search - 4 Dynamic Programming
No ratings yet
CS221 - Artificial Intelligence - Search - 4 Dynamic Programming
23 pages
CS221 - Artificial Intelligence - Machine Learning - 4 Stochastic Gradient Descent
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 4 Stochastic Gradient Descent
12 pages
Time Series Classification From Scratch With Deep Neural Networks
No ratings yet
Time Series Classification From Scratch With Deep Neural Networks
9 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Mastering Dynamic Programming in Java
From Everand
Mastering Dynamic Programming in Java
Ed A Norex
No ratings yet

CS221 - Artificial Intelligence - Machine Learning - 1 Overview

Uploaded by

CS221 - Artificial Intelligence - Machine Learning - 1 Overview

Uploaded by

Machine learning: overview

Search problems Constraint satisfaction problems

Reflex States Variables Logic

Search problems Constraint satisfaction problems

Fraud detection: credit card transaction → fraud or no fraud

Toxic comments: online comment → toxic or not toxic

Higgs boson: measurements of event → decay event or background

Extension: multiclass classification: y ∈ {1, . . . , K}

Poverty mapping: satellite image → asset wealth index

Housing: information about house → price

Arrival times: destination, weather, time → time of arrival

Machine translation: English sentence → Japanese sentence

Dialogue: conversational history → next utterance

Image captioning: image → sentence describing image

Image segmentation: image → segmentation

You might also like