0% found this document useful (0 votes)
29 views12 pages

Dsci552 Syllabus

Uploaded by

kiyomita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views12 pages

Dsci552 Syllabus

Uploaded by

kiyomita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

DSCI 552: Machine Learning for Data

Science (Fall 2023)

Units: 4
Instructor: Mohammad Reza Rajati, PhD
PHE 412
[email protected] – Include DSCI 552 in subject.
Office Hours: Right after the lecture, by appointment

Webpage: Personal Homepage at Intelligent Decision Analysis

TA(s): Will be introduced on Piazza.

Lecture 1: Monday, Wednesday, 12:00 pm –1:50 pm SGM 123 & Online

Lecture 2: Monday, Wednesday, 3:30 pm –5:20 pm MHP 101 & Online

Webpages: Piazza Class Page for discussions, announcements, and course materials
and USC DEN Class Page for exams and grades
and GitHub for code submission

– All HWs, handouts, solutions will be posted in PDF format


– Student has the responsibility to stay current with webpage material

Prerequisite: Prior courses in multivariate calculus, linear algebra, probability, and statistics.
– This course is a prerequisite to DSCI 558.

Other Requirements: Computer programming skills.


Using Python is mandatory.
Students must know Python or must be willing to learn it.

Tentative Grading: Assignments 45%


Midterm 1 20%
Midterm 2 25%
Final Project 10%
Participation on Piazza* 5%

Letter Grade Distribution:


≥ 93.00 A 73.00 - 76.99 C
90.00 - 92.99 A- 70.00 - 72.99 C-
87.00 - 89.99 B+ 67.00 - 69.99 D+
83.00 - 86.99 B 63.00 - 66.99 D
80.00 - 82.99 B- 60.00 - 62.99 D-
77.00 - 79.99 C+ ≤ 59.99 F
2 DSCI 552 Syllabus – August 17, 2023

Disclaimer: Although the instructor does not expect this syllabus to drastically change, he
reserves every right to change this syllabus any time in the semester.

Note on e-mail vs. Piazza: If you have a question about the material or logistics of the class
and wish to ask it electronically, please post it on the piazza page (not e-mail). Often times, if one
student has a question/comment, other also have a similar question/comment. Use private Piazza
posts with the professor, TA, graders only for issues that are specific to your individually (e.g., a
scheduling issue or grade issue). Minimize the use of email to the course staff and only use it when
absolutely necessary.

Catalogue Description: Practical applications of machine learning techniques to real-world


problems. Uses in data mining and recommendation systems and for building adaptive user inter-
faces.

Course Description: This is a foundational course with the primary application to data analyt-
ics, but is intended to be accessible both to students from technical backgrounds such as computer
science, computer engineering, electrical engineering, or mathematics; and to students from less
technical backgrounds such as business administration, communication, accounting, various medi-
cal specializations including preventative medicine and personalized medicine, genomics, and man-
agement information systems. A basic understanding of engineering and/or technology principles
is needed, as well as basic programming skills, sufficient mathematical background in probability,
statistics, and linear algebra.

Course Objectives: Upon successful completion of this course a student will

• Broadly understand major algorithms used in machine learning.

• Understand supervised and unsupervised learning techniques.

• Understand regression methods.

• Understand resampling methods, including cross-validation and bootstrap.

• Understand decision trees, dimensionality reduction, regularization, clustering, and kernel


methods.

• Understand hidden Markov models and graphical models.

• Understand feedforward and recurrent neural networks and deep learning.

Exam Dates:

• Midterm 1 (in-person): Friday October 20, 10:00 AM-11:50 AM. (May be changed to a a
different hour on the same day)

• Midterm 2 (in-person): Friday, December 1, 10:00 AM - 11:50 AM (May be changed to a


different hour on the same day)
DSCI 552 Syllabus – August 17, 2023 3

• Final Project Due: Monday, December 11, 4:00 PM. Grace period: the project can be
submitted until 11:59 PM of the same day with 30% penalty. Any change in the project after
the deadline is considered late submission. One second late is late. The project is graded
based on when it was submitted, not when it was finished. Homework late days cannot be
used for the project.

Textbooks:

• Required Textbook:

1. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction
to Statistical Learning with Applications in R, Springer, 2021. (ISLR)
Available at https://web.stanford.edu/~hastie/ISLRv2_website.pdf

• Recommended Textbooks:

1. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction
to Statistical Learning with Applications in Python, Springer, 2023.
Available at https://hastie.su.domains/ISLP/ISLP_website.pdf
2. Applied Predictive Modeling, 1st Edition
Authors: Max Kuhn and Kjell Johnson; Springer; 2016. ISBN-13: 978-1-4614-6848-6
3. Machine Learning: A Concise Introduction, 1st Edition
Author: Steven W. Knox; Wiley; 2018. ISBN-13: 978-1-119-43919-6
4. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edi-
tion
Authors: Trevor Hastie, Robert Tibshirani, and Jerome Friedman; Springer; 2008.
(ESL) ISBN-13: 978-0387848570
5. Machine Learning: An Algorithmic Perspective, 2nd Edition
Author: Stephen Marsland; CRC Press; 2014. ISBN-13: 978-1-4614-7137-0
6. Deep Learning, 1st Edition
Authors: Ian Goodfellow, Yoshua Bengio, and Aaron Courville; MIT Press; 2016. (DL)
ISBN-13: 978-0262035613
7. Neural Networks and Learning Machines, 3rd Edition
Author: Simon Haykin; Pearson; 2008. ISBN-13: 978-0131471399
8. Neural Networks and Deep Learning: A Textbook, 1st Edition
Authors: Charu Aggrawal; Springer; 2018. ISBN-13: 978-3319944623
9. Introduction to Machine Learning, 2nd Edition
Author: Ethem Alpaydine; MIT Press; 2010. (AL) ISBN-13: 978-8120350786
10. Machine Learning, 1st Edition
Authors: Tom M. Mitchell; McGraw-Hill Education; 1997. ISBN-13: 978-0070428072
4 DSCI 552 Syllabus – August 17, 2023

Grading Policies:

• The letter grade distribution table guarantees the minimum grade each student will receive
based on their final score. When appropriate, relative performance measures will be used to
assign the final grade, at the discretion of the instructor.

– Final grades are non-negotiable and are assigned at the discretion of the instructor. If
you cannot accept this condition, you should not enroll in this course.

• Your lowest homework grade and half of your second lowest homework grade will be dropped
from the final grade. For example, if you received 90, 85, 10, 95, 65, 80, 100, 100 your home-
work score will be 0.5×65+80+85+90+95+100+100
6.5 = 89.62 instead of 10+65+80+85+90+95+100+100
8 =
78.13. This policy makes up for missing assignments because of heavy workload, sickness,
etc. Remember that if you miss an assignment because of heavy workload in other courses
and then miss another one because of sickness, only the second assignment’s grade will be
completely dropped from your score. Be aware of this when you decide not to submit an
assignment, because later you may become sick.

• Homework 0 will not be graded.

• *Participation on Piazza has up to 5% extra credit, which is granted on a competitive basis


at the discretion of the instructor.

• Homework Policy

– Homework is assigned on an approximately biweekly basis. Homework due dates are


mentioned in the course outline, so mark your calendars. A three-day grace period can
be used for each homework with 10% penalty per day. Any change in homework after
the deadline makes it a late submission. Absolutely no late homework will be accepted
after the grace period. A late assignment results in a zero grade.
– Late Days: No late homework will be accepted after the three day grace period. One
second after the deadline is considered late. However, students are allowed to use six late
days for homework for any reason (including sickness, family emergencies, overwhelming
workload, exams, etc) without incurring the 10% penalty. Beyond that, no individual
extension will be granted to anyone for any reason whatsoever.
Example: A student can submit six assignments, one day late each, without any penalty.
Or three assignments, two days late each, without penalty, or two assignments three days
late each. A student cannot use four late days for one assignment, and two late days for
another assignment. An assignment submitted four days late will receive a zero grade,
although its grade will be dropped as the lowest homework grade, according to the above
grading policies.
– Use your six late days strategically and only if you absolutely need them. Always
remember that later in the semester, you might become sick or have heavy workload in
other courses and might need to use your late days.
– Assignments are project-style; therefore, we do not provide solutions to the assignments.
This is a firm rule.
– Poor internet connection, failing to upload properly, or similar issues are NOT acceptable
reasons for late submissions. If you want to make sure that you do not have such
DSCI 552 Syllabus – August 17, 2023 5

problems, submit homework eight hours earlier than the deadline. Please do not ask the
instructor to make individual exceptions.
– Homework is graded based on when it was submitted, not when it was finished.
– Homework solutions and simulation results should be typed or scanned using scanners
or mobile scanner applications like CamScan and uploaded (photos taken by cell-phone
cameras and in formats other than pdf will NOT be accepted). Programs and simulation
results have to be uploaded on GitHub as well.
– Students are encouraged to discuss homework problems with one another, but each
student must do their own work and submit individual solutions written/ coded in
their own hand. Copying the solutions or submitting identical homework sets is written
evidence of cheating. The penalty ranges from F on the homework or exam, to an F
in the course, to recommended expulsion. One important (but not exclusive) instance
of cheating is having access to other students’ solutions. Claims of “being inspired”
by other students’ codes, or using them as “sample code” are not acceptable. Asking
questions from your peers and exchanging tips about coding are highly encouraged and
should not be confused with outright cheating.
– Posting the homework assignments and their solutions to online forums or sharing them
with other students is strictly prohibited and infringes the copyright of the instructor.
Instances will be reported to USC officials as academic dishonesty for disciplinary action.

• Exam Policy

– Make-up Exams: No make-up exams will be given. If you cannot make the above
dates due to a class schedule conflict or personal matter, you must drop the class. In
the case of a required business trip or a medical emergency, a signed letter from your
manager or physician has to be submitted. This letter must include the contact of your
physician or manager.
– An excused absence supported by documents in the first midterm can be made up by
using the second midterm’s grade in lieu of the first midterm. An excused absence in
the second midterm results in an IN (incomplete) grade.
– Exams will be closed book and notes. Calculators are allowed but computers and cell-
phones or using any devices that have internet capability are not allowed, except for
writing the solutions or being proctored are not allowed. One letter size cheat sheet
(back and front) is allowed for Midterm 1. Two letter size cheat sheets (back and front)
are allowed for Midterm 2.
– All exams are cumulative, with an emphasis on material presented since the last exam.
– For several reasons, including unauthorized circulation of previous exams, we DO NOT
provide exam solutions. This is a firm rule.
– For several reasons, including the difficult logistics of dealing with a large class, we
may not be able to hold a regrading session for the exams. Please make sure that you
understand this rule when you take this course.

• Project

– The final project is more like a slightly extended Homework that will be assigned after
Midterm 2 as the final summative experience.
6 DSCI 552 Syllabus – August 17, 2023

– The project topic and steps will be provided to students, similar to homework assign-
ments.
– Projects must be finished individually.
– A short grace period of a few hours after the project deadline will be given to students
for 30% penalty. Late submissions will be graded zero. One second late is late.
– Project is graded based on when it was submitted, not when it was finished.
– Homework late days cannot be used for project in any circumstances.

• Attendance:

– Students are required to attend all the lectures and discussion sessions and actively par-
ticipate in class discussions. Use of cellphones and laptops is prohibited in the classroom.
If you need your electronic devices to take notes, you should discuss with the instructor
at the beginning of the semester.

Important Notes:

• Textbooks are secondary to the lecture notes and homework assignments.

• Handouts and course material will be distributed.

• Please use your USC email to register on Piazza and to contact the instructor and TAs.
DSCI 552 Syllabus – August 17, 2023 7

Monday Wednesday
Aug 21st 1 23rd 2
Introduction to Statistical Learning Introduction to Statistical Learning
(ISLR Chs.1,2, ESL Chs.1,2) (ISLR Chs.1,2, ESL Chs.1,2)
Motivation: Big Data Regression, Classification
Supervised vs. Unsupervised Learning The Regression Function
Nearest Neighbors
28th 3 30th 4
Introduction to Statistical Learning Linear Regression (ISLR Ch.3, ESL Ch. 3)
(ISLR Chs.1,2, ESL Chs.1,2) Estimating Coefficients
Model Assessment Estimating the Accuracy of Coefficients
The Bias-Variance Trade-off
No Free Lunch Theorem
Sep 4th 6th 5
Labor Day Linear Regression (ISLR Ch.3, ESL Ch. 3)
Variable Selection and Hypothesis Testing
Multiple Regression
Analysis of Variance and the F Test
11th 6 13th 7
Linear Regression (ISLR Ch.3, ESL Ch. 3) Classification (ISLR Ch. 4, ESL Ch. 4)
Stepwise Variable Selection Multi-class and Multi-label Classification
Qualitative Variables Logistic Regression
Class Imbalance
Hypothesis Testing and Variable Selection
18th 8 20th 9
Classification (ISLR Ch. 4, ESL Ch. 4) Classification (ISLR Ch. 4, ESL Ch. 4)
Subsampling and Upsampling Measures for Evaluating Classifiers
SMOTE Quadratic Discriminant Analysis*
Multinomial Regression Comparison with K-Nearest Neighbors
Bayesian Linear Discriminant Analysis The Naı̈ve Bayes’ Classifier
Text Classification
Feature Creation for Text Data
Handling Missing Data
25th 10 27th 11
Resampling Methods (ISLR Ch. 5, ESL Linear Model Selection and
Ch. 7) Regularization (ISLR Ch.6, ESL Ch. 3)
Model Assessment Subset Selection
Validation Set Approach AIC, BIC, and Adjusted R2 )
Cross-Validation Shrinkage Methods
The Bias-Variance Trade-off for Ridge Regression
Cross-Validation
The Bootstrap
Bootstrap Confidence Intervals
8 DSCI 552 Syllabus – August 17, 2023

Monday Wednesday
Oct 2nd 12 4th 13
Linear Model Selection and Tree-based Methods (ISLR Ch. 8, ESL
Regularization (ISLR Ch.6, ESL Ch. 3) Chs. 9, 10)
The LASSO Regression and Classification Trees
Elastic Net Cost Complexity Pruning
Dimension Reduction Methods*
9th 14 11th 15
Tree-based Methods (ISLR Ch. 8, ESL Support Vector Machines (ISLR Ch. 9,
Chs. 9, 10, 16) ESL Ch. 12)
Bagging, Random Forests, and Boosting Maximal Margin Classifier
Support Vector Classifiers
16th 16 18th 17
Support Vector Machines (ISLR Ch. 9, Unsupervised Learning (ISLR Ch. 12,
ESL Ch. 12) ESL Ch. 14)
The Kernel Trick K-Means Clustering
Support Vector Machines Hierarchical Clustering
L1 Regularized SVMs
Multi-class and Multilabel Classification
The Vapnik-Chervonenkis Dimension*
Support Vector Regression
23rd 18 25th 19
Unsupervised Learning (ISLR Ch. 12, Unsupervised Learning (ISLR Ch. 12,
ESL Ch. 14) ESL Ch. 14)
Practical Issues in Clustering Principal Component Analysis
Anomaly Detection*
Association Rules*
Mixture Models and Soft K-Means*
30th 20 Nov 1st 21
Active and Semi-Supervised Learning Neural Networks and Deep Learning
Semi-Supervised Learning (ISLR Ch. 10, ESL Ch. 11, DL Ch. 6)
Self-Training The Perceptron
Co-Training Feedforward Neural Networks
Yarowsky Algorithm Backpropagation and Gradient Descent
Refinements Overfitting
Active vs. Passive Learning
Stream-Based vs. Pool-Based Active Learning
Query Selection Strategies
DSCI 552 Syllabus – August 17, 2023 9

Monday Wednesday
6th 22 8th 23
Neural Networks and Deep Learning Neural Networks and Deep Learning
(DL Chs. 6, 7) (ISLR Ch. 12, DL Chs. 9, 10)
Autoencoders and Deep Feedforward Neural Convolutional Neural Networks
Networks Sequence Modeling
Regularization Recurrent Neural Networks
Early Stopping and Dropout
Adversarial Training*
13th 24 15th 25
Neural Networks and Deep Learning Hidden Markov Models (AL Ch. 15)
(ISLR Ch. 12, DL Ch. 10) Principles
Sequence-to-Sequence Modeling* The Viterbi Algorithm
Long Short Term Memory (LSTM) Neural
Networks
20th 26 22nd
Reinforcement Learning* Thanksgiving Break
Definitions
Task-Reward-Policy Formulation
Total Discounted Future Reward
Optimal Policy
Value Function
Q-Function
The Bellman Equation
Q-Learning
Exploration- Exploitation
Temporal Difference Learning
Extensions to Stochastic Environments and
Rewards
Deep Reinforcement Learning
27th 27 29th 28
Fuzzy Systems* Fuzzy Systems*
Fuzzy Sets Inference from Fuzzy Rules
Set Operations Fuzzification and Defuzzification
T-norms, T-conorms, and Fuzzy complements Learning Fuzzy Rules from Examples
Cylindrical Extensions and Fuzzy Relations The Wang-Mendel Algorithm
Fuzzy If-Then Rules as Association Rules Fuzzy C-Means Clustering

Notes:

• Items marked by * will be covered only if time permits.


10 DSCI 552 Syllabus – August 17, 2023

Homework Due Dates & Exams

Friday
Aug 25th 1
-

Sep 1st 2
-
8th 3
Homework 0 Due (not graded)
15th 4
Homework 1 Due
22nd 5
Homework 2 Due
29th 6
-

Oct 6th 7
Homework 3 Due
13th 8
Homework 4 Due (Moved to Monday Oct. 16)
20th 9
[Midterm 1]
27th 10
Homework 5 Due

Nov 3rd 11
Homework 6 Due
10th 12
Homework 7 Due (Moved to Monday Nov. 13)
17th 13
-
24th 14
Homework 8 Due (Moved to Monday Nov. 28)

Dec 1st 15
[Midterm 2]
DSCI 552 Syllabus – August 17, 2023 11

Statement on Academic Conduct and Support Systems


Academic Conduct:
Plagiarism – presenting someone else’s ideas as your own, either verbatim or recast in your own
words – is a serious academic offense with serious consequences. Please familiarize yourself with the
discussion of plagiarism in SCampus in Part B, Section 11, “Behavior Violating University Stan-
dards” policy.usc.edu/scampus-part-b. Other forms of academic dishonesty are equally unaccept-
able. See additional information in SCampus and university policies on Research and Scholarship
Misconduct.

Students and Disability Accommodations:


USC welcomes students with disabilities into all of the University’s educational programs. The
Office of Student Accessibility Services (OSAS) is responsible for the determination of appropriate
accommodations for students who encounter disability-related barriers. Once a student has com-
pleted the OSAS process (registration, initial appointment, and submitted documentation) and
accommodations are determined to be reasonable and appropriate, a Letter of Accommodation
(LOA) will be available to generate for each course. The LOA must be given to each course in-
structor by the student and followed up with a discussion. This should be done as early in the
semester as possible as accommodations are not retroactive. More information can be found at
osas.usc.edu. You may contact OSAS at (213) 740-0776 or via email at [email protected].

Support Systems:
Counseling and Mental Health - (213) 740-9355 – 24/7 on call
studenthealth.usc.edu/counseling
Free and confidential mental health treatment for students, including short-term psychotherapy,
group counseling, stress fitness workshops, and crisis intervention.

National Suicide Prevention Lifeline - 1 (800) 273-8255 – 24/7 on call


suicidepreventionlifeline.org
Free and confidential emotional support to people in suicidal crisis or emotional distress 24 hours
a day, 7 days a week.

Relationship and Sexual Violence Prevention Services (RSVP) - (213) 740-9355(WELL), press
“0” after hours – 24/7 on call
studenthealth.usc.edu/sexual-assault
Free and confidential therapy services, workshops, and training for situations related to gender-
based harm.

Office for Equity, Equal Opportunity, and Title IX (EEO-TIX) - (213) 740-5086
eeotix.usc.edu
Information about how to get help or help someone affected by harassment or discrimination, rights
of protected classes, reporting options, and additional resources for students, faculty, staff, visitors,
and applicants.

Reporting Incidents of Bias or Harassment - (213) 740-5086 or (213) 821-8298


12 DSCI 552 Syllabus – August 17, 2023

usc-advocate.symplicity.com/care report
Avenue to report incidents of bias, hate crimes, and microaggressions to the Office for Equity, Equal
Opportunity, and Title for appropriate investigation, supportive measures, and response.

The Office of Student Accessibility Services (OSAS) - (213) 740-0776


osas.usc.edu
OSAS ensures equal access for students with disabilities through providing academic accommoda-
tions and auxiliary aids in accordance with federal laws and university policy.

USC Campus Support and Intervention - (213) 821-4710


campussupport.usc.edu
Assists students and families in resolving complex personal, financial, and academic issues adversely
affecting their success as a student.

Diversity, Equity and Inclusion - (213) 740-2101


diversity.usc.edu
Information on events, programs and training, the Provost’s Diversity and Inclusion Council, Diver-
sity Liaisons for each academic school, chronology, participation, and various resources for students.

USC Emergency - UPC: (213) 740-4321, HSC: (323) 442-1000 – 24/7 on call
dps.usc.edu,emergency.usc.edu
Emergency assistance and avenue to report a crime. Latest updates regarding safety, including
ways in which instruction will be continued if an officially declared emergency makes travel to
campus infeasible.

USC Department of Public Safety - UPC: (213) 740-6000, HSC: (323) 442-120 – 24/7 on call
dps.usc.edu Non-emergency assistance or information.

Office of the Ombuds - (213) 821-9556 (UPC) / (323-442-0382 (HSC)


ombuds.usc.edu
A safe and confidential place to share your USC-related issues with a University Ombuds who will
work with you to explore options or paths to manage your concern.

Occupational Therapy Faculty Practice - (323) 442-3340 or [email protected]


chan.usc.edu/otfp
Confidential Lifestyle Redesign services for USC students to support health promoting habits and
routines that enhance quality of life and academic performance.

You might also like