Dsci552 Syllabus
Dsci552 Syllabus
Units: 4
Instructor: Mohammad Reza Rajati, PhD
PHE 412
[email protected] – Include DSCI 552 in subject.
Office Hours: Right after the lecture, by appointment
Webpages: Piazza Class Page for discussions, announcements, and course materials
and USC DEN Class Page for exams and grades
and GitHub for code submission
Prerequisite: Prior courses in multivariate calculus, linear algebra, probability, and statistics.
– This course is a prerequisite to DSCI 558.
Disclaimer: Although the instructor does not expect this syllabus to drastically change, he
reserves every right to change this syllabus any time in the semester.
Note on e-mail vs. Piazza: If you have a question about the material or logistics of the class
and wish to ask it electronically, please post it on the piazza page (not e-mail). Often times, if one
student has a question/comment, other also have a similar question/comment. Use private Piazza
posts with the professor, TA, graders only for issues that are specific to your individually (e.g., a
scheduling issue or grade issue). Minimize the use of email to the course staff and only use it when
absolutely necessary.
Course Description: This is a foundational course with the primary application to data analyt-
ics, but is intended to be accessible both to students from technical backgrounds such as computer
science, computer engineering, electrical engineering, or mathematics; and to students from less
technical backgrounds such as business administration, communication, accounting, various medi-
cal specializations including preventative medicine and personalized medicine, genomics, and man-
agement information systems. A basic understanding of engineering and/or technology principles
is needed, as well as basic programming skills, sufficient mathematical background in probability,
statistics, and linear algebra.
Exam Dates:
• Midterm 1 (in-person): Friday October 20, 10:00 AM-11:50 AM. (May be changed to a a
different hour on the same day)
• Final Project Due: Monday, December 11, 4:00 PM. Grace period: the project can be
submitted until 11:59 PM of the same day with 30% penalty. Any change in the project after
the deadline is considered late submission. One second late is late. The project is graded
based on when it was submitted, not when it was finished. Homework late days cannot be
used for the project.
Textbooks:
• Required Textbook:
1. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction
to Statistical Learning with Applications in R, Springer, 2021. (ISLR)
Available at https://web.stanford.edu/~hastie/ISLRv2_website.pdf
• Recommended Textbooks:
1. Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, An Introduction
to Statistical Learning with Applications in Python, Springer, 2023.
Available at https://hastie.su.domains/ISLP/ISLP_website.pdf
2. Applied Predictive Modeling, 1st Edition
Authors: Max Kuhn and Kjell Johnson; Springer; 2016. ISBN-13: 978-1-4614-6848-6
3. Machine Learning: A Concise Introduction, 1st Edition
Author: Steven W. Knox; Wiley; 2018. ISBN-13: 978-1-119-43919-6
4. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edi-
tion
Authors: Trevor Hastie, Robert Tibshirani, and Jerome Friedman; Springer; 2008.
(ESL) ISBN-13: 978-0387848570
5. Machine Learning: An Algorithmic Perspective, 2nd Edition
Author: Stephen Marsland; CRC Press; 2014. ISBN-13: 978-1-4614-7137-0
6. Deep Learning, 1st Edition
Authors: Ian Goodfellow, Yoshua Bengio, and Aaron Courville; MIT Press; 2016. (DL)
ISBN-13: 978-0262035613
7. Neural Networks and Learning Machines, 3rd Edition
Author: Simon Haykin; Pearson; 2008. ISBN-13: 978-0131471399
8. Neural Networks and Deep Learning: A Textbook, 1st Edition
Authors: Charu Aggrawal; Springer; 2018. ISBN-13: 978-3319944623
9. Introduction to Machine Learning, 2nd Edition
Author: Ethem Alpaydine; MIT Press; 2010. (AL) ISBN-13: 978-8120350786
10. Machine Learning, 1st Edition
Authors: Tom M. Mitchell; McGraw-Hill Education; 1997. ISBN-13: 978-0070428072
4 DSCI 552 Syllabus – August 17, 2023
Grading Policies:
• The letter grade distribution table guarantees the minimum grade each student will receive
based on their final score. When appropriate, relative performance measures will be used to
assign the final grade, at the discretion of the instructor.
– Final grades are non-negotiable and are assigned at the discretion of the instructor. If
you cannot accept this condition, you should not enroll in this course.
• Your lowest homework grade and half of your second lowest homework grade will be dropped
from the final grade. For example, if you received 90, 85, 10, 95, 65, 80, 100, 100 your home-
work score will be 0.5×65+80+85+90+95+100+100
6.5 = 89.62 instead of 10+65+80+85+90+95+100+100
8 =
78.13. This policy makes up for missing assignments because of heavy workload, sickness,
etc. Remember that if you miss an assignment because of heavy workload in other courses
and then miss another one because of sickness, only the second assignment’s grade will be
completely dropped from your score. Be aware of this when you decide not to submit an
assignment, because later you may become sick.
• Homework Policy
problems, submit homework eight hours earlier than the deadline. Please do not ask the
instructor to make individual exceptions.
– Homework is graded based on when it was submitted, not when it was finished.
– Homework solutions and simulation results should be typed or scanned using scanners
or mobile scanner applications like CamScan and uploaded (photos taken by cell-phone
cameras and in formats other than pdf will NOT be accepted). Programs and simulation
results have to be uploaded on GitHub as well.
– Students are encouraged to discuss homework problems with one another, but each
student must do their own work and submit individual solutions written/ coded in
their own hand. Copying the solutions or submitting identical homework sets is written
evidence of cheating. The penalty ranges from F on the homework or exam, to an F
in the course, to recommended expulsion. One important (but not exclusive) instance
of cheating is having access to other students’ solutions. Claims of “being inspired”
by other students’ codes, or using them as “sample code” are not acceptable. Asking
questions from your peers and exchanging tips about coding are highly encouraged and
should not be confused with outright cheating.
– Posting the homework assignments and their solutions to online forums or sharing them
with other students is strictly prohibited and infringes the copyright of the instructor.
Instances will be reported to USC officials as academic dishonesty for disciplinary action.
• Exam Policy
– Make-up Exams: No make-up exams will be given. If you cannot make the above
dates due to a class schedule conflict or personal matter, you must drop the class. In
the case of a required business trip or a medical emergency, a signed letter from your
manager or physician has to be submitted. This letter must include the contact of your
physician or manager.
– An excused absence supported by documents in the first midterm can be made up by
using the second midterm’s grade in lieu of the first midterm. An excused absence in
the second midterm results in an IN (incomplete) grade.
– Exams will be closed book and notes. Calculators are allowed but computers and cell-
phones or using any devices that have internet capability are not allowed, except for
writing the solutions or being proctored are not allowed. One letter size cheat sheet
(back and front) is allowed for Midterm 1. Two letter size cheat sheets (back and front)
are allowed for Midterm 2.
– All exams are cumulative, with an emphasis on material presented since the last exam.
– For several reasons, including unauthorized circulation of previous exams, we DO NOT
provide exam solutions. This is a firm rule.
– For several reasons, including the difficult logistics of dealing with a large class, we
may not be able to hold a regrading session for the exams. Please make sure that you
understand this rule when you take this course.
• Project
– The final project is more like a slightly extended Homework that will be assigned after
Midterm 2 as the final summative experience.
6 DSCI 552 Syllabus – August 17, 2023
– The project topic and steps will be provided to students, similar to homework assign-
ments.
– Projects must be finished individually.
– A short grace period of a few hours after the project deadline will be given to students
for 30% penalty. Late submissions will be graded zero. One second late is late.
– Project is graded based on when it was submitted, not when it was finished.
– Homework late days cannot be used for project in any circumstances.
• Attendance:
– Students are required to attend all the lectures and discussion sessions and actively par-
ticipate in class discussions. Use of cellphones and laptops is prohibited in the classroom.
If you need your electronic devices to take notes, you should discuss with the instructor
at the beginning of the semester.
Important Notes:
• Please use your USC email to register on Piazza and to contact the instructor and TAs.
DSCI 552 Syllabus – August 17, 2023 7
Monday Wednesday
Aug 21st 1 23rd 2
Introduction to Statistical Learning Introduction to Statistical Learning
(ISLR Chs.1,2, ESL Chs.1,2) (ISLR Chs.1,2, ESL Chs.1,2)
Motivation: Big Data Regression, Classification
Supervised vs. Unsupervised Learning The Regression Function
Nearest Neighbors
28th 3 30th 4
Introduction to Statistical Learning Linear Regression (ISLR Ch.3, ESL Ch. 3)
(ISLR Chs.1,2, ESL Chs.1,2) Estimating Coefficients
Model Assessment Estimating the Accuracy of Coefficients
The Bias-Variance Trade-off
No Free Lunch Theorem
Sep 4th 6th 5
Labor Day Linear Regression (ISLR Ch.3, ESL Ch. 3)
Variable Selection and Hypothesis Testing
Multiple Regression
Analysis of Variance and the F Test
11th 6 13th 7
Linear Regression (ISLR Ch.3, ESL Ch. 3) Classification (ISLR Ch. 4, ESL Ch. 4)
Stepwise Variable Selection Multi-class and Multi-label Classification
Qualitative Variables Logistic Regression
Class Imbalance
Hypothesis Testing and Variable Selection
18th 8 20th 9
Classification (ISLR Ch. 4, ESL Ch. 4) Classification (ISLR Ch. 4, ESL Ch. 4)
Subsampling and Upsampling Measures for Evaluating Classifiers
SMOTE Quadratic Discriminant Analysis*
Multinomial Regression Comparison with K-Nearest Neighbors
Bayesian Linear Discriminant Analysis The Naı̈ve Bayes’ Classifier
Text Classification
Feature Creation for Text Data
Handling Missing Data
25th 10 27th 11
Resampling Methods (ISLR Ch. 5, ESL Linear Model Selection and
Ch. 7) Regularization (ISLR Ch.6, ESL Ch. 3)
Model Assessment Subset Selection
Validation Set Approach AIC, BIC, and Adjusted R2 )
Cross-Validation Shrinkage Methods
The Bias-Variance Trade-off for Ridge Regression
Cross-Validation
The Bootstrap
Bootstrap Confidence Intervals
8 DSCI 552 Syllabus – August 17, 2023
Monday Wednesday
Oct 2nd 12 4th 13
Linear Model Selection and Tree-based Methods (ISLR Ch. 8, ESL
Regularization (ISLR Ch.6, ESL Ch. 3) Chs. 9, 10)
The LASSO Regression and Classification Trees
Elastic Net Cost Complexity Pruning
Dimension Reduction Methods*
9th 14 11th 15
Tree-based Methods (ISLR Ch. 8, ESL Support Vector Machines (ISLR Ch. 9,
Chs. 9, 10, 16) ESL Ch. 12)
Bagging, Random Forests, and Boosting Maximal Margin Classifier
Support Vector Classifiers
16th 16 18th 17
Support Vector Machines (ISLR Ch. 9, Unsupervised Learning (ISLR Ch. 12,
ESL Ch. 12) ESL Ch. 14)
The Kernel Trick K-Means Clustering
Support Vector Machines Hierarchical Clustering
L1 Regularized SVMs
Multi-class and Multilabel Classification
The Vapnik-Chervonenkis Dimension*
Support Vector Regression
23rd 18 25th 19
Unsupervised Learning (ISLR Ch. 12, Unsupervised Learning (ISLR Ch. 12,
ESL Ch. 14) ESL Ch. 14)
Practical Issues in Clustering Principal Component Analysis
Anomaly Detection*
Association Rules*
Mixture Models and Soft K-Means*
30th 20 Nov 1st 21
Active and Semi-Supervised Learning Neural Networks and Deep Learning
Semi-Supervised Learning (ISLR Ch. 10, ESL Ch. 11, DL Ch. 6)
Self-Training The Perceptron
Co-Training Feedforward Neural Networks
Yarowsky Algorithm Backpropagation and Gradient Descent
Refinements Overfitting
Active vs. Passive Learning
Stream-Based vs. Pool-Based Active Learning
Query Selection Strategies
DSCI 552 Syllabus – August 17, 2023 9
Monday Wednesday
6th 22 8th 23
Neural Networks and Deep Learning Neural Networks and Deep Learning
(DL Chs. 6, 7) (ISLR Ch. 12, DL Chs. 9, 10)
Autoencoders and Deep Feedforward Neural Convolutional Neural Networks
Networks Sequence Modeling
Regularization Recurrent Neural Networks
Early Stopping and Dropout
Adversarial Training*
13th 24 15th 25
Neural Networks and Deep Learning Hidden Markov Models (AL Ch. 15)
(ISLR Ch. 12, DL Ch. 10) Principles
Sequence-to-Sequence Modeling* The Viterbi Algorithm
Long Short Term Memory (LSTM) Neural
Networks
20th 26 22nd
Reinforcement Learning* Thanksgiving Break
Definitions
Task-Reward-Policy Formulation
Total Discounted Future Reward
Optimal Policy
Value Function
Q-Function
The Bellman Equation
Q-Learning
Exploration- Exploitation
Temporal Difference Learning
Extensions to Stochastic Environments and
Rewards
Deep Reinforcement Learning
27th 27 29th 28
Fuzzy Systems* Fuzzy Systems*
Fuzzy Sets Inference from Fuzzy Rules
Set Operations Fuzzification and Defuzzification
T-norms, T-conorms, and Fuzzy complements Learning Fuzzy Rules from Examples
Cylindrical Extensions and Fuzzy Relations The Wang-Mendel Algorithm
Fuzzy If-Then Rules as Association Rules Fuzzy C-Means Clustering
Notes:
Friday
Aug 25th 1
-
Sep 1st 2
-
8th 3
Homework 0 Due (not graded)
15th 4
Homework 1 Due
22nd 5
Homework 2 Due
29th 6
-
Oct 6th 7
Homework 3 Due
13th 8
Homework 4 Due (Moved to Monday Oct. 16)
20th 9
[Midterm 1]
27th 10
Homework 5 Due
Nov 3rd 11
Homework 6 Due
10th 12
Homework 7 Due (Moved to Monday Nov. 13)
17th 13
-
24th 14
Homework 8 Due (Moved to Monday Nov. 28)
Dec 1st 15
[Midterm 2]
DSCI 552 Syllabus – August 17, 2023 11
Support Systems:
Counseling and Mental Health - (213) 740-9355 – 24/7 on call
studenthealth.usc.edu/counseling
Free and confidential mental health treatment for students, including short-term psychotherapy,
group counseling, stress fitness workshops, and crisis intervention.
Relationship and Sexual Violence Prevention Services (RSVP) - (213) 740-9355(WELL), press
“0” after hours – 24/7 on call
studenthealth.usc.edu/sexual-assault
Free and confidential therapy services, workshops, and training for situations related to gender-
based harm.
Office for Equity, Equal Opportunity, and Title IX (EEO-TIX) - (213) 740-5086
eeotix.usc.edu
Information about how to get help or help someone affected by harassment or discrimination, rights
of protected classes, reporting options, and additional resources for students, faculty, staff, visitors,
and applicants.
usc-advocate.symplicity.com/care report
Avenue to report incidents of bias, hate crimes, and microaggressions to the Office for Equity, Equal
Opportunity, and Title for appropriate investigation, supportive measures, and response.
USC Emergency - UPC: (213) 740-4321, HSC: (323) 442-1000 – 24/7 on call
dps.usc.edu,emergency.usc.edu
Emergency assistance and avenue to report a crime. Latest updates regarding safety, including
ways in which instruction will be continued if an officially declared emergency makes travel to
campus infeasible.
USC Department of Public Safety - UPC: (213) 740-6000, HSC: (323) 442-120 – 24/7 on call
dps.usc.edu Non-emergency assistance or information.