0% found this document useful (0 votes)
389 views13 pages

STAT 650 - Foundations of Data Science Syllabus

Uploaded by

dyavadisaivenkat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
389 views13 pages

STAT 650 - Foundations of Data Science Syllabus

Uploaded by

dyavadisaivenkat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Course Syllabus

STAT650 - STATISTICAL FOUNDATION OF DATA SCIENCE - Spring 2024


Course Information
Course Number: STAT 650
Course Title: STATISTICAL FOUNDATION OF DATA SCIENCE
Section: 700
Time:
Location: Online/CANVAS
Credit Hours: 3

Instructor Details

Instructor: YOONSUNG JUNG


Office: BLOCKER 245A
Phone: N/A
E-Mail: [email protected] (When you send me an email, use subject-line “STAT650”)
Office Hours: by the order of email appointment request day before; other times and in-
person meetings can be arranged upon request)
Weekly Q&A: Thursday 7:00 pm ~ 8:00 pm CST online via ZOOM
(1. Responses are based on the questions submitted to INBOX with Subject:
STAT 650-700 Q&A (by Wednesday). 2. Additional responses will be followed.)

Teaching Assistant:
Office: Blocker xxxx
Phone: N/A
E-Mail: [email protected] (When you send me an email, use subject-line “STAT650”)
Office Hours: xxx-xxx-xxxx

Course Description

This course is designed for graduate students in statistics, applied mathematics, computer science,
and related fields who are interested in statistical computing. The course will provide a fundamental
introduction to both probability and statistics with emphasis on applications in data science. We will
go through topics including basic probability concepts such as sample space, conditional probability,
random variables, as well as statistical inference. To emphasize applications towards data science, we
will discuss data analysis examples using Python. Hence, you need to be comfortable with programming
in Python. In the weekly topics section, a range of data applications are provided and underlined.

Course Prerequisites

● Familiarity with computing in Python.


● Graduate Classification or approval of instructor

Page 1 of 13
Course Syllabus

Course Learning Outcomes

This course will provide both theoretical and practical knowledge about,

• the importance of statistics in data science


• fundamentals in statistics and data science
• use of statistical concepts in data science
• data analysis with Python

Textbook and/or Resource Materials

There is no specific textbook for the course, and the instructor will provide course materials. However,
the following books may provide additional information.
● Introduction to Machine Learning with Python by Andreas Muller and Sarah Guido.
● Python Data Analytics: Data Analysis and Science using Pandas, matplotlib, and the Python
Programming Language by Fabio Nelli.
● Python Data Science Handbook by Jake vanderPlas.
● Python for Data Analysis: Data Wrangling with Pandas, NumPy and IPython by Wes McKinney.
● More Resources might be on CANVAS Module.

Homework Policy

Homework assignments will be available in the Canvas.


There will be assignments with due date posted on the assignments in the Canvas. Your homework
solutions must be your own work, not from outside sources, consistent with the university rules on
academic integrity. I expect you to follow this policy scrupulously. Your chances for a good
performance on the exams will be higher if you follow this policy.

You may use:


● Your textbook and notes from class or from a related class that you took or are taking.
● References listed on the syllabus or discussion with the instructor or grader.
● Voluntary, mutual and cooperative discussion with other students currently taking the
class. There will be an online discussion board.
You may NOT use:
● Solutions manuals (printed or electronic) and copies of pages from solutions manuals.
● Solutions notes, homework, etc., from previous classes.
● Solutions, notes, homework, etc., from students who took this class previously.
● Copying from students in this class, including expecting them to reveal their solutions in
“discussion".

Grading Policy

Your final grade score will be based on 100% scale as follows:

Page 2 of 13
Course Syllabus

● Weekly Q&A
o You are strongly encouraged to attend all Q&A sessions. Reading the university
attendance policy, refer to Student Rule 7 (https://student-rules.tamu.edu/rule07/).
o It is NOT mandatory.
o If no one logs in during the first 20 minutes, the Q&A will end.
● Homework (50%)
o Homework will be assigned regularly and will include coding, computing, and summary
tasks.
o At least five assignments will be provided. Each homework carries equal weight.
o You are allowed to collaborate with other students on homework problems.
o However, verbatim copying of homework is absolutely forbidden and constitutes a
violation of the Aggie Honor Code. Therefore, each student must produce his or her own
homework to be turned in and graded.
● NO Mid-term/Project
● Final Exam/Project (35%)
o Students will work on a provided project task. You can discuss with other students.
o The exam/project will include both calculation and coding tasks.
o Submit your files (.ipynb, excel data.xls or .xlsx, .html) on Canvas.
o Details on exams will be announced on Canvas when the date is approaching.
● Final Project Presentation (15%)
o The project presentation should be 10~15 minutes, recorded on video as seminar or
conference presentation. The emphasis will be on communicating effectively rather than
proving your statistical intellectual prowess.
o Submit your files (video files.mp4, presentation.ppt or .pdf)
● Exam/Project Due Dates:
o Mid-term/Project: NA
o Final Exam/Project: TBA, 2024
● Grading Scale: Standard letter grading scale. (However, these grading cutoffs may be adjusted
downward at my discretion)

Course Grade Points Needed


A 90-100
B 80-89
C 70-79
D 60-69
F <60

Page 3 of 13
Course Syllabus

Exam Policy

● Your exam solutions must be your own work, consistent with the university rules on
academic integrity.
● There will be additional details listed in Canvas before the exams. You will be expected to
follow those details thoroughly.
● You may use a calculator, but it cannot have capability to phone, text, or access Web except
for downloading exam and uploading solutions (if asked)
● Copies of practice exam in the case of in-class exam will be available for you to review under
Module for Exams folder on Canvas.

Late Work Policy

● No late submission will be accepted without a valid excuse.


● If you have any issues, you should contact the instructor via email directly.
● Work submitted by a student as makeup work for an excused absence is not considered late
work and is exempted from the late work policy (Student Rule 7).
● If you fail to submit a homework assignment by the due date because of a university excused
absence or due to illness or circumstances beyond your control, notify me in writing or by email
(before, if feasible, otherwise within two working days after you return). If your absence is
approved, I will notify you on how you may make up the missed assignment.
● If you miss a homework assignment or an exam and your reason for missing the assignment or
exam is not accepted, then you will receive a score of 60 for the assignment or exam.
● A temporary grade of I (Incomplete) at the end of a semester indicates that the student has
completed the course except for a major quiz, or major exam/project. The instructor shall give
this grade only when the deficiency is due to an authorized absence or other cause beyond the
control of the student.

Course Communication Policy

● Canvas will be the primary source of information relevant to the course (e.g.,
announcements, lecture materials, assignments, changes to office hours). You must check
the Canvas - Module site at least twice weekly and whenever the instructor posts a new
announcement.
● If you have a question that arises during the class, please ask in class! It helps the instructor
regulate the pace of the course and address issues along the way. Time permitting, the
instructor tries to address the question immediately.
● Please follow these steps when seeking assistance with the course outside of class.
1. If you have questions related to class, leave your message by INBOX on the left of CANVAS
for assistance. Please include ‘STAT 650’ at the beginning of the subject line. For questions
by email, please allow at least one business day for email response..
2. If the question may be helpful for others in the class, you can post your question on the
discussion board on Canvas. The TA and/or instructor will respond to your post in Canvas as
soon as possible. So, others with the same questions can also see and answer them.

Page 4 of 13
Course Syllabus

3. If the question may be helpful for others in the class, but you need help communicating your
question in writing, drop in during the weekly Q&A session so you can verbally pose your
question and get answers.
4. If you need one-on-one guidance, drop in during office hours so the instructor or TA can
work with you individually.

Course Schedule

The following topics will be covered. (The order could be changed on CANVAS-Module)
● Basic Statistics
o Definition of Statistics, Population, Sample
o Types of Variables
o Type of Data
● Basic Probability
o Definition of Factorial, Combination, Permutation
o Venn Diagram: Union, Intersection, Independence
o Bayes Theorem
● Probability Distribution
o Discrete random variable distribution:
o Continuous random variable distribution:
o Example of Python function for discrete/continuous random variable distributions
● Introduction of Python 1
o Install Python for Window OS and Mac OS
o Install Anaconda for Window OS and Mac OS
o Run and Review Jupyter Notebook
● Introduction to Python 2
o Basics of Python: structure, functions, class, common packages
o Review Exploratory Data Analysis (EDA) by Python
o Review Common Visualization by Python
● Exploratory Data Analysis (EDA) by Python
o Importing the relevant libraries
o How to load data
o Review Data – Missing values, dropping columns.
o Analyzing data – Univariate, Bivariate, Multivariate
● Statistical Inference I: Point & Interval Estimate
● Statistical Inference II: Hypothesis Test
o How to define Hypothesis
o Hypothesis Test when sigma is known.
o Hypothesis Test when sigma is unknown.
● Statistical Inference III: Two Population Case
o Hypothesis Test when sigma is known.
o Hypothesis Test when sigma is unknown.
● Statistical Inference IV: More Than Two Populations
o What is ANOVA?
o How to test Hypothesis?
● Regression

Page 5 of 13
Course Syllabus

o Simple Linear Regression


o Multiple Linear Regression
o Polynomial Regression
o Logistic Regression
o LASSO, Ridge, Elastic Net Regression
o Quantile Regression
o Poisson Regression
o Negative Binomial Regression
o Zero Inflated and Hurdle Regression
o Cox Regression
o Partial Least Squared Regression
o Principle Component Regression
● Machine Learning and Classification

Course Calendar:

Class Duration: January 16, 2024 – April 30, 2024


● January 16: First Day
● January 18: First Weekly Q&A
● January 22: last day to ADD/Drop Courses
● March 11-15: Spring Break - No Class/ No Q&A
● March (TBA): Mid-term Exam/Project
● March 29: Reading Day - No Classes
● April 16: last day for all students to drop courses with no penalty (Q-drop)
● April 30: Last class day
● May (TBA): Final Examinations/Project

Optional Course Information Items

Consider adding the following additional information items to the course syllabus when appropriate.
Delete any information and/or subheadings if not needed, including this note.

Technology Support – Provide appropriate technical support information to inform students who to
contact if they encounter technical difficulties (e.g., direct technical questions to the course teaching
assistant; contact the vendor; etc.). Technical support information should include information such as
who to contact, how to contact that resource, hours of availability, etc.

Texas A&M at Qatar


Texas A&M University at Qatar students can also direct their technical questions to
[email protected]

Learning Resources – Provide information regarding available learning resources such as


supplemental instruction or tutoring when appropriate (e.g., information about the University Writing
Center for a W/C designated course or related LinkedIn Learning modules appropriate for the course

Page 6 of 13
Course Syllabus

topic). The Study Hub website lists many on-campus learning resources to support students in achieving
academic excellence.
Texas A&M at Galveston
On-campus learning resources to support students in achieving excellence are available through The
Commons (tamug.edu/commons).

Texas A&M at Qatar


Texas A&M University at Qatar students should contact the Center for Teaching and Learning at
[email protected] for questions related to learning support, peer tutoring, supplemental instruction,
writing support, etc.

University Policies
This section outlines the university level policies that must be included in each course syllabus. The TAMU
Faculty Senate established the wording of these policies.

NOTE: Faculty members should not change the written statements. A faculty member may add separate
paragraphs if additional information is needed.

Attendance Policy

The university views class attendance and participation as an individual student responsibility. Students
are expected to attend class and to complete all assignments.

Please refer to Student Rule 7 in its entirety for information about excused absences, including
definitions, and related documentation and timelines.

Makeup Work Policy

Students will be excused from attending class on the day of a graded activity or when attendance
contributes to a student’s grade, for the reasons stated in Student Rule 7, or other reason deemed
appropriate by the instructor.

Please refer to Student Rule 7 in its entirety for information about makeup work, including definitions,
and related documentation and timelines.

Absences related to Title IX of the Education Amendments of 1972 may necessitate a period of more
than 30 days for make-up work, and the timeframe for make-up work should be agreed upon by the
student and instructor” (Student Rule 7, Section 7.4.1).

“The instructor is under no obligation to provide an opportunity for the student to make up work missed
because of an unexcused absence” (Student Rule 7, Section 7.4.2).

Students who request an excused absence are expected to uphold the Aggie Honor Code and Student
Conduct Code. (See Student Rule 24.)

Page 7 of 13
Course Syllabus

Academic Integrity Statement and Policy

“An Aggie does not lie, cheat or steal, or tolerate those who do.”

“Texas A&M University students are responsible for authenticating all work submitted to an instructor. If
asked, students must be able to produce proof that the item submitted is indeed the work of that
student. Students must keep appropriate records at all times. The inability to authenticate one’s work,
should the instructor request it, may be sufficient grounds to initiate an academic misconduct case”
(Section 20.1.2.3, Student Rule 20).

Texas A&M at College Station


You can learn more about the Aggie Honor System Office Rules and Procedures, academic integrity, and
your rights and responsibilities at aggiehonor.tamu.edu.

Texas A&M at Galveston


You can learn more about the Honor Council Rules and Procedures as well as your rights and
responsibilities at tamug.edu/HonorSystem.

Texas A&M at Qatar


You can learn more about academic integrity and your rights and responsibilities at Texas A&M
University at Qatar by visiting the Aggie Honor System website.

Americans with Disabilities Act (ADA) Policy

Texas A&M University is committed to providing equitable access to learning opportunities for all
students. If you experience barriers to your education due to a disability or think you may have a
disability, please contact the Disability Resources office on your campus (resources listed below)
Disabilities may include, but are not limited to attentional, learning, mental health, sensory, physical, or
chronic health conditions. All students are encouraged to discuss their disability related needs with
Disability Resources and their instructors as soon as possible.

Texas A&M at College Station


Disability Resources is located in the Student Services Building or at (979) 845-1637 or visit
disability.tamu.edu.

Texas A&M at Galveston


Disability Resources is located in the Student Services Building or at (409) 740-4587 or visit
tamug.edu/counsel/Disabilities.

Texas A&M at Qatar


Disability Services is located in the Engineering Building, room 318C or at +974.4423.0316 or visit
https://www.qatar.tamu.edu/students/student-affairs/disability-services.

Page 8 of 13
Course Syllabus

Title IX and Statement on Limits to Confidentiality

Texas A&M University is committed to fostering a learning environment that is safe and productive for
all. University policies and federal and state laws prohibit gender-based discrimination and sexual
harassment, including sexual assault, sexual exploitation, domestic violence, dating violence, and
stalking.

With the exception of some medical and mental health providers, all university employees (including full
and part-time faculty, staff, paid graduate assistants, student workers, etc.) are Mandatory Reporters
and must report to the Title IX Office if the employee experiences, observes, or becomes aware of an
incident that meets the following conditions (see University Rule 08.01.01.M1):

● The incident is reasonably believed to be discrimination or harassment.


● The incident is alleged to have been committed by or against a person who, at the time of the
incident, was (1) a student enrolled at the University or (2) an employee of the University.

Mandatory Reporters must file a report regardless of how the information comes to their attention –
including but not limited to face-to-face conversations, a written class assignment or paper, class
discussion, email, text, or social media post. Although Mandatory Reporters must file a report, in most
instances, a person who is subjected to the alleged conduct will be able to control how the report is
handled, including whether or not to pursue a formal investigation. The University’s goal is to make sure
you are aware of the range of options available to you and to ensure access to the resources you need.

Texas A&M at College Station


Students wishing to discuss concerns in a confidential setting are encouraged to make an appointment
with Counseling and Psychological Services (CAPS).

Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the University’s Title IX webpage.

Texas A&M at Galveston


Students wishing to discuss concerns in a confidential setting are encouraged to make an appointment
with the Counseling Office in the Seibel Student Center, or call (409)740-4587. For additional
information, visit tamug.edu/counsel.

Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the Galveston Campus’ Title IX webpage.

Texas A&M at Qatar


Texas A&M University at Qatar students wishing to discuss concerns in a confidential setting are
encouraged to visit the Health and Wellness website for more information.

Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the University’s Title IX webpage.

Page 9 of 13
Course Syllabus

Statement on Mental Health and Wellness

Texas A&M University recognizes that mental health and wellness are critical factors that influence a
student’s academic success and overall wellbeing. Students are encouraged to engage in healthy self-
care by utilizing available resources and services on your campus

Texas A&M College Station


Students who need someone to talk to can contact Counseling & Psychological Services (CAPS) or call the
TAMU Helpline (979-845-2700) from 4:00 p.m. to 8:00 a.m. weekdays and 24 hours on weekends. 24-
hour emergency help is also available through the 988 Suicide & Crisis Lifeline (988) or
at 988lifeline.org Links to an external site..

Texas A&M at Galveston


Students who need someone to talk to can call (409) 740-4736 from 8:00 a.m. to 5:00 p.m. weekdays or
visit tamug.edu/counsel for more information. For 24-hour emergency assistance during nights and
weekends, contact the TAMUG Police Dept at (409) 740-4545. 24-hour emergency help is also available
through the 988 Suicide & Crisis Lifeline (988) or at 988lifeline.org Links to an external site..

Texas A&M at Qatar


Texas A&M University at Qatar students wishing to discuss concerns in a confidential setting are
encouraged to visit the Health and Wellness website for more information.

Campus-Specific Policies
Texas A&M at Galveston
Classroom Access and Inclusion Statement
Texas A&M University is committed to engaged student participation in all of its programs and courses
and provides an accessible academic environment for all students. This means that our classrooms, our
virtual spaces, our practices and our interactions are as inclusive as possible and we work to provide a
welcoming instructional climate and equal learning opportunities for everyone. If you have an
instructional need, please notify me as soon as possible.
The Aggie Core values of respect, excellence, leadership, loyalty, integrity and selfless service in addition
to civility, and the ability to listen and to observe others are the foundation of a welcoming instructional
climate. Active, thoughtful and respectful participation in all aspects of the course supports a more
inclusive classroom environment as well as our mutual responsibilities to the campus community.

The following statements below are optional. Leave as is to include, or delete if preferred. Either way,
delete this note.
Statement on the Family Educational Rights and Privacy Act (FERPA)
FERPA is a federal law designed to protect the privacy of educational records by limiting access to
these records, to establish the right of students to inspect and review their educational records and to
provide guidelines for the correction of inaccurate and misleading data through informal and formal

Page 10 of 13
Course Syllabus

hearings. Currently enrolled students wishing to withhold any or all directory information items may
do so by going to howdy.tamu.edu and clicking on the "Directory Hold Information" link in the Student
Records channel on the MyRecord tab. The complete FERPA Notice to Students and the student
records policy is available on the Office of the Registrar webpage.
Items that can never be identified as public information are a student’s social security number,
citizenship, gender, grades, GPR or class schedule. All efforts will be made in this class to protect your
privacy and to ensure confidential treatment of information associated with or generated by your
participation in the class.
Directory items include name, UIN, local address, permanent address, email address, local telephone
number, permanent telephone number, dates of attendance, program of study (college, major,
campus), classification, previous institutions attended, degrees honors and awards received,
participation in officially recognized activities and sports, medical residence location and medical
residence specialization.

College and Department Policies


College and departmental units may establish their own policies and minimum syllabus requirements. As
long as these policies and requirements do not contradict the university level requirements, colleges and
departments can add them in this section. Please remove this section if not needed.

Rubric: Stat 650-700 – Final Project Report


Criteria Points

Formatting
5 pts
• Legible and professional, looks like it could be published.

Readability
5 pts
• Ordering and transitions are clear. Not hard to read.

1. Title (5 points)
• Concise and informative
5 pts
2. Authors and Affiliation (Required)
• Clearly stated

3. Abstract/Summary
10 pts
• Summarizes motivation, methodology, findings, and implications.

4. Introduction
• Background/Motivation (5 points)
15 pts
Clearly explains the background and motivation.
• Literature Review (5 points)

Page 11 of 13
Course Syllabus

Criteria Points

Reviewing and synthesizing existing research related to your project.


• Research Goals and Hypothesis (5 points)
Clearly outlines the goals and hypothesis for data analysis.

5. Data Handling
• Data source: where/how to get the data,
• Data description
10 pts
• Review data: Checking missing values and outliers.
• Transform the data set, if necessary, on data analysis.
• Fix typo and formatting issues.

6. Methodology
• Explain the Model (5 points):
- Select possible models for data analysis with cleaned data
- Justify the analysis chosen.
- Explain theory if it adds to the explanation.
- Define the alpha/confidence level. 15 pts
• Model Validation (5 points):
- Residuals, confusion matrix, prediction accuracy, standard errors, or maybe
effect graphs.
• Searched for more complex effects
- Interactions, transformations, curvature, etc.

7. Results
• How Independent variables affect the response (5 points)
- Graphs of effects, interactions/transformations explained. Accuracy of the
results. Variables removed from the study.
• Graphs (5 points)
- Labelled adequately.
- Easy to understand.
20 pts
- Explanation of what is seen in the graph to helps explain the results.
• Output Explained (5 points)
- Everything in the report is referenced and discussed and explained in
connection to how it answers the question/hypothesis.
• Answer the Question (5 points)
- Answer for Goal and Hypotheses what the client wanted to get from your
paper.
- Explain the results and summarize the findings easily.

8. Conclusion
• Summarized the findings of the report.
10 pts
• How this will benefit people.
• Where future research could go.

9. References
• Properly lists key references.
5 pts
10. Appendix
• Includes derivations, program code, additional graphs, and tables as needed.

Page 12 of 13
Course Syllabus

Rubric: Stat 650-700 – Final Project Presentation


Criteria Description Points
• Appropriate background and dress.
• Good lighting.
Professional Style 20
• Polished presentation.
• Good flow and cadence.
• Presentation is interesting and engaging by Tables and
Visuals Figures. 20
• Visuals help explain without being distracting.
Answer the • The goal of the client is resolved.
20
Question • Explanation is easy to understand and complete.
• Concise breakdown of the points that would be most
Summarize
important to the client. 20
Bottom Line
• Summarize answers for Goal, objective, or hypothesis
• Around 10 ~ 15 minutes.
Time 20
• Rehearsed enough to keep it under 15 minutes.

Page 13 of 13

You might also like