STAT 650 - Foundations of Data Science Syllabus
STAT 650 - Foundations of Data Science Syllabus
Instructor Details
Teaching Assistant:
Office: Blocker xxxx
Phone: N/A
E-Mail: [email protected] (When you send me an email, use subject-line “STAT650”)
Office Hours: xxx-xxx-xxxx
Course Description
This course is designed for graduate students in statistics, applied mathematics, computer science,
and related fields who are interested in statistical computing. The course will provide a fundamental
introduction to both probability and statistics with emphasis on applications in data science. We will
go through topics including basic probability concepts such as sample space, conditional probability,
random variables, as well as statistical inference. To emphasize applications towards data science, we
will discuss data analysis examples using Python. Hence, you need to be comfortable with programming
in Python. In the weekly topics section, a range of data applications are provided and underlined.
Course Prerequisites
Page 1 of 13
Course Syllabus
This course will provide both theoretical and practical knowledge about,
There is no specific textbook for the course, and the instructor will provide course materials. However,
the following books may provide additional information.
● Introduction to Machine Learning with Python by Andreas Muller and Sarah Guido.
● Python Data Analytics: Data Analysis and Science using Pandas, matplotlib, and the Python
Programming Language by Fabio Nelli.
● Python Data Science Handbook by Jake vanderPlas.
● Python for Data Analysis: Data Wrangling with Pandas, NumPy and IPython by Wes McKinney.
● More Resources might be on CANVAS Module.
Homework Policy
Grading Policy
Page 2 of 13
Course Syllabus
● Weekly Q&A
o You are strongly encouraged to attend all Q&A sessions. Reading the university
attendance policy, refer to Student Rule 7 (https://student-rules.tamu.edu/rule07/).
o It is NOT mandatory.
o If no one logs in during the first 20 minutes, the Q&A will end.
● Homework (50%)
o Homework will be assigned regularly and will include coding, computing, and summary
tasks.
o At least five assignments will be provided. Each homework carries equal weight.
o You are allowed to collaborate with other students on homework problems.
o However, verbatim copying of homework is absolutely forbidden and constitutes a
violation of the Aggie Honor Code. Therefore, each student must produce his or her own
homework to be turned in and graded.
● NO Mid-term/Project
● Final Exam/Project (35%)
o Students will work on a provided project task. You can discuss with other students.
o The exam/project will include both calculation and coding tasks.
o Submit your files (.ipynb, excel data.xls or .xlsx, .html) on Canvas.
o Details on exams will be announced on Canvas when the date is approaching.
● Final Project Presentation (15%)
o The project presentation should be 10~15 minutes, recorded on video as seminar or
conference presentation. The emphasis will be on communicating effectively rather than
proving your statistical intellectual prowess.
o Submit your files (video files.mp4, presentation.ppt or .pdf)
● Exam/Project Due Dates:
o Mid-term/Project: NA
o Final Exam/Project: TBA, 2024
● Grading Scale: Standard letter grading scale. (However, these grading cutoffs may be adjusted
downward at my discretion)
Page 3 of 13
Course Syllabus
Exam Policy
● Your exam solutions must be your own work, consistent with the university rules on
academic integrity.
● There will be additional details listed in Canvas before the exams. You will be expected to
follow those details thoroughly.
● You may use a calculator, but it cannot have capability to phone, text, or access Web except
for downloading exam and uploading solutions (if asked)
● Copies of practice exam in the case of in-class exam will be available for you to review under
Module for Exams folder on Canvas.
● Canvas will be the primary source of information relevant to the course (e.g.,
announcements, lecture materials, assignments, changes to office hours). You must check
the Canvas - Module site at least twice weekly and whenever the instructor posts a new
announcement.
● If you have a question that arises during the class, please ask in class! It helps the instructor
regulate the pace of the course and address issues along the way. Time permitting, the
instructor tries to address the question immediately.
● Please follow these steps when seeking assistance with the course outside of class.
1. If you have questions related to class, leave your message by INBOX on the left of CANVAS
for assistance. Please include ‘STAT 650’ at the beginning of the subject line. For questions
by email, please allow at least one business day for email response..
2. If the question may be helpful for others in the class, you can post your question on the
discussion board on Canvas. The TA and/or instructor will respond to your post in Canvas as
soon as possible. So, others with the same questions can also see and answer them.
Page 4 of 13
Course Syllabus
3. If the question may be helpful for others in the class, but you need help communicating your
question in writing, drop in during the weekly Q&A session so you can verbally pose your
question and get answers.
4. If you need one-on-one guidance, drop in during office hours so the instructor or TA can
work with you individually.
Course Schedule
The following topics will be covered. (The order could be changed on CANVAS-Module)
● Basic Statistics
o Definition of Statistics, Population, Sample
o Types of Variables
o Type of Data
● Basic Probability
o Definition of Factorial, Combination, Permutation
o Venn Diagram: Union, Intersection, Independence
o Bayes Theorem
● Probability Distribution
o Discrete random variable distribution:
o Continuous random variable distribution:
o Example of Python function for discrete/continuous random variable distributions
● Introduction of Python 1
o Install Python for Window OS and Mac OS
o Install Anaconda for Window OS and Mac OS
o Run and Review Jupyter Notebook
● Introduction to Python 2
o Basics of Python: structure, functions, class, common packages
o Review Exploratory Data Analysis (EDA) by Python
o Review Common Visualization by Python
● Exploratory Data Analysis (EDA) by Python
o Importing the relevant libraries
o How to load data
o Review Data – Missing values, dropping columns.
o Analyzing data – Univariate, Bivariate, Multivariate
● Statistical Inference I: Point & Interval Estimate
● Statistical Inference II: Hypothesis Test
o How to define Hypothesis
o Hypothesis Test when sigma is known.
o Hypothesis Test when sigma is unknown.
● Statistical Inference III: Two Population Case
o Hypothesis Test when sigma is known.
o Hypothesis Test when sigma is unknown.
● Statistical Inference IV: More Than Two Populations
o What is ANOVA?
o How to test Hypothesis?
● Regression
Page 5 of 13
Course Syllabus
Course Calendar:
Consider adding the following additional information items to the course syllabus when appropriate.
Delete any information and/or subheadings if not needed, including this note.
Technology Support – Provide appropriate technical support information to inform students who to
contact if they encounter technical difficulties (e.g., direct technical questions to the course teaching
assistant; contact the vendor; etc.). Technical support information should include information such as
who to contact, how to contact that resource, hours of availability, etc.
Page 6 of 13
Course Syllabus
topic). The Study Hub website lists many on-campus learning resources to support students in achieving
academic excellence.
Texas A&M at Galveston
On-campus learning resources to support students in achieving excellence are available through The
Commons (tamug.edu/commons).
University Policies
This section outlines the university level policies that must be included in each course syllabus. The TAMU
Faculty Senate established the wording of these policies.
NOTE: Faculty members should not change the written statements. A faculty member may add separate
paragraphs if additional information is needed.
Attendance Policy
The university views class attendance and participation as an individual student responsibility. Students
are expected to attend class and to complete all assignments.
Please refer to Student Rule 7 in its entirety for information about excused absences, including
definitions, and related documentation and timelines.
Students will be excused from attending class on the day of a graded activity or when attendance
contributes to a student’s grade, for the reasons stated in Student Rule 7, or other reason deemed
appropriate by the instructor.
Please refer to Student Rule 7 in its entirety for information about makeup work, including definitions,
and related documentation and timelines.
Absences related to Title IX of the Education Amendments of 1972 may necessitate a period of more
than 30 days for make-up work, and the timeframe for make-up work should be agreed upon by the
student and instructor” (Student Rule 7, Section 7.4.1).
“The instructor is under no obligation to provide an opportunity for the student to make up work missed
because of an unexcused absence” (Student Rule 7, Section 7.4.2).
Students who request an excused absence are expected to uphold the Aggie Honor Code and Student
Conduct Code. (See Student Rule 24.)
Page 7 of 13
Course Syllabus
“An Aggie does not lie, cheat or steal, or tolerate those who do.”
“Texas A&M University students are responsible for authenticating all work submitted to an instructor. If
asked, students must be able to produce proof that the item submitted is indeed the work of that
student. Students must keep appropriate records at all times. The inability to authenticate one’s work,
should the instructor request it, may be sufficient grounds to initiate an academic misconduct case”
(Section 20.1.2.3, Student Rule 20).
Texas A&M University is committed to providing equitable access to learning opportunities for all
students. If you experience barriers to your education due to a disability or think you may have a
disability, please contact the Disability Resources office on your campus (resources listed below)
Disabilities may include, but are not limited to attentional, learning, mental health, sensory, physical, or
chronic health conditions. All students are encouraged to discuss their disability related needs with
Disability Resources and their instructors as soon as possible.
Page 8 of 13
Course Syllabus
Texas A&M University is committed to fostering a learning environment that is safe and productive for
all. University policies and federal and state laws prohibit gender-based discrimination and sexual
harassment, including sexual assault, sexual exploitation, domestic violence, dating violence, and
stalking.
With the exception of some medical and mental health providers, all university employees (including full
and part-time faculty, staff, paid graduate assistants, student workers, etc.) are Mandatory Reporters
and must report to the Title IX Office if the employee experiences, observes, or becomes aware of an
incident that meets the following conditions (see University Rule 08.01.01.M1):
Mandatory Reporters must file a report regardless of how the information comes to their attention –
including but not limited to face-to-face conversations, a written class assignment or paper, class
discussion, email, text, or social media post. Although Mandatory Reporters must file a report, in most
instances, a person who is subjected to the alleged conduct will be able to control how the report is
handled, including whether or not to pursue a formal investigation. The University’s goal is to make sure
you are aware of the range of options available to you and to ensure access to the resources you need.
Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the University’s Title IX webpage.
Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the Galveston Campus’ Title IX webpage.
Students can learn more about filing a report, accessing supportive resources, and navigating the Title IX
investigation and resolution process on the University’s Title IX webpage.
Page 9 of 13
Course Syllabus
Texas A&M University recognizes that mental health and wellness are critical factors that influence a
student’s academic success and overall wellbeing. Students are encouraged to engage in healthy self-
care by utilizing available resources and services on your campus
Campus-Specific Policies
Texas A&M at Galveston
Classroom Access and Inclusion Statement
Texas A&M University is committed to engaged student participation in all of its programs and courses
and provides an accessible academic environment for all students. This means that our classrooms, our
virtual spaces, our practices and our interactions are as inclusive as possible and we work to provide a
welcoming instructional climate and equal learning opportunities for everyone. If you have an
instructional need, please notify me as soon as possible.
The Aggie Core values of respect, excellence, leadership, loyalty, integrity and selfless service in addition
to civility, and the ability to listen and to observe others are the foundation of a welcoming instructional
climate. Active, thoughtful and respectful participation in all aspects of the course supports a more
inclusive classroom environment as well as our mutual responsibilities to the campus community.
The following statements below are optional. Leave as is to include, or delete if preferred. Either way,
delete this note.
Statement on the Family Educational Rights and Privacy Act (FERPA)
FERPA is a federal law designed to protect the privacy of educational records by limiting access to
these records, to establish the right of students to inspect and review their educational records and to
provide guidelines for the correction of inaccurate and misleading data through informal and formal
Page 10 of 13
Course Syllabus
hearings. Currently enrolled students wishing to withhold any or all directory information items may
do so by going to howdy.tamu.edu and clicking on the "Directory Hold Information" link in the Student
Records channel on the MyRecord tab. The complete FERPA Notice to Students and the student
records policy is available on the Office of the Registrar webpage.
Items that can never be identified as public information are a student’s social security number,
citizenship, gender, grades, GPR or class schedule. All efforts will be made in this class to protect your
privacy and to ensure confidential treatment of information associated with or generated by your
participation in the class.
Directory items include name, UIN, local address, permanent address, email address, local telephone
number, permanent telephone number, dates of attendance, program of study (college, major,
campus), classification, previous institutions attended, degrees honors and awards received,
participation in officially recognized activities and sports, medical residence location and medical
residence specialization.
Formatting
5 pts
• Legible and professional, looks like it could be published.
Readability
5 pts
• Ordering and transitions are clear. Not hard to read.
1. Title (5 points)
• Concise and informative
5 pts
2. Authors and Affiliation (Required)
• Clearly stated
3. Abstract/Summary
10 pts
• Summarizes motivation, methodology, findings, and implications.
4. Introduction
• Background/Motivation (5 points)
15 pts
Clearly explains the background and motivation.
• Literature Review (5 points)
Page 11 of 13
Course Syllabus
Criteria Points
5. Data Handling
• Data source: where/how to get the data,
• Data description
10 pts
• Review data: Checking missing values and outliers.
• Transform the data set, if necessary, on data analysis.
• Fix typo and formatting issues.
6. Methodology
• Explain the Model (5 points):
- Select possible models for data analysis with cleaned data
- Justify the analysis chosen.
- Explain theory if it adds to the explanation.
- Define the alpha/confidence level. 15 pts
• Model Validation (5 points):
- Residuals, confusion matrix, prediction accuracy, standard errors, or maybe
effect graphs.
• Searched for more complex effects
- Interactions, transformations, curvature, etc.
7. Results
• How Independent variables affect the response (5 points)
- Graphs of effects, interactions/transformations explained. Accuracy of the
results. Variables removed from the study.
• Graphs (5 points)
- Labelled adequately.
- Easy to understand.
20 pts
- Explanation of what is seen in the graph to helps explain the results.
• Output Explained (5 points)
- Everything in the report is referenced and discussed and explained in
connection to how it answers the question/hypothesis.
• Answer the Question (5 points)
- Answer for Goal and Hypotheses what the client wanted to get from your
paper.
- Explain the results and summarize the findings easily.
8. Conclusion
• Summarized the findings of the report.
10 pts
• How this will benefit people.
• Where future research could go.
9. References
• Properly lists key references.
5 pts
10. Appendix
• Includes derivations, program code, additional graphs, and tables as needed.
Page 12 of 13
Course Syllabus
Page 13 of 13