Data Analysis Plan
Data Analysis Plan
Prerequisites
Install Python (via Anaconda or python.org) or use Google Colab (free, cloud-
based).
Dedicate 5–10 hours/week: 2–3 hours for learning, 2–3 hours for practice, 1–2
hours for the mini-project.
Topics:
o
o
o
o
o
o
o
Resources:
o
o
o
o
o
Book: “Automate the Boring Stuff with Python” (free online, Chapters
1–4).
o
Practice:
o
Write a program to calculate the average of 5 numbers entered by the
user.
o
o
Create a list of 10 items (e.g., groceries) and print every second item
using a loop.
o
o
Task: Write a program that takes 5 test scores as input, stores them in a
list, and calculates the average score. Print a message based on the
average (e.g., “Pass” if ≥70, “Fail” if <70).
o
o
o
o
Topics:
o
o
o
o
o
o
Resources:
o
o
o
o
Practice:
o
o
o
o
Read a CSV file (e.g., sample dataset from Kaggle) and print its
contents.
o
o
o
o
Time: 6 hours (2h learning, 3h practice, 1h project).
Topics:
o
o
o
o
o
o
Resources:
o
o
o
o
Practice:
Create a NumPy array of 10 numbers and calculate its mean and sum.
o
o
o
o
o
o
o
o
Topics:
o
o
o
o
o
o
Resources:
o
o
o
o
Practice:
Load a CSV dataset and print the first 5 rows using head().
o
o
Filter rows where a column meets a condition (e.g., age > 18).
o
o
o
o
Topics:
o
o
o
o
o
o
Plotting with Pandas DataFrames.
Resources:
o
o
o
o
Practice:
o
o
o
o
Task: Use a sample sales dataset (e.g., from Kaggle). Create a bar chart
of total sales by product and a line plot of sales over time.
o
o
o
o
Topics:
o
o
o
o
Using SciPy for statistical calculations.
o
o
Resources:
o
o
o
o
Practice:
o
o
o
o
Perform a t-test on two groups (e.g., male vs. female grades) using
SciPy.
o
o
o
o
o
Topics:
o
o
o
o
Resources:
o
o
o
o
Practice:
o
o
o
o
o
o
o
o
Topics:
o
o
o
o
o
o
Resources:
o
o
o
o
Practice:
o
o
o
o
o
o
o
o
Post-Learning Projects
These 5 projects of increasing difficulty will help you apply and expand your data
analysis skills. Each reinforces core concepts and introduces new challenges.
Challenge: Add a feature to predict future sales using simple linear regression
(learn SciPy’s linregress).
Description: Use a stock price dataset (e.g., from Yahoo Finance via
yfinance). Calculate moving averages, volatility, and correlations between
stocks. Visualize trends and create a report comparing stock performance.
Use Kaggle: Download free datasets and explore notebooks for inspiration.
Debug Independently: Use Stack Overflow or Python documentation to solve
errors.
Ask for Feedback: Share your code on r/learnpython or with peers for
improvement.
Resources
Paid (Optional): Coursera’s Python for Data Science (audit for free), Create
& Learn’s Python for AI ($50–$100).