Project 3 - Online Course Platform Analysis
Project 3 - Online Course Platform Analysis
Dataset Description
The dataset represents data from an online course platform, tracking user enrollments,
course details, and user progress. It includes information about users, courses,
categories, and enrollment activities. The dataset is provided in CSV format with the
following tables:
1. Users:
o user_id (INT, Primary Key)
o first_name (VARCHAR)
o last_name (VARCHAR)
o email (VARCHAR)
o country (VARCHAR)
o signup_date (DATE)
2. Courses:
o course_id (INT, Primary Key)
o course_name (VARCHAR)
o category (VARCHAR, e.g., Programming, Data Science, Business, Design)
o duration_hours (DECIMAL)
o price (DECIMAL)
3. Categories:
o category_id (INT, Primary Key)
o category_name (VARCHAR, matches category in Courses)
o description (VARCHAR)
4. Enrollments:
o enrollment_id (INT, Primary Key)
o user_id (INT, Foreign Key)
o course_id (INT, Foreign Key)
o enrollment_date (DATE)
o progress_percentage (DECIMAL, 0 to 100)
o completed (BOOLEAN, 1 for completed, 0 for not completed)
Tasks:
1. Database and Table Creation:
• Create a database named CoursePlatformDB.
• Create the four tables (Users, Courses, Categories, Enrollments) with
appropriate data types, primary keys, and foreign key constraints.
2. Data Loading:
• Load the provided CSV data into the respective tables. (CSV files will be
generated by the Python script below.)
3. SQL Queries:
Write queries to answer the following questions:
Report Deliverables:
• SQL code files with comments
• Power BI file (.pbix)
• A 1-page summary answering all 10 insight questions using both SQL and visual
evidence
• Screenshots of dashboards and key visuals in the report