0% found this document useful (0 votes)
7 views

Final Coursework - 24.2 Ad Cert Python

The coursework requires students to conduct a data analysis and visualization project using Python, focusing on a real-world dataset. Groups of 8-10 members will define a problem, select a dataset, and perform data cleaning, exploratory analysis, and visualization, culminating in a presentation. Deliverables include a Jupyter Notebook, a report, presentation slides, and the dataset file.

Uploaded by

sahandumidu4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Final Coursework - 24.2 Ad Cert Python

The coursework requires students to conduct a data analysis and visualization project using Python, focusing on a real-world dataset. Groups of 8-10 members will define a problem, select a dataset, and perform data cleaning, exploratory analysis, and visualization, culminating in a presentation. Deliverables include a Jupyter Notebook, a report, presentation slides, and the dataset file.

Uploaded by

sahandumidu4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Final Coursework : Data Analysis & Visualization Project using Python

Objective:

The goal of this coursework is for students to apply the Python programming concepts they
have learned throughout the semester. Students will work on a real-world dataset, perform
data manipulation, analysis, and visualization, and present their findings in a final
presentation.

Number of Members in a Group: Maximum 10 members (8-10 members)

Project Overview:

Each group will choose a problem or opportunity you have identified. (The selected problem
can be anything like improving business performance, increasing profit, reducing cost, utilizing
resources effectively, increasing market visibility, etc.)

The dataset can be obtained from sources like Kaggle, UCI Machine Learning Repository,
or Open Data Portals. You must conduct an end-to-end data analysis project, focusing on
data cleaning, manipulation, exploratory data analysis (EDA), and visualization.

Each team must submit a Jupyter Notebook along with the report and deliver a 10-minute
presentation at the end of the semester.

Tasks & Deliverables:

1. Problem Definition (10 marks)

• Select a real-world problem (e.g., Healthcare, Finance, Sports, social media, E-Commerce).
• Explain why this problem should be sorted using Python data analysis and visualization
techniques through this coursework.

2. Dataset Selection (10 Marks)

• Select a dataset relevant to the real-world problem you have identified in the above
step.
• Justify why this dataset was chosen and its potential insights.

3. Data Cleaning & Preprocessing (20 Marks)

• Handle missing values (fill, drop, or impute).


• Remove duplicates and handle incorrect data types.
• Normalize, encode, or scale data where necessary.
4. Exploratory Data Analysis (EDA) (15 Marks)

• Use descriptive statistics to summarize key insights.


• Identify correlations and patterns using groupby, value_counts, pivot_table,
etc.
• Detect and visualize outliers.

5. Data Visualization (20 Marks)

• Use Matplotlib & Seaborn to create effective charts:


o Histograms, Boxplots, Bar Charts, Scatter Plots, Heatmaps.
• Explain insights derived from each visualization.

6. Advanced Analysis (15 Marks)

• Use grouping and aggregation for deeper analysis.


• Implement time series analysis or trend identification (if applicable).
• Perform basic machine learning predictions (Optional for advanced students).

7. Final Presentation (10 Marks)

• Each team will present their findings in a 10-minute PowerPoint presentation.


• The presentation should include:
o Problem Statement & Dataset Introduction.
o Key Findings from Data Analysis.
o Visualizations & Insights.
o Conclusion & Future Recommendations.

Submission Requirements:

1. Jupyter Notebook (with code, markdown explanations, and visualizations).


2. Report (Problem statement, the explanations regarding the code and the discussion
about the outputs)
3. Presentation Slides (PowerPoint or Google Slides).
4. Dataset File (CSV or JSON format).

Tools & Libraries to Use:

✅ Pandas (Data Manipulation)


✅ NumPy (Numerical Computations)
✅ Matplotlib & Seaborn (Data Visualization)
✅ Scikit-Learn (Basic ML, Optional)

You might also like