0% found this document useful (0 votes)
10 views3 pages

Group Assignment 01

The group assignment focuses on data analysis using Python, requiring students to collect, clean, manipulate, analyze, and visualize a dataset with tools like Pandas, Matplotlib, and Seaborn. The project includes tasks such as dataset selection, data cleaning, manipulation, aggregation, merging, visualization, and a final report. Assessment emphasizes hands-on learning, teamwork, and the application of advanced data analytics concepts.

Uploaded by

sahandumidu4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views3 pages

Group Assignment 01

The group assignment focuses on data analysis using Python, requiring students to collect, clean, manipulate, analyze, and visualize a dataset with tools like Pandas, Matplotlib, and Seaborn. The project includes tasks such as dataset selection, data cleaning, manipulation, aggregation, merging, visualization, and a final report. Assessment emphasizes hands-on learning, teamwork, and the application of advanced data analytics concepts.

Uploaded by

sahandumidu4
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Group Assignment – Python for Data

Analysis (100 Marks)


Group Size: 2 Members | Duration: 03 hours

Objective:
Each group will collect, clean, manipulate, analyze, and visualize a dataset using Pandas,
Matplotlib, and Seaborn. This project will require students to apply data manipulation,
filtering, sorting, group by operations, data merging, and visualization techniques.

Assignment Tasks & Breakdown (100 Marks)


1. Dataset Selection & Project Proposal (10 Marks)

• Select a dataset from Kaggle or any open-source repository.


• Submit a brief project proposal (max 2 pages) covering:
o Dataset description & source (2 Marks)
o Research questions (minimum 3) (3 Marks)
o Tools & libraries to be used (2 Marks)
o Expected insights & challenges (3 Marks)

2. Data Cleaning & Preprocessing (20 Marks)

• Handle missing values using appropriate strategies (drop, fill, interpolate).


• Remove duplicates and standardize column formats.
• Convert data types as required.
• Evaluation Criteria:
o Missing value handling (5 Marks)
o Duplicate removal & column formatting (5 Marks)
o Data type conversions & restructuring (5 Marks)
o Code readability & documentation (5 Marks)

3. Data Manipulation, Filtering & Sorting (15 Marks)

• Apply sorting on at least two numerical columns.


• Use logical & conditional filtering to extract relevant data.
• Perform at least three transformations (e.g., creating new columns).
• Evaluation Criteria:
o Correct sorting & filtering implementation (8 Marks)
o Proper transformations & feature creation (5 Marks)
o Code optimization & clarity (2 Marks)

4. Group By & Aggregation (15 Marks)

• Perform grouping based on relevant categorical columns.


• Apply aggregation functions (sum, mean, count, max, min) on grouped data.
• Evaluation Criteria:
o Correct implementation of group by operations (8 Marks)
o Effective use of aggregation functions (5 Marks)
o Code efficiency & clarity (2 Marks)

5. Data Merging, Joining & Concatenation (15 Marks)

• Merge two related datasets using appropriate join types (inner, outer, left, right).
• Concatenate data from multiple sources (rows or columns).
• Evaluation Criteria:
o Correct merge & join operations (8 Marks)
o Proper use of concatenation methods (5 Marks)
o Code structure & documentation (2 Marks)

6. Data Visualization (20 Marks)

• Generate at least 4 visualizations using Matplotlib & Seaborn.


• Use bar charts, histograms, scatter plots, and heatmaps to identify patterns.
• Write a brief analysis on the insights gained.
• Evaluation Criteria:
o Variety & relevance of visualizations (10 Marks)
o Clarity & accuracy of insights (5 Marks)
o Code implementation & visualization aesthetics (5 Marks)

7. Final Report & Presentation (20 Marks)

• Submit a structured report covering:


o Introduction & research questions.
o Data description & cleaning steps.
o Key findings from analysis & visualizations.
o Summary of challenges & insights.
• Evaluation Criteria:
o Report clarity & completeness
Submission Requirements:
• Python script (.py or Jupyter Notebook).
• Cleaned dataset (.csv or .json).
• Project report (.pdf).

Assessment Goals:
✔ Encourages hands-on learning in data analysis.
✔ Strengthens Python skills in real-world data processing.
✔ Develops teamwork, problem-solving & presentation skills.
✔ Prepares students for advanced data analytics concepts.

You might also like