0% found this document useful (0 votes)
39 views2 pages

Ca2 - Lpu

The document outlines the requirements for a project divided into multiple units involving data analysis and visualization. Students will be grouped and assigned an industry dataset to work with. The project requirements include exploring Pandas and NumPy, creating visualizations with Matplotlib and Seaborn, and using Plotly structures like Figure, Data and Layout.

Uploaded by

puneet.pahadia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views2 pages

Ca2 - Lpu

The document outlines the requirements for a project divided into multiple units involving data analysis and visualization. Students will be grouped and assigned an industry dataset to work with. The project requirements include exploring Pandas and NumPy, creating visualizations with Matplotlib and Seaborn, and using Plotly structures like Figure, Data and Layout.

Uploaded by

puneet.pahadia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CA2 - Project work Total 50 marks

General Instructions
• Students would be divided into 10 groups of 3 students each
• Each Group will be allocated a Group Number
• Each group number will be allocated one industry as mentioned below
• Data can be downloaded from Kaggle datasets /Central Michigan University data
sets or any other relevant website. But data sets must have at least 500 rows.
• Each CA will be accompanied by 4-5 different viva questions from each student
in a group
• Attach snapshot of your terminal and upload each phase pdf file in UMS
Industries:
1. Finance and Banking/Credit card datasets
2. Healthcare/Who data sets AFP/polio data
3. Marketing and Advertising
4. Manufacturing and Supply Chain/Cars data set from Central Michigan University
5. Telecommunications/ Telecom Data
6. Energy and Utilities/Population datasets
7. Transportation and Logistics/WHO data related to Covid
8. Government and Public Sector:
9. Sports data set/FIFA data set
10. Automobile /Car sales data

Unit – 4 Total 20 marks


Part 1: Pandas
1. List at least three real-world scenarios where Pandas can be used for data analysis.
Explain the specific use cases in each scenario.
2. Describe the primary data structures in Pandas, namely Series and DataFrame.
Explain the differences and use cases for each.
Part 2: NumPy
1. Write a brief description of what NumPy is and why it is important for scientific
computing and data analysis in Python.
2. Explain the significance of NumPy in terms of performance and efficiency when
working with large datasets and numerical computations.

Unit – 5 Total 15 marks


Data Visualization:
1. Create a Matplotlib bar plot showing the sales of products in a store for a given
month. Label the axes, add a title, and customize the appearance (e.g., color, width).
2. Provide at least three examples of data visualization scenarios where Seaborn is the
preferred library over Matplotlib. Describe the type of plots or charts involved and
why Seaborn is a better choice.
Unit – 6 Total 15 marks
Describe the three key structures in Plotly:
1. Figure, Data, and Layout. Explain the purpose of each structure in creating
visualizations.
2. Load a sales dataset with columns 'Sales,' create a Plotly line chart to visualize the
total sales trend. Include axis labels, a title, and customize the appearance.

Rubrics
Criteria Excellent Good Satisfactory Unsatisfactory
Unit 1
(10) Demonstrates a Designs effectively, Unclear or fail to Unrelated to the
deep understanding though there might effectively address problem.
of principles. be minor the problem's (0-2)
(9-10) inefficiencies requirements
(6-8) (3-5)

Unit 2 The chosen data Data structures are Data structures are Data structures are
(10) structures are well-suited for the reasonable for the selected without
optimal for the problem, but there problem, although careful
problem, leading to might be more some choices could consideration of the
efficient algorithms efficient choices be improved. problem's
and minimal available. (3-5) requirements.
memory usage (6-8) (0-2)
(9-10)
Unit 3 The logic is correct, The logic is mostly The logic works for The logic contains
(10) addressing the correct, but there common cases, but significant errors
problem accurately might be minor there are noticeable that lead to
and providing issues in handling errors or omissions incorrect results in
expected results for edge cases or in certain scenarios. various scenarios.
various scenarios. complex scenarios (3-5) (0-2)
Demonstrate the but there might be
Charts properly. minor issues in
(9-10) handling edge cases
or complex
scenarios. Only
Chart with no
explanation
(6-8)

You might also like