0% found this document useful (0 votes)
53 views3 pages

DWM Lab 11 (Open Ended Lab)

The automobile manufacturer needs to identify competitors for a new vehicle prototype. They want to cluster existing vehicles based on similarities and determine which cluster most resembles the prototypes. This will help identify primary competitors. Students should build a clustering model using a provided vehicle dataset and Python code to group vehicles and determine the closest competitors for the new prototypes. Their work will be evaluated on data preparation, analysis, modeling, model selection/building, and evaluation.

Uploaded by

Shahzeb Raheel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views3 pages

DWM Lab 11 (Open Ended Lab)

The automobile manufacturer needs to identify competitors for a new vehicle prototype. They want to cluster existing vehicles based on similarities and determine which cluster most resembles the prototypes. This will help identify primary competitors. Students should build a clustering model using a provided vehicle dataset and Python code to group vehicles and determine the closest competitors for the new prototypes. Their work will be evaluated on data preparation, analysis, modeling, model selection/building, and evaluation.

Uploaded by

Shahzeb Raheel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Open Ended Lab – Data Warehousing and Mining

Objective: To test the data mining analytical skills of students to solve the problem by using the
knowledge they have gained in their previous labs
Time Required: 3 hrs
Programming Language: Python
Software Required: Anaconda
______________________________________________________________________________

Task:
The automobile manufacturer is seeking to identify the closest competitors to their newly
developed vehicle prototypes before launching the new model. To achieve this, they need to
group existing vehicles on the market based on similarities, determine which group is the most
similar to the prototypes, and use this information to identify the primary competitors for their
new model.
The objective is to utilize clustering techniques to identify clusters of vehicles that possess
unique characteristics. This analysis will provide an overview of the current market of vehicles
and aid manufacturers in deciding on the development of new models based on the identified
distinct clusters.
You can download the dataset from the link given below:
https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/ML0101ENv3/
labs/cars_clus.csv

Build your own pipeline and justify it. Also show the implementation and results of your solution
through code.

Rubrics for Evaluation


Parameter Poor Weak Good Excellent
(0) (1) (2) (3)
Data The student did The student The student The student
Preparation not perform any attempted to successfully successfully
data cleaning, or cleaned the
CLO-2 clean the data, cleaned the data,
the cleaning data but with
(C-3) process is but the result is some errors or demonstrating a
completely mostly incorrect omissions, or good
incorrect or or inadequate. some parts are understanding of
inadequate. incomplete or data cleaning
unclear. techniques and
best practices.
Data Analysis No attempt made Inappropriate The Appropriate
CLO-2 to analyze the statistical understanding statistical
(C-3) data or of the
techniques are techniques are
inappropriate statistical
statistical used, or the analyses used to analyze
techniques were understanding of performed is the data. The
used. mostly clear
the statistical understanding of
and accurate.
analyses the statistical
performed is analyses
incomplete performed is
clear and
accurate.
Data Modeling The student did The student The student The student
CLO-2 not demonstrate demonstrated a demonstrated demonstrated an
(C-3) poor
any a good excellent
understanding of
understanding of data modeling understanding understanding of
data modeling concepts. of data data modeling
concepts. modeling concepts.
concepts.
Model Selection The student does The student The student The student
and Building not build any attempts to build successfully successfully built
CLO-2 (C-3) data mining data mining built data accurate, robust,
models, or the models but the mining and interpretable
building process result is mostly models but data mining
is completely incorrect or with some models,
incorrect or inadequate. errors or demonstrating a
inadequate. omissions, or good
some parts are understanding of
incomplete or model building
unclear. techniques and
best practices.
Model The student does The student The student The student
Evaluation not use any attempted to use successfully successfully used
CLO-2 evaluation evaluation used appropriate
(C-3) metrics, or the metrics such as evaluation evaluation
selected metrics Accuracy, F1 metrics but metrics,
are completely etc. with some demonstrating a
incorrect or errors or good
inadequate. omissions, or understanding of
some parts are evaluation
incomplete. metrics and their
interpretation.

You might also like