0% found this document useful (0 votes)
94 views2 pages

4.AIML - To Extract Features From Given Data Set and Establish Training Data

This document describes a practical machine learning exercise to extract and select features from a mobile phone sales dataset to predict price. The dataset contains 21 variables describing phone specifications and the goal is to determine which factors most affect price. The program loads the dataset, selects the top 13 important features using a chi-squared test, and displays that pixel width, RAM, and pixel height have the highest scores, indicating they are best for predicting price.

Uploaded by

RAHUL DARANDALE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views2 pages

4.AIML - To Extract Features From Given Data Set and Establish Training Data

This document describes a practical machine learning exercise to extract and select features from a mobile phone sales dataset to predict price. The dataset contains 21 variables describing phone specifications and the goal is to determine which factors most affect price. The program loads the dataset, selects the top 13 important features using a chi-squared test, and displays that pixel width, RAM, and pixel height have the highest scores, indicating they are best for predicting price.

Uploaded by

RAHUL DARANDALE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Dr.

VITHALRAO VIKHE PATIL COLLEGE OF ENGINEERING, AHMEDNAGAR


Department of Mechanical Engineering
TE Mechanical Subject-302049: Artificial Intelligence & Machine Learning

Practical No. 04

Aim: To extract features from given data set and establish training data.

Objectives:

1. To learn how to prepare dataset.


2. To understand steps include to upload dataset.
3. To learn how to execute python program.
4. To show top 10 Best Features using SelectKBest class

Package Used: - Python 2 / Python 3

Problem Definition:-
The Dataset given below is the mobile phone sales data with details of each mobile phone
w.r.t their brand. The problem is that how to decide mobile phone price considering factors
affecting like Battery, Ram, and Camera etc. To solve this problem following is the collected sales
data of mobile phones of various companies.

Input data
1. Dataset given in form of .csv file (comma separated values)

Description of variables in the above file


Sr.No Abbreviation Details
1. battery_power: Total energy a battery can store in one time measured in mAh
2. blue: Has Bluetooth or not
3. clock_speed: the speed at which microprocessor executes instructions
4. dual_sim: Has dual sim support or not
5. fc: Front Camera megapixels
6. four_g: Has 4G or not
7. int_memory: Internal Memory in Gigabytes
8. m_dep: Mobile Depth in cm
9. mobile_wt: Weight of mobile phone
10. n_cores: Number of cores of the processor
11. pc: Primary Camera megapixels
12. px_height Pixel Resolution Height
13. px_width: Pixel Resolution Width
14. ram: Random Access Memory in MegaBytes
15. sc_h: Screen Height of mobile in cm
16. sc_w: Screen Width of mobile in cm
17. talk_time: the longest time that a single battery charge will last when you are
18. three_g: Has 3G or not
19. touch_screen: Has touch screen or not
20. wifi: Has wifi or not
21. price_range: 0(low cost), 1(medium cost), 2(high cost) and 3(very high cost).

Artificial Intelligence & Machine Learning Mechanical Engineering Department


Dr. VITHALRAO VIKHE PATIL COLLEGE OF ENGINEERING, AHMEDNAGAR
Department of Mechanical Engineering
TE Mechanical Subject-302049: Artificial Intelligence & Machine Learning

Program:-
import pandas as pd
import fsspec
import numpyas np
from sklearn.feature_selectionimport SelectKBest
from sklearn.feature_selectionimport chi2
data = pd.read_csv("E://test1.csv")
X=data.iloc[:,1:20]
y = data.iloc[:,-1]
print(X,y)

bestfeatures = SelectKBest(score_func=chi2, k=13)


fit = bestfeatures.fit(X,y)
print(fit)
dfscores = pd.DataFrame(fit.scores_)
dfcolumns = pd.DataFrame(X.columns)
#concat two dataframes for better visualization
featureScores = pd.concat([dfcolumns,dfscores],axis=1)
featureScores.columns = ['Specs','Score'] #naming the dataframe columns
print(featureScores.nlargest(10,'Score')) #print 10 best features

Output:-

SelectKBest(k=13, score_func=<function chi2 at 0x00000069DB063670>)

Specs Score

12 px_width 852.914979

13 ram 562.837207

11 px_height 46.347162

8 mobile_wt 42.328627

4 fc 15.793117

10 pc 11.148155

6 int_memory 1.372252

2 clock_speed 1.052762

15 sc_w 0.809077

16 talk_time 0.760553

Conclusion:-

 From the above data we understand these 10 features are more effective while selecting the
price of mobile phones.
 From the above output it is concluded that the camera pixel width affect more on price
because camera is the major costly part in mobile phones.

Artificial Intelligence & Machine Learning Mechanical Engineering Department

You might also like