0 ratings 0% found this document useful (0 votes) 47 views 20 pages Unit 1 Python Pandas
The document discusses data handling using Python's Pandas library, covering topics such as missing values, statistical measures (median, mode, standard deviation), and data aggregation. It includes examples of creating and manipulating dataframes, utilizing functions like groupby, and performing operations such as filling NaN values and calculating averages. Additionally, it touches on importing data from CSV files and MySQL databases.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here .
Available Formats
Download as PDF or read online on Scribd
Carousel Previous Carousel Next
Save Unit 1 Python Pandas For Later 5. PYTHON PANDAS ~ 1
get?
gat Chapter 3 : Doto Hendling using Pondas
N
missing values ? What are
Wha the strate
‘ans. Refer to Solved Problems 13 and 16,
ies t0 handle them 2
ing terms: Median,
Standard Deviation and y
‘ans. Median. Median is mid value in :
Mes ‘an ordered data sot,
standard Deviation Iisa measure of dispersion of abservatio
meant is square root ofthe variance and denoted by Signs oe
Variance. Variance is the numerical values that describe iil
vt ea eld geese the variability of the observations from its
thin dataset relative to their
Variance measures how far individuals in the ‘Sroup are spread out, in the set of data from the mean.
Name the function which is
Ans. Modeis the number which occurs most often in a data set. The m:
& Writ the purpose of Data Aggregation,
Ans. Data Aggregation is a process of
statistical aggregation functions,
4. Explain the concept of GROUPBY weith the help of
Ans, The groupby( ) allows to create fi
aggregate function.
To view example of groupby( ), please refer to example 50 given in chapter 2.
5, Assuming the gioen table: Product. Write the python code fr the following :
do you understand by the term MODE ?
a
used to calculate it
\ode() is used to calculate it
Producing a summary statistics from a dataset using
an example
ield wise group of values in a dataframe as per a specific
Item Company Rupees usD
TV LG 12000 700
~v VIDEOCON 10000 650
™v LG 15000 800
AC SONY 14000 750.
(2) To create the data frame for the above table.
(6) Toad the new rows in the data frame.
(4) To display the Sum of all products
Ans,
() oo>dl={ ‘Item's ['TV', "TV", ‘TV, ACT],
“Company’:['LG', "VIDEOCON', 'LG', 'SONY'],
‘Rupees’: [12008, 10000, 15000, 14000] , "USD': [700, 650, 800, 750] }
>>> df = pd. DataFrame(d1)
>» df
Item Company Rupees baat
8. WS 6 12000 bcd
1 ow VIDEOCON 10200 650
22>" SLs 15000 Loo
3 Ac ‘SONY 14000 750
@ scanned with OKEN ScannerINFORMATICS, Pea
8
180 : ,
tact, ‘DATKIN' , 15000, 80F
() rdf tocl4s 2) = (AC
>>> df
Rupees
Compan
ae pany 2000.0 700.0
Seales 9000.0 650.0
1 1 VEOEOCON 10000.
w 6 1500.0 800.0
; ac SONY 14000.0 750.0
aac ATKIN, = 15000.0 800.0
(@ >> éft-sun()
Item TVIVIVACAC
Conpany _LGVIDEOCONLGSONYDATKIN
Rupees 65000
uso 3700
dtype: object
14. Write the python statement for the following question on the basis of given dataset
Name Degree Score
@ Aparna = MBA 90.0
1 Pankaj BCA Mey
2 Ram M.Tech 80.0
3 Ramesh MBA 98.0
4 Naveen NaN 97.0
5 Krrishnav BCA 73.0
6 Bhavna = MBA oe
(a) To create the above DataFrame.
(6) To print the Degree and maximum marks in each stream.
(0) To fil the NaN with 76.
(@) To set the index to Name.
(©) To display the degree wise average marks.
() To count the number of students in MBA.
Ans.
(@) >>> Df2= pd.DataFrame(D2)
>>> DFR
Name Degree
@ Aparna Maa
1 Pankaj BCA
2 Ran M.Tech.
3 Ramesh MBA
4 Naveen None
5 Krishnav Bea
6 Bhawna mA
Score
90.0
NaN
80.0
98.0
97.0
78.0
89.0
@ scanned with OKEN Scanner4 PYTHON PANDAS ~ I
o
@
0
181
>>> DF2-€rouPbY(L Degree’, ‘Hane*})(+
pegree frit + *Name"})[ Score) max)
BCA Krishnav, 78.0
Pankaj 76.0
M.Tech. Ram 80.0
MBA Aparna 90.0
Shawna 89.0
Ramesh, 98.0
Name: Score, dtype: floatea
>» DF2['Score’] .Fillna(76, inplace= True) | (dy >>» DF2.set_index(*Nare")
ae Mane Degree Score
_ peer eer scoe Aparna MBA 90.0
eee 98.0 Pankaj BCA 76.8
3 Pankaj CA 76.0 Ram M.Tech, 28.8
eae M.Tech. 88.0 Ramesh MBA 92.8
eee 98.0 Naveen tone 97.8
eee oe 97.0 Krishnay BCA 72.0
5 Krishnav BCA 78.0 peueue pe
6 Bhawna MBA 29.0
Only Degree wise
>>> DF2.groupby( "Degree" )[ "Score" ] .mean()
Degree
BCA 77 .@00000
M.Tech. 80,000000
MBA 92, 333333
Name: Score, dtype: floated
Degree and Name wise
>>> DF2.groupby([ ‘Degree’, "Name" ]){ “Score” }.mean()
Degree Name
BCA krishnay 78.0
Pankaj 76.0
M.Tech. Ram 20.0
MBA Aparna (90.0
Bhawna «89.0
Ramesh 98.0
Name:
‘core, dtype: floated
>>> DF2. groupby( “Degree )[ "Score" J count ()[‘MBA']
3
@ scanned with OKEN Scanner9, The Student table We i ram to toad Me™
ject). Wr
ay the al
apres?
afrane
in a dataframe. Disp!
Solution
import pandas aS pd
import mysql. con” a3
mycon = sqltor connect (hos) , databa
cova sroun’, pas nv?
a. ‘onnected()* ee
1 cn see 2 from student WHET grade in ( )
ary
oo srread sal APY FICO) ay are)
print (student decals with Brae
print(mdf)
else:
print(
*mysQl Connection problem")
Pies At are
student details with grades 4 0 a i
Roltno “hane’<. “Marks “Grade SECT." Project
0 01 Ruhani 76.8 AA ict
1 103. simran 81.2 A 8 eal ua :
pr cane Wearsiya a16, Me eB ome suami tte
stores about 50 rows in it, Write a progran | —
1’. Append the rows ifthe table already exigg.
10. Dataframe saleDf ogrant to store its rows
random on MySQL database namely ‘worl
Solution
import pandas as pd
from sqlalcheny inport create_engine
import pynysql
engine = create_engine( ‘nysql+pynysql://root :MyPass@localhost /world"
mycon = engine.connect()
# statements to create or load saleDf
saleDf.iloc[10:15, : ].to_sql(‘random’, mycon, index = False, if exists = ‘rl
exists = "replay
ene oi ACERTIR
NCERT Chopter 2 : Data Handling using Pandas ~ I
(Please note that we have given only those parts of question here which
pertain to this chapter, e.g., (k), (I) only as given below)
11. Use the DataFrame Sales to do the following :
2014 2015 2016 2
re 1005 12000 20000 _
Kinshuk a a cn foe
ea me 22000 70000 70000
Shruti 40000 ait ae Si
5000 125000 90000 _}
@ scanned with OKEN ScannerSORTING DATA BETWEEN CSV FILES/MYSOL AND PA
NDAS
311
aframe Sales 10 0 6
aes of aaa Sales 106 comma sport fi
a si end colunn labels ‘arated file SalesFigures.csw om the disk. Do not writ
he disk, Do not write
fe SalesPigures.cs
ee fil sv into a Data rame S
alesRetri
jolumnn labels of SalesRetrieved to b alesRetrieved and Disph
Jo be the same as that play it. Now update the
6 that of Sales.
: nq: \\SalesFi .
aes ails. toe" d:\\salesFigures.csv", header =False, index = False)
a 7 " Boake
( caresretrieved = pd.read_csv("d:\\SalesFigures.csv”
@ names = (2014, 2015, 261
enestotrived |» 2016, 2017, 2618)})
»
5014 2015 ©2016-2017 (2018
9 300-5 12000 20880» 50000 © 160000
1 150.8 18000 50080 ©6900 110000
> 100.8 22000 70088 ©78008 © 588000
5 30000. 39000 100800 80000 340000
3 49090.0 45000 125000 99000 900000
5 196.2 37800 52000 78438 «38852
yp» SalesRetrieved =Saleshetrieved.renane( index = {@: ‘Madhu’, 1:"Kusum’
2: 'Kinshuk’, 3: "Ankit", 4:‘Shruti', 5:"Suneet’ })
yp SalesRetrieved
2014 2015 2016 «(2017-2018
100.5 12000 20000 © 50000 160000
Madhu
kusum 150.8 yseee 50000 69000 110000
Kinshuk 200.9 22000 70000 © 70008 590000
Ankit 30000.0 30000 100000 9000 340080
125000 90000 900000
Shruti 48800.0 45000
Sumeet 196.2 37800 © 52000 78438 38852
SSARY
Acronym for Commo Separated Values
character as separator)
‘e For
m™mat A text file format storing data values separate
ee ener
AL QUESTIONS
TYPE A: SHORT ANSWER QuesTions/CONCEPTY
@ scanned with OKEN Scanner
4 by commas (or any otherSE
Case S
&
eS)
tudy Based Q
|G : PANDAS AND PYPLor
sreunisaton as data analyst He uses Python Pareles any
eT ie bp aimee ss the year 2010 to 2012 for January
same, He got dese Informalion from him, but he i facing. some prog
nswering feu questions given below
te Year | Month | Passengers
0 | 200 Jan 2B
1 2010 Mar 50
2 | 2012 Jan 35
3 2010 Dee 55
4 2012 Dec 6
Code to create the above data frame :
import pandas as___
"Month": ["Jan’
“Passengers":(25, 50,35, 55,
df = pd. (data)
print (df)
#Statement 1
data = {"Year":[2810, 2010, 2012, 2010, 2012],
65)}
‘Jan","Dec","Dec"],
#Statement 2
(8) Choose the right code from the following for statement 1.
(@) pd () df
(©) data @p
(Choose the right code from the following for the statement 2.
(2) Dataframe (H) DataFrame
(ii) Choose the correct statement/method
(5,3)
(@) df.index (b) df.shape( )
(ie) He wants to print the details of “January” month
correct statement :
(0) Series (d) Dictionary
for the required output :
(©) déshape
2
(®) dF. locl{ ‘Month
() dF[[ Months, Passen,
(9 dF.iloct [Month
(dF C'Month*
» "Passengers" })[dF[ ‘Month*
ers" ]]{dF[ ‘Month’ J=="Jan*]
+"Passengers'})[dF{ ‘Month! J=="Jan']
» Passengers }) [df ‘Month’ J=="Jan")
rere
@ scanned with OKEN Scanner>: Panny,
gp GSTONS | UST DATA tn
ait wants €0 hang the index ofthe ty
oy Hey ake comet statement Lo change the ing #4 te a
as sh ’ index OHEPON foe tp
| | Yeor J :
L Yer | Monn T :
| Actin | amg PL Pottengeny
fodigo | amo | MP | os
Mar
| Spicejet 2012 hi a
" an ~ |
| Jet 2010 Dec | |
(Emirates _ 2012 | |
2 Dec |
|
of-index[] = [Air India”, "Indi gor
"Spice A
gf index{"Air India", "Indigo, “ee Enirat
af.index = ("Air India”, "Indigo", "splcejet”, anes
p df.index() = ("Air India’
"4 "Indigo", ee
Ans.
pape Gi) DataFrame Gi) (c) dfshape
) ef{["Nonth®, "Passengers" ]][dF[ ‘Month*] == "Jan"
\ df.index = ["Air India", "Indigo"
Spiceje
Jet*,"Enirates")
lowing based om the series given below.
inport pandas as pd
1,2,3,4,5,6,7,8]
List2= [ ‘swimming’ , "tt", ‘skating’ ,"khokho', ‘bb’, ‘chess’,
“football, "cricket"]
school = pd. Series(list2, index=list2)
school.name = ("Little")
print (school*2) statement 1
print (school. tail(3)) # statenent 2
print (school {"tt"]}) # statement 3
print (school [2:4]) # statement 4
(Choose the correct name of the series object given above.
(0) list (by list (6) school
(5) Choose the correct output for the statement :
(i) litte
print (school. tail(3)) # statement 2
6
(sui (ty chess
alli : football 7
skating 3 ae
4 (uy kho kho 4
bb $
chess 6
football 7
cricket &
@ scanned with OKEN Scanner316
hy
statement > Ch,
4) Choose the corect ent forthe sta |
(i) | sift}. # statement 3 i
print (schoo 1 |
cy tt2
(2 Ws : 4) try, |
io) Ientfy the correct output for + |
. ao an (school {2:4}) # statement 4, |
‘ (b) tt 2
0 ar : skating 3 |
ekiawd kho kho 4 |
(d i |
(9 skating 3 ° peat |
kho kho 4 oes |
bb 5 |
bb. 5
chess 6
football 7
cricket 3
(@) The correct output of the statement :
print (school*2) # statement 1 will be.
(@) swimming 3 (®) swimming — 2
t 4 tt 4
skating 5 skating 6
kho kho 6 kho kho 3
bb 7 bb 10
chess 8 chess R
football 9 football 14
cricket 10 cricket 16
(swimming False (@) swimming = 1
tt False tt 4
skating True skating 9
= kho True kho kho 16
True bb 25
chess True chess 36
football True football 49
cricket True cricket 64 (ce
Ans. (i) (d) litle ; (i) (); Gi)()2; (DG); (%) ()
ianyukta is the event incharge in a school. One of her students gave her a |
Pandas and Matplotlib for analysing and visualising the ie
lysing and visuatising the data, respectively
frame “SportsDay” to keep track of the number of Firs :
houses in various event
@ scanned with OKEN ScannerTIONS : Unit |: DATA HANDUNG
o ques DUNG : Panos 9
yg ND Prror
“ commands 10 do the following
of 317
Pri
izes
aren the range
USecon"} sof
ws names where the ny
yy the ; umber af Ser
etvHouse”J( CAFE 'Second"}>e o eet
jqnouse) [(@FL "Second ye
qe wer ouse'IECEFL'Second"} >= 22) a agtrcen 2)
‘Se
0 : . ;
get dF S204} =22) 8 (AFL Second] «gen 1%
USecond"} 29)!
stl
@
@ Of 12 to 20,
() :
( ; iy al te records in the reverse order,
a
(i) (af{:1))
ct) i Cid + ateaD ® Print(dfitoo(s1)
rp ten 3 ears (@ print(atseverse)
ae it 0
gatas) () dibottom@) (ed next(a)
4 choose the correct output for the given statements : (@) dh.tail(3)
@ x2 df-colums[:1] :
print (x)
(oe ee (0) First orn
ror
| gic command will give the output 24
to peintatsize) ©) PHAWCAShApE) (print dindey
. . 2 (d) pri
jaf ‘House’ }[(EFL'Second"} >= 12) & ee 7
ans. (0.6 ’
1) + (iti) (d) Afstail(3) ; (fv) (b) House ; —(o) (a) print(df-size)
inc) printatito
sr the dataframe, namely SAf, given below and answer questions (i) to (0)
1 Coil
(7 StudentI0 Homevork Midterm Project Final |
19 4560 100 97 100 95 |
la 5540 85 98 88 90 '
ia 6889 2 85 88 a7 |
la 65 8 a7 a |
(Display only the frst tree columns of the datframe,
fe) Satloc{ 2] @) Sdlocl +2] (2 SAaloe +-2] (0 Sabor al
(Which ofthe following commands wll give the output as shown below ?
StudentId Homework
1 5540 85
2 6889 a 92
3 6817 65
(a) Sdfiloc{ 1: 22]
(0) Sdfitocl 0: 2}
Pin the Final Marks
Ss who have scored atleast 90
ids will yield the desired reslt ?
(@) Sdfdoc{ 1: +2] (b) Sdédocf 0: , 2]
The Principal wants to know the details of studen!
Fercentage, Which of the following set of comman
(0) print (sdf [ "Final’ >= 99 })
(b) print (Saf [Sd ['Final'] >= 92)) ‘
(0) tmp = sd€ [sdf [‘Final'] >= 90]
print (tmp)
print(sdf[ Final‘ ] >= 98)
@ scanned with OKEN ScannerHotta,
318 mtg
? many students’ details are si |
er wwants to calenlate hw many st Ored in |
(io) The programmer wa mbcr of records in each cole " data
- le hieee commanis will yield the munber of records in each column F thet, |
0 follo (®) pandas.cou SE clay, |
Sdf.count(axis = 1)
(@) SAfcount( (@) pandas.counysag, 9)
(0) Sdi.count( ) :
(0) What will the following statement yield
Sdf.Midterm > 90
@ 0 97 True 0 True
1 False
1 90 False
2 85 False 2 False
3° 85 False 3 False
© @ 97
if 2 Naw
ii 2 han
: 3 NaN
i @) Error
Ans.
(0) Sdéilocf 2 ; (i) (d) Sdfiiloct 1: , 2)
Git) (b) print(Sdf [Sd ("Final*] >= 99])
(c) tmp = dF [Sdf ['Final'] >= 99)
print(tmp)
(2) (0) Sdfé.couny) ; @)
5. Given a dataframe below namely waf as
*intenp maxTemp Rainfall Evaporation
1
24.3 o0
26.9 3.6
2.4 3.6
15.5 9.8
8
20
2
ol
16.1
2
ieee
aaa
oct
ws ge
4a ee
a 2
D3 3
$
°
°
3
@.
27.9 ou
30.9 a
31.2 6.
32:4 e
31.2 e
30.8
32.3
@
1.2
6
33.4 00 |
8.
33.4
compute s Dery
7 compute sum Of every column Of the dataframe,
() Write command tp Compute mean of cot i
one column Rainfall,
(4) Write command tg Compute suum of
() Write command tp
COTY row
the dataframe,
@ scanned with OKEN Scanner: utsTIONS Unt | DATA HANDLING
yo
mand fo compte WERE Of ll te column, uy
fo compute average maxTemp, ie a Inst 19
1 Rainf
PANDAS AND proyeyp
ows only,
ie
iy comme i
fo fr 101
ie
ye gt si)
® gesinfaal’J-mean0
af. sul XdS =1)
sf docEt» J.mean()
xdf-10¢
falowing dataframe nf as shown belo :
[:11, 'maxtemp' : Rainfall} mean()
column Colum2 — Columns
62.893165 100.0 60.00 = 1
94734483 100.0 59.22 i i
49.090140 100.0 46.04 False |
38.487265 85.4 58.62 fe |
sions from (i) t0 (0)
fp ick fhe elo se commands wil yield te flowing output?
Column1 Column Column3 Res
n 94734483 102.0 59.22 True
B 49.090148 100.0 46.04 False
4 38.487265, 85.4 58.62 False
(a) print(ndfiloc{"T2's :] ) (©) print(ndtat{T2’, :})
(0) print(ndf-loc{‘T2":, :] ) (@) print(ndfiat{‘T2’, :])
(i) Predict the output produced by the following statement:
print(ndf.at['T3", 'Res"] , ndf.at['T1', ‘Colunn3"] )
(a) False 60
w 3 49.090140 100.0 46.04 False
1 62.893165 100.0 6.00
(9 13 False 11 60.00
@ Column Column2 Column3
1 62.893165 100.0 60.00
m2 94.734483 100.0 59.22
2 49.090140 100.0 46.04
{) Which command will delete the rows T2 and 73? oS
(0) ndfdrop(‘T2, TS) (@) ndfdeop( (72 a 7
gg, {() Pafedropy ['T2", ‘T3'], axis = 1) (@) ndfdrop( T ae
(©) Which command wil yield the maximum value from the column Cou
(©) Pandas.max(ndf {/Column3'1) ( maxind a
(6) na {na Column3’L.max0 } (@) nf [Colure Tm?
@ scanned with OKEN Scanner320 oR
(©) Which command wilt delete columns °C ‘olumn2! and ‘Res’ 2 Me
(0) ndtdel (Column:
(a) ndf.del ( ['Column2’, ‘Res'))
(i) ndbdrop ¢ 1
(¢) nd drop ({'Cohumn2, Res’) Sime a <
Ans, te
{9 (0) print(ndt loef'12"s,}) (ii) (a) False 60
(i) (b) ndfdrop( [T2, 'T3'}) (ie) (@) nd Column} max¢ y
(©) (2) ndf.drop (['Column2, ‘Res’, axis = 1)
Gioew a dataframe mulf as shown below, Ansiver questions (a) to (e)
a
;e@ 2B 23 37 |
}1 a9 20 a}
20m 2 BI
(33 4 a5 |
(@) Write code to rete a new dataframe nt that stores the values of detframe muy F
tiple
(0) Write coe toad a column ‘C4 inthe dataframe nt, which stores the diferenee
column ‘C2’ from the dataframe n1. F clam
(©) Write code to drop the column ‘C4’ from the dataframe n1. The dat
‘statement. = nite Modified,
(@) Write code to drop the index 2 from the dataframe n1. The datafran
aoe afr iframe should be mais
(0) Write code to display the suo of rows with indexes 2 ontvards from the dataframe mip
Ans, :
(a) ml=mdt*3
(b) AU['C4"] = ni[*C3"] - ni[*c2"]
() nl. pop ("C4")
(d) nl =nt.drop(2)
(©) mdf f0c{2:].sum() Color Count frist}
Apple Red 30 |
8 oe opelived andy data as shown on right (fruit Apple Green 31}
Ansiver questions frm (i) tot) ate ein eal
(i) Which of the below ‘given commands will yield the: following output ?
Price
Apple 120
Apple he
Pear 125
fein a
Lime 7
(2) dataitoct : 4 , 2:
(© datatoc{ : 4,2: } (6) dataitoct : , 2: |
(@) datatoc{ : , 2: |
@ scanned with OKEN ScannerINS: Uns Mine’ TANNOLING ; p,
yestiO' PANDAS.
F av a AND. Pron
( the label “Apple”. Extrae
. [7 Mract tp
jw! a ‘ of 2 Apple’) COMMIS. (Choe th a
2 ee et PDI] i tee “mo
son 1) ataloct Apne
oe ada AM 101. (Choose the correc oe
Slatemens)
1 ) datatoct gy
i (4) data toc 034
having price more Tha 120, (Chore the ap
price!) > 120 TCL Statement)
(0) datatdata Price’} > 19
() datat ‘P, 120 :
ve alin, HE MCI prc. (Coe
a &
a i datal’Price’L-max( )
} datafdatal Price’). max( ) == True |
a datal dataf’Price’] == datal’Price’|max() |
datal’'
2 al Price’ > 120 ]
a
reel staleme
at)
0) datafdatar,
ice’|max()
meat: 22d (i) (@) dataloct Apple’
: ata.ilocl 0:3, 2] + (iv) () datafdatal’Price’} >
aod 1> 129)
-() datal datal’Price’] = datal‘Price’).max( )]
se Randep Kaur maintains the records of all students of her ces
= ‘on the data
She wants to perfor
inport pandas as pd
t={"Rollno’ :[101, 102, 103, 104, 105, 106, 107],
‘Name’: ['Shubrato', 'Krishna', ‘Pranshu',
*sanidhya’, ‘Aurobindo’ ],
‘age':[15, 14, 14, 15, 16, 15, 16],
‘Marks’ : (77.9, 70.4, 60.9, 80.3, 86.5, 67.7, 85.0],
“Grade':['118', "24a", "24", ‘11C", '226", "224", *21¢")}
oF = pd.DataFrame(t, index = (10, 26, 30, 40, 50, 60, 70])
‘Gurpreet’, ‘Arpit',
print (dF)
Oxtput of the above code :
Rollno | _ Name ‘Age | Marks | Grade
0 101 | Shubrato 13 | 79 | UB
x A
ca 102 | Krishna fae ema |adh
9 | 1B
2 103 | Pranshu 4 60.9
1c
0 104 | Gurpreet 15 80.3 i
5 E |
Ved 105 | Arpit 16 | 865 i
| ea
sanidhya 8
| am | sit aes
@ scanned with OKEN Scannerom (i) to (vi)
answer questions from (1)
nation, ansever
ratio,
o given infer scene
Sake rect statement for the bel
Select the correct anes
(i) Sele
Name 14
Age 70.4
marks 1A
grade dtyperobject
+20
Name : 20, (0) print: dF.
gnt(af. S021) () peer pela
ene d10¢[20]) “"9r28})
fo prin now the marks secured by the second last Student only, yy
(ii) The teacher wants t * anstuer 2 hich
gt the cor : ~
help her oe itera) ‘marks"]) ——(b)
(a) print(df.
int(df dloc{ 22-2, "Marks']) — (d) print(agy., ,
@ painter
ig ium ot seong
cong statements) soil adda nto column ‘eat secon co
i, Hai of he sin so, 400, 3700] in DataFrame af? Poston
3400, y SKM, DE | .
(0) df-insert(loc=2, colum= "fee", value = [3200, 3400, aso, 30,
a df.add(2, column = "Fee", [3200, 3400, 4500, ay¢9, 3260, ie
(0 df append(loc =2, ‘fee’ = [3200, 3480, asea, 3108, 3209, 4 vi
(df Ansert(loc=2, ‘fee", (3208, 3400, 4500, 3100, 3220, tes, 9
(io) Which of the following commands is used to delete the column ‘Grade in 7
@) — df.drop(‘Grade", axis =1, inplace = True)
the Dati
() dF.drop( ‘Grade’, axis = @, inplace = True)
(© dF-dropl ‘Grade’, axis =1, inplace = True]
(t) df-delete( Grade", axis =1, inplace = True)
(©) Which of he following commands would rename
a?
(®) Gf.rename({‘Marks', “Halfyearly'], inplace = True)
@) dF rename({‘Marks*, ‘Halfyearly'}, inplace = True)
(af, ename(columns = {'Marks': ‘Halfyearly'}, Anplace = True)
(@) df.rename(t Marks ‘Halfyearly'], inplace = True)
(©) Which ofthe ftoing commands wil ais
80 marks >
(a) Print(dF. oct Mapks*
w® Print(d¢,
the column ‘Marks’ to “Halfyear
play the Names and Marks of all student
5 Print (GF oc laFt Marks] < 80, [‘Name", *Marks*]])
Print;
a Int (4F oc agp Marks" ]> 80, ['wame', ‘Marks']]) |
OW; Gi) () |
(oy (a ; (&) (by |
@ scanned with OKEN ScanneriS : Unit |
DATA HANDLIt
jeSTION ING : Pay
gp OU DAS AND Preroy
rated the following data frame datay franca
¢ 7 sPiclass la Keep track 323
ri students of her class where roo indexe. tack OF data Rotiy h
z €8 ate take 0, Nam
Rollno Name Maken a8 the defy pes Marks} ang
1 | Swapnil Sharma | .
Raj Batra
2
3 | Bhoomi Singh
4 _| lay Gupta
among the oloeing option will gee 90,95 as outpy, >
print(wax(datafranel[‘Marks1?, ‘Harks2s])
print((dataframel.Marks1.max( ), (dataframe:
print(max(dataframel[ ‘Marks’ })
print(max(dataframel[ ‘Marks2?])
@ which
@
0
o
@
+ nos to know the marks scored by Rollno 2. Help her to id
fon the given options: Per to deny the correct st of statement
(a) print (dataframet [datafranei{ Rollno" J ==2})
(i) print(dataframed[‘Rollno' } == 2)
(print (dataframe1[dataframe1.. Rolin
1.Marks2.max( ))))
2))
(a print(datafrane1 [dataframet['Rolno' ]])
1a) Which of the following statements will delete the 3rd column ?
{A) del dataframel{/Marks1'] (B) dataframel.pop(‘Marks!’)
(©) drop dataframel{'Marks!'] (D) pop dataframel{’Marks!']
Ccose the correct option
(a) both (A) and (B) (0) only (B)
(9 (A), (B) and (©) (@ (A), (B) and (D)
(6) Which of the following command will display the total number of elements in the dataftame ?
(@) print(dataframe1 shape) (8) print(dataframel.num)
(© print(dataframel size) (@ print(dataframel.elements)
() Now she wants to add a new column Marks3 with relevant data, Help her choose the command to
Perform this task.
(0) dataframel.column = [ 45, 52, 90, 95] _(b) dataframel [’Marks3'] = [ 45,5
(9 dataframe! loc{’Marks3’] = [ 43, 52, 90, 95]
( Both (i) and (c) are correct
90, 95)
(CBSE Q
An
“0H; WMO: GM: GO: OH
he data about climatic come
sa is created the following dataframe “Climate” to recon the date about ¢ LW MANU ANOLING 5
1 question 19 PANOAS anit pry
101
f
oe
collection of open dausets, avitaaye
0 is ome such open datace,
327
the IF expe researe
Sor exp imentation ang th
research
pubic
jo’s
; related 10 fuel constnprion by Autonobites in
A it i i Aanpg). hence the name ofthe dataset jg 1
a gon (mS
wt vs or objects) anu nine column oy
Regt gates oF OP ei
i!
HY. Con stung
he data has
KM a8 attribute )
splacement, horse ,
sears, cuiners, mel yenr andy
rit
iption is me
Atopy
eT 398 rows (al
asured in
0 known
iu gare: mp eylinders, dis
POwer, weight, accelcation, my
IS has catey ical v4
er a for eeTY 1 Whe the remaining fu tribute, is
nit pen dovedeed fom the UCL deta repository
ag bee Ch
9 fat Tgsanci.cdulnl/machine-learning data
ite iC
todel y
ear, origin,
i
ales, car name
re numeric
wave
mille at
fauto-mpyl
the exercises 10 analyse the data;
pgdata into a DataFrame autods
tweent
ing
pad alors Deiat et
0 aescrpton of the generated DataFrame aut
aw ply te fist 10 rvs of the DataFrame autodf
a Fa the attributes tohich have missing values, Handle the missing values using fallowing hoo
ns:
aa ace the missing values by a value before that,
7 — the rows having missing values from the original dataset
i 1 the details of the car which gave the maximum mileage,
° ea the average displacement of the car given the number of cylinders.
BI a 1 is the average number of cylinders in a car ?
° eee the no, of cars with weight greater than the average weight
(i
fa a read_csv("c:\\pywork\\auto-mpg.data", sep= "\t", header = None)
» ey
>» autodf
8
4567
oe E 3 hevelle malibu
.0 70 1 chevrolet che\
018.0 8 307.0 130.0 3504.0 12.0 1 buick skylark 320
115.0 8 350.0 165.0 3693.0 11.5 70 ymouth satellite
2 80 8 318.0 150.0 3436.0 11.0 70 1 e anc rebel sst
316.0 8 304.0 150.0 3433.0 12.0 70 : ford torino
417.0 B 302.0 140.0 3449.0 10.5 70 hee
nye Ga i cee ford qustang gl
0.0 15.6 82 vw pickup
32, 140.0 86.0 279%
J ; 95.0 11.6 8 font teen
3532.0 4 135.0 84.0 22 ‘i
3 28.8 4 120.0 79.0 2625.0 18.6 fe 2 a
wate 4 nye 62.0 2720.0 19-4 8
encolumns
(ross caer) ste for, we ad torelace te spaces betwee ou
‘ 5 compatibe fom
as convert intoa CSV removed.
loading auto-mpg.data file, to. ich were to be
hae rey ee Oe nied some extra ?5, whi ich wer
@ scanned with OKEN Scanner328
(i) >>> autodf.describe()
(iii) >>> autodf.head(1)
(iv) (a) We can issue statement for each individual column as follows, ¢ 8 toy
of column 1 with values of column 0 ; nj
>>> autodf [1]. fil1na(autodf[@])
Similarly,
>> autodf[2].fillna(autodf[2])
>>> autodf(2]. Fillna(autodf[2])
(0) >>> autodf.dropna()
(2) Mileage is column with index 0 ( mpg)
>> autodf[ autodf[9] == autodf[0] .max()]
el 2s og 7
8
322466 4 86.0 65.0 2110.8 17.9 a9 3. mazda gic
(i) Displacement is column with index 2
>>? autodf[2].mean()
193.42587939698493
(ii) Cylinders is column with index 1
>>> autodf[1] mean(’)
5.454773869346734
(iit) Weight is column with index 4
>>? autodf[ autodF [4] > autodf[4] .mean()]
e
1 2 oe ee ‘
eee 8 367.0 19k 12.8 78 1 chevrolet chevellenality
115.6 8 350.0 165.9 11.5 70 4 buick skylark 329
218.0 8 318.8 150.9 11.8 70 1 plynouth satellite
316.8 8 304.9 158.0 12.0 70 4 ame rebel sst
417.8 8 302.9 140.0 10.5 79 4 ford torino
366 26-6 8 350.0 365.6 19.0 81 1 otdsnobile cutlass!s
365 20.2 6 200.0 as.g 17.1 81 4 Ford granada!
1 ne : —- 85.0 16.6 81 4 ci eles leben
387 38.0 5 ae oe eer
262.0 85.9 17-8 82 Lotdsnobste cutlass ler (i
{1272 rows x9 columns}
@ scanned with OKEN Scanner