, >» nare2 = np.array(({[11.5, 21.2, 33.8], [48, 50, 60], [212.3, 301.5, 405.2]])
op ctts = pd.0ataFrane(nar?, colums= |'First', ‘Second’, "Third! ], index =
Eek ees Dee [mone
us 22 3.8) Seeworproegey
40.8 50.8 60.8 ‘sequence felines
212.3 301.5 405.2 ~
Inthe above example, ;
2mples the ndarays tha are pase
ineach ofthe rows. If however ea Passed to DataFrame have same number ofelemet®
“rows of ndarrays differ in length, ie, if number of ele
CChopler 1 = PYTHON PANDAS ~ 1
39
Leach ro der, hen Python wl eae ut omc
Crete fst sng clu in te date :
1pe ofthe oan wl be considered ben elo ae and
pp nares =mp-array(((303.5, 201.2], (480, 52,600, 700), (232.3, 361.5, 485.21] )
array({1ist(¥@a.5, 201.2]), 1ist({400, 50, 609, 708]),
lea. 3.5.20), type = object) Ee]
>>> dt a = pd. DataFrane(narr3)
3 valves
doo ata
e
e (101.5, 201.2]
a [409, 50, 600, 700)
2 [212.3, 301.5, 405.2]
EXAMPLE BEM Wit wil be the output of following code ?
Amportt pandas as pd
Ampor't nunpy as np
p-array({ (11, 12}, (23, 14}, [15, 16] }, np.int32)
1d.DataFrame(arri)
print(dt#2)
SOLUTION
01
one
123 4
215 16
EXAMPLE BED Write « program to create a DataFramejrom a 2D array as shown below
pices
ao | us | 12
[430] 140 | 200 Output
[us [ae [a oa 2
~ 0 101 3 124
1 130 140 200
2 us 26 27
SOLUTION
import pandas as pd
import numpy as np
are2=np.array({ (101, 113, 124), (138, 148, 260 }, (115, 216, 217] ])
dt 3 = pd.DataFrame(arr2)
print(dt3)a INFORMATICS POSES =
1 0s Series Objects
Ina2D dictionary, you
wary aS argUMERt to
A 1D Dictionary with Values
multiple Series objects:
‘4. Creating o DataFrome ‘Object from o 2
wees nose ae th
ors mnt
cena ge ae aoweng Ses HS
1 eo eS tab
ere ema ten Paseg ST EOE
>>> dt = pd.DataFrane( school)
Now the DataFrame dtf, created above, will be like
poo ate
‘amount people
© 166000 20
246000 36
2563000 “4
ZRAMPLE PJ Conte tow sores eects staff and salaries that toe the number of people in various ofc
tranches and salaries distributed in thes ranches, respectively
Write a program to creat anther Series objet at stores average slry pe ranch and then rest @ DataFrame
object fr thse Sve objets,
SOLUTION import pandas as pd
Snport numpy 25
staffs pd.Series([20, 36, 461) |
salaries = pd.Series([279000, 396800, 563000])
‘it will create avg series object
eee
avg = salaries / staff
orgs {‘people' staff, ‘Anount salaries, ‘Average’ :avg }
aes = pd.DataFrane(org)
print (ats)
Output
‘Amount average people
0 279000 13950.000000 20
4 396800 11022.222222 36
2 563000 12795.454545 44
5. Creating « DataFrame Object from another DaiaFrame Object
‘You can pass an existing DataFrame object to DataFrame( ) and it will create another dataframe
object having similar data, Consider the code shown below that passes to DataFrame() another
datarame object
Given a Daaframe object tft
>ooatfa
@
1
2
3
6
CChopter |: PYTHON PANDAS — 1
a
You can create an identical data i
te an identical dataframe by passing its name (dtfl) to DataFrame( )
‘fnew = pd. ataFrane(at1)
You an ply he ew Daane tet o confi ht tii
ene
e123
Lace
When you cesta datframe using another datattame you should ad an ade
ame, ou should ad an adiiona argument
todstafame() method as copy= Tr, which wl ensure hat he newly ted dae oe
True copy ofthe pased dataframe and not anew label ely. The sbove give ethod cee
only ane abel refering to the sme datafame, To rates datarame set slnked sew
copy we copy) ofthe pased daftrame sete adlonal argument cop + me
Refer tothe inf losing the ston Ming snl cl, page 560 wns his
Please note, there are some other methods too for creating dataframe objects but covering those
will be beyond the scope of the book
— __
NOTE
Displaying @ DatoFrame
Displaying a dataframe is the same as the way
you display other variables and objects, i.e, on
the console prompt, either type its name or give
print( ) command with the dalaframe object as
DataFrames can ao be crated rom vext/S les,
whch we shall cove in Chapter 4 — Data Tansler
[between Oataframes and Flt lex/MySQL
cere ai
When you create a DataFrame object all information relate to it (suchas,
its siz its datatype ete.) i available ough its attributes, You can use
these attibutes in the follwing format to get information about the
dataframe object. 7
.cattribute nane> on cae
Some common attributes of DatFrame obec ar listed in table below. The code examples forthe
usage ofthese atsibutes follow Table 1.6
Table 16 common otitutes of Datafrome Objects
Attribute
Index | The index (row labels) of the DataFrame
coluans | The column labels ofthe DataFrame.
axes Return a list representing both the aves (axis 0 ie, index and axis 1, ic,
columns) of the DataFrame.
dtypes | Return the dtypes of data in the DataFrame.
size Retum an int representing the number of elements in this object.
shape | Return a tuple representing the dimensionality ofthe DataFrame.
values | Retuen a Numpy representation of the Dataframe
fenpty | Indicator whether DataFrame is empty.
ndin Return an int representing the number of axesarray dimensions,
1 “Transpose index and columns.42 putes, counting, transpose et,
it
display various!
ing Dataframe (dfn)
‘Weare using the following!
>» dfn
Marketing Sales :
ae x
ame Neha Ronit
sex Female fle
ject
ies of o DotoFrame Obi sg name as depic
fo) Retroving Various Poperis ets ame with he daarae’s name 38 depicted in
“To view the value of an atribute, just
following examples
boo dfn. index
por dfn. size
: 6
+), types ‘obfect
vrgon( ("ages ‘rane’, “sex']» atyper ‘obsect’) >>> dfn. shape
>> dfncolums vapyect" aD
Trae aketing’, "Sales, types 'ob3ect') dp dfn.ndin
>> dfnaxes 2
Treden({ rage, “name, “Sex'], type 'object")
i >> éfn.empty
Index({"Rarketing’,‘Sales"], dtype = ‘ebsect')}
False
vo» dfn. ctypes
Marketing object
sales object
type: object
(b) Getting Number of Rows in a DataFrome
The lenleDF object) will return the number of rows in a dataframe e3
>>> len(dfn)
3
{c) Getting Count of non-NA Values in DotaFrame
ike Series, you can use count) with dataframe too to get the count of non-NaN or non-Ni
Values, but count) with dataframe is litle elaborate
() Ifyou do not pass any argument or pass 0 (default is 0 only), then it returns count
non-NA values for each column, ¢:.
>»> dfn. count) >»> éfn.count(axis « " index)
Marketing 3 Marketing 3
sales 3 sales 3
type: inte type: inte4
‘You may also pass argument axis =
dex’ to get the same result as above.
(ji) If you pass argument as 1, then it retums count of non-NA values for each row, $4
>»>dfn.count (2)
>»> dfn count(axis = 'coluans"
age 2 a age 2
rane 2 Cain counaxis= columns’) name 2
sex 2 | produce the same resut sex 2
dtype: inte
type: intea
You may also pass argument axis = ‘columns’ to get the same result as above.
CChopler : PYTHON PANDAS — |
43
{d) Transposing o DataFrame
You can transpose a dataframe by swapping i
You cn anapose ime by swapping its indexes and columns by using attribute T as
sooatnct Compare te wanspoataiaT) wa
‘original dfn
Marketing 25 Neha Female pain,
Sales 24° Rohit Male AKCE Sales
ae 25
ane Neha Rohit
Sex _Fenale_yale
SeAMPLE I Wes pogrom se «Dalam ods wig gy nd wn ofS PIT
Daan nf nepee
soLuTiON
Aeport pandas as pd
fr eesting the DtaFrone
Src pdsmarren( (tah
42, 75, 661,
‘Wane’ :['Arnav', "Charles", ‘Guru"},
*age’ :[25, 22, 35)))
print( ‘Original Datafrane" )
print (df)
print Teanspose:
print(4f.1)
Output
original oataframe
‘Age Name weight
0 15 amav 42
1 22 charles 75 —,
235 Guru 66 ne
Transpose: You can also use shape{0] to see the
° ae ‘umber of rows and shape] for getting
age a 2 35 ‘umber of columns, i.e,
name arma charles Guru af. shape[o)
weight 42 7S 66 4.shape[2]
(e) Numpy Representation of DotaFrame
‘You can represent the values of a dataframe object in numpy way using values attribute
>>> dfn. values
arvay({{'25", '24"),
[iweha’, "Rohit",
[‘Fenale', ‘Male'}], dtype = object)44
evs. Series and 20 Numpy AFraY
rapaaame sec
si fom Ses oe
and how it works letus discuss
22D NumPy Arrays.
11. Datafram
[Now that you have dear ides
how a Dataframe objec is sila
lata structure and Series is
Series is a 1D 4
table as well as size-mutable.
1.11.1 Dotofrome vs. Series
Dataframe as such is a 2D
‘value-mutable only while @
Table 17 otapome vs. eres |
Series Objet |
‘SNe. Datarame obec
This 1D data structure,
Its value- mutable only
Itstores homogeneous elements (same datatype).
data structure while
dataframne is value-mut
1, | feis 2 20 data structure
2, | This value mutable as wel as size-mutable
3, | ttean store heterogeneous elements ferent
datatypes)
4._| Each column has a heading.
Its single colurnn does not have a heading
inividua cola ofa tame canbe considered equvlnto ht fei bie A
aan cae como amdar oases bet wih ite diference that Tow
also has size-mutability unlike Series objects.”
1.11.2. Dotaframe vs. 2D Ndarray
‘A Dataame he 20 nda two denna ata suc, However iti ferent
frm 2D ndarays
Lats ow
Toble 1.8. ootpame vs 20 Ndarays
SN. Dataframe Object
1. | tis a 20 data structure
20 Ndarrays
Iti also a 2D data structure,
2 | Itean store heterogeneous elements (diferent | It sa table of homogenous elements
datatypes) (usually number) all ofthe same type.
3. | Itcan have indexes as wel as labels fo rows
and columns,
tis indexed by a tuple of positive integers for both
18 consumes more memory than an equic
‘rae It consumes lesser memory compared to equivalent
datsfame,
5 | Dataeames are expandable as you cana
Pe en at Ya Num aay ae nt expandable. you add ne
emer slement then anew areay wll be created Even Wilh
PPend|) function anew array is created an fle.
2. rie sett ih etn
(Oke enn te leche st2 Mh alton one dts, mya Sp adorns 4
CChoplor | «PYTHON PANDAS ~ 1 45
1.12. Selecting or Accessing Data
From a DataFrame object, you can extract or
requirement. Let us see how.
For all the examples inthis section, in coming line, we are using the following DataFrame a
select desired rows and columns as per your
DataFrame : dtf5
Population Hospitals schools
Delhi 10927986 as 716
Mumbai 12691836 208 3508
Kolkata 4631392 a9 m6
Chennai 4328063, 357 7617
1.12.1 Selecting/Accessing a Column
Selecting a column is easy, just use the following syntax.
[ ] «————— Using sre och
<————— in do nan
Now, consider the following example accessing columns Population, Schools from dataframe at.
>>> dt5{ Population’) >>> dt5[ 'Schools"]
eihi 110927986 Delhi 718
Mumbai 12601836, Mumbai 8508,
Kolkata 4631392 Kolkata 7226,
Chennai 4328063, Chennai 7617
Nane: Population, dtype: intéa Nave : Schools, dtype : int6s
In the dot notation, make sure not to put any quotation marks around the column name,
For example,
>>> def. Population
beth 10927986 SS
mba 12691836
Kolkata 4631382
Chennai 4328063
Name: Population, dtype: intéa
1.12.2 Selecting/Accessing Multiple Columns
‘To select multiple columns, you can givea list having multiple column names inside the square
brackets with dataframe object, ie. as follows
a
, , ,46
reaches
ae
For example,
. +, Wospitals’]]
poo dtfs[ ['Schools'»
ae ‘schools Hospitals
oan oe
mumbai 8508 bi
Kolkata ‘7226 we
157
chennai 7617
in the order of column names
erica above result £00.
(see below). Compare it with
yoo atfs[ ['Hospitals’, ‘Schools'] ]
Hospitals Schools
ethi 189 7316
Mumbai 288 8508
Kolkata 149 7226
767
chennai 157
‘Given a DataF ame namely aid ti
EXAMPLE
Toys Books Uniform Shoes
‘Ando 7916-6189 6108810
Odisha 508 8208 508 «6798
mp. 7226 «614961961
up. 76176157 457,847
Write a program to display the aid for
(Books and Uniform only (i) Shoes only
SOLUTION
import pandas 25 pd
‘oataFrane aid created or loaded
print(“Aid for books and unifora:")
print(atd[[ "Books", “Uniforn') ])
print ("Aid for shoes:*)
print(aid.shoes)
1.12.3 Selecting/Accessing a Subset from o DotaFram
To access row(s) and/or a combi
nation of rows
“cecil oc
<@ataFraneObject>.loe [
, :
given in the list inside square brackey
Torts the aid by NGOS for different states
Output
‘kid for books and uni for
Books Uniform
andra 6189 610
odisha 8208 508
mp, 6149 GIL
up. 6157 457
Aid for shoes:
andhra 8810
odisha 6798,
MP. 9611
up, 6457
Name: shoes, dtype: int6#
fe using Row/Column Names
‘you can use following syntax !0
}
‘The above syntax
us see some examples
PYTHON PANDAS - |
47
'@ general syntax through which you can single/multiple rows /eolumns. Let
© To access a row, just give the row name/label as this
Make sure not to miss the COLON AFTER COMMA,” 95 [tow Bbsb a
>>> dtf5.loc{ ‘Delhi’, 2} y
sy >>> a5 2oe{"Chennat|
Population 19827985 a
Peat oa couation 30003
seeds mie foe 7a
ane: Deh fone! Chena
© To access multiple rows, use : loc [ :, :]. Make sure
rot to miss the COLON AFTER COMMA.
Al
>»> dtf5.loc[ "Mumbai : "Kolkata"
Population Hospitals Schools
Mumbai 12691836 208 3508 Bicep eS
Kolkata 4631392 143 7226 ‘sd clon ater comma |
Please note that when you specify :, the Python will return all ows,
falling between start row and end row, along with start row and endrow. (ee below)
>>> dt5.loc["Mumbai :‘Chennai", :)
Population Hospitals Schools
Mumbai «12691836288 8508 teow
Kolkata 4631392149 726 se om a cakane
= =e s he
‘© To access selective columns, use
EDF object>loc [ : , : }, Make sure
not to miss the COLON BEFORE COMMA. Like rows, all columns
falling between start and end columns, will also be listed
>o> dtf5.loc[ ‘Population’ :*Schools’]
Population Hospitals Schools
Delhi 19927985 189 7916
mumbai «12692836208 « 3508
Kolkata 4631392149 7226
Chennai 4328063157 7617
>>> dtf5.loc[ +, ‘Population’ : Hospitals")
Population Hospitals
Delhi 19927986189
Mumbai «12691836208
Kolkata «4631392, 149
Chennai «4328063157-
INFORMATICS PRACTICES _
Cha PON NON
fos use
eee ied n> : cendcolunn>].
© Tose mg avn cnr, ear
cmunbos' , ‘Population’ : Hospitals]
poadt 5. Joc{ Delhi EXAMPLE BRE Consider a dataframe af es shown below
population Hospitals “ang args of obrins - ____ Sample Dataframe df (Reference 1,
pelhi 0927988189 “fom aange fos ‘an pe Sales ies : a eaaeanaEEEE
oe ie 8 ge Saas RR ae iar
au baron ay a at teres he i by NGOs or diferent Sites Ss 2 Eee
roys socks Uniform Shoes . Ee =
andrea yone=—«189.— «18 sexe crv galt eee oo
caista 5088708508 6738 eed
ae me 69611 sett See eee eat
up, 76176187457 as7 eS eet Ee
We ton od te ling:
(Dipl rus 2 fo 4 hin)
Ci) Frm rue 24 ht cs die cles, he Typ on ol aft
(i) Ports 2 0 th ice, dp fer atin
Ge) Dipl Sls Chana and “Oner Df an 5 ny.
SOLUTION
(maf 25)
Write «progam to display te id for states ‘Andra’ end ‘Ovi’ for Books and Uniform ony
SOLUTION core
import pandas as pd
i Datafrane aid created or loaded
print( aid.loc{‘Andhra’ :‘Odisha',, "Books" :‘Uniform*])
andnra 6189
odisha 8208
‘You may also specify distinct row id and column names as lists with lo, €8
‘id.loc{[ "Andhra", “U.P."], ['Toys", "Shoe"]]
will list data only from rows with row id ‘Andhra’ and ‘U.P.’ and from columns “Toys
‘Shoes’ only. You will see such a statement as part of next example.
foosart 257
ten type Sales Channel Onder Date Order 10 Total Revenge Total Cost Total Profit
ters Cae opting arnt “tsar ster sy oun.) akin
‘racke—«>> @F.10c [ 2:4, ["TtemType", ‘Total Profit]
pea. loc [ 2:4» ["Iten Type, “Total Profit") }
“OF object>.iloe[ start row index> : , : >> at #5.doc{@:2, 1:2]
Hospitals a
Dethi 139
Mumbai 208
‘when ven as start: end. an 762
offline sas
‘Recall hat when you use Hoc, then >> df-Aloc[2:5,0:4] (iv) >>> df toe{(4, 5], ["Sales Channel, "Orders"
>>> dts. loc[0:2, 1:3] NOTE a (0,5) t : By
hospitals schools San tetas) [Sidhaadinah Causto, S|
bane as 76 ahi tn sa el on eee ental |e ae eer tee|
Mumbai «2888508 label are included when given I eeaaee) te coc Grencezic'sra/ youu eam 1a omine 8 \
start end, but with og ike i 5 personel Core Online 3/13/2017 12 offtine 60 1
‘end index/position is exclude a ‘Smacks Online 2/25/2017 Is ‘online oor iINFORMATICS PRACTICES _ yy
50
s ing Individual Value
1.125 Selectina/Accesing ndvitNS Ts dataframe, you can use any ofthe following
“To select/acess an individual data v4
‘methods
(@ Bither give name of tow oF
oF object «column?
numeric index in square brackets with, ie, as this
[crow nare or row nuneric index>]
Consider the following examples nninan
ets. Population( ‘DeIht"} “nn oebeteaten
oo at ore
Clie conjure edt
>o> dt¢s.Population( 1)
12681836
{i You can use ator iat atibutes with DF object as shown below
se To 14
> at{ roy labels, col Label ‘Access a single value for a tow/ |)
“oFobject>.at{eron label», col label>] tere
‘Access a single value for a row!
column pair by integer postion
, col Sndex n0.>,]
Consider examples given below
Youn gv
ane win 9 82 803
>>> dt#5.at[ "Chennai, 'Schools")
ure ow nde and nome
ued lar or nih iat atte
pop dtf5.iat(3, 2] ‘sess a
7617
NOTE
1.12.6 Selecting Datafrome Rows/Columns
based on Boolean Conditions
‘Sometimes you need to select rows/columns from a dataframe
based an a condition, just the way you filtered entries in series
‘objets in section 1.66. When you compare a dataframe with a
value then Pandas wil execute that comparison condition for
‘ach element of the dataftame and give you True or False
accordingly for each element, eg,
‘The a and iat attributes are used
to. acces. single values. (saa
values) at specif leaton in 2
atafame while at uses row and
column libes, iat uses ine
postion to acess the vale.
oo fea >> det > 0
a «
True True Tre
‘ea, conpared wach
tteant
iates
au
‘Chops 1: PTHON PANDAS — 1
51
‘You can apply condition to individual columns or a range of values to0 as shown below
poo ates
People Amount average
@ 28 279008 13950. eaee00
136 39680831622. 222222
2 44 — 56300032795. asesas
3 34 49680014611. 754706
4 49° 763000 © 18573.428571
>>> dtf5[ ‘average"] > 14000
fete) se isn pt came
False (cle Aap) othe date ifs
e
a
2 False
4 1 app te gb coon toch sme fe
4
True
‘olor aon oe te eth
True alvin the cola ae
Nane: Average, dtype: bool
Duteppying te conden ike hae ave given youjust theres Teor Fe To ett
sie es nuame a puts ane eae ec eereeeee
cron idee spare bee sete asec
eee
ee
erereetedeeaen|
Now cle expe low ht py hued shown ee bt tin,
i mboet ofa Go tat eutes ee acim een
ee
2 ae seme sett r6aes a il
Similarly, check the following statement
spat arn") > 08
ete me eee
the comin is end ie gts
Internally Pandas checks the condition for each row and retums True or False. These truth
values (True/False) act as an index for the rows and the rows with True index are returned,
similar to Boolean Indexing covered later. The Boolean selection discussed above, works similar
to Boolean indexing but is different from that. Their difference will become clear to you when
you go through the topic Boolean indexing later in the chapter.NS gp
32 ed simple NOTE
we have dso ,
ease rementer al Mors. Bolen ston 9S a
e
ndosing = condition oy
conditions for extras yond the scope of the book: Gataame’s subsets nade gl
advanced conditions # PSM auctions canbe refered 19 sare brackets next tg
However, Arend B— Bolen datarame name viel the
read more about it values that match the
i yu want to read mer Pt esubees from a Series object ch the ee
Inthe same way, you an &*
" ‘condition.
‘posed on a condition. Consider following 0 mpl
too, a.com
PRAMPLE BN Fron series Ser of reas of sates in kn, find out the areas that ar
ens tal stores ens of ites
Li ries Ser
tha 50000 bn
SOLUTION
ee puta a08
Snr, 90, 5, 782, UST, 788, 257, TE,
1637632, 25723, 2367, 11789, 345, 256517])
print(Sera{Sert > 50000])
EXAMPLE BY Gen «Sri ober. White program to store he squares of he Sve wales in objet
Display 6 aus which ae > 1.
SOLUTION
{import pandas as pd
Series cbjact s5 created or loaded
print( "Series object s5 :°)
print(s5)
562552 4 s6.created
print("Values in s625 :*)
print(s6[ s6>25])
1.13. Adding/Moditying Rows'/Columns’ Values in DataFrames
You can assign or modify data in a dataframe in the same way as you do with other objects. All
you need to do isto specify the row name and/or column name along with the dataframe's
hhame. The proces of adding and modifying rows /columas’ value is similar, as you will seein
the following sub-sections
1.13.1. Adding/Modiying o Column
Columns in a datatame canbe refered to in multiple ways. Assigning a value toa column:
© will moi ifthe cokumn steady exists
9 will add anew column, if it des notes already
To change or adda column, use syntay
‘OF object >. = cnew values
‘OF object >. columns:
new values
CChopter 1 PYTHON PANDAS —
I the given column name does not e
53
Mie g vist in dataframe then a new column with this name is,
>>> dtFS[ ‘Density’
>>> dts
219
Population Hospitals Schools
Delhi 10927986 ago 7ag
Mumbai 12691836208 aso,
Kolkata 4631392 us 726
Chennai 4328063 577617
‘Although the above method adds a column BUT
here the catch is that all the rows of this new
‘column have the same given value. If you want to
add a proper new column that has different
values for all its rows, then you can assign the
data values for each row of the column in the form,
‘Since haa sna conn
Hf you asin something to
(Fusing at oF lee and if column name does nat
fet, then a new colun i created and that has
the Same vale for alts ows.
of alist, ie, as shown below
d>> tf SL Density!) = (1599, 1219, 1636, 1050, 1180] —,
po aees ——
Population Hospitals ‘Schools This time Python
Delhi ——10927586.0 © 189.8 7916.0 ies sed)
himboi —12691836.0 208.0 «—as08.0 Dae
Kolkata aesnssz@ = us.@ 7226.8 yates rat
Chemnai a328063.8 157.8 7617.8 fon apc at
tanglore Se78057.0 1200.8 1200.8
Same way, you can modify an existing column by assigning a new list of values to it. Thats, for
existing column, it will change the data values and for non-existing column, it will add a new
column, There are some other ways of adding a column to a dataframe. These are
.at[ : , } = «values for colusa>
Or -Joe[ : , ] «
Or OF object.» =)
For example, given below are some statements that will add a gore:
new column if the mentioned column name does not exist in a
dataframe
When you asign something
to a column of datarane,
then for existing column, it
ensi e wil change the data values
tS. loc{:, “Density"] = [1500, 1219, 1630, 1050, 1109] ni ge he Ot
LFS « dba, assign( Density =[1500, 1228, 1630, 1050, 1100] ) wil add a new column.
You just need to make sure in above examples thatthe sequence which contains values for the
‘nev column must have values equal to number of rows inthe dataframe otherwise Python will
ai
re error (eeror : ValueErt0r )CES
54
1.13.2. Adding/Modifving © Row —
Tike alums, you an change or 244 rows 108 Data rame SiG at oF 106 at,
explained below
‘To change or add a row, use syntax =
, +]
1: ] =newvalue>
‘new value>
Or «oF object» Loc{,
Likewise, if there i no row with such row label, then Python adds new row with this rg
and assigns given values to al its columns
ooo dt#5.at{ Banglore’, :] = 1208
poo dts ie
Population Hospitals Schools Density.
Delhi 10927986.@ «189.8 7916.8 1219.0
Mumbai -«*12692836.8 «208.0 8508.8 1219.8
Kolkata 4631392.0 149.0 7226.0 1219.8
Chennai 4328063.0 157.0 7617.8 1219.0, sion vale fra)
Tanglore 1200.0 1200.0 1200.0 1200.0 Roun
‘As you can see in the above output that ifa mentioned row label with ator loc attributes d
not exist in the DataFrame, Python will create a new row for it and this is how a new
added toa DataFrame. But there isa catch if you specify only a single value, then all the va
in the newly added row will have the same value as it did in the above output.
‘You can add a new row by specifying individual values for each column. For this, specify all
‘values of the new row to be added as a sequence such as a list ete,
>>» dtf5.at{ Bangalore’, :] = [10002980, 171, 7311, 1200],
>>> dts
Population Hospitals Schools Density
Deahi 10927986.0 189.0 7916.8 1219.0
Mumbai —-—«12691836.0 208.8 8508.0 1219.8
Kolkata —-«4621392.0 «149.8 7226.9 1219.8
Chennai __4328063.0 «157.0 7637.0 1219.8
Banglore 10002980.0 171.0 7311.0 1208.
While adding a row this way, make sure that the sequence containing values for diff
columns has values forall the columns, otherwise Python will raise ValueError (se below)
Ifyou try to add a row having 4 values to the above DataFrame dtf5 having 5 columns, Python
will give you error
his stato i yin inset
>> dS. 2oc{ "Mohali, :] = (452980, 72, 281] <——— fu oa
having 3 aes the dane
{ur coons od this hea
Value€rror: cannot copy sequence with size 3 to array axis with dimension 4
——_
NOTE
You can se at orc tirbvtes of Dataframe to adé/modty a row, cok or ini
Chops 1: PYTHON PANDAS — 1 ng
EXAMPLE BID Consider the flowing dataframe saleDf
Target sales
zoned 56000 58000
zoneB 70000 68000
zonec 75008 78000
zoned 60000 61000
Write program to adda column namely Orders having values 6000, 6700, 6200 and 6000 respectively forthe zones
‘A,B, Cand D. The program should also adda new row fora new zone ZoneE. Add some dummy values in this row.
SOLUTION
import pandas as pd
4# saleOF created or loaded here
saleDf{'Orders'] = [5000, 6720, 6200, 6000 ]
saleDf.loc["zonet",, :]= [ $2802, 45000, $200]
print(saleDF)
Output Target sales orders
zones $600.0 $8000.0 6000.0
zone8 7000.0 6800.0 6700.0
zonec 7500.0 7800.0 6200.0
zoned 6000.0 61000.0 6000.0,
zone 5000.0 4500.0 5000.0
1.13.3. Modifying o Single Cell
‘You may use iat{ ) to modify values using row
and column position. (Refer Multiple Choice Question 22 given in Objective type questions atthe end
ofthis chapter)
‘To change or modify a single data value, use syntax
. [] =
>2> dtf5.Population| "Banglore’ ) = 5678037
o> ates
Population Hospitals Schools Density
Delhi 10927985.0 189.0 7516.8 1219.0
Mumbai —-—«12692836.0 208.0 8598.0 1219.0
Kolkata 4631382.0 149.8 7226.0 1219.0
Chennai 4328063.0 157.8 7617.0 1219.0
Banglore ((5678097.0) 1200.0 1200.0 1200.0
‘ee, tis tma only his oe got modtedatatrame as shown below =
an exist
when you cee 2 daame using an est
ya
einige?)
oun
1 55 7 58
poo df= pd. ataFrane(sf2)
sd
ei 2
eure
1 55 67 58
Now if you change the value
ov fa Doe(, 2] = 85
one etl ing i
domed
fone cell of tbe dataame afl (orginal dataframe) 35
‘Butt you check the value ofthe new dataframe
vale of he given clin dataframe
Saat easmicibe fi wl also be charged,
oe >of
me et 2
enue B Changed ae
ue
iattame, unaffected from the changes of af1?
Why didithappen ? Wasn't df2 supposed tobe a diferent da
lasDataFrame{ ) method. So far you have reed
“The answer toll your questions isin the syntax ofthe pands
the syntax of pandas.DataFrame( ) method as
“cdatFraneObject> = pandas, DataFrane( , \
[columns = J, [index « cindex sequence>])
Bt there isan aditional optional argument, copy whichis by default set to False, ie, ye
cdatFramebject» = pandas. DataFrane(, \
[columns = J, [index » cindex sequence>], [copy = False])
‘With copy as False(s default value), the new Datarameis not created asa separate copy andis thus linked to
the orignal datas ony anew labels created efersng othe orignal dataframe). Hence all changes made to
the original detatame are reflected in it and ths was the reason that 2 also showed the changes above.
In order to creste a datatrme as a completly different unlinked dataframe, you need to set the copy
Opto orwmen co,
whi Feb df
argument s Tivo tht the ve dtfrare i rated as are copy (deep copy), i. as
>>> df3=pd.DataFrane( #1, copy = True)
1 ene hat he alae
29> ft Doe[4, 2]= 78
spat ‘utd a Tre py an change in
aan) ‘he cgi dataome ar refed init
eunnp
1 55 67 78 Datoome Jf coated wi copy es Fate
vee a et ae Sis al cargo
ee tina detajome
B
Dutfame 3 cue wih copy ws Tru isnot
inks othe rial dea Trae copy
155 67 as doce doesnt fet any ches of ft
‘Chopter 1: PYTHON PANDAS
57
1.14 Deleting/Renaming Columns/Rows
Deleting/Renaming Columns/Rows
Python Pandas provides two
‘ways to delete rows and columns — del statement
function. Pandas also provides rename) function to rename rowsand cohume Inthe
we shall tlk about how rows/columns can be deleted or renamed nn nse
Let us now talk about how you can delete or rename columns and
1.14.1. Deleting Rows/Columns in a Dotaframe
‘To delete a column, you use del statement as this
del [ ]
For example,
>>> del dts ‘Density"] .
>>> dts ms
Population Hospitals Schools 4”
Delhi 10927986.¢ 19.8 7016.0
mumbai -—«12691836.0 208.0 8508.0
Kolkata 4631382.8 3.8 7226.0
Chennai 4328063.8 157.8 7617.0
Banglore §678097.6 1200.0 1200.0
‘To delete rows from a dataframe, you can use .droplindex or sequence of indexes), e.g,
Both these commands will delete the rows with indexes 2,4, 6, 8, 12 from dataframe df
F -drop(range(2, 13, 2)) Argument to drop shuld be ther an
AF drop( 2, 4, 6, 8, 12]) incor aagnne onaing nds
You can also give axis = 1 along with indexes/labels then drop() will drop the columns, ie, the
following command will drop the mentioned columns from dataframe df
df.drop( (“Total Cost”, "Order 10"), axis = 1)
Ths rane ied cons
EXAMPLE [BBM From the aY wed above, create another DataFrame and it must not contain the column
‘Population’ and the row Bangelore
SOLUTION
import pandas as pd
# DataFrane dtf5 created or loaded
de6 = pd.DataFrane(dtF5)
del dt6[ ‘Population’ } Output
def 6 = dt¥6.drop(['Bangalore]) nee
Cees ethi 189.0 7916.0
mumbai 208.0 8508.0
kotkata 49.0 7226.0
chennai 157.0 7617.0oo TSACTICES:
58
youn use te ENAMEL) ft
1142 :
change the name below.
Sota per ee
cof ronan inden"
cetums = (mes HONE), pace sg,
want o rename rows,
wes senames wists) HOU. a
pe index rg i
psy ey index clams anes 704 Wan FEAMEcLMNG yy,
gumen names-change dictionary
st nets pect the names ad
6 For bh index and column 3 form ike (ld name new name) tening
es re oo aman rns es
Spat ince 2 tif om Min a new dataame is ested. wif al
es yo sp ths 30m
a dal eas wnarerd
Lets now pata se how rename) works
Consider the dataame tp shown below
foitno Name Marks
sock us Pai 87.5
secs 236 Rishi 98
Sec 307 Pret 98.5
seco 2 Paula 98.0
“To change the ow labels as ‘A; °C; D’, you can write
"Sec "Sec D':'0'})
The cat of name ha so he change inde
$6 rth had egal pf wel?
‘The above statement wil show the changed indexes but when you display the dataframe topDf
after executing above statement, it will show you the original dataframe (sce below) because
default enamel) rats a new dataframe with changed names.
99 topo
folln Wane
45 avni
236 Rishi
387 Preet
a
he ts
She rit uae tps wach
coe 1 PHN PON
xR
rm age 5 oe hacia
he etna ea fie A my
pee
ier
"seco
1 'SeCC's'C', 'ec0"s"0"), inplace= True)
oe
gig ge Tr ergs nn
ok he angi he a dof
"ih npc as Tr, rename) az
hg te inden oil tap Df
You can use rename() to change columns names too,¢. following statement will change the
columns names in the topDf dataframe. (Consider the original topDf shown earlier in the
beginning of this section) You can choose to change selective columns’ names or all the columns’
names. Only the names given in the name-change dictionary will et changed. For example,
>>> topo reame(columsoteine
Ro Name Marks
Seck 115 Pavni 97.5
Sece 236 Rishi 98.0
SecC 307 Preet 98.5
Sec 422 Paula 98.0
md tow each,
‘You can combine any of these arguments together: index, columns, inplace arguments.
Following examples are illustrating the same.
EXAMPLE [EI Consider te saleDf shown below.
Target Sales
zone $6008 58800
zone 70008 68000
zone 7580@ 78000
zoned 69000 61000
Write «program to rename indexes of ‘soneC’ and ‘zoneD’ as ‘Central and ‘Dakshin’ respectively andthe clu
ames "Target and ‘Sales’ as “Targeted” and ‘Achieved’ respectively.
SOLUTION
import pandas as pd
ft saleof created or loaded here
print(saleOF.renane(index = ("zonec' : ‘Central, ‘zoneD":'Dakshin'}, \ Output
columns = {"Target': Targeted’, 'Sales”:'Achieved"))) +
oe Targeted Achieved
zonen $6000 $8000
zones —-70000»—68000
‘entra? 7500078000
pakshin 60000 61000ne er inged index.
it ge se te angeles nd
revi pest rns.
sn te pevions
iy? Make cans .
saleDf she changed indexes and cOlUNS' names
SOLUTION gam i 14 inp = THE AEUMEn y
1e previous pr is api Mec
Te rep cn wih wae) le would BE me
a ramn to ma
Te
as sed
snort ands eed or ead Ne
se saleoFcreater
saleof.renane( index
print(saledf)
ataframe Indexing
{zone
colums= (Taree
«zoned: Dakshin’) \
"central", :
7 "saes!:"Aehieved"))» Lnplace stag
‘targeted’,
BOOLEAN INDEXING
i
LS Mon oe ee Daaframesin varity of Ways, YOU hav alo lean
Til now yo
to change indexes, col
sometimes] as indexes ofa dat
Ww”
1 Wout isthe ef Pion Ponas
ray?
2 Name the Pads bj tat an sre
one dinersnl aay We eject an
nha umes eed ners.
3. Can you ave dpe idee in 2
seis eject?
4. at do these atbaes of see
sity? (@) se (i) temaze
(0) yee
5. tin ais objet ten how wl
te) aed St cum) bea ?
at Nas? Hw you sta them
ina deuce?
7. Tue/Fie. Snes objets aay hae
indexes 6 to n= ne
8. What sth se of dl satenent
9. What oes dp ction do?
Woat the rt of
eo pace agen
‘ane ) funtion net
ya have learnt to create and
lun names, rename
DataFrames - Boolean Indexing.
dng, the ae uggs ES
cone taframe. You mig
having indexes as True or False?
them etc Let us now talk about an interesting feature
having Boolean Values [(True or False) or (1 or
ight be wondering - Why? What is the need fap
Well, your question is genuine and so is the answer
‘Adualy, in some situations, we may need to divide our
data in two subsets ~ True or False, eg., your school has
decided to launch online classes for you. But some. “ |
the week are designated for it, So, a dataframe related to
information might look like
Day No. of Classes
Monday 6
Tuesday
Wednesday
Thursday
Friday
“lai
‘True rows and False rows. This is useful division in situatic
where you find out things like - On which days, the oll
lesa held }Or Which ones are ofine classes days? A
The Boolean indexes divide the dataframe in fo =|
BOOLEAN INDEXING
Boolean indexing "!
having Boolean values iT
False) or (1 or} asindeashett
dataframe.
Boolean indexes can either
bein True/False form or in
(1.070) form,
7" — ON
colamiie” ame 4 chapter I = THON PANDAS — |
sad indexes and col te ape ra
1.15.1 Creating Dototromes with Boolean Indexs :
Lat wa teat tes Bock nega Tara ly Wale acta aay
and Fle. While craig a data
- make sure that True and False are not enclosed in
{hence it will give you error (KeyEror) while accessing
Toc, because ‘True’ and ‘False’ are string values, not Boolean
quotes (ie, like ‘true’ or False’,
data with Boolean indexes using
values.
Solved problem 32 depi
's the same problem,
Let us first create above shown dataframe containing —
code given below. %6 online classes’ information, through the
import pandas as pd
Days =['Nonday’, "Tuesday', ‘Wednesday’, "Thursday", “Friday* ]
Classes =[6, 8, 3, 0, 8) eco indes provid
Days’ :Days, ‘No. of Classes" :Classes} Eee
lasDf = pd.DataFrane(de, index = (True, False, True, False, True] ) Belo vals rt sings
ie (aot ened in gute
>>> clasoe
5 Days No. of Classes
Monday 6
Tuesday 8
Wednesday 3
Thursday @
8
Friday
‘You can also provide Boolean indexing to dataframes as 1s and 0s. Let us create another
dataframe (clasDff) similar to the above shown datairame (clasDf) having indexes as Is and 0s,
through the code given below,
“import pandas as pd
ays [‘Nonday", "Tuesday", ‘wednesday’, "Thursday", ‘Friday’ ]
Classes = [6, 8, 3, 8, 8]
dc= (‘Days" :days, ‘No. of Classes’ :Classes}
clasDfl = pd.DataFrane(de, index = [1, 8, 2, 8, 2] )
: ae ~ tides provid fo each ne
end Oe hi ime
>>> clasbfl
Days. No. of Classes.
Monday 6
Tuesday @
Wednesday 3
e
8
Thursday
Friday/
W
ean Indexes
62 fron Datofromes ened ie, fr fining oF extracting
sing for fiteing rn dataframes with Boole, =a
1.152 ACES very ust ecards from data oolean indexes ia
Boolean inde ju can iter “
: a
Fue inderd ed below lith True index.
Joc atta atodisplay all i ith False index
oo) stocisplay 21 re index as 1 oa
alse all records wi attra
emi) eae arecrdswithindexas® Bsn
on. gro aisel® ert
oie?)
ap
7 svete 04]
‘ no, of Classes
6
3
8
days ‘No. oF Classes
Tuesday =
Thursday 5 :
>>> clasDA1. 1oc{@]
Tariana ‘
cane bays No, oF Classes
a Tvestay °
Turstey
Tals Me hivecometo te end ofthis chapter. Let us quickly revise what we have lari
shape.
LET Us Revise
“ry fom O10 Lava.
tg daa
da
“dope, shape, nbytes, ni, size, temsize, hasnans, em?
PYTHON PANDAS — 1
hope | J
1 Aslic objets rete from Series objec ut
Sg @ sma of .T
4 When you assign someting toa column ofdtframe, he for existing column, i wil change the daa vlues nd for
rnon-eisting clump, wil add @rew column,
4 Acolumn canbe deleted using dl cormand.
Opiective Type Questions
Multiple Choice Questions
f create an empty Series object, you can use
(@) pa Series(empty) (©) pa Seriesinp NaN)
(0 paSeries) (@)all ofthese
2, To specify datatype int16 fora Series object, you can write
(@) pa Series(data = array, dtype = inti6) _(@) pa Series(data = array, dtype = numpy.int6)
(6) paSeries(data = array.dtype = pandas.nt16)
(@ all of the above
OTQs
3. To get the number of dimensions of a Series object, _ attribute is displayed.
(@) index () size (0 itemsize (0 dim
4. To get the size of the datatype ofthe items in Series object, you can display __ attribute,
(0) index () size (©) itemsize (@ nin
5. To get the number of elements in a Series object, __ attribute may be used.
(0) index (size (© itemsize (@ dion
6. To get the number of bytes of the Series data, _ attribute is displayed.
(@) basnans (0) bytes (nim (@) dtype
7. To chuck ifthe Series object contains NaN values, ___ attribute is displayed.
(w) hasnans (W) nbytes (9 ndim (@) type
8, To display third element ofa Series object S, you will write
ws) wsei 881 say
&