Python Pandas - I
>» nare2 = np.array(({[11.5, 21.2, 33.8], [48, 50, 60], [212.3, 301.5, 405.2]]) op ctts = pd.0ataFrane(nar?, colums= |'First', ‘Second’, "Third! ], index = Eek ees Dee [mone us 22 3.8) Seeworproegey 40.8 50.8 60.8 ‘sequence felines 212.3 301.5 405.2 ~ Inthe above example, ; 2mples the ndarays tha are pase ineach ofthe rows. If however ea Passed to DataFrame have same number ofelemet® “rows of ndarrays differ in length, ie, if number of ele CChopler 1 = PYTHON PANDAS ~ 1 39 Leach ro der, hen Python wl eae ut omc Crete fst sng clu in te date : 1pe ofthe oan wl be considered ben elo ae and pp nares =mp-array(((303.5, 201.2], (480, 52,600, 700), (232.3, 361.5, 485.21] ) array({1ist(¥@a.5, 201.2]), 1ist({400, 50, 609, 708]), lea. 3.5.20), type = object) Ee] >>> dt a = pd. DataFrane(narr3) 3 valves doo ata e e (101.5, 201.2] a [409, 50, 600, 700) 2 [212.3, 301.5, 405.2] EXAMPLE BEM Wit wil be the output of following code ? Amportt pandas as pd Ampor't nunpy as np p-array({ (11, 12}, (23, 14}, [15, 16] }, np.int32) 1d.DataFrame(arri) print(dt#2) SOLUTION 01 one 123 4 215 16 EXAMPLE BED Write « program to create a DataFramejrom a 2D array as shown below pices ao | us | 12 [430] 140 | 200 Output [us [ae [a oa 2 ~ 0 101 3 124 1 130 140 200 2 us 26 27 SOLUTION import pandas as pd import numpy as np are2=np.array({ (101, 113, 124), (138, 148, 260 }, (115, 216, 217] ]) dt 3 = pd.DataFrame(arr2) print(dt3)a INFORMATICS POSES = 1 0s Series Objects Ina2D dictionary, you wary aS argUMERt to A 1D Dictionary with Values multiple Series objects: ‘4. Creating o DataFrome ‘Object from o 2 wees nose ae th ors mnt cena ge ae aoweng Ses HS 1 eo eS tab ere ema ten Paseg ST EOE >>> dt = pd.DataFrane( school) Now the DataFrame dtf, created above, will be like poo ate ‘amount people © 166000 20 246000 36 2563000 “4 ZRAMPLE PJ Conte tow sores eects staff and salaries that toe the number of people in various ofc tranches and salaries distributed in thes ranches, respectively Write a program to creat anther Series objet at stores average slry pe ranch and then rest @ DataFrame object fr thse Sve objets, SOLUTION import pandas as pd Snport numpy 25 staffs pd.Series([20, 36, 461) | salaries = pd.Series([279000, 396800, 563000]) ‘it will create avg series object eee avg = salaries / staff orgs {‘people' staff, ‘Anount salaries, ‘Average’ :avg } aes = pd.DataFrane(org) print (ats) Output ‘Amount average people 0 279000 13950.000000 20 4 396800 11022.222222 36 2 563000 12795.454545 44 5. Creating « DataFrame Object from another DaiaFrame Object ‘You can pass an existing DataFrame object to DataFrame( ) and it will create another dataframe object having similar data, Consider the code shown below that passes to DataFrame() another datarame object Given a Daaframe object tft >ooatfa @ 1 2 3 6 CChopter |: PYTHON PANDAS — 1 a You can create an identical data i te an identical dataframe by passing its name (dtfl) to DataFrame( ) ‘fnew = pd. ataFrane(at1) You an ply he ew Daane tet o confi ht tii ene e123 Lace When you cesta datframe using another datattame you should ad an ade ame, ou should ad an adiiona argument todstafame() method as copy= Tr, which wl ensure hat he newly ted dae oe True copy ofthe pased dataframe and not anew label ely. The sbove give ethod cee only ane abel refering to the sme datafame, To rates datarame set slnked sew copy we copy) ofthe pased daftrame sete adlonal argument cop + me Refer tothe inf losing the ston Ming snl cl, page 560 wns his Please note, there are some other methods too for creating dataframe objects but covering those will be beyond the scope of the book — __ NOTE Displaying @ DatoFrame Displaying a dataframe is the same as the way you display other variables and objects, i.e, on the console prompt, either type its name or give print( ) command with the dalaframe object as DataFrames can ao be crated rom vext/S les, whch we shall cove in Chapter 4 — Data Tansler [between Oataframes and Flt lex/MySQL cere ai When you create a DataFrame object all information relate to it (suchas, its siz its datatype ete.) i available ough its attributes, You can use these attibutes in the follwing format to get information about the dataframe object. 7 .cattribute nane> on cae Some common attributes of DatFrame obec ar listed in table below. The code examples forthe usage ofthese atsibutes follow Table 1.6 Table 16 common otitutes of Datafrome Objects Attribute Index | The index (row labels) of the DataFrame coluans | The column labels ofthe DataFrame. axes Return a list representing both the aves (axis 0 ie, index and axis 1, ic, columns) of the DataFrame. dtypes | Return the dtypes of data in the DataFrame. size Retum an int representing the number of elements in this object. shape | Return a tuple representing the dimensionality ofthe DataFrame. values | Retuen a Numpy representation of the Dataframe fenpty | Indicator whether DataFrame is empty. ndin Return an int representing the number of axesarray dimensions, 1 “Transpose index and columns.42 putes, counting, transpose et, it display various! ing Dataframe (dfn) ‘Weare using the following! >» dfn Marketing Sales : ae x ame Neha Ronit sex Female fle ject ies of o DotoFrame Obi sg name as depic fo) Retroving Various Poperis ets ame with he daarae’s name 38 depicted in “To view the value of an atribute, just following examples boo dfn. index por dfn. size : 6 +), types ‘obfect vrgon( ("ages ‘rane’, “sex']» atyper ‘obsect’) >>> dfn. shape >> dfncolums vapyect" aD Trae aketing’, "Sales, types 'ob3ect') dp dfn.ndin >> dfnaxes 2 Treden({ rage, “name, “Sex'], type 'object") i >> éfn.empty Index({"Rarketing’,‘Sales"], dtype = ‘ebsect')} False vo» dfn. ctypes Marketing object sales object type: object (b) Getting Number of Rows in a DataFrome The lenleDF object) will return the number of rows in a dataframe e3 >>> len(dfn) 3 {c) Getting Count of non-NA Values in DotaFrame ike Series, you can use count) with dataframe too to get the count of non-NaN or non-Ni Values, but count) with dataframe is litle elaborate () Ifyou do not pass any argument or pass 0 (default is 0 only), then it returns count non-NA values for each column, ¢:. >»> dfn. count) >»> éfn.count(axis « " index) Marketing 3 Marketing 3 sales 3 sales 3 type: inte type: inte4 ‘You may also pass argument axis = dex’ to get the same result as above. (ji) If you pass argument as 1, then it retums count of non-NA values for each row, $4 >»>dfn.count (2) >»> dfn count(axis = 'coluans" age 2 a age 2 rane 2 Cain counaxis= columns’) name 2 sex 2 | produce the same resut sex 2 dtype: inte type: intea You may also pass argument axis = ‘columns’ to get the same result as above. CChopler : PYTHON PANDAS — | 43 {d) Transposing o DataFrame You can transpose a dataframe by swapping i You cn anapose ime by swapping its indexes and columns by using attribute T as sooatnct Compare te wanspoataiaT) wa ‘original dfn Marketing 25 Neha Female pain, Sales 24° Rohit Male AKCE Sales ae 25 ane Neha Rohit Sex _Fenale_yale SeAMPLE I Wes pogrom se «Dalam ods wig gy nd wn ofS PIT Daan nf nepee soLuTiON Aeport pandas as pd fr eesting the DtaFrone Src pdsmarren( (tah 42, 75, 661, ‘Wane’ :['Arnav', "Charles", ‘Guru"}, *age’ :[25, 22, 35))) print( ‘Original Datafrane" ) print (df) print Teanspose: print(4f.1) Output original oataframe ‘Age Name weight 0 15 amav 42 1 22 charles 75 —, 235 Guru 66 ne Transpose: You can also use shape{0] to see the ° ae ‘umber of rows and shape] for getting age a 2 35 ‘umber of columns, i.e, name arma charles Guru af. shape[o) weight 42 7S 66 4.shape[2] (e) Numpy Representation of DotaFrame ‘You can represent the values of a dataframe object in numpy way using values attribute >>> dfn. values arvay({{'25", '24"), [iweha’, "Rohit", [‘Fenale', ‘Male'}], dtype = object)44 evs. Series and 20 Numpy AFraY rapaaame sec si fom Ses oe and how it works letus discuss 22D NumPy Arrays. 11. Datafram [Now that you have dear ides how a Dataframe objec is sila lata structure and Series is Series is a 1D 4 table as well as size-mutable. 1.11.1 Dotofrome vs. Series Dataframe as such is a 2D ‘value-mutable only while @ Table 17 otapome vs. eres | Series Objet | ‘SNe. Datarame obec This 1D data structure, Its value- mutable only Itstores homogeneous elements (same datatype). data structure while dataframne is value-mut 1, | feis 2 20 data structure 2, | This value mutable as wel as size-mutable 3, | ttean store heterogeneous elements ferent datatypes) 4._| Each column has a heading. Its single colurnn does not have a heading inividua cola ofa tame canbe considered equvlnto ht fei bie A aan cae como amdar oases bet wih ite diference that Tow also has size-mutability unlike Series objects.” 1.11.2. Dotaframe vs. 2D Ndarray ‘A Dataame he 20 nda two denna ata suc, However iti ferent frm 2D ndarays Lats ow Toble 1.8. ootpame vs 20 Ndarays SN. Dataframe Object 1. | tis a 20 data structure 20 Ndarrays Iti also a 2D data structure, 2 | Itean store heterogeneous elements (diferent | It sa table of homogenous elements datatypes) (usually number) all ofthe same type. 3. | Itcan have indexes as wel as labels fo rows and columns, tis indexed by a tuple of positive integers for both 18 consumes more memory than an equic ‘rae It consumes lesser memory compared to equivalent datsfame, 5 | Dataeames are expandable as you cana Pe en at Ya Num aay ae nt expandable. you add ne emer slement then anew areay wll be created Even Wilh PPend|) function anew array is created an fle. 2. rie sett ih etn (Oke enn te leche st2 Mh alton one dts, mya Sp adorns 4 CChoplor | «PYTHON PANDAS ~ 1 45 1.12. Selecting or Accessing Data From a DataFrame object, you can extract or requirement. Let us see how. For all the examples inthis section, in coming line, we are using the following DataFrame a select desired rows and columns as per your DataFrame : dtf5 Population Hospitals schools Delhi 10927986 as 716 Mumbai 12691836 208 3508 Kolkata 4631392 a9 m6 Chennai 4328063, 357 7617 1.12.1 Selecting/Accessing a Column Selecting a column is easy, just use the following syntax. [ ] «————— Using sre och <————— in do nan Now, consider the following example accessing columns Population, Schools from dataframe at. >>> dt5{ Population’) >>> dt5[ 'Schools"] eihi 110927986 Delhi 718 Mumbai 12601836, Mumbai 8508, Kolkata 4631392 Kolkata 7226, Chennai 4328063, Chennai 7617 Nane: Population, dtype: intéa Nave : Schools, dtype : int6s In the dot notation, make sure not to put any quotation marks around the column name, For example, >>> def. Population beth 10927986 SS mba 12691836 Kolkata 4631382 Chennai 4328063 Name: Population, dtype: intéa 1.12.2 Selecting/Accessing Multiple Columns ‘To select multiple columns, you can givea list having multiple column names inside the square brackets with dataframe object, ie. as follows a , , ,46 reaches ae For example, . +, Wospitals’]] poo dtfs[ ['Schools'» ae ‘schools Hospitals oan oe mumbai 8508 bi Kolkata ‘7226 we 157 chennai 7617 in the order of column names erica above result £00. (see below). Compare it with yoo atfs[ ['Hospitals’, ‘Schools'] ] Hospitals Schools ethi 189 7316 Mumbai 288 8508 Kolkata 149 7226 767 chennai 157 ‘Given a DataF ame namely aid ti EXAMPLE Toys Books Uniform Shoes ‘Ando 7916-6189 6108810 Odisha 508 8208 508 «6798 mp. 7226 «614961961 up. 76176157 457,847 Write a program to display the aid for (Books and Uniform only (i) Shoes only SOLUTION import pandas 25 pd ‘oataFrane aid created or loaded print(“Aid for books and unifora:") print(atd[[ "Books", “Uniforn') ]) print ("Aid for shoes:*) print(aid.shoes) 1.12.3 Selecting/Accessing a Subset from o DotaFram To access row(s) and/or a combi nation of rows “cecil oc <@ataFraneObject>.loe [ , : given in the list inside square brackey Torts the aid by NGOS for different states Output ‘kid for books and uni for Books Uniform andra 6189 610 odisha 8208 508 mp, 6149 GIL up. 6157 457 Aid for shoes: andhra 8810 odisha 6798, MP. 9611 up, 6457 Name: shoes, dtype: int6# fe using Row/Column Names ‘you can use following syntax !0 } ‘The above syntax us see some examples PYTHON PANDAS - | 47 '@ general syntax through which you can single/multiple rows /eolumns. Let © To access a row, just give the row name/label as this Make sure not to miss the COLON AFTER COMMA,” 95 [tow Bbsb a >>> dtf5.loc{ ‘Delhi’, 2} y sy >>> a5 2oe{"Chennat| Population 19827985 a Peat oa couation 30003 seeds mie foe 7a ane: Deh fone! Chena © To access multiple rows, use : loc [ :, :]. Make sure rot to miss the COLON AFTER COMMA. Al >»> dtf5.loc[ "Mumbai : "Kolkata" Population Hospitals Schools Mumbai 12691836 208 3508 Bicep eS Kolkata 4631392 143 7226 ‘sd clon ater comma | Please note that when you specify :, the Python will return all ows, falling between start row and end row, along with start row and endrow. (ee below) >>> dt5.loc["Mumbai :‘Chennai", :) Population Hospitals Schools Mumbai «12691836288 8508 teow Kolkata 4631392149 726 se om a cakane = =e s he ‘© To access selective columns, use EDF object>loc [ : , : }, Make sure not to miss the COLON BEFORE COMMA. Like rows, all columns falling between start and end columns, will also be listed >o> dtf5.loc[ ‘Population’ :*Schools’] Population Hospitals Schools Delhi 19927985 189 7916 mumbai «12692836208 « 3508 Kolkata 4631392149 7226 Chennai 4328063157 7617 >>> dtf5.loc[ +, ‘Population’ : Hospitals") Population Hospitals Delhi 19927986189 Mumbai «12691836208 Kolkata «4631392, 149 Chennai «4328063157- INFORMATICS PRACTICES _ Cha PON NON fos use eee ied n> : cendcolunn>]. © Tose mg avn cnr, ear cmunbos' , ‘Population’ : Hospitals] poadt 5. Joc{ Delhi EXAMPLE BRE Consider a dataframe af es shown below population Hospitals “ang args of obrins - ____ Sample Dataframe df (Reference 1, pelhi 0927988189 “fom aange fos ‘an pe Sales ies : a eaaeanaEEEE oe ie 8 ge Saas RR ae iar au baron ay a at teres he i by NGOs or diferent Sites Ss 2 Eee roys socks Uniform Shoes . Ee = andrea yone=—«189.— «18 sexe crv galt eee oo caista 5088708508 6738 eed ae me 69611 sett See eee eat up, 76176187457 as7 eS eet Ee We ton od te ling: (Dipl rus 2 fo 4 hin) Ci) Frm rue 24 ht cs die cles, he Typ on ol aft (i) Ports 2 0 th ice, dp fer atin Ge) Dipl Sls Chana and “Oner Df an 5 ny. SOLUTION (maf 25) Write «progam to display te id for states ‘Andra’ end ‘Ovi’ for Books and Uniform ony SOLUTION core import pandas as pd i Datafrane aid created or loaded print( aid.loc{‘Andhra’ :‘Odisha',, "Books" :‘Uniform*]) andnra 6189 odisha 8208 ‘You may also specify distinct row id and column names as lists with lo, €8 ‘id.loc{[ "Andhra", “U.P."], ['Toys", "Shoe"]] will list data only from rows with row id ‘Andhra’ and ‘U.P.’ and from columns “Toys ‘Shoes’ only. You will see such a statement as part of next example. foosart 257 ten type Sales Channel Onder Date Order 10 Total Revenge Total Cost Total Profit ters Cae opting arnt “tsar ster sy oun.) akin ‘racke—«>> @F.10c [ 2:4, ["TtemType", ‘Total Profit] pea. loc [ 2:4» ["Iten Type, “Total Profit") } “OF object>.iloe[ start row index> : , : >> at #5.doc{@:2, 1:2] Hospitals a Dethi 139 Mumbai 208 ‘when ven as start: end. an 762 offline sas ‘Recall hat when you use Hoc, then >> df-Aloc[2:5,0:4] (iv) >>> df toe{(4, 5], ["Sales Channel, "Orders" >>> dts. loc[0:2, 1:3] NOTE a (0,5) t : By hospitals schools San tetas) [Sidhaadinah Causto, S| bane as 76 ahi tn sa el on eee ental |e ae eer tee| Mumbai «2888508 label are included when given I eeaaee) te coc Grencezic'sra/ youu eam 1a omine 8 \ start end, but with og ike i 5 personel Core Online 3/13/2017 12 offtine 60 1 ‘end index/position is exclude a ‘Smacks Online 2/25/2017 Is ‘online oor iINFORMATICS PRACTICES _ yy 50 s ing Individual Value 1.125 Selectina/Accesing ndvitNS Ts dataframe, you can use any ofthe following “To select/acess an individual data v4 ‘methods (@ Bither give name of tow oF oF object «column? numeric index in square brackets with, ie, as this [crow nare or row nuneric index>] Consider the following examples nninan ets. Population( ‘DeIht"} “nn oebeteaten oo at ore Clie conjure edt >o> dt¢s.Population( 1) 12681836 {i You can use ator iat atibutes with DF object as shown below se To 14 > at{ roy labels, col Label ‘Access a single value for a tow/ |) “oFobject>.at{eron label», col label>] tere ‘Access a single value for a row! column pair by integer postion , col Sndex n0.>,] Consider examples given below Youn gv ane win 9 82 803 >>> dt#5.at[ "Chennai, 'Schools") ure ow nde and nome ued lar or nih iat atte pop dtf5.iat(3, 2] ‘sess a 7617 NOTE 1.12.6 Selecting Datafrome Rows/Columns based on Boolean Conditions ‘Sometimes you need to select rows/columns from a dataframe based an a condition, just the way you filtered entries in series ‘objets in section 1.66. When you compare a dataframe with a value then Pandas wil execute that comparison condition for ‘ach element of the dataftame and give you True or False accordingly for each element, eg, ‘The a and iat attributes are used to. acces. single values. (saa values) at specif leaton in 2 atafame while at uses row and column libes, iat uses ine postion to acess the vale. oo fea >> det > 0 a « True True Tre ‘ea, conpared wach tteant iates au ‘Chops 1: PTHON PANDAS — 1 51 ‘You can apply condition to individual columns or a range of values to0 as shown below poo ates People Amount average @ 28 279008 13950. eaee00 136 39680831622. 222222 2 44 — 56300032795. asesas 3 34 49680014611. 754706 4 49° 763000 © 18573.428571 >>> dtf5[ ‘average"] > 14000 fete) se isn pt came False (cle Aap) othe date ifs e a 2 False 4 1 app te gb coon toch sme fe 4 True ‘olor aon oe te eth True alvin the cola ae Nane: Average, dtype: bool Duteppying te conden ike hae ave given youjust theres Teor Fe To ett sie es nuame a puts ane eae ec eereeeee cron idee spare bee sete asec eee ee erereetedeeaen| Now cle expe low ht py hued shown ee bt tin, i mboet ofa Go tat eutes ee acim een ee 2 ae seme sett r6aes a il Similarly, check the following statement spat arn") > 08 ete me eee the comin is end ie gts Internally Pandas checks the condition for each row and retums True or False. These truth values (True/False) act as an index for the rows and the rows with True index are returned, similar to Boolean Indexing covered later. The Boolean selection discussed above, works similar to Boolean indexing but is different from that. Their difference will become clear to you when you go through the topic Boolean indexing later in the chapter.NS gp 32 ed simple NOTE we have dso , ease rementer al Mors. Bolen ston 9S a e ndosing = condition oy conditions for extras yond the scope of the book: Gataame’s subsets nade gl advanced conditions # PSM auctions canbe refered 19 sare brackets next tg However, Arend B— Bolen datarame name viel the read more about it values that match the i yu want to read mer Pt esubees from a Series object ch the ee Inthe same way, you an &* " ‘condition. ‘posed on a condition. Consider following 0 mpl too, a.com PRAMPLE BN Fron series Ser of reas of sates in kn, find out the areas that ar ens tal stores ens of ites Li ries Ser tha 50000 bn SOLUTION ee puta a08 Snr, 90, 5, 782, UST, 788, 257, TE, 1637632, 25723, 2367, 11789, 345, 256517]) print(Sera{Sert > 50000]) EXAMPLE BY Gen «Sri ober. White program to store he squares of he Sve wales in objet Display 6 aus which ae > 1. SOLUTION {import pandas as pd Series cbjact s5 created or loaded print( "Series object s5 :°) print(s5) 562552 4 s6.created print("Values in s625 :*) print(s6[ s6>25]) 1.13. Adding/Moditying Rows'/Columns’ Values in DataFrames You can assign or modify data in a dataframe in the same way as you do with other objects. All you need to do isto specify the row name and/or column name along with the dataframe's hhame. The proces of adding and modifying rows /columas’ value is similar, as you will seein the following sub-sections 1.13.1. Adding/Modiying o Column Columns in a datatame canbe refered to in multiple ways. Assigning a value toa column: © will moi ifthe cokumn steady exists 9 will add anew column, if it des notes already To change or adda column, use syntay ‘OF object >. = cnew values ‘OF object >. columns: new values CChopter 1 PYTHON PANDAS — I the given column name does not e 53 Mie g vist in dataframe then a new column with this name is, >>> dtFS[ ‘Density’ >>> dts 219 Population Hospitals Schools Delhi 10927986 ago 7ag Mumbai 12691836208 aso, Kolkata 4631392 us 726 Chennai 4328063 577617 ‘Although the above method adds a column BUT here the catch is that all the rows of this new ‘column have the same given value. If you want to add a proper new column that has different values for all its rows, then you can assign the data values for each row of the column in the form, ‘Since haa sna conn Hf you asin something to (Fusing at oF lee and if column name does nat fet, then a new colun i created and that has the Same vale for alts ows. of alist, ie, as shown below d>> tf SL Density!) = (1599, 1219, 1636, 1050, 1180] —, po aees —— Population Hospitals ‘Schools This time Python Delhi ——10927586.0 © 189.8 7916.0 ies sed) himboi —12691836.0 208.0 «—as08.0 Dae Kolkata aesnssz@ = us.@ 7226.8 yates rat Chemnai a328063.8 157.8 7617.8 fon apc at tanglore Se78057.0 1200.8 1200.8 Same way, you can modify an existing column by assigning a new list of values to it. Thats, for existing column, it will change the data values and for non-existing column, it will add a new column, There are some other ways of adding a column to a dataframe. These are .at[ : , } = «values for colusa> Or -Joe[ : , ] « Or OF object.» =) For example, given below are some statements that will add a gore: new column if the mentioned column name does not exist in a dataframe When you asign something to a column of datarane, then for existing column, it ensi e wil change the data values tS. loc{:, “Density"] = [1500, 1219, 1630, 1050, 1109] ni ge he Ot LFS « dba, assign( Density =[1500, 1228, 1630, 1050, 1100] ) wil add a new column. You just need to make sure in above examples thatthe sequence which contains values for the ‘nev column must have values equal to number of rows inthe dataframe otherwise Python will ai re error (eeror : ValueErt0r )CES 54 1.13.2. Adding/Modifving © Row — Tike alums, you an change or 244 rows 108 Data rame SiG at oF 106 at, explained below ‘To change or add a row, use syntax = , +] 1: ] =newvalue> ‘new value> Or «oF object» Loc{, Likewise, if there i no row with such row label, then Python adds new row with this rg and assigns given values to al its columns ooo dt#5.at{ Banglore’, :] = 1208 poo dts ie Population Hospitals Schools Density. Delhi 10927986.@ «189.8 7916.8 1219.0 Mumbai -«*12692836.8 «208.0 8508.8 1219.8 Kolkata 4631392.0 149.0 7226.0 1219.8 Chennai 4328063.0 157.0 7617.8 1219.0, sion vale fra) Tanglore 1200.0 1200.0 1200.0 1200.0 Roun ‘As you can see in the above output that ifa mentioned row label with ator loc attributes d not exist in the DataFrame, Python will create a new row for it and this is how a new added toa DataFrame. But there isa catch if you specify only a single value, then all the va in the newly added row will have the same value as it did in the above output. ‘You can add a new row by specifying individual values for each column. For this, specify all ‘values of the new row to be added as a sequence such as a list ete, >>» dtf5.at{ Bangalore’, :] = [10002980, 171, 7311, 1200], >>> dts Population Hospitals Schools Density Deahi 10927986.0 189.0 7916.8 1219.0 Mumbai —-—«12691836.0 208.8 8508.0 1219.8 Kolkata —-«4621392.0 «149.8 7226.9 1219.8 Chennai __4328063.0 «157.0 7637.0 1219.8 Banglore 10002980.0 171.0 7311.0 1208. While adding a row this way, make sure that the sequence containing values for diff columns has values forall the columns, otherwise Python will raise ValueError (se below) Ifyou try to add a row having 4 values to the above DataFrame dtf5 having 5 columns, Python will give you error his stato i yin inset >> dS. 2oc{ "Mohali, :] = (452980, 72, 281] <——— fu oa having 3 aes the dane {ur coons od this hea Value€rror: cannot copy sequence with size 3 to array axis with dimension 4 ——_ NOTE You can se at orc tirbvtes of Dataframe to adé/modty a row, cok or ini Chops 1: PYTHON PANDAS — 1 ng EXAMPLE BID Consider the flowing dataframe saleDf Target sales zoned 56000 58000 zoneB 70000 68000 zonec 75008 78000 zoned 60000 61000 Write program to adda column namely Orders having values 6000, 6700, 6200 and 6000 respectively forthe zones ‘A,B, Cand D. The program should also adda new row fora new zone ZoneE. Add some dummy values in this row. SOLUTION import pandas as pd 4# saleOF created or loaded here saleDf{'Orders'] = [5000, 6720, 6200, 6000 ] saleDf.loc["zonet",, :]= [ $2802, 45000, $200] print(saleDF) Output Target sales orders zones $600.0 $8000.0 6000.0 zone8 7000.0 6800.0 6700.0 zonec 7500.0 7800.0 6200.0 zoned 6000.0 61000.0 6000.0, zone 5000.0 4500.0 5000.0 1.13.3. Modifying o Single Cell ‘You may use iat{ ) to modify values using row and column position. (Refer Multiple Choice Question 22 given in Objective type questions atthe end ofthis chapter) ‘To change or modify a single data value, use syntax . [] = >2> dtf5.Population| "Banglore’ ) = 5678037 o> ates Population Hospitals Schools Density Delhi 10927985.0 189.0 7516.8 1219.0 Mumbai —-—«12692836.0 208.0 8598.0 1219.0 Kolkata 4631382.0 149.8 7226.0 1219.0 Chennai 4328063.0 157.8 7617.0 1219.0 Banglore ((5678097.0) 1200.0 1200.0 1200.0 ‘ee, tis tma only his oe got modtedatatrame as shown below = an exist when you cee 2 daame using an est ya einige?) oun 1 55 7 58 poo df= pd. ataFrane(sf2) sd ei 2 eure 1 55 67 58 Now if you change the value ov fa Doe(, 2] = 85 one etl ing i domed fone cell of tbe dataame afl (orginal dataframe) 35 ‘Butt you check the value ofthe new dataframe vale of he given clin dataframe Saat easmicibe fi wl also be charged, oe >of me et 2 enue B Changed ae ue iattame, unaffected from the changes of af1? Why didithappen ? Wasn't df2 supposed tobe a diferent da lasDataFrame{ ) method. So far you have reed “The answer toll your questions isin the syntax ofthe pands the syntax of pandas.DataFrame( ) method as “cdatFraneObject> = pandas, DataFrane( , \ [columns = J, [index « cindex sequence>]) Bt there isan aditional optional argument, copy whichis by default set to False, ie, ye cdatFramebject» = pandas. DataFrane(, \ [columns = J, [index » cindex sequence>], [copy = False]) ‘With copy as False(s default value), the new Datarameis not created asa separate copy andis thus linked to the orignal datas ony anew labels created efersng othe orignal dataframe). Hence all changes made to the original detatame are reflected in it and ths was the reason that 2 also showed the changes above. In order to creste a datatrme as a completly different unlinked dataframe, you need to set the copy Opto orwmen co, whi Feb df argument s Tivo tht the ve dtfrare i rated as are copy (deep copy), i. as >>> df3=pd.DataFrane( #1, copy = True) 1 ene hat he alae 29> ft Doe[4, 2]= 78 spat ‘utd a Tre py an change in aan) ‘he cgi dataome ar refed init eunnp 1 55 67 78 Datoome Jf coated wi copy es Fate vee a et ae Sis al cargo ee tina detajome B Dutfame 3 cue wih copy ws Tru isnot inks othe rial dea Trae copy 155 67 as doce doesnt fet any ches of ft ‘Chopter 1: PYTHON PANDAS 57 1.14 Deleting/Renaming Columns/Rows Deleting/Renaming Columns/Rows Python Pandas provides two ‘ways to delete rows and columns — del statement function. Pandas also provides rename) function to rename rowsand cohume Inthe we shall tlk about how rows/columns can be deleted or renamed nn nse Let us now talk about how you can delete or rename columns and 1.14.1. Deleting Rows/Columns in a Dotaframe ‘To delete a column, you use del statement as this del [ ] For example, >>> del dts ‘Density"] . >>> dts ms Population Hospitals Schools 4” Delhi 10927986.¢ 19.8 7016.0 mumbai -—«12691836.0 208.0 8508.0 Kolkata 4631382.8 3.8 7226.0 Chennai 4328063.8 157.8 7617.0 Banglore §678097.6 1200.0 1200.0 ‘To delete rows from a dataframe, you can use .droplindex or sequence of indexes), e.g, Both these commands will delete the rows with indexes 2,4, 6, 8, 12 from dataframe df F -drop(range(2, 13, 2)) Argument to drop shuld be ther an AF drop( 2, 4, 6, 8, 12]) incor aagnne onaing nds You can also give axis = 1 along with indexes/labels then drop() will drop the columns, ie, the following command will drop the mentioned columns from dataframe df df.drop( (“Total Cost”, "Order 10"), axis = 1) Ths rane ied cons EXAMPLE [BBM From the aY wed above, create another DataFrame and it must not contain the column ‘Population’ and the row Bangelore SOLUTION import pandas as pd # DataFrane dtf5 created or loaded de6 = pd.DataFrane(dtF5) del dt6[ ‘Population’ } Output def 6 = dt¥6.drop(['Bangalore]) nee Cees ethi 189.0 7916.0 mumbai 208.0 8508.0 kotkata 49.0 7226.0 chennai 157.0 7617.0oo TSACTICES: 58 youn use te ENAMEL) ft 1142 : change the name below. Sota per ee cof ronan inden" cetums = (mes HONE), pace sg, want o rename rows, wes senames wists) HOU. a pe index rg i psy ey index clams anes 704 Wan FEAMEcLMNG yy, gumen names-change dictionary st nets pect the names ad 6 For bh index and column 3 form ike (ld name new name) tening es re oo aman rns es Spat ince 2 tif om Min a new dataame is ested. wif al es yo sp ths 30m a dal eas wnarerd Lets now pata se how rename) works Consider the dataame tp shown below foitno Name Marks sock us Pai 87.5 secs 236 Rishi 98 Sec 307 Pret 98.5 seco 2 Paula 98.0 “To change the ow labels as ‘A; °C; D’, you can write "Sec "Sec D':'0'}) The cat of name ha so he change inde $6 rth had egal pf wel? ‘The above statement wil show the changed indexes but when you display the dataframe topDf after executing above statement, it will show you the original dataframe (sce below) because default enamel) rats a new dataframe with changed names. 99 topo folln Wane 45 avni 236 Rishi 387 Preet a he ts She rit uae tps wach coe 1 PHN PON xR rm age 5 oe hacia he etna ea fie A my pee ier "seco 1 'SeCC's'C', 'ec0"s"0"), inplace= True) oe gig ge Tr ergs nn ok he angi he a dof "ih npc as Tr, rename) az hg te inden oil tap Df You can use rename() to change columns names too,¢. following statement will change the columns names in the topDf dataframe. (Consider the original topDf shown earlier in the beginning of this section) You can choose to change selective columns’ names or all the columns’ names. Only the names given in the name-change dictionary will et changed. For example, >>> topo reame(columsoteine Ro Name Marks Seck 115 Pavni 97.5 Sece 236 Rishi 98.0 SecC 307 Preet 98.5 Sec 422 Paula 98.0 md tow each, ‘You can combine any of these arguments together: index, columns, inplace arguments. Following examples are illustrating the same. EXAMPLE [EI Consider te saleDf shown below. Target Sales zone $6008 58800 zone 70008 68000 zone 7580@ 78000 zoned 69000 61000 Write «program to rename indexes of ‘soneC’ and ‘zoneD’ as ‘Central and ‘Dakshin’ respectively andthe clu ames "Target and ‘Sales’ as “Targeted” and ‘Achieved’ respectively. SOLUTION import pandas as pd ft saleof created or loaded here print(saleOF.renane(index = ("zonec' : ‘Central, ‘zoneD":'Dakshin'}, \ Output columns = {"Target': Targeted’, 'Sales”:'Achieved"))) + oe Targeted Achieved zonen $6000 $8000 zones —-70000»—68000 ‘entra? 7500078000 pakshin 60000 61000ne er inged index. it ge se te angeles nd revi pest rns. sn te pevions iy? Make cans . saleDf she changed indexes and cOlUNS' names SOLUTION gam i 14 inp = THE AEUMEn y 1e previous pr is api Mec Te rep cn wih wae) le would BE me a ramn to ma Te as sed snort ands eed or ead Ne se saleoFcreater saleof.renane( index print(saledf) ataframe Indexing {zone colums= (Taree «zoned: Dakshin’) \ "central", : 7 "saes!:"Aehieved"))» Lnplace stag ‘targeted’, BOOLEAN INDEXING i LS Mon oe ee Daaframesin varity of Ways, YOU hav alo lean Til now yo to change indexes, col sometimes] as indexes ofa dat Ww” 1 Wout isthe ef Pion Ponas ray? 2 Name the Pads bj tat an sre one dinersnl aay We eject an nha umes eed ners. 3. Can you ave dpe idee in 2 seis eject? 4. at do these atbaes of see sity? (@) se (i) temaze (0) yee 5. tin ais objet ten how wl te) aed St cum) bea ? at Nas? Hw you sta them ina deuce? 7. Tue/Fie. Snes objets aay hae indexes 6 to n= ne 8. What sth se of dl satenent 9. What oes dp ction do? Woat the rt of eo pace agen ‘ane ) funtion net ya have learnt to create and lun names, rename DataFrames - Boolean Indexing. dng, the ae uggs ES cone taframe. You mig having indexes as True or False? them etc Let us now talk about an interesting feature having Boolean Values [(True or False) or (1 or ight be wondering - Why? What is the need fap Well, your question is genuine and so is the answer ‘Adualy, in some situations, we may need to divide our data in two subsets ~ True or False, eg., your school has decided to launch online classes for you. But some. “ | the week are designated for it, So, a dataframe related to information might look like Day No. of Classes Monday 6 Tuesday Wednesday Thursday Friday “lai ‘True rows and False rows. This is useful division in situatic where you find out things like - On which days, the oll lesa held }Or Which ones are ofine classes days? A The Boolean indexes divide the dataframe in fo =| BOOLEAN INDEXING Boolean indexing "! having Boolean values iT False) or (1 or} asindeashett dataframe. Boolean indexes can either bein True/False form or in (1.070) form, 7" — ON colamiie” ame 4 chapter I = THON PANDAS — | sad indexes and col te ape ra 1.15.1 Creating Dototromes with Boolean Indexs : Lat wa teat tes Bock nega Tara ly Wale acta aay and Fle. While craig a data - make sure that True and False are not enclosed in {hence it will give you error (KeyEror) while accessing Toc, because ‘True’ and ‘False’ are string values, not Boolean quotes (ie, like ‘true’ or False’, data with Boolean indexes using values. Solved problem 32 depi 's the same problem, Let us first create above shown dataframe containing — code given below. %6 online classes’ information, through the import pandas as pd Days =['Nonday’, "Tuesday', ‘Wednesday’, "Thursday", “Friday* ] Classes =[6, 8, 3, 0, 8) eco indes provid Days’ :Days, ‘No. of Classes" :Classes} Eee lasDf = pd.DataFrane(de, index = (True, False, True, False, True] ) Belo vals rt sings ie (aot ened in gute >>> clasoe 5 Days No. of Classes Monday 6 Tuesday 8 Wednesday 3 Thursday @ 8 Friday ‘You can also provide Boolean indexing to dataframes as 1s and 0s. Let us create another dataframe (clasDff) similar to the above shown datairame (clasDf) having indexes as Is and 0s, through the code given below, “import pandas as pd ays [‘Nonday", "Tuesday", ‘wednesday’, "Thursday", ‘Friday’ ] Classes = [6, 8, 3, 8, 8] dc= (‘Days" :days, ‘No. of Classes’ :Classes} clasDfl = pd.DataFrane(de, index = [1, 8, 2, 8, 2] ) : ae ~ tides provid fo each ne end Oe hi ime >>> clasbfl Days. No. of Classes. Monday 6 Tuesday @ Wednesday 3 e 8 Thursday Friday/ W ean Indexes 62 fron Datofromes ened ie, fr fining oF extracting sing for fiteing rn dataframes with Boole, =a 1.152 ACES very ust ecards from data oolean indexes ia Boolean inde ju can iter “ : a Fue inderd ed below lith True index. Joc atta atodisplay all i ith False index oo) stocisplay 21 re index as 1 oa alse all records wi attra emi) eae arecrdswithindexas® Bsn on. gro aisel® ert oie?) ap 7 svete 04] ‘ no, of Classes 6 3 8 days ‘No. oF Classes Tuesday = Thursday 5 : >>> clasDA1. 1oc{@] Tariana ‘ cane bays No, oF Classes a Tvestay ° Turstey Tals Me hivecometo te end ofthis chapter. Let us quickly revise what we have lari shape. LET Us Revise “ry fom O10 Lava. tg daa da “dope, shape, nbytes, ni, size, temsize, hasnans, em? PYTHON PANDAS — 1 hope | J 1 Aslic objets rete from Series objec ut Sg @ sma of .T 4 When you assign someting toa column ofdtframe, he for existing column, i wil change the daa vlues nd for rnon-eisting clump, wil add @rew column, 4 Acolumn canbe deleted using dl cormand. Opiective Type Questions Multiple Choice Questions f create an empty Series object, you can use (@) pa Series(empty) (©) pa Seriesinp NaN) (0 paSeries) (@)all ofthese 2, To specify datatype int16 fora Series object, you can write (@) pa Series(data = array, dtype = inti6) _(@) pa Series(data = array, dtype = numpy.int6) (6) paSeries(data = array.dtype = pandas.nt16) (@ all of the above OTQs 3. To get the number of dimensions of a Series object, _ attribute is displayed. (@) index () size (0 itemsize (0 dim 4. To get the size of the datatype ofthe items in Series object, you can display __ attribute, (0) index () size (©) itemsize (@ nin 5. To get the number of elements in a Series object, __ attribute may be used. (0) index (size (© itemsize (@ dion 6. To get the number of bytes of the Series data, _ attribute is displayed. (@) basnans (0) bytes (nim (@) dtype 7. To chuck ifthe Series object contains NaN values, ___ attribute is displayed. (w) hasnans (W) nbytes (9 ndim (@) type 8, To display third element ofa Series object S, you will write ws) wsei 881 say &You might also likeHourglass Workout Program by Luisagiuliet 2PDF76% (21)Hourglass Workout Program by Luisagiuliet 251 pages12 Week Program: Summer Body Starts NowPDF87% (46)12 Week Program: Summer Body Starts Now70 pagesRead People Like A Book by Patrick King-EditedPDF58% (81)Read People Like A Book by Patrick King-Edited12 pagesLivingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real HealthPDF77% (13)Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health260 pagesCheat Code To The UniversePDF94% (79)Cheat Code To The Universe34 pagesFacial Gains Guide (001 081)PDF91% (45)Facial Gains Guide (001 081)81 pagesCurse of StrahdPDF95% (467)Curse of Strahd258 pagesThe Psychiatric Interview - Daniel CarlatPDF91% (34)The Psychiatric Interview - Daniel Carlat473 pagesThe Borax ConspiracyPDF91% (57)The Borax Conspiracy14 pagesTDA Birth Certificate Bond InstructionsPDF97% (285)TDA Birth Certificate Bond Instructions4 pagesThe Secret Language of AttractionPDF86% (107)The Secret Language of Attraction278 pagesHow To Develop and Write A Grant ProposalPDF83% (542)How To Develop and Write A Grant Proposal17 pagesPenis Enlargement SecretPDF60% (124)Penis Enlargement Secret12 pagesWorkbook For The Body Keeps The ScorePDF89% (53)Workbook For The Body Keeps The Score111 pagesDonald Trump & Jeffrey Epstein Rape Lawsuit and AffidavitsPDF83% (1016)Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits13 pagesKamaSutra PositionsPDF78% (69)KamaSutra Positions55 pages7 Hermetic PrinciplesPDF93% (30)7 Hermetic Principles3 pages27 Feedback Mechanisms Pogil KeyPDF77% (13)27 Feedback Mechanisms Pogil Key6 pagesFrank Hammond - List of DemonsPDF92% (92)Frank Hammond - List of Demons3 pagesPhone CodesPDF79% (28)Phone Codes5 pages36 Questions That Lead To LovePDF91% (35)36 Questions That Lead To Love3 pagesHow 2 Setup TrustPDF97% (307)How 2 Setup Trust3 pagesThe 36 Questions That Lead To Love - The New York TimesPDF94% (34)The 36 Questions That Lead To Love - The New York Times3 pages100 Questions To Ask Your PartnerPDF78% (36)100 Questions To Ask Your Partner2 pagesSatanic CalendarPDF25% (56)Satanic Calendar4 pagesThe 36 Questions That Lead To Love - The New York TimesPDF95% (21)The 36 Questions That Lead To Love - The New York Times3 pagesJeffrey Epstein39s Little Black Book Unredacted PDFPDF75% (12)Jeffrey Epstein39s Little Black Book Unredacted PDF95 pages14 Easiest & Hardest Muscles To Build (Ranked With Solutions)PDF100% (8)14 Easiest & Hardest Muscles To Build (Ranked With Solutions)27 pages1001 SongsPDF69% (72)1001 Songs1,798 pagesThe 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - ExcerptPDF23% (954)The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt38 pagesZodiac Sign & Their Most Common AddictionsPDF63% (30)Zodiac Sign & Their Most Common Addictions9 pagesUnit_III_part_2_1725700061785PDFNo ratings yetUnit_III_part_2_172570006178585 pagesPandas Notoes For XII PDFPDFNo ratings yetPandas Notoes For XII PDF12 pagesData Handling Using Pandas - 1-2-1PDFNo ratings yetData Handling Using Pandas - 1-2-110 pagesInformatics Practices Class 12PDFNo ratings yetInformatics Practices Class 12225 pagesUnit I: Data Handling Using Pandas and Data Visualization: Marks:30PDFNo ratings yetUnit I: Data Handling Using Pandas and Data Visualization: Marks:3075 pagesPandas_1_SeriesPDFNo ratings yetPandas_1_Series14 pagesXII_ip_Panda_I_Part_I_2023 (1) 1 1PDFNo ratings yetXII_ip_Panda_I_Part_I_2023 (1) 1 125 pagesUnit I: Data Handling Using Pandas and Data Visualization: Marks:25PDFNo ratings yetUnit I: Data Handling Using Pandas and Data Visualization: Marks:25135 pagesPython PandasPDFNo ratings yetPython Pandas230 pagesPython PandasPDF100% (1)Python Pandas35 pagesUnit - V Introduction To Pandas in PythonPDFNo ratings yetUnit - V Introduction To Pandas in Python21 pagesLn. 1 - Data handling using Pandas - Series & DataframePDFNo ratings yetLn. 1 - Data handling using Pandas - Series & Dataframe14 pagesPython PandasPDFNo ratings yetPython Pandas96 pagesDAY6 Pandas SeabornPDFNo ratings yetDAY6 Pandas Seaborn97 pagesPandasPDFNo ratings yetPandas20 pagesChapter 10 Eng Introducing Python PandasPDF100% (3)Chapter 10 Eng Introducing Python Pandas28 pagesIp Chapter 1PDFNo ratings yetIp Chapter 136 pagesData Analytics PandasPDFNo ratings yetData Analytics Pandas33 pagesPandasPDFNo ratings yetPandas82 pagesLAST MINUTES REVISION Pandas SeriesPDFNo ratings yetLAST MINUTES REVISION Pandas Series6 pagesPython PandasPDFNo ratings yetPython Pandas177 pagesData Handling Using Pandas I - SeriesPDFNo ratings yetData Handling Using Pandas I - Series11 pagesML Lab8PDFNo ratings yetML Lab828 pagesData Handling Using Pandas and Data Visualization - Assessment1 Class Room NotesPDFNo ratings yetData Handling Using Pandas and Data Visualization - Assessment1 Class Room Notes18 pagesUnit - 1 - Python PandasPDFNo ratings yetUnit - 1 - Python Pandas176 pagesUNIT - 3 PandasPDFNo ratings yetUNIT - 3 Pandas21 pagesIntroduction to Pandas & Data StructuresPDFNo ratings yetIntroduction to Pandas & Data Structures11 pagesPandasPDFNo ratings yetPandas3 pagesPandas Notes 1PDFNo ratings yetPandas Notes 16 pagesPython Pandas SeriesPDFNo ratings yetPython Pandas Series45 pages10_20241104_Data-Analysis_PandasPDFNo ratings yet10_20241104_Data-Analysis_Pandas53 pagesPandasPDFNo ratings yetPandas13 pagesUNIT II MaterialPDFNo ratings yetUNIT II Material34 pages2.1 Pandas ObjectsPDFNo ratings yet2.1 Pandas Objects10 pagesPython CodePDFNo ratings yetPython Code44 pagesClass 12 IP Ch-1, 2 3PDFNo ratings yetClass 12 IP Ch-1, 2 328 pagesPython PandasPDFNo ratings yetPython Pandas21 pagesInformatics Practices Class 12 Study Material PDFNo ratings yetInformatics Practices Class 12 Study Material 128 pagesIP TERM-1 Study Material (Session 2021-22)PDFNo ratings yetIP TERM-1 Study Material (Session 2021-22)84 pagesPandas basicsPDFNo ratings yetPandas basics21 pagesPanda Ncert 1PDFNo ratings yetPanda Ncert 136 pagesXII-IP-QuickRevisionPDFNo ratings yetXII-IP-QuickRevision26 pagesData Handling Python NCERTPDFNo ratings yetData Handling Python NCERT36 pagesCH 2PDFNo ratings yetCH 236 pagesTop 50 Pandas Interview Questions and Answers (2024)PDFNo ratings yetTop 50 Pandas Interview Questions and Answers (2024)34 pagesPython Data ProcessingPDFNo ratings yetPython Data Processing36 pagesPandas 21PDFNo ratings yetPandas 2133 pagesExp 25_26PDFNo ratings yetExp 25_2617 pagesPandas Notes(1)PDFNo ratings yetPandas Notes(1)44 pages01 Data Handling Using Pandas IPDFNo ratings yet01 Data Handling Using Pandas I19 pagesData Manipulation With PandasPDFNo ratings yetData Manipulation With Pandas38 pagesWorking With Pandas NotesPDFNo ratings yetWorking With Pandas Notes27 pagesIp 102PDFNo ratings yetIp 10236 pagesIntroduction To PandasPDFNo ratings yetIntroduction To Pandas2 pagesPandas pythonPDFNo ratings yetPandas python11 pagesXII-IP-QuickRevision 2 in 1PDFNo ratings yetXII-IP-QuickRevision 2 in 113 pagesPython Pandas (II)PDFNo ratings yetPython Pandas (II)18 pagesPython Data Frame NewPDFNo ratings yetPython Data Frame New32 pagesDWV Unit1PDFNo ratings yetDWV Unit1102 pages