0% found this document useful (0 votes)
72 views16 pages

Python Pandas I - Boolean Indexing

Uploaded by

Yash Chaurasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
72 views16 pages

Python Pandas I - Boolean Indexing

Uploaded by

Yash Chaurasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
Unit ata Handling using \das and Data Visualization Visit to website tips Juno learnpythonacbse.com Chapter-4 Data Handling Using Pandas-I In order to access a dataframe with a Boolean index, we have to create a dataframe in which index of dataframe contains a Boolean value that is “True” or “False” For Example: ff Created by: Amjad Khan (06.06.2020) l# Creeate DataFrame with Boolean Index limport pandas as pd jsrec={'sid':[101,102, 103,104,105, 106,107,108,109,110], ‘sname':['Amit', 'Sumit', 'Aman', 'Rama', 'Neeta', 'Amjad', "Ram', 'Ilma', 'Raja', 'Pawan',], *smarks': (98, 67,85, 56,38, 98, 67,28, 56,81], ‘sgrade':['A1','B2','A1', 'Cl', "D', "Al", 'B2', "E', 'B2", 'A2"], tremark':['P',"P','P', 'E", "P,P", "Pt, "EY, 'P', 'P'] } # Convert the dictionary into DataFrame |df=pd.DataFrame (Srec) jt Without Boolean Index display print ("\n-# Without Boolean Index display-\n") [print (df) laf = pd.DataFrame(Srec, index = [True, False, True, False, True,False, True, False, False, True]) l# with Boolean Index display print ("\n-With Boolean Index display-\n") int (df) -With Boolean Index display-— sid sname smarks sgrade remark IiTrue 101 Amit 98 Al PB False 102 Sumit 67 B2 PB True 103 Aman 85 Al PB False 104 Rama 56 cL EF True 105 Neeta 38 D PB False 106 Amjad 98 Al P True 107 Ram 67 B2 Pe False 108 Ilma 28 E EF False 109 Raja 56 B2 Pp [True 110 Pawan sl A2 Pp Page tof 16 Unit 1: Data Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I This example shows the working of how to access the DataFrame with a Boolean index by using .loc[ ] We simply pass a Boolean value (True or False) in a .loc{] function. Here | passed ‘True’, So it displays only True’s indices data. # loc methods takes only integers # so,we are passing True insted of index no. # accessing using .loc(True) print ("\n-accessing using .loc(True)-\n") print (df.loc[True]) -accessing using .loc(True)- sid sname smarks sgrade remark True 101 Amit 98 Al P True 103 Aman 85 Al PB True 105 Neeta 38 D P True 107 Ram 67 B2 P True 110 Pawan 81 A2 P In order to access a dataframe using .iloc[ ], we have to pass a Boolean value (True or False) in a ilocf ] function. Here | passed two indices (1, 4), you can pass single c[2]) # iloc methods takes only integers # so, we are passing 1,4 insted of True. # accessing using .iloc[[(1,4)]] print ("\n-accessing using .iloc[[{(1,4)]]-\n") print (df.iloc[[1,4]]) -accessing using .iloc[[(1,4)]]- index for example: print(df. sid sname smarks sgrade remark False 102 Sumit 67 B2 P True 105 Neeta 38 D PB Page 2 of 16 Unit 1: Data Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I In this tutorial, you'll learn how and when to combine your data in Pandas with: * Join() for combining data on a key column or an index * merge() for combining data on common columns or indices * .concat() for combining DataFrames across rows or columns Oke etki In order to join dataframe, we use .concat() function this function concat a dataframe and returns a new dataframe. uuter’, join_axes=None, ignore_index-False, pd.concat(objs, axis=0, joi keys=None, levels=None, names=None, verify_integrity=False, copy=True) 1, Important Parameters for concat( ) method : objs: a sequence of pandas objects to concatenate. axis: default is 0 i.e. row-wise concatenation. If axis=1, then column- wise concatenation is performed keys: assigning keys to create the multi-index. It's useful in marking the source objects in the output. iv.ignore_index: if True, the source objects indexes are ignored and 0,1,2..n indexes are used in the output. Vv. join: optional parameter to define how to handle the indexes onthe other axis. The valid values are ‘inner’ and ‘outer’. Page 3 of 16 Unit ata Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I # importing pandas module eer import pandas as pd 2) # Define a dictionary containing student data recl = {'sroll':['s0', 'sl', 'S2', 'S3"], "sname':['Kamal', 'Princ', 'Gagan', 'Amit'], "sage':(15, 14, 16, 17], 'marks': (78,90, 76,561} # Define a dictionary containing student data rec2 = {'sroll':['s4", 's4', 'sé', 'S7"'], "sname':('"Kimmi', 'Pranav', 'Pankaj', 'Sumit'], "sage':[13, 15, 17, 18], Sroll sname sage marks ‘marks':[78,90,76,56]} jo = SO Kamal «15 78 Sl Princ «61490 datafl = pd.DataFrame(recl) 2 $2 Gagan 16 76 Bs? amit 17 56 dataf2 = pd.DataFrame(rec2) sroll sname sage marks 0 s4 Kimmi 13 78 print (datafl, "\n\n", dataf2) 1 s4 Pranav 15 90 2 «$6 Pankaj «91776 3s? suit 18 56 Example 1: # Code 1: Now we apply .concat function in order to #concattwo dataframe along the rows # using a .concat() method print ("\n-Concate datafl and dataf2 using .concat() along”) print ("row wise-\n") frames=[datafl,dataf2] resi = pd.concat (frames) # OR #resl = pd.concat ([dataf1,dataf2]) print (res1) Page 4 of 16 Unit ata Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Concate datafl and dataf2 using .concat() along row wise- sroll sname sage marks sO Kamal 15 78 sl Princ 14 90 s2 Gagan 16 76 s3 amit 17 56 s4 Kimmi 13 78 s4 Pranav 15 90 s6 Pankaj 17 76 s7 sumit 18 56 Example 2: Pandas also provide you with an option to label the DataFrames, after the concatenation, with a key so that you may know which data came from which DataFrame. You can achieve the same by passing additional argument keys specifying the label names of the DataFrames in a list. Here you will perform the same concatenation with keys as X and Y for DataFrames datafl and dataf2 respectively. print ("\n-Label the DataFrames,after the concatenation-\n") res2 = pd.concat (frames, keys=['X", 'Y"]) print (res2) -Label the DataFrames,after the concatenation- sroll sname sage marks CCRC Ls xO. SO Kamal 15 78 PAE eLy] 1 Sl Princ 14 90 2 s2 Gagan 16 76 pra) 30 $30 Amit’. 17 56 a yo s4 Kimi 13 78 1 84 Pranav 15 90 2 Sé Pankaj 17 76 3s? Sumit 18 56 Page Sof 16 Unit 1: Data Handling using Vist to website: https: /www learnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Example 3: # Code print ("\n-concat datafl and dataf2 using .concat() along column wise-\n") res2 = pd.concat ((datafl,dataf2], axis=1) [print (res2) |concat datafl and dataf2 using .concat() along column wise- sroll sname sage marks sroll sname sage marks 0 sO Kamal 15 78 s4 Kimmi 13 78 h si Princ 14 90 $4 Pranav 15 90 kk s2 Gagan 16 76 $6 Pankaj 17 76 3 S3_ Amit 17 56 s7 Sumit 18 56 Data of Data Frame ~detaf Data of Data Frame - datat2 after concatenate after concatenate Tuy ae Li A cura DBI It will combine all the columns from the two tables or DataFrames, with the common columns. a 2 left join Fight join inner join outer join @ ® 9 @® Page 6 of 16 Unit 1: Data Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I (WEREEUE Pandas DataFrame merge() function is used to merge two DataFrame objects with a database-style join operation. The joining is performed on columns or indexes. Merging dataframe using how in an argument: * We use how argument to merge specifies how to determine which keys are to be included in the resulting table. * Ifa key combination does not appear in either the left or right tables, the values in the joined table will be NA. * Here is a summary of the how options and their SQL equivalent names: inner INNER JOIN Use intersection of keys from both frames Ta] p= ale] fi i 2 o l J 2 inner join. 1 outer join _. ap Teper] outer join = = i fe a] Page 7 of 16 Unit Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Example 1: Simple merging a DataFrames with one unique key combination Suppose we have two data frames where, SRoll is a unique key: # using .merge() function # we are using .merge() with one unique key combination print ("\n-we are using .merge() with one unique") print ("key combination-\n") datares = pd.merge(dataf1, dataf2, on="SRoll') print (datares) Dataf1: left Dataf2: right SRoll sname sage Roll Sadd sclass 0 sO Kamal 15 0 so Tkd x 1 Sl Princ 14 1 S1_ Khanpur 1x 2 s2 Gagan 16 2 s2. Kalkaji XI 3.83. amit. 17 3.83 Gk-IT__—XIT -we are using .merge() with one unique key combination- SRoll sname sage Sadd sclass 0 sO Kamal 15 Tkd x 1 Sl Prine 14 Khanpur Ix 2 $2 Gagan 16 Kalkaji xI 3.83 amit —17_—Gk-II_—siXTI Example 2: Merging dataframe using multiple join keys. Multiple keys (SRoll1 & SRoll2) # using .merge() function # we are using .merge() with multiple key combination print ("\n-we are using .merge() with multiple key") print ("combination-\n") datares = pd.merge(datafl, dataf2, on=['SRoll1', 'SRol12"]) print (datares) Page 8 of 16 Unit 1: Data Handling using Vist to website: https: /www learnpythonacbse.com Chapter-4 Data Handling Using Pandas-I datafa: left dataf ht SRollI SRol12 ‘sname “sage SRol11 SRol12 Sadd sclass 0 sO sO Kamal 15 0 so 80 tka x i sl Sl Princ 14 2 $2 $0 Khenpur alkaji 2 Se SO Gagan 16 3.0 $3 SO Gk-Ir.— XI 3 83 sl amit 17 —we are using .merge() with multiple key combination- SROll1 SRoll2 sname sage Sadd sclass ° so sO Kamal 15 Tka x 1 sS2 SO Gagan 16 Kalkaji xI Example 3: Now we set how = ‘left! in order to use keys from left frame only. Using keys from left frame : fcode 3:Now we set how = ‘left' in order to use keys from left frame only. fusing keys from left frame print("\n-using keys from left frame-\n") res = pd.merge(datafl, dataf2, how="left', on=['SRoll1', 'SRol12"]) print (res) datafi: left dataf2: right SRolll SRoll2 sname sage SRoll1 SRol12 Sadd sclass lo so sO Kamal 15 0 so so kd x 1 s1 sl Princ 14 1 Sl SO Khanpur x > 82 $0 Gagan 16 2 82 SO Kalkaji XI 5 Ss Sones oo 3 83 80. Gk-TT. XII Left join -using keys from left frame- SRoll1 SRol1l2 sname sage Sadd sclass ° so sO Kamal 15 Tka x 1 sl Sl Princ 14 NaN NaN 2 s2 sO Gagan 16 Kalkaji xI 3 s3 Sl amit 17 NaN NaN Page 9 of 16 Unit sndas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Example 4: Now we set how = ‘right’ in order to use keys from right frame only. Using keys from right frame fcode 4: Now we set how = 'right' in order to use keys from right frame only. fusing keys from right frame print ("\n-using keys from right frame-\n") res = pd.merge(datafi, dataf2, how='right', on=| print (res) "sRolli', 'sRol12"]) dataf1: left dataf2: right SROIIT SROl12” sname sage SRollI SRoll2 Sada sclass o so sO Kamal 15 0 so so Tkd x 1 sl sl Princ 14 SL 80 Rhanpar rx 2 s2 sO Gagan 16 5 50 Kalkaji xz 5 3 Sloat oD 3. 83-80. Gk-IT_—xIT Right Join -using keys from right frame— SRoll1 SRol12 sname sage — Sadd sclass 0 so sO Kamal 15.0 Tkd x 1 82. $0 Gagan 16.0 Kalkaji XI 2 sl sO NaN NaN Khanpur =X 3 s3 so NaN NaN Gk-II XII Example 5: Now we set how = ‘outer’ in order to get union of keys from dataframes. f#code 5: Now we set how = ‘outer’ in order to get union of keys from dataframes.Using keys from outer frame print ("\n# getting union of keys\n") res = pd.merge(dataf1, dataf2, how="outer', on=['sRolli', 'SRol12"]) print (res) datafl: left dataf2: right SROl11 SRO112 Sada sclass SRoll1 SROll2_Sadd sclass o 800 Tk x oso. 80 tka x 1 Sl SO. Khanpur pg 1 sl SO Khanpur x 2 82 80 Kalkaji XI 2 82 80 Kalkaji XI 3. S30 SO. Gk-IT_—XIT 3. S30 SO. Gk-IE.—XIT Page 10 0f 16, Unit 1: Data Handling using Vist to website: https: /www learnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Outer Join # getting union of keys SRoll1 SRol12 sname sage Sadd sclass ° so sO Kamal 15.0 Td x 1 sl sl Princ 14.0 NaN NaN 2 s2 SO Gagan 16.0 Kalkaji XI 3 3 Sl amit 17.0 NaN NaN 4 si SO NaN NaN Khanpur Ix 5 s3 SO NaN NaN Gk-IT__XIT Example 6: Now we set how = ‘inner’ in order to get intersection of keys from DataFrames. #Code 6: Now we set how = 'inner' in order to get intersection of #keys from dataframes. Using keys from right frame print ("\n-getting intersection of keys-\n") res = pd.merge(datafl, dataf2, how-'inner', on=['SRoll1', 'SRol12"]) print (res) dataf1: left dataf2: right SROl11 SRO112 Sada sclass SROll1 SROl12_-Sadd sclass oso SO Tkd x oso. so Tkd x 1 si sO Khanpur x 1 st sO Khanpur x 2 82 80 Kalkaji XI 2 s2 80 Kalkaji XT 3 $3 «S80. Gk-I._—sXIT. 3 $3 80. Gk-IT._—XIT Inner Join -getting intersection of keys- SRolll SRoll2 sname sage Sadd sclass o so so Kamal 15 Tkd x 1 s2 S0_Gagan__16 Kalkaji XI For more reference on join, merge and concat https://pandas.pydata.org/pandas-docs/stable/user_guide/merging. htm! Page 11 0f 16, Data Handling using Pandas and Data Visualization Visit to webs -si/ [scons Jearnpythondcbse.com Chapter-4 Data Handling Using Pandas-I The CSV (Comma Separated Values) format is one of the most popular file formats used to store and transfer data between different programs. Currently, many database management tools and the popular Excel offer data import and export in this format. The CSV file is a plain text file with the .esv extension. A typical file contains comma-separated values, but other separators such as semicolon or tab are also allowed. It should be emphasized that only one type of separator can be used in one CSV file. Each line in the file represents a certain set of data. Optionally, in the first line we can put a header that describes this data. Let's look at a simple example of a file called details.csv that stores contacts from a phone: Name,Phone mother,9990004561 father,9985672340 wife,9898234580 mother-in-law,0920486745 Page 12 0f 16, Visit to webs Ttpe Jonnw hondcbse.com Chapte Data Handling Using Pandas-I In the above file, there are four contacts consisting of name and phone number. Note that the first line contains a header to help you interpret the data. To create a CSV file using Notepad and MS Excel Steps to create CSV file using Notepad 1. Start Notepad. Create a table with three records, where each record has two fields. For example, type “mother,999000451” (without quotation marks) on the first line, “father,9985672340” on the second line and “wife,989823480” on the third line. 2. Open the “File” menu and select “Save As.” In the File Name box, type a file name that ends with a CSV extension. For example, type “contacts.csv.” 3, Click the “Save as Type” drop-down list and select “All Files.” Click "Save." Test the file by opening it inside a spreadsheet. Steps to create CSV file using Notepad 1, Start Microsoft Excel and add data to a new spreadsheet. For example, type “32,” “19” and “8” in cells “A1,” “A2” and “A3,” respectively. 2. Click the “File” tab on the ribbon and then choose “Save As.” Click the arrow next to “Save as Type” and choose “CSV (Comma Delimited)” from the drop-down list. 3. Change the file name to one you prefer. Select the location to save the file, then click the “Save” button. Click "OK" to save only the active sheet. Click "Yes" to save the file in CSV format. Page 13 of 16, Unit 1: Data Handling using Pandas and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Chapter-4 Data Handling Using Pandas-I Suppose we have a ‘Studentdetails.csv’ file 201 Amit 208 Rama 105 Neeta 4106 Amjad 209 Raja 1310 Pawan 1123 Sudhir Method Using the read_csv() function from the pandas package, you can import tabular data from CSV files into pandas dataframe by specifying a parameter value for the file name (e.g. pd.read_csv("filename.csv”)). Remember that you gave pandas an alias (pd), so you will use pd to calll pandas functions. Be sure to update the path to the CSV file to your home directory. (f Created By: Amjad Khan |fImporting/Exporting Data between csv files and Data Frames. import pandas as pd # importing pandas module limport csv # import the module csv ""*Method 1:Importing csv file (using read_csv()method''' lt Import csv file and making data frame f1 = pd.read_csv(r"D: /Amjad_Pandas_Programs/StudentDetails.csv") vint (df1.nead(5)) ge 14 of 16 Unit Chapter-4 Data Handling Using Pandas-I ndas and Data Visualization Visit to website: Tips Junnw learnpythondcbse.com lt created 8 ftmporting/Exporting Data between csv files and pata Frames. jimport pandas as pd —# importing pandas module ort csv "*'Method 2:Importing CSV file (Using csv.reader() module) *** lt open the csv file open (r"D: /Amjad_Pandas_Programs/BoolStudentDetails.csv") as csv_file: 4 read the csv file csv_reader = csv.reader(csv_file, delimiter=',") # now we can use this csv files into the pandas £2 = pd.DataFrame([csv_reader], index=None) df2-head() lt iterating values of ten column a for i in range (10): for val in list (df2(i]): print (val) ‘Amjad Khan # import the module csv jae3 lt Created By: Amjad Khan lt Importing/Exporting Data between CSV files and Data Frames. import pandas as pd # importing pandas module '"*Exporting/Saving a Pandas Dataframe as a CSV''! l#Method 1:Save csv to working directory. lf list of name, degree, score Ime = ("aparna", "pankaj", "sumit", "Geeku"] deg = ("XT") "R", "XII", "X" lscr = [90, 40, 80, 98] lt dictionary of lists dict = {tname': nme, 'degree': deg, ‘score’: scr} = pd.DataFrame (dict) # saving the dataframe af3.to_csv('filel.csv') Page 15 of 16 \das and Data Visualization Visit to website: https /uwwrleamnpythonacbse.com Unit ata Handling using Chapter-4 Data Handling Using Pandas-I # Created By: Amjad Khan lt Importing/Exporting Data between CSV files and Data Frames. limport pandas as pd # importing pandas module lfMethod 2: Saving CSV without headers and index. # list of name, degree, score nme = ["aparna", “pankaj", "sumit” ldeg = ["XI", "X", "RIT", "X"] lscr = (90, 40, 80, 98] "Geeku lt dictionary of lists dict = {"name': nme, 'degree': deg, 'score': scr) af4 = pd.Datarame (dict) # saving the dataframe laf4.to_csv('file2.csv', header=False, index=False) jt Created By: Amjad Khan jt Importing/Exporting Data between CSV files and Data Frames. import pandas as pd # importing pandas module lt Method 3: Save csv file to a specified location. jt list of name, degree, score Ime = ["aparna", "panka: sumit", "Geeku"] ldeg = ("XI") "XK", "XII", "X"] cr = [90, 40, 80, 98] lt dictionary of lists fdict = {"name': nme, 'degree': deg, 'score': scr} |af5 = pd.DataFrame (dict) jt saving the dataframe [a£5.to_csv(r'C:\Users\Admin\Desktop\file3.csv', index=False) ge 16 of 16

You might also like