0% found this document useful (0 votes)
13 views

Data Handling using Pandas-1

The document provides an overview of Python libraries, specifically focusing on NumPy, Pandas, and Matplotlib, which are essential for data manipulation, analysis, and visualization. It explains the differences between Pandas and NumPy, outlines how to install and import Pandas, and describes its data structures, including Series and DataFrames, along with methods for creating and manipulating them. Additionally, it includes practice exercises related to the use of Pandas for data analysis.

Uploaded by

cr7vk18shourya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views

Data Handling using Pandas-1

The document provides an overview of Python libraries, specifically focusing on NumPy, Pandas, and Matplotlib, which are essential for data manipulation, analysis, and visualization. It explains the differences between Pandas and NumPy, outlines how to install and import Pandas, and describes its data structures, including Series and DataFrames, along with methods for creating and manipulating them. Additionally, it includes practice exercises related to the use of Pandas for data analysis.

Uploaded by

cr7vk18shourya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 23
Python Pandas-! Fastrack« Revision » Introduction to Python Librarles: Python libraries contains acallection of builtin modules that allow us to perform many actions without writing detailed programs for it. > NumPy, Pandas and Matplotib are three well-established Python libraries For scientific and analytical use. These libraries allow us to manipulate, transform and visualise data easily and efficentiy > NumPy: NumPy stands For'Numerical Python’ sa library package that can be used for numerical data analysis and sdlentific computing. > Pandas: Pandas stands for ‘panel data’. It Is a high-evel data manipulation tool used for analysing data > Matplotlib: The Matplotlb library in Python is used for plotting graphs and visualisation. Using Matplotlb, with justa Few ines of code we can generate publication quality plots histograms, bar charts, scatter plots, etc. > Difference between Pandas and NumPy: Following are some of the difference between Pandas and NumPy: >A NumPy array requires homogeneous data, while a Pandas DataFrame can have different data types (Fost, int, string, ete) > Pandas DataFrames (with column names) make it very easy to keep track of data > Pandas Is used when data isin tabular Format, whereas 'NumPy i used for numeric array based data manipulation. > Installing Pandas: To Install Pandas From command line we need to type in: Pip inatall pandas] » Importing Pandas: in order to work with Pandas in Python, we need to import Pandas library In Python environment. We can do this either on the shell prompt or in our script file (py) by writing import pandoa oa Pd) > Pandas Data Structure: A data structure is a particular way of storing and organising data In a computer to sult 2 specific purpose so that it can be accessed and worked with in appropriate ways. ‘Two commonly used data structures in Pandas are: > Series: It is one-dimensional data structure of Python Pandas. > DataFram Pandas. > Series Data Structure: A series isa one-dimensional array containing 2 sequence of values of any data type (int, Float, list, string, etc.) which by default have numeric data labels starting from zero. The data label associated with a particular value is called its index. Is two-dimensional data structure of Python Example: Index value Index Value 0 Rohit Jan a 1 ‘ARKIE Feb 28 2 Deepak Mar 3 3 ‘Ayush Apr 30 > Creation of Series: A series can be created in many ways using Pandas library series (). Make sure that we have imported Pandas and NumPy modules with Import statements. > Create Empty Serles Object by using Just Series () with no Parameter: To create an empty objects Le, having no values, we can just use the series () as: Series Object> = Panda. Series () > Croating Non-empty Serles Object: To create non- empty series, the we need to specify arguments For data and indexes as per the Following syntax: Series Object> = pd. Series (data, index = Idx) where Idx's a valid NumPy datatype and data is the dota part ofthe series object, It can be one ofthe following > Creation of Serles from Scalar Value: created using scalar values as: series can be >>import pandas as pd >>>serie pd.Series (C10, 20, 30) >>oprint (aeries 1) Output: Index Data values 1 20 2 30 dtype : int 64 > Creation of Series from NumPy Arrays: We can create a series From one-dimensional NumPy array as: >>> import numpy aa ap >>> import pandas ao pd y (Ls Series >>> array 1 = np.oi 33,44) (arrayl) >>> print (series Output: Index Data values ° i 1 2 2 3 3 a4 dtypo : int 32 Pe EE EE EE EE > Creation of Series from Dictionary: We can ceate 2 series by specifying indexes and values through a dictionary as: et 1» (ure *Rajagthan’: Jaipur >>> print (dict 1 (uctar Pradeah’: ‘Lucknow’, ‘Rajasthan’ ‘Joipue’ ) >>> Serie 1 = pd.Seriee (dict 1) >>> print (eertes 1 Output: Uttar Pradesh Lucknow Rajasthan Jaipur dtype : object ‘Accassing Elemonts of a Series: There are two common ‘ways for accessing the elements of 2 series: Indexing and Slicing, > Indexing: Indexing in series is similar to that For NumPy arrays and is used to access elements ina series. Indexes ‘are of two types: positional index and Labelled index Positional index takes an integer value that correspands to its position in the series starting fram 0, whereas labelled index takes any user-defined label as index. Example: aNun © pd-Sertes ((11,22,33)) >>> soriosNun{L 22 Here, the value 30s displayed for the positional index 2. > Slicing: This is similar to slicing used with NumPy arrays. We can define which part of the series is to be sliced by specifying the start and end parameters [start : end] with the series name. When we use positionalindices For slicing, the value at the endindex position Is excluded, te, only (end - start) number of data values of the series are extracted. Example: >>> soriesCopstate = pa.Seriew (pispur’, ‘patna’, ‘Panaji}, index (CAssam’, ‘Bihar’, ‘Goa’ ]) >>> serlescapstate (1:2) object Here, only data values at indices 1 is displayed Le, excludes the value at index position 2. ‘Attributes of Serles: We can access certain properties called attributes of a series by using that property with the series name. ‘Attribute Name Purpose name [assigns a name to the series indexname assigns a name to the index of the series values prints list of the values in the series size prints the number of values in the series object prints True ifthe series is empty and empty False otherwise Methods of Series: Method Explanation Head(n) [Returns the First n members of the series. IF the value For n is not passed, then by [default n takes 5 and the first Five members are displayed, Count{) [Returns the number of non-NaN values In the series. Tallin) | Returns the last n members of the series. IF the value for n is not passed, then by [default n takes S and the last Five members are displayed, Mathematical Operations on Series: > addition: We can use the +’ Operator or add() method of series to perform addition between two series objects > Subtraction: We can use the "Operator or sub() method of series to perform subtraction between two series objects > Division: We can use the /f Operator or div() method of series to perform division between two series objects. > Multiplication: We can use the ‘s’ Operator or mul() method of series to perform multiplication between ‘wo series objects. > Exponential Power: We can use the "*" Operator or owt) method of series to put each element of passed series as exponential power of caller series and return the results DataFrame Data Structure: A DataFrame is 9 two- dimensional labelled data structure Uke a table of MySQL. tecontains rows and columns and therefore has both arov and column index. The row index is known as index and the column index called the column-name. Creation of DataFrame: There are a number of ways to create a DataFrame. Some of them are listed In this section. > Creation of an empty DataFrame: An empty DataFrame can be created as follows >>> import pandas as pd >o> dPrumagme = pd came () >>> dFramezat Output: Empty DataFrame Columns: (] Index [] > Creation of DataFrame from NumPy ndarrays: Consider the following three NumPy ndarrays. Let us create a simple DataFrame without any column labels, using a single ndarray: Pe EE EEE EEE >>> Leport numpy as np np.areay( (11, 22,33)) array? ~ np-accay((110,210,310}) >>> array3 ~ np.array {{-100,-200, -300, =400)) >>> dframed © pd-DataPeamo(arrayl) o> d ° Output: 2 22 233 > Creation of DataFrame from List of Dictionaries: We can create DataFrame from a list of Dictionaries as: tated, bt :20)) >>> ListDic 5, ‘bh’ £10, aFramaListoict = pd-DataPcame dFrameLiatoi a boo 0 1 22 NaN 1 5 10 200 > Creatlon of DataFrame From Dictionary of Lists: DataFrames can also be created from a dictionary of lists, po> dictroreat = {'stace’: [ ‘Kanpur! ‘Delhi’, ‘Wdaipur'], ‘Area’: [9683 7583, $4552], 0 3197, 4.02, 2563)) >>> dFrameforeat» pd.DataPrame (dict Forest) 35> derameForeat Output: state Gasea vor 0 Kanpur 96838 3197.00 1 Delhi 7583 4.42 2 Udaipur 44852 2563.00 > Creation of DataFrame from Dictionary of Serles: A dictionary of series can also be used to create 3 oultSheets ‘Rohit! ea ((B0, 92, 97), tengliab! ,/ScLence’ , Matha’), pd.Series((72, 81, 94), ngltan! , 1 ence’, "Maths! |), ‘priya’: pd.Sertes((8d, 86, 78), indexs{ ‘engl iaht,"Selencet,‘macna’ }), >>> Resul tOFwpd. DataFrane (Reaultsh >>> ReguleoF Output: Rohit Ayush Priya English ao 72 ka Science oz at Maths a7 a8 > Operations on Rows and Columns in DataFrames: > Adding 2 New Column to a DataFrame: We can easily, adda new column to a DataFrame. > Adding a New Row to a DataFrame: We can add anew row to a DataFrame using the DataFrame.toc{ J method, > Deleting Rows or Columns from a DataFrame: We can use the DataFrame.drop() method to delete rows and columns Fram @ DataFrame. > Renaming Row Labels of a DataFrame: We can change the labels of rows and columns in a DataFrame using the DataFrame.rename() method > Renaming Column Labels of a DataFrame: To alter the column names of ResultOF, we can again use the rename() method, » Accessing DataFrames Element through Indexing: Data elements in a DataFrame can be accessed using indexing. There are two ways of indexing DataFrames: Label Based, Indexing and Boolean Indexing > Label Based Indexing: There are several methods in Pandas to implement label based indexing. DataFrame. loc{] isan important method that is used for label based indexing with DataFrames, > Boolean Indexing: In boolean indexing, we can select the subsets of data based on the actual values in the DataFrame rather than their row/column labels. Thus, ‘we can use conditions on column names to filter data values. > Accessing DataFrames Element through Slicing: We can. Use slicing to select a subset of rows and/or columns from, a DataFrame. To retrieve a set of rows, slicing can be used, ‘with row labels. > Attributes of DataFram: Attribute Name Purpose DataFrameindex {to display row labels DataFrame.columns |to display column labels to display data type of each column in the DataFrame DataFrame.dtypes DataFrame.values |to display a NumPy ndarray having Jall the values in the DataFrame, without the axes labels to display a tuple representing the dimensionality of the DataFrame DataFrame.shape to display a tuple representing the dimensionality of the DataFrame Dataframesize DataFrameT to transpose the DataFrame, means, row indices and column labels of the DataFrame replace each other's position DataFrame.head{n) |to display the first n rows in the DataFrame to display the last n rows in the DataFrame DataFrame.tall(n) Pe EE EE EE EE x Practice Exercise Multiple choice Questions 1 To create an empty series object, you can use: a. paSeries(empty) _b. pd Serles(np NaN) pdSeries() d.Allof these Q2. To specify datatype Int16 for a series object, you can write: a. paSeries(data » array. dtype = intl6) b. pdSeries(data = array, dtype = numpyinti6) ¢. pdSeries(data = array dtype = pandas.inti6) 4. Allo the above 3. To got the number of dimensions of a series object, attribute is displayed. bisize —c itemsize a Index Q4, To get the size of the datatype of the items in series 4. din object, you can display. attribute. a.index size itemsize —d. ndim QS. To get the number of elements in a series object, attribute may be used. b size © itemsize 2. index Q6. To get the number of bytes of the series data, 4. agin attribute Is displayed. b.nbytes agin d. dtype Q7. To check if the series object contains NaN values, attribute Is displayed. a. hasnans a. hasnans b.nbytes cncim 4d. dtype 8. To display third element of a series object S, you will write - 2.53) bSR) ¢ 5i3) 4. 5{2) Q9. To display first three elements of a series object S, you may write Fz a. $(3) b.s@) © $Grd) 4. All of these 10. To display last five rows of a series object S, you may. write - a. head) b.tall(5) «. tail() d. Either b. or ¢. Q1L Missing data in Pandas object is represented through: a.ul b.none & missing d.NaN Q12. Given 2 Pandas series called Sequences, the command which will display the frst 4 rows is. a. print(Sequences head|4)) » print (Sequences, Head(4)) «. print(Sequences heads(4)) 4, print(Sequences Heads(4)) 13. If a DataFrame is created using a 2D dictionary, then the indexes/row labels are formed from a. dictionary values —_b. Inner dictionary’s keys outer dictionary’s keys d. None of these 14. Ifa DataFrame is created using a 2D dictionary, then the column labels are formed from. é a. dictionary values Inner dictionary’ keys outer dictionary’s keys d. None of these Q1S. The axis 0 Identifies a DataFrame’s a. rows b. columns c values 4. datatype 16. The axis 4 Identifies a DataFrame’s a. rows b. columns c values 6. datatype QU. To get the number of elements in a DataFrame, attribute may be used. a.size b. shape c values d.ndim Q18. To get NumPy representation of a DataFrame, attribute may be used. a. size b. shape c values d.ngim 19. To get a number representing number of axes in a DataFrame,.. a. size c values Q20. The name “Pandas” od from the term: [case sep 2021 Term-1] a, Panel Data Panel Serles Python Document Panel DataFrame Q2L The command to install the Pandas Is: [cose sop 2021 Torm-2] bi install Pandas pip install Pandas a. install plp Pandas pp Pandas 022. Python Pandas was developed by: [cose sop 2021 Term-1} 2. Guido van Rossum — b, Travis Oliphant c.Wes McKinney 4, Brendan Eich 928. Pandas Series Is: [cose sp 2021 Tern-1} 2. 2Dimensional_ —_b.3 Dimensional 1Dimensionat ——_d, Multidimensional (024. Pandas is a: [cose sop 2021 Torm-1] a. Package b Language Ubrary d Software Q25. We can analyse the data in Pandas with [case sop 2021 Torm-1) Data Frame .Nane of these a. Series Botha. and Pe EE EEE EEE Q26. Qa. Q28. Q29. 30. Qa Q32. 933. Q34. 38. Method or function to add a new row in a DataFrame is: {cBse Sop 2022 Tera-1] alo) bloc) Join 4. adl) Which of the following import statement is not correct? (case sop 2023 Term-1} 2a. Import Pandas as class12 b. import Pandas as Ipd «. import Pandas as pd d. import Pandas as pd While accessing the column from the dataframe, we can specify the column name. In case column does not exist, which type of error it will raise: (CBSE SQp 2023 Term-1} a. Key Error b Syntax Error c. Name Error Runtime Error Function to display the first n rows in the DataFrame: [CBSE SQp 2021 Toru] a tail (n) head (n) top (n) first (n) Pandas DataFrame cannot be created using: {CBSE S9p 2021 Term-t] 2a. Dictionary of tuples Series Dictionary of List d_ List of Dictionaries Which function will be used to read data from a CSV file into Pandas DataFrame? —(c8se sop 2022 Torn] a. readesv() b.to_csv(} c. read_csv() d-csv_read)) Which of the following Is not an attribute of Pandas DataFrame? [c8Se Sqp 2021 Torm-t} a. length bt c Size d.Shape What will be the output of the given code? Import Pandas as pd s® pdSeries((1,2,3.4,5], indexs[‘akram: ‘brijesh, charu, ‘deepika, era‘) print(s{charu') (case Sop 2022 Term-1} al b2 3 a4 Assuming the given series, named stud, which ‘command will be used to print 5 as output? Amit 90 Ramesh 100 Mahesh 50 john 67 Abdul 89 Name: Student, dtype: Int64 _(ces€ sop 2021 Term-1] a. stud.index b.studlength stud values dstudsize A Soclal Sclence teacher wants to use 2 Pandas series to teach about Indian historical monuments and its states. The series should have the monument names as values and state names as indexes which are stored in the given Usts, as shown in the code. Choose the statement which will create the serles: import pandas as pd Monument=[‘Qutub Minar’, ‘Gateway of India, ‘Red Fort, Taj Mahal’) State=['Dethi,,' Maharashtra, ‘Delhi, Uttar Pradesh’) {case Sap 2021 Term-1] 936. qa. 38. 39. a. Sedf Series(Monumentindex=State) b, S=pd eries(State Monument) c Spd Series(Itonumentindex=State) 4. S=pd.series(Monument index=State) Consider the following series named animal: L Lon 8 Bear E Elephant T Tiger w Wolf dtype: Object Write the output of the command: print{anima [case sop 2021 Tern-1] a. LUon b.W Wott TTiger BEcar dtype: abject type: object cc BBear .W Wolf Elephant Tiger type: Object type: object What is a correct syntax to return the values of first row of a Pandas DataFrame? Assuming the name of the DataFrame is dfRent. _(¢8S€S0P2023 Torm-1] a. dérent{0] b. dfRent oct c dfftent loc(0) d. dfRentitoc(]) Difference between loc() and itoc().: [cose sop 2021, Term-1} 2. Both are Label indexed based functions, b. Both are Integer position-based functions. € loc) Is label based function and iloc() integer position based function. d-loc() is integer position based function and iloc() index position based function Write the output of the given program: import Pandas as pd Si=pd.Series({5,6,7,8,10],Index: 15(2,6.1.4,6] S2=pd.Series(tindex=(', y;'a, 'W,'V) print(s1-S2) (CBSE 9p 2021 Term-]} ae o ba NaN vy Ao vo 0 we 20 w 20 x NaN x NaN y 20 y 20 z 80 z 80 diype: float64 dtype: floats4 cv 0 da NaN wo 20 y 40 y 20 w 20 z 80 x 30 dtype: oats y 20 z 80 dtype: floate4 Pe EE EE EE EE 40. Which command willbe used to delete 3 and 5 rows of the DataFrame. Assuming the DataFrame name as oF. [c¥S€S0? 2021 Term-1) a DFdrop((24}axs=0) b, DF drop{(24} axis Cc DFdrop((35)axls=1) —d. OF drop((3.5)) Write the output of the given command: import Pandas as pd sepd-Series({1,2,3,4,5,6]ndex=[, Qa. print(s{s%2==0)) [cesesgp 2021 Torm-1) ag 0 be 2 Do o 4 Foo F 6 type: inté4 type: int64 cA deo 8 2 0 2 c 5 Fa type: int type: int4 042, Ritika is a new leaner for the Python Pandas and she is aware of some concepts of Python. She has created some lists, but is unable to create the DataFrame from the same. Help her by identifying the statement which will create the DataFrame. Phyo (70, 60,76, 89: Chem {30, 70, 80, 651 a. dfupd.DataFrame(("Name":Name."Phy':Phy, “Chem'Chem)) bb. de(Name'Name“Phy*Phy'Chem':Chern) dfpd DataFrame(4) ¢ dfspd.DataFrame({Name Phy.Chem).columns= (Name’Phy’*Chem',Total')) 1.DataFrame((NameName’, Phy “Phy:Chem: "Chem’) ‘Assuming the given structure, which command wilt sive us the given output: Flight No. o d 43. Airline Indigo Spicedet Indian Alfines| Passenger 230000 12000) 240000 245000 210000 [cose sop 2021 Torm-1) Uutthansa 5 [AirAsia Output Required: (3,5) ‘2. print(dfshapeQ) print(dfshape) C print(atsze) ok print(atsize0) Write the output of the given command: dfi.loc{:0,Sal') Consider the given DataFrame, Airline 30000 60000 35000 4a, EName 0 | Kavita Passenger 3000 4000 5000 [c8Se 597 2021 Torm-1) a0 Kavita $0000 3000 b. 50000 3000 . 69000 Sudha 2 | Garima 4s, 946. Qar. 48, gag. 50. Consider the following DataFrame name df Name] Age | Marks 0 [Amit 15 900) 1_|Bhavdeep[ 16 NaN 2 | Reema | 17 87) Write the output of the given command: print(df.marks/2) [case sop 2021 Term-3] ad 450 1 NaN 2 435 Name: Marks, dtype: float64 bo 450 1 NaN 2 43 Name: Marks. doype: float64 <0 45 1 NaN 2 35 Name: Marks. dtype: float64 do 450 1 0 2 435 Name: Marks. dtype: float64 Read the statements given below. Identify the right option from the following for Attribute and method/ function. Statement 1: Attribute always ends without parenthesis. Statement 2: Function/Method cannot work without arguments, [cose Sop 2021 Torm-3} a, Both statements are correct. b. Both statements are incorrect «Statement 1s correct. but Statement 2is incorrect. d. Statement 1s incorrect. but Statement 2 i correct. To get the transpose of a DataFrame D1, you can write 3 a.O1T b. D1 Transpose < D1Swap 4. Allof these Which of the following is a two-dimensional labelled data structure of Python? (cose 2023) 2. Relation b. DataFrame Series 4. Square To display the 3rd, 4th and Sth columns from the 6th to 9th rows of a DataFrame DF, you can write a. OF loc{69. 35) b, OF toc(6:10, 3:6) € DFiloc{6:10, 3:6) 4. OFiloc(6, 35} ‘To change the Sth column's value at 3rd row as 35 In DataFrame DF, you can write 2. OF(4, 6) = 35 b. OF(3.5) «35 OF iat(4,6) 935d. DF lat(3, 5) » 35 Pe EE EE EEE EEE QSL What will be the output of the following code? Import pandas as pd myser = pd. Series ({0, 0, 0)) (cese2023) print (myser) ago boo ooo a4 oo o 2 co oO ao 0 1 0 rod 2 0 2 2 Q52. Which of the following command will show the Last 3 rows from a Pandas Series named NP? [cose sop 2023.24] a. NPTaIl() b. NPtail(3) < NPTAIL(3) 4. Allof these Which of the following statement is wrong? {[e0SE 2021 Torm-1} ‘a. Cant change the index of the Series, . We can easily convert the list. tuple and dictionary into a Series. -ASerles represents a single column in memory d. We can create empty Series What type of error is returned by the following statement? jport pandas as pa paSeries ([1,2,3,4], index » [ sa. Qsa. ) [OSE 2021 Term-1} b. Syntax error d, Logical errar Which is incorrect statement far the python package Numpy? [ces sop 2021 Torm-1} a. Ibis a general-purpose array-pracessing package b. Numpy arrays are Faster and more compact Itis multidimensional arrays 4d. Its proprietary software The data of any CSV file can be shown in which of the following software? {OSE 2021 Yerm-1} a. MIS Word b. Notepad Spreadsheet 1. All of these 2. Value error c Name error ass. 56. 87. Which Python library is not used for data science? [WE 2021 Term-1] a. Panda b. Numpy < Matplotlib J. Thinter Q58. Which method is used to Delete row(s) from DataFrame? {€6SE 2022 Torm-1} 2. drop() method b. del() method remove methad —d. delete() method Consider the following code: Import numpy as np import pandas as pd Le npaarray ({10,201)) xpd Series ( ) Q59. 60. Qo. Qez. Q63. Q 64. Q65. 966. Qor. print(x) Output of the above code Is: 0 1000 1 8000 type: int64 What isthe correct statement for the above output in the following statement 1? [CBSE 2021 Torm-1) a.d=U'3 b. dataet""3 3 4.(0.20)"3 Which of the following would give the same output as DF/DF1 where DF and DF1 are DataFrames? a. OF div(OF 1) b. DF Ldiv(OF) c. Divide(DF.DF1) d. Div(DF.0F 1) Which of the following statement is wrong in context of DataFrame? [COSE 2022 Term-1) 2. Two dimensional size is Mutable ®.Can perform Arithmetic operations on rows and columns «. Homogeneous tabular data structure d. Create DataFrame from numpy ndarray Which attribute is not used with DataFrame? {OSE 2021 Term 1} a. size empty When we create a DataFrame from a list of Dictionaries the columns labels are formed by the (CBSE 2022 Term-t] {a Union of the keys of the dictionaries b. Intersection of the keys ofthe dictionaries Union of the values of the dictionaries 4. Intersection of the values of the dictionaries Identify the correct option to select first four rows and second to fourth columns from a DataFrame ‘Data’ (CBSE 2021 Term-t} a. display(Datailoc: 4 2: 4)) . display(Datailoc(}: 5. 2: 5)) © print{Oataitoc(0: 4,1: 4)) . print(Oatailoc(l 42:4) Which attribute is used with Series to count the total number of NaN values, (case 2022 Term-t} a. size b.len © count d.count total Consider the following Series in Python: data = pd Series((5,2,3,7],index=f'a;'b' ¢,'4')) Which statement will display all odd values? [cose 2022 Term-t) a.print(data%2=u0) —_b. print(data(data%42l«0)) print(data mod 2I=0) _d. print(data(data%21+0)) What will be the output of the following code? import pandas as pd Import numpy d.Series(data=(31,54,34,89,12,23], dtypesnumpy.int) print (=>50) (cost 2021 Terma} b.type d.columns Pe EE EE EEE = b. © a OFalse [1 s4 om 1 Tue tue | 3 89 1 s4 3 Tre 2False | dtype:intea |2 34 type: boo! 3 True 3 89 4 False 42 5 False 523 dtype:boo! dtypesint64 Q.68. Consider a following DataFrame: import pandas as pd s=pd.Series (data-[31, 54, 34, 89, 12, 23) df-pd.DataFrame(s) Which statement will be used to get the output as 2? [case 2021 Torm-3} b print(dfshape)) dtprint(dfvalues) a print(dtindex) print(df.ndim) Q.69. Sandhya wants to display the last four rows of the dataframe df and she has written the following command: aftaild But the first 5 rows are being displayed. To rectify this problem, which of the following statements should be written? (CBSE 2021, Term-I] 2. dfheadl) bdflast(a) c dalla) d.dfrows(4) Q70. Consider the following series: serspd Series({'C, '0,'M; FO, R:'T,,'B, Ly E,) indexs({4, 2,3, 4, 5,6, 7,8,9, 10, 14)) print(serf4]) {c0s£2021 Ter} a b © é. aF ae a F 50 50 so 50 6R eR 6 R BR 77 71 7T aT BA BA aA BA 98 98 | dtyperobject| 9 8 we wot diype:object| 11 ne type: object type: object QTL Now-a-days for developing Machine learning projects programmers rely on CSV files rather than databased. Why? [cose 2021 Ter-1} ‘a. CSV can be used with proprietary softwares only, b. CSV files can be downloaded from open source websites free of cost © CSV files need not be imported while creating the projects, 4. CSV isa simple and well formatted mode for data storage Q72. DataFrames can be created from: [cuse 2021 Term-1) b.dictionarles d.All of these a. Usts series Q73. Consider the following statements: Statement 1: loc() is a label based data selecting method to select a specific row(s) or column(s) which we want to select. Statement 2: .iloc() can not be used with default Indices if customised indlces are provided. [CBSE 2021, Teem-1] 2, Statement 1is True, but Statement 2s False b. Statement 1is False, but Statement 2 Is True. ©. Statement 1 and Statement 2 both are False. d. Statement land Statement 2 both are True. Abhay is a student of class ‘Xil' and he is aware of some concepts of python. He has created the DataFrame, but he is getting errors after executing the code. Help him by identifying the correct qm. statement that will create the DataFrame. Code: import pandas as pd stuname= ['Muskan; ‘Radhika, Gopar,"Pihu' ] term1 = [70, 63,74, 90] term? = (67,70, 86, 95] [CBSE 2021 Term] a. disp DataFrame('Name'stuname’marksI term “marks2* term2)) b. df=pdDataFrame({stuname. termi, term2). columns=(stuname; ‘marks’, marks2")) c df=pdDataFrame({stuname.terml, term2)) d. df=PD.dataframe ({stuname, terml, term2]) Q75. Mr. Raman created a DataFrame from a Numpy array arr = npaarray({2, 4, 8}.[3, 9,271, [4, 16, 64])) df-pd.DataFrame (arr, index-[one, ‘two, ‘three'), print (df) Help him to add a customised column labels to the above DataFrame. a. columns=ino.'sq: cube) b. column=('no’ 'sq\ cube’) «. columns=(no: ‘sq ‘cube) d. columns=fno’ ‘sq cube’) What will be the output of the following program Import pandas as pd? dic=(Name' : 'Sapna, Anmol, Rishul,Sameep'}, ‘Aggregate’: (56, 67,75, 76], Age':[16, 18, 16, 19}) df=pd.DataFrame (dic, columns={'Name; Age’) (cose 2022 Term!) Q76. print (df) [cst sop 2021 Terma] a Name Aga Age b, Name Agg Age 101 Sapna 56 16 0 Sapna 56 16 102 Anmal 67 18 1 Anmol 67 18 103 Alshul 75 16 104 Sameep 76 19 2 Rishul 75 16 3 Sameep76 19 “ Name d Name Age 0 Sapna 0 Sapna 16 1 Anmot 1 Anmot 18 2 Alshul 2 Aishul 16 3 Sameep 3 Sameep 19 Pe EE EE EE EE EEE an. 78. Q79. Consider the following code: import pandas as pd =pd. Series ((23, 24, 35, 56], indexs(,'b, "dD Series ((27, 12,14, 15}, ('¢,'ab]}) dfepd.DataFrame (S1+S2) print (df) Output for the above code will be: (Cas€ 2022, Tenm-1) a0 b oO a NaN a 50 ab NaN b 36 b 510 49 © 490 a7 d NaN vy NaN c 0 a 0 b so a NaN y 36 ab NaN 649 b NaN ab 71 NaN d NaN y NaN Sudhanshu has written the following code to create a DataFrame with boolean index: import numpy as np import pandas as pd dtepd.DataFrame (data-[5, 6,7) Index={truc, false, tive) print (af) While executing the code, she is getting an error, help herto rectify the code: {CBSE 2021 Terur-1} a. df=pd.OataFrame({True, False, True), datax(5, 6, 7) bo. dfepd OataFrame(data-(5. 6. 7)indexe(True. False True) € dfepd DataFrame((truesalsetrue] data=t(6.67)) 4. dfopd.DataFrame(indexe(true.false.true) datae (67) Sushila has created a DataFrame with the help of the following code: import pandas EMP='EMPID’: ['E01,'£02;'E03;'E04,"E0S'], ‘EMPNAME’:[:KISHORI: ‘PRIYA, DAMODAR: ‘REEMA, "MANQU'], “EMP_SALARY’ :[67000, 34000, 68000,90000, 45000),

You might also like