0% found this document useful (0 votes)

53 views91 pages

Python 20

Uploaded by

Asir Mansur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views91 pages

Python 20

Uploaded by

Asir Mansur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Assignment of Data Mining

[Basic python, Numpy, Pandas]

Submitted To-
[Link] Islam
Associate Professor

Submitted By-
Susmita Rani Saha (B180305047)
Tanvir Ahammed Hridoy (b180305020)

Department of computer Sceience and Engineering,

Jagannath university
String

A string is a sequence of characters, can contain letters, numbers, symbols and

even spaces.

 Strings are immutable:

Once we create string, then we can’t change the content of this object. If we
want to change then we must create a new string.
Example:

1
Lists

Python has a data type known as list. Lists are same as arrays. That is, List is a
collection that allows us to put many variable in a single variable.

 Lists are Mutable:

We can change an element in any index of list.
Example:

 Indexing & negative indexing:

Lists have zero based indexing from front and also have negative indexing
from end. We can access any element using index operator.
Example:

 Zeros array:
We can declare an array that fill with zero.
Example:

2
 Iterate through lists:
We use loop for iterate list. Loop is used to repeat a block of code until the
specified is met. For access elements in a list we can use loop.
Example:

 Iterate through list using range method:

We can access the elements in list define start and end point. Range method
is use for define limit.
Syntax: range(start-inclusive, end-exclusive)
Example : We can apply any operation in list using range

3
 Basic methods:
There’s some basic method, that can directly modify the lists.

 Append() method:
Append or add an item to the end of the list.
Syntax: [Link](item)
Example:

 Insert() method:
Insert an item at the specified index.
Syntax: [Link](index, item)
Example:

 Remove() method:
Remove the first occurrence of item from the list.
Syntax: [Link](item)

4
Example 1:

Example 2: we can remove elements using range.

 Extend() method:
Using this method we can append another list to the list.
Syntax: [Link](list2)
Example:

 Count() method:
This method returns the number of times element occurs in the list.

5
Syntax: [Link](element)
Example:

 Sort() method:
For sort the elements of list we use sort method. Sort items in a list in
ascending order.
Syntax: [Link]()
Example:

 Reverse() method:
This reverse method uses for reverse the list. That is, it reverses the order of
items in the list.
Syntax: [Link]()
Example:

6
 Copy() method:
Its copy the elements of a list and return copied list.
Syntax: list1=[Link]()
Example:

 Pop() method:
This method removes and returns an element at the given index.
Syntax: n=[Link](index)
Example:

7
 Clear() method:
This method removes all items from the list.
Syntax: [Link]()
Example:

 Len() method:
If we measure the length of array then we can use len() method, this method
return the length of list.
Syntax: n=len(list)
Example:

 Slicing method:
Using Slicing method, we can get la sub list of a list. We can access elements
in range. We can get all elements using slice operator.
Syntax: list[star-inclusive : end-exclusive]
Example:

8
 Split() method:
This method returns a list that split a string. String is converted to the elements
of list.
Example:

 Mathematical methods:
There’s some methods using for mathematical operations.

 Sum() method:
Using this method we can sums up the numbers in the list.
Syntax: n=sum(list)
Example:

9
 Max() method:
For finding maximum value in list we can use max method.
Syntax: n=max(list)
Example:

 Min() method:
For finding minimum value in list we can use min method.
Syntax: n=min(list)
Example:

10
Dictionary

A dictionary associates a simple data value called a key (most often string) with
a value. And values can be of any python data type.
Syntax: dic {key1, value, ……}

 Create an empty dictionary:

We can create an empty dictionary without assigning any key or value.
Example:

 Create a dictionary:
Create a dictionary named grades which contains name as key and grade as
value of dictionary.
Example:

 Update value for existing index:

We can update or change the value of already existing Index.
Example:

11
 Added a new entry:
We can add a new entry that is, key and value pair of the dictionary.
Example:

 Delete entry from dictionary:

For remove an entry from dictionary we have to use del keyword, and then
specific key for delete.
Example:

 Iterate through dictionary:

Using loop we can access all elements of the dictionary.
Example:

12
 Some methods:

 Values() methods:
This method returns all values in dictionary.
Syntax: [Link]()
Example:

 Keys() method:
This method returns all keys in dictionary.
Syntax: [Link]()
Example:

13
Some Python libraries:
i. Numpy
ii. Pandas
iii. Scipy
iv. Scikit-learn
We will discuss about only numpy and pandas.

14
Numpy
Numpy is a popular python library. It is the fundamental package needed for
scientific computation with python.
It features:
i. Multidimensional array,
ii. Fast numerical computation,
iii. High level math function,

 Arrays:
Structured lists of numbers.
Two types:
i. Vectors (single dimensional array)
ii. Matrices (Multidimensional array)

 Single dimensional array create:

We can create a single dimensional array using numpy. We have to first import
numpy then create array using this.
Example:

 Multidimensional array create:

Like single dimensional array we can also create multidimensional array for store
matrices values. We can also declare data type of its values.
Example:

15
 Basic properties (dimension, shape, data type):
For knowing the dimension (1D,2D) of dictionary we can use ndim method. It
returns the dimensions.
Syntax: [Link]
For knowing the shape (row, column) of dictionary we can use shape method. It
returns the shape of dictionary.
Syntax: [Link]
For knowing the data type of the elements of dictionary we can dtype method,
that returns the data type of elements.
Syntax: [Link]
Example:

16
 Array addition:
We can add two array using add operator for create a new array, that represent
the sum of this two array.
Example:

 Array multiplication:
We can multiply two array using dot method and store this result in an array.
Example:

17
 Some methods:

 zeros() method:
Using this method we can create an array of all zeros elements.
Syntax: [Link]((row,column), dtype=data type)
Example:

18
 ones() method:
Using this method we can create an array(1D,2D) of all ones elements.
Syntax: [Link]((row,column), dtype=data type)
Example:

 arange() method:
This method takes start index, end index and step size and create an array using
this info. Here start inclusive, end exclusive and step size by default 1.
Syntax: [Link](start-inclusive, end-exclusive, step, dtype)
Example:

19
 concatenate() method:
This method concatenate two arrays.
Syntax: [Link]([array1,array2])
Example:

 astype() method:
This method use for type casting. It can change the data type of an array.
Syntax: [Link](data type)
Example:

20
 [Link]() method:
This method is use for generate random values from 0 to <1. That is range is
[0,1).
Syntax: [Link](value) ,for single dimension
[Link](row,column) ,for multidimension
Example 1:

Example 2:

21
 linspace() method:
This method returns a numbers as sample numbers instead of step in arrange method.
This method takes –
Start=starting point inclusive
Stop=stop point inclusive
Num= how many numbers in samples to generate
Endpoint= it includes last point. It always True by default.
Retstep=if true than result the sampling rate. By default it false.
Dtype=data type
Syntax: [Link](start,stop,num=n,endpoint=True,retstep=False,dtype=type)
Example:

22
Pandas (Part 1)

 Pandas is a Python library for data manipulation and analysis. It provides data
structures and functions for working with structured data, such as tabular or time
series data.

 The two primary data structures in Pandas are:

i. Series and
ii. DataFrame objects.
A Series is a one-dimensional array-like object that can hold any data type, while
a DataFrame is a two-dimensional table-like object that can hold multiple types
of data.

 Pandas provides a wide range of functions for manipulating and analyzing data,
such as filtering, sorting, grouping, merging, pivoting, and aggregating. It also
has built-in support for handling missing data, time series data, and categorical
data.

 Pandas is widely used in data analysis and scientific computing, and is often used
in conjunction with other Python libraries such as NumPy, Matplotlib, and Scikit-
Learn.

23
Series:
Series is like one dimensional array like other languages. It can store any data
type and it have an index this is by default in numeric value.

 Create Series:

 First we have to import pandas library, Then create a series just like the
example.

 We can store any type of value in series and also assign user-defined
labels to the index and use them to access elements of a Series.

24
 Index also can be any data type. In this example I use string as data type
in index.

 Creation of Series from NumPy Arrays:

 NumPy is another Library using in python. We can convert NumPy

array (1D) to series just like the example below:

25
 We can set index value but we have to ensure that index size must be
matched with the NumPy array size. If index is not declared is take
numeric automatically.

 If the index size is not matched with the array size it throw error just like
the example.

26
 Creation of Series from Dictionary:

 Python dictionary has [key:value] pairs, it can be converted into series.

Dictionary key is use as index and value is use as value in the series just
like the example.

27
 Accessing Elements of a Series:

 Indexing: Indexing in Series is similar to that for NumPy arrays, and

is used to access elements in a series. Indexes are of two types: positional
index and labelled index.

 positional index: Positional index takes an integer value and the

series starting from 0 index just like the example.

 labelled index: Labelled index takes any user-defined label as

index just like the example.

 This is another example of labelled index.

 We can also access an element of the series using the positional index 3
and 2 positions value is showed here.
28
 We simply can access the positional value without index.

 We can access the series by it index values.

 The index value can be changed of a series and put a new index for the
existing series.

 Slicing:
29
 There is a difference between slicing and indexing, in indexing we only
can access the value which is given. But in slicing we can access a range
for example seriesCapCntry[0:3] we can access 0 to 2 positional index
value because 3 use here exclusive.

 If labelled indexes are used for slicing, then value at the end index label
is also included in the output just like the example.

 We can get the series in reverse order just like the example.
seriesName(starting_index : ending_index : step)

30
 We can use slicing to modify the series. In the example we use
seriesAlpha[1:6]=99 that means from 1 to 5 index the value is updated
to 99. Updating the values in a series using slicing excludes the value at
the end index position

 We can use labelled index slicing for update values. In this type the
end index position is inclusive.

31
 Attributes of Series:
We can access certain properties called attributes of
a series by using that property with the series name.

 We can assign a name of the series just like the example and assign a
name to the index of the series.

32
 We can create a empty series and check it weather the series is empty or
not. [Link] prints True if the series is empty, and False
otherwise.

 Methods of Series:

 There are some methods that are available for Pandas Series which give
the flex to the user.
 head(n) -> Returns the first n members of the series. If the value for n is
not passed, then by default n takes 5 and the first five members are
displayed.
 count() -> Returns the number of size of the series. It not include the
non-NaN values.
 tail(n) -> Returns the last n members of the series. If the value for n is
not passed, then by default n takes 5 and the last five members are
displayed.

33
 Mathematical Operations on Series:

34
 Addition of two Series:
We can add two series like [seriesA+seriseB] it will add values based
on the index value but if in one series there is not present a index value
it will show NaN in the addition.
But if we don’t want to place NaN then we have to use
[ [Link](seriesB, fill_value=0) ] like that it will add 0 by default
where there is absence of value.

 Subtraction of two Series:

It is same as addition just it will subtract two series values and all the
properties as same as addition.

 Multiplication of two Series:

It is same as addition just it will multiplication two series values and all
the properties as same as addition.

 Division of two Series:

35
It is same as addition just it will divide two series values and all the
properties as same as addition.

DataFrame:
We learn before about pandas series, but Sometimes we need to work
on multiple columns at a time, i.e., we need to process the tabular data. Pandas
store such tabular data using a DataFrame.
A DataFrame is a two-dimensional labelled data structure like a table
of MySQL. It contains rows and columns, and therefore has both a row and
column index. Each column can have different data type value.

36
 Creation of DataFrame:

 In the following example we create a empty DataFrame.

 Creation of DataFrame from NumPy n-dimension arrays:

 We can convert NumPy array into DataFrame by simply pass the array
into DataFrame [ dFrame4 = [Link](array1) ] .

 We can create a DataFrame using more than one n-dimension arrays just
like the example.

37
 Creation of DataFrame from List of Dictionaries:
We can create DataFrame from a list of Dictionaries just like the
example.

 Creation of DataFrame from Dictionary of Lists:

 DataFrames can also be created from a dictionary of lists. Dictionary

keys become column labels by default in a DataFrame, and the lists
become the rows.
 We can change the sequence of the column labels as like the example.

38
 Creation of DataFrame from Series:

 We can combine multiple series to a DataFrame. Here are three series

seriesA, seriesB, seriesC we convert them into dFrame8.

 Creation of DataFrame from Dictionary of Series:

 A dictionary of series can also be used to create a [Link] the
example ResultSheet is a dictionary with 5 student as column and 3
subject as index.

39
 If an individual dictionary element doesn’t contain any value it will put
NaN to that position.

40
 Operations on rows and columns in DataFrames

 Adding a New Column to a DataFrame:

We can add a new column in the DataFrame as like the example.

 If we assign value in the existing column name the column value will be
modified, it will not create a new column at the end.

41
 Adding a New Row to a DataFrame:
We can add a new row to a DataFrame using the [Link][ ]
method. In the example, we add a new row which is English.

 We can set all the value of the DataFrame into one value as
ResultDF[: ] = Value. In the example we converted all value into 0.

42
 Deleting Rows or Columns from a DataFrame:
We can use the [Link]() method to delete rows and columns
from a DataFrame. If we put axis value is 0 it will delete the specified
row on the other had putting axis value 1 it will delete specified column.

43
 Renaming Row Labels of a DataFrame:
We can change the labels of rows and columns in a DataFrame using
the [Link]() method. In the following example Hindi,
Maths, English, Bangla to sub1, sub2, sub3, sub4. In the axis field we
have to put the value ‘index’ to rename row.

 We can choose which row name I want to change. If I don’t want change
any row name we have to leave just as it is.

44
 Renaming Column Labels of a DataFrame:
We can alter the column name in a DataFrame using the
[Link]() method. In the axis field we have to put the value
‘columns’ to rename column.

 Accessing DataFrames Element through Indexing

 Label Based Indexing:

[Link][ ] is an important method that is used for label based
indexing with DataFrames. In the example [Link][‘science’] will
show Science result of all the students.

45
 When a single column label is passed, it returns the column as a Series.
In the example it will show Riya result in list format.

 Boolean Indexing:
Boolean means a binary variable that can be either True or False. In the
following example if the student result is greater than 90 it will show
True otherwise False.

46
 To check in which subjects ‘Arnab’ has scored more than 90, we can
write:

 Accessing DataFrames Element through Slicing:

 We can use slicing to select a subset of rows and/or columns from a

DataFrame. DataFrames slicing is inclusive of the end values. In the
example it will take row from Maths to Science.

47
 We may use a slice of labels with a slice of column names to access
values of those rows and columns:

 Filtering Rows in DataFrames:

In DataFrames [Link][] method can be use as rows filtering.
True(1) means it will show the row if it False(0) it will not show the row.

48
 Joining, Merging and Concatenation of DataFrames

 Joining:
We can use the [Link]() method to merge two
DataFrames. It appends rows of the second DataFrame at the end of the
first DataFrame. If there the second DataFrame column is not present
in the first DataFrame it will add new column.

 In the example we merge dFrame1 with dFrame2 and display it.

49
 In the previous example the column level is not sorted order, if we want
to sort the join DataFrame in column order we can set the parameter
sort=True.

 If we don’t want to sort the Dataframe in column level we can set the
parameter sort=False.

50
 when we do not want to use row index labels we can set ignore_index
=True. By default in the append function ignore_index = False.

 Attributes of DataFrames:

51
 If we want to transpose the DataFrame we can use [ DataFrame.T ].
Means, row indices and column labels of the DataFrame replace each
other’s position

 If we want to display the first n row we can use [ [Link](n) ].

In the same way, to display the last n row we can use
[ [Link] (n) ].

52
 [ [Link] ] return a Boolean value if the DataFrame is empty it
return True otherwise False.

 [Link] display the size or total number of tuples in the

DataFrame.
 [Link] display the number of rows and number of columns.

53
 [Link] display all the values in the DataFrame without the axes
labels.
 [Link] display the data type of each column in the DataFrame.

 Exporting a DataFrame to a CSV file:

 We can use the to_csv() function to save a DataFrame to a text or csv file.
In the following example we convert the ForestAreaDF DataFrame to csv
file.

54
 Importing a CSV file to a DataFrame:
 We can load the data from the [Link] file into a DataFrame, In the
example using Pandas read_csv() function as shown below:

55
Pandas (Part 2)

As discussed in before part(part 1) about pandas two primary data structure

series and dataframe and basic operation on them like creating and accessing
data from them.
In this part, we will discuss about more advanced features of dataframe, like
sorting data, answering analytical questions using data, cleaning data and
applying different useful functions on the data.

 Create dataframe:
For store the result data in dataframe we first create a dataframe from a
dictionary of list using pandas.

Example:

 Descriptive Statistics:
Descriptive statistics are used to summarize the given data. We will applied
statistical method to a DataFrame. These are –
i. Max
ii. Min
iii. Count
iv. Sum
56
v. Mean
vi. Median
vii. Mode
viii. Quartiles
ix. Variance
x. Standard deviation

 Some parameters for statistical methods:

 Numerical_only:
If we want to find the maximum value for the column that have numeric
numbers than we have to set numerical_only=True in these method.
Syntax: [Link](numerical_only=True)
Example:

 Relational operators:
If we want to calculate max value based on specific condition than we can
use relational operator and apply methods.
Syntax: df2=df[df[‘ut’]==2].max(numerical_only=True)
print(df2)
57
or,
df2=df[[Link]==2]
[Link](numerical_only=True)
or,
df[‘Maths’].min()

Example 1: Find the max marks of unit test(ut)=2

Example 2: find min marks obtain by susmita in each subject .

58
 Axis:
Calculate maximum value row wise then use axis=1, if column wise then
use axis =0
Syntax: [Link](axis=1)
Example:

59
 Calculate Maximum values:
If we want to calculate maximum value for each column then we can simply use
max function.
Syntax: [Link]()
Example:

60
 Calculate Maximum values:
If we want to calculate minimum value for each column then we can simply use
min function.
Syntax: [Link]()
Example:

61
 Calculate sum of values:
We can calculate sum of each column.
Syntax: [Link]()
We can also use parameters like numerical_only ,axis or relational operator.
Example: Calculate sum for specific entity for each sub only.

62
 Calculate Number of values:
For calculate total number of values in each column or row than use count
method. Can use parameters.
Syntax: [Link]()
Example:

63
 Calculate mean:
If we want to calculate the mean (average) of each column or row then use
mean method. We can use parameters.
Syntax: [Link]()
Example:

64
 Calculate median:
If we want to calculate the middle value of each column or row then use medin
method. We can use parameters.
Syntax: [Link]()
Example:

 Calculate mode:
If we want to calculate the value that is appears most numbers of times in data
of each column or row then use mode method. We can use parameters.
Syntax: [Link]()
Example:

65
 Calculate quartile:
If we want to calculate the quartile value of each column or row then use
quantile method. We can use parameters. And special parameters for this
method is q. If q=.25 then denote first quartile,
If q=.75 then denote third quartie,
By default it denote second quartile that is median value.

Syntax: [Link]()
Example 1: For a single column

66
Example 2: For multiple column

67
 Calculate variance:
It is the average of squared differences from the mean. If we want to calculate
the variance of each column or row then use var method. We can use
parameters.
Syntax: [Link]()
Example :

 Calculate standard deviation:

It is the square root of the variance. If we want to calculate the standard
deviation of each column or row then use std method. We can use parameters.
Syntax: [Link]()
Example :

68
 Describe() method:
This method display the descriptive statistical values in a single command.
Syntax: [Link]()
Example:

69
 Data Aggregations:
Aggregation means to transform the dataset and produce a single numeric value.
Can be applied to one or more columns together. We can use one or more
statistical method(max,min,sum,count,std,var,mean,mode,median) together.
Syntax: [Link](‘function name’)
Example 1: Single function using aggregation

70
Example 2: Multiple aggregation function in a single statement

71
Example 3: Multiple aggregation function in a single statement with axis
parameter.

 Sorting a dataframe:
Sorting refers to the arrangement of data elements in a specified order,which can
either be ascending and descending. For sorting dataframe we can use sort_value
method.
Syntax: df.sort_value(by=[‘label’],axis=0,ascending=True) (by default)
Example 1: sort by single attribute/column

72
Example 2: sort by multiple attributes/columns

73
 Group by function:
Groupby function is used to split the data into groups based on some criteria. This
function works based on a split-apply-combine strategy which is shown below
using a 3-step process:
Step 1: Split the data into groups by creating a groupby object from the original
DataFrame.
Step 2: Apply the required function(size,sum,mean,get_group…).
Step 3: Combine the results to form a new DataFrame.
Syntax: g1=[Link](‘column name’)
Df1=[Link]()
Example 1: display the first entry from each group

74
Example 2: display the size of each group

Example 3: display data of a single group

75
Example 4: display all groups data

Example 5: grouping with multiple attributes

Example 6: calculate average of each group

76
Example 7: calculate average of each group with single attribute

Example 8: calculate statistical data of each group with single attribute and multiple
aggregate functions

77
 Altering the index:
Depending on our requirements, we can select some other column to be the
index or we can add another index column (specially in slicng).
Syntax: df.reset_index(inplace=True)
Example 1: In slicing, altering the index

Example 2: In slicing, drop the original index after creating new index

78
Example 3: Select another column as index and then reset the index
Set -

Reset-

79
 Reshaping data:
The way a dataset is arranged into rows and columns is referred to as the shape of
data. Reshaping data refers to the process of changing the shape of the dataset to
make it suitable for some analysis problems.
For reshaping data, two basic functions are available in Pandas,
i. pivot and
ii. pivot_table.

 Pivot:
The pivot function is used to reshape and create a new DataFrame from the original
one. In previous section, we have to slice the data corresponding to a particular
attribute and then apply the statistical method for finding descriptive statistical data.
But reshaping has transformed the structure of the data, which makes it more
readable and easy to analyze the data.

 Pivoting by single column:

Syntax: pivot1=[Link](index='attribute',columns='attribute',values=’attribute')
[Link][‘index_value’].sum()

80
Example :

 Pivoting by multiple columns:

Syntax:
pivot1=[Link](index='attribute',columns='attribute',values=[’attribute1',’attribute’
,….])
[Link][‘index_value’].sum()
Example :

81
 Pivot table:
Duplicate data can’t be reshaped using pivot function. That’s why we may have to
use pivot_table function instead. It works like a pivot function, but aggregates the
values from rows with duplicate entries for the specified columns.
The default aggregate function is mean.
Syntax:
pd.pivot_table(data,values=None,index=None,columns=None,aggfunc=’mean’)
The parameter aggfunc can have values among sum,max, min, len, [Link],
[Link] wherever we have duplicate entries.
For calculating mean,median we have to import numpy as np.
Example:

82
 Handling missing value:
As we know that a DataFrame can consist of many rows (objects) where each row
can have values for various columns (attributes). If a value corresponding to a
column is not present, it is considered to be a missing value. A missing value is
denoted by NaN. Missing values create a lot of problems during data analysis and
have to be handled properly. The two most common strategies for handling missing
values explained in this section are:
i. drop the object having missing values,
ii. fill or estimate the missing value

 Checking missing values:

For checking missing values there are some method. They are-

 Isnull() method:
Pandas provide a function isnull() to check whether any value is missing or not in
the DataFrame. This function checks all attributes and returns True in case that
attribute has missing values, otherwise returns False.
We can check for each individual attribute also.
83
Syntax: [Link]()
Example:

 Isnull().any() method:
To check whether a column (attribute) has a missing value in the entire dataset, any()
function is used. It returns True in case of missing value else returns False.
We can check for each individual attribute also.
Syntax: [Link]().any()
Example:

84
 Isnull().sum() method:
To find the number of NaN values corresponding to each attribute, one can use the
sum() function along with isnull() function.
Syntax: [Link]().sum()
Example:

85
 Isnull().sum().sum() method:
To find the total number of NaN in the whole dataset, one can use this method.
Syntax: [Link]().sum().sum().
Example:

86
 Dropping missing values:
Missing values can be handled by either dropping the entire row having missing
value or replacing it with appropriate value. Dropping will remove the entire row
(object) having the missing value(s). The dropna() function can be used to drop an
entire row from the DataFrame.
Syntax: [Link]()
Example:

 Estmaing missing values:

Missing values can be filled by using estimations or approximations e.g a value just
before or after the missing value. In some cases, missing values are replaced by zeros
or ones.

 Fillna(num) method:
87
The fillna(num) function can be used to replace missing values by the value specified
in num.
i. fillna(0) replaces missing value by 0.
ii. fillna(1) replaces missing value by 1.
Syntax: df. fillna(num)
Example:

 fillna(method=’pad’) method:
This method replaces the missing value by the value before the missing value.
Syntax: [Link](method='pad')
Example:

88
 fillna(method=’bfill’) method:
This method replaces the missing value by the value after the missing value.
Syntax: [Link](method='bfill')
Example:

89
END

Class X Practical Notes
No ratings yet
Class X Practical Notes
17 pages
Python Notes
No ratings yet
Python Notes
24 pages
Slide 9
No ratings yet
Slide 9
25 pages
Python
No ratings yet
Python
132 pages
Complete Python Questions Answers
No ratings yet
Complete Python Questions Answers
6 pages
Python & Libraries Internship Report
No ratings yet
Python & Libraries Internship Report
9 pages
ML Lab File Vijay Kumar
No ratings yet
ML Lab File Vijay Kumar
27 pages
Lecture 2 Python Data Structures
No ratings yet
Lecture 2 Python Data Structures
52 pages
Python
No ratings yet
Python
61 pages
Core Python Commands Guide
No ratings yet
Core Python Commands Guide
10 pages
Data Types
No ratings yet
Data Types
21 pages
Python & NumPy Basics for Beginners
No ratings yet
Python & NumPy Basics for Beginners
54 pages
DSP 22395 Unit 2 and 3
No ratings yet
DSP 22395 Unit 2 and 3
6 pages
Python Basics & Data Structures
No ratings yet
Python Basics & Data Structures
47 pages
Data Types With Methods
No ratings yet
Data Types With Methods
4 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
DataScience - ML DEEP LEARNING - LPEI - 120 Days
No ratings yet
DataScience - ML DEEP LEARNING - LPEI - 120 Days
8 pages
Python Exam Paper Solve
No ratings yet
Python Exam Paper Solve
7 pages
M3-Introduction To Numpy and Pandas
No ratings yet
M3-Introduction To Numpy and Pandas
55 pages
Nivi Python - PPTX - 20250520 - 200602 - 0000
No ratings yet
Nivi Python - PPTX - 20250520 - 200602 - 0000
33 pages
Python Notes (Lists, Tuples, Sets, Dictionary)
No ratings yet
Python Notes (Lists, Tuples, Sets, Dictionary)
6 pages
Unit Iii
No ratings yet
Unit Iii
14 pages
Unit 5
No ratings yet
Unit 5
20 pages
Wa0005.
No ratings yet
Wa0005.
7 pages
Wa0007.
No ratings yet
Wa0007.
16 pages
Data Structures in Python
No ratings yet
Data Structures in Python
10 pages
2024 Summer Model Answer Paper
No ratings yet
2024 Summer Model Answer Paper
28 pages
Lec 2 .PDF Python
No ratings yet
Lec 2 .PDF Python
25 pages
Important Coding Functions Summary
No ratings yet
Important Coding Functions Summary
40 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Python OOP W5
No ratings yet
Python OOP W5
51 pages
Aiml Notes
No ratings yet
Aiml Notes
84 pages
Python Revision for Class XII 2024-25
No ratings yet
Python Revision for Class XII 2024-25
18 pages
Columbiax - BAMM 101 - Python For Analytics
No ratings yet
Columbiax - BAMM 101 - Python For Analytics
38 pages
Summer 2024 Examination Model Answer Only For The Use of RAC Assessors Subject Name: Programming With Python Subject Code
No ratings yet
Summer 2024 Examination Model Answer Only For The Use of RAC Assessors Subject Name: Programming With Python Subject Code
19 pages
Summer 2024 Examination Model Answer Only For The Use of RAC Assessors Subject Name: Programming With Python Subject Code
No ratings yet
Summer 2024 Examination Model Answer Only For The Use of RAC Assessors Subject Name: Programming With Python Subject Code
130 pages
Basic Python
No ratings yet
Basic Python
10 pages
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
No ratings yet
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
42 pages
PS Assignment 3
No ratings yet
PS Assignment 3
2 pages
Report of Python (1.)
No ratings yet
Report of Python (1.)
52 pages
Practice Numpyarray
No ratings yet
Practice Numpyarray
15 pages
Python and Libraries For AI
No ratings yet
Python and Libraries For AI
34 pages
Unit Iv FDS
No ratings yet
Unit Iv FDS
142 pages
Numpy
No ratings yet
Numpy
14 pages
Data Filtering in Python Programming
No ratings yet
Data Filtering in Python Programming
5 pages
Datacamp Python Intro Guide
No ratings yet
Datacamp Python Intro Guide
10 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
9 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
78 pages
Num Py
No ratings yet
Num Py
52 pages
Python Cheat Sheet: Syntax & Functions
No ratings yet
Python Cheat Sheet: Syntax & Functions
19 pages
Python Video Notes
No ratings yet
Python Video Notes
16 pages
Python Exps Questions
No ratings yet
Python Exps Questions
10 pages
Understanding Python Dictionaries
No ratings yet
Understanding Python Dictionaries
8 pages
Final Class XII IP Study Material 2023-24
No ratings yet
Final Class XII IP Study Material 2023-24
20 pages
Array Notes
No ratings yet
Array Notes
11 pages
Python Papper
No ratings yet
Python Papper
43 pages
Python Programming Basics and Data Structures
No ratings yet
Python Programming Basics and Data Structures
15 pages
Land Law Disposal by Way of Alienation
No ratings yet
Land Law Disposal by Way of Alienation
27 pages
UMLPCXPRESSO55S16
No ratings yet
UMLPCXPRESSO55S16
21 pages
Schedule
No ratings yet
Schedule
1 page
Economics Discussion
No ratings yet
Economics Discussion
4 pages
1 PDF
No ratings yet
1 PDF
11 pages
Form 3251B
No ratings yet
Form 3251B
2 pages
Aarav - Shah IA
No ratings yet
Aarav - Shah IA
20 pages
Your PAYG Instalments Have Changed
No ratings yet
Your PAYG Instalments Have Changed
3 pages
SkyTrak 10054 Telehandler Specs
No ratings yet
SkyTrak 10054 Telehandler Specs
2 pages
Dental Equipment and Materials List
No ratings yet
Dental Equipment and Materials List
4 pages
Ethical Issues in Change Management
No ratings yet
Ethical Issues in Change Management
10 pages
Cableguys FilterShaper XL Manual
No ratings yet
Cableguys FilterShaper XL Manual
27 pages
2 - The Periodic Table of Arduino
No ratings yet
2 - The Periodic Table of Arduino
4 pages
Five Mysteries of Capital Explained
No ratings yet
Five Mysteries of Capital Explained
7 pages
Using The TWI Module As I2C Master
No ratings yet
Using The TWI Module As I2C Master
17 pages
ORFS Installation Ubuntu22.04
No ratings yet
ORFS Installation Ubuntu22.04
5 pages
Fico Training
No ratings yet
Fico Training
14 pages
Siemens LOGO Ethernet Connection Guide
No ratings yet
Siemens LOGO Ethernet Connection Guide
5 pages
GH Trans Sales Training Guide
No ratings yet
GH Trans Sales Training Guide
48 pages
Rinl, Vizag
No ratings yet
Rinl, Vizag
43 pages
Understanding COPAR in Community Health
No ratings yet
Understanding COPAR in Community Health
5 pages
The Politics of Belgium Governing A Divided Society 2nd Edition Kris Deschouwer PDF Version
No ratings yet
The Politics of Belgium Governing A Divided Society 2nd Edition Kris Deschouwer PDF Version
54 pages
Linux Driver
0% (1)
Linux Driver
13 pages
Offer Letter
No ratings yet
Offer Letter
2 pages
Mechanical Drawings
No ratings yet
Mechanical Drawings
10 pages
Room Appliance
No ratings yet
Room Appliance
2 pages
Gecko Platform Release Notes 4.2.1.0
No ratings yet
Gecko Platform Release Notes 4.2.1.0
24 pages
Growth Mindset
No ratings yet
Growth Mindset
38 pages
Turnover Activities Flow Leading To RFSU
No ratings yet
Turnover Activities Flow Leading To RFSU
1 page
Nike Brochure
No ratings yet
Nike Brochure
5 pages

Python 20

Uploaded by

Python 20

Uploaded by

Assignment of Data Mining

[Basic python, Numpy, Pandas]

Department of computer Sceience and Engineering,

A string is a sequence of characters, can contain letters, numbers, symbols and

 Strings are immutable:

 Lists are Mutable:

 Indexing & negative indexing:

 Iterate through list using range method:

Example 2: we can remove elements using range.

 Create an empty dictionary:

 Update value for existing index:

 Delete entry from dictionary:

 Iterate through dictionary:

 Single dimensional array create:

 Multidimensional array create:

 The two primary data structures in Pandas are:

 Creation of Series from NumPy Arrays:

 NumPy is another Library using in python. We can convert NumPy

 Python dictionary has [key:value] pairs, it can be converted into series.

 Indexing: Indexing in Series is similar to that for NumPy arrays, and

 positional index: Positional index takes an integer value and the

 labelled index: Labelled index takes any user-defined label as

 This is another example of labelled index.

 We can access the series by it index values.

 Subtraction of two Series:

 Multiplication of two Series:

 Division of two Series:

 In the following example we create a empty DataFrame.

 Creation of DataFrame from NumPy n-dimension arrays:

 Creation of DataFrame from Dictionary of Lists:

 DataFrames can also be created from a dictionary of lists. Dictionary

 We can combine multiple series to a DataFrame. Here are three series

 Creation of DataFrame from Dictionary of Series:

 Adding a New Column to a DataFrame:

 Accessing DataFrames Element through Indexing

 Label Based Indexing:

 Accessing DataFrames Element through Slicing:

 We can use slicing to select a subset of rows and/or columns from a

 Filtering Rows in DataFrames:

 In the example we merge dFrame1 with dFrame2 and display it.

 If we want to display the first n row we can use [ [Link](n) ].

 [Link] display the size or total number of tuples in the

 Exporting a DataFrame to a CSV file:

As discussed in before part(part 1) about pandas two primary data structure

 Some parameters for statistical methods:

Example 1: Find the max marks of unit test(ut)=2

Example 2: find min marks obtain by susmita in each subject .

 Calculate standard deviation:

Example 3: display data of a single group

Example 5: grouping with multiple attributes

Example 6: calculate average of each group

 Pivoting by single column:

 Pivoting by multiple columns:

 Checking missing values:

 Estmaing missing values:

You might also like