0% found this document useful (0 votes)

13 views62 pages

4 Introduction To Python Part 3

This document serves as an introduction to Python programming, focusing on the libraries NumPy and Pandas, which are essential for data science and artificial intelligence. It covers the basics of importing libraries, creating and manipulating arrays with NumPy, and introduces Pandas for handling heterogeneous data through Series and DataFrames. The material is prepared for a course at the American University of Sharjah and is based on the book 'Python for Programmers' by Paul Deitel and Harvey Deitel.

Uploaded by

Yusra Eltilib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views62 pages

4 Introduction To Python Part 3

Uploaded by

Yusra Eltilib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Introduction to Python Programming – Part 3

Python Libraries: NumPy and Pandas

Intro to AI and Data Science
NGN 112 – Fall 2024

Ammar Hasan
Department of Electrical Engineering
College of Engineering

American University of Sharjah

Prepared by Dr. Tamer Shanableh, CSE and Dr. Jamal A. Abdalla, CVE
Material mainly based on “Python for Programmers” by Paul Deitel and
Harvey Deitel, Pearson; Illustrated edition, ISBN-10 : 0135224330

Last Updated on: 22nd of August 2024

Table of Content
2

Python Libraries

NumPy Library

Pandas

DataFrames
Python Libraries
3

Python has many libraries, which are a collection of pre-defined

functions or pre-written code. The libraries can can be imported
into your program, and you can use all the functions in that
library.

You have previously used “import math” where math is the

name for the math library in Python.

A software library is a collection of pre-written code such that

programmers do not reinvent the wheel.
Python Libraries
4

Popular libraires in Python for Data Science (We will use the highlighted ones in this course):
Python Libraries for Data Processing and Model Deployment
• Pandas
• NumPy
• SciPy
• Sci-Kit Learn
• PyCaret
• Tensorflow
• OpenCV
Python Libraries for Data Mining and Data Scraping
• SQLAlchemy
• Scrapy
• BeautifulSoup
Python Libraries for Data Visualization
• Matplotlib
• Ggplot
• Plotly
• Altair
• Seaborn
Source: https://www.projectpro.io/article/top-5-libraries-for-data-science-in-python/196
Importing Libraries
5

Import the whole library:

import numpy
myarr = numpy.array([1,2,3,4])

OR: Import the whole library with an alias:

import numpy as np
myarr = np.array([1,2,3,4])
Importing a Specific Object
6

OR: Import a specific function or an object:

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a line plot

plt.plot(x, y)
7

NumPy Library
The NumPy Library
8

• NumPy is a popular open-source library in Python for

data science and artificial intelligence.

• It is a standard way of working with numeric data in

Python.

• It can be used for creating and manipulating N-

dimensional arrays
▪ 1D for lists of numbers
▪ 2D for tables and grayscale images
▪ 3D for images (R, G, B)
▪ 4D for videos (a sequence of 3D images)
Creating NumPy Arrays
9

 Start by importing the NumPy library

import numpy as np

 Next, create 1D arrays:

# Create a 1D array
numpy_array = np.array([10,20,30])

# Create a 1D array from a list of numbers

data = [10, 20, 30, 40, 50]
numpy_array = np.array(data)
Numpy 2D arrays
10

 Next, let’s create a 2D array. Think of 2D arrays as an “Array of “Arrays” or

a matrix.
import numpy as np

arr_2D = np.array([
[10, 20, 30, 4],
[2, 8, 2, 4],
[30, 12, 67, 44],
[24, 10, 32, 0]
])
print(arr_2D)
print('Shape: ', arr_2D.shape) #prints the
dimensions of the array
Numpy 2D arrays
11

import numpy as np
import matplotlib.pyplot as plt

# Create a smiley
smiley_array = np.array([
[1, 1, 1, 1, 1],
[1, 0, 1, 0, 1],
[1, 1, 1, 1, 1],
[1, 0, 0, 0, 1],
[1, 1, 1, 1, 1],
])

print(smiley_array)
plt.imshow(smiley_array, cmap='binary', )
Reshaping NumPy Arrays
12

 You can use the NumPy reshape function to transform a 1D array into a
multidimensional array (row-wise)
 Example: we can reshape a 12-element 1D array into a 4x3 2D array
 Clearly, reshaping a 12-element 1D array into a 4x4 2D array will not work and
this will generate an error.

import numpy as np
arr = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
print('arr contains: \n', arr)

arr_2D = arr.reshape(4,3)
print('arr_2D contains: \n', arr_2D)
Transposing NumPy Arrays
13

 You can use the np transpose function to replace rows with columns in a 2D array
 The first row becomes the first column, the second row becomes the second column
and so forth…
Transposing NumPy Arrays
14

 You can use the np transpose function to replace rows

with columns in a 2D array
 The first row becomes the first column, the second row

becomes the second column, and so forth.

import numpy as np
arr = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
print('arr contains: \n', arr)
arr_2D = arr.reshape(4,3)
print('arr_2D contains: \n', arr_2D)
#------------------------------------
arr_2D_transposed = np.transpose(arr_2D)
print('arr_2D_transposed contains: \n',
arr_2D_transposed)
NumPy Sorting
15

#Numpy Example: sort #Use the function np.sort(name of

method array, axis to sort: None|0|1)
import numpy as np
# Sort the whole array
arr_2D = np.array([ rst = np.sort(arr_2D,axis=None)
[10, 20, 30, 4], print('sort the whole Array: \n',
[2, 8, 2, 4], rst)
[30, 12, 67, 44],
[24, 10, 32, 0] # Sort row-wise (axis = 1)
]) rst = np.sort(arr_2D,axis=1)
print(arr_2D) print('Row-wise sorting: \n',rst)

# Sort column-wise (axis = 0)

rst = np.sort(arr_2D,axis=0)
print('Column-wise sorting:
\n',rst)
NumPy Calculation Functions
16

 We can use the sum, min, max, mean, std, and var
functions on NumPy arrays. An example of using of sum is shown below.

import numpy as np
grades = np.array([[87,96, 70], [100, 87, 90], [94,77,
90],[100, 81, 82]])
print('The grades are: \n', grades)

sum = grades.sum(axis=1) # row-wise

print('Summation row-wise:\n',sum)

sum = grades.sum(axis=0) # col-wise

print('Summation col-wise:\n',sum)

sum = grades.sum(axis=None) # all

print('Summation of all grades:\n',sum)
NumPy Calculation Functions
17

 An example of using of min is shown below.

import numpy as np
grades = np.array([[87,96, 70], [100, 87, 90], [94,77,
90], [100, 81, 82]])
print('The grades are: \n', grades)

min = grades.min(axis=1) # row-wise

print('min row-wise:\n',min)

min = grades.min(axis=0) # col-wise

print('min col-wise:\n',min)

min = grades.min(axis=None) # all

print('min of all grades:\n',min)
Indexing and Slicing (1/4)
18

 Arrays in NumPy use a zero-indexing scheme. This scheme applies to

rows and columns indexing.
import numpy as np

grades = np.array([[87,96, 70], [100, 87, 90], [94,77,

90],[100,81, 82]])
print('The grades are: \n', grades)

#Select one grade using: grade[row index, col index]

print('grades[0,0] = ', grades[0,0])
print('grades[1,2] = ', grades[1,2])

#Select one row of grades using : grade[row index]

print('grades[3] = ', grades[3])
Indexing and Slicing (2/4)
19

 Multiple rows can be selected from a NumPy array.

 Select multiple sequential rows of grades using array_name[row index
from : row index to]. However, this will exclude the row with the last
index as shown in the example below.

import numpy as np

grades = np.array([[87,96, 70],[100, 87, 90],[94, 77,

90], [100, 81, 82]])

print('The grades are: \n', grades)

#Select multiple sequential rows of grades using :

grade[row index from : row index to]

print('grades[0:2] = \n', grades[0:2]) #up to but not

including row 2
Indexing and Slicing (3/4)
20

 You can select a subset of columns in NumPy

arrays
 grades[:,0] means select all rows,
column 0
 grades[:, 0:2] means select all rows,
columns 0,1 (up to but not including 2)

import numpy as np
grades = np.array([[87, 96, 70], [100, 87, 90],
[94, 77, 90], [100, 81, 82]])
print('The grades are: \n', grades)

print('First column | grades[:,0] = \n', grades[:,0])

print(‘Last 2 columns| grades[:, 1:3] = \n', grades[:,1:3])

Adopted from https://www.w3resource.com/python-

exercises/numpy/python-numpy-exercise-104.php
Indexing and Slicing (4/4)
21

 Python allows negative indices in arrays

 One particularly important case is the access of the last

column using the negative column index of ‘-1’

import numpy as np
grades = np.array([[87, 96, 70], [100, 87, 90], [94, 77,
90], [100, 81, 82]])
print('The grades are: \n', grades)

print('First column | grades[:,0] = \n', grades[:,0])

print('Last column | grades[:, -1] = \n', grades[:,-1])
22

Pandas - Series & DataFrames

Revisiting countries and fruits example
23

Which representation is better, with headers or without headers

Heterogenous data
24

Heterogenous means data of different types e.g. strings and ints

Numpy arrays do not support heterogenous data

Numpy arrays do not

support missing entries
Pandas Series and DataFrames (1/2)
25

 NumPy arrays are optimized for homogenous numeric data

 However, in machine learning (ML) applications, we need to

provide:
 Support for heterogeneous types (e.g., numbers and strings).
 Support for missing data.
 Support for headers and indices (as shown in the next slide).

 Pandas is the commonly used library for dealing with such data.
 It provides support for:
 Series: for 1D collections (enhanced 1D array).
 DataFrames: for 2D collections (enhanced 2D array).
Pandas Series and DataFrames (2/2)
26

Index value
Index header header header header

Rest of columns are called “values”

First column is called “index”
Pandas: Python Data Analysis
27
28

Pandas Series
Pandas Series (1/2)
29

 A Series is an enhanced 1D array.

 It can be indexed using integers like NumPy or strings.

import pandas as pd
grades = pd.Series([87, 100, 94])
print('Grades Series:\n',grades)
print('First grade: ',grades[0])

Output (index and value):

0 87
1 100
2 94
First grade: 87
Pandas Series (2/2)
30

 Provides for statistical import pandas as pd

grades = pd.Series([87, 100, 94])
functions like count,
mean, min, max, and print('Grades Series:\n',grades)
std. print('Count: ', grades.count())
print('Mean: ' , grades.mean())
 For a full numerical print('Min: ' , grades.min())
summary, you can use the print('Max: ' , grades.max())
print('Std: ' , grades.std())
describe function.
# for an overall summary you can
use:
print('Description:\n',grades.des
cribe())
Series with a Custom Index
31

 You can use custom indices with the index

argument. Index value

import pandas as pd
grades = pd.Series([87, 100, 94],
index=['First', 'Second', 'final'])
print(grades)

Output:

First 87
Second 100
final 94
Accessing Series Using String Indices
32

 In the previous example, a Series with custom indices can be accessed via
square brackets [ ] containing a custom index value:
import pandas as pd
grades = pd.Series([87, 100, 94], index=['First',
'Second', 'final'])
print('Grade of first = ',grades['First']) # or
print('Grade of first = ',grades[0])

#--You can also access all values and all indices

print('Series values are: ', grades.values)
print('Series indices are: ', grades.index)

Output:
Grade of first = 87
Grade of first = 87
Series values are: [ 87 100 94]
Series indices are: Index(['First', 'Second', 'final'],
dtype='object')
33

Pandas DataFrames
Pandas DataFrames
34

 DataFrames are enhanced 2D Index header header header header

arrays
 They can have custom indices
and headers
 Each column in a DataFrame is a
Series
Creating DataFrames From Files
35

• Pandas provides a read_csv() function to read data stored as a .csv file into
a pandas DataFrame.

• Pandas supports many different file formats including csv and excel:
• myDataFrame = pd.read_csv(“myfile.csv”)
• myDataFrame = pd.read_excel(“myfile.xlsx”)

• To save data from DataFrames to files use:

• myDataFrame.to_csv(“myOutputFile.csv”)
• myDataFrame.to_excel(“myOutputFile.xlsx”)

• After reading a file, you can display the first and last 5 rows using
myDataFrame.head()
Creating DataFrames From Files in Colab
36

Click to upload a file

I uploaded this file

df2.to_csv('testFileToWrite.csv') # this will create an output file with .csv extension

Creating DataFrames From Internet Files (1/3)

• We will use the Iris sample data, which contains information on 150
Iris flowers, 50 each from one of three Iris species: Setosa,
Versicolour, and Virginica.
• Each flower is characterized by five attributes:
1. sepal_length in centimeters
2. sepal_width in centimeters
3. petal_length in centimeters
4. petal_width in centimeters
• Each flower belongs to one type, which is the last column in
DataFrame:
(Setosa, Versicolour, Virginica)
Data is available online at: https://archive.ics.uci.edu/dataset/53/iris
Iris Flowers Dataset
38
Creating DataFrames From Internet Files (2/3)
39

import pandas as pd

#The argument header=None says that this dataset does not

contain a header yet, so we will add one next
data = pd.read_csv('http://archive.ics.uci.edu/ml/machine-
learning-databases/iris/iris.data',header=None)

# data = pd.read_csv('iris.data')

#You can then Add column headers

data.columns=['sepal_length','sepal_width','petal_length','pe
tal_width','class']

#And display the first 5 rows to make sure that the reading
is successful
data.head()
Creating DataFrames From Internet Files (3/3)

The output:
41

DataFrames Indexing
Accessing DataFrame’s Columns and Rows (1/4)
42

petal_length columns:
#Access one column using a header’s name 0 1.4
print('petal_length 1 1.4
columns:\n',data['petal_length']) 2 1.3
3 1.5
4 1.4
...
145 5.2
146 5.0
147 5.2
148 5.4
149 5.1

First row:
#Access one row using the .iloc function sepal_length 5.1
print('\n\nFirst row:') sepal_width 3.5
petal_length 1.4
print(data.iloc[0]) petal_width 0.2
class Iris-setosa
Accessing DataFrame’s Columns and Rows (2/4)
43

#Access a sequential slice of rows using the .iloc

function

print('\n\nFirst 5 rows:')

print(data.iloc[0:5]) # up to but not including 5

First 5 rows:

sepal_length sepal_width petal_length petal_width class

0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
Accessing DataFrame’s Columns and Rows (3/4)
44

#Access a sequential slice of rows and columns using the

.iloc function
print('\n\nFirst 5 rows and first 2 columns:')

#print up to but not including row 5, up to but not

including col 2
#.iloc[ rows from:to , cols from:to ]
print(data.iloc[0:5 , 0:2 ])

First 5 rows and first 2 columns:

sepal_length sepal_width
0 5.1 3.5
1 4.9 3.0
2 4.7 3.2
3 4.6 3.1
4 5.0 3.6
Accessing DataFrame’s Columns and Rows (4/4)
45

#Access a sequential slice of rows and columns using the

.iloc function
print('\n\nFirst 5 rows and first 2 columns:')

#print up to but not including row 5, and cols 0,1 and the
last column
#.loc[ rows from:to , [cols indices] ]
print(data.iloc[0:5 , [0,1,-1]])

sepal_length sepal_width class

0 5.1 3.5 Iris-setosa
1 4.9 3.0 Iris-setosa
2 4.7 3.2 Iris-setosa
3 4.6 3.1 Iris-setosa
4 5.0 3.6 Iris-setosa
46

DataFrames Boolean Indexing

DataFrames Boolean Indexing (1/5)
47

 Pandas provide a powerful selection feature called Boolean

indexing.
 That is, you can use a Boolean expression that returns True/False
to filter a DataFrame.
 Let us start by extracting the numeric data from our DataFrame:
data_numeric = data.iloc[:, 0:4]
data_numeric.head()
sepal_length sepal_width petal_length petal_width
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
DataFrames Boolean Indexing (2/5)
48

# from the previous slide

data_numeric = data.iloc[:, 0:4]
#Filter the dataFrame, locate values >=
5.0 sepal_length sepal_width petal_length petal_width
rst = data_numeric[data_numeric >= 5.0] 0 5.1 NaN NaN NaN
rst
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
• Pandas checks every element to 3 NaN NaN NaN NaN
determine whether its value is greater 4 5.0 NaN NaN NaN
than or equal to 5.0. ... ... ... ... ...
• If True then it includes it in the new 145 6.7 NaN 5.2 NaN
DataFrame (rst in the example above). 146 6.3 NaN 5.0 NaN
147 6.5 NaN 5.2 NaN
• Elements for which the condition is False 148 6.2 NaN 5.4 NaN
are represented as NaN (not a number) in 149 5.9 NaN 5.1 NaN
the new DataFrame. 150 rows × 4 column
DataFrames Boolean Indexing (3/5)
49

• In Boolean expression, you can use:

▪ AND, which is the & operator
▪ OR, which is the | operator

data_numeric = data.iloc[:, 0:4]

rst = data_numeric[data_numeric >= 5.0]

rst.head()

#Other examples (data_numeric >= 3.0) AND (data_numeric <=

5.0):
rst = data_numeric[(data_numeric >= 3.0) & (data_numeric <=
5.0)]
rst.head()

#Other examples (data_numeric < 3.0) OR (data_numeric > 5.0):

rst = data_numeric[(data_numeric < 3.0) | (data_numeric > 5.0)]
rst.head()
DataFrames Boolean Indexing (4/5)
50

• In Boolean expression, you can use the .loc function to filter rows according to Boolean
criteria.
import pandas as pd

data = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data',header=None)
# data = pd.read_csv('iris.data')

data.columns=['sepal_length','sepal_width','petal_length','petal_width'
,'class']

#Select row where sepal_length >= 5.0

rst = data.loc[ data.sepal_length >= 5.0 ]
print('Select row where sepal_length >= 5.0')
print(rst.head())

#Select row where sepal_length >= 5.0 AND & data.sepal_width >= 3.5
rst = data.loc[ (data.sepal_length >= 5.0) & (data.sepal_width >= 3.5)]
print('Select row where sepal_length >= 5.0 & data.sepal_width >= 3.5')
print(rst.head())
DataFrames Boolean Indexing (5/5)
51

Select row where sepal_length >= 5.0

Output: sepal_length sepal_width petal_length petal_width class
0 5.1 3.5 1.4 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
5 5.4 3.9 1.7 0.4 Iris-setosa
7 5.0 3.4 1.5 0.2 Iris-setosa
10 5.4 3.7 1.5 0.2 Iris-setosa

Select row where sepal_length >= 5.0 & data.sepal_width >= 3.5
sepal_length sepal_width petal_length petal_width class
0 5.1 3.5 1.4 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
5 5.4 3.9 1.7 0.4 Iris-setosa
10 5.4 3.7 1.5 0.2 Iris-setosa
14 5.8 4.0 1.2 0.2 Iris-setosa
Summary of Four Types of Indexing in
DataFrames
52
import pandas as pd
#Retrieve data from web archive and add column headers
data = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data',header=None)
data.columns=['sepal_length','sepal_width','petal_length','petal_width','class']

# Indexing with column header name

print(data['sepal_width'])

# Indexing using .iloc, which is similar to numpy indexing

print(data.iloc[2,1]) # index a particular element
print(data.iloc[2]) # index a row
print(data.iloc[2:5]) # index a range of rows
print(data.iloc[:,4]) # index a column
print(data.iloc[:,0:3]) # index a range of columns
print(data.iloc[20:22,0:3]) # index a range of rows and columns
print(data.iloc[0:7:2,3]) # index a range of rows with step size
print(data.iloc[2,-1]) # index with -1 for last column
print(data.iloc[[0,39,45],3]) # index some specific row numbers

# boolean indexing (element wise)

data_numeric = data.iloc[:,0:4] #retrieve only numeric data
print(data_numeric[data_numeric >= 5.0]) #all elements that satisfy boolean condition
print(data_numeric[(data_numeric >= 3.0) & (data_numeric <= 5.0)])

# boolean indexing (row wise) using .loc

print(data.loc[data.sepal_length >= 7]) #all rows that satisfy boolean condition
print(data.loc[(data.sepal_length >= 7) & (data.petal_length <=5)])
53

DataFrames Statistics
DataFrames Statistics (1/2)
54

 Similar to Series, you can use the describe()function to print out

statistics.
 In DataFrames, the statistics are calculated by column (for the numeric
columns only).
sepal_length sepal_width petal_length petal_width
count 150.000000 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667 1.198667
std 0.828066 0.433594 1.764420 0.763161
min 4.300000 2.000000 1.000000 0.100000
25% 5.100000 2.800000 1.600000 0.300000
data.describe() 50% 5.800000 3.000000 4.350000 1.300000
75% 6.400000 3.300000 5.100000 1.800000
max 7.900000 4.400000 6.900000 2.500000
DataFrames Statistics (2/2)
55

 Similar to Series, you can use the mean(), min(), max(), std(),
var().
 In DataFrames, the statistics are calculated by column (for the numeric columns
only).
Avg per col:
print('Avg per col:') sepal_length 5.843333
sepal_width 3.054000
print(data.mean()) petal_length 3.758667
print('Std per col:') petal_width 1.198667
print(data.std())
Std per col:
print('Min per col:') sepal_length 0.828066
print(data.min()) sepal_width 0.433594
print('Max per col:') petal_length 1.764420
petal_width 0.763161
print(data.max())
…
56

Converting Numpy to DataFrames

DataFrames <-> NumPy (1/3)
57

 There are cases where you need to convert a DataFrame into a NumPy Array
and vice versa
 This is needed in machine learning tasks like classification and regression that
you will study next
 Let us start by converting a DataFrame into a NumPy array using to_numpy()
function
import pandas as pd
data = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data',header=None)
data.columns=['sepal_length','sepal_width','petal_length','petal_width
', 'class']

#Convert a dataFrame into a numPy array

numpy_from_dataFrame = data.to_numpy()
#print(numpy_from_dataFrame)

#OR: Convert the first 4 columns of a dataFrame into a numPy array

numpy_from_dataFrame = data.iloc[:, 0:4].to_numpy()
#print(numpy_from_dataFrame)
DataFrames <-> NumPy (2/3)
58

Output of Output of data.iloc[:,

data.to_numpy() 0:4].to_numpy()

[[5.1 3.5 1.4 0.2 'Iris-setosa'] [[5.1 3.5 1.4 0.2]

[4.9 3.0 1.4 0.2 'Iris-setosa'] [4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2 'Iris-setosa'] [4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2 'Iris-setosa'] [4.6 3.1 1.5 0.2]
[5.0 3.6 1.4 0.2 'Iris-setosa'] [5. 3.6 1.4 0.2]
[5.4 3.9 1.7 0.4 'Iris-setosa'] [5.4 3.9 1.7 0.4]
[4.6 3.4 1.4 0.3 'Iris-setosa'] [4.6 3.4 1.4 0.3]
[5.0 3.4 1.5 0.2 'Iris-setosa'] [5. 3.4 1.5 0.2]
[4.4 2.9 1.4 0.2 'Iris-setosa'] [4.4 2.9 1.4 0.2]
[4.9 3.1 1.5 0.1 'Iris-setosa'] [4.9 3.1 1.5 0.1]
[5.4 3.7 1.5 0.2 'Iris-setosa'] [5.4 3.7 1.5 0.2]
[4.8 3.4 1.6 0.2 'Iris-setosa'] [4.8 3.4 1.6 0.2]
… …
DataFrames <-> NumPy (3/3)
59

 To convert a NumPy array into a DataFrame we can use the command

pd.DataFrame()
 Notice how you can add columns (which are the headers), using the argument
columns=[…]

dataFrame_from_numpy =
pd.DataFrame(numpy_from_dataFrame, columns =
['sepal_length', 'sepal_width', 'petal_length',
'petal_width','class'])

dataFrame_from_numpy.head()
60

Converting Dictionaries to
DataFrames
Other Ways of Creating DataFrames (1/2)
61

import pandas as pd Output:

df = pd.DataFrame( Name Age Gender
{ 0 Braund, Mr. Owen Harris 22 male
"Name":["Braund, Mr. Owen 1 Allen, Mr. William Henry 35 male
Harris", "Allen, Mr. William Henry", 2 Bonnell, Miss. Elizabeth 58 female
"Bonnell, Miss. Elizabeth"],
"Age":[22, 35, 58], Age
“Gender":["male","male", "female"] count 3.000000
}
mean 38.333333
)
print(df) std 18.230012
df.describe() min 22.000000
25% 28.500000
50% 35.000000
75% 46.500000
max 58.000000
https://pandas.pydata.org/
Other Ways of Creating DataFrames (2/2)
62

#You can create a DataFrame from an existing The dictionary’s

dictionary as follows keys become the
import pandas as pd column names
my_dictionary={ (headers).
"Name": [
"Dr. Sami Batata", The values become
"Prof. Marwa Halawah", the element values
"Mr. Fawzi Kamal" in the
], corresponding
"Age": [29, 40, 60], column.
"Gender": ["male", "female", "male"]
}
df = pd.DataFrame( my_dictionary)
print(df)

Name Age Gender

0 Dr. Sami Batata 29 male
1 Prof. Marwa Halawah 40 female
2 Mr. Fawzi Kamal 60 male

Learner Assessment Pack SITXINVOO6 Receive Store and Maintain Stock
No ratings yet
Learner Assessment Pack SITXINVOO6 Receive Store and Maintain Stock
75 pages
WAfrica Metocean Data Rev20
100% (2)
WAfrica Metocean Data Rev20
55 pages
4 Introduction To Python Part 3
No ratings yet
4 Introduction To Python Part 3
48 pages
Week 4 - Introduction To Python #3
No ratings yet
Week 4 - Introduction To Python #3
47 pages
M3-Introduction To Numpy and Pandas
No ratings yet
M3-Introduction To Numpy and Pandas
55 pages
NUMPY
No ratings yet
NUMPY
33 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
Dse Unit 3
No ratings yet
Dse Unit 3
12 pages
Ch-2 Python Libraries For ML
No ratings yet
Ch-2 Python Libraries For ML
70 pages
Python Unit 4
No ratings yet
Python Unit 4
43 pages
Swarang Raut EDVA Experiment 1 Numpy Pandas
No ratings yet
Swarang Raut EDVA Experiment 1 Numpy Pandas
58 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Basic Array Creation and Operations
No ratings yet
Basic Array Creation and Operations
27 pages
De Lab Manual New
No ratings yet
De Lab Manual New
24 pages
LT2 - 07 - Numpy Matplotlib Pandas
No ratings yet
LT2 - 07 - Numpy Matplotlib Pandas
101 pages
Unit 3 - Numpy - VP
No ratings yet
Unit 3 - Numpy - VP
53 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
05-Unit-V Python Lecture Notes
No ratings yet
05-Unit-V Python Lecture Notes
14 pages
C1 W1 Lab 1 Introduction To Numpy Arrays
No ratings yet
C1 W1 Lab 1 Introduction To Numpy Arrays
12 pages
Module Numpy
No ratings yet
Module Numpy
67 pages
B14 - LT2 - 07 - Numpy Matplotlib Pandas
No ratings yet
B14 - LT2 - 07 - Numpy Matplotlib Pandas
101 pages
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
No ratings yet
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
29 pages
Python Unit 3
No ratings yet
Python Unit 3
38 pages
Week2-1 Numpy
No ratings yet
Week2-1 Numpy
43 pages
Numpy
No ratings yet
Numpy
64 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
Unit 1
No ratings yet
Unit 1
170 pages
Numpy
No ratings yet
Numpy
51 pages
Python Module 5
No ratings yet
Python Module 5
43 pages
Lab 1
No ratings yet
Lab 1
6 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Getting Started With NumPy in Data Analytics
No ratings yet
Getting Started With NumPy in Data Analytics
45 pages
10 Numpy
No ratings yet
10 Numpy
39 pages
PPS - Unit 5 (Imp Topics)
No ratings yet
PPS - Unit 5 (Imp Topics)
7 pages
Lab-3 AI
No ratings yet
Lab-3 AI
21 pages
Unit 4
No ratings yet
Unit 4
49 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Unit Iv FDS
No ratings yet
Unit Iv FDS
142 pages
RAW Data
No ratings yet
RAW Data
22 pages
Tutorial 2
No ratings yet
Tutorial 2
9 pages
Numpy Part-1
No ratings yet
Numpy Part-1
22 pages
Unit-V Python - BCC402
No ratings yet
Unit-V Python - BCC402
20 pages
1 Numpy
No ratings yet
1 Numpy
26 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
42 pages
An Introduction To Numpy and Scipy by Scott Shell
No ratings yet
An Introduction To Numpy and Scipy by Scott Shell
24 pages
Essential Python Libraries
100% (1)
Essential Python Libraries
41 pages
Packages
No ratings yet
Packages
37 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
11 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
NumpyToday's Session
No ratings yet
NumpyToday's Session
8 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
Unit 3
No ratings yet
Unit 3
56 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Python Numpy
No ratings yet
Python Numpy
20 pages
Self Numpy
No ratings yet
Self Numpy
6 pages
Numpy&pandas
No ratings yet
Numpy&pandas
17 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Lectures CVE 232 (Part G) - Prof. Aqeel Ahmed
No ratings yet
Lectures CVE 232 (Part G) - Prof. Aqeel Ahmed
30 pages
Dr. Juana's Lecture Slides
No ratings yet
Dr. Juana's Lecture Slides
86 pages
NGN 112 - Syllabus
No ratings yet
NGN 112 - Syllabus
5 pages
2 Introduction To Python Part 1 2
No ratings yet
2 Introduction To Python Part 1 2
94 pages
Syllabus PHY101 Fall23 Hamdan
No ratings yet
Syllabus PHY101 Fall23 Hamdan
3 pages
7 Neural Networks
No ratings yet
7 Neural Networks
70 pages
User'S Guide: Msp-Exp430G2 Launchpad™ Development Kit
No ratings yet
User'S Guide: Msp-Exp430G2 Launchpad™ Development Kit
32 pages
Faiz Haikal A189119
No ratings yet
Faiz Haikal A189119
11 pages
SW TM4C Utils Ug 2.1.0.12573
No ratings yet
SW TM4C Utils Ug 2.1.0.12573
188 pages
Elasticity
No ratings yet
Elasticity
7 pages
Configuration IPv4 MVPN
100% (1)
Configuration IPv4 MVPN
28 pages
Description Part Number: Last Updated On 14-Apr-15
No ratings yet
Description Part Number: Last Updated On 14-Apr-15
82 pages
Formation and Evolution of Lava and Magma
No ratings yet
Formation and Evolution of Lava and Magma
2 pages
Checklist of Fishes in Thailand
No ratings yet
Checklist of Fishes in Thailand
355 pages
MATH1081 Discrete Mathematics 3.1: Title
No ratings yet
MATH1081 Discrete Mathematics 3.1: Title
74 pages
On Becoming A Valued Data Warehouse Tester
No ratings yet
On Becoming A Valued Data Warehouse Tester
4 pages
Transaction Details - PayPal
No ratings yet
Transaction Details - PayPal
2 pages
LESSON PLAN Week25b Bus AS 9609 2025
No ratings yet
LESSON PLAN Week25b Bus AS 9609 2025
8 pages
Design and Testing of A Universal Motor Using A Soft Magnetic Composite Stator
No ratings yet
Design and Testing of A Universal Motor Using A Soft Magnetic Composite Stator
5 pages
The Rational Expectations Hypothesis As A Key Element of New Classical Macroeconomics
No ratings yet
The Rational Expectations Hypothesis As A Key Element of New Classical Macroeconomics
39 pages
SDB Communication Guidance For IHE PCD-01 DEC (HL7) - 1.4.1
No ratings yet
SDB Communication Guidance For IHE PCD-01 DEC (HL7) - 1.4.1
22 pages
16 CFR 1500 - Sharp Points - Edges
No ratings yet
16 CFR 1500 - Sharp Points - Edges
8 pages
OBLG. Appelgren Engstr M - Mothers in Same Sex Relationships Striving For Equal Parenthood
No ratings yet
OBLG. Appelgren Engstr M - Mothers in Same Sex Relationships Striving For Equal Parenthood
10 pages
Cryptography AND Information Security, Second Edition by Pachghare, V - K
No ratings yet
Cryptography AND Information Security, Second Edition by Pachghare, V - K
2 pages
Forgiveness and Reconciliation
No ratings yet
Forgiveness and Reconciliation
4 pages
Art Appreciation
No ratings yet
Art Appreciation
11 pages
VSL Post Tensioning Solutions
No ratings yet
VSL Post Tensioning Solutions
28 pages
3.1-1 Continuous Beams: Sabah Shawkat Cabinet of Structural Engineering 2017
No ratings yet
3.1-1 Continuous Beams: Sabah Shawkat Cabinet of Structural Engineering 2017
10 pages
One Hundred Years of Solitude by Gabriel García Márquez
No ratings yet
One Hundred Years of Solitude by Gabriel García Márquez
6 pages
Curriculum Vitae 1
No ratings yet
Curriculum Vitae 1
3 pages
Ishan Paper 2 2 Chapters
No ratings yet
Ishan Paper 2 2 Chapters
2 pages
Standards of TV
No ratings yet
Standards of TV
4 pages
Monitor Olympus 26 "
No ratings yet
Monitor Olympus 26 "
2 pages
Meyn - Plucker - JM Series
No ratings yet
Meyn - Plucker - JM Series
2 pages