0% found this document useful (0 votes)

3 views7 pages

PPS - Unit 5 (Imp Topics)

NumPy is a core library for scientific computing in Python, centered around the ndarray object, which holds homogeneous data in multi-dimensional arrays. It provides efficient array creation, slicing, and manipulation, while Pandas offers data structures like Series and DataFrame for data analysis, including handling missing values. Understanding the differences between NumPy and Pandas is crucial for effective data manipulation and analysis in Python.

Uploaded by

mk7023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views7 pages

PPS - Unit 5 (Imp Topics)

Uploaded by

mk7023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

NumPy ndarray

NumPy (Numerical Python) is a fundamental library for scientific computing in Python.

The heart of NumPy is the ndarray (N-dimensional array) object, which is a fast, flexible
container for large datasets of homogeneous (same type) data [1]. Here are the key
features and explanations for beginners:

 Homogeneous Data: All elements in an ndarray must be of the same data type
(e.g., all integers or all floats)[1].

 Shape: The shape of an ndarray is a tuple indicating the size along each dimension
(e.g., a 2x3 array has shape (2,3))[1].

 dtype: This attribute tells you the type of data stored (e.g., int64, float64) [1].

 ndim: This attribute tells you the number of dimensions (axes) the array has [1].

Creating and Inspecting an ndarray

import numpy as np # Import the numpy library

data1 = [6, 7.5, 8, 0, 1] # Define a Python list

arr1 = np.array(data1) # Convert the list to a NumPy array (ndarray)
print(arr1) # Output: [6. 7.5 8. 0. 1.]
print(arr1.dtype) # Output: float64 (data type of elements)
print(arr1.shape) # Output: (5,) (1D array with 5 elements)
print(arr1.ndim) # Output: 1 (one-dimensional)

 Comment: Here, np.array() converts a Python list into an ndarray. The dtype, shape,
and ndim attributes help you understand the structure of your data [1].

Multidimensional ndarrays

data2 = [[1,2,3,4], [5,6,7,8]] # List of lists for 2D array

arr2 = np.array(data2)
print(arr2)
# Output:
# [[1 2 3 4]
# [5 6 7 8]]
print(arr2.dtype) # int64
print(arr2.shape) # (2, 4) -> 2 rows, 4 columns
print(arr2.ndim) # 2 (two-dimensional)
print(arr2.shape[^0]) # 2 (number of rows)
print(arr2.shape[^1]) # 4 (number of columns)

 Comment: This creates a 2D array (matrix). The shape (2,4) means 2 rows and 4
columns[1].

Array Creation Functions

NumPy provides functions like arange, zeros, ones, and reshape to create arrays efficiently:

arr = np.arange(10) # Array from 0 to 9

arr2d = arr.reshape(2, 5) # Reshape to 2 rows, 5 columns
zeros = np.zeros((3, 4)) # 3x4 array of zeros
ones = np.ones((2, 2)) # 2x2 array of ones

 Comment: These functions help you quickly generate arrays for computation [1].

Slicing Arrays in NumPy

Slicing is a way to extract sub-parts of an array, similar to slicing lists in Python but
extended to multiple dimensions[1].

Basic Slicing Syntax

 array[start:end] extracts elements from index start to end-1.

 array[start:end:step] adds a step size.

Example: 1D Slicing

arr = np.arange(10) # [0 1 2 3 4 5 6 7 8 9]
print(arr[2:7:2]) # Output: [2 4 6]

 Comment: Slices from index 2 to 6 (since end is exclusive), taking every 2nd
element[1].

Example: 2D Slicing

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(arr2d[:, 2]) # Output: [3 6 9] (last column)
print(arr2d[1, 1:]) # Output: [5 6] (last two elements of middle row)

 Comment: : means "all rows" or "all columns" depending on position [1].

Slicing Modifies the Original Array

Slicing in NumPy returns a view, not a copy. Modifying the slice changes the original
array.

arr = np.arange(10)
arr_slice = arr[5:8]
arr_slice[:] = 100
print(arr) # The original array is changed at indices 5, 6, 7

 Comment: This is different from Python lists, and is very efficient for large data [1].

Dealing with Rows and Columns in Pandas

Pandas is a powerful library for data manipulation and analysis. Its main data structures
are Series (1D) and DataFrame (2D)[1].

Creating a DataFrame

import pandas as pd

data = {
'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']
}
df = pd.DataFrame(data)
print(df)

 Comment: DataFrames are like tables with rows and columns [1].

Selecting Columns

print(df['Name']) # Selects the 'Name' column as a Series

print(df[['Name', 'Qualification']]) # Selects multiple columns as a DataFrame

 Comment: Use single or double brackets for single/multiple columns [1].

Selecting Rows

 By Index:
print(df.loc[^0]) # Select row by index label
print(df.iloc[^1]) # Select row by integer position

 By Condition:

print(df[df['Age'] > 25]) # Select rows where Age > 25

Adding and Deleting Columns

 Add:

df['Salary'] = [50000, 60000, 70000, 80000] # Add new column

 Delete:

df = df.drop('Salary', axis=1) # Remove column

Renaming Columns

df = df.rename(columns={'Qualification': 'Degree'})

Working with Missing Data

Real-world datasets often have missing values. Pandas provides tools to handle them[1].

Checking for Missing Values

import numpy as np

dict = {
'First Score': [100, 90, np.nan, 95],
'Second Score': [30, 45, 56, np.nan],
'Third Score': [np.nan, 40, 80, 98]
}
df = pd.DataFrame(dict)
print(df.isnull()) # True where value is missing
print(df.notnull()) # True where value is present

Filling Missing Values

df_filled = df.fillna(0) # Replace NaN with 0

Dropping Missing Values

df_dropped = df.dropna() # Remove rows with any NaN

 Comment: Handling missing data is crucial for accurate analysis [1].

Applying Functions to DataFrames

Pandas allows you to apply functions to rows or columns using apply()[1].

Syntax

DataFrame.apply(func, axis=0)

 func: The function to apply.

 axis=0: Apply function to each column (default).

 axis=1: Apply function to each row.

Example

import pandas as pd

df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})

# Sum of each column

print(df.apply(np.sum, axis=0))

# Sum of each row

print(df.apply(np.sum, axis=1))

 Comment: You can use built-in functions or custom functions with apply()[1].

Comparison between NumPy and Pandas

Feature NumPy Pandas

Data Structure ndarray (N-dimensional array) Series (1D), DataFrame (2D)

Data Type Homogeneous (all elements Heterogeneous (different types
same type) per column)

Speed Very fast for numerical Slightly slower due to more

computations features

Indexing Integer-based, slices Label-based and integer-based

Use-case Numerical calculations, linear Data analysis, tabular data,

algebra statistics

Memory Usage Lower (no metadata) Higher (stores labels, more

metadata)

Operations Vectorized, element-wise Powerful group-by, merge, pivot,

etc.

File I/O Limited (text/binary) Extensive (CSV, Excel, SQL,

JSON, etc.)

 Summary: Use NumPy for efficient numerical computations and Pandas for data
analysis with labeled, tabular data[1].

Other Python Libraries

Python's ecosystem is rich with libraries for various purposes [1]:

 TensorFlow: Deep learning and high-level computations, especially for neural

networks.

 Matplotlib: Data visualization (plots, graphs, charts).

 Pandas: Data analysis and manipulation (tabular data).

 NumPy: Numerical computing (arrays, matrices).

 SciPy: Scientific and technical computing, built on NumPy.

 Scrapy: Web scraping and data extraction from websites.

 Scikit-learn: Machine learning (classification, regression, clustering).

 PyGame: Game development.

 PyTorch: Deep learning, tensor computations with GPU support.

 PyBrain: Reinforcement learning and neural networks, beginner-friendly.

These libraries make Python powerful for data science, machine learning, visualization,
and more[1].

Summary Table: Key Concepts and Keywords

Concept Key Features/Keywords Example Code/Explanation

NumPy ndarray Homogeneous, shape, dtype, np.array([^1])

ndim, vectorization

Slicing arrays in NumPy start🔚step, views, arr[1:5], arr2d[:,2]

multidimensional slicing

Rows/Columns in Pandas DataFrame, Series, loc, iloc, df['Name'], df.loc

column selection

Working with Missing Data isnull(), notnull(), fillna(), df.isnull(), df.fillna(0)

dropna(), NaN

Applying Functions apply(), axis, custom functions, df.apply(np.sum, axis=0)

np.sum, np.mean

NumPy vs Pandas Speed, structure, use-case, See comparison table above

memory, operations

Other Python Libraries TensorFlow, Matplotlib, SciPy, Used for ML, plotting,
Scikit-learn, PyTorch computation, etc.

In summary, NumPy and Pandas are essential for efficient data manipulation and
analysis in Python. NumPy's ndarray is optimized for numerical operations, while Pandas'
DataFrame and Series provide powerful tools for handling labeled, tabular data, including
missing values and function application. Understanding these basics, along with the
differences between the libraries and their integration with other Python tools, is
foundational for anyone starting in data science or scientific computing [1].

1. UNIT-5.pptx

3 - Pandas
No ratings yet
3 - Pandas
87 pages
Areer: A Warm Welcome To Careerera Family
No ratings yet
Areer: A Warm Welcome To Careerera Family
131 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
72 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
No ratings yet
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
29 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
Unit 4
No ratings yet
Unit 4
27 pages
Data Science - Unit II
100% (2)
Data Science - Unit II
173 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
34 pages
Numpy Data Analysis and Visualisation With Python
No ratings yet
Numpy Data Analysis and Visualisation With Python
75 pages
Python Libraries
No ratings yet
Python Libraries
79 pages
M3-Introduction To Numpy and Pandas
No ratings yet
M3-Introduction To Numpy and Pandas
55 pages
Ty B Tech - Bda - Ai315 - Lab Manual
No ratings yet
Ty B Tech - Bda - Ai315 - Lab Manual
52 pages
PP Unit 4 Q&A
No ratings yet
PP Unit 4 Q&A
25 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
4 Introduction To Python Part 3
No ratings yet
4 Introduction To Python Part 3
62 pages
PyDays Day-2 - Final
No ratings yet
PyDays Day-2 - Final
26 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Python Module 5
No ratings yet
Python Module 5
43 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Week 4 - Introduction To Python #3
No ratings yet
Week 4 - Introduction To Python #3
47 pages
ELE492 - ELE492 - Image Process Lecture Notes 5
No ratings yet
ELE492 - ELE492 - Image Process Lecture Notes 5
41 pages
Top Python Questions 1735201448
No ratings yet
Top Python Questions 1735201448
25 pages
4 Introduction To Python Part 3
No ratings yet
4 Introduction To Python Part 3
48 pages
Unit - V
No ratings yet
Unit - V
90 pages
05-Unit-V Python Lecture Notes
No ratings yet
05-Unit-V Python Lecture Notes
14 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Staff Manula 01
No ratings yet
Staff Manula 01
7 pages
Data Visualization1
No ratings yet
Data Visualization1
52 pages
ML Sample Programs
No ratings yet
ML Sample Programs
7 pages
Manipulating and Analyzing Data With Pandas
No ratings yet
Manipulating and Analyzing Data With Pandas
50 pages
Packages
No ratings yet
Packages
37 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
DV Lab Manual Modified
No ratings yet
DV Lab Manual Modified
31 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
Dse Unit 3
No ratings yet
Dse Unit 3
12 pages
Unit6 - Working With Data
No ratings yet
Unit6 - Working With Data
29 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Chapter 2 - NumPy and Pandas
No ratings yet
Chapter 2 - NumPy and Pandas
26 pages
45B AIML Practical1.1
No ratings yet
45B AIML Practical1.1
57 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
Essential Python Libraries
100% (1)
Essential Python Libraries
41 pages
Report
No ratings yet
Report
18 pages
RAW Data
No ratings yet
RAW Data
22 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Unit 5
No ratings yet
Unit 5
28 pages
Python For Data Science
No ratings yet
Python For Data Science
4 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
NUMPY
No ratings yet
NUMPY
33 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Dot-Net-UNIT 1
No ratings yet
Dot-Net-UNIT 1
117 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Unit 5
No ratings yet
Unit 5
27 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
The Godot-Rust Book
No ratings yet
The Godot-Rust Book
121 pages
Numpy Complete Notes
No ratings yet
Numpy Complete Notes
64 pages
RTU Study DSA
No ratings yet
RTU Study DSA
73 pages
Server Side Scripting PHP
No ratings yet
Server Side Scripting PHP
99 pages
Prog in C and Verilog HDL - Compressed
No ratings yet
Prog in C and Verilog HDL - Compressed
86 pages
Programming Steps-Full Document
No ratings yet
Programming Steps-Full Document
51 pages
Digital Notes ON Data Structures Using C++
No ratings yet
Digital Notes ON Data Structures Using C++
98 pages
Practical File IP 2025
No ratings yet
Practical File IP 2025
54 pages
Thomas Kabalin - Grade 12 IT PAT
No ratings yet
Thomas Kabalin - Grade 12 IT PAT
21 pages
Computer Science IGCSE Paper 2
No ratings yet
Computer Science IGCSE Paper 2
8 pages
PreCAT Syllabus-3
No ratings yet
PreCAT Syllabus-3
5 pages
Pandas - 1
No ratings yet
Pandas - 1
45 pages
Solution Manual For Introduction To Algorithms, Third Edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
100% (6)
Solution Manual For Introduction To Algorithms, Third Edition by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
43 pages
2023 Regulation BE CSE (AI&ML) Scheme and Syllabus
No ratings yet
2023 Regulation BE CSE (AI&ML) Scheme and Syllabus
29 pages
Unit-2 Java
No ratings yet
Unit-2 Java
67 pages
MATLAB Fundamentals Quick Reference
No ratings yet
MATLAB Fundamentals Quick Reference
43 pages
(PDF Download) Introduction To Computing Systems: From Bits & Gates To C & Beyond 3rd Edition Yale Patt Fulll Chapter
100% (8)
(PDF Download) Introduction To Computing Systems: From Bits & Gates To C & Beyond 3rd Edition Yale Patt Fulll Chapter
64 pages
Polla - Ibrahim Teaching 382 5524 1684870381 1
No ratings yet
Polla - Ibrahim Teaching 382 5524 1684870381 1
15 pages
Employeemanagementsystem projectDST
No ratings yet
Employeemanagementsystem projectDST
21 pages
1D Arrays - 1
No ratings yet
1D Arrays - 1
27 pages
Labview Basics
No ratings yet
Labview Basics
16 pages
Singly Linked List
No ratings yet
Singly Linked List
13 pages
Data Structures and Algorithims: Lab Report
No ratings yet
Data Structures and Algorithims: Lab Report
6 pages
VG Frameworks Binary Search
No ratings yet
VG Frameworks Binary Search
8 pages
Practical Assessment of Phased Array Beam Generation
No ratings yet
Practical Assessment of Phased Array Beam Generation
10 pages
Weekly Lesson Plan (Grade 10)
No ratings yet
Weekly Lesson Plan (Grade 10)
8 pages
PSTC Question Bank On Module - 3 - 4 - 5
No ratings yet
PSTC Question Bank On Module - 3 - 4 - 5
3 pages
Noc20-Cs06 Week 06 Assignment 01
No ratings yet
Noc20-Cs06 Week 06 Assignment 01
6 pages
103ES
No ratings yet
103ES
2 pages
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet

PPS - Unit 5 (Imp Topics)

Uploaded by

PPS - Unit 5 (Imp Topics)

Uploaded by

NumPy ndarray

NumPy (Numerical Python) is a fundamental library for scientific computing in Python.

Creating and Inspecting an ndarray

import numpy as np # Import the numpy library

data1 = [6, 7.5, 8, 0, 1] # Define a Python list

data2 = [[1,2,3,4], [5,6,7,8]] # List of lists for 2D array

Array Creation Functions

arr = np.arange(10) # Array from 0 to 9

Slicing Arrays in NumPy

Basic Slicing Syntax

 array[start:end] extracts elements from index start to end-1.

 array[start:end:step] adds a step size.

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

 Comment: : means "all rows" or "all columns" depending on position [1].

Dealing with Rows and Columns in Pandas

print(df['Name']) # Selects the 'Name' column as a Series

 Comment: Use single or double brackets for single/multiple columns [1].

print(df[df['Age'] > 25]) # Select rows where Age > 25

Adding and Deleting Columns

df['Salary'] = [50000, 60000, 70000, 80000] # Add new column

df = df.drop('Salary', axis=1) # Remove column

Working with Missing Data

Checking for Missing Values

Filling Missing Values

df_filled = df.fillna(0) # Replace NaN with 0

df_dropped = df.dropna() # Remove rows with any NaN

 Comment: Handling missing data is crucial for accurate analysis [1].

Applying Functions to DataFrames

Pandas allows you to apply functions to rows or columns using apply()[1].

 func: The function to apply.

 axis=0: Apply function to each column (default).

 axis=1: Apply function to each row.

# Sum of each column

# Sum of each row

Comparison between NumPy and Pandas

Feature NumPy Pandas

Data Structure ndarray (N-dimensional array) Series (1D), DataFrame (2D)

Speed Very fast for numerical Slightly slower due to more

Indexing Integer-based, slices Label-based and integer-based

Use-case Numerical calculations, linear Data analysis, tabular data,

Memory Usage Lower (no metadata) Higher (stores labels, more

Operations Vectorized, element-wise Powerful group-by, merge, pivot,

File I/O Limited (text/binary) Extensive (CSV, Excel, SQL,

Other Python Libraries

Python's ecosystem is rich with libraries for various purposes [1]:

 TensorFlow: Deep learning and high-level computations, especially for neural

 Matplotlib: Data visualization (plots, graphs, charts).

 Pandas: Data analysis and manipulation (tabular data).

 NumPy: Numerical computing (arrays, matrices).

 SciPy: Scientific and technical computing, built on NumPy.

 Scrapy: Web scraping and data extraction from websites.

 Scikit-learn: Machine learning (classification, regression, clustering).

 PyGame: Game development.

 PyTorch: Deep learning, tensor computations with GPU support.

 PyBrain: Reinforcement learning and neural networks, beginner-friendly.

Summary Table: Key Concepts and Keywords

Concept Key Features/Keywords Example Code/Explanation

NumPy ndarray Homogeneous, shape, dtype, np.array([^1])

Slicing arrays in NumPy start🔚step, views, arr[1:5], arr2d[:,2]

Rows/Columns in Pandas DataFrame, Series, loc, iloc, df['Name'], df.loc

Working with Missing Data isnull(), notnull(), fillna(), df.isnull(), df.fillna(0)

Applying Functions apply(), axis, custom functions, df.apply(np.sum, axis=0)

NumPy vs Pandas Speed, structure, use-case, See comparison table above

You might also like