0% found this document useful (0 votes)

34 views18 pages

Report

Uploaded by

ysourav172

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views18 pages

Report

Uploaded by

ysourav172

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

1. Introduc on

o Importance of Data Science

o Role of Python
o Overview of NumPy and Pandas
2. Understanding NumPy
o 2.1 What is NumPy?
o 2.2 Key Features of NumPy
o 2.3 Crea ng Arrays
o 2.4 Array Opera ons
o 2.5 Mathema cal Func ons
o 2.6 Broadcas ng
o 2.7 Example Applica ons
3. Exploring Pandas
o 3.1 What is Pandas?
o 3.2 Key Features of Pandas
o 3.3 Crea ng DataFrames
o 3.4 Data Manipula on Techniques
o 3.5 Handling Missing Data
o 3.6 Grouping and Aggrega ng Data
o 3.7 Example Applica ons
4. Data Prepara on with NumPy and Pandas
o 4.1 Importance of Data Cleaning
o 4.2 Using NumPy for Data Prepara on
o 4.3 Using Pandas for Data Cleaning
o 4.4 Case Study: Real-world Data Cleaning
5. Data Analysis Techniques
o 5.1 Sta s cal Analysis with NumPy
o 5.2 Data Visualiza on Integra on
o 5.3 Using Pandas for Analysis
o 5.4 Case Study: Analyzing a Dataset

6. Advanced Features
o 6.1 Mul -dimensional Arrays in NumPy
o 6.2 Advanced DataFrame Opera ons in Pandas
o 6.3 Time Series Analysis
o 6.4 Case Study: Time Series Analysis Example
7. Real-world Applica ons
o 7.1 Use Cases in Industry
o 7.2 Compara ve Analysis with Other Tools
o 7.3 Future Trends in Data Science
8. Conclusion
o Summary of Key Points
o Importance of Mastering NumPy and Pandas
o Final Thoughts
9. References
1. Introduc on
Importance of Data Science

Data science is an interdisciplinary field that focuses on extrac ng insights and knowledge
from structured and unstructured data. It combines techniques from sta s cs, mathema cs,
computer science, and domain exper se. The rise of big data has led to an increasing
demand for data scien sts who can analyze large datasets to inform decision-making
processes. In industries such as finance, healthcare, retail, and marke ng, data science
enables organiza ons to op mize their opera ons, predict trends, and enhance customer
experiences.
Role of Python
Python has emerged as one of the most popular programming languages in data science due
to its simplicity and versa lity. Its rich ecosystem of libraries and frameworks facilitates tasks
such as data manipula on, analysis, and visualiza on. Python is preferred for its:
 Readability: Clear syntax makes it easier for beginners to learn and for teams to
collaborate.
 Community support: A vast community means extensive resources, tutorials, and
forums are available.
 Integra on capabili es: Python integrates well with other languages and tools,
making it suitable for various applica ons.
Overview of NumPy and Pandas
NumPy and Pandas are two founda onal libraries in Python for data science. NumPy
(Numerical Python) provides support for large, mul -dimensional arrays and matrices, along
with mathema cal func ons to operate on these arrays. Pandas, on the other hand, offers
data structures and func ons specifically designed for data manipula on and analysis,
allowing for efficient handling of structured data.

2. Understanding NumPy
2.1 What is NumPy?
NumPy is a powerful library for numerical compu ng in Python. It provides support for
mul -dimensional arrays and a collec on of mathema cal func ons to operate on these
arrays. The core data structure in NumPy is the ndarray (N-dimensional array), which allows
for efficient storage and manipula on of numerical data. NumPy serves as the founda on
for many scien fic compu ng tasks in Python.
2.2 Key Features of NumPy
 N-dimensional arrays: NumPy allows the crea on of mul -dimensional arrays, which
are essen al for complex data manipula on.
 Performance: Opera ons on NumPy arrays are significantly faster than opera ons on
tradi onal Python lists, thanks to op mized C code.
 Comprehensive mathema cal func ons: NumPy includes func ons for linear
algebra, sta s cal analysis, and more.
 Broadcas ng: This feature allows arithme c opera ons on arrays of different shapes,
simplifying code and enhancing performance.
2.3 Crea ng Arrays
NumPy provides various methods for crea ng arrays. Below are some examples:
python
import numpy as np

# Crea ng a 1D array from a list

array_1d = np.array([1, 2, 3, 4, 5])
print("1D Array:", array_1d)

# Crea ng a 2D array from a nested list

array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("2D Array:\n", array_2d)

# Crea ng an array of zeros

zeros_array = np.zeros((3, 3))
print("Array of Zeros:\n", zeros_array)

# Crea ng an array of ones

ones_array = np.ones((2, 3))
print("Array of Ones:\n", ones_array)

# Crea ng a range of numbers

range_array = np.arange(10) # Array with values from 0 to 9
print("Range Array:", range_array)
2.4 Array Opera ons
NumPy supports various opera ons such as indexing, slicing, and reshaping:

# Indexing
print("Element at index 1:", array_1d[1]) # Output: 2

# Slicing
print("Sliced Array (from index 1 to 3):", array_1d[1:4]) # Output: [2 3 4]

# Reshaping
reshaped_array = array_2d.reshape((3, 2)) # Changing shape from (2,3) to (3,2)
print("Reshaped Array:\n", reshaped_array)
2.5 Mathema cal Func ons
NumPy includes a rich set of mathema cal func ons. Here are a few examples:

# Mean and standard devia on

mean_value = np.mean(array_1d)
std_value = np.std(array_1d)
print("Mean:", mean_value, "Standard Devia on:", std_value)

# Element-wise opera ons

squared_array = np.square(array_1d)
print("Squared Array:", squared_array)

# Dot product of two arrays

array_a = np.array([1, 2, 3])
array_b = np.array([4, 5, 6])
dot_product = np.dot(array_a, array_b)
print("Dot Product:", dot_product)
2.6 Broadcas ng
Broadcas ng is a powerful feature that allows NumPy to perform arithme c opera ons on
arrays of diﬀerent shapes. For instance:

# Broadcas ng example
array_a = np.array([1, 2, 3])
array_b = np.array([[10], [20], [30]])

result = array_a + array_b # Adds array_a to each row of array_b

print("Broadcas ng Result:\n", result)
2.7 Example Applica ons
NumPy is widely used in various applica ons, such as:
 Scien ﬁc Compu ng: Simula ons, numerical analysis, and scien ﬁc research.

 Data Analysis: Preprocessing and transforming data for analysis.

 Machine Learning: Handling large datasets and performing mathema cal opera ons
eﬃciently.

3. Exploring Pandas
3.1 What is Pandas?
Pandas is an open-source data analysis and manipula on library built on top of NumPy. It
provides two primary data structures: Series (1D) and DataFrame (2D), which are designed
for handling structured data eﬃciently. Pandas simpliﬁes data manipula on and analysis,
making it a crucial tool for data scien sts.

3.2 Key Features of Pandas

 DataFrames: A two-dimensional, size-mutable, poten ally heterogeneous tabular
data structure with labeled axes (rows and columns).
 Data manipula on: Func ons for cleaning, transforming, and reshaping data.
 Time series support: Built-in support for handling me series data, including date-
me indexing and resampling.
 Integra on with other libraries: Works seamlessly with NumPy, Matplotlib, and
other libraries.
3.3 Crea ng DataFrames
DataFrames can be created from various sources. Below is an example of crea ng a
DataFrame from a dic onary:

import pandas as pd

# Crea ng a DataFrame from a dic onary

data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [24, 27, 22],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print("DataFrame:\n", df)

You can also create a DataFrame from a CSV ﬁle:

# Reading a CSV ﬁle into a DataFrame

df_from_csv = pd.read_csv('data.csv') # Assuming data.csv is a valid ﬁle
print("DataFrame from CSV:\n", df_from_csv)
3.4 Data Manipula on Techniques
Pandas provides numerous func ons for manipula ng data. Here are some common
techniques:
 Filtering data:

# Filtering rows where Age > 24

ﬁltered_df = df[df['Age'] > 24]
print("Filtered DataFrame:\n", ﬁltered_df)
 Sor ng:

# Sor ng DataFrame by Age

sorted_df = df.sort_values(by='Age')
print("Sorted DataFrame:\n", sorted_df)
 Adding new columns:

# Adding a new column for job tle

df['Job'] = ['Engineer', 'Designer', 'Ar st']
print("DataFrame with Job Column:\n", df)
3.5 Handling Missing Data
Pandas provides robust methods for handling missing data, which is cri cal for data analysis:

# Crea ng a DataFrame with missing values

data_with_nan = {
'Name': ['Alice', 'Bob', None],
'Age': [24, None, 22],
'City': ['New York', 'Los Angeles', 'Chicago']
}
df_nan = pd.DataFrame(data_with_nan)

# Checking for missing values

print("Missing Values:\n", df_nan.isnull().sum())

# Filling missing values

df_filled = df_nan.fillna({'Name': 'Unknown', 'Age': df_nan['Age'].mean()})
print("DataFrame with Filled Values:\n", df_filled)
3.6 Grouping and Aggrega ng Data

Pandas makes it easy to group data and perform aggrega ons:

# Grouping by City and calcula ng the average Age

grouped_df = df.groupby('City')['Age'].mean()
print("Grouped DataFrame:\n", grouped_df)

# Aggrega ng with mul ple func ons

agg_df = df.groupby('City').agg({'Age': ['mean', 'max'], 'Name': 'count'})
print("Aggregated DataFrame:\n", agg_df)
3.7 Example Applica ons
Pandas is widely used in:
 Data Cleaning: Preparing data for analysis by handling missing values and ﬁltering.
 Exploratory Data Analysis (EDA): Analyzing datasets to summarize their main
characteris cs.
 Data Visualiza on: Integra ng with libraries like Matplotlib to visualize data
eﬀec vely.

4. Data Prepara on with NumPy and Pandas

4.1 Importance of Data Cleaning
Data cleaning is a crucial step in the data science process, as real-world data is o en messy
and inconsistent. Cleaning data involves iden fying and correc ng errors or inconsistencies
to improve the quality of the dataset. This step is essen al for accurate analysis and
modeling.
4.2 Using NumPy for Data Prepara on
NumPy can be employed for data prepara on tasks such as transforming and reshaping
data:

# Example: Reshaping an array for analysis

data_array = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_data = data_array.reshape(-1) # Fla ening the array
print("Fla ened Data Array:", reshaped_data)
4.3 Using Pandas for Data Cleaning
Pandas excels in data cleaning tasks:
python

# Dropping rows with missing values

cleaned_df = df_nan.dropna()
print("DataFrame a er Dropping Rows with NaN:\n", cleaned_df)

# Replacing speciﬁc values

df_replaced = df.replace({'City': {'New York': 'NY'}})
print("DataFrame with Replaced Values:\n", df_replaced)
4.4 Case Study: Real-world Data Cleaning
Consider a case study where a company collects customer data with inconsistencies:
1. Data Collec on: The dataset includes customer names, ages, and email addresses,
but many entries have missing values or incorrect formats.

2. Data Cleaning Steps:

o Iden fy and ﬁll missing values.
o Normalize formats (e.g., consistent casing for names).
o Remove duplicates.
3. Pandas Implementa on:
# Sample customer data
customer_data = {

'Name': ['Alice', 'BOB', 'Charlie', None, 'Alice'],

'Age': [24, None, 22, 29, 24],
'Email': ['[email protected]', '[email protected]', None, '[email protected]',
'[email protected]']
}
df_customers = pd.DataFrame(customer_data)

# Cleaning process
df_customers['Name'] = df_customers['Name'].str. tle() # Normalize names
df_customers['Email'] = df_customers['Email'].str.lower() # Normalize email
df_customers = df_customers.drop_duplicates().ﬁllna({'Age': df_customers['Age'].mean()})

print("Cleaned Customer DataFrame:\n", df_customers)

5. Data Analysis Techniques

5.1 Sta s cal Analysis with NumPy
NumPy provides a range of sta s cal func ons that can be used for data analysis:

# Sample data
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Basic sta s cs
mean = np.mean(data)
median = np.median(data)
variance = np.var(data)
standard_devia on = np.std(data)

print("Mean:", mean)
print("Median:", median)
print("Variance:", variance)

print("Standard Devia on:", standard_devia on)

5.2 Data Visualiza on Integra on

Visualizing data helps in understanding pa erns and trends. Pandas integrates well with
Matplotlib for data visualiza on:

import matplotlib.pyplot as plt

# Sample data
df_plot = pd.DataFrame({'X': range(10), 'Y': np.random.randint(1, 10, size=10)})

# Plo ng

plt.ﬁgure(ﬁgsize=(10, 6))
plt.plot(df_plot['X'], df_plot['Y'], marker='o')
plt. tle('Sample Data Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid()
plt.show()

5.3 Using Pandas for Analysis

Pandas provides various methods to analyze data:

# Descrip ve sta s cs
print("Descrip ve Sta s cs:\n", df.describe())

# Correla on matrix
correla on = df.corr()
print("Correla on Matrix:\n", correla on)
5.4 Case Study: Analyzing a Dataset
Let’s consider a dataset containing sales data for a retail store:

1. Data Explora on:

o Load the dataset and examine its structure.
o Check for missing values and perform cleaning.
2. Analysis:
o Analyze sales trends over me.
o Iden fy top-selling products and customer demographics.
3. Implementa on:

# Sample sales data

sales_data = {
'Date': pd.date_range(start='2023-01-01', periods=10, freq='D'),
'Sales': [200, 220, 250, 230, 300, 320, 350, 360, 380, 400],
'Product': ['A', 'B', 'A', 'C', 'B', 'A', 'C', 'B', 'A', 'C']
}
df_sales = pd.DataFrame(sales_data)

# Analyzing sales trends

sales_trend = df_sales.groupby('Date')['Sales'].sum()
print("Sales Trend:\n", sales_trend)

# Plo ng sales trend

plt.ﬁgure(ﬁgsize=(10, 6))
plt.plot(sales_trend.index, sales_trend.values, marker='o')
plt. tle('Sales Trend Over Time')
plt.xlabel('Date')
plt.ylabel('Sales Amount')
plt.x cks(rota on=45)
plt.grid()
plt.show()

6. Advanced Features
6.1 Mul -dimensional Arrays in NumPy
NumPy supports mul -dimensional arrays, enabling the handling of complex data structures.
For example, a 3D array can represent a collec on of images:
python

# Crea ng a 3D array
array_3d = np.random.rand(2, 3, 4) # 2 images of 3x4 pixels
print("3D Array Shape:", array_3d.shape)
6.2 Advanced DataFrame Opera ons in Pandas

Pandas oﬀers advanced opera ons such as pivo ng, merging, and joining:

# Crea ng two DataFrames

df1 = pd.DataFrame({'A': ['foo', 'bar'], 'B': [1, 2]})
df2 = pd.DataFrame({'A': ['foo', 'bar'], 'C': [3, 4]})

# Merging DataFrames
merged_df = pd.merge(df1, df2, on='A')
print("Merged DataFrame:\n", merged_df)
6.3 Time Series Analysis

Pandas provides robust support for me series data. Here’s an example of how to handle
me series data:
python

# Crea ng a me series DataFrame

date_rng = pd.date_range(start='2023-01-01', end='2023-01-10', freq='D')
ts_df = pd.DataFrame(date_rng, columns=['date'])
ts_df['data'] = np.random.randint(0, 100, size=(len(date_rng)))
ts_df.set_index('date', inplace=True)

# Resampling the data

daily_mean = ts_df.resample('D').mean()
print("Daily Mean:\n", daily_mean)
6.4 Case Study: Time Series Analysis Example
Consider a case study where we analyze stock price data:
1. Data Collec on: Obtain historical stock price data from a ﬁnancial API.
2. Data Cleaning: Handle missing dates and ﬁll gaps.
3. Analysis:
o Plot stock prices over me.
o Calculate moving averages to iden fy trends.
4. Implementa on:

# Sample stock price data

stock_data = {
'Date': pd.date_range(start='2023-01-01', periods=10, freq='D'),
'Price': [100, 102, 101, 105, 107, 110, 108, 109, 112, 115]
}
df_stock = pd.DataFrame(stock_data)
df_stock.set_index('Date', inplace=True)

# Calcula ng moving averages

df_stock['MA'] = df_stock['Price'].rolling(window=3).mean()

# Plo ng
plt.ﬁgure(ﬁgsize=(10, 6))
plt.plot(df_stock.index, df_stock['Price'], label='Stock Price', marker='o')
plt.plot(df_stock.index, df_stock['MA'], label='Moving Average', linestyle='--')
plt. tle('Stock Price Analysis')

plt.xlabel('Date')
plt.ylabel('Price')
plt.legend()
plt.grid()
plt.show()

7. Real-world Applica ons

7.1 Use Cases in Industry
NumPy and Pandas are widely used across various industries:
 Finance: Risk analysis, por olio op miza on, and algorithmic trading.
 Healthcare: Analyzing pa ent data, clinical trials, and predic ng disease outbreaks.
 Marke ng: Customer segmenta on, A/B tes ng, and campaign eﬀec veness
analysis.
7.2 Compara ve Analysis with Other Tools
While NumPy and Pandas are powerful, there are other tools and languages used in data
science:
 R: Known for sta s cal analysis and visualiza on.
 SQL: Essen al for querying databases and managing structured data.
 Hadoop: Useful for handling large datasets across distributed systems.
7.3 Future Trends in Data Science
The ﬁeld of data science is rapidly evolving. Some emerging trends include:

 Automated Machine Learning (AutoML): Tools that automate the model selec on
and training process.
 Explainable AI: Techniques that provide transparency in machine learning models.
 Real- me Data Processing: The increasing need to process data as it is generated.
8. Conclusion
Summary of Key Points

Python, along with libraries like NumPy and Pandas, plays a crucial role in data science. Their
func onali es allow for efficient data manipula on, analysis, and visualiza on, making them
essen al tools for data scien sts.
Importance of Mastering NumPy and Pandas
Proficiency in NumPy and Pandas not only enhances data analysis skills but also provides a
strong founda on for exploring more advanced data science techniques. As data con nues
to grow in volume and complexity, these tools will remain vital for extrac ng meaningful
insights.
Final Thoughts
Mastering NumPy and Pandas opens doors to various opportuni es in the field of data
science. With their widespread adop on, becoming skilled in these libraries is a strategic
investment in one’s career.
9. References
1. McKinsey & Company. (2020). The State of AI in 2020.

2. NumPy Documenta on. (2023). Retrieved from h ps://numpy.org/doc/

3. Pandas Documenta on. (2023). Retrieved from h ps://pandas.pydata.org/pandas-
docs/stable/
4. Jake VanderPlas. (2016). Python Data Science Handbook. O'Reilly Media.

Dashboard in A Day
No ratings yet
Dashboard in A Day
40 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Tool and Lib in Data Science
No ratings yet
Tool and Lib in Data Science
32 pages
2A - Python+Data Analysis For Pyhton2 v2
No ratings yet
2A - Python+Data Analysis For Pyhton2 v2
38 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
Attachment 3 Python For Data Analysis Lyst9850
No ratings yet
Attachment 3 Python For Data Analysis Lyst9850
31 pages
Dav 2 Unit
No ratings yet
Dav 2 Unit
55 pages
3 - Pandas
No ratings yet
3 - Pandas
87 pages
Python CA2
No ratings yet
Python CA2
11 pages
PPS - Unit 5 (Imp Topics)
No ratings yet
PPS - Unit 5 (Imp Topics)
7 pages
Advanced Python Lab
No ratings yet
Advanced Python Lab
17 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Unit 1
100% (1)
Unit 1
69 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
Data Science Notes
No ratings yet
Data Science Notes
44 pages
NUMPY
No ratings yet
NUMPY
33 pages
Q.1 Explain Process of Working With Data From Files in Data Science
No ratings yet
Q.1 Explain Process of Working With Data From Files in Data Science
20 pages
Data Analysis Lab - Final - 23-24
No ratings yet
Data Analysis Lab - Final - 23-24
11 pages
Unit 5
No ratings yet
Unit 5
28 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
MGNM801 Ca2 Final
No ratings yet
MGNM801 Ca2 Final
13 pages
Machine Learning Lecture2
No ratings yet
Machine Learning Lecture2
38 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Introduction To NumPy & Pandas
No ratings yet
Introduction To NumPy & Pandas
12 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Jenisha INTERNSHIP REPORT-2
No ratings yet
Jenisha INTERNSHIP REPORT-2
19 pages
Microsoft Ai Automate
No ratings yet
Microsoft Ai Automate
259 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
Fds Merged
No ratings yet
Fds Merged
102 pages
05-Unit-V Python Lecture Notes
No ratings yet
05-Unit-V Python Lecture Notes
14 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
51 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Python Ca22
No ratings yet
Python Ca22
14 pages
Numpy&pandas
No ratings yet
Numpy&pandas
17 pages
22mbada303 Module 4
No ratings yet
22mbada303 Module 4
32 pages
Data Science Workshop - Day 1
No ratings yet
Data Science Workshop - Day 1
80 pages
Unit 4 Fod
100% (1)
Unit 4 Fod
21 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
ML Sample Programs
No ratings yet
ML Sample Programs
7 pages
Data Analysis and Visualisation With Python
No ratings yet
Data Analysis and Visualisation With Python
75 pages
Numpy & Pandas
No ratings yet
Numpy & Pandas
13 pages
NumPy & Pandas
No ratings yet
NumPy & Pandas
27 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Data Science
No ratings yet
Data Science
10 pages
Wa0005.
No ratings yet
Wa0005.
29 pages
Numpy Data Analytics
No ratings yet
Numpy Data Analytics
13 pages
Data Science I: Charles C.N. Wang
No ratings yet
Data Science I: Charles C.N. Wang
68 pages
Unit 5 PythonPackages (Matplotlib)
No ratings yet
Unit 5 PythonPackages (Matplotlib)
24 pages
Learning NumPy and Pandas
No ratings yet
Learning NumPy and Pandas
3 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
Chapter 3 Numpy Data Analysis
No ratings yet
Chapter 3 Numpy Data Analysis
21 pages
Data Science
No ratings yet
Data Science
42 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
ECMA
No ratings yet
ECMA
7 pages
Cubeacon Card - Datasheet-V - 0.3.1
No ratings yet
Cubeacon Card - Datasheet-V - 0.3.1
2 pages
Hi-Fi Choice - March 2017
No ratings yet
Hi-Fi Choice - March 2017
132 pages
Autosar RTE Layer
No ratings yet
Autosar RTE Layer
1,116 pages
The Blockchain Catalyst For Change - CEPR
No ratings yet
The Blockchain Catalyst For Change - CEPR
7 pages
Falcon Outdoor OI
No ratings yet
Falcon Outdoor OI
78 pages
Introduction To Programming Language C 2023
100% (1)
Introduction To Programming Language C 2023
44 pages
Sample Program: XGB-INV IG5A (RS-485 Modbus RTU)
No ratings yet
Sample Program: XGB-INV IG5A (RS-485 Modbus RTU)
4 pages
Notes Co Unit4
No ratings yet
Notes Co Unit4
12 pages
Internal Routine (B.Tech)
No ratings yet
Internal Routine (B.Tech)
1 page
NT2S-SF121B-E & NT2S-SF122B-E: Quick Start Guide
No ratings yet
NT2S-SF121B-E & NT2S-SF122B-E: Quick Start Guide
31 pages
Ashrith Resume-1
No ratings yet
Ashrith Resume-1
2 pages
User Manual Fudaa-LSPIV 1.7
No ratings yet
User Manual Fudaa-LSPIV 1.7
61 pages
TPACK Template: Subject US Government Grade Level 12 Grade Learning Objective
No ratings yet
TPACK Template: Subject US Government Grade Level 12 Grade Learning Objective
2 pages
Rotem Mesika: System Security Engineering 372.2.5204
No ratings yet
Rotem Mesika: System Security Engineering 372.2.5204
21 pages
Dhrystone - Wikipedia
No ratings yet
Dhrystone - Wikipedia
15 pages
Gcse Information and Communication Technology
No ratings yet
Gcse Information and Communication Technology
20 pages
DevCom Project Lead Recruitment Assignment
No ratings yet
DevCom Project Lead Recruitment Assignment
11 pages
WD19DC Dell Docking Station Troubleshoot
No ratings yet
WD19DC Dell Docking Station Troubleshoot
6 pages
manual-en-EU Automate Diseño AI 3shape
No ratings yet
manual-en-EU Automate Diseño AI 3shape
26 pages
Introduction To Computer CH 2
No ratings yet
Introduction To Computer CH 2
53 pages
COMPUTER Input and Out Put Devices
100% (1)
COMPUTER Input and Out Put Devices
15 pages
Data Storage 6MBXz6j2tBKCGg7X
No ratings yet
Data Storage 6MBXz6j2tBKCGg7X
8 pages
TLE ICT CSS9 Q2 WEEK3 MODULE Edited Black
No ratings yet
TLE ICT CSS9 Q2 WEEK3 MODULE Edited Black
6 pages
Power System Security
88% (40)
Power System Security
32 pages
Duplicate 1723844408809
No ratings yet
Duplicate 1723844408809
404 pages
The Most Effective Digital Marketing Strategies
No ratings yet
The Most Effective Digital Marketing Strategies
5 pages
Lecture 14
No ratings yet
Lecture 14
40 pages
Data Structures and Problem Solving Using Java 4th, Intern. Edition Weiss PDF Download
No ratings yet
Data Structures and Problem Solving Using Java 4th, Intern. Edition Weiss PDF Download
38 pages

Report

Uploaded by

Report

Uploaded by

Table of Contents

o Importance of Data Science

# Crea ng a 1D array from a list

# Crea ng a 2D array from a nested list

# Crea ng an array of zeros

# Crea ng an array of ones

# Crea ng a range of numbers

# Mean and standard devia on

# Element-wise opera ons

# Dot product of two arrays

result = array_a + array_b # Adds array_a to each row of array_b

 Data Analysis: Preprocessing and transforming data for analysis.

3.2 Key Features of Pandas

# Crea ng a DataFrame from a dic onary

You can also create a DataFrame from a CSV ﬁle:

# Reading a CSV ﬁle into a DataFrame

# Filtering rows where Age > 24

# Sor ng DataFrame by Age

# Adding a new column for job tle

# Crea ng a DataFrame with missing values

# Checking for missing values

# Filling missing values

Pandas makes it easy to group data and perform aggrega ons:

# Grouping by City and calcula ng the average Age

# Aggrega ng with mul ple func ons

4. Data Prepara on with NumPy and Pandas

# Example: Reshaping an array for analysis

# Dropping rows with missing values

# Replacing speciﬁc values

2. Data Cleaning Steps:

'Name': ['Alice', 'BOB', 'Charlie', None, 'Alice'],

print("Cleaned Customer DataFrame:\n", df_customers)

5. Data Analysis Techniques

print("Standard Devia on:", standard_devia on)

5.2 Data Visualiza on Integra on

import matplotlib.pyplot as plt

5.3 Using Pandas for Analysis

1. Data Explora on:

# Sample sales data

# Analyzing sales trends

# Plo ng sales trend

# Crea ng two DataFrames

# Crea ng a me series DataFrame

# Resampling the data

# Sample stock price data

# Calcula ng moving averages

7. Real-world Applica ons

2. NumPy Documenta on. (2023). Retrieved from h ps://numpy.org/doc/

You might also like