0% found this document useful (0 votes)
0 views3 pages

Python Assignment-2

The document presents a Python assignment involving data manipulation using NumPy and Pandas. It includes code to create a 3x3 NumPy array, extract specific rows and columns, and handle missing values in a Pandas DataFrame. The report concludes with observations on the data processing techniques applied and their relevance to real-world data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views3 pages

Python Assignment-2

The document presents a Python assignment involving data manipulation using NumPy and Pandas. It includes code to create a 3x3 NumPy array, extract specific rows and columns, and handle missing values in a Pandas DataFrame. The report concludes with observations on the data processing techniques applied and their relevance to real-world data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Python Assignment-2

NumPy Assignment
Code Implementation:
import numpy as np

# 1. Create a 3x3 NumPy array with random integers between 10 and 100
array_3x3 = np.random.randint(10, 100, (3, 3))
print("3x3 Random Integer Array:\n", array_3x3)

# 2. Extract the second row and third column


second_row = array_3x3[1, :]
third_column = array_3x3[:, 2]
print("\nSecond Row:", second_row)
print("Third Column:", third_column)

# 3. Reshape the array into a 1D array


array_1D = array_3x3.flatten()
print("\nReshaped 1D Array:", array_1D)

Sample Output:
3x3 Random Integer Array:
[[44 89 15]
[11 54 52]
[49 45 63]]

Second Row: [11 54 52]


Third Column: [15 52 63]

Reshaped 1D Array: [44 89 15 11 54 52 49 45 63]

Pandas Assignment
Code Implementation:
import pandas as pd
import numpy as np

# Creating a DataFrame with employee details


data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eve"],
"Age": [25, 30, 22, 27, np.nan], # Including a missing value in Age
"Department": ["HR", "IT", "HR", "IT", "Finance"],
"Salary": [50000, 60000, 55000, np.nan, 65000] # Including a missing
value in Salary
}

df = pd.DataFrame(data)
# Display the first 3 rows
print("First 3 rows of the DataFrame:\n", df.head(3))

# Filtering employees whose salary is greater than ₹50,000


filtered_df = df[df["Salary"] > 50000]
print("\nEmployees with Salary > 50,000:\n", filtered_df)

# Sorting the DataFrame by Age in ascending order


sorted_df = df.sort_values(by="Age")
print("\nSorted DataFrame by Age:\n", sorted_df)

# Handling Missing Data


# Fill missing "Age" values with the mean Age
df["Age"].fillna(df["Age"].mean(), inplace=True)

# Drop rows where "Salary" is missing


df_cleaned = df.dropna(subset=["Salary"])
print("\nDataFrame after handling missing values:\n", df_cleaned)

# Group employees by "Department" and find the average salary


grouped_df = df_cleaned.groupby("Department")["Salary"].mean()
print("\nAverage Salary by Department:\n", grouped_df)

Sample Output:
First 3 rows of the DataFrame:
Name Age Department Salary
0 Alice 25.0 HR 50000.0
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0

Employees with Salary > 50,000:


Name Age Department Salary
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0
4 Eve NaN Finance 65000.0

Sorted DataFrame by Age:


Name Age Department Salary
2 Charlie 22.0 HR 55000.0
0 Alice 25.0 HR 50000.0
3 David 27.0 IT NaN
1 Bob 30.0 IT 60000.0
4 Eve NaN Finance 65000.0

DataFrame after handling missing values:


Name Age Department Salary
0 Alice 25.0 HR 50000.0
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0
4 Eve 26.0 Finance 65000.0

Average Salary by Department:


Department
Finance 65000.0
HR 52500.0
IT 60000.0
Name: Salary, dtype: float64

Report Analysis
NumPy Observations:

• A 3×3 matrix of random integers between 10 and 100 was generated.


• The second row and third column were successfully extracted.
• The 3×3 matrix was reshaped into a 1D array.

Pandas Observations:

• The initial DataFrame contained missing values for Age and Salary.
• Employees with Salary > ₹50,000 were filtered.
• Data was sorted by Age in ascending order.
• Missing Age values were replaced with the mean, and rows with missing Salary were
removed.
• The average Salary for each Department was computed, with Finance having the
highest and HR the lowest.

Conclusion:

This assignment demonstrated essential data manipulation techniques using NumPy and Pandas.
NumPy was used for array operations, while Pandas provided insights into structured data.
Handling missing values, filtering, sorting, and grouping techniques were effectively applied,
showcasing real-world data analysis applications.

You might also like