Python Assignment-2
NumPy Assignment
Code Implementation:
import numpy as np
# 1. Create a 3x3 NumPy array with random integers between 10 and 100
array_3x3 = np.random.randint(10, 100, (3, 3))
print("3x3 Random Integer Array:\n", array_3x3)
# 2. Extract the second row and third column
second_row = array_3x3[1, :]
third_column = array_3x3[:, 2]
print("\nSecond Row:", second_row)
print("Third Column:", third_column)
# 3. Reshape the array into a 1D array
array_1D = array_3x3.flatten()
print("\nReshaped 1D Array:", array_1D)
Sample Output:
3x3 Random Integer Array:
[[44 89 15]
[11 54 52]
[49 45 63]]
Second Row: [11 54 52]
Third Column: [15 52 63]
Reshaped 1D Array: [44 89 15 11 54 52 49 45 63]
Pandas Assignment
Code Implementation:
import pandas as pd
import numpy as np
# Creating a DataFrame with employee details
data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eve"],
"Age": [25, 30, 22, 27, np.nan], # Including a missing value in Age
"Department": ["HR", "IT", "HR", "IT", "Finance"],
"Salary": [50000, 60000, 55000, np.nan, 65000] # Including a missing
value in Salary
}
df = pd.DataFrame(data)
# Display the first 3 rows
print("First 3 rows of the DataFrame:\n", df.head(3))
# Filtering employees whose salary is greater than ₹50,000
filtered_df = df[df["Salary"] > 50000]
print("\nEmployees with Salary > 50,000:\n", filtered_df)
# Sorting the DataFrame by Age in ascending order
sorted_df = df.sort_values(by="Age")
print("\nSorted DataFrame by Age:\n", sorted_df)
# Handling Missing Data
# Fill missing "Age" values with the mean Age
df["Age"].fillna(df["Age"].mean(), inplace=True)
# Drop rows where "Salary" is missing
df_cleaned = df.dropna(subset=["Salary"])
print("\nDataFrame after handling missing values:\n", df_cleaned)
# Group employees by "Department" and find the average salary
grouped_df = df_cleaned.groupby("Department")["Salary"].mean()
print("\nAverage Salary by Department:\n", grouped_df)
Sample Output:
First 3 rows of the DataFrame:
Name Age Department Salary
0 Alice 25.0 HR 50000.0
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0
Employees with Salary > 50,000:
Name Age Department Salary
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0
4 Eve NaN Finance 65000.0
Sorted DataFrame by Age:
Name Age Department Salary
2 Charlie 22.0 HR 55000.0
0 Alice 25.0 HR 50000.0
3 David 27.0 IT NaN
1 Bob 30.0 IT 60000.0
4 Eve NaN Finance 65000.0
DataFrame after handling missing values:
Name Age Department Salary
0 Alice 25.0 HR 50000.0
1 Bob 30.0 IT 60000.0
2 Charlie 22.0 HR 55000.0
4 Eve 26.0 Finance 65000.0
Average Salary by Department:
Department
Finance 65000.0
HR 52500.0
IT 60000.0
Name: Salary, dtype: float64
Report Analysis
NumPy Observations:
• A 3×3 matrix of random integers between 10 and 100 was generated.
• The second row and third column were successfully extracted.
• The 3×3 matrix was reshaped into a 1D array.
Pandas Observations:
• The initial DataFrame contained missing values for Age and Salary.
• Employees with Salary > ₹50,000 were filtered.
• Data was sorted by Age in ascending order.
• Missing Age values were replaced with the mean, and rows with missing Salary were
removed.
• The average Salary for each Department was computed, with Finance having the
highest and HR the lowest.
Conclusion:
This assignment demonstrated essential data manipulation techniques using NumPy and Pandas.
NumPy was used for array operations, while Pandas provided insights into structured data.
Handling missing values, filtering, sorting, and grouping techniques were effectively applied,
showcasing real-world data analysis applications.