Set-C AnsKey CT2
Set-C AnsKey CT2
Number
Part – B (4 x 5 = 20 Marks)
Instructions: Answer ANY FOUR
Questions
Q Question Mark B C PO PI
. s L O Code
N
o
11 Explain the difference between reshaping, pivoting, and concatenating 5 2 3 5
datasets using pandas.
Ans:
Reshaping: Changing the structure of data (e.g., melt()
converts wide to long format).
Pivoting: Converting long data into a wide format (e.g.,
pivot() makes a column's values into new columns).
Concatenating: Combining multiple datasets along rows or
columns (e.g., concat()).
# Create data
X = np.linspace(-5, 5, 100)
Y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(X, Y)
Z = np.sin(np.sqrt(X**2 + Y**2))
# Customize labels
ax.set_xlabel('X axis')
ax.set_ylabel('Y axis')
ax.set_zlabel('Z axis')
# Show plot
plt.show()
Customization Options:
cmap: Color map for the surface (e.g., 'viridis', 'plasma').
ax.plot_surface(): You can add more options like edgecolor, alpha for
transparency, etc.
15 Use Seaborn to create a pairplot and customize its style using 5 3 5 5
sns.set_style() on iris dataset. What insights can a pairplot provide?
Ans:
import seaborn as sns
import matplotlib.pyplot as plt
# Create a pairplot
sns.pairplot(iris, hue='species')
Part – C (2 x 10 = 20 Marks)
Instructions: Answer ALL questions.
Q. Question Mark BL C P PI
No s O O Code
16 Describe and compare various techniques used to clean and prepare 10 2 3 5
a raw datasets for analysis. Include examples of handling missing
data, standardization, string cleaning, and binning. Give python
code examples of each.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, None, 4], 'B': [5, None, 7, 8]})
df_cleaned = df.dropna() # Remove rows with any missing values
o Impute missing data:
(OR)
import pandas as pd
import numpy as np
# Remove outliers
df = df[(df['Age'] >= lower_bound_age) & (df['Age'] <=
upper_bound_age)]
4. Numeric Scaling (Standardization):
Standardize numeric columns like 'Age' and 'Salary' to
have zero mean and unit variance.
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
df[['Age', 'Salary']] = scaler.fit_transform(df[['Age', 'Salary']])
Final Dataframe:
print(df)
plt.plot(x, y)
plt.xlim(0, 5) # Set x-axis limit
plt.ylim(0, 20) # Set y-axis limit
plt.show()
2. Adding Labels and Title:
xlabel(), ylabel(), and title() are used to add labels and
titles.
plt.plot(x, y)
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.title('Plot Title')
plt.show()
3. Legends:
Use legend() to add a legend to the plot. You can label
your plots during plotting and then call legend().
plt.plot(x, y)
plt.annotate('Peak', xy=(2, 4), xytext=(3, 5),
arrowprops=dict(facecolor='red', arrowstyle="->"))
plt.show()
5. Applying Plot Styles:
Use plt.style.use() to apply predefined styles such as
ggplot, seaborn, etc.
plt.style.use('ggplot')
plt.plot(x, y)
plt.show()
plt.boxplot([1, 2, 3, 4, 5, 6, 7])
plt.show()
5. Scatter Plot:
o Use-case: Displays relationships between two
variables, useful for correlation analysis.
o Example: Visualizing the relationship between
height and weight.
(OR)
17 Apply advanced Seaborn visualizations to explore patterns in a real 10 3 5 5
b dataset. Include pair plots, heatmaps, and style settings. Write a
Python program to visualize a 3D surface plot. Explain each
component used in the plot.
Ans:
# Load dataset
iris = sns.load_dataset('iris')
# Set style
sns.set_style("whitegrid")
# Pair plot
sns.pairplot(iris, hue='species')
plt.show()
Explanation:
sns.set_style(): Sets plot background style.
# Create 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Plot surface
surf = ax.plot_surface(X, Y, Z, cmap='viridis')
# Add labels
ax.set_xlabel('X Axis')
ax.set_ylabel('Y Axis')
ax.set_zlabel('Z Axis')
plt.title('3D Surface Plot')
plt.show()
Explanation:
Axes3D: Enables 3D plotting.
CO Coverage
60 53 %
50
40
30 26 %
21 %
20
10
0
CO 1 CO 2 CO 3