Aam Unit 4 QB With Answer
Aam Unit 4 QB With Answer
3. Decision Trees:
It serves as a comprehensive approach to address various issues like as handling missing
results, outliers, and finding relevant variables.
It performed effectively during our Data Hackathon as well. Multiple data scientists
employed decision tree algorithms and achieved successful outcomes.
4. Random Forest:
Random Forest is a method that is similar to a decision tree.
It is important to note that random forests tend to show a bias towards variables with a
higher number of different values, meaning they favour numeric variables over binary or
category values
5. Strong Correlation:
Dimensions that have a strong correlation can negatively impact the model's performance.
Furthermore, it is undesirable to have several variables that contain comparable informat ion or
exhibit variance, a phenomenon commonly referred to as "multicollinearity".
7. Factor Analysis:
. There are essentially two approaches to conducting factor analysis:
Q. Define Correlation:
Correlation refers to the statistical relationship between two or more variables.
Correlation is a statistical term that quantifies the direction and magnitude of the linear
relationship between two variables..
A correlation value of 0 indicates the absence of a linear relationship between the two
variables, whereas correlation coefficients of 1 and -1 indicate perfect positive and
negative correlations, respectively.
The principal components in PCA are linear combinations of the original variables that
optimise the amount of variation accounted for by the data. The calculation of principal
components involves the utilisation of the correlation matrix.
.
m_data = pd.read_csv('mushrooms.csv')
encoder = LabelEncoder()
X_features = m_data.iloc[:,1:23]
y_label = m_data.iloc[:, 0]
# Visualize
pca = PCA()
pca.fit_transform(X_features)
pca_variance = pca.explained_variance_
plt.figure(figsize=(8,6))
plt.scatter(x_3d[:,0], x_3d[:,5], c=m_data['class'])
plt.show()
pca3 = PCA(n_components=2)
pca3.fit(X_features)
x_3d = pca3.transform(X_features)
plt.figure(figsize=(8,6))
plt.scatter(x_3d[:,0], x_3d[:,1], c=m_data['class'])
plt.show()