ML LAB - Principal Component Analysis
ML LAB - Principal Component Analysis
-RAHUL NABERA M
-15BCE1101
What is PCA?
Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to
convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated
variables called principal components. The number of distinct principal components is equal to the
smaller of the number of original variables or the number of observations minus one. This
transformation is defined in such a way that the first principal component has the largest possible
variance (that is, accounts for as much of the variability in the data as possible), and each succeeding
component in turn has the highest variance possible under the constraint that it is orthogonal to the
preceding components. The resulting vectors are an uncorrelated orthogonal basis set. PCA is sensitive
to the relative scaling of the original variables.
CODE:
dataset = pd.read_csv('log.csv',header=None)
X=dataset.iloc[:,:-1].values
y = dataset.iloc[:,-1 ].values
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.30, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
#pca
from sklearn.decomposition import PCA
pca = PCA(n_components=7)
X_train = pca.fit_transform(X_train)
X_test = pca.transform(X_test)
Explained= pca.explained_variance_ratio_
print('accuracy train:{:.3f}'.format(classifier.score(X_train,y_train)))
print('accuracy test:{:.3f}'.format(classifier.score(X_test,y_test)))
RESULTS:
PCA’s:
ACCURACY:
accuracy train:0.791
accuracy test:0.684