0% found this document useful (0 votes)
4 views1 page

1 - Pca Python Code

The document outlines the application of Principal Component Analysis (PCA) on a dataset containing various variables related to consumer products and demographics. It provides the explained variance ratios for each principal component, along with cumulative variance ratios, and highlights the importance of features in the first six principal components. The results are presented in dataframes for clarity and analysis.

Uploaded by

joaogarciacati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views1 page

1 - Pca Python Code

The document outlines the application of Principal Component Analysis (PCA) on a dataset containing various variables related to consumer products and demographics. It provides the explained variance ratios for each principal component, along with cumulative variance ratios, and highlights the importance of features in the first six principal components. The results are presented in dataframes for clarity and analysis.

Uploaded by

joaogarciacati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

PCA

variables = [
'_0101_MUEBLES', '_0111_COLCHONES', '_0511_BOLSAS','_0625_ELECTRICOS',
'_0628_COMPUTACION',
'_0629_TV_Y_VIDEO', '_0632_CELULARES','_0634_LINEA_BLANCA', '_0701_FRAGANCIAS',
'JOVENES','FAMILIAS_C_BEBES_NINOS',
'ADULTOS_S_HIJOS','FAMILIAS_C_ADOLESCENTES_NINOS',
'ADULTOS_MAYORES','TICKET_PROMEDIO','LC','CLIMA','UTILIDAD_ACUM_PCTJ',
'ROTACION_ACUM_PCTJ',
'VENTA_PROM', 'INVENTARIO_PROM', 'MT2','IDH', 'PEA', 'COMPETENCIA'
]

consolidado[variables]

pca = PCA(n_components=None)
pca.fit(consolidado[variables])
explained_variance_ratio = pca.explained_variance_ratio_
print("Explained Variance Ratios (by Principal Components):")
print(explained_variance_ratio)

explained_variance_df = pd.DataFrame({
'Principal Component': [f'PC{i+1}' for i in
range(len(explained_variance_ratio))],
'Explained Variance Ratio': explained_variance_ratio,
'Cumulative Variance Ratio': explained_variance_ratio.cumsum()
})

print("Explained Variance Ratio and Cumulative Variance Ratio by Principal


Component:")
print(explained_variance_df)

components = pca.components_
print("Principal Components (each row corresponds to a PC, each column to an original
feature):")
print(components)

components_df = pd.DataFrame(components, columns=variables, index=[f'PC{i+1}' for i


in range(len(components))])

print("\nPrincipal Components with Feature Loadings:")


print(components_df)

pc1_importance = components_df.loc['PC1'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC1:")
print(pc1_importance)

pc2_importance = components_df.loc['PC2'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC2:")
print(pc2_importance)

pc3_importance = components_df.loc['PC3'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC3:")
print(pc3_importance)

pc4_importance = components_df.loc['PC4'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC4:")
print(pc4_importance)

pc5_importance = components_df.loc['PC5'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC5:")
print(pc5_importance)

pc6_importance = components_df.loc['PC6'].sort_values(ascending=False)
print("\nFeatures sorted by importance in PC6:")
print(pc6_importance)

You might also like