Phase 2
Phase 2
Introduction
Objectives
Cleanse the dataset: Data cleaning involves handling missing values and
outliers to ensure data integrity and accuracy.
2
Develop a recommendation engine: Train and deploy a
recommendation model to deliver personalized product
recommendations.
Dataset Description
System Architecture
3
Recommendation Engine:Generate personalized product
recommendations for users based on their preferences and
interactions.Deploy the recommendation engine on the e-commerce
platform.
1. Data Description:
Head: Displaying the first few rows of the dataset to get an initial
overview.
CODE
import pandas as pd
import numpy as np
np.random.seed(0)
data = pd.DataFrame({
4
'product_id': np.random.randint(1, 50, 100),
})
y = data['interaction_type']
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
5
import seaborn as sns
plt.figure(figsize=(10, 6))
plt.xlabel('Interaction Type')
plt.ylabel('Frequency')
plt.legend()
plt.show()
OUTPUT
6
2.Data Cleaning:
CODE
import pandas as pd
import numpy as np
data_dict = {
data = pd.DataFrame(data_dict)
data.dropna(inplace=True)
threshold = 3
7
data
OUTPUT
CODE
import numpy as np
8
import pandas as pd
np.random.seed(0)
n = 1000
data = pd.DataFrame({
})
plt.figure(figsize=(10, 6))
sns.countplot(data['interaction_type'])
plt.xlabel('Interaction Type')
plt.ylabel('Frequency')
plt.show()
OUTPUT
9
4.Feature Engineering
CODE
import pandas as pd
data = {
10
}
data = pd.DataFrame(data)
user_profiles =
data.groupby('user_id').size().to_frame('num_interactions')
data['timestamp'] = pd.to_datetime(data['timestamp'])
data['hour_of_day'] = data['timestamp'].dt.hour
data['product_category'] = data['product_name'].apply(lambda x:
x.split()[0])
data['product_brand'] = data['product_name'].apply(lambda x:
x.split()[1])
data['product_popularity'] =
data.groupby('product_name')['user_id'].transform('count')
print("Modified Dataset:")
print(data)
print("\nUser Profiles:")
print(user_profiles)
OUTPUT
11
5. Data Transformation
CODE
y = data['interaction_type']
12
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
X_train_scaled[:5]
OUTPUT
Model Development
Collaborative Filtering :
CODE
data = {
13
}
data_df = pd.DataFrame(data)
data_df['interaction_type'] =
data_df['interaction_type'].map(interaction_type_map)
algo_cf = KNNBasic()
algo_cf.fit(trainset)
predictions_cf = algo_cf.test(testset)
rmse_cf = accuracy.rmse(predictions_cf)
OUTPUT
14
Recommendation Engine
CODE
user_items = data_df[data_df['user_id'] ==
user_id]['product_id'].tolist()
all_items = data_df['product_id'].unique().tolist()
return top_n_items
user_id = 12345
top_n_recommendations
OUTPUT
15
Assumed Scenario
Conclusion
CODE
16
return top_n_items
user_id = 12345
OUTPUT
17