Solution LabAssignment
Solution LabAssignment
df.head()
[5 rows x 32 columns]
1
[2]: df = df.drop(['ID','Age','Gender','Education', 'Country','Ethnicity'],axis=1)
df.head()
[5 rows x 26 columns]
[3]: df.columns
[4]: frequency_mapping = {
'CL0': 0,
'CL1': 1,
'CL2': 2,
'CL3': 3,
'CL4': 4,
'CL5': 5,
'CL6': 6
}
columns = ['Alcohol', 'Amphet', 'Amyl', 'Benzos', 'Caff', 'Cannabis', 'Choc',␣
↪'Coke', 'Crack', 'Ecstasy', 'Heroin', 'Ketamine', 'Legalh', 'LSD', 'Meth',␣
2
df
3
[6]: target_variable = columns[0]
X = df.drop(target_variable,axis=1)
y = df[target_variable]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.
↪3,random_state=1)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
4
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.
↪3,random_state=1)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
5
[ 6 0 4 5 7 4 10]]
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
6
y = df[target_variable]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.
↪3,random_state=1)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
7
[ 18 5 14 35 6 0 2]
[ 8 0 3 16 3 0 1]
[ 0 0 3 4 5 0 1]
[ 2 0 0 0 0 0 1]]
/Users/hitaarthh/anaconda3/lib/python3.11/site-
packages/sklearn/linear_model/_logistic.py:460: ConvergenceWarning: lbfgs failed
to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
8
[15]: target_variable = columns[9]
X = df.drop(target_variable,axis=1)
y = df[target_variable]
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.
↪3,random_state=1)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
9
[ 16 0 0 1 1 0 0]
[ 14 1 6 1 0 4 0]
[ 7 1 1 6 1 0 1]
[ 6 0 0 3 1 1 0]
[ 2 0 1 1 1 0 1]
[ 0 0 0 4 0 0 1]]
/Users/hitaarthh/anaconda3/lib/python3.11/site-
packages/sklearn/linear_model/_logistic.py:460: ConvergenceWarning: lbfgs failed
to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
10
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
11
packages/sklearn/linear_model/_logistic.py:460: ConvergenceWarning: lbfgs failed
to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
12
[ 42 7 4 2 0 0 0]
[ 23 6 20 22 1 0 0]
[ 19 1 13 37 10 0 0]
[ 5 3 3 16 10 1 0]
[ 1 2 3 6 2 0 0]
[ 0 0 1 1 0 0 0]]
/Users/hitaarthh/anaconda3/lib/python3.11/site-
packages/sklearn/linear_model/_logistic.py:460: ConvergenceWarning: lbfgs failed
to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
13
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
model.fit(X_train,y_train)
y_predict = model.predict(X_test)
from sklearn.metrics import confusion_matrix
conf = confusion_matrix(y_test,y_predict)
print("Config matrix for:", target_variable)
print(conf)
Increase the number of iterations (max_iter) or scale the data as shown in:
https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
https://scikit-learn.org/stable/modules/linear_model.html#logistic-
regression
n_iter_i = _check_optimize_result(
0.0.4 Inference:
• Learnt how to implement hot encoding can be performed on the categorical data inorder to
make it fit for machine learning algorithm.
14
• Instead of directly using the concept of dummies or one hot encoding, i prefered mapping out
the class of the input from the user in a form of discrete data. This approach works for small
data set, but if the number of columns increase drastically, hot encoding is the only solution
to optimize the code.
15