Machine Learning Assignment
Machine Learning Assignment
Our task is to implement feed-forward artificial neural networks to solve a binary classification task
within Python using Keras. Dataset from UCI Machine Learning Repository is related with direct
marketing campaign of a Portuguese bank based on phone calls. Classification goal is to predict if the
client will subscribe to the bank term deposit allocated a binary column ‘y’ with categorical variables
‘yes’ or ‘no’ as outcomes. So we are going to do data processing, define model architecture, compile the
model, fit the model on training data, use test data to predict and finally evaluate the model.
1. Imports
Main imports will be Tensorflow, Keras, Numpy, Pandas, Scikit-learn and lastly Matplotlib to help graph
the evaluating metrics.
We need a clear picture of our data that is our input and output variables so we can process the data to
define our model well. We have named the dataframe training_df and use pandas to import and read
it. As a result we get 14 columns and 7842 rows of data.
Inputs variables
Age – numeric
Marital status – numeric
Education – numeric
Credit default – categorical
Bank balance – numeric
Has Housing loan – categorical
Has personal Loan – categorical
Contact communication type – numeric
Last Contact duration – numeric
Campaign: no. of contacts performed this campaign - numeric
Pdays: no. of days passed after last contact previous campaign – numeric
Previous: no. of contacts before this campaign – numeric
Poutcome: outcome of previous campaign – numeric
Output variable
We separate d our target variable(subscribed) from the rest of the data and also put it into numeric
form ( yes -> 1 and no -> 0). Also there are some input variables which were categorical we changed
them into numeric form. We also dropped some columns from inputs which were not related to the
outcome to have 2 sets of data training_x for inputs and training _y for output.
Printing training_x and training_y we now have ten input columns and one output all in numeric form.
Once we were done with preprocessing we converted our data into numpy array so as to fit it into our
neural network.
Define model architecture
After our data is now in proper format we define our Sequential model using a set of rules. Our input
shape is equal to the number of inputs which is ten. Then we decided to have just one hidden layer which
is is appropriate for the amount of data and it’s more of standard practice (normally its’ rare to exceed 2
hidden layers). For our hidden layer we choose 7 nodes based on following rules:
And in our case we chose the first rule. The output node being a binary classification problem it has to be
one node since it is either a 1 or 0. Our hidden layer activation layer is ‘relu’ and for our output layer it is
‘sigmoid’ since it gives us probability between 1 and 0 which are our outcomes.
Compiling our model
Since it’s a binary classification our loss function was binary_crossentropy and our optimizer we used
Stochastic Gradient Descent and accuracy as our metric.
A neural network is used to predict future outcomes as such, relying on results from training data alone
doesn’t guarantee our model will accurately predict future outcomes from new input data. So we split our
data into two sets (training and test data), then we fitted our compiled model with training data and used
the test data to predict if the model is consistent and accurate if used to predict outcomes with different
input data. And we used Scikit-learn’s train-test-split to split data training data (70%) and test data (30%).
Fitting our model
After we split our data we now ready to fit our model with training data (X_train and Y_Train).
Importantly we had to decide the number of epochs and batch size. As a rule of thump, number of epochs
should be number of inputs times 3 but in my case I did a trial and error using evaluating metrics (F1-
score) to find the best architecture. Fewer number of epochs resulted in a very low F1-score same as a
very high number of batch sizes and after many trials a batch size of 64 and 50 epochs managed to give us
a reasonable F1-score.
Plotting Loss and Accuracy curves using history from training data
The loss of each epoch and accuracy can be calculated from the dictionary and we use Matplotlib plot
some graphs to show loss and accuracy.
The resulting plots:
After running the model.predict() function using input test data (X_test) our network runs a forward pass
to give us it’s predicted y-values and use them to run metric calculations and we did 4 basic metrics for
classification evaluation: true-positive(TP), true-negative(TN), false-positive(FP) and false-
negative(FN).
Once we had our 4 base metrics we put them into a confusion matrix to help us visualize how many times
our classes got confused so as to improve our hyperparameters.
Visualization of confusion matrix:
Derived Metrics
We also used derived metrics accuracy, specificity, precision and F1-score and F1-score was over 0.5
meaning it was closer to 1 meaning our model is meaningful in making predictions.