Project 2
Project 2
Project 2
This is a teamwork project, each team can have up to three members to work on the project. Of
course, you can work on the project individually.
You will build neural network for regression and classification in this project by using the
Matlab built-in functions or any other publicly available packages such as Keras (Python),
Pytorch (Python + Torch), SciKit-Learn (Python), Weka (Java), etc.
Task 1: We will continue to work on the Red Wine Quality Dataset (from UCI) provided in
project 1. Still treat the dataset as a two-class problem using the same rule described in project 1.
In project 1, you designed two linear classifiers for this dataset. In this project, design a
Multilayer Perceptron (MLP) classier using built-in Matlab functions (for example, patternnet()
but not restricted to) or Scikit-learn APIs (for example: MLPClassifier(), not restricted to) with
the following model structure:
1. One hidden layer of 10 hidden units with Sigmoid activation function for both hidden
layer and the output layer.
2. One hidden layer of 50 hidden units with Sigmoid activation function for both hidden
layer and the output layer.
3. One hidden layer of 100 hidden units with Sigmoid activation function for both
hidden layer and the output layer.
Task 2: Design a Convolutional Neural Network (CNN) to classify the benchmark handwritten
digit dataset: MNIST. If you are using Matlab or Keras or Pytorch software package, there are
built-in API to download the dataset (see the following links). There are 60000 training images
and 10000 testing images. Train the following baseline CNN model on the training dataset and
test the trained CNN model on the testing dataset. Report training and testing accuracies. The
baseline CNN model has the following structure:
Task 4: For the provided Gaussian distributed dataset you used in Homework 2 (the generated
two-class dataset), ignore the class ID (the third column), apply the K-means clustering
algorithm (kmeans()) in Matlab to cluster the training and testing datasets, respectively, to TWO
clusters. Scatter plot the two datasets and used the cluster membership to color each data points.
You can use Scikit-learn API as well (KMeans()). You can scatter plot the data using the true
class ID and compare it with the clustering result.
Task 5: Discuss the results you achieved in Tasks 1 – Task 4. What have you observed and what
conclusions can you draw?
Reference:
https://nextjournal.com/gkoehler/pytorch-mnist