Graph Theory Report
Graph Theory Report
Table of Contents
1. Introduction
2. Explanation of Key Concepts and Methods
2.1 Components of an MLP
2.2 Architecture of MLP
2.3 Activation Functions
2.4 Training Process
2.5 Regularization Techniques
3. Applications and Real-World Relevance
3.1 Computer Vision
3.2 Natural Language Processing (NLP)
3.3 Financial Forecasting
3.4 Healthcare
3.5 Gaming and Reinforcement Learning
4. Analysis of Insights
4.1 Advantages
4.2 Challenges
4.3 Future Directions
5. Conclusion
6. References
1
1. Introduction
The artificial neural network ANN is a function that copies in lots of detail and is best suited for
pattern searches. A type of ANN known as the Multi-Layer Perceptron, possesses an input layer,
one or more hidden layers and the output layer. These layers consist of basic, connected building
blocks called neurons which naturally take in the data they are given and pass on the analyzed
result to the next layer. These workhorse MLPs are employed to address a vast number of
difficult tasks in fields such as robotics, and computer vision and natural language processing.
An MLP is named as multi-layer due to having one or more hidden layers between the input and
output layer. These hidden layers help the network become capable of learning curved
relationships which are complex and incomprehensible to the perceptron, and thus solve certain
problems. An MLP is a basic building block of deep learning architectures due to its design and
training that emulates the human brain’s neural network.
● Input Layer: Receives the input features or data points. Each feature corresponds to a
neuron in the input layer.
● Hidden Layers: Perform computations and transformations using activation functions to
model non-linear relationships. The number of hidden layers and the number of neurons
in each layer significantly impact the model’s capacity to learn complex patterns.
● Output Layer: Provides the final result or classification based on the learned parameters.
For regression tasks, it might output continuous values, while for classification, it often
uses a softmax function.
● Weights and Biases: Adjustable parameters that the network learns during training.
These parameters determine the importance of each input feature in the final prediction.
2
● Output Layer: The output layer takes the modified input representation from the hidden
layer(s) and produces the final output. This is made of nodes: each node represents a
class, or a continuous variable, depending on the problem at hand. The output layer
applies an activation function which is suitable for the kind of learning, for example,
softmax in learning DNN for classification and linear function for learning a parametric
model of a DNN for regression.
Activation functions add non-linearity into the network, so that the neural network is capable of
solving any problem. The meaning of this is, if activation functions are not used, seeing them or
not seeing them, the network would function more like a linear model in any case.
● Sigmoid: Outputs values between 0 and 1. It is often used in binary classification tasks
but suffers from vanishing gradient issues.
● ReLU (Rectified Linear Unit): Outputs zero for negative inputs and the input itself for
positive inputs. ReLU is computationally efficient and widely used in modern
architectures.
● Tanh: Outputs values between -1 and 1. It centers the data, making optimization easier in
some cases compared to the sigmoid function.
3
MLPs are trained using supervised learning techniques, which include:
● Forward Propagation: The input data flows through the network, producing outputs at
each layer. During this phase, the weighted sum of inputs and biases is calculated for
each neuron, and the activation function is applied.
● Loss Function: Measures the difference between the predicted output and the actual
target. Common loss functions include Mean Squared Error (MSE) for regression and
Cross-Entropy Loss for classification.
● Backpropagation: Computes gradients of the loss function with respect to the network’s
weights and biases. It uses the chain rule to propagate errors backward through the
network.
● Optimization: Updates the weights and biases using algorithms like Gradient Descent or
Adam Optimizer. These updates are aimed at minimizing the loss function.
Input layer
4
The input layer of an MLP accept input information which could be feature of the input sequence
in a dataset. The neurons within the input layer are equivalent to the features.
Units of the input layer are inactive simply passing the calculated input values to the first hidden
layer neurons.
Hidden layers
The hidden layers of an MLP consist of interconnected neurons that perform computations on the
input data.
In this topic, every neuron in a hidden layer is connected to every neuron in the previous layer.
Inputs are then weighted as per certain coefficients known as weights or w in this case. They
show how much effect input from one neuron has on another neuron in the next layer of the
neural network.
Furthermore, each neuron within the hidden layer in the current embodiment includes a bias
term, referred to herein as ‘b’. The bias enables the addition of another input key into the neuron
which helps to change the output threshold. Biases, unlike weights, are also determined by
training.
Ideally for each neuron in a hidden layer or the output layer the sum of weights of inputs passing
to it is computed. This involves multiplying each input by its corresponding weight, summing up
these products, and adding the bias:
Where n is the total number of input connections, wi is the weight for the i-th input, and xi is the
The weighted sum is passed through an activation function described as f. I believe that the
activation function gives non-linearity to the network which makes it possible for the network to
learn and model non- linear relationship in data. This function defines the range of output of the
neuron, and overall behavior when faced with certain input values. The type of activation
function depends on the type of a problem, that’s being solved as well as the characteristics of
the network, required for this purpose.
Output layer
5
The output layer of an MLP is that last layer of the network that provides the final outputs of the
network. The number of neurons in the output layer depends of the kind of activity to be done,
for instance, binary classification, multi-class classification or even regression.
The neuron in the output layer acquires data from the neuron in last hidden layer and then gets
activated. This activation function rarely the same with that used in the hidden layers and its
output gives the final value or the prediction.
In feedback, the program removes the error back through the network and adjusts the weights of
its inputs in each neuron in an attempt to minimize the error found in the training data. That’s
why, through modification of weights and learning the proper activation functions, the network
is capable to almost fit any type and degree of patterns and relations in the data for further
appropriate prediction on new unknown samples.
Regularization methods are employed to enhance the generalization of MLPs and prevent
overfitting. Key techniques include:
● Dropout: Randomly drops a subset of neurons during training, forcing the network to
learn redundant representations and reducing overfitting.
● L2 Regularization: Adds a penalty term proportional to the square of the weights to the
loss function, encouraging smaller weight values.
MLPs have been applied in various domains to solve real-world problems. Some notable
applications include:
Specifically, when working with image classification problems MLP is used as a classifier after
the convolutional layers extract features. They have been employed to identify digital numbers
such as the MNIST set of databases that can recognize a number manually written. Furthermore,
MLP is implemented with a pipeline for object detection and facial recognition.
On tasks like text categorizing, Binary and multi class classification of sentiments and spam
identification, MLPs are applied in NLP. They can even use embedding like Word2Vec or
GloVe to capture the relationship in word and extracting information and feature from text data.
For example in the spam detection, MLPs can develop rules that enables the computer to
differentiate between normal and spam emails.
6
3.3 Financial Forecasting
Some of the financial applications types that MLPs deployed includes; stock price prediction,
credit scoring, and risk assessment. Such include disclosures that based on past performance
provide information that assist in choice of investments and information that aids in fraud
examination.
3.4 Healthcare
The healthcare industry greatly benefit from MLP for disease predication and diagnosis.
Additionally, they can identify patterns of patients’ characteristics to develop models for the
tendencies of such diseases as diabetes, cancer and cardiovascular diseases. Further, MLPs are
employed in medical image analysis processes, in identification of anomalies such as tumor in
X-ray or MRI scans.
Thus, it is seen that MLPs have an application frequently in reinforcement learning processes to
permit an agent to make relevant decisions in a managed regime. In gaming they have been used
to teach AI to be able to mold agents who are able to play strategic games or even command
avatars in game environments.
4. Analysis of Insights
The literatures proposed that MLPs have been quite precise about the execution and adaptability
of computing non-linear problems. So, the lack of raster quantization of input variables in a
continuous space for continuous target function representation makes these nets a valuable tool
of the machine learning toolbox as stated by the Universal Approximation Theorem. However,
the efficiency with which these approaches work extremely focuses on a rigorous designing and
training. Here are key insights:
4.1 Advantages
● Flexibility: MLPs can model a wide variety of tasks, from regression to classification.
● Scalability: They can be scaled up by increasing the number of hidden layers and
neurons to handle more complex problems.
● Integration with Other Architectures: MLPs often serve as the final classifier in
complex architectures like convolutional neural networks (CNNs) and recurrent neural
networks (RNNs).
4.2 Challenges
7
● Overfitting: Without proper regularization, MLPs can memorize training data instead of
generalizing.
● High Computational Cost: Training MLPs requires significant computational resources,
especially for large networks.
● Data Requirements: MLPs perform well when provided with large, high-quality
datasets, but their performance deteriorates with limited or noisy data.
There are also resultant improvements on the hardware, perfect optimisations and better
algorithms improving MLPs performance. The following are other productive areas for
enhancing MLPs: Researches in incorporating the MLPs with other architectural forms and
additional research on deeper MLPs.
5. Conclusion
To sum up, MLP is a well-protected and versatile kind of artificial neural network that is often
used in Machine learning algorithms. Known fundamental aspects include input layer, hidden
layers, output layer, nodes, activation functions, weight and bias. MLPs being the supervised
models make their input/output mapping through some form of error minimization techniques
like backpropagation/gradient descent. Other research areas that the technique is applied are
computer vision, speech recognition, natural language processing, and finance. Being an
effective approach to solving multifaceted issues, MLPs have a long successful history and even
more potential in the AI development.
6. References
1. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press,
2016.
[Online]. Available: https://www.deeplearningbook.org
2. C. M. Bishop, Pattern Recognition and Machine Learning. New York, NY, USA: Springer, 2006.
[Online]. Available: https://link.springer.com/book/10.1007/978-0-387-45528-0
3. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-
propagating errors," Nature, vol. 323, no. 6088, pp. 533–536, Oct. 1986.
[Online]. Available: https://www.nature.com/articles/323533a0
4. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document
recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[Online]. Available: https://ieeexplore.ieee.org/document/726791
8
5. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining,
Inference, and Prediction, 2nd ed. New York, NY, USA: Springer, 2009.
[Online]. Available: https://link.springer.com/book/10.1007/978-0-387-84858-7