L10-DL Intro
L10-DL Intro
Learning
Many of the AI systems in current use fall short of the tremendous breadth of
capabilities of the human brain.
An artificial general intelligence (AGI) is a hypothetical type of
intelligent agent which, if realized, could learn to accomplish any
intellectual task that human beings or animals can perform. Read
Wikipedia
Generative AI: Deep learning models that generate outputs in the form of
images, video, audio, text, and candidate drug molecules.
Large Language models (LLMs) such as GPT-4 are early, incomplete forms of
AGI.
https://arxiv.org/abs/2303.12712
3D shape of a protein using AlphaFold (Jumper et al., 2021)
https://www.nature.com/articles/s41586-021-03819-2
https://generated.photos/
Number of computer cycles needed to train SOTA neural networks 1 petaflop = 1015 FP operations
Rectified Linear Unit (ReLU)
• They compute a linear weighted sum of their inputs.
• The output is a non-linear function of the total input.
• This is the most popularly used neuron.
https://proceedings.mlr.press/v15/glorot11a/glorot11a.pdf
Softmax function
(Normalized exponential
function)
The classic experiment showed how the visual cortex processes information
in a hierarchical way, extracting increasingly complex information. They
showed that there is a topographical map in the visual cortex that
represents the visual field, where nearby cells process information from
nearby visual fields.
They identified two types of neuron cells: simple cells whose output is
maximized by straight edges having particular orientations within their
receptive field, and complex cells which have larger receptive fields and
combine the outputs of the simple cells. They also discovered that
neighbouring cells have similar and overlapping receptive fields.
This gave the concept of sparse interactions in CNN’s where the network
focusses on local information rather than taking the complete global
information.
Advantages of CNN
1. They have sparse connections instead of fully
connected connections which lead to reduced
parameters and make CNN’s efficient for processing
high dimensional data.
2. Weight sharing takes place where the same weights
are shared across the entire image, causing reduced
memory requirements as well as translational
invariance.
3. CNN’s use a very important concept of subsampling
or pooling in which the most prominent pixels are
propagated to the next layer dropping the rest. This
provides a fixed size output matrix which is typically
required for classification and invariance to
translation, rotation.
Introduction
• Traditional pattern recognition models use hand-crafted
features and relatively simple trainable classifier.
hand-crafted “Simple”
feature Trainable output
extractor Classifier
On the left is the original image. In the middle, an image with sections removed