0% found this document useful (0 votes)
124 views1 page

Fundamentals of AI - Visual Map (AIF-C01)

The document discusses the fundamentals of artificial intelligence, focusing on key concepts such as underfitting, overfitting, and model accuracy. It highlights the importance of hyperparameter adjustments, dataset scaling, and regularization techniques to improve model performance. Additionally, it covers various AI applications, including natural language processing and image recognition, while emphasizing the significance of feature engineering and model training.

Uploaded by

nshahpar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views1 page

Fundamentals of AI - Visual Map (AIF-C01)

The document discusses the fundamentals of artificial intelligence, focusing on key concepts such as underfitting, overfitting, and model accuracy. It highlights the importance of hyperparameter adjustments, dataset scaling, and regularization techniques to improve model performance. Additionally, it covers various AI applications, including natural language processing and image recognition, while emphasizing the significance of feature engineering and model training.

Uploaded by

nshahpar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Fundamentals Of Artificial Intelligence

Underfittin g

Inaccurate results for both the training Longer training duration


dataset and new dat Prevention

Bias
Caused by
Disparities in t h
e
High not training long enough, Scaling dataset
performance of a mode l
not large enough dataset
across different groups;

- “error due to incorrect Lo w Tempe rature

assumptions in the model” i i ing dataset nfluences the h of the model selecting lower-
I likeli ood probability outputs.

Accuracy of Models
F
D vers fy

Balanced
Creativity abstractio / n
raud Detection

#1 Challenge of AI model that generalizes well to new data High = more random response -> Hallucinations

Scenario Best Hyperparameter Adjustment Variance


E arl y stopping

w Stops training before the model overfits to noise


Learning Rate
'

T K

Lo
model s sensitivity to

How Big Are the Steps?

op

Th f most-likely candidates th t th m k .

fluctuations or noise in

od s underfi ing ( ow u y, Increase epochs so the model learns


h a
Randomness lc s s f th xt t
P ( P)

M el i tt l acc rac e number o a e ode on ider or e ne o en

Controls step size when updating model weights


umb f p b bl w s

t e training dat

high b s)
ia mo p re attern s - “error due to the High P g
and diversity m ts th
Li i e n er o ro a e ord Intelligent Document rocessing ID

Overfitting
High K = more probable words, more diverse and creative
runin

Removes irrelevant features.


extracts and classifies information (structured) from
model's complexity”
unstructured data
z z
Performs better on training data than it does

M od el i s overfi ing (m mo
tt e ri ing , not Increase regularization to penali e Batch Siz e

z
on new dat

omp x y
Prevention
generali ing ) c le it How Many Attempts Before Adjusting? / Solo vs Group
Caused by
Regularization
T P

Numb f mpl p m ss o
op

db f upd t w ht Th f most-likely candidates th t th m k .

I age cla ificati n


er o sa es rocesse e ore a ing eig s
P l s c mpl x ty t void overfitting. lc s s f th xt t
P
training for too lon e percentage o a e ode on ider or e ne o en
ena ize

0-1 (%)

o e i o a
Inference arameters
od s uns ble ( oss Redu e le rning r e o smoo Hyperparameters not diverse dataset
C V n

High P = broad range of possible words, possibly more creative and diverse output
M el training i ta l c a at f r ther
omputer isio
jumps, do s e n’t c o v
n )
erge op m z o
ti i ati n
N E s
I nterpret and understand digital
Obj ect detecti on

E
umber of poch
g

Many Times Do You Practice?

nsemblin
How images and videos

Number of times the model sees the entire dataset


Combines models for better accuracy.

M od el training i s oo slow
t Increase batch size or increase Response Length
m s m o
Defines the minimum or maximum number of tokens in the output
I age eg entati n

learning rate (carefully)

Regularization

Data augmentatio n
DL Use C ases
Alters data to improve robustness. x ss o
M od upd
el ate s are t oo noisy Redu e b c atch size for more stable Avoiding Bad Habits
Penaltie s

Te t cla ificati n

training Prevents model from being too complex Length


Ad usts the model s response by penali ing length, repetition, frequency, or token types
j ’ z

S enti m ent Anal ys s i

Stop se quence s
N atural Language P rocessing (N P)

Specifies chara ct s qu c s th t h lt fu th t k t Interaction between human language & computer


s o
er e en e a a r er o en genera ion
Machine tran lati n

T raining Inference Deployable Model


T raining Model

Inference Code
Lang u age generati o n
Algorithms
Software that implements the
Mathematical relationship model, by reading the artifacts.
F s

Output
N w
eature
between output and inputs a
Output

F s
e Dat
C lum s t bl /

T
eature
Specified (Supervised)

o n in a a e
raining Dat a

C lum s t bl /
Unknown data s
Predictions
Known data
o n in a a e
(Unspecified for unsupervised P x ls
i e in an i m age
Model Artifact

P x ls
i e in an i m age
Model P s

arameter learning)
-Trained Parameters, a model
Learned & adjusted iteratively Promp t
definition that describes how to
during the training process to Specific set of inputs (+ enrichments) compute inferences;

Hyp pa am t s

er r e er
find best fit (minimise errors) AI
-other metadata
User-defined to control training process towards expected output. Inference Parameters

Artificial
Us -der d o influence model output
efine t

Intelligence
ML Op en sou rce p -
re traine d mod el

Model Training

produces
us om mod
Model Artifacts
Sources Training c t el
DL
Artificial Intelligence (AI)

r
T ainin 80%
g Machine Learning (ML)
Machine
Hybrid model with fine tuning
train the model

Feature n
Learning Manage d PI

iteratively used to provide desired result SageMaker AI


creatio

creation f o ne w f tu s f m x st
ea re ro e i ing da a t
Gen AI Deplo yment method (production)

Validation 10% [Optional]


Feature transformation and imputatio n
S os d PI

elf h te A

tuning h s

yperparameter
replacing m ss f tu s f tu s th t tv l EC2
h
selecting t l

e best mode
i ing ea re or ea re a are no a id
Neural Networks (NN)

Data Sets Uses GPUs for parallel


I new Deep Learning DL is Built On
ntroduces data
F eature Engineering P rocess
computations

Layers add weight to features


Feature x n

F w k
e tractio

reducing th e a m u t f t t b p c ss
o n o da a o e ro e ed us
ing
ML rame or

Test Set 10%


dim sen iona i l ty uct t ch qu s
red ion e ni e

evaluate final performance on unseen a


dat

determines h wwo ell t h


e model generalizes
Feature n

selectio Txa onom y


selecting subs t f xt ct
a e o e ra ed f tu s
ea re

Training
Learning Model

Data Algorithms Inferencing

Data Types Learning T yp s


e Inference T yp s
e king predictions based on new data
Ma GA Ns
Transformer s
F oundation CNN
RNN
FNN

enerative Adversarial (self-attention Convolutional Neural Recurrent Neural Feedforward Neural


(F
G Autoencoders

Network mechanism) Models M ) Network Network Networks

Feature Engineering
Image rec o o
gniti n an d analyz videos d o se uen i l
e an f r q t a p o ss d
r ce o ata in ne
to create new input labels (features)
cla ss o
ificati n;
u d s d
n er d su
tan s m -the ata ch a ti e d on
irecti

-
Semi structured
Structured
-
Semi supervised
-
Self supervised

Re l- ime

a t Batch
for grid-like d ata o s ps s s o x
relati n hi erie r te t -p ern re ogni ion d
att c t an

Transactional, JSON Tables Supervised


sup vis + u sup vis d

er ed n er e enerates its own


G be ween im ges.
t a fu o pp ox m o
ncti n a r i ati n in
Labeled
Labeled Training (when it’s difficult to obtain pseudo-labels
Inferencing
Inferencing

s ru ured d a

t ct at

Lowest latency (ms-s)


Highest latency (min-hours)

ERT

labels for a % of dataset) by using pre-text tasks


PT
TM

o u d s d - s sw
E
VAE

f r n er tan ing the ca e here


Unstructured

Payload <= 6MB / record


P <= 100MB / h
Generator Discriminator B
d p d s oss m
Unsupervised
G
o x o wo ds DA LS
ayload mini batc

Bidirectional Encoder
Generative Pretrained L Short-Term
c nte t f r in a e en encie acr ti e
video, image, text Reinforced (RL)

P <= 60s
P <= 1 h r
Generate synthetic data u sw
Eval d s Deterministic Variation o qu d
Representations from s ( . .p d
Pattern Discovery rocessing time rocessing time ou ate hether ata i ong

Transformer Memory Networks


are n t re ire
F
entence e g re ict

/C n

raud identificatio
Continuously improves
Fast, near-instant
Batch processing
om dom pu .
o k .
Transformers Autoencoders Autoencoders
m ss wo ds)
Logistic regression lassificatio fr ran in t real r fa e
Unlabelled
sf mu l b l
i ing r

utput s category clas Persistent endpoint Large datasets


Ex mp : G s Ex mp : d s k
its model by mining
Learn ro n a e ed a le enerate a le I entifie fa e
o i a or

Ass s l b l ign a e
g
C
t s ct s b f refining
ran a ion e ore
feedback from previous
Infrequent use reali s m so
tic i wo k
age r art r m so d
i age r etects om an alies an om y d al etecti o n, image sy nthe s s;

Binary Multi-class

or
Decision tre e
group similar data points

lusterin
with known f u c s s. ra d a e
iterations;

Serverless
Asynchronous
(e . ., D pF k m
g ee s).a e i age ind . ata learn a compact representation of
“An agent learns to make
KNN algorithm if-else structure to K-Means algorithm P robabilit y density / data called “latent space”

predict an outcome anomal y detectio n


Sentiment anal s
ysi
decisions by performing
Inferencing
Inferencing

Us s l u l b l t xt actions (determined by
Low latency (ms-s)
Mid-High latency (near real-time)

tfc t f t ms e arge n a e ed e

f s w th
Iden i i a ion o rare i e ,

t f st th Payload <= 4MB / record


Payload <= 1GB / record

Linear regressio policy) in (a particular

utput s continuous v lu N w k

eural net or Reduction

Dimension ev tsen bs v t s
, or o er a ion in a
da a ir , en re ine i
E r

ncode

l b l s t m t s mpl s. Processing time <= 1 hour

Processing time <= 60s

o i a a state of) an environment

st m t s numerical v lu s performs one or more Simplifying Data While t s t


da a e
a e ed en i en a e
C mp ss s put t
o re e in in o a l t t p s t t
a en re re en a ion .
On-demand, no infra,

e i a e a e to maximise cumulative
only pay when endpoint is LL
yers of mathematical Keeping most relevant Multi Modal
la
n
Intermittent, short term,
L L uage
ssociation rule learning

rewards.”
processing requests.
Diffusion
transformation based features
Document classificatio arge ang
A Models r

tf ssml cum ts idle period, tolerate “cold start”


Models
r le based relationships
Decode

on adjusting data Principal Component u -


Iden i ie i i ar do en

R c st ucts t f m th l t t sp c
f st, th l b ls s l ct higher initial latency
bet een inputs in a dataset
e on r da a ro e a en a e

weightings Analysis (PCA) algorithm


ir en a e a e e ed
w
subs t. e M u p
lti le f o ms o d
r f ata , e.g. Understand & generate
Small Language Model (SLM)

x I g
te t and ma e, etc. a s hum an -i lk e x
te F wor ard Diffusio n

Inferencing

Deployed directly on the edge device

in pu t an d ou pu
t t T rained on m ss text
a c upt
orr ing da a t w th i noi s e

at the Edge

Very low latency,

data ( t
in erne , t b ks oo , FM onl y A m zoa n B edro kc
Offline capability, local inference
tc Pre-built

Edge device

e
Reverse Diffusion

r sf r r
T an o me s th t m zo S geM ker Jumps r
+ Pre-trained
Accuracy
(Raspberry Pi)
Multimodal E s

mbedding
Architecture
denoi e e da a
FM
A a n a a ta t

How many total predictions (both positives and negatives) are correct.
close to where the Large Language Model ( )

LLM I of multiple
ntegration
non t -de erm
inistic +

Best when classes are balanced and errors are equally costly. data is generated,
Deployed on a remote server, types of data, into a Stable Diffusio n
other ML A m zo S geM ker AI
a n a a Custom build + train
in places with limited Edge device connects via API
unified n
us s uc - f t l t t sp c
Tk
e a red ed de ini ion a en a e
representatio
E
internet connections Higher latency,

o en

[Amazon Titan
MA

Absolute Error Basic units of tex


Precision
Mean
Must be online for access Multimodal Embeddings] Standardization of
How many predicted positives are actually positive? 
Input Data
Best when false positives are costly (e.g., predictive maintenance). MA PE

Mean Absolute Percentage Error


E
mbeddings & V ector

Recall / True Positive Rate (TPR)


RMSE

Eac h token assigned a

How many actual positives are correctly identified as positives? 


R tm squ (R SE)
list of numbers (vector

Best when missing positives is costly (e.g., medical diagnoses)


oo ean ared error M
Relates to ot h er tokens

R Squared

y /T N Rate (TNR)

Explains variance in your model R2 close to 1 means predictions are good


T s

Specificit rue egative


Single hreshold Metric
How many actual negatives are correctly identified as negatives.
Model

/C onfusion Matri x E n

Best when correctly identifying negatives is important (e.g., quality control). valuatio

Metrics

Unlabeled data

F alse P ositive Rate (FPR)

How many actual negatives are incorrectly identified as positives?

Best when false alarms are costly (e.g., fraud detection).

C ontinued P -
re trainin g

False N egative Rate (FNR)


= Domain Adaptation Fine-tuning

How many actual positives are incorrectly identified as negatives.


$$$$

Best when missing positives is critical (e.g., security screening).


F
M Lifec ycle

F1 Score

How well the model balances Precision and Recall?


& yment & F k &

T E
Data Selection Deplo eedbac

Best when both false positives and false negatives matter (e.g., information retrieval). P re - rainning Optimization valuation
m zo OpenSearch Service (Serverless
P C A a n

search & analytics databas


reparation Inference I

real time similarity querie


Unl beled d a

a at
(kNN) search capability
Mass dataset, Diverse Source
C-ROC
AU

True Positive Rate (Recall) vs. False Positive Rate (FPR)

Balanced datasets
-
Self supervised
A m zo
a nc Do umentD

How well the model distinguishes between the positive and negative classes across various thresholds?
create labels from input data NoSQL database
Best for evaluating model performance across thresholds (e.g., binary classification). real time similarity queries
Multi- Threshold Metric s
T
ransfer Learnin g
F ine tunin- g
Prompt Engineering
RAG

Threshold-independent, Curve-based does NOT change the weights of FM DOES change the weights of FM Embeddings Model
Vector Databases

C-PR

does NOT change the weights of FM

(R etrieval-Augmented Generation)
h h m zo
A a n rr
Au o
relational database,
converts data (text, images, etc.) stores t ese vectors and elps
AU
Don’t remember earlier conversation
does NOT change the weights of FM

Precision vs. Recall (TPR)

proprietary on AWS
into numbers (vectors) find similar ones efficiently
$
$$
Imbalanced datasets

FM can reference a data source outside of its


(where positive class is rare, e.g., fraud detection)
training data (e.g. Knowledge-base):

Relational
“Augmented” Prompt = Query + “Retrieval” Text m zo
A a n RDS for PostgreSQ
relational database,
Single T u rn Me ss aging
open-source
Labeled data
Instruction based Fine-tuning

$$$
m zo Neptun
M u u
lti T rn Me ssaging
Graph A a n

graph database

Reinforced learning RLHF: Reinforcement Learning from Human Feedback

Ama zon GroundTruth + Amazon Mechanical Turk

You might also like