100% found this document useful (2 votes)

469 views

Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub

The document discusses symbolic machine learning and outlines various topics including the differences between man and machine, intelligent agents and examples of them, and machine learning including application domains and algorithms. Machine learning is described as involving representation, evaluation, and optimization and examples are given of machine learning applications in areas like computer vision, natural language processing, medical diagnosis, and more.

Uploaded by

Asif Bin Latif

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

469 views

Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub

Uploaded by

Asif Bin Latif

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 112

Symbolic Machine Learning

M.S.Kaysar,M.Engg
CSE, IUB
Outlines
• Man vs. Machine

• Intelligent Agents & Some Examples

• Machine Learning
• Application domains
• Algorithms
Man vs. Machine
Man vs. Machine
Man vs. Machine
Human Computer
Have common sense and bigger Speed: Fast.
knowledge base, thus can percept his  Reliable.
environment better than computer Endurance: Not tired.
given appropriate means (especially in Unbiased.
visual form). Consistent.
Can think (synthesize) new rules òut Can try much more
Strengths of the box'. combinations than what
Psychologically, human decision is human is capable of.
more trusted than computer expert
system decision.
Can detect trends, patterns, or
anomalies, in visualization data.
Good in learning.
Man vs. Machine
Human Computer
 Easily tired and bored, thus Difficult to
can only be utilized for a short synthesize new rules
period of time, perhaps as (cannot think òut of
òracle' only. the box').
 Cannot do micro manage.  Limited knowledge
 Biased and inconsistent. base.
Weaknesses  Can make error.  No common sense.
 Not a perfect decision maker.
 Actually cannot see anything
if the data is presented in
awkward
manner.
Man vs. Machine
Key Difference….
• Intelligence
• How to build up intelligence into machine or
intelligent agents?
• Solution: Artificial Intelligence (AI)
What is Intelligence?
• Intelligence (also called intellect) is an umbrella term
used to describe a property of the mind that
encompasses many related abilities, such as the
capacities to reason, to plan, to solve problems, to
think abstractly, to comprehend ideas, to use
language, & to learn [Wikipedia]

Where is mind?
What is intelligence?

• As the ability to acquire, understand & apply

knowledge, or the ability to exercise thought &
reason [Dictionary]
• Intelligence is more than this!!!
Vision of AI

Develop systems that matches or exceeds

human intelligence: the intelligence of a
machine that could successfully perform any
intellectual task that a human being can
Is it really possible?
Other Notable Examples..
Chess (Deep Blue, 1997)

“I could feel –
I could smell –
a new kind of
intelligence
across the
table”
-Gary
Kasparov
Speech Recognition

Automated call
centers

Navigation Systems
Museum Tour-Guide Robots

Minerva, 1998
Rhino, 1997
Mars Rovers (2003-now)
Europa Mission ~ 2018?
Humanoid Robots
Brain-Computer Interfaces
Singing, Dancing, Bride, …..
How it would be Possible?

Use Machine Learning

Machine Learning
Learning = Improving performance with
experience at some task
Study of algorithms that
• improve their performance P
• at some task T
• with experience E

well-defined learning task: <P,T,E>

Machine learning systems automatically learn
programs from data
Machine Learning
LEARNING = REPRESENTATION + EVALUATION +
OPTIMIZATION
• Representation: classifier must be represented in
some formal language that the computer can
handle.
• Evaluation: evaluation function (objective/ or
scoring function) is needed to distinguish good
classifiers from bad ones.
• Optimization: method to search among the
classifiers for the highest-scoring one.
03 components of learning algorithms
Representation Evaluation Optimization
Instances Accuracy/Error rate Combinatorial optimization
K-nearest neighbor Precision and recall Greedy search
Support vector machines Squared error Beam search
Hyperplanes Likelihood Branch-and-bound
Naive Bayes Posterior Continuous optimization
Logistic regression probability Unconstrained
Information gain Gradient descent
Decision trees K-L divergence
Sets of rules Conjugate gradiant
Margin Quasi-Newton methods
Propositional rules
Logic programs Constrained
Neural networks Linear programming
Quadratic programming
Graphical models
Bayesian networks
Conditional random fields
Why Machine Learning?
• Some tasks cannot be defined well, except by
examples (e.g., recognizing people).
• Relationships & correlations can be hidden
within large amounts of data
ML may be able to find these relationships.
• Human designers often produce machines
that do not work as well as desired in the
environments in which they are used.
Why Machine Learning?
• The amount of knowledge available about certain
tasks might be too large for explicit encoding by
humans (e.g., medical diagnostic).
• Environments change over time.
• New knowledge about tasks is constantly being
discovered by humans (e.g. autonomous driving)
 It may be difficult to continuously re-design
systems “by hand”.
Pros & Cons
 Pros:
 often much more accurate than human-crafted rules (since
data driven)
 humans often incapable of expressing what they know
(e.g., rules of English, or how to recognize letters), but can
easily classify examples
 don’t need a human expert or programmer
 automatic method to search for hypotheses explaining data
 cheap & flexible — can apply to any learning task

 Cons:
 need a lot of labeled data
 error prone — usually impossible to get perfect accuracy
Machine Learning Applications
Countless………..
• Machine perception •Sequence mining
• Computer vision, •Speech and handwriting recognition
including object recognition •Game playing
• Natural language processing •Software engineering
• Syntactic pattern recognition •Adaptive websites
• Search engines •Robot locomotion
• Medical diagnosis •Computational advertising
• Bioinformatics •Computational finance
• Brain-machine interfaces •Structural health monitoring
•Sentiment analysis (or opinion mining)
• Cheminformatics
•Affective computing
• Detecting credit card fraud
•Information retrieval
• Stock market analysis
•Recommender systems
• Classifying DNA sequences •Optimization and Metaheuristic
Few Examples
Learning to Predict Emergency C-Sections
 Data

Given:
9714 patient records, each describing a pregnancy & birth
Each patient record contains 215 features
Learn to predict:
Classes of future patients at high risk for Emergency Cesarean Section
Learning to Predict Emergency C-Sections

One of 18
learned rules
Credit Risk Analysis
• Data

Rules learned
from synthesized
data
Learning to detect objects in images

Example training images

for each orientation
Learning to classify text documents
Company home page
vs.
Personal home page
vs.
University home page
vs
…
ALVINN drives 70 mph on highways
Learning prosthetic control from neural implant
R. Kass, L. Castlellanos, A. Schwartz
Machine Learning - Practice

• Supervised learning
• Bayesian networks
• Hidden Markov models
• Unsupervised clustering
• Reinforcement learning
• ....
Related Disciplines

Computer
science Animal
Economics learning
& (Cognitive
Organizational science,
Behavior Psychology,
Machine Neuroscience)
learning

Evolution
Adaptive
Control theory

Statistics
ML niche is growing (Why)?

Improved machine learning algorithms

Increased data capture, networking, new
sensors
Software too complex to write by hand
Demand for self-customization to user,
environment
Designing a Learning System
Example: A Checker Learning Problem
1. Problem Description
2. Choosing the Training Experience
3. Choosing the Target Function
4. Choosing a Representation for the Target
Function
5. Choosing a Function Approximation Algorithm
6. Final Design
1. Problem Description
• Task T: Playing Checkers
• Performance Measure P: % of games won
against opponents
• Training Experience E: To be selected ==>
Games Played against itself
2. Choosing the Training Experience
• Direct versus Indirect Experience [Indirect Experience gives
rise to the credit assignment problem & is thus more difficult]
• Teacher versus Learner Controlled Experience [the teacher
might provide training examples; the learner might suggest
interesting examples & ask the teacher for their outcome; or
the learner can be completely on its own with no access to
correct outcomes]
• How Representative is the Experience? [Is the training
experience representative of the task the system will actually
have to solve? It is best if it is, but such a situation cannot
systematically be achieved]
3. Choosing the Target Function
• Given a set of legal moves, we want to learn how to choose the
best move [since the best move is not necessarily known, this is
an optimization problem]
• ChooseMove: B --> M is called a Target Function [ChooseMove,
however, is difficult to learn. An easier & related target function
to learn is V: B --> R, which assigns a numerical score to each
board. The better the board, the higher the score.]
• Operational vs. Non-Operational Description of a Target
Function [An operational description must be given]
• Function Approximation [The actual function can often not be
learned & must be approximated]
4. Choosing a Representation for the
Target Function
• Expressiveness vs. Training set size [The more
expressive the representation of the target function,
the closer to the “truth” we can get. However, the
more expressive the representation, the more
training examples are necessary to choose among
the large number of “representable” possibilities.]
• Example of a representation: wi’s are adjustable
or “learnable”
– x1/x2 = # of black/red pieces on the board coefficients
– x3/x4 = # of black/red king on the board
– x5/x6 = # of black/red pieces threatened by red/black
V(b) = w0+w1.x1+w2.x2+w3.x3+w4.x4+w5.x5+w6.x6
5. Choosing a Function Approximation Algorithm
• Generating Training Examples of the form
<b,Vtrain(b)> [e.g. <x1=3, x2=0, x3=1, x4=0, x5=0, x6=0,
+100 (=blacks won)]
– Useful & Easy Approach: Vtrain(b) <- V(Successor(b))
• Training the System
– Defining a criterion for success [What is the error that
needs to be minimized?]
– Choose an algorithm capable of finding weights of a
linear function that minimize that error [e.g. the Least
Mean Square (LMS) training rule].
6. Final Design for Checkers Learning
• The Performance Module: Takes as input a new board
and outputs a trace of the game it played against itself.
• The Critic: Takes as input the trace of a game and
outputs a set of training examples of the target function
• The Generalizer: Takes as input training examples and
outputs a hypothesis which estimates the target
function. Good generalization to new cases is crucial.
• The Experiment Generator: Takes as input the current
hypothesis (currently learned function) and outputs a
new problem (an initial board state) for the
performance system to explore
Learning Algorithms
Learning styles:
Supervised Learning:
• Input data is called training data & has a known
label or result
-such as spam/not-spam or a stock price at a time.
• A model is prepared through a training process
where it is required to make predictions & is
corrected when those predictions are wrong.
• The training process continues until the model
achieves a desired level of accuracy on the
training data.
Learning Algorithms
 Unsupervised Learning:
• Input data is not labeled & does not have a
known result.
• A model is prepared by deducing
structures/patterns present in the input data.
Learning Algorithms
Semi-Supervised Learning:
• Input data is a mixture of labeled & unlabelled
examples.
• There is a desired prediction problem but the
model must learn the structures to organize
the data as well as make predictions.
Learning Algorithms
Reinforcement Learning
• Input data is provided as stimulus to a model
from an environment to which the model
must respond & react.
• Feedback is provided not from of a teaching
process as in supervised learning, but as
punishments & rewards in the environment
Learning Algorithms
Supervised Unsupervised Reinforcement
Classification Clustering Q-learning
k-Nearest Neighbour (kNN) BIRCH
Decision tree Hierarchical Temporal
Random Forest k-means difference
Naive Bayes EM learning
BBN DBSCAN
SVM OPTICS
LDA Mean-shift
Neural nets
Adaboost Association rule
Deep learning learning
Apriori algorithm
Regression Eclat algorithm
Ordinary Least Squares
Logistic Regression
Stepwise Regression
Multivariate Adaptive Regression Splines (MARS)
Classification
• A bank loans officer needs analysis of her data to
learn which loan applicants are “safe” and which
are “risky” for the bank.
• A marketing manager at AllElectronics needs data
analysis to help guess whether a customer with a
given profile will buy a new computer [yes or no].
• A medical researcher wants to analyze breast
cancer data to predict which one of three specific
treatments a patient should receive [A, B, or C]
Steps
Training Training
Labels
Training
Images
Image Learned
Training
Features model

Testing

Image Learned
Prediction
Features model
Test Image
How does classification work?
• Two-Step Process :
• learning step (where a classification model is
constructed)
• describing a set of predetermined classes
– Each tuple/sample is assumed to belong to a predefined class, as
determined by the class label attribute
– The set of tuples used for model construction is training set
– The model is represented as classification rules, decision trees, or
mathematical formulae
• classification step (where the model is used to
predict class labels for given data).
• for classifying future or unknown objects
– Estimate accuracy of the model
• The known label of test sample is compared with the classified
result from the model
• Accuracy rate is the percentage of test set samples that are
correctly classified by the model
• Test set is independent of training set (otherwise overfitting)
– If the accuracy is acceptable, use the model to classify new data
Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

M ike A ssistant P rof 3 no (Model)
M ary A ssistant P rof 7 yes
B ill P rofessor 2 yes
Jim A ssociate P rof 7 yes
IF rank = ‘professor’
D ave A ssistant P rof 6 no
OR years > 6
A nne A ssociate P rof 3 no
THEN tenured = ‘yes’
60
Process (2): Using the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
T om A ssistant P rof 2 no Tenured?
M erlisa A ssociate P rof 7 no
G eorge P rofessor 5 yes
Joseph A ssistant P rof 7 yes
61
Classification Problems
• classify examples into given set of categories

New example

Labeled Classification
training ML algorithm
rule
examples
Predicted
classification
Decision Trees
Decision Tree Learning
• Decision tree induction is the learning of
decision trees from class-labeled training tuples.
• A decision tree is a flowchart-like tree structure,
where each internal node (nonleaf node)
denotes a test on an attribute, each branch
represents an outcome of the test, and each leaf
node (or terminal node) holds a class label.
• The topmost node in a tree is the root node.
A decision tree for the concept buys_computer, indicating whether an
AllElectronics customer is likely to purchase a computer. Each internal
(nonleaf) node represents a test on an attribute. Each leaf node represents a
class (either buys_computer = yes or buys_computer = no).
Example: Good versus Evil
• problem: identify people as good or bad from
their appearance
A Decision Tree Classifier
How to Build Decision Trees

• choose rule to split on

• divide data using splitting
rule into disjoint subsets
How to Build Decision Trees
• choose rule to split on
• divide data using splitting rule into disjoint subsets
• repeat recursively for each subset
• stop when leaves are (almost) “pure”
How to Choose the Splitting Rule

• key problem: choosing best rule to split on:

How to Choose the Splitting Rule
key problem: choosing best rule to split on:

idea: choose rule that leads to greatest increase in “purity”

A Possible Classifier
How to Measure Purity

Information gain
Gini Index
Gain Ratio
Information Gain
Expected information (entropy) needed to classify a tuple in D
m
Info( D)   pi log 2 ( pi )
i 1

 Information needed (after using A to split D into v partitions)

to classify D:
v | D |
InfoA ( D)    Info( D j )
j

j 1 | D |

 Information gained by branching on attribute A

Gain(A)  Info(D)  InfoA(D)

Gain(A) tells us how much would be

gained by branching on A

The attribute A with the highest

information gain, Gain (A), is chosen as
the splitting attribute at node N.
An Illustrative Example

Quinlan [Qui86].
How to choose best splitting criterion?
• The class label attribute, buys computer, has two
distinct values (namely, {yes, no});
• two distinct classes (i.e., m = 2).
• class C1 = yes
• class C2 = no
• 09 tuples of class = yes
• 05 tuples of class = no
• A (root) node N is created for the tuples in D.
• To find the splitting criterion for these tuples,
compute the information gain of each attribute.
• Expected information needed to classify a
tuple in D: m
Info( D)   pi log 2 ( pi )
i 1
• Next, we need to compute the expected
information requirement for each attribute.
• Start with attribute: age
• age category “youth,”: yes = 02 tuples & no = 03
tuples.
• category “middle aged,”: yes = 04 tuples & no = 0
tuples.
• category “senior,”: yes = 03 tuples & no = 02
tuples.
• The expected information needed to classify a
tuple in D if the tuples are partitioned
according to age:
v | Dj |
InfoA ( D)    Info( D j )
j 1 | D|
• Hence, the gain in information from such a
partitioning would be

Similarly, we can compute Gain(income) = 0.029 bits,

Gain(student) = 0.151 bits, and Gain(credit_rating) = 0.048
bits.
Because age has the highest information gain among the
attributes, it is selected as the splitting attribute.
Decision Trees
best known:
• C4.5 (Quinlan)
• CART (Breiman, Friedman, Olshen & Stone)
• very fast to train and evaluate
• relatively easy to interpret
• but: accuracy often not state-of-the-art
Bayes Classification
Bayesian Classification: Why?
• A statistical classifier: performs probabilistic prediction, i.e.,
predicts class membership probabilities
• Foundation: Based on Bayes’ Theorem.
• Performance: A simple Bayesian classifier, naïve Bayesian
classifier, has comparable performance with decision tree &
selected neural network classifiers
• Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct —
prior knowledge can be combined with observed data
• Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision
making against which other methods can be measured
Bayes’ Theorem: Basics

• Total Probability Theorem:

M
P(B)   P(B | A )P( A )
i i
i 1
Bayes’ Theorem: Basics
• Bayes’ Theorem: P(H | X)  P(X | H )P(H )  P(X | H ) P(H ) / P(X)
P(X)

– Let X be a data sample (“evidence”): class label is unknown

– Let H be a hypothesis that X belongs to class C
– Classification is to determine P(H|X), (i.e., posteriori probability): the
probability that the hypothesis holds given the observed data sample X
– P(H) (prior probability): the initial probability
• E.g., X will buy computer, regardless of age, income, …
– P(X): probability that sample data is observed
– P(X|H) (likelihood): the probability of observing the sample X, given that
the hypothesis holds
• E.g., Given that X will buy computer, the prob. that X is 31..40,
medium income
Prediction Based on Bayes’ Theorem
• Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes’ theorem

P(H | X)  P(X | H )P(H )  P(X | H ) P(H ) / P(X)

P(X)
• Informally, this can be viewed as
posteriori = likelihood x prior/evidence
• Predicts X belongs to Ci iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes
• Practical difficulty: It requires initial knowledge of many
probabilities, involving significant computational cost
Classification Is to Derive the Maximum Posteriori
• Let D be a training set of tuples & their associated class labels,
& each tuple is represented by an n-D attribute vector X = (x1,
x2, …, xn)
• Suppose there are m classes C1, C2, …, Cm.
• Classification is to derive the maximum posteriori, i.e., the
maximal P(Ci|X)
• This can be derived from Bayes’ theorem
P(X | C )P(C )
P(C | X)  i i
i P(X)
• Since P(X) is constant for all classes, only
P(C | X)  P(X | C )P(C )
i i i
needs to be maximized

89
Naïve Bayes Classifier
• When to use
Moderate or large training set available
Attributes that describes instances are
conditionally independent given classifier

• Applications
 Diagnosis
 Classifying text documents
Naïve Bayes Classifier
• A simplified assumption: attributes are conditionally
independent (i.e., no dependence relation between attributes):
n
P( X | C i)   P( x | C i)  P( x | C i)  P( x | C i)  ...  P( x | C i)
k 1 2 n
k 1
• This greatly reduces the computation cost: Only counts the
class distribution
• If Ak is categorical, P(xk|Ci) is the # of tuples in Ci having value xk
for Ak divided by |Ci, D| (# of tuples of Ci in D)
• If Ak is continous-valued, P(xk|Ci) is usually computed based on
Gaussian distribution with a mean μ & standard deviation σ
( x )2
1 
g ( x,  ,  )  e 2 2

& P(xk|Ci) is 2 

P(X | C i)  g ( xk , Ci ,  Ci )
91
Predicting a class label using na¨ıve
Bayesian classification
• The data tuples are
described by the
attributes:
- age, income, student, &
credit_rating

Class:
C1: buys_computer = ‘yes’
C2: buys_computer = ‘no’

Data to be classified:
X = (age = youth, income = medium, student = yes, credit_rating = fair)
We need to maximize P(X|Ci)P(Ci), for i = 1, 2.

Compute: P(Ci), the prior probability of each class,

P(buys_computer =
“yes”) = 9/14 = 0.643
P(buys_computer = “no”)
= 5/14= 0.357
Compute P(X|Ci) for each class
P(age = youth | buys_computer = yes) = 2/9 = 0.222
P(age = youth | buys_computer = no) = 3/5 = 0.6
P(income = medium | buys_computer = yes) = 4/9 = 0.444
P(income = medium | buys_computer = no) = 2/5 = 0.4
P(student = yes | buys_computer = yes) = 6/9 = 0.667
P(student = yes | buys_computer = no) = 1/5 = 0.2
P(credit_rating = fair | buys_computer = yes) = 6/9 = 0.667
P(credit_rating = fair | buys_computer = no) = 2/5 = 0.4
X = (age = youth , income = medium, student = yes, credit_rating = fair)

P(X|Ci) :
P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019

To find the class, Ci , that maximizes P(X|Ci)P(Ci),

compute:

-P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028

-P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007

Therefore, X belongs to class (“buys_computer = yes”)

SVM-Support Vector Machine
SVM-Basics
• A relatively new classification method for both linear &
nonlinear data
• It uses a nonlinear mapping to transform the original training
data into a higher dimension
• With the new dimension, it searches for the linear optimal
separating hyperplane (i.e., “decision boundary”)
• With an appropriate nonlinear mapping to a sufficiently high
dimension, data from two classes can always be separated by a
hyperplane
• SVM finds this hyperplane using support vectors (“essential”
training tuples) & margins (defined by the support vectors)
given linearly separable data
SVM—History and Applications
• Vapnik and colleagues (1992)—groundwork from Vapnik &
Chervonenkis’ statistical learning theory in 1960s
• Features: training can be slow but accuracy is high owing to their
ability to model complex nonlinear decision boundaries (margin
maximization)
• Used for: classification and numeric prediction
• Applications:
– handwritten character recognition, object/image recognition,
speaker identification, benchmarking time-series prediction
tests
100
SVM—General Philosophy

Small Margin Large Margin

Support Vectors
SVM—When Data Is Linearly Separable

2 class problem:
Database D
(X1, y1), (X2, y2)---
Xi: traning tuples
yi: class level [+1/-1]

Data are linearly separable, because a straight line can be drawn to

separate all the tuples of class +1 from all the tuples of class -1.
Which one is Best?
• infinite number of
possible separating
hyperplanes or
“decision boundaries,”
• Which one is best?
We want to find the “best” one, that is,
one that will have the minimum
classification error on previously unseen
tuples.
How can we find this best
line/hyperplane?
SVM searches for the hyperplane with the largest margin, i.e.,
maximum marginal hyperplane (MMH)

Which
one is better?
SVM—Linearly Inseparable

not linearly separable

SVM—Linearly Inseparable

Now what?

It is not possible to draw a straight line to separate the

classes. Instead, the decision boundary is nonlinear
Extend linear approach…
• How can we extend the linear approach?
• Two main steps:
 Transform the original input data into a higher
dimensional space using a nonlinear mapping.
 Searches for a linear separating hyperplane in the
new space.
• Outcomes: a quadratic optimization problem that
can be solved using the linear SVM formulation.
• The maximal marginal hyperplane found in the
new space corresponds to a nonlinear separating
hypersurface in the original space.
Tips! If Writing Your Own Code
• Matlab are great for easy coding, but for speed, may
need C or java
• debugging machine learning algorithms is very tricky!
• hard to tell if working, since don’t know what to
expect
• run on small cases where can figure out answer by
hand
• test each module/subroutine separately
• compare to other implementations (written by
others, or written in different language)
• compare to theory or published results
Conclusion
• ML: How can we program systems to
automatically learn & to improve with
experience?
• In order to build up an intelligent or autonomous
agents, ML should be used.
• Still wide gaps: need to perform plenty of social &
cognitive capabilities
• Not so far away from our dreams!
• Artificial Humans co-play in the human world
very soon.
Conclusion
• AI dream of someday building machines as
intelligent as you or I - Andrew Ng.

A Puzzle………

Who is the real human?

Thanks
• V3

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (80)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
Soal GOJEK
No ratings yet
Soal GOJEK
34 pages
1001 Songs
69% (72)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
MachineLearningNotes PDF
100% (1)
MachineLearningNotes PDF
299 pages
Machine Learning in Action
100% (1)
Machine Learning in Action
1 page
Machine Learning
100% (2)
Machine Learning
76 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Introduction To ML
100% (1)
Introduction To ML
39 pages
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
1 - Machine Learning (Start)
No ratings yet
1 - Machine Learning (Start)
32 pages
Machine Learning
100% (2)
Machine Learning
104 pages
Machine Learning
100% (1)
Machine Learning
62 pages
Machine Learning Material
100% (3)
Machine Learning Material
115 pages
Machine Learning
100% (2)
Machine Learning
211 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
46 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
Machine Learning Notes Unit 1 To 4
No ratings yet
Machine Learning Notes Unit 1 To 4
101 pages
Machine Learning Tutorial
100% (2)
Machine Learning Tutorial
139 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Natural Language Processing: Dr. Ahmed El-Bialy
100% (1)
Natural Language Processing: Dr. Ahmed El-Bialy
49 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
Machine Learning Summarized Notes 1660762916
No ratings yet
Machine Learning Summarized Notes 1660762916
111 pages
CS6659 AI UNIT 1 Notes
100% (8)
CS6659 AI UNIT 1 Notes
47 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
110 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
Generative Adversarial Networks (GANs)
No ratings yet
Generative Adversarial Networks (GANs)
51 pages
Machine Learning Handouts
No ratings yet
Machine Learning Handouts
110 pages
ML (U1&u2)
No ratings yet
ML (U1&u2)
51 pages
Unit I Introduction To AI & ML - AVA
100% (3)
Unit I Introduction To AI & ML - AVA
85 pages
K Means
100% (2)
K Means
329 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
SRM Valliammai Engineering College (An Autonomous Institution)
No ratings yet
SRM Valliammai Engineering College (An Autonomous Institution)
9 pages
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
No ratings yet
Introduction To Machine Learning by Ethem Alpaydin 2nded - 2010
314 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
59 pages
1 - Intro To Machine Learning
100% (1)
1 - Intro To Machine Learning
20 pages
Machine Learning
100% (3)
Machine Learning
2,520 pages
Machine Learning Techniques Quantum
No ratings yet
Machine Learning Techniques Quantum
159 pages
Design A Machine Learning System
No ratings yet
Design A Machine Learning System
9 pages
List of Deep Learning and NLP Resources
No ratings yet
List of Deep Learning and NLP Resources
69 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
8 Machine Learning Algorithms in Python
100% (3)
8 Machine Learning Algorithms in Python
16 pages
71A Machine Learning
No ratings yet
71A Machine Learning
8 pages
Machine Learning
No ratings yet
Machine Learning
111 pages
Artificial Neural Networks
0% (1)
Artificial Neural Networks
50 pages
Ai and Machine Learning For Business
No ratings yet
Ai and Machine Learning For Business
114 pages
Linear Regression For Machine Learning
No ratings yet
Linear Regression For Machine Learning
17 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
Data Preprocessing
100% (1)
Data Preprocessing
109 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
15 pages
Training Report On Machine Learning PDF
No ratings yet
Training Report On Machine Learning PDF
28 pages
Machine Learning
100% (1)
Machine Learning
81 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
6 pages
GANppt
100% (1)
GANppt
34 pages
Statistics in Details
100% (2)
Statistics in Details
283 pages
Machine Learning and Real-World Applications
100% (1)
Machine Learning and Real-World Applications
19 pages
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
From Everand
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
Abhishek Vijayvargia
No ratings yet
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
TV Wall 01
No ratings yet
TV Wall 01
1 page
Who Can Make You Famous With The Push of A Button: 241 New Media Influencers
No ratings yet
Who Can Make You Famous With The Push of A Button: 241 New Media Influencers
32 pages
Boutique Logo Usage Guide
No ratings yet
Boutique Logo Usage Guide
1 page
Economics: Elasticity and Its Application
No ratings yet
Economics: Elasticity and Its Application
49 pages
The Manufacturing Funding Landscape
No ratings yet
The Manufacturing Funding Landscape
1 page
Creating and Manipulating Structure of Similar Data Types (Example 1)
No ratings yet
Creating and Manipulating Structure of Similar Data Types (Example 1)
5 pages
Linear Regression With Multiple Variables
No ratings yet
Linear Regression With Multiple Variables
42 pages
Do You Believe Fast Foods Should Come With Warning Labels?
No ratings yet
Do You Believe Fast Foods Should Come With Warning Labels?
4 pages
Linear Regression With Multiple Variables
No ratings yet
Linear Regression With Multiple Variables
56 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
18 pages
Meeting 14 Linier Bounded Automata (LBA)
No ratings yet
Meeting 14 Linier Bounded Automata (LBA)
61 pages
Numerical Treatment of Simple Equations (FDM)
No ratings yet
Numerical Treatment of Simple Equations (FDM)
27 pages
Solution For Assignment
No ratings yet
Solution For Assignment
7 pages
MAA HL T2 Paper 3
No ratings yet
MAA HL T2 Paper 3
4 pages
Download Full (Ebook) Applied Machine Learning Using mlr3 in R by Bernd Bischl, Raphael Sonabend, Lars Kotthoff, Michel Lang ISBN 9781032515670, 1032515678 PDF All Chapters
100% (2)
Download Full (Ebook) Applied Machine Learning Using mlr3 in R by Bernd Bischl, Raphael Sonabend, Lars Kotthoff, Michel Lang ISBN 9781032515670, 1032515678 PDF All Chapters
81 pages
sms spam filtering system hybrid approaches
No ratings yet
sms spam filtering system hybrid approaches
25 pages
Image Signature: Highlighting Sparse Salient Regions: Xiaodi Hou, Jonathan Harel, and Christof Koch
No ratings yet
Image Signature: Highlighting Sparse Salient Regions: Xiaodi Hou, Jonathan Harel, and Christof Koch
8 pages
Vector Analysis
100% (1)
Vector Analysis
20 pages
Statistical Methods For Stochastic Differential Equations: Monographs On Statistics and Applied Probability 124
No ratings yet
Statistical Methods For Stochastic Differential Equations: Monographs On Statistics and Applied Probability 124
498 pages
Class 10th Chapter 2 - Polynomials Test by RVS Institute
No ratings yet
Class 10th Chapter 2 - Polynomials Test by RVS Institute
1 page
Kunci Kieso
No ratings yet
Kunci Kieso
17 pages
6 Recursion
No ratings yet
6 Recursion
23 pages
Kalman Theorem
No ratings yet
Kalman Theorem
2 pages
2025 FEB MOCK 3 - Computing 2
No ratings yet
2025 FEB MOCK 3 - Computing 2
4 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
11 pages
DS 630_Lec 4_St
No ratings yet
DS 630_Lec 4_St
27 pages
T5-Based Model For Abstractive Summarization A Semi-Supervised Learning Approach With Consistency Loss Functions
No ratings yet
T5-Based Model For Abstractive Summarization A Semi-Supervised Learning Approach With Consistency Loss Functions
16 pages
Lecture 3
No ratings yet
Lecture 3
21 pages
Machine Learning Oral Questions
No ratings yet
Machine Learning Oral Questions
10 pages

Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub

Uploaded by

Symbolic Machine Learning: M.S.Kaysar, M.Engg Cse, Iub

Uploaded by

Symbolic Machine Learning

• Intelligent Agents & Some Examples

• As the ability to acquire, understand & apply

Develop systems that matches or exceeds

Use Machine Learning

well-defined learning task: <P,T,E>

Example training images

Improved machine learning algorithms

NAME RANK YEARS TENURED Classifier

• choose rule to split on

• key problem: choosing best rule to split on:

idea: choose rule that leads to greatest increase in “purity”

 Information needed (after using A to split D into v partitions)

 Information gained by branching on attribute A

Gain(A)  Info(D)  InfoA(D)

Gain(A) tells us how much would be

The attribute A with the highest

Similarly, we can compute Gain(income) = 0.029 bits,

• Total Probability Theorem:

– Let X be a data sample (“evidence”): class label is unknown

P(H | X)  P(X | H )P(H )  P(X | H ) P(H ) / P(X)

Compute: P(Ci), the prior probability of each class,

To find the class, Ci , that maximizes P(X|Ci)P(Ci),

-P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028

-P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007

Therefore, X belongs to class (“buys_computer = yes”)

Small Margin Large Margin

Data are linearly separable, because a straight line can be drawn to

not linearly separable

It is not possible to draw a straight line to separate the

Who is the real human?

You might also like