0% found this document useful (0 votes)
49 views

ML 5 Practical Writeup

Rule-based machine learning methods identify and utilize a set of relational rules to represent captured knowledge, instead of a single model. Tree-based models empower predictive models with high accuracy, stability, and interpretability. They can effectively model non-linear relationships. Key tree-based algorithms include decision trees, random forests, and gradient boosting. Overfitting is a challenge that can be addressed through constraints on tree size or pruning. The choice between linear and tree-based models depends on factors like the complexity of relationships in the data and interpretability needs.

Uploaded by

Tanmay Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

ML 5 Practical Writeup

Rule-based machine learning methods identify and utilize a set of relational rules to represent captured knowledge, instead of a single model. Tree-based models empower predictive models with high accuracy, stability, and interpretability. They can effectively model non-linear relationships. Key tree-based algorithms include decision trees, random forests, and gradient boosting. Overfitting is a challenge that can be addressed through constraints on tree size or pruning. The choice between linear and tree-based models depends on factors like the complexity of relationships in the data and interpretability needs.

Uploaded by

Tanmay Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

ML PRACTICAL

AIM : WRITE STUDY ASSIGNMENT TO BUILD RULE BASED AND TREE BASED
MODEL

THEORY :

Rule Based Model:

Rule-based modeling is a modeling approach that uses a set of rules that indirectly specifies a
mathematical model. The rule-set can either be translated into a model such as Markov chains or
differential equations, or be treated using tools that directly work on the rule-set in place of a
translated model, as the latter is typically much bigger. Rule-based modeling is especially
effective in cases where the rule-set is significantly simpler than the model it implies, meaning
that the model is a repeated manifestation of a limited number of patterns. An important domain
where this is often the case is biochemical models of living organisms. Groups of mutually
corresponding substances are subject to mutually corresponding interactions.

Rule-based machine learning methods have been established for a long time.  There exists
a large body of research describing many different approaches, which are offering various
combinations of interesting properties. Among these properties are, for example:

 good interpretability,
 the ability to incrementally refine rules or sets of rules, and/or
 allowing rules to be expressed in a formalism like Datalog or Horn logic, i.e.
going beyond conjunctions of simple conditions based on attribute-value pairs.

Rule-based machine learning (RBML) is a term in computer science intended to encompass


any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or
apply. The defining characteristic of a rule-based machine learner is the identification and
utilization of a set of relational rules that collectively represent the knowledge captured by the
system. This is in contrast to other machine learners that commonly identify a singular model
that can be universally applied to any instance in order to make a prediction.
Rule-based machine learning approaches include learning classifier systems, association rule
learning, artificial immune systems, and any other method that relies on a set of rules, each
covering contextual knowledge.
While rule-based machine learning is conceptually a type of rule-based system, it is distinct from
traditional rule-based systems, which are often hand-crafted, and other rule-based decision
makers. This is because rule-based machine learning applies some form of learning algorithm to
automatically identify useful rules, rather than a human needing to apply prior domain
knowledge to manually construct rules and curate a rule set.

Rules: Rules typically take the form of an {IF:THEN} expression, (e.g. {IF 'condition' THEN
'result'}, or as a more specific example, {IF 'red' AND 'octagon' THEN 'stop-sign'}). An
individual rule is not in itself a model, since the rule is only applicable when its condition is
satisfied. Therefore rule-based machine learning methods typically comprise a set of rules,
or knowledge base, that collectively make up the prediction model.

Tree Based Model:

Tree based models empower predictive models with high accuracy, stability and ease of


interpretation. Unlike linear models, they map non-linear relationships quite well. They are
adaptable at solving any kind of problem at hand (classification or regression).

Tree based algorithms are considered to be one of the best and mostly used supervised learning
methods. Tree based algorithms empower predictive models with high accuracy, stability and
ease of interpretation. Unlike linear models, they map non-linear relationships quite well. They
are adaptable at solving any kind of problem at hand (classification or regression).

Methods like decision trees, random forest, gradient boosting are being popularly used in all
kinds of data science problems. Hence, for every analyst (fresher also), it’s important to learn
these algorithms and use them for modeling.

The decision of making strategic splits heavily affects a tree’s accuracy. The decision criteria is
different for classification and regression trees. Decision trees use multiple algorithms to decide
to split a node in two or more sub-nodes. The creation of sub-nodes increases the homogeneity of
resultant sub-nodes. In other words, we can say that purity of the node increases with respect to
the target variable. Decision tree splits the nodes on all available variables and then selects the
split which results in most homogeneous sub-nodes.

Overfitting is one of the key challenges faced while using tree based algorithms. If there is no
limit set of a decision tree, it will give you 100% accuracy on training set because in the worse
case it will end up making 1 leaf for each observation. Thus, preventing overfitting is pivotal
while modeling a decision tree and it can be done in 2 ways:

1. Setting constraints on tree size


2. Tree pruning

SOME KEY FACTORS WHICH WILL HELP YOU TO DECIDE WHICH


ALGORITHM TO USE:

1. If the relationship between dependent & independent variable is well approximated by a


linear model, linear regression will outperform tree based model.
2. If there is a high non-linearity & complex relationship between dependent & independent
variables, a tree model will outperform a classical regression method.
3. If you need to build a model which is easy to explain to people, a decision tree model will
always do better than a linear model. Decision tree models are even simpler to interpret
than linear regression!

You might also like