0% found this document useful (0 votes)

41 views

AITA: Machine Learning: © John A. Bullinaria, 2003

Uploaded by

Anil Dongardiye

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

AITA: Machine Learning: © John A. Bullinaria, 2003

Uploaded by

Anil Dongardiye

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

AITA : Machine Learning

John A. Bullinaria, 2003

1. 2. 3. 4. 5. 6. 7. 8.

What is Machine Learning? The Need for Learning Learning in Neural and Evolutionary Systems Problems Facing Expert Systems Learning in Rule Based Systems Rule Induction and Rule Refinement Concept Learning and Version Spaces Learning Decision Trees

What is Machine Learning?

Any study of Machine Learning should begin with a formal definition of what is meant by Learning. A definition due to Simon (1983) is one of the best: Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task (or tasks drawn from the same population) more effectively the next time. We can easily extend this definition easily to our AI systems: Machine learning denotes automated changes in an AI system that are adaptive in the sense that they enable the system to do the same task (or tasks drawn from the same population) more effectively the next time. The details of Machine Learning depends on the underlying knowledge representations, e.g. learning in neural networks will be very different to learning in rule based systems.
w10s1-2

Types of Learning
The strategies for learning can be classified according to the amount of inference the system has to perform on its training data. In increasing order we have 1. Rote learning the new knowledge is implanted directly with no inference at all, e.g. simple memorisation of past events, or a knowledge engineers direct programming of rules elicited from a human expert into an expert system. Supervised learning the system is supplied with a set of training examples consisting of inputs and corresponding outputs, and is required to discover the relation or mapping between then, e.g. as a series of rules, or a neural network. Unsupervised learning the system is supplied with a set of training examples consisting only of inputs and is required to discover for itself what appropriate outputs should be, e.g. a Kohonen Network or Self Organizing Map.

Early expert systems relied on rote learning, but for modern AI systems we are generally interested in the supervised learning of various levels of rules.
w10s1-3

The Need for Learning

We have seen that extracting knowledge from human experts, and converting it into a form useable by the inference engine of an expert system, is an arduous and labour intensive process. As with many other types of AI system, it is much more efficient to give the system enough knowledge to get it started, and then leave it to learn the rest for itself. We may even end up with a system that learns to be better than a human expert. The general learning approach is to generate potential improvements, test them, and discard those which do not work. Naturally, there are many ways we might generate the potential improvements, and many ways we can test their usefulness. At one extreme, there are model driven (top-down) generators of potential improvements, guided by an understanding of how the problem domain works. At the other, there are data driven (bottom-up) generators, guided by patterns in some set of training data. We shall now look in turn at learning in neural, evolutionary, and rule-based systems.
w10s1-4

Learning in Neural Network Systems

Recall that neural networks consist of many simple processing units (that perform addition and smooth thresholding) with activation passing between them via weighted connections. Learning proceeds by iteratively updating the connection weights wij in such a way that the output errors on a set of training data are reduced.

ini

outj = Sigmoid(ini wij)

wij

The standard procedure is to define an output error measure (such as the sum squared difference between the actual network outputs and the target outputs), and use gradient descent weight updates to reduce that error. The details are complex, but such an approach can learn from noisy training data and generalise well to new inputs.
w10s1-5

Learning in Evolutionary Computation Systems

Evolutionary computation systems simulate the evolution by natural selection that is seen in biological systems. Typically one creates a whole population of AI agents defined by some genotypic representation, and measures their individual performance levels (or fitnesses) on the given task or problem. The most fit individuals are chosen from each generation to survive and breed to form the next generation. The simulated breeding process involves cross-over and mutation of genetic material. This will map into the individuals fitness and drive the selection process. In this way, good recombinations and mutations will proliferate in the population, and we will end up with generations of individuals which are increasingly good at their given tasks. Often evolutionary improvements and lifetime learning are combined in the same system, and we end up with an approach that is superior to either on their own. We find the evolution of particularly good learners, and that learned behaviour can be assimilated into the genotype (the Baldwin Effect).
w10s1-6

Problems Facing Expert Systems

We can identify four major limitations facing conventional expert systems: 1. Brittleness Expert systems generally only have access to highly specific domain knowledge, so they cannot fall back on more general knowledge when the need arises, e.g. to deal with missing information, or when information appears inconsistent. 2. Lack of Meta-Knowledge Expert systems rarely have sophisticated knowledge about their own operation, and hence lack an appreciation of their own limitations. 3. Knowledge Acquisition Despite an increasing number of automated tools, this remains a major bottleneck in applying expert system technology to new domains. 4. Validation Measuring the performance of expert systems is difficult because it is not clear how to quantify the use of knowledge. The best we can do is compare their performance against that of human experts. Progress on the first two points should follow once we have made progress on the third point. We shall now look at how machine learning techniques can help us here.
w10s1-7

Types of Learning in Rule Based Systems

The principal kinds of learning appropriate for rule based systems are 1. 2. 3. The invention of new conditions and/or actions for rules. The invention of new conflict resolution strategies (i.e. meta-rules). The discovery and correction of errors in the existing system.

For learning new rules (including meta-rules) there are two basic approaches: 1. Inductive rule learning methods create new rules about a domain that are not derivable from any previous rules. We take some training data, e.g. examples of an expert performing the given task, and work out corresponding rules that also generalize to new situations. Deductive rule learning enhances the efficiency of a systems performance by deducing new rules from previously known domain rules and facts. Having the new rules should not change the outputs of the system, but should make it perform more efficiently.
w10s1-8

Meta-Rules and Meta-Knowledge

In order to learn effectively, we need to be able to reason about the rules, and to have an understanding of the state of the knowledge base. We need to have meta-rules, i.e. rules about the rules, and meta-knowledge, i.e. knowledge about the knowledge. An example of a meta-rule is: IF: there are rules which do not mention the current goal in their premise, AND there are rules which do mention the current goal in their premise, THEN: the former rule should be used in preference to the latter An example of a meta-knowledge is: Knowledge/information that can be frequently used in strong rules is more important than knowledge/information that is rarely used and only appears in weak rules. Using meta-rules and meta-knowledge involves meta-level inference.
w10s1-9

Rule Induction Systems

The simplest rule induction system can be represented by the following flowchart: Create Initial Rule Set Select New Training Instance SAMPLER

Generate New Rules from Old

GENERATOR

Evaluate Rules on Training Data

PERFORMER CRITIC

Eliminate Poorly Performing Rules NO YES

System Good Enough?

w10s1-10

FINISHED

Rule Refinement Strategies

There are numerous approaches one can take to improve the rules in an existing rule based systems. A good rule refinement program should involve: 1. Removing redundancy. More than one rule may deal with essentially the same situation unnecessary rules should be removed to increase efficiency. Merging rules. Sometimes a set of rules can be merged into a single, more general, rule that has the same effect. Doing this will improve efficiency. Making rules more specific. If a rule is too general it can make incorrect predictions. Such rules should be made more specific to reduce errors. Making rules more general. If a rule can be made more general without introducing errors, it should be, as it is likely to improve generalization. Selecting the final rules. The process of specialization and generalization my have introduced more redundancy, so step 1 is applied again to remove them.
w10s1-11

Concept Learning and Classification

The above procedures for generating and testing and refining rules make good sense, but for large (i.e. useful) systems we need to formulate a more systematic procedure. The idea of concept learning and classification is that given a training set of positive and negative instances of some concept (which belongs to some pre-enumerated set of concepts), the task is to generate rules that classify the training set correctly, and that also recognize unseen instances of that concept, i.e. generalize well. To do this we work with a set of patterns that describe the concepts, i.e. patterns which state those properties which are common to all individual instances of each concept. The simplest patterns are nothing more than the descriptors that specify the rule conditions (i.e. the IF parts of the rules) that relate to the given concept. We will clearly need to be able to match efficiently given instances in the training set against the hypothetical/potential descriptions of the concepts.
w10s1-12

Version Spaces and Partial Ordering

The idea of a version space is simply a way of representing the space of all concept descriptions (rule conditions) consistent with the training instances seen so far. Efficient representation and update of version spaces can be achieved by defining a partial order over the patterns generated by any concept definition language. We can do this by defining the relation more specific than or equal to as follows: Pattern P1 is more specific than or equal to pattern P2 (written as P1 P2) if and only if P1 matches a subset of all the instances that P2 matches. For example, P1 = car is a fairly general concept pattern, P2 = American car is more specific, and P3 = yellow American cars with sun-roofs and alloy wheels is an even more specific pattern. We can write P3 P2 P1. Note that we can order P4 = blue car with respect to P1 but not P2 and P3, so the ordering is only partial.
w10s1-13

Version Spaces Blocks World Example

Consider the following simple example from Winstons blocks world: P1:
STANDING BRICK SUPPORTS LYING WEDGE or BRICK

P2:

not LYING any shape

TOUCHES

any orientation WEDGE or BRICK

Clearly, pattern P1 is more specific than pattern P2, because the constraints imposed by P1 are only satisfied if the weaker constraints imposed by P2 are satisfied. So P1 P2. Note that, for a program to perform this partial ordering, it would need to understand the relevant concepts and relationships, e.g. that wedges and bricks are different shapes, that supporting implies touching, and so on.
w10s1-14

Version Spaces Boundary Sets

Once a system can grasp the relationship of specificity, the version space can be represented in terms of its maximally specific and maximally general patterns. The system can consider the version space as containing: 1. 2. 3. The set S = {Si} of maximally specific patterns. The set G = {Gi} of maximally general patterns. All concept descriptions which occur between these two sets in the partial ordering.

This is called the boundary sets representation for version spaces, which is both 1. 2. Compact it is not explicitly storing every concept description in that space. Easy to update a new space simply corresponds to moving the boundaries.

With this convenient representation we can now apply machine learning techniques to it.
w10s1-15

Version Spaces Learning the Boundaries

A machine learning technique known as the candidate elimination algorithm can manipulate the boundaries in an extremely efficient manner. This is best illustrated by thinking of a set of positive and negative training examples in some input space, and looking at where the decision boundaries can go: + G S + + + + + + + S = Most specific boundaries G = Most general boundaries

It is easy to see how the boundaries can be refined as increasing numbers of data points become available, and how to extend the approach to more complex input spaces.
w10s1-16

Decision Trees
Decision trees are a particularly convenient way of structuring information for classification systems. All the data to be classified enters at the root of the tree, while the leaf nodes represent the classifications. For example: Outlook sunny Humidity high
Stay In

overcast
Go Out

rain Windy yes no

normal
Go Out

Stay In

Go Out

Intermediate nodes represent choice points, or tests upon attributes of the data, which serve to further sub-divide the data at that node.
w10s1-17

Decision Trees versus Rules

Although decision trees look very different to rule based systems, it is actually easy to convert a decision tree into a set of rules. From the above example we have: R1: IF: THEN: IF: THEN: R3: IF: THEN: Outlook = overcast Go Out Outlook = sunny Humidity = normal Go Out Outlook = rain Windy = no Go Out R4: Outlook = sunny Humidity = high THEN: Stay In Outlook = rain Windy = yes THEN: Stay In IF: IF:

R2:

R5:

The advantage of decision trees over rules is that comparatively simple algorithms can derive decision trees (from training data) that are good at generalizing (i.e. classifying unseen instances). Well known algorithms include CLS, ACLS, IND, ID3, and C4.5.
w10s1-18

Decision Tree Algorithms - ID3, C4.5, Etc.

All decision tree algorithms set out to solve basically the same problem: Given a set of training data D, and a set of disjoint target classes {Ci}, the algorithm must use a series of tests T on data attributes with outcomes {Oi} to partition D into subsets {Di} such that Di = { d D : T(d) = Oi } If we repeat this process for an appropriate sequence of tests T, we will end up with each resulting data subset Di corresponding to a single class Ci, and we can draw the resultant decision tree, and if required, convert it to a set of rules. The hard part is to determine the appropriate sequences of tests, and this is where the various decision tree algorithms differ. ID3 uses ideas from information theory and at each stage selects the test that gains the most information (or equivalently, results in the biggest reduction in entropy). C4.5 uses different heuristics which usually work better. Note that unlike the version space approach to concept learning, these algorithms are not incremental if we get new data we need to start again.
w10s1-19

Overview and Reading

1. 2. 3. 4. 5. We began by defining some general ideas about machine learning systems. We first looked at learning in neural networks and evolutionary systems. We then considered the need for learning in expert systems, and how we might set up simple rule induction systems and rule refinement strategies. We then considered the version space approach to concept learning. We ended by looking at decision trees, how they can be turned into rule sets, and how they can be generated by algorithms such as ID3 and C4.5.

Reading 1. 2. 3. 4. 5. Jackson: Chapter 20 Russell & Norvig: Chapters 18, 19, 20 & 21 Callan: Chapters 11 & 12 Rich & Knight: Chapter 17 Nilsson: Section 17.5
w10s1-20

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6184)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (629)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1150)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (944)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8234)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (633)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1254)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (877)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (8550)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (879)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (969)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (2955)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (506)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (282)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5010)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (503)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4294)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (450)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2067)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1989)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1071)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2670)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1957)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2176)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (132)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4081)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (76)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (902)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (795)
Resume: Name: Anil B. Dongardiye Mobile No.: 09922111408 Address
No ratings yet
Resume: Name: Anil B. Dongardiye Mobile No.: 09922111408 Address
3 pages
Electronic Communications of The EASST Volume 3 (2006)
No ratings yet
Electronic Communications of The EASST Volume 3 (2006)
15 pages
Mobile Agent Technology: Current Trends and Perspectives: G. Cabri, L. Leonardi, F. Zambonelli
No ratings yet
Mobile Agent Technology: Current Trends and Perspectives: G. Cabri, L. Leonardi, F. Zambonelli
12 pages
A Mobile Unit Synchronization Algorithm: Erik Hammarberg Thomas Gustafsson
No ratings yet
A Mobile Unit Synchronization Algorithm: Erik Hammarberg Thomas Gustafsson
31 pages
Lecture 11 20040915
No ratings yet
Lecture 11 20040915
20 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carre
3.5/5 (125)

AITA: Machine Learning: © John A. Bullinaria, 2003

Uploaded by

AITA: Machine Learning: © John A. Bullinaria, 2003

Uploaded by

AITA : Machine Learning

John A. Bullinaria, 2003

What is Machine Learning?

The Need for Learning

Learning in Neural Network Systems

outj = Sigmoid(ini wij)

Learning in Evolutionary Computation Systems

Problems Facing Expert Systems

Types of Learning in Rule Based Systems

Meta-Rules and Meta-Knowledge

Rule Induction Systems

Generate New Rules from Old

Evaluate Rules on Training Data

Eliminate Poorly Performing Rules NO YES

System Good Enough?

Rule Refinement Strategies

Concept Learning and Classification

Version Spaces and Partial Ordering

Version Spaces Blocks World Example

not LYING any shape

any orientation WEDGE or BRICK

Version Spaces Boundary Sets

Version Spaces Learning the Boundaries

rain Windy yes no

Decision Trees versus Rules

Decision Tree Algorithms - ID3, C4.5, Etc.

Overview and Reading

You might also like