AITA: Machine Learning: © John A. Bullinaria, 2003
AITA: Machine Learning: © John A. Bullinaria, 2003
1. 2. 3. 4. 5. 6. 7. 8.
What is Machine Learning? The Need for Learning Learning in Neural and Evolutionary Systems Problems Facing Expert Systems Learning in Rule Based Systems Rule Induction and Rule Refinement Concept Learning and Version Spaces Learning Decision Trees
Types of Learning
The strategies for learning can be classified according to the amount of inference the system has to perform on its training data. In increasing order we have 1. Rote learning the new knowledge is implanted directly with no inference at all, e.g. simple memorisation of past events, or a knowledge engineers direct programming of rules elicited from a human expert into an expert system. Supervised learning the system is supplied with a set of training examples consisting of inputs and corresponding outputs, and is required to discover the relation or mapping between then, e.g. as a series of rules, or a neural network. Unsupervised learning the system is supplied with a set of training examples consisting only of inputs and is required to discover for itself what appropriate outputs should be, e.g. a Kohonen Network or Self Organizing Map.
2.
3.
Early expert systems relied on rote learning, but for modern AI systems we are generally interested in the supervised learning of various levels of rules.
w10s1-3
ini
wij
The standard procedure is to define an output error measure (such as the sum squared difference between the actual network outputs and the target outputs), and use gradient descent weight updates to reduce that error. The details are complex, but such an approach can learn from noisy training data and generalise well to new inputs.
w10s1-5
For learning new rules (including meta-rules) there are two basic approaches: 1. Inductive rule learning methods create new rules about a domain that are not derivable from any previous rules. We take some training data, e.g. examples of an expert performing the given task, and work out corresponding rules that also generalize to new situations. Deductive rule learning enhances the efficiency of a systems performance by deducing new rules from previously known domain rules and facts. Having the new rules should not change the outputs of the system, but should make it perform more efficiently.
w10s1-8
2.
GENERATOR
PERFORMER CRITIC
FINISHED
2.
3.
4.
5.
P2:
TOUCHES
Clearly, pattern P1 is more specific than pattern P2, because the constraints imposed by P1 are only satisfied if the weaker constraints imposed by P2 are satisfied. So P1 P2. Note that, for a program to perform this partial ordering, it would need to understand the relevant concepts and relationships, e.g. that wedges and bricks are different shapes, that supporting implies touching, and so on.
w10s1-14
This is called the boundary sets representation for version spaces, which is both 1. 2. Compact it is not explicitly storing every concept description in that space. Easy to update a new space simply corresponds to moving the boundaries.
With this convenient representation we can now apply machine learning techniques to it.
w10s1-15
It is easy to see how the boundaries can be refined as increasing numbers of data points become available, and how to extend the approach to more complex input spaces.
w10s1-16
Decision Trees
Decision trees are a particularly convenient way of structuring information for classification systems. All the data to be classified enters at the root of the tree, while the leaf nodes represent the classifications. For example: Outlook sunny Humidity high
Stay In
overcast
Go Out
normal
Go Out
Stay In
Go Out
Intermediate nodes represent choice points, or tests upon attributes of the data, which serve to further sub-divide the data at that node.
w10s1-17
R2:
R5:
The advantage of decision trees over rules is that comparatively simple algorithms can derive decision trees (from training data) that are good at generalizing (i.e. classifying unseen instances). Well known algorithms include CLS, ACLS, IND, ID3, and C4.5.
w10s1-18
Reading 1. 2. 3. 4. 5. Jackson: Chapter 20 Russell & Norvig: Chapters 18, 19, 20 & 21 Callan: Chapters 11 & 12 Rich & Knight: Chapter 17 Nilsson: Section 17.5
w10s1-20