Class 1 Intro AI
Class 1 Intro AI
“Computers that could simulate human intelligence were once a futuristic dream.
Now they are all around us – but not in the way their pioneers expected”
Nello Cristianini, The Road to artificial intelligence: A case of data over theory (2016)
What type of AI is thriving?
…act / behave as they were humans” …act to achieve the best expected
(Turing test) outcome”
(correct inference / rational agent)
…act / behave as they were humans” …act to achieve the best expected
(Turing test) outcome”
(correct inference / rational agent)
They proposed:
Allen Newell
• Logic Theorist 1955
• General Problem Solver 1957 Herbert Simon
• Physical Symbol System and Heuristic Search hypotheses 1976
Physical Symbol System
Physical Symbol System hypothesis: “A Physical symbol system has the necessary and
sufficient means for general intelligent action”
(Newell A. and Simon H., Computer Science as empirical inquiry: Symbols and search, 1976)
Essential ideas:
• Physical à obey laws of physics / be engineered
• Symbol system à set of entities that designate objects and carry out processes
Search
Heuristic Search Hypothesis: “The solutions to problems are represented as symbol
structures. A physical symbol system exercises its intelligence in problem solving by
search – that is by generating and progressively modifying symbol structures until it
produces a solution structure”
Example:
AX + B = CX + D
AX + B – CX = CX + D – CX
…
X = (D – B)/(A – C)
Newell A. and Simon H., Computer Science as empirical inquiry: Symbols and search (1976)
Graph and tree search
• Reducing the problem to a graph or tree search
• Element that must be specified:
• States (the space of symbol structures)
• Actions (the processes for modifying a symbol structure into another)
• The initial and the goal state (the test)
• The task is to select a series of actions that take us from the initial state to the goal
state
• Note that the search can be uninformed (blind) or informed (heuristics /
information to assess the progress toward the goal)
Example: states
Venice
Newell A. and Simon H., Computer Science as empirical inquiry: Symbols and search (1976)
Knowledge and search
• He defined the high-level programming language LISP
• In 1958 he proposed the idea of “Advice Taker”
Central idea: to embed knowledge into the system and search for solutions
Connectionism
• McCulloch and Pitt proposed a
mathematical model for a neuron (1943)
Warren S. McCulloch
Walter H. Pitts
Frank Rosenblatt
McCulloch and Pitts model
Neuron is modelled as a binary threshold unit:
The unit fires if the weighted sum of the input ∑" 𝑤𝑖𝑗 𝑎𝑖 reaches or exceeds a threshold value 𝜇𝑗
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
The problem of classification
Given:
1) Some features (f1, f2, f3,…fn)
2) Some classes (c1, c2, c3,…cm)
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
The problem of classification
Given:
1) Some features (f1, f2, f3,…fn)
2) Some classes (c1, c2, c3,…cm)
Aristotle
Simple example
Given certain classes or categories:
c1 = “watermelon”
c2 = “apple”
c3 = “orange”
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
Geometrical interpretation
Classes = {1,0}
Features = {x1, x2} ∈ [0,+∞[
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
Neural networks for classification
A neural network can be used for classification tasks
• Input = features values
• Output = class labels
Example: 3 features, 2 labels
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
Simple perceptron
A network consisting of one layer of McCulloch and Pitt neurons connected in a
feedforward way (no lateral or feedback connections)
• Perceptrons can represent all of the primitive Boolean functions (e.g. AND, OR)
Mitchell T., Machine Learning, 1997
Limitations of Perceptrons
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
Linear separability
X Y X and Y X Y X and Y
0 0 0 0 0 0
0 1 0 0 1 1
0 1 0 0 1 1
1 1 1 1 1 0
→
Gradient of E = vectors of derivative of E with respect to ,
Pelillo M. http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/
Learning machines
• Great achievements and commercialization of AI
• Terry Sejnowski presented NetTalk, an artificial neural network that learn to pronounce English
text the same way as a baby does (1985)
• IBM deep blue beats the world champion at chess, Garry Kasparov (1997)
• Amazon replaces human product recommendation editors with an automated system (2002)
• Google launches Translate (2007)
• IBM’s super computer Watson beats two human champions ta the quiz game Jeopardy (2011)
LeCun Y., Bengio Y. and Hinton G., Deep Learning, Nature 2015
Deep-learning
LeCun Y., Bengio Y. and Hinton G., Deep Learning, Nature 2015
Big data revolution
• In 2009 Google researchers published an influential paper celebrating “the
unreasonable effectiveness of data”:
Halevy A., Norvig P., Pereira F., The Unreasonable Effectiveness of Data, 2009
Insights
Nello Cristianini, The Road to artificial intelligence: A case of data over theory (2016)
Concluding remarks
The dominant AI paradigm places great emphasis on
• Statistical inference (prediction) / machine learning (classification)
• Recommend products or friends
• Answer human queries
• Personalize news
• Predict risk scores (e.g. loans, credit card…)
• Optimal control
• Maximize an objective function over time
• Data collection and curation
• Integration with human life
• Data collection and curation as a by-product of online activities and micro-workers
• AI to support decision-making
References
• Russell N and Norvig P, Artificial Intelligence. A modern Approach, 3rd Edition, Pearson, 2010
• Cristianini N, “The Road to artificial intelligence: A case of data over theory”, New Scientist, 2016
• Mitchell T, Machine Learning, McGraw Hill, 1997
• Newell A and Simon H, “Computer science as empirical inquiry: symbols and search”. Commun.
ACM 19, 3 (March 1976), 113-126, 1976
• LeCun Y, Bengio Y and Hinton G, “Deep Learning”, Nature, 521, 436–444, 2015
• Halevy A, Norvig P and Pereira F., “The Unreasonable Effectiveness of Data”, IEEE Intelligent Systems,
8-12, 2009
• Pelillo M, Scantamburlo T “How Mature Is the Field of Machine Learning?” AI*IA 2013. Lecture Notes
in Computer Science, vol 8249, pp. 121-132, 2013
• Pelillo M, classes on neural networks,
http://www.dsi.unive.it/~pelillo/Didattica/Old%20Stuff/RetiNeurali/