0% found this document useful (0 votes)

19 views12 pages

Chapter 1

Uploaded by

pakchungyiu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views12 pages

Chapter 1

Uploaded by

pakchungyiu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Statistical Deep Learning

with Python and R

By Kaiser FAN and Phillip YAM

Department of Statistics, Faculty of Science, Chinese University of Hong Kong

Credit to the photo editor https://www5.lunapic.com/.

Photo taken by the authors over the River Cam, Cambridge.
Chapter 1

A JOURNEY FROM MACHINE LEARNING TO DEEP LEARN-

ING

CONTENTS
1.1 A brief history on Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Machine Learning - Learn from data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Parameters vs. Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Classification vs. Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3 Model-Based vs. Instance-Based Learnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.4 Shallow vs. Deep Learnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 How to use this book? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

“We know the past but cannot control it. We control the future but cannot know it.” by Claude Shannon 1

1.1 A brief history on Artificial Intelligence

Artificial Intelligence was kicked off in the 1950’s, and its development can be divided into three phrases.
In the first stage, it mainly focused on the logical mechanism via fuzzy logic. In light of the vast amount of
data available from the internet and contemporary clinical studies, AI has moved to the stream of statistical
and machine learning in the previous two decades. Getting into the third stage, in just the past few years,
we get into the paradigm of deep learning.

The main challenge faced by AI study is to teach a computer how to resolve an apparently trivial task to
humans but cannot be tackled by simply using preassigned algorithm or routine logical instructions; for
instance, distinguishing a cat from a dog is obvious to a human, but writing an algorithm to do this task,
so that all aspects are taken into account at once, would be very complicated. Particularly, the most recent
proposal on the use of machine learning is to handle problems similar to this; indeed, with the availability
of hand-held electronic devices, such as smart phones and smart watches, collecting huge amounts of data
on human behavior is far easier nowadays, and this can help to train the machines to learn to mimic us
on how to solve different matter. In the primitive models in statistical learning, most of them are only
composed with a few layers of complexity, and therefore they lack the ability to pick up the more subtle
latent information embedded deeply in the ocean of data. Facing at this bottleneck, to overcome this, with
1
Shannon, C. E. (1959). Coding theorems for a discrete source with a fidelity criterion. IRE Nat. Conv. Rec, 4(142-163),
1.

6
Page 7 A Journey from Machine Learning to Deep Learning

the latest advance in computational power and the availability of labeled data, scholars turn to strengthening
the network approach which leads us to the most recently popular topic - Deep Learning.

Figure 1.1.1: Relations among AI, Machine Learning and Deep Learning

First Wave (1950-1975): Mechanical logical reasoning

In 1950, Alan Turing proposed the influential yet controversial Turing test in his paper Computing Machin-
ery and Intelligence [11]. In the test, one of the two humans serves as an examiner and communicates with
the second human and a computer through text messages, where the last two are kept away out of the sight
of the examiner. The computer is considered to possess artificial intelligence if the examiner is unable to
distinguish the responses between the human and the computer.

In 1951, Marvin Minsky built the first Stochastic Neural Analog Reinforcement Calculator (SNARC). The
machine, essentially a neural network consisting of 40 neurons, enables human to first simulate the transmis-
sion of neural signals. To honor his contribution, Minsky received the Turing Award, the most prestigious
prize in computer science, in 1969.

In 1955, Allen Newell, Herbert Simon, and Cliff Shaw [6] wrote a compute program called the Logic Theo-
rist to mimic the problem-solving skills of humans. This program successfully proved 38 out of 52 theorems
from Principia Mathematica by Whitehead and Russell (1927).

For the formal origin of AI, the first workshop on Artificial Intelligence held in Dartmouth in the summer
of 1956 is commonly regarded as the date of birth of AI, and it was attended by the representative scholars
in information science and intelligence such as John McCarthy, Marvin Minsky, and Claude Shannon. The
workshop covered topics including neural networks, natural language processing, abstraction, and creativity.
After this series of talks, scientists and engineers have been constantly dreaming of a hypothetical machine
that can exhibit behavior at least as skillful and flexible as humans do, can reason, and can possess the
human soul and mind; researchers often refer to this collective wisdom and research program as General
(Strong) Artificial Intelligence.

Amazed by its unlimited potentials, AI had started to flourish. During this time, some contemporaries
optimistically foresaw that a machine completely driven by AI would come to birth in 20 years time. In
1963, the MIT initialized the Project on Mathematics and Computation (Project MAC), with Minsky and
McCarthy joining at a later time, in which they promoted a series of research topics on image and speech
recognitions. From 1964 to 1966, Joseph Weizenbaum built the world’s first natural language processing
computer program; meanwhile, on the other side of the globe, Waseda University in Japan announced the
invention of the first biped walking robot.

However, the hunger of scientists had yet to be satisfied. Criticism on AI began to rise starting in the
1970’s; indeed, the rapidly growing demand on computational power could not be fulfilled at that time.

All rights reserved. Do not distribute without permission from the authors, Kaiser Fan and Phillip Yam.
Page 8 Statistical Machine Learning and Deep Learning with Python and R

In addition, the variety and complexity of demanding problems in image and natural language processings
had created severe hurdles given the contemporary technological conditions. Reaching this bottleneck, the
public awareness and grant funding started to rapidly decline in the mid of 1970’s, AI development then fell
into decay.

Second Wave (1980-1987): Rise and fall of the expert system

Stepping into the 80’s, the breakthroughs in expert systems and artificial neural networks drew the public’s
attention back to AI. Expert systems can be dated back to the 60’s, being introduced in a project led
by Edward Feigenbaum [2], who was advocated as the “father of expert systems”. An expert system is a
computer program that simulates the judgment and behavior of a human with expertise collected before in
a particular field under a set of prescribed rules. In the 70’s, researchers at Stanford University invented
a system called MYCIN, which diagnosed a person’s blood, to identify bacteria causing infection such as
bacteremia and meningitis so as to recommend the appropriate dose of antibiotics, based on around 600
manually assigned rules. In the 80’s, the Carnegie Mellon University invented an expert system called XCON
(eXpert CONfiger) [1] for the Digital Equipment Corporation, which could automatically select the combi-
nations of computer components on behalf of a customer’s needs; that XCON had helped the corporation
save over 40 million US dollars annually at that time.

With the success of expert systems, the purpose of developing AI started to deviate from its original goal
of obtaining general intelligence, instead, the interest now is to develop more tailor made system to solve
target practical problems in specific areas. In 1982, John Hopfield proposed a new network model which
was later called the Hopfield network, a kind of recurrent artificial neural network [4], which incorporates
the mechanism of associative memory (the ability to learn and remember the relationship between unrelated
items). In 1986, David Rumelhart, Geoffrey Hinton and Ronald Williams jointly published the paper Learn-
ing representation by back-propagating errors [7] in which they proved empirically the method of backward
propagation can help train a multi-layer neural network such that it can learn the appropriate inherent
representations of an arbitrary mapping of input to output.

During this new wave of passion for AI, Japan’s Ministry of International Trade and Industry initial-
ized a project of building a “fifth-generation computer” in 1982 [8]. It aimed to create a machine with
supercomputer-like performance through large scale simultaneous sychronized calculations, in order to pro-
vide a platform for future developments in AI. However, after spending over 50 billion Japanese yen in 10
years time, the project could still not meet the planned target. In the late 80’s, negative impressions on AI
started to grow in the industry, as it failed to meet the expectations of the tremendous investments that
had been made, AI had once again faded out of people’s mind.

Third Wave (2011-now): Deep learning

After the previous two waves, researchers had given up their idealistic thoughts and AI had emerged a solu-
tion to solving practical concrete problems rather than a general approach and on a case-by-case basis. The
extensive use of mathematics has opened up the interdisciplinary collaboration between AI researchers and
scholars from other disciplines; and new sophisticated models and more effective algorithms are subsequently
developed, for instance, statistical learning theory, the support vector machine and probabilistic graphical
model, and now more prevalent deep neural networks, to name a few.

Stepping into the 21st century, the rapid globalization and development of the internet have significantly
boosted the volume of available digital information. On the other hand, the computing capability of Graph-
ics Processing Unit (GPU), first appearing in the 1990’s and then gaining popularity over the next two

All rights reserved. Do not distribute without permission from the authors, Kaiser Fan and Phillip Yam.
Page 9 A Journey from Machine Learning to Deep Learning

decades, has been proliferating; for instance, the calculation speed of a NVIDIA23 Tesla V100 GPU4 has
exceeded 10 trillion FLOPs (floating-point operations per second), surpassing the world’s fastest supercom-
puter in 2001.

With the rapid development of effective big data collection and computing technology, AI has achieved
major breakthroughs. The multi-layer neural network AlexNet5 invented by researchers at the University of
Toronto won the 2012 ImageNet Large Scale Visual Recognition Challenge (ILSVRC). AlexNet outdid, to
a large extent, the first runner-up in the challenge, where its algorithm was based on convolutional neural
networks in machine learning. Henceforth, deep learning based on multi-layer neural networks has been
applied to various areas. For instance, with an advance in deep reinforcement learning, AlphaGo recently
developed by Google has defeated several Go world champions [9]. All of these have captured public atten-
tion on the potentials of deep learning and brought back the revenge of AI.

Figure 1.1.2: The robot “R2-D2” in the Figure 1.1.3: T-800 “Model 101” in the
Star Wars series. movie The Terminator.

Figure 1.1.4: Which of these robots would you prefer to see?

1.2 Machine Learning - Learn from data

Machine learning, also known as statistical machine learning, is devoted to building statistical models, based
on data, for analysis and making predictions. Instead of executing manually assigned commands, machine
learning solves problems by utilizing the inherent insight and structure within the input data. The main
purpose of machine learning is to generalize, that is, to learn the rules from any hidden patterns embadded
in the collected data, and then apply this recently acquired “laws” to new scenarios for making decisions or
predictions.

Machine learning originated from the early stages of artificial intelligence, and it evolved gradually and
brought in new inspirations into different sub-branches of pattern recognition and computer learning the-
ories. It is an interdisciplinary subject that involves statistics, linear algebra, optimization and numerical
analysis. According to variations in purposes and methodologies, machine learning is classified into super-
vised learning, unsupervised learning and reinforcement learning. Assume that
1. there are N samples in dataset, and

2
Originally, the founders first thought of “NV” standing for “Next Version”, then they added “invidia” referring to the
Latin word envy.
3
We here again would like to express our gratitude to NVIDIA for supporting the joint-institute with CUHK.
4
The product NVIDIA Tesla V100 GPU can be found in https://www.nvidia.com/en-gb/data-center/tesla-v100/.
5
AlexNet is the name of a convolutional neural network (CNN), designed by Alex Krizhevsky in collaboration with Ilya
Sutskever and Geoffrey Hinton, who was Krizhevsky’s Ph.D. advisor.

All rights reserved. Do not distribute without permission from the authors, Kaiser Fan and Phillip Yam.
Page 10 Statistical Machine Learning and Deep Learning with Python and R

2. xm represents the out-sample feature vector, i.e. not in the training dataset, for m > N .

1.2.1 Supervised Learning

In supervised learning, models are trained using labeled data. Each datapoint in the dataset consists of a
feature vector (input) and their respective labels (output). Common learning scheme is called classification
if the output variable is discrete-valued, and regression if it is continuous.

We start the learning procedure by choosing a suitable model. Common supervised learning models include
logistic regression, generalized linear models, classification and regression trees, support vector machine
(SVM), K-nearest neighbors (KNN), naive Bayes classifiers, and many common Deep Neural Networks.
The model is tested by comparing the predicted values against the actual labels, so that the model can be
adjusted accordingly. The training process is repeated until sufficient accuracy is obtained. The learning is
supervised by the feedback obtained from the values of the actual labels; Once the training is finished, new
data can be input into the model for predictions.
1. Dataset: A collection of labeled examples {xn , yn }N n=1 , where

(a) xn (Input): A D-dimensional feature vector, i.e.

xn = (x(1) (2) (D)
n , xn , · · · , xn ) .
(b) yn (Label): Anything, e.g. an element belonging to a finite set of classes {1, · · · , C}, a real
number, a vector, a matrix, a tree, or a graph.
2. Goal: Produce a model that allows “correctly” guessing the label ym from the new feature vector xm .

Example 1.2.1. Spam Detection: Suppose that we have 10,000 email messages, each label with “spam” or
“not spam”. However, these email messages cannot be directly used in the model, these labels and passages
in the emails are not numbers! Hence, each email message has to be converted into a feature vector. One
common way is called bag of words: Let say the bag (dictionary) contains 20,000 alphabetically sorted
words, then
1. the first feature has a value of 1 if the email message contains the word “a”; 0 otherwise;
2. the second feature has a value of 1 if the email message contains the word “aaron”; 0 otherwise;
3. ..
.
4. the th
20, 000 feature
th
has a value of 1 if the email message contains the word “zulu”; 0 otherwise.
1, if the nth message contains “zulu”

1, if the n message contains “a”
x(1)
n = · · · x(20,000)
n = .
0, otherwise 0, otherwise
Similarly, the output labels have to be converted into numbers. For example,
1, if the nth message is spam
yn = 1{the nth message is spam} = .
0, otherwise
where 1{·} is the indicator function. This example will be further discussed in support vector machine,
random forest, naive bayes classifier, and CIBer.

1.2.2 Unsupervised Learning

In practice, labeled data are not always available, this leads us to the setting of unsupervised learning.
Only available information is the feature vectors of datapoints, these are analyzed in order to find out the
hidden patterns (inner structures) or clusters (organizations) within the data source. Typical approaches
of unsupervised learning include principal component analysis, recommended systems, K-means clustering,
dimension reduction, and feature extraction, etc.

The performance of traditional unsupervised learning in feature extraction for any complex data structure
may not be too appealing; alternatively, deep learning has proven its strong unsupervised learning abilities,

All rights reserved. Do not distribute without permission from the authors, Kaiser Fan and Phillip Yam.
Page 11 A Journey from Machine Learning to Deep Learning

especially in the field of computer vision, or when there are some natural ordering metric, or algebraic
structures among feature variables of datapoints; particularly, it is through the Convolutional layers in
Convolutional Neural Network (CNN) and the feedback mechansim via backpropagation in the Deep NN
(latter) part of CNN. Recently, some reasearch also suggests semi-supervised learning, falling in between
supervised and unsupervised learning. It makes use of unlabeled data together with a small amount of
labeled data, striking a balance between the learning performance and the costs in obtaining labeled data.
1. Dataset: A collection of unlabeled examples {xn }N n=1 .
2. Goal: Produce a model that transforms the feature vector xn into the real-valued output yn or a
vector output yn . For example, in the following cases, the model returns:
(a) Clustering: The identity of the cluster for each group of feature vectors in the dataset, i.e.
yn ∈ {1, · · · , C} ,
where C is the total number of clusters. K-means clustering is one of such method in clustering
K different subgroups, where K is a hyperparameter; see Section ??.
(b) Dimension Reduction: A new feature matrix Y ∈ RNY ×DY that has a smaller dimension than
the input feature matrix X ∈ RN ×D . Principal component analysis (PCA) reduces the dimension
of the input feature matrix through looking for the dominant eigenvalues of covariance matrix of
feature vectors; see Section ??.
(c) Outlier Detection: A real-valued number y` that indicates how x` is different from a “typical”
examples in the dataset {xn }N n=1 . The Mahalanobis distance [3] Dn for the independent and
identically distributed (iid) datum xn is such that
Dn2 = (xn − x̄N )> S −1 (xn − x̄N ) ∼ χ2D , n = 1, 2, · · · , N
approximately, where x̄N and S are respectively the mean and the covariance matrix of x1 , · · · , xN ,
and χ2D is the Chi-squared distribution with D degrees of freedom; especially if N is large enough,
these Dn2 ’s, for n = 1, · · · , N , are also approximately independent of each other.

1.2.2.1 Reinforcement Learning

Reinforcement learning (RL) finds extensive applications. Consider a two-layer pendulum, which is a classi-
cal problem in non-linear control. In control theory, the first step to tackle the problem is to build a precise
mathematical model to describe the pendulum system. Then, based on the model and theories in non-linear
systems, the control strategy is designed. However, building a model and designing a control can be very
complicated. Tackling the problem using methods in reinforcement learning, on the other hand, does not
require a mathematical model nor control. What we need is a learning algorithm and let the simulated
pendulum system to learn on its own.

A major usage of RL is to mimic human behaviour, and with the rapidly developing research and algorithms,
they can often outdo humans. For instance, a robot can learn to get up on itself after falling in a simulated
environment. Not to mention the eye-catching Go match in 2017, the computer program AlphaGo developed
by Google defeated Ke Jie, the world champion in Go. Other applications include machine translation (MT)
and predictive text etc.

There are three historical moments in the development of reinforcement learning. First, Sutton et al. (1998)
published the text Reinforcement Learning: An Introduction. The book summarizes the development of
different algorithms in reinforcement learning by 1998. By that time, much emphasis was paid on Q-
learning with tables. Concurrently, algorithms such as direct policy search have already been proposed.
For instance, the algorithm REINFORCE proposed in Williams (1992) directly updates the policy weights
by evaluating the policy gradient. The second time-point was in 2013, when Deep Q Network was first
suggested for gaming by DeepMind; also see Mnih et al. (2013). Deep Q Network integrates reinforcement
learning and deep neural network to form deep reinforcement learning. During 1998-2013, various policy-
based algorithms have also been developed. The third moment, and also the most compelling breakthrough

All rights reserved. Do not distribute without permission from the authors, Kaiser Fan and Phillip Yam.
Page 12 Statistical Machine Learning and Deep Learning with Python and R

in RL, has to be the development of AlphaGo by Google [9]. The RL-trained computer program earned 2
consecutive wins in Go matches over the world champions during 2016-2017.

1.3 Machine Learning Algorithms

1.3.1 Parameters vs. Hyperparameters
1. Parameters: Variables that define the model, which aims to learn their true values by some learning
algorithms. Parameters are directly estimated by the learning algorithm based on the training dataset.
The goal of machine learning is to find such values of parameters, based on some dataset, that optimize
the model in a certain sense.
2. Hyperparameters: Variables that control the learning process, e.g. the speed of convergence to the
optimal solution and the accuracy of the estimates. Hyperparameters are not learnt by the algorithm
itself from the training dataset, they have to be set a priori by the data analyst before running the
algorithm; see Section 4.5.

1.3.2 Classification vs. Regression

1. Classification: Assigning a binary or multiclass predicted label yn to an unlabeled observation xn .
2. Regression: Predicting a real-valued label yn for an unlabeled observation xn .

1.3.3 Model-Based vs. Instance-Based Learnings

1. Model-Based Learning: Use the training dataset to create a model that has parameters learnt
from the dataset. Most supervised learning algorithms are model-based, e.g. SVM, logistic regression,
random forest, DNNs. Once the model is built (i.e. the optimal parameters are found), the training
dataset can be discarded.
2. Instance-Based Learning: Use the dataset growing in time as the model. This learning approach
is similar to the online learning, the model is used to predict any new instances, and then the model
is updated, but in online learning, the dataset is in sequential order, while instance-based learning has
no such restriction. One example of instance-based learning algorithm is the K-Nearest Neighbors
(KNN), it looks at the K closest neighborhoods of an input in the dataset and make a label prediction
through majority vote among these K neighbours.

1.3.4 Shallow vs. Deep Learnings

1. Shallow Learning: Learns the parameters directly from the feature vector of the training dataset.
Most supervised learning algorithms are shallow: Artificial Neural Netwrok (ANN)
2. Deep Learning: Learns the parameters directly from the outputs of the preceding layers, usually with
more than one hidden layers in between the input and the output layers: Deep Neural Network (DNN).
Those preceding layers are normally used as filtering out the hidden patterns or inner organisations
from the data points.

1.4 How to use this book?

The main purpose of this book is to deliver the fundamental, mathematical, and statistical principles to
common machine and deep learnings learners, with the aid of some examples and problems arisen in business
sector through R and Python. Though codings involved may not be the most effective in reaching the
optimal buyout, but it surely can motivate and inspire readers in constructing an efficient enough model.

Unfortunately, in this book, you may not found the materials about deep thinking, in the sense of the book
(see Figure 1.4.1) by professor Kawakami of design studies.

Figure 1.4.1: Kyoto University’s Deep Thinking Method (Japanese) by Hiroshi Kawakami.

“Deep Thinking” by Kawakami is a book on thought that deepens one’s thinking ability and cultivate his/her
skill of analyzing problems and then to propose solution strategies; it is more about scientific methods and
philosophical argument training, which will not be covered in the present book. Instead, we introduce
various practically useful mathematical and statistical models behind a wide range of machine learners. In
1997, Garry Kimovich Kasparov, who was once a world champion chess grandmaster, lost a match to the
IBM supercomputer “Deep Blue” under a limited time constraint. After 20 years, in 2017, he published a
book and named it as “Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins”
(see Figure 1.4.2).

Figure 1.4.2: Deep Thinking: Where Machine Intelligence Ends and Human Creativity Begins by Garry
Kasparov.

In his book, Kasparov revealed experience and strategies playing against the Deep Blue. Although there
were plenty of critisms against artificial intelligence during that time, Kasparov believed that artificial intel-
ligence could bring humans to another height, and predicted the future development of artificial intelligence.
Similar to his idea, we hope that our book can bridge our readers to understand the common existing ma-
chine and deep learners, and to foster the future development of artificial intelligence.

In addition, we may not introduce any materials related to deep diving (see Figure 1.4.3), but only motivating
more about self-driving/autonomous driving (“deep” driving; see Figure 1.4.4) in the due course.

Figure 1.4.3: Deep diving. Photo by Figure 1.4.4: Deep driving. Photo by Alex
http://divemagazine.co.uk/travel/ Kendall https://www.youtube.com/watch?v=
7529-fresh-wrecks. CxanE_W46ts.

In particular, Tesla Inc., a U.S. based company which builds electric car, uses deep learning to develop
an autopilot system. This autopilot system has already been equiped in the Tesla Model 3 (see Figure
1.4.5). However, this autopilot technology can only perform several functions, including but not limited to
accelerating, braking, and steering. The Tesla drivers still need to take control of the car. The U.S. National
Highway Traffic Safety Administration gives a definition to a Level 5 self-driving cars:

“An automated driving system (ADS) on the vehicle can do all the driving in all circumstances.
The human occupants are just passengers and need never be involved in driving.” 6

While a Level 2 self-driving car is defined as:

“An advanced driver assistance system (ADAS) on the vehicle can itself actually control both
steering and braking/accelerating simultaneously under some circumstances. The human driver
must continue to pay full attention (monitor the driving environment) at all times and perform
the rest of the driving task.” 6

Therefore, the current Tesla’s autopilot system only suits the Level 2 requirement. There is still a long
journey for Tesla to improve its system; indeed, in March 2016, an incident was reported on Twitter that
the Tesla’s autopilot system mistakenly recognized the salt lines, which was caused in advance of a massive
snowstorm, as the normal traffic broken white lines on the highway, see Figure 1.4.6.

6
Retreived from https://www.nhtsa.gov/technology-innovation/automated-vehicles-safety

Figure 1.4.5: White Tesla Model 3. Figure 1.4.6: Original picture on Twit-
Photo by https://en.wikipedia.org/ ter: Salt lines confuse Tesla’s autopilot
wiki/Tesla_Model_3. system. Photo by https://twitter.com/
amywebb/status/841292068488118273.

Bibliography
[1] Bachant, J. and Soloway, E. (1989). The engineering of xcon. Communications of the ACM, 32(3):311–
319.

[2] Buchanan, B. G. and Feigenbaum, E. A. (1980). The stanford heuristic programming project: Goals
and activities. AI Magazine, 1(1):25–25.

[3] Chandra, M. P. (1936). On the generalised distance in statistics. In Proceedings of the National Institute
of Sciences of India, volume 2, pages 49–55.

[4] Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational
abilities. Proceedings of the national academy of sciences, 79(8):2554–2558.

[5] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M.
(2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.

[6] Newell, A. and Simon, H. (1956). The logic theory machine–a complex information processing system.
IRE Transactions on information theory, 2(3):61–79.

[7] Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-
propagating errors. nature, 323(6088):533–536.

[8] Shapiro, E. Y. (1983). The fifth generation project—a trip report. Communications of the ACM,
26(9):637–641.

[9] Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J.,
Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016). Mastering the game of go with deep
neural networks and tree search. nature, 529(7587):484–489.

[10] Sutton, R. S., Barto, A. G., et al. (1998). Introduction to reinforcement learning.

[11] Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236):433–460.

[12] Whitehead, A. and Russell, B. (1927). Principia Mathematica. Number 1 in Cambridge mathematical
library. Cambridge University Press.

[13] Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement
learning. Machine learning, 8(3):229–256.

Abhijit Ghatak - Deep Learning With R-Springer (2019)
No ratings yet
Abhijit Ghatak - Deep Learning With R-Springer (2019)
259 pages
CH - 1 Artificial Intelligence Class 11 Notes
100% (2)
CH - 1 Artificial Intelligence Class 11 Notes
11 pages
Report
100% (3)
Report
101 pages
Lecture 1 Basics of ML
No ratings yet
Lecture 1 Basics of ML
30 pages
Practical AI For Cybersecurity
No ratings yet
Practical AI For Cybersecurity
293 pages
AI QB For All 5 Units - 2 Marks
No ratings yet
AI QB For All 5 Units - 2 Marks
28 pages
Final Updated AI
No ratings yet
Final Updated AI
475 pages
COMP 6930 Topic01 Classification Basics
No ratings yet
COMP 6930 Topic01 Classification Basics
190 pages
Ai For Biginners (Autosaved)
No ratings yet
Ai For Biginners (Autosaved)
135 pages
Deep Learning
100% (3)
Deep Learning
32 pages
(Internet of Everything (IoE) ) Rashmi Agrawal (Editor) Marcin Paprzycki (Editor) Neha Gupta (Editor) - Big Data IoT and Mach
No ratings yet
(Internet of Everything (IoE) ) Rashmi Agrawal (Editor) Marcin Paprzycki (Editor) Neha Gupta (Editor) - Big Data IoT and Mach
339 pages
Introduction To Intelligence Systems
No ratings yet
Introduction To Intelligence Systems
22 pages
A Review of Machine Learning and Deep Learning Applications
No ratings yet
A Review of Machine Learning and Deep Learning Applications
6 pages
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
100% (2)
Deep Learning With Python A Crash Course To Deep Learning With Illustrations in Python Programming Language
59 pages
01 Introduction To Artificial Intelligence
No ratings yet
01 Introduction To Artificial Intelligence
67 pages
ML Final
100% (1)
ML Final
28 pages
Plant Disease Detection Using Image Processing Techniques
No ratings yet
Plant Disease Detection Using Image Processing Techniques
19 pages
Deep Learning
No ratings yet
Deep Learning
34 pages
Ilwis User Guide30 PDF
No ratings yet
Ilwis User Guide30 PDF
542 pages
AI Introduction
No ratings yet
AI Introduction
376 pages
Rule Based Classification
No ratings yet
Rule Based Classification
28 pages
8 Machine Learning Algorithms in Python
100% (3)
8 Machine Learning Algorithms in Python
16 pages
Demystifying AI
No ratings yet
Demystifying AI
105 pages
AI RajeevSir Merged
No ratings yet
AI RajeevSir Merged
148 pages
Review Questions On Clustering DBSCAN and HAC
No ratings yet
Review Questions On Clustering DBSCAN and HAC
2 pages
Deep Learning With Keras and Tensorflow
No ratings yet
Deep Learning With Keras and Tensorflow
88 pages
Ai Chapter 1 Introduction
No ratings yet
Ai Chapter 1 Introduction
5 pages
AI Vs Machine Learning
No ratings yet
AI Vs Machine Learning
79 pages
Unit 1
No ratings yet
Unit 1
55 pages
Lec1 Intro v1.4
No ratings yet
Lec1 Intro v1.4
67 pages
Week#1
No ratings yet
Week#1
46 pages
Impact of AI Book
No ratings yet
Impact of AI Book
24 pages
Support Vector Machines
No ratings yet
Support Vector Machines
69 pages
Partition
No ratings yet
Partition
52 pages
Lecture 1 - Introduction To The Course and AI, ML
No ratings yet
Lecture 1 - Introduction To The Course and AI, ML
44 pages
Chapter 1,2,3 - ML
No ratings yet
Chapter 1,2,3 - ML
79 pages
Artificial Intelligenceii
No ratings yet
Artificial Intelligenceii
48 pages
7 Ann Multilayer Perceptron Full
No ratings yet
7 Ann Multilayer Perceptron Full
69 pages
Terjemahan Bab1 TheHundredPageLanguageModels AndySetiawan
No ratings yet
Terjemahan Bab1 TheHundredPageLanguageModels AndySetiawan
34 pages
AIML Lect1 Introduction
No ratings yet
AIML Lect1 Introduction
70 pages
Learning in Big Data: Introduction To Machine Learning
No ratings yet
Learning in Big Data: Introduction To Machine Learning
25 pages
Mod 1
No ratings yet
Mod 1
52 pages
1.1.1. Introduction To AI and Machine Learning
No ratings yet
1.1.1. Introduction To AI and Machine Learning
34 pages
AI - Lecture 02
No ratings yet
AI - Lecture 02
32 pages
Introduction To Artificial Intelligence: Inte Ligê Ncia Artif Icial E Cibe Rse Gurança (Inacs)
No ratings yet
Introduction To Artificial Intelligence: Inte Ligê Ncia Artif Icial E Cibe Rse Gurança (Inacs)
35 pages
DWDM File
No ratings yet
DWDM File
26 pages
Deep Learning With R
No ratings yet
Deep Learning With R
18 pages
NNFL Unit I For IV ECE
No ratings yet
NNFL Unit I For IV ECE
41 pages
Data Science Task List Pfsinterns
No ratings yet
Data Science Task List Pfsinterns
14 pages
AI 501 - Lesson 1 - Intro To AI PDF
No ratings yet
AI 501 - Lesson 1 - Intro To AI PDF
45 pages
NPTEL Week01 01 IntroHistory
No ratings yet
NPTEL Week01 01 IntroHistory
20 pages
Notes Unit 1 ML
No ratings yet
Notes Unit 1 ML
17 pages
FAM Unit4
No ratings yet
FAM Unit4
11 pages
DP 100 PDF
No ratings yet
DP 100 PDF
45 pages
Unit 1 AIML
No ratings yet
Unit 1 AIML
21 pages
UNIT 1 DT
No ratings yet
UNIT 1 DT
14 pages
Historia Ia Ing Graficos Buen Texto
No ratings yet
Historia Ia Ing Graficos Buen Texto
7 pages
Wa0016
No ratings yet
Wa0016
27 pages
An Exhaustive Investigation On Loan Prediction in Banks Using LRD
No ratings yet
An Exhaustive Investigation On Loan Prediction in Banks Using LRD
15 pages
Tutorial Sheet For Unit 1,2 and 3
No ratings yet
Tutorial Sheet For Unit 1,2 and 3
6 pages
Machine Learning Based Spam Comments Detection On YouTube
No ratings yet
Machine Learning Based Spam Comments Detection On YouTube
6 pages
01 Introduction and 02 ML
No ratings yet
01 Introduction and 02 ML
8 pages
Classification: Rule Based Classification 0R Holte 1R Holte Decision Tree
No ratings yet
Classification: Rule Based Classification 0R Holte 1R Holte Decision Tree
24 pages
Intro ML Lecture 1
No ratings yet
Intro ML Lecture 1
9 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
5 pages
Class Note
No ratings yet
Class Note
3 pages
Titanic Survival Analysis
No ratings yet
Titanic Survival Analysis
61 pages
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
No ratings yet
Panjala Sravani, V. Rama Krishna: Prospective Projection On Covid-19 Utilising ML Algorithms
8 pages
AI 2021 Slides
No ratings yet
AI 2021 Slides
39 pages
Artificial Intelligence Introduction
No ratings yet
Artificial Intelligence Introduction
8 pages
Computational Intelligence: (Introduction To Machine Learning)
No ratings yet
Computational Intelligence: (Introduction To Machine Learning)
55 pages
Towards Efficient and Scalable Machine Learning-Based Qos Traffic Classification in Software-Defined Network
No ratings yet
Towards Efficient and Scalable Machine Learning-Based Qos Traffic Classification in Software-Defined Network
13 pages
A Hybrid Approach Handwritten Character Recognition For Mizo Using Artificial Neural Network
No ratings yet
A Hybrid Approach Handwritten Character Recognition For Mizo Using Artificial Neural Network
6 pages
Ch-1 Notes
No ratings yet
Ch-1 Notes
7 pages
ArtificiaI Intelligence
No ratings yet
ArtificiaI Intelligence
16 pages
Introduction To Artificial Intelligence: Session 1 2015/2016:W.R.W OMAR
No ratings yet
Introduction To Artificial Intelligence: Session 1 2015/2016:W.R.W OMAR
33 pages
Machine Learning and Deep Learning Revol
No ratings yet
Machine Learning and Deep Learning Revol
4 pages
Class Notes Pyhon 1
No ratings yet
Class Notes Pyhon 1
6 pages
2
No ratings yet
2
3 pages
Expose Anglais Fichier Corriger - Copie
No ratings yet
Expose Anglais Fichier Corriger - Copie
18 pages
Basketball Free Throw - Biomechanic Analysis
No ratings yet
Basketball Free Throw - Biomechanic Analysis
13 pages
Artificial Intelligence - ETT Reviewer
No ratings yet
Artificial Intelligence - ETT Reviewer
7 pages
ISRO MidPrep
No ratings yet
ISRO MidPrep
3 pages
Mis ML
No ratings yet
Mis ML
5 pages
What Is Artificial Intelligence?: GOFAI Versus New AI
No ratings yet
What Is Artificial Intelligence?: GOFAI Versus New AI
17 pages
Machine Learning, Deep Learning, and AI: What's The Difference?
No ratings yet
Machine Learning, Deep Learning, and AI: What's The Difference?
6 pages
Web Mining
No ratings yet
Web Mining
10 pages