0% found this document useful (0 votes)
35 views83 pages

Block 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views83 pages

Block 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

Indira Gandhi

National Open University


MIO-002
School of Engineering & Technology SMART TECHNOLOGIES
(HARDWARE AND
SOFTWARE)

BLOCK 3
AI and Machine Learning
Indira Gandhi
National Open University
MIO-002
School of Engineering & Technology SMART TECHNOLOGIES
(HARDWARE AND
SOFTWARE)

Block

3
AI AND MACHINE LEARNING
UNIT 7
Basics of AI

UNIT 8
Basics of Machine Language / Introduction to Machine Language

UNIT 9
AI and Machine Learning for Smart Cities
MIO - 002 : SMART TECHNOLOGIES (HARDWARE AND SOFTWARE)
BLOCK 3: AI AND MACHINE LEARNING
GUIDANCE
Prof. Nageshwar Rao Prof. Satyakam Prof. Ashish Agarwal
Vice-Chancellor, IGNOU PVC, IGNOU Director, SOET, IGNOU

COURSE CURRICULUM DESIGN AND DEVELOPMENT COMMITTEE


External Expert Members
Dr. Charru Malhotra Prof. Sewaram
Associate Professor, IIPA, ITO, New Delhi Head, Transport Planning, SPA, NewDelh
Prof. (Dr.) Rashmi Ashtt. Dr. Chetna Singh
Principal & Director, Hindu College of Design, Assistant Professor, SPA, NewDelhi
Architecture & Planning, Sonipat, Haryana
Ms. Meenakshi Tyagi Dr. Neha Goyel Tripathi
AGM, Architect & Planner, Gurugram Assistant Professor, SPA, New Delhi
Mr. Sunil Mr. Samir Chaudhuri
Associate Professor, Hindu College of Design, Retired Head (IT), PGCIL, Gurugram
Architecture & Planning, Sonipat, Haryana
Dr. P.Venkateshwar Rao Dr. Asif Nazar
Associate Professor, NIT, Warangal Ex-Dy.Manager (Tech), MTNL Delhi
Dr. Shridhar INTERNAL FACULTY MEMBERS
Assistant Professor, NIT, Warangal
Dr. K.V.R. Ravi Shankar Prof. Ashish Agarwal Prof. Sanjay Agrawal
Assistant Professor, NIT Warangal
Mr. Pradeep Kumar Prof. K.T.Mannan Dr. Shashank Srivastav
Assistant Professor, KITS, Karimnagar, Telangana
Prof. V.V. Subrahmanyam Prof. N. Venkateshwarlu Dr. Shweta Tripathi
SOCIS, IGNOU
Prof. P.V.K Sashidhar Prof. Rakhi Sharma Dr. Anuj Kumar Purwar
SOEDS, IGNOU
Dr. Munish Kumar Bhardwaj

Programme Coordinator Course Coordinator Block Coordinator


Prof. N. Venkateshwarlu Prof. N. Venkateshwarlu Prof. N. Venkateshwarlu
SOET-IGNOU Prof. Ashish Agarwal

BLOCK PREPARATION TEAM


BLOCK 3: AI AND MACHINE LEARNING
Units written by Block 3 Editor Block Coordination and CRC Management
Units - 7, 8 & 9
Dr. Asif Nazar Prof. N. Venkateshwarlu Prof. N. Venkateshwarlu
Ex-Dy.Manager (Tech), MTNL Delhi Prof. Ashish Agarwal Prof. Ashish Agarwal

October 2022
©Indira Gandhi National Open University, (IGNOU)2022 CRC Prepared in 2021
ISBN - 978-93-5568-577-3
All rights reserved No part of this work may be reproduced in any from, by mimeograph or any other means, without
permission is writing from the Indira Gandhi National Open University (IGNOU).
Further Information on the Indira Gandhi National Open University (IGNOU) courses may be obtained from the
University’s office at Maidan Garhi, New Delhi-110068.
Laser Typesetting by School of Engineering and Technology, IGNOU, New Delhi-110 068. Phone 29532863
Printed and publish Digital Materials on behalf of School of Engineering and Technology (SOET), Indira Gandhi
National Open University, New Delhi 110068.
UNIT 7 BASICS OF AI
Structure
7.1 Introduction
7.2 What is AI?
7.3 Components of Artificial Intelligence
7.4 Fields of Applications of AI
7.5 Implementation of AI
7.6 The Future of AI
7.7 AI Ethics
7.8 Summary
7.9 Keywords
7.10 Check Your Progress – Possible Answers
7.11 References and Selected Readings

7.1 INTRODUCTION
Whether a computer has Artificial Intelligence? Alan Turing proposed a test to
answer this question. A computer that satisfies Turing’s definition of Artificial
Intelligence should be able to do things that are expected from humans such as
writing an essay, recognizing pictures of celebrities, engaging in conversation,
composing music, solving reasoning tests, and so on. To pass the Turing test,
AI should have capabilities like natural language processing, knowledge
representation, reasoning, and machine learning. The domain of natural
language processing (NLP) is concerned with understanding how human
languages like English can be understood and replicated by computers.
Different NLP models perform a variety of tasks such as sentiment analysis i.e.
tone of a sentence, machine translation (like Google Translator) and speech
recognition (like Alexa, Siri). Generative Pre-trained Transformer or GPT is a
model for Natural Language Processing. An AI called GPT-3, trained on
millions of online articles and posts, can generate human-like textual passages
based on prompts. AI is a powerful technology and it is progressing rapidly. AI
is capable of performing many tasks. AI can translate between languages. It
can beat best of the chess players. It can recognize objects in images and
videos. It has made inroads into stock trading, self-driving cars, and many such
applications. Neural networks and machine learning can creatively generate
texts, music pieces and even painting in the style of famous painters. Some of
the prominent fields where applications of AI can be discerned are: climate
science, finance, cybersecurity, and natural language processing. The ultimate
goal of AI is to make machines that think like human beings. This idea is
called Artificial General Intelligence. As opposed to the current AI systems,
which are dedicated to solving specific tasks, a machine with Artificial General
intelligence will be capable of learning and performing several tasks. With the
advancement and progress of AI, questions about the ethics of AI become
more prominent.
Objective
In this unit, you will learn about the basics of AI. After reading this unit, you
will be able to:
 Understand the concept of AI
 Identify components of AI
 Appreciate applications of AI in different fields
 Appreciate AI through the implementation of a small project
 Discuss the future of AI
 Discuss ethics of AI

7.2 WHAT IS AI?


In 1950 Alan Turing the famous British cryptographer, remembered for his
work on decoding the German Enigma machine, proposed a test to ascertain
whether a computer has Artificial Intelligence (AI). The test goes as follows:
An interviewer interviews a human and a computer without knowing their
identities initially. The interviewer then asks a series of questions to both for
five minutes and tries to figure out which one of them is the AI. If the AI is
able to confuse the human interviewer, then it can be said that the computer
has Artificial Intelligence.
Thus, a computer that satisfies Turing’s definition of Artificial Intelligence
should be able to do things that are expected from humans, for example,
writing an essay, recognizing pictures of celebrities, engaging in conversation,
composing music, solving reasoning tests, and so on.
To pass the Turing Test an AI must have the following capabilities:
i. Natural Language Processing: To understand the questions being asked,
the AI must be capable of understanding human language such as English.
ii. Knowledge Representation: To store information
iii. Reasoning: To logically deduce conclusions from information
iv. Machine Learning: To learn from past performance and feedback
The notion of AI is not limited to what is defined by Turing’s Test, rather the
Turing test is one of many approaches to describing AI. Today while AI which
can pass Turing’s test conclusively has not been made, several advancements
have been made in AI which goes beyond the scope of Turing’s test.
Check Your Progress 1
In this section, you studied “What is AI?”, now answer the questions given in
Check Your Progress-1.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Explain the Turing test that ascertains whether a computer has Artificial
Intelligence.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.3 COMPONENTS OF ARTIFICIAL INTELLIGENCE


7.3.1 Search
Search is a universal problem-solving mechanism in artificial intelligence. In
AI problems, the sequence of steps required for the solution of a problem is not
known a priori, but often must be determined by a systematic trial-and-error
exploration of alternatives.
Begin with a case of a two-player game, like chess, that can be addressed by an
AI search algorithm. This is how a chess player thinks while making a move:
i. Considers the position on the board
ii. Looks at key moves that are possible
iii. Calculates what position each move leads to
iv. Chooses the move that leads to the best position
Now in a game of chess the space of possible moves is huge. In fact, there are
about 69,352,859,712,417 positions that can be reached within 5 moves of play.
Clearly, it is not feasible to look at all the possible combinations of moves that
can be played. In fact, experienced players can use “intuition” to quickly rule
out moves that are not relevant.
From the perspective of computers, the game of chess is a search problem. A
player has to search for the best move (or sequence of moves) from all the
possible choices. As it is evident that a brute force search is impractical even
for modern computers, it is time to look at other search algorithms that are
commonly used by AI practitioners.
Alpha-Beta Pruning
To explain Alpha-Beta pruning, we first introduce the notion of a “Game Tree”.
A game tree is a representation of a sequential game (i.e. a game in which
people take turns like tic-tac-toe or chess). It represents all possible situations
(states) that can occur in the game. Each state is represented by a node in the
graph, nodes that radiate downwards from a node are called daughter nodes
and the node itself is referred to as parent node.
Figure 7.1 shows how a game tree for tic-tac-toe would look like. It starts with
an empty grid, subsequently, alternate levels in the tree represent all possible
moves of each player. Then finally the last nodes (called leaves) would have
the final state which would represent either Player 1 winning or Player 2
winning or a draw.

Figure 7.1: Game Tree Example


(Source: Wikipedia)

If we denote win for Player 1 by +1, win for Player 2 as -1, and a draw as zero,
then the objective for player 1 is to maximize the final score and the objective
of player 2 is to minimize the score.
Figure 7.2 shows an example of a game tree which we will search using alpha-
beta pruning. To begin with, two variables α =− ∞ and β =+ ∞ for each node
have been initialized. α represents the best-case scenario for Player 1, while β
represents the best-case scenario for player 2. Following are the steps to
understand the working of alpha-beta pruning.
Figure 7.2: Alpha-beta Pruning Example

i) Algorithm starts in a depth-first search fashion, that is it keeps moving


down the graph until the last move is reached. Here the algorithm goes
from ZP A.
ii) For node A, since it’s Player 1 to play, he will seek to maximize. He
explores the first child of node A and finds α = max −∞, 3 . Then he
checks what is the best-case scenario for Player 2 i.e. checks β, which
is still + ∞. Since β ≥ α, it checks the other child nodes until it either
beats the best-case score of Player 2 or runs off nodes to check. In this
case, it checks 3, then checks the next node and α = ��� 3,4 . Since
all nodes are exhausted, player 1 can safely claim that it’ll score at least
4 points from this node and then assigns � = 4.
iii) After fully exploring possibilities arising from A, the algorithm moves
to the next higher level on the path it followed to A, and reaches node P
where it is Player 2’s turn to make a move. Player 2 explores its child
nodes starting with A (as A has been explored). Since its goal is to
minimize � it assigns � = ��� +∞, 4 . Then it checks the first new
node to see it can do better. Thus, the algorithm moves down the graph
and passes the newly found β = 4 to its child B.
iv) Step 2 is repeated for node B; α = ��� 7, − ∞ . Then check if β ≥ α,
since the condition is false, we stop the search here and assign � = 7.
Thus, we pruned the remaining branches of node B.
v) The algorithm then goes back to where it left off at the end of step 3
assigning β = ��� 4,7 . Since no more branches are left to be
explored, Player 2 can claim that it can restrict the score to β = 4, so
assigns � = 4
These steps are repeated until we get the value for node Z.
Heuristic Search
Often in real-life situations more information is available than the abstract
definition of the problem. This can be illustrated through the graph shown in
Figure7.3.
Here the nodes represent cities and the edges connecting them represent the
distance between the two cities. The task is to find the shortest path between
cities 1 and 3. The solution to this problem is given by Dijkstra’s algorithm
which explores the graph in a manner such that the shortest route from the
destination to any intermediate state is considered.
However, in practice, the information available may be more than just the road
distance. It can be the aerial distance between different cities given in a
geographical map. This extra information can help in making more informed
decisions when choosing which node to explore. This extra information is
called a heuristic. A heuristic can thus be used to make choices when at a
crossroad instead of randomly choosing one (as in uninformed search) or
choosing one based on the cost to reach that state.

,
Figure 7.3: City Graph with the Distance between Cities

A* Search
A* search uses both the cost to reach the state (same as Dijkstra’s algorithm)
and the heuristic. Thus, the total cost is given by
�=ℎ � +� �
Different problems have different heuristic functions ℎ. For our example, ℎ �
can simply be chosen as the aerial distance of the destination from the current
node. Where g(n) is the total cost of the path from the starting node to the
current node.
A* search progressively choses the nodes which lead to a path with the
minimum total cost.
A heuristic search does not guarantee an optimal result but generally performs
faster than uninformed searches.
Monte Carlo Tree Search
Monte Carlo Tree Search (MCTS) is useful when the tree is very large. It
estimates the probability for wining given a move by using Monte Carlo
Simulation. Monte Carlo Tree Search example has been given in Figure 7.4.
Monte Carlo Tree Search has 4 essential steps:
1. Selection: It is decided which node is to be selected next; based on a
function called Upper Confidence Bound (UCB).
UCB = �� + �� � /��
Where �� represents the value of the current node. The value of a node
corresponds to the probability of winning. UCB function has two
important characteristics.
 Exploitation: You choose to explore the nodes that have higher
chances of winning. Thus, following the exploitation strategy
the higher the value of �� the more likely the node will be
selected.
 Exploration: The term �� � /�� quantifies the exploration
strategy. Once a node has been explored many times the values
�� becomes much larger than the logarithm term eventually
bringing the UCB value down for that node compared to nodes
that have a small �� value.

Monte Carlo Simulation

Figure 7.4: Monte Carlo Tree Search Example

You stop the selection process when you reach the node which has not
been further expanded.
2. Expansion: When you reach the last node, you randomly make a move
to add another node to the tree.
3. Simulation: For the given node, you run a classical Monte Carlo
simulation to accumulate statistics for the node. A Monte Carlo
simulation in this case is simply playing multiple games by making
random moves until a result is achieved and recording the statistics for
that node.
4. Backpropagation: The newly collected statistics are used to update the
probability values of nodes upwards to the top.
After a sufficient number of iterations are completed, you simply choose the
move with the highest probability of winning. Monte Carlo Tree Search is
widely used in programs that play games like chess.
7.3.2 Learning from Data
The domain of learning from data is more commonly known as machine
learning. Machine learning techniques use data to learn to do a specific task.
For example, using a collection of your album photos as input, machine
learning techniques can be used to develop a model that can recognize your
face. In machine learning “Training a model” is often referred to in the same
sense as it is used in the case of “Training a pet”. To be more precise, training
a model means finding parameters for which the model gives optimal results.
Machine learning is classified into 3 types:
i. Supervised Learning
ii. Unsupervised Learning
iii. Reinforcement Learning
Supervised learning uses data that has been labelled or tagged. The goal here is
to predict the labels for new input.
Unsupervised learning utilizes data that is not labelled. The main goal here is
to find patterns in the data. A very common example is a clustering of data, i.e.
to group similar data together and form different groups of the data. For
example, consider a satellite image of India, an unsupervised learning model
may recognize different features like the forest, desert, mountains, and so on.
Reinforcement learning allows the model to learn by exploring the
environment and receiving rewards or punishments for certain actions. The
model often referred to as the Agent is allowed to make decisions according to
a policy that it can learn such that it maximizes the reward accumulated.
Neural Networks
Here neural networks have been explained to illustrate the ideas in machine
learning. Neural networks were designed to replicate the human brain. The
human brain has billions of neurons connected to each other, which convey
messages from one part of the brain to another and make us capable of
listening, speaking, and reasoning. Likewise, a neural network has neurons that
are connected with each other. Each neuron performs computation and
forwards its output to connected neurons.
Perceptron
A perceptron is the single unit of the neural network. The perceptron has some
parameters which it can learn. It takes an input, performs some calculations on
it, and returns the output. The calculation consists of two parts:
i) Linear Transformation: �� �
ii) Activation function: � �� �
A linear transformation is simply the dot product of the input � and the
parameters of the perceptron (�). The activation function introduces
nonlinearity to the output. This activation function makes sure that the neural
network is able to learn even complex relationships.
In a neural network, the perceptrons are arranged in layers as shown in
Figure7.5.
The output of the neural network can be designed to represent the probabilities
of different output classes. Sometimes, the output layer of the neural network
is removed. This gives us processed data, and this process is called feature
extraction.
Recently many advancements have been made in this area which has lead to
development of neural networks specializing in different tasks such as image
processing and language processing.
Figure 7.5: Representation of a Neural Network
(Source:Wikipedia: https://upload.wikimedia.org/wikipedia/commons/e/e4/Artificial_neural_network.svg)

7.3.3 Knowledge Representation and Reasoning


Humans often reason based on their knowledge. Certain statements about the
world are assumed to be true; in mathematics and reasoning, these statements
which are assumed to be true are called axioms. A famous example is Euclid’s
axioms of geometry.
In logic, sentences that are either true or false are called statements. A
statement can be represented by variables. Table 7.1 shows statement and truth
value.
Table 7.1: Statement and Truth Value
Statement Truth value
P: All novels are books. TRUE
Q: Christmas Carol is a book. TRUE
R: Paris is the capital of England. FALSE
Statements are usually denoted by upper case alphabets. Each statement has a
truth value, that is, it is either True or False.
Sentences that are interrogatory such as “Is UP India’s largest state?” or which
are ambiguous such as “It might be raining outside.” are not considered as
statements.
In English, you often use words like “and”, “or”, “not”, “therefore” etc. to
convey logical statements. For example:
Christmas carol is a book and Paris is the capital of England. One can say that
the sentence is False as the second part of the sentence is false. This kind of
logic is formally represented with the help of logical operators. Table 7.2
shows logical operators and their English equivalents.
Table 7.2: Logic Operators and their English Equivalent
¬ NOT
∧ AND
∨ OR
→ IF… THEN
↔ IF AND ONLY IF
Using logical operator, you can write the statement,
A: Christmas carol is a book and Paris is the capital of England
as,
A=� ∧ �
Truth Tables
Truth tables represent the truth value of the outcome of a logical operator for
all possible combinations of truth values of the inputs. The following is the
truth table for ∧ ��� ∨.
P Q � ∧� P Q � ∨�

FALSE FALSE FALSE FALSE FALSE FALSE

FALSE TRUE FALSE FALSE TRUE TRUE

TRUE FALSE FALSE TRUE FALSE TRUE

TRUE TRUE TRUE TRUE TRUE TRUE

A collection of such statements with their truth


values is called a knowledge base. A knowledge base provides the AI facts
about the world and AI can use it to make logical deductions.
7.3.4 Hardware for AI
The development of AI algorithms has also led to developments in the
hardware used to run AI algorithms. On the other hand, the development of
processors with high processing power has allowed models like neural
networks to become feasible. Graphics Processing Units or GPUs were
developed for image processing and rendering. GPUs are structured in a highly
parallelized manner, which means they can run thousands of processes
simultaneously. Training neural networks, for example, can be divided into
several sub-processes which can be run simultaneously on a GPU. Thus, neural
networks train much faster on GPUs than an all-purpose CPU.
Field Programmable Gate Arrays (FPGA) are integrated circuits that are
widely used to deploy AI as they are reconfigurable and offer flexibility to the
designer. Since no standard circuits are available for AI, the customizable
nature of FPGA is beneficial.
Companies also develop AI Application Specific Integrated Circuits (ASIC)
for implementing specialized techniques. An example of ASIC is Tensor
Processing Units(TPU) developed by Google. TPUs have been designed
specially to carry out matrix multiplications and additions which are used
repeatedly in deep learning.
Check Your Progress 2
In this section, you studied components of Artificial Intelligence, now answer
the questions given in Check Your Progress-2.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss A* search.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss the knowledge representation and reasoning.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Discuss Hardware for AI.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.4 FIELDS OF APPLICATION OF AI


Artificial Intelligence is widely used in various fields. AI is changing our lives
in many ways. Be it self-driving cars or service robots, these applications show
the tremendous stride AI has taken. Neural networks and machine learning can
creatively generate texts, music pieces and even painting in the style of famous
painters. Some of the prominent fields where applications of AI can be
discerned are: climate science, finance, cybersecurity, and natural language
processing. Applications of AI in the domain of speech recognition, natural
language processing, and self-driving cars have been discussed in this section.
7.4.1 Speech Recognition and Natural Language Processing
In 2020 an article titled “A robot wrote this entire article. Are you scared yet,
human?” was published in The Guardian. The article was written by an AI
called GPT-3, which was trained on millions of online articles and posts. GPT-
3, the latest in a series of models can generate human-like textual passages
based on prompts. GPT stands for Generative Pre-trained Transformer. GPT is
a model for Natural Language Processing.
The domain of natural language processing concerns with understanding how
human languages like English can be understood and replicated by computers.
Different NLP models perform a variety of tasks such as: -
Sentiment Analysis: Analysing the “tone” of a sentence
Sentiment Analysis finds applications in analyzing product reviews. Sentiment
analysis models can provide information whether the review is positive or
negative, did the audience find the movie thrilling or boring by analyzing the
text.
Machine Translation
The most famous example is Google Translate which is capable of translating
text from one language to another.
Speech Recognition
Speech recognition is used in home assistants such as Alexa or Siri. The task in
speech recognition is to process the audio data to obtain the corresponding text.
Coming back to GPT, it is based on the transformer architecture. The basic
transformer model is shown in Figure 7.6.
The transformer is basically an encoder-decoder architecture. The left side of
the figure shows the encoder. The encoder takes a sentence as input and returns
an embedding (i.e. a vector that is obtained by processing of the input.) The
embedding can be thought of as an essence or meaning of the sentence.
The other half is the decoder, which takes the encoder output and returns the
desired probabilities. For example, if we are using the transformer for machine
translation we can get the output as probabilities for words in the translated
sentence.
The key feature of transformer architecture is the Multi-Headed Attention
block. Attention block allows the model to make long-range correlations. Such
correlations are important to make, as languages have pronouns, verbs, and
adjectives which usually are correlated with nouns which may occur quite later
in the sentence.

Figure 7.6: Transformer Architecture


(Source: taken from “Attention is all you need”)

7.4.2 Self-Driving Cars


Self-driving cars use intelligent systems to sense the surroundings, detect and
locate objects and symbols, arrive at a decision to steer, brake, accelerate, and
so on, and finally to control the vehicle by passing instructions to the steering
system.
The objective of a self-driving car is to make driving decisions based on the
input it gets from the sensors. An example of a driving decision might be:
 When to brake?
 When to steer and how much and in which direction to steer?
 When to apply emergency brakes?
A self-driving car typically uses several sensors to detect the environment.
These sensors include LiDAR, Cameras, RADAR, and so on. These sensors
capture what is termed as “Raw Data” i.e. data that is in a very crude form and
can not directly be used to make driving decisions.
To make driving decisions you need to be able to detect and identify objects in
the surrounding. For example, you need to be able to identify other cars on the
road, traffic lights, signboards, zebra crossings, markings on road, pedestrians,
etc.
Subsequently, you want to locate these objects in the 3-D space. Note that just
identifying the images in different cameras is not enough, you need to combine
the images to get a 3-D map of the surrounding. This task is not trivial and the
traditional programming approach fails to get the desired results.

Figure 7.7: Object Detection by Self Driving Car


(Source: Wikipedia)

Tesla’s AI for Autonomous vehicle


Tesla uses feed from eight cameras to generate a 3-D map of its environment.
The architecture used is a “HydraNet”. The HydraNet (Figure7.8) has a single
neural network backbone that takes images from the cameras as input and
extracts the features. Then these features are provided to different heads which
are fine-tuned for different tasks such as vehicle detection, marking detection,
and so on.
Tesla chose to use only cameras instead of LiDAR which is more commonly
used by other companies. The rationale behind the choice was that if a human
can drive based only on visual inputs, then so can AI. Thus, they eliminated
the use of LiDAR and subsequently even removed RADAR from the cars thus
making it solely reliant on cameras.
A key point to remember when using camera feeds is that feeds should be
synchronized i.e. there should not be any time lag between inputs from
different cameras.
Figure 7.8: Tesla’s HydraNet Architecture

Check Your Progress 3


In this section, you studied smart solar charger, now answer the questions
given in Check Your Progress-3.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss transformer architecture.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss Tesla’s HydraNet architecture.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.5 IMPLEMENTATION OF AI
Many implementations of AI are released under open source licenses. These
software can be used and modified by anyone in the world. Here you can look
at one such model, GPT-2. You were acquainted with GPT 3 model in the last
section. GPT 2 is the precursor of GPT-3. The project example is used to
generate passages based on prompts.
7.5.1 Implementation through a Project Example: Generating passage
based on prompts
To run GPT-2, you will use Google Colab as it gives free access to cloud-
based GPU which can speed up the code execution.
Setting Up Google Colab
1. To use Google Colab, you must have a Google account. If you don’t
already have an account, you can create a Google account and set up an
email address and a password.
2. Go to https://colab.research.google.com/ and sign in with your Google
account. Select the New Notebook option. This will open a blank
notebook (Figure7.9).

Figure 7.9: Colab Startup Screen

3. The default notebook contains a code cell. You can run blocks of
python code in these cells. The other type of cell is the Text cell, where
you can enter a description of your code or any other information that
you want to.
4. Just to get started, Make a text box and type “My First Colab
Notebook”, now create a Code Cell and type print(‘hello’). It should
look something like this (Figure7.10):

Figure 7.10: Colab Interface

5. Now run each cell using Ctrl+Enter. Alternatively, use Runtime->Run


All. The text will get formatted according to Markdown Syntax. All
code output will be displayed below the cell.
You can add multiple code cells to implement different parts of a code. All
variables, functions and classes defined in one cell are accessible to other
cells. The getting started guide gives a more detailed and comprehensive
introduction.
Loading and using GPT 2
1. Open a new Google Colab notebook.
2. Change runtime type to GPU. Navigate to Runtime  Change runtime
type. Then select GPU from the dropdown menu that appears (Figure
7.11).
Figure 7.11: ASnapshot of Selecting GPU

3. Clone the GitHub repository of GPT-2. Like many other open source
software GPT 2 is hosted on GitHub. Cloning a repository means that
you have made a local copy of the repository. To clone GPT-2 source
code, execute the following code in the code cell.

4. A new folder named gpt-2 is created. Navigate to gpt-2 folder.

5. Install and import required packages.


6. Now everything is set up and you can start using GPT-2 model. Run

to interactively generate text based on prompt. The model will then ask
for a prompt, enter a prompt of your choice and hit enter.

Please note that it is ok if you get several warnings before the output. This
happens because as software engineers update the software, they introduce
better functionalities and add warnings to older methods.
7. Output
============================== SAMPLE 1
============================== (AI) and human-machine
cooperation to produce a new generation of vehicles that drive autonomously
and safely.
The idea of robot cars that drive themselves has been around for decades, with
some of the first self-driving car prototypes produced in the 1970s. It has only
been in the last few years that the technology has become commercially
feasible, with Google's self-driving car project, Waymo, and Ford's F-650
concept car being among the first to hit the streets.
Although no firm timeline has been set for the commercial introduction of a
self-driving car, some of the early trials have already begun, with the world's
first fully self-driving car, which was driven for the first time in Switzerland,
taking the reins at a race meeting in Singapore last year.
Now, it's the turn of Chinese tech giant Baidu, which hopes to be one of the
first carmakers to use the technology.
The autonomous vehicle will be developed by Baidu as part of its self-driving
car research. The Chinese technology giant has partnered with Carnegie
Mellon University to develop and test the vehicle.
The car, which has been dubbed the "Baidu Drive", will be able to drive from
New York to San Francisco in less than 12 hours and from Beijing to Shanghai
in about 17 hours. Baidu is currently building a fleet of 100 prototype self-
driving cars in China.
"By combining AI, machine learning, and robotics to improve driving safety,
we hope to dramatically decrease the time it takes to travel from the U.S. to
China," Baidu CEO Robin Li said in a release.
The company hopes the self-driving car will become a mainstream technology
used in the transportation market in the near future, and will help it compete
with Chinese rivals such as Tencent and Alibaba. Baidu will use the test results
to help develop further technologies for autonomous driving, according to the
release.
"The new Baidu Drive prototype demonstrates our technology can handle
highly congested urban environments and has a promising safety record in the
city," said Dr. Bin Zhao, director of the Department of Robotics, the Carnegie
Mellon University's Department of Computer Science.
The autonomous vehicle has been built using Baidu's self-driving technology.
Researchers believe the self-driving car will be able to handle difficult driving
conditions, such as road construction, traffic jams, and sudden changes in road
conditions.
"We are very happy to
8. You can experiment with different prompts.
Question: Is artificial intelligence dangerous?, Answer:
This format can be used to ask questions to the model.
9. Even though the model performs well there are caveats. Sometimes it
seems that it is copying the information from random internet sources.
This is because it was trained on corpus of text taken from the internet.
10. Change the value of temperature and top_k parameters, these
parameters control how creative or “random” the model can become.
Check Your Progress 4
In this section, you studied the implementation of AI, now answer the
questions given in Check Your Progress-4.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Setup Google Colab.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Load and use gpt-2.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Use gpt-2 model for various prompts.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.6 THE FUTURE OF AI


7.6.1 Near Term Future
In near future, the use of AI in several fields such as medicine and research is
expected to mature. Drug discovery and simulation of complex biological
processes can be enhanced by AI leading to faster drug developments at a
cheaper price.
AI is also an integral part of upcoming technologies such as Web 3.0 and
Industry 4.0. There are several ideas on how this could pan out. The semantic
web is a concept that advocates for the transformation of the internet such that
the data on the web can be read, accessed, and understood by machines. At
present the web is designed such that it is easy for humans to understand and
navigate the web, computers can for now only extract data from the web and
there is no in-built mechanism that the computers can use to extract meaning
from the information on the web.
Similarly, as we advance into the world of industry 4.0, which is a paradigm
where machines are interconnected via the internet of things, a huge amount of
data would be generated by these machines which can be processed with the
help of machine learning to optimize and tweak production techniques. AI is
expected to get involved at all levels of the supply chain, helping in optimizing,
removing bottlenecks, and improving customer satisfaction.
AI is also expected to exploit recent advances in fields such as Quantum
Computing, which change the way we fundamentally think about computers.
Quantum Computers can obtain exponential speed-ups over their classical
counterparts in certain cases. AI and Machine Learning techniques are being
adapted to be used by these next-generation computers.
7.6.2 Artificial General Intelligence
The ultimate goal of AI is to make machines that think like human beings. This
idea is called Artificial General Intelligence. As opposed to the current AI
systems, which are dedicated to solving specific tasks, a machine with
Artificial General intelligence will be capable of learning and performing
several tasks.
Artificial general intelligence is a farfetched idea. Achieving this requires an
amalgamation of different domains of AI that are being used right now, such as
Natural Language Processing, Robotics, Computer Vision as well as the
invention of new techniques that understand emotions and dynamics of social
interactions.
Thus a machine with AGI must be capable of conversation with humans,
playing football, dancing, singing, imitating other humans, and so on.
7.6.3 Humans and AI
In popular media, AI is often portrayed as a job killer. People are anxious that
AI will replace them at work, especially with AI-powered systems such as self-
driving cars or human-less stores such as Amazon GO. While it is quite true
that many jobs would become obsolete, all the domains that involve creativity,
human interaction, or impromptu decisions will be immune from AI for the
foreseeable future.
Another school of thought believes that AI will augment human capabilities,
eliminating hazards and risks associated with workplaces. A glimpse of this
can be seen in health monitoring technologies and smart prosthetics that are
being developed.
Yet another concept is the concept of singularity. If we are able to create an AI
that is as capable as humans, it stands to reason that AI can itself create
updated versions of self iteratively and reach a point where the AI created by
AI would outsmart humans. This is called a singularity, a point when we
would not have any control over the further development of AI. Singularity
however is only a topic of philosophical discussion as its precondition is the
existence of AGI which itself is a tough goal to accomplish.
Check Your Progress 5
In this section, you studied the future of AI, now answer the questions given in
Check Your Progress-5.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) What is artificial general intelligence?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.7 AI ETHICS
AI is a powerful technology and it is progressing rapidly. AI is capable of
performing many tasks. AI can translate between languages. It can beat best of
the chess players. It can recognize objects in images and videos. It has made
inroads into stock trading, self-driving cars, and many such applications. With
the advancement and progress of AI, questions about the ethics of AI become
more prominent. One such ethical issue comes in self-driving cars. For saving
the life of a patient while going to a hospital, self-driving cars should give
priority to the life of a patient on the car or pedestrian in a crowded road.
Another example pertains to the autonomous weapon system. Can an
autonomous weapon system be given free hand in identifying and attacking a
target? These are dilemmas. There are several ethical issues while using AI.
Some of the ethical issues pertaining to AI have been discussed here.
7.7.1 Data Privacy
Perhaps the most recent episode in the minds of the people is the Cambridge
Analytica scandal where about 87 million people mostly in the USA were
profiled and their profiles were sold to political campaign managers for the
purposes of influencing the voters.
The data was leaked from Facebook, which earns its revenue by running ads.
Tech giants like Facebook and Google that offer free services usually use AI to
monitor and analyze the behavior of their users to improve the quality of ads
by making them more targeted. Thus naturally they collect and store petabytes
of data.
Even smaller, new organizations are collecting and analyzing data for
improving their services. For example, smart appliances send data periodically
to servers which helps the manufacturer to gain insights into the performance
of the product and improve it.
Clearly, there is a trade-off between the amount of data collected and the
performance of a product or service. Thus several practices are incorporated to
ensure that the privacy of individual users is not compromised. The data
collected is usually anonymized, that is, all the markers that may be used to
identify a user are removed before storing and processing the data.
k-anonymity
k-anonymity is a standard metric that measures how many users in a database
share a particular attribute. For example, if we have a database of school
students from which obvious identifiers such as name and roll number are
omitted, one can think of identifying a person based on other attributes such as
percentage of marks obtained, sports team they are a part of, etc. If each query
of such attribute results in at least k-1 outputs, then the database is said to
possess k-anonymity. Thus, for sufficiently large ‘k’, it is not possible to
ascertain the identity of a person based on their attributes.
L-diversity
L-diversity extends the idea of k-anonymity by merging several sensitive
attributes together. For example, if we have a hospital database that has a list
of patients in each disease category, one can run a “background knowledge
attack” i.e. just knowing that a certain number of people in a hospital suffer
from a particular disease can be used for malicious purposes. Thus in the
database instead of listing patients by individual diseases, we can group ‘l’
similar diseases together.
7.7.2 Biases in AI
Many AI models are black boxes, even leading experts don’t understand how
and what factors were taken into account and to what extent. Many AI models,
especially in machine learning, are trained on historic data which themselves
have biases. For example, if a police database is used to train a model that
determines the bail amount, people from certain neighborhoods might see their
bail set at higher amounts if the historical data is biased against them.
7.7.3 Deepfakes
With advancements in deep learning, a new phenomenon has emerged called
deepfakes. Deepfakes refer to audio, video, or images that are generated by AI,
and never actually existed in the real world. For example, the image shown
below (Figure 7.12) is not a picture of a real woman but an image generated by
AI.

Figure 7.12: Deepfake Image of a Person Generated by AI


(Source: Wikipedia)

AI models can also be trained to generate images or videos that resemble real-
life humans but in entirely different contexts, for example, a video of Barak
Obama calling Trump names was released in 2017.
The most notable architecture that drives these models is Generative
Adversarial Networks or GAN. GANs consist of two neural network models,
one generator and the other discriminator. The generator generates images and
the discriminator tries to classify generator images as fake or real. Both the
generator and discriminator are trained simultaneously and finally when the
discriminator accuracy approaches 50% (i.e. no better than random guess), we
consider the model to be trained.
Deepfakes are problematic because they are hard to detect and hence can
convey misinformation very convincingly to large audiences.
Check Your Progress 6
In this section, you studied AI ethics, now answer the questions given in Check
Your Progress-6.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) What are k-anonymity and L-diversity.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss biases in AI.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) What are deepfakes?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

7.8 SUMMARY
This unit discussed the basics of Artificial Intelligence. The concept of AI was
elucidated. Acquaintance with various search algorithms was made. An
overview of learning from data i.e. machine learning was presented. Further,
fields of applications such as natural language processing, and self-driving cars
were discussed. A small project was illustrated to demonstrate the
implementation of AI. Finally, the unit concluded with the ethics of AI.

7.9 KEYWORDS
Application Specific Integrated Circuit: An application specific integrated
circuit (ASIC) is a kind of integrated circuit that is specially built for a specific
application or purpose.
Tensor Processing Unit: Tensor Processing Unit (TPU) is an AI accelerator
application-specific integrated circuit (ASIC) developed by Google
specifically for neural network machine learning.
Depth-First Search: Depth-first search is an algorithm for traversing or
searching tree or graph data structures. The algorithm starts at the root node
and explores as far as possible along each branch before backtracking.
Dijkstra’s Algorithm: Dijkstra’s algorithm allows you to calculate the
shortest path between one node of your choosing and every other node in a
graph.
Field Programmable Gate Array (FPGA): It is a semiconductor IC where a
large majority of the electrical functionality inside the device can be changed;
changed by the design engineer, changed during the PCB assembly process, or
even changed after the equipment has been shipped to customers out in the
‘field’.
Game Tree:A game tree is a representation of a sequential game (i.e. a game
in which people take turns like tic-tac-toe or chess). It represents all possible
situations (states) that can occur in the game. Each state is represented by a
node in the graph, nodes that radiate downwards from a node are called
daughter nodes and the node itself is referred to as parent node.
GPT-3: Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive
language model that uses deep learning to produce human-like text.
7.10 CHECK YOUR PROGRESS 1 – POSSIBLE ANSWERS
1) Explain the Turing test that ascertains whether a computer has Artificial
Intelligence.
In 1950 Alan Turing the famous British cryptographer, remembered for his
work on decoding the German Enigma machine, proposed a test to ascertain
whether a computer has Artificial Intelligence (AI). The test goes as follows:
An interviewer interviews a human and a computer without knowing their
identities initially. The interviewer then asks a series of questions to both for
five minutes and tries to figure out which one of them is the AI. If the AI is
able to confuse the human interviewer, then it can be said that the computer
has Artificial Intelligence.
Thus, a computer that satisfies Turing’s definition of Artificial Intelligence
should be able to do things that are expected from humans, for example,
writing an essay, recognizing pictures of celebrities, engaging in conversation,
composing music, solving reasoning tests, and so on.
Check Your Progress 2 – Possible Answers
1) Discuss A* search.
A* search uses both the cost to reach the state (same as Dijkstra’s algorithm)
and the heuristic. Thus, the total cost is given by
�=ℎ � +� �
Different problems have different heuristic functions ℎ. For our example, ℎ �
can simply be chosen as the aerial distance of the destination from the current
node. Where g(n) is the total cost of the path from the starting node to the
current node.
2) Discuss the knowledge representation and reasoning.
Humans often reason based on their knowledge. Certain statements about the
world are assumed to be true; in mathematics and reasoning, these statements
which are assumed to be true are called axioms. A famous example is Euclid’s
axioms of geometry.
In logic, sentences that are either true or false are called statements. We can
represent statements by variables. Statements are usually denoted by upper
case alphabets. Each statement has a truth value, that is, it is either True or
False.
3) Discuss Hardware for AI.
The development of AI algorithms has also led to developments in the
hardware used to run AI algorithms. On the other hand, the development of
processors with high processing power has allowed models like neural
networks to become feasible. Graphics Processing Units or GPUs were
developed for image processing and rendering. GPUs are structured in a highly
parallelized manner, which means it can run thousands of processes
simultaneously. Training neural networks, for example, can be divided into
several sub-processes which can be run simultaneously on a GPU. Thus, neural
networks train much faster on GPUs than an all-purpose CPU.
Field Programmable Gate Arrays (FPGA) are integrated circuits that are
widely used to deploy AI as they are reconfigurable and offer flexibility to the
designer. Since no standard circuits are available for AI, the customizable
nature of FPGA is beneficial.
Check Your Progress 3 – Possible Answers
1) Discuss transformer architecture.
The transformer is basically an encoder decoder architecture. The left side of
the figure shows the encoder. The encoder takes a sentence as input and returns
an embedding (i.e. a vector that is obtained by processing of the input.) The
embedding can be thought of as an essence or meaning of the sentence.
The other half is the decoder, which takes the encoder output and returns the
desired probabilities. For example, if we are using the transformer for machine
translation we can get the output as probabilities for words in the translated
sentence.
The key feature of transformer architecture is the Multi-Headed Attention
block. Attention block allows the model to make long range correlations. Such
correlations are important to make, as languages have pronouns, verbs and
adjectives which usually are correlated with nouns which may occur quite later
in the sentenc
2) Discuss Tesla’s HydraNet architecture.
Tesla uses feed from eight cameras to generate a 3-D map of its environment.
The architecture used is a “HydraNet”. The hydraNet has a single neural
network backbone that takes images from the cameras as input and extracts the
features. Then these features are provided to different heads which are fine-
tuned for different tasks such as vehicle detection, marking detection and so on.
Check Your Progress 4 – Possible Answers
1) Setup Google Colab

Follow the following step.

1. Create a Google account and set up an email address and a password.


2. Go to https://colab.research.google.com/ and sign in with your Google
account. Select the New Notebook option. This will open a blank
notebook.
3. The default notebook contains a code cell. You can run blocks of
python code in these cells. The other type of cell is the Text cell, where
you can enter a description of your code or any other information that
you want to.
4. Just to get started, Make a text box and type “My First Colab
Notebook”, now create a Code Cell and type print(‘hello’).
5. Now run each cell using Ctrl+Enter. Alternatively, use Runtime->Run
All. The text will get formatted according to Markdown Syntax. All
code output will be displayed below the cell.
2) Load and use gpt-2
Follow the following steps:
1. Open a new Google Colab notebook.
2. Change runtime type to GPU. Navigate to Runtime  Change runtime
type. Then select GPU from the dropdown menu that appears.
3. Clone the GitHub repository of GPT-2. Like many other open source
software GPT 2 is hosted on GitHub. Cloning a repository means that
you have made a local copy of the repository. To clone GPT-2 source
code, execute the following code in the code cell.

4. A new folder named gpt-2 is created. Navigate to gpt-2 folder.


5. Install and import required packages.
6. Now everything is set up and you can start using GPT-2 model. Run to
interactively generate text based on prompt. The model will then ask
for a prompt, enter a prompt of your choice and hit enter.

3) Use gpt-2 model for various prompts


Once gpt-2 model is set up, write different prompts and get answer.
Check Your Progress 5 – Possible Answers
1) What is artificial general intelligence?
The ultimate goal of AI is to make machines that think like human beings. This
idea is called Artificial General Intelligence. As opposed to the current AI
systems, which are dedicated to solving specific tasks, a machine with
Artificial General intelligence will be capable of learning and performing
several tasks.
Check Your Progress 6 – Possible Answers
1) What are k-anonymity and L-diversity.
k-anonymity is a standard metric that measures how many users in a database
share a particular attribute. L-diversity extends the idea of k-anonymity by
merging several sensitive attributes together.
2) Discuss biases in AI.
Many AI models are black boxes, even leading experts don’t understand how
and what factors were taken into account and to what extent. Many AI models,
especially in machine learning models, are trained on historic data which
themselves have biases. For example, if a police database is used to train a
model that determines the bail amount, people from certain neighborhoods
might see their bail set at higher amounts if the historical data is biased against
them.
3) What are deepfakes?
With advancements in deep learning, a new phenomenon has emerged called
deepfakes. Deepfakes refers to audio, video, or images that are generated by AI
and never actually existed in the real world. Deepfakes are problematic
because they are hard to detect and hence can convey misinformation very
convincingly to large audiences.
7.11 REFERENCES AND SELECTED READINGS
Books
1. Ertel, W. (2018). Introduction to artificial intelligence. Springer.
2. Liao, S. M. (Ed.). (2020). Ethics of artificial intelligence. Oxford
University Press.
3. Russell, S., & Norvig, P. (2015). Artificial intelligence: a modern
approach.
Web Articles/Journals
4. Korf, R. E. (1999). Artificial intelligence search algorithms.
5. What is Artificial Intelligence (AI)? (https://www.ibm.com/in-
en/cloud/learn/what-is-artificial-intelligence)
6. What makes TPUs fine-tuned for deep learning?
(https://cloud.google.com/blog/products/ai-machine-learning/what-makes-
tpus-fine-tuned-for-deep-learning)
7. Attention is all you need (https://arxiv.org/pdf/1706.03762.pdf)
8. Stanford Encyclopedia of Philosophy: Artificial Intelligence
(https://plato.stanford.edu/entries/artificial-intelligence/#StroVersWeakAI)
9. Better Language Models and Their Implications
(https://openai.com/blog/better-language-models/)
UNIT 8 INTRODUCTION TO MACHINE
LEARNING
Structure
8.1 Introduction
8.2 What is Machine Learning?
8.3 Types of Machine Learning
8.4 Machine Learning Algorithms
8.5 Neural Networks and Deep Learning
8.6 Mathematics for Machine Learning
8.7 Software for Machine Learning
8.8 Summary
8.9 Keywords
8.10 Check Your Progress – Possible Answers
8.11 References and Selected Readings

8.1 INTRODUCTION
Credit to define machine learning goes to Arthur Samuel. IBM’s Arthur
Samuel wrote a paper titled “Some Studies in Machine Learning Using the
Game of Checkers” in 1959. The paper investigated the application of machine
learning in the game of checkers. The concept of machine learning introduced
by Samuel showed that machines (computers) can learn without being
explicitly programmed. Without explicitly programmed means without the use
of direct programming commands. Here machine learning refers to self-
learning by machine (computer). How will a machine learn? A machine will
learn from historical data and empirical information. Machine Learning,
statistical learning or predictive modelling represents the same concept.
Statistical modelling is at the core of Machine Learning.
Broadly speaking, machines can learn in three ways. These form three
categories of Machine Learning. These are: Supervised Learning,
Unsupervised Learning and Reinforcement Learning. Supervised learning
uses labelled datasets. Labelled data is data that comes with a name or type.
Unsupervised learning involves finding a pattern in data. Thus unsupervised
learning segregates data in clusters or groups. These clusters or groups are
unlabeled. Reinforcement learning works on the principle of reward and
punishment. In other words, Reinforcement learning builds its prediction
model by gaining feedback from random trial and error and leveraging insight
from previous iterations. There are Machine Learning Algorithms
corresponding to these Machine Learning categories. Support Vector Machine
is a supervised machine learning algorithm. K-means clustering is an
unsupervised machine learning algorithm. These traditional models do not
scale in performance as the size of the dataset increases. However, deep
learning methods continue to scale in performance with the increasing size of
the dataset. Machine Learning is an emerging field of computer science having
wide applications in Search engines, Recommendation systems, Spam filters
etc.
Objectives
In this unit, you will learn the fundamentals of Machine Learning with proper
examples and illustrations. After reading this unit, you will be able to:
 Understand the concept of Machine Learning
 Discuss various types of Machine Learning
 Appreciate various Machine Learning Algorithms
 Understand the concept of neural networks and deep learning
 Explore some frameworks for Machine Learning in Python

8.2 WHAT IS MACHINE LEARNING?


Machine learning is a branch of artificial intelligence (AI) and computer
science which focuses on the use of data and algorithms to imitate the way that
humans learn, gradually improving its accuracy. Machine learning algorithms
are a class of computation algorithms where computers are not programmed to
do a task explicitly but rather “Learn” how to perform the task. Think of it in
the following fashion; in classical programming, you have a function that for
any given input x would produce y as an output.

However, a machine learning model tries to figure out the function f given x
and y. This approach is similar to how humans figure out things. They observe
the cause and effect and try to figure out how it happened. In other words,
machine learning tries to figure out the rule that connects the input and output.
This approach works wonders when a computer is tasked with performing non-
trivial tasks which do not have a set defined rule, such as recognizing a human
face, differentiating between different cat species, and talking to humans
(Remember Siri and Alexa?).
Another interesting non-trivial task is playing games such as chess and Go, at
which humans seem to excel. However, computers have a tough time
understanding and evaluating the game. For a long time, computers relied on
brute force calculations and tabulated data to try and outperform human
players. For example, when deep blue defeated the World Chess Champion
Garry Kasparov in 1997, it relied heavily on Opening book, a database of
about 70,000 Master Level games and Good-Old Fashioned AI algorithms like
alpha-beta search.
Cut to the present, Google’s AlphaZero is ruling the chess world. It is an AI
based system that learns chess by literally just playing against itself and
consistently outperforms all Grand Masters and other conventional chess
engines.
Unlike chess, in Go (a game quite popular in Korea), the possible scenarios
grow at an even faster rate than in chess, making it even harder to use brute
force techniques and the role of intuition becomes much more important.
Google’s AlphaGo defeated the reigning world champion of Go, again
showing how superior AI and Machine Learning Models can be compared to
conventional algorithms.
A more business-like application of machine learning is in building
recommendation systems. Websites like Netflix and YouTube rely on
recommendation engines to show relevant results to users from an almost
infinite set of possibilities.
Another widely used application of Machine Learning is spam detection.
Google and other companies classify Emails as Spam or Not-Spam based on
certain patterns found in them. Recently Machine Learning has been applied to
this task and with great success.
Check Your Progress 1
In this section, you studied “What is Machine Learning?”, now answer the
questions given in Check Your Progress-1.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Briefly explain the idea of Machine Learning.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Mention currently used applications based on Machine Learning.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

8.3 TYPES OF MACHINE LEARNING


Typically machine learning is divided into three categories:
i. Supervised Learning
ii. Unsupervised Learning
iii. Reinforcement Learning
8.3.1 Supervised Learning
Supervised learning utilizes a “Labelled Dataset”, i.e., a dataset that contains
the data and its labels. A Labelled Dataset contains features and their
corresponding labels. For example, the MNIST dataset has images of
handwritten digits. Each image has a corresponding integer label indicating the
number corresponding to the image. The images and their corresponding labels
are as shown in Figure 8.1.

Figure 8.1: MNIST Dataset Showing Images and Corresponding Labels

A supervised machine learning algorithm learns the mapping from image to


label. When the model is trained, it can predict the labels even for previously
unseen images. Note that the features need not be images; these can be any set
of attributes or characteristics. Image classification is one of several tasks that
supervised learning can accomplish.
8.3.2 Unsupervised Learning
Unsupervised learning involves finding patterns in data. It utilizes data that is
not labelled and attempts to find patterns by clustering similar objects together.
For protecting customers from fraudulent online activities, the company may
use either supervised learning or unsupervised learning. Unsupervised learning
will be able to detect the new pattern of fraudulent practices by clustering the
data.
Since unsupervised learning does not involve labels, large volumes of data that
are hard to label can be processed to get insights.
8.3.3 Reinforcement Leaning
Reinforcement Learning models work on the principle of reward and
punishment. In reinforcement learning, an agent is allowed to interact with the
environment and is rewarded if the action is deemed correct, else it is punished
for not behaving expectedly. The goal of reinforcement learning is to
maximize the cumulative reward.
Reinforcement Learning (RL) models differ from Supervised and
Unsupervised Learning in the sense that there is no data from which patterns
have to be recognized or values to be predicted. RL models are very useful in
industrial applications where the model can learn to perform complicated tasks
and optimize the process. Other applications include self-driving cars.
To understand reinforcement learning, one has to understand the following:
Environment: Think of the environment as a boardgame in which landing on
a particular square gives a reward or penalty. Thus the environment is a
collection of entities with which an agent can interact
Agent: An agent is the entity which performs certain actions by interacting
with the environment and has a goal of maximizing its rewards.
Reward: Reward is a numerical incentive given to the agent for performing a
certain task.
Check Your Progress 2
In this section, you studied different types of Machine Learning, now answer
the questions given in Check Your Progress-2.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss different types of Machine Learning.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) How Reinforcement Learning is different from Supervised Learning?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) What is a Labelled Dataset?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

8.4 MACHINE LEARNING ALGORITHMS


8.4.1 K-means Clustering
K-means clustering is an unsupervised learning algorithm, which attempts to
partition the set of given input data into k partitions by minimizing the
variance for each partition. In Figure. 8.2, there are three clusters shown in
three different colours. K-means clustering will find three centroids
corresponding to each cluster and try to correctly allocate each data point to
the corresponding centroid by forming three sets from the given data points.

Figure 8.2: Dataset for K-means Clustering

(The colours are for readers’ convenience. The dataset does not have any prior
information regarding the cluster.)
The simplest way of doing this is using the Naive k-means algorithm. Start
with some arbitrary values for the centroids (μ1 , μ2 , …. . , μ�) . Then allocate
each point to its nearest centroid. Finally, update the centroid values to the new
values, which is the mean of the data points in that partition.
8.4.2 Linear and Logistic Regression
Regression is a widely used method to establish a relation between input and
output variables. In linear regression, the relation between the input and output
is known to be linear in nature (while this might seem too restrictive, there are
indeed a lot of problems that can be mapped to this simple case)
i.e.,
� = �� + �
(a, b are unknown but fixed constants)
A formal presentation of the problem at hand can be done in the following
manner.
Given a set of ordered pairs �1 , �1 , �2 , �2 ……, �� , �� , the goal is to find
the constants a and b that characterize a function �(�) = �� + � such that the
total error ��=1 � � �� , �� (L is a loss function.) is minimized. Commonly
used loss functions include the mean square error, which is defined as

2
� �� − ��
�=1

Figure 8.3: Linear Regression Fit for the Plotted Points

The above statement can be understood visually is as shown in the Figure 8.3.
Here ‘n’ (input, output) pairs are given, and the task is to find the line which
minimizes the sum of the distance of each point from the line.
This can be done using the gradient descent algorithm (Ref Section 8.6 for
details). Start with some random values for ‘a’ and ‘b’ and then calculate the
loss. Then find the gradient of the loss and update parameters to reduce the
loss function. Do so until the error is below the acceptable limit or a certain
number of iterations have been done.
Repeat
��
�=�−
��
��
�=�−
��
Until
L < ϵ, where ϵ is the acceptable error limit
However, there is another class of problem that can be approached in a similar
manner. This problem is that of binary classification. So now the output is
binary one (i.e. �� ∈ 0,1) instead of being a continuous variable. One simple
approach can be to fit a straight line to the data and define a threshold value �0
and classify everything greater than �0 as 1 and everything less than �0 as 0.
Figure 8.4: Sigmoid Function

However, a better idea is to use a “Sigmoid” function on top of the linear


model is as shown in Figure 8.4. Mathematically, the prediction made by the
model is
1
y= 1+�−�
(i)

where,
� = �� + �
Equation (i) represents the logistic regression model. The output of the sigmoid
function can be interpreted as the probability or the confidence that a given
data point has the label 1. There is one last catch before we can fully
implement the logistic regression model, that is, the mean squared error (one
used in linear regression) cannot be used here as it leads to non-convex
optimization problems. So, finally, a Log Loss (also called Cross-Entropy Loss)
function is used which is defined as:

L= − �� ��� �'� − 1 − �� ��� 1 − �'�


�=1

The gradient descent is used to optimize the parameters ‘a’ and ‘b’ and
minimize the loss function.
Bias-variance Trade-off
Bias: A model is said to have a high bias when it underfits the data. That is
when the model does not correctly learn the relationship between the input data
and the output. This might happen because the model is too simple, or the
model is making assumptions that simply isn’t true.
Variance: When the model overfits, it is said to have high variance. In this case,
the model is unable to generalize to the data and performs poorly on the test
data while scoring very well on the training data. Generally, this happens when
the model is too complex, or there is too little data to train on.
8.4.3 Support Vector Machines

Figure 8.5: Optimal Hyperplane Separating Two Classes

Support Vector Machines (SVM) are robust classifiers that are great at
separating data that have a complex decision boundary. An SVM finds a
hyperplane is as shown in Figure 8.5, that separates the two classes such that it
maximizes the distance from the nearest data points (also known as the
“Support Vectors” and hence the name).
In Fig.5, there are two classes of points, green and blue. SVM finds � ��� �
such that the line (or more generally the hyperplane) separating the two classes
maximizes the distance between points nearest to the line.
Mathematically we are interested in finding the hyperplane,
�� − � = 0
which is mid-way between the hyperplanes containing the support vectors, i.e.,
�� − � = 1
�� − � =− 1
Such that the distance between them,
2
�=

is maximized.
The intuition behind using an SVM is that points that are closest to other group
are more important to consider while making a boundary than those which are
not. Further, the best boundary that separates the two groups is the one that
maximizes the distance between these nearby points. As can be seen intuitively
in Figure 8.6, clearly, the red line is the best boundary.
Figure 8.6: Possible Hyperplanes Separating the Two Classes

The real power of SVM is, however, unleashed when the “Kernel Trick” is
used. In this technique, the data is embedded into a higher dimensional space.
By such embedding, the data which is not linearly separable in the original
space becomes linearly separable in the higher dimensional space. The SVM
then goes on to find the optimal hyperplane separating the data.
Embedding the data to a higher dimensional space basically means we add
more dimensions to the data that are derived from the original data. For
example as shown in the Figure 8.7, we can see there is no hyperplane
separating the two variables. However, if we introduce a third variable
� = �2 + �2
we get an embedding in a 3-D space. When we plot �� , �� , �� , we can clearly
see that purple points are higher up compared to the red points; thus, a
horizontal plane can separate the two.

Figure 8.7: Effect of Embedding Data in a Higher Dimensional Space


(Source: Wikipedia)

Check Your Progress 3


In this section, you studied Machine Learning algorithm, now answer the
questions given in Check Your Progress-3.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Briefly describe the iterative K-means clustering algorithm.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Describe different loss functions in linear and logistic regression.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Describe how embedding data in higher dimensional space is useful for
SVM?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

8.5 NEURAL NETWORKS AND DEEP LEARNING


As the size of data scales up, traditional machine learning algorithms stop
performing better. However, in this regime, deep learning models continue to
scale up in performance with the data, making them very relevant in the
current scenario when a humongous volume of data is generated daily (Think
Facebook, YouTube, Satellite Data). In this section, we will look at how neural
networks work. Neural Networks are inspired by the human brain, which has
thousands of neurons working together to give us consciousness. Before that,
we will look at the perceptron which is the building block of the neural
network. Finally, we will explore deep learning, including models that are
designed for image processing (Convolutional Neural Networks) and models
designed for language processing (Recurrent Neural Networks).
8.5.1 The Perceptron

Figure 8.7: Perceptron

A perceptron is a single layer neural network, or in other words, it is the


building block of neural networks. A perceptron takes an input vector,
multiplies it with a parameter vector and then applies an activation function to
get the result. Now that might be a lot to take in at one go, so here look at it
one by one:
Input Vector: It is the input to the perceptron. It may be from the training set
from which the perceptron has to learn or from a test set in which case a
perceptron has to predict. For example, in an image classification problem, the
image pixel values can be the input.
Parameter Vector: The parameter vector is the key to the perceptron’s
learning ability. The parameter vector is optimized (or learned) during the
training of the model.
Activation Function: Finally, an activation function is applied to map the
output to the desired range. For the example of image classification, we might
want to map the output to zero and one for negative and positive samples,
respectively. Some commonly used activation functions are: Sigmoid
Activation Function, ReLU (Rectified Linear Unit), Leaky ReLU, and tanh.
These will be discussed as and when they are used.
Simple implantation of perceptron model in python (Refer 8.7 for setting up
python):

Feel free to change the input data and weights to see how the output changes.
Exercise: Generate plot for ReLU activation function.
(Hint: ReLU function is defined as ��� 0, � )
Ans.

Figure 8.8: ReLU Activation Function Plot

To summarize mathematically:
������: � = �1 , �2 , ……, �� �
����������: � = �1 , �2 , ……�� �
� = �� �
�=� �
1 2
���� = �' − �
2
�' is the actual label, � is the predicted label, and � is the activation function.
Loss is generally taken to be mean square error or cross entropy loss.
8.5.2 Neural Network
One perceptron isn’t very powerful on its own. To build a more powerful
classifier, we combine a lot of perceptrons so that they can learn even more
complex decision boundaries; this is called a neural network. A simple google
search would return the following trademark image of a neural network is as
shown in Figure 8.9. One can now instantly recognize that it’s just a lot of
perceptrons; each having its own set of input vectors, parameters and
activation function. Output from one layer is passed on to the next as input.
This layered structure gives rise to some fairly obvious terminology:

Figure 8.9: Neural Network

Input Layer: The input layer is basically the input data itself, reshaped to a
suitable shape.
Hidden Layer: Hidden layer is where all of the computation and learning
occurs.
Output Layer: It finally converts the output of the hidden layer to the
desired output such as {0,1} in the case of binary classification.
Loss Function:
The goal of training the neural network is to minimize the loss function. For
binary classification problems we can use the same loss function as in the case
of logistic regression, i.e., binary cross entropy. Mean Squared Error is used
for regression problems.
Backpropagation
The neural network works by minimizing the loss function. To minimize the
loss function, the gradient descent algorithm is used. For that, there is a need to
find gradients with the help of the chain rule. Differentiating the equations
developed for perceptronusing the chain rule,
�� �� �� ��
=
�� �� �� ��
Notice that � is the loss, which is a function of �. In turn, � is a function of �,
which is a function of �. We can compute each derivate term separately.
��
= �' − �
��
��
= �' �
��
If � is ReLu
�� 1 if z > 0
=
�� 0 �� � < 0
And finally
��
=�
��
Combining,
��
= � − �' �' � �
��
Now in a neural network, there are several layers of perceptrons. For the
output layer, the above formula can be modified to,
��
= � − �' �' � �ℎ
��
Where �ℎ is the output of the hidden layer and the input of the output layer.
For any other layer, the gradient depends on the next outer layer. In other
words, gradient of a layer determines the gradient of the layer before it, hence
the name back propagation
�� ��ℎ
= � − �' �' � �
��ℎ ��ℎ
Where,
�ℎ = � ��ℎ,� �ℎ,�
The subscript j denotes each perceptron in the hidden layer.
Thus,
��
= �0 �' �ℎ �ℎ,�
��ℎ
�0 is the term that is carried over from the output layer.
Finally, all parameters are updated,
��
��,� = ��,� − α
���,�
α is called the learning rate, it determines by how much are the parameters
updated. A very small learning rate will make the learning slow, a very high
learning rate may cause the model to skip the minimum loss point and diverge.
8.5.3 Deep Learning
A Neural Network can have any number of neurons (aka perceptrons) in the
hidden layer, and the hidden layer itself can be multilayered. Neural Networks
that have a lot of hidden layers are termed deep neural networks. As the layers
on a neural network increase, its capacity to learn more complex models
increases. Thus, the more data you have, the deeper neural network you need.
However, as the depth of the network increases, the computational resources
required to train the model goes up considerably. Thus it is preferable to keep
the network deep enough so that it is capable of fitting the data, and at the
same time, avoid problems like overfitting and resource wastage. The two
most basic deep learning models are Convolutional Neural Network and
Recurrent Neural Networks used in image processing and Natural Language
Processing, respectively.
Convolutional Neural Networks
Convolutional neural networks (CNN) use “Convolutional layers” along with
Dense Layers (the normal neural network layers we studied in previous
section).
Convolutional Layers: Convolutional layers use filters to scan the image. A
filter is basically an NxN array where N is significantly less than the
dimension of the input image. The filter is placed at the start of the image, and
a dot product of the filter and overlapping image is calculated.
0.8 2.4 2.5 1 1 1
A = 2.4 0 4.0 ∗ 1 1 1 = (14.4)
1.1 2.4 0.8 1 1 1
This gives us the first output value. The filter is then shifted by a number of
“strides”

Figure 8.10: An Image and a Convolution Filter.

and the dot product is evaluated again. This gives us the output of the
convolutional layer. When multiple filters are used, we can stack the output
back to back, so if we have k kernels each giving an MxM output, we would
have a kxMxM output. Convolutional Layers help the model to learn the
correlation between neighbouring pixels and identify structures and patterns
such as eyes on a face and so on.
Another important type of layer used in CNNs is the Pooling Layers, which
come in two varieties. These are : MaxPooling Layer and AveragePooling
Layers
Pooling Layers: Pooling Layers work in a manner similar to the convolutional
layers. They scan the image using a small NxN filter but instead of taking the
dot product between the filter and image, they pick out the maximum value
from the overlapping image in case of max pooling or return the averagevalue
in case of average pooling.
Recurrent Neural Networks

Figure 8.11: RNN Model

Recurrent neural networks (RNN) are useful in cases where we have sequential
data, such as language models, music and so on. RNNs take a sequential input:
� = �0 , �1 , ……. . , �� . This could be a sentence, for example, “This is a book
about neural networks”. Now, of course, to use a model such as RNN there is a
need to encode this sentence into numeric values. So, each word in the
vocabulary is mapped to a number. For example,
�ℎ�� → �0
�� → �1
� → �2
���� → �3
����� → �4
������ ������� → �5

Now, the way RNN works is bypassing the value at each time-step �� to a
neural network, which returns two parameters ��, �� as shown in Figure 8.11.��
is the output for that time-step and �� is passed over to the next time-step. Then
the output �� is taken as the input along with ��+1 to give ��+1 and ��+1 . Thus,
an RNN utilizes the information of the previous time step to generate the
output for the current time-step, something that is desired when sequential
information is being processed.
Let’s looks at the mathematical description of RNN

Figure 8.12: Single RNN Unit


�� = g ��� xi + ��� ��−1 + b
Or, compactly written as
�� = � �� ��−1 , �� + �
where,
�� = ��� , ���
And,
�� = ℎ ���� + �
ℎ and � are the activation function which is usually chosen as ���ℎ or ReLu
for �.
RNNs are not able to learn long distance connections in the data, for example,
“Sachin Tendulkar is one of the greatest cricket players, he has over 14,000
runs.” In the sentence we expect the RNN to be able to figure out that he refers
to Sachin Tendulkar however in practice RNNs do poorly when the gap
between two time-steps increases and are unable to make the connection.
Check Your Progress 4
In this section, you studied neural networks and deep learning, now answer the
questions given in Check Your Progress-4.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Define input layer, hidden layer and output layer for neural network.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) What advantages deep learning has over other algorithms such as Support
Vector Machine?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) What is the major problem associated with simple RNNs?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

8.6 MATHEMATICS FOR MACHINE LEARNING


The basic level of knowledge about linear algebra and vector calculus is
needed to understand machine learning algorithms where input data is always
given as a numerical vector.
8.6.1 Introduction to Linear Algebra
Vectors in physics are quantities that have a magnitude and a direction.
However, if you look closely, you’ll realize they are just an ordered set of three
numbers � = � � � . The magnitude of the vector is given by � =
�2 + �2 + �2 . The magnitude of the vector is also known as the “norm of
the vector” (more precisely the L2 norm, as will see). Another operation that is
defined between two vectors is the dot product which is defined as�. � =

�1 �2 + �1 �2 + �1 �2 . Let’s represent the vector by a column matrix � = � .

Matrix
A matrix is a rectangular array of numbers. An � � � matrix has m rows and n
columns. For example:
1 2 5
3 7 9
�=
5 9 3
1 4 8
is a 3x4 matrix.
A matrix can be used to write and solve a system of linear equations in a
compact form. Consider the following system of equations
2�1 + 3�2 + 4�3 = 5
7�1 + 1�2 + 2�3 = 5
12�1 + 7�2 + 9�3 = 4
It can be written compactly as
2 3 4 �1 5
7 1 2 �2 = 5
12 7 9 �3 4
Thus, any general linear system of equations can be written as �� = �, where
� is the matrix of coefficients, � is the unknown, and � is the coefficient of the
constant term.
An Identity matrix akin to the number 1 in the real number system is the
multiplicative identity for matrices.
1 0 0
�= 0 1 0
0 0 1
A matrix B is the inverse of a matrix � if their product yields the identity
matrix.
�� = �,
� = �−1
Transpose of a matrix
The transpose of a matrix is obtained by exchanging the rows and columns of a
matrix.
2 1 5
7 3 9
A=
9 5 3
4 1 8
1 3 5 1

� = 2 7 9 4
5 9 3 8
Note that because of the rules of matrix multiplication, inverse exists only for
square matrices, i.e., matrices which have an equal number of rows and
columns.
Let’s get back to vectors. We mentioned that we can represent vectors as

column matrices. The multiplication of a vector � by a 3x3 matrix M would

'

give another vector �' . This matrix would transform any given vector into
�'
some other vector and hence is also known as a “linear transformation”. For
each matrix M in general, there are some vectors which are transformed to the
same vector scaled by a constant mathematically �� = �� these vectors are
called the eigenvectors of the Matrix and the scaling constant � is called the
eigenvalue.
Following is an example of Eigenvectors and Eigenvalues of a matrix:
Consider the following 2x2 matrix
2 1
1 2
Notice what happens when the matrix is multiplied to the following vectors,
2 1 1 1
=3
1 2 1 1
2 1 1 1
=
1 2 −1 −1
The same vector is returned but with a constant factor. These special vectors
are the eigenvectors, and the scaling constants are the eigenvalues.
Verify using examples that no other vector obeys this property for the given
matrix. Also, note that eigenvalues have to be non-zero. The same applies for
eigenvectors.
Now there is nothing special about 3x3 matrices or vectors having just 3
dimensions. In general, we can define a vector to be an ordered set of N real
numbers (or even complex numbers if you wish) and represent it by a column
matrix of N entries.
�1
�2
�= �3
..
��
Let’s look at some of the common definitions in linear algebra that we will be
encountering throughout machine learning.
Vector Norm: A vector norm is a function that takes a vector and returns a
non-negative real number as output. A vector norm should fulfil the following
conditions.
� > 0 �f� ≠ 0 ��� � = 0 �ff � = 0
�� = � �
�+� < � + �
The most commonly used norm is the Euclidean norm, also known as the L2
norm. It is defined as:

�� 2

As an exercise, verify that the Euclidean norm satisfies the above-mentioned


properties.
Inner Product
Inner Product of two vectors � and � is given by �� � , where �� is the
transpose of the matrix �.

�� ��

The inner product gives us the sense of how correlated two vectors are. The
inner product is maximum when two vectors are in the same direction. When
the inner product between two vectors is zero, they are termed Orthogonal
Vectors.
8.6.2 Multivariable Calculus
Single variable calculus deals with functions of one variable � � . However, in
machine learning, we are often interested in functions that depend on several
variables. For example, to predict if it will rain today, we might want to know
the Temperature, Pressure, Wind speed, Humidity and so on. Thus the
probability of it raining today is a function of all of these variables
� �, �, �, �
Naturally, we want to find analogues of derivatives and integrals for such
functions. Also, we would like to see how to find minima and maxima for
these functions. Multivariable Calculus deals with these questions. We will
consider the functions of two variables x and y, as these are easy to understand
visually.
Concept of Gradient
Partial Derivative: Consider a function � �, � , the partial derivative of �
with respect to �, at a given point �0 , �0 is given by
�� � � + ℎ, � − � �, �
= lim ​
�� ℎ→0 ℎ �0 ,�0

The partial derivative measures the change in the function with respect to one
variable while keeping the other variables fixed. Let’s look at an example.
Example: Find the partial derivatives of the function � �, � = �2 + �2 at
� = 2, � = 3 .
Solution:
��
= 2�
��
��
�� � = 2; =4
��
��
= 2�
��
��
�� � = 3; =6
��
Gradient
Gradient gives the direction (in the domain of the function) in which the
function is the “steepest,” i.e., the direction of maximum change. The
magnitude of the gradient quantifies the steepness.
��
��
∇� �, � = ��
��
Note that Gradient of a function is a vector quantity.
If the gradient of a function is zero at any point, it may point to one of the three
cases:
 Local Maxima: A point where the function attains a maximum value
locally, that is, the value at the point is greater than all neighbouring
points.

 Local Minima: A point where the function attains a minimum value


locally, that is the value at the point is less than all neighboring points

 Saddle Point: Saddle points occur when the gradient is zero, but the
point is neither a maxima or a minima. The figure shows how the
function appears to have a minima when viewed from the front and a
maxima when viewed from the side.
Check Your Progress 5
In this section, you studied Mathematics for Machine Learning, now answer
the questions given in Check Your Progress-5.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Find the gradients of the following function:
�2+�2
� �, � = �
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Find the Eigenvalue and Eigenvector for the following matrices.
3 0
1 1
(Hint:)
�� = ��
� − �� � = 0
� − �� = 0

-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Find the Dot Product of the following pair of vectors.

� = 2,4,6,1,8

� = 1,0,3,0,1
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
8.7 SOFTWARE FOR MACHINE LEARNING
Python is the most commonly used Programming Language for designing
machine learning algorithms. The easiest way of using Python is to use Google
Colab Notebooks. Colab Notebooks can run python command interactively.
Google also gives access to high end GPUs for free, which greatly reduces the
time taken to train models, especially deep-learning models.
8.7.1 Setting Up the Environment
Google Colab
To use google Colab you first must have a google account. If you don’t already
have an account, you can create a google account and set up an email address
and a password.
Go to https://colab.research.google.com/ and sign in with your google account.
Select the New Notebook option. This will open a blank notebook.

Figure 8.13: Colab Startup Screen

The default notebook contains a code cell. You can run blocks of python code
in these cells. The other type of cell is the Text Cell, where you can enter a
description of your code or any other information that you want to.
Just to get started, Make a text box and type “My First Colab Notebook”, now
create a Code Cell and type print(‘hello’).
It should look something like this,

Figure8.14 Colab Interface

Now run each cell using Ctrl+Enter. Alternatively, use Runtime->Run All. The
text will get formatted according to Markdown Syntax. All code output will be
displayed below the cell.
You can add multiple code cells to implement different parts of a code. All
variables, functions and classes defined in one cell are accessible to other cells.
The getting started guide gives a more detailed and comprehensive
introduction.
Setting Up Locally
Easiest way to set up an environment on your system is by using Anaconda.
Anaconda is a package manager, an environment manager and a Python/R data
science distribution. This basically means that all the machine learning
packages such as numpy, Tensorflow etc. Following is a quick guide on how
to install Anaconda
i. Download the Anaconda installer.
ii. Double click the installer to launch.
iii. Complete the set up by agreeing to the Licencing terms and selecting
the location for installation.
iv. Select the following option in advanced menu(Recommended by
Anaconda).

v. Click the Install button. If you want to watch the packages Anaconda is
installing, click Show Details.

vi. After a successful installation you will see the “Thanks for installing
Anaconda” dialog box:
Open Anaconda Navigator from the start Menu. You can install different tools
that are used in Data Science and Machine learning from here. It is
recommended that you Install Jupyter-Notebook for this course. A Jupyter-
Notebook works almost like a Colab Notebook, just with a minor difference.

4Anaconda Navigator
8.7.1 Numpy
Although numpy is not a machine learning library, it is extensively used to
handle the data and pass inputs to other libraries/frameworks. Numpy helps to
handle data that is in the form of an array. For example, an image is an array of
pixels, so you can represent an image using numpy. Operations in numpy are
highly optimized and work way faster than other alternatives such as iterating
over loops. To illustrate this, let’s sum an array with 10^7 elements
It takes the loop about 2 seconds to evaluate the sum, whereas numpy takes
only 0.02 seconds for the same task, making numpy 100x faster than the
alternative.
Numpy arrays are often called “ndarrays” which is a short form for n-
dimensional arrays. Numpy has inbuilt methods that can be utilized to perform
almost all linear algebra tasks. Simplest way to create a numpy array is by
passing a list to the np.array() method. It returns an nd array having the same
values as the list, but a fixed data type.

Output: array([1, 2, 3, 4, 5, 6, 7, 8])


Concatenate function:It concatenates two arrays, that is, simply joins them
back-to-back.
We have already seen in the how np.sum() can be used, similarly np.average(),
np.std() etc can be used to find average, standard deviation etc.
We can use the reshape method to reshape the array into a different shape. For
example, in the following code arange method creates an array of shape (6,),
reshaping it to (2,3) changes the shape of the array.

Output: array([[0, 1, 2],


[3, 4, 5]])
Output: array([[0, 3],
[1, 4],
[2, 5]])
We can also do the reverse operation. Given an ndarray of shape (3,3) we can
flatten it out by using the flatten method.
Output: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
8.7.2 Scikit-Learn
Scikit-Learn is a complete machine learning package that has most of the
machine learning techniques implemented in a highly optimized fashion. Let’s
look at a few examples from the Scikit-Learn documentation to understand it
better.
First, we need to install Scikit-Learn, which can be done easily using pip

Let’s look at some of the features of scikit-learn.

Clearly, scikit-learn offers a wide range of features, however, most of these


have similar designs and are well documented.
8.7.3 Tensorflow and Keras
Tensorflow is a deep learning library developed by Google. Keras is built on
top of the Tensorflow backend and is a fast way of creating deep learning
models. Let’s build a small neural network to classify handwritten digits. For
this classification, we will use the MNIST dataset, which has 60,000 training
images and additional 10,000 images for testing.
Figure 8.15: Samples from MNIST Dataset

Fortunately, MNIST data set is already available with Keras so no additional


downloads are required. The following chart shows the workflow for any
neural network in Keras.
Note that we changed the labels to categorical labels. What this means is that
each number from 0 to 9 is represented by a vector with 1 at the corresponding
position.
1 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0→ ,1 → , …………. , 9→
0 0 0
0 0 0
0 0 0
0 0 0
0 0 1
This is done because we use a SoftMax activation function with the last layer,
which returns output in this format. This is known as one-hot encoding.
The DropOut layer is a special layer. It turns off some neurons during the
training, i.e. model behaves as if they were not there for that epoch.
When the model completes training over the entire dataset once, we say the
model has trained for one epoch. The model is trained several times on the
same data, i.e. through several epochs to improve the quality.
Now, we will train the model and make some predictions and measure the
accuracy of the model.

Output: dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])

Figure 8.16 Model Accuracy at Each Epoch

The fit() method returns a history object which stores the values of different
parameters such as accuracy, validation accuracy, loss for each epoch. The
history member inside the history object is a dictionary that stores values for
different parameters for each epoch.
The compile method configures the model. It takes as input the optimizer that
we want to use to minimize the loss function, the metric for validation.
Finally, we test the model for our test data and see how it performs.

The evaluate() method returns the value of the loss and the metric chosen (In
our case accuracy). The evaluate function gives the results for the entire data
set. If we wish to make an individual prediction on a test data, we can use the
predict() method.
Check Your Progress 6
In this section, you studied Software for Machine Language, now answer the
questions given in Check Your Progress-6.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) What will be the output for the following code?
print(np.sum([1,2,3,4]))
print(np.concatenate([[1,2,3,4],[4,3,2,1]]))
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Identify and describe the layers used in the following Keras model.
model = keras.Sequential(
[
keras.Input(shape=input_shape),
layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Flatten(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation="softmax"),
]
)
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) What is one-hot encoding?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

8.8 SUMMARY
This unit dealt with the fundamentals of Machine Learning. This unit discussed
how a machine learns. Machine learning algorithms were explained and linked
to various machine learning categories. Neural Networks and Deep learning
were described to appreciate the power of machine learning. Mathematics and
Machine Learning Software are crucial in appreciating machine learning
algorithms. These have been ingrained at appropriate places in this unit and
can be used easily by learners either using Google Colab (online) or python
package Anaconda (by installing on your computer/laptop).

8.9 KEYWORDS
Activation Function: The activation function is a mathematical function that
lets transform the outputs to a desired non-linear format before it is sent to the
next layer. It maps the summation result to a desired range.
Agent: An agent is the entity which performs certain actions by interacting
with the environment and has a goal of maximizing its rewards.
Algorithm: An algorithm provides fixed computational rules.
Conventional algorithm: Algorithms based on explicit instructions of how to
perform a task
Machine Learning algorithm: Algorithms that are based on an implicit set of
rules (based on data, not on explicit instructions of how to perform a task)
Anaconda: The most popular data science platform for Python
Backpropagation: Backpropagation is the method for performing gradient
descent in artificial neural networks. It allows us to compute the derivative of
a loss function with respect to every free parameter (i.e. weight and bias) in the
network. It does so layer by layer.
Classifier: A classifier is a type of machine learning algorithm used to assign a
class label to a data input.
Convolutional Neural Networks (CNN): CNN is one of the most popular
deep neural network algorithms. It is mostly used in visual recognition task. It
takes image as an input and learns the features from the different parts of
image.
Environment: Think of the environment as a board game in which landing on
a particular square gives a reward or penalty. Thus the environment is a
collection of entities with which an agent can interact
Gradient descent: Gradient descent is an optimization algorithm which is
commonly used to train machine learning models and neural networks.
K-means clustering: K-means clustering is an unsupervised learning
algorithm, which attempts to partition the set of given input data into k
partitions by minimizing the variance for each partition.
Keras: Keras is a high level open source neural network library written in
python.
Kernel: In machine learning, a kernel is the measure of resemblance where a
kernel function defines the distribution of similarity of points around a given
point.
Kernel-trick: The kernel-trick is a method that allows one to use a linear
classifier to solve a non-linear problem.
Labelled Database: Labelled data is data that comes with a name or type. In
other words, a dataset that contains the data as well as its labels (name/type)
MNIST dataset: The MNIST (Modified National Institute of Standards and
Technology) dataset is a large collection of handwritten digits.
Reward: Reward is a numerical incentive given to the agent for performing a
certain task.
Recurrent Neural Networks (RNN): A recurrent neural network (RNN) is a
type of artificial neural network which uses sequential data or time series data.
These deep learning algorithms are commonly used in language translation,
natural language processing (nlp), speech recognition, and image captioning;
they are incorporated into popular applications such as Siri, voice search, and
Google Translate.
ReLU (Rectified Linear Unit): ReLUis anactivation function in deep neural
network. ReLU function is defined as ��� 0, �
Scikit-Learn: Scikit-Learn is a complete machine learning package that has
most of the machine learning techniques implemented in a highly optimized
fashion.
Support Vector Machine (SVM): SVM is a machine learning algorithm
popular for regression and classification.
Tanh: Tanh is an activation function in a neural network.

8.10 CHECK YOUR PROGRESS 1 – POSSIBLE ANSWERS


1) Briefly explain the idea of Machine Learning.
The concept of machine learning is that machines (computers) can learn
without being explicitly programmed. Without explicitly programmed means
without the use of direct programming commands. Here machine learning
refers to self-learning by machine (computer).
2) Mention currently used applications based on Machine Learning.
Machine Learning is an emerging field of computer science having wide
applications in Search engines, Recommendation systems, Spam filters etc.
Check Your Progress 2 – Possible Answers
1) Discuss different types of Machine Learning.

There are three categories of Machine Learning. These are: Supervised


Learning, Unsupervised Learning and Reinforcement Learning. Supervised
learning uses labelled datasets. Labelled data is data that comes with a name or
type. Unsupervised learning involves finding a pattern in data. Thus
unsupervised learning segregates data in clusters or groups. These clusters or
groups are unlabeled. Reinforcement learning works on the principle of reward
and punishment.

2) How is Reinforcement Learning different from Supervised Learning?


Reinforcement Learning models differ from Supervised and Unsupervised
Learning in the sense that there is no data from which patterns have to be
recognized or values to be predicted. RL models are very useful in industrial
applications where the model can learn to perform complicated tasks and
optimize the process. Other applications include self-driving cars and so on.
3) What is a Labelled Dataset?
A dataset that has data as well as label is known as labelled data. In other
words, Labelled data is data that comes with a name or type. Some of the
popular labelled datasets are: MNIST dataset.
Check Your Progress 3 – Possible Answers
1) Briefly describe the iterative K-means clustering algorithm.

K-means clustering is an unsupervised learning algorithm, which attempts


to partition the set of given input data into k partitions by minimizing the
variance for each partition.

2) Describe different loss functions in linear and logistic regression.


Binary cross-entropy loss:

L= − �� ��� �'� − 1 − �� ��� 1 − �'�


�=1
Mean Squared Error:

2
�= �' − �
�=1
3) Describe how embedding data in higher dimensional space is useful for
SVM?
Embedding the data into a higher dimensional space makes it linearly
separable. i.e. an SVM can find an optimal hyperplane.
Check Your Progress 4 – Possible Answers
1) Define input layer, hidden layer and output layer for a neural network.

1. Input Layer: The input layer is basically the input data itself, reshaped
to a suitable shape.
2. Hidden Layer: Hidden layer is where all of the computation and
learning occurs.
3. Output Layer: It finally converts the output of the hidden layer to the
desired output such as {0,1} in the case of binary classification.

2) What advantages deep learning has over other algorithms such as Support
Vector Machine?

As the size of data scales up, traditional machine learning algorithms (like
Support Vector Machine) stop performing better. However, in this regime,
deep learning models continue to scale up in performance with the data making
them very relevant in the current scenario when a humongous volume of data
is generated on a daily basis (Think Facebook, YouTube, Satellite Data).
3) What is the major problem associated with simple RNNs?

It turns out very simple RNNs are not able to learn long distance connections
in the data, for example, “Sachin Tendulkar is one of the greatest cricket
players, he has over 14,000 runs.” In the sentence, we expect the RNN to be
able to figure out that he refers to Sachin Tendulkar however, in practice,
RNNs do poorly when the gap between two time-steps increases and are
unable to make the connection.
Check Your Progress 5 – Possible Answers
1) Find the gradients of the following functions.
2 2
2�� � +�
2 2
2�� � +�
2) Find the Eigenvalue and Eigenvector for the following matrices.
3−� 0
��� =0
1 1−�
3−� 1−� =0
����� ������: � = 1, � = 3
��� � = 3, �����
3−3 0 �
=0
1 1−3 �
� = 2, � = 1
Similarly, solve for k=1.

(3) Find the Dot Product of the following pair of vectors.



� = 2,4,6,1,8

� = 1,0,3,0,1
Ans. 28
Check Your Progress 6 – Possible Answers
1) What will be the output for the following code?
10
[1 2 3 4 4 3 2 1]

2) Identify and describe the layers used in the following Keras model.
Convolution Layer: Uses a filter to generate an output corresponding to the dot
product of filter and overlapping image.
MaxPool Layer: Uses a filter to select maximum values from the image for
each filter window
Flatten Layer: Reshapes the data for compatibility
Droput Layer: Turns off some neurons for the epoch
Dense Layer: Is the simple neural network layer
3) What is one-hot encoding??
Note that we changed the labels to categorical labels. What this means is
that each number from 0 to 9 is represented by a vector with 1 at the
corresponding position.
1 0 0
0 1 0
0 0 0
0 0 0
0 0 0
0→ ,1 → , …………. , 9→
0 0 0
0 0 0
0 0 0
0 0 0
0 0 1
This is done because we use a SoftMax activation function with the last
layer, which returns output in this format. This is known as one-hot
encoding.

8.11 REFERENCES AND SELECTED READINGS


1. Oliver Theobald, “Machine Learning for Absolute Beginners: A Plain
English Introduction”, Independently published, 2021
2. Saiket Dutt, Subramanian Chandramouli and Amit Kumar Das,
“Machine Learning”, Pearson, 2018
3. Sridhar, M. Vijaylakshmi, “Machine Learning”, Oxford University
Press, 2021

4. https://web.mit.edu/6.034/wwwbob/svm.pdf
5. https://ml-cheatsheet.readthedocs.io/en/latest/backpropagation.html
6. https://users.math.msu.edu/users/gnagy/teaching/11-fall/mth234/L19-
234-th.pdf
7. https://colah.github.io/posts/2015-08-Understanding-LSTMs/
8. https://cs231n.github.io/convolutional-networks/
9. https://keras.io/examples/vision/mnist_convnet/
UNIT 9 AI AND MACHINE LEARNING FOR
SMARTCITIES
Structure
9.1 Introduction
9.2 Healthcare
9.3 Education
9.4 Mobility and Transportation
9.5 Energy Sector
9.6 Environment and Economy
9.7 AI and ML Challenges
9.8 Summary
9.9 Keywords
9.10 Check Your Progress – Possible Answers
9.11 References and Selected Readings

9.1 INTRODUCTION
Key components of smart cities are: smart people, smart transportation, smart
living, smart environment, smart economy and smart governance. These
components get manifested in education, mobility and transportation,
healthcare, energy, environment and economy. When we talk of personalized
learning, when we talk of flexible learning, when we talk of inclusive learning,
AI in education can aid in offering personalized, flexible and inclusive learning.
AI can play a significant role in developing a more efficient intelligent
transport system in smart cities. With the coming of IoT devices, advanced
sensors and an increase in data rates, AI and ML have seen extensive use in the
healthcare sector. The very objective of a smart environment is sustainability.
Sustainability signifies the balance between city and environment. Smart cities
should be developed to utilize natural resources in a sustainable way. Air
quality, water quality, waste management and building management are the
attributes of a smart environment. Smart cities are flooded with real-time data
collected from various sources. Analysis of this vast amount of data in full is
nearly impossible without the use of machine language tools. AI and ML have
penetrated every aspect of smart cities. When it comes to the diagnosis of
various diseases, Artificial Intelligence (AI) and Machine Language (ML)
algorithms play a key role. Deep Learning, a subfield of machine language,
like Convolutional Neural Networks (CNN) can be applied in cancer imaging
that assists pathologists to detect and classify the disease at earlier stages.
Support Vector Machine (SVM) can help in the diagnosis of heart disease. The
main algorithm used for autonomous vehicles also called self-driving cars is
Convolutional Neural Networks (CNN). The long and short time memory
(LSTM) model, a common model in the field of deep learning, can be
effectively applied to the power prediction of wind power and photovoltaic
power generation. K-nearest neighbour, Random Forest and Support Vector
Machine can be used for classification of pollution data to estimate pollution
level. With these ubiquitous influences of AI and ML on every component of
smart cities, there are many challenges also that need to be addressed in the
coming days.
Objectives:
In this unit, you will learn the applications of AI and Machine Learning for
smart cities. After reading this unit, you will be able to:
 Appreciate applications of AI and ML for healthcare
 Understand ML algorithm in education
 Identify the role of AI and ML in mobility and transportation
 Understand the concept of AI and ML in the energy sector
 Identify various applications of AI and ML in the environment and
economy
 Discuss challenges of AI and ML in key components of smart cities

9.2 HEALTHCARE
With the coming of IoT devices, advanced sensors and an increase in data rates,
AI and ML have seen extensive use in the healthcare sector. AI and ML are
playing a pivotal role in disease diagnosing, cure prediction, and medical
imaging.
9.2.1 Predictive Medicine
Predictive medicine is a branch of medicine that aims to identify patients at
risk of developing a disease, thereby enabling either prevention or early
treatment of that disease. As AI can find meaningful relationships in raw data,
it can support diagnosis, treatment and predictive outcomes in many medical
situations. The application of AI will help medical professionals to incorporate
proactive management of a disease that is likely to develop. Further AI can
help in predictions of disease by identifying risk factors of a patient and thus
earlier healthcare intervention is possible.
The approach to predict cardiovascular risk without AI and ML fail to identify
many people who would benefit from preventive treatment, whereas others
receive an unnecessary intervention. ML offers improved accuracy in
predicting cardiovascular risks. ML can be used to substantially improve the
accuracy of predicting cancer susceptibility, recurrence and mortality. Deep
Learning technologies like Convolutional Neural Networks (CNN) can be
applied in cancer imaging that assists pathologists to detect and classify the
disease at earlier stages, thus improving the chances of the patient to survive,
especially for lung, breast and thyroid cancer.
9.2.2 AI in Clinical Trials
Clinical trials are a type of research that studies new tests and treatments and
evaluates their effects on human health outcomes. People volunteer to take part
in clinical trials to test medical interventions including drugs, cells and other
biological products, surgical procedures, radiological procedures, devices,
behavioural treatments and preventive care. This takes a lot of time and money.
Further, the success rate is very low. AI can help in eliminating time
consuming data monitoring procedures. There are four phases of clinical trials.
And every phase takes a considerable amount of time and money. AI has the
potential to reduce clinical trial cycle duration.
9.2.3 ML in Medical Image Analysis
The objective of medical image analysis is to assist clinicians and radiologists
in efficient diagnosis and prognosis of the diseases. Deep Learning, a sub field
of ML, is used for automatic extraction of information from medical images
such as magnetic resonance imaging (MRI), X-ray, computed tomography
(CT), ultrasound. Important tasks under medical image analysis are: detection,
classification and segmentation. The identification of specific abnormalities,
like tumor and cancer, in medical images is detection. CNN gives high
performance in medical image analysis detection and classification tasks as
compared to other conventional techniques. CNN and RNN is widely used for
segmentation task in medical image analysis.
Architecture of ML in medical image analysis is as shown in the Figure 9.1.
DL automatically extracts imaging features from medical images. Predictive
modelling for various tasks such as detection, classification, segmentation is
done based on extracted imaging information.

Feature Predictive
Medical Images Extraction Modeling
Prediction
(using CNN) (ML model)

Figure 9.1: Architecture of ML in Medical Image Analysis

9.2.4 ML Algorithms Used for Diagnosis of Various Diseases


Table 9.1 summarizes different ML techniques that can be used for the
diagnosis of various diseases.
Table 9.1: ML Algorithms for Diagnosis of Various Diseases
Disease ML Algorithm
Detection of heart disease Support Vector Machine (SVM), Naïve
Bayes Algorithm
Disclosure of breast cancer Support Vector Machine (SVM), Naïve
Bayes Algorithm, K-nearest neighbour
Diagnosis of thyroid disorder SVM and Decision Tree
Diabetic Disease SVM
Source: adapted from various sources

Check Your Progress 1


In this section, you studied AI and ML for healthcare, now answer the
questions given in Check Your Progress-1.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss AI in clinical trials.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss predictive medicine.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Discuss ML algorithms for the diagnosis of various diseases.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
9.3 EDUCATION
AI has influenced so many fields. Likewise, the educational field can’t escape
the influences AI will have on learning and teaching. What is AI in education
and what AI can deliver? When we talk of personalized learning, when we talk
of flexible learning, when we talk of inclusive learning, AI in education can
aid in offering personalized, flexible and inclusive learning. AI and ML can be
applied in the following areas of education.
 Content Creation
 Intelligent Tutoring Systems
 Virtual Facilitators
 Automatic Degrading
 Interaction data for Learning
 Adaptive Learning
 Intermediate Interval Education
 Personalized Learning
9.3.1 Adaptive Learning
Adaptive learning systems use artificial intelligence and machine learning to
personalize the learning experience. Steps involved in adaptive learning
systems using AI have been shown in Figure 9.2. The architecture of adaptive
learning systems is based upon these four tasks.
 Capturing data from the learner
 Using captured data to assess learner’s progress
 Recommending learning activities
 Providing tailored feedback
AI in education uses predominantly three models:
i. Pedagogical model
ii. Domain model
iii. Learner model
Approaches to teaching are represented under pedagogical model. Knowledge
of the subject being learnt falls under domain model. Knowledge of the student
comes under learner model. The adaptive learning systems algorithm takes
decision by referring to these three models.
Adaptive Content
(As per individual learner’s needs)

Data Gather
(of learner’s interactions)

Feedback

Data Analysis
(using AI and ML)

Figure 9.2: Adaptive Learning Systems Using AI


9.3.2 Intelligent Tutoring Systems
Intelligent Tutoring Systems are computer programs that use Artificial
Intelligence components. The educational software involving AI was originally
called ‘intelligent’ computer-assisted instruction. The critical distinction
between traditional computer-assisted instruction and intelligent tutoring
systems is that the former involves static systems that incorporate the decisions
of expert teachers, whereas intelligent tutoring systems contain the expertise
itself and can use it as a basis for making decisions about instructional
interventions. Intelligent tutoring systems must contain the following three
components.
 Knowledge of the learner (learner model/student model)
 Knowledge of the domain (domain model/expert model)
 Knowledge of teaching strategies (pedagogical model)
The ability to diagnose students' errors and adapt instruction based on the
diagnosis represents a key difference between intelligent and traditional
computer-assisted instruction.
A student learns from an intelligent tutoring system primarily by solving tailor-
made problems that serve as learning experiences for that student. The whole
cycle of intelligent tutoring systems has been delineated here.
 The intelligent tutoring system may start by assessing what the student
already knows (student model).
 The system then must consider what the student needs to know
(domain model/expert model)
 Finally, the system must decide what unit of content (e.g., assessment
task or instructional element) ought to be presented next, and how it
should be presented (pedagogical model).
Based on the above three models, the system generates a problem. After that, it
works out a solution to the problem. The intelligent tutoring system compares
its solution to the one prepared by the student. Then performs a diagnosis
based on differences between the two. Feedback is offered by the system based
on the last feedback provided. After this, the program updates the student
model, and the entire cycle is repeated, beginning with generating a new
problem.
Check Your Progress 2
In this section, you studied AI and ML for education, now answer the
questions given in Check Your Progress-2.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss the architecture of adaptive learning.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) What is intelligent tutoring systems?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Explain the components of intelligent learning systems.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

9.4 MOBILITY AND TRANSPORTATION


With the increase in the number of vehicles and other transportation systems,
cities are witnessing traffic congestion and road accidents. Intelligent
transportation systems can address the issues of traffic congestion and road
accidents. AI and ML approaches can help in predicting real-time traffic flow.
Further ML algorithms can help in providing real-time reports on traffic
accidents. Thus ensuring efficient and safe transportation. AI-based techniques
can be used in smart parking management.
AI and ML can be applied in the following areas of mobility and transportation.
 Unmanned Arial Vehicles (Drones)
 Autonomous Vehicles
 Public Transport Planning
 Traffic Information and Control
 Smart Parking System
 Detection of potholes and bumps on the road
 On-demand Transportation
9.4.1 Autonomous Vehicles
Autonomous vehicle technology has the potential to revolutionize the mobility
landscape in the near future. Autonomous vehicles or self-driving cars use AI
and Machine Learning algorithms. The main algorithm used for self-driving
cars is Convolutional Neural Network (CNN). CNN is one of the most popular
deep neural network algorithms. It is mostly used in the visual recognition task.
It takes the image as an input and learns the features from the different parts of
the image. Machine Learning algorithms like K-means clustering and Support
Vector Machine (SVM) are also used in autonomous vehicles.
9.4.2 Unmanned Aerial Vehicles (Drones)
Unmanned Aerial Vehicles or drones can provide many applications for smart
cities. A few applications of unmanned aerial vehicles include traffic
management, pollution monitoring, environmental monitoring, civil security
control. The machine learning algorithm that can be used for smart cities is the
convolutional neural network (CNN).
9.4.3 Detection of Potholes and Bumps on Roads
Potholes and bumps on roads can be detected by using smart mobile devices
equipped with GPS and accelerometers. The data of acceleration can then be
used to analyze the road condition for future use. Supervised and unsupervised
machine learning can be used to detect potholes and bumps. ML algorithm that
can be used for detection of potholes is K-means clustering. K-means
clustering is an unsupervised learning algorithm.
9.4.4 Smart Parking System
A smart parking system is required for smart cities in view of an increased
number of vehicles. The issue of finding parking spaces without spending a lot
of time needs to be addressed.
A smart parking system uses AI technology to detect available parking spaces.
AI algorithm called Genetic Algorithm (GA) can be used for solving
transportation problems such as optimum utilization of space for smart parking
systems.
An ML-based smart parking system analyses the parking lot data to extract the
status of the parking lot. Furthermore, ML and AI technology can predict the
parking lot occupancy status of the upcoming days, weeks, or even months.
ML-based systems can monitor traffic congestion of particular roads and offer
a smart solution to smart parking spaces.
Deep Learning algorithms can be used to predict parking lot occupancy.
Neural Network is used for license plate recognition using real-time video data.
CNN and machine vision are implemented to detect parking lot occupancy
status.
9.4.5 Intelligent Transportation Systems and ML Algorithms
Different ML techniques that can be used in mobility and transportation
applications is as shown in Table 9.2.
Table 9.2: ML Techniques in Transportation Applications
Transportation Application ML algorithms
Traffic flow conditions SVM, K-nearest neighbor, LSTM
Smart parking system Genetic Algorithm
Autonomous Vehicles CNN, K-nearest neighbours
Accident detection SVM, K-neareset neighbours
Potholes and bump detection SVM, Naïve Bayes, K-means clustering
Source: adapted from various sources

Check Your Progress 3


In this section, you studied AI and ML for mobility and transportation, now
answer the questions given in Check Your Progress-3.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss AI algorithms used in autonomous vehicles.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss unmanned aerial vehicles.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Discuss the smart parking system and the role of AI.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

9.5 ENERGY SECTOR


AI and ML can make smart grid capable of making intelligent decisions,
ability to respond to the intermittent nature of RES, sudden changes in energy
demands of customers & power outages. Supervised Learning helps in
forecasting future energy demand of customers through their energy
consumption patterns obtained from smart meter data. Reinforcement Learning
helps in making optimal decisions in energy markets so that the microgrid can
maximize its cumulative reward.
9.5.1 Power Generation Forecast of Renewable Energy
The energy mix of a country is witnessing significant penetration of variable
renewable energy sources like wind and solar PV. The intermittent nature of
availability of these variable renewable energy resources (VRES) may affect
the stability of the grid. The accurate prediction of these VRES is very much
required for the stable, efficient and economical operation of the power system.
The long and short time memory (LSTM) model can be effectively applied to
the power prediction of wind power and photovoltaic power generation. LSTM
network is a common model in the field of deep learning. This model is an
improved one based on a recursive neural network (RNN). The characteristic
of the LSTM network is to use memory modules instead of common hidden
nodes to ensure that the gradient will not disappear or expand after passing
through many time steps, so as to overcome some difficulties encountered in
traditional RNN training. LSTM is suitable for processing and predicting
important events with relatively long intervals and delays in time series.
9.5.2 Home Energy Management System
A home energy management system (HEMS) is a combination of hardware
and software program that allows the end-users (prosumers/consumers) to
monitor their energy consumption and generation and to manage the energy
inside a home. The core component of HEMS is smart controller that takes
care of logging, monitoring and control. The smart controller captures real-
time electricity usage data from schedulable and non-schedulable appliances
and implements optimal demand management strategies. Smart meter is an
important component in HEMS as it records energy consumption and
generation. Further smart meter also enables smart billing solution. Other vital
components in HEMS are renewable energy sources (such as wind and solar
PV) and energy storage systems (such as battery storage). The Architecture of
HEMS is as shown in Figure 9.3.
Artificial Neural Network (ANN) and optimization algorithms are embedded
in home energy management controllers (smart controllers) to integrate the
battery storage and renewable energy sources with the grid to reduce the
energy cost for the prosumers/smart consumers. ANN is the most commonly
used AI techniques in HEM schedulers. An ANN-based residential thermal
control strategy can be developed for a home to create a more comfortable
thermal environment. A hybrid approach of Genetic Algorithm (GA) and ANN
algorithms can be developed for weekly appliance scheduling to optimize
electricity consumption in a smart home with renewable sources to maintain
energy demand during peak hours.
Renewable Energy Utility Grid
Sources

Energy storage system


(Battery)

Smart Meter
Smart Controller

Non-Schedulable Schedulable Appliances


Appliances (such as (such as Air Conditioning,
Television, Microwave) Washing Machine)

Figure 9.3: Architecture of HEMS

9.5.3 Analysis of Consumer Electricity Consuming Behaviour


Power consumption behaviour of consumers and abnormal consumption of
power by users can be analysed using machine language. The clustering and
identification ability of machine learning can be used in analysing the power
consumption behaviour of consumers and users. Based on the data of power
measured by a smart meter, AI clustering can be used for the segmentation of
consumers based upon the characteristics of consumption behaviour of
different groups.
Check Your Progress 4
In this section, you studied AI and ML for the energy sector, now answer the
questions given in Check Your Progress-4.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Enumerate various applications of AI and ML in smart grid.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss Home Energy Management System.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Discuss the power generation forecast of renewable energy.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

9.6 ENVIRONMENT AND ECONOMY


Air quality, green and water spaces, emission monitoring, waste management,
energy efficiency and monitoring of city trees are the hallmark of a smart
environment. Air pollution is threatening many cities. Hence prediction and
monitoring of air quality based upon data collected through IoT devices and
analysing this voluminous data using AI methods are very much required.
Water management and waste management are other significant issues with
regard to smart cities.
A smart economy is characterized by innovation, entrepreneurship, efficiency,
labour market flexibility, and the ability to transform. A smart economy is
flexible and able to compete globally.
AI-driven smart environment and economy have been further explained here.
9.6.1 AI-Driven Smart Economy
Following are the characteristics of a smart economy.
 Entrepreneurial opportunities
 Diverse economic opportunities
 Sustainable economic development
 Driven by innovation
 Automated data management process
The above characteristics of a smart economy embedded with AI have been
delineated here.
Entrepreneurship may be classified into SME entrepreneurship and IDE
entrepreneurship. SME (i.e small and medium enterprise) entrepreneurship are
focussed on the local market. IDE (i.e. innovation driven enterprise)
entrepreneurship are looking for a global market. IDE entrepreneurship can
give more stimulus to economic growth. This classification has been shown in
Figure 9.4.

Figure 9.4: Classification of Entrepreneurship


AI has made inroads into business activities substantially. The example of the
penetration of AI can be discerned in the e-commerce business. AI is used in
recommendations for online shopping. A recommendation system is a tool that
uses a series of algorithms, data analysis and AI to make recommendations
online. Customer behaviour analysis can help in the growth of sale and a better
understanding of customers. A simple model is as shown in Figure 9.5,
explains the recommendation system. Customer data is taken as input, this data
is analysed and modeled. Data analysis and modelling include data cleaning
and pre-processing, segmentation and recommendation system. Based on this
modelling recommendations are made.
Input Data analysis & Recommendations
Modelling (segmentation,
(Customer data) recommendation system)

Figure 9.5: Recommendation System Model

Uber, a cab aggregator, uses AI in estimating prices. For a dynamic pricing


system, Uber uses ML extensively. Uber generates forecasts using ML that
uses external conditions like weather, traffic, holidays, time, and historical data.
When very little data is available, Uber uses LSTM networks and deep
learning models to arrive at an estimated price.
Generally, entrepreneurs use intuition, imagination and creativity in their
enterprise and business. But with the penetration of AI in enterprises and
businesses, logic, rules and formal thinking are replacing what earlier it was
based on gut feelings. AI is influencing every dimension of business.
9.6.2 AI-Driven Smart Environment
Following are the characteristics of a smart environment
 Disaster Prevention
 Water Quality
 Air Quality Prevention
 Building Management
 Waste Management
The above characteristics of a smart environment embedded with AI have been
delineated here.
Air pollution is one of the major health and environment concern. Air pollution
can lead to acid rain, smog and global warming. So it is essential to monitor
and control air pollution. Smart air pollution monitoring consists of wireless
sensor nodes, server and database to store the monitored data. Traditional
methods are too cumbersome to process such huge data. The heterogeneous
data are converted into meaningful information by using data mining
approaches. Data mining refers to the mining or discovery of new information
based on patterns and rules from vast amounts of data. Classification,
clustering, and association are some of the data mining techniques used to
analyse and extract meaningful information from complex data. Clustering is
the process of organizing data objects into a set of disjoint classes called
clusters. Cluster analysis is one of the primary data analysis tools in data
mining. The K-means clustering algorithm is a partitioning clustering method
that separates data into K groups. K-means clustering algorithm can be used
for analysing the air pollution data.
Further, K-nearest neighbour, Random Forest and Support Vector Machine can
be used for classification of pollution data to estimate pollution level.
IoT devices can collect real-time data for various events. This collected data
when analyzed using AI methods can help in the prediction and monitoring of
disastrous events. AI can be used to predict natural disasters. Earthquakes,
hurricanes, floods and volcanic eruptions can be predicted using AI.
Check Your Progress 5
In this section, you studied AI and ML for the smart environment and smart
economy, now answer the questions given in Check Your Progress-5.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) What are the characteristics of an AI-driven smart economy?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) What are the characteristics of an AI-driven smart environment?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) Discuss the classification of entrepreneurship.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

9.7 AI AND ML CHALLENGES


Applications of AI and ML are based on large volumes of sensitive data.
Biases and noises in data are one of concerns. Data security and privacy are
other challenge. It is the quality of data that decides accuracy of ML model and
thus better prediction. AI and ML challenges in key components of smart cities
have been discussed here.
9.7.1 AI and ML Challenges in Healthcare
AI has revolutionized medical practices. Still, there are many challenges that
need to be addressed. AI and ML techniques require a very large amount of
high-quality data. Data from various hospitals may have bias and noise making
it difficult for generalizing the result obtained from this biased data.
The model for predictive clinical care is as shown in Figure 9.6. Every stage of
this model is vulnerable. Training for AI and ML models requires a huge
dataset. Though all precautions are made while collecting data, there may be
many sources of vulnerabilities. Vulnerabilities at the data collection level may
arise due to instrumental noise or untrained staff. Data annotation is related to
the labelling of data. Data labelling by experienced physicians and radiologists
will give accurate labelling. But sometimes the task of data labelling may be
done by trainee staff leading to improper annotation. Vulnerabilities in model
training include improper or incomplete training. The analysis and diagnosis
based on incomplete or improper training will lead to misinterpretations of
results i.e. false positive and negative.
Other major challenges of AI and ML in healthcare have been listed here.
 Availability of good quality data
 Data standardization
 Regulatory and policy challenges
 Security and Privacy Challenges

Data Data Feature Model


Collection Annotation Extraction

Figure 9.6: Model for Predictive Clinical Care Analysis,


9.7.2 Challenges of AI in Transportation Diagnosis

When AI and ML are applied to transportation systems, these technologies


offer safety, a better lifestyle and faster transportation. With these benefits of
AI and ML in transportation systems, there are certain challenges in
implementing AI and ML in transportation systems.
One of the challenges is the collection of a large volume of transportation data
for getting the accuracy of the ML algorithm.
There has been tremendous progress in autonomous vehicles based on AI
technologies. But safety is still a concern. Minimization of the error rate in
autonomous vehicles is a challenge that needs to be addressed. Efforts are
being made to minimize the error rate in an autonomous vehicle.
Another challenge is the security and privacy issues of data collected through
various sensors and actuators of people who are using smart transportation.
9.7.3 Challenges of AI in Smart Grid
Following are the major challenges of AI in the smart grid.
 Integration of renewable energy is one of the main challenges of AI in
the smart grid. The penetration of variable renewable energy like solar
and wind is increasing in the smart grid. This variability and
unpredictability of renewable energy is a challenge in grid balancing.
 Big data storage and analysis for AI applications
 Data privacy and security
 Standardization of big data in smart grid
 Data samples for different AI applications are not rich enough.
 Shift of AI-based technology from identification rate of problems and
faults to practical applications in power systems
Check Your Progress 6
In this section, you studied AI and ML challenges for smart cities, now answer
the questions given in Check Your Progress-6.
Note: a) Write your answer in about 50 words
b) Check your answer with possible answers given at the end of the unit
(1) Discuss AI and ML challenges in healthcare.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(2) Discuss the challenges of AI in mobility and smart transportation.
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
(3) What are the challenges of AI and ML in smart grid?
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------

9.8 SUMMARY
This unit discussed key components of smart cities. Applications of AI and
ML in smart cities in the light of these key components were discussed. It was
shown that AI and ML have been embedded in every aspect of smart cities. In
smart cities, millions of devices generate data, and the analysis of these data in
full is nearly impossible without the use of AI and ML. Be it the diagnosis of
disease, be it personalized learning, be it the integration of renewable energy,
be it intelligent transportation systems, be it entrepreneurship, be it air quality
monitoring, AI and ML are there to support and influence.

9.9 KEYWORDS
Convolutional Neural Networks (CNN): CNN is one of the most popular
deep neural network algorithms. It is mostly used in the visual recognition task.
It takes an image as an input and learns the features from the different parts of
the image.
Decision Tree: Decision Tree is a Supervised learning technique that can be
used for both classification and Regression problems.
Genetic Algorithms (GA): Genetic algorithms are stochastic search
algorithms that act on a population of possible solutions. They are loosely
based on the mechanics of population genetics and selection. The potential
solutions are encoded as ‘genes’ — strings of characters from some alphabet.
K-means clustering: K-means clustering is an unsupervised learning
algorithm, which attempts to partition the set of given input data into k
partitions by minimizing the variance for each partition.
K-nearest neighbour: The K-nearest neighbours algorithm is a supervised
learning classifier, which uses proximity to make classifications or predictions
about the grouping of an individual data point. While it can be used for either
regression or classification problems, it is typically used as a classification
algorithm, working off the assumption that similar points can be found near
one another.
LSTM neural network: LSTM network is a common model in the field of
deep learning. This model is an improved one based on a recursive neural
network (RNN). The characteristic of the LSTM network is to use memory
modules instead of common hidden nodes to ensure that the gradient will not
disappear or expand after passing through many time steps, so as to overcome
some difficulties encountered in traditional RNN training. LSTM is suitable
for processing and predicting important events with relatively long intervals
and delays in time series.
Naïve Bayes: The Naive Bayes classification algorithm is a probabilistic
classifier. It is based on probability models that incorporate strong
independence assumptions.
Random Forest: This method uses multiple decision trees to basically classify
and regress large amounts of data, where each tree generates a value for a
given subset of random variables.
RNN: Recurrent neural networks (RNN) are useful in cases where we have
sequential data, such as language models, music and so on. RNNs take a
sequential input: � = �0 , �1 , ……. . , �� .
Support Vector Machine (SVM): SVM is a machine learning algorithm
popular for regression and classification.

9.10 CHECK YOUR PROGRESS 1 – POSSIBLE ANSWERS


1) Discuss AI in clinical trials.
Clinical trials are a type of research that studies new tests and treatments and
evaluates their effects on human health outcomes. People volunteer to take part
in clinical trials to test medical interventions including drugs, cells and other
biological products, surgical procedures, radiological procedures, devices,
behavioural treatments and preventive care. This takes a lot of time and money.
Further, the success rate is very low. AI can help in eliminating time
consuming data monitoring procedures. There are four phases of clinical trials.
And every phase takes a considerable amount of time and money. AI has the
potential to reduce clinical trial cycle duration.
2) Discuss predictive medicine.
Predictive medicine is a branch of medicine that aims to identify patients at
risk of developing a disease, thereby enabling either prevention or early
treatment of that disease. As AI can find meaningful relationships in raw data,
it can support diagnosis, treatment and predictive outcomes in many medical
situations. The application of AI will help medical professionals to incorporate
proactive management of a disease that is likely to develop. Further AI can
help in predictions of disease by identifying risk factors of a patient and thus
earlier healthcare intervention is possible.
3) Discuss ML algorithms for the diagnosis of various diseases.
Disease and their corresponding ML algorithm has been given below.
Disease ML Algorithm
Detection of heart disease Support Vector Machine (SVM), Naïve
Bayes Algorithm
Disclosure of breast cancer Support Vector Machine (SVM), Naïve
Bayes Algorithm, K-nearest neighbour
Diagnosis of thyroid disorder SVM and Decision Tree
Diabetic Disease SVM
Check Your Progress 2 – Possible Answers
1) Discuss the architecture of adaptive learning.
Adaptive learning systems use artificial intelligence and machine learning to
personalize the learning experience. Steps involved in adaptive learning
systems using AI have been shown in Fig. xx. The architecture of adaptive
learning systems is based upon these four tasks.
 Capturing data from the learner
 Using captured data to assess learner’s progress
 Recommending learning activities
 Providing tailored feedback

2) What is intelligent tutoring systems?


Intelligent Tutoring Systems are computer programs that use Artificial
Intelligence components. The educational software involving AI was originally
called ‘intelligent’ computer-assisted instruction. The critical distinction
between traditional computer-assisted instruction and intelligent tutoring
systems is that the former involves static systems that incorporate the decisions
of expert teachers, whereas intelligent tutoring systems contain the expertise
itself and can use it as a basis for making decisions about instructional
interventions.
3) Explain the components of intelligent tutoring systems.
Intelligent tutoring systems must contain the following three components.
 Knowledge of the learner (learner model/student model)
 Knowledge of the domain (domain model/expert model)
 Knowledge of teaching strategies (pedagogical model)
The ability to diagnose students' errors and adapt instruction based on the
diagnosis represents a key difference between intelligent and traditional
computer-assisted instruction.
A student learns from an intelligent tutoring system primarily by solving tailor-
made problems that serve as learning experiences for that student. The whole
cycle of intelligent tutoring systems has been delineated here.
 The intelligent tutoring system may start by assessing what the student
already knows (student model).
 The system then must consider what the student needs to know
(domain model/expert model)
 Finally, the system must decide what unit of content (e.g., assessment
task or instructional element) ought to be presented next, and how it
should be presented (pedagogical model).
Based on the above three models, the system generates a problem. After that, it
works out a solution to the problem. The intelligent tutoring system compares
its solution to the one prepared by the student. Then performs a diagnosis
based on differences between the two. Feedback is offered by the system based
on the last feedback provided. After this, the program updates the student
model, and the entire cycle is repeated, beginning with generating a new
problem.
Check Your Progress 3 – Possible Answers
1) Discuss AI algorithms used in autonomous vehicles.
Autonomous vehicles or self-driving cars use AI and Machine Learning
algorithms. The main algorithm used for self-driving cars is Convolutional
Neural Network (CNN). CNN is one of the most popular deep neural network
algorithms. It is mostly used in the visual recognition task. It takes the image
as an input and learns the features from the different parts of the image.
Machine Learning algorithms like K-means clustering and Support Vector
Machine (SVM) are also used in autonomous vehicles.
2) Discuss unmanned aerial vehicles.
Unmanned Aerial Vehicles or drones can provide many applications for smart
cities. A few applications of unmanned aerial vehicles include traffic
management, pollution monitoring, environmental monitoring, civil security
control. The machine learning algorithm that can be used for smart cities is the
convolutional neural network (CNN).
3) Discuss the smart parking system and the role of AI.
A smart parking system is required for smart cities in view of an increased
number of vehicles. The issue of finding parking spaces without spending a lot
of time needs to be addressed.
A smart parking system uses AI technology to detect available parking spaces.
AI algorithm called Genetic Algorithm (GA) can be used for solving
transportation problems such as optimum utilization of space for smart parking
systems.
An ML-based smart parking system analyses the parking lot data to extract the
status of the parking lot. Furthermore, ML and AI technology can predict the
parking lot occupancy status of the upcoming days, weeks, or even months.
ML-based systems can monitor traffic congestion of particular roads and offer
a smart solution to smart parking spaces.
Deep Learning algorithms can be used to predict parking lot occupancy.
Neural Network is used for license plate recognition using real-time video data.
CNN and machine vision are implemented to detect parking lot occupancy
status.
Check Your Progress 4 – Possible Answers
1) Enumerate various applications of AI and ML in smart grid.
Following are the applications of AI and ML in smart grid.
 Home Energy Management System (HEMS)
 Power generation forecast of renewable energy
 Power consumption behaviour of consumers
 Smart energy meter
 Energy trading
2) Discuss Home Energy Management System (HEMS).
A home energy management system (HEMS) is a combination of hardware
and software program that allows the end-users (prosumers/consumers) to
monitor their energy consumption and generation and to manage the energy
inside a home. The core component of HEMS is smart controller that takes
care of logging, monitoring and control. The smart controller captures real-
time electricity usage data from schedulable and non-schedulable appliances
and implements optimal demand management strategies. Smart meter is an
important component in HEMS as it records energy consumption and
generation. Further smart meter also enables smart billing solution. Other vital
components in HEMS are renewable energy sources (such as wind and solar
PV) and energy storage systems (such as battery storage).
Artificial Neural Network (ANN) and optimization algorithms are embedded
in home energy management controllers (smart controllers) to integrate the
battery storage and renewable energy sources with the grid to reduce the
energy cost for the prosumers/smart consumers.
3) Discuss the power generation forecast of renewable energy.
The energy mix of a country is witnessing significant penetration of variable
renewable energy sources like wind and solar PV. The intermittent nature of
availability of these variable renewable energy resources (VRES) may affect
the stability of the grid. The accurate prediction of these VRES is very much
required for the stable, efficient and economical operation of the power system.
The long and short time memory (LSTM) model can be effectively applied to
the power prediction of wind power and photovoltaic power generation.
Check Your Progress 5 – Possible Answers
1) What are the characteristics of an AI-driven smart economy?
Following are the characteristics of a smart economy.
 Entrepreneurial opportunities
 Diverse economic opportunities
 Sustainable economic development
 Driven by innovation
 Automated data management process
2) What are the characteristics of an AI-driven smart environment?
Following are the characteristics of a smart environment
 Disaster Prevention
 Water Quality
 Air Quality Prevention
 Building Management
Waste Management
3) Discuss the classification of entrepreneurship.
Entrepreneurship may be classified into SME entrepreneurship and IDE
entrepreneurship. SME (i.e small and medium enterprise) entrepreneurship are
focussed on the local market. IDE (i.e. innovation driven enterprise)
entrepreneurship are looking for a global market. IDE entrepreneurship can
give more stimulus to economic growth
Check Your Progress 6 – Possible Answers
1) Discuss AI and ML challenges in healthcare.
Following are the major challenges of AI and ML in healthcare.
 Availability of good quality data
 Data standardization
 Regulatory and policy challenges
 Security and Privacy Challenges
2) Discuss the challenges of AI in mobility and smart transportation.
When AI and ML are applied to transportation systems, these technologies
offer safety, a better lifestyle and faster transportation. With these benefits of
AI and ML in transportation systems, there are certain challenges in
implementing AI and ML in transportation systems.
One of the challenges is the collection of a large volume of transportation data
for getting the accuracy of the ML algorithm.
Minimization of the error rate in autonomous vehicles is a challenge that needs
to be addressed.
Another challenge is the security and privacy issues of data collected through
various sensors and actuators of people who are using smart transportation.
3) What are the challenges of AI and ML in the smart grid?
Following are the major challenges of AI in the smart grid.
 Integration of renewable energy is one of the main challenges of AI in
the smart grid. The penetration of variable renewable energy like solar
and wind is increasing in the smart grid. This variability and
unpredictability of renewable energy is a challenge in grid balancing.
 Big data storage and analysis for AI applications
 Data privacy and security
 Standardization of big data in smart grid

9.11 REFERENCES AND SELECTED READINGS


1. Gangwani, D., & Gangwani, P. (2021). Applications of machine
learning and artificial intelligence in intelligent transportation system:
A review. Applications of Artificial Intelligence and Machine
Learning, 203-216.
2. Kirimtat, A., Krejcar, O., Kertesz, A., & Tasgetiren, M. F. (2020).
Future trends and current state of smart city concepts: A survey. IEEE
access, 8, 86448-86467.
3. K. Shailaja, B. Seetharamulu and M. A. Jabbar, "Machine Learning in
Healthcare: A Review," 2018 Second International Conference on
Electronics, Communication and Aerospace Technology (ICECA),
2018, pp. 910-914, doi: 10.1109/ICECA.2018.8474918.
4. Nikitas, A., Michalakopoulou, K., Njoya, E. T., & Karampatzakis, D.
(2020). Artificial intelligence, transport and the smart city: Definitions
and dimensions of a new mobility era. Sustainability, 12(7), 2789.
5. Qayyum, A., Qadir, J., Bilal, M., & Al-Fuqaha, A. (2020). Secure and
robust machine learning for healthcare: A survey. IEEE Reviews in
Biomedical Engineering, 14, 156-180.
6. Shute, V. J., & Zapata-Rivera, D. (2010). Educational measurement
and intelligent systems. of the International Encyclopedia of
Education. Oxford, UK: Elsevier Publishers.
7. Zafar, U., Bayhan, S., & Sanfilippo, A. (2020). Home energy
management system concepts, configurations, and technologies for
the smart grid. IEEE access, 8, 119271-119286
MIO– 002
INTRODUCTION TO SMART REGIONS: SMART CITIES AND
SMART VILLAGES

BLOCK 1 : IOT, IIOT and IOE


Unit 1 : Internet of Things (IoT) and Its Applications
Unit 2 : Industrial Internet of Things ( IIoT) and IOE
Unit 3 : Smart Grid Technologies for Smart Cities

BLOCK-2 : BLOCKCHAIN TECHNOLOGIES


Unit 4 : Basics of Blockchain Technology
Unit 5 : Applications of Blockchain Technology
Unit 6 : Blockchain Technology for Smart Cities

BLOCK -3 : AI AND MACHINE LEARNING


Unit 7 : Basics of AI
Unit 8 : Basics of Machine Language / Introduction to Machine
Language
Unit 9 : AI and Machine Learning for Smart Cities

BLOCK -4 : DIGITAL PLATFORM, WEARABLE


TECHNOLOGIES
Unit 10 : Digital India Concepts in Smart Cities
Unit 11 : Data Science, Big Data Analytics
Unit 12 : Concept of SCADA, GIS and MIS

SOET/IGNOU 2022 (Digital Print)

ISBN- 978-93-5568-577-3

You might also like