0% found this document useful (0 votes)

1K views310 pages

IBM AI Level (1-3)

The document provides an instruction manual for teachers covering 5 units on artificial intelligence foundations, including an introduction to AI and its history, applications and methodologies of AI, mathematics for AI, AI values and ethical decision making, and an introduction to storytelling. The first unit lays the foundation for AI by discussing its definition, evaluating its impact on society, and unfolding key AI terminology such as machine learning, deep learning, and different types of learning.

Uploaded by

Yatin Kumar Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views310 pages

IBM AI Level (1-3)

Uploaded by

Yatin Kumar Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 310

____________________________________________

ARTIFICIAL INTELLIGENCE: STUDY MATERIAL

CLASS XI
______________________________________________

LEVEL 1: AI INFORMED (UNIT 1 – UNIT 5)

TEACHER INSTRUCTION MANUAL

LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

INDEX

UNIT 1: INTRODUCTION: AI FOR EVERYONE………………… Page 2 - 37

UNIT 2: AI APPLICATIONS & METHODOLOGIES……………. Page 38 – 76

UNIT 3: MATH FOR AI…………………………………………………… Page 77 - 128

UNIT 4: AI VALUES (ETHICAL DECISION MAKING)………… Page 129 - 139

UNIT 5: INTRODUCTION TO STORYTELLING…………………. Page 140 - 150

1
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Unit 1: Introduction: Artificial intelligence for Everyone

Title: Introduction – AI for everyone Approach: Interactive/ Collaborative /

Activity

Summary:

The imminent world that forecasts the future era is different from the one we can predict and
see today. Artificial Intelligence is the driving force that will lead the future generations. Self-
driving cars, widespread automation, robotic gadgets will become an integral part of day to
day life of the human race. Trade, work, professions, employment will see a massive
transformation. Fast adaptability is crucial for the forthcoming cohort as they will be widely
affected by this change. We the mentors shoulder this responsibility to equip them to handle
the future tools with care and intellectual pride.
We are confident that the prospective children will empower themselves for future to come
and will understand key concepts underlying this new technology- AI.
What is AI? This unit will lay down the foundations of AI by discussing its history and setting
ground for forthcoming units.
Objective:
1. Understand the definition of Artificial Intelligence and Machine Learning
2. Evaluate the impact of AI on society
3. Unfold the AI terminology - Machine Learning (ML), Deep Learning (DL), Supervised
Learning, Un-supervised Learning etc.
4. Understand the strengths and limitations of AI and ML
5. Identify the difference between AI on one side and Machine Learning (ML), Deep
Learning (DL) on other

Learning Outcome:
1. To get introduced to the basics of AI and its allied technologies
2. To understand the impact of AI on society
Pre-requisites: Reasonable fluency in English language and basic computer skills
Key Concepts: Artificial Intelligence (AI) , Machine Learning (ML) and Deep Learning (DL)

2
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1. What is Artificial Intelligence (AI)

1. What movies have you seen about artificial intelligence?

2. How intelligent will artificial intelligence become by 2030, any guess?

3. At present, in what activities are computers better at than humans?

4. At present, in what activities are human better at than computers?

So, how do we define Artificial Intelligence [AI?]

AI is a technique that facilitates a machine to perform all cognitive functions such as perceiving, learning
and reasoning that are otherwise performed by humans.

“The Science and Engineering of making intelligent machines, especially intelligent Computer programs
is Artificial intelligence” –JOHN MC CARTHY [Father of AI]

The yardstick to achieve true AI still seems decades away. Computers execute certain tasks way better
than humans e.g.: Sorting, computing, memorizing, indexing, finding patterns etc. While identifying of

3
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

emotions, recognising faces, communication and conversation are unbeatable human skills. This is
where AI will play a crucial role to enable machines achieving equalling human capabilities.

World Famous AI Machines [naming a few of them]:

 IBM Watson (https://www.youtube.com/watch?v=s_wgf75GwCM )

 Google’s Driverless car (https://www.youtube.com/watch?v=cdgQpa1pUUE)

 Sophia, the humanoid Robot (https://www.youtubeom/watch?v=cdgQpa1pUUE)

 The assistant / Chabot - Alexa, Siri, Google’s Home

 Honda Asimo (https://www.youtube.com/watch?v=1urL_X_vp7w)

4
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

 Boston Dynamics AI Robot (https://www.youtube.com/watch?v=NR32ULxbjYc)

Activity

Let’s get imaginative and create an intelligent motorbike. It is the year 2030, add features to create a
machine that races against time.

_________________________________________________________________________________

5
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

2. History of AI

In 1950’s
The modern-day AI got impetus since 50s of the previous centuries, once Alan Turning introduced
“Turning Test” for assessment of intelligence.

In 1955
John McCarthy known as the founder of Artificial Intelligence introduced the term ‘Artificial
Intelligence’. McCarthy along with Alan Turing, Allen Newell, Herbert A. Simon, and Marvin Minsky too
has the greatest contribution to present day machine intelligence. Alan suggested that if humans use
accessible information, as well as reason, to solve problems to make decisions – then why can’t it be
done with the help of machines?
In 1970’s
70 s saw an upsurge of computer era. These machines were much quicker, affordable and stowed more
information. They had an amazing character to think abstract, could self-recognize and accomplished
natural language processing.
In 1980’s
These were the years that saw flow of funds for research and algorithmic tools. The learning skills were
enhanced and computers improved with deeper user experience.
In 2000’s
Many unsuccessful attempts, Alas! The technology was successfully established by years 2000.The
milestones were realised, that needed to be accomplished. AI could somehow manage to thrive despite
lack of government funds and public appreciation.

(Image Source: www.data-flair.com)

6
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

3. Machine Learning

Example 1:

Let’s play a game. Find the missing number

2, 4, 8, 16, 32,?

And I am sure, you would have guessed the correct answer which is 64. But how did you arrive at 64?
This calculation must have taken place inside your brain cells and the technique you used to decipher
this puzzle, has actually helped you to decode Machine Learning (ML).

That’s exactly the kind of behaviour that we are trying to teach the machines. ‘Learn from experience’
is what we want machines to acquire.

Example 2:

Let us take another example from Cricket. Assume you are the batsman facing a baller. By looking at the
baller’s body movement and action, you predict and move either left or right to hit the ball. But if the
baller throws a straight ball, what will you do? Apart from the baller’s body movement, you also try to
find out the patterns in baller’s bowling habit, that after 2 consecutive left side balls, he/she throws a
straight ball and you prepare yourself to face the next ball. So what you are doing is learning from past
experience in order to perform better in the future.
When a computer does this, it is called Machine Learning. You let the computer to learn from its past
experience / data.

Example 3:

Now let us go for a slightly more complicated example:

I am Mr. XYZ and I want to buy a house. I try to calculate how much I need to save monthly for that. I
did my research work and got to know that a new house would cost me anything between Rs. 30 Lakh
to Rs. 100 Lakh. A 5-year old house would cost me between Rs. 20 Lakh to 50 Lakh, a house in Delhi
would cost me ......and buying a house in Mumbai would be ......and so on.

Now my brain starts working and suddenly I am able to make out a pattern:

 So, the price of the house depends on its age, location, built up area, facilities, depreciation
(which means that price could drop by Rs. 2 Lakh every year, but it would not go below Rs. 20
Lakh.)
 In machine learning terms, Mr. XYZ has stumbled upon regression – he predicted a value (price)
based on the available historical data. People do it all the time, when trying to estimate a
reasonable cost for a used phone or a car or figure out how many cakes to buy for a birthday
party, which might be 200 grams per person, so how many kilograms for a party of 50 persons?

7
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Let's get back to the pricing of the house. The problem is that the construction dates are different,
dozens of options are available, locations are multiple, seasonal demands spike, and an array of many
more hidden factors.

Humans may not be able to keep all that data in mind, while calculating the price for prospective houses.
So we need robots to do the mathematics for us. Let’s go the computational way and provide the
machine some data and ask it to find all hidden patterns related to the price, and it works! The most
exciting thing is that a machine copes with this task much better than a real person does when carefully
analysing all the dependencies in his/her mind. This heralds the birth of machine learning!

 Do you know this?

1. Gmail automatically classifying emails as ‘Spam’ and ‘Not Spam’. Spam emails being automatically
sent to the Spam folder saving a lot of your time.

2. YouTube recommending you to watch videos of certain genre and the recommended videos matching
your choice of videos to a great extent.

3. Flipkart or Amazon recommending you to buy products of your choice. How do they come to know
your buying preferences? Did you shop together?

4. When you upload photos to Facebook, the service automatically highlights faces and suggests which
friends to tag. How does it instantly identify your friends in the photos? You might be thinking that
Facebook is a magician. Isn’t it?

8
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

If you haven’t realized as yet, then it is time for you to know that Machine learning is behind all the
surprises sprung up by Google, Amazon and Flipkart. Even you can create this magic by learning a little
about mathematics and a computer programming language.

I am sure, by now you have some insight into ML. So, what is ML?

“Machine Learning is a discipline that deals with programming the systems so as to make them
automatically learn and improve with experience. Here, learning implies understanding the input data
and taking informed decisions based on the supplied data”. In simple words, Machine Learning is a
subset of AI which predicts results based on incoming data.

The utilities of ML are numerous. So as to detect spam emails, forecast stock prices or to project class
attendance one can achieve results by means of earlier collected spam messages, previous price history
records or procure 5 years or more attendance data of a class. ML will predict the results based upon
previous data base experience available with it.

Activity

Based on the understanding you have developed till now, how do you think Machine Learning could
help some of the problems being faced currently by your school. Fill the problems in the blank circles
given below:

9
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

3.1. Difference between Conventional programming and Machine Learning

Conventional programming and ML coding both are computer programs but their approach and
objective are different. Like your school dress and your casual dress – both are clothes, made from
threads but their purpose is different.
If you ned to develop a website for your school, you will take the Conventional programming approach.
But if you want to develop an application to forecast the attendance percentage of your school for a
particular month (based on historical attendance data) you will use the ML approach.

Conventional Programming Approach

Conventional Programming refers to any manually created program which uses input data, runs on a
computer and produces the output. What does it mean? Let us understand it by illustration below:

A programmer accepts the input, gives the instruction (through Code / Computer language) to the
computer to produce an output/destination.

Take a look at an example. Below are the steps to convert Celcius scale to Fahrenheit scale

Step -1: Take input (Celcius)

Step-2: Apply the conversion formula: Fahrenheit = Celcius * 1.8 + 32
Step -3: Print the Output (Fahrenheit)
Did you notice, we are telling the computer what to do on the input data i.e. multiply Celcius with 1.8
and then add 32 to obtain the value in Fahrenheit.

Machine Learning (or AI) Approach

On the contrary, in Machine Learning (ML), the input data and the output data are fed to an algorithm
(Machine learning algorithm) to create a program. Unlike conventional programming, Machine Learning
is an automated process where a programmer feeds the computer with ‘The Input + The Output’ and
computer generates the algorithm as to how the ‘The Output’ was achieved.

10
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

For example, if the same Python program above is to be written using the Machine Learning approach,
the code will look like this:

Step 1: Feed lot many values in Celcius (i.e. -40, -10, 0, 8, 15, 22, 38)
Step -2: Feed corresponding Fahrenheit values (i.e. -40, 14, 32, 46, 59, 72, 100)
Step -3: Pass these 2 sets of values to Machine Learning (ML) algorithm
Step- 4: Now you ask the ML program to predict (convert) any other celcius value to Fahrenheit, and
program will tell you the answer.
For example, ask the computer to predict (convert) 200 Celcius to Fahrenheit, and you will get the
answer as 392.
Can you notice - in the ML approach, nowhere this conversion step (F = C*1.8 +32) has been mentioned.
Code was provided with the input date (Celcius) and corresponding output data (Fahrenheit) and the
model (ML code) automatically generates the relationship between Celsius and Fahrenheit.

3.2. How is machine learning related to AI?

There is a lot of debate regarding the difference between Machine Learning and Artificial Intelligence.
But the truth is that Machine Learning and Artificial Intelligence are not essentially two different things
as it is understood to be. Machine Learning is a tool for achieving Artificial Intelligence.
AI is a technology to create intelligent machines that can recognize human speech, can see (vision),
assimilate knowledge, strategize and solve problems as humans do. Broadly, AI entails all those
technologies or fields that aim to create intelligent machines.
Machine learning provides machines the ability to learn, forecast and progress on their own without
specifically being programmed. In a nutshell, ML is more about learning and nothing else. ML system
primarily starts with a ‘slow state’ (like a child) and gradually improve by learning from examples to
become ‘superior’ (like an adult).
Imagine you have to make a robot that can see, talk, walk, sense and learn. What application will you
apply? In order to achieve this task of making such a robot, one have to apply numerous technologies
but for the learning part, you will apply machine learning.

11
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

4. Data

Modern day scholars have coined the phrase ‘Data is the new oil’. If everyone is talking so highly about
data, then it must be something precious! But what is this data?

Activity

Let us create a students’ dataset for your class (the one given below is a sample, you can create one of
your own)

Name of Attendance (%) Total Participation in Sports

Students as of April, Gender Marks (%)
2020 obtained in
Grade X

A 76 Male 92 N

B 82 Male 88 Y

C 57 Male 65 N

D 97 Female 97 N

E 56 Male 62 Y

F 76 Female 85 N

G 51 Male 56 Y

Does this dataset tell you a story?

 Do you think it mirrors an association between marks obtained and attendance?

 Can you extract 5 observations from this dataset? [Although this is a very small dataset, can you
still take a shot at it?]

12
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity

Open the URL https://data.gov.in/node/6721404 in your web browser. It should open the following
page

The page you opened, has a link Reference URL: https://myspeed.trai.gov.in/ - Click on this link.

Now answer a few questions:

1. Who owns and maintains this dataset?

2. What kind of data does it hold?
3. Why the Government of India stores these data?
4. Why has the government made this data public?
5. Do you see the use of such archives in Artificial Intelligence Machine Learning?
6. Can you do a simple web search and find three other such sources of data?

Now that we have engaged in two activities related to data, let us try and define Data.

What is Data? Define it.

Data can be defined as a representation of facts or instructions about some entity (students, school,
sports, business, animals etc.) that can be processed or communicated by human or machines. Data is
a collection of facts, such as numbers, words, pictures, audio clips, videos, maps, measurements,
observations or even just descriptions of things.

13
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Data maybe represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9) or special
characters (+, -, /, *, <,>, = etc.)

Activity

Create a dataset about yourself with the following attributes (fields):

Attribute Name Size of the Field Data Type

Name 100 Character
Age 3 Number
Address 200 String
Class 3 Character
Number of friends 3 Number
Number of FB posts 3 Number
Type of FB posts Can you Guess Can you Guess
Number of word docs you 4 Number
created
Type of word docs you Can you Guess Can you Guess
created

Now that you have created a dataset of your own, it is the time to categorise the data. Data can be
sorted into one of the two categories stated below:

 Structured Data
 Unstructured Data

‘Structured data’ is most often categorized as quantitative data, and it's the type of data most of us work
with every day. Structured data has predefined data types and format so that it fits well in the column/
fields of database or spreadsheet. They are highly organised and easily analysed.
In above Activity- name, age, address etc. are examples of ‘Structured data’. The data is structured in
accurately defined fields. The data that can be stored in relational databases or spread sheets (like Excel)
is the best example of structured data.
However, for the field of ‘Type of Facebook posts’ - Do you have any predefined data type? In fact, your
Facebook post can carry anything – text, picture, video, audio etc. You can’t have one fixed data type
for such data and that’s why you call it ‘Unstructured data’ - where neither size is fixed not datatype is
predefined.
‘Unstructured data’ is most often categorized as qualitative data, and it cannot be processed and
analysed using conventional relational database (RDBMS) methods.
Examples of unstructured data include text, video, audio, mobile activity, social media activity, satellite
imagery, surveillance imagery and the list goes on. Unstructured data is difficult to deconstruct because

14
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

it has no pre-defined model, meaning it cannot be organized in relational databases. Instead, non-
relational, or NOSQI databases, are best fit for managing unstructured data.

5. Terminology and Related Concepts

5.1. Machine Learning

“Machine learning is the science of getting computers to act without being explicitly programmed.”
– Stanford University

“Machine learning algorithms can figure out how to perform important tasks by generalizing from
examples.” – University of Washington

Of late, machine learning has achieved a great deal of popularity, but the first attempt to develop a
machine that imitated the behaviour of a living being was made in the 1930s by Thomas Ross. Machine
Learning (ML) is a term used today to describe an application of AI which equips the system with the
ability to learn and improve from experience using the data that is accessible to it.
For more, please refer to section 3.

5.2. Supervised, Unsupervised and Reinforcement learning

Machine learning is often divided into three categories – Supervised, Unsupervised and
Reinforcement learning.

5.2.1. Supervised Learning

As the name specifies, Supervised Learning occurs in the presence of a supervisor or a teacher. We train
the machine with labeled data (i.e. some data is already tagged with correct answer). It is then compared
to the learning which takes place in the presence of a supervisor or a teacher. A supervised learning
algorithm learns from labelled training data, and then becomes ready to predict the outcomes for
unforeseen data.

15
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Example 1

Remember the time when you used to go to school? The time when you first learnt what an apple looked
like? The teacher probably showed a picture of an apple and told you what it was, right? And you could
identify the particular fruit ever since then.

That’s exactly how supervised learning works.

As you can see in the image below

Step 1: You provide the system with data that contains photos of apples and let it know that these are
apples. This is called labelled data.

Step 2: The model learns from the labelled data and the next time you ask it to identify an apple, it can
do it easily.

Example 2
For instance, suppose you are given a basket full of different kinds of fruits. Now the first step is to train
the machine to identify all the different fruits one by one in the following manner:

 If the shape of the object is round with depression at the top and
its color being Red, then it will be labelled – Apple.
 If shape of object resembles a long-curved cylinder with tapering
ends and its colour being Green or Yellow, then it will be labelled
– Banana.

Now suppose after training, you bring a banana and ask the machine to
identify it, the machine will classify the fruit on the basis of its shape and
colour and would confirm the fruit to be BANANA and place it in the Banana category.

Activity 1

Suppose you have a data set entailing images of different bikes and cars. Now you need to train the
machine on how to classify all the different images. How will you create your labelled data?

16
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

[Hint – If there are 2 wheels and 1 headlight on the front it will be labelled as a ‘Bike’]

5.2.2. Unsupervised Machine Learning

Many a times, perfectly labelled data sets are hard to find. In such situations, data used to train the
machine are neither labelled nor classified. Unsupervised learning is a ML technique where we don’t
need to supply labelled data, instead we allow the machine learning model (algorithm) to discover the
patterns on its own. The task of the machine is to assemble unsorted information according to
resemblances, patterns and variances without any former training of data.
In this kind of learning, the machine is restricted to find a hidden structure in the unlabelled data without
guidance or supervision.

Example 1

If somebody gives you a basket full of different fruits and asks you to separate them, you will probably
do it based on their colour, shape and size, right?

Unsupervised learning works in the same way. As you can see in the image:

Step 1: You provide the system with a data that contains photos of different kinds of fruits and ask it to
segregate it. Remember, in case of unsupervised learning you don’t need to provide labelled data.

Step 2: The system will look for patterns in the data. Patterns like shape, colour and size and group the
fruits based on those attributes.

17
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Example 2
For instance, suppose the machine is given an image having both dogs
and cats which it has not seen before. Logically the machine has no idea
about the physical characteristics of dogs and cats and therefore it
cannot categorize the animals. But it can surely categorize them
according to their similarities, patterns, and differences i.e. we can
easily categorize this above picture into two parts. First category may
contain all pictures having dogs in it and second category may contain
all pictures having cats in it. Here you didn’t learn anything before,
means no training data or examples were provided for prior training.

Let us take another example - a friend invites you to his party, where you meet a stranger. Now you will
classify this person using unsupervised learning (without prior knowledge) and this classification can be
on the basis of gender, age group, dressing style, educational qualification or whichever way you prefer.

Why is this learning different from Supervised Learning?

Since you didn't use any past/prior knowledge about the person and classified them "on-the-go".

Activity 1

Let's suppose you have never seen a Cricket match before and by chance watch a video on the internet.
Can you classify players on the basis of different criterion?

Hint: [Players wearing similar outfits belong to a team, players performing different types of action –
batting, bowling, fielding, and wicket keeping.]

5.2.3. Reinforcement Machine Learning

Wikipedia defines Reinforcement learning as “Reinforcement learning (RL)” as an area of machine

learning concerned with how software agents ought to take actions in an environment in order to
maximize some notion of cumulative reward. Reinforcement learning is one of three basic machine
learning paradigms, alongside supervised learning and unsupervised learning.”

In reinforcement learning, the machine is not given examples of correct input-output pairs, but a
method is provided to the machine to measure its performance in the form of a reward. Reinforcement

18
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

learning methods resemble how humans and animals learn, the machine carries out numerous activities
and gets rewarded whenever it does something well.

Example 1

Let’s play a game:

We have an agent, a robot, and a reward (diamond here) with many

hurdles (fires) in between.

The goal of the robot is to get the reward (diamond) and to avoid the
hurdles (fire). The robot learns by trying all the possible paths and then
chooses the path which reaches the reward while encountering the
least hurdles. Each correct step will bring the robot closer to the
diamond while accumulating some points and each wrong step will
push the robot away from the diamond and will take away some of the
accumulated points. The reward (diamond) will be assigned to the robot when it reaches the final stage of the
game.

Example 2

Imagine a small kid is given access to a laptop at home (environment). In simple terms, the baby (agent)
will first observe and try to understand the laptop environment (state). Then the curious kid will take
certain actions like hitting some random buttons (action) and observe how the laptop would respond
(next state).

As the non-responding laptop screen goes dull, the kid dislikes it (receiving a negative reward) and
probably won’t like to repeat the actions that led to such a result (updating the policy) and vice versa.
The kid will repeat the process until he/she finds a button which turns the laptop screen bright (rewards)
and will be happy maximizing the total rewards.

This is how reinforcement learning works!

In reinforcement learning, artificial intelligence

faces a game-like situation. The computer employs
trial and error to come up with a solution
to the problem. To get the machine to do what
the programmer wants, the artificial intelligence
gets either rewards or penalties for the actions it
performs. Its goal is to maximize the total reward.

19
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity 1

Question -1: Can you please find two real world applications of Supervised Learning?

Question -2: Can you write down two real world applications of Unsupervised Learning?

Question-3: What kind of learning algorithm do you think works behind the Computer chess engine?

5.3 Deep Learning and Neural Networks

Deep Learning is inspired from human brain and the neurons in the human brain. Therefore, in order to
understand Deep Learning, we will first need to know about ‘neurons’.

A small child learns to distinguish between a school bus and a regular transit bus. How?

How do we unconsciously perform complex pattern recognition tasks?

How do we easily differentiate between our pet dog and a street dog?

The answer is we have a vast biological neural network that connects the neurons to our nervous
systems. Our brain is a very complex network comprising of about 10 billion neurons each connected to
10 thousand other neurons.

So, before we try to understand Deep Learning, let us understand Neural Network (Artificial Neural
Network i.e. ANN). In short, Deep Learning consists of artificial neural networks designed on similar
networks present in the human brain. The idea of ANN in Deep Learning is based on the belief that
human brain works by making the right connections, and this pattern can be imitated using silicon and
wires in place of living neurons.

5.3.1. Artificial Neural Network

‘Artificial Neural Networks (ANN) can be described as layers of software units called neurons (also
called node), connected with different neurons in a layered manner. These networks transform data
from one neuron to another neuron until they can classify it as an output. Neural network is again a
technique to build a computer program that learns from data.’

20
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

The most common structure for a neural

network consists of three separate nodes
known as input, hidden and output.

 Input Node: This is the layer

where information or initial data
from the real world gets
introduced into the neural
network. The information is then
passed onto the hidden node
where computations can begin.

 Hidden Node: There is no

connection to the real world at
this stage. This is the point where the machine uses the information received from the input
node, it carries out computation and processing on it. There can be more than one hidden
layer.

 Output Node: This is the final stage where the computations conclude, and data is made
available to the output layer from where it gets transferred back into the real-world
environment.
Example

Let me share an example to further explain Artificial Neural Networks…

A school has to select students for their upcoming sports meet. The school principal forms a group of
three teachers (a selection jury) and entrusts them with the responsibility of selection of students based
on the following criteria:

21
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

• Students’ Marks (in Grade X)

• Students’ Gender
• Students’ Age
• Students’ Emotional stability

The school has a history of fair selection procedure and therefore only talented and bonafide students
are able to secure a place in the sports team. In order to continue the same standard and selection
procedure, the principal decides to share (with the jury) data of about 50 previous students’ (who were
selected) cases to study. The principal feels this will give the jury an opportunity to practice, which will
eventually help them make a fair selection.

The process that will be followed for this exercise is as below:

(I would like to remind that this whole exercise is being performed on the previous batch of students
and the purpose of this exercise is to sharpen the decision making accuracy of the jury for the upcoming
selection)

 Every jury member is given a maximum of 10 points (weight) on which they rate a student. They
need to distribute the 10 points across the four criteria of marks, gender, age and emotional
stability.
 The cut-off average required for a student to qualify is fixed at ‘6’. So, a student needs to have
an average score of ≥ 6 to reserve his/her spot in the sports team.
 After the jury gives their verdict on a particular student (using the above four criteria), the
principal will reveal whether their verdict of "Selected" or "Not Selected" matches the original
selection outcome.
Once the ground rules have been set, the jury enters a room to deliberate on the candidates and start the decision
making process. Here is a peak into the jury conversation:

 Teacher 1: For me ‘Grade X Marks’ is most important and I am assigning this criterion the most weight
and other criteria are not important. Accordingly, I’m giving a score of ‘7 points’ to Student#1.
 Teacher 2: I think differently…’Marks’ are important, however I am also considering ‘Gender’ and ‘Age’
and I’m assigning each of the three criteria equal weight. So I’m scoring Student # 1 ‘2 points for Marks’,
‘2 points for Gender’ and ‘2 points for Age’.
 Teacher 3: For me only ‘Gender’ and ‘Emotional Stability’ count and I’m assigning equal weightage to
both these criteria. Accordingly, I will score Student # 1 with ‘5 points for Gender’ and ‘5 points for
Emotional Stability’.

Based on the above deliberation, let us take a look at how the jury members have scored Student # 1:

Criteria Teacher Teacher Teacher

1 2 3

Grade X Marks 7 2 0

Gender 0 2 5

22
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Age 0 2 0

Emotional 0 0 5
Stability

Average score for Student # 1: (7 + 6 + 10)/ 3 = 7.6

As per the selection rule (cut-off ≥ 6), Student # 1 should have ideally qualified, but the principal reveals that this
student actually did not make the team as per the original decision.

Let us take a look at Student # 2 now… Please see below the jury discussion and deliberation for this candidate
and how they now begin to adjust their scoring based on learning from Student # 1.

 Teacher 1: It seems I'm attaching too much weight to just ‘Marks’, so I'm leaning towards giving some
weightage to ‘Age’ as well.
 Teacher 2: I feel I’m assigning too much weight to ‘Gender’, I’m going to consider splitting it between
‘Gender’ and ‘Emotional Stability’.
 Teacher 3: I now feel in addition to ‘Gender’ and ‘Emotional Stability’ some weightage needs to be given
to ‘Age’ as well.

Based on the above deliberations, let us take a look at the score table for Student # 2:

Teacher Teacher Teacher

1 2 3

Grade X Marks 3 2 0

Gender 0 1 1

Age 3 2 3

Emotional 0 1 1
Stability

Average score for Student # 2: (6 + 6 + 5) /3 = 5. 6

As per the selection rule (cut-off ≥ 6), Student # 2 will not qualify and the principal reveals that indeed this student
did not make the team as per the original decision as well.

So in this case the jury decision matches the original verdict!

In the above fashion the jury proceeds to evaluate student after student and in doing so a pattern emerges for
the right 'weightage' for each criteria (per jury member) that yields the highest number of correct predictions.

And this whole process of learning and developing an accuracy is nothing but Artificial Neural Networks (ANN).

23
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

To explain the above illustration:

 The selection criteria (marks, age, gender and emotional stability), is what we call the ‘Input Layer’ in a
neural network – the input to the network, which will be assigned weightage and eventually decide an
outcome

 The decision/prediction is what we call the ‘Output Layer’ in a neural network. In this case, ‘Selected’ and
‘Not Selected’ is the output layer. It should be noted that it can either be a continuous outcome
(regression, as in a number like 3.14 or 42) or categorical outcome (true/false, yes/no, selected/not
selected etc.)

 The jurors (group of teachers) form the ‘Hidden Layer’. It's called ‘hidden’ because no one besides them
know how much weightage they are attaching to each criteria (or input). To the input and output neuron,
the hidden layer is a ‘black box’ that simply listens and jointly decides an outcome.

24
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

5.3.2. Deep Learning

Deep learning is a branch of machine learning which is completely based on artificial neural networks,
as neural network mimics the human brain so deep learning is also a kind of imitation of the human
brain. In deep learning, we don’t need to explicitly program everything”. It is important to know that in
deep learning, we do not need to explicitly program everything.

Let us now understand the difference between Machine Learning and Deep Learning:
MACHINE LEARNING DEEP LEARNING

Works on Large amount of

Works on small amount of Dataset for accuracy. Dataset.

Heavily dependent on High-end

Dependent on Low-end Machine.
Machine.

Divides the tasks into sub-tasks, solves them individually and Solves problem end to end.
finally combine the results.
Takes longer time to train.
Takes less time to train.

Testing time may increase. Less time to test the data.

25
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Here are a few examples of Deep Learning at Work:

 Automated Driving: Automotive researchers are using profound learning to robotically spot
entities such as stop lights and traffic signals. In addition, deep learning is also used to detect
pedestrians, reducing the incident of accidents.

 Aerospace and Defence: Identifying objects from satellites and locate safe and unsafe zones for
troops is another area where Deep Learning is playing major role.

 Medical Research: Deep Learning is used by cancer researchers to automatically detect cancer
cells.

 Industrial Automation: Deep learning is helping to improve worker safety around heavy
machinery by automatically detecting when people or objects are within an unsafe distance from
the machines.

6. What machine learning can and cannot do?

The first thought that arises in one’s mind after learning about AI and ML is if they will replace humans.
In which case, what are humans for?
Humans as the Commander-In-Chief know ‘what to count”, whereas computers know “how to count”.
Smart machines can be put to best use only when we understand what it can do and cannot do.
AI and ML are tools, like a calculator, that help us in solving complex problems which otherwise are
complicated for the human brain to solve. For instance, we would not use a calculator to multiply “4 x
2”, and would not when we have to multiply “798 x 347”.
Here are a few examples of Machine Learning that we use every day:
1. Virtual Personal Assistant like Siri, Alexa, Google Home etc.
2. Predictions while commuting - like Traffic Forecasts on Google Maps.
3. Video Surveillance systems nowadays are powered by AI that makes it possible to detect crime before
they happen. They track unusual behaviour of people like standing motionless for a long time, stumbling,
or napping on benches etc.
4. Social Media Services
 Facebook Friend Suggestion: Facebook continuously notices the friends that you connect with,
the profiles that you visit very often. On the basis of continuous learning, list of Facebook users
is suggested that you can become friends with.
 Face Recognition on Facebook: When you upload a picture of yourself with a friend does
Facebook instantly recognizes that friend? Facebook checks the poses and projections in the

26
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

picture, notices the unique features, and then matches the same with the people in your friends
list.
5. Email spam and malware filtering - Emails are arranged according to some standards as per email
spam. Mail filtering manages received mails, detects and removes the ones holding malicious codes such
as virus, Trojan or malware.
6. Product recommendations - You often receive emails from similar merchandizers after you have
shopped online for a product. The products are either similar or matches your taste, it definitely refines
the shopping experience. Did you know that it is Machine Learning working its magic in the back?
7. Online Fraud Detection
Machine learning is lending its potential to make cyberspace a secure place by tracking monetary frauds
online. Take for example PayPal is using ML for protection against money laundering. Even with the
advancements we have made in ML over the years, there are instances where a Grade 2 student has
been able to beat a computer by solving a problem faster.
1. Any problems or questions which require social context will take longer for a machine to solve

2. Particularly with respect to text analytics, there are two main challenges. First is “Ambiguity” -
this means that the same word can mean many things. Second is “Variability” - indicating the
same thing can be said in many different ways.
3. Machine learning can’t solve ethical problem. If a self-driving car kills someone on the road,
whose fault is it?

27
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

7. Jobs in AI
Can you guess the jobs depicted in the pictures below:
Picture – 1 Picture -2

Picture – 3 Picture -4

The jobs depicted in the pictures above, were professions from not long back, may be 20-30 years ago.
There are so many jobs, which used to exist few decades ago but are redundant in today’s age. Similarly,
there are jobs which were unheard of 30 years ago but are very popular now.

28
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity
1. Can you please prepare a list of 10 such jobs which existed in the 80’s but no longer relevant now?

2. Can you prepare a list of 5 jobs, which were not in 80’s but are popular now?

3. Can you imagine 5 jobs / professions that do not exist now but maybe popular in 2035?

The World Economic Forum predicts that AI and ML will displace 75 million jobs but generate 133 million
new ones worldwide by 2022. Another Gartner report claims that in 2020, Artificial Intelligence will
create 2.3 million jobs and eliminate 1.8 million jobs.
Job losses due to Artificial Intelligence is a baseless fear as AI will NOT take over the employment market
– as simple as that. It will merely introduce a paradigm shift, similar to the one which occurred after the
Industrial Revolution. Consequently, while many professions will become obsolete and disappear, some
occupations will become much more popular, with new ones emerging on the go. It’s important to keep
two things in mind:
1. Acquiring basic tech-related skills is not something you will live to regret
2. Understanding what is happening in the field of AI may help you gain a significant career
advantage, either by investing time and money into learning a new skill or leveraging your
existent knowledge into solving relevant AI-related problems.
Jobs which will grow with the help of AI
1) Creative Jobs
Professionals like artists, doctors, scientists are only a few which can be labelled creative. Such category
of jobs is only going to get refined and advance by use of AI.

The number of such professionals required will not increase. But AI will make certain parts of these jobs
less complex for humans, so it will become easier in the future to learn the skill in lesser time and
flourish.

2) Management Jobs

Management jobs cannot be replaced by artificial managers. Human managers have to manage artificial
managers. Managing is a very complex task which involves deep understanding of people and
communication. There are already few smart tools which help managers become more effective at their

29
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

job. So, if you’re interested in this kind of job, you can learn to use them and gain some advantage in
the field.

3) Tech Jobs

Programmers, data scientists, people who work on the creation and maintenance of AI systems are the
jobs of the future and they will be very important for humanity to make the next large step of its
evolution. They too should undergo certain changes. Few of the tech jobs which are in demand today
may become less common, while others may become more vital.

Few of the jobs title, which can be expected to appear by 2030:

1. Chief Bias Office

2. Data detective

3. Man – Machine teaming manager

4. AI business development manager

5. AI assisted medical professional (this I am sure, will appear before 2030)

6. AI tutor

I will leave up to you guys to take up as a project and define the roles and responsibilities of these jobs
profile.

Happy learning in AI Age!

30
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity
(This activity has been designed by MIT AI Ethics Education Curriculum. “An Ethics of Artificial Intelligence Curriculum for
Middle School Students was created by Blakeley H. Payne with support from the MIT Media Lab Personal Robots Group,
directed by Cynthia Breazeal.”)

Let’s play a game of ‘AI BINGO’!

Learning Objectives:

1. To understand the basic mechanics of AI systems

2. Know that artificial intelligence is a specific type of algorithm and has three specific parts:
dataset, learning algorithm, and prediction.
3. Recognize AI systems in everyday life and be able to reason about the prediction an AI system
makes and the potential datasets the AI system uses.

Instructions for Teachers

1. Print out all of the materials below these two paragraphs, with each bingo card on a separate
paper and the list of Tasks, Data Sets & Predictions on a third.

2. Pass around the bingo cards to the separate teams and keep the list of tasks/dataset/prediction
for yourself (It will serve as both the answer key and the bingo calls)

3. Along with every data set and prediction, you will see the task that it corresponds to on the Bingo
grids. Read out the data set and prediction pairs at random (but not the task itself!) and have the
students fill in the tile they think it belongs to.

4. The first of the two teams to correctly fill out five tiles in a row, diagonal, or column win.

31
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

32
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

33
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

34
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Data Sets & Predictions – FOR TEACHERSY

TASK: Get a forecast from a weather app

DATA SET: what the weather was like in the past
PREDICTION: what the weather will be like in the future

T: Send a voice-to-text message

D: transcribed audio of people talking
P: transcription of your audio message

T: Search for something on Google

D: past links you’ve clicked on in Google
P: which search results you’d want to see first

T: Have Google autocomplete your search query

D: past searches of people who share your interests
P: your full search after you type the first word

T: Have a writing assignment graded by a computer

D: examples of graded writing assignments
P: the grade a new assignment deserves

T: Use “safe search” on Google

D: examples of websites that are safe and unsafe
P: new websites that are safe and unsafe

T: Get a suggested email response on Gmail

D: people’s responses from past email exchanges
P: a response you might give to a new email

T: Use a Snapchat filter

D: examples of people’s faces
P: where to paste glasses on your face

T: Play a motion-sensitive video game on Nintendo or Wii

D: examples of different motions that correspond with actions in a video game
P: the action you’re trying to take when you make a motion

T: Replace letters, like “lol,” with a suggested emoji

D: what each emoji could mean
P: the best emoji to replace what you’ve texted

35
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

T: Receive a product recommendation on Amazon

D: the products people have bought together in the past
P: a product you might like with what you just bought

T: Have an email go to your spam folder

D: examples of emails that are or aren’t spam
P: whether a new email is spam

T: Click on an Instagram ad
D: the Instagram accounts people follow and what they buy
P: what you might buy based on who you follow

T: Have a news app suggest an article

D: the news articles you’ve read in the past
P: the news articles you may like to read

T: See a suggested ad on Snapchat

D: the Snapchat accounts people follow and what they buy
P: what you might buy based on who you follow

T: Have your words autocorrected in a text

D: examples of how people misspell words
P: the word you’re trying to spell

T: Listen to a recommended song on Spotify

D: past songs that you’ve listened to
P: new songs you may like

T: See a recommended product on Facebook

D: the Facebook posts people engage with and what they buy
P: what you might buy based on posts you engage with

T: Get “nudged” to respond to an email on Gmail

D: how quickly people have responded to emails in the past
P: how quickly you should respond to an email

T: Use your face to unlock a device

D: images of your face
P: whether a face is yours

36
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

T: Use a map app to find a path to a destination

D: how long it historically takes to get from point A to B
P: the shortest commute from point A to B

T: Use an app like Shazam to identify a song

D: examples of what songs sound like in noisy environments
P: the name of a song playing in a noisy environment

T: Communicate with a customer service bot

D: the most helpful answers to past customer questions
P: the best answer to your question

T: Have an email labelled as “important”

D: examples of emails that are or aren’t important
P: whether a new email is important

37
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Unit 2: AI Applications and Methodologies

Title: AI Applications and Methodologies Approach: Interactive/ Discussion, Activities

Summary: As soon as someone mentions the word “AI”, the first picture that comes in our mind
is of Terminator or a Robot which can do everything. Thankfully, the present picture is more
positive. In this unit, we will explore how AI and its various applications helping our planet and
benefiting the humankind. This unit will let you know the applications of AI in the domain of
science, automobiles, healthcare, trading and business, weather etc.
This unit will also take you through the cognitive computing aspect of AI by introducing you to
terms like Computer Vision, speech, reasoning etc. Which otherwise is considered to be the
attributes of humans.
Objectives:
1. Students get familiar with AI applications such as Chatbots, role of AI in weather
forecasting, autonomous cars etc.
2. Students start appreciating the fact that AI is here to supplement us and NOT compete
with us.
3. Students get a fair understanding of how our society is expected to look like in the age of
AI and the skills that students need to acquire in order to keep pace with the changes.
Learning Outcomes:
1. To develop a fair understanding of AI applications and to know where and how to apply
these tools to improve productivity.
2. They should see AI as a tool pretty much like they treat calculator as a tool for simple
calculation.
Pre-requisites: Basic understanding of AI, reasonable fluency in English and computer literacy,
comfortable in using the internet
Key Concepts: AI applications, cognitive computing, Impact of AI on society

Before we start this unit, let me ask you a few questions:

1. Give me two reasons why you should learn to ride a bicycle?

2. Share two reasons why one should learn how to operate a calculator?
3. Do you know anybody who doesn’t know even know the basics of mobile phone operation?

Hope you have noticed, that I did not ask why we should learn how to manufacture bicycles or how to
build and program a mobile phone or calculator. Anyone, who can’t operate a mobile phone or doesn’t
know how to use a calculator is probably a misfit in the society that we live in today. Those days are not
far away when AI tools and applications will start replacing calculators, bicycles or mobile phones, and
many other gadgets that we use in our day-to-day lives.

Artificial Intelligence technologies are widely used by people today, just that they don’t realise it or are
unaware of it. In the modern times that we live today, we are surrounded by a variety of AI tools that
make many aspects of our lives easier. Have you ever thought about the music that we listen to, the
products we buy from Amazon or Flipkart, or the news and information that we receive are all made
available to us with the help of AI? AI is helping in composing poems, writing stories, helping doctors
perform complex surgeries and also prescribing medicines.

38
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

You must have watched a number of Sci-Fi movies, but to encounter something like “The Terminator”
come to life is not going to happen anytime soon.

Activity 1:
Let us do a small exercise on how AI is perceived by our society.

Do you have any idea what people think about AI? Do an online image search for the term “AI”

The results of your search would help you get an idea about how AI is perceived.

[Note - If you are using Google search, choose "Images" instead of “All” search]

Based on the outcome of your web search, create a short story on “People’s Perception of AI”

Activity 2:

In this activity, let us explore what is being written on the web about AI?
Do you think the articles on the net are anywhere close to reality or are they just complete science
fantasies?
Now based on your experience during the online search about ‘AI’, prepare your summary report that
answers the following questions:
1. What was the name of the article, name of the author and web link where the article was
published?
2. What was the main idea of the article? Mention in not more than 5 sentences

39
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1. Key Fields of Application in AI

According to the father of Artificial Intelligence, John McCarthy, “Artificial Intelligence is the science and
engineering of making intelligent machines, especially intelligent computer programs”. Artificial
Intelligence helps in developing intelligent machines and software that learns, works and responds like
the human beings. Today, AI has become an integral part of human lives and has permeated in areas as
diverse as science, engineering, business, medicine, video games, etc. Some of the applications are
discussed below.

1.1 Chatbots
If you remember from Unit-1, AI tries to mimic human cognitive functions like vision, speech, reasoning
etc. Chatbot is one of the applications of AI which simulate conversation with humans either through
voice commands or through text chats or both. We can define Chatbot as an AI application that can
imitate a real conversation with a user in their natural language. They enable communication via text or
audio on websites, messaging applications, mobile apps, or telephones.

Some interesting applications of Chatbot are Customer services, E-commerce, Sales and Marketing,
Schools and Universities etc.
Let us take up some real-life examples. HDFC Bank’s EVA (Electronic Virtual Assistant) is making
banking service simple and is available 24x7 for the HDFC bank’s customers.
Please visit the link to get a first-hand experience of banking related conversations with EVA :
https://v1.hdfcbank.com/htdocs/common/eva/index.html
You can also visit https://watson-assistant-demo.ng.bluemix.net/ for a completely new experience on
the IBM Web based text Chatbot.
Open the link in your web browser, and you will land on the demo page of IBM’s banking virtual
assistant. In this demo, you will be engaging with a banking virtual assistant that is capable of
simulating a few scenarios, such as making a credit card payment, booking an appointment with a
banker or choosing a credit card. Watson can understand your entries and responds accordingly.

40
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

The other online Chatbot platform, where you can get basic hands on experience using your Gmail
account is the Google Dialog flow. You can visit the link https://dialogflow.com/ to experience the
Chatbot

We all know the threat due to Corona Virus on humanity. Most of us want to calculate our own risk level
at this time. But just for the risk assessment, it’s not advisable to go out to see a doctor or hospital.
Amidst the current paranoia surrounding COVID 19, Apollo hospitals (and many other hospitals
/medical companies also) released a Chatbot to scan one’s risk level. Another example of leveraging AI
at the time of need.
URL: https://covid.apollo247.com/

41
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Types of Chatbots
Chatbots can broadly be divided into two types:
1. Rule based Chatbot
2. Machine Learning (or AI) based Chatbot risk

1. Rule – based Chatbot

This is the simpler form of Chatbot which follows a set of pre-defined rules in responding to user’s questions. For
example, a Chatbot installed at a school reception area, can retrieve data from the school’s archive to answer
queries on school fee structure, course offered, pass percentage, etc.

Rule based Chatbot is used for simple conversations, and cannot be used for complex conversations.

For example, let us create the below rule and train the Chatbot:

bot.train ( [

'How are you?',

'I am good.',

'That is good to hear.',

'Thank you',

'You are welcome.',

])
After training, if you ask the Chatbot, “How are you?”, you would get a response, “I am good”. But if you
ask ‘What’s going on? or What’s up? ‘, the rule based Chatbot won’t be able to answer, since it has been
trained to take only a certain set of questions.

42
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Quick Question?
Question: Are rule-based bots able to answer questions based on how good the rule is and how
extensive the database is?
Answer: We can have a rule-based bot that understands the value that we supply. However, the
limitation is that it won’t understand the intent and context of the user’s conversation with it.
For example: If you are booking a flight to Paris, you might say “Book a flight to Paris” and someone else
might say “I need a flight to Paris” while someone from another part of the world may use his/her native
language. AI-based or NLP-based bot identifies the language, context and intent and then it reacts
accordingly. A rule-based bot only understands a pre-defined set of options.
2. Machine Learning (or AI) based Chatbot
Such Chatbots are advanced forms of chatter-bots capable of holding complex conversations in real-
time. They process the questions (using neural network layers) before responding to them. AI based
Chatbots also learn from previous experience and reinforced learning and it keeps on evolving.
AI Chatbots are developed and programmed to meet user requests by furnishing suitable and relevant
responses. The challenge, nevertheless lies in aligning the requests to the most intelligent and closest
response that would satisfy the user.
Had the rule based Chatbot discussed earlier been an AI Chatbot and you had posed ‘What’s going on?
or What’s up?’ instead of ‘How are you?’, you would have got a suitable response from the AI Chatbot.
The earlier mentioned IBM Watson Chatbot - https://watson-assistant-mo.ng.bluemix.net/ is in fact
an AI Chatbot .

How AI Chatbots Understands User Requests

source: https://dzone.com/articles/how-to-make-a-chatbot-with-artificial-intelligence

43
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1.2 Natural Language processing (NLP)

We read about Chatbots in the previous section. But do you understand the technology behind the
Chatbots? It is called is NLP, short for Natural Language Processing. Every time you throw a question to
Alexa, Siri or Google assistant, they use NLP to answer your question. Now we are faced with the
question, what is Natural Language processing (NLP)?

Source: https://www.techsophy.com/chatbots-need-natural-language-processing/

Natural language means the language of humans. It refers to the different forms in which humans
communicate with each other – verbal, written or non-verbal or expressions (sentiments such as sad,
happy, etc.). The technology which enables the machines (software) to understand and process the
natural language (of humans), is called natural language processing (NLP). In other words, it is defined
as branch of Artificial Intelligence that deals with the interaction between computers and humans using
the natural language. The main objective of NLP is to read, interpret, comprehend, and coherently make
sense of the human language such that it creates value for all. Therefore, we can safely conclude that
NLP essentially comprises of natural language understanding (human to machine) and natural language
generation (machine to human).
NLP is a sub – area of Artificial Intelligence deals with the capability of software to process and analyse
human language, both verbal and written language.

44
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Natural language processing has found applications in various fields which are listed as follows
1. Text Recognition (in an image or a video)
You might have seen / heard about cameras that read vehicle’s number plate

camera / machine translation

using NLP => KL 85 C 4780

Camera / machine captures the image of number plate, the image is transferred to neural network layer
of NLP application, NLP extracts the vehicle’s number from the image. However, correct extraction of
data also depends on the quality of image.
Quick Question?
Question: Where and how is NLP used?
Answer: NLP is natural language processing and can be used in scenarios where static or predefined
answers, options and questions may not work. In fact, if you want to understand the intent and
context of the user, then it is advisable to use NLP.
Let’s take the example of pizza ordering bot. When it comes to pre-listed pizza topping options, you can
consider using the rule-based bot, whereas in case you want to understand the intent of the user where
one person is saying “I am hungry” and another person is saying “I am starving”, it would make more
sense to use NLP, in which case the bot can understand the emotion of the user and what he/she is
trying to convey.

45
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity:
Do you think the application of NLP, can completely remove the language barriers? Can you write the
algorithm / steps to show how NLP will perform the language translation?

_________________________________________________________________________________

2. Summarization by NLP
NLP not only can read and understand the paragraphs or full article but can summarize the article into
a shorter narrative without changing the meaning. It can create the abstract of entire article. There are
two ways in which summarization takes place – one in which key phrases are extracted from the
document and combined to form a summary (extraction-based summarization) and the other in which
the source document is shortened (abstraction-based summarization).

https://techinsight.com.vn/language/en/text-summarization-in-machine-learning/
https://blog.floydhub.com/gentle-introduction-to-text-summarization-in-machine-learning/
Example 1
To understand the summarization process, let us take an example from our real-life, that from our
judiciary. The Lawyers and judges have to read through large volumes of documents just to develop an
understanding of one case. The NLP can be leveraged here by assigning the task of reading the case files
to create a short abstract of every case file. Judges/lawyers will read the summarized files (prepared
by NLP AI) saving their precious time, which can be utilized to expedite the pending cases resolution.
Example 2
Let us take one more example:
Source text: Joseph and Mary rode on a horse drawn carriage to attend the annual fair in London. In
the city, Mary bought a new dress for herself.
Extractive summary: Joseph and Mary attend annual fair in London. Mary bought new dress.

Have you noticed that in this case the extractive summary has been formed by joining the words in
bold?

46
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

3. Information Extraction
Information extraction is the technology of finding a specific information in a document or searching the
document itself. It automatically extracts structured information such as entities, relationships between
entities, and attributes describing entities from unstructured sources.
Example
For example, a school’s Principal writes an email to all the teachers in his school -
“I have decided to organize a teachers’ meet tomorrow. You all are requested to please assemble in my
office at 2.30 pm. I will share the agenda just before the meeting.”
NLP can extract the meaningful information for teachers:
What: Meeting called by Principal
When: Tomorrow at 2.30 pm
Where: Principal’s office
Agenda: Will be shared before the meeting
Another very common example is Search Engines like Google retrieving results using Information
Extraction.
4. Speech processing
The ability of a computer to hear human speech and analyse and understand the content is called speech
processing. When we talk to our devices like Alexa or Siri, they recognize what we are saying to them.
For example:
You: Alexa, what is the date today?
Alexa: It is the 18-March-2020.
What happens when we speak to our device? The microphones of the device hears our audio and plots
the graphs of our sound frequencies. As light-wave has a standard frequency for each colour, so does
sound. Every sound (phonetics) has a unique frequency graph. This is how NLP recognizes each sound
and composes an individual’s words and sentences.

47
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Below table shows the step-wise process of how speech recognition works:

Step 1: Voice / Sound is picked up by the Step 2: Waveform vibrates on the

microphone microphone’s diaphragm which
measures amplitude of the sound of our
speech

Step 3: Inbuilt algorithm converts amplitude to Step 4: The plotted graph is converted
frequency and then plots a graph to equivalent text which is further
analysed by NLP

Activity
Have you noticed that virtual assistants like Alexa and Google Assistant work very well when they are
connected to the internet, and cannot function when they are offline?
Can you do a web search to find out the reason for the same? Please summarize your
findings/understanding below:

Natural Language Processing Tools and Libraries for advance learners

1. NLTK ( www.nltk.org ) Open source NLP Toolkit

2. Cornel (https://stanfordnlp.github.io/CoreNLP/)

3. Open NLP ( https://opennlp.apache.org/) Open Source NLP toolkit for Data Analysis and sentiment Analysis

4.SpaCy ( https://spacy.io/ ) for Data Extraction, Data Analysis, Sentiment Analysis, Text Summarization

5. Allen NLP ( https://allennlp.org/) for text analysis and Sentiment Analysis

1.3 Computer Vision (CV): Getting started

Have you ever heard the term Computer Vision often abbreviated as CV? In very simple terms CV is a
field of study that enables the computers to “see”. It is a subfield of AI and involves extraction of
information from digital images like the videos and photographs, analysing, and understanding the
content thereof. CV has been most effective in the following:

 Object detection
 Optical Character Recognition
 Fingerprint Recognition

48
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Source: https://towardsdatascience.com/everything-you-ever-wanted-to-know-about-computer-vision-heres-a-look-why-it-s-so-
awesome-e8a58dfb641e & https://nordicapis.com/8-biometrics-apis-at-your-fingertips/

How many of you have seen video surveillance cameras installed in schools, shopping malls or other
public places? Do you know what it does? What is the purpose behind installing these cameras? I am
sure your answer would be – safety and security.

49
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Quick Question?
Q1: List out three of 3 places/locations (like library, playground etc.) in your school where surveillance
cameras have been installed?

Q2: In your opinion, who does the actual surveillance – the camera or the person sitting behind the
device? Justify your answer. and why?

Q3: What features you would like to add to the surveillance camera to make it true smart surveillance
camera?

From the above exercise you must have understood that although cameras or any device similar to it,
have the capability to capture a picture or a particular moment but it can’t really analyse or make sense
of that. The device (camera in this case) has a limited capacity to just capture the image of the objects
but is not able to recognize them. Taking pictures is not same as seeing, recognising or understanding.

Watch Video 1 & 2 for which links have been given below and answer the related questions:
Video 1: https://www.youtube.com/watch?v=GN7RKRFtZiQ
Can you explain what you saw in this video?

Video 2: https://www.youtube.com/watch?v=1_B44HO_PAI
After watching this video, do you still believe that a device (camera in this case) can only take pictures
of objects but can’t recognize them?

50
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Before we dive deep into the interesting world of computer vision, let us perform an online activity:
Activity:
Open the URL in your web browser – https://studio.code.org/s/oceans/stage/1/puzzle/1

‘AI for Ocean’ by Code.org is designed to quickly introduce students to machine learning, a branch
of artificial intelligence and is based on the idea that machines can recognise patterns, make sense of
data and make decisions with very little human involvement. Over the years machine learning and
computer vision have come to work very closely. Machine learning has also improved the
effectiveness of computer vision. Students will explore how training data is used to enable a machine
to see and classify the objects.
Activity Plan
There are 8 levels and you will be allowed not more than an hour to complete all the eight levels. And
it is not mandatory for students to complete all eight levels to understand the concept of computer
vision. Although it is recommended that students complete all eight levels).
Activity Outcome
Gaining a basic overview of AI and computer vision.
Activity details
Level 1:
The first level is talking about a branch of AI called Machine Learning (ML). This level is explaining about
the example of ML in our daily life like email filters, voice recognition, auto-complete text and computer
vision.

51
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Level 2 – 4:
Students can proceed through the first four levels on their own or with a partner. In order to program
AI, use the buttons to label an image as either "Fish" or "Not Fish". Each image and its label become part
of the data which is used to train the Computer Vision Model. Once the AI model gets properly trained,
it will classify an image as ‘Fish’ and ‘Not Fish’. Fish will be allowed to live in the ocean but ‘Not-Fish’ will
be grabbed and stored in the box, to be taken out of ocean/river later.
At the end of Level-4, we should explain to students how AI identifies objects using the training data, in
this case fish is our training data. This ability of AI to see and identify an object is called ‘Computer
Vision’. Similar to the case of humans where they can see an object and recognise it, an AI application
(using camera) can also see an object and recognise it.
Advance learners may go ahead with the level 5 and beyond.
Quick Question?

Based on the activity you just did, answer the following questions:

Q1: Can we say ‘computer vision’ acts as the eye of a machine / robot?
Q2: In the above activity, why did we have to supply many examples of fish before the AI model actually
started recognizing the fish?
Q3: Can a regular camera see things?
Q4: Let me check your imagination – You have been tasked to design and develop a prototype of a robot
to clean your nearby river/ ponds. Can you use your imagination and write 5 lines about features of this
‘future’ robot?

With the supporting materials provided in the section above, you must have developed a reasonably
good understanding of Computer Vision. Let us reinforce the learning by quickly going over it again in
brief.

Computer vision is a sub-set of Artificial Intelligence which enables the machines (robot/ any other
device with camera) to see and understand the digital images – photographs or videos. Computer Vision
has made it possible to make good use of the critical capabilities of AI by giving machines the power of
vision. Computer vision enables machines/ robots to inspect objects and accomplish certain tasks
making them useful for both homes and offices.

52
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

(Source: https://www.bitbrain.com/sites/default/files/styles/blog_1200x500/public/robot-maquina-inteligencia-artificial-con-emociones-
sentimientos.png?itok=S_50rjPm)

1.3.1. Computer Vision: How does computer see an image

Consider the image below:

Looking at the picture, the human eye can easily tell that a train/engine has crashed through the statin
wall (https://en.wikipedia.org/wiki/Montparnasse_derailment). Do you think the computer also views the
image the same was as humans? No, the computer sees images as a matrix of 2-dimensional array (or
three-dimensional array in case of a colour image).

53
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

The above image is a grayscale image, which means each value in the 2D matrix represents the
brightness of the pixels. The number in the matrix ranges between 0 to 255, wherein 0 represents black
and 255 represents white and the values between them is a shade of grey.

For example, the above image has been represented by a 9x9 matrix. It shows 9 pixels horizontally and
9 pixels vertically, making it a total of 81 pixels (this is a very low pixel count for an image captured by a
modern camera, just treat this as an example).

In the grayscale image, each pixel represents the brightness or darkness of the pixel, which means the
grayscale image is composed of only one channel*. But colour image is the right mix of three primary
colours (Red, Green and Blue), so a colour image will have three channels*.

Since colour images have three channels*, computers see the colour image as a matrix of a 3-
dimensional array. If we have to represent the above locomotive image in colour, the 3D matrix will be
9x9x3. Each pixel in this colour image has three numbers (ranging from 0 to 255) associated with it.
These numbers represent the intensity of red,
green and blue colour in that particular pixel.

* Channel refers to the number of colours in the

digital image. For example, a coloured image has
3 channels – the red channel, the green channel
and the blue channel. Image is usually
represented as height x width x channels, where
channel is 3 for coloured image and channel is for
1 for grayscale image.

1.3.2. Computer Vision: Primary Tasks

There are primarily four tasks that Computer vision accomplishes:

1. Semantic Segmentation (Image Classification)

2. Classification + Localization
3. Object Detection
4. Instance Segmentation

54
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

(Source : http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf)

1. Semantic Segmentation
Semantic Segmentation is also called the Image classification. Semantic segmentation is a process in
Computer Vision where an image is classified depending on its visual content. Basically, a set of classes
(objects to identify in images) are defined and a model is trained to recognize them with the help of
labelled example photos. In simple terms it takes an image as an input and outputs a class i.e. a cat, dog
etc. or a probability of classes from which one has the highest chance of being correct.
For human, this ability comes naturally and effortlessly but for machines, it’s a fairly complicated
process.

For example, the cat image shown below is the size of 248x400x3 pixels (297,600 numbers)

(Source : http://cs231n.github.io/assets/classify.png)

55
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Though image classification model takes this image as an input and reports 4 possibilities, we can see
that it indicates the highest probability of a cat. Therefore, the classification model has to be refined
/trained on further to produce a single label ‘cat’ as output.

2. Classification and Localization

Once the object classified and labelled, the localization task is evoked which puts a bounding box around
the object in the picture. The term ‘localization’ refers to where the object is in the image. Say we have
a dog in an image, the algorithm predicts the class and creates a bounding box around the object in the
image.

Source: https://medium.com/analytics-vidhya/image-classification-vs-object-detection-vs-image-segmentation-f36db85fe81

3. Object Detection
When human beings see a video or an image, they immediately identify the objects present in them.
This intelligence can be duplicated using a computer. If we have multiple objects in the image, the
algorithm will identify all of them and localise (put a bounding box around) each one of them. You will
therefore, have multiple bounding boxes and labels around the objects.

Source: https://pjreddie.com/darknet/yolov1/

56
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

4. Instance segmentation
Instance segmentation is that technique of CV which helps in identifying and outlining distinctly each
object of interest appearing in an image. This process helps to create a pixel-wise mask for each object
in the image and provides us a far more granular understanding of the object(s) in the image. As you
can see in the image below, objects belonging to the same class are shown in multiple colours.

Source: https://towardsdatascience.com/detection-and-segmentation-through-convnets-47aa42de27ea

Activity

Let’s have some fun now!

Given a grayscale image, one simple way to find edges is to look at two neighbouring pixels and take the
difference between their values. If it’s big, this means the colours are very different, so it’s an edge.

The grids below are filled with numbers that represent a grayscale image. See if you can detect edges
the way a computer would do it.

Try it yourself!

57
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Grid 1

If the values of two neighbouring squares on the grid differ by more than 50, draw a thick line between
them.

58
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Grid 2

If the values of two neighbouring squares on the grid differ by more than 40, draw a thick line between
them.

59
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1.4 Weather Prediction using AI

Story -1: Fani, a rare summer cyclone in the Bay of Bengal, hit eastern India on May 03,2019. It is one
of the strongest cyclones to have hit India in the last 20 years, according to the Indian government’s
meteorological department. Storm surges and powerful winds reaching 125 mph blew off roofs,
damaged power lines, and uprooted countless trees.
But the worst-affected state, Odisha, has been successful in keeping the loss of life and numbers of
affected people to a minimum. This is the result of a very effective strategy of disaster preparation and
effective weather forecasting.
(Source: https://qz.com/india/1618717/indias-handling-of-cyclone-fani-has-a-lesson-for-the-us/ )

Story-2: KOLKATA | NEW DELHI: Laxman Vishwanath Wadale, a 40-year-old farmer from Maharashtra's
Jalna district, spent nearly Rs 25,000 on fertilisers and seeds for his 60-acre plot after the Indian
Meteorological Department (IMD) said in June that it stands by its earlier prediction of normal monsoon.

Today, like lakhs of farmers, Wadale helplessly stares at parched fields and is furious with the weather
office that got it wrong - once again. So far, rainfall has been 22% below normal if you include the
torrential rains in the northeast while Punjab and Haryana are being baked in one of the driest summers
ever with rainfall 42% below normal.
( Source : https://economictimes.indiatimes.com/news/economy/agriculture/indian-meteorological-departments-high-failure-rate-
prompts-states-to-set-up-their-own-systems/articleshow/15133072.cms)

Now watch this video: https://www.youtube.com/watch?v=fdErsR8_NaU

Can you imagine how weather forecasting impacts people’s lives? Accurate weather forecasting allows
farmers to better plan for harvesting. It allows airlines to fly their passengers with safety. Electricity
departments can make decisions about their capacity needs during summer and winters. As we saw in
story-1, it allows governments to better prepare their responses to natural disasters that impact the
lives of millions.

Mini Research Project

Project 1: Make a report on the tools used for weather forecasting. Your report should not be more than
a page long.

Project 2: What are the different types of weather forecasting techniques?

60
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Weather forecasting deals with gathering the satellites data, identifying patterns in the observations
made, and then computing the results to get accurate weather predictions. This is done in real-time to
prevent disasters. Artificial Intelligence uses computer-generated mathematical programs and
computer vision technology to identify patterns so that relevant weather predictions can be made.
Scientists are now using AI for weather forecasting to obtain refined and accurate results, fast!

In the current model of weather forecasting, scientists gather satellite data i.e. temperature, wind,
humidity etc. and compare and analyse this data against a mathematical model that is based on past
weather patterns and geography of the region in question. This is done in real time to prevent disasters.
This model being primarily human dependent (and the mathematical model cannot be adjusted real
time) faces many challenges in forecasting.

On the other hand, Artificial Intelligence (AI) uses computer-generated mathematical programs and
computer vision technology to identify patterns and make relevant weather predictions. This has
resulted in scientists preferring AI for weather forecasting. One of the key advantages of the AI based
model is that it adjusts itself with the dynamics of atmospheric changes.

The image below will help you to understand how companies are leveraging Computer vision in weather
prediction.

Leading IT companies have been doing their intensive research by leveraging technologies like AI, IoT,
and Big Data:

1. IBM Global High-resolution Atmospheric Forecasting System (IBM GRAF) is a high-precision global
weather model that updates hourly to provide a clearer picture of weather activity around the globe

61
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

(https://www.ibm.com/weather)

2. Panasonic has been working on its weather forecasting model for years. The company makes
TAMDAR, a speciality weather sensor installed on commercial airplanes.

1.5. Price forecast for commodities

What is commodity? Commodities are the natural resources and agriculture products that come from
the earth. Some examples of these goods are wheat, cattle, soybeans, corn, oranges, various metals,
coal, cotton, oil etc.

Question: Is Mobile phone or TV, a commodity?

Answer: No

In commodity market, production is normally local but consumption is global. Therefore, price forecast
is beneficial for farmers, policymakers or for industries. Commodities like agricultural products are prone
to weather risk, demand and price risk than any other products and because of such vulnerabilities,
small farmers often do distress sales. Farmers are in distress under both the situation of crops failures
and bumper productions. A reliable price forecast tool will allow producer to make informed decision
to manage their price.

There are reliable forecasting techniques (mathematical and statistical models) which are currently
being used but they all work at macro level i.e. national or state level. Switching the forecasting
technique from classical methods (mathematical) to machine learning model would attract multiple
benefits, like machine learning can predict at the micro level i.e. individual market/mandi.

The AI based commodity forecasting system can produce transformative results, such as:

1. The accuracy level would be much higher than the classical forecasting model

2. AI can work on broad range of data and due to which it can reveal new insights

3. AI model is not rigid like classical model therefore its forecasting is always based on the most recent
input.

1.6. Self-Driving car

Wikipedia defines the self-driving car as, " A self-driving car, also known as an autonomous vehicle (AV),
driverless car, robot car, or robotic car is a vehicle that is capable of sensing its environment and moving
safely with little or no human input.”
Self-driving cars combine a variety of sensors to perceive their surroundings, such
as radar, lidar, sonar, GPS, odometry and inertial measurement units.

62
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Watch These Videos:

Video 1: Google’s Waymo - https://www.youtube.com/watch?v=B8R148hFxPw
Video 2: Tesla’s self-parking - https://www.youtube.com/watch?v=0kRArbNfOGA
Having watched the above videos, take a look at the image below and try answering the questions that
follow.

Question 1: What should the self-driving car do in the above scenario?

_________________________________________________________________________________
Question 2: In the age of self-driving cars, do we need zebra crossings? After all, self-driving cars can
potentially make it safe to cross a road anywhere.

_________________________________________________________________________________

Having attempted the activities let us now understand how self-driving cars work?

63
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Self-driving cars or autonomous cars, work on a combination of technologies. Here is a brief introduction
to them:

1. Computer Vision: Computer vision allows the car to see / sense the surrounding. Basically, it uses:

 ‘Camera’ that captures the picture of its surrounding which then goes to a deep learning model
to processing the image. This helps the car to know when the light is red or where there is a
zebra crossing etc.

 ‘Radar’ – It is a detection system to find out the how far or close the other vehicles on the road
are

 ‘Lidar’ – It is a surveillance method with which the distance to a target is measured. The lidar
(Lidar emits laser rays) is usually placed in a spinning wheel on top of the car so that they can
spin around very fast, looking at the environment around them. Here you can see a Lidar placed
on top of the Google car

2. Deep Learning: This is the brain of the car which takes driving decisions on the information gathered
through various sources like computer visions etc.

 Robotics: The self-driven cars have a brain and vision but still its brain needs to connect with
other parts of the car to control and navigate effectively. Robotics helps transmit the driving
decisions (by deep learning) to steering, breaks, throttle etc.
 Navigation: Using GPS, stored maps etc. the car navigates busy roads and hurdles to reach
its destination.

64
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

2. Characteristics and Types of AI

 Artificial Intelligence is autonomous and can make independent decisions — it does not require
human inputs, interference or intervention and works silently in the background without the
user’s knowledge. These systems do not depend on human programming, instead they learn on
their own through data experiencing.
 Has the capacity to predict and adapt – Its ability to understand data patterns is being used for
future predictions and decision-making.
 It is continuously learning – It learns from data patterns.
 AI is reactive – It perceives a problem and acts on perception.
 AI is futuristic – Its cutting-edge technology is expected to be used in many more fields in future.

There are many applications and tools, being backed by AI, which has a direct impact on our daily life.
So, it is important for us to understand what kind of systems can be developed, in a broad sense, using
AI.

2.1. Data Driven AI

The recent development in cheap data storage (hard disk etc.), fast processors (CPU, GPU or TPU) and
sophisticated deep learning algorithm has made it possible to extract huge value from data and that has
led to the rise of data centric AI systems. Such AI systems can predict what will happen next based on
what they’ve experienced so far, very efficiently. At times these systems have managed to outperform
humans. Data driven AI systems, get trained with large datasets, before it makes predictions, forecasts
or decisions. However, the success of such systems depends largely on the availability of correctly
labelled large datasets.
Every AI system we know around, are data driven AI system.

2.2. Autonomous System

Autonomous system is a technology which understands the environment and reacts without human
intervention. The autonomous systems are often based on Artificial intelligence. For example, self-
driving car, space probe rover like Mars rover or a floor cleaning robot etc. are autonomous system. An
IoT device like a smart home system is another example of an autonomous system. Because it does
things like gather information about the temperature of a room, then uses that information to
determine whether the AC needs to be switched on or not, then executes a task to achieve the desired
result.

2.3. Recommendation systems

A recommendation system recommends or suggests products, services, information to users based on
analysis of data based on a number of factors such as the history, behaviour, preference, and interest
of the user, etc. It is a data driven AI which needs data for training. For instance, when you watch a
video on YouTube, it suggests many videos that match or suit the videos that you generally search for,
prefer, or have been watching in the past. So, in short it gathers data from the history of your activities
and behaviour. Let us now take another example. Flipkart recommends the types of products that you
normally prefer or buy. All these are classic examples of recommendation, but you must know that
Machine Learning (AI) is working behind all this to give you a good user experience.

65
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Activity
You have to design a recommendation system for your school library. What kind of information or data
would you would like to collect, that would help you train your recommendation system? Mention 2
such data.

[ Hint: The type of books [science fiction, thriller etc.) students borrow, their comments etc.]

2.4. Human Like

(Source:https://cdn0.tnwcdn.com/wp-content/blogs.dir/1/files/2017/11/Screen-Shot-2017-11-24-at-13.17.49-796x399.png)

Activity

Does this picture remind you of something? Can you please share your thoughts about this image in the
context of AI?

--------------------------------------------------------------------------------------------------------------------------------

The dream of making machines that think and act like humans, is not new. We have long tried to create
intelligent machines to ease our work. These machines perform at greater speed, have
higher operational ability and accuracy, and are also more capable of undertaking highly
tedious and monotonous jobs compared to humans.

Humans do not always depend on pre-fed data as required for AI. Human memory, its computing power,
and the human body as an entity may seem insignificant compared to the machine’s hardware and
software infrastructure. But, the depth and layers present in our brains are far more complex and
sophisticated, and which the machines still cannot beat at sin the near future.

These days, the AI machines we see, are not true AI. These machines are super good in delivering a
specific type of jobs.

66
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Think of Jarvis in “Iron Man” and you’ll get a sneak peek of Artificial General Intelligence (AGI) and that
is nothing but human-like AI. Although we still have a long way to go in attaining human-like AI.

Source: https://www.researchgate.net/figure/AI-and-human-non-automated-decision-Source-Koeszegi-2019-referring-to-
Agrawal_fig2_333720228

3. Cognitive Computing (Perception, Learning, Reasoning)

2. The platform (Cognitive computing) uses

1.This is a platform based on Artificial Machine Learning, Reasoning, Natural Language
Intelligence and Signal processing. Processing (NLP) and Computer Vision to
compute results.

3.Cognitive computing improves human decision 4.Cognitive computing tries to mimic the human
making brain

Examples of cognitive computing software: IBM Watson, Deep mind, Microsoft Cognitive service etc.

In summary, Cognitive Computing can be defined as a technology platform that is built on AI and signal
processing, to mimic the functioning of a human brain (speech, vision, reasoning etc.) and help humans
in decision making.

(For additional reference please see: https://www.coursera.org/lecture/introduction-to-ai/cognitive-

computing-perception-learning-reasoning-UBtrp)

Now that we have understood what Cognitive Computing is, let us explore the need for the same.

67
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Enormous amounts of unstructured data and information (Facebook pages, twitter posts, WhatsApp
data, sensors data, traffic data, traffic signals data, medical reports so on) are available to us in this
digital age., We need an advanced technology (traditional computing cannot process this amount of
data) to make sense of the data in order to help humans to take better decisions. Because Cognitive
Computing generates new knowledge using existing knowledge, it is viewed as platform holding the
potential of future computing.

3.1. Applications of Cognitive Computing

In mainstream usage, Cognitive Computing is used to aid humans in their decision-making process.
Some examples of Cognitive Computing and its applications include the treatment of disease/’ illness
by supporting medical doctors. For example, ‘The IBM Watson for Oncology’ has been used at the
‘Memorial Sloan Kettering Cancer Centre’ to provide oncologists with evidence-based treatment
alternatives for patients having cancer. When medical staff pose questions, Watson generates a list of
hypotheses and offers treatment possibilities for doctors.
Watch this video : https://www.coursera.org/lecture/introduction-to-ai/cognitive-computing-
perception-learning-reasoning-UBtrp

4. AI and Society
AI is sure changing the world but at the same time there is also a lot of hype and misconceptions about
it. In order for citizens, businesses and the government to take full advantages of AI, it is imperative that
we have realistic view about the same.

AI will impact almost every walk of society, from health, security, culture, education, jobs and
businesses. And as with any change/development, AI also has positive and negative influence on society
and this depends on how we leverage the same.

Some of the key social benefits of AI are outlined below:

1. Healthcare
IBM Watson (An AI Tool by IBM) can predict development of a particular form of cancer up to 12
months before its onset with almost a 90% accuracy.
(https://www.beckershospitalreview.com/artificial-intelligence/ibm-ai-predicts-breast-cancer-up-to-a-
year-in-advance-using-health-records-mammograms.html)

There are many such developments happening in the field of medical science. To control the outbreak
of CORONA virus in China, the country leaned on Artificial Intelligence (AI), Data Science, to track cases
and fight the pandemic. Our healthcare sectors are moving towards a future where Robots and AI tools
will work alongside doctors.
Though scientists and researchers are working hard to find out the opportunities to apply AI technology
in almost all sectors like transportation, education, agriculture etc. But healthcare has been the focal
points for AI. Can you please try to find out the two reasons why impact of AI is maximum in healthcare
sector?

68
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

2. Transportation

Transportation is a field where artificial intelligence along with machine learning has given us major
innovations.
Autonomous vehicles like cars, trucks etc. use advanced AI capabilities that offer features like lane-
changing systems, automated vehicle guidance, automated braking, use of sensors and cameras for
collision avoidance, and analysing information in real time, thus saving human lives by reducing road
accidents.
3. Disaster Prediction
AI is considered as one of the best tools for prediction of natural occurrences. There is an AI model that
can almost perfectly predict the weather for the next couple of days, which was unimaginable before
the advent of AI.
4. Agriculture
Farming is a sector faced is full of multiple challenges such like unpredictable as unpredictable weather,
availability of natural resources, growing populations etc. With the help of AI, farmers can now analyse
a variety of factors things in real time such as weather conditions, temperature, water usage or soil
conditions collected from their farm. Real-time data analytics helps farmers to maximize their crop
yields and thus, in turn, their profits too.

Having discussed the advantages of AI, it is important to also understand how AI is negatively affecting
our society. With all the AI benefits, there comes some significant disadvantages as well, but that’s
natural for any technology!
Listed below are some of the challenges posed by AI
1. Integrity of AI
AI systems learn by analysing huge volumes of data. What are the consequences of using biased data
(via the training data set)) in favour of a particular class/section of customers or users?
In 2016, the professional networking site LinkedIn was discovered to have a gender bias in its system.
When a search was made for the female name ‘Andrea’, the platform would show
recommendations/results of male users with the name ‘Andrew’ and its variations. However, the site
did not show similar recommendations/results for male names. i.e. A search/query for the name
‘Andrew’ did not result in a prompt asking the users if he/she meant to find ‘Andrea’. The company said
this was due to a gender bias in their training data which they fixed later.
2. Technological Unemployment
Due to heavy automation, (with the advent of AI and robotics) some sets of people will lose their jobs.
These jobs will be replaced by intelligent machines. There will be significant changes in the workforce
and the market — there will be creation of some high skilled jobs however some roles and jobs will
become obsolete.

69
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

3. Disproportionate control over data

Data is the fuel to AI; the more data you have, the more intelligent machine you would be able to
develop. Technology giants are investing heavily in AI and data acquisition projects. This gives them an
unfair advantage over their smaller competitors.

4. Privacy

In this digitally connected world, privacy will become next to impossible. Numerous consumer products,
from smart home appliances to computer applications have features that makes them vulnerable to
data exploitation by AI. AI can be utilized to identify, track and monitor individuals across multiple
devices, whether they are at work, home, or at a public location. To complicate things further, AI does
not forget anything. Once AI knows you, it knows you forever!

5. Non-technical Explanation of Deep Learning

Watch this video: https://www.coursera.org/learn/ai-for-everyone

The excitement behind artificial intelligence and deep learning is at its peak. At the same time there is
growing perception that these terms are meant for techies to understand, which is a misconception.
Here is a brief overview of deep learning in simple non-technical terms!
Imagine the below scenario.
A courier delivery person (X) has to deliver a box to a destination, which is right across the road from
the courier company. In order to carry out the task, X will pick up the box, cross the road and deliver the
same to the concerned person.

Route
X Cross the road
Final Destination

In neural network terminology, the activity of ‘crossing the road’ is termed as neuron. So, input ‘X’ goes
into a single neuron which is ‘Crossing the road’ to produce the output/goal which in this case is the
‘Final Destination’. In this example the starting and the ending are connected with a straight line. This
is an example of a simple neural network.
The life of delivery man is not so simple and straightforward in reality. They start from a particular
location in the city, and go to different locations across the city. For instance, as shown below, the
delivery man could choose multiple paths to fulfil his/her deliveries as shown in the image below:

70
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1. Option 1: ‘Address 1’ to ‘Destination 1’and then to the Final Destination’.

2. Option 2: ‘Address 1’ to ‘Destination 2’ and then to the ‘Final Destination’
3. Option 3: ‘Address 3’ to ‘Destination 3’ and then the ‘Final Destination’.
All of the above (and more combinations) are valid paths that take the delivery man from ‘start’ to
‘finish’.

However, some routes (or “paths”) are better than the rest. Let us assume that all paths take the same
time, but some are really bumpy while some are smooth. Maybe the path chosen in ‘Option3’ is
bumpy and the delivery man has to burn a (loss) lot more fuel on the way! Whereas, choosing ‘Option
2’ is perfectly smooth, so the deliveryman does not lose anything!

71
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

In this case, the deliveryman would definitely prefer ‘Option 2’ (Address 1 – Destination 2 – Final
Destination) as he/she will not want to burn extra fuel! So, the goal of Deep Learning then, is to assign
‘weights’ to each path from address to destination so that delivery man will find the most optimum
path.
How does this work?
As we discussed, the goal of the deep learning exercise is to assign weights to each path from start to
finish (start – address – destination – Final Destination). To do this for the example discussed above,
each time a delivery man goes from start to finish, the fuel consumption for every path is computed.
Based on this parameter (in reality the number of parameters can go up to 100 or more), cost for each
path is calculated and this cost is called the ‘Loss Function’ in deep learning.
As we saw above in our example, ‘Option 3’ (Address 3 -- Destination 3 –Final Destination) lost a lot of
fuel, so that path had a large loss function. However, ‘Option 2’ (Address 1 – Destination 2 – Final
Destination) cost the deliveryman least fuel cost, thereby having a small loss function as well– thereby
making it the most efficient route!

The above picture is a representation of a neural network of the most efficient path to be taken by the
deliveryman to reach the goal i.e. to go from starting point ‘X’ to ‘Final Destination’ using the best
possible route (which is most fuel efficient). This is a very small neural network consisting of 3 neurons
and 2 layers (address and destination).
In reality neural networks are not as simple as the above discussed example. They may look like the
one shown below!

72
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

(Source - http://www.rsipvision.com/exploring-deep-learning/)

The term ‘deep’ in Deep Learning refers to the various layers you will find in a neural network. This
closely relates to how our brains work. The neural network shown above, is a network of 3 layers (hidden
layer1, hidden layer 2 and hidden layer 3) with each layer having 9 neurons each.
Let us have a quick quiz now!
Question 1: The size of an image is represented as 600 x 300 x 3. What would be the type of the
image?
a) Jpeg Image
b) Grayscale image
c) Colour Image
d) Large image
Question 2: Which of the following have people traditionally done better than computers?
a) Recognizing relative importance
b) Detecting Emotions
c) Resolving Ambiguity
d) All of the above
Question 3: You have been asked to design an AI application which will prepare the ‘minutes of the
meeting’. Which part of AI will you use to develop your solution -
a) Computer Vision
b) Python Programming
c) Chatbot
d) Natural Language Processing (NLP)

73
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Supervised Machine Learning Activity

Activity 1
Activity Description

1. In this exercise, you will learn about the three components of an artificial intelligence (AI) system

2. You will then learn about the role of training data in an AI system

Instructions for Teachers/Students

1. Go to: https://teachablemachine.withgoogle.com/

2. Click on ‘Get started’

3. Click on ‘Image Project’

4. Rename Class 1 to “Dogs” and Class 2 to “Cats”

5. Click on ‘Upload’ in Class 1/Dogs and select ‘Choose images from your files or, drag and drop here’ and
then upload all the dog images from the dataset from here

6. Repeat the same steps for Class 2/Cats

7. Then click on train model

8. Test your model with a sample image downloaded from the web

Activity 2

Click on https://teachablemachine.withgoogle.com/v1/

1. Identify the three parts of an AI system in the teachable machine – Input, Learning, Output
2. Follow the tutorial

74
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Hit refresh. This time click “skip the tutorial.” Train the same classifier with your face and hands. What
happens when:

1. You only train one class?

2.What happens when you increase the number of images in your dataset? Make sure both classes
have at least ten images.

3. If you’ve mainly been training with one hand up, try using the other hand. What happens when
your test dataset is different from your training dataset?

75
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Image Datasets
Three different datasets include:

Dataset Description

Initial These are the images students should use to “teach” their machine learning
Training model which image is a cat and which image is a dog.
Dataset

Note that there are many more cats and that the cats are more diverse in
appearance than the dogs. This means that the classifier will more accurately
classify cats than dogs.

Test Dataset These are the images that students should use to test their classifier after training.
Students should show these images to their model and record if their classifier
predicts if the image is of a dog or a cat.

Note: Students should not use these images to teach their classifier. If an image is
used to train a classifier, the machine will have already recorded the
corresponding label for the particular image. Showing this image to the machine
during the testing phase will not measure how well the model generalizes.

Recurating This is a large assortment of images students can use to make their training
dataset dataset of cats and dogs larger and more diverse.

The test dataset should be used twice, once for testing students’ initial classifier and again for testing their
recurated dataset.

76
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Unit 3: Mathematics for AI

Title: Maths For AI Approach: Interactive/ Discussion,

Problem Solving through examples
Summary: Artificial Intelligence (AI) and Machine Learning (ML) are built on mathematics
like Calculus, Linear Algebra, Probability, Statistics, Algorithms and Complex
Optimizations. This unit aims to help students learn the foundation concepts of
mathematics which will be utilized in AI and ML.
Objectives:
1. Learners to appreciate the role of mathematics in Artificial Intelligence and
Machine learning
2. Students to get to know the application side of mathematics and to have a basic
level of understanding of the mathematical models.
3. Help students learn about the different ways data can be represented and
summarized graphically.

Learning Outcomes:
1. By the end of this unit, students are expected to have foundation level
understanding of Linear Algebra, Statistics, various kinds of graphs to visualize
data and set theory.
2. Students should be in a position to relate real world problems with these
mathematical concepts.
3. Students should be curious enough to explore deeper concepts of the
application aspects of mathematics.
Pre-requisites: Knowledge of Grade X Mathematics
Key Concepts: Matrices, Statistics, Set theory, Data representations

77
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

1. Introduction to Matrices
We all know that computer understands only numbers (binary / hexadecimal etc.) then how do you
think they (computer/mobile phones / digital camera etc.) store the image?
Let us capture the image of a pet dog using the mobile camera

But for your mobile, the above image is like a grid as written below:

The grid above is a matrix – which is what we are going to learn about now!
Matrix (or linear algebra) is also called the mathematics of data. It is arguably the pillar of the study of
Artificial Intelligence and therefore this topic is advised as a prerequisite prior to getting started with
the study of Artificial Intelligence.

78
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL

Terminology related to Matrices

1. Order of matrix – If a matrix has 3 rows and 4 columns, order of the matrix is 3*4 i.e. row*column
2. Square matrix – The matrix in which the number of rows is equal to the number of columns
3. Diagonal matrix – A matrix in which all the non-diagonal elements equal to 0 is called a diagonal
matrix
4. Upper triangular matrix – Square matrix where all the elements below the diagonal is equal to
0
5. Lower triangular matrix – Square matrix where all the elements above the diagonal equal to 0
6. Scalar matrix – Square matrix where all the diagonal elements equal to some constant k
7. Identity matrix – Square matrix where all the diagonal elements equal to 1 and all the non-
diagonal elements equal to 0
8. Column matrix – The matrix which consists of only 1 column. Sometimes, it is used to represent
a vector.
9. Row matrix – A matrix consisting only of row.
10. Trace – It is the sum of all the diagonal elements of a square matrix.

The followings topics will be dealt with under Matrix:

1.1. Types of Matrix
1.2. Matrix Operations
1.3. Vector and Vector Arithmetic
1.4. Matrices and Matrix Arithmetic

79
1.1 Matrix

How do you define a Matrix?

When we represent a set of numbers in the form of ‘M’ horizontal line (called rows) and ‘N’
vertical line (called columns), this arrangement is called m x n (m by n) matrix.

If A= | 1 2 3|
|4 5 6|
|7 8 9|

The top row is row 1. The leftmost column is column 1. This matrix is a 3x3 matrix because it
has three rows and three columns. In describing matrices, the format is:

rows X columns
Each number that makes up a matrix is called an element of the matrix. The elements in a matrix
have specific locations.
The upper left corner of the matrix is [row 1 x column 1]. In the above matrix the element at row
1 column 1 is the value 1. The element at [row 2 x column 3] is the value 6.

Quick Question
Question 1: What is the location of value 8?
Question 2: What is the value at location row 3 x column 2?

Activity 1

Can you try to represent the following sum in matrix format:

Mohan purchased 3 Math books, 2 Physics books and 3 Chemistry books. Sohan purchased 8
Math books, 7 Physics books and 4 Chemistry books.

________________________________________________________________________

Activity 2
What do you see when you look at the above image? A colorful pattern - easy guess!
Can you think of how to represent it so that computer also can understand this or process this?
I know you are too young to solve this, all I want to know is your approach.

If you thought that matrix representation is the solution, you got it right!

80
NOTE: You were able to identify the pattern because the human brain has gone through million
years of evolution. We have somehow trained our brains to automatically perform this task but
making a computer do the same task is not an easy. But before we work on identifying attributes
in an image, let us understand - How does a machine stores this image?

You know that computers are designed to process only numbers. So how can an image such as
the above with multiple attributes like color, height, width be stored in a computer? This is
achieved by storing the pixel intensities in a construct called Matrix. Then, this matrix can be
processed to identify colors etc.

So any operation which you want to perform on this image would likely use matrices at the back
end.

1.1. Types of Matrix

1. Row Matrix: Matrix with only one row.

A = [1 3 -5]

2. Column Matrix: Matrix with only one column.

1
A= [ 3 ]
−5

3. Square Matrix: A matrix in which number of rows are equal to number of columns.

A= | 1 2 3|
|4 5 6|
|7 8 9|

Rows = 3 and Column = 3, so this is square matrix.

4. Diagonal Matrix: A matrix with all elements zero except its leading diagonal.

A= | 2 0 0|
| 0 3 0|
| 0 0 4|

81
5. Scalar Matrix: A matrix in which all the diagonal elements are equal and all other
elements are zero.

A= |5 0 0|
|0 5 0|
| 0 0 5|

And if all diagonal element is unity (1) and all other non-diagonal element is equal to
zero, this matrix is called Unit matrix.

A= |1 0 0|
|0 1 0|
|0 0 1|

1.2. Matrix Operations

Following are the 3 types of operations, that we perform frequently on matrices

1. Transpose
Transpose of a matrix creates a new matrix with number of rows and columns flipped
This is denoted by the superscript T next to the matrix AT.
C = AT.
A= |1 2|
|3 4|
|5 6|

AT = |1 3 5 |

|2 4 6|
Inverse

For matrices, there is no such thing as division. You can add, subtract or multiply but you can’t
divide them. There is a related concept, which is called "inversion".
Matrix inversion is a process that finds another matrix that when multiplied with the matrix,
results in an identity matrix. Given a matrix A, find matrix B, such that
AB = I n or BA = I n
AB = BA = I n
Calculating inverse of matrix is slightly complicated, so let us use Inverse matrix calculator
- https://matrix.reshish.com/inverCalculation.php

82
2. Determinant
Every square matrix can be expressed using a number which is known as it determinant.
If A = [aif] is a square matrix of order n, then determinant of A is denoted by det A or |𝐴| .

To find the value assigned to determinant we can expand it along any row or column.

Explanation: let us take a 2 x 2 matrix -

|a b|
A= | c d|
The determinant is: |A| = ad − bc
Example 1
If A= | 2 4|
|3 8|

|A| = 2 x 8 – 4 x 3
= 16 – 12
=4

Example 2: A= |6 1 1|
|4 -2 5|
|2 8 7 |

|A|= 6× (−2×7 − 5×8) − 1× (4×7 − 5×2) + 1× (4×8 − (−2×2))

= 6× (−54) − 1× (18) + 1× (36)
= −306

Similarly, we can expand matrix to calculate its determinant.

There are 2 more matrices operations i.e. Trace and Rank, which students are advised to
explore themselves.

83
1.3. Vector and Vector Arithmetic
Vectors are foundation of linear algebra. Vectors are used throughout the field of machine
learning in the description of algorithms and processes such as the target variable (y) when
training an algorithm.

In this section we will be covering 3 important topics related to Vectors.

What is Vector?
The two-dimensional array-expression enclosed in brackets is a matrix while the one-
dimensional array expression in brackets are column vectors or simply vectors.

We begin by defining a vector, a set of n numbers which we shall write in the form

| x1 |
| x2 |
x = | x3 |
.
.
| xn |
This object is called a column vector. Vectors are often represented using a lowercase character
such as “v”; for example: v = (V1, V2, V3) Where v1, v2, v3 are scalar values, often real values.

For instance, in the popular machine learning example of housing price prediction, we might
have features (table columns) including a house's year of construction, number of bedrooms,
area (m^2), and size of garage (auto capacity). This would give input vectors such as

| xn | = [ 1988 4 200 2]
[ 2001 3 220 1]

84
1.3.1. Vector Arithmetic

1. Vector Addition
Vectors of equal length can be added to create a new vector
x=y+z
The new vector has the same length as the other two.

X = (y1 + z1, y 2 + z 2, y3 + z3 )

2. Vector Subtraction
Vector of unequal length can be subtracted from another vector of equal length to create a new
third vector.
x=x−y
As with addition, the new vector has the same length as the parent vectors and each element
of the new vector is calculated as the subtraction of the elements at the same indices.
X = (y1 - z1, y 2 - z 2, y3 - z3)

3.Vector Multiplication
If we perform a scaler multiplication, there is only one type operation – multiply the scaler with
a scaler and obtain a scaler result,
axb=c
But vector has a different story, there are two different kinds of multiplication - the one in which
the result of the product is scaler and the other where the result of product is vector (there is
third one also which gives tensor result, but out of scope for now)
To begin, let’s represent vectors as column vectors. We’ll define the vectors A and B as the
column vectors
A= | Ax | B= | Box |
| Ay | | By |
| Az| | Bz|

We’ll now see how the two types of vector multiplication are defined in terms of these column
vectors and the rules of matrix arithmetic -

85
Physical quantities are of two types:
Scaler: Which has only magnitude, no direction.
Vector: Which has both in it – magnitude and direction.
This is the first type of vector multiplication, called dot product, written as A.B. The vector dot
product, multiplication of one vector by another, gives scaler result.
[Where do we use it in AI – This operation is used in machine learning to calculate weight.
Please refer “weight” in the Unit 2: Deep Learning]
If i = unit vector along the direction of x -axis
j = unit vector along the direction of y -axis
k = unit vector along the direction of z -axis
Vector Dot Product
If there are 2 vectors, vector a = a1i + a2j + a3k
And vector b = b1i + b2j + b3k
Their dot product a.b = a1b1 + a2b2 + a3b3

And their cross product,

axb= |i j k |
| a1 a2 a3 |
| b1 b2 b3 |

= [ (a2b3 - a3b2) i - (a1b3 - a3b1) j - (a1b2 - a2b1) k]

Example 1
Calculate the dot product of a= (1,2,3) and b= (4, −5,6).
Using the formula for the dot product of three-dimensional vectors,
a. b = a1b1 + a2b2 + a3b3,
we calculate the dot product to be
a⋅b=1(4) + 2(−5) + 3(6) = 4−10+18 = 12.

Practice Sum -1: Calculate the dot product of c = (−4, −9) and d = (−1,2).

86
1.4. Matrix and Matrix Arithmetic
Matrices are a foundational elements of linear algebra. Matrices are used in machine learning
to processes the input data variable when training a model.

1.4.1. Addition of matrices

A and B are two matrices of order m x n (means it has m rows and n columns), then their sum
A+B is a matrices of order m x n, is obtained by adding corresponding elements of A and B.

| 12 1| B = | 8 9 |
A= | 3 -5 | -1 4 |

A+B= | 12+8 1+9 | | 20 10 |

| 3 + (-1) -5 + 4 | = |2 -1 |

1.4.2. Multiplication of a matrix by a scalar

Let A = [a if] be an m x n matrix and K be any number called a scalar. Then matrix obtained by
multiplying scalar K is denote by K A

If A= | 12 1 |
| 3 -5 | and K = 2

Then K A = = | 24 2 |
| 6 -10 |

1.4.3. Multiplication of Matrices

Two matrices with the same size can be multiplied together, and this is often called element-
wise matrix multiplication
Two matrices A and B can be multiplied (for the product AB) if the number of columns in A
(Pre- multiplier) is same as the number of rows in B (Post multiplier).

If A = [a if]mxn , B = [bif]nxp

Pre multiplier Post multiplier

Then A B = [c if] mxp

A= | 2 -3 4 | and B= | 2 5|
| 3 6 -1 | | -1 0 |
| 4 -2 |

87
A is 2 x 3 matrix while B is 3 x 2 matrix, No of row of B = No of column of A
They are meet the condition for matrix multiplication.

Now we use to multiply them A and B matrix as

(first row of A) X First column of B

(first Row of A) X second column of B
(second row of A) X (first column of B)
(second row of A) X (second column of B)

For example

AB = | (2x2) + (-3 x -1) + (4 x 4) (2x5) + (-3 x 0) + (4x -2) |

| (3 x 2) + (6 x -1) + (- 1 x 4) (3 x 5) + (6 x 0) + (-1 x -2) |

= | 4+3+16 10+0-8 |
| 6 -6 -4 15 + 0 +2 |

= | 23 2 |
| - 4 17|

Tinker your brain – Real world Math!

Activity 1:

Three people denoted by P1, P2, P3 intend to buy some rolls, buns, cakes and bread. Each of
them needs these commodities in different amounts and can buy them in two shops S1, S2.
Which shop is the best for each person P1, P2, P3 to pay as little as possible? The individual
prices and desired quantities of the commodities are given in the following tables:

Demanded Quantity Prices in shops S1 and S2

roll bun cake bread S1 S2

P1 6 5 3 1 roll 1.5 1.00

P2 3 6 2 2 bun 2.00 2.50

P3 3 4 3 1 cake 5.00 4.5

Bread 16.00 17.00

88
Let us solve this matrices way:

These calculations can be written using a product of two matrices

|6 5 3 1|
P = | 3 6 2 2 | (The demand matrix)
|3 4 3 1|

and

| 1.50 1 |
Q= |2 2.50 |
|5 4.50 | (The Price matrix)
| 16 17 |

For example, the first row of the matrix

R = PQ = | 50 49 |
| 58.50 61 |
| 43.50 43.50 |

expresses the amount spent by the person P1 in the shop S1 (the element r11) and in the shop
S2 (the element r12). Hence, it is optimal for the person P1 to buy in the shop S2, for the
person P2 in S1 and the person P3 will pay the same price in S1 as in S2.

Activity 2
Share Market Portfolios
A has INR 1000 worth stock of Apple, INR 1000 worth of Google and INR 1000 worth of Microsoft.
B has INR 500 of Apple, INR 2000 of Google and INR 500 of Microsoft.
Suppose a news broke and Apple jumps 20%, Google drops 5%, and Microsoft stays the same.
What is the updated portfolio of A and B and net profit /loss from the event?

|apple | |100|
The original stock price matrices look like, | google | |010|
| Microsoft| |001|
| profit (+/-) | |000|

89
After news broke, updated stock price, |apple | | 1.2 0 0 |
| google | |0 .95 0 |
| Microsoft| | 0 0 1|
| profit (+/-) | | +.20 -.05 0 |

Now let’s feed in the portfolios for A (INR 1000, 1000, 1000) and B (INR 500, 2000, 500). We
can crunch the numbers by hand.
Input Interpretation -

| 1.2 0 0| | 1000 500 |

| 0. 95 0| * | 1000 2000 |
|0 0 1| | 1000 500 |
| .2 -.5 0|

| 1200 600 |
| 950 1900 |
Results - | 1000 500 |
| 50 0 |

The key is understanding why we’re setting up the matrix like this, not blindly crunching
numbers. This is the algorithm which plays behind any electronic spreadsheet (i.e. MS EXCEL)
when you do what if analysis.

90
2. Set Theory: Introduction to Data Table Joins
Many a times, you might have heard your teachers saying mathematics is the foundation of
Computer Science and you must have started thinking – how? Did this question ever cross your
mind?
In this module, we will try to explain the confluence of Set theory, which is a branch of
mathematics, and relational database (RDBMS), which is the part of the computer science. A lot
of things are going to come together today, because we are going to learn how set theory
principles are helping in data retrievals from database, which in turn is going to be used by AI
model for its training. The important topics which we are going to cover in this strand are as
below:
2.1. Context setting – Set theory and Relational Algebra
2.2. Set Operations
2.3. Data Tables Join (SQL Joins)
2.4. Practice Questions
2.1 Context Setting: Set Theory and Relational Algebra
Before we get into the actual relation between Set and Database of sets, we first need to
understand what do these terms refer to.
A Set is an unordered collection of objects, known as elements or members of the set. An
element ‘a’ belongs to a set A can be written as ‘a ∈ A’, ‘a ∉ A’ denotes that a is not an element
of the set A. So set is a mathematical concept and the way we relate sets to other sets, is called
set theory.
Set of even numbers: {..., -4, -2, 0, 2, 4, ...}
Set of odd numbers: {..., -3, -1, 1, 3, ...}
Set of prime numbers: {2, 3, 5, 7, 11, 13, 17, ...}
Set of names of grade X students: {‘A’, ‘X’, ‘B’, ‘H’, ..............}

We use database (like Oracle, MS SQL server, MySql etc.) to store digital data. Database is made
up of several components, of which table is the most important. Database stores the data in the
table. Without tables, there would not be must significance of the DBMS.
For example, student database and its 2 tables

91
Please see the records in the ‘Activity Table’, does this information make any meaning - No? But
if you combine the information from the 2 tables - ‘Students Table’ and ‘Activities Table’, you
get a meaning information.
For example, student John Smith, participated in swimming and he must have paid $17.
The data in the table of database are of limited values unless the data from different tables are
combined and manipulated to generate useful information. And from here, the role of relational
algebra begins.
Relational algebraic is a set of algebraic operators and rules that manipulates the relational
tables to yield desired information. Relational algebra takes relation (table) as their operands
and returns relation (table) as their results. Relational algebra consists of eight operators:
SELECT, PROJECT, JOIN, INTERSECT, UNION, DIFFERENCE, PRODUCT, AND DIVIDE.

92
( Image Source : https://images.slideplayer.com/42/11342631/slides/slide_4.jpg)

Here are the simplifies definitions of these eight operators:

1. SELECT also known as RESTRICT, yields values for all the rows found in a table that satisfy a
given condition. SELECT yields a horizontal subset of a table as shown in the above diagram.
2. PROJECT yields all values for selected attributes. PROJECT yields a vertical subset of a table.
Please refer the above picture.
3. PRODUCT yields all possible pairs of rows from two tables- also known as Cartesian product.
Therefore, if one table has three rows and the other table has two, the PRODUCT yields a list
composed of 3 x 2= 6 rows as shown in the picture.
4. JOIN allows the combination of two or more tables based on common attributes. Please
refer the above picture.

93
5. UNION returns a table containing all records that appear in either or both of the specified
tables as shown in the diagram.
6. INTERSECTION returns only those rows that appears in both tables, see the diagram above.
7. DIFFERENCE returns all rows in one table that are not found in the other table, that is, it
subtracts one table from the other, a shown in the diagram above.
8. DIVIDE is typically required when you want to find out entities that are interacting with all
entities of a set of different type entities.
Say for example, if want to find out a person who has account in all the bank of a city?
The division operator is used when we have to evaluate queries which contain the keyword
‘all’. Division is not supported by SQL directly. However, it can be represented using other
operations (like cross join, Except, In)
2.2. Set Operations
When two or more sets combined together to form another set under the mathematical
principles of sets, the process of combining of sets is called set operations.
To keep the process simple, let us assume two small sets:
A = {2 ,3 ,4} and B = {3,4,5}
Keeping these two sets as our example, let us perform four important set operations:
i) Union of Sets (U)
Union of the sets A and B is the set, whose element are distinct element of set A or Set B or
both.
A U B = {2, 3, 4, 5}
ii) Interpretation of Sets
Intersection of set A and set B is the set of elements belongs to both A and B.
A∩B = {3, 4}
iii) Complement of the Sets
Complement of a set A is the set of all elements except A, which means all elements except A.

= {5}
iv) Set Difference
Difference between sets is denoted by ‘A – B’, is the set containing elements of set A but not in
B. i.e. all elements of A except the element of B.
A – B = {2}

94
v) Cartesian Product
Remember the term used when plotting a graph, like axes (x-axis, y-axis). For example, (2, 3)
depicts that the value on the x-plane (axis) is 2 and that for y is 3 which is not the same as (3,
2).
The way of representation is fixed that the value of the x- coordinate will come first and then
that for y (ordered way). Cartesian product means the product of the elements say x and y in
an ordered way.
A and B are two non-empty sets, then the Cartesian product of two sets, A and set B is the set
of all ordered pairs (a, b) such that a ∈A and b∈B which is denoted as A × B.
A x B = {(2,3) ;(2,4) ;(2,5) ;(3,3) ;(3,4) ;(3,5) ;(4,3) ;(4,4) ;(4,5)}

2.3. Data Tables Join (SQL Joins)

You may have understood by now that relational databases are based almost entirely upon set
theory. In fact, if you’ve ever worked with or SQL queried a database you’re probably familiar
with the idea of finding records from a database tables. Finding records from a database tables
is nothing but some form of set operations.
Look at the diagram below, all possible table join operations have been summarized here for
your quick reference:

(http://stackoverflow.com/questions/406294/left-join-vs-left-outer-join-in-sql-server)

95
In a database, information is stored in various tables. In order to retrieve a meaningful
information about an entity, all concerned tables need to be joined.
What do we mean in fact by joining tables? Joining tables is essentially a Cartesian product
followed by a selection criterion (did you notice, set theory operation. JOIN operation also
allows joining variously related records from different relations (tables).

Different types of JOINs

(All INNER and OUTER keywords are optional)
1. (INNER) JOIN

Select records that have matching values in both tables.

In an inner join, only those tuples that satisfy the matching criteria are included, while the
rest are excluded. Let's study various types of Inner Joins.

2. LEFT (OUTER) JOIN

Select records from the first (left-most) table with matching right table records.

In the left outer join, operation allows keeping all tuple in the left relation. However, if there is
no matching tuple is found in right relation, then the attributes of right relation in the join
result are filled with null values.

96
3. RIGHT (OUTER) JOIN

Select records from the second (right-most) table with matching left table records.

In the right outer join, operation allows keeping all tuple in the right relation. However, if
there is no matching tuple is found in the left relation, then the attributes of the left
relation in the join result are filled with null values.

4. FULL (OUTER) JOIN

Selects all records that match either left or right table records.

In a full outer join, all tuples from both relations are included in the result, irrespective of the
matching condition.

2.4. Practice Questions

Question 1: Which operation of relational algebra is equivalent to Set Union operation?
_________________________________________________________________________
Question 2: Which operation of set theory is equivalent to the ‘Product’ operation of relational
algebra?
__________________________________________________________________________

97
Question 3: Specify if below statement is true or false:

i) A SQL query that calls for a FULL OUTER JOIN is merely returning the union of
two sets.
____________________________________

ii) Finding the LEFT JOIN of two tables is nothing more than finding the set
difference or the relative complement of the two tables.
_______________________________________

iii) A SQL INNER JOIN is just the intersection of two sets.

_______________________________________

Question 4: Can you think of an entity like students, employee, sports - create 3 tables of any
one of the entities, you want?
For example
Entity: Students
Students Table (name, roll number, age, class, address)
Marks Table (roll number, subject, marks obtained)
Bus Route table (roll number, bus number, boarding point)

3. Simple Statistical Concepts

I won’t be wrong if I say, Artificial Intelligence (Machine Learning / Deep Learning) is an engine
that needs data as fuel, so data is the primary building block of AI. And to understand data,
statistics is the key.

The purpose of this module is not to replace the statistics that you will study as a part of
Mathematics in your school, but to introduce you to statistics for the perspective of the
Artificial Intelligence and Machine learning.

This module on statistics, is divided into the following parts:

3.1. Measure of Central Tendency

3.2. Variance and Standard Deviation
3.3. Activities

98
3.1. Measure of Central Tendency

Statistics is the science of data, which is in fact a collection of mathematical techniques that
helps to extract information from data. For the AI perspective, statistics transforms
observations into information that you can understand and share. You will learn more about
statistics and statistical methods in your next level i.e. Level-2.

Usually, Statistics deals with large dataset (population of a country, country wise number of
infected people from CORONA virus and similar datasets). For the understanding and analysis
purpose, we need a data point, be it a number or set of numbers, which can represent the whole
domain of data and this data point is called the central tendency.

“Central tendency” is stated as the summary of a data set in a single value that represents the
entire distribution of data domain (or data set). The one important point that I would like to
highlight here that central tendency does not talk about individual values in the datasets but it
gives a comprehensive summary of whole data domain.

3.1.1. Mean

In statistics, the mean (more technically the arithmetic mean or sample mean) can be
estimated from a sample of examples drawn from the domain. It is a quotient obtained by
dividing the total of the values of a variable by the total number of their observations or items.

If we have n values in a data set and they have values x1, x2, x3 …, the sample mean,
M = (x1 + x2 + x3 …xn) / n
And if we need to calculate the mean of a grouped data,
M = ∑fx / n
Where M = Mean
∑ = Sum total of the scores
f = Frequency of the distribution
x = Scores
n = Total number of cases

99
Example 1
The set S = { 5,10,15,20,30},
Mean of set S = 5+10+15+20+30/5 = 80/5 = 16
Example 2
Calculate the mean of the following grouped data

Class
Frequency

2-4 3

4-6 4

6–8 2

8 – 10 1

Solution

Mid
Class Frequency (f) f⋅x
value (x)

2 -4 3 3 9

4-6 4 5 20

6–8 2 7 14

8–
1 9 9
10

n=10 ∑f⋅x=52

Mean (M) = ∑fxn / n

=52 / 10
= 5.2

100
When to use Mean?

1. Mean is more stable than the median and mode. So that when the measure of central
tendency having the greatest stability is wanted mean is used.
2. When you want to includes all the scores of a distribution
3. When you want your result should not be affected by sampling data.

When not to use the mean

1. The mean has one main disadvantage: it is particularly susceptible to the influence of
outliers. These are values that are unusual compared to the rest of the data set by being
especially small or large in numerical value.
For example, consider the wages of staff at a factory below:
Staff 1 2 3 4 5 6 7 8 9 10
Salar 15 18 16 14 15 15 12 17 90 95
y k k k k k k k k k k

Mean = Total Salary / Number of Staffs

= 307 / 10
= 30.7 K
The mean salary for these ten staff is INR 30.7k. However, inspecting the raw data suggests that
this mean value might not be the best way to accurately reflect the typical salary of a worker, as
most workers have salaries in the INR 12k to INR 18k range. The mean is being skewed by the
two large salaries. Therefore, in this situation, we would like to have a better measure of central
tendency. As we will find out later, taking the median would be a better measure of central
tendency in this situation.
2. Sometimes it gives absurd values. For example, there are 41, 44 and 42 students in class VIII,
IX and X of a school. So the average students per class are 42.33. It is never possible.
Brain Teaser Question:
Why 6ft tall man drowned while crossing a swimming pool
which was on an average 5ft deep?

101
3.1.2. Median
The median is another measure of central tendency. It is positional value of the variables
which divides the group into two equal parts one part comprising all values greater than
median and other part smaller than median.
Following series shows marks in mathematics of students learning AI
17 32 35 15 21 41 32 11 10 20 27 28 30

We arrange this data in an ascending or descending order.

10, 11, 15, 17, 20, 21, 27, 28, 30, 32, 32, 35, 40
As 27 is in the middle of this data position wise, therefore Median = 27
How to find median values?
Use Case 1
In case of ungrouped data, the scores are arranged in order of size. Then the midpoint is found
out, which is the median. In this process two situations arise in computation of median, (a) N is
odd (b) N is even First we shall discuss how to compute median (Mdn) when N is odd.
Step by Step Process for Finding the Median -
Step 1: Put the numbers in numerical order from smallest to largest.
Step 2: If there is an odd number of numbers, locate the middle number so that there is an
equal number of values to the left and to the right. If there is an even number of numbers
locate the two middle numbers so that there is an equal number of values to the left and to
the right of these two numbers.
Step 3: If there is an odd number of numbers, this middle number is the median. If there is an
even number of numbers add the two middles and divide by 2. The result will be the median.
Example 2
In your class, 5 students scored following marks in the unit test mathematics, find median
value
18, 10, 13, 10, 17, 11, 9
Solution:
Arrange them in order - 9 , 10 , 10 , 11 , 13 , 17 , 18
Total count is odd, so median value is – 11.

102
Example 2
In your class, 5 students scored following marks in the unit test mathematics, find median
value: 11, 11, 14, 18, 20, 22
Solution
They are already in order - 11, 11, 14, 18, 20, 22
Total count is in even number, so median is the average of the two-middle number
(14 + 18) / 2 = 16.

Use Case 2: Calculating median from grouped data

Calculation of a median in continuous series involves the following steps:
(i) The data arranged in ascending order of their class interval.
(ii) Frequencies are converted into commutative frequencies
(iii) Median class of the series is identified
(iv) Formula used to find actual median value

𝑁
−𝑐.𝑓
2
And the formula is : Median = l1+ ×𝑖
𝑓

l1= Lower limit of median class

c.f= Cumulative frequency of the class preceding the median class
f= Frequency of the median class
i= Class size
Example -1:
Following distribution table is given as

Wage 0-10 10-20 20-30 30-40 40-50

Number 22 38 46 35 20
of
workers

103
To find median (M) make table as

Wage Frequency Cumulative Frequency

0-10 22 22

10-20 38 60

20-30 46 106

30-40 35 141

40-50 20 161

N=161

161+1
Median class = size of item = 81
2

Median class = 20-30

(as 81 is right before 106 in cumulative frequency table, median group is 20-30 group)

Now, l1=20, c.f=60 , f=46 , n=161 , i=10

161
−60
2
M = 20 + × 10
46
80.5−60
= 20 + × 10
46
20.5
= 20 + × 10
46

= 20 + 4.46
Median = 24.46

104
3.1.3. Mode
Mode is another important measure of central tendency of statistical series. It is the value which
occurs most frequently in the data series. On a histogram it represents the highest bar in a bar
chart or histogram. You can, therefore, sometimes consider the mode as being the most popular
option. An example of a mode is presented below:

To Calculate the mode, different methods are described below -

(i) Inspection Method
In this method mode is determined just by observation. We use mode by inspection method in
the individual series with method involves just an inspection of the series. One is simply
identifying the value that occurs most frequently in the series such a value is called a mode.
Example 1: Age of 15 students of a class
Age (years) 22, 24, 17, 18, 17, 19, 18, 21, 20, 21, 20, 23, 22, 22, 22,22,21,24
We arrange this series in ascending order as
17,17,18,18,19,20,20,21,21,22,22,22,
An inspection of the series shows that 22 occurs most frequently
Mode=22
(ii) Mode for Frequency Distribution
For frequency distribution, the method for mode calculation is somewhat different. Here we
have to find a modal class. The modal class is the one with the highest frequency value. The
class just before the modal class is called the pre-modal class. Whereas, the class just after the
modal class is known as the post-modal class. Lastly, the following formula is applied for
calculation of mode:
Mode = l + h [(f1-f0)/(2f1-f0-f2)]
Here, l= The lower limit of the modal class
f1 = Frequency corresponding to the modal class,
f2 = Frequency corresponding to the post-modal class,

105
and f0 = Frequency corresponding to the pre-modal class
Example – 2: Calculate mode for the following data:
Class Interval 10-20 20-30 30-40 40-50 50-60
Frequency 3 10 15 10 2
Answer: As the frequency for class 30-40 is maximum, this class is the modal class. Classes 20-
30 and 40-50 are pre-modal and post-modal classes respectively. The mode is:
Mode= 30 + 10× [(15-10)/ (2×15-10-10)] = 30+ 5= 35

There are two methods for calculation of mode in discrete frequency series:
(i) By inspection method - Same as above example.
(ii) Grouping method:
More than one value may command the highest frequency in the series.
In such cases grouping method of calculation is used.

In summary, when do we use mean, median and mode:

Mean Median Mode

The mean is a good measure The median is a good Mode is used when you
of the central tendency when measure of the central value need to find the
a data set contains values when the data include distribution peak and
that are relatively evenly exceptionally high or low peak may be many.
spread with no exceptionally values. The median is the
For example, it is
high or low values. most suitable measure of
important to print more
average for data classified on
of the most popular
an ordinal scale.
books; because printing
different books in equal
numbers would cause a
shortage of some books
and an oversupply of
others.

106
3.2. Variance and Standard Deviation

Measures of central tendency (mean, median and mode) provide the central value of the data
set. Variance and standard deviation are the measures of dispersion (quartiles, percentiles,
ranges), they provide information on the spread of the data around the centre.
In this section we will look at two more measures of dispersion: Variance and standard
deviation.
Let us understand these two using a diagram:
Let us measure the height (at the shoulder) of 5 dogs (in millimetres)

As you can see, their heights are: 600mm, 470mm, 170mm, 430mm and 300mm.
Let us calculate their mean,
Mean = (600 + 470 + 170 + 430 + 300) / 5
= 1970 / 5
= 394 mm
Now let us plot again after taking mean height (The green Line)

107
Now, let us find the deviation of dogs height from the mean height

Calculate the difference (from mean height), square them , and find the average. This average
is the value of the variance.
Variance = [ (206) 2 + (76) 2 + (-224) 2 + (36) 2 + (-94) 2] / 5
= 108520 / 5
= 21704
And standard deviation is the square root of the variance.
Standard deviation = √21704 = 147.32
I am assuming that the example above, must have given you a clear idea about the variance
and standard deviation.
So just to summarize, Variance is the sum of squares of differences between all numbers and
means.
In order to calculate variance , first, calculate the deviations of each data point from the mean,
and square the result of each .
Say, there is a data range: 2 ,4 ,4,4,5,5,7,9

Calculate the variance:

Find the mean first: (2 + 4 + 4 + 4+ 5 + 5 + 7 + 9) / 8

108
Then sum of square of differences between all numbers and mean =

(2-5) 2 + (4-5) 2 + (4-5) 2 + (4-5) 2 + (5-5) 2 + (5-5) 2 + (7-5) 2 + (9-5) 2

= 9 + 1 +1 + 1+ 0 +0 + 4 + 16

= 32

[Where μ is Mean, N is the total number of elements or frequency of distribution] Variance =

Average of sum of square of differences between all numbers and mean

= 32 / 8
=4
Standard Deviation is square root of variance. It is a measure of the extent to which data

varies from the mean.

Standard Deviation (for above data) = =2

Some important facts about variance and standard deviation

 A small variance indicates that the data points tend to be very close to the mean, and
to each other.
 A high variance indicates that the data points are very spread out from the mean, and
from one another.
 A low standard deviation indicates that the data points tend to be very close to the
mean
 A high standard deviation indicates that the data points are spread out over a large
range of values.

109
3.3. Activities

Activity 1

Refer to the website https://www.worldometers.info/coronavirus/

Can you extract 5 statistical data from this global corona tracker?

______________________________________________________________

_____________________________________________________________

Activity 2

Can you please perform a statistical research on “The time students spend on social media”?
Condition 1: You will collect the data outside of your school
Condition 2: You can work in a group of 5 students
Condition 3: Your group need to capture data from a minimum 10 students

Once you have data ready with you, do your statistical analysis (central deviation, variance and
standard deviation) and present your story.

____________________________________________________________________

___________________________________________________________________

110
3. Visual representation of data
This module will provide an introduction about the purpose, importance and various methods
of data representation using graphs. Statistics is a science of data, so we deal with large data
volume in statistics or Artificial Intelligence. Whenever volume of data increases rapidly, an
efficient and convenient technique for representing data is needed. For a complex and large
quantity, human brain is more comfortable in dealing if represented through visual format.
And that is how the need arise for the graphical representation of data.
The important topics that we are going to cover in this module is:
3.1. Why do we need to represent data graphically?
3.2. What is a Graph?
3.3. Types of Graphs
3.1 Why do we need to represent data graphically?
There could be various reasons of representing data on graphs, few of them have been
outlined below
 The purpose of a graph is to present data that are huge in volume or complicated to be
described in the text / tables.
 Graphs only represent the data but also reveals relations between variables and shows
the trends in data sets.
 Graphical representation helps us in analysing the data.
3.2. What is a Graph?
Graph is a chart of diagram through with data are represented in the form of lines or curve
drawn on the coordinated points and its shows the relation between variable quantities.
The are some algebraic and coordinate geometry principle which apply in drawing the graphs of
any kind.
Graphs have two axis, the vertical one is called Y-axis
and the horizontal one is called X-Axis. X and Y axis are
perpendicular to each other. The intersection of these
two axis is called ‘0’ or the Origin. On the X axis the
distances right to the origin have positive value (see fig.
7.1) and distances left to the origin have negative value.
On the Y axis distances above the origin have a positive
value and below the origin have a negative value.

111
3.3. Types of Graphs
3.3.1 Bar Graphs
As per Wikipedia “A bar chart or bar graph is a chart or graph that presents categorical data with
rectangular bars with heights or lengths proportional to the values that they represent “. It is a
really good way to show relative sizes of different variables.
There are many characteristics of bar graphs that make them useful. Some of these are that:
 They make comparisons between different variables very easy to see.
 They clearly show trends in data, meaning that they show how one variable is affected
as the other rises or falls.
 Given one variable, the value of the other can be easily determined.
Example 1
The percentage of total income spent under various heads by a family is given below.

Different House
Food Clothing Health Education Miscellaneous
Heads Rent

% Age of
Total 40% 10% 10% 15% 20% 5%
Number

Represent the above data in the form of bar graph.

112
3.3.2 Histogram
Histogram is drawn on a natural scale in which the representative frequencies of the different
class of values are represented through vertical rectangles drawn closed to each other.
Measure of central tendency, mode can be easily determined with the help of this graph.
Histogram is easy to draw and simple to understand but it has one limitation that we cannot
plot more than one data distribution on the same axis as histogram.
Example 1
Below is the waiting time of the customer at the cash counter of a bank branch during peak
hours. You are required to create a histogram based on the below data.

Represent the above data in the form of a histogram

113
3.3.3. Scatter Plot
Scatter plots is way to represent the data on the graph which is similar to line graphs. A line
graph uses a line on an X-Y axis, while a scatter plot uses dots to represent individual pieces of
data. In statistics, these plots are useful to see if two variables are related to each other. For
example, a scatter chart can suggest a linear relationship (i.e. a straight line).

There is no line but dots are representation the value of variables on the graph.
Example 1
Here price of 1460 apartments and their ground living area. This dataset comes from a kaggle (
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data) machine
learning competition. You can read more about this example here

(Sourec : https://www.data-to-viz.com/story/TwoNum.html )

114
Scatter plot is most frequently used data plotting technique in machine learning.
When should we use scatter plot:

 It is used to observe relationship between two numeric variables. The dots on the plot
not only denotes value of variable but also the patterns, when data taken as whole.

 Scatter plot is a useful tool for the correlation. Relationships between variables can be
described in many ways: positive or negative, strong or weak, linear or nonlinear.

115
4. Introduction to Dimensionality of Data
4.1. Data Dimensionality
Dimensionality in statistics refers to how many attributes a dataset has. There is a sample
students dataset with four attributes ( columns ) , so this student dataset is of 4 dimensions.

Students Dataset

Name Age Class Address

X 16 XII New Delhi

So if you have a data-set having n observations (or rows) and m columns (or features), then
your data is m-dimensional.

The dimension of dataset can change without forcing change in another dimension. We can
change the age of students without changing class or address, for example.

Let us take one more to understand the dimensionality of data

If you remember the ‘colour channel’ topic of computer vision in Unit-2. The colour image is
expressed by three colour (or by three numbers) - Red, Green and Blue.
For example, Orange colour has Red value – 255, Green value – 165 and Blue value – 0. Orange
would not be the result if one of these value was different.
We can say, Orange colour space is three dimensional because there are three directions in
which colour can vary

116
Combination of these three colours (numbers: 0 – 255 ) ultimately decides the colour, hence
we say that colour space is three-dimensional because there are three “directions” in which a
colour can vary.

Example of Data dimensionality for AI

We know recognizing face or an object is one of the major AI function. The image is stored in
the form of pixels. If we consider, each pixel as a variable, then n images can reside in 2D space
i.e. y x z space. And then we give computer a set images (residing in y x z plane) for training,
using which AI software recognize new objects. Having said that, using dimensionality of data,
AI software differentiate between human face to animal face or something like that.

For Advance learners

Let us take, dog an example. Dog is a 3-dimensional object, so
Let us assume, length of dog is – x
Width (dog’s body) dog - dy / dx [ y is a function of x here]
Height - dz / dx [ z is a function of x ]
We can plot really huge number of dog’s parameter (length, width and height) in terms of x, y
and z variable, which further can be used by AI model for its training purpose.
NOTE: Advance learners can further explore it and try to relate with central tendency and
variance.
_____________________________________________________________________________

117
4.2. Data Representation on Graph
Before we move further, let us understand the basics of data representation on the graph

Quadrants - I

Quadrants - IV

Co-ordinates on the graphs:

 Move forward on the X-axis for positive value

 Move backward on the X-axis for negative axis

 Move up on the Y-axis for positive value

 Move down on the X – axis for the negative value

Please look at the above diagram of the graph and try to reason out why
i) (6, 4) is in first quadrant
ii) ( -6, 4) is in second quadrant
iii) ( -6, -4) is in third quadrant
iv) (6, -4) is in the fourth quadrant

118
4.3. Multi-Dimensional Data and Graph
Use Case 1
Let us assume a data set of 1-Dimension
Students Dataset
Age

The graphical line will a straight line

Use Case 2
Let us take 2-Dimensional data
Students Dataset
Age Marks

16 91

15 85

14 93

Cartesian coordinates can go:

 left-right, and

 up-down
So any position needs two numbers.

119
Use Case 3
Let us take 3-Dimensional data
Students Dataset
Age Maths Marks Science Marks

16 91 92

15 85 90

14 93 72

How do we locate a spot in the real world (such as the tip of your nose)? We need to know:

 left-right,

 up-down, and

 forward-backward
that is three numbers, or 3 dimensions!

5. Simple Linear Equation and Regression

Linear regression is perhaps one of the most well-known and well understood algorithms in
statistics and machine learning.
Use Case 1
Have you ever come across such situations like?

 Impact of 12th score on college admission

 Impact of product price on number of sales

 Impact of rainfall on farming

 Impact of time of allocated by you on study to your exam performance

 Time spend in front of TV to your monthly electricity bill

What kind of situation are these? College admission (one variable) depends on other variable
i.e. 12th score. Number of sales (one variable) depends on other variable i.e. product price.
In all the situations, there are two variables – one is input variable (12th score, product price
etc.) and other one is the outcome (college admission, sales, farming etc.)
We know these two variables i.e. input and outcome are related but what is the equation of
the relation is unknown.

120
Example 1
The general formula of linear equation is:

Ax + By = C

In terms of slope, this formula can be written, y = - (A / B) x + C

[We know slope m = - A / B, for a straight line]

Now let us take an example from real life to understand how linear equation behaves (slope of the
graph) with changing data point.

Suppose, the cab fare in Mumbai is: Fixed amount (x) + INR y per KM.

For a journey of 1 KM, Cab fare = x + y

Slope of straight line = -1

Cab fare revised, new fare is = Fixed amount(x) + twice the number of Km travelled

For a journey of 1 KM, cab fare = x + 2y

Slope of straight line in this case = -2

Thereby, if data points change, the slop of linear equation also changes.

We need to know that linear equation changes its path only when condition of variable changes.

Example 2
When we collect data, sometimes there are values that are "far away" from the main group
of data, how does that ‘far away’ value (called outlier) impacts the equation? What do we do
with them?
Below is the Delhi daily temperature data, recorded for a week:
Temperature recorded (degree C): 1st Week of June

Day 1st 2nd 3rd 4th 5th 6th 7th

Temp. 42 44 47 30 40 43 46

The mean temperature is: 42 + 44 + 47 + 30 + 40 + 43 + 46) / 7 = 41.71

121
 4th of June, it rained in Delhi and therefore temperature dipped

 The temperature on 4th, is called outlier for this dataset

 Now, let us take the outlier out, and calculate the mean

Mean (without outlier) = 43.66

Because data range is very small, even though we notice visible difference in the mean. When we
remove outliers we are changing the data, it is no longer "pure", so we shouldn't just get rid of the
outliers without a good reason! And when we do get rid of them, we should explain what we are
doing and why.

Use Case 2
Let us consider another simple example:
Your physical education teacher given you an offer – If you take two rounds of school
ground, he will offer you 4 chocolates.
Based on the above offer, students come up with the following tables

Number of Number of
Rounds (x) chocolates (y)

2 4

4 8

6 12

So, you can find a relationship between these variables,

Number of chocolates = 2 x Number of rounds [ y = 2x ]

122
There you find a linear relationship between these two variables i.e. Input and Outcome.
And congratulation, this is linear regression.
Activity 1
From the case-I, can you prepare some hypothetical data set (of any one event) and try to
establish the relationship between input variable and outcome?

What is Linear Regression?

Regression is the process of determining the relationship between two (or more) variables. In
the above example of ‘number of rounds and chocolates’, x (number of rounds) is termed as
the independent variable and y (number of chocolate) is termed as dependent variable.
And if there is only one dependent variable, then regression is called simple linear regression.
Question 1
Is Neural network a simple linear equation?
It is called “simple” because we take only two variables in this case and try to establish a
relationship between them. It is called “linear” because both the variables vary linearly (can
be described with a straight line) with respect to each other.

A simple straight line can be represented by an equation -

y = mx + c
y = dependent variable
x = independent variable
m = a co-efficient indicating how a unit change in x brings a change in y
c = a constant, which determines where line will cut the x-axis when y = 0
Let us solve a real life problem
Problem:
ABC has shifted to Mumbai recently. The cab prices in Mumbai from ABCs apartment to the
office varies every month. ABC wants to understand the cause of the price variation. Also,
ABC would like to know how he/she can predict the cost for the cab in the upcoming month.

What will you do??

Solution
Independent variable, x = months

123
Dependent variable, y = cab price
[Remember that your dependent variable is the one which you are trying to predict (cab price)
and your independent variable (months) is the one which you will supply as input]
Using the historical data, let us plot the scatter graph

We can describe the above plot as

Cab price (dependent variable) = m * months (independent variable) + C
In linear regression, our aim is to find a straight line which will cover most of the points in the points
in the above graph.
[Note: You can easily find the values for m and c with the help of free statistical software, online linear
regression calculators or Excel. All you need are the values for the independent (x) and dependent (y)
variables (as those in the above table). Also, we will discuss more about this in our next level]
For the data points, m = 4.914 and c = 72.752
So, equation becomes,
Cab price = 4.914 + months * 72.752

Now we can predict, cab price for any coming month say the 14th month from the point one has
arrived in the city, by just replacing the month variable with 14 in the above equation.
Cab price = 4.914 + 14 * 72.752
= 146.82 INR

124
Seems like we now have an estimate on how much cab price needs to be paid 2 months from now!
Least Square Method
The "least square" method is a form of mathematical regression analysis used to determine the line of
best fit for a set of data, providing a visual demonstration of the relationship between the data points.
Each point of data represents the relationship between a known independent variable and an unknown
dependent variable.
Linear regression is basically a mathematical analysis method which considers the relationship
between all the data points in a simulation. All these points are based upon two unknown variables;
one independent and one dependent. The dependent variable will be plotted on the y-axis and the
independent variable will be plotted to the x-axis on the graph of regression analysis. In literal manner,
least square method of regression minimizes the sum of squares of errors that could be made based
upon the relevant equation.
We know the straight line formula
y = mx + c
Y (dependent variable) and x (independent variable) are known value, but m and c, we need to
calculate.
Steps to calculate the m and c -

Step 1: For each (x,y) point calculate x2 and xy

Step 2: Sum all x, y, x2 and xy, which gives us Σx, Σy, Σx2 and Σxy
Step 3: Calculate Slope m:

m = N Σ(xy) − Σx Σy / N Σ(x2) − (Σx)2

(N is the number of points.)

Step 4: Calculate Intercept b:

b = Σy − m Σx / N

Step 5: Assemble the equation of a line

y = mx + C
we are done!

125
Example 1:
Let us collect a data how many hours of sunshine vs how many ice creams were sold at the shop from
Monday to Friday:

X ( hours of sunshine) Y ( Ice cream sold )

2 4

3 5

5 7

7 10

9 15

Step -1: For each (x,y) calculate x2 and xy:

X ( hours of Y ( Ice cream sold ) x2 Xy

sunshine)

2 4 4 8

3 5 9 15

5 7 25 35

7 10 49 70

9 15 81 135

Step 2: Sum x, y, x2 and xy (gives us Σx, Σy, Σx2 and Σxy):

X ( hours of Y ( Ice cream sold x2 Xy

sunshine) )

2 4 4 8

3 5 9 15

5 7 25 35

126
7 10 49 70

9 15 81 135

Σx = 26 Σy = 41 Σx2 = 168 Σxy = 263

Also N (number of data values) = 5

Step 3: Calculate slope m:

m = N Σ(xy) − Σx Σy / N Σ(x2) − (Σx)2

= 5 x 263 − 26 x 41 / 5 x 168 − 262

= 1315 − 1066 / 840 − 676
= 249 / 164
= 1.5183.

Step 4: Calculate Intercept b:

c = Σy − m Σx / N
= 41 − 1.5183 x 26 / 5
= 0.3049...
Step 5: Assemble the equation of a line:
y = mx + c
y = 1.518 x + 0.305
Let's see how it works:

X ( hours of Y ( Ice cream sold ) y = 1.518 x + 0.305 Error

sunshine)

2 4 3.34 - 0.66

3 5 4.86 - 0.14

5 7 7.89 0.89

7 10 10.93 0.93

9 15 13.97 -1.03

127
Here are the (x,y) points and the line y = 1.518x + 0.305 on a graph:

Once you hear the weather forecast which says "we expect 8 hours of sun tomorrow", so you use the
above equation to estimate that you will sell
y = 1.518 x 8 + 0.305 = 12.45 Ice Creams

128
Unit 4: AI Values (Ethical Decision Making)

Title: AI Values ( Ethical Decision Making) Approach: Interactive/Discussion, Reading,

Un-plugged Activities
Summary: “The rise of powerful AI will be either the best or the worst thing to happen to
humanity. We do not know which” - By Stephen Hawking
AI is just a tool, just an algorithm but an extremely powerful one. It can be used in ways that can
affect the society – positively or negatively. Like electricity or nuclear energy, a lot depends on us
- how we utilize AI for the humanity.
The current advances in research and development in the space of AI have given rise also to
challenges of various kinds - ethical challenges, jobs being taken by robots, value system of robots,
privacy in the age of AI etc.
Objectives:
1. As an educator or a parent, it’s our responsibility to prepare our next generation for the
future The objective of this unit is to give a glimpse of the society of the future
2. Deepen the understanding of students about the technological basis of AI
3. Analyse machine bias and other ethical risks
Learning Outcomes:
1. To develop a fair understanding of AI bias, how the society is to be impacted with the
advent of AI
2. Students should be in position to discuss AI and individual responsibility.
Pre-requisites: Basic understanding of AI, reasonable fluency in English and computer literacy,
comfortable in using the internet
Key Concepts: AI applications, Ethics , Bias , Jobs in AI age

129
1. AI: Issues, Concerns and Ethical Considerations
Recent progress in computing, robotics and AI may create a unique opportunity in human
society. Days are not too far, when we will entrust the management of the environment,
economy, public security, healthcare or agriculture to artificially intelligent robots and computer
systems. And this is where the discussion on ‘AI Ethics and values’ is born. Countries all over the
world are in a race to evolve their AI skills and technologies. The topic of AI is very popular but
what are the ethical and practical issue should we be considering before embracing AI?
Question Time!
Let me describe a few scenarios and then pose some associated questions. Do keep in mind that
that there are no ‘Right/Wrong’ answers for ethical questions. Read the scenario carefully and
try to answer to the best of your understanding.
Q.1: You are a doctor at a well-renowned hospital. You have six ill patients, five of whom are in urgent
need of organ transplant. However, you can't help them as there are no available organs that can be
used to save their lives. The sixth patient, however, will die without a particular medicine. If s/he dies,
you will be able to save the other five patients by using the organs of patient#6, who is an organ donor.
What will you do in this scenario?

(https://listverse.com/2011/04/18/10-more-moral-dilemmas/

___________________________________________________________________________________

Q.2: An AI music software has composed a song which has become a worldwide hit. Who will
own the rights to this song? The team who developed the AI software or the music company?
____________________________________________________________________________

Q.3: A farmer is headed somewhere sitting on his horse cart. A pedestrian makes some noise
which upsets the horse who injures the pedestrian in reaction. The pedestrian makes a police
complaint. Who do you think is at fault? Who should be penalized?
____________________________________________________________________________

Q.4: A medical equipment manufacturing company has developed an AI robot to perform

complex surgeries. The trained robot has saved many lives by operating on very critical cases.
Due to some algorithm issue, the robot makes an error which costs a patients’ life. Who is the
culprit here? The hospital or the robot or the company who developed this robot?
____________________________________________________________________________
Q.5: Should AI robot be given citizenship to a country?
____________________________________________________________________________

130
1.1. Issues and Concerns around AI
Activity 1
Let us begin with a YouTube video: Humans Need Not Apply
Watch the video in groups of five. After watching this, each student of the group should write a
short note on their understanding of the video and present your write up to the teacher.
As Artificial Intelligence evolves, so do the issues and concerns around it. Let us review some of
the issues and concerns around AI here:

 Personal Privacy: Human behaviour and activities can be tracked in in ways that were
unimaginable earlier. AI systems need huge amounts of data in order to be trained. In
many cases data involves individual faces, medical records, financial data, location
information etc.

 Job Loss: One of the primary concerns around AI is the future loss of jobs. According to
a research by McKinsey, 800 million people will lose their jobs
(https://www.theverge.com/2017/11/30/16719092/automation-robots-jobs-global-
800-million-forecast). At the same time another point to keep in mind is that AI may
also create more jobs, after all, people will be tasked with creating these robots to
begin with and then manage them in the future.

 What if AI makes a mistake: Microsoft’s AI Chatbot – ‘Tay’ was released on Twitter in

2016. In less than one day, due to the information it was receiving and learning from
other Twitter users, the robot learned to spew racist slurs and Nazi propaganda. Another
similar example, again from Microsoft Surface device. The device allows the user to login
using your face. However, the device has difficulty recognising faces of people from
certain demographics. Microsoft identified this issue and fixed it later.

Yes, AI makes mistakes. If Humans make a mistake there are laws that can be enforced, what
do we do in the case of AI? Do we have such laws for AI?
How should we treat AI Robot? Should robots be granted Human rights or citizenship? If
robots evolve to the point that they are capable of “feeling” does that entitle them to rights
similar to humans or animals? If robots are granted rights, then how do we rank their social
status?

131
Activity 2
Look at the four pictures below. Can you write a short story based on these four pictures?

132
1.2. AI and Ethical Concerns
Ethics is defined as the discipline dealing with moral obligations and duties of humans. It is a set
of moral principles which govern the behaviour and actions of individuals or groups.
“The ethics of AI is the part of the ethics technology specific to robots and other artificially
intelligent beings. It can be divided into roboethics, a concern with the moral behaviour of
humans as they design, construct, use and treat artificially intelligent beings, and machine
ethics, which is concerned with the moral behaviour of artificial moral agents (AMAs). With
regard to artificial general intelligence (AGIs), preliminary work has been conducted on
approaches to integrating AGIs which are full ethical agents with existing legal and social
frameworks “ .
The bigger concerns are:

 If AI generates human-like output, can it also make human-like decision?

 If AI makes human-like decisions, are the decision human-like also?

 If AI takes decision as to whether a bank loan should be disbursed or not, is AI algorithm

fair?

 If AI decides whether college admission can be provided to particular candidate or not,

is there guarantee that algorithm is not biased?

 If AI makes human-like decision, is it human-like trustworthy also?

 AI is basically data + mathematical model + training based on data + predictions. What if

the data provided for the training is unintentionally wrong/biased?
Such questions and concerns are endless and therefore ‘Ethics of AI’ is important. As a citizen of
the AI society, we must know how AI works and the framework of AI ethics.
Activity 1
(This activity has been designed by MIT AI Ethics Education Curriculum. “An Ethics of Artificial Intelligence
Curriculum for Middle School Students was created by Blakeley H. Payne with support from the MIT Media Lab
Personal Robots Group, directed by Cynthia Breazeal.”)
Activity Description

In this exercise, you will learn to think about the kind of world we make when we build new
technology, and the unintended consequences that can occur when we build that technology.

133
Instructions

1. Go to: https://talktotransformer.com/
2. Explore with the tool for al little bit!
3. Then, answer the following prompts:
Write a brief description of your technology:

_______________________________________________________________

Which stakeholders might be interested in this technology?

_______________________________________________________________

Who might be affected by this technology most? Brainstorm at least 10 stakeholders.

_______________________________________________________________

If this technology was used for evil, how might that be done?

_______________________________________________________________

If this technology was used to help other people, who might it help?

________________________________________________________________

In 50 years this technology could do the most good by…

________________________________________________________________

In 50 years this technology could do the most harm by…

________________________________________________________________

1.3. AI and Bias

Brain teaser Questions!

Question. 1: Why are most images that show up when you do an image search for “doctor” are white
men?

Question 2: Why is it that most times AI tools associate ‘Doctor’ to a man and ‘Nurse’ to a woman?

Question 3: Why are the virtual assistants (Alexa, Siri, Google assistant etc.) all female?

Question 4: Computer vision systems report high error rates while recognizing people of colour, why?

134
You can do further search on the web and add more to this list. It’s not the case that the developers did
this intentionally. This is what we call AI bias!

“AI bias, is a phenomenon that occurs when an algorithm produces results that are systematically
prejudiced towards certain gender, language, race, wealth etc. and therefore produces skewed or leaned
output. Algorithms can have built-in biases because they are created by individuals who have conscious
or unconscious preferences that may go undiscovered until the algorithms are used publically”

What are the sources of AI bias?

There could be many sources, but let’s outline three sources of for AI bias
1. Data
AI system are as good as the data we put into them. And putting biased/skewed data into the
system is the reason for AI bias. AI system don’t have understanding of whether their training
data is right or wrong and have enough representation from a broader base.
Amazon developed an AI tool for recruitment, but the company realized its new system was not
rating candidates for software developer jobs and other technical posts in a gender-neutral way.
This is because Amazon's computer models were trained to vet applicants by observing patterns
in resumes submitted to the company over a 10-year period. Most resumes came from men, a
reflection of male dominance across the tech industry ( https://in.reuters.com/article/amazon-
com-jobs-automation/insight-amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-
against-women-idINKCN1MK0AH)
Another example is that of voice assistants like Siri or Alexa that are trained on huge databases
of recorded speech that are unfortunately dominated by speech samples from white, upper-
middle class Americans, making it challenging for the technology to understand commands from
people outside this category.
Activity
Can you think of more examples of AI biases due to data?
2. Algorithm
While algorithm, in itself doesn’t inject bias but can amplify the biases caused by data.
For example, an image classifier trained on the images available in public domain - which
happens to show more women in the kitchen as opposed to men. AI algorithms are designed to
maximize accuracy. So an AI algorithm may decide that all people in the kitchen are women,
despite the fact the training data has some images of men in the kitchen. It thereby incorporates
a gender bias in the AI system.

135
Activity
Can you prepare a list of events where bias appears in the AI system due to the algorithm?
3. People
The last issue is with the people who are developing the AI system i.e. engineers, scientists,
developers etc. They aim to get the most accurate results with the available data. They are often
lesser focused on the broader context. It is rightly said that ethics and bias are not the problem
of the machine but that of the humans behind the machine.

1.4. AI: Ethics, Bias and Trust

(Source :https://xkcd.com/1838/)

Adoption of AI by companies is increasing and they are seeing AI as critical to the future of their
business and sustainability. However, there are concerns regarding the possible misuse of the
technology, which lead to a trust and confidence issues.
Currently, AI can automate data entry tasks, can take attendance of students in a classroom or
can beat Gary Kasparov at chess. However, more complicated usage of algorithms like machine
learning or neural networks, makes it less likely for human beings to understand how AI arrived
at a conclusion. It is essentially a ‘black box’ to humans. If a system is so complicated that a user
doesn’t understand how it works, how can we trust the decisions it makes?
While there is no easy solution to this problem, the way forward is for governments, industry
and regulatory bodies to join hands in addressing the challenge of ‘AI trust’ by doing the
following:

136
1. Minimize bias in training data

2. Share the algorithm with the government/public/users, if possible

 One of the most famous algorithms right now in the world is the ‘Google Search’.
Sunder Pichai, the Google CEO, had to describe the algorithm to lawmakers,
explaining that the search algorithm uses over 200 signals, including relevance and
popularity, to determine a page rank. A bipartisan bill was recently proposed by the
US lawmakers that would require internet giants such as Google, Facebook, Yahoo
and AOL to disclose their search algorithms
( https://sensecorp.com/ethical-ai/)
3. AI developers should be representative/inclusive of diverse backgrounds – gender, religion,
skin colour, language and so on

4. There should be an international monitoring body that designs and monitors and AI ethics
and algorithm policy

1.5. Employment and AI

Activity

Does this picture tell you something? Can you describe this picture in your own words?

Although Artificial Intelligence has dramatically improved our world in many ways, there are notable
concerns regarding the forthcoming impact of AI on employment and the workforce.

Roles and Jobs at Risk:

Jobs that are monotonous and repetitive, can be easily automated; this can gradually lead to
certain jobs becoming obsolete.

137
Activities related to customer care operation, document classification, content moderation,
production line in factories etc., are at risk of being taken away by smart robots and software.

 Self-driving cars and trucks will soon be a reality; transportation will see a transportation.

 More traditional professions, such as legal professions and accounting will be

significantly impacted. In these sectors we will start to see different kinds of roles
emerging, which require higher order skills.

 Financial Services, Insurance and any other sector requiring significant amounts of data
processing and content handling will also be impacted to a certain extent. AI can have a
significant role in eliminating bureaucracy, improving the service to citizens.

 The healthcare sector and imaging services will also have some degree of impact.
It is a fact that AI will create millions of more jobs than the ones that it will affect. These new
jobs will require higher order thinking skills.

 Advent of Internet and computers made a few jobs and roles obsolete, but, we know the
number of opportunities and new jobs it has created.

 ATMs definitely reduced the number of cashier positions in the banks, but it has had a
positive impact on the banking business. ATMs have lowered the cost associated with
running brick and motor branches, and as a response banks have responded by opening
more ATMs – leading to hiring more bank personnel’s for the ATMs.

 AI is becoming adept at language translation i.e. Natural Language Processing (NLP). If

the cost of basic translation drops to nearly zero, the cost of doing business with those
who speak other languages falls. Thus, it motivates companies to do more business
overseas, creating more work for human translators. AI may do the simple translations,
but humans are needed for the complicated ones.
Having discussed the above, just as with the internet, the real gains in jobs will come from places
where our imagination cannot yet take us.
I would cite here another example from the automobile sector. There used to be two dominant
auto makers in India during 70s and 80s –

 Ambassador (https://en.wikipedia.org/wiki/Hindustan_Ambassador)

 Fiat (https://en.wikipedia.org/wiki/Premier_Padmini)
What happened to these models? Do we see them on the roads now? Nobody pushed them out
of business, however, their inertia and resistance to change with the changing times and
technology, made them run out of business.
Advent of electricity and mechanical engine changed the world. Each of them bettered our lives,
created jobs, and raised wages. AI will be bigger than electricity, bigger than mechanization,
bigger than anything that has come before it.

138
We don’t need to fear AI but prepare to reap the benefits of AI!
Let us conclude this unit by enagaging in this activity
Activity
Form groups of 5 students each, and ask the students to prepare a list of
i) 5 jobs or professions that AI will disrupt
ii) 5 job segment that will be immune to AI
iv) 10 new jobs or businesses that will be created by AI

139
Unit 5: Introduction to Storytelling

Title: Introduction to Storytelling Approach: Interactive/ Discussion,

Activities

Summary: Students get to learn about the significance of storytelling which has been used
as a medium to pass on knowledge, experience, and information since ages. It also builds
intercultural understanding and commonalities thereof. This session will also equip
students with a vital skill to tell back their stories with numbers or proof points by blending
the two worlds of hard data and human communication. Data visualisation is now a key to
interpret and tell an impactful story.

Objectives:

1. Students develop an understanding of benefits of powerful storytelling and its

need.
2. Students appreciate the importance of knowing the audience of their story.
3. Students learn to create and deliver effective stories blended with numbers to
engage the audience.
4. Students demonstrate ability to gain insights from data storytelling.

Learning Outcomes:

1. To get introduced to storytelling.

2. Building an impactful story using data for a set of audience.
Pre-requisites: Understanding of data and reasonable fluency in English language.

Key Concepts: Data visualisation and storytelling.

Purpose: Introduce the importance of storytelling and its effectiveness in passing on knowledge,
values, facts and events from one generation to another.

Say: “This unit intends to create value for storytelling. Although storytelling comes naturally to
everyone, keeping a few things in mind not only enriches the said art but also makes it more
impactful. This unit also dwells on the art of storytelling and how blending stories with numbers/
data can make storytelling forceful.”

After having briefed the students about storytelling, ask them to answer the questions that follow
and gauge their understanding of the subject. Engage the students in a discussion and ask them
about their expectations from this unit.

140
1. Storytelling: Communication across the ages
Stories have been central to human cognition and it has proved to be the most effective way
of communication since time immemorial. There is a bio-chemical reason why people love
stories. It’s the mode of communication our brains biologically prefer. When a good story is told
the brain comes alive because storytelling literally has a chemical effect on the brain that
wakes it up in order to absorb, digest and store information. Stories have the power to inspire,
motivate, and change people’s opinions. In short stories are the best possible way to deliver
complex information (data).

Storytelling is defined as the art of narrating stories to engage an audience. It originated in the
ancient times with visual stories, such as cave drawings, and then shifted to oral traditions, in
which stories were passed down from generation to generation by word of mouth. Later, words
formed into narratives, that included written, printed and typed stories. Written language, as it
is seen now, was arguably the first technological innovation, that gave us as a species the power
to convey story in a physical format, and thus visualize, archive and share that data with
community members and future generations. It encourages people to make use of their
imagination and inventiveness (creativity) to express themselves (verbal skills) which makes it a
lot more than just a recitation of facts and events.

What do you understand by storytelling?

What are the three forms of storytelling?

Expected Responses: Visual, Oral and Written.

What skills do storytelling help develop?

Expected Responses: Imagination, inventiveness, creativity and articulation.

Energiser: Ask the students to watch the video

https://www.youtube.com/watch?v=GxcGVCEEdcU and change the end of the story using
their imaginativeness and creativity.

141
1.1. Learn why storytelling is so powerful and cross-cultural, and what this means for
data storytelling
Stories create engaging experiences that transport the audience to another space and time.
They establish a sense of community belongingness and identity. For these reasons, storytelling
is considered a powerful element that enhances global networking by increasing the awareness
about the cultural differences and enhancing cross-cultural understanding. Storytelling is an
integral part of indigenous cultures.
Some of the factors that make storytelling powerful are its attribute to make information more
compelling, the ability to present a window in order to take a peek at the past, and finally to
draw lessons and to reimagine the future by affecting necessary changes. Storytelling also
shapes, empowers and connects people by doing away with judgement or critic and facilitates
openness for embracing differences.
A well-told story is an inspirational narrative that is crafted to engage the audience across
boundaries and cultures, as they have the impact that isn’t possible with data alone. Data can
be persuasive, but stories are much more. They change the way that we interact with data,
transforming it from a dry collection of “facts” to something that can be entertaining, engaging,
thought provoking, and inspiring change.
Each data point holds some information which maybe unclear and contextually deficient on its
own. The visualizations of such data are therefore, subject to interpretation (and
misinterpretation). However, stories are more likely to drive action than are statistics and
numbers. Therefore, when told in the form of a narrative, it reduces ambiguity, connects data
with context, and describes a specific interpretation – communicating the important messages
in most effective ways. The steps involved in telling an effective data story are given below:
Understanding the audience
Choosing the right data and visualisations
Drawing attention to key information
Developing a narrative
Engaging your audience

142
Activity

A new teacher joined the ABC Higher Secondary School, Ambapalli to teach Science to the
students of Class XI. In his first class itself, he could make out that not everyone understood
what was being taught in class. So, he decided to take a poll to assess the level of students. The
following graph shows the level of interest of the students in the class.

PRE: How do you feel about Science?

11%
19% 5%

25%
40%

Bored Not great OK A bit interested Excited

Depending on the result obtained, he changed his method of teaching. After a month, he
repeated the same poll once again to ascertain if there was any change. The results of poll are
shown in the chart below.

POST: How do you feel about Science?

12%
6%
38%
14%

30%

Bored Not great OK A bit interested Excited

With the help of the information provided create a good data story setting a strong narrative
around the data, making it is easier to understand the pre and post data, existing problem,
action taken by the teacher, and the resolution of the problem. Distribute A4 sheets and pens
to the students for this activity.

143
2. The Need for Storytelling
The need for storytelling is gaining importance like never before, as more and more people are
becoming aware of its potential to achieve multipurpose objectives.

Purpose: To familiarize students with the need for storytelling and how it proves
beneficial.
Say: “Now that you have learnt about storytelling and its power, we will introduce you to
the need of storytelling.”

Guide the students to think of the many needs that storytelling satisfies and enter in the blank
circles in the figure below:

Storytelling

144
Expected Responses:

 Storytelling acts as an emotional glue to connect a diverse audience – It is an important

way to tap into the heart of the audience also conveying a deeper message based on
emotion.
 Information presented in a structured manner – Organising information into a definite
format with a beginning (setting the stage), middle (the challenge), and ending (solution
to the problem and a new beginning) works for many topics.
 Storytelling reshapes knowledge and helps communicate something meaningful – Stories
have been used to pass on knowledge. When the knowledge gets embedded in the
context of a story, it is transferred or communicated to the listener in a unique way.
 Storytelling is persuasive and influential – It can persuade people to execute plans
towards a certain future or objective.
 Helps transcend one’s current environment – Good storytelling can transport people into
another world.
 Storytelling achieves adding meaning to data - Many people perceive data as
meaningless numbers when data is disconnected to anything important. But when the
data is placed in the context of a story, it comes alive.
 Storytelling can be motivating for the audience – Inspire people to buy into a mission or
cause.

3. Storytelling with Data

Session Preparation
Logistics: For a class of ____ students. [Group Activity]
Materials Required

Purpose: To provide insight into data storytelling and how it can bring a story to life.

Say: “Now that you have understood what storytelling is and why it is needed, let us learn about
a storytelling of a different kind - the art of data storytelling and in the form of a narrative or
story.”

ITEM QUANTITY

A4 Xx
sheets

Pens Xx

145
can enlighten the audience to the insights that they wouldn’t perceive without the charts or
graphs.
Finally, when narrative and visuals are merged together, they can engage or even entertain an
audience. When you combine the right visuals and narrative with the right data, you have a data
story that can influence and drive change.

3.1. By the numbers: How to tell a great story with your data?
Presenting the data as a series of disjointed charts and graphs could result in the audience
struggling to understand it – or worse, come to the wrong conclusions entirely. Thus, the
importance of a narrative comes from the fact that it explains what is going on within the data
set. It offers a context and meaning, relevance and clarity. A narrative shows the audience where
to look and what not to miss and also keeps the audience engaged.
Good stories don’t just emerge from data itself; they need to be unravelled from data
relationships. Closer scrutiny helps uncover how each data point relates with other. Some easy
steps that can assist in finding compelling stories in the data sets are as follows:
Step 1: Get the data and organise it.
Step 2: Visualize the data.
Step 3: Examine data relationships.
Step 4: Create a simple narrative embedded with conflict.
Activity: Try creating a data story with the information given below and use your imagination to
reason as to why some cases have spiked while others have seen a fall.

146
Mosquito borne diseases in Delhi
600
530
500
435
400
311
300
238
200 165 146
208 131
100 78
56 20 75
0 0 21
44
2015 2016 2017 2018 2019

Dengue Malaria Chikungunya

Data storytelling has acquired a place of importance because:

 It is an effective tool to transmit human experience. Narrative is the way we simplify and
make sense of a complex world. It supplies context, insight, interpretation—all the things
that make data meaningful, more relevant and interesting.
 No matter how impressive an analysis, or how high-quality the data, it is not going to
compel change unless the people involved understand what is explained through a story.
 Stories that incorporate data and analytics are more convincing than those based
entirely on anecdotes or personal experience.
 It helps to standardize communications and spread results.
 It makes information memorable and easier to retain in the long run.
Data Story elements challenge –
Identify the elements that make a compelling data story and name them

_____________________

______________________

147
_____________________

Activity:
First present the statistics as shown below. Ask the students to read it and say if they have
understood information presented well.

1. 7.6% of men believe mobiles are a distraction as compared to 4.2% of the women.
2. Kids in the car cause 9.8% of the men to be distracted as compared to 26.3% of the
women.

Another way to recreate the same statistics is the visual shown below:

Ask the students which one tells a better story and list out why?

(Expected Response: The former way of presenting story is far more detailed and easier to
comprehend.)

148
4.Conflict and Resolution

Conflict is the most exciting and engaging drive in any story. Every story or plot is centred on its
conflict and the ways in which the characters of the story attempt to resolve the problem. Conflict in
a story is a struggle between two or more opposing forces. Conflict in a story drives the plot forward
towards a resolution.

What is Data storytelling, conflict and resolution?

In a business or our daily life, the users or audience are trying to resolve a conflict always. All
decisions have to be made after resolving the conflict. Every question answered in data storytelling
is by the means of finding evidence to a conflict.

SKILLS THAT RESOLVE CONFLICT IN A DATA STORYTELLING:

1. Communication

2. Teamwork

3. Problem Solving

4. Stress management

5. Emotional agility

Activity

A school has planned its annual meet for the for the year. 15 students can participate in a drama for which 28
students showed interest to participate. The teacher co-ordinator decides to leave the decision on the 28
students to unanimously select 15 students who will participate in the drama.

Recognise the conflict and way to resolve it.

149
5.Storytelling for Audience

Here we have seven pointers to become an engaging storyteller

1. Engross the audience in your story

2. Link up with your personal story

3. Create suspense till end

4. Bring characters to life

5. Show, don’t tell

6. Build up a peek moment

7. Climax should have a positive takeaway

Data storytelling has few elements without which storytelling is impossible. Let us have a look at them:

1. Character-who populates the story

2. Plot-what takes place in the story

3. Setting-where the story takes place

4. Point of view-participation of narrator and audience

5. Style-skills attained for storytelling

6. Literacy devices-knowledge of technology

Let’s do an activity

Create a data story to highlight changes you see in yourself after the outbreak of COVID 19 followed by
lockdown in the country.

Extra Resources for further Reference:

Link for storytelling: https://www.khanacademy.org/humanities/hass-storytelling/storytelling-

pixar-in-a-box/ah-piab-we-are-all-storytellers/v/storytelling-introb

Link for storytelling: https://www.youtube.com/watch?v=uAG8c-sapUE

Link for storytelling: http://storywards.com/en/what-is-storytelling/

150
____________________________________________

ARTIFICIAL INTELLIGENCE: STUDY MATERIAL

CLASS XI
______________________________________________

LEVEL 2: AI INQUIRED (UNIT 6 – UNIT 10)

TEACHER INSTRUCTION MANUAL

LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

INDEX

UNIT 6: CRITICAL & CREATIVE THINKING…………………….. Page 3 - 18

UNIT 7: DATA ANALYSIS……………………………………………… Page 19 – 55

UNIT 8: REGRESSION…………………………………………………… Page 56 - 78

UNIT 9: CLASSIFICATION & CLUSTERING……………………… Page 79 - 110

UNIT 10: AI VALUES…………………………………………………….. Page 111 - 119

2
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Unit 6

Critical and Creative Thinking

Title: Critical and Creative Thinking Approach: Interactive/ Discussion, Team
Activity
Summary: We are living in a rapidly changing complex world characterised by learning, unlearning
and relearning which happens to be the new normal. 85% of the future jobs have either not been
visualised or invented yet. So, how can we prepare our children for a future full of uncertainty and
dramatic changes? Of the 10 skills expected to be in high demand in the future, World Economic
Forum lists complex problem solving, critical thinking and creativity as the top three skills for future
employment. Hence, it becomes imperative for the school or an educator to prepare and connect
the students with the demands of the real world. And design thinking can be instrumental in
establishing such connections which is going to be the topic of discussion in this unit.

Objectives:
1. To build focus on research, prototyping, and testing products and services so as to find
new ways to improve the product, service or design.
2. Students develop the understanding that there is more to Design thinking than just
hardware designing.
3. To inculcate design thinking approach to enhance the student's creative confidence.

Learning Outcomes:
1. Underlining the importance of Prototype as a solution to user challenge.
2. Recognizing empathy to be a critical factor in developing creative solutions for the end
users.
3. Applying multiple brainstorming techniques to find innovative solutions.
Pre-requisites: Reasonable fluency in the English language.
Key Concepts: Design Thinking framework, Prototype, Ideate

( https://en.wikipedia.org/wiki/The_Thinker )

The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and
relearn.” - Alvin Toffler, author of Future Shock.

3
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. Design Thinking Framework

Activity 1
Can you please write a paragraph in not more than 10 lines on the image on the previous page?

---------------------------------------------------------------------------------------------------------------------------------------------

Activity 2
Have the class form groups of 3 or 4 students each and assign them tasks (let’s say to plan a party).

In round one, get everyone to start each sentence of their conversation with “Yes, BUT…….”. After the first
round, ask your participants how the conversation went? How did their discussion to plan the party go?

For round two, get the participants to start their conversation with “Yes, AND….”. After the second round, ask
the group how that round went and compare the two rounds of discussions. The differences between the two
will be striking!

Purpose: Collaboration along with distinction between an open and closed mind set.

Activity 3
Divide the class into groups of 4-5 students each. Pick a random object (i.e. a paperclip, pen, notebook), and
challenge each group to come up with 40 uses for the object. No repeats!

Each group will take turns in coming up with new ideas. Make sure that each group has a volunteer note-taker
to capture the ideas along with the total number of ideas their group comes up with. Allow 4 mins for this
challenge. When time is up, have each group share how many ideas they generated. The group with the most
ideas is declared the winner!

Purpose: Thinking out-of-the box, encouraging wild idea generation

Activity 4
This is an activity that promotes pure imagination. The purpose is to think expansively around an ideal future
for the school or about yourself too; it's an exercise in visioning.

The objective of this activity is to suspend all disbelief and envision a future that is so stellar that it can land you
or your school on the cover of a well-known international magazine. The student must pretend as though this
future has already taken place and has been reported by the mainstream media.

Purpose: Encouraging students to "think big,", Planting the seeds for a desirable future

According to Wikipedia, "Design thinking refers to the cognitive, strategic and practical processes by which
design concepts (proposals for new products, buildings, machines, etc.) are developed.” Design thinking is also
associated with prescriptions for the innovation of products and services within business and social contexts.

Most often, the design is used to describe hardware, machine or a structure, but essentially, it is a process. It is
a set of procedures and principles that employ creative and innovative techniques to solve any complex
technological or social problem. It is a way of thinking and working about the potential solution to a complex
problem.

Let’s hear a story now…

4
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Some years ago, an incident occurred where a truck driver tried to pass under a low bridge. But he failed, and
the truck got lodged firmly under the bridge. The driver was unable to continue driving through or reverse out.

The story goes that as the truck became stuck, it caused massive traffic problems, which resulted in emergency
personnel, engineers, firefighters and truck drivers gathering to devise and negotiate various solutions for
dislodging the trapped vehicle.

Emergency workers were debating whether to dismantle parts of the truck or chip away at parts of the bridge.
Each spoke of a solution that fits within his or her respective level of expertise.

A boy walking by and witnessing the intense debate looked at the truck, at the bridge, then looked at the road
and said nonchalantly, "Why not just let the air out of the tires?" to the absolute amazement of all the specialists
and experts trying to unpick the problem.

When the solution was tested, the truck was able to drive free with ease, having suffered only the damage
caused by its initial attempt to pass underneath the bridge. The story symbolizes the struggles we face where
oftentimes the most obvious solutions are the ones hardest to come by because of the self-imposed constraints
we work within.

( Source - https://www.interaction-design.org/literature/article/what-is-design-thinking-and-why-is-it-so-popular )

Now let’s move on to understand the Design Thinking framework. The illustration below has the various
components of the framework.

5
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Empathize

Design thinking begins with empathy. This requires doing away with any preconceived notions and immersing
oneself in the context of the problem for better understanding. In simple words, through empathy, one is able
to put oneself in other people's shoes and connect with how they might be feeling about their problem,
circumstance, or situation.

There is a challenge one needs to solve. How does one approach it? Empathy starts from here. As a designer of
the solution to a challenge, one should always understand the problem from the end-user perspective.

This is done by observation, interaction or by imagination.

Define

In the Define stage, information collected during Empathize is used to draw insights and is instrumental in stating
the problem that needs to be solved. It's an opportunity for the design thinker to define the challenge or to
write the problem statement in a human-centred manner with a focus on the unmet needs of the users.

Ideate

By now the problem is obvious and it is time to brainstorm ways and methods to solve it. At this stage, numerous
ideas are generated as a part of the problem-solving exercise. In short, ideation is all about idea generation.
During brainstorming, one should not be concerned if the generated ideas are possible, feasible, or even viable.
The only task of the thinkers is to think of as many ideas as possible for them. It requires "going wide" mentally
in terms of concepts and outcomes. There are many brainstorming tools that can be used during this stage.

By this time, you are already aware of who your target users are and what your problem statement is. Now it’s
time to come up with as many possible solutions. This phase is all about creativity and imagination; all types of
ideas are encouraged, whether stupid or wise – it hardly matters as long as the solution is imagined.

6
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Ideation is the most invigorating stage of Design Thinking, and consists of a process where any and all ideas are
welcomed, no matter how outrageous they may seem. A lot of planning and preparation goes into this stage to
ensure that the results are varied and innovative. After everyone shares their ideas, specific measures are
applied to evaluate the ideas without being judgmental or critical to narrow the list. It may so happen that the
solution comes from the unlikeliest of ideas. So, at this point focus is on quantity over quality of ideas. The most
feasible ideas are chosen for further exploration. Storyboarding, or making a visual mock-up of an idea, can also
be useful during ideation.

Prototype

The prototype stage involves creating a model designed to solve consumers' problems which is tested in the
next stage of the process. Creating a prototype is not a detailed process. It may include a developing simple
drawing, poster, group role-playing, homemade “gadget, or a 3d printed product.” The prototypes must be quick
and easy to develop and cheap. Therefore, prototypes are visualised as rudimentary forms of what a final
product is expected to look like. Prototyping is intended to answer questions that get you closer to your final
solution. Prototypes, though quick and simple to make, bring out useful feedback from users. Prototypes can be
made with everyday materials.

Test

One of the most important parts of the design thinking process is to test the prototypes with the end users. This
step is often seen going parallel to prototyping. During testing, the designers receive feedback about the
prototype(s), and get another opportunity to interact and empathize with the people they are finding solutions
for. Testing focuses on what can be learned about the user and the problem, as well as the potential solution.

7
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Having understood the different stages, let us see some of the best examples of Design Thinking. You will need
to identify and highlight wherever you feel design thinking has been applied.

Example 1: Toilet tank cover with faucet

Example 2: Illuminated Light Switch

Example 3:

Fuel dispensers hanging overhead, unlike what is usually seen in our gas filling stations in India

8
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.1 Right Questioning

Designers generally avoid asking questions to the users as they work against stringent deadlines which demand
quick solutions and delivery. However, great questions often lead to better understanding of the problem which
result in designing great solutions. In the process of developing solutions using design thinking framework,
designers are expected to interact with customers / users very frequently to gather detailed facts about the
problems and user’s expectations. A detailed analysis of these facts leads to approaching the problem in best
possible way.

In order to extract / gather relevant facts and information from users/customers, it is recommended to
use this simple and reliable method of questioning: the 5W1H method.

(https://www.workfront.com/blog/project-management-101-the-5-ws-and-1-h-that-should-be-asked-of-every-project)

9
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

To collect facts and key information about the problem, ask and answer the 5 W's and One H question—Who?
What? When? Where? Why? and How?

https://www.sketchbubble.com/en/presentation-5w1h-model.html

For instance, if one's car is giving inadequate gas mileage the following questions can be asked:

 Who recognized the problem or who drives the car?

 What has changed - for instance, maintenance and repairs done last, change in the gas
station?

 When did the mileage start to deteriorate?

 Where are the new driving routes or distances that the car is covering?

 How the problem became noticeable? How can it be addressed?

The questions can be changed to make them pertinent to whatever problem or issue that needs to be addressed.
The essential W’s and H help to cover all aspects of a problem so that a comprehensive solution can be found.

Activity 1

Your best friend who had scored very high marks in the mid-term exams has surprisingly put up a poor
performance in the final term exams. You decide to bring him back on track by spending time with him and try
to extract facts to get to the root of the problem.

10
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Use the below 5W1H worksheet given below to record the questions and answer with your friend -

Worksheet: 5W and 1H (for problem solving)

Five W’s and One H Answer

What is the Problem?

Where is it happening?

When is it Happening?

Why is it happening?

How can I help my friend overcome the problem?

Why will I need to involve myself?

11
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.2 Identifying the problem to solve

Problem solving is the act of defining a problem; determining the cause of the problem; brainstorming to
generate probable solutions and selecting alternatives for the most suitable solution.

Problems are at the centre of what many people do at work every day. Whether you're solving a problem for a
client (internal or external) or discovering new problems to solve - the problems you face can be large or small,
simple or complex.

The problem, in the below picture may appear simple to you. Thinking every aspect from the perspective of
giraffe, can you solve it for them?

It has often been found that finding or identifying a problem is more important than the solution. For example,
Galileo recognised the problem of needing to know the speed of light, but did not come up with a solution. It
took advances in mathematics and science to solve this measurement problem. Yet to date Galileo still receives
credit for finding the problem.

Question -1: Rohan has been offered a job that he wants, but he doesn’t have the facility to reach the office
premises and also doesn’t have enough money to buy a car.”

What do you think is Rohan’s main problem?

Question-2: Instructors at a large university do not show up for technology training sessions. What do you think
is the problem?

 The time frame for the training sessions does not meet the instructors' schedules.

 There is no reward for investing time in training sessions.

 The notifications for the training are sent in bulk mailings to all email accounts.

The define stage of design thinking (identify the problem) ensures you fully understand the goal of your design
project. It helps you to articulate your design problem, and provides a clear-cut objective to work towards.

Without a well-defined problem statement, it’s hard to know what you’re aiming for. With this in mind, let’s
take a closer look at problem statements and how you can go about defining them.

12
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.3 Ideate
Ideation is the process of generating ideas and solutions through sessions such as sketching, prototyping,
brainstorming etc. In the ideation stage, design thinkers generate ideas — in the form of questions and solutions
— through creative and curious activities.

https://www.tutorialspoint.com/design_thinking/design_thinking_ideate_stage.htm

Ideation Will Help You:

 Ask the right questions and innovate with a strong focus on your users, their needs, and your insights
about them.

 Bring together perspectives and strengths of your team members.

 Get obvious solutions out of your heads, and drive your team beyond them.

Ideation Techniques:

Here is an overview of the most essential ideation techniques employed to generate numerous ideas:

Brainstorm

During a Brainstorming session, students leverage the synergy of the group to generate new innovative ideas by
building on others’ ideas. Participants should be able to discuss their ideas freely without fear of criticism. A
large number of ideas are collected so that different options are available for solving the challenge.
Brain dump

Brain dump is very similar to Brainstorm; however, it’s done individually. It allows the concerned person to open
the mind and let the thoughts be released and captured onto a piece of paper. The participants write down their
ideas onto paper or post-it notes and share their ideas later with the larger group.

Brain writing

Brain writing is also very similar to a Brainstorm session and is known as ‘individual brainstorming’. At times only
the most confident of team members share their ideas while the introverts keep the ideas to themselves.
Brainwriting gives introverted people time to write them down instead of sharing their thoughts out loud with
the group. The participants write down their ideas on paper and, after a few minutes, pass on their own piece
of paper to another participant who then elaborates on the first person’s ideas and so forth. In this way all

13
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

participants pass their papers on to someone else and the process continues. After about 15 minutes, the papers
are collected and posted for instant discussion.

Group Activity: Your class have been tasked with a responsibility - “How to redesign the classroom to better
meet the students’ needs without incurring any cost”?

Form groups of 4 or 5 students each. Apply the design thinking framework i.e. all five phases. Every group is
supposed to submit a detailed report (not more than 10 pages) in a week’s time to the teacher.

Tools supporting the design thinking process:

 https://www.google.com/keep/ (A collaborative note-taking tool that works with Google accounts.)

 https://www.sketchup.com/ (Free 3D digital design tool. Ideal for prototyping and mocking up design
solutions.

A Focus on Empathy

Empathy is the first step in design thinking because it allows designers to understand, empathize and share
the feelings of the users. Through empathy, we can put ourselves in other people’s shoes and connect with
how they might be feeling about their problem, circumstance, or situation.

A big part of design thinking focuses on the nature of impact that innovative thinking has on individuals. Recall
the students who were featured at the beginning of the module. Empathy was at the centre of their designs.

In preparation for your AI challenge, you are going to engage in an empathy map activity to practice one
way of empathizing in the design process.

What’s an Empathy Map?

Before we start to figure out what the problem is or try to solve

it, it's always a good idea to “walk a mile in the user’s shoes” and
get an understanding of the user. An extremely useful tool for
understanding the users’ needs and gaining a deeper insight into
the problem at hand is the empathy map. It also helps in
deepening that understanding, gaining insight into the user’s
behaviour.

To create a “persona” or profile for the user, you can use the empathy
map activity to create a realistic general representation of the user or
users. Personas can include details about a user’s education, lifestyle,
interests, values, goals, needs, thoughts, desires, attitudes, and actions.

14
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity on empathy map

Please look at the below links for the activity and the related video

Source- https://www.ibm.com/design/thinking/page/toolkit/activity/empathy-map

Video link- https://video.ibm.com/recorded/116968874

Instructions:

1. Come prepared with observations

Empathy mapping is only as reliable as the data you bring to the table, so make sure you have defensible
data based on real observations (for example, from an interview or contextual inquiry). When you can, invite
users or Sponsor Users to participate.

2. Set up the activity

Draw a grid and label the four essential quadrants of the map: Says, Does, Thinks, and Feels. Sketch your user
or stakeholder in the centre. Give them a name and brief description of who they are and what they do.

15
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

https://www.uxbooth.com/articles/empathy-mapping-a-guide-to-getting-inside-a-users-head/

3. Capture observations

Have everyone record what they know about the user or stakeholder. Use one sticky note per observation.
Place the sticky notes with the relevant answers on the appropriate quadrant of the map.

16
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

4. Find patterns and identify unknowns

Within each quadrant, look for similar or related items. If desired, move them closer together. As you do,
imagine how these different aspects of your user’s life really affect how they feel. Can you imagine yourself in
their shoes?

5. Playback and discuss

Label anything on the map that might be an assumption or a question for later inquiry or validation. Look for
interesting observations or insights. What do you all agree on? What surprised you? What’s missing? Make sure
to validate your observations with other participants involved in the activity.

17
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

You’ve been asked to build a mobile app that will help connect students and tutors.
 Persona 1: Neha is a high school student and is focused on maintaining a high Percentage to increase her
chances of getting into her first-choice college after Class 12th. She is struggling with her Physics class
and wants to find a tutor. She is looking for someone in her neighbourhood who she can meet with after
school, possibly on Saturday mornings.

● Persona 2: Priya is a college student and an expert in Physics who would like to make a little extra money
by helping students. She hopes to be a teacher one day and thinks being a tutor would help her gain
experience and build her resume. She would like to offer her services to students looking for a Physics tutor.

● Persona 3: Mr. Jaswinder Singh is a high school teacher and has several students struggling with their
Physics assignments. He would like to be able to direct his students to available tutors to help them improve
their grades and catch up with the rest of the class. He also wants to be able to check the progress of his
students to ensure they are taking appropriate steps to improve.

18
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Unit 7

Data Analysis

Title: Data Analysis Approach: Interactive/ Discussion, Team

Activity, Case studies
Summary: In the AI age, where data is the new electricity, students need to know how to use,
analyse and communicate data effectively. Data Analysis should not be limited to mathematics,
statistics or economics, but should be a cross-curriculum concept.
Institutions like the World Bank to entities like the local government, organizations are becoming
increasingly open about the information that they gather and are ready to share the same with the
public. Those who know how to analyse and interpret data, can crunch those numbers to make
predictions, identify patterns, explain historical trends, or find fault in arguments. Students who
become data literate are better equipped to make sense of the information that's all around them
so that they can support their arguments with reliable evidence.
Statistics is the science of data and its interpretation. In other words, statistics is a way to
understand the data that is collected about us and the world; therefore, the basic understanding of
statistics is important. There are statistics all around us – news, in scientific observations, sports,
medicine, populations, and demographics. Understanding statistics is essential to understand
research in the social sciences, science, medicine and behavioural sciences. In this unit you will learn
the basics of statistics; not just how to calculate them, but also how to evaluate them. This module
will also prepare you for the next unit of this Level-II.

Objectives:
1. Demonstrate an understanding of data analysis and statistical concepts.
2. Recognise the various types of structured data – string, date, etc
3. Illustrate an understanding of various statistical concepts like mean, median, mode, etc.

Learning Outcomes:
1. Comprehension and demonstration of data management skills.
2. Students will demonstrate proficiency in applying the knowledge in statistical analysis of
data.

Pre-requisites: No previous knowledge is required, just an interest in methodology and data. All
you need is an Internet connection.
Key Concepts: Data Analysis, Structured Data, Statistical terms and concepts

19
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Q 1. What is your understanding of data? State with examples.

__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________

Q 2. How is data collected?

__________________________________________________________________________________________
__________________________________________________________________________________________

Q 3. Why is data collected? State a few reasons you can think of.

Q 4. What is the difference between data analysis and data interpretation?

It is a widely known fact that Artificial Intelligence (AI) is essentially data-driven. AI involves converting large
amounts of raw data into actionable information that carry practical value and is usable. Therefore,
understanding the statistical concepts and principles are essential to Artificial Intelligence and Machine
Learning. Statistical methods are required to find answers to the questions that we have about data. Statistics
and artificial intelligence share many commonalities. Both disciplines have much to do with planning, combining
evidence, and decision-making. We are aware that statistical methods are required to understand the data used
to train a machine learning model and to interpret the results of testing different machine learning models.

The first section of this unit describes the different data types and how they get stored in a database. The second
section of this unit deals with data representation. Data are usually collected in a raw format and thus difficult
to understand. However, no matter how accurate and valid the captured data might be it would be of no use
unless it is presented effectively. In the third part of the unit, we will get to learn what cases and variables are
and how you can compute measures of central tendency i.e. mean, median, mode, and dispersion i.e. standard
deviation and variance.

20
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. Types of Structured Data

Recalling what we learnt about structured data in Level I, we know that it is highly organised in a formatted
repository and has a predefined data type and a definite structure. It fits neatly within fixed fields and columns
and therefore can be easily stored and searched in a relational database management system (RDBMS). Some
examples of structured data that we come across in daily life include names, dates, addresses, credit card
numbers, stock information, etc. SQL (Structured Query Language) is used to manipulate structured data.

Example of Structured data (https://www.researchgate.net/figure/Unstructured-semi-structured-and-structured-

data_fig4_236860222)
Common sources of structured data are:

 Excel files

 SQL databases

 Medical devices Logs

 Online Forms

Each of these has structured rows and columns that can be sorted or manipulated. Structured data is highly
organized and easily understood by machine language. The most attractive feature of the structured database
is that those working within relational databases can easily input, search, and manipulate structured data.

21
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

https://www.datamation.com/big-data/structured-data.html

Activity 1

Tick the correct image depicting structured data depending on your understanding of the same:

https://www.curvearro.com/blog/difference-between-structured-data-unstructured-data/

22
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

https://www.nbnminds.com/structured-data-vs-unstructured-data/

https://lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/

https://www.laserfiche.com/ecmblog/4-ways-to-manage-unstructured-data-with-ecm/

23
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.1 Date and Time Datatype

‘Date and Time ' datatype is used to store values that contain both – date and time. There could be many
formats, in which date-time data can be stored. Let us take one format for example

 Date data type helps us to specify the date in a particular format. Let's say if we want to store the date, 2
January 2019, then first we will give the year which would be 2 0 1 9, then the month which would be 01,
and finally, the day which would be 02.

 Time data type helps us specify the time represented in a format. Let's say, we want to store the time
8:30:23 a.m. So, first, we'll specify the hour which would be 08, then the minutes which would be 30, and
finally the seconds which would be 23. Year data type holds year values such as 1995 or 2011.

Sample Date-Time format (https://docs.frevvo.com/d/display/frevvo/Setting+Properties)

24
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Some date format choices are given below:

https://docs.frevvo.com/d/display/frevvo/Setting+Properties

Activity

Write the date format used for the dates mentioned below. The first one has been solved as an example. Pay
attention to the separators used in each case. You may use MM or mm to denote month, DD or dd for day and
YY or yyyy for year.

a. mm-dd-yyyy - (07-26-1966)
b. ______________ - (07/26/1966)
c. ______________ - (07.26.1966)
d. ______________ - (26-07-1966)
e. ______________ - (26/07/1966)
f. ______________ - (26.07.1966)
g. ______________ - (1966-07-26)
h. ______________ - (1966/07/26)
i. ______________ - (1966.07.26)

1.2 String Data Type

A string is a structured data type and is often implemented as an array of bytes (or words) that stores a sequence
of elements. A string can store alphanumeric data, which means a string can contain [ A -Z], [ as z], [ 0 -9] and [
all special characters] but they are all considered as if they were text. It also contains spaces. String data must
be placed within a quote (““or ' ‘).

Examples:

Address = “9th Floor, SAS Tower, Gurgaon"

“Hamburger”
“I ate 3 hamburgers”.

25
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.3 Categorical Data Types

Categorical data symbolises characteristics. Therefore, it illustrates things like a person’s gender, language etc.
Categorical data can also take on numerical values like 1 for female and 0 for male. However, these numbers
don’t have mathematical meaning. Mathematical calculations like addition and subtraction cannot be
performed on the numbers. Categorical data is also looked upon as a collection of information that can be
divided into groups i.e. the report cards of all students is referred to as categorical. This data is called categorical
because it may be grouped depending on the variable present in the report card, like – class, subjects, sections,
school-house, etc.

It means, the report card can be bundled section-wise, or class-wise, or house-wise, so section, class or house
are the categories here. The easy way to determine whether the given data is categorical or numerical data is
to calculate the average. If you can calculate the average, then it is considered to be a numerical data. If you
cannot calculate the average, then it is considered to be a categorical data.

https://study.com/academy/exam/topic/ppst-math-data-analysis.html

http://www.intellspot.com/categorical-data-examples/

Question 1: Pin code of a place - Categorical data or numerical data?

Question 2: Date of birth of a person – Categorical data or numerical data?

Question 3: Refer to the table on ice cream and answer the following:

 How many belong to the group, ‘Adults who like Chocolate ice creams’?

 Can you name the group which had 45 people in it?

Question 4: Refer to the table on hair colour, and answer the following:

 How many groups can be formed from this table?

 How many blondes have green eyes?

26
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2. Representation of Data
According to Wikipedia, “Statistics is the discipline that concerns the collection, organization, analysis,
interpretation and presentation of data.” It is the science of data that transforms the observations into usable
information. To achieve this task, statisticians summarize a large amount of data in a format that is compact and
produces meaningful information. Without displaying values for each observation (from populations), it is
possible to represent the data in brief while keeping its meaning intact using certain techniques called 'data
representation'. It can also be defined as a technique for presenting large volumes of data in a manner that
enables the user to interpret the important data with minimum effort and time.

Data representation techniques are broadly classified in two ways:

2.1. Non-Graphical technique: Tabular form and case form

This is the old format of data representation not suitable for large datasets. Non-graphical techniques are not
so suitable when our objective is to make some decisions after analysing a set of data.

This is not the subject of our study in this unit.

2.2. Graphical Technique: Pie Chart, Bar graphs, line graphs, etc.
The visual display of statistical data in the form of points, lines, dots and other geometrical forms is most
common. It would not be possible to discuss the methods of construction of all types of diagrams and maps
primarily due to time constraint. We will, therefore, describe the most commonly used graphs and the way they
are drawn.

These are:

 Line graphs

 Bar diagrams

 Pie diagram

 Scatter Plots

27
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.2.1 Line Graphs

A line graph also called the line chart is a graphical display of information that changes constantly over time
Within a line graph, the data is connected by points which show a continuous change. The lines in a line graph
can descend and ascend based on the data points it represents. We can use a line graph to represent the time
series data related to temperature, rainfall, population growth, birth rates, death rates, etc.

Example of Line Graph (https://depictdatastudio.com/labeling-line-graphs/)

Rules to be followed during construction of Line Graph

(a) Simplify the data by converting it into round numbers such as the growth rate of the population as shown in
the table for the years 1901 to 2001

(b) Draw an X and Y-axis. Mark the time series variables (years/months) on the X-axis and the data quantity/value
to be plotted (growth of population) in percent on the Y-axis.

(c) Choose an appropriate scale and label it on Y-axis. If the data involves a negative figure then the selected
scale should also show it.

The advantages of using Line graph is that it is useful for making comparisons between different datasets, it is
easy to tell the changes in both long and short term, with even small changes over time.

28
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Let us take the following table:

https://www.yourarticlelibrary.com/population/growth-of-population-in-india-1901-to-2001-with-statistics/39653

Population Growth Rate of India

2.5

1.5
Year

1 Growth Rate in percentage

-
0.5

0
1911 1921 1931 1941 1951 1961 1971 1981 1991 2001

-0.5
Growth Rate

Activity 1: Find out the reasons for the sudden change in growth of population between 1911 and 1921 as is
evident from the above graph.

Activity 2: Between the student attendance data and student's score, which one according to you should be
represented using the line graph?

Activity 3: Construct a simple line graph to represent the rainfall data of Tamil Nadu as shown in the table
below

Months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Rainfall (cm) 2.3 2.1 3.7 10.6 20.8 35.6 22.8 14.6 13.8 27.5 20.6 7.5

29
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.2.2 Bar Diagram

A bar graph (also known as a bar chart or bar diagram) is a visual tool in which the bars are used to compare
data among categories. The length of the bar is directly proportional to the value it represents. In simple terms,
the longer the bar, the greater the value it represents. The bars in the graph may run horizontally or vertically
and are of equal width.

Example of Bar Graph (https://mammothmemory.net/maths/graphs/other-graphs-charts-and-diagrams/bar-graph.html)

Following rules should be observed while constructing a bar diagram:

(a) The width of all the bars or columns should be similar.

(b) All the bars should be placed on equal intervals/distance.

The advantages of using a bar graph are many, it is useful for comparing facts, it provides a visual display for
quick comparison of quantities in different categories, and they help us to ascertain relationships easily. Bar
graphs also show big changes over time.

The following table shows the defects during production in a factory:

TYPE Number of occurrences % OF TOTAL

HIGH TURN ON SPEED 18 14.754

HIGH RIPPLE CURRENT 38 31.147

HIGH LEAKAGE 12 9.836

LOW OUTPUT AT LOW SPEED 15 12.295

LOW OUTPUT AT HIGH SPEED 7 5.737

DEAD UNIT 4 3.278

BAD REGULATOR 22 18.032

BAD VOLTAGE SETPOINT 6 4.918

30
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Using the information being depicted in the graph above, answer the questions below:

Question 1: Which type of defect has the highest occurrence?

Question 2: Name the types of faults whose occurrence is below 10.

2.2.3 Pie Chart

A pie chart is a circular graph in which the circle is divided into many segments or sections. Each division
(segment/ sector) of the pie shows the relative size i.e. each category’s contribution or a certain proportion or
percentage of the total. The entire diagram resembles a pie and each component resembles a slice. Pie charts
are a popular means to visualize data taken from a small table. It is a best practice to have not more than seven
categories in a pie chart. Zero values cannot be represented in such graphs. However, such graphs are hard to
interpret and difficult to compare with data from another pie chart.

Pie charts are used to for representing compositions or when trying to compare parts of a whole. They do not
show changes over time. Various applications of pie charts can be found in business, school and at home. For
business, pie charts can be used to show the success or failure of certain products or services. At school, pie
chart applications include showing how much time is allotted to each subject. At home, pie charts can be used
to see the expenses of monthly income on different goods and services.

31
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Example of Pie Chart (https://brilliant.org/wiki/data-presentation-pie-charts/)

The advantages of a pie chart is that it is simple and easy-to-understand and provides data comparison at a
glance.

Imagine, you survey your class to find what kind of books they like the most. You recorded your findings in the
table for all the (40) students in the class

Step 1: Record your observation in a table

Classic Fiction Comedy Story Biography

6 11 8 7 8

Step 2: Add up the total observations to check if totals up to 40

Classic Fiction Comedy Story Biography Total

6 11 8 7 8 40

Step 3: Next, divide each value by the total and multiply by 100 to get the percentage

Classic Fiction Comedy Story Biography Total

6 11 8 7 8 40

(6/40) * 100 (11/40) * 100 (8/40) * 100 (7/40) * 100 (8/40) * 100 (40/40) * 100

= 15% = 27.5 % = 20% = 17.5% = 20% = 100%

32
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Genre Of Books Students Like

6
8

Classic
Fictiion
Comedy

7 11 Story
Biography

2.2.4 Scatter Plots

Scatter plots are a set of data points plotted along the x and y axis and is used to represent the relationship
between two variables (or aspects) for a set of paired data. The shape the data points assume narrates a unique
story, most often revealing the correlation (positive or negative) in a large amount of data. The pattern of the
scatter describes the relationship as shown in the examples below. A scatter plot is a graph of a collection of
ordered pairs (XXY), with one variable on each axis.

Scatter plots are used when there is paired numerical data and when the dependent variable may have multiple
values for each value of your independent variable. The advantage of scatter graph lies in its ability to portray
trends, clusters, patterns, and relationships.

Example of scatter plot (https://www.learnbyexample.org/r-scatter-plot-base-graph/)

A student had a hypothesis for a science project. He believed that the more the students studied Math, the
better their Math scores would be. He took a poll in which he asked students the average number of hours that

33
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

they studied per week during a given semester. He then found out the overall percentage that they received in
their Math classes. His data is shown in the table below:

Study 4 3.5 5 2 3 6.5 0.5 3.5 4.5 5

Time
(Hours)

Maths 82 81 90 74 77 97 51 78 86 88
Grade
(%)

To understand this data, he decided to make a scatter plot.

The independent variable, or input data, is the study time because the hypothesis is that the Math grade
depends on the study time. That means that the Math grade is the dependent variable, or the output data. The
input data is plotted on the x-axis and the output data is plotted on the y-axis.

2.2.5 Types of Correlation

Positive Correlation: Both variables are seen to be moving in the same direction. In other words, with the
increase in one variable, the other variable also increases. As one variable decreases, the other variable is also
found to be decreasing. This means that data points along both x and y – coordinates increase and are related.
E.g. Years of education and annual salary is positively correlated.

Negative Correlation: Both the variables are seen to be moving in opposite directions. While one variable
increases, the other variable decreases. As one variable decreases, the other variable increases. If among the
data points along the x – coordinate and the y – coordinate, one increases and the other decreases it is termed
as a negative correlation.

34
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

E.g. When hours spent sleeping increases hours spent awake decreases, so they are negatively correlated.

No correlation: If no relationship becomes evident between the two variables then there is no correlation. E.g.
Eg: There is no correlation between the amount of tea consumed and the level of intelligence.

Example of scatter Plots with different correlations (https://slideplayer.com/slide/9489537/)

Activity 1: What type of correlation do see in the below graph?

35
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3. Exploring Data
Exploring data is about "getting to know" the data: and its values - whether they are typical, unusual; centered,
spread out; or whether they are extremes. More importantly, during the process of exploration one gets an
opportunity to identify and correct any problems in your data that would affect the conclusions you draw in any
way during analysis. This is the first step in data analysis and involves summarizing the main characteristics of a
dataset, such as its size, accuracy, initial patterns in the data and other attributes.

Pictorial representation of Data Exploration (https://blog.camelot-group.com/2019/03/exploratory-data-analysis-an-

important-step-in-data-science/)

3.1 Case, Variables and Levels of Measurement

3.1.1 Cases and Variables
Imagine you are very interested in cricket. You want to all the details about that game - how many matches won
by a team, how many wickets were taken by a particular bowler or how many runs scored by a batsman?

The number of runs, wickets of a team can be expressed in terms of variables and case.

Before getting into the proper definition of case or variable let us try to understand its meaning in simple words:

“Variables” are the features of someone / something and "Case" is that something/ someone. So, here Cricket
is the case and features of cricket like wickets, runs, win, etc are the variables.

Example 1:

Take another example, you want to know the age, height and address of your favourite cricket player.

Question-1: What are the variables here?

Question-2: What is the case here?

Example 2:

Let us take one more example where data is collected from a sample of STAT 200 students. Each student's
major, quiz score, and lab assignment score is recorded.

36
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Question -3: Name the variables here

______________________________________________

Question-4: Name the case here

______________________________________________

Example 3:

A fourth-grade teacher wants to know if students who spend more time studying at home get higher homework
and exam grades.

Question 5: Name the variables

______________________________________________

Question 6: Name the case

______________________________________________

So, given the examples you came across here, you would have understood that a dataset contains information
about a sample. Hence a dataset is said to consist of cases and cases are nothing but a collection of objects. It
must now also be clear that a variable is a characteristic that is measured and whose value can keep changing
during the program. In other words, something that can vary. This is in striking contrast to a constant which is
the same for all cases in a study.

Example:

Let’s say you are collecting blood samples from students in a school for a CBC test, where the following
components would be measured:

 Haemoglobin level

 White Blood Cells

 Red Blood Cells

 Platelets

The students are the cases and all the components of blood are the variables.

Take another example, x = 10, this means that x is variable that stores the value 10 in it.

x = x + 5, name of variable is still x but its value has changed to 15 due to the addition of a constant 5

37
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.1.2 Levels of Measurement

The way a set of data is measured is called the level of measurement. Not all data can be treated equally. It
makes sense to classify data sets based on different criteria. Some are quantitative, and some qualitative. Some
data sets are continuous and some are discrete. Qualitative data can be nominal or ordinal. And quantitative
data can be split into two groups: interval and ratio.

https://slideplayer.com/slide/8137745/

38
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Example of data categorisation in the four Levels of Measurement

Broadly, there are four levels of measurement for the variables

1. Nominal Level

Data at the nominal level is qualitative. Nominal variables are like categories such as Mercedes, BMW or Audi,
or like the four seasons – winter, spring, summer and autumn. They aren’t numbers, and cannot be used in
calculations and neither in any order or rank. The nominal level of measurement is the simplest or lowest of the
four ways to characterize data. Nominal means "in name only".

Colours of eyes, yes or no responses to a survey, gender, smartphone companies, etc all deal with the nominal
level of measurement. Even some things with numbers associated with them, such as a number on the back of
a cricketer’s T-shirt are nominal since they are used as "names" for individual players on the field and not for
any calculation purpose.

https://slideplayer.com/slide/8059841/

39
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2. Ordinal Level

Ordinal data, is made up of groups and categories which follow a strict order. For e.g. if you have been asked to
rate a meal at a restaurant and the options are: unpalatable, unappetizing, just okay, tasty, and delicious.
Although the restaurant has used words not numbers to rate its food, it is clear that these preferences are
ordered from negative to positive or low to high, thus the data is qualitative, ordinal. However, the difference
between the data cannot be measured. Like the nominal scale data, ordinal scale data cannot be used in
calculations.

A Hotel industry survey where the responses to questions about the hotels are accepted as, "excellent," "good,"
"satisfactory," and "unsatisfactory." These responses are ordered or ranked from the excellent service to
satisfactory response to the least desired or unsatisfactory. But the differences between the two pieces of data
as seen in the previous case cannot be measured.

Another common example of this is the grading system where letters are used to grade a service or good. You
can order things so that A is higher than a B, but without any other information, there is no way of knowing how
much better an A is from a B.

https://slideplayer.com/slide/6564103/

3. Interval Level

Data that is measured using the interval scale is similar to ordinal level data because it has a definite ordering
but there is a difference between the two data. The differences between interval scale data can be measured
though the data does not have a starting point i.e. zero value.

Temperature scales like Celsius (oC) and Fahrenheit (F) are measured by using the interval scale. In both
temperature measurements, 40° is equal to 100° minus 60°. Differences make sense. But 0 degrees does not
because, in both scales, 0 is not the absolute lowest temperature. Temperatures like -20° F and -30° C exist and
are colder than 0.

Interval level data can be used in calculations, but the comparison cannot be done. 80° C is not four times as hot
as 20° C (nor is 80° F four times as hot as 20° F). There is no meaning to the ratio of 80 to 20 (or four to one).

40
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

4. Ratio Scale Level

Ratio scale data is like interval scale data, but it has a 0 point and ratios can be calculated. For example, the
scores of four multiple choice statistics final exam questions were recorded as 80, 68, 20 and 92 (out of a
maximum of 100 marks). The grades are computer generated. The data can be put in order from lowest to
highest: 20, 68, 80, 92 or vice versa. The differences between the data have meaning. The score 92 is more than
the score 68 by 24 points. Ratios can be calculated. The smallest score is 0. So, 80 is four times 20. The score of
80 is four times better than the score of 20.
So, we can add, subtract, divide and multiply the two ratio level variables. Egg: Weight of a person. It has a real
zero point, i.e. zero weight means that the person has no weight. Also, we can add, subtract, multiply and
divide weights at the real scale for comparisons.

41
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity

1. Student Health Survey – Fill in the response and mention appropriate Level of Measurement

Query Response Level of Measurement

Sex (Male/ Female)
Height (in metres)
Weight (in kilograms)
Rate overall health (Excellent; Good; Average; Below Average;
Poor)
Pulse rate (in BPM)
Body temperature (in Fahrenheit)
Country of residence

2. State whether or not the following statements are true

 All ordinal measurements are also nominal
 All interval measurements are also ordinal
 All ratio measurements are also interval

3. Indicate whether the variable is ordinal or not. If the variable is not ordinal, indicate its variable type.
 Opinion about a new law (favour or oppose)
 Letter grade in an English class (A, B, C, etc.)
 Student rating of teacher on a scale of 1 – 10.

42
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.2 Data Matrix and Frequency Tables

If you are conducting a statistical study or research, you need to think of your data in terms of variables and
cases. In this topic, you will learn how to present and order your variables and cases.

3.2.1 Data Matrix

When data is collected about any case, it is finally stored in a data matrix that contains a certain number of rows
and columns. The data is usually arranged in such a way, that each row of the data matrix contains details about
the case. The rows, therefore, represent the case or samples, whereas the columns represent the variable.

So, the tabular format of representation of cases and variables being used in your statistical study is known as
the Data Matrix. Each row of a data matrix represents a case and each column represent a variable. A complete
Data Matrix may contain thousands or lakhs or even more cases.
Each cell contains a single value for a particular variable (or observation).

Imagine you want to create a database of top 3 scorers in each section of every class of your school. The case
you are interested in is individual students (top 3) and variables you want to capture – name, class, section, age,
aggregate %, section rank, and address.

The best way to arrange all this information is to create a data matrix.

Name Class Section Age Aggregate % Section Address

Rank

A X M 16 92 3 Add1

B X M 15 98 1 Add2

C X M 16 95 2 Add3

D IX N 14 96 1 Add4

E IX N 14 95 2 Add5

............... ............... ............... ............... ............... ............... ...............

Z IV M 9 97 1 Add10

43
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity 1

Flipping a coin

Suppose you flip a coin five times and record the outcome (heads or tails) each time

1. What would be the observations?

2. What is the variable? (note there is only one)

3. Flip a coin and record the outcome in a data matrix as below

Outcomes

Activity 2

The car data set was prepared by randomly selecting 54 cars and then collecting data on various attributes of
these cars. The first ten observations of the data can be seen in the data matrix below:

CAR TYPE PRICE DRIVETRAIN PASSNEGERS WEIGHT

#1 Small 15.9 L Front 5 2705

#2 Midsize 33.9 L Front 5 3560

#3 Midsize 33.9 L Front 6 3405

#4 Midsize 30.0 L Rear 4 3640

#5 Midsize 15.7 L Front 6 2880

#6 Large 20.8 L Front 6 3470

1. What are the observations in this data matrix?

2. What are the variables in this data matrix?

Activity 3: Prepare a data matrix to record sales of different types of fruits from a grocery store. Note variables
can be weight, per unit cost, total cost.

44
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.2.1 Frequency Tables

The frequency of a particular data value is the number of times the data value occurs (occurrences) in a given
set of data. Let’s say, if four players have scored 90 runs in cricket, then the score of 90 is said to have a
frequency of 4. The frequency of a data value is often represented by ‘f’.

A frequency table is constructed by arranging the collected data values in either ascending order of magnitude
or descending order with their corresponding frequencies.

Marks Tally Frequency

1 /// 3

2 /// 3

3 // 2

4 // 2

5 // 2

6 //// 5

7 //// 4

8 //// 5

9 // 2

10 // 2

Total 30 30

Example 1: The following data shows the test marks obtained by a group of students. Draw a frequency table
for the data.

6 7 7 1 3 2 8 6 8 2

4 4 9 10 2 6 3 1 6 6

9 8 7 5 7 10 8 1 5 8

Go through the data; make a stroke in the tally column for each occurrence of the data. The number of strokes
will be the frequency of the data.
When the set of data values are spread out, it is difficult to set up a frequency table for every data value as
there will be too many rows in the table. So, we group the data into class intervals (or groups) to help us
organize, interpret and analyse the data.

45
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Example 2: The number of calls from motorists per day for roadside service was recorded for a particular
month in a year. The results were as follows:

Set up a frequency table for this set of data values.

Hint: Make your frequency table something like:

3.3 Graphs and Shapes of Distributions

Statisticians or machine learning engineers often want to summarize the data they have. They can do it by
various available methods like data matrix, frequency tables or by graphical representation. When graphed, the
data in a set is arranged to show how the points are distributed throughout the set. These distributions show
the spread (dispersion, variability, scatter) of the data. The spread may be stretched (covering a wider range) or
squeezed (covering a narrower range).

We have already studied about the graphs and various graphical representation methods in the previous
chapters. Here we will learn about the spread of data and meaning of the spread i.e. distribution. The shape of
a distribution is described by its number of peaks and by its possession of symmetry, its tendency to skew, or its
uniformity. (Distributions that are skewed have more points plotted on one side of the graph than on the other.)
The shape of the data distribution represents:

 Spread of data i.e. scatter, variability, variance etc

 Where the central tendency (i.e. mean) lies in the data spread

 What the range of data set is

46
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Shapes of distribution are defined by different factors such as:

1. Number of Peaks

Distribution can have single peak (unimodal), also called modes, two peaks (bimodal) or (multimodal).

http://www.lynnschools.org/classrooms/english/faculty/documents

One of the most common types of unimodal distribution is normal distribution of ‘bell curve’ because its shape
looks like bell.

Unimodal

2. Symmetry

A symmetric graph when graphed and a vertical line drawn at the centre forms mirror images, with the left
half of the graph being the mirror image of the right half of the graph. The normal distribution or U-
distribution is an example of symmetric graphs.

47
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

https://mathbitsnotebook.com/Algebra1/StatisticsData/STShapes.html

3. Skewness

Unsymmetrical distributions are usually skewed. They have pointed plot on one side of mean. This causes a
long tail either in the negative direction on the number line (left skew) or long tail on the positive direction on
the number line (positive skew or right skew).

Now let us look at

some cases to check our understanding of the above

Let us further describe these shapes of distribution using histogram, one of the simplest but widely used data
representation methods.

Histograms are a very common method of visualizing data, and that means that understanding how to
interpret histograms is a valuable and important skill in statistics and machine learning.

Case – 1

In this histogram, you can see that mean is close to 50. The shape of the graph is roughly symmetric and the
values fall between 40 to 64. In some sense, value 64, looks like outlier.

Question -1: What type of distribution do you see in this graph?

____________________________________________

48
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Case -2

This histogram has 2 means which suggests that this histogram represents two cases. One group has mean 50
and other group has a mean of 65.

Can you think of a real-life example for such a case with two means?

Let us assume, that for the sports meet being held in your school your parents along with your grandparents
were also invited. The age of the parents ranged between 40 – 55 years (in blue colour), and that of grandparents
in the range of 60 – 80 years (in pink colour). During the break, both these cases (parents and grandparents)
bought snacks from the snacks counter. Y – Axis shows the money they spent at the counter in buying the snacks.

Activity 1

Can you think of an event(s) where you have two cases, in your classroom environment? Capture the data and
plot on the histogram to have two peaks. Once done, tell your data story to the class.

Case – 3

49
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Case 3 represents the right-skewed distribution (the direction of the skew indicates which way the longer tail
extends), the long tail extends to the right while most values cluster on the left, as shown in the histogram
above.

Case – 4

In Case 4 which is a left skewed distribution, long tail extends to the left while most values cluster on the right.

3.4 Mean, Median and Mode

We have already covered the measures of central tendency (mean, median and mode) in the Level 1.
In this level, we will do a quick recap citing examples from our life.
3.4.1 Mean
Think of a situation when your final results have been declared in school and you reach home with your report
card. Your parents will enquire about your scores and your overall performance - “What is your average
score?” In fact, they are trying to find your MEAN score.

The mean (or average) is the most popular and well-known measure of central tendency. The mean is equal to
the sum of all the values in the data set divided by the number of values in the data set. Therefore, in this case,
mean is the sum of the total marks you scored in all the subjects divided by the number of subjects.

M = ∑ fox / n

Where M = Mean

∑ = Sum total of the scores

f = Frequency of the distribution

x = Scores

n = Total number of cases

Activity 1: When you try to search a game app on play store, you must be looking at the rating of the app. Can
you figure out how that rating is calculated?

50
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity 2: In the example below, if we add all rating given by users, it will come to 45685. Then how come
rating is 3.5?

3.4.2 Median
Suppose, there are 11 students in your class and one of them is from a very wealthy family.

Pocket money received by all 11 students is as follows:

Student 1 2 3 4 5 6 7 8 9 10 11

Pocket 600 580 650 650 550 700 590 640 600 595 20000
Money

Upon calculating the mean or average, it would turn out that the average pocket money received by students
of the class is Rs. 7973

However, on crosschecking with the amounts mentioned in the table it does not appear to be true. Is any student
getting a pocket money even close to INR 7973? The answer is No! So, should we use mean to represent the
data here? No! Because of one the extreme contributions received by a student from a wealthy family has upset
the mean.

So, what else can we do to so that the value should represent maximum students?

The median is the middle score for a set of data that has been arranged in order of magnitude. The median is
less affected by outliers and skewed data. So median value here is INR 700 and this value will represent
maximum students.

For a grouped data, calculation of a median in continuous series involves the following steps:

(I) The data arranged in ascending order of their class interval

(ii) Frequencies are converted into cumulative frequencies
(iii) Median class of the series is identified
(iv) Formula used to find actual median value
𝑁
−𝑐.𝑓
And the formula is: Median = l1+ 2 𝑓
×𝑖
l1= Lower limit of median class
cuff= Cumulative frequency of the class preceding the median class
f= Frequency of the median class
I= Class size

51
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.4.3 Mode
Mode is another important measure of central tendency of tactical series. It is the value which occurs most
frequently in the data series. The mode is the most frequent score in our data set. On a histogram it represents
the highest bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most
popular option.

https://contexturesblog.com/archives/2013/06/13/excel-functions-average-median-mode/

https://mathsmadeeasy.co.uk/gcse-maths-revision/bar-graphs-revision/

Let us take an example…

You need to buy a shoe. You go to the market and ask the shopkeeper 'what is the average shoe size you sell?',
he will give an answer corresponding to the size that he sells maximum. That is the mode.

The arithmetic mean and median would give you figures of shoe size that don't exist and are obviously
meaningless.

52
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Now let us understand how you can use these central tendencies in Machine Learning

Machine Learning (or AI) is a data driven technology. If the input data is wrong, ML will produce wrong results.
When working with data it is good to have an overall picture of the data. Where it’s good to have an idea of
how values in a given data set is distributed. The data distribution of data set can be

 Symmetric
 Skewed
◦ Negatively skewed
◦ Positively skewed
The skewness of data can be found either by data visualization techniques or by calculation of central
tendency.

If the data is symmetrically distributed:

Mean = Median = Mode

http://homepage.stat.uiowa.edu/~rdecook/stat1010/notes/Section_4.2_distribution_shapes.pdf

If the data is positively distributed

Mode<Median<Mean

If the data is negativity distributed

Mean<Median<Mode

https://www.calculators.org/math/mean-median-mode.php

53
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Z – score (For Advance Learners)

Z-score gives us an idea of how far our data point (in question) is from
the mean. But more technically, it’s a measure of how many standard
deviations below or above, the data point is from the population mean.
The z-score is positive if the value lies above the mean, and negative if it
lies below the mean.

As the formula shows, the z-score is simply the raw score minus the
sample mean, divided by the sample standard deviation.

How do we interpret a z-score?

The value of the z-score tells you how many standard deviations your data point is away from the mean. If a z-
score is equal to 0, it is on the mean.

A positive z-score indicates the raw score is higher than the mean average. For example, if a z-score is equal to
+1, it is 1 standard deviation above the mean.

A negative z-score reveals the raw score is below the mean average. For example, if a z-score is equal to -2, it
is 2 standard deviations below the mean.

Question

There are 50 students in your class, you scored 70 out of 100 in SST exam. How well did you perform in your
SST exam?

Answer

Let us re-phrase this question – In fact you need to find - “What percentage (or number) of students scored
higher than you and what percentage (or number) of students scored lower than you?

This is a perfect case of z-score. To calculate z-score, you need to find the mean score of your class in SST and
standard deviation.

So, let us assume, mean is 60 and standard deviation is 15.

z-score = (x – μ) / ῤ

= (70 – 60) / 15 = 10 / 15

= .6667

This means you scored .6667 standard deviations above the mean.

54
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Practice Questions
1. Find (a) the mean (b) the median (c) the mode (d) the range of the below data set

5, 6, 2, 4, 7, 8, 3, 5, 6, 6

2. In a survey of 10 households, the number of children was found to be

4, 1, 5, 4, 3, 7, 2, 3, 4, 1

(a) State the mode

(b) Calculate

(I) the mean number of children per household

(ii) the median number of children per household.

(c) A researcher says: "The mode seems to be the best average to represent the data in this survey."
Give ONE reason to support this statement.

3. The mean number of students ill at a school is 3.8 per day, for the first 20 school days of a term. On the 21st
day 8 students are ill. What is the mean after 21 days?

4. If a positively skewed distribution has a median of 50, which of the following statement is true?

A) Mean is greater than 50

B) Mean is less than 50
C) Mode is less than 50
D) Mode is greater than 50
E) Both A and C
F) Both B and D

5. Which of the following is a possible value for the median of the below distribution?

A) 32
B) 26
C) 17
D) 40

55
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Unit 8

Regression

Title: Regression Approach: Problem Solving , Discussion, Team

Activity, Case studies
Summary: Artificial Intelligence / Machine Learning has become prevalent in almost every aspect
of our life, society and business. People across different disciplines are trying to apply AI to be more
accurate and to have better control of the future. For example, economists are using AI to predict
future market prices to make a profit, doctors use AI to classify whether a tumour is malignant or
benign, meteorologists use AI to predict the weather, HR recruiters use AI to check the resume of
applicants to verify if the applicant meets the minimum criteria for the job, banks are using AI to
check paying capacity of the customers before loan disbursement.
The AI / ML algorithm that every AI learner starts with is a linear regression (and correlation)
algorithm. So let us learn the foundations of linear regression to build a solid base for the learning
of AI and ML.
Linear regression is a method for modelling the relationship between one or more independent
variables and a dependent variable. It is the foundation block of Machine Learning and Artificial
Intelligence. It is a form of predictive modelling technique that depicts the relationship between a
dependent (target) and the independent variables (predictors).
This technique is used for forecasting, time series modelling and finding the cause - effect
relationship between the variables.
Objectives:
1. To understand the difference between correlation and regression.
2. To Understand the Pearson correlation coefficient (r) measures,
3. To understand how regression analysis is used to predict outcome.
4. To understand the main features and characteristics of the Pearson r.

Learning Outcomes:
1. Students should be able to estimate the correlation coefficient for a given data set
2. Students should be able to estimate the line of best fit for a given data set
3. Students should be able to determine whether a regression model is significant
Pre-requisites:
1. Students must be able to plot points on the Cartesian coordinate system
2. They should have basic understanding of statistics and central tendencies
Key Concepts: Regression, Correlation, Pearson’s r

56
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. Regression and Correlation

Regression can be defined as a method or an algorithm in Machine Learning that models a target value
based on independent predictors. It is essentially a statistical tool used in finding out the relationship
between a dependent variable and an independent variable. This method comes to play in forecasting
and finding out the cause and effect relationship between variables.
Regression techniques differ based on:
1. The number of independent variables
2. The type of relationship between the independent and dependent variable
Regression is basically performed when the dependent variable is of a continuous data type. The
independent variables, however, could be of any data type — continuous, nominal/categorical etc.
Regression methods find the most accurate line describing the relationship between the dependent
variable and predictors with least error. In regression, the dependent variable is the function of the
independent variable and the coefficient and the error term.
Correlation is a measure of the strength of a linear relationship between two quantitative variables
(e.g. price, sales)

 Correlation is positive when the values increase together

 Correlation is negative when one value decreases as the other increases
A correlation is assumed to be linear i.e. following a line.

Correlation can have a value:

 1 is a perfect positive correlation

 0 is no correlation (the values don't seem linked at all)

 -1 is a perfect negative correlation

The value shows how good the correlation is (not how steep the line is), and if it is positive or negative.

57
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.1 Crosstabs and Scatterplots

1.1.1 Crosstabs
Cross tabs help us establish a relationship between two variables. This relationship is exhibited in a tabular form.

The table below is a crosstab that shows by age whether somebody has an unlisted phone number.

 This table shows the number of observations with each combination of possible values of the two
variables in each cell of the table

 We can see, for example, there are 185 people aged 18 to 34 years who do not have an unlisted phone
number.

 Column percentages are also shown (these are percentages within the columns, so that each column’s
percentages add up to 100%); for example, 24% of all the people without an unlisted phone number are
aged 18 to 34 years.

 The age distribution for people without unlisted numbers is different from that for people with
unlisted numbers. In other words, the crosstab reveals a relationship between the two: people
with unlisted phone numbers are more likely to be younger.
 Thus, we can also say that the variables used to create this table are correlated. If there were
no relationship between these two categorical variables, we would say that they were not
correlated.
In this example, the two variables can both be viewed as being ordered. Consequently, we can
potentially describe the patterns as being positive or negative correlations (negative in the table
shown). However, where both variables are not ordered, we can simply refer to the strength of the
correlation without discussing its direction (i.e., whether it is positive or negative).

58
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.1.2 Scatterplots
A scatter plot (aka scatter chart, scatter graph) uses dots to represent values for two different numeric
variables. The position of each dot on the horizontal and vertical axis indicates values for an individual
data point. Scatter plots are used to observe relationships between variables.

Example

This is a scatter plot showing the amount of sleep needed per day by age.

As seen above, as you grow older, you need less sleep (but still probably more than you’re currently
getting).

Question: What type of correlation is shown here?

Answer: This is a negative correlation. As we move along the x-axis toward the greater numbers,
the points move down which means the y-values are decreasing, making this a negative correlation.

59
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.2 Pearson’s r
The Pearson correlation coefficient is used to measure the strength of a linear association between
two variables, where the value r = 1 means a perfect positive correlation and the value r = -1 means a
perfect negative correlation. So, for example, you could use this test to find out whether people's
height and weight are correlated (the taller the people are, the heavier they're likely to be).
Requirements for Pearson's correlation coefficient are as follows:Scale of measurement should be
interval or ratio

 Variables should be approximately normally distributed

 The association should be linear
 There should be no outliers in the data
Equation

What does this test do?

The Pearson product-moment correlation coefficient (or Pearson correlation coefficient, for short) is a
measure of the strength of a linear association between two variables and is denoted by ‘r’. Basically,
a Pearson product-moment correlation attempts to draw a line of best fit through the data of two
variables, and the Pearson correlation coefficient, r, indicates how far away all these data points are to
this line of best fit (i.e., how well the data points fit this new model/line of best fit).
What values can the Pearson correlation coefficient take?
The Pearson correlation coefficient, r, can take a range of values from +1 to -1. A value of 0 indicates
that there is no association between the two variables. A value greater than 0 indicates a positive
association; that is, as the value of one variable increases, so does the value of the other variable. A
value less than 0 indicates a negative association; that is, as the value of one variable increases, the
value of the other variable decreases. This is shown in the diagram below:

60
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

How can we determine the strength of association based on the Pearson correlation coefficient?
The stronger the association of the two variables, the closer the Pearson correlation coefficient, r, will
be to either +1 or -1 depending on whether the relationship is positive or negative, respectively.
Achieving a value of +1 or -1 means that all your data points are included on the line of best fit – there
are no data points that show any variation away from this line. Values for r between +1 and -1 (for
example, r = 0.8 or -0.4) indicate that there is variation around the line of best fit. The closer the value
of r to 0 the greater the variation around the line of best fit. Different relationships and their correlation
coefficients are shown in the diagram below:

Are there guidelines to interpreting the Pearson's correlation coefficient?

Yes, the following guidelines have been proposed:
Coefficient, r

Strength of Association Positive Negative

Small .1 to .3 -0.1 to -0.3

Medium .3 to .5 -0.3 to -0.5

Large .5 to 1.0 -0.5 to -1.0

Remember that these values are guidelines and whether an association is strong or not will also depend on
what you are measuring.

61
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Example 1
In the example below of 6 people with different age and different weight, let us try calculating the value of the
Pearson r.

Solution:

For the Calculation of the Pearson Correlation Coefficient, we will first calculate the following values:

Here the total number of people is 6 so, n=6

Now the calculation of the Pearson R is as follows:

62
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

 r = (n (∑xy)- (∑x)(∑y))/(√ [n ∑x2-(∑x)2][n ∑y2– (∑y)2 )

 r = (6 * (13937)- (202)(409)) / (√ [6 *7280 -(202)2] * [6 * 28365- (409)2 )
 r = (6 * (13937)- (202) * (409))/(√ [6 *7280 -(202)2] * [6 * 28365- (409)2 )
 r = (83622- 82618)/(√ [43680 -40804] * [170190- 167281 )
 r = 1004/(√ [2876] * [2909 )
 r = 1004 / (√ 8366284)
 r = 1004 / 2892.452938
 r = 0.35

Thus the value of the Pearson correlation coefficient is 0.35

Assumptions

There are four "assumptions" that underpin a Pearson's correlation. If any of these four assumptions are not
met, analysing your data using a Pearson's correlation might not lead to a valid result.

Assumption # 1: The two variables should be measured at the continuous level. Examples of such continuous
variables include height (measured in feet and inches), temperature (measured in °C), salary (measured in
dollars/INR), revision time (measured in hours), intelligence (measured using IQ score), reaction time (measured
in milliseconds), test performance (measured from 0 to 100), sales (measured in number of transactions per
month), and so forth.

Assumption # 2: There needs to be a linear relationship between your two variables. Whilst there are a number
of ways to check whether a Pearson's correlation exists, we suggest creating a scatterplot using Stata, where
you can plot your two variables against each other. You can then visually inspect the scatterplot to check for
linearity. Your scatterplot may look something like one of the following:

Assumption #3: There should be no significant outliers. Outliers are simply single data points within your data
that do not follow the usual pattern (e.g. in a study of 100 students' IQ scores, where the mean score was 108
with only a small variation between students, one student had a score of 156, which is very unusual, and may
even put her in the top 1% of IQ scores globally). The following scatterplots highlight the potential impact of
outliers:

63
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Pearson's r is sensitive to outliers, which can have a great impact on the line of best fit and the Pearson
correlation coefficient, leading to very difficult conclusions regarding your data. Therefore, it is best if there are
no outliers or they are kept to a minimum. Fortunately, you can use Stata to detect possible outliers
using scatterplots.

Assumption # 4: Your variables should be approximately normally distributed. In order to assess the statistical
significance of the Pearson correlation, you need to have bivariate normality, but this assumption is difficult to
assess, so a simpler method is more commonly used.

1.3 Regression – Finding The line

When we make a distribution in which there is an involvement of more than one variable, then such an analysis
is called Regression Analysis. It generally focuses on finding or rather predicting the value of the variable that is
dependent on the other.

Let there be two variables x and y. If y depends on x, then the result comes in the form of a simple regression.
Furthermore, we name the variables x and y as:

y – Regression or Dependent Variable or Explained Variable

x – Independent Variable or Predictor or Explanator
Therefore, if we use a simple linear regression model where y depends on x, then the regression line
of y on x is:
y = a + bx
Regression Coefficient
The two constants a and b are regression parameters. Furthermore, we denote the
variable b as byx and we term it as regression coefficient of y on x.

Also, we can have one more definition for the regression line of y on x. We can call it the best fit as
the result comes from least squares. This method is the most suitable for finding the value
of y on x i.e. the value of a dependent variable on an independent variable.
Least Squares Method
∑ ei2 = ∑ (yi – y ^ i)2 = ∑ (yi – a – bxi)2

64
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Here:

 Variable yi is the actual value or the observed value

 y ^ i = a + bxi, denotes the estimated value of yi for a given random value of a variable of xi

 ei = Difference between observed and estimated value and is the error or residue. The
regression line of y or x along with the estimation errors are as follows:

On minimizing the least squares equation, here is what we get. We refer to these equations Normal Equations.

∑yi = na + b ∑xi
∑xiyi = a ∑xi2 + b ∑xi

We get the least squares estimate for a and b by solving the above two equations for both a and b.

b = Cov(x,y)/Sx2
= (r.SxSy)/Sx2
= (r.Sy)/Sx

The estimate of a, after the estimation of b is:

a = y¯ – bx¯

On substituting the estimates of a and b is:

[ y – y¯ ]/Sy = r[ x – x¯ ]/Sx

Sometimes, it might so happen that variable x depends on variable y. In such cases, the line of regression of x
on y is:

x = a ^ + b^y

Regression Equation

The standard form of the regression equation of variable x on y is:

[ x – x¯ ]/Sx = r[ y – y¯ ]/Sy

Question

The regression equation for variables x and y are 7x – 3y – 18 = 0 and 4x – y – 11 = 0.

1.What is the AM for x and y?

2.Find the correlation coefficient in between x and y.

65
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Solution

(i) The intersection of two lines have the same intersection point and that is [x¯, y¯]. Therefore, we replace, x
and y with x¯ and y¯

7x – 3y = 18
4x – y = 11

Hence, on solving these two equations we get x¯ = 3 and y¯ = 1.

(ii) We know,

r2 = 7/12

Therefore,

r = 712−−√ (r is positive as both the coefficients are positive)

= 0.7638

1.4 Regression – Describing the line

Definition: In statistics, a regression line is a line that best describes the behaviour of a set of data. In other
words, it’s a line that best fits the trend of a given data.

What Does Regression Line Mean?

Regression lines are very useful for forecasting procedures. The purpose of the line is to describe the
interrelation of a dependent variable (Y variable) with one or many independent variables (X variable). By using
the equation obtained from the regression line an analyst can forecast future behaviours of the dependent
variable by inputting different values for the independent ones. Regression lines are widely used in the financial
sector and in business in general.

Financial analysts employ linear regressions to forecast stock prices, commodity prices and to perform
valuations for many different securities. On the other hand, companies employ regressions for the purpose of
forecasting sales, inventories and many other variables that are crucial for strategy and planning.

The regression line formula is like the following:

(Y = a + bX + u)

The multiple regression formula looks like this:

(Y = a + b1X1 + b2X2 + b3X3 + … + btXt +u.)

Y is the dependent variable

X is the independent ones

a is the interception point

b is the slope

u is the residual regression

66
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Example: 1

Data was collected on the “depth of dive” and the “duration of dive” of penguins. The following linear model is
a fairly good summary of the data:

Where:

 t is the duration of the dive in minutes

 d is the depth of the dive in yards
The equation for the model is: d t = + 0.015 2.915

Interpretation of the slope: If the duration of the dive increases by 1 minute, we predict the depth of the
dive will increase by approximately 2.915 yards.

Interpretation of the intercept: If the duration of the dive is 0 seconds, then we predict the depth of the dive
is 0.015 yards.

Comments: The interpretation of the intercept doesn’t make sense in the real world. It isn’t reasonable for
the duration of a dive to be near t = 0, because that’s too short for a dive. If data with x-values near zero
wouldn’t make sense, then usually the interpretation of the intercept won’t seem realistic in the real world.
It is, however, acceptable (even required) to interpret this as a coefficient in the model.

Example: 2

Reinforced concrete buildings have steel frames. One of the main factors affecting the durability of these
buildings is carbonation of the concrete (caused by a chemical reaction that changes the pH of the concrete),
which then corrodes the steel reinforcing the building.

Data is collected on specimens of the core taken from such buildings, where the following are measured:

 Depth of the carbonation (in mm) is called d

 Strength of the concrete (in Mpa) is called s

It is found that the model is s = 24.5 – 2.8.d

Interpretation of the slope: If the depth of the carbonation increases by 1 mm, then the model predicts that the
strength of the concrete will decrease by approximately 2.8 Mpa.

Interpretation of the intercept: If the depth of the carbonation is 0, then the model predicts that the strength
of the concrete is approximately 24.5 Mpa.

Comments: Notice that it isn’t necessary to fully understand the units in which the variables are measured in
order to correctly interpret these coefficients. While it is good to understand data thoroughly, it is also important
to understand the structure of linear models. In this model, notice that the strength decreases as the
carbonation increases, which is shown by the negative slope coefficient. When you interpret a negative slope,
notice that you must say that, as the explanatory variable increases, then the response variable decreases.

Example: 3

When cigarettes are burned, one by-product in the smoke is carbon monoxide. Data is collected to determine
whether the carbon monoxide emission can be predicted by the nicotine level of the cigarette.

 It is determined that the relationship is approximately linear when we predict carbon monoxide, C,
from the nicotine level, N
 Both variables are measured in milligrams
 The formula for the model is C = 3.0 + 10.3.N

67
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Interpretation of the slope: If the amount of nicotine goes up by 1 mg, then we predict the amount of carbon
monoxide in the smoke will increase by 10.3 mg.

Interpretation of the intercept: If the amount of nicotine is zero, then we predict that the amount of carbon
monoxide in the smoke will be about 3.0 mg.

1.4 Correlation is not Causation

Correlation and causation are terms which are mostly misunderstood and often used interchangeably.
Understanding both the statistical terms is very important not only to make conclusions but more importantly,
making correct conclusions at the end. In this section we will understand why correlation does not imply
causation.

Correlation is a statistical technique which tells us how strongly the pair of variables are linearly related and
change together. It does not tell us why and how behind the relationship but it just says the relationship exists.

Example: Correlation between Ice cream sales and sunglasses sold.

As the sales of ice creams is increasing so do the sales of sunglasses.

Causation takes a step further than correlation. It says any change in the value of one variable will cause a
change in the value of another variable, which means one variable makes the other happen. It is also referred
to as cause and effect.

Two or more variables considered to be related, in a statistical context, if their values change so that as the
value of one variable increases or decreases so does the value of the other variable (it may be in the same or
opposite direction).

68
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

For example,

 For the two variables "hours worked" and "income earned" there is a relationship between the two
such that the increase in hours worked is associated with an increase in income earned as well.

 If we consider the two variables "price" and "purchasing power", as the price of goods increases a
person's ability to buy these goods decreases (assuming a constant income).

Therefore:

 Correlation is a statistical measure (expressed as a number) that describes the size and direction of a
relationship between two or more variables.

 A correlation between variables, however, does not automatically mean that the change in one variable
is the cause of change in the values of the other variable.

 Causation indicates that one event is the result of the occurrence of the other event; i.e. there is a
causal relationship between the two events. This is also referred to as cause and effect.

Theoretically, the difference between the two types of relationships are easy to identify — an action or
occurrence can cause another (e.g. smoking causes an increase in the risk of developing lung cancer), or it
can correlate with another (e.g. smoking is correlated with alcoholism, but it does not cause alcoholism). In
practice, however, it remains difficult to clearly establish cause and effect, compared to establishing correlation.

1.6 Contingency Tables – Examples

A contingency table provides a way of portraying data that can facilitate calculating probabilities. The table
helps in determining conditional probabilities quite easily. The table displays sample values in relation to two
different variables that may be dependent or contingent on one another. Later on, we will use contingency
tables again, but in another manner.

Example 1

Suppose a study of speeding violations and drivers who use cell phones produced the following fictional data:

Cell Phone User Speeding violation in the last year No speeding violation in the last year Total

Yes 25 280 305

No 45 405 450

Total 70 685 755

The total number of people in the sample is 755. The row totals are 305 and 450. The column totals are 70 and
685. Notice that 305 + 450 = 755 and 70 + 685 = 755.

Calculate the following probabilities using the table:

1. Find P (Person is a cell phone user)

Number of cell phone users / Total number in study = 305 / 755

2. Find P (person had no violation in the last year)

Number of no violations / Total number in study = 685 / 755

69
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3. Find P (Person had no violation in the last year AND was a cell phone user)

Number of cell phone users with no violation / Total number in study = 280/755

4. Find P (Person is a cell phone user OR person had no violation in the last year)

(305 / 755 + 685 / 755) − 280 / 755= 710 / 755

Example 2

This table shows a random sample of 100 hikers and the areas of hiking they prefer.

Hiking Area Preference

Sex The Coastline Near Lakes and Streams On Mountain Peaks Total

Female 18 16 ___ 45

Male _ _ 14 55

Total _ 41 _ ___

1. Complete the above table

Hiking Area Preference

Sex The Coastline Near Lakes and Streams On Mountain Peaks Total

Female 18 16 11 45

Male 16 25 14 55

Total 34 41 25 100

2. Find the probability that a person is male given that the person prefers hiking near lakes and streams

Hint:
Let M = being male, and let L = prefers hiking near lakes and streams.
1. What word tells you this is conditional?
2. Fill in the blanks and calculate the probability: P(___|___) = ___.
3. Is the sample space for this problem all 100 hikers? If not, what is it?

Answer
1. The word “given” tells you that this is a conditional.
2. P(M|L) =2541
3. No, the sample space for this problem is the 41 hikers who prefer lakes and streams.

70
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2. Reading
2.1 Correlation
Correlation is a measure of how closely two variables move together. Pearson’s correlation coefficient is a
common measure of correlation, and it ranges from +1 for two variables that are perfectly in sync with each
other, to 0 when they have no correlation, to -1 when the two variables are moving opposite to each other.

For linear regression, one way of calculating the slope of the regression line uses Pearson’s correlation, so it
is worth understanding what correlation is.

The equation for a line is

Y = a + bx

here a = intercept and b = the slope.

How to Find the Correlation?

The correlation coefficient that indicates the strength of the relationship between two variables can be
found using the following formula:

Where:

rxy – the correlation coefficient of the linear relationship between the variables x and y

xi – the values of the x-variable in a sample

x̅ – the mean of the values of the x-variable

yi – the values of the y-variable in a sample

ȳ – the mean of the values of the y-variable

In order to calculate the correlation coefficient using the formula above, you must undertake the following
steps:

1. Obtain a data sample with the values of x-variable and y-variable.

2. Calculate the means (averages) x̅ for the x-variable and ȳ for the y-variable.

3. For the x-variable, subtract the mean from each value of the x-variable (let’s call this new variable
“a”). Do the same for the y-variable (let’s call this variable “b”).

4. Multiply each a-value by the corresponding b-value and find the sum of these multiplications (the
final value is the numerator in the formula).

5. Square each a-value and calculate the sum of the result

6. Find the square root of the value obtained in the previous step (this is the denominator in the
formula).

7. Divide the value obtained in step 4 by the value obtained in step 7.

71
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

You can see that the manual calculation of the correlation coefficient is an extremely tedious process,
especially if the data sample is large. However, there are many software tools that can help you save time
when calculating the coefficient. ‘CORREL’ function of MS Excel returns the correlation coefficient of two cell
range.

Example of Correlation

 X is an investor; he invests money in share market. His portfolio primarily tracks the performance of
the S&P 500 (this is a stock market index in USA that measures the performance of top 500 large
companies in the USA).

 X wants to add the stock of Apple Inc. Before adding Apple to his portfolio, he wants to assess the
correlation between the stock and the S&P 500 to ensure that adding the stock won’t increase the
systematic risk of his portfolio.

 To find the coefficient, X gathers the following prices from the last five years (Step 1)

Using the formula above, X can determine the correlation between the prices of the S&P 500 Index and
Apple Inc.

 Next, X calculates the average prices of each security for the given periods (Step 2):

 After the calculation of the average prices, we can find the other values. A summary of the
calculations is given in the table below:

72
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

 Using the obtained numbers, X can calculate the coefficient:

The coefficient indicates that the prices of the S&P 500 and Apple Inc. have a high positive correlation. This
means that their respective prices tend to move in the same direction. Therefore, adding Apple to his portfolio
would, in fact, increase the level of systematic risk.

2.2 Regression
With correlation, we determined how much two sets of numbers changed together. With regression, we will
to use one set of numbers to make a prediction on the value in the other set. Correlation is part of what we
need for regression. But we also need to know how much each set of numbers change individually, via the
standard deviation, and where we should put the line, i.e. the intercept.

The regression that we are calculating is very similar to correlation. So you might ask, why do we have both
regression and correlation? It turns out that regression and correlation give related but distinct information.

 Correlation gives you a measurement that can be interpreted independently of the scale of the two
variables. Correlation is always bounded by ±1. The closer the correlation is to ±1 the closer the two
variables are to a perfectly linear relationship.

 The regression slope by itself does not tell you that. The regression slope tells you the expected
change in the dependent variable y when the independent variable x changes one unit. That
information cannot be calculated from the correlation alone.

A fallout of those two points is that correlation is a unit-less value, while the slope of the regression line has
units. If for instance, you owned a large business and were doing an analysis on the amount of revenue in
each region compared to the number of salespeople in that region, you would get a unit-less result with
correlation, and with regression, you would get a result that was the amount of money per person.

73
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Regression Equations

With linear regression, we are trying to solve for the equation of a line, which is shown below.

Y = a + bx

The values that we need to solve for are ‘b’ the slope of the line, and ‘a’ the intercept of the line. The hardest
part of calculating the slope ‘b’, is finding the correlation between x and y, which we have already done. The
only modification that needs to be made to that correlation is multiplying it by the ratio of the standard
deviations of x and y, which we also already calculated when finding the correlation. The equation for slope
is shown below

Once we have the slope, getting the intercept is easy. Assuming that you are using the standard equations
for correlation and standard deviation, which go through the average of x and y (x̄,ȳ), the equation for
intercept is

Simple Linear Model for Predicting Marks

Let’s consider the problem of predicting the marks of a student based on the number of hours he/she put in
towards preparation. Although at the outset, it may look like a problem which can be modelled using simple
linear regression, it could turn out to be a multiple linear regression problem depending on multiple input
features. Alternatively, it may also turn out to be a non-linear problem. However, for the sake of example,
let’s consider this as a simple linear regression problem.
 Let’s assume for the sake of understanding that the marks of a student (M) do depend on the
number of hours (H) he/she has put in towards preparation.

The following formula can represent the model:

Marks = function (No. of hours)

=> Marks = m*Hours + c

The best way to determine whether it is a simple linear regression problem is to do a plot of Marks vs Hours.
If the plot comes like below, it may be inferred that a linear model can be used for this problem.

74
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Plot representing a simple linear model for predicting marks

The data represented in the above plot would be used to find out a line such as the following which
represents a best-fit line. The slope of the best-fit line would be the value of “m”.

Plot representing a simple linear model with a regression line

The value of m (slope of the line) can be determined using an objective function which is a combination of loss
function and a regularization term. For simple linear regression, the objective function would be
the summation of Mean Squared Error (MSE). MSE is the sum of squared distances between the target
variable (actual marks) and the predicted values (marks calculated using the above equation). The best fit line
would be obtained by minimizing the objective function (summation of mean squared error).

75
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.3 Practice Exercise

Problem 1

A statistics instructor at a university would like to examine the relationship (if any) between the number of
optional homework problems students do during the semester and their final course grade. She randomly
selects 12 students for study and asks them to keep track of the number of these problems completed during
the course of the semester. At the end of the class each student’s total is recorded along with their final grade.
The data is available in the following table:

Final Course Grade Vs the Number of optional homework

problems completed

No. of Problems Final Course Grade Problem Grade

completed

51 62 3162

58 68 3944

62 66 4092

65 66 4290

68 67 4556

76 72 5472

77 73 5621

78 72 5616

78 78 6084

84 73 6132

85 76 6460

91 75 6825

873 848 62254

ΣPrb Σgrd Σprb * Grd

1) For this setting identify the response variable

2) For this setting, identify the predictor variable

3) Compute the linear correlation coefficient – r – for this data set

4) Classify the direction and strength of the correlation

76
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

5) Test the hypothesis for a significant linear correlation

6) What is the valid prediction range for this setting?

7) Use the regression equation to predict a student’s final course grade if 75 optional homework
assignments are done.

8) Use the regression equation to compute the number of optional homework assignments that need to be
completed if a student expects a course grade of 85

Problem 2

The following data set of the heights and weights of a random sample of 15 male students is acquired. Is
there any apparent relationship between the two variables?

S.no Height Weight

1 5 ft 6 inch 60 kgs

2 5 ft 4 inch 55 kgs

3 5 ft 8 inch 78 kgs

4 5ft 9 inch 82 kgs

5 5 ft 4 inch 53 kgs

6 5 ft 7 inch 56 kgs

7 5 ft 3 inch 54 kgs

8 5ft 5 inch 65 kgs

9 5 ft 6 inch 74 kgs

10 5 ft 3 inch 65 kgs

11 5 ft 9 inch 76 kgs

12 5ft 10 inch 79 kgs

13 5ft 6 inch 75 kgs

14 5 ft 4 inch 63 kgs

15 5 ft 7 inch 62 kgs

Would you expect the same relationship (if any) to exist between the heights and weights of the opposite
sex?

77
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Problem 3

From the following data of hours worked in a factory (x) and output units (y), determine the regression line
of y on x, the linear correlation coefficient and determine the type of correlation.

Hours (X) 80 79 83 84 78 60 82 85 79 84 80 62
Production (Y) 300 302 315 330 300 250 300 340 315 330 310 240

Problem 4

The height (in cm) and weight (in kg) of 10 basketball players on a team are as below:

Height (X) 186 189 190 192 193 193 198 201 203 205
Weight (Y) 85 85 86 90 87 91 93 103 100 101

Calculate:

i) The regression line of y on x.

ii) The coefficient of correlation.
Iii) The estimated weight of a player who measures 208 cm.

78
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Unit 9

Classification & Clustering

Title: Classification Approach: Interactive/ Discussion, Team

Activity, Case studies
Summary:
Building machine learning (ML) models has traditionally required a binary choice. On one hand, you
could manually prepare the features, select the algorithm, and optimize the model parameters in
order to have full control over the model design and understand all the thought that went into
creating it. However, this approach requires deep understanding of many ML concepts / algorithms
and classification is one of them.

There are many practical business applications for machine learning classification. For example, if
you want to predict whether or not a person will default on a loan, you need to determine if that
person belongs to one of two classes with similar characteristics: the defaulter class or the non-
defaulter class. This classification helps you understand how likely the person is to become a
defaulter, and helps you adjust your risk assessment accordingly.
Objectives:
1. The main goal of this unit is to help students learn and understand classification problems
2. Define Classification and list its algorithms
3. Student should understand classification as a type of supervised learning

Learning Outcomes:
I. Describe the input and output of a classification model
II. Students should be able to differentiate the regression problem with classification
problem
Pre-requisites: Concept of machine learning and artificial intelligence. Understanding of
supervised and unsupervised learning and Regression Analysis.

79
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. Classification
Almost everyday, we deal with classification problems. Here are few interesting examples to illustrate the
widespread applications of classification problems.

Case 1:

A credit card company typically receives hundreds of applications for a new credit card. It contains information
regarding several different attributes such as, annual salary, outstanding debt, age etc. The problem is to
categorize applications into those who have good credit, bad credit or somewhere in the middle.
Categorization of the application is nothing but a classification problem.

Case 2:

You may want to own a dog but which kind of dog? This is the beginning of a classification problem. Dogs can
be classified in a number of different ways. For example, they can be classified by breed (examples include
beagles, hounds, Pug and countless others). they can also be classified by their role in the lives of their masters
and the work they do (examples include a dog might be a family pet, a working dog, a show dog, or a hunting
dog). In many cases, dogs are defined both by their breed and their role. Based on different classification
criteria, you decide eventually which one you want to own.

Let us take a technical example to explain classification problem next.

Case 3:

A common example of classification comes with detecting spam emails. To write a program to filter out
spam emails, a computer programmer can train a machine learning algorithm with a set of spam-like emails
labeled as “spam” and regular emails labeled as “not-spam”. The idea is to make an algorithm that can learn
characteristics of spam emails from this training set so that it can filter out spam emails when it encounters
new emails.

Based on the above examples let us try the following activities…

Activity-1

Look at the pictures below and tell me whether the fruit seller knows the art of classification or not. Justify
your answer.

80
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

In order to understand ‘Classification’, let us revise the concept of ‘Supervised Learning’, because classification
is type of supervised learning.

Supervised learning as the name indicates is the presence of a supervisor as a teacher. Basically supervised
learning is learning in which we teach or train the machine using data which is well labeled that means some
data is already tagged with the correct answer. After that, the machine is provided with a new set of examples
(data) so that supervised learning algorithm analyses the training data (set of training examples) and produces
a correct outcome from labeled data.

For instance, suppose you are given a basket filled with different kinds of fruits. Now the first step is to train
the machine with all different fruits one by one like below:

 If shape of object is rounded with a depression at top and Red

in colour, then it will be labeled as – Apple.

 If shape of object is long curving cylinder and green in colour,

then it will be labeled as – Banana.

Now suppose after training the data, you present a new fruit (say
Banana) from basket and ask the machine to identify it. Since the
machine has already learnt from previous data, it will use the learning
wisely this time to classify the fruit based on its shape and color and
would confirm the fruit as BANANA and place it in Banana category. Thus
the machine learns the things from training data (basket containing
fruits) and then applies the knowledge to test data (new fruit).

Supervised learning is further classified into two categories of algorithms:

 Classification: A classification problem is when the output variable is a category, such as “Red” or
“blue” or “disease” and “no disease”.

 Regression: A regression problem is when the output variable is a real value, such as “INR” or
“Kilograms”, “Fahrenheit” etc.

Out of these two supervised learning algorithms, the context of the current unit is – classification learning /
training algorithm.

1.1 What is classification in Artificial Intelligence / Machine Learning (AI/ML)

Classification is the process of categorizing a set of data (structured data or unstructured data) into different
categories or classes where we can assign label to each class.

Let’s say, you live in a gated housing society and your society has separate dustbins for different types of
waste: paper waste, plastic waste, food waste and so on. What you are basically doing over here is
classifying the waste into different categories and then labeling each category.

In the below picture, we are assigning the labels ‘paper’, ‘metal’, ‘plastic’, and so on to different types of
waste.

81
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Let’s say you own a shop and you want to figure out if one of your customers is going to come visit your shop
again or not. The answer to that question can only be a ‘Yes’ or ‘No’. There can’t be third type of answer to
such a question. These kind of problems in Machine Learning are known as classification problem.
Classification problems normally have a categorical output like a ‘yes’ or ‘no’, ‘1’ or ‘0’, ‘True’ or ‘false’.

Let’s look at another example:

Say you want to check if on a particular day, it will rain or not. In this case the answer is dependent on the
weather condition and based on the same, the outcome can either be ‘Yes’ or ‘No’.

Question 1: Can you find 2 differences between classification and regression?

Question 2: Look at the two graphs below and suggest which graph represents the classification problem.

Graph 1 Graph 1

Question 3: “Predicting stock price of a company on a particular day” - is it a classification problem? Justify
your answer.

Question 4: “Predicting whether India will lose or win a cricket match “- is it a regression problem? Justify
your answer.

82
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1.2 Few more examples of Classification Problems

Example 1: In the banking industry, where you would like to know whether a transaction is fraudulent or
otherwise violating some regulation. That is a classification pattern because most of the time you will attempt
to match against a pattern, which may not always be 100% correct.

Example 2: Speech Understanding: Given an utterance from a user, identify the specific request made by the
user. A model of this problem would allow a program to understand and make an attempt to fulfill that
request. Eg: Siri, Cortana, google now has this capability.

Example 3: Face Detection: Given a digital photo album of many hundreds of digital photographs, identify
those photos that include a given person. A model of this decision process would allow a program to organize
photos by person. Some cameras and software like Facebook, Google Photos have this capability.

Activity:

Form a group of 5 students. Each group should think and come up with one use case from the classroom
environment or their home/society, where they would like to apply classification algorithm to solve the
problem.

2. Types of Classification Algorithm

Classification is a type of supervised learning. It labels the examples of input data and is best used when the
output has finite and discrete values.

Examples of classification problems include:

 Given an email, classify if it is spam or not.

 Given a handwritten character, classify it as one of the known characters.
 Given recent user behavior, classify as churn or not.

There are two main types of classification tasks that you may encounter, they are:

i) Binary Classification: Classification with only 2 distinct classes or with 2 possible outcomes

Example: Male and Female

Example: Classification of spam email and non-spam email

Example: Results of an exam: pass/fail

Example: Positive and Negative sentiment

ii) Multi Class Classification: Classification with more than two distinct classes.

Example: classification of types of soil

Example: classification of types of crops

Example: classification of mood/feelings in songs/music

83
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.1 Binary Classification

As we stated earlier, Binary Classification refers to those classification tasks that have two class labels i.e. two
possible outcomes.

Typically, binary classification involves one class that is the normal state and another class that is the abnormal
state. For example, “not spam” is the normal state and “spam” is the abnormal state. Another example is
“cancer not detected” is the normal state of a task that involves a medical test and “cancer detected” is the
abnormal state.

The class for the normal state is assigned the class label 0 and the class with the abnormal state is assigned
the class label 1.

Popular algorithms that can be used for binary classification include:

 Logistic Regression
 k-Nearest Neighbors
 Decision Trees
 Support Vector Machine
Out of these binary classification algorithms, we are going to study about ‘Logistic Regression’.

2.1.1. Logistic Regression

Logistic regression is one of the binomial classification algorithm used to assign observations to a discrete set
of classes. Logistic Regression works with binary data, where either the event happens (1) or the event does
not happen (0).

So given some feature x it tries to find out whether some event y happens or not. So y can either be 0 or 1. In
the case where the event happens, y is given the value 1. If the event does not happen, then y is given the
value of 0. For example, if y represents whether a sports teams wins a match, then y will be 1 if they win the
match or y will be 0 if they do not.

Example of a Logistic Curve is where the values of y cannot be less than 0 or greater than 1.

84
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Let us look at a few more examples to understand logistic regression.

Example 1: Spam detection is a binary classification problem where we are given an email and we need to
classify whether or not it is spam. If the email is spam, we label it 1; if it is not spam, we label it 0. In order to
apply Logistic Regression to the spam detection problem, the following features of the email are extracted:

 Sender of the email

 Number of typos in the email

 Occurrence of words/phrases like “offer”, “prize”, “free gift”, “lottery”, “you won cash” and more

The resulting feature vector is then used to train a Logistic classifier which emits a score in the range 0 to 1. If
the score is more than 0.5, we label the email as spam. Otherwise, we don’t label it as spam.

Example 2: A Logistic Regression classifier may be used to identify whether a tumor is malignant or if it is
benign. Several medical imaging techniques are used to extract various features of tumors. For instance, the
size of the tumor, the affected body area, etc. These features are then fed to a Logistic Regression classifier
to identify if the tumor is malignant or if it is benign.

Above two problems are solved using logistic regression algorithm because the possible labels in both the
cases are two only – Spam / Not spam, malignant/benign i.e. binomial classification.

2.2 True positives, true negatives, false positives and false negatives
In the field of machine learning / Artificial Intelligence, a matrix (NxN table) is used to validate how successful
a classification model i.e. classifier’s predictions are, where N is the number of target classes. The confusion
matrix compares the actual target values with those predicted by the classification model. This gives how well
the classification model is performing and what kind of error it is making.

For a binary classification problem, we would have a 2x2 matrix.

Let’s understand the matrix:

 The target variable has two values: Positive or Negative

 The columns represent the actual values of the target variable
 The rows represent the predicted values of the target variable

But wait – what’s TP, FP, FN and TN here? That’s the point we have to understand in confusion matrix. Let’s
understand each term below.

85
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

True Positive (TP)

 The predicted value matches the actual value

 The actual value was positive and classification model also predicts positive

 There is no error

True Negative (TN)

 The predicted value matches the actual value

 The actual value was negative and classification model also forecasts negative

 There is no error

False Positive (FP)

 The predicted value doesn’t match the actual value

 The actual value was negative but the model predicted a positive value

 This is Type 1 Error

False Negative (FN)

 The predicted value doesn’t match the actual value

 The actual value was positive but the model predicted a negative value

 This is Type 2 Error

Let me give you an example from cricket to better explain this.

True Positive (TP) - Umpire gives a batsman NOT OUT when he is NOT OUT.
True Negative (TN) - Umpire gives a Batsman OUT when he is OUT.

False Positive(FP) - Umpire gives a Batsman NOT OUT when he is OUT.

False Negative(FN) - Umpire gives a Batsman OUT when he is NOT OUT.

Question 1:

Assume there are 100 images, 30 of them depict a cat, the rest do not. A machine learning model predicts the
occurrence of a cat in 25 of 30 cat images. It also predicts absence of a cat in 50 of the 70 no cat images.

In this case, what are the true positive, false positive, true negative and false negative?

Solution: Assuming cat as a positive class.

Confusion Matrix:

TN | FP
FN | TP
 True Positive (TP): Images which are cat and actually predicted cat i.e. 25
 True Negative (TN): Images which are not-cat and actually predicted not-cat i.e. 50
 False Positive (FP): Images which are not-cat and actually predicted as cat i.e. 20
 False Negative (FN): Images which are cat and actually predicted as not-cat i.e. 5

86
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Precision: TP/(TP+FP)
Recall: TP/(TP+FN)
Precision: 25/(25+20) = 0.55,
Recall: 25/(25+5) = 0.833
Confusion Matrix Example 1: Do you still remember the shepherd boy story?

“A shepherd boy used to take his herd of sheep across the fields to the lawns near the forest. One day he felt very bored.
He wanted to have fun. So he cried aloud "Wolf, Wolf. The wolf is carrying away a lamb". Farmers working in the fields
came running and asked, "Where is the wolf?". The boy laughed and replied "It was just for fun. Now get going all of
you".
The boy played the trick for quite a number of times in the next few days. After some days, as the boy was perched on a
tree, singing a song, there came a wolf. The boy cried loudly "Wolf, Wolf, the wolf is carrying a lamb away." There was
no one to the rescue. The boy shouted "Help! Wolf! Help!" Still no one came to his help. The villagers thought that the
boy was playing mischief again. The wolf carried a lamb away“

Let us work on arriving at a confusion matrix for the above situation:

 "Wolf" is a positive class.

 "No wolf" is a negative class.

True Positive (TP): False Positive (FP):

Reality: A wolf threatened Reality: No wolf threatened

Shepherd said: "Wolf” Shepherd said: "Wolf”

Outcome: Shepherd is a hero Outcome: Villagers are angry at shepherd for waking them up

False Negative (FN): True Negative (TN):

Reality: A wolf threatened Reality: No wolf threatened

Shepherd said: "No wolf” Shepherd said: "No wolf”

Outcome: The wolf ate all the sheep Outcome: Everyone is fine

A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative
is an outcome where the model correctly predicts the negative class.

A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is
an outcome where the model incorrectly predicts the negative class.

87
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Question 2:

In this case, what are the true positive, false positive, true negative and false negative? Let’s take cat as
negative class.

Question 3:

Below is a confusion matrix prepared for a binary classifier to detect email as Spam and Not Spam.

What is your interpretation of the above matrix?

Why do you need a Confusion matrix?

Here are the benefits of using a confusion matrix:

 It shows how any classification model is confused when it makes predictions

 Confusion matrix not only gives insight into the errors being made by the classifier but also types of
errors that are being made
 This breakdown helps overcome the limitations of using classification accuracy alone
 Every column of the confusion matrix represents the instances of the predicted class
 Each row of the confusion matrix represents the instances of the actual class
 It provides insight not only into the errors which are made by a classifier but also errors that are
being made in general

2.3. False Positive or False Negative in Medical Science

In medical testing, and more generally in binary classification, a false positive is an error in data reporting in
which a test result improperly indicates presence of a condition, such as a disease (the result is positive), when
in reality it is not present, while a false negative is an error in which a test result improperly indicates absence
of a disease, when in reality it is present. These are the two kinds of errors in a binary test.

While many of today’s medical tests are accurate and reliable but still there are false positives or false
negatives and their implications are severe on the patients, family or society.

False positive prompts patients to take medication or treatment they don’t really need. Perhaps, even more
dangerous is the ‘false negative’ - the test that says you don’t have a disease for a condition you actually have.

We most often hear about false negatives in the context of home pregnancy tests, which are more prone to
giving false negatives than false positives. However, when it comes to screening for more serious conditions
like HIV or cancer, a false negative can have dire repercussions.

88
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Case 1:

Consider a health prediction case, where one wants to diagnose cancer. Imagine that detecting cancer will
trigger further analysis (the patient will not be immediately treated) whereas if you don't detect cancer, the
patient is sent home without further prognosis.

This case is thus asymmetric, since you definitely would like to avoid sending home a sick patient (False
Negative). You can however make the patient wait a little more by asking him/her to take more tests even if
the initial results show them negative for cancer (False Positive). As in this situation, you would prefer a False
Positive over a False Negative.

Case 2:

Imagine a patient taking an HIV test. The impacts of a false positive on the patient would at first be
heartbreaking; to have to deal with the trauma of facing this news and telling your family and friends. But on
further examination, the doctors will find out that person in question does not have the virus. Again, this
would not be a particularly pleasant experience. But not having HIV is ultimately a good thing.

On the other hand, a false negative would mean that the patient has HIV but the test shows a negative result.
The implications of this are terrifying, the patient would be missing out on crucial treatment and runs the risk
of spreading.

Without much doubt, the false negative here is the bigger problem. Both for the person and for society.

3. Practice exercise on simple binary classification models

Q 1: A binary classifier was evaluated using a set of 1,000 test examples in which 50% of the examples are
negative. It was found that the classifier has 60 % sensitivity and 70 % accuracy. Write the confusion matrix
for this case.

Q 2: The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her
maiden voyage, the RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough
lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.

While there was some element of luck involved in surviving, it seems some groups of people were more likely
to survive than others. In this challenge, we ask you to build a predictive model that answers the question:
“what sorts of people were more likely to survive?” using passenger data (i.e. name, age, gender, socio-
economic class, etc.). Please refer : https://www.kaggle.com/c/titanic

Q 3: Why can’t linear regression be used in place of logistic regression for binary classification?

Q 4: What are false positives and false negatives?

Q 5: What is true positive rate (TPR), true negative rate (TNR), false-positive rate (FPR), and false-negative
rate (FNR)?

89
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity 1:

You may have heard a lot about Artificial Neural Networks (ANN), Deep Learning (DL) and Machine Learning
(ML). You must have also heard about the different training algorithms like clustering, classification etc. and
would like to learn more. But when you learn about the technology from a textbook, you may find yourself
overwhelmed by mathematical models and formulae.

To make this easy and interesting, there's an awesome tool to help you grasp the idea of neural networks and
different training algorithms like classification and clustering. This tools is called TensorFlow Playground, a
web app written in JavaScript that lets you play with a real neural network running in your browser and click
buttons and tweak parameters to see how it works.

TensorFlow Playground home screen

First, we will start with understanding some of the terms in the above picture.

I. Data

We have six different data sets Circle, Exclusive OR (XOR), Gaussian, Spiral, plane and multi Gaussian. The first
four are for classification problems and last two are for regression problems. Small circles are the data points
which correspond to positive one and negative one. In general, positive values are shown in blue and negative
in orange.

In the hidden layers, the lines are colored by the weights of the connections between neurons. Blue shows a
positive weight, which means the network is using that output of the neuron as given. An orange line shows
that the network is assigning a negative weight.

In the output layer, the dots are colored orange or blue depending on their original values. The background
color shows what the network is predicting for a particular area. The intensity of the color shows how
confident that prediction is.

90
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

II. Features

We have seven features or inputs (X1, X2, squares, product and sine). We can turn on and off different
features to see which features are more important. It is a good example of feature engineering.

III. Epoch

Epoch is one complete iteration through the data set.

IV. Learning Rate

Learning rate (alpha) is responsible for the speed at which the model learns.

V. Activation Function

We may skip this term for now but for the purpose of the activity, you may choose any one of the given 4
activation functions (Tanh, ReLu, Sigmoid and Linear). We will read about this in the next class.

VI. Regularization

The purpose of regularization L1 and L2 is to remove / reduce overfitting.

VII. Neural Network Model or Perceptron

A neural network model is a network of simple elements called neurons, which receive input, change their
internal state (activation) according to that input, and produce output (0 or 1) depending on the input and
activation. We have one input, one output and at least one hidden layer in the simplest neural network called
shallow neural network. When the hidden layers are 3 or more then we call it a deep neural network. Each
hidden layer has actual working elements called neurons that take input from features or predecessor
neurons and calculate a linear activation function (z) and an output function (a).

VIII. Problem Type

We have four data sets for classification and two for regression problem. We can select the type of problem
we want to study.

IX. Output

Check the model performance after training the neural network. Observe the Test loss and Training loss of
the model.

91
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Activity – Classification problem using TensorFlow playground.

Below are the steps on how to play in this neural network playground:

 Select the Exclusive OR Data Set Classification problem.

 Set Ratio of training and test data to 60% – which means we have 60% train data and 40% testing
data.
 Noise is added to 5 and increase it and do some experiment with it, check how the output losses are
changing and select the batch size to 10.
 First Select simple features like X1 and X2 then note down the output losses

(Training loss: -0.004, Test loss: – 0.002, Steps: -255)

 Now add the third feature product of (X1X2) then observe the Losses

(Training loss: -0.001, Test loss: – 0.001, Steps: -102)

 This is how you can understand the value of features, and how to get good results in minimum steps

 Set the learning rate to 0.03, also check how the learning rate plays an important role in training a
neural network

Since you have already learnt about regression, you may also play with regression, so you have a clear idea
about regression.

 Select 2 hidden layers, set 4 neurons for the first hidden layer and 2 neurons for the second hidden
layer then followed by the output
 Starting from the first layer the weights are passed on to the first hidden layer which contains output
from one neuron, second hidden layer output is mixed with different weights. Weights are
represented by the thickness of the lines

 Then the final output will contain the Train and Test loss of the neural network

92
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Title: Clustering Approach: Interactive/ Discussion, Team

Activity, Case studies
Summary: Data clustering is a basic problem occurring in machine learning, pattern recognition,
computer vision and data compression. The goal of clustering is to categorize similar data into one
cluster based on some similarity measure.

In this chapter, we will be reviewing two main components:

First, you will be learning about the purpose of clustering and how it applies to the real world.
Second, you will get a general overview of clustering such as K-means clustering.

We will also try to understand the implementation of clustering algorithm to solve some real world
problems.

Objectives:
1. The main goal of this unit is to help students learn and understand clustering problems
2. Define Clustering and list its algorithms
3. Understand clustering as a type of unsupervised learning

Learning Outcomes:
1. Describe the input and output of a clustering model
2. Students should be able to differentiate between supervised and unsupervised learning
3. Students also should be able to differentiate classification problems from clustering
problems.
Pre-requisites: Understanding of supervised and unsupervised learning
Key Concepts. Clustering algorithms in Machine learning

93
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. Clustering
Consider you have large collection of books that you have to arrange according to categories in a bookshelf.
Since you haven’t read all the books, you have no idea about the content of the titles. You start by first bringing
the books with similar titles together.

For example, you would arrange books like the “Harry Potter” series in one corner and the “Famous Five”
series in another.

Harry Potter Series (Cluster -1) Famous Five series collection (Cluster – 2)

This is your first experience with clustering, where the books are clustered according to the similarity in
their titles. There could be many other criteria of clustering like – clustering based on authors, genre, year
publication, hardcover vs. paperback etc.

Let us take another example:

When I visit a city, I would like to walk as much as possible, but I want to optimize my time to see as many
attractions as possible. While I am planning my next trip to Mumbai for four days. I have researched online
and made a list of 20 places that I would like to visit, at during this trip. In order to optimize time and cover
all the shortlisted places, I will need to bucket (“cluster”) the places based on proximity to each other. Creating
the buckets is in fact a method of clustering. Having said that, we perform the process of clustering almost
every day in some way or the other.

1.1 What is Clustering

Clustering is unsupervised learning which deals with finding a pattern in the collection of unlabeled data.
Having said that, clustering is a technique of grouping similar data in such a way that data/objects in a group
are more similar to each other than the data/ objects in the other groups.

Let us understand this with a simple graphical example:

Example of Clustering technique – Grouping similar data in similar groups

94
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Let us take another example to understand clustering. Imagine X owns a chain of flavored milk parlors. The
parlor sells milk in 2 flavors – Strawberry (S) and Chocolate (C) across 8 outlets. In the below table, you see
the sales of both strawberry and chocolate flavored milk across the eight outlets.

Outlet Strawberry Chocolate

Outlet 1 12 6

Outlet 2 15 16

Outlet 3 18 17

Outlet 4 10 8

Outlet 5 8 7

Outlet 6 9 6

Outlet 7 12 9

Outlet 8 20 18

In order to get a better understanding of the sales data, you can plot it on a graph. Below we have plotted the
sales of both strawberry and chocolate. There are eight dots in this graph that represents the 8 stores and the
Y-axis indicates the strawberry sales and the X- axis indicates the chocolate sales.

After the analysis of this graph, you will have a better insight into the sales data and see a pattern emerging
with respect to two groups of stores that behave slightly different in terms of their strawberry and
chocolate sales and this is essentially how clustering works.

Clustering algorithms can be applied in many fields, for instance:

 Marketing: If you are a business, it is crucial that you target the right people. Clustering algorithms
are able to group together people with similar traits and likelihood to purchase your product/service.
Once you have the groups identified, target your messaging to them to increase sales probability.

 Biology: Classification of plants and animals given their features

95
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

 Libraries: Book ordering

 Insurance: Identifying groups of motor insurance policy holders with a high average claim cost;
Identifying frauds

 City-planning: Identifying groups of houses according to their house type, value and geographical
location

 Earthquake studies: Clustering observed earthquake epicenters to identify dangerous zones

 WWW: Document classification; clustering weblog data to discover groups of similar access patterns

 Identifying Fake News: Fake news is being created and spread at a rapid rate due to technology
innovations such as social media. But clustering algorithm is being used to identify fake news based
on the news content. The way that the algorithm works is by taking in the content of the fake news
article and examining the words used and then clustering them. These clusters are what help the
algorithm determine which pieces are genuine and which ones are fake. Certain words are found
more commonly in fake articles and once you see more such words in an article, it gives a higher
probability of the material being fake news.

1.2 Clustering Workflow

In order to cluster the data, the following steps are required to be taken:

1. Prepare the data: Data preparation refers to the set of features that will be available to the clustering
algorithm. For the clustering strategy to be effective, the data representation must include descriptive
features in the input set (feature selection), or the new features based on the original set to be generated
(feature extraction). In this stage, we normalize, scale, and transform feature data.

2. Create similarity metrics: To calculate the similarity between two data sets, you need to combine all the
feature data for the two examples into a single numeric value. For instance, consider a shoe data set with only
one feature – “shoe size”. You can quantify how similar two shoes are by calculating the difference between
their sizes. The smaller the numerical difference between sizes, the greater the similarity between shoes. Such
a handcrafted similarity measure is called a manual similarity measure. The similarity measure is critical to
any clustering technique and it must be chosen carefully.

3. Run the clustering algorithm: In machine learning, you sometimes encounter datasets that can have
millions of examples. ML algorithms must scale efficiently to these large datasets. However, many clustering
algorithms do not scale because they need to compute the similarity between all pairs of points. There are
many different approaches to clustering data. Roughly speaking, the cluster algorithms can be classified as
hierarchical or partitioning for a more comprehensive taxonomy of clustering techniques.

4. Interpret the results: Because clustering is unsupervised, no “truth” is available to verify results. The
absence of truth complicates assessing quality. In this situation, interpretation of results becomes crucial.

96
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.Types of Clustering
In fact, there are more than 100 clustering algorithms known. But few of the algorithms are used popularly,
let’s look at them in detail:

1. Centroid-based clustering organizes the data into non-hierarchical clusters, k-means is the most widely-
used centroid-based clustering algorithm. Centroid-based algorithms are efficient but sensitive to initial
conditions and outliers. This course focuses on k-means because it is an efficient, effective, and simple
clustering algorithm.

Example of centroid-based clustering

2. Density-based clustering connects areas of high example density into clusters. This allows for arbitrary-
shaped distributions as long as dense areas can be connected. These algorithms have difficulty with data of
varying densities and high dimensions. Further, by design, these algorithms do not assign outliers to clusters.

Example of density-based clustering

97
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.Distribution-based Clustering approach assumes data is composed of distributions, such as Gaussian

distributions. In the below figure, the distribution-based algorithm clusters data into three Gaussian
distributions. As distance from the distribution's center increases, the probability that a point belongs to the
distribution decreases. The bands show that decrease in probability. When you do not know the type of
distribution in your data, you should use a different algorithm.

4.Hierarchical clustering creates a tree of clusters. Hierarchical clustering, not surprisingly, is well suited to
hierarchical data, such as taxonomies. See Comparison of 61 Sequenced Escherichia coli Genomes by Oksana
Lukjancenko, Trudy Wassenaar & Dave Ussery for an example. In addition, another advantage is that any
number of clusters can be chosen by cutting the tree at the right level.

Example of a hierarchical tree clustering animals.

Out of several approaches to clustering mentioned above, the most widely used clustering algorithm is -
“centroid-based clustering using k-means”.

98
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2.1. K- means Clustering

K-means is an iterative clustering algorithm that aims to find local maxima in each iteration. This algorithm
works in two steps:

Step 1: Cluster Assignment

In this step, the algorithm goes to each of the data points and assigns the data point to one of the cluster
centroids. The assignment of data point to a particular cluster is determined by how close the data point is
from the particular centroid.

Step 2: Move centroid

In move centroid step, K-means moves the centroids to the average of the points in a cluster. In other words,
the algorithm calculates the average of all the points in a cluster and moves the centroid to that average
location. This process is repeated until all data points get a cluster and hence there is no further opportunity
of change in the clusters. The number of starting cluster is chosen randomly.

Example 1:

Let us see how this algorithm works using the well-known Iris flower data set -
(https://archive.ics.uci.edu/ml/datasets/iris) .

This dataset contains four measurements of three different Iris flowers. The measurements are - Sepal length,
Sepal width, Petal length, and Petal width. The three types of Iris are Setosa, Versicolour, and Virginica as
shown below in the same order.

Let's first plot the values of the petals’ lengths and widths against each other.

99
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

With just a quick glance, it is clear that there are at least two groups of flowers shown on the chart. Let’s see
how we can use a K-means algorithm to find clusters in this data.

Applying the K-Means Algorithm

Iteration 1: First, we create two randomly generated centroids and assign each data point to the cluster of
the closest centroid. In this case, because we are using two centroids, that means we want to create two
clusters i.e. K=2.

Iteration 2: As you can see above, the centroids are not evenly distributed. In the second iteration of the
algorithm, the average values of each of the two clusters are found and become the new centroid values.

100
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Iterations 3-5: We repeat the process until there is no further change in the value of the centroids.

Finally, after iteration 5, there is no further change in the clusters.

2.1.1 k-Means Clustering: Advantages and Disadvantages

Advantages of k-means
 Relatively simple to implement.

 Scales to large data sets.

 Guarantees convergence.

 Can warm-start the positions of centroids.

 Easily adapts to new examples.

 Generalizes to clusters of different shapes and sizes, such as elliptical clusters.

101
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Disadvantages of k-means
 Choosing k manually

 Being dependent on initial values: For a low k, you can mitigate this dependence by running k-means
several times with different initial values and picking the best result. As ‘k’ increases, you need advanced
versions of k-means to pick better values of the initial centroids (called k-means seeding

 Clustering data of varying sizes and density: k-means has trouble clustering data where clusters are of
varying sizes and density. To cluster such data, you need to generalize k-means as described in
the Advantages section.

 Clustering outliers: Centroids can be dragged by outliers, or outliers might get their own cluster instead
of being ignored. Consider removing or clipping outliers before clustering.

 Scaling with number of dimensions: As the number of dimension increases, a distance-based similarity
measure converges to a constant value between any given examples. Reduce dimensionality either by
using PCA on the feature data, or by using “spectral clustering” to modify the clustering algorithm as
explained below.

2.1.2. k-means Generalization

What happens when clusters are of different densities and sizes? Look at Figure 1. Compare the intuitive
clusters on the left side with the clusters actually found by k-means on the right side. The comparison shows
how k-means can stumble on certain datasets.

Two graphs side-by-side. The first showing a dataset with somewhat obvious clusters. The second showing an
odd grouping of examples after running k-means.

Ungeneralised k-means example.

102
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

To cluster naturally imbalanced clusters like the ones shown in Figure 1, you can adapt (generalize) k-means.
In Figure 2, the lines show the cluster boundaries after generalizing k-means as:

 Left plot: No generalization, resulting in a non-intuitive cluster boundary.

 Centre plot: Allow different cluster widths, resulting in more intuitive clusters of different sizes.

 Right plot: Besides different cluster widths, allow different widths per dimension, resulting in elliptical
instead of spherical clusters, improving the result.

Two graphs side-by-side. The first a spherical cluster example and the second a non-spherical cluster example.

A spherical cluster example and a non-spherical cluster example.

3. Why is it Unsupervised?
In clustering, we group some data-points into several clusters. So usually clustering does not look at
target/labels instead it groups the data considering the similarities in the features. Therefore, clustering
employs a similarity function to measure the similarity between two data-points (e.g. k means clustering
measures the Euclidean distance). And feature engineering plays a key role in clustering because the feature
that you provide to the cluster decides the type of groups that you get.

For example, if you use a set of features that characterized the CPU (no. of cores, clock speed etc.) to cluster
laptops, each cluster will have laptops with similar CPU power, if you add the price of the laptop as a feature
you may be able to get clusters that illustrate overpriced and economical laptops based on their price and
CPU specs.

How do you classify it?

The usual approach requires a set of labelled data/or a person to annotate the clusters.

4. Decide the features

5. Cluster the data
6. Use labelled data or human evaluators to annotate the clusters.

In the third step, we try to assign a label to each cluster by looking at the data-points in them. If a certain
cluster has 90% of overpriced laptops (based on the labelled data or human evaluation), then we label that
cluster as an overpriced laptop cluster. Such that we may get multiple overpriced laptop clusters. When we
classify a new laptop, if it belongs to one of those overpriced laptop clusters then we classify that laptop as
an overpriced laptop.

103
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

For Advance Learners (Optional)

Project 1: Customer Segmentation (Clustering)
Install
This project requires Python 2.7 and the following Python libraries installed:

 NumPy
 Pandas
 matplotlib
 scikit-learn

You will also need to have software installed to run and execute an iPython Notebook

Code
Template code is provided in the notebook customer_segments.ipynb notebook file. Additional
supporting code can be found in renders.py. While some code has already been implemented to get you
started, you will need to implement additional functionality when requested to successfully complete the
project.

(Source Code : https://github.com/ritchieng/machine-learning-nanodegree)

Getting Started
In this project, you will analyze a dataset containing data on various customers' annual spending amounts
(reported in monetary units) of diverse product categories for internal structure. One goal of this project is
to best describe the variation in the different types of customers that a wholesale distributor interacts with.
Doing so would equip the distributor with insight into how to best structure their delivery service to meet
the needs of each customer.

The dataset for this project can be found on the UCI Machine Learning Repository. For the purposes of this
project, the features 'Channel' and 'Region' will be excluded in the analysis — with focus instead on
the six product categories recorded for customers.

Run the code block below to load the wholesale customers dataset, along with a few of the necessary Python
libraries required for this project. You will know the dataset loaded successfully if the size of the dataset is
reported.
# Import libraries necessary for this project
import numpy as np
import pandas as pd
import renders as rs
from IPython.display import display # Allows the use of display() for DataFrames

# Show matplotlib plots inline (nicely formatted in the notebook)

%matplotlib inline

# Load the wholesale customers dataset

try:

104
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

data = pd.read_csv("customers.csv")
data.drop(['Region', 'Channel'], axis = 1, inplace = True)
print "Wholesale customers dataset has {} samples with {} features
each.".format(*data.shape)
except:
print "Dataset could not be loaded. Is the dataset missing?"

STEP -1: Data Exploration

Run the code block below to observe a statistical description of the dataset. Note that the dataset is
composed of six important product categories: 'Fresh', 'Milk', 'Grocery', 'Frozen', 'Detergents_Paper', and
'Delicatessen'. Consider what each category represents in terms of products you could purchase.
# Display a description of the dataset
stats = data.describe()
stats

OUTPUT:

Data Visualization code [ Sample]

# Import Seaborn, a very powerful library for Data Visualisation

import seaborn as sns

# Get the means

mean_data = data.describe().loc['mean', :]

# Append means to the samples' data

samples_bar = samples.append(mean_data)

# Construct indices
samples_bar.index = indices + ['mean']

# Plot bar plot

samples_bar.plot(kind='bar', figsize=(14,8))

105
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

STEP -2: Implementation – Features Relevance

The code is slightly big, so not writing code block here. Please refer the github link shared above.

In the code block below, you will need to implement the following:

 Assign new_data a copy of the data by removing a feature of your choice using the
DataFrame.drop function.
 Use sklearn.cross_validation.train_test_split to split the dataset into training
and testing sets.
 Use the removed feature as your target label. Set a test_size of 0.25 and set a
random_state.
 Import a decision tree regressor, set a random_state, and fit the learner to the training data.
 Report the prediction score of the testing set using the regressor's score function.

STEP – 3: Data Pre-processing

In the code block below, you will need to implement the following:

 Assign a copy of the data to log_data after applying a logarithm scaling. Use the np.log
function for this.
 Assign a copy of the sample data to log_samples after applying a logrithm scaling. Again, use
np.log.
# TODO: Scale the data using the natural logarithm
log_data = np.log(data)

106
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

# TODO: Scale the sample data using the natural logarithm

log_samples = np.log(samples)

# Produce a scatter matrix for each pair of newly-transformed features

pd.scatter_matrix(log_data, alpha = 0.3, figsize = (14,8), diagonal = 'kde');

STEP – 4: Outlier Detection

Detecting outliers in the data is extremely important in the data preprocessing step of any analysis. The
presence of outliers can often skew results which take into consideration these data points.

In the code block below, you will need to implement the following:

 Assign the value of the 25th percentile for the given feature to Q1. Use np.percentile for this.
 Assign the value of the 75th percentile for the given feature to Q3. Again, use np.percentile.
 Assign the calculation of an outlier step for the given feature to step.
 Optionally remove data points from the dataset by adding indices to the outliers list.

NOTE: If you choose to remove any outliers, ensure that the sample data does not contain any of these
points! Once you have performed this implementation, the dataset will be stored in the variable
good_data.

import itertools
# Select the indices for data points you wish to remove
outliers_lst = []

# For each feature find the data points with extreme high or low values
for feature in log_data.columns:
# TODO: Calculate Q1 (25th percentile of the data) for the given feature
Q1 = np.percentile(log_data.loc[:, feature], 25)

# TODO: Calculate Q3 (75th percentile of the data) for the given feature
Q3 = np.percentile(log_data.loc[:, feature], 75)

# TODO: Use the interquartile range to calculate an outlier step (1.5 times
the interquartile range)
step = 1.5 * (Q3 - Q1)

# Display the outliers

print "Data points considered outliers for the feature
'{}':".format(feature)

# The tilde sign ~ means not

# So here, we're finding any points outside of Q1 - step and Q3 + step
outliers_rows = log_data.loc[~((log_data[feature] >= Q1 - step) &
(log_data[feature] <= Q3 + step)), :]
# display(outliers_rows)

outliers_lst.append(list(outliers_rows.index))

outliers = list(itertools.chain.from_iterable(outliers_lst))

# List of unique outliers

# We use set()
# Sets are lists with no duplicate entries

107
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

uniq_outliers = list(set(outliers))

# List of duplicate outliers

dup_outliers = list(set([x afor x in outliers if outliers.count(x) > 1]))

print 'Outliers list:\n', uniq_outliers

print 'Length of outliers list:\n', len(uniq_outliers)

print 'Duplicate list:\n', dup_outliers

print 'Length of duplicates list:\n', len(dup_outliers)

# Remove duplicate outliers

# Only 5 specified
good_data = log_data.drop(log_data.index[dup_outliers]).reset_index(drop = True)

# Original Data
print 'Original shape of data:\n', data.shape
# Processed Data
print 'New shape of data:\n', good_data.shape

STEP- 5: Dimensionality Reduction

When using principal component analysis, one of the main goals is to reduce the dimensionality of the data
— in effect, reducing the complexity of the problem. Dimensionality reduction comes at a cost: Fewer
dimensions used implies less of the total variance in the data is being explained. Because of this, the
cumulative explained variance ratio is extremely important for knowing how many dimensions are necessary
for the problem.

In the code block below, you will need to implement the following:

 Assign the results of fitting PCA in two dimensions with good_data to pca.
 Apply a PCA transformation of good_data using pca.transform, and assign the reuslts to
reduced_data.
 Apply a PCA transformation of the sample log-data log_samples using pca.transform, and
assign the results to pca_samples.
# TODO: Apply PCA by fitting the good data with only two dimensions
# Instantiate
pca = PCA(n_components=2)
pca.fit(good_data)

# TODO: Transform the good data using the PCA fit above
reduced_data = pca.transform(good_data)

# TODO: Transform the sample log-data using the PCA fit above
pca_samples = pca.transform(log_samples)

# Create a DataFrame for the reduced data

reduced_data = pd.DataFrame(reduced_data, columns = ['Dimension 1', 'Dimension
2'])

STEP - 6: CLUSTERING

108
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Depending on the problem, the number of clusters that you expect to be in the data may already be
known. When the number of clusters is not known a priority, there is no guarantee that a given number of
clusters best segments the data, since it is unclear what structure exists in the data — if any.

In the code block below, you will need to implement the following:

 Fit a clustering algorithm to the reduced_data and assign it to clusterer.

 Predict the cluster for each data point in reduced_data using clusterer.predict and
assign them to preds.
 Find the cluster centers using the algorithm's respective attribute and assign them to centers.
 Predict the cluster for each sample data point in pca_samples and assign them
sample_preds.
 Import sklearn.metrics.silhouette_score and calculate the silhouette score of reduced_data
against preds.
 Assign the silhouette score to score and print the result.
# Imports
from sklearn.mixture import GMM
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

# Create range of clusters

range_n_clusters = list(range(2,11))
print(range_n_clusters)

[2, 3, 4, 5, 6, 7, 8, 9, 10]

GMM Implementation
# Loop through clusters
for n_clusters in range_n_clusters:
# TODO: Apply your clustering algorithm of choice to the reduced data
clusterer = GMM(n_components=n_clusters).fit(reduced_data)

# TODO: Predict the cluster for each data point

preds = clusterer.predict(reduced_data)

# TODO: Find the cluster centers

centers = clusterer.means_

# TODO: Predict the cluster for each transformed sample data point
sample_preds = clusterer.predict(pca_samples)

# TODO: Calculate the mean silhouette coefficient for the number of clusters
chosen
score = silhouette_score(reduced_data, preds, metric='mahalanobis')
print "For n_clusters = {}. The average silhouette_score is :
{}".format(n_clusters, score)

KNN Implementation
# Loop through clusters
for n_clusters in range_n_clusters:
# TODO: Apply your clustering algorithm of choice to the reduced data

109
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

clusterer = KMeans(n_clusters=n_clusters).fit(reduced_data)

# TODO: Predict the cluster for each data point

preds = clusterer.predict(reduced_data)

# TODO: Find the cluster centers

centers = clusterer.cluster_centers_

# TODO: Predict the cluster for each transformed sample data point
sample_preds = clusterer.predict(pca_samples)

# TODO: Calculate the mean silhouette coefficient for the number of clusters
chosen
score = silhouette_score(reduced_data, preds, metric='euclidean')
print "For n_clusters = {}. The average silhouette_score is :
{}".format(n_clusters, score)

Cluster Visualization
Once you've chosen the optimal number of clusters for your clustering algorithm using the scoring metric
above, you can now visualize the results by executing the code block below.The final visualization provided
should, however, correspond with the optimal number of clusters.
# Extra code because we ran a loop on top and this resets to what we want
clusterer = GMM(n_components=2).fit(reduced_data)
preds = clusterer.predict(reduced_data)
centers = clusterer.means_
sample_preds = clusterer.predict(pca_samples)

Display the results of the clustering from implementation

rs.cluster_results(reduced_data, preds, centers, pca_samples)

110
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

Unit 10

AI Values

Title: AI Values Approach: Interactive/ Discussion, Team

Activity, Case studies
Summary:
AI is progressing towards a stage where, eventually it will replicate human’s general intelligence.
The possibility of making a thinking machine raises a host of ethical issues. These ethical questions
ensure that such machines do not harm humans and the society at large.
To harness the potential of AI in the right way, guidelines and ethical standards are therefore
required. As a result, ethical guidelines have been developed in recent years and developers are
expected to be adhere to these principles.
Inspite of the standards, collectively as a society we have to face the challenges arising from current
AI techniques and implementations, in the form of systematic decrease in privacy; increasing
reliance on AI for our safety, and the ongoing job losses due to mechanization and automatic control
of work processes.

Objectives:
1. Understand and debate on Ethics of AI
2. Understand biases and its types
3. Scope of biases in data and how it impacts AI
Key Concepts: Data, Bias, Data Bias, Types of Bias

111
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. AI Values

Before we begin this chapter, let us watch the few essential videos. (Total watch time may be 30-35 minutes).

The Ethical Robot

How To Build A Moral Robot

Humans Need Not Apply

Activity 1: After watching the video “The Ethical Robot” what are the two ethical questions that strike you?
Write them down.

Activity 2: With the video “How to build a moral robot” as your baseline, please write down the moral and
ethical values you would like incorporate in your robot? The video is only a guide, let it not limit your
imagination and creativity.

Activity 3: Form a group of 5 students and watch the video “Humans need not apply” as a group. Please
watch the video more than once. At the end, submit a paper as a group on your learnings from the vide.

112
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

1. AI Working for Good

At the World Economic Forum 2019 in Davos, Paul Daugherty, Accenture’s Chief Technology and Innovation
Officer floated the idea of Human + Machine = Superpowers. What if people could predict natural disasters
before they happen? Better protect endangered species? Track disease as it spreads, to eliminate it sooner?
These are the actual AI projects on which companies like Google, IBM etc. are working to serve humanity.

Find below a compilation of a few AI projects in progress:

1. IBM (https://www.research.ibm.com/artificial-intelligence/#quicklinks)

 Applying AI to accelerate COVID-19 Research. As the COVID-19 pandemic unfolds, we continue

to ask how these technologies and our scientific knowledge can help in the global battle against
the corona virus.

 The potential benefits of AI for breast cancer detection

 AI Enables Foreign Language Study Abroad, No Travel Required

2. Google (https://ai.google/social-good/)

 Keeping people safe with AI-enabled flood forecasting

3. Assessing Cardiovascular Risk Factors with Computer Vision Agricultural productivity can be increased
through digitization and analysis of images from automated drones and satellites

4. AI can be instrumental in providing personalized learning experience to students

5. AI can help the people in special needs in numerous ways. AI is getting better at doing text-to-voice
translation as well as voice-to-text translation, and could thus help visually impaired people, or people
with hearing impairments, to use information and communication technologies (ICTs)

6. Pattern recognition can track marine life migration, concentrations of life undersea and fishing activities
to enhance sustainable marine ecosystems and combat illegal fishing

7. With global warming, climate change, and water pollution on the rise, we could be dealing with a harsh
future. Food shortages are not something we want to add to the list. Thankfully, one startup is already
working hard on using AI for good in this regard.

113
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

8. Imago AI an India-based agri-tech startup that aims to use AI to increase crop yields and reduce food
waste. The company’s vision is to use technology to feed the world’s growing population by optimizing
agricultural methods.

The

company combines machine learning and computer vision to automate tedious tasks like measuring crop
quality and weighing yields. This won’t just speed up the process, but it will also help farmers to identify
plants that have diseases. TechCrunch reports that 40% of the world’s crops are lost to disease, so the work
from Imago AI could be a major breakthrough for agriculture, especially in poorer countries.

( https://www.springboard.com/blog/ai-for-good/)

Activity – Moral Machines – For student to explore!

From self-driving cars on public roads to self-piloting reusable rockets landing on self-sailing ships, machine
intelligence is supporting or entirely taking over ever more complex human activities at an ever-increasing
pace. The greater autonomy given machine intelligence in these roles can result in situations where they
have to make autonomous choices involving human life and limb. This calls for not just a clearer
understanding of how humans make such choices, but also a clearer understanding of how humans perceive
machine intelligence making such choices. URL - https://www.moralmachine.net/

114
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

2. Principles for Ethical AI

The 1980s classic film “The Terminator” stimulated our imaginations, but it also triggered fears about
autonomous, intelligent robots eliminating the human race, though this scenario is far-fetched. Artificial
Intelligence (AI) is a technology to design a machine which can perform tasks normally requiring human
intelligence. In light of its powerful transformative force and profound impact across various societal
domains, AI has sparked ample debate about the principles and values that should guide its development
and use.

UNI Global Union ( http://www.thefutureworldofwork.org/) has identified 10 key principles for Ethical AI

1. AI systems must be transparent

Consumers should have the right to demand transparency in the decisions and outcomes of AI systems as
well as their underlying algorithms. They must also be consulted on AI systems’ implementation,
development and deployment.

2. AI systems must be equipped with an “ethical black box”

The ethical “black box” should not only contain relevant data to ensure system transparency and
accountability, but also include clear data and information on the ethical considerations built into the system.

3. AI must serve people and planet

Codes of ethics for the development, application and use of AI are needed so that throughout their entire
operational process, AI systems remain compatible and increase the principles of human dignity, integrity,
freedom, privacy and cultural and gender diversity, as well as fundamental human rights.

4. Adopt a human-in-command approach

The development of AI must be responsible, safe and useful where machines maintain the legal status of
tools, and legal persons retain control over, and responsibility for these machines at all times.

5. Ensure a gender less, unbiased AI

In the design and maintenance of AI and artificial systems, it is vital that the system is controlled for negative
or harmful human-bias, and that any bias–be it gender, race, sexual orientation, age–is identified and is not
propagated by the system.

6. Share the benefits of AI systems

The economic prosperity created by AI should be distributed broadly and equally, to benefit all of humanity.
Global as well as national policies aimed at bridging the economic, technological and social digital divide are
therefore necessary.

7. Secure a just transition and ensure support for fundamental freedoms and rights
As AI systems develop and augmented realities are formed, workers and work tasks will be displaced. It is
vital that policies are put in place that ensure a just transition to the digital reality, including specific
governmental measures to help displaced workers find new employment.

8. Establish global governance mechanism

Establish multi-stakeholder Decent Work and Ethical AI governance bodies on global and regional levels. The
bodies should include AI designers, manufacturers, owners, developers, researchers, employers, lawyers,
CSOs and trade unions.

115
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

9. Ban the attribution of responsibility to robots

Robots should be designed and operated as far as is practicable to comply with existing laws, and
fundamental rights and freedoms, including privacy.

10. Ban AI arms race

Lethal autonomous weapons, including cyber warfare, should be banned.

Now please watch the two videos below:

1. https://www.youtube.com/watch?v=cplucNW70II&ab_channel=TEDxTalks

2. https://www.youtube.com/watch?v=vgUWKXVvO9Q

Question 1: How do you decide if something deserves to be called intelligent? Does it have to pass exams to
earn this certificate? Apply your imagination and creativity to answer this question.

Question 2: A village needs your help to prevent the spread of a nearby forest fire. Design, develop and train
the Agent to identify what causes fires, remove materials that help fires spread, and then bring life back to a
forest destroyed by fire — all with Flowchart / pseudo code.

Be informed that while designing the AI agent, most likely your own biases will get inside the algorithm.

Stage 1: Design the whole solution alone first

Stage 2: Include more students in the group and take their perspective about your solution. You will come to
know there were biases in your solution

Stage 3: Increase your group size by including students from different classes and different age groups. Let
them give their point of view about the solution you’ve arrived at in Stage 2 and you would be surprised to
know that the solution still has a lot of biases.

116
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3. Types of Bias (personal /cultural /societal)

Example 1

Suppose a CCTV camera were to spot your face in a crowd outside a sports stadium. In the police data center
somewhere in the city/ country, an artificial neural network analyzes images from the CCTV footage frame-
by-frame. A floating cloud in the sky causes a shadow on your face and neural network (by mistake) finds
your face similar to the face of a wanted criminal.

If the police were to call you aside for questioning and tell you they had reason to detain you, how would you
defend yourself? Was it your fault that your shadowed face has resemblance by few degrees with a person
in the police record?

Example 2: This happened in the USA in 2018. An AI system was being used to allocate care to nearly 200
million patients in the US. It was discovered later that AI system was offering a lower standard of care to the
black patients. Across the board, black people were assigned lower risk scores than white people. This in turn
meant that black patients were less likely to be able to access the necessary standard of care.

The problem stemmed from the fact that the AI algorithm was allocating risk values using the predicted cost
of healthcare. Because black patients were often less able to pay or were perceived as less able to pay for the
higher standard of care, the AI essentially learned that they were not entitled to such a standard of treatment.

Though the system was fixed / improved after being discovered but the big question is – whose problem was
this? The AI system developers or the US black people data (which was true to an extent)?

117
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

3.1 What is Bias?

Bias is a tendency to lean and act in a certain direction, either in favor of or against a particular thing. Bias
lacks the neutral viewpoint. If you're biased toward something, then you lean favorably toward it; you tend
to think positively of it. Meanwhile, if you're biased against something, then you lean negatively against it;
you tend to think poorly of it.

The sources of ‘Bias in AI’ usually are our own cultural, societal or personal biases regarding race, gender,
nationality, age or personal habits.

Did you answer “bananas”? Why didn’t you mention the plastic bag roll? Or the color of the banana? Or the
plastic stand holding the bananas?

Although all answers are technically correct, for some reason we have a bias to prefer one of them. Not all
people would share that bias; what we perceive and how we respond is influenced by our norms, culture
and habits. If you live on a planet where all bananas are blue, you might answer “yellow bananas” here. If
you’ve never seen a banana before, you might say “shelves with yellow stuff on them.”

Question:

Make a list of 10 biases which you observe in your home, classroom or in your society. You don’t need to
get all 10 biases in one go. You can start with one and keep adding as you observe more.

5. How data driven decisions can be de-biased

“AI bias doesn’t come from AI algorithm; it comes from people” - some may disagree to this saying bias comes
NOT from people but from dataset. But people make and collect the data. Textbooks reflect the biases of
their authors. Like textbooks, datasets have authors. They’re collected according to instructions made
by people.

While ML and AI are technologies often dissociated from human thinking, they are always based on
algorithms created by humans. And like anything created by humans, these algorithms are prone to
incorporating the biases of their creators.

Because AI algorithms learn from data, any historical data can quickly create biased AI that bases decisions
on unfair datasets.

118
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL

But there are tangible things we can do to manage bias in AI. Here are some of them:

Educate and check yourself

The first step to removing bias is to proactively look out for it and keep checking your own behavior, as a lot
of bias is unconscious.

Build a diverse team

Another way to reduce the risk of bias and to create more inclusive experiences is to ensure the team building
the AI system is diverse (for example, with regard to gender, race, education, thinking process, disability
status, skill set and problem framing approach). This should include the engineer teams, as well as project
and middle management, and design teams.

Be able to explain automated decisions

Explainable AI is new normal. With AI in system, ability to explain the algorithm under the hood is critical.
This involves ensuring transparency at both the macro level as well as at the individual level.

It’s all about the data – make sure you choose a representative dataset
Choosing data that is diverse and includes different groups to prevent your model from having trouble
identifying unlabeled examples that are outside the norm. Make sure you have properly grouped and
managed the data so you aren't forced to face similar situations as Google and its facial recognition system.

Activities to teach AI ethics in the classrooms

Activity 1: What do You think

Most western countries in the world have a regulation that says that fire retardants must be
added to the foams and fabrics in furniture. New Zealand does not. Do you think New Zealand
should have such a regulation? YES or NO?
Give two reasons:
1. ________________________________________________________________
2. ________________________________________________________________
Activity 2: AI Bingo

Step 1: First take a look at this PPT as a class: Introduction to AI and AI Bingo_AI-Ethics.pptx

Step 2: After reviewing the introductory slides, pass out bingo cards. Bingo cards are available here:
https://www.media.mit.edu/projects/ai-ethics-for-middle-school/overview/

Step 3: Club students into teams of 2. Teams must identify the prediction the AI system is trying to make and
the dataset it might use to make that prediction. The first team to get five squares filled out in a row, diagonal,
or column wins (or, for longer play, the first student to get two rows/diagonals/columns).

Step 4: After playing, have students discuss the squares they have filled out

119
___________________________________________

ARTIFICIAL INTELLIGENCE: STUDY MATERIAL

CLASS XII
______________________________________________

LEVEL 3: AI INNOVATE (UNIT 1 – UNIT 3)

TEACHER INSTRUCTION MANUAL

INDEX

UNIT 1: CAPSTONE PROJECT……………………………………. Page 3-30

UNIT 2: MODEL LIFE CYCLE……………………………………… Page 31-34

UNIT 3: STORY TELLING THROUGH DATA ………………. Page 35-39

APPENDIX: ADDITIONAL RESOURCE (ON PYTHON)

FOR ADVANCED LEARNERS ……………………………………. Page 40

2
Unit 1: Capstone Project

Title: Capstone Project Approach: Hands on, Team Discussion, Web

search, Case studies
Summary: The final project of an academic program, typically integrating all of the learning from
the program is called the Capstone Project.
Here students are made to look at real world examples and situations, exchange their points of view
based on experiences and discuss potential solutions to the problem.
Objectives:
1. Students should apply their learning in solving real world problems
2. Students should be comfortable in expressing their solution in non-technical words
3. Students should be in a position to choose and apply right algorithm to solve the problem

Key Concepts: AI Project Cycle , Model validation , RMSE , MSE , MAPE

A capstone project is a project where students must research a topic independently to find a deep
understanding of the subject matter. It gives an opportunity for the student to integrate all their
knowledge and demonstrate it through a comprehensive project.
So, without further ado, let’s jump straight into some Capstone project ideas that will strengthen your
base

1. Stock Prices Predictor

2. Develop A Sentiment Analyzer

3. Movie Ticket Price Predictor

4. Students Results Predictor

5. Human Activity Recognition using Smartphone Data set

6. Classifying humans and animals in a photo

The list can is huge but these are some simple projects, which you can consider to pick up to develop.

3
1. Understanding The Problem
Artificial Intelligence is perhaps the most transformative technology available today. At a high level, every
AI project follows the following six steps:

1) Problem definition i.e. Understanding the problem

2) Data gathering

3) Feature definition

4) AI model construction

5) Evaluation & refinements

6) Deployment

In this section, I will share the best practices for the first step: “understanding the problem”.

Begin formulating your problem by asking yourself this simple — is there a pattern? The premise that
underlies all Machine Learning disciplines is that there needs to be a pattern. If there is no pattern, then the
problem cannot be solved with AI technology. It is fundamental that this question is asked before deciding
to embark on an AI development journey.
(If it is believed that there is a pattern in the data, then AI development techniques may be employed , else Don’t apply AI techniques to
solve the problem.

4
If it is believed that there is a pattern in the data, then AI development techniques may be employed.
Applied uses of these techniques are typically geared towards answering five types of questions, all of
which may be categorized as being within the umbrella of predictive analysis:

1) Which category? (Classification)

2) How much or how many? (Regression)

3) Which group? (Clustering)

4) Is this unusual? (Anomaly Detection)

5) Which option should be taken? (Recommendation)

It is important to determine which of these questions you’re asking, and how answering it helps you
solve your problem.

Project 1:

Form a team of 4-5 students. And submit a detailed report on the most critical problems and how AI can
assist in addressing those problems. Report should include description of the problem and the proposed
way in which AI can solve the problem.

1. Agriculture in India

2. School Education in India

3. Healthcare in India

2. Decomposing The Problem Through DT Framework

Design Thinking is a design methodology that provides a solution-based approach to solving problems. It’s
extremely useful in tackling complex problems that are ill-defined or unknown.

The five stages of Design Thinking are as follows: Empathize, Define, Ideate, Prototype, and Test.

5
Real computational tasks are complicated. To accomplish them you need to break down the problem into
smaller units before coding.

Problem decomposition steps

1. Understand the problem and then restate the problem in your own words
 Know what the desired inputs and outputs are
 Ask questions for clarification (in class these questions might be to your instructor, but
most of the time they will be asking either yourself or your collaborators)
2. Break the problem down into a few large pieces. Write these down, either on paper or as
comments in a file.
3. Break complicated pieces down into smaller pieces. Keep doing this until all of the pieces are
small.
4. Code one small piece at a time.
1. Think about how to implement it
2. Write the code/query
3. Test it… on its own.
4. Fix problems, if any

Example 1: Calculate the volume of a bunch of books

Data
length width height
1 2 3
2 4 3

1. Calculate the volume of a book [a. Function]

2. Run this calculation on all books [a. Loop]

Example 2: Imagine that you want to create your first app. This is a complex problem. How would you
decompose the task of creating an app?

To decompose this task, you would need to know the answer to a series of smaller problems:

 what kind of app you want to create?

 what will your app will look like?
 who is the target audience for your app?
 what will the graphics will look like?
 what audio will you include?
 what software will you use to build your app?
 how will the user navigate your app?
 how will you test your app?

This list has broken down the complex problem of creating an app into much simpler problems that can
now be worked out. You may also be able to get other people to help you with different individual parts
of the app. For example, you may have a friend who can create the graphics, while another will be your
test the app.

6
Example 3: (For Advance learners)

Decompose Time Series Data into Trend

Time series decomposition involves thinking of a series as a combination of level, trend, seasonality, and
noise components. Decomposition provides a useful abstract model for thinking about time series
generally and for better understanding problems during time series analysis and forecasting.

These components are defined as follows:

 Level: The average value in the series.

 Trend: The increasing or decreasing value in the series.
 Seasonality: The repeating short-term cycle in the series.
 Noise: The random variation in the series.

Example Airline Passengers Data set

The Airline Passengers dataset describes the total number of airline passengers over a period of time.

The units are a count of the number of airline passengers in thousands. There are 144 monthly
observations from 1949 to 1960.

 Download the dataset.

Download the dataset to your current working directory with the filename “airline-passengers.csv“.

First, let’s graph the raw observations.

from pandas import read_csv

from matplotlib import pyplot

series = read_csv('airline-passengers.csv', header=0, index_col=0)

series.plot()

pyplot.show()

7
Reviewing the line plot, it suggests that there may be a linear trend, but it is hard to be sure from eye-
balling. There is also seasonality, but the amplitude (height) of the cycles appears to be increasing,
suggesting that it is multiplicative.

We will assume a multiplicative model.

The example below decomposes the airline passenger’s dataset as a multiplicative model.
from pandas import read_csv

from matplotlib import pyplot

from statsmodels.tsa.seasonal import seasonal_decompose

series = read_csv('airline-passengers.csv', header=0, index_col=0)

result = seasonal_decompose(series, model='multiplicative')

result.plot()

pyplot.show()

Running the example plots the observed, trend, seasonal, and residual time series.

We can see that the trend and seasonality information extracted from the series does seem
reasonable. The residuals are also interesting, showing periods of high variability in the early and later
years of the series.
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

(Multiplicative Decomposition of Airline Passenger Dataset)

9
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

3.Analytic Approach
Those who work in the domain of AI and Machine Learning solve problems and answer questions
through data every day. They build models to predict outcomes or discover underlying patterns, all to
gain insights leading to actions that will improve future outcomes.

It is the ‘ Foundational Methodology of Data Science’ (https://www.ibmbigdatahub.com/blog/why-we-

need-methodology-data-science) and it has 10 stages -

Figure 1. Foundational methodology for data science.

Every project, regardless of its size, starts with business understanding, which lays the foundation for
successful resolution of the business problem. The business sponsors needing the analytic solution play
the critical role in this stage by defining the problem, project objectives and solution requirements from
a business perspective. And, believe it or not—even with nine stages still to go—this first stage is the
hardest.

After clearly stating a business problem, the data scientist can define the analytic approach to solving
it. Doing so involves expressing the problem in the context of statistical and machine learning
techniques so that the data scientist can identify techniques suitable for achieving the desired outcome.

Selecting the right analytic approach depends on the question being asked. Once the problem to be
addressed is defined, the appropriate analytic approach for the problem is selected in the context of
the business requirements. This is the second stage of the data science methodology.

10
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

 If the question is to determine probabilities of an action, then a predictive model might be used.

 If the question is to show relationships, a descriptive approach maybe be required.

 Statistical analysis applies to problems that require counts: if the question requires a yes/ no
answer, then a classification approach to predicting a response would be suitable.

4. Data Requirement

11
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

If the problem that needs to be resolved is "a recipe", so to speak, and data is "an ingredient", then the
data scientist needs to identify:

1. which ingredients are required?

2. how to source or the collect them?

3. how to understand or work with them?

4. and how to prepare the data to meet the desired outcome?

Prior to undertaking the data collection and data preparation stages of the methodology, it's vital to
define the data requirements for decision-tree classification. This includes identifying the necessary
data content, formats and sources for initial data collection.

In this phase the data requirements are revised and decisions are made as to whether or not the
collection requires more or less data. Once the data ingredients are collected, the data scientist will
have a good understanding of what they will be working with.

Techniques such as descriptive statistics and visualization can be applied to the data set, to assess the
content, quality, and initial insights about the data. Gaps in data will be identified and plans to either fill
or make substitutions will have to be made.

In essence, the ingredients are now sitting on the cutting board.

12
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

5. Modeling Approach

Data Modeling focuses on developing models that are either descriptive or predictive.

 An example of a descriptive model might examine things like: if a person did this, then they're
likely to prefer that.

 A predictive model tries to yield yes/no, or stop/go type outcomes. These models are based on
the analytic approach that was taken, either statistically driven or machine learning driven.

The data scientist will use a training set for predictive modelling. A training set is a set of historical data
in which the outcomes are already known. The training set acts like a gauge to determine if the model
needs to be calibrated. In this stage, the data scientist will play around with different algorithms to
ensure that the variables in play are actually required.

The success of data compilation, preparation and modelling, depends on the understanding of the
problem at hand, and the appropriate analytical approach being taken. The data supports the
answering of the question, and like the quality of the ingredients in cooking, sets the stage for the
outcome.

Constant refinement, adjustments and tweaking are necessary within each step to ensure the outcome
is one that is solid. The framework is geared to do 3 things:

 First, understand the question at hand.

 Second, select an analytic approach or method to solve the problem.

 Third, obtain, understand, prepare, and model the data.

The end goal is to move the data scientist to a point where a data model can be built to answer the
question.

13
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

6. How to validate model quality

6.1 Train-Test Split Evaluation
The train-test split is a technique for evaluating the performance of a machine learning algorithm.

It can be used for classification or regression problems and can be used for any supervised learning
algorithm.

The procedure involves taking a dataset and dividing it into two subsets. The first subset is used to fit
the model and is referred to as the training dataset. The second subset is not used to train the model;
instead, the input element of the dataset is provided to the model, then predictions are made and
compared to the expected values. This second dataset is referred to as the test dataset.

 Train Dataset: Used to fit the machine learning model.

 Test Dataset: Used to evaluate the fit machine learning model.

The objective is to estimate the performance of the machine learning model on new data: data not used
to train the model.

This is how we expect to use the model in practice. Namely, to fit it on available data with known inputs
and outputs, then make predictions on new examples in the future where we do not have the expected
output or target values.

The train-test procedure is appropriate when there is a sufficiently large dataset available.

How to Configure the Train-Test Split

The procedure has one main configuration parameter, which is the size of the train and test sets. This is
most commonly expressed as a percentage between 0 and 1 for either the train or test datasets. For
example, a training set with the size of 0.67 (67 percent) means that the remainder percentage 0.33 (33
percent) is assigned to the test set.

There is no optimal split percentage.

You must choose a split percentage that meets your project’s objectives with considerations that
include:

 Computational cost in training the model.

 Computational cost in evaluating the model.
 Training set representativeness.
 Test set representativeness.

Nevertheless, common split percentages include:

 Train: 80%, Test: 20%

 Train: 67%, Test: 33%
 Train: 50%, Test: 50%

Now that we are familiar with the train-test split model evaluation procedure, let’s look at how we can
use this procedure in Python.

14
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Example 1: Training and Test Data in Python Machine Learning

As we work with datasets, a machine learning model works in two stages. We usually split the data
around 20%-80% between testing and training stages. Under supervised learning, we split a dataset
into a training data and test data in Python ML.

a. Prerequisites for Train and Test Data

We will need the following Python Libraries for this tutorial:

 Pandas

 Sklearn

We can install these with pip

1. pip install pandas

2. pip install sklearn

We use pandas to import the dataset and sklearn to perform the splitting. You can import these packages
as:

1. >>> import pandas as pd

2. >>> from sklearn.model_selection import train_test_split
3. >>> from sklearn.datasets import load_iris

Following are the process of Train and Test set in Python ML. So, let’s take a dataset first.

Loading the Data set

Let’s load the forestfires dataset using pandas.

1. >>> data=pd.read_csv(‘forestfires.csv’)
2. >>> data.head()

15
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Train and Test Set in Python Machine Learning

b. Splitting

Let’s split this data into labels and features. Now, what’s that? Using features, we predict labels. I mean
using features (the data we use to predict labels), we predict labels (the data we want to predict).

1. >>> y=data.temp
2. >>> x=data.drop(‘temp’,axis=1)

Temp is a label to predict temperatures in y; we use the drop() function to take all other data in x. Then,
we split the data.

1. >>> x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)
2. >>> x_train.head()

Train and Test Set in Python Machine Learning

1. >>> x_train.shape

(413, 12)

1. >>> x_test.head()

Train and Test Set in Python Machine Learning

1. >>> x_test.shape

(104, 12)

16
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

The line test_size=0.2 suggests that the test data should be 20% of the dataset and the rest should be
train data. With the outputs of the shape () functions, you can see that we have 104 rows in the test data
and 413 in the training data.

Example 2: Train-Test Split for Regression

We will demonstrate how to use the train-test split to evaluate a random forest algorithm on the housing
dataset.

The housing dataset is a standard machine learning dataset composed of 506 rows of data with 13
numerical input variables and a numerical target variable.

The dataset involves predicting the house price given details of the house is in the suburbs of the
American city of Boston.

 Housing Dataset (housing.csv)

 Housing Description (housing.names)

You will not need to download the dataset; we will download it automatically as part of our worked
examples. The example below downloads and loads the dataset as a Pandas DataFrame and summarizes
the shape of the dataset.

# load and summarize the housing dataset from pandas import read_csv

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'

dataframe = read_csv(url, header=None)

# summarize shape

print(dataframe.shape)

Running the example confirms the 506 rows of data and 13 input variables and single numeric target
variables (14 in total).
(506, 14)
1

e can now evaluate a model using a train-test split.

First, the loaded dataset must be split into input and output components.

...

# split into inputs and outputs

X, y = data[:, :-1], data[:, -1]

print(X.shape, y.shape)

17
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Next, we can split the dataset so that 67 percent is used to train the model and 33 percent is used to
evaluate it. This split was chosen arbitrarily.

...

# split into train test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

We can then define and fit the model on the training dataset.

...

# fit the model

model = RandomForestRegressor(random_state=1)

model.fit(X_train, y_train)

Then use the fit model to make predictions and evaluate the predictions using the mean absolute error
(MAE) performance metric.

...

# make predictions

yhat = model.predict(X_test)

# evaluate predictions

mae = mean_absolute_error(y_test, yhat)

print('MAE: %.3f' % mae)

Tying this together, the complete example is listed below.

# train-test split evaluation random forest on the housing dataset

from pandas import read_csv

from sklearn.model_selection import train_test_split

from sklearn.ensemble import RandomForestRegressor

from sklearn.metrics import mean_absolute_error

# load dataset

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'

dataframe = read_csv(url, header=None)

data = dataframe.values

# split into inputs and outputs

18
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

X, y = data[:, :-1], data[:, -1]

print(X.shape, y.shape)

# split into train test sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

# fit the model

model = RandomForestRegressor(random_state=1)

model.fit(X_train, y_train)

# make predictions

yhat = model.predict(X_test)

# evaluate predictions

mae = mean_absolute_error(y_test, yhat)

print('MAE: %.3f' % mae)

Running the example first loads the dataset and confirms the number of rows in the input and output
elements.

The dataset is split into train and test sets and we can see that there are 339 rows for training and 167
rows for the test set.

Finally, the model is evaluated on the test set and the performance of the model when making predictions
on new data is a mean absolute error of about 2.211 (thousands of dollars).

(506, 13) (506,)

(339, 13) (167, 13) (339,) (167,)

MAE: 2.157

19
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

6.2 Introduce concept of cross validation

Machine learning is an iterative process.

You will face choices about predictive variables to use, what types of models to use, what arguments to
supply those models, etc. We make these choices in a data-driven way by measuring model quality of
various alternatives.

You've already learned to use train_test_split to split the data, so you can measure model quality on the
test data. Cross-validation extends this approach to model scoring (or "model validation.") Compared to
train_test_split, cross-validation gives you a more reliable measure of your model's quality, though it
takes longer to run.

The Shortcoming of Train-Test Split

Imagine you have a dataset with 5000 rows. The train_test_split function has an argument for test_size
that you can use to decide how many rows go to the training set and how many go to the test set. The
larger the test set, the more reliable your measures of model quality will be. At an extreme, you could
imagine having only 1 row of data in the test set. If you compare alternative models, which one makes
the best predictions on a single data point will be mostly a matter of luck.

You will typically keep about 20% as a test dataset. But even with 1000 rows in the test set, there's some
random chance in determining model scores. A model might do well on one set of 1000 rows, even if it
would be inaccurate on a different 1000 rows. The larger the test set, the less randomness (aka "noise")
there is in our measure of model quality.

The Cross-Validation Procedure

In cross-validation, we run our modeling process on different subsets of the data to get multiple
measures of model quality. For example, we could have 5 folds or experiments. We divide the data into
5 pieces, each being 20% of the full dataset.

20
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

We run an experiment called experiment 1 which uses the first fold as a holdout set, and everything
else as training data. This gives us a measure of model quality based on a 20% holdout set, much as we
got from using the simple train-test split.

We then run a second experiment, where we hold out data from the second fold (using everything
except the 2nd fold for training the model.) This gives us a second estimate of model quality. We repeat
this process, using every fold once as the holdout. Putting this together, 100% of the data is used as a
holdout at some point.

Returning to our example above from train-test split, if we have 5000 rows of data, we end up with a
measure of model quality based on 5000 rows of holdout (even if we don't use all 5000 rows
simultaneously.

Trade-offs Between Cross-Validation and Train-Test Split

Cross-validation gives a more accurate measure of model quality, which is especially important if you
are making a lot of modeling decisions. However, it can take more time to run, because it estimates
models once for each fold. So it is doing more total work.

Given these tradeoffs, when should you use each approach? On small datasets, the extra computational
burden of running cross-validation isn't a big deal. These are also the problems where model quality
scores would be least reliable with train-test split. So, if your dataset is smaller, you should run cross-
validation.

For the same reasons, a simple train-test split is sufficient for larger datasets. It will run faster, and you
may have enough data that there's little need to re-use some of it for holdout.

There's no simple threshold for what constitutes a large vs small dataset. If your model takes a couple
minute or less to run, it's probably worth switching to cross-validation. If your model takes much longer
to run, cross-validation may slow down your workflow more than it's worth.

Alternatively, you can run cross-validation and see if the scores for each experiment seem close. If each
experiment gives the same results, train-test split is probably sufficient.

Example
First we read the data
import pandas as pd
data = pd.read_csv('../input/melb_data.csv')
cols_to_use = ['Rooms', 'Distance', 'Landsize', 'BuildingArea', 'YearBuilt']
X = data[cols_to_use]
y = data.Price

Then specify a pipeline of our modeling steps (It can be very difficult to do cross-validation properly if you
arent't using pipelines)
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Imputer
my_pipeline = make_pipeline(Imputer(), RandomForestRegressor())

21
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Finally get the cross-validation scores:

from sklearn.model_selection import cross_val_score
scores = cross_val_score(my_pipeline, X, y, scoring='neg_mean_absolute_error')
print(scores)

[-322244.30158131 -305507.19909774 -284635.2229142 ]

You may notice that we specified an argument for scoring. This specifies what measure of model quality
to report. The docs for scikit-learn show a list of options.

It is a little surprising that we specify negative mean absolute error in this case. Scikit-learn has a
convention where all metrics are defined so a high number is better. Using negatives here allows them to
be consistent with that convention, though negative MAE is almost unheard of elsewhere.

You typically want a single measure of model quality to compare between models. So we take the
average across experiments.
print('Mean Absolute Error %2f' %(-1 * scores.mean()))

Mean Absolute Error 304128.907864

Conclusion

Using cross-validation gave us much better measures of model quality, with the added benefit of cleaning
up our code (no longer needing to keep track of separate train and test sets. So, it's a good win.

Activity 1: Convert the code for your on-going project over from train-test split to cross-validation. Make
sure to remove all code that divides your dataset into training and testing datasets. Leaving code you don't
need any more would be sloppy.

Activity 2: Add or remove a predictor from your models. See the cross-validation score using both sets of
predictors, and see how you can compare the scores.

7. Metrics of model quality by simple Math and examples

After you make predictions, you need to know if they are any good. There are standard measures that we
can use to summarize how good a set of predictions actually are.

Knowing how good a set of predictions is, allows you to make estimates about how good a given machine
learning model of your problem,

You must estimate the quality of a set of predictions when training a machine learning model.

Performance metrics like classification accuracy and root mean squared error can give you a clear objective
idea of how good a set of predictions is, and in turn how good the model is that generated them.

This is important as it allows you to tell the difference and select among:

 Different transforms of the data used to train the same machine learning model.
 Different machine learning models trained on the same data.
 Different configurations for a machine learning model trained on the same data.

22
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

As such, performance metrics are a required building block in implementing machine learning
algorithms from scratch.

All the algorithms in machine learning rely on minimizing or maximizing a function, which we call
“objective function”. The group of functions that are minimized are called “loss functions”. A loss
function is a measure of how good a prediction model does in terms of being able to predict the
expected outcome. A most commonly used method of finding the minimum point of function is
“gradient descent”. Think of loss function like undulating mountain and gradient descent is like sliding
down the mountain to reach the bottom most point.

Loss functions can be broadly categorized into 2 types: Classification and Regression Loss.

Regression functions predict a quantity, and classification functions predict a label.

23
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

7.1 RMSE (Root Mean Squared Error)

In this section, we will be looking at one of the methods to determine the accuracy of our model in
predicting the target values. All of you reading this article must have heard about the term RMS i.e. Root
Mean Square and you might have also used RMS values in statistics as well. In machine Learning when we
want to look at the accuracy of our model we take the root mean square of the error that has occurred
between the test values and the predicted values mathematically:

For a single value:

Let a= (predicted value- actual value) ^2
Let b= mean of a = a (for single value)
Then RMSE= square root of b

For a wide set of values RMSE is defined as follows:

Graphically:

As you can see in this scattered graph the red dots are the actual values and the blue line is the set of
predicted values drawn by our model. Here X represents the distance between the actual value and the
predicted line this line represents the error, similarly, we can draw straight lines from each red dot to the
blue line. Taking mean of all those distances and squaring them and finally taking the root will give us RMSE
of our model.

24
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Example 1 (RMSE)

Let us write a python code to find out RMSE values of our model. We would be predicting the brain
weight of the users. We would be using linear regression to train our model, the data set used in my code
can be downloaded from here: headbrain6-

import time

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

#reading the data

"""

here the directory of my code and the headbrain6.csv file is same make sure both the files are stored in
same folder or directory

"""

data=pd.read_csv('headbrain6.csv')

data.head()

x=data.iloc[:,2:3].values

y=data.iloc[:,3:4].values

#splitting the data into training and test

from sklearn.cross_validation import train_test_split

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/4,random_state=0)

#fitting simple linear regression to the training set

from sklearn.linear_model import LinearRegression

regressor=LinearRegression()

regressor.fit(x_train,y_train)

25
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

#predict the test result

y_pred=regressor.predict(x_test)

#to see the relationship between the training data values

plt.scatter(x_train,y_train,c='red')

plt.show()

#to see the relationship between the predicted

#brain weight values using scattered graph

plt.plot(x_test,y_pred)

plt.scatter(x_test,y_test,c='red')

plt.xlabel('headsize')

plt.ylabel('brain weight')

#errorin each value

for i in range(0,60):

print("Error in value number",i,(y_test[i]-y_pred[i]))

time.sleep(1)

#combined rmse value

rss=((y_test-y_pred)**2).sum()

mse=np.mean((y_test-y_pred)**2)

print("Final rmse value is =",np.sqrt(np.mean((y_test-y_pred)**2)))

26
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Output

27
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

The RMSE value of our is coming out to be approximately 73 which is not bad. A good model should have
an RMSE value less than 180. In case you have a higher RMSE value, this would mean that you probably
need to change your feature or probably you need to tweak your hyperparameters.

7.2 MSE (Mean Squared Error)

Mean Square Error (MSE) is the most commonly used regression loss function. MSE is the sum of
squared distances between our target variable and predicted values.

Below is a plot of an MSE function where the true target value is 100, and the predicted values range
between -10,000 to 10,000. The MSE loss (Y-axis) reaches its minimum value at prediction (X-axis) = 100.
The range is 0 to ∞.

28
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Why use mean squared error

MSE is sensitive towards outliers and given several examples with the same input feature values, the
optimal prediction will be their mean target value. This should be compared with Mean Absolute Error,
where the optimal prediction is the median. MSE is thus good to use if you believe that your target data,
conditioned on the input, is normally distributed around a mean value, and when it’s important to
penalize outliers extra much.

When to use mean squared error

Use MSE when doing regression, believing that your target, conditioned on the input, is normally
distributed, and want large errors to be significantly (quadratically) more penalized than small ones.

Example-1: You want to predict future house prices. The price is a continuous value, and therefore we
want to do regression. MSE can here be used as the loss function.

Example-2:Consider the given data points: (1,1), (2,1), (3,2), (4,2), (5,4)
You can use this online calculator to find the regression equation / line.

Regression line equation: Y = 0.7X – 0.1

x y Yi

1 1 0.6

2 1 1,39

3 2 1.99

4 2 2.69

5 4 3.4

29
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

from sklearn.metrics import mean_squared_error

# Given values
Y_true = [1,1,2,2,4] # Y_true = Y (original values)

# calculated values
Y_pred = [0.6,1.29,1.99,2.69,3.4] # Y_pred = Y'

# Calculation of Mean Squared Error (MSE)

mean_squared_error(Y_true,Y_pred)

Output: 0.21606

30
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Unit 2: Model Life cycle

ATitle: Model Life cycle Approach: Hands on, Team Discussion, Web
search, Case studies
Summary: The machine learning life cycle is the cyclical process that AI or machine learning projects
follow. It defines each step that an engineer or developer should follow. Generally, every AI project
lifecycle encompasses three main stages: project scoping, design or build phase, and deployment
in production. In this unit we will go over each of them and the key steps and factors to consider
when implementing them.

The expectation out of students would be to focus more on the hands-on aspect of AI projects.
Objectives:

s
1. Students should develop their capstone project using AI project cycle methodologies
2. Students should be comfortable in breaking down their projects into different phases of
AI project cycle
3. Students should be in a position to choose and apply right AI model to solve the problem
Learning Outcomes:
1. Students will demonstrate the skill of breaking down a problem in smaller sub units
according to AI project life cycle methodologies
2. Students will demonstrate proficiency in choosing and applying the correct AI or ML model

Key Concepts: AI Project Cycle, Model validation, AI deployment, IBM Watson

c
a
t 31
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

1. AI Model Life Cycle

Understanding AI Project Cycle is crucial element for students to have a robust and sound AI foundation.
In order to implement successful AI projects, students need to adopt a comprehensive approach to
covering each step of the AI or machine learning life cycle - starting from project scoping and data prep,
and going through all the stages of model building, deployment, management, analytics, to full-blown
Enterprise AI.

Generally, every AI project lifecycle encompasses three main stages: project scoping, design or build
phase, and deployment in production. Let's go over each of them and the key steps and factors to consider
when implementing them.

(Source : https://blog.dataiku.com/ai-projects-lifecycle-key-steps-and-considerations)

32
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Step 1: Scoping (Requirements analysis)

The first fundamental step when starting an AI initiative is scoping and selecting the relevant use case(s)
that the AI model will be built to address. This is arguably the most important part of your AI project. Why?
There's a couple of reasons for it. First, this stage involves the planning and motivational aspects of your
project. It is important to start strong if you want your artificial intelligence project to be successful. There's
a great phrase that characterizes this project stage: garbage in, garbage out. This means if the data you
collect is no good, you won't be able to build an effective AI algorithm, and your whole project will collapse.

In this phase, it's crucial to precisely define the strategic business objectives and desired outcomes of the
project, select align all the different stakeholders' expectations, anticipate the key resources and steps,
and define the success metrics. Selecting the AI or machine learning use cases and being able to evaluate
the return on investment (ROI) is critical to the success of any data project.

Step 2: Design/Building the Model

Once the relevant projects have been selected and properly scoped, the next step of the machine learning
lifecycle is the Design or Build phase, which can take from a few days to multiple months, depending on
the nature of the project. The Design phase is essentially an iterative process comprising all the steps
relevant to building the AI or machine learning model: data acquisition, exploration, preparation, cleaning,
feature engineering, testing and running a set of models to try to predict behaviours or discover insights
in the data.

Enabling all the different people involved in the AI project to have the appropriate access to data, tools,
and processes in order to collaborate across different stages of the model building is critical to its success.
Another key success factor to consider is model validation: how will you determine, measure, and evaluate
the performance of each iteration with regards to the defined ROI objective?

During this phase, you need to evaluate the various AI development platforms, e.g.:

 Open languages — Python is the most popular, with R and Scala also in the mix.

 Open frameworks — Scikit-learn, XGBoost, TensorFlow, etc.

 Approaches and techniques — Classic ML techniques from regression all the way to state-of-the-
art GANs and RL

 Productivity-enhancing capabilities — Visual modelling, AutoAI to help with feature engineering,

algorithm selection and hyperparameter optimization

 Development tools — DataRobot, H2O, Watson Studio, Azure ML Studio, Sagemaker, Anaconda,
etc.

33
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Different AI development platforms offer extensive documentation to help the development teams.
Depending on your choice of the AI platform, you need to visit the appropriate webpages for this
documentation, which are as follows:

 Microsoft Azure AI Platform;

 Google Cloud AI Platform;

 IBM Watson Developer platform;

 BigML;

 Infosys Nia resources.

Step 3: Testing

While the fundamental testing concepts are fully applicable in AI development projects, there are
additional considerations too. These are as follows:

 The volume of test data can be large, which presents complexities.

 Human biases in selecting test data can adversely impact the testing phase, therefore, data
validation is important.

 Your testing team should test the AI and ML algorithms keeping model validation, successful
learnability, and algorithm effectiveness in mind.

 Regulatory compliance testing and security testing are important since the system might deal with
sensitive data, moreover, the large volume of data makes performance testing crucial.

 You are implementing an AI solution that will need to use data from your other systems, therefore,
systems integration testing assumes importance.

 Test data should include all relevant subsets of training data, i.e., the data you will use for training
the AI system.

 Your team must create test suites that help you validate your ML models.

34
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Unit 3: Storytelling
Refreshing what we learnt in Level 1, we will be re-visiting some concepts of storytelling.

Why storytelling is so powerful and cross-cultural, and what this means for data
storytelling?
Stories create engaging experiences that transport the audience to another space and time. They
establish a sense of community belongingness and identity. For these reasons, storytelling is
considered a powerful element that enhances global networking by increasing the awareness
about the cultural differences and enhancing cross-cultural understanding. Storytelling is an
integral part of indigenous cultures.
Some of the factors that make storytelling powerful are its attribute to make information more
compelling, the ability to present a window in order to take a peek at the past, and finally to draw
lessons and to reimagine the future by affecting necessary changes. Storytelling also shapes,
empowers and connects people by doing away with judgement or critic and facilitates openness
for embracing differences.
A well-told story is an inspirational narrative that is crafted to engage the audience across
boundaries and cultures, as they have the impact that isn’t possible with data alone. Data can be
persuasive, but stories are much more. They change the way that we interact with data,
transforming it from a dry collection of “facts” to something that can be entertaining, engaging,
thought provoking, and inspiring change.
Each data point holds some information which maybe unclear and contextually deficient on its
own. The visualizations of such data are therefore, subject to interpretation (and
misinterpretation). However, stories are more likely to drive action than are statistics and
numbers. Therefore, when told in the form of a narrative, it reduces ambiguity, connects data
with context, and describes a specific interpretation – communicating the important messages
in most effective ways. The steps involved in telling an effective data story are given below:
Understanding the audience
Choosing the right data and visualisations
Drawing attention to key information
Developing a narrative
Engaging your audience

35
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Activity

A new teacher joined the ABC Higher Secondary School, Ambapalli to teach Science to the
students of Class XI. In his first class itself, he could make out that not everyone understood what
was being taught in class. So, he decided to take a poll to assess the level of students. The following
graph shows the level of interest of the students in the class.

PRE: How do you feel about Science?

11%
19% 5%

25%
40%

Bored Not great OK A bit interested Excited

Depending on the result obtained, he changed his method of teaching. After a month, he repeated
the same poll once again to ascertain if there was any change. The results of poll are shown in the
chart below.

POST: How do you feel about Science?

12%
6%
38%
14%

30%

Bored Not great OK A bit interested Excited

With the help of the information provided create a good data story setting a strong narrative
around the data, making it is easier to understand the pre and post data, existing problem, action
taken by the teacher, and the resolution of the problem. Distribute A4 sheets and pens to the
students for this activity.

36
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Storytelling with Data

Purpose: To provide insight into data storytelling and how it can bring a story to life.

Session Preparation
Logistics: For a class of ____ students. [Group Activity]
Materials Required
ITEM QUANTITY

A4 sheets Xx

Pens Xx

Data storytelling is a structured approach for communicating insights drawn from data, and
invariably involves a combination of three key elements: data, visuals, and narrative. When the
narrative is accompanied with data, it helps to explain the audience what’s happening in the data
and why a particular insight has been generated. When visuals are applied to data, they
can enlighten the audience to the insights that they wouldn’t perceive without the charts or
graphs.
Finally, when narrative and visuals are merged together, they can engage or even entertain an
audience. When you combine the right visuals and narrative with the right data, you have a data
story that can influence and drive change.

37
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

By the numbers: How to tell a great story with your data?

Presenting the data as a series of disjointed charts and graphs could result in the audience
struggling to understand it – or worse, come to the wrong conclusions entirely. Thus, the
importance of a narrative comes from the fact that it explains what is going on within the data
set. It offers a context and meaning, relevance and clarity. A narrative shows the audience where
to look and what not to miss and also keeps the audience engaged.
Good stories don’t just emerge from data itself; they need to be unravelled from data
relationships. Closer scrutiny helps uncover how each data point relates with other. Some easy
steps that can assist in finding compelling stories in the data sets are as follows:
Step 1: Get the data and organise it.
Step 2: Visualize the data.
Step 3: Examine data relationships.
Step 4: Create a simple narrative embedded with conflict.
Activity: Try creating a data story with the information given below and use your imagination to
reason as to why some cases have spiked while others have seen a fall.

Mosquito borne diseases in Delhi

600
530
500
435
400
311
300
238
200 165 146
208 131
100 78
56 20 75
0 0 21
44
2015 2016 2017 2018 2019

Dengue Malaria Chikungunya

38
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

Data storytelling has acquired a place of importance because:

 It is an effective tool to transmit human experience. Narrative is the way we simplify and
make sense of a complex world. It supplies context, insight, interpretation—all the things
that make data meaningful, more relevant and interesting.
 No matter how impressive an analysis, or how high-quality the data, it is not going to
compel change unless the people involved understand what is explained through a story.
 Stories that incorporate data and analytics are more convincing than those based entirely
on anecdotes or personal experience.
 It helps to standardize communications and spread results.
 It makes information memorable and easier to retain in the long run.
Data Story elements challenge –
Identify the elements that make a compelling data story and name them

_____________________

______________________

_____________________

39
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL

APPENDIX
Additional Resource for Advanced learners

The objective of this additional AI programming resource for Class 12 is to increase student
knowledge and exposure to programming and help them create AI projects.
The Resources are divided into two categories:

 Beginner - For Students who have no experience in python programming

 Advanced – For Students who have some experience with python in earlier grades
The resources are shared in a cloud storage that has been made public. These resources are mainly
python notebooks, links to open-source GitHub repositories and eBooks that will be updated on
an ongoing basis.

Links:
Beginner - https://bit.ly/33spBZq
Advance - https://bit.ly/3b9US7V
Note: Please use google collab links for easy reference

Ai Handbook Class-11
89% (9)
Ai Handbook Class-11
154 pages
Chapter 9: Authentic Leadership Test Bank: Ultiple Hoice
100% (5)
Chapter 9: Authentic Leadership Test Bank: Ultiple Hoice
26 pages
Ai DL ML
No ratings yet
Ai DL ML
151 pages
417 Ai MS
No ratings yet
417 Ai MS
8 pages
Lecture Notes ON: Artificial Intelligence
No ratings yet
Lecture Notes ON: Artificial Intelligence
123 pages
Artificial Intelligence For Speech Recognition
No ratings yet
Artificial Intelligence For Speech Recognition
5 pages
Chapter4 Beyond Classical Search
No ratings yet
Chapter4 Beyond Classical Search
34 pages
AIQUEBANK
No ratings yet
AIQUEBANK
9 pages
Traveling Allowance Bill For Journey Undertaken For Performance of Election Duty
100% (1)
Traveling Allowance Bill For Journey Undertaken For Performance of Election Duty
2 pages
4-Strategic Management and Artifical Intelligence
No ratings yet
4-Strategic Management and Artifical Intelligence
5 pages
Zavrsni Rad - Its U Automobilima
No ratings yet
Zavrsni Rad - Its U Automobilima
59 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
A New Service For Sharing Economy: Ecole de Commerce de Lyon International Business Administration
No ratings yet
A New Service For Sharing Economy: Ecole de Commerce de Lyon International Business Administration
34 pages
Programming Agents Williams
No ratings yet
Programming Agents Williams
31 pages
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
No ratings yet
Convolutional Neural Network For A Self-Driving Car in A Virtual Environment
6 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
4 pages
843-Skill Handbook AI Class XI
No ratings yet
843-Skill Handbook AI Class XI
173 pages
Lesson 1
100% (1)
Lesson 1
36 pages
AI Curriculum HandbookClassXI
No ratings yet
AI Curriculum HandbookClassXI
151 pages
Chapter 1
No ratings yet
Chapter 1
56 pages
AI Student Handbook Class IX 2025 26
No ratings yet
AI Student Handbook Class IX 2025 26
161 pages
AI Introduction
No ratings yet
AI Introduction
32 pages
AI Student HandbookXI Copy No Page2 3 4
No ratings yet
AI Student HandbookXI Copy No Page2 3 4
153 pages
Artificial Intelligence & Machine Learning Digital Notes
100% (2)
Artificial Intelligence & Machine Learning Digital Notes
116 pages
Intro TO AI
No ratings yet
Intro TO AI
5 pages
AI-ML Edited
100% (1)
AI-ML Edited
12 pages
Introductiontoaiml 240919083826 24f51819
No ratings yet
Introductiontoaiml 240919083826 24f51819
105 pages
Artificial Intelligence - Handbook
No ratings yet
Artificial Intelligence - Handbook
215 pages
CH - 11 - Introduction To Artificial Intelligence
No ratings yet
CH - 11 - Introduction To Artificial Intelligence
5 pages
Screenshot 2025-03-14 at 10.30.37 PM
No ratings yet
Screenshot 2025-03-14 at 10.30.37 PM
31 pages
Research Paper On AI
No ratings yet
Research Paper On AI
16 pages
Expose Anglais Fichier Corriger - Copie
No ratings yet
Expose Anglais Fichier Corriger - Copie
18 pages
Ai For Biginners (Autosaved)
No ratings yet
Ai For Biginners (Autosaved)
135 pages
Lecturer 1
No ratings yet
Lecturer 1
24 pages
AI Student HandbookXI
No ratings yet
AI Student HandbookXI
156 pages
Indian Institute of Management Bangalore: PGP 4 Term 2019-20
No ratings yet
Indian Institute of Management Bangalore: PGP 4 Term 2019-20
6 pages
Asm 127219
No ratings yet
Asm 127219
15 pages
Lecture 1 - Introduction To The Course and AI, ML
No ratings yet
Lecture 1 - Introduction To The Course and AI, ML
44 pages
Week#1
No ratings yet
Week#1
46 pages
417 AI Handbook Class9
No ratings yet
417 AI Handbook Class9
135 pages
417 Class 9 AI - Facilitators - Handbook Study Materia (2025-26)
No ratings yet
417 Class 9 AI - Facilitators - Handbook Study Materia (2025-26)
157 pages
AI Module1 Notes
No ratings yet
AI Module1 Notes
23 pages
Presentationn
No ratings yet
Presentationn
6 pages
Unit IV - AIML and IoT
No ratings yet
Unit IV - AIML and IoT
39 pages
Indian Institute of Management Bangalore: PGP 4 Term 2019-20
No ratings yet
Indian Institute of Management Bangalore: PGP 4 Term 2019-20
6 pages
IntroToAi NOTES
No ratings yet
IntroToAi NOTES
2 pages
Chapter - 3 - Artificial Intelligence (AI)
No ratings yet
Chapter - 3 - Artificial Intelligence (AI)
66 pages
AI
No ratings yet
AI
6 pages
DS Xi Sec3
No ratings yet
DS Xi Sec3
101 pages
Chapter 3 - Introduction Artificial Intelligence
No ratings yet
Chapter 3 - Introduction Artificial Intelligence
48 pages
Basic Into To The Course Ai
No ratings yet
Basic Into To The Course Ai
40 pages
Artificial Intelligence (Unit - 1)
No ratings yet
Artificial Intelligence (Unit - 1)
82 pages
Artificial Intelligence and Machine Learning Digital Notes
No ratings yet
Artificial Intelligence and Machine Learning Digital Notes
185 pages
Ai Part B
No ratings yet
Ai Part B
232 pages
XI AI UNIT 1 Introduction Artificial Intelligence For Everyone
No ratings yet
XI AI UNIT 1 Introduction Artificial Intelligence For Everyone
18 pages
11 Ai Level 1 Notes
No ratings yet
11 Ai Level 1 Notes
8 pages
AI - Fall2024 (Week 1)
No ratings yet
AI - Fall2024 (Week 1)
34 pages
Module 1
No ratings yet
Module 1
5 pages
Ai Xi Sec3
No ratings yet
Ai Xi Sec3
47 pages
Ict Eng t2
No ratings yet
Ict Eng t2
76 pages
CS-370-Week 1-1
No ratings yet
CS-370-Week 1-1
59 pages
Pioneers in Police Research William A. Westley - Jack R. Greene
No ratings yet
Pioneers in Police Research William A. Westley - Jack R. Greene
16 pages
Coworld M1-T4
No ratings yet
Coworld M1-T4
2 pages
DLL - Mathematics 1 - Q1 - W6
No ratings yet
DLL - Mathematics 1 - Q1 - W6
6 pages
Pureposive Communication Midterm Reviewer
No ratings yet
Pureposive Communication Midterm Reviewer
3 pages
Syllabus B.tech Gju
No ratings yet
Syllabus B.tech Gju
280 pages
DLL CSS Week 2
No ratings yet
DLL CSS Week 2
3 pages
Comp ML4004 - Assessment - ReflectionQuestions - To21Cv2
No ratings yet
Comp ML4004 - Assessment - ReflectionQuestions - To21Cv2
2 pages
Functionalism 2007
No ratings yet
Functionalism 2007
8 pages
ReMeth - Chap 1
No ratings yet
ReMeth - Chap 1
29 pages
Macaraeg-Arc085 (RSW No 1)
No ratings yet
Macaraeg-Arc085 (RSW No 1)
7 pages
Las Pr1 11 Melc 6 Week 1e
No ratings yet
Las Pr1 11 Melc 6 Week 1e
5 pages
Study of Consumer Perceptions and Preferences: Consumption Case Study of Fish and Shellfish in Mexicali, Baja California, Mexico
No ratings yet
Study of Consumer Perceptions and Preferences: Consumption Case Study of Fish and Shellfish in Mexicali, Baja California, Mexico
16 pages
Intro To Tel 433
No ratings yet
Intro To Tel 433
2 pages
Patrimonialism. What Is Behind The Term: Ideal Type, Category, Concept or Just A Buzzword?
No ratings yet
Patrimonialism. What Is Behind The Term: Ideal Type, Category, Concept or Just A Buzzword?
27 pages
11 A
No ratings yet
11 A
10 pages
Unit 2 Biological Development
No ratings yet
Unit 2 Biological Development
53 pages
Fakulti Pengurusan & Teknologi Maklumat Universiti Sultan Azlan Shah
No ratings yet
Fakulti Pengurusan & Teknologi Maklumat Universiti Sultan Azlan Shah
4 pages
Student Teachers'Competencein Lesson Planning During Microteaching
No ratings yet
Student Teachers'Competencein Lesson Planning During Microteaching
29 pages
Final Survey Questionnaire
No ratings yet
Final Survey Questionnaire
11 pages
Omd Course List Summer
No ratings yet
Omd Course List Summer
1 page
Practical Research 2 Chapter 1 Contents
67% (3)
Practical Research 2 Chapter 1 Contents
47 pages
Exam-Raun Clarke
No ratings yet
Exam-Raun Clarke
3 pages
2025 Yiss 6W Com3134-11 Media Psychology Namkee Park P1
No ratings yet
2025 Yiss 6W Com3134-11 Media Psychology Namkee Park P1
2 pages
Supervised, Unsupervised & Reinforcement Learning
No ratings yet
Supervised, Unsupervised & Reinforcement Learning
11 pages
A Critical Evaluation of The Concept of Human Security
100% (1)
A Critical Evaluation of The Concept of Human Security
5 pages
Week 4
No ratings yet
Week 4
3 pages
British Journal of Guidance & Counselling
No ratings yet
British Journal of Guidance & Counselling
4 pages
Educated Filipino
100% (1)
Educated Filipino
2 pages
Skerdicuka - Shtresimi I Shoqerise
No ratings yet
Skerdicuka - Shtresimi I Shoqerise
40 pages