IBM AI Level (1-3)
IBM AI Level (1-3)
CLASS XI
______________________________________________
INDEX
1
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Summary:
The imminent world that forecasts the future era is different from the one we can predict and
see today. Artificial Intelligence is the driving force that will lead the future generations. Self-
driving cars, widespread automation, robotic gadgets will become an integral part of day to
day life of the human race. Trade, work, professions, employment will see a massive
transformation. Fast adaptability is crucial for the forthcoming cohort as they will be widely
affected by this change. We the mentors shoulder this responsibility to equip them to handle
the future tools with care and intellectual pride.
We are confident that the prospective children will empower themselves for future to come
and will understand key concepts underlying this new technology- AI.
What is AI? This unit will lay down the foundations of AI by discussing its history and setting
ground for forthcoming units.
Objective:
1. Understand the definition of Artificial Intelligence and Machine Learning
2. Evaluate the impact of AI on society
3. Unfold the AI terminology - Machine Learning (ML), Deep Learning (DL), Supervised
Learning, Un-supervised Learning etc.
4. Understand the strengths and limitations of AI and ML
5. Identify the difference between AI on one side and Machine Learning (ML), Deep
Learning (DL) on other
Learning Outcome:
1. To get introduced to the basics of AI and its allied technologies
2. To understand the impact of AI on society
Pre-requisites: Reasonable fluency in English language and basic computer skills
Key Concepts: Artificial Intelligence (AI) , Machine Learning (ML) and Deep Learning (DL)
2
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
AI is a technique that facilitates a machine to perform all cognitive functions such as perceiving, learning
and reasoning that are otherwise performed by humans.
“The Science and Engineering of making intelligent machines, especially intelligent Computer programs
is Artificial intelligence” –JOHN MC CARTHY [Father of AI]
The yardstick to achieve true AI still seems decades away. Computers execute certain tasks way better
than humans e.g.: Sorting, computing, memorizing, indexing, finding patterns etc. While identifying of
3
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
emotions, recognising faces, communication and conversation are unbeatable human skills. This is
where AI will play a crucial role to enable machines achieving equalling human capabilities.
4
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity
Let’s get imaginative and create an intelligent motorbike. It is the year 2030, add features to create a
machine that races against time.
_________________________________________________________________________________
5
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
2. History of AI
In 1950’s
The modern-day AI got impetus since 50s of the previous centuries, once Alan Turning introduced
“Turning Test” for assessment of intelligence.
In 1955
John McCarthy known as the founder of Artificial Intelligence introduced the term ‘Artificial
Intelligence’. McCarthy along with Alan Turing, Allen Newell, Herbert A. Simon, and Marvin Minsky too
has the greatest contribution to present day machine intelligence. Alan suggested that if humans use
accessible information, as well as reason, to solve problems to make decisions – then why can’t it be
done with the help of machines?
In 1970’s
70 s saw an upsurge of computer era. These machines were much quicker, affordable and stowed more
information. They had an amazing character to think abstract, could self-recognize and accomplished
natural language processing.
In 1980’s
These were the years that saw flow of funds for research and algorithmic tools. The learning skills were
enhanced and computers improved with deeper user experience.
In 2000’s
Many unsuccessful attempts, Alas! The technology was successfully established by years 2000.The
milestones were realised, that needed to be accomplished. AI could somehow manage to thrive despite
lack of government funds and public appreciation.
6
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
3. Machine Learning
Example 1:
2, 4, 8, 16, 32,?
And I am sure, you would have guessed the correct answer which is 64. But how did you arrive at 64?
This calculation must have taken place inside your brain cells and the technique you used to decipher
this puzzle, has actually helped you to decode Machine Learning (ML).
That’s exactly the kind of behaviour that we are trying to teach the machines. ‘Learn from experience’
is what we want machines to acquire.
Example 2:
Let us take another example from Cricket. Assume you are the batsman facing a baller. By looking at the
baller’s body movement and action, you predict and move either left or right to hit the ball. But if the
baller throws a straight ball, what will you do? Apart from the baller’s body movement, you also try to
find out the patterns in baller’s bowling habit, that after 2 consecutive left side balls, he/she throws a
straight ball and you prepare yourself to face the next ball. So what you are doing is learning from past
experience in order to perform better in the future.
When a computer does this, it is called Machine Learning. You let the computer to learn from its past
experience / data.
Example 3:
I am Mr. XYZ and I want to buy a house. I try to calculate how much I need to save monthly for that. I
did my research work and got to know that a new house would cost me anything between Rs. 30 Lakh
to Rs. 100 Lakh. A 5-year old house would cost me between Rs. 20 Lakh to 50 Lakh, a house in Delhi
would cost me ......and buying a house in Mumbai would be ......and so on.
Now my brain starts working and suddenly I am able to make out a pattern:
So, the price of the house depends on its age, location, built up area, facilities, depreciation
(which means that price could drop by Rs. 2 Lakh every year, but it would not go below Rs. 20
Lakh.)
In machine learning terms, Mr. XYZ has stumbled upon regression – he predicted a value (price)
based on the available historical data. People do it all the time, when trying to estimate a
reasonable cost for a used phone or a car or figure out how many cakes to buy for a birthday
party, which might be 200 grams per person, so how many kilograms for a party of 50 persons?
7
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Let's get back to the pricing of the house. The problem is that the construction dates are different,
dozens of options are available, locations are multiple, seasonal demands spike, and an array of many
more hidden factors.
Humans may not be able to keep all that data in mind, while calculating the price for prospective houses.
So we need robots to do the mathematics for us. Let’s go the computational way and provide the
machine some data and ask it to find all hidden patterns related to the price, and it works! The most
exciting thing is that a machine copes with this task much better than a real person does when carefully
analysing all the dependencies in his/her mind. This heralds the birth of machine learning!
2. YouTube recommending you to watch videos of certain genre and the recommended videos matching
your choice of videos to a great extent.
3. Flipkart or Amazon recommending you to buy products of your choice. How do they come to know
your buying preferences? Did you shop together?
4. When you upload photos to Facebook, the service automatically highlights faces and suggests which
friends to tag. How does it instantly identify your friends in the photos? You might be thinking that
Facebook is a magician. Isn’t it?
8
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
If you haven’t realized as yet, then it is time for you to know that Machine learning is behind all the
surprises sprung up by Google, Amazon and Flipkart. Even you can create this magic by learning a little
about mathematics and a computer programming language.
I am sure, by now you have some insight into ML. So, what is ML?
“Machine Learning is a discipline that deals with programming the systems so as to make them
automatically learn and improve with experience. Here, learning implies understanding the input data
and taking informed decisions based on the supplied data”. In simple words, Machine Learning is a
subset of AI which predicts results based on incoming data.
The utilities of ML are numerous. So as to detect spam emails, forecast stock prices or to project class
attendance one can achieve results by means of earlier collected spam messages, previous price history
records or procure 5 years or more attendance data of a class. ML will predict the results based upon
previous data base experience available with it.
Activity
Based on the understanding you have developed till now, how do you think Machine Learning could
help some of the problems being faced currently by your school. Fill the problems in the blank circles
given below:
9
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Conventional programming and ML coding both are computer programs but their approach and
objective are different. Like your school dress and your casual dress – both are clothes, made from
threads but their purpose is different.
If you ned to develop a website for your school, you will take the Conventional programming approach.
But if you want to develop an application to forecast the attendance percentage of your school for a
particular month (based on historical attendance data) you will use the ML approach.
Conventional Programming refers to any manually created program which uses input data, runs on a
computer and produces the output. What does it mean? Let us understand it by illustration below:
A programmer accepts the input, gives the instruction (through Code / Computer language) to the
computer to produce an output/destination.
Take a look at an example. Below are the steps to convert Celcius scale to Fahrenheit scale
On the contrary, in Machine Learning (ML), the input data and the output data are fed to an algorithm
(Machine learning algorithm) to create a program. Unlike conventional programming, Machine Learning
is an automated process where a programmer feeds the computer with ‘The Input + The Output’ and
computer generates the algorithm as to how the ‘The Output’ was achieved.
10
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
For example, if the same Python program above is to be written using the Machine Learning approach,
the code will look like this:
Step 1: Feed lot many values in Celcius (i.e. -40, -10, 0, 8, 15, 22, 38)
Step -2: Feed corresponding Fahrenheit values (i.e. -40, 14, 32, 46, 59, 72, 100)
Step -3: Pass these 2 sets of values to Machine Learning (ML) algorithm
Step- 4: Now you ask the ML program to predict (convert) any other celcius value to Fahrenheit, and
program will tell you the answer.
For example, ask the computer to predict (convert) 200 Celcius to Fahrenheit, and you will get the
answer as 392.
Can you notice - in the ML approach, nowhere this conversion step (F = C*1.8 +32) has been mentioned.
Code was provided with the input date (Celcius) and corresponding output data (Fahrenheit) and the
model (ML code) automatically generates the relationship between Celsius and Fahrenheit.
There is a lot of debate regarding the difference between Machine Learning and Artificial Intelligence.
But the truth is that Machine Learning and Artificial Intelligence are not essentially two different things
as it is understood to be. Machine Learning is a tool for achieving Artificial Intelligence.
AI is a technology to create intelligent machines that can recognize human speech, can see (vision),
assimilate knowledge, strategize and solve problems as humans do. Broadly, AI entails all those
technologies or fields that aim to create intelligent machines.
Machine learning provides machines the ability to learn, forecast and progress on their own without
specifically being programmed. In a nutshell, ML is more about learning and nothing else. ML system
primarily starts with a ‘slow state’ (like a child) and gradually improve by learning from examples to
become ‘superior’ (like an adult).
Imagine you have to make a robot that can see, talk, walk, sense and learn. What application will you
apply? In order to achieve this task of making such a robot, one have to apply numerous technologies
but for the learning part, you will apply machine learning.
11
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
4. Data
Modern day scholars have coined the phrase ‘Data is the new oil’. If everyone is talking so highly about
data, then it must be something precious! But what is this data?
Activity
Let us create a students’ dataset for your class (the one given below is a sample, you can create one of
your own)
A 76 Male 92 N
B 82 Male 88 Y
C 57 Male 65 N
D 97 Female 97 N
E 56 Male 62 Y
F 76 Female 85 N
G 51 Male 56 Y
12
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity
Open the URL https://data.gov.in/node/6721404 in your web browser. It should open the following
page
The page you opened, has a link Reference URL: https://myspeed.trai.gov.in/ - Click on this link.
Now that we have engaged in two activities related to data, let us try and define Data.
Data can be defined as a representation of facts or instructions about some entity (students, school,
sports, business, animals etc.) that can be processed or communicated by human or machines. Data is
a collection of facts, such as numbers, words, pictures, audio clips, videos, maps, measurements,
observations or even just descriptions of things.
13
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Data maybe represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9) or special
characters (+, -, /, *, <,>, = etc.)
Activity
Now that you have created a dataset of your own, it is the time to categorise the data. Data can be
sorted into one of the two categories stated below:
Structured Data
Unstructured Data
‘Structured data’ is most often categorized as quantitative data, and it's the type of data most of us work
with every day. Structured data has predefined data types and format so that it fits well in the column/
fields of database or spreadsheet. They are highly organised and easily analysed.
In above Activity- name, age, address etc. are examples of ‘Structured data’. The data is structured in
accurately defined fields. The data that can be stored in relational databases or spread sheets (like Excel)
is the best example of structured data.
However, for the field of ‘Type of Facebook posts’ - Do you have any predefined data type? In fact, your
Facebook post can carry anything – text, picture, video, audio etc. You can’t have one fixed data type
for such data and that’s why you call it ‘Unstructured data’ - where neither size is fixed not datatype is
predefined.
‘Unstructured data’ is most often categorized as qualitative data, and it cannot be processed and
analysed using conventional relational database (RDBMS) methods.
Examples of unstructured data include text, video, audio, mobile activity, social media activity, satellite
imagery, surveillance imagery and the list goes on. Unstructured data is difficult to deconstruct because
14
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
it has no pre-defined model, meaning it cannot be organized in relational databases. Instead, non-
relational, or NOSQI databases, are best fit for managing unstructured data.
“Machine learning is the science of getting computers to act without being explicitly programmed.”
– Stanford University
“Machine learning algorithms can figure out how to perform important tasks by generalizing from
examples.” – University of Washington
Of late, machine learning has achieved a great deal of popularity, but the first attempt to develop a
machine that imitated the behaviour of a living being was made in the 1930s by Thomas Ross. Machine
Learning (ML) is a term used today to describe an application of AI which equips the system with the
ability to learn and improve from experience using the data that is accessible to it.
For more, please refer to section 3.
Machine learning is often divided into three categories – Supervised, Unsupervised and
Reinforcement learning.
As the name specifies, Supervised Learning occurs in the presence of a supervisor or a teacher. We train
the machine with labeled data (i.e. some data is already tagged with correct answer). It is then compared
to the learning which takes place in the presence of a supervisor or a teacher. A supervised learning
algorithm learns from labelled training data, and then becomes ready to predict the outcomes for
unforeseen data.
15
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Example 1
Remember the time when you used to go to school? The time when you first learnt what an apple looked
like? The teacher probably showed a picture of an apple and told you what it was, right? And you could
identify the particular fruit ever since then.
Step 1: You provide the system with data that contains photos of apples and let it know that these are
apples. This is called labelled data.
Step 2: The model learns from the labelled data and the next time you ask it to identify an apple, it can
do it easily.
Example 2
For instance, suppose you are given a basket full of different kinds of fruits. Now the first step is to train
the machine to identify all the different fruits one by one in the following manner:
If the shape of the object is round with depression at the top and
its color being Red, then it will be labelled – Apple.
If shape of object resembles a long-curved cylinder with tapering
ends and its colour being Green or Yellow, then it will be labelled
– Banana.
Now suppose after training, you bring a banana and ask the machine to
identify it, the machine will classify the fruit on the basis of its shape and
colour and would confirm the fruit to be BANANA and place it in the Banana category.
Activity 1
Suppose you have a data set entailing images of different bikes and cars. Now you need to train the
machine on how to classify all the different images. How will you create your labelled data?
16
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
[Hint – If there are 2 wheels and 1 headlight on the front it will be labelled as a ‘Bike’]
Many a times, perfectly labelled data sets are hard to find. In such situations, data used to train the
machine are neither labelled nor classified. Unsupervised learning is a ML technique where we don’t
need to supply labelled data, instead we allow the machine learning model (algorithm) to discover the
patterns on its own. The task of the machine is to assemble unsorted information according to
resemblances, patterns and variances without any former training of data.
In this kind of learning, the machine is restricted to find a hidden structure in the unlabelled data without
guidance or supervision.
Example 1
If somebody gives you a basket full of different fruits and asks you to separate them, you will probably
do it based on their colour, shape and size, right?
Unsupervised learning works in the same way. As you can see in the image:
Step 1: You provide the system with a data that contains photos of different kinds of fruits and ask it to
segregate it. Remember, in case of unsupervised learning you don’t need to provide labelled data.
Step 2: The system will look for patterns in the data. Patterns like shape, colour and size and group the
fruits based on those attributes.
17
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Example 2
For instance, suppose the machine is given an image having both dogs
and cats which it has not seen before. Logically the machine has no idea
about the physical characteristics of dogs and cats and therefore it
cannot categorize the animals. But it can surely categorize them
according to their similarities, patterns, and differences i.e. we can
easily categorize this above picture into two parts. First category may
contain all pictures having dogs in it and second category may contain
all pictures having cats in it. Here you didn’t learn anything before,
means no training data or examples were provided for prior training.
Let us take another example - a friend invites you to his party, where you meet a stranger. Now you will
classify this person using unsupervised learning (without prior knowledge) and this classification can be
on the basis of gender, age group, dressing style, educational qualification or whichever way you prefer.
Activity 1
Let's suppose you have never seen a Cricket match before and by chance watch a video on the internet.
Can you classify players on the basis of different criterion?
Hint: [Players wearing similar outfits belong to a team, players performing different types of action –
batting, bowling, fielding, and wicket keeping.]
In reinforcement learning, the machine is not given examples of correct input-output pairs, but a
method is provided to the machine to measure its performance in the form of a reward. Reinforcement
18
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
learning methods resemble how humans and animals learn, the machine carries out numerous activities
and gets rewarded whenever it does something well.
Example 1
The goal of the robot is to get the reward (diamond) and to avoid the
hurdles (fire). The robot learns by trying all the possible paths and then
chooses the path which reaches the reward while encountering the
least hurdles. Each correct step will bring the robot closer to the
diamond while accumulating some points and each wrong step will
push the robot away from the diamond and will take away some of the
accumulated points. The reward (diamond) will be assigned to the robot when it reaches the final stage of the
game.
Example 2
Imagine a small kid is given access to a laptop at home (environment). In simple terms, the baby (agent)
will first observe and try to understand the laptop environment (state). Then the curious kid will take
certain actions like hitting some random buttons (action) and observe how the laptop would respond
(next state).
As the non-responding laptop screen goes dull, the kid dislikes it (receiving a negative reward) and
probably won’t like to repeat the actions that led to such a result (updating the policy) and vice versa.
The kid will repeat the process until he/she finds a button which turns the laptop screen bright (rewards)
and will be happy maximizing the total rewards.
19
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity 1
Question -1: Can you please find two real world applications of Supervised Learning?
Question -2: Can you write down two real world applications of Unsupervised Learning?
Question-3: What kind of learning algorithm do you think works behind the Computer chess engine?
Deep Learning is inspired from human brain and the neurons in the human brain. Therefore, in order to
understand Deep Learning, we will first need to know about ‘neurons’.
A small child learns to distinguish between a school bus and a regular transit bus. How?
How do we easily differentiate between our pet dog and a street dog?
The answer is we have a vast biological neural network that connects the neurons to our nervous
systems. Our brain is a very complex network comprising of about 10 billion neurons each connected to
10 thousand other neurons.
So, before we try to understand Deep Learning, let us understand Neural Network (Artificial Neural
Network i.e. ANN). In short, Deep Learning consists of artificial neural networks designed on similar
networks present in the human brain. The idea of ANN in Deep Learning is based on the belief that
human brain works by making the right connections, and this pattern can be imitated using silicon and
wires in place of living neurons.
20
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Output Node: This is the final stage where the computations conclude, and data is made
available to the output layer from where it gets transferred back into the real-world
environment.
Example
A school has to select students for their upcoming sports meet. The school principal forms a group of
three teachers (a selection jury) and entrusts them with the responsibility of selection of students based
on the following criteria:
21
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
The school has a history of fair selection procedure and therefore only talented and bonafide students
are able to secure a place in the sports team. In order to continue the same standard and selection
procedure, the principal decides to share (with the jury) data of about 50 previous students’ (who were
selected) cases to study. The principal feels this will give the jury an opportunity to practice, which will
eventually help them make a fair selection.
(I would like to remind that this whole exercise is being performed on the previous batch of students
and the purpose of this exercise is to sharpen the decision making accuracy of the jury for the upcoming
selection)
Every jury member is given a maximum of 10 points (weight) on which they rate a student. They
need to distribute the 10 points across the four criteria of marks, gender, age and emotional
stability.
The cut-off average required for a student to qualify is fixed at ‘6’. So, a student needs to have
an average score of ≥ 6 to reserve his/her spot in the sports team.
After the jury gives their verdict on a particular student (using the above four criteria), the
principal will reveal whether their verdict of "Selected" or "Not Selected" matches the original
selection outcome.
Once the ground rules have been set, the jury enters a room to deliberate on the candidates and start the decision
making process. Here is a peak into the jury conversation:
Teacher 1: For me ‘Grade X Marks’ is most important and I am assigning this criterion the most weight
and other criteria are not important. Accordingly, I’m giving a score of ‘7 points’ to Student#1.
Teacher 2: I think differently…’Marks’ are important, however I am also considering ‘Gender’ and ‘Age’
and I’m assigning each of the three criteria equal weight. So I’m scoring Student # 1 ‘2 points for Marks’,
‘2 points for Gender’ and ‘2 points for Age’.
Teacher 3: For me only ‘Gender’ and ‘Emotional Stability’ count and I’m assigning equal weightage to
both these criteria. Accordingly, I will score Student # 1 with ‘5 points for Gender’ and ‘5 points for
Emotional Stability’.
Based on the above deliberation, let us take a look at how the jury members have scored Student # 1:
Grade X Marks 7 2 0
Gender 0 2 5
22
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Age 0 2 0
Emotional 0 0 5
Stability
As per the selection rule (cut-off ≥ 6), Student # 1 should have ideally qualified, but the principal reveals that this
student actually did not make the team as per the original decision.
Let us take a look at Student # 2 now… Please see below the jury discussion and deliberation for this candidate
and how they now begin to adjust their scoring based on learning from Student # 1.
Teacher 1: It seems I'm attaching too much weight to just ‘Marks’, so I'm leaning towards giving some
weightage to ‘Age’ as well.
Teacher 2: I feel I’m assigning too much weight to ‘Gender’, I’m going to consider splitting it between
‘Gender’ and ‘Emotional Stability’.
Teacher 3: I now feel in addition to ‘Gender’ and ‘Emotional Stability’ some weightage needs to be given
to ‘Age’ as well.
Based on the above deliberations, let us take a look at the score table for Student # 2:
Grade X Marks 3 2 0
Gender 0 1 1
Age 3 2 3
Emotional 0 1 1
Stability
As per the selection rule (cut-off ≥ 6), Student # 2 will not qualify and the principal reveals that indeed this student
did not make the team as per the original decision as well.
In the above fashion the jury proceeds to evaluate student after student and in doing so a pattern emerges for
the right 'weightage' for each criteria (per jury member) that yields the highest number of correct predictions.
And this whole process of learning and developing an accuracy is nothing but Artificial Neural Networks (ANN).
23
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
The decision/prediction is what we call the ‘Output Layer’ in a neural network. In this case, ‘Selected’ and
‘Not Selected’ is the output layer. It should be noted that it can either be a continuous outcome
(regression, as in a number like 3.14 or 42) or categorical outcome (true/false, yes/no, selected/not
selected etc.)
The jurors (group of teachers) form the ‘Hidden Layer’. It's called ‘hidden’ because no one besides them
know how much weightage they are attaching to each criteria (or input). To the input and output neuron,
the hidden layer is a ‘black box’ that simply listens and jointly decides an outcome.
24
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Deep learning is a branch of machine learning which is completely based on artificial neural networks,
as neural network mimics the human brain so deep learning is also a kind of imitation of the human
brain. In deep learning, we don’t need to explicitly program everything”. It is important to know that in
deep learning, we do not need to explicitly program everything.
Let us now understand the difference between Machine Learning and Deep Learning:
MACHINE LEARNING DEEP LEARNING
Divides the tasks into sub-tasks, solves them individually and Solves problem end to end.
finally combine the results.
Takes longer time to train.
Takes less time to train.
25
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Automated Driving: Automotive researchers are using profound learning to robotically spot
entities such as stop lights and traffic signals. In addition, deep learning is also used to detect
pedestrians, reducing the incident of accidents.
Aerospace and Defence: Identifying objects from satellites and locate safe and unsafe zones for
troops is another area where Deep Learning is playing major role.
Medical Research: Deep Learning is used by cancer researchers to automatically detect cancer
cells.
Industrial Automation: Deep learning is helping to improve worker safety around heavy
machinery by automatically detecting when people or objects are within an unsafe distance from
the machines.
26
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
picture, notices the unique features, and then matches the same with the people in your friends
list.
5. Email spam and malware filtering - Emails are arranged according to some standards as per email
spam. Mail filtering manages received mails, detects and removes the ones holding malicious codes such
as virus, Trojan or malware.
6. Product recommendations - You often receive emails from similar merchandizers after you have
shopped online for a product. The products are either similar or matches your taste, it definitely refines
the shopping experience. Did you know that it is Machine Learning working its magic in the back?
7. Online Fraud Detection
Machine learning is lending its potential to make cyberspace a secure place by tracking monetary frauds
online. Take for example PayPal is using ML for protection against money laundering. Even with the
advancements we have made in ML over the years, there are instances where a Grade 2 student has
been able to beat a computer by solving a problem faster.
1. Any problems or questions which require social context will take longer for a machine to solve
2. Particularly with respect to text analytics, there are two main challenges. First is “Ambiguity” -
this means that the same word can mean many things. Second is “Variability” - indicating the
same thing can be said in many different ways.
3. Machine learning can’t solve ethical problem. If a self-driving car kills someone on the road,
whose fault is it?
27
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
7. Jobs in AI
Can you guess the jobs depicted in the pictures below:
Picture – 1 Picture -2
Picture – 3 Picture -4
The jobs depicted in the pictures above, were professions from not long back, may be 20-30 years ago.
There are so many jobs, which used to exist few decades ago but are redundant in today’s age. Similarly,
there are jobs which were unheard of 30 years ago but are very popular now.
28
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity
1. Can you please prepare a list of 10 such jobs which existed in the 80’s but no longer relevant now?
2. Can you prepare a list of 5 jobs, which were not in 80’s but are popular now?
3. Can you imagine 5 jobs / professions that do not exist now but maybe popular in 2035?
The World Economic Forum predicts that AI and ML will displace 75 million jobs but generate 133 million
new ones worldwide by 2022. Another Gartner report claims that in 2020, Artificial Intelligence will
create 2.3 million jobs and eliminate 1.8 million jobs.
Job losses due to Artificial Intelligence is a baseless fear as AI will NOT take over the employment market
– as simple as that. It will merely introduce a paradigm shift, similar to the one which occurred after the
Industrial Revolution. Consequently, while many professions will become obsolete and disappear, some
occupations will become much more popular, with new ones emerging on the go. It’s important to keep
two things in mind:
1. Acquiring basic tech-related skills is not something you will live to regret
2. Understanding what is happening in the field of AI may help you gain a significant career
advantage, either by investing time and money into learning a new skill or leveraging your
existent knowledge into solving relevant AI-related problems.
Jobs which will grow with the help of AI
1) Creative Jobs
Professionals like artists, doctors, scientists are only a few which can be labelled creative. Such category
of jobs is only going to get refined and advance by use of AI.
The number of such professionals required will not increase. But AI will make certain parts of these jobs
less complex for humans, so it will become easier in the future to learn the skill in lesser time and
flourish.
2) Management Jobs
Management jobs cannot be replaced by artificial managers. Human managers have to manage artificial
managers. Managing is a very complex task which involves deep understanding of people and
communication. There are already few smart tools which help managers become more effective at their
29
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
job. So, if you’re interested in this kind of job, you can learn to use them and gain some advantage in
the field.
3) Tech Jobs
Programmers, data scientists, people who work on the creation and maintenance of AI systems are the
jobs of the future and they will be very important for humanity to make the next large step of its
evolution. They too should undergo certain changes. Few of the tech jobs which are in demand today
may become less common, while others may become more vital.
2. Data detective
6. AI tutor
I will leave up to you guys to take up as a project and define the roles and responsibilities of these jobs
profile.
30
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity
(This activity has been designed by MIT AI Ethics Education Curriculum. “An Ethics of Artificial Intelligence Curriculum for
Middle School Students was created by Blakeley H. Payne with support from the MIT Media Lab Personal Robots Group,
directed by Cynthia Breazeal.”)
Learning Objectives:
2. Know that artificial intelligence is a specific type of algorithm and has three specific parts:
dataset, learning algorithm, and prediction.
3. Recognize AI systems in everyday life and be able to reason about the prediction an AI system
makes and the potential datasets the AI system uses.
1. Print out all of the materials below these two paragraphs, with each bingo card on a separate
paper and the list of Tasks, Data Sets & Predictions on a third.
2. Pass around the bingo cards to the separate teams and keep the list of tasks/dataset/prediction
for yourself (It will serve as both the answer key and the bingo calls)
3. Along with every data set and prediction, you will see the task that it corresponds to on the Bingo
grids. Read out the data set and prediction pairs at random (but not the task itself!) and have the
students fill in the tile they think it belongs to.
4. The first of the two teams to correctly fill out five tiles in a row, diagonal, or column win.
31
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
32
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
33
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
34
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
35
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
T: Click on an Instagram ad
D: the Instagram accounts people follow and what they buy
P: what you might buy based on who you follow
36
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
37
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Hope you have noticed, that I did not ask why we should learn how to manufacture bicycles or how to
build and program a mobile phone or calculator. Anyone, who can’t operate a mobile phone or doesn’t
know how to use a calculator is probably a misfit in the society that we live in today. Those days are not
far away when AI tools and applications will start replacing calculators, bicycles or mobile phones, and
many other gadgets that we use in our day-to-day lives.
Artificial Intelligence technologies are widely used by people today, just that they don’t realise it or are
unaware of it. In the modern times that we live today, we are surrounded by a variety of AI tools that
make many aspects of our lives easier. Have you ever thought about the music that we listen to, the
products we buy from Amazon or Flipkart, or the news and information that we receive are all made
available to us with the help of AI? AI is helping in composing poems, writing stories, helping doctors
perform complex surgeries and also prescribing medicines.
38
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
You must have watched a number of Sci-Fi movies, but to encounter something like “The Terminator”
come to life is not going to happen anytime soon.
Activity 1:
Let us do a small exercise on how AI is perceived by our society.
Do you have any idea what people think about AI? Do an online image search for the term “AI”
The results of your search would help you get an idea about how AI is perceived.
[Note - If you are using Google search, choose "Images" instead of “All” search]
Based on the outcome of your web search, create a short story on “People’s Perception of AI”
Activity 2:
In this activity, let us explore what is being written on the web about AI?
Do you think the articles on the net are anywhere close to reality or are they just complete science
fantasies?
Now based on your experience during the online search about ‘AI’, prepare your summary report that
answers the following questions:
1. What was the name of the article, name of the author and web link where the article was
published?
2. What was the main idea of the article? Mention in not more than 5 sentences
39
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
1.1 Chatbots
If you remember from Unit-1, AI tries to mimic human cognitive functions like vision, speech, reasoning
etc. Chatbot is one of the applications of AI which simulate conversation with humans either through
voice commands or through text chats or both. We can define Chatbot as an AI application that can
imitate a real conversation with a user in their natural language. They enable communication via text or
audio on websites, messaging applications, mobile apps, or telephones.
Some interesting applications of Chatbot are Customer services, E-commerce, Sales and Marketing,
Schools and Universities etc.
Let us take up some real-life examples. HDFC Bank’s EVA (Electronic Virtual Assistant) is making
banking service simple and is available 24x7 for the HDFC bank’s customers.
Please visit the link to get a first-hand experience of banking related conversations with EVA :
https://v1.hdfcbank.com/htdocs/common/eva/index.html
You can also visit https://watson-assistant-demo.ng.bluemix.net/ for a completely new experience on
the IBM Web based text Chatbot.
Open the link in your web browser, and you will land on the demo page of IBM’s banking virtual
assistant. In this demo, you will be engaging with a banking virtual assistant that is capable of
simulating a few scenarios, such as making a credit card payment, booking an appointment with a
banker or choosing a credit card. Watson can understand your entries and responds accordingly.
40
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
The other online Chatbot platform, where you can get basic hands on experience using your Gmail
account is the Google Dialog flow. You can visit the link https://dialogflow.com/ to experience the
Chatbot
We all know the threat due to Corona Virus on humanity. Most of us want to calculate our own risk level
at this time. But just for the risk assessment, it’s not advisable to go out to see a doctor or hospital.
Amidst the current paranoia surrounding COVID 19, Apollo hospitals (and many other hospitals
/medical companies also) released a Chatbot to scan one’s risk level. Another example of leveraging AI
at the time of need.
URL: https://covid.apollo247.com/
41
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Types of Chatbots
Chatbots can broadly be divided into two types:
1. Rule based Chatbot
2. Machine Learning (or AI) based Chatbot risk
Rule based Chatbot is used for simple conversations, and cannot be used for complex conversations.
For example, let us create the below rule and train the Chatbot:
bot.train ( [
'I am good.',
'Thank you',
])
After training, if you ask the Chatbot, “How are you?”, you would get a response, “I am good”. But if you
ask ‘What’s going on? or What’s up? ‘, the rule based Chatbot won’t be able to answer, since it has been
trained to take only a certain set of questions.
42
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Quick Question?
Question: Are rule-based bots able to answer questions based on how good the rule is and how
extensive the database is?
Answer: We can have a rule-based bot that understands the value that we supply. However, the
limitation is that it won’t understand the intent and context of the user’s conversation with it.
For example: If you are booking a flight to Paris, you might say “Book a flight to Paris” and someone else
might say “I need a flight to Paris” while someone from another part of the world may use his/her native
language. AI-based or NLP-based bot identifies the language, context and intent and then it reacts
accordingly. A rule-based bot only understands a pre-defined set of options.
2. Machine Learning (or AI) based Chatbot
Such Chatbots are advanced forms of chatter-bots capable of holding complex conversations in real-
time. They process the questions (using neural network layers) before responding to them. AI based
Chatbots also learn from previous experience and reinforced learning and it keeps on evolving.
AI Chatbots are developed and programmed to meet user requests by furnishing suitable and relevant
responses. The challenge, nevertheless lies in aligning the requests to the most intelligent and closest
response that would satisfy the user.
Had the rule based Chatbot discussed earlier been an AI Chatbot and you had posed ‘What’s going on?
or What’s up?’ instead of ‘How are you?’, you would have got a suitable response from the AI Chatbot.
The earlier mentioned IBM Watson Chatbot - https://watson-assistant-mo.ng.bluemix.net/ is in fact
an AI Chatbot .
source: https://dzone.com/articles/how-to-make-a-chatbot-with-artificial-intelligence
43
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Source: https://www.techsophy.com/chatbots-need-natural-language-processing/
Natural language means the language of humans. It refers to the different forms in which humans
communicate with each other – verbal, written or non-verbal or expressions (sentiments such as sad,
happy, etc.). The technology which enables the machines (software) to understand and process the
natural language (of humans), is called natural language processing (NLP). In other words, it is defined
as branch of Artificial Intelligence that deals with the interaction between computers and humans using
the natural language. The main objective of NLP is to read, interpret, comprehend, and coherently make
sense of the human language such that it creates value for all. Therefore, we can safely conclude that
NLP essentially comprises of natural language understanding (human to machine) and natural language
generation (machine to human).
NLP is a sub – area of Artificial Intelligence deals with the capability of software to process and analyse
human language, both verbal and written language.
44
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Natural language processing has found applications in various fields which are listed as follows
1. Text Recognition (in an image or a video)
You might have seen / heard about cameras that read vehicle’s number plate
Camera / machine captures the image of number plate, the image is transferred to neural network layer
of NLP application, NLP extracts the vehicle’s number from the image. However, correct extraction of
data also depends on the quality of image.
Quick Question?
Question: Where and how is NLP used?
Answer: NLP is natural language processing and can be used in scenarios where static or predefined
answers, options and questions may not work. In fact, if you want to understand the intent and
context of the user, then it is advisable to use NLP.
Let’s take the example of pizza ordering bot. When it comes to pre-listed pizza topping options, you can
consider using the rule-based bot, whereas in case you want to understand the intent of the user where
one person is saying “I am hungry” and another person is saying “I am starving”, it would make more
sense to use NLP, in which case the bot can understand the emotion of the user and what he/she is
trying to convey.
45
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity:
Do you think the application of NLP, can completely remove the language barriers? Can you write the
algorithm / steps to show how NLP will perform the language translation?
_________________________________________________________________________________
2. Summarization by NLP
NLP not only can read and understand the paragraphs or full article but can summarize the article into
a shorter narrative without changing the meaning. It can create the abstract of entire article. There are
two ways in which summarization takes place – one in which key phrases are extracted from the
document and combined to form a summary (extraction-based summarization) and the other in which
the source document is shortened (abstraction-based summarization).
https://techinsight.com.vn/language/en/text-summarization-in-machine-learning/
https://blog.floydhub.com/gentle-introduction-to-text-summarization-in-machine-learning/
Example 1
To understand the summarization process, let us take an example from our real-life, that from our
judiciary. The Lawyers and judges have to read through large volumes of documents just to develop an
understanding of one case. The NLP can be leveraged here by assigning the task of reading the case files
to create a short abstract of every case file. Judges/lawyers will read the summarized files (prepared
by NLP AI) saving their precious time, which can be utilized to expedite the pending cases resolution.
Example 2
Let us take one more example:
Source text: Joseph and Mary rode on a horse drawn carriage to attend the annual fair in London. In
the city, Mary bought a new dress for herself.
Extractive summary: Joseph and Mary attend annual fair in London. Mary bought new dress.
Have you noticed that in this case the extractive summary has been formed by joining the words in
bold?
46
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
3. Information Extraction
Information extraction is the technology of finding a specific information in a document or searching the
document itself. It automatically extracts structured information such as entities, relationships between
entities, and attributes describing entities from unstructured sources.
Example
For example, a school’s Principal writes an email to all the teachers in his school -
“I have decided to organize a teachers’ meet tomorrow. You all are requested to please assemble in my
office at 2.30 pm. I will share the agenda just before the meeting.”
NLP can extract the meaningful information for teachers:
What: Meeting called by Principal
When: Tomorrow at 2.30 pm
Where: Principal’s office
Agenda: Will be shared before the meeting
Another very common example is Search Engines like Google retrieving results using Information
Extraction.
4. Speech processing
The ability of a computer to hear human speech and analyse and understand the content is called speech
processing. When we talk to our devices like Alexa or Siri, they recognize what we are saying to them.
For example:
You: Alexa, what is the date today?
Alexa: It is the 18-March-2020.
What happens when we speak to our device? The microphones of the device hears our audio and plots
the graphs of our sound frequencies. As light-wave has a standard frequency for each colour, so does
sound. Every sound (phonetics) has a unique frequency graph. This is how NLP recognizes each sound
and composes an individual’s words and sentences.
47
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Below table shows the step-wise process of how speech recognition works:
Step 3: Inbuilt algorithm converts amplitude to Step 4: The plotted graph is converted
frequency and then plots a graph to equivalent text which is further
analysed by NLP
Activity
Have you noticed that virtual assistants like Alexa and Google Assistant work very well when they are
connected to the internet, and cannot function when they are offline?
Can you do a web search to find out the reason for the same? Please summarize your
findings/understanding below:
2. Cornel (https://stanfordnlp.github.io/CoreNLP/)
3. Open NLP ( https://opennlp.apache.org/) Open Source NLP toolkit for Data Analysis and sentiment Analysis
4.SpaCy ( https://spacy.io/ ) for Data Extraction, Data Analysis, Sentiment Analysis, Text Summarization
Object detection
Optical Character Recognition
Fingerprint Recognition
48
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Source: https://towardsdatascience.com/everything-you-ever-wanted-to-know-about-computer-vision-heres-a-look-why-it-s-so-
awesome-e8a58dfb641e & https://nordicapis.com/8-biometrics-apis-at-your-fingertips/
How many of you have seen video surveillance cameras installed in schools, shopping malls or other
public places? Do you know what it does? What is the purpose behind installing these cameras? I am
sure your answer would be – safety and security.
49
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Quick Question?
Q1: List out three of 3 places/locations (like library, playground etc.) in your school where surveillance
cameras have been installed?
Q2: In your opinion, who does the actual surveillance – the camera or the person sitting behind the
device? Justify your answer. and why?
Q3: What features you would like to add to the surveillance camera to make it true smart surveillance
camera?
From the above exercise you must have understood that although cameras or any device similar to it,
have the capability to capture a picture or a particular moment but it can’t really analyse or make sense
of that. The device (camera in this case) has a limited capacity to just capture the image of the objects
but is not able to recognize them. Taking pictures is not same as seeing, recognising or understanding.
Watch Video 1 & 2 for which links have been given below and answer the related questions:
Video 1: https://www.youtube.com/watch?v=GN7RKRFtZiQ
Can you explain what you saw in this video?
Video 2: https://www.youtube.com/watch?v=1_B44HO_PAI
After watching this video, do you still believe that a device (camera in this case) can only take pictures
of objects but can’t recognize them?
50
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Before we dive deep into the interesting world of computer vision, let us perform an online activity:
Activity:
Open the URL in your web browser – https://studio.code.org/s/oceans/stage/1/puzzle/1
‘AI for Ocean’ by Code.org is designed to quickly introduce students to machine learning, a branch
of artificial intelligence and is based on the idea that machines can recognise patterns, make sense of
data and make decisions with very little human involvement. Over the years machine learning and
computer vision have come to work very closely. Machine learning has also improved the
effectiveness of computer vision. Students will explore how training data is used to enable a machine
to see and classify the objects.
Activity Plan
There are 8 levels and you will be allowed not more than an hour to complete all the eight levels. And
it is not mandatory for students to complete all eight levels to understand the concept of computer
vision. Although it is recommended that students complete all eight levels).
Activity Outcome
Gaining a basic overview of AI and computer vision.
Activity details
Level 1:
The first level is talking about a branch of AI called Machine Learning (ML). This level is explaining about
the example of ML in our daily life like email filters, voice recognition, auto-complete text and computer
vision.
51
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Level 2 – 4:
Students can proceed through the first four levels on their own or with a partner. In order to program
AI, use the buttons to label an image as either "Fish" or "Not Fish". Each image and its label become part
of the data which is used to train the Computer Vision Model. Once the AI model gets properly trained,
it will classify an image as ‘Fish’ and ‘Not Fish’. Fish will be allowed to live in the ocean but ‘Not-Fish’ will
be grabbed and stored in the box, to be taken out of ocean/river later.
At the end of Level-4, we should explain to students how AI identifies objects using the training data, in
this case fish is our training data. This ability of AI to see and identify an object is called ‘Computer
Vision’. Similar to the case of humans where they can see an object and recognise it, an AI application
(using camera) can also see an object and recognise it.
Advance learners may go ahead with the level 5 and beyond.
Quick Question?
Based on the activity you just did, answer the following questions:
Q1: Can we say ‘computer vision’ acts as the eye of a machine / robot?
Q2: In the above activity, why did we have to supply many examples of fish before the AI model actually
started recognizing the fish?
Q3: Can a regular camera see things?
Q4: Let me check your imagination – You have been tasked to design and develop a prototype of a robot
to clean your nearby river/ ponds. Can you use your imagination and write 5 lines about features of this
‘future’ robot?
With the supporting materials provided in the section above, you must have developed a reasonably
good understanding of Computer Vision. Let us reinforce the learning by quickly going over it again in
brief.
Computer vision is a sub-set of Artificial Intelligence which enables the machines (robot/ any other
device with camera) to see and understand the digital images – photographs or videos. Computer Vision
has made it possible to make good use of the critical capabilities of AI by giving machines the power of
vision. Computer vision enables machines/ robots to inspect objects and accomplish certain tasks
making them useful for both homes and offices.
52
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
(Source: https://www.bitbrain.com/sites/default/files/styles/blog_1200x500/public/robot-maquina-inteligencia-artificial-con-emociones-
sentimientos.png?itok=S_50rjPm)
Looking at the picture, the human eye can easily tell that a train/engine has crashed through the statin
wall (https://en.wikipedia.org/wiki/Montparnasse_derailment). Do you think the computer also views the
image the same was as humans? No, the computer sees images as a matrix of 2-dimensional array (or
three-dimensional array in case of a colour image).
53
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
The above image is a grayscale image, which means each value in the 2D matrix represents the
brightness of the pixels. The number in the matrix ranges between 0 to 255, wherein 0 represents black
and 255 represents white and the values between them is a shade of grey.
For example, the above image has been represented by a 9x9 matrix. It shows 9 pixels horizontally and
9 pixels vertically, making it a total of 81 pixels (this is a very low pixel count for an image captured by a
modern camera, just treat this as an example).
In the grayscale image, each pixel represents the brightness or darkness of the pixel, which means the
grayscale image is composed of only one channel*. But colour image is the right mix of three primary
colours (Red, Green and Blue), so a colour image will have three channels*.
Since colour images have three channels*, computers see the colour image as a matrix of a 3-
dimensional array. If we have to represent the above locomotive image in colour, the 3D matrix will be
9x9x3. Each pixel in this colour image has three numbers (ranging from 0 to 255) associated with it.
These numbers represent the intensity of red,
green and blue colour in that particular pixel.
54
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
(Source : http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf)
1. Semantic Segmentation
Semantic Segmentation is also called the Image classification. Semantic segmentation is a process in
Computer Vision where an image is classified depending on its visual content. Basically, a set of classes
(objects to identify in images) are defined and a model is trained to recognize them with the help of
labelled example photos. In simple terms it takes an image as an input and outputs a class i.e. a cat, dog
etc. or a probability of classes from which one has the highest chance of being correct.
For human, this ability comes naturally and effortlessly but for machines, it’s a fairly complicated
process.
For example, the cat image shown below is the size of 248x400x3 pixels (297,600 numbers)
(Source : http://cs231n.github.io/assets/classify.png)
55
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Though image classification model takes this image as an input and reports 4 possibilities, we can see
that it indicates the highest probability of a cat. Therefore, the classification model has to be refined
/trained on further to produce a single label ‘cat’ as output.
Source: https://medium.com/analytics-vidhya/image-classification-vs-object-detection-vs-image-segmentation-f36db85fe81
3. Object Detection
When human beings see a video or an image, they immediately identify the objects present in them.
This intelligence can be duplicated using a computer. If we have multiple objects in the image, the
algorithm will identify all of them and localise (put a bounding box around) each one of them. You will
therefore, have multiple bounding boxes and labels around the objects.
Source: https://pjreddie.com/darknet/yolov1/
56
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
4. Instance segmentation
Instance segmentation is that technique of CV which helps in identifying and outlining distinctly each
object of interest appearing in an image. This process helps to create a pixel-wise mask for each object
in the image and provides us a far more granular understanding of the object(s) in the image. As you
can see in the image below, objects belonging to the same class are shown in multiple colours.
Source: https://towardsdatascience.com/detection-and-segmentation-through-convnets-47aa42de27ea
Activity
Given a grayscale image, one simple way to find edges is to look at two neighbouring pixels and take the
difference between their values. If it’s big, this means the colours are very different, so it’s an edge.
The grids below are filled with numbers that represent a grayscale image. See if you can detect edges
the way a computer would do it.
Try it yourself!
57
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Grid 1
If the values of two neighbouring squares on the grid differ by more than 50, draw a thick line between
them.
58
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Grid 2
If the values of two neighbouring squares on the grid differ by more than 40, draw a thick line between
them.
59
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Story -1: Fani, a rare summer cyclone in the Bay of Bengal, hit eastern India on May 03,2019. It is one
of the strongest cyclones to have hit India in the last 20 years, according to the Indian government’s
meteorological department. Storm surges and powerful winds reaching 125 mph blew off roofs,
damaged power lines, and uprooted countless trees.
But the worst-affected state, Odisha, has been successful in keeping the loss of life and numbers of
affected people to a minimum. This is the result of a very effective strategy of disaster preparation and
effective weather forecasting.
(Source: https://qz.com/india/1618717/indias-handling-of-cyclone-fani-has-a-lesson-for-the-us/ )
Story-2: KOLKATA | NEW DELHI: Laxman Vishwanath Wadale, a 40-year-old farmer from Maharashtra's
Jalna district, spent nearly Rs 25,000 on fertilisers and seeds for his 60-acre plot after the Indian
Meteorological Department (IMD) said in June that it stands by its earlier prediction of normal monsoon.
Today, like lakhs of farmers, Wadale helplessly stares at parched fields and is furious with the weather
office that got it wrong - once again. So far, rainfall has been 22% below normal if you include the
torrential rains in the northeast while Punjab and Haryana are being baked in one of the driest summers
ever with rainfall 42% below normal.
( Source : https://economictimes.indiatimes.com/news/economy/agriculture/indian-meteorological-departments-high-failure-rate-
prompts-states-to-set-up-their-own-systems/articleshow/15133072.cms)
Can you imagine how weather forecasting impacts people’s lives? Accurate weather forecasting allows
farmers to better plan for harvesting. It allows airlines to fly their passengers with safety. Electricity
departments can make decisions about their capacity needs during summer and winters. As we saw in
story-1, it allows governments to better prepare their responses to natural disasters that impact the
lives of millions.
Project 1: Make a report on the tools used for weather forecasting. Your report should not be more than
a page long.
60
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Weather forecasting deals with gathering the satellites data, identifying patterns in the observations
made, and then computing the results to get accurate weather predictions. This is done in real-time to
prevent disasters. Artificial Intelligence uses computer-generated mathematical programs and
computer vision technology to identify patterns so that relevant weather predictions can be made.
Scientists are now using AI for weather forecasting to obtain refined and accurate results, fast!
In the current model of weather forecasting, scientists gather satellite data i.e. temperature, wind,
humidity etc. and compare and analyse this data against a mathematical model that is based on past
weather patterns and geography of the region in question. This is done in real time to prevent disasters.
This model being primarily human dependent (and the mathematical model cannot be adjusted real
time) faces many challenges in forecasting.
On the other hand, Artificial Intelligence (AI) uses computer-generated mathematical programs and
computer vision technology to identify patterns and make relevant weather predictions. This has
resulted in scientists preferring AI for weather forecasting. One of the key advantages of the AI based
model is that it adjusts itself with the dynamics of atmospheric changes.
The image below will help you to understand how companies are leveraging Computer vision in weather
prediction.
Leading IT companies have been doing their intensive research by leveraging technologies like AI, IoT,
and Big Data:
1. IBM Global High-resolution Atmospheric Forecasting System (IBM GRAF) is a high-precision global
weather model that updates hourly to provide a clearer picture of weather activity around the globe
61
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
(https://www.ibm.com/weather)
2. Panasonic has been working on its weather forecasting model for years. The company makes
TAMDAR, a speciality weather sensor installed on commercial airplanes.
In commodity market, production is normally local but consumption is global. Therefore, price forecast
is beneficial for farmers, policymakers or for industries. Commodities like agricultural products are prone
to weather risk, demand and price risk than any other products and because of such vulnerabilities,
small farmers often do distress sales. Farmers are in distress under both the situation of crops failures
and bumper productions. A reliable price forecast tool will allow producer to make informed decision
to manage their price.
There are reliable forecasting techniques (mathematical and statistical models) which are currently
being used but they all work at macro level i.e. national or state level. Switching the forecasting
technique from classical methods (mathematical) to machine learning model would attract multiple
benefits, like machine learning can predict at the micro level i.e. individual market/mandi.
The AI based commodity forecasting system can produce transformative results, such as:
1. The accuracy level would be much higher than the classical forecasting model
2. AI can work on broad range of data and due to which it can reveal new insights
3. AI model is not rigid like classical model therefore its forecasting is always based on the most recent
input.
Wikipedia defines the self-driving car as, " A self-driving car, also known as an autonomous vehicle (AV),
driverless car, robot car, or robotic car is a vehicle that is capable of sensing its environment and moving
safely with little or no human input.”
Self-driving cars combine a variety of sensors to perceive their surroundings, such
as radar, lidar, sonar, GPS, odometry and inertial measurement units.
62
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
_________________________________________________________________________________
Question 2: In the age of self-driving cars, do we need zebra crossings? After all, self-driving cars can
potentially make it safe to cross a road anywhere.
_________________________________________________________________________________
Having attempted the activities let us now understand how self-driving cars work?
63
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Self-driving cars or autonomous cars, work on a combination of technologies. Here is a brief introduction
to them:
1. Computer Vision: Computer vision allows the car to see / sense the surrounding. Basically, it uses:
‘Camera’ that captures the picture of its surrounding which then goes to a deep learning model
to processing the image. This helps the car to know when the light is red or where there is a
zebra crossing etc.
‘Radar’ – It is a detection system to find out the how far or close the other vehicles on the road
are
‘Lidar’ – It is a surveillance method with which the distance to a target is measured. The lidar
(Lidar emits laser rays) is usually placed in a spinning wheel on top of the car so that they can
spin around very fast, looking at the environment around them. Here you can see a Lidar placed
on top of the Google car
2. Deep Learning: This is the brain of the car which takes driving decisions on the information gathered
through various sources like computer visions etc.
Robotics: The self-driven cars have a brain and vision but still its brain needs to connect with
other parts of the car to control and navigate effectively. Robotics helps transmit the driving
decisions (by deep learning) to steering, breaks, throttle etc.
Navigation: Using GPS, stored maps etc. the car navigates busy roads and hurdles to reach
its destination.
64
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Artificial Intelligence is autonomous and can make independent decisions — it does not require
human inputs, interference or intervention and works silently in the background without the
user’s knowledge. These systems do not depend on human programming, instead they learn on
their own through data experiencing.
Has the capacity to predict and adapt – Its ability to understand data patterns is being used for
future predictions and decision-making.
It is continuously learning – It learns from data patterns.
AI is reactive – It perceives a problem and acts on perception.
AI is futuristic – Its cutting-edge technology is expected to be used in many more fields in future.
There are many applications and tools, being backed by AI, which has a direct impact on our daily life.
So, it is important for us to understand what kind of systems can be developed, in a broad sense, using
AI.
65
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Activity
You have to design a recommendation system for your school library. What kind of information or data
would you would like to collect, that would help you train your recommendation system? Mention 2
such data.
[ Hint: The type of books [science fiction, thriller etc.) students borrow, their comments etc.]
(Source:https://cdn0.tnwcdn.com/wp-content/blogs.dir/1/files/2017/11/Screen-Shot-2017-11-24-at-13.17.49-796x399.png)
Activity
Does this picture remind you of something? Can you please share your thoughts about this image in the
context of AI?
--------------------------------------------------------------------------------------------------------------------------------
The dream of making machines that think and act like humans, is not new. We have long tried to create
intelligent machines to ease our work. These machines perform at greater speed, have
higher operational ability and accuracy, and are also more capable of undertaking highly
tedious and monotonous jobs compared to humans.
Humans do not always depend on pre-fed data as required for AI. Human memory, its computing power,
and the human body as an entity may seem insignificant compared to the machine’s hardware and
software infrastructure. But, the depth and layers present in our brains are far more complex and
sophisticated, and which the machines still cannot beat at sin the near future.
These days, the AI machines we see, are not true AI. These machines are super good in delivering a
specific type of jobs.
66
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Think of Jarvis in “Iron Man” and you’ll get a sneak peek of Artificial General Intelligence (AGI) and that
is nothing but human-like AI. Although we still have a long way to go in attaining human-like AI.
Source: https://www.researchgate.net/figure/AI-and-human-non-automated-decision-Source-Koeszegi-2019-referring-to-
Agrawal_fig2_333720228
3.Cognitive computing improves human decision 4.Cognitive computing tries to mimic the human
making brain
Examples of cognitive computing software: IBM Watson, Deep mind, Microsoft Cognitive service etc.
In summary, Cognitive Computing can be defined as a technology platform that is built on AI and signal
processing, to mimic the functioning of a human brain (speech, vision, reasoning etc.) and help humans
in decision making.
Now that we have understood what Cognitive Computing is, let us explore the need for the same.
67
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Enormous amounts of unstructured data and information (Facebook pages, twitter posts, WhatsApp
data, sensors data, traffic data, traffic signals data, medical reports so on) are available to us in this
digital age., We need an advanced technology (traditional computing cannot process this amount of
data) to make sense of the data in order to help humans to take better decisions. Because Cognitive
Computing generates new knowledge using existing knowledge, it is viewed as platform holding the
potential of future computing.
4. AI and Society
AI is sure changing the world but at the same time there is also a lot of hype and misconceptions about
it. In order for citizens, businesses and the government to take full advantages of AI, it is imperative that
we have realistic view about the same.
AI will impact almost every walk of society, from health, security, culture, education, jobs and
businesses. And as with any change/development, AI also has positive and negative influence on society
and this depends on how we leverage the same.
1. Healthcare
IBM Watson (An AI Tool by IBM) can predict development of a particular form of cancer up to 12
months before its onset with almost a 90% accuracy.
(https://www.beckershospitalreview.com/artificial-intelligence/ibm-ai-predicts-breast-cancer-up-to-a-
year-in-advance-using-health-records-mammograms.html)
There are many such developments happening in the field of medical science. To control the outbreak
of CORONA virus in China, the country leaned on Artificial Intelligence (AI), Data Science, to track cases
and fight the pandemic. Our healthcare sectors are moving towards a future where Robots and AI tools
will work alongside doctors.
Though scientists and researchers are working hard to find out the opportunities to apply AI technology
in almost all sectors like transportation, education, agriculture etc. But healthcare has been the focal
points for AI. Can you please try to find out the two reasons why impact of AI is maximum in healthcare
sector?
68
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
2. Transportation
Transportation is a field where artificial intelligence along with machine learning has given us major
innovations.
Autonomous vehicles like cars, trucks etc. use advanced AI capabilities that offer features like lane-
changing systems, automated vehicle guidance, automated braking, use of sensors and cameras for
collision avoidance, and analysing information in real time, thus saving human lives by reducing road
accidents.
3. Disaster Prediction
AI is considered as one of the best tools for prediction of natural occurrences. There is an AI model that
can almost perfectly predict the weather for the next couple of days, which was unimaginable before
the advent of AI.
4. Agriculture
Farming is a sector faced is full of multiple challenges such like unpredictable as unpredictable weather,
availability of natural resources, growing populations etc. With the help of AI, farmers can now analyse
a variety of factors things in real time such as weather conditions, temperature, water usage or soil
conditions collected from their farm. Real-time data analytics helps farmers to maximize their crop
yields and thus, in turn, their profits too.
Having discussed the advantages of AI, it is important to also understand how AI is negatively affecting
our society. With all the AI benefits, there comes some significant disadvantages as well, but that’s
natural for any technology!
Listed below are some of the challenges posed by AI
1. Integrity of AI
AI systems learn by analysing huge volumes of data. What are the consequences of using biased data
(via the training data set)) in favour of a particular class/section of customers or users?
In 2016, the professional networking site LinkedIn was discovered to have a gender bias in its system.
When a search was made for the female name ‘Andrea’, the platform would show
recommendations/results of male users with the name ‘Andrew’ and its variations. However, the site
did not show similar recommendations/results for male names. i.e. A search/query for the name
‘Andrew’ did not result in a prompt asking the users if he/she meant to find ‘Andrea’. The company said
this was due to a gender bias in their training data which they fixed later.
2. Technological Unemployment
Due to heavy automation, (with the advent of AI and robotics) some sets of people will lose their jobs.
These jobs will be replaced by intelligent machines. There will be significant changes in the workforce
and the market — there will be creation of some high skilled jobs however some roles and jobs will
become obsolete.
69
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Data is the fuel to AI; the more data you have, the more intelligent machine you would be able to
develop. Technology giants are investing heavily in AI and data acquisition projects. This gives them an
unfair advantage over their smaller competitors.
4. Privacy
In this digitally connected world, privacy will become next to impossible. Numerous consumer products,
from smart home appliances to computer applications have features that makes them vulnerable to
data exploitation by AI. AI can be utilized to identify, track and monitor individuals across multiple
devices, whether they are at work, home, or at a public location. To complicate things further, AI does
not forget anything. Once AI knows you, it knows you forever!
Route
X Cross the road
Final Destination
In neural network terminology, the activity of ‘crossing the road’ is termed as neuron. So, input ‘X’ goes
into a single neuron which is ‘Crossing the road’ to produce the output/goal which in this case is the
‘Final Destination’. In this example the starting and the ending are connected with a straight line. This
is an example of a simple neural network.
The life of delivery man is not so simple and straightforward in reality. They start from a particular
location in the city, and go to different locations across the city. For instance, as shown below, the
delivery man could choose multiple paths to fulfil his/her deliveries as shown in the image below:
70
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
However, some routes (or “paths”) are better than the rest. Let us assume that all paths take the same
time, but some are really bumpy while some are smooth. Maybe the path chosen in ‘Option3’ is
bumpy and the delivery man has to burn a (loss) lot more fuel on the way! Whereas, choosing ‘Option
2’ is perfectly smooth, so the deliveryman does not lose anything!
71
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
In this case, the deliveryman would definitely prefer ‘Option 2’ (Address 1 – Destination 2 – Final
Destination) as he/she will not want to burn extra fuel! So, the goal of Deep Learning then, is to assign
‘weights’ to each path from address to destination so that delivery man will find the most optimum
path.
How does this work?
As we discussed, the goal of the deep learning exercise is to assign weights to each path from start to
finish (start – address – destination – Final Destination). To do this for the example discussed above,
each time a delivery man goes from start to finish, the fuel consumption for every path is computed.
Based on this parameter (in reality the number of parameters can go up to 100 or more), cost for each
path is calculated and this cost is called the ‘Loss Function’ in deep learning.
As we saw above in our example, ‘Option 3’ (Address 3 -- Destination 3 –Final Destination) lost a lot of
fuel, so that path had a large loss function. However, ‘Option 2’ (Address 1 – Destination 2 – Final
Destination) cost the deliveryman least fuel cost, thereby having a small loss function as well– thereby
making it the most efficient route!
The above picture is a representation of a neural network of the most efficient path to be taken by the
deliveryman to reach the goal i.e. to go from starting point ‘X’ to ‘Final Destination’ using the best
possible route (which is most fuel efficient). This is a very small neural network consisting of 3 neurons
and 2 layers (address and destination).
In reality neural networks are not as simple as the above discussed example. They may look like the
one shown below!
72
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
(Source - http://www.rsipvision.com/exploring-deep-learning/)
The term ‘deep’ in Deep Learning refers to the various layers you will find in a neural network. This
closely relates to how our brains work. The neural network shown above, is a network of 3 layers (hidden
layer1, hidden layer 2 and hidden layer 3) with each layer having 9 neurons each.
Let us have a quick quiz now!
Question 1: The size of an image is represented as 600 x 300 x 3. What would be the type of the
image?
a) Jpeg Image
b) Grayscale image
c) Colour Image
d) Large image
Question 2: Which of the following have people traditionally done better than computers?
a) Recognizing relative importance
b) Detecting Emotions
c) Resolving Ambiguity
d) All of the above
Question 3: You have been asked to design an AI application which will prepare the ‘minutes of the
meeting’. Which part of AI will you use to develop your solution -
a) Computer Vision
b) Python Programming
c) Chatbot
d) Natural Language Processing (NLP)
73
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
1. In this exercise, you will learn about the three components of an artificial intelligence (AI) system
2. You will then learn about the role of training data in an AI system
1. Go to: https://teachablemachine.withgoogle.com/
5. Click on ‘Upload’ in Class 1/Dogs and select ‘Choose images from your files or, drag and drop here’ and
then upload all the dog images from the dataset from here
8. Test your model with a sample image downloaded from the web
Activity 2
Click on https://teachablemachine.withgoogle.com/v1/
1. Identify the three parts of an AI system in the teachable machine – Input, Learning, Output
2. Follow the tutorial
74
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Hit refresh. This time click “skip the tutorial.” Train the same classifier with your face and hands. What
happens when:
2.What happens when you increase the number of images in your dataset? Make sure both classes
have at least ten images.
3. If you’ve mainly been training with one hand up, try using the other hand. What happens when
your test dataset is different from your training dataset?
75
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Image Datasets
Three different datasets include:
Dataset Description
Initial These are the images students should use to “teach” their machine learning
Training model which image is a cat and which image is a dog.
Dataset
Note that there are many more cats and that the cats are more diverse in
appearance than the dogs. This means that the classifier will more accurately
classify cats than dogs.
Test Dataset These are the images that students should use to test their classifier after training.
Students should show these images to their model and record if their classifier
predicts if the image is of a dog or a cat.
Note: Students should not use these images to teach their classifier. If an image is
used to train a classifier, the machine will have already recorded the
corresponding label for the particular image. Showing this image to the machine
during the testing phase will not measure how well the model generalizes.
Recurating This is a large assortment of images students can use to make their training
dataset dataset of cats and dogs larger and more diverse.
The test dataset should be used twice, once for testing students’ initial classifier and again for testing their
recurated dataset.
76
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
Learning Outcomes:
1. By the end of this unit, students are expected to have foundation level
understanding of Linear Algebra, Statistics, various kinds of graphs to visualize
data and set theory.
2. Students should be in a position to relate real world problems with these
mathematical concepts.
3. Students should be curious enough to explore deeper concepts of the
application aspects of mathematics.
Pre-requisites: Knowledge of Grade X Mathematics
Key Concepts: Matrices, Statistics, Set theory, Data representations
77
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
1. Introduction to Matrices
We all know that computer understands only numbers (binary / hexadecimal etc.) then how do you
think they (computer/mobile phones / digital camera etc.) store the image?
Let us capture the image of a pet dog using the mobile camera
But for your mobile, the above image is like a grid as written below:
The grid above is a matrix – which is what we are going to learn about now!
Matrix (or linear algebra) is also called the mathematics of data. It is arguably the pillar of the study of
Artificial Intelligence and therefore this topic is advised as a prerequisite prior to getting started with
the study of Artificial Intelligence.
78
LEVEL 1: AI INFORMED (AI FOUNDATIONS) TEACHER INSTRUCTION MANUAL
79
1.1 Matrix
When we represent a set of numbers in the form of ‘M’ horizontal line (called rows) and ‘N’
vertical line (called columns), this arrangement is called m x n (m by n) matrix.
If A= | 1 2 3|
|4 5 6|
|7 8 9|
The top row is row 1. The leftmost column is column 1. This matrix is a 3x3 matrix because it
has three rows and three columns. In describing matrices, the format is:
rows X columns
Each number that makes up a matrix is called an element of the matrix. The elements in a matrix
have specific locations.
The upper left corner of the matrix is [row 1 x column 1]. In the above matrix the element at row
1 column 1 is the value 1. The element at [row 2 x column 3] is the value 6.
Quick Question
Question 1: What is the location of value 8?
Question 2: What is the value at location row 3 x column 2?
Activity 1
Mohan purchased 3 Math books, 2 Physics books and 3 Chemistry books. Sohan purchased 8
Math books, 7 Physics books and 4 Chemistry books.
________________________________________________________________________
Activity 2
What do you see when you look at the above image? A colorful pattern - easy guess!
Can you think of how to represent it so that computer also can understand this or process this?
I know you are too young to solve this, all I want to know is your approach.
If you thought that matrix representation is the solution, you got it right!
80
NOTE: You were able to identify the pattern because the human brain has gone through million
years of evolution. We have somehow trained our brains to automatically perform this task but
making a computer do the same task is not an easy. But before we work on identifying attributes
in an image, let us understand - How does a machine stores this image?
You know that computers are designed to process only numbers. So how can an image such as
the above with multiple attributes like color, height, width be stored in a computer? This is
achieved by storing the pixel intensities in a construct called Matrix. Then, this matrix can be
processed to identify colors etc.
So any operation which you want to perform on this image would likely use matrices at the back
end.
A = [1 3 -5]
1
A= [ 3 ]
−5
3. Square Matrix: A matrix in which number of rows are equal to number of columns.
A= | 1 2 3|
|4 5 6|
|7 8 9|
4. Diagonal Matrix: A matrix with all elements zero except its leading diagonal.
A= | 2 0 0|
| 0 3 0|
| 0 0 4|
81
5. Scalar Matrix: A matrix in which all the diagonal elements are equal and all other
elements are zero.
A= |5 0 0|
|0 5 0|
| 0 0 5|
And if all diagonal element is unity (1) and all other non-diagonal element is equal to
zero, this matrix is called Unit matrix.
A= |1 0 0|
|0 1 0|
|0 0 1|
AT = |1 3 5 |
|2 4 6|
Inverse
For matrices, there is no such thing as division. You can add, subtract or multiply but you can’t
divide them. There is a related concept, which is called "inversion".
Matrix inversion is a process that finds another matrix that when multiplied with the matrix,
results in an identity matrix. Given a matrix A, find matrix B, such that
AB = I n or BA = I n
AB = BA = I n
Calculating inverse of matrix is slightly complicated, so let us use Inverse matrix calculator
- https://matrix.reshish.com/inverCalculation.php
82
2. Determinant
Every square matrix can be expressed using a number which is known as it determinant.
If A = [aif] is a square matrix of order n, then determinant of A is denoted by det A or |𝐴| .
To find the value assigned to determinant we can expand it along any row or column.
|A| = 2 x 8 – 4 x 3
= 16 – 12
=4
Example 2: A= |6 1 1|
|4 -2 5|
|2 8 7 |
There are 2 more matrices operations i.e. Trace and Rank, which students are advised to
explore themselves.
83
1.3. Vector and Vector Arithmetic
Vectors are foundation of linear algebra. Vectors are used throughout the field of machine
learning in the description of algorithms and processes such as the target variable (y) when
training an algorithm.
We begin by defining a vector, a set of n numbers which we shall write in the form
| x1 |
| x2 |
x = | x3 |
.
.
| xn |
This object is called a column vector. Vectors are often represented using a lowercase character
such as “v”; for example: v = (V1, V2, V3) Where v1, v2, v3 are scalar values, often real values.
For instance, in the popular machine learning example of housing price prediction, we might
have features (table columns) including a house's year of construction, number of bedrooms,
area (m^2), and size of garage (auto capacity). This would give input vectors such as
| xn | = [ 1988 4 200 2]
[ 2001 3 220 1]
84
1.3.1. Vector Arithmetic
1. Vector Addition
Vectors of equal length can be added to create a new vector
x=y+z
The new vector has the same length as the other two.
X = (y1 + z1, y 2 + z 2, y3 + z3 )
2. Vector Subtraction
Vector of unequal length can be subtracted from another vector of equal length to create a new
third vector.
x=x−y
As with addition, the new vector has the same length as the parent vectors and each element
of the new vector is calculated as the subtraction of the elements at the same indices.
X = (y1 - z1, y 2 - z 2, y3 - z3)
3.Vector Multiplication
If we perform a scaler multiplication, there is only one type operation – multiply the scaler with
a scaler and obtain a scaler result,
axb=c
But vector has a different story, there are two different kinds of multiplication - the one in which
the result of the product is scaler and the other where the result of product is vector (there is
third one also which gives tensor result, but out of scope for now)
To begin, let’s represent vectors as column vectors. We’ll define the vectors A and B as the
column vectors
A= | Ax | B= | Box |
| Ay | | By |
| Az| | Bz|
We’ll now see how the two types of vector multiplication are defined in terms of these column
vectors and the rules of matrix arithmetic -
85
Physical quantities are of two types:
Scaler: Which has only magnitude, no direction.
Vector: Which has both in it – magnitude and direction.
This is the first type of vector multiplication, called dot product, written as A.B. The vector dot
product, multiplication of one vector by another, gives scaler result.
[Where do we use it in AI – This operation is used in machine learning to calculate weight.
Please refer “weight” in the Unit 2: Deep Learning]
If i = unit vector along the direction of x -axis
j = unit vector along the direction of y -axis
k = unit vector along the direction of z -axis
Vector Dot Product
If there are 2 vectors, vector a = a1i + a2j + a3k
And vector b = b1i + b2j + b3k
Their dot product a.b = a1b1 + a2b2 + a3b3
Practice Sum -1: Calculate the dot product of c = (−4, −9) and d = (−1,2).
86
1.4. Matrix and Matrix Arithmetic
Matrices are a foundational elements of linear algebra. Matrices are used in machine learning
to processes the input data variable when training a model.
A and B are two matrices of order m x n (means it has m rows and n columns), then their sum
A+B is a matrices of order m x n, is obtained by adding corresponding elements of A and B.
| 12 1| B = | 8 9 |
A= | 3 -5 | -1 4 |
Let A = [a if] be an m x n matrix and K be any number called a scalar. Then matrix obtained by
multiplying scalar K is denote by K A
If A= | 12 1 |
| 3 -5 | and K = 2
Then K A = = | 24 2 |
| 6 -10 |
Two matrices with the same size can be multiplied together, and this is often called element-
wise matrix multiplication
Two matrices A and B can be multiplied (for the product AB) if the number of columns in A
(Pre- multiplier) is same as the number of rows in B (Post multiplier).
If A = [a if]mxn , B = [bif]nxp
A= | 2 -3 4 | and B= | 2 5|
| 3 6 -1 | | -1 0 |
| 4 -2 |
87
A is 2 x 3 matrix while B is 3 x 2 matrix, No of row of B = No of column of A
They are meet the condition for matrix multiplication.
For example
= | 4+3+16 10+0-8 |
| 6 -6 -4 15 + 0 +2 |
= | 23 2 |
| - 4 17|
Activity 1:
Three people denoted by P1, P2, P3 intend to buy some rolls, buns, cakes and bread. Each of
them needs these commodities in different amounts and can buy them in two shops S1, S2.
Which shop is the best for each person P1, P2, P3 to pay as little as possible? The individual
prices and desired quantities of the commodities are given in the following tables:
88
Let us solve this matrices way:
and
| 1.50 1 |
Q= |2 2.50 |
|5 4.50 | (The Price matrix)
| 16 17 |
R = PQ = | 50 49 |
| 58.50 61 |
| 43.50 43.50 |
expresses the amount spent by the person P1 in the shop S1 (the element r11) and in the shop
S2 (the element r12). Hence, it is optimal for the person P1 to buy in the shop S2, for the
person P2 in S1 and the person P3 will pay the same price in S1 as in S2.
Activity 2
Share Market Portfolios
A has INR 1000 worth stock of Apple, INR 1000 worth of Google and INR 1000 worth of Microsoft.
B has INR 500 of Apple, INR 2000 of Google and INR 500 of Microsoft.
Suppose a news broke and Apple jumps 20%, Google drops 5%, and Microsoft stays the same.
What is the updated portfolio of A and B and net profit /loss from the event?
|apple | |100|
The original stock price matrices look like, | google | |010|
| Microsoft| |001|
| profit (+/-) | |000|
89
After news broke, updated stock price, |apple | | 1.2 0 0 |
| google | |0 .95 0 |
| Microsoft| | 0 0 1|
| profit (+/-) | | +.20 -.05 0 |
Now let’s feed in the portfolios for A (INR 1000, 1000, 1000) and B (INR 500, 2000, 500). We
can crunch the numbers by hand.
Input Interpretation -
| 1200 600 |
| 950 1900 |
Results - | 1000 500 |
| 50 0 |
The key is understanding why we’re setting up the matrix like this, not blindly crunching
numbers. This is the algorithm which plays behind any electronic spreadsheet (i.e. MS EXCEL)
when you do what if analysis.
90
2. Set Theory: Introduction to Data Table Joins
Many a times, you might have heard your teachers saying mathematics is the foundation of
Computer Science and you must have started thinking – how? Did this question ever cross your
mind?
In this module, we will try to explain the confluence of Set theory, which is a branch of
mathematics, and relational database (RDBMS), which is the part of the computer science. A lot
of things are going to come together today, because we are going to learn how set theory
principles are helping in data retrievals from database, which in turn is going to be used by AI
model for its training. The important topics which we are going to cover in this strand are as
below:
2.1. Context setting – Set theory and Relational Algebra
2.2. Set Operations
2.3. Data Tables Join (SQL Joins)
2.4. Practice Questions
2.1 Context Setting: Set Theory and Relational Algebra
Before we get into the actual relation between Set and Database of sets, we first need to
understand what do these terms refer to.
A Set is an unordered collection of objects, known as elements or members of the set. An
element ‘a’ belongs to a set A can be written as ‘a ∈ A’, ‘a ∉ A’ denotes that a is not an element
of the set A. So set is a mathematical concept and the way we relate sets to other sets, is called
set theory.
Set of even numbers: {..., -4, -2, 0, 2, 4, ...}
Set of odd numbers: {..., -3, -1, 1, 3, ...}
Set of prime numbers: {2, 3, 5, 7, 11, 13, 17, ...}
Set of names of grade X students: {‘A’, ‘X’, ‘B’, ‘H’, ..............}
We use database (like Oracle, MS SQL server, MySql etc.) to store digital data. Database is made
up of several components, of which table is the most important. Database stores the data in the
table. Without tables, there would not be must significance of the DBMS.
For example, student database and its 2 tables
91
Please see the records in the ‘Activity Table’, does this information make any meaning - No? But
if you combine the information from the 2 tables - ‘Students Table’ and ‘Activities Table’, you
get a meaning information.
For example, student John Smith, participated in swimming and he must have paid $17.
The data in the table of database are of limited values unless the data from different tables are
combined and manipulated to generate useful information. And from here, the role of relational
algebra begins.
Relational algebraic is a set of algebraic operators and rules that manipulates the relational
tables to yield desired information. Relational algebra takes relation (table) as their operands
and returns relation (table) as their results. Relational algebra consists of eight operators:
SELECT, PROJECT, JOIN, INTERSECT, UNION, DIFFERENCE, PRODUCT, AND DIVIDE.
92
( Image Source : https://images.slideplayer.com/42/11342631/slides/slide_4.jpg)
93
5. UNION returns a table containing all records that appear in either or both of the specified
tables as shown in the diagram.
6. INTERSECTION returns only those rows that appears in both tables, see the diagram above.
7. DIFFERENCE returns all rows in one table that are not found in the other table, that is, it
subtracts one table from the other, a shown in the diagram above.
8. DIVIDE is typically required when you want to find out entities that are interacting with all
entities of a set of different type entities.
Say for example, if want to find out a person who has account in all the bank of a city?
The division operator is used when we have to evaluate queries which contain the keyword
‘all’. Division is not supported by SQL directly. However, it can be represented using other
operations (like cross join, Except, In)
2.2. Set Operations
When two or more sets combined together to form another set under the mathematical
principles of sets, the process of combining of sets is called set operations.
To keep the process simple, let us assume two small sets:
A = {2 ,3 ,4} and B = {3,4,5}
Keeping these two sets as our example, let us perform four important set operations:
i) Union of Sets (U)
Union of the sets A and B is the set, whose element are distinct element of set A or Set B or
both.
A U B = {2, 3, 4, 5}
ii) Interpretation of Sets
Intersection of set A and set B is the set of elements belongs to both A and B.
A∩B = {3, 4}
iii) Complement of the Sets
Complement of a set A is the set of all elements except A, which means all elements except A.
= {5}
iv) Set Difference
Difference between sets is denoted by ‘A – B’, is the set containing elements of set A but not in
B. i.e. all elements of A except the element of B.
A – B = {2}
94
v) Cartesian Product
Remember the term used when plotting a graph, like axes (x-axis, y-axis). For example, (2, 3)
depicts that the value on the x-plane (axis) is 2 and that for y is 3 which is not the same as (3,
2).
The way of representation is fixed that the value of the x- coordinate will come first and then
that for y (ordered way). Cartesian product means the product of the elements say x and y in
an ordered way.
A and B are two non-empty sets, then the Cartesian product of two sets, A and set B is the set
of all ordered pairs (a, b) such that a ∈A and b∈B which is denoted as A × B.
A x B = {(2,3) ;(2,4) ;(2,5) ;(3,3) ;(3,4) ;(3,5) ;(4,3) ;(4,4) ;(4,5)}
(http://stackoverflow.com/questions/406294/left-join-vs-left-outer-join-in-sql-server)
95
In a database, information is stored in various tables. In order to retrieve a meaningful
information about an entity, all concerned tables need to be joined.
What do we mean in fact by joining tables? Joining tables is essentially a Cartesian product
followed by a selection criterion (did you notice, set theory operation. JOIN operation also
allows joining variously related records from different relations (tables).
In an inner join, only those tuples that satisfy the matching criteria are included, while the
rest are excluded. Let's study various types of Inner Joins.
Select records from the first (left-most) table with matching right table records.
In the left outer join, operation allows keeping all tuple in the left relation. However, if there is
no matching tuple is found in right relation, then the attributes of right relation in the join
result are filled with null values.
96
3. RIGHT (OUTER) JOIN
Select records from the second (right-most) table with matching left table records.
In the right outer join, operation allows keeping all tuple in the right relation. However, if
there is no matching tuple is found in the left relation, then the attributes of the left
relation in the join result are filled with null values.
Selects all records that match either left or right table records.
In a full outer join, all tuples from both relations are included in the result, irrespective of the
matching condition.
97
Question 3: Specify if below statement is true or false:
i) A SQL query that calls for a FULL OUTER JOIN is merely returning the union of
two sets.
____________________________________
ii) Finding the LEFT JOIN of two tables is nothing more than finding the set
difference or the relative complement of the two tables.
_______________________________________
Question 4: Can you think of an entity like students, employee, sports - create 3 tables of any
one of the entities, you want?
For example
Entity: Students
Students Table (name, roll number, age, class, address)
Marks Table (roll number, subject, marks obtained)
Bus Route table (roll number, bus number, boarding point)
The purpose of this module is not to replace the statistics that you will study as a part of
Mathematics in your school, but to introduce you to statistics for the perspective of the
Artificial Intelligence and Machine learning.
98
3.1. Measure of Central Tendency
Statistics is the science of data, which is in fact a collection of mathematical techniques that
helps to extract information from data. For the AI perspective, statistics transforms
observations into information that you can understand and share. You will learn more about
statistics and statistical methods in your next level i.e. Level-2.
Usually, Statistics deals with large dataset (population of a country, country wise number of
infected people from CORONA virus and similar datasets). For the understanding and analysis
purpose, we need a data point, be it a number or set of numbers, which can represent the whole
domain of data and this data point is called the central tendency.
“Central tendency” is stated as the summary of a data set in a single value that represents the
entire distribution of data domain (or data set). The one important point that I would like to
highlight here that central tendency does not talk about individual values in the datasets but it
gives a comprehensive summary of whole data domain.
3.1.1. Mean
In statistics, the mean (more technically the arithmetic mean or sample mean) can be
estimated from a sample of examples drawn from the domain. It is a quotient obtained by
dividing the total of the values of a variable by the total number of their observations or items.
If we have n values in a data set and they have values x1, x2, x3 …, the sample mean,
M = (x1 + x2 + x3 …xn) / n
And if we need to calculate the mean of a grouped data,
M = ∑fx / n
Where M = Mean
∑ = Sum total of the scores
f = Frequency of the distribution
x = Scores
n = Total number of cases
99
Example 1
The set S = { 5,10,15,20,30},
Mean of set S = 5+10+15+20+30/5 = 80/5 = 16
Example 2
Calculate the mean of the following grouped data
Class
Frequency
2-4 3
4-6 4
6–8 2
8 – 10 1
Solution
Mid
Class Frequency (f) f⋅x
value (x)
2 -4 3 3 9
4-6 4 5 20
6–8 2 7 14
8–
1 9 9
10
n=10 ∑f⋅x=52
=52 / 10
= 5.2
100
When to use Mean?
1. Mean is more stable than the median and mode. So that when the measure of central
tendency having the greatest stability is wanted mean is used.
2. When you want to includes all the scores of a distribution
3. When you want your result should not be affected by sampling data.
101
3.1.2. Median
The median is another measure of central tendency. It is positional value of the variables
which divides the group into two equal parts one part comprising all values greater than
median and other part smaller than median.
Following series shows marks in mathematics of students learning AI
17 32 35 15 21 41 32 11 10 20 27 28 30
102
Example 2
In your class, 5 students scored following marks in the unit test mathematics, find median
value: 11, 11, 14, 18, 20, 22
Solution
They are already in order - 11, 11, 14, 18, 20, 22
Total count is in even number, so median is the average of the two-middle number
(14 + 18) / 2 = 16.
𝑁
−𝑐.𝑓
2
And the formula is : Median = l1+ ×𝑖
𝑓
Number 22 38 46 35 20
of
workers
103
To find median (M) make table as
0-10 22 22
10-20 38 60
20-30 46 106
30-40 35 141
40-50 20 161
N=161
161+1
Median class = size of item = 81
2
161
−60
2
M = 20 + × 10
46
80.5−60
= 20 + × 10
46
20.5
= 20 + × 10
46
= 20 + 4.46
Median = 24.46
104
3.1.3. Mode
Mode is another important measure of central tendency of statistical series. It is the value which
occurs most frequently in the data series. On a histogram it represents the highest bar in a bar
chart or histogram. You can, therefore, sometimes consider the mode as being the most popular
option. An example of a mode is presented below:
105
and f0 = Frequency corresponding to the pre-modal class
Example – 2: Calculate mode for the following data:
Class Interval 10-20 20-30 30-40 40-50 50-60
Frequency 3 10 15 10 2
Answer: As the frequency for class 30-40 is maximum, this class is the modal class. Classes 20-
30 and 40-50 are pre-modal and post-modal classes respectively. The mode is:
Mode= 30 + 10× [(15-10)/ (2×15-10-10)] = 30+ 5= 35
There are two methods for calculation of mode in discrete frequency series:
(i) By inspection method - Same as above example.
(ii) Grouping method:
More than one value may command the highest frequency in the series.
In such cases grouping method of calculation is used.
The mean is a good measure The median is a good Mode is used when you
of the central tendency when measure of the central value need to find the
a data set contains values when the data include distribution peak and
that are relatively evenly exceptionally high or low peak may be many.
spread with no exceptionally values. The median is the
For example, it is
high or low values. most suitable measure of
important to print more
average for data classified on
of the most popular
an ordinal scale.
books; because printing
different books in equal
numbers would cause a
shortage of some books
and an oversupply of
others.
106
3.2. Variance and Standard Deviation
Measures of central tendency (mean, median and mode) provide the central value of the data
set. Variance and standard deviation are the measures of dispersion (quartiles, percentiles,
ranges), they provide information on the spread of the data around the centre.
In this section we will look at two more measures of dispersion: Variance and standard
deviation.
Let us understand these two using a diagram:
Let us measure the height (at the shoulder) of 5 dogs (in millimetres)
As you can see, their heights are: 600mm, 470mm, 170mm, 430mm and 300mm.
Let us calculate their mean,
Mean = (600 + 470 + 170 + 430 + 300) / 5
= 1970 / 5
= 394 mm
Now let us plot again after taking mean height (The green Line)
107
Now, let us find the deviation of dogs height from the mean height
Calculate the difference (from mean height), square them , and find the average. This average
is the value of the variance.
Variance = [ (206) 2 + (76) 2 + (-224) 2 + (36) 2 + (-94) 2] / 5
= 108520 / 5
= 21704
And standard deviation is the square root of the variance.
Standard deviation = √21704 = 147.32
I am assuming that the example above, must have given you a clear idea about the variance
and standard deviation.
So just to summarize, Variance is the sum of squares of differences between all numbers and
means.
In order to calculate variance , first, calculate the deviations of each data point from the mean,
and square the result of each .
Say, there is a data range: 2 ,4 ,4,4,5,5,7,9
=5
108
Then sum of square of differences between all numbers and mean =
= 9 + 1 +1 + 1+ 0 +0 + 4 + 16
= 32
= 32 / 8
=4
Standard Deviation is square root of variance. It is a measure of the extent to which data
109
3.3. Activities
Activity 1
______________________________________________________________
_____________________________________________________________
Activity 2
Can you please perform a statistical research on “The time students spend on social media”?
Condition 1: You will collect the data outside of your school
Condition 2: You can work in a group of 5 students
Condition 3: Your group need to capture data from a minimum 10 students
Once you have data ready with you, do your statistical analysis (central deviation, variance and
standard deviation) and present your story.
____________________________________________________________________
___________________________________________________________________
110
3. Visual representation of data
This module will provide an introduction about the purpose, importance and various methods
of data representation using graphs. Statistics is a science of data, so we deal with large data
volume in statistics or Artificial Intelligence. Whenever volume of data increases rapidly, an
efficient and convenient technique for representing data is needed. For a complex and large
quantity, human brain is more comfortable in dealing if represented through visual format.
And that is how the need arise for the graphical representation of data.
The important topics that we are going to cover in this module is:
3.1. Why do we need to represent data graphically?
3.2. What is a Graph?
3.3. Types of Graphs
3.1 Why do we need to represent data graphically?
There could be various reasons of representing data on graphs, few of them have been
outlined below
The purpose of a graph is to present data that are huge in volume or complicated to be
described in the text / tables.
Graphs only represent the data but also reveals relations between variables and shows
the trends in data sets.
Graphical representation helps us in analysing the data.
3.2. What is a Graph?
Graph is a chart of diagram through with data are represented in the form of lines or curve
drawn on the coordinated points and its shows the relation between variable quantities.
The are some algebraic and coordinate geometry principle which apply in drawing the graphs of
any kind.
Graphs have two axis, the vertical one is called Y-axis
and the horizontal one is called X-Axis. X and Y axis are
perpendicular to each other. The intersection of these
two axis is called ‘0’ or the Origin. On the X axis the
distances right to the origin have positive value (see fig.
7.1) and distances left to the origin have negative value.
On the Y axis distances above the origin have a positive
value and below the origin have a negative value.
111
3.3. Types of Graphs
3.3.1 Bar Graphs
As per Wikipedia “A bar chart or bar graph is a chart or graph that presents categorical data with
rectangular bars with heights or lengths proportional to the values that they represent “. It is a
really good way to show relative sizes of different variables.
There are many characteristics of bar graphs that make them useful. Some of these are that:
They make comparisons between different variables very easy to see.
They clearly show trends in data, meaning that they show how one variable is affected
as the other rises or falls.
Given one variable, the value of the other can be easily determined.
Example 1
The percentage of total income spent under various heads by a family is given below.
Different House
Food Clothing Health Education Miscellaneous
Heads Rent
% Age of
Total 40% 10% 10% 15% 20% 5%
Number
112
3.3.2 Histogram
Histogram is drawn on a natural scale in which the representative frequencies of the different
class of values are represented through vertical rectangles drawn closed to each other.
Measure of central tendency, mode can be easily determined with the help of this graph.
Histogram is easy to draw and simple to understand but it has one limitation that we cannot
plot more than one data distribution on the same axis as histogram.
Example 1
Below is the waiting time of the customer at the cash counter of a bank branch during peak
hours. You are required to create a histogram based on the below data.
113
3.3.3. Scatter Plot
Scatter plots is way to represent the data on the graph which is similar to line graphs. A line
graph uses a line on an X-Y axis, while a scatter plot uses dots to represent individual pieces of
data. In statistics, these plots are useful to see if two variables are related to each other. For
example, a scatter chart can suggest a linear relationship (i.e. a straight line).
There is no line but dots are representation the value of variables on the graph.
Example 1
Here price of 1460 apartments and their ground living area. This dataset comes from a kaggle (
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data) machine
learning competition. You can read more about this example here
(Sourec : https://www.data-to-viz.com/story/TwoNum.html )
114
Scatter plot is most frequently used data plotting technique in machine learning.
When should we use scatter plot:
It is used to observe relationship between two numeric variables. The dots on the plot
not only denotes value of variable but also the patterns, when data taken as whole.
Scatter plot is a useful tool for the correlation. Relationships between variables can be
described in many ways: positive or negative, strong or weak, linear or nonlinear.
115
4. Introduction to Dimensionality of Data
4.1. Data Dimensionality
Dimensionality in statistics refers to how many attributes a dataset has. There is a sample
students dataset with four attributes ( columns ) , so this student dataset is of 4 dimensions.
Students Dataset
So if you have a data-set having n observations (or rows) and m columns (or features), then
your data is m-dimensional.
The dimension of dataset can change without forcing change in another dimension. We can
change the age of students without changing class or address, for example.
116
Combination of these three colours (numbers: 0 – 255 ) ultimately decides the colour, hence
we say that colour space is three-dimensional because there are three “directions” in which a
colour can vary.
117
4.2. Data Representation on Graph
Before we move further, let us understand the basics of data representation on the graph
Quadrants - I
Quadrants - IV
Please look at the above diagram of the graph and try to reason out why
i) (6, 4) is in first quadrant
ii) ( -6, 4) is in second quadrant
iii) ( -6, -4) is in third quadrant
iv) (6, -4) is in the fourth quadrant
118
4.3. Multi-Dimensional Data and Graph
Use Case 1
Let us assume a data set of 1-Dimension
Students Dataset
Age
16
15
14
16 91
15 85
14 93
left-right, and
up-down
So any position needs two numbers.
119
Use Case 3
Let us take 3-Dimensional data
Students Dataset
Age Maths Marks Science Marks
16 91 92
15 85 90
14 93 72
How do we locate a spot in the real world (such as the tip of your nose)? We need to know:
left-right,
up-down, and
forward-backward
that is three numbers, or 3 dimensions!
What kind of situation are these? College admission (one variable) depends on other variable
i.e. 12th score. Number of sales (one variable) depends on other variable i.e. product price.
In all the situations, there are two variables – one is input variable (12th score, product price
etc.) and other one is the outcome (college admission, sales, farming etc.)
We know these two variables i.e. input and outcome are related but what is the equation of
the relation is unknown.
120
Example 1
The general formula of linear equation is:
Ax + By = C
Now let us take an example from real life to understand how linear equation behaves (slope of the
graph) with changing data point.
Suppose, the cab fare in Mumbai is: Fixed amount (x) + INR y per KM.
Cab fare revised, new fare is = Fixed amount(x) + twice the number of Km travelled
Thereby, if data points change, the slop of linear equation also changes.
We need to know that linear equation changes its path only when condition of variable changes.
Example 2
When we collect data, sometimes there are values that are "far away" from the main group
of data, how does that ‘far away’ value (called outlier) impacts the equation? What do we do
with them?
Below is the Delhi daily temperature data, recorded for a week:
Temperature recorded (degree C): 1st Week of June
Temp. 42 44 47 30 40 43 46
121
4th of June, it rained in Delhi and therefore temperature dipped
Now, let us take the outlier out, and calculate the mean
Because data range is very small, even though we notice visible difference in the mean. When we
remove outliers we are changing the data, it is no longer "pure", so we shouldn't just get rid of the
outliers without a good reason! And when we do get rid of them, we should explain what we are
doing and why.
Use Case 2
Let us consider another simple example:
Your physical education teacher given you an offer – If you take two rounds of school
ground, he will offer you 4 chocolates.
Based on the above offer, students come up with the following tables
Number of Number of
Rounds (x) chocolates (y)
2 4
4 8
6 12
122
There you find a linear relationship between these two variables i.e. Input and Outcome.
And congratulation, this is linear regression.
Activity 1
From the case-I, can you prepare some hypothetical data set (of any one event) and try to
establish the relationship between input variable and outcome?
Solution
Independent variable, x = months
123
Dependent variable, y = cab price
[Remember that your dependent variable is the one which you are trying to predict (cab price)
and your independent variable (months) is the one which you will supply as input]
Using the historical data, let us plot the scatter graph
Now we can predict, cab price for any coming month say the 14th month from the point one has
arrived in the city, by just replacing the month variable with 14 in the above equation.
Cab price = 4.914 + 14 * 72.752
= 146.82 INR
124
Seems like we now have an estimate on how much cab price needs to be paid 2 months from now!
Least Square Method
The "least square" method is a form of mathematical regression analysis used to determine the line of
best fit for a set of data, providing a visual demonstration of the relationship between the data points.
Each point of data represents the relationship between a known independent variable and an unknown
dependent variable.
Linear regression is basically a mathematical analysis method which considers the relationship
between all the data points in a simulation. All these points are based upon two unknown variables;
one independent and one dependent. The dependent variable will be plotted on the y-axis and the
independent variable will be plotted to the x-axis on the graph of regression analysis. In literal manner,
least square method of regression minimizes the sum of squares of errors that could be made based
upon the relevant equation.
We know the straight line formula
y = mx + c
Y (dependent variable) and x (independent variable) are known value, but m and c, we need to
calculate.
Steps to calculate the m and c -
Step 2: Sum all x, y, x2 and xy, which gives us Σx, Σy, Σx2 and Σxy
Step 3: Calculate Slope m:
b = Σy − m Σx / N
125
Example 1:
Let us collect a data how many hours of sunshine vs how many ice creams were sold at the shop from
Monday to Friday:
2 4
3 5
5 7
7 10
9 15
2 4 4 8
3 5 9 15
5 7 25 35
7 10 49 70
9 15 81 135
2 4 4 8
3 5 9 15
5 7 25 35
126
7 10 49 70
9 15 81 135
c = Σy − m Σx / N
= 41 − 1.5183 x 26 / 5
= 0.3049...
Step 5: Assemble the equation of a line:
y = mx + c
y = 1.518 x + 0.305
Let's see how it works:
2 4 3.34 - 0.66
3 5 4.86 - 0.14
5 7 7.89 0.89
7 10 10.93 0.93
9 15 13.97 -1.03
127
Here are the (x,y) points and the line y = 1.518x + 0.305 on a graph:
Once you hear the weather forecast which says "we expect 8 hours of sun tomorrow", so you use the
above equation to estimate that you will sell
y = 1.518 x 8 + 0.305 = 12.45 Ice Creams
128
Unit 4: AI Values (Ethical Decision Making)
129
1. AI: Issues, Concerns and Ethical Considerations
Recent progress in computing, robotics and AI may create a unique opportunity in human
society. Days are not too far, when we will entrust the management of the environment,
economy, public security, healthcare or agriculture to artificially intelligent robots and computer
systems. And this is where the discussion on ‘AI Ethics and values’ is born. Countries all over the
world are in a race to evolve their AI skills and technologies. The topic of AI is very popular but
what are the ethical and practical issue should we be considering before embracing AI?
Question Time!
Let me describe a few scenarios and then pose some associated questions. Do keep in mind that
that there are no ‘Right/Wrong’ answers for ethical questions. Read the scenario carefully and
try to answer to the best of your understanding.
Q.1: You are a doctor at a well-renowned hospital. You have six ill patients, five of whom are in urgent
need of organ transplant. However, you can't help them as there are no available organs that can be
used to save their lives. The sixth patient, however, will die without a particular medicine. If s/he dies,
you will be able to save the other five patients by using the organs of patient#6, who is an organ donor.
What will you do in this scenario?
(https://listverse.com/2011/04/18/10-more-moral-dilemmas/
___________________________________________________________________________________
Q.2: An AI music software has composed a song which has become a worldwide hit. Who will
own the rights to this song? The team who developed the AI software or the music company?
____________________________________________________________________________
Q.3: A farmer is headed somewhere sitting on his horse cart. A pedestrian makes some noise
which upsets the horse who injures the pedestrian in reaction. The pedestrian makes a police
complaint. Who do you think is at fault? Who should be penalized?
____________________________________________________________________________
130
1.1. Issues and Concerns around AI
Activity 1
Let us begin with a YouTube video: Humans Need Not Apply
Watch the video in groups of five. After watching this, each student of the group should write a
short note on their understanding of the video and present your write up to the teacher.
As Artificial Intelligence evolves, so do the issues and concerns around it. Let us review some of
the issues and concerns around AI here:
Personal Privacy: Human behaviour and activities can be tracked in in ways that were
unimaginable earlier. AI systems need huge amounts of data in order to be trained. In
many cases data involves individual faces, medical records, financial data, location
information etc.
Job Loss: One of the primary concerns around AI is the future loss of jobs. According to
a research by McKinsey, 800 million people will lose their jobs
(https://www.theverge.com/2017/11/30/16719092/automation-robots-jobs-global-
800-million-forecast). At the same time another point to keep in mind is that AI may
also create more jobs, after all, people will be tasked with creating these robots to
begin with and then manage them in the future.
Yes, AI makes mistakes. If Humans make a mistake there are laws that can be enforced, what
do we do in the case of AI? Do we have such laws for AI?
How should we treat AI Robot? Should robots be granted Human rights or citizenship? If
robots evolve to the point that they are capable of “feeling” does that entitle them to rights
similar to humans or animals? If robots are granted rights, then how do we rank their social
status?
131
Activity 2
Look at the four pictures below. Can you write a short story based on these four pictures?
132
1.2. AI and Ethical Concerns
Ethics is defined as the discipline dealing with moral obligations and duties of humans. It is a set
of moral principles which govern the behaviour and actions of individuals or groups.
“The ethics of AI is the part of the ethics technology specific to robots and other artificially
intelligent beings. It can be divided into roboethics, a concern with the moral behaviour of
humans as they design, construct, use and treat artificially intelligent beings, and machine
ethics, which is concerned with the moral behaviour of artificial moral agents (AMAs). With
regard to artificial general intelligence (AGIs), preliminary work has been conducted on
approaches to integrating AGIs which are full ethical agents with existing legal and social
frameworks “ .
The bigger concerns are:
In this exercise, you will learn to think about the kind of world we make when we build new
technology, and the unintended consequences that can occur when we build that technology.
133
Instructions
1. Go to: https://talktotransformer.com/
2. Explore with the tool for al little bit!
3. Then, answer the following prompts:
Write a brief description of your technology:
_______________________________________________________________
_______________________________________________________________
_______________________________________________________________
If this technology was used for evil, how might that be done?
_______________________________________________________________
If this technology was used to help other people, who might it help?
________________________________________________________________
________________________________________________________________
________________________________________________________________
Question. 1: Why are most images that show up when you do an image search for “doctor” are white
men?
Question 2: Why is it that most times AI tools associate ‘Doctor’ to a man and ‘Nurse’ to a woman?
Question 3: Why are the virtual assistants (Alexa, Siri, Google assistant etc.) all female?
Question 4: Computer vision systems report high error rates while recognizing people of colour, why?
134
You can do further search on the web and add more to this list. It’s not the case that the developers did
this intentionally. This is what we call AI bias!
“AI bias, is a phenomenon that occurs when an algorithm produces results that are systematically
prejudiced towards certain gender, language, race, wealth etc. and therefore produces skewed or leaned
output. Algorithms can have built-in biases because they are created by individuals who have conscious
or unconscious preferences that may go undiscovered until the algorithms are used publically”
135
Activity
Can you prepare a list of events where bias appears in the AI system due to the algorithm?
3. People
The last issue is with the people who are developing the AI system i.e. engineers, scientists,
developers etc. They aim to get the most accurate results with the available data. They are often
lesser focused on the broader context. It is rightly said that ethics and bias are not the problem
of the machine but that of the humans behind the machine.
(Source :https://xkcd.com/1838/)
Adoption of AI by companies is increasing and they are seeing AI as critical to the future of their
business and sustainability. However, there are concerns regarding the possible misuse of the
technology, which lead to a trust and confidence issues.
Currently, AI can automate data entry tasks, can take attendance of students in a classroom or
can beat Gary Kasparov at chess. However, more complicated usage of algorithms like machine
learning or neural networks, makes it less likely for human beings to understand how AI arrived
at a conclusion. It is essentially a ‘black box’ to humans. If a system is so complicated that a user
doesn’t understand how it works, how can we trust the decisions it makes?
While there is no easy solution to this problem, the way forward is for governments, industry
and regulatory bodies to join hands in addressing the challenge of ‘AI trust’ by doing the
following:
136
1. Minimize bias in training data
One of the most famous algorithms right now in the world is the ‘Google Search’.
Sunder Pichai, the Google CEO, had to describe the algorithm to lawmakers,
explaining that the search algorithm uses over 200 signals, including relevance and
popularity, to determine a page rank. A bipartisan bill was recently proposed by the
US lawmakers that would require internet giants such as Google, Facebook, Yahoo
and AOL to disclose their search algorithms
( https://sensecorp.com/ethical-ai/)
3. AI developers should be representative/inclusive of diverse backgrounds – gender, religion,
skin colour, language and so on
4. There should be an international monitoring body that designs and monitors and AI ethics
and algorithm policy
Activity
Does this picture tell you something? Can you describe this picture in your own words?
Although Artificial Intelligence has dramatically improved our world in many ways, there are notable
concerns regarding the forthcoming impact of AI on employment and the workforce.
Jobs that are monotonous and repetitive, can be easily automated; this can gradually lead to
certain jobs becoming obsolete.
137
Activities related to customer care operation, document classification, content moderation,
production line in factories etc., are at risk of being taken away by smart robots and software.
Self-driving cars and trucks will soon be a reality; transportation will see a transportation.
Financial Services, Insurance and any other sector requiring significant amounts of data
processing and content handling will also be impacted to a certain extent. AI can have a
significant role in eliminating bureaucracy, improving the service to citizens.
The healthcare sector and imaging services will also have some degree of impact.
It is a fact that AI will create millions of more jobs than the ones that it will affect. These new
jobs will require higher order thinking skills.
Advent of Internet and computers made a few jobs and roles obsolete, but, we know the
number of opportunities and new jobs it has created.
ATMs definitely reduced the number of cashier positions in the banks, but it has had a
positive impact on the banking business. ATMs have lowered the cost associated with
running brick and motor branches, and as a response banks have responded by opening
more ATMs – leading to hiring more bank personnel’s for the ATMs.
Ambassador (https://en.wikipedia.org/wiki/Hindustan_Ambassador)
Fiat (https://en.wikipedia.org/wiki/Premier_Padmini)
What happened to these models? Do we see them on the roads now? Nobody pushed them out
of business, however, their inertia and resistance to change with the changing times and
technology, made them run out of business.
Advent of electricity and mechanical engine changed the world. Each of them bettered our lives,
created jobs, and raised wages. AI will be bigger than electricity, bigger than mechanization,
bigger than anything that has come before it.
138
We don’t need to fear AI but prepare to reap the benefits of AI!
Let us conclude this unit by enagaging in this activity
Activity
Form groups of 5 students each, and ask the students to prepare a list of
i) 5 jobs or professions that AI will disrupt
ii) 5 job segment that will be immune to AI
iv) 10 new jobs or businesses that will be created by AI
139
Unit 5: Introduction to Storytelling
Summary: Students get to learn about the significance of storytelling which has been used
as a medium to pass on knowledge, experience, and information since ages. It also builds
intercultural understanding and commonalities thereof. This session will also equip
students with a vital skill to tell back their stories with numbers or proof points by blending
the two worlds of hard data and human communication. Data visualisation is now a key to
interpret and tell an impactful story.
Objectives:
Learning Outcomes:
Purpose: Introduce the importance of storytelling and its effectiveness in passing on knowledge,
values, facts and events from one generation to another.
Say: “This unit intends to create value for storytelling. Although storytelling comes naturally to
everyone, keeping a few things in mind not only enriches the said art but also makes it more
impactful. This unit also dwells on the art of storytelling and how blending stories with numbers/
data can make storytelling forceful.”
After having briefed the students about storytelling, ask them to answer the questions that follow
and gauge their understanding of the subject. Engage the students in a discussion and ask them
about their expectations from this unit.
140
1. Storytelling: Communication across the ages
Stories have been central to human cognition and it has proved to be the most effective way
of communication since time immemorial. There is a bio-chemical reason why people love
stories. It’s the mode of communication our brains biologically prefer. When a good story is told
the brain comes alive because storytelling literally has a chemical effect on the brain that
wakes it up in order to absorb, digest and store information. Stories have the power to inspire,
motivate, and change people’s opinions. In short stories are the best possible way to deliver
complex information (data).
Storytelling is defined as the art of narrating stories to engage an audience. It originated in the
ancient times with visual stories, such as cave drawings, and then shifted to oral traditions, in
which stories were passed down from generation to generation by word of mouth. Later, words
formed into narratives, that included written, printed and typed stories. Written language, as it
is seen now, was arguably the first technological innovation, that gave us as a species the power
to convey story in a physical format, and thus visualize, archive and share that data with
community members and future generations. It encourages people to make use of their
imagination and inventiveness (creativity) to express themselves (verbal skills) which makes it a
lot more than just a recitation of facts and events.
141
1.1. Learn why storytelling is so powerful and cross-cultural, and what this means for
data storytelling
Stories create engaging experiences that transport the audience to another space and time.
They establish a sense of community belongingness and identity. For these reasons, storytelling
is considered a powerful element that enhances global networking by increasing the awareness
about the cultural differences and enhancing cross-cultural understanding. Storytelling is an
integral part of indigenous cultures.
Some of the factors that make storytelling powerful are its attribute to make information more
compelling, the ability to present a window in order to take a peek at the past, and finally to
draw lessons and to reimagine the future by affecting necessary changes. Storytelling also
shapes, empowers and connects people by doing away with judgement or critic and facilitates
openness for embracing differences.
A well-told story is an inspirational narrative that is crafted to engage the audience across
boundaries and cultures, as they have the impact that isn’t possible with data alone. Data can
be persuasive, but stories are much more. They change the way that we interact with data,
transforming it from a dry collection of “facts” to something that can be entertaining, engaging,
thought provoking, and inspiring change.
Each data point holds some information which maybe unclear and contextually deficient on its
own. The visualizations of such data are therefore, subject to interpretation (and
misinterpretation). However, stories are more likely to drive action than are statistics and
numbers. Therefore, when told in the form of a narrative, it reduces ambiguity, connects data
with context, and describes a specific interpretation – communicating the important messages
in most effective ways. The steps involved in telling an effective data story are given below:
Understanding the audience
Choosing the right data and visualisations
Drawing attention to key information
Developing a narrative
Engaging your audience
142
Activity
A new teacher joined the ABC Higher Secondary School, Ambapalli to teach Science to the
students of Class XI. In his first class itself, he could make out that not everyone understood
what was being taught in class. So, he decided to take a poll to assess the level of students. The
following graph shows the level of interest of the students in the class.
11%
19% 5%
25%
40%
Depending on the result obtained, he changed his method of teaching. After a month, he
repeated the same poll once again to ascertain if there was any change. The results of poll are
shown in the chart below.
12%
6%
38%
14%
30%
With the help of the information provided create a good data story setting a strong narrative
around the data, making it is easier to understand the pre and post data, existing problem,
action taken by the teacher, and the resolution of the problem. Distribute A4 sheets and pens
to the students for this activity.
143
2. The Need for Storytelling
The need for storytelling is gaining importance like never before, as more and more people are
becoming aware of its potential to achieve multipurpose objectives.
Purpose: To familiarize students with the need for storytelling and how it proves
beneficial.
Say: “Now that you have learnt about storytelling and its power, we will introduce you to
the need of storytelling.”
Guide the students to think of the many needs that storytelling satisfies and enter in the blank
circles in the figure below:
Storytelling
144
Expected Responses:
Purpose: To provide insight into data storytelling and how it can bring a story to life.
Say: “Now that you have understood what storytelling is and why it is needed, let us learn about
a storytelling of a different kind - the art of data storytelling and in the form of a narrative or
story.”
ITEM QUANTITY
A4 Xx
sheets
Pens Xx
Data storytelling is a structured approach for communicating insights drawn from data, and
invariably involves a combination of three key elements: data, visuals, and narrative. When the
narrative is accompanied with data, it helps to explain the audience what’s happening in the
data and why a particular insight has been generated. When visuals are applied to data, they
145
can enlighten the audience to the insights that they wouldn’t perceive without the charts or
graphs.
Finally, when narrative and visuals are merged together, they can engage or even entertain an
audience. When you combine the right visuals and narrative with the right data, you have a data
story that can influence and drive change.
3.1. By the numbers: How to tell a great story with your data?
Presenting the data as a series of disjointed charts and graphs could result in the audience
struggling to understand it – or worse, come to the wrong conclusions entirely. Thus, the
importance of a narrative comes from the fact that it explains what is going on within the data
set. It offers a context and meaning, relevance and clarity. A narrative shows the audience where
to look and what not to miss and also keeps the audience engaged.
Good stories don’t just emerge from data itself; they need to be unravelled from data
relationships. Closer scrutiny helps uncover how each data point relates with other. Some easy
steps that can assist in finding compelling stories in the data sets are as follows:
Step 1: Get the data and organise it.
Step 2: Visualize the data.
Step 3: Examine data relationships.
Step 4: Create a simple narrative embedded with conflict.
Activity: Try creating a data story with the information given below and use your imagination to
reason as to why some cases have spiked while others have seen a fall.
146
Mosquito borne diseases in Delhi
600
530
500
435
400
311
300
238
200 165 146
208 131
100 78
56 20 75
0 0 21
44
2015 2016 2017 2018 2019
It is an effective tool to transmit human experience. Narrative is the way we simplify and
make sense of a complex world. It supplies context, insight, interpretation—all the things
that make data meaningful, more relevant and interesting.
No matter how impressive an analysis, or how high-quality the data, it is not going to
compel change unless the people involved understand what is explained through a story.
Stories that incorporate data and analytics are more convincing than those based
entirely on anecdotes or personal experience.
It helps to standardize communications and spread results.
It makes information memorable and easier to retain in the long run.
Data Story elements challenge –
Identify the elements that make a compelling data story and name them
_____________________
______________________
147
_____________________
Activity:
First present the statistics as shown below. Ask the students to read it and say if they have
understood information presented well.
1. 7.6% of men believe mobiles are a distraction as compared to 4.2% of the women.
2. Kids in the car cause 9.8% of the men to be distracted as compared to 26.3% of the
women.
Another way to recreate the same statistics is the visual shown below:
Ask the students which one tells a better story and list out why?
(Expected Response: The former way of presenting story is far more detailed and easier to
comprehend.)
148
4.Conflict and Resolution
Conflict is the most exciting and engaging drive in any story. Every story or plot is centred on its
conflict and the ways in which the characters of the story attempt to resolve the problem. Conflict in
a story is a struggle between two or more opposing forces. Conflict in a story drives the plot forward
towards a resolution.
In a business or our daily life, the users or audience are trying to resolve a conflict always. All
decisions have to be made after resolving the conflict. Every question answered in data storytelling
is by the means of finding evidence to a conflict.
1. Communication
2. Teamwork
3. Problem Solving
4. Stress management
5. Emotional agility
Activity
A school has planned its annual meet for the for the year. 15 students can participate in a drama for which 28
students showed interest to participate. The teacher co-ordinator decides to leave the decision on the 28
students to unanimously select 15 students who will participate in the drama.
149
5.Storytelling for Audience
Data storytelling has few elements without which storytelling is impossible. Let us have a look at them:
Let’s do an activity
Create a data story to highlight changes you see in yourself after the outbreak of COVID 19 followed by
lockdown in the country.
150
____________________________________________
CLASS XI
______________________________________________
INDEX
2
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Unit 6
Objectives:
1. To build focus on research, prototyping, and testing products and services so as to find
new ways to improve the product, service or design.
2. Students develop the understanding that there is more to Design thinking than just
hardware designing.
3. To inculcate design thinking approach to enhance the student's creative confidence.
Learning Outcomes:
1. Underlining the importance of Prototype as a solution to user challenge.
2. Recognizing empathy to be a critical factor in developing creative solutions for the end
users.
3. Applying multiple brainstorming techniques to find innovative solutions.
Pre-requisites: Reasonable fluency in the English language.
Key Concepts: Design Thinking framework, Prototype, Ideate
( https://en.wikipedia.org/wiki/The_Thinker )
The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and
relearn.” - Alvin Toffler, author of Future Shock.
3
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
---------------------------------------------------------------------------------------------------------------------------------------------
Activity 2
Have the class form groups of 3 or 4 students each and assign them tasks (let’s say to plan a party).
In round one, get everyone to start each sentence of their conversation with “Yes, BUT…….”. After the first
round, ask your participants how the conversation went? How did their discussion to plan the party go?
For round two, get the participants to start their conversation with “Yes, AND….”. After the second round, ask
the group how that round went and compare the two rounds of discussions. The differences between the two
will be striking!
Purpose: Collaboration along with distinction between an open and closed mind set.
Activity 3
Divide the class into groups of 4-5 students each. Pick a random object (i.e. a paperclip, pen, notebook), and
challenge each group to come up with 40 uses for the object. No repeats!
Each group will take turns in coming up with new ideas. Make sure that each group has a volunteer note-taker
to capture the ideas along with the total number of ideas their group comes up with. Allow 4 mins for this
challenge. When time is up, have each group share how many ideas they generated. The group with the most
ideas is declared the winner!
Activity 4
This is an activity that promotes pure imagination. The purpose is to think expansively around an ideal future
for the school or about yourself too; it's an exercise in visioning.
The objective of this activity is to suspend all disbelief and envision a future that is so stellar that it can land you
or your school on the cover of a well-known international magazine. The student must pretend as though this
future has already taken place and has been reported by the mainstream media.
Purpose: Encouraging students to "think big,", Planting the seeds for a desirable future
According to Wikipedia, "Design thinking refers to the cognitive, strategic and practical processes by which
design concepts (proposals for new products, buildings, machines, etc.) are developed.” Design thinking is also
associated with prescriptions for the innovation of products and services within business and social contexts.
Most often, the design is used to describe hardware, machine or a structure, but essentially, it is a process. It is
a set of procedures and principles that employ creative and innovative techniques to solve any complex
technological or social problem. It is a way of thinking and working about the potential solution to a complex
problem.
4
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Some years ago, an incident occurred where a truck driver tried to pass under a low bridge. But he failed, and
the truck got lodged firmly under the bridge. The driver was unable to continue driving through or reverse out.
The story goes that as the truck became stuck, it caused massive traffic problems, which resulted in emergency
personnel, engineers, firefighters and truck drivers gathering to devise and negotiate various solutions for
dislodging the trapped vehicle.
Emergency workers were debating whether to dismantle parts of the truck or chip away at parts of the bridge.
Each spoke of a solution that fits within his or her respective level of expertise.
A boy walking by and witnessing the intense debate looked at the truck, at the bridge, then looked at the road
and said nonchalantly, "Why not just let the air out of the tires?" to the absolute amazement of all the specialists
and experts trying to unpick the problem.
When the solution was tested, the truck was able to drive free with ease, having suffered only the damage
caused by its initial attempt to pass underneath the bridge. The story symbolizes the struggles we face where
oftentimes the most obvious solutions are the ones hardest to come by because of the self-imposed constraints
we work within.
( Source - https://www.interaction-design.org/literature/article/what-is-design-thinking-and-why-is-it-so-popular )
Now let’s move on to understand the Design Thinking framework. The illustration below has the various
components of the framework.
5
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Empathize
Design thinking begins with empathy. This requires doing away with any preconceived notions and immersing
oneself in the context of the problem for better understanding. In simple words, through empathy, one is able
to put oneself in other people's shoes and connect with how they might be feeling about their problem,
circumstance, or situation.
There is a challenge one needs to solve. How does one approach it? Empathy starts from here. As a designer of
the solution to a challenge, one should always understand the problem from the end-user perspective.
Define
In the Define stage, information collected during Empathize is used to draw insights and is instrumental in stating
the problem that needs to be solved. It's an opportunity for the design thinker to define the challenge or to
write the problem statement in a human-centred manner with a focus on the unmet needs of the users.
Ideate
By now the problem is obvious and it is time to brainstorm ways and methods to solve it. At this stage, numerous
ideas are generated as a part of the problem-solving exercise. In short, ideation is all about idea generation.
During brainstorming, one should not be concerned if the generated ideas are possible, feasible, or even viable.
The only task of the thinkers is to think of as many ideas as possible for them. It requires "going wide" mentally
in terms of concepts and outcomes. There are many brainstorming tools that can be used during this stage.
By this time, you are already aware of who your target users are and what your problem statement is. Now it’s
time to come up with as many possible solutions. This phase is all about creativity and imagination; all types of
ideas are encouraged, whether stupid or wise – it hardly matters as long as the solution is imagined.
6
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Ideation is the most invigorating stage of Design Thinking, and consists of a process where any and all ideas are
welcomed, no matter how outrageous they may seem. A lot of planning and preparation goes into this stage to
ensure that the results are varied and innovative. After everyone shares their ideas, specific measures are
applied to evaluate the ideas without being judgmental or critical to narrow the list. It may so happen that the
solution comes from the unlikeliest of ideas. So, at this point focus is on quantity over quality of ideas. The most
feasible ideas are chosen for further exploration. Storyboarding, or making a visual mock-up of an idea, can also
be useful during ideation.
Prototype
The prototype stage involves creating a model designed to solve consumers' problems which is tested in the
next stage of the process. Creating a prototype is not a detailed process. It may include a developing simple
drawing, poster, group role-playing, homemade “gadget, or a 3d printed product.” The prototypes must be quick
and easy to develop and cheap. Therefore, prototypes are visualised as rudimentary forms of what a final
product is expected to look like. Prototyping is intended to answer questions that get you closer to your final
solution. Prototypes, though quick and simple to make, bring out useful feedback from users. Prototypes can be
made with everyday materials.
Test
One of the most important parts of the design thinking process is to test the prototypes with the end users. This
step is often seen going parallel to prototyping. During testing, the designers receive feedback about the
prototype(s), and get another opportunity to interact and empathize with the people they are finding solutions
for. Testing focuses on what can be learned about the user and the problem, as well as the potential solution.
7
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Having understood the different stages, let us see some of the best examples of Design Thinking. You will need
to identify and highlight wherever you feel design thinking has been applied.
Example 3:
Fuel dispensers hanging overhead, unlike what is usually seen in our gas filling stations in India
8
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
In order to extract / gather relevant facts and information from users/customers, it is recommended to
use this simple and reliable method of questioning: the 5W1H method.
(https://www.workfront.com/blog/project-management-101-the-5-ws-and-1-h-that-should-be-asked-of-every-project)
9
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
To collect facts and key information about the problem, ask and answer the 5 W's and One H question—Who?
What? When? Where? Why? and How?
https://www.sketchbubble.com/en/presentation-5w1h-model.html
For instance, if one's car is giving inadequate gas mileage the following questions can be asked:
What has changed - for instance, maintenance and repairs done last, change in the gas
station?
Where are the new driving routes or distances that the car is covering?
The questions can be changed to make them pertinent to whatever problem or issue that needs to be addressed.
The essential W’s and H help to cover all aspects of a problem so that a comprehensive solution can be found.
Activity 1
Your best friend who had scored very high marks in the mid-term exams has surprisingly put up a poor
performance in the final term exams. You decide to bring him back on track by spending time with him and try
to extract facts to get to the root of the problem.
10
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Use the below 5W1H worksheet given below to record the questions and answer with your friend -
Where is it happening?
When is it Happening?
Why is it happening?
11
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Problems are at the centre of what many people do at work every day. Whether you're solving a problem for a
client (internal or external) or discovering new problems to solve - the problems you face can be large or small,
simple or complex.
The problem, in the below picture may appear simple to you. Thinking every aspect from the perspective of
giraffe, can you solve it for them?
It has often been found that finding or identifying a problem is more important than the solution. For example,
Galileo recognised the problem of needing to know the speed of light, but did not come up with a solution. It
took advances in mathematics and science to solve this measurement problem. Yet to date Galileo still receives
credit for finding the problem.
Question -1: Rohan has been offered a job that he wants, but he doesn’t have the facility to reach the office
premises and also doesn’t have enough money to buy a car.”
Question-2: Instructors at a large university do not show up for technology training sessions. What do you think
is the problem?
The time frame for the training sessions does not meet the instructors' schedules.
The notifications for the training are sent in bulk mailings to all email accounts.
The define stage of design thinking (identify the problem) ensures you fully understand the goal of your design
project. It helps you to articulate your design problem, and provides a clear-cut objective to work towards.
Without a well-defined problem statement, it’s hard to know what you’re aiming for. With this in mind, let’s
take a closer look at problem statements and how you can go about defining them.
12
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1.3 Ideate
Ideation is the process of generating ideas and solutions through sessions such as sketching, prototyping,
brainstorming etc. In the ideation stage, design thinkers generate ideas — in the form of questions and solutions
— through creative and curious activities.
https://www.tutorialspoint.com/design_thinking/design_thinking_ideate_stage.htm
Ask the right questions and innovate with a strong focus on your users, their needs, and your insights
about them.
Get obvious solutions out of your heads, and drive your team beyond them.
Ideation Techniques:
Here is an overview of the most essential ideation techniques employed to generate numerous ideas:
Brainstorm
During a Brainstorming session, students leverage the synergy of the group to generate new innovative ideas by
building on others’ ideas. Participants should be able to discuss their ideas freely without fear of criticism. A
large number of ideas are collected so that different options are available for solving the challenge.
Brain dump
Brain dump is very similar to Brainstorm; however, it’s done individually. It allows the concerned person to open
the mind and let the thoughts be released and captured onto a piece of paper. The participants write down their
ideas onto paper or post-it notes and share their ideas later with the larger group.
Brain writing
Brain writing is also very similar to a Brainstorm session and is known as ‘individual brainstorming’. At times only
the most confident of team members share their ideas while the introverts keep the ideas to themselves.
Brainwriting gives introverted people time to write them down instead of sharing their thoughts out loud with
the group. The participants write down their ideas on paper and, after a few minutes, pass on their own piece
of paper to another participant who then elaborates on the first person’s ideas and so forth. In this way all
13
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
participants pass their papers on to someone else and the process continues. After about 15 minutes, the papers
are collected and posted for instant discussion.
Group Activity: Your class have been tasked with a responsibility - “How to redesign the classroom to better
meet the students’ needs without incurring any cost”?
Form groups of 4 or 5 students each. Apply the design thinking framework i.e. all five phases. Every group is
supposed to submit a detailed report (not more than 10 pages) in a week’s time to the teacher.
https://www.sketchup.com/ (Free 3D digital design tool. Ideal for prototyping and mocking up design
solutions.
A Focus on Empathy
Empathy is the first step in design thinking because it allows designers to understand, empathize and share
the feelings of the users. Through empathy, we can put ourselves in other people’s shoes and connect with
how they might be feeling about their problem, circumstance, or situation.
A big part of design thinking focuses on the nature of impact that innovative thinking has on individuals. Recall
the students who were featured at the beginning of the module. Empathy was at the centre of their designs.
In preparation for your AI challenge, you are going to engage in an empathy map activity to practice one
way of empathizing in the design process.
To create a “persona” or profile for the user, you can use the empathy
map activity to create a realistic general representation of the user or
users. Personas can include details about a user’s education, lifestyle,
interests, values, goals, needs, thoughts, desires, attitudes, and actions.
14
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Please look at the below links for the activity and the related video
Source- https://www.ibm.com/design/thinking/page/toolkit/activity/empathy-map
Instructions:
Empathy mapping is only as reliable as the data you bring to the table, so make sure you have defensible
data based on real observations (for example, from an interview or contextual inquiry). When you can, invite
users or Sponsor Users to participate.
Draw a grid and label the four essential quadrants of the map: Says, Does, Thinks, and Feels. Sketch your user
or stakeholder in the centre. Give them a name and brief description of who they are and what they do.
15
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://www.uxbooth.com/articles/empathy-mapping-a-guide-to-getting-inside-a-users-head/
3. Capture observations
Have everyone record what they know about the user or stakeholder. Use one sticky note per observation.
Place the sticky notes with the relevant answers on the appropriate quadrant of the map.
16
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Within each quadrant, look for similar or related items. If desired, move them closer together. As you do,
imagine how these different aspects of your user’s life really affect how they feel. Can you imagine yourself in
their shoes?
Label anything on the map that might be an assumption or a question for later inquiry or validation. Look for
interesting observations or insights. What do you all agree on? What surprised you? What’s missing? Make sure
to validate your observations with other participants involved in the activity.
17
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
You’ve been asked to build a mobile app that will help connect students and tutors.
Persona 1: Neha is a high school student and is focused on maintaining a high Percentage to increase her
chances of getting into her first-choice college after Class 12th. She is struggling with her Physics class
and wants to find a tutor. She is looking for someone in her neighbourhood who she can meet with after
school, possibly on Saturday mornings.
● Persona 2: Priya is a college student and an expert in Physics who would like to make a little extra money
by helping students. She hopes to be a teacher one day and thinks being a tutor would help her gain
experience and build her resume. She would like to offer her services to students looking for a Physics tutor.
● Persona 3: Mr. Jaswinder Singh is a high school teacher and has several students struggling with their
Physics assignments. He would like to be able to direct his students to available tutors to help them improve
their grades and catch up with the rest of the class. He also wants to be able to check the progress of his
students to ensure they are taking appropriate steps to improve.
18
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Unit 7
Data Analysis
Objectives:
1. Demonstrate an understanding of data analysis and statistical concepts.
2. Recognise the various types of structured data – string, date, etc
3. Illustrate an understanding of various statistical concepts like mean, median, mode, etc.
Learning Outcomes:
1. Comprehension and demonstration of data management skills.
2. Students will demonstrate proficiency in applying the knowledge in statistical analysis of
data.
Pre-requisites: No previous knowledge is required, just an interest in methodology and data. All
you need is an Internet connection.
Key Concepts: Data Analysis, Structured Data, Statistical terms and concepts
19
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
Q 3. Why is data collected? State a few reasons you can think of.
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
__________________________________________________________________________________________
________________________________________________________________________________
It is a widely known fact that Artificial Intelligence (AI) is essentially data-driven. AI involves converting large
amounts of raw data into actionable information that carry practical value and is usable. Therefore,
understanding the statistical concepts and principles are essential to Artificial Intelligence and Machine
Learning. Statistical methods are required to find answers to the questions that we have about data. Statistics
and artificial intelligence share many commonalities. Both disciplines have much to do with planning, combining
evidence, and decision-making. We are aware that statistical methods are required to understand the data used
to train a machine learning model and to interpret the results of testing different machine learning models.
The first section of this unit describes the different data types and how they get stored in a database. The second
section of this unit deals with data representation. Data are usually collected in a raw format and thus difficult
to understand. However, no matter how accurate and valid the captured data might be it would be of no use
unless it is presented effectively. In the third part of the unit, we will get to learn what cases and variables are
and how you can compute measures of central tendency i.e. mean, median, mode, and dispersion i.e. standard
deviation and variance.
20
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Excel files
SQL databases
Online Forms
Each of these has structured rows and columns that can be sorted or manipulated. Structured data is highly
organized and easily understood by machine language. The most attractive feature of the structured database
is that those working within relational databases can easily input, search, and manipulate structured data.
21
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://www.datamation.com/big-data/structured-data.html
Activity 1
Tick the correct image depicting structured data depending on your understanding of the same:
https://www.curvearro.com/blog/difference-between-structured-data-unstructured-data/
22
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://www.nbnminds.com/structured-data-vs-unstructured-data/
https://lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/
https://www.laserfiche.com/ecmblog/4-ways-to-manage-unstructured-data-with-ecm/
23
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Date data type helps us to specify the date in a particular format. Let's say if we want to store the date, 2
January 2019, then first we will give the year which would be 2 0 1 9, then the month which would be 01,
and finally, the day which would be 02.
Time data type helps us specify the time represented in a format. Let's say, we want to store the time
8:30:23 a.m. So, first, we'll specify the hour which would be 08, then the minutes which would be 30, and
finally the seconds which would be 23. Year data type holds year values such as 1995 or 2011.
24
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://docs.frevvo.com/d/display/frevvo/Setting+Properties
Activity
Write the date format used for the dates mentioned below. The first one has been solved as an example. Pay
attention to the separators used in each case. You may use MM or mm to denote month, DD or dd for day and
YY or yyyy for year.
a. mm-dd-yyyy - (07-26-1966)
b. ______________ - (07/26/1966)
c. ______________ - (07.26.1966)
d. ______________ - (26-07-1966)
e. ______________ - (26/07/1966)
f. ______________ - (26.07.1966)
g. ______________ - (1966-07-26)
h. ______________ - (1966/07/26)
i. ______________ - (1966.07.26)
Examples:
25
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
It means, the report card can be bundled section-wise, or class-wise, or house-wise, so section, class or house
are the categories here. The easy way to determine whether the given data is categorical or numerical data is
to calculate the average. If you can calculate the average, then it is considered to be a numerical data. If you
cannot calculate the average, then it is considered to be a categorical data.
https://study.com/academy/exam/topic/ppst-math-data-analysis.html
http://www.intellspot.com/categorical-data-examples/
Question 3: Refer to the table on ice cream and answer the following:
How many belong to the group, ‘Adults who like Chocolate ice creams’?
Question 4: Refer to the table on hair colour, and answer the following:
26
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
2. Representation of Data
According to Wikipedia, “Statistics is the discipline that concerns the collection, organization, analysis,
interpretation and presentation of data.” It is the science of data that transforms the observations into usable
information. To achieve this task, statisticians summarize a large amount of data in a format that is compact and
produces meaningful information. Without displaying values for each observation (from populations), it is
possible to represent the data in brief while keeping its meaning intact using certain techniques called 'data
representation'. It can also be defined as a technique for presenting large volumes of data in a manner that
enables the user to interpret the important data with minimum effort and time.
2.2. Graphical Technique: Pie Chart, Bar graphs, line graphs, etc.
The visual display of statistical data in the form of points, lines, dots and other geometrical forms is most
common. It would not be possible to discuss the methods of construction of all types of diagrams and maps
primarily due to time constraint. We will, therefore, describe the most commonly used graphs and the way they
are drawn.
These are:
Line graphs
Bar diagrams
Pie diagram
Scatter Plots
27
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
(a) Simplify the data by converting it into round numbers such as the growth rate of the population as shown in
the table for the years 1901 to 2001
(b) Draw an X and Y-axis. Mark the time series variables (years/months) on the X-axis and the data quantity/value
to be plotted (growth of population) in percent on the Y-axis.
(c) Choose an appropriate scale and label it on Y-axis. If the data involves a negative figure then the selected
scale should also show it.
The advantages of using Line graph is that it is useful for making comparisons between different datasets, it is
easy to tell the changes in both long and short term, with even small changes over time.
28
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://www.yourarticlelibrary.com/population/growth-of-population-in-india-1901-to-2001-with-statistics/39653
1.5
Year
0
1911 1921 1931 1941 1951 1961 1971 1981 1991 2001
-0.5
Growth Rate
Activity 1: Find out the reasons for the sudden change in growth of population between 1911 and 1921 as is
evident from the above graph.
Activity 2: Between the student attendance data and student's score, which one according to you should be
represented using the line graph?
Activity 3: Construct a simple line graph to represent the rainfall data of Tamil Nadu as shown in the table
below
Months Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Rainfall (cm) 2.3 2.1 3.7 10.6 20.8 35.6 22.8 14.6 13.8 27.5 20.6 7.5
29
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
(c) Bars may be shared with colours or patterns to make them distinct and attractive.
The advantages of using a bar graph are many, it is useful for comparing facts, it provides a visual display for
quick comparison of quantities in different categories, and they help us to ascertain relationships easily. Bar
graphs also show big changes over time.
30
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Using the information being depicted in the graph above, answer the questions below:
Pie charts are used to for representing compositions or when trying to compare parts of a whole. They do not
show changes over time. Various applications of pie charts can be found in business, school and at home. For
business, pie charts can be used to show the success or failure of certain products or services. At school, pie
chart applications include showing how much time is allotted to each subject. At home, pie charts can be used
to see the expenses of monthly income on different goods and services.
31
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The advantages of a pie chart is that it is simple and easy-to-understand and provides data comparison at a
glance.
Imagine, you survey your class to find what kind of books they like the most. You recorded your findings in the
table for all the (40) students in the class
6 11 8 7 8
6 11 8 7 8 40
Step 3: Next, divide each value by the total and multiply by 100 to get the percentage
6 11 8 7 8 40
(6/40) * 100 (11/40) * 100 (8/40) * 100 (7/40) * 100 (8/40) * 100 (40/40) * 100
32
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
6
8
Classic
Fictiion
Comedy
7 11 Story
Biography
Scatter plots are used when there is paired numerical data and when the dependent variable may have multiple
values for each value of your independent variable. The advantage of scatter graph lies in its ability to portray
trends, clusters, patterns, and relationships.
A student had a hypothesis for a science project. He believed that the more the students studied Math, the
better their Math scores would be. He took a poll in which he asked students the average number of hours that
33
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
they studied per week during a given semester. He then found out the overall percentage that they received in
their Math classes. His data is shown in the table below:
Maths 82 81 90 74 77 97 51 78 86 88
Grade
(%)
The independent variable, or input data, is the study time because the hypothesis is that the Math grade
depends on the study time. That means that the Math grade is the dependent variable, or the output data. The
input data is plotted on the x-axis and the output data is plotted on the y-axis.
Negative Correlation: Both the variables are seen to be moving in opposite directions. While one variable
increases, the other variable decreases. As one variable decreases, the other variable increases. If among the
data points along the x – coordinate and the y – coordinate, one increases and the other decreases it is termed
as a negative correlation.
34
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
E.g. When hours spent sleeping increases hours spent awake decreases, so they are negatively correlated.
No correlation: If no relationship becomes evident between the two variables then there is no correlation. E.g.
Eg: There is no correlation between the amount of tea consumed and the level of intelligence.
35
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
3. Exploring Data
Exploring data is about "getting to know" the data: and its values - whether they are typical, unusual; centered,
spread out; or whether they are extremes. More importantly, during the process of exploration one gets an
opportunity to identify and correct any problems in your data that would affect the conclusions you draw in any
way during analysis. This is the first step in data analysis and involves summarizing the main characteristics of a
dataset, such as its size, accuracy, initial patterns in the data and other attributes.
The number of runs, wickets of a team can be expressed in terms of variables and case.
Before getting into the proper definition of case or variable let us try to understand its meaning in simple words:
“Variables” are the features of someone / something and "Case" is that something/ someone. So, here Cricket
is the case and features of cricket like wickets, runs, win, etc are the variables.
Example 1:
Take another example, you want to know the age, height and address of your favourite cricket player.
Example 2:
Let us take one more example where data is collected from a sample of STAT 200 students. Each student's
major, quiz score, and lab assignment score is recorded.
36
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
______________________________________________
______________________________________________
Example 3:
A fourth-grade teacher wants to know if students who spend more time studying at home get higher homework
and exam grades.
______________________________________________
______________________________________________
So, given the examples you came across here, you would have understood that a dataset contains information
about a sample. Hence a dataset is said to consist of cases and cases are nothing but a collection of objects. It
must now also be clear that a variable is a characteristic that is measured and whose value can keep changing
during the program. In other words, something that can vary. This is in striking contrast to a constant which is
the same for all cases in a study.
Example:
Let’s say you are collecting blood samples from students in a school for a CBC test, where the following
components would be measured:
Haemoglobin level
Platelets
The students are the cases and all the components of blood are the variables.
Take another example, x = 10, this means that x is variable that stores the value 10 in it.
x = x + 5, name of variable is still x but its value has changed to 15 due to the addition of a constant 5
37
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://slideplayer.com/slide/8137745/
38
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. Nominal Level
Data at the nominal level is qualitative. Nominal variables are like categories such as Mercedes, BMW or Audi,
or like the four seasons – winter, spring, summer and autumn. They aren’t numbers, and cannot be used in
calculations and neither in any order or rank. The nominal level of measurement is the simplest or lowest of the
four ways to characterize data. Nominal means "in name only".
Colours of eyes, yes or no responses to a survey, gender, smartphone companies, etc all deal with the nominal
level of measurement. Even some things with numbers associated with them, such as a number on the back of
a cricketer’s T-shirt are nominal since they are used as "names" for individual players on the field and not for
any calculation purpose.
https://slideplayer.com/slide/8059841/
39
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
2. Ordinal Level
Ordinal data, is made up of groups and categories which follow a strict order. For e.g. if you have been asked to
rate a meal at a restaurant and the options are: unpalatable, unappetizing, just okay, tasty, and delicious.
Although the restaurant has used words not numbers to rate its food, it is clear that these preferences are
ordered from negative to positive or low to high, thus the data is qualitative, ordinal. However, the difference
between the data cannot be measured. Like the nominal scale data, ordinal scale data cannot be used in
calculations.
A Hotel industry survey where the responses to questions about the hotels are accepted as, "excellent," "good,"
"satisfactory," and "unsatisfactory." These responses are ordered or ranked from the excellent service to
satisfactory response to the least desired or unsatisfactory. But the differences between the two pieces of data
as seen in the previous case cannot be measured.
Another common example of this is the grading system where letters are used to grade a service or good. You
can order things so that A is higher than a B, but without any other information, there is no way of knowing how
much better an A is from a B.
https://slideplayer.com/slide/6564103/
3. Interval Level
Data that is measured using the interval scale is similar to ordinal level data because it has a definite ordering
but there is a difference between the two data. The differences between interval scale data can be measured
though the data does not have a starting point i.e. zero value.
Temperature scales like Celsius (oC) and Fahrenheit (F) are measured by using the interval scale. In both
temperature measurements, 40° is equal to 100° minus 60°. Differences make sense. But 0 degrees does not
because, in both scales, 0 is not the absolute lowest temperature. Temperatures like -20° F and -30° C exist and
are colder than 0.
Interval level data can be used in calculations, but the comparison cannot be done. 80° C is not four times as hot
as 20° C (nor is 80° F four times as hot as 20° F). There is no meaning to the ratio of 80 to 20 (or four to one).
40
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Ratio scale data is like interval scale data, but it has a 0 point and ratios can be calculated. For example, the
scores of four multiple choice statistics final exam questions were recorded as 80, 68, 20 and 92 (out of a
maximum of 100 marks). The grades are computer generated. The data can be put in order from lowest to
highest: 20, 68, 80, 92 or vice versa. The differences between the data have meaning. The score 92 is more than
the score 68 by 24 points. Ratios can be calculated. The smallest score is 0. So, 80 is four times 20. The score of
80 is four times better than the score of 20.
So, we can add, subtract, divide and multiply the two ratio level variables. Egg: Weight of a person. It has a real
zero point, i.e. zero weight means that the person has no weight. Also, we can add, subtract, multiply and
divide weights at the real scale for comparisons.
41
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Activity
1. Student Health Survey – Fill in the response and mention appropriate Level of Measurement
3. Indicate whether the variable is ordinal or not. If the variable is not ordinal, indicate its variable type.
Opinion about a new law (favour or oppose)
Letter grade in an English class (A, B, C, etc.)
Student rating of teacher on a scale of 1 – 10.
42
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
So, the tabular format of representation of cases and variables being used in your statistical study is known as
the Data Matrix. Each row of a data matrix represents a case and each column represent a variable. A complete
Data Matrix may contain thousands or lakhs or even more cases.
Each cell contains a single value for a particular variable (or observation).
Imagine you want to create a database of top 3 scorers in each section of every class of your school. The case
you are interested in is individual students (top 3) and variables you want to capture – name, class, section, age,
aggregate %, section rank, and address.
The best way to arrange all this information is to create a data matrix.
A X M 16 92 3 Add1
B X M 15 98 1 Add2
C X M 16 95 2 Add3
D IX N 14 96 1 Add4
E IX N 14 95 2 Add5
Z IV M 9 97 1 Add10
43
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Activity 1
Flipping a coin
Suppose you flip a coin five times and record the outcome (heads or tails) each time
Outcomes
Activity 2
The car data set was prepared by randomly selecting 54 cars and then collecting data on various attributes of
these cars. The first ten observations of the data can be seen in the data matrix below:
Activity 3: Prepare a data matrix to record sales of different types of fruits from a grocery store. Note variables
can be weight, per unit cost, total cost.
44
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
A frequency table is constructed by arranging the collected data values in either ascending order of magnitude
or descending order with their corresponding frequencies.
1 /// 3
2 /// 3
3 // 2
4 // 2
5 // 2
6 //// 5
7 //// 4
8 //// 5
9 // 2
10 // 2
Total 30 30
Example 1: The following data shows the test marks obtained by a group of students. Draw a frequency table
for the data.
6 7 7 1 3 2 8 6 8 2
4 4 9 10 2 6 3 1 6 6
9 8 7 5 7 10 8 1 5 8
Go through the data; make a stroke in the tally column for each occurrence of the data. The number of strokes
will be the frequency of the data.
When the set of data values are spread out, it is difficult to set up a frequency table for every data value as
there will be too many rows in the table. So, we group the data into class intervals (or groups) to help us
organize, interpret and analyse the data.
45
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example 2: The number of calls from motorists per day for roadside service was recorded for a particular
month in a year. The results were as follows:
We have already studied about the graphs and various graphical representation methods in the previous
chapters. Here we will learn about the spread of data and meaning of the spread i.e. distribution. The shape of
a distribution is described by its number of peaks and by its possession of symmetry, its tendency to skew, or its
uniformity. (Distributions that are skewed have more points plotted on one side of the graph than on the other.)
The shape of the data distribution represents:
Where the central tendency (i.e. mean) lies in the data spread
46
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. Number of Peaks
Distribution can have single peak (unimodal), also called modes, two peaks (bimodal) or (multimodal).
http://www.lynnschools.org/classrooms/english/faculty/documents
One of the most common types of unimodal distribution is normal distribution of ‘bell curve’ because its shape
looks like bell.
Unimodal
2. Symmetry
A symmetric graph when graphed and a vertical line drawn at the centre forms mirror images, with the left
half of the graph being the mirror image of the right half of the graph. The normal distribution or U-
distribution is an example of symmetric graphs.
47
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
https://mathbitsnotebook.com/Algebra1/StatisticsData/STShapes.html
3. Skewness
Unsymmetrical distributions are usually skewed. They have pointed plot on one side of mean. This causes a
long tail either in the negative direction on the number line (left skew) or long tail on the positive direction on
the number line (positive skew or right skew).
Let us further describe these shapes of distribution using histogram, one of the simplest but widely used data
representation methods.
Histograms are a very common method of visualizing data, and that means that understanding how to
interpret histograms is a valuable and important skill in statistics and machine learning.
Case – 1
In this histogram, you can see that mean is close to 50. The shape of the graph is roughly symmetric and the
values fall between 40 to 64. In some sense, value 64, looks like outlier.
____________________________________________
48
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Case -2
This histogram has 2 means which suggests that this histogram represents two cases. One group has mean 50
and other group has a mean of 65.
Can you think of a real-life example for such a case with two means?
Let us assume, that for the sports meet being held in your school your parents along with your grandparents
were also invited. The age of the parents ranged between 40 – 55 years (in blue colour), and that of grandparents
in the range of 60 – 80 years (in pink colour). During the break, both these cases (parents and grandparents)
bought snacks from the snacks counter. Y – Axis shows the money they spent at the counter in buying the snacks.
Activity 1
Can you think of an event(s) where you have two cases, in your classroom environment? Capture the data and
plot on the histogram to have two peaks. Once done, tell your data story to the class.
Case – 3
49
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Case 3 represents the right-skewed distribution (the direction of the skew indicates which way the longer tail
extends), the long tail extends to the right while most values cluster on the left, as shown in the histogram
above.
Case – 4
In Case 4 which is a left skewed distribution, long tail extends to the left while most values cluster on the right.
The mean (or average) is the most popular and well-known measure of central tendency. The mean is equal to
the sum of all the values in the data set divided by the number of values in the data set. Therefore, in this case,
mean is the sum of the total marks you scored in all the subjects divided by the number of subjects.
M = ∑ fox / n
Where M = Mean
x = Scores
Activity 1: When you try to search a game app on play store, you must be looking at the rating of the app. Can
you figure out how that rating is calculated?
50
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Activity 2: In the example below, if we add all rating given by users, it will come to 45685. Then how come
rating is 3.5?
3.4.2 Median
Suppose, there are 11 students in your class and one of them is from a very wealthy family.
Student 1 2 3 4 5 6 7 8 9 10 11
Pocket 600 580 650 650 550 700 590 640 600 595 20000
Money
Upon calculating the mean or average, it would turn out that the average pocket money received by students
of the class is Rs. 7973
However, on crosschecking with the amounts mentioned in the table it does not appear to be true. Is any student
getting a pocket money even close to INR 7973? The answer is No! So, should we use mean to represent the
data here? No! Because of one the extreme contributions received by a student from a wealthy family has upset
the mean.
So, what else can we do to so that the value should represent maximum students?
The median is the middle score for a set of data that has been arranged in order of magnitude. The median is
less affected by outliers and skewed data. So median value here is INR 700 and this value will represent
maximum students.
For a grouped data, calculation of a median in continuous series involves the following steps:
51
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
3.4.3 Mode
Mode is another important measure of central tendency of tactical series. It is the value which occurs most
frequently in the data series. The mode is the most frequent score in our data set. On a histogram it represents
the highest bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most
popular option.
https://contexturesblog.com/archives/2013/06/13/excel-functions-average-median-mode/
https://mathsmadeeasy.co.uk/gcse-maths-revision/bar-graphs-revision/
You need to buy a shoe. You go to the market and ask the shopkeeper 'what is the average shoe size you sell?',
he will give an answer corresponding to the size that he sells maximum. That is the mode.
The arithmetic mean and median would give you figures of shoe size that don't exist and are obviously
meaningless.
52
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Now let us understand how you can use these central tendencies in Machine Learning
Machine Learning (or AI) is a data driven technology. If the input data is wrong, ML will produce wrong results.
When working with data it is good to have an overall picture of the data. Where it’s good to have an idea of
how values in a given data set is distributed. The data distribution of data set can be
Symmetric
Skewed
◦ Negatively skewed
◦ Positively skewed
The skewness of data can be found either by data visualization techniques or by calculation of central
tendency.
http://homepage.stat.uiowa.edu/~rdecook/stat1010/notes/Section_4.2_distribution_shapes.pdf
Mode<Median<Mean
Mean<Median<Mode
https://www.calculators.org/math/mean-median-mode.php
53
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
As the formula shows, the z-score is simply the raw score minus the
sample mean, divided by the sample standard deviation.
The value of the z-score tells you how many standard deviations your data point is away from the mean. If a z-
score is equal to 0, it is on the mean.
A positive z-score indicates the raw score is higher than the mean average. For example, if a z-score is equal to
+1, it is 1 standard deviation above the mean.
A negative z-score reveals the raw score is below the mean average. For example, if a z-score is equal to -2, it
is 2 standard deviations below the mean.
Question
There are 50 students in your class, you scored 70 out of 100 in SST exam. How well did you perform in your
SST exam?
Answer
Let us re-phrase this question – In fact you need to find - “What percentage (or number) of students scored
higher than you and what percentage (or number) of students scored lower than you?
This is a perfect case of z-score. To calculate z-score, you need to find the mean score of your class in SST and
standard deviation.
z-score = (x – μ) / ῤ
= (70 – 60) / 15 = 10 / 15
= .6667
This means you scored .6667 standard deviations above the mean.
54
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Practice Questions
1. Find (a) the mean (b) the median (c) the mode (d) the range of the below data set
5, 6, 2, 4, 7, 8, 3, 5, 6, 6
4, 1, 5, 4, 3, 7, 2, 3, 4, 1
(b) Calculate
(c) A researcher says: "The mode seems to be the best average to represent the data in this survey."
Give ONE reason to support this statement.
3. The mean number of students ill at a school is 3.8 per day, for the first 20 school days of a term. On the 21st
day 8 students are ill. What is the mean after 21 days?
4. If a positively skewed distribution has a median of 50, which of the following statement is true?
5. Which of the following is a possible value for the median of the below distribution?
A) 32
B) 26
C) 17
D) 40
55
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Unit 8
Regression
Learning Outcomes:
1. Students should be able to estimate the correlation coefficient for a given data set
2. Students should be able to estimate the line of best fit for a given data set
3. Students should be able to determine whether a regression model is significant
Pre-requisites:
1. Students must be able to plot points on the Cartesian coordinate system
2. They should have basic understanding of statistics and central tendencies
Key Concepts: Regression, Correlation, Pearson’s r
56
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The value shows how good the correlation is (not how steep the line is), and if it is positive or negative.
57
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The table below is a crosstab that shows by age whether somebody has an unlisted phone number.
This table shows the number of observations with each combination of possible values of the two
variables in each cell of the table
We can see, for example, there are 185 people aged 18 to 34 years who do not have an unlisted phone
number.
Column percentages are also shown (these are percentages within the columns, so that each column’s
percentages add up to 100%); for example, 24% of all the people without an unlisted phone number are
aged 18 to 34 years.
The age distribution for people without unlisted numbers is different from that for people with
unlisted numbers. In other words, the crosstab reveals a relationship between the two: people
with unlisted phone numbers are more likely to be younger.
Thus, we can also say that the variables used to create this table are correlated. If there were
no relationship between these two categorical variables, we would say that they were not
correlated.
In this example, the two variables can both be viewed as being ordered. Consequently, we can
potentially describe the patterns as being positive or negative correlations (negative in the table
shown). However, where both variables are not ordered, we can simply refer to the strength of the
correlation without discussing its direction (i.e., whether it is positive or negative).
58
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1.1.2 Scatterplots
A scatter plot (aka scatter chart, scatter graph) uses dots to represent values for two different numeric
variables. The position of each dot on the horizontal and vertical axis indicates values for an individual
data point. Scatter plots are used to observe relationships between variables.
Example
This is a scatter plot showing the amount of sleep needed per day by age.
As seen above, as you grow older, you need less sleep (but still probably more than you’re currently
getting).
Answer: This is a negative correlation. As we move along the x-axis toward the greater numbers,
the points move down which means the y-values are decreasing, making this a negative correlation.
59
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1.2 Pearson’s r
The Pearson correlation coefficient is used to measure the strength of a linear association between
two variables, where the value r = 1 means a perfect positive correlation and the value r = -1 means a
perfect negative correlation. So, for example, you could use this test to find out whether people's
height and weight are correlated (the taller the people are, the heavier they're likely to be).
Requirements for Pearson's correlation coefficient are as follows:Scale of measurement should be
interval or ratio
60
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
How can we determine the strength of association based on the Pearson correlation coefficient?
The stronger the association of the two variables, the closer the Pearson correlation coefficient, r, will
be to either +1 or -1 depending on whether the relationship is positive or negative, respectively.
Achieving a value of +1 or -1 means that all your data points are included on the line of best fit – there
are no data points that show any variation away from this line. Values for r between +1 and -1 (for
example, r = 0.8 or -0.4) indicate that there is variation around the line of best fit. The closer the value
of r to 0 the greater the variation around the line of best fit. Different relationships and their correlation
coefficients are shown in the diagram below:
Remember that these values are guidelines and whether an association is strong or not will also depend on
what you are measuring.
61
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example 1
In the example below of 6 people with different age and different weight, let us try calculating the value of the
Pearson r.
Solution:
For the Calculation of the Pearson Correlation Coefficient, we will first calculate the following values:
62
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Assumptions
There are four "assumptions" that underpin a Pearson's correlation. If any of these four assumptions are not
met, analysing your data using a Pearson's correlation might not lead to a valid result.
Assumption # 1: The two variables should be measured at the continuous level. Examples of such continuous
variables include height (measured in feet and inches), temperature (measured in °C), salary (measured in
dollars/INR), revision time (measured in hours), intelligence (measured using IQ score), reaction time (measured
in milliseconds), test performance (measured from 0 to 100), sales (measured in number of transactions per
month), and so forth.
Assumption # 2: There needs to be a linear relationship between your two variables. Whilst there are a number
of ways to check whether a Pearson's correlation exists, we suggest creating a scatterplot using Stata, where
you can plot your two variables against each other. You can then visually inspect the scatterplot to check for
linearity. Your scatterplot may look something like one of the following:
Assumption #3: There should be no significant outliers. Outliers are simply single data points within your data
that do not follow the usual pattern (e.g. in a study of 100 students' IQ scores, where the mean score was 108
with only a small variation between students, one student had a score of 156, which is very unusual, and may
even put her in the top 1% of IQ scores globally). The following scatterplots highlight the potential impact of
outliers:
63
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Pearson's r is sensitive to outliers, which can have a great impact on the line of best fit and the Pearson
correlation coefficient, leading to very difficult conclusions regarding your data. Therefore, it is best if there are
no outliers or they are kept to a minimum. Fortunately, you can use Stata to detect possible outliers
using scatterplots.
Assumption # 4: Your variables should be approximately normally distributed. In order to assess the statistical
significance of the Pearson correlation, you need to have bivariate normality, but this assumption is difficult to
assess, so a simpler method is more commonly used.
Let there be two variables x and y. If y depends on x, then the result comes in the form of a simple regression.
Furthermore, we name the variables x and y as:
Also, we can have one more definition for the regression line of y on x. We can call it the best fit as
the result comes from least squares. This method is the most suitable for finding the value
of y on x i.e. the value of a dependent variable on an independent variable.
Least Squares Method
∑ ei2 = ∑ (yi – y ^ i)2 = ∑ (yi – a – bxi)2
64
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Here:
y ^ i = a + bxi, denotes the estimated value of yi for a given random value of a variable of xi
ei = Difference between observed and estimated value and is the error or residue. The
regression line of y or x along with the estimation errors are as follows:
On minimizing the least squares equation, here is what we get. We refer to these equations Normal Equations.
∑yi = na + b ∑xi
∑xiyi = a ∑xi2 + b ∑xi
We get the least squares estimate for a and b by solving the above two equations for both a and b.
b = Cov(x,y)/Sx2
= (r.SxSy)/Sx2
= (r.Sy)/Sx
a = y¯ – bx¯
[ y – y¯ ]/Sy = r[ x – x¯ ]/Sx
Sometimes, it might so happen that variable x depends on variable y. In such cases, the line of regression of x
on y is:
x = a ^ + b^y
Regression Equation
[ x – x¯ ]/Sx = r[ y – y¯ ]/Sy
Question
65
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Solution
(i) The intersection of two lines have the same intersection point and that is [x¯, y¯]. Therefore, we replace, x
and y with x¯ and y¯
7x – 3y = 18
4x – y = 11
(ii) We know,
r2 = 7/12
Therefore,
Regression lines are very useful for forecasting procedures. The purpose of the line is to describe the
interrelation of a dependent variable (Y variable) with one or many independent variables (X variable). By using
the equation obtained from the regression line an analyst can forecast future behaviours of the dependent
variable by inputting different values for the independent ones. Regression lines are widely used in the financial
sector and in business in general.
Financial analysts employ linear regressions to forecast stock prices, commodity prices and to perform
valuations for many different securities. On the other hand, companies employ regressions for the purpose of
forecasting sales, inventories and many other variables that are crucial for strategy and planning.
(Y = a + bX + u)
b is the slope
66
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example: 1
Data was collected on the “depth of dive” and the “duration of dive” of penguins. The following linear model is
a fairly good summary of the data:
Where:
Interpretation of the slope: If the duration of the dive increases by 1 minute, we predict the depth of the
dive will increase by approximately 2.915 yards.
Interpretation of the intercept: If the duration of the dive is 0 seconds, then we predict the depth of the dive
is 0.015 yards.
Comments: The interpretation of the intercept doesn’t make sense in the real world. It isn’t reasonable for
the duration of a dive to be near t = 0, because that’s too short for a dive. If data with x-values near zero
wouldn’t make sense, then usually the interpretation of the intercept won’t seem realistic in the real world.
It is, however, acceptable (even required) to interpret this as a coefficient in the model.
Example: 2
Reinforced concrete buildings have steel frames. One of the main factors affecting the durability of these
buildings is carbonation of the concrete (caused by a chemical reaction that changes the pH of the concrete),
which then corrodes the steel reinforcing the building.
Data is collected on specimens of the core taken from such buildings, where the following are measured:
Interpretation of the slope: If the depth of the carbonation increases by 1 mm, then the model predicts that the
strength of the concrete will decrease by approximately 2.8 Mpa.
Interpretation of the intercept: If the depth of the carbonation is 0, then the model predicts that the strength
of the concrete is approximately 24.5 Mpa.
Comments: Notice that it isn’t necessary to fully understand the units in which the variables are measured in
order to correctly interpret these coefficients. While it is good to understand data thoroughly, it is also important
to understand the structure of linear models. In this model, notice that the strength decreases as the
carbonation increases, which is shown by the negative slope coefficient. When you interpret a negative slope,
notice that you must say that, as the explanatory variable increases, then the response variable decreases.
Example: 3
When cigarettes are burned, one by-product in the smoke is carbon monoxide. Data is collected to determine
whether the carbon monoxide emission can be predicted by the nicotine level of the cigarette.
It is determined that the relationship is approximately linear when we predict carbon monoxide, C,
from the nicotine level, N
Both variables are measured in milligrams
The formula for the model is C = 3.0 + 10.3.N
67
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Interpretation of the slope: If the amount of nicotine goes up by 1 mg, then we predict the amount of carbon
monoxide in the smoke will increase by 10.3 mg.
Interpretation of the intercept: If the amount of nicotine is zero, then we predict that the amount of carbon
monoxide in the smoke will be about 3.0 mg.
Correlation is a statistical technique which tells us how strongly the pair of variables are linearly related and
change together. It does not tell us why and how behind the relationship but it just says the relationship exists.
Causation takes a step further than correlation. It says any change in the value of one variable will cause a
change in the value of another variable, which means one variable makes the other happen. It is also referred
to as cause and effect.
Two or more variables considered to be related, in a statistical context, if their values change so that as the
value of one variable increases or decreases so does the value of the other variable (it may be in the same or
opposite direction).
68
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
For example,
For the two variables "hours worked" and "income earned" there is a relationship between the two
such that the increase in hours worked is associated with an increase in income earned as well.
If we consider the two variables "price" and "purchasing power", as the price of goods increases a
person's ability to buy these goods decreases (assuming a constant income).
Therefore:
Correlation is a statistical measure (expressed as a number) that describes the size and direction of a
relationship between two or more variables.
A correlation between variables, however, does not automatically mean that the change in one variable
is the cause of change in the values of the other variable.
Causation indicates that one event is the result of the occurrence of the other event; i.e. there is a
causal relationship between the two events. This is also referred to as cause and effect.
Theoretically, the difference between the two types of relationships are easy to identify — an action or
occurrence can cause another (e.g. smoking causes an increase in the risk of developing lung cancer), or it
can correlate with another (e.g. smoking is correlated with alcoholism, but it does not cause alcoholism). In
practice, however, it remains difficult to clearly establish cause and effect, compared to establishing correlation.
Example 1
Suppose a study of speeding violations and drivers who use cell phones produced the following fictional data:
Cell Phone User Speeding violation in the last year No speeding violation in the last year Total
No 45 405 450
The total number of people in the sample is 755. The row totals are 305 and 450. The column totals are 70 and
685. Notice that 305 + 450 = 755 and 70 + 685 = 755.
69
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
3. Find P (Person had no violation in the last year AND was a cell phone user)
Number of cell phone users with no violation / Total number in study = 280/755
4. Find P (Person is a cell phone user OR person had no violation in the last year)
This table shows a random sample of 100 hikers and the areas of hiking they prefer.
Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
Female 18 16 ___ 45
Sex The Coastline Near Lakes and Streams On Mountain Peaks Total
Female 18 16 11 45
Male 16 25 14 55
Total 34 41 25 100
2. Find the probability that a person is male given that the person prefers hiking near lakes and streams
Hint:
Let M = being male, and let L = prefers hiking near lakes and streams.
1. What word tells you this is conditional?
2. Fill in the blanks and calculate the probability: P(___|___) = ___.
3. Is the sample space for this problem all 100 hikers? If not, what is it?
Answer
1. The word “given” tells you that this is a conditional.
2. P(M|L) =2541
3. No, the sample space for this problem is the 41 hikers who prefer lakes and streams.
70
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
2. Reading
2.1 Correlation
Correlation is a measure of how closely two variables move together. Pearson’s correlation coefficient is a
common measure of correlation, and it ranges from +1 for two variables that are perfectly in sync with each
other, to 0 when they have no correlation, to -1 when the two variables are moving opposite to each other.
For linear regression, one way of calculating the slope of the regression line uses Pearson’s correlation, so it
is worth understanding what correlation is.
Y = a + bx
The correlation coefficient that indicates the strength of the relationship between two variables can be
found using the following formula:
Where:
rxy – the correlation coefficient of the linear relationship between the variables x and y
In order to calculate the correlation coefficient using the formula above, you must undertake the following
steps:
2. Calculate the means (averages) x̅ for the x-variable and ȳ for the y-variable.
3. For the x-variable, subtract the mean from each value of the x-variable (let’s call this new variable
“a”). Do the same for the y-variable (let’s call this variable “b”).
4. Multiply each a-value by the corresponding b-value and find the sum of these multiplications (the
final value is the numerator in the formula).
6. Find the square root of the value obtained in the previous step (this is the denominator in the
formula).
71
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
You can see that the manual calculation of the correlation coefficient is an extremely tedious process,
especially if the data sample is large. However, there are many software tools that can help you save time
when calculating the coefficient. ‘CORREL’ function of MS Excel returns the correlation coefficient of two cell
range.
Example of Correlation
X is an investor; he invests money in share market. His portfolio primarily tracks the performance of
the S&P 500 (this is a stock market index in USA that measures the performance of top 500 large
companies in the USA).
X wants to add the stock of Apple Inc. Before adding Apple to his portfolio, he wants to assess the
correlation between the stock and the S&P 500 to ensure that adding the stock won’t increase the
systematic risk of his portfolio.
To find the coefficient, X gathers the following prices from the last five years (Step 1)
Using the formula above, X can determine the correlation between the prices of the S&P 500 Index and
Apple Inc.
Next, X calculates the average prices of each security for the given periods (Step 2):
After the calculation of the average prices, we can find the other values. A summary of the
calculations is given in the table below:
72
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The coefficient indicates that the prices of the S&P 500 and Apple Inc. have a high positive correlation. This
means that their respective prices tend to move in the same direction. Therefore, adding Apple to his portfolio
would, in fact, increase the level of systematic risk.
2.2 Regression
With correlation, we determined how much two sets of numbers changed together. With regression, we will
to use one set of numbers to make a prediction on the value in the other set. Correlation is part of what we
need for regression. But we also need to know how much each set of numbers change individually, via the
standard deviation, and where we should put the line, i.e. the intercept.
The regression that we are calculating is very similar to correlation. So you might ask, why do we have both
regression and correlation? It turns out that regression and correlation give related but distinct information.
Correlation gives you a measurement that can be interpreted independently of the scale of the two
variables. Correlation is always bounded by ±1. The closer the correlation is to ±1 the closer the two
variables are to a perfectly linear relationship.
The regression slope by itself does not tell you that. The regression slope tells you the expected
change in the dependent variable y when the independent variable x changes one unit. That
information cannot be calculated from the correlation alone.
A fallout of those two points is that correlation is a unit-less value, while the slope of the regression line has
units. If for instance, you owned a large business and were doing an analysis on the amount of revenue in
each region compared to the number of salespeople in that region, you would get a unit-less result with
correlation, and with regression, you would get a result that was the amount of money per person.
73
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Regression Equations
With linear regression, we are trying to solve for the equation of a line, which is shown below.
Y = a + bx
The values that we need to solve for are ‘b’ the slope of the line, and ‘a’ the intercept of the line. The hardest
part of calculating the slope ‘b’, is finding the correlation between x and y, which we have already done. The
only modification that needs to be made to that correlation is multiplying it by the ratio of the standard
deviations of x and y, which we also already calculated when finding the correlation. The equation for slope
is shown below
Once we have the slope, getting the intercept is easy. Assuming that you are using the standard equations
for correlation and standard deviation, which go through the average of x and y (x̄,ȳ), the equation for
intercept is
The best way to determine whether it is a simple linear regression problem is to do a plot of Marks vs Hours.
If the plot comes like below, it may be inferred that a linear model can be used for this problem.
74
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The data represented in the above plot would be used to find out a line such as the following which
represents a best-fit line. The slope of the best-fit line would be the value of “m”.
The value of m (slope of the line) can be determined using an objective function which is a combination of loss
function and a regularization term. For simple linear regression, the objective function would be
the summation of Mean Squared Error (MSE). MSE is the sum of squared distances between the target
variable (actual marks) and the predicted values (marks calculated using the above equation). The best fit line
would be obtained by minimizing the objective function (summation of mean squared error).
75
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
A statistics instructor at a university would like to examine the relationship (if any) between the number of
optional homework problems students do during the semester and their final course grade. She randomly
selects 12 students for study and asks them to keep track of the number of these problems completed during
the course of the semester. At the end of the class each student’s total is recorded along with their final grade.
The data is available in the following table:
51 62 3162
58 68 3944
62 66 4092
65 66 4290
68 67 4556
76 72 5472
77 73 5621
78 72 5616
78 78 6084
84 73 6132
85 76 6460
91 75 6825
76
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
7) Use the regression equation to predict a student’s final course grade if 75 optional homework
assignments are done.
8) Use the regression equation to compute the number of optional homework assignments that need to be
completed if a student expects a course grade of 85
Problem 2
The following data set of the heights and weights of a random sample of 15 male students is acquired. Is
there any apparent relationship between the two variables?
1 5 ft 6 inch 60 kgs
2 5 ft 4 inch 55 kgs
3 5 ft 8 inch 78 kgs
5 5 ft 4 inch 53 kgs
6 5 ft 7 inch 56 kgs
7 5 ft 3 inch 54 kgs
9 5 ft 6 inch 74 kgs
10 5 ft 3 inch 65 kgs
11 5 ft 9 inch 76 kgs
14 5 ft 4 inch 63 kgs
15 5 ft 7 inch 62 kgs
Would you expect the same relationship (if any) to exist between the heights and weights of the opposite
sex?
77
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Problem 3
From the following data of hours worked in a factory (x) and output units (y), determine the regression line
of y on x, the linear correlation coefficient and determine the type of correlation.
Hours (X) 80 79 83 84 78 60 82 85 79 84 80 62
Production (Y) 300 302 315 330 300 250 300 340 315 330 310 240
Problem 4
The height (in cm) and weight (in kg) of 10 basketball players on a team are as below:
Height (X) 186 189 190 192 193 193 198 201 203 205
Weight (Y) 85 85 86 90 87 91 93 103 100 101
Calculate:
78
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Unit 9
There are many practical business applications for machine learning classification. For example, if
you want to predict whether or not a person will default on a loan, you need to determine if that
person belongs to one of two classes with similar characteristics: the defaulter class or the non-
defaulter class. This classification helps you understand how likely the person is to become a
defaulter, and helps you adjust your risk assessment accordingly.
Objectives:
1. The main goal of this unit is to help students learn and understand classification problems
2. Define Classification and list its algorithms
3. Student should understand classification as a type of supervised learning
Learning Outcomes:
I. Describe the input and output of a classification model
II. Students should be able to differentiate the regression problem with classification
problem
Pre-requisites: Concept of machine learning and artificial intelligence. Understanding of
supervised and unsupervised learning and Regression Analysis.
79
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. Classification
Almost everyday, we deal with classification problems. Here are few interesting examples to illustrate the
widespread applications of classification problems.
Case 1:
A credit card company typically receives hundreds of applications for a new credit card. It contains information
regarding several different attributes such as, annual salary, outstanding debt, age etc. The problem is to
categorize applications into those who have good credit, bad credit or somewhere in the middle.
Categorization of the application is nothing but a classification problem.
Case 2:
You may want to own a dog but which kind of dog? This is the beginning of a classification problem. Dogs can
be classified in a number of different ways. For example, they can be classified by breed (examples include
beagles, hounds, Pug and countless others). they can also be classified by their role in the lives of their masters
and the work they do (examples include a dog might be a family pet, a working dog, a show dog, or a hunting
dog). In many cases, dogs are defined both by their breed and their role. Based on different classification
criteria, you decide eventually which one you want to own.
Case 3:
A common example of classification comes with detecting spam emails. To write a program to filter out
spam emails, a computer programmer can train a machine learning algorithm with a set of spam-like emails
labeled as “spam” and regular emails labeled as “not-spam”. The idea is to make an algorithm that can learn
characteristics of spam emails from this training set so that it can filter out spam emails when it encounters
new emails.
Activity-1
Look at the pictures below and tell me whether the fruit seller knows the art of classification or not. Justify
your answer.
80
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
In order to understand ‘Classification’, let us revise the concept of ‘Supervised Learning’, because classification
is type of supervised learning.
Supervised learning as the name indicates is the presence of a supervisor as a teacher. Basically supervised
learning is learning in which we teach or train the machine using data which is well labeled that means some
data is already tagged with the correct answer. After that, the machine is provided with a new set of examples
(data) so that supervised learning algorithm analyses the training data (set of training examples) and produces
a correct outcome from labeled data.
For instance, suppose you are given a basket filled with different kinds of fruits. Now the first step is to train
the machine with all different fruits one by one like below:
Now suppose after training the data, you present a new fruit (say
Banana) from basket and ask the machine to identify it. Since the
machine has already learnt from previous data, it will use the learning
wisely this time to classify the fruit based on its shape and color and
would confirm the fruit as BANANA and place it in Banana category. Thus
the machine learns the things from training data (basket containing
fruits) and then applies the knowledge to test data (new fruit).
Classification: A classification problem is when the output variable is a category, such as “Red” or
“blue” or “disease” and “no disease”.
Regression: A regression problem is when the output variable is a real value, such as “INR” or
“Kilograms”, “Fahrenheit” etc.
Out of these two supervised learning algorithms, the context of the current unit is – classification learning /
training algorithm.
Let’s say, you live in a gated housing society and your society has separate dustbins for different types of
waste: paper waste, plastic waste, food waste and so on. What you are basically doing over here is
classifying the waste into different categories and then labeling each category.
In the below picture, we are assigning the labels ‘paper’, ‘metal’, ‘plastic’, and so on to different types of
waste.
81
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Let’s say you own a shop and you want to figure out if one of your customers is going to come visit your shop
again or not. The answer to that question can only be a ‘Yes’ or ‘No’. There can’t be third type of answer to
such a question. These kind of problems in Machine Learning are known as classification problem.
Classification problems normally have a categorical output like a ‘yes’ or ‘no’, ‘1’ or ‘0’, ‘True’ or ‘false’.
Say you want to check if on a particular day, it will rain or not. In this case the answer is dependent on the
weather condition and based on the same, the outcome can either be ‘Yes’ or ‘No’.
Question 2: Look at the two graphs below and suggest which graph represents the classification problem.
Graph 1 Graph 1
Question 3: “Predicting stock price of a company on a particular day” - is it a classification problem? Justify
your answer.
Question 4: “Predicting whether India will lose or win a cricket match “- is it a regression problem? Justify
your answer.
82
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example 2: Speech Understanding: Given an utterance from a user, identify the specific request made by the
user. A model of this problem would allow a program to understand and make an attempt to fulfill that
request. Eg: Siri, Cortana, google now has this capability.
Example 3: Face Detection: Given a digital photo album of many hundreds of digital photographs, identify
those photos that include a given person. A model of this decision process would allow a program to organize
photos by person. Some cameras and software like Facebook, Google Photos have this capability.
Activity:
Form a group of 5 students. Each group should think and come up with one use case from the classroom
environment or their home/society, where they would like to apply classification algorithm to solve the
problem.
There are two main types of classification tasks that you may encounter, they are:
i) Binary Classification: Classification with only 2 distinct classes or with 2 possible outcomes
ii) Multi Class Classification: Classification with more than two distinct classes.
83
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Typically, binary classification involves one class that is the normal state and another class that is the abnormal
state. For example, “not spam” is the normal state and “spam” is the abnormal state. Another example is
“cancer not detected” is the normal state of a task that involves a medical test and “cancer detected” is the
abnormal state.
The class for the normal state is assigned the class label 0 and the class with the abnormal state is assigned
the class label 1.
Logistic Regression
k-Nearest Neighbors
Decision Trees
Support Vector Machine
Out of these binary classification algorithms, we are going to study about ‘Logistic Regression’.
So given some feature x it tries to find out whether some event y happens or not. So y can either be 0 or 1. In
the case where the event happens, y is given the value 1. If the event does not happen, then y is given the
value of 0. For example, if y represents whether a sports teams wins a match, then y will be 1 if they win the
match or y will be 0 if they do not.
Example of a Logistic Curve is where the values of y cannot be less than 0 or greater than 1.
84
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example 1: Spam detection is a binary classification problem where we are given an email and we need to
classify whether or not it is spam. If the email is spam, we label it 1; if it is not spam, we label it 0. In order to
apply Logistic Regression to the spam detection problem, the following features of the email are extracted:
Occurrence of words/phrases like “offer”, “prize”, “free gift”, “lottery”, “you won cash” and more
The resulting feature vector is then used to train a Logistic classifier which emits a score in the range 0 to 1. If
the score is more than 0.5, we label the email as spam. Otherwise, we don’t label it as spam.
Example 2: A Logistic Regression classifier may be used to identify whether a tumor is malignant or if it is
benign. Several medical imaging techniques are used to extract various features of tumors. For instance, the
size of the tumor, the affected body area, etc. These features are then fed to a Logistic Regression classifier
to identify if the tumor is malignant or if it is benign.
Above two problems are solved using logistic regression algorithm because the possible labels in both the
cases are two only – Spam / Not spam, malignant/benign i.e. binomial classification.
2.2 True positives, true negatives, false positives and false negatives
In the field of machine learning / Artificial Intelligence, a matrix (NxN table) is used to validate how successful
a classification model i.e. classifier’s predictions are, where N is the number of target classes. The confusion
matrix compares the actual target values with those predicted by the classification model. This gives how well
the classification model is performing and what kind of error it is making.
But wait – what’s TP, FP, FN and TN here? That’s the point we have to understand in confusion matrix. Let’s
understand each term below.
85
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The actual value was positive and classification model also predicts positive
There is no error
The actual value was negative and classification model also forecasts negative
There is no error
The actual value was negative but the model predicted a positive value
The actual value was positive but the model predicted a negative value
True Positive (TP) - Umpire gives a batsman NOT OUT when he is NOT OUT.
True Negative (TN) - Umpire gives a Batsman OUT when he is OUT.
Question 1:
Assume there are 100 images, 30 of them depict a cat, the rest do not. A machine learning model predicts the
occurrence of a cat in 25 of 30 cat images. It also predicts absence of a cat in 50 of the 70 no cat images.
In this case, what are the true positive, false positive, true negative and false negative?
Confusion Matrix:
TN | FP
FN | TP
True Positive (TP): Images which are cat and actually predicted cat i.e. 25
True Negative (TN): Images which are not-cat and actually predicted not-cat i.e. 50
False Positive (FP): Images which are not-cat and actually predicted as cat i.e. 20
False Negative (FN): Images which are cat and actually predicted as not-cat i.e. 5
86
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Precision: TP/(TP+FP)
Recall: TP/(TP+FN)
Precision: 25/(25+20) = 0.55,
Recall: 25/(25+5) = 0.833
Confusion Matrix Example 1: Do you still remember the shepherd boy story?
“A shepherd boy used to take his herd of sheep across the fields to the lawns near the forest. One day he felt very bored.
He wanted to have fun. So he cried aloud "Wolf, Wolf. The wolf is carrying away a lamb". Farmers working in the fields
came running and asked, "Where is the wolf?". The boy laughed and replied "It was just for fun. Now get going all of
you".
The boy played the trick for quite a number of times in the next few days. After some days, as the boy was perched on a
tree, singing a song, there came a wolf. The boy cried loudly "Wolf, Wolf, the wolf is carrying a lamb away." There was
no one to the rescue. The boy shouted "Help! Wolf! Help!" Still no one came to his help. The villagers thought that the
boy was playing mischief again. The wolf carried a lamb away“
Outcome: Shepherd is a hero Outcome: Villagers are angry at shepherd for waking them up
Outcome: The wolf ate all the sheep Outcome: Everyone is fine
A true positive is an outcome where the model correctly predicts the positive class. Similarly, a true negative
is an outcome where the model correctly predicts the negative class.
A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is
an outcome where the model incorrectly predicts the negative class.
87
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Question 2:
Assume there are 100 images, 30 of them depict a cat, the rest do not. A machine learning model predicts the
occurrence of a cat in 25 of 30 cat images. It also predicts absence of a cat in 50 of the 70 no cat images.
In this case, what are the true positive, false positive, true negative and false negative? Let’s take cat as
negative class.
Question 3:
Below is a confusion matrix prepared for a binary classifier to detect email as Spam and Not Spam.
While many of today’s medical tests are accurate and reliable but still there are false positives or false
negatives and their implications are severe on the patients, family or society.
False positive prompts patients to take medication or treatment they don’t really need. Perhaps, even more
dangerous is the ‘false negative’ - the test that says you don’t have a disease for a condition you actually have.
We most often hear about false negatives in the context of home pregnancy tests, which are more prone to
giving false negatives than false positives. However, when it comes to screening for more serious conditions
like HIV or cancer, a false negative can have dire repercussions.
88
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Case 1:
Consider a health prediction case, where one wants to diagnose cancer. Imagine that detecting cancer will
trigger further analysis (the patient will not be immediately treated) whereas if you don't detect cancer, the
patient is sent home without further prognosis.
This case is thus asymmetric, since you definitely would like to avoid sending home a sick patient (False
Negative). You can however make the patient wait a little more by asking him/her to take more tests even if
the initial results show them negative for cancer (False Positive). As in this situation, you would prefer a False
Positive over a False Negative.
Case 2:
Imagine a patient taking an HIV test. The impacts of a false positive on the patient would at first be
heartbreaking; to have to deal with the trauma of facing this news and telling your family and friends. But on
further examination, the doctors will find out that person in question does not have the virus. Again, this
would not be a particularly pleasant experience. But not having HIV is ultimately a good thing.
On the other hand, a false negative would mean that the patient has HIV but the test shows a negative result.
The implications of this are terrifying, the patient would be missing out on crucial treatment and runs the risk
of spreading.
Without much doubt, the false negative here is the bigger problem. Both for the person and for society.
Q 2: The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her
maiden voyage, the RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough
lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely
to survive than others. In this challenge, we ask you to build a predictive model that answers the question:
“what sorts of people were more likely to survive?” using passenger data (i.e. name, age, gender, socio-
economic class, etc.). Please refer : https://www.kaggle.com/c/titanic
Q 3: Why can’t linear regression be used in place of logistic regression for binary classification?
Q 5: What is true positive rate (TPR), true negative rate (TNR), false-positive rate (FPR), and false-negative
rate (FNR)?
89
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Activity 1:
You may have heard a lot about Artificial Neural Networks (ANN), Deep Learning (DL) and Machine Learning
(ML). You must have also heard about the different training algorithms like clustering, classification etc. and
would like to learn more. But when you learn about the technology from a textbook, you may find yourself
overwhelmed by mathematical models and formulae.
To make this easy and interesting, there's an awesome tool to help you grasp the idea of neural networks and
different training algorithms like classification and clustering. This tools is called TensorFlow Playground, a
web app written in JavaScript that lets you play with a real neural network running in your browser and click
buttons and tweak parameters to see how it works.
First, we will start with understanding some of the terms in the above picture.
I. Data
We have six different data sets Circle, Exclusive OR (XOR), Gaussian, Spiral, plane and multi Gaussian. The first
four are for classification problems and last two are for regression problems. Small circles are the data points
which correspond to positive one and negative one. In general, positive values are shown in blue and negative
in orange.
In the hidden layers, the lines are colored by the weights of the connections between neurons. Blue shows a
positive weight, which means the network is using that output of the neuron as given. An orange line shows
that the network is assigning a negative weight.
In the output layer, the dots are colored orange or blue depending on their original values. The background
color shows what the network is predicting for a particular area. The intensity of the color shows how
confident that prediction is.
90
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
II. Features
We have seven features or inputs (X1, X2, squares, product and sine). We can turn on and off different
features to see which features are more important. It is a good example of feature engineering.
III. Epoch
Learning rate (alpha) is responsible for the speed at which the model learns.
V. Activation Function
We may skip this term for now but for the purpose of the activity, you may choose any one of the given 4
activation functions (Tanh, ReLu, Sigmoid and Linear). We will read about this in the next class.
VI. Regularization
A neural network model is a network of simple elements called neurons, which receive input, change their
internal state (activation) according to that input, and produce output (0 or 1) depending on the input and
activation. We have one input, one output and at least one hidden layer in the simplest neural network called
shallow neural network. When the hidden layers are 3 or more then we call it a deep neural network. Each
hidden layer has actual working elements called neurons that take input from features or predecessor
neurons and calculate a linear activation function (z) and an output function (a).
We have four data sets for classification and two for regression problem. We can select the type of problem
we want to study.
IX. Output
Check the model performance after training the neural network. Observe the Test loss and Training loss of
the model.
91
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Now add the third feature product of (X1X2) then observe the Losses
This is how you can understand the value of features, and how to get good results in minimum steps
Set the learning rate to 0.03, also check how the learning rate plays an important role in training a
neural network
Since you have already learnt about regression, you may also play with regression, so you have a clear idea
about regression.
Select 2 hidden layers, set 4 neurons for the first hidden layer and 2 neurons for the second hidden
layer then followed by the output
Starting from the first layer the weights are passed on to the first hidden layer which contains output
from one neuron, second hidden layer output is mixed with different weights. Weights are
represented by the thickness of the lines
Then the final output will contain the Train and Test loss of the neural network
92
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
First, you will be learning about the purpose of clustering and how it applies to the real world.
Second, you will get a general overview of clustering such as K-means clustering.
We will also try to understand the implementation of clustering algorithm to solve some real world
problems.
Objectives:
1. The main goal of this unit is to help students learn and understand clustering problems
2. Define Clustering and list its algorithms
3. Understand clustering as a type of unsupervised learning
Learning Outcomes:
1. Describe the input and output of a clustering model
2. Students should be able to differentiate between supervised and unsupervised learning
3. Students also should be able to differentiate classification problems from clustering
problems.
Pre-requisites: Understanding of supervised and unsupervised learning
Key Concepts. Clustering algorithms in Machine learning
93
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. Clustering
Consider you have large collection of books that you have to arrange according to categories in a bookshelf.
Since you haven’t read all the books, you have no idea about the content of the titles. You start by first bringing
the books with similar titles together.
For example, you would arrange books like the “Harry Potter” series in one corner and the “Famous Five”
series in another.
Harry Potter Series (Cluster -1) Famous Five series collection (Cluster – 2)
This is your first experience with clustering, where the books are clustered according to the similarity in
their titles. There could be many other criteria of clustering like – clustering based on authors, genre, year
publication, hardcover vs. paperback etc.
When I visit a city, I would like to walk as much as possible, but I want to optimize my time to see as many
attractions as possible. While I am planning my next trip to Mumbai for four days. I have researched online
and made a list of 20 places that I would like to visit, at during this trip. In order to optimize time and cover
all the shortlisted places, I will need to bucket (“cluster”) the places based on proximity to each other. Creating
the buckets is in fact a method of clustering. Having said that, we perform the process of clustering almost
every day in some way or the other.
94
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Let us take another example to understand clustering. Imagine X owns a chain of flavored milk parlors. The
parlor sells milk in 2 flavors – Strawberry (S) and Chocolate (C) across 8 outlets. In the below table, you see
the sales of both strawberry and chocolate flavored milk across the eight outlets.
Outlet 1 12 6
Outlet 2 15 16
Outlet 3 18 17
Outlet 4 10 8
Outlet 5 8 7
Outlet 6 9 6
Outlet 7 12 9
Outlet 8 20 18
In order to get a better understanding of the sales data, you can plot it on a graph. Below we have plotted the
sales of both strawberry and chocolate. There are eight dots in this graph that represents the 8 stores and the
Y-axis indicates the strawberry sales and the X- axis indicates the chocolate sales.
After the analysis of this graph, you will have a better insight into the sales data and see a pattern emerging
with respect to two groups of stores that behave slightly different in terms of their strawberry and
chocolate sales and this is essentially how clustering works.
Marketing: If you are a business, it is crucial that you target the right people. Clustering algorithms
are able to group together people with similar traits and likelihood to purchase your product/service.
Once you have the groups identified, target your messaging to them to increase sales probability.
95
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Insurance: Identifying groups of motor insurance policy holders with a high average claim cost;
Identifying frauds
City-planning: Identifying groups of houses according to their house type, value and geographical
location
WWW: Document classification; clustering weblog data to discover groups of similar access patterns
Identifying Fake News: Fake news is being created and spread at a rapid rate due to technology
innovations such as social media. But clustering algorithm is being used to identify fake news based
on the news content. The way that the algorithm works is by taking in the content of the fake news
article and examining the words used and then clustering them. These clusters are what help the
algorithm determine which pieces are genuine and which ones are fake. Certain words are found
more commonly in fake articles and once you see more such words in an article, it gives a higher
probability of the material being fake news.
1. Prepare the data: Data preparation refers to the set of features that will be available to the clustering
algorithm. For the clustering strategy to be effective, the data representation must include descriptive
features in the input set (feature selection), or the new features based on the original set to be generated
(feature extraction). In this stage, we normalize, scale, and transform feature data.
2. Create similarity metrics: To calculate the similarity between two data sets, you need to combine all the
feature data for the two examples into a single numeric value. For instance, consider a shoe data set with only
one feature – “shoe size”. You can quantify how similar two shoes are by calculating the difference between
their sizes. The smaller the numerical difference between sizes, the greater the similarity between shoes. Such
a handcrafted similarity measure is called a manual similarity measure. The similarity measure is critical to
any clustering technique and it must be chosen carefully.
3. Run the clustering algorithm: In machine learning, you sometimes encounter datasets that can have
millions of examples. ML algorithms must scale efficiently to these large datasets. However, many clustering
algorithms do not scale because they need to compute the similarity between all pairs of points. There are
many different approaches to clustering data. Roughly speaking, the cluster algorithms can be classified as
hierarchical or partitioning for a more comprehensive taxonomy of clustering techniques.
4. Interpret the results: Because clustering is unsupervised, no “truth” is available to verify results. The
absence of truth complicates assessing quality. In this situation, interpretation of results becomes crucial.
96
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
2.Types of Clustering
In fact, there are more than 100 clustering algorithms known. But few of the algorithms are used popularly,
let’s look at them in detail:
1. Centroid-based clustering organizes the data into non-hierarchical clusters, k-means is the most widely-
used centroid-based clustering algorithm. Centroid-based algorithms are efficient but sensitive to initial
conditions and outliers. This course focuses on k-means because it is an efficient, effective, and simple
clustering algorithm.
2. Density-based clustering connects areas of high example density into clusters. This allows for arbitrary-
shaped distributions as long as dense areas can be connected. These algorithms have difficulty with data of
varying densities and high dimensions. Further, by design, these algorithms do not assign outliers to clusters.
97
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
4.Hierarchical clustering creates a tree of clusters. Hierarchical clustering, not surprisingly, is well suited to
hierarchical data, such as taxonomies. See Comparison of 61 Sequenced Escherichia coli Genomes by Oksana
Lukjancenko, Trudy Wassenaar & Dave Ussery for an example. In addition, another advantage is that any
number of clusters can be chosen by cutting the tree at the right level.
Out of several approaches to clustering mentioned above, the most widely used clustering algorithm is -
“centroid-based clustering using k-means”.
98
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
In this step, the algorithm goes to each of the data points and assigns the data point to one of the cluster
centroids. The assignment of data point to a particular cluster is determined by how close the data point is
from the particular centroid.
In move centroid step, K-means moves the centroids to the average of the points in a cluster. In other words,
the algorithm calculates the average of all the points in a cluster and moves the centroid to that average
location. This process is repeated until all data points get a cluster and hence there is no further opportunity
of change in the clusters. The number of starting cluster is chosen randomly.
Example 1:
Let us see how this algorithm works using the well-known Iris flower data set -
(https://archive.ics.uci.edu/ml/datasets/iris) .
This dataset contains four measurements of three different Iris flowers. The measurements are - Sepal length,
Sepal width, Petal length, and Petal width. The three types of Iris are Setosa, Versicolour, and Virginica as
shown below in the same order.
Let's first plot the values of the petals’ lengths and widths against each other.
99
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
With just a quick glance, it is clear that there are at least two groups of flowers shown on the chart. Let’s see
how we can use a K-means algorithm to find clusters in this data.
Iteration 1: First, we create two randomly generated centroids and assign each data point to the cluster of
the closest centroid. In this case, because we are using two centroids, that means we want to create two
clusters i.e. K=2.
Iteration 2: As you can see above, the centroids are not evenly distributed. In the second iteration of the
algorithm, the average values of each of the two clusters are found and become the new centroid values.
100
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Iterations 3-5: We repeat the process until there is no further change in the value of the centroids.
Guarantees convergence.
101
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Disadvantages of k-means
Choosing k manually
Being dependent on initial values: For a low k, you can mitigate this dependence by running k-means
several times with different initial values and picking the best result. As ‘k’ increases, you need advanced
versions of k-means to pick better values of the initial centroids (called k-means seeding
Clustering data of varying sizes and density: k-means has trouble clustering data where clusters are of
varying sizes and density. To cluster such data, you need to generalize k-means as described in
the Advantages section.
Clustering outliers: Centroids can be dragged by outliers, or outliers might get their own cluster instead
of being ignored. Consider removing or clipping outliers before clustering.
Scaling with number of dimensions: As the number of dimension increases, a distance-based similarity
measure converges to a constant value between any given examples. Reduce dimensionality either by
using PCA on the feature data, or by using “spectral clustering” to modify the clustering algorithm as
explained below.
Two graphs side-by-side. The first showing a dataset with somewhat obvious clusters. The second showing an
odd grouping of examples after running k-means.
102
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
To cluster naturally imbalanced clusters like the ones shown in Figure 1, you can adapt (generalize) k-means.
In Figure 2, the lines show the cluster boundaries after generalizing k-means as:
Centre plot: Allow different cluster widths, resulting in more intuitive clusters of different sizes.
Right plot: Besides different cluster widths, allow different widths per dimension, resulting in elliptical
instead of spherical clusters, improving the result.
Two graphs side-by-side. The first a spherical cluster example and the second a non-spherical cluster example.
3. Why is it Unsupervised?
In clustering, we group some data-points into several clusters. So usually clustering does not look at
target/labels instead it groups the data considering the similarities in the features. Therefore, clustering
employs a similarity function to measure the similarity between two data-points (e.g. k means clustering
measures the Euclidean distance). And feature engineering plays a key role in clustering because the feature
that you provide to the cluster decides the type of groups that you get.
For example, if you use a set of features that characterized the CPU (no. of cores, clock speed etc.) to cluster
laptops, each cluster will have laptops with similar CPU power, if you add the price of the laptop as a feature
you may be able to get clusters that illustrate overpriced and economical laptops based on their price and
CPU specs.
The usual approach requires a set of labelled data/or a person to annotate the clusters.
In the third step, we try to assign a label to each cluster by looking at the data-points in them. If a certain
cluster has 90% of overpriced laptops (based on the labelled data or human evaluation), then we label that
cluster as an overpriced laptop cluster. Such that we may get multiple overpriced laptop clusters. When we
classify a new laptop, if it belongs to one of those overpriced laptop clusters then we classify that laptop as
an overpriced laptop.
103
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
NumPy
Pandas
matplotlib
scikit-learn
You will also need to have software installed to run and execute an iPython Notebook
Code
Template code is provided in the notebook customer_segments.ipynb notebook file. Additional
supporting code can be found in renders.py. While some code has already been implemented to get you
started, you will need to implement additional functionality when requested to successfully complete the
project.
Getting Started
In this project, you will analyze a dataset containing data on various customers' annual spending amounts
(reported in monetary units) of diverse product categories for internal structure. One goal of this project is
to best describe the variation in the different types of customers that a wholesale distributor interacts with.
Doing so would equip the distributor with insight into how to best structure their delivery service to meet
the needs of each customer.
The dataset for this project can be found on the UCI Machine Learning Repository. For the purposes of this
project, the features 'Channel' and 'Region' will be excluded in the analysis — with focus instead on
the six product categories recorded for customers.
Run the code block below to load the wholesale customers dataset, along with a few of the necessary Python
libraries required for this project. You will know the dataset loaded successfully if the size of the dataset is
reported.
# Import libraries necessary for this project
import numpy as np
import pandas as pd
import renders as rs
from IPython.display import display # Allows the use of display() for DataFrames
104
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
data = pd.read_csv("customers.csv")
data.drop(['Region', 'Channel'], axis = 1, inplace = True)
print "Wholesale customers dataset has {} samples with {} features
each.".format(*data.shape)
except:
print "Dataset could not be loaded. Is the dataset missing?"
Run the code block below to observe a statistical description of the dataset. Note that the dataset is
composed of six important product categories: 'Fresh', 'Milk', 'Grocery', 'Frozen', 'Detergents_Paper', and
'Delicatessen'. Consider what each category represents in terms of products you could purchase.
# Display a description of the dataset
stats = data.describe()
stats
OUTPUT:
# Construct indices
samples_bar.index = indices + ['mean']
105
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The code is slightly big, so not writing code block here. Please refer the github link shared above.
In the code block below, you will need to implement the following:
Assign new_data a copy of the data by removing a feature of your choice using the
DataFrame.drop function.
Use sklearn.cross_validation.train_test_split to split the dataset into training
and testing sets.
Use the removed feature as your target label. Set a test_size of 0.25 and set a
random_state.
Import a decision tree regressor, set a random_state, and fit the learner to the training data.
Report the prediction score of the testing set using the regressor's score function.
In the code block below, you will need to implement the following:
Assign a copy of the data to log_data after applying a logarithm scaling. Use the np.log
function for this.
Assign a copy of the sample data to log_samples after applying a logrithm scaling. Again, use
np.log.
# TODO: Scale the data using the natural logarithm
log_data = np.log(data)
106
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Detecting outliers in the data is extremely important in the data preprocessing step of any analysis. The
presence of outliers can often skew results which take into consideration these data points.
In the code block below, you will need to implement the following:
Assign the value of the 25th percentile for the given feature to Q1. Use np.percentile for this.
Assign the value of the 75th percentile for the given feature to Q3. Again, use np.percentile.
Assign the calculation of an outlier step for the given feature to step.
Optionally remove data points from the dataset by adding indices to the outliers list.
NOTE: If you choose to remove any outliers, ensure that the sample data does not contain any of these
points! Once you have performed this implementation, the dataset will be stored in the variable
good_data.
import itertools
# Select the indices for data points you wish to remove
outliers_lst = []
# For each feature find the data points with extreme high or low values
for feature in log_data.columns:
# TODO: Calculate Q1 (25th percentile of the data) for the given feature
Q1 = np.percentile(log_data.loc[:, feature], 25)
# TODO: Calculate Q3 (75th percentile of the data) for the given feature
Q3 = np.percentile(log_data.loc[:, feature], 75)
# TODO: Use the interquartile range to calculate an outlier step (1.5 times
the interquartile range)
step = 1.5 * (Q3 - Q1)
outliers_lst.append(list(outliers_rows.index))
outliers = list(itertools.chain.from_iterable(outliers_lst))
107
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
uniq_outliers = list(set(outliers))
# Original Data
print 'Original shape of data:\n', data.shape
# Processed Data
print 'New shape of data:\n', good_data.shape
In the code block below, you will need to implement the following:
Assign the results of fitting PCA in two dimensions with good_data to pca.
Apply a PCA transformation of good_data using pca.transform, and assign the reuslts to
reduced_data.
Apply a PCA transformation of the sample log-data log_samples using pca.transform, and
assign the results to pca_samples.
# TODO: Apply PCA by fitting the good data with only two dimensions
# Instantiate
pca = PCA(n_components=2)
pca.fit(good_data)
# TODO: Transform the good data using the PCA fit above
reduced_data = pca.transform(good_data)
# TODO: Transform the sample log-data using the PCA fit above
pca_samples = pca.transform(log_samples)
STEP - 6: CLUSTERING
108
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Depending on the problem, the number of clusters that you expect to be in the data may already be
known. When the number of clusters is not known a priority, there is no guarantee that a given number of
clusters best segments the data, since it is unclear what structure exists in the data — if any.
In the code block below, you will need to implement the following:
[2, 3, 4, 5, 6, 7, 8, 9, 10]
GMM Implementation
# Loop through clusters
for n_clusters in range_n_clusters:
# TODO: Apply your clustering algorithm of choice to the reduced data
clusterer = GMM(n_components=n_clusters).fit(reduced_data)
# TODO: Predict the cluster for each transformed sample data point
sample_preds = clusterer.predict(pca_samples)
# TODO: Calculate the mean silhouette coefficient for the number of clusters
chosen
score = silhouette_score(reduced_data, preds, metric='mahalanobis')
print "For n_clusters = {}. The average silhouette_score is :
{}".format(n_clusters, score)
KNN Implementation
# Loop through clusters
for n_clusters in range_n_clusters:
# TODO: Apply your clustering algorithm of choice to the reduced data
109
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
clusterer = KMeans(n_clusters=n_clusters).fit(reduced_data)
# TODO: Predict the cluster for each transformed sample data point
sample_preds = clusterer.predict(pca_samples)
# TODO: Calculate the mean silhouette coefficient for the number of clusters
chosen
score = silhouette_score(reduced_data, preds, metric='euclidean')
print "For n_clusters = {}. The average silhouette_score is :
{}".format(n_clusters, score)
Cluster Visualization
Once you've chosen the optimal number of clusters for your clustering algorithm using the scoring metric
above, you can now visualize the results by executing the code block below.The final visualization provided
should, however, correspond with the optimal number of clusters.
# Extra code because we ran a loop on top and this resets to what we want
clusterer = GMM(n_components=2).fit(reduced_data)
preds = clusterer.predict(reduced_data)
centers = clusterer.means_
sample_preds = clusterer.predict(pca_samples)
110
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Unit 10
AI Values
Objectives:
1. Understand and debate on Ethics of AI
2. Understand biases and its types
3. Scope of biases in data and how it impacts AI
Key Concepts: Data, Bias, Data Bias, Types of Bias
111
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. AI Values
Before we begin this chapter, let us watch the few essential videos. (Total watch time may be 30-35 minutes).
Activity 1: After watching the video “The Ethical Robot” what are the two ethical questions that strike you?
Write them down.
Activity 2: With the video “How to build a moral robot” as your baseline, please write down the moral and
ethical values you would like incorporate in your robot? The video is only a guide, let it not limit your
imagination and creativity.
Activity 3: Form a group of 5 students and watch the video “Humans need not apply” as a group. Please
watch the video more than once. At the end, submit a paper as a group on your learnings from the vide.
112
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. IBM (https://www.research.ibm.com/artificial-intelligence/#quicklinks)
2. Google (https://ai.google/social-good/)
3. Assessing Cardiovascular Risk Factors with Computer Vision Agricultural productivity can be increased
through digitization and analysis of images from automated drones and satellites
5. AI can help the people in special needs in numerous ways. AI is getting better at doing text-to-voice
translation as well as voice-to-text translation, and could thus help visually impaired people, or people
with hearing impairments, to use information and communication technologies (ICTs)
6. Pattern recognition can track marine life migration, concentrations of life undersea and fishing activities
to enhance sustainable marine ecosystems and combat illegal fishing
7. With global warming, climate change, and water pollution on the rise, we could be dealing with a harsh
future. Food shortages are not something we want to add to the list. Thankfully, one startup is already
working hard on using AI for good in this regard.
113
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
8. Imago AI an India-based agri-tech startup that aims to use AI to increase crop yields and reduce food
waste. The company’s vision is to use technology to feed the world’s growing population by optimizing
agricultural methods.
The
company combines machine learning and computer vision to automate tedious tasks like measuring crop
quality and weighing yields. This won’t just speed up the process, but it will also help farmers to identify
plants that have diseases. TechCrunch reports that 40% of the world’s crops are lost to disease, so the work
from Imago AI could be a major breakthrough for agriculture, especially in poorer countries.
( https://www.springboard.com/blog/ai-for-good/)
114
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
UNI Global Union ( http://www.thefutureworldofwork.org/) has identified 10 key principles for Ethical AI
7. Secure a just transition and ensure support for fundamental freedoms and rights
As AI systems develop and augmented realities are formed, workers and work tasks will be displaced. It is
vital that policies are put in place that ensure a just transition to the digital reality, including specific
governmental measures to help displaced workers find new employment.
115
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
1. https://www.youtube.com/watch?v=cplucNW70II&ab_channel=TEDxTalks
2. https://www.youtube.com/watch?v=vgUWKXVvO9Q
Question 1: How do you decide if something deserves to be called intelligent? Does it have to pass exams to
earn this certificate? Apply your imagination and creativity to answer this question.
Question 2: A village needs your help to prevent the spread of a nearby forest fire. Design, develop and train
the Agent to identify what causes fires, remove materials that help fires spread, and then bring life back to a
forest destroyed by fire — all with Flowchart / pseudo code.
Be informed that while designing the AI agent, most likely your own biases will get inside the algorithm.
Stage 2: Include more students in the group and take their perspective about your solution. You will come to
know there were biases in your solution
Stage 3: Increase your group size by including students from different classes and different age groups. Let
them give their point of view about the solution you’ve arrived at in Stage 2 and you would be surprised to
know that the solution still has a lot of biases.
116
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
Example 1
Suppose a CCTV camera were to spot your face in a crowd outside a sports stadium. In the police data center
somewhere in the city/ country, an artificial neural network analyzes images from the CCTV footage frame-
by-frame. A floating cloud in the sky causes a shadow on your face and neural network (by mistake) finds
your face similar to the face of a wanted criminal.
If the police were to call you aside for questioning and tell you they had reason to detain you, how would you
defend yourself? Was it your fault that your shadowed face has resemblance by few degrees with a person
in the police record?
Example 2: This happened in the USA in 2018. An AI system was being used to allocate care to nearly 200
million patients in the US. It was discovered later that AI system was offering a lower standard of care to the
black patients. Across the board, black people were assigned lower risk scores than white people. This in turn
meant that black patients were less likely to be able to access the necessary standard of care.
The problem stemmed from the fact that the AI algorithm was allocating risk values using the predicted cost
of healthcare. Because black patients were often less able to pay or were perceived as less able to pay for the
higher standard of care, the AI essentially learned that they were not entitled to such a standard of treatment.
Though the system was fixed / improved after being discovered but the big question is – whose problem was
this? The AI system developers or the US black people data (which was true to an extent)?
117
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
The sources of ‘Bias in AI’ usually are our own cultural, societal or personal biases regarding race, gender,
nationality, age or personal habits.
Did you answer “bananas”? Why didn’t you mention the plastic bag roll? Or the color of the banana? Or the
plastic stand holding the bananas?
Although all answers are technically correct, for some reason we have a bias to prefer one of them. Not all
people would share that bias; what we perceive and how we respond is influenced by our norms, culture
and habits. If you live on a planet where all bananas are blue, you might answer “yellow bananas” here. If
you’ve never seen a banana before, you might say “shelves with yellow stuff on them.”
Question:
Make a list of 10 biases which you observe in your home, classroom or in your society. You don’t need to
get all 10 biases in one go. You can start with one and keep adding as you observe more.
While ML and AI are technologies often dissociated from human thinking, they are always based on
algorithms created by humans. And like anything created by humans, these algorithms are prone to
incorporating the biases of their creators.
Because AI algorithms learn from data, any historical data can quickly create biased AI that bases decisions
on unfair datasets.
118
LEVEL 2: AI INQUIRED (AI APPLY) TEACHER INSTRUCTION MANUAL
But there are tangible things we can do to manage bias in AI. Here are some of them:
It’s all about the data – make sure you choose a representative dataset
Choosing data that is diverse and includes different groups to prevent your model from having trouble
identifying unlabeled examples that are outside the norm. Make sure you have properly grouped and
managed the data so you aren't forced to face similar situations as Google and its facial recognition system.
Most western countries in the world have a regulation that says that fire retardants must be
added to the foams and fabrics in furniture. New Zealand does not. Do you think New Zealand
should have such a regulation? YES or NO?
Give two reasons:
1. ________________________________________________________________
2. ________________________________________________________________
Activity 2: AI Bingo
Step 1: First take a look at this PPT as a class: Introduction to AI and AI Bingo_AI-Ethics.pptx
Step 2: After reviewing the introductory slides, pass out bingo cards. Bingo cards are available here:
https://www.media.mit.edu/projects/ai-ethics-for-middle-school/overview/
Step 3: Club students into teams of 2. Teams must identify the prediction the AI system is trying to make and
the dataset it might use to make that prediction. The first team to get five squares filled out in a row, diagonal,
or column wins (or, for longer play, the first student to get two rows/diagonals/columns).
Step 4: After playing, have students discuss the squares they have filled out
119
___________________________________________
CLASS XII
______________________________________________
2
Unit 1: Capstone Project
A capstone project is a project where students must research a topic independently to find a deep
understanding of the subject matter. It gives an opportunity for the student to integrate all their
knowledge and demonstrate it through a comprehensive project.
So, without further ado, let’s jump straight into some Capstone project ideas that will strengthen your
base
The list can is huge but these are some simple projects, which you can consider to pick up to develop.
3
1. Understanding The Problem
Artificial Intelligence is perhaps the most transformative technology available today. At a high level, every
AI project follows the following six steps:
2) Data gathering
3) Feature definition
4) AI model construction
6) Deployment
In this section, I will share the best practices for the first step: “understanding the problem”.
Begin formulating your problem by asking yourself this simple — is there a pattern? The premise that
underlies all Machine Learning disciplines is that there needs to be a pattern. If there is no pattern, then the
problem cannot be solved with AI technology. It is fundamental that this question is asked before deciding
to embark on an AI development journey.
(If it is believed that there is a pattern in the data, then AI development techniques may be employed , else Don’t apply AI techniques to
solve the problem.
4
If it is believed that there is a pattern in the data, then AI development techniques may be employed.
Applied uses of these techniques are typically geared towards answering five types of questions, all of
which may be categorized as being within the umbrella of predictive analysis:
It is important to determine which of these questions you’re asking, and how answering it helps you
solve your problem.
Project 1:
Form a team of 4-5 students. And submit a detailed report on the most critical problems and how AI can
assist in addressing those problems. Report should include description of the problem and the proposed
way in which AI can solve the problem.
1. Agriculture in India
3. Healthcare in India
The five stages of Design Thinking are as follows: Empathize, Define, Ideate, Prototype, and Test.
5
Real computational tasks are complicated. To accomplish them you need to break down the problem into
smaller units before coding.
1. Understand the problem and then restate the problem in your own words
Know what the desired inputs and outputs are
Ask questions for clarification (in class these questions might be to your instructor, but
most of the time they will be asking either yourself or your collaborators)
2. Break the problem down into a few large pieces. Write these down, either on paper or as
comments in a file.
3. Break complicated pieces down into smaller pieces. Keep doing this until all of the pieces are
small.
4. Code one small piece at a time.
1. Think about how to implement it
2. Write the code/query
3. Test it… on its own.
4. Fix problems, if any
Data
length width height
1 2 3
2 4 3
Example 2: Imagine that you want to create your first app. This is a complex problem. How would you
decompose the task of creating an app?
To decompose this task, you would need to know the answer to a series of smaller problems:
This list has broken down the complex problem of creating an app into much simpler problems that can
now be worked out. You may also be able to get other people to help you with different individual parts
of the app. For example, you may have a friend who can create the graphics, while another will be your
test the app.
6
Example 3: (For Advance learners)
Time series decomposition involves thinking of a series as a combination of level, trend, seasonality, and
noise components. Decomposition provides a useful abstract model for thinking about time series
generally and for better understanding problems during time series analysis and forecasting.
The Airline Passengers dataset describes the total number of airline passengers over a period of time.
The units are a count of the number of airline passengers in thousands. There are 144 monthly
observations from 1949 to 1960.
Download the dataset to your current working directory with the filename “airline-passengers.csv“.
series.plot()
pyplot.show()
7
Reviewing the line plot, it suggests that there may be a linear trend, but it is hard to be sure from eye-
balling. There is also seasonality, but the amplitude (height) of the cycles appears to be increasing,
suggesting that it is multiplicative.
The example below decomposes the airline passenger’s dataset as a multiplicative model.
from pandas import read_csv
result.plot()
pyplot.show()
Running the example plots the observed, trend, seasonal, and residual time series.
We can see that the trend and seasonality information extracted from the series does seem
reasonable. The residuals are also interesting, showing periods of high variability in the early and later
years of the series.
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
9
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
3.Analytic Approach
Those who work in the domain of AI and Machine Learning solve problems and answer questions
through data every day. They build models to predict outcomes or discover underlying patterns, all to
gain insights leading to actions that will improve future outcomes.
Every project, regardless of its size, starts with business understanding, which lays the foundation for
successful resolution of the business problem. The business sponsors needing the analytic solution play
the critical role in this stage by defining the problem, project objectives and solution requirements from
a business perspective. And, believe it or not—even with nine stages still to go—this first stage is the
hardest.
After clearly stating a business problem, the data scientist can define the analytic approach to solving
it. Doing so involves expressing the problem in the context of statistical and machine learning
techniques so that the data scientist can identify techniques suitable for achieving the desired outcome.
Selecting the right analytic approach depends on the question being asked. Once the problem to be
addressed is defined, the appropriate analytic approach for the problem is selected in the context of
the business requirements. This is the second stage of the data science methodology.
10
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
If the question is to determine probabilities of an action, then a predictive model might be used.
Statistical analysis applies to problems that require counts: if the question requires a yes/ no
answer, then a classification approach to predicting a response would be suitable.
4. Data Requirement
11
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
If the problem that needs to be resolved is "a recipe", so to speak, and data is "an ingredient", then the
data scientist needs to identify:
Prior to undertaking the data collection and data preparation stages of the methodology, it's vital to
define the data requirements for decision-tree classification. This includes identifying the necessary
data content, formats and sources for initial data collection.
In this phase the data requirements are revised and decisions are made as to whether or not the
collection requires more or less data. Once the data ingredients are collected, the data scientist will
have a good understanding of what they will be working with.
Techniques such as descriptive statistics and visualization can be applied to the data set, to assess the
content, quality, and initial insights about the data. Gaps in data will be identified and plans to either fill
or make substitutions will have to be made.
12
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
5. Modeling Approach
Data Modeling focuses on developing models that are either descriptive or predictive.
An example of a descriptive model might examine things like: if a person did this, then they're
likely to prefer that.
A predictive model tries to yield yes/no, or stop/go type outcomes. These models are based on
the analytic approach that was taken, either statistically driven or machine learning driven.
The data scientist will use a training set for predictive modelling. A training set is a set of historical data
in which the outcomes are already known. The training set acts like a gauge to determine if the model
needs to be calibrated. In this stage, the data scientist will play around with different algorithms to
ensure that the variables in play are actually required.
The success of data compilation, preparation and modelling, depends on the understanding of the
problem at hand, and the appropriate analytical approach being taken. The data supports the
answering of the question, and like the quality of the ingredients in cooking, sets the stage for the
outcome.
Constant refinement, adjustments and tweaking are necessary within each step to ensure the outcome
is one that is solid. The framework is geared to do 3 things:
The end goal is to move the data scientist to a point where a data model can be built to answer the
question.
13
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
It can be used for classification or regression problems and can be used for any supervised learning
algorithm.
The procedure involves taking a dataset and dividing it into two subsets. The first subset is used to fit
the model and is referred to as the training dataset. The second subset is not used to train the model;
instead, the input element of the dataset is provided to the model, then predictions are made and
compared to the expected values. This second dataset is referred to as the test dataset.
The objective is to estimate the performance of the machine learning model on new data: data not used
to train the model.
This is how we expect to use the model in practice. Namely, to fit it on available data with known inputs
and outputs, then make predictions on new examples in the future where we do not have the expected
output or target values.
The train-test procedure is appropriate when there is a sufficiently large dataset available.
You must choose a split percentage that meets your project’s objectives with considerations that
include:
Now that we are familiar with the train-test split model evaluation procedure, let’s look at how we can
use this procedure in Python.
14
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
As we work with datasets, a machine learning model works in two stages. We usually split the data
around 20%-80% between testing and training stages. Under supervised learning, we split a dataset
into a training data and test data in Python ML.
Pandas
Sklearn
We use pandas to import the dataset and sklearn to perform the splitting. You can import these packages
as:
Following are the process of Train and Test set in Python ML. So, let’s take a dataset first.
1. >>> data=pd.read_csv(‘forestfires.csv’)
2. >>> data.head()
15
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
b. Splitting
Let’s split this data into labels and features. Now, what’s that? Using features, we predict labels. I mean
using features (the data we use to predict labels), we predict labels (the data we want to predict).
1. >>> y=data.temp
2. >>> x=data.drop(‘temp’,axis=1)
Temp is a label to predict temperatures in y; we use the drop() function to take all other data in x. Then,
we split the data.
1. >>> x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)
2. >>> x_train.head()
1. >>> x_train.shape
(413, 12)
1. >>> x_test.head()
1. >>> x_test.shape
(104, 12)
16
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
The line test_size=0.2 suggests that the test data should be 20% of the dataset and the rest should be
train data. With the outputs of the shape () functions, you can see that we have 104 rows in the test data
and 413 in the training data.
We will demonstrate how to use the train-test split to evaluate a random forest algorithm on the housing
dataset.
The housing dataset is a standard machine learning dataset composed of 506 rows of data with 13
numerical input variables and a numerical target variable.
The dataset involves predicting the house price given details of the house is in the suburbs of the
American city of Boston.
You will not need to download the dataset; we will download it automatically as part of our worked
examples. The example below downloads and loads the dataset as a Pandas DataFrame and summarizes
the shape of the dataset.
# load and summarize the housing dataset from pandas import read_csv
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
# summarize shape
print(dataframe.shape)
Running the example confirms the 506 rows of data and 13 input variables and single numeric target
variables (14 in total).
(506, 14)
1
First, the loaded dataset must be split into input and output components.
...
print(X.shape, y.shape)
17
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Next, we can split the dataset so that 67 percent is used to train the model and 33 percent is used to
evaluate it. This split was chosen arbitrarily.
...
We can then define and fit the model on the training dataset.
...
model = RandomForestRegressor(random_state=1)
model.fit(X_train, y_train)
Then use the fit model to make predictions and evaluate the predictions using the mean absolute error
(MAE) performance metric.
...
# make predictions
yhat = model.predict(X_test)
# evaluate predictions
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'
data = dataframe.values
18
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
print(X.shape, y.shape)
model = RandomForestRegressor(random_state=1)
model.fit(X_train, y_train)
# make predictions
yhat = model.predict(X_test)
# evaluate predictions
Running the example first loads the dataset and confirms the number of rows in the input and output
elements.
The dataset is split into train and test sets and we can see that there are 339 rows for training and 167
rows for the test set.
Finally, the model is evaluated on the test set and the performance of the model when making predictions
on new data is a mean absolute error of about 2.211 (thousands of dollars).
MAE: 2.157
19
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
You will face choices about predictive variables to use, what types of models to use, what arguments to
supply those models, etc. We make these choices in a data-driven way by measuring model quality of
various alternatives.
You've already learned to use train_test_split to split the data, so you can measure model quality on the
test data. Cross-validation extends this approach to model scoring (or "model validation.") Compared to
train_test_split, cross-validation gives you a more reliable measure of your model's quality, though it
takes longer to run.
Imagine you have a dataset with 5000 rows. The train_test_split function has an argument for test_size
that you can use to decide how many rows go to the training set and how many go to the test set. The
larger the test set, the more reliable your measures of model quality will be. At an extreme, you could
imagine having only 1 row of data in the test set. If you compare alternative models, which one makes
the best predictions on a single data point will be mostly a matter of luck.
You will typically keep about 20% as a test dataset. But even with 1000 rows in the test set, there's some
random chance in determining model scores. A model might do well on one set of 1000 rows, even if it
would be inaccurate on a different 1000 rows. The larger the test set, the less randomness (aka "noise")
there is in our measure of model quality.
20
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
We run an experiment called experiment 1 which uses the first fold as a holdout set, and everything
else as training data. This gives us a measure of model quality based on a 20% holdout set, much as we
got from using the simple train-test split.
We then run a second experiment, where we hold out data from the second fold (using everything
except the 2nd fold for training the model.) This gives us a second estimate of model quality. We repeat
this process, using every fold once as the holdout. Putting this together, 100% of the data is used as a
holdout at some point.
Returning to our example above from train-test split, if we have 5000 rows of data, we end up with a
measure of model quality based on 5000 rows of holdout (even if we don't use all 5000 rows
simultaneously.
Cross-validation gives a more accurate measure of model quality, which is especially important if you
are making a lot of modeling decisions. However, it can take more time to run, because it estimates
models once for each fold. So it is doing more total work.
Given these tradeoffs, when should you use each approach? On small datasets, the extra computational
burden of running cross-validation isn't a big deal. These are also the problems where model quality
scores would be least reliable with train-test split. So, if your dataset is smaller, you should run cross-
validation.
For the same reasons, a simple train-test split is sufficient for larger datasets. It will run faster, and you
may have enough data that there's little need to re-use some of it for holdout.
There's no simple threshold for what constitutes a large vs small dataset. If your model takes a couple
minute or less to run, it's probably worth switching to cross-validation. If your model takes much longer
to run, cross-validation may slow down your workflow more than it's worth.
Alternatively, you can run cross-validation and see if the scores for each experiment seem close. If each
experiment gives the same results, train-test split is probably sufficient.
Example
First we read the data
import pandas as pd
data = pd.read_csv('../input/melb_data.csv')
cols_to_use = ['Rooms', 'Distance', 'Landsize', 'BuildingArea', 'YearBuilt']
X = data[cols_to_use]
y = data.Price
Then specify a pipeline of our modeling steps (It can be very difficult to do cross-validation properly if you
arent't using pipelines)
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Imputer
my_pipeline = make_pipeline(Imputer(), RandomForestRegressor())
21
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
You may notice that we specified an argument for scoring. This specifies what measure of model quality
to report. The docs for scikit-learn show a list of options.
It is a little surprising that we specify negative mean absolute error in this case. Scikit-learn has a
convention where all metrics are defined so a high number is better. Using negatives here allows them to
be consistent with that convention, though negative MAE is almost unheard of elsewhere.
You typically want a single measure of model quality to compare between models. So we take the
average across experiments.
print('Mean Absolute Error %2f' %(-1 * scores.mean()))
Conclusion
Using cross-validation gave us much better measures of model quality, with the added benefit of cleaning
up our code (no longer needing to keep track of separate train and test sets. So, it's a good win.
Activity 1: Convert the code for your on-going project over from train-test split to cross-validation. Make
sure to remove all code that divides your dataset into training and testing datasets. Leaving code you don't
need any more would be sloppy.
Activity 2: Add or remove a predictor from your models. See the cross-validation score using both sets of
predictors, and see how you can compare the scores.
Knowing how good a set of predictions is, allows you to make estimates about how good a given machine
learning model of your problem,
You must estimate the quality of a set of predictions when training a machine learning model.
Performance metrics like classification accuracy and root mean squared error can give you a clear objective
idea of how good a set of predictions is, and in turn how good the model is that generated them.
This is important as it allows you to tell the difference and select among:
Different transforms of the data used to train the same machine learning model.
Different machine learning models trained on the same data.
Different configurations for a machine learning model trained on the same data.
22
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
As such, performance metrics are a required building block in implementing machine learning
algorithms from scratch.
All the algorithms in machine learning rely on minimizing or maximizing a function, which we call
“objective function”. The group of functions that are minimized are called “loss functions”. A loss
function is a measure of how good a prediction model does in terms of being able to predict the
expected outcome. A most commonly used method of finding the minimum point of function is
“gradient descent”. Think of loss function like undulating mountain and gradient descent is like sliding
down the mountain to reach the bottom most point.
Loss functions can be broadly categorized into 2 types: Classification and Regression Loss.
23
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Graphically:
As you can see in this scattered graph the red dots are the actual values and the blue line is the set of
predicted values drawn by our model. Here X represents the distance between the actual value and the
predicted line this line represents the error, similarly, we can draw straight lines from each red dot to the
blue line. Taking mean of all those distances and squaring them and finally taking the root will give us RMSE
of our model.
24
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Example 1 (RMSE)
Let us write a python code to find out RMSE values of our model. We would be predicting the brain
weight of the users. We would be using linear regression to train our model, the data set used in my code
can be downloaded from here: headbrain6-
import time
import numpy as np
import pandas as pd
"""
here the directory of my code and the headbrain6.csv file is same make sure both the files are stored in
same folder or directory
"""
data=pd.read_csv('headbrain6.csv')
data.head()
x=data.iloc[:,2:3].values
y=data.iloc[:,3:4].values
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=1/4,random_state=0)
regressor=LinearRegression()
regressor.fit(x_train,y_train)
25
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
y_pred=regressor.predict(x_test)
plt.scatter(x_train,y_train,c='red')
plt.show()
plt.plot(x_test,y_pred)
plt.scatter(x_test,y_test,c='red')
plt.xlabel('headsize')
plt.ylabel('brain weight')
for i in range(0,60):
time.sleep(1)
rss=((y_test-y_pred)**2).sum()
mse=np.mean((y_test-y_pred)**2)
26
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Output
27
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
The RMSE value of our is coming out to be approximately 73 which is not bad. A good model should have
an RMSE value less than 180. In case you have a higher RMSE value, this would mean that you probably
need to change your feature or probably you need to tweak your hyperparameters.
Below is a plot of an MSE function where the true target value is 100, and the predicted values range
between -10,000 to 10,000. The MSE loss (Y-axis) reaches its minimum value at prediction (X-axis) = 100.
The range is 0 to ∞.
28
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
MSE is sensitive towards outliers and given several examples with the same input feature values, the
optimal prediction will be their mean target value. This should be compared with Mean Absolute Error,
where the optimal prediction is the median. MSE is thus good to use if you believe that your target data,
conditioned on the input, is normally distributed around a mean value, and when it’s important to
penalize outliers extra much.
Use MSE when doing regression, believing that your target, conditioned on the input, is normally
distributed, and want large errors to be significantly (quadratically) more penalized than small ones.
Example-1: You want to predict future house prices. The price is a continuous value, and therefore we
want to do regression. MSE can here be used as the loss function.
Example-2:Consider the given data points: (1,1), (2,1), (3,2), (4,2), (5,4)
You can use this online calculator to find the regression equation / line.
x y Yi
1 1 0.6
2 1 1,39
3 2 1.99
4 2 2.69
5 4 3.4
29
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
# Given values
Y_true = [1,1,2,2,4] # Y_true = Y (original values)
# calculated values
Y_pred = [0.6,1.29,1.99,2.69,3.4] # Y_pred = Y'
Output: 0.21606
30
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
ATitle: Model Life cycle Approach: Hands on, Team Discussion, Web
search, Case studies
Summary: The machine learning life cycle is the cyclical process that AI or machine learning projects
follow. It defines each step that an engineer or developer should follow. Generally, every AI project
lifecycle encompasses three main stages: project scoping, design or build phase, and deployment
in production. In this unit we will go over each of them and the key steps and factors to consider
when implementing them.
The expectation out of students would be to focus more on the hands-on aspect of AI projects.
Objectives:
s
1. Students should develop their capstone project using AI project cycle methodologies
2. Students should be comfortable in breaking down their projects into different phases of
AI project cycle
3. Students should be in a position to choose and apply right AI model to solve the problem
Learning Outcomes:
1. Students will demonstrate the skill of breaking down a problem in smaller sub units
according to AI project life cycle methodologies
2. Students will demonstrate proficiency in choosing and applying the correct AI or ML model
c
a
t 31
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Generally, every AI project lifecycle encompasses three main stages: project scoping, design or build
phase, and deployment in production. Let's go over each of them and the key steps and factors to consider
when implementing them.
(Source : https://blog.dataiku.com/ai-projects-lifecycle-key-steps-and-considerations)
32
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
The first fundamental step when starting an AI initiative is scoping and selecting the relevant use case(s)
that the AI model will be built to address. This is arguably the most important part of your AI project. Why?
There's a couple of reasons for it. First, this stage involves the planning and motivational aspects of your
project. It is important to start strong if you want your artificial intelligence project to be successful. There's
a great phrase that characterizes this project stage: garbage in, garbage out. This means if the data you
collect is no good, you won't be able to build an effective AI algorithm, and your whole project will collapse.
In this phase, it's crucial to precisely define the strategic business objectives and desired outcomes of the
project, select align all the different stakeholders' expectations, anticipate the key resources and steps,
and define the success metrics. Selecting the AI or machine learning use cases and being able to evaluate
the return on investment (ROI) is critical to the success of any data project.
Once the relevant projects have been selected and properly scoped, the next step of the machine learning
lifecycle is the Design or Build phase, which can take from a few days to multiple months, depending on
the nature of the project. The Design phase is essentially an iterative process comprising all the steps
relevant to building the AI or machine learning model: data acquisition, exploration, preparation, cleaning,
feature engineering, testing and running a set of models to try to predict behaviours or discover insights
in the data.
Enabling all the different people involved in the AI project to have the appropriate access to data, tools,
and processes in order to collaborate across different stages of the model building is critical to its success.
Another key success factor to consider is model validation: how will you determine, measure, and evaluate
the performance of each iteration with regards to the defined ROI objective?
During this phase, you need to evaluate the various AI development platforms, e.g.:
Open languages — Python is the most popular, with R and Scala also in the mix.
Approaches and techniques — Classic ML techniques from regression all the way to state-of-the-
art GANs and RL
Development tools — DataRobot, H2O, Watson Studio, Azure ML Studio, Sagemaker, Anaconda,
etc.
33
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Different AI development platforms offer extensive documentation to help the development teams.
Depending on your choice of the AI platform, you need to visit the appropriate webpages for this
documentation, which are as follows:
BigML;
Step 3: Testing
While the fundamental testing concepts are fully applicable in AI development projects, there are
additional considerations too. These are as follows:
Human biases in selecting test data can adversely impact the testing phase, therefore, data
validation is important.
Your testing team should test the AI and ML algorithms keeping model validation, successful
learnability, and algorithm effectiveness in mind.
Regulatory compliance testing and security testing are important since the system might deal with
sensitive data, moreover, the large volume of data makes performance testing crucial.
You are implementing an AI solution that will need to use data from your other systems, therefore,
systems integration testing assumes importance.
Test data should include all relevant subsets of training data, i.e., the data you will use for training
the AI system.
Your team must create test suites that help you validate your ML models.
34
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Unit 3: Storytelling
Refreshing what we learnt in Level 1, we will be re-visiting some concepts of storytelling.
Why storytelling is so powerful and cross-cultural, and what this means for data
storytelling?
Stories create engaging experiences that transport the audience to another space and time. They
establish a sense of community belongingness and identity. For these reasons, storytelling is
considered a powerful element that enhances global networking by increasing the awareness
about the cultural differences and enhancing cross-cultural understanding. Storytelling is an
integral part of indigenous cultures.
Some of the factors that make storytelling powerful are its attribute to make information more
compelling, the ability to present a window in order to take a peek at the past, and finally to draw
lessons and to reimagine the future by affecting necessary changes. Storytelling also shapes,
empowers and connects people by doing away with judgement or critic and facilitates openness
for embracing differences.
A well-told story is an inspirational narrative that is crafted to engage the audience across
boundaries and cultures, as they have the impact that isn’t possible with data alone. Data can be
persuasive, but stories are much more. They change the way that we interact with data,
transforming it from a dry collection of “facts” to something that can be entertaining, engaging,
thought provoking, and inspiring change.
Each data point holds some information which maybe unclear and contextually deficient on its
own. The visualizations of such data are therefore, subject to interpretation (and
misinterpretation). However, stories are more likely to drive action than are statistics and
numbers. Therefore, when told in the form of a narrative, it reduces ambiguity, connects data
with context, and describes a specific interpretation – communicating the important messages
in most effective ways. The steps involved in telling an effective data story are given below:
Understanding the audience
Choosing the right data and visualisations
Drawing attention to key information
Developing a narrative
Engaging your audience
35
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Activity
A new teacher joined the ABC Higher Secondary School, Ambapalli to teach Science to the
students of Class XI. In his first class itself, he could make out that not everyone understood what
was being taught in class. So, he decided to take a poll to assess the level of students. The following
graph shows the level of interest of the students in the class.
11%
19% 5%
25%
40%
Depending on the result obtained, he changed his method of teaching. After a month, he repeated
the same poll once again to ascertain if there was any change. The results of poll are shown in the
chart below.
12%
6%
38%
14%
30%
With the help of the information provided create a good data story setting a strong narrative
around the data, making it is easier to understand the pre and post data, existing problem, action
taken by the teacher, and the resolution of the problem. Distribute A4 sheets and pens to the
students for this activity.
36
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Purpose: To provide insight into data storytelling and how it can bring a story to life.
Say: “Now that you have understood what storytelling is and why it is needed, let us learn about
a storytelling of a different kind - the art of data storytelling and in the form of a narrative or
story.”
Session Preparation
Logistics: For a class of ____ students. [Group Activity]
Materials Required
ITEM QUANTITY
A4 sheets Xx
Pens Xx
Data storytelling is a structured approach for communicating insights drawn from data, and
invariably involves a combination of three key elements: data, visuals, and narrative. When the
narrative is accompanied with data, it helps to explain the audience what’s happening in the data
and why a particular insight has been generated. When visuals are applied to data, they
can enlighten the audience to the insights that they wouldn’t perceive without the charts or
graphs.
Finally, when narrative and visuals are merged together, they can engage or even entertain an
audience. When you combine the right visuals and narrative with the right data, you have a data
story that can influence and drive change.
37
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
Presenting the data as a series of disjointed charts and graphs could result in the audience
struggling to understand it – or worse, come to the wrong conclusions entirely. Thus, the
importance of a narrative comes from the fact that it explains what is going on within the data
set. It offers a context and meaning, relevance and clarity. A narrative shows the audience where
to look and what not to miss and also keeps the audience engaged.
Good stories don’t just emerge from data itself; they need to be unravelled from data
relationships. Closer scrutiny helps uncover how each data point relates with other. Some easy
steps that can assist in finding compelling stories in the data sets are as follows:
Step 1: Get the data and organise it.
Step 2: Visualize the data.
Step 3: Examine data relationships.
Step 4: Create a simple narrative embedded with conflict.
Activity: Try creating a data story with the information given below and use your imagination to
reason as to why some cases have spiked while others have seen a fall.
38
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
It is an effective tool to transmit human experience. Narrative is the way we simplify and
make sense of a complex world. It supplies context, insight, interpretation—all the things
that make data meaningful, more relevant and interesting.
No matter how impressive an analysis, or how high-quality the data, it is not going to
compel change unless the people involved understand what is explained through a story.
Stories that incorporate data and analytics are more convincing than those based entirely
on anecdotes or personal experience.
It helps to standardize communications and spread results.
It makes information memorable and easier to retain in the long run.
Data Story elements challenge –
Identify the elements that make a compelling data story and name them
_____________________
______________________
_____________________
39
CLASS XII - LEVEL 3: AI INNOVATE AI TEACHER INSTRUCTION MANUAL
APPENDIX
Additional Resource for Advanced learners
The objective of this additional AI programming resource for Class 12 is to increase student
knowledge and exposure to programming and help them create AI projects.
The Resources are divided into two categories:
Links:
Beginner - https://bit.ly/33spBZq
Advance - https://bit.ly/3b9US7V
Note: Please use google collab links for easy reference
40