0% found this document useful (0 votes)
194 views30 pages

classVIII DS Student Handbook

The document introduces data science concepts for 8th grade students, covering topics like what is data, types of data, careers in data science, data visualization, and applications of data science and artificial intelligence. It provides examples of different types of data like text, images, numbers and sound, and real-world examples of how platforms use data analysis of user video viewing preferences to suggest relevant video content. The purpose is to lay a foundation for data science skills and prepare students to be ready for industry.

Uploaded by

rvarghese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
194 views30 pages

classVIII DS Student Handbook

The document introduces data science concepts for 8th grade students, covering topics like what is data, types of data, careers in data science, data visualization, and applications of data science and artificial intelligence. It provides examples of different types of data like text, images, numbers and sound, and real-world examples of how platforms use data analysis of user video viewing preferences to suggest relevant video content. The purpose is to lay a foundation for data science skills and prepare students to be ready for industry.

Uploaded by

rvarghese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

DATA SCIENCE

GRADE VIII

Version 1.0
DATA SCIENCE
GRADE VIII
Student Handbook
ACKNOWLEDGMENT

Patrons
• Sh. Ramesh Pokhriyal 'Nishank', Minister of Human Resource Development,
Government of India
• Sh. Dhotre Sanjay Shamrao, Minister of State for Human Resource
Development, Government of India
• Ms. Anita Karwal, IAS, Secretary, Department of School Education and Literacy,
Ministry Human Resource Development, Government of India Advisory

Editorial and Creative Inputs


• Mr. Manuj Ahuja, IAS, Chairperson, Central Board of Secondary Education

Guidance and Support


• Dr. Biswajit Saha, Director (Skill Education & Training), Central Board of
Secondary Education
• Dr. Joseph Emmanuel, Director (Academics), Central Board of Secondary
Education
• Sh. Navtez Bal, Executive Director, Public Sector, Microsoft Corporation India
Pvt. Ltd.
• Sh. Omjiwan Gupta, Director Education, Microsoft Corporation India Pvt. Ltd
• Dr. Vinnie Jauhari, Director Education Advocacy, Microsoft Corporation India
Pvt. Ltd.
• Ms. Navdeep Kaur Kular, Education Program Manager, Allegis Services India

Value adder, Curator and Co-Ordinator


• Sh. Ravinder Pal Singh, Joint Secretary, Department of Skill Education, Central
Board of Secondary Education
ABOUT THE HANDBOOK
In today’s world, we have a surplus of data, and the demand for learning data science
has never been greater. The students need to be provided a solid foundation on data
science and technology for them to be industry ready.

The objective of this curriculum is to lay the foundation for Data Science,
understanding how data is collected, analyzed and, how it can be used in solving
problems and making decisions. It will also cover ethical issues with data including
data governance and builds foundation for AI based applications of data science.
Therefore, CBSE is introducing ‘Data Science’ as a skill module of 12 hours duration
in class VIII and as a skill subject in classes IX-XII.
CBSE acknowledges the initiative by Microsoft India in developing this data science
handbook for class VIII students. This handbook introduces the concepts of data
science, data visualizations and applications of data science in AI. The course covers
the theoretical concepts of data science followed by practical examples to develop
critical thinking capabilities among students.
The purpose of the book is to enable the future workforce to acquire data science skills
early in their educational phase and build a solid foundation to be industry ready.
Contents

Introduction to Data .....................................................................................1


1. What is Data? .......................................................................................1
2. Real-World examples of Data....................................................................2
Introduction to Data Science .........................................................................6
1. A brief introduction to Data Science ..........................................................6
2. Careers in Data science...........................................................................7
3. What does Data Science help us achieve? ...................................................8
Data Visualization ...................................................................................... 12
1. Introduction........................................................................................ 12
2. What is data visualization? .................................................................... 12
3. Examples of data visualization ............................................................... 13
4. Importance of data visualization ............................................................. 14
5. Asking the right question ...................................................................... 14
Data Science and AI .................................................................................... 19
1. Introduction........................................................................................ 19
2. Applications of data science ................................................................... 19
3. Analytics on text data ........................................................................... 20
4. Analytics on image data ........................................................................ 20
5. Overview of AI ..................................................................................... 21
References ................................................................................................. 24

I
CHAPTER
Introduction to Data

Studying this chapter should Data comes in different types. Some of


enable you to understand: the common types of data include:

• What is data? • Text


• Real world examples of data • Image
• Video
• Numbers
• Spreadsheets
1. What is Data?
• Sound
We often use the term data to refer to
computer information. This information Internally in computers, data is stored
is either transmitted or stored. Data as a series of bits that have a value of
comes in numerous forms. Any kind of either one or zero.
information may it be in numbers or text
Data can be of two types:
or pictures is termed as data.
1. Qualitative - Qualitative data is
Data is gathered and translated for some
the data that is a descriptive piece
purpose, usually analysis. However, if
of information. For example,
data is not put into context, it doesn't
"What a nice day it is"
help in any way to humans or
2. Quantitative – Quantitative Data
computers.
is the data that is numerical
information—for example, "1",
"3.65," etc.

1
2. Real-World examples of
Data
Now that we have understood what is
data and what are types in which data is
categorized, an obvious question that
strikes our mind is that, what is the
application of this data in the real world?

Quantitative data can be further divided


into subtypes "Discrete" and
"Continuous."

Let us have a look at the below scenario


from the entertainment industry to
understand the real-world application of
data.
Many of you must be used to watching
videos on the web. While you start
watching a video, many a time you get
the list of suggested videos to watch post
the existing clip is completed. Have you
ever noticed how relevant these videos
are to the content that you are watching
or the content that you like?
Yes, most of the time, the suggested
clips are the ones that, if given a
situation where you were not suggested
Activity 1.1 any new videos, you could have
searched and played the same clip by
Think of real world scenarios of discrete
your choice.
and continuous data.
Isn't it strange how the video platform
knows your choices so well? Well, it is all
because of data and data analysis. These
platforms have many videos in their

2
content. Combined with that, it analyses • Effective targeting of the
the videos that people usually play post- advertisements
watching a video.
These people's preferences are stored
and studied. Later an algorithm in the Recap
background creates a pattern of people's
preferences and shows you the same • We are surrounded by data. Every
content in suggested videos, which the computer, every mobile device
majority of people watched post the generates immense amount of data.
existing clip. • Data comes in different types such
as audio, video, text etc.
This is how data analysis is applied in
• Data can be qualitative or
the entertainment industry in real life.
quantitative, continuous or iscrete.
Some of the benefits of data in the • Discrete data can take only a
entertainment industry are: specific value.
• Continuous data can have a value
• Predicting interests of the audience
within a specific range.
• Optimized or on-demand scheduling
of media streams in digital media
distribution platforms
• Getting insights from customer
reviews

3
Exercises
Objective Type Questions
Please choose the correct option in the questions below.
1. Discrete data can take any value in a range.
a. True
b. False
2. Continuous data cannot take decimal values.
a. True
b. False
3. Information stored in a PDF is not considered data.
a. True
b. False
4. Quantitative data cannot take numerical values
a. True
b. False
5. Qualitative data is descriptive in nature.
a. True
b. False
6. “How is the weather like?” is what kind of data
a. Quantitative
b. Qualitative
7. Which of the following is considered data?
a. Speech
b. Video
c. Messages
d. All of the above
8. How is data used in the entertainment industry?
a. Predicting interests
b. Targeting ads
c. Both of the above
9. Number of days in a week is an example of?
a. Discrete Data
b. Continuous Data
10. What are the types of quantitative data?
a. Discrete
b. Continuous
c. Both a and b

4
Standard Questions
Please answer the questions below in no less than 100 words.
1. Explain what data is, with the help of two real-life examples.
2. How is the data categorized?
3. What is Discrete Data?
4. What is Continuous Data?
5. Give two examples of real-life applications of data.

Higher Order Thinking Skills


Please answer the questions below in no less than 200 words.
1. How is data used by online streaming platforms?
2. Give five examples of discrete data around you.

Applied Project
Data analytics has many applications in our life. Discuss how data analytics is applied
in the airline industry to predict flight delays. Few factors which influence flight delays:

• Weather condition (Extreme weather)


• Route restriction/Air traffic
• Mechanical delays
• Availability of runways
Using these factors discuss with your classmates how data analytics can help predict
flight delays.

5
CHAPTER

Introduction to Data Science

Studying this chapter should


enable you to understand: On your social media page, you are
clicking the 'Like' button for the singer
• What is Data Science? you like; on your browser, you are
• Careers in Data Science. surfing through various tutorial videos
• What questions does Data in a video uploading site – all these
Science answer? activities are creating data.
This data can be investigated and can be
1. A brief introduction to organized, and through careful analysis,
it will give a clearer picture of what you
Data Science do and what might be offered to you to
Every day, through various means of our enrich your daily life.
lifestyle, a tremendous amount of data is
This is what data science is about, to
generated. When you buy something
extract meaningful interpretations from
from your local grocery store,
the data.
somewhere in this world, someone keeps
track of what you have purchased and in The insight gained through the
which quantity you have purchased. processing of the data using data
science is meant for helping our
Let's say you are withdrawing money
decision-making capabilities. It has
from ATM for monthly expenditure. You
various applications; for example, helps
might withdraw various amounts of
various industries to cater us better and
cash through multiple ATMs at different helping authorities to nab criminals,
locations. ATMs of these Banks will
and learn cricket in a better way.
generate data to manage your bank
account correctly.

6
create actionable plans for
companies and organizations.
Activity 2.1
Data Scientists are analytical experts
Try to find everyday used applications
who utilize their skills both in
that depend on data science. technology and social science to find
trends and manage data. They use
their industry knowledge and
2. Careers in Data science context-specific understanding to
As we understand about Data, Data find solutions to business
Analysis, and Data Science, one of the challenges.
important questions that coin up is, 2. Business Intelligence Analyst -
what are the career options that we can Business Intelligence Analysts use
take up in Data Science? data to assess the market and find
We have learned about the real-life the latest business trends in the
applications of data and data science. industry. This helps to develop a
Many of us may have found it interesting clearer picture of how a company
and may want to pursue this career to should shape its strategy.
explore it further.
3. Data Engineer - Data Engineer
To help you nail through the right examines not only the Data for their
choice, let us understand which own business but also that of third
different careers we can take up in Data parties. In addition to mining data, a
Science. Some common job titles for data engineer creates robust
data scientists include: algorithms to help analyze the data
further.
1. Data Scientist
2. Business Intelligence Analyst
4. Data Architect - Data Architects
3. Data Mining Engineer work closely with users, system
4. Data Architect designers, and developers to create a
5. Senior Data Scientist
blueprint that data management
Let us now briefly go through these job systems use to centralize, integrate
titles to get a better understanding: and maintain the data sources.

1. Data Scientist - Data Scientists are 5. Senior Data Scientist - Senior Data
data enthusiasts who gather and Scientists anticipate the business's
analyze large sets of structured and needs in the future. Although they
unstructured data. A data scientist's might not be involved in gathering
role combines computer science, data, they play a high-level role in
statistics, and mathematics. They analyzing it. Using their vast
analyze, process, and model data experience, they can design and
and later interpret the results to create new standards for analyzing

7
data. They can also create ways to Is this an outlier?
use statistical data and develop tools
to further analyze the data. In some cases, the objective is to find

3. What does Data Science


help us achieve?
Activity 2.2
Which career path would be good for
you? Discuss

outliers or anomalies in data that is


In simple words, data science helps us otherwise mostly consistent.
answer different types of questions that These so-called anomalies could be a
help us achieve various objectives. cause of concern especially in cases
Broadly, these questions can be divided where we need the data to be within a
into five types. specific range all the time.

An unexpected change in data patterns


Which class does this belong to - A
can often be a sign of something going
or B?
wrong or possible fraud.
The answers to some questions can only
For example, if an unexpected
be from a definite number of options.
transaction is done from your debit card
For example, will it rain today? which does not match your regular
transactions, there could be a case of
A: Yes/No fraud. Banking institutions track these
Q: Will the weather be hot or cold? records and alert the customer that an
unexpected transaction has happened,
A: Hot/Cold and this helps in protecting the
To make such predictions, we use a customer's money.
family of algorithms called classification Some other examples of anomaly
algorithms. In case we have only two detections are:
choices, the mechanism is called binary
classification. If we try to predict Q: Is this email normal or spam?
between more than two choices, we use Q: You are checking your car tyre
a multiclass classification algorithm. pressure. Is the reading normal?

8
The algorithms that are used for these What should be done now?
types of questions are called anomaly
detection algorithms. This question usually solves the
problems of autonomous robots or self-
What will probably be the value of driving cars that need to make decisions
this variable? based on changes in external factors.
Machine learning helps to solve such
Machine learning can also help us problems with the help of reinforcement
predict numerical values of continuous learning.
variables. There are scenarios in which
we must predict numerical values of a These models are trained by a process of
variable based on historic data. reward every time a correct action is
taken and punishment every time a
Some examples are: wrong action is taken.
Q: How much rainfall will we receive this
year?
A: 100 mm

Q: How many runs will the winning team


score?

A: 320
The kind of algorithms that can predict
these values are called regression
algorithms.

How is the data grouped?


Sometimes data may be separated into
distinct groups based on some
parameters. This approach is called
clustering and is a type of unsupervised
machine learning. These systems are automated and can
take decisions without human
For example, consider the data of the
intervention.
heights and weights of three species of
cats.

Q: I am a robot vacuum. Should I


continue cleaning or go to the charging
When we perform clustering, we will get
station?
an output that shows us the three
different species of cats in three groups. A: Continue

9
Recap

• Data science is about how to extract meaningful interpretation from the data.
• There are many careers in Data Science like Data Scientist, Data Engineer and
Data analyst.
• Data Architect and Senior Data Scientist are two roles for experienced
professionals.
• Classification helps us to predict if a new item belongs to class A or class B.
• Regression helps us to predict the value of a continuous variable.
• Clustering helps us to find patterns in the data.
• Reinforcement learning helps models to take decisions based on external
factors.

Exercises
Objective Type Questions
Please choose the correct option in the questions below.
1. A school named ABC has recorded the total marks of every student in the class.
This an example of:
a. Qualitative data
b. Quantitative data
c. Both qualitative and quantitative data
d. None of the above
2. A food delivery app has asked for your feedback on the quality of the food. You
have written two paragraphs to describe the food. This is an example of:
a. Qualitative Data
b. Quantitative Data
c. Both qualitative and quantitative data
d. None of the above
3. It would help if you predicted what the temperature would be for next Friday.
Which algorithm will you use?
a) Clustering
b) Regression
c) Anomaly detection
d) Binary classification
4. You need to predict if your car tire will last for the next 1000 km. Which algorithm
will you use?
a) Clustering
b) Regression
c) Anomaly detection
d) Binary classification
5. You want to build a way to segregate spam emails from good emails. Which
algorithm will you use?
a) Clustering
b) Regression
c) Anomaly detection
d) Binary classification

Standard Questions
Please answer the questions below in no less than 100 words.
1. What are the common career paths for data science?
2. What does a Data Architect do?
3. What are the differences between classification and regression?

Higher Order Thinking Skills(HOTS)


Please answer the questions below in no less than 200 words.
1. Discuss a recent innovation that makes use of reinforcement learning.
2. Write a short note on how data science is helping sports teams.

Applied Project
Emails are a part of daily communication. Sometimes we receive unwanted emails called
spam. There are few techniques that email providers use to identify spam mails :

• Content-based filtering (Analyzing the words, occurrence, distribution of words


to identify spam mail)
• Header filters (Reviewing the email header ) Example: Promo!, Offer!
Provide 2 examples each of words/phrases in email content & header which marks an
email as spam.Explain in detail, how email providers make use of clustering to mark an
email as spam. Also elaborate.how email providers create and update the
words/phrases to mark an email as spam.

11
CHAPTER

Data Visualization

Studying this chapter should


enable you to understand: 2. What is data
• What is data visualization? visualization?
• The importance of
visualization
Data visualization is the representation
• Collecting relevant data
of data or information in a graph, chart,
• Asking the right question
or other visual formats.
• Predict an answer
• Examples of data Data visualization provides a way to see
visualization and understand trends, outliers, and
patterns in data. Charts and graphs
make communicating data findings
easier even if you can identify the
patterns without them.
1. Introduction The goal of data visualization is to
In the previous chapters, we learned communicate information clearly and
about how data is collected and how we efficiently to users.
can interpret the data by asking several
types of questions on the data. In this Common types of data visualizations
chapter, we will learn to visualize data are:
and make predictions.

12
• Charts The most preferred food item is pizza
• Graphs and the least preferred food item is
• Tables pasta.
• Maps
• Histograms Example 2: Using a line chart that
displays the data of the number of
3. Examples of data students present in the class for one
week.
visualization
Example 1: Using a pie chart that Here is the data:
displays the data of the food preferred by
the students.
Date Number of students
present
We have the food item preference of 50
06-Apr 49
students. Let us now visualize the data
using a pie chart and find the most 07-Apr 42
preferred and the least preferred food 08-Apr 37
item. 09-Apr 48
10-Apr 43
11-Apr 36
12-Apr 50

Food item Number of students


Pizza 25
Let’s now visualize the data using line
Pasta 10
chart
Dosa 15

N U M BE R O F S T U D EN T S
Let us now visualize the data using a pie PRESENT
chart: 60

50
FOOD PREFERENCE
40
Dosa
30% 30

Pizza 20
50%
10

Pasta 0
20%

13
We can also visualize the same data Let us understand what steps we need
using a bar graph: to take to make sure that we collect the
right set of data for analysis.

• Quality of the data - Primary and


Number of students present most vital point to consider while
60
collecting the data is the quality of
50 data that is getting collected. If we
40
collect incomplete data, build an
unreliable database, and run
30 analysis on skewed data sets,
20
obviously we are not going to arrive
at the required output. The quality of
10 data that is collected should always
0
be the top priority while assessing
the data.

• Completeness of data - We need to


make sure that the data that is
getting collected is a complete set.
4. Importance of data Incomplete sets of data may cause
visualization many discrepancies and wrong
output on analysis.

To make sure that we get the required • Format of data - The format of the
outcome from the data, we must collect Data that is collected for analysis
the right and relevant data. should be right. Data should be
It is essential to have correct and good accessible and readable for analysis.
quality data to make an analysis or to If the collected data is not in the right
format, we should convert it to the
construct algorithms that can have an
required format for analysis.
impact. Without relevant data, your
analyses will not only be irrelevant, but
they can also be misleading.
5. Asking the right
You cannot expect to find perfectly question
preprocessed raw data that be used
Once we have the required data ready
directly for your needs. Hence, you need
with us, the next step is to ask the right
to understand how the data was question to the data. It is important to
gathered and what sources it was
understand that if we don't ask the right
collected from.
questions, we will never get the right
Therefore, it is essential to understand answers. To make sure we perform the
how to collect relevant data for analysis.

14
a. Regression Analysis is a process
for finding out the relationships
and correlations among the
different variables in the data.

Regression analysis helps us to


figure out how the value of the
dependent variable varies when
one of the independent variables
change. Thus, regression analysis
helps to identify which
proper analysis of data, we should ask independent variables affect the
the data the right set of questions. dependent variable.

Below are specific questions that you b. Cohort Analysis – it enables you
need to ask to your data set to get the to easily compare how different
right answer: groups, or cohorts, of customers,
behave over time.
• What do you wish to find?
For example, you can create a
It is essential to consider what your cohort of customers based on the
goal is and what decision-making it date when they made their first
will facilitate. What outcome from the purchase. Subsequently, you can
analysis would you consider a study the spending trends of
success? cohorts from different periods in
time to determine whether the
These initial analysis questions are quality of the average acquired
important to guide you through the customer is increasing or
process and help focus on valuable decreasing over time.
insights. You can start by
brainstorming and preparing a draft
c. Predictive Analysis – Predictive
guideline for specific questions you
analytics involves the analysis of
want to find from the data. This will
historical datasets to predict
help you to dive deeper into the more future possibilities. It can also be
specific insights you want to achieve. used for generating alternative
scenarios and risk assessments.
• Which statistical techniques are
applicable? • Who will be using the final results?
There are several statistical analysis An important aspect of your data
techniques that you can use for analytics refers to the end-users of
analyzing data. However, in real-life our analysis. Who are they and how
scenarios, three statistical will they be using the reports you
techniques are mostly used for create? You must get to know your
analysis: final users, including:

15
a. What do they expect to learn from be able to understand the insights
the data? from them.
b. What do they need?
c. How advanced are their technical It is essential to convince executive
skills?
and decision-makers that the data
d. How much time do they have?
that you have gathered and analyzed
are:
If you know these answers, you can
decide on how detailed your data
visualizations should be and what a. Correct
areas of the data your report should b. Important
be focused on. c. Urgent to act upon

Effective presentation aids in all


You should keep in mind that
these areas. There are several kinds
technical and non-technical users of charts to pick from. You can
have different needs. If the reports improve your chances of getting good
are designed for executives and non- feedback by choosing the right data
technical staff, you know which visualization type.
insights will be useful for them and
what level of data complexity they
can handle. There are several data visualization
software like Power BI that can
perform most of the tedious aspects
If external parties will be using your of data cleaning as well. They can not
reports, the reports you make should only help to prepare the data but also
provide them with an easy-to-use interpret the insights. Because they
and actionable interface. The final
are so easy to use and test data
users should be able to read and hypotheses without the need for
interpret them independently, with intensive training, these tools have
no support needed. become an invaluable resource in
today's data management practice.
• Which visualizations should you
pick? These tools are very flexible for the
Once your data is clean and your end-user and can easily adjust to
statistical analysis is done, you need your prepared questions for
analyzing data, the tools can help to
to pick your visualizations. You can
have impactful and valuable insights perform a voluminous analysis.
from the data, but if they're
presented badly, the end-users won't

16
Recap

• It is important to check the quality of data, completeness of data and format


of data.
• Once we have the required data ready with us, next step is to ask right
question to the data.
• There are a number of statistical analysis techniques that you can use for
analyzing data – Regression, Predictive analytics and cohort analysis.
• Once your Data is clean and your statistical analysis is done, you need to pick
your visualizations.
• Picking the right visualisation is important otherwise the results will not be
interpreted well.
• You must analyse the end users of the visualisations in order to decide how to
present your insights.

Exercises
Objective Type Questions
Please choose the correct option in the questions below.

1. Data can be visualized using:


a. Graphs
b. Maps
c. Charts
d. All of the above
2. Which of the following statement is false?
a. Data visualization can absorb information quickly.
b. Data visualization decreases the insights and takes slower decisions.
c. Data visualization is a type of visual art.
d. None of the above
3. Which of the following is a use case of data visualization?
a. Healthcare
b. Sales and Marketing
c. Politics/Campaigning
d. All of the above

17
4. Which format of data is easiest for analysis?
a. Tabular data
b. Text data in a PDF
c. Data in an image
d. Speech data
5. Which visualization is best for representing a relation between two variables?
a. Scatter plot
b. Histogram
c. Pie chart
d. Gantt Chart

Standard Questions
Please answer the questions below in no less than 100 words.
1. What are the steps to make sure that the correct data is collected for analysis?
2. Write a short note on the statistical techniques which can be used for data
analysis.
3. Is it important to assess the end-users for a visualization? Explain in your own
words.

Higher Order Thinking Skills(HOTS)


Please answer the questions below in no less than 200 words.
1. What are the things to consider before deciding on an appropriate visualization
for your data?

2. If you find that the data collected has outliers, what steps can you take to ensure
that your analysis is still accurate?

Applied Project
Each student should write down the marks he/she had received in the examination for
the subjects studied in the previous grade. Use these marks to plot on paper
a. bar graph to display marks of each individual subject.
b. line graph to display marks of each individual subject.
c. pie chart to show percentage contribution of marks of each subject to the total
marks obtained.

18
CHAPTER

Data Science and AI

Studying this chapter should


enable you to understand:
2. Applications of data
• Applications of data science
science
• Analytics on text data Looking at the advantages of data
• Analytics on image data science, it is obvious that many
companies have applied data science to
• Overview of AI
help their business grow. If we look
around us, data science is everywhere.
In numerous ways, it is affecting our
day-to-day life.
1. Introduction Digital Advertisements - You must
In the last chapter, we saw how we can have noticed many times that if you do
visualize and make predictions from the an internet search for a thing, say, you
data. In this chapter, we will learn about searched for a handbag, and close the
the applications of data science and the browser once your surf is complete.
basics of artificial intelligence. Later, when you open any other
applications or websites, you see
advertisements for handbags from

19
various brands in your window. Ever querying data, mining data, search data,
wondered, how this new application or and analyzing data to get insights.
website knows that you are looking to
buy a handbag? Well, the answer to this For example, if we have a database with
is data science. Algorithms in data
customer data, an end-user could query
science help in tracking your searches
the database to find out how many
and learn your preferences from them.
customers have started using the
Speech Recognition - Speech company's services in the last quarter
recognition is now part of our everyday and how many have stopped using the
lives. Speech recognition has now service. They can do so by just entering
become a part of phones, game consoles, a query in plain English instead of a
and even smartwatches. Have you heard
query language like SQL.
of Microsoft's Cortana? It uses speech
recognition behind the scenes to take
Chatbots are also an important area that
inputs from the user.
uses text analytics for both querying and
Speech recognition can also be found on searching data. Chatbots can use to
many devices that can be used to query a database and give a reply based
automate our homes. on the question. They can also use
search based on text analytics to help in
Speech recognition has been around for retrieving a document based on what
more than a decade. However, it is end users are looking for.
gaining popularity now as machine
learning is helping organizations make 4. Analytics on image
speech recognition much more accurate.
data
3. Analytics on text data Image recognition can be described as a
Text analytics can be defined as the process by which we can process images
process of collecting unstructured text for identifying people, patterns, logos,
from various sources and analyzing and objects, or places.
extracting relevant information from it. Many machine learning tools can assist
It can also be used for transforming it users with facial recognition of objects in
into structured information that can a picture. These tools can also scan the
then be used in various other ways. objects in the picture and attempt to
There are several ways to analyze identify and name them based on a large
unstructured text. Most of these database of images.
techniques can be divided under these Mobile phones, for example, make use of
technical areas - Natural Language computer vision technologies in
Processing (NLP), data mining, and combination with a camera to achieve
information retrieval. image recognition. This advanced
technology has a variety of applications
Typically, we used text analytics
technologies for four basic tasks –

20
like accessibility for the visually
impaired and interactive advertising.

Facial recognition is also used by many


organizations to check the attendance of
workers and by government
organizations for identification
purposes.

Besides identifying faces and detecting


objects in images, AI is also capable of
recognizing special patterns, in the
images and matched them with its
database.

Another growing application of image


analytics is in searching content based Artificial Intelligence aims at making
on images. Some search engines now machines as a smart human. This main
allow users to upload images and search goal of Artificial Intelligence can be
based on that. explained using the below sub-goals:
a. Logical Reasoning: AI aims at
5. Overview of AI making computers capable of doing
Artificial Intelligence is defined as the all the intelligent and sophisticated
science and engineering of making tasks that we humans can do. For
intelligent machines. AI is a branch of example, solving problems that
Computer Science which deals with the require logical reasoning like
research and design of intelligent switching on the fan because it is hot
systems that can take inputs from their or solving complex mathematical
environment and takes actions based on problems.
it as a human being would.
In technology, Artificial Intelligence, b. Knowledge Representation: Make
Machine Learning, and Deep Learning computers capable of describing
are widely used. While you may have objects. For example, describing a
seen these terms getting used car that just violated the traffic
interchangeably, each carries its norms.
significance and application.
Artificial Intelligence, Machine Learning
are subsets of each other, while Machine
Learning is the superset of Deep
Learning.

21
c. Planning and Navigation:
Making computers capable of Recap
traveling from Point X to Point Y.
For example, a self-driving robot. • There are two important applications
of data science – digital ads and
d. Natural Language Processing: speech recognition.
Make computers capable of • Text analytics can be defined as the
understanding and processing a process of collecting unstructured
language. For example, a web text from various sources and
translator that translates one analyzing and extracting relevant
language to another. information from it.
• Chatbots are also an important area
e. Perception: Make computers that uses text analytics for both
capable of interacting with real- querying and searching data.
world objects by the sense touch, • Image recognition can be said to be
sound, smell and eyesight. a process by which we can process
images for identifying people,
f. Emergent Intelligence: Make patterns, logos, objects or places.
computers capable of Intelligence • Artificial Intelligence is defined as
that is not explicitly programmed the science and engineering of
but is derived from AI making intelligent machines.
capabilities. The basic vision for • AI has many sub goals like – natural
this goal is to enable machines to language processing, perception etc.
exhibit emotional intelligence,
moral reasoning, and more.

Exercises
Objective Type Questions
Please choose the correct option in the questions below.

1. Data Science can help with:


a. Speech Recognition
b. Digital Advertising
c. All of the above

2. Which of the following is a goal of Artificial Intelligence?


a. Logical Reasoning
b. Knowledge Representation
c. Planning and Navigation
d. All of the above

22
3. Which of the following is a use case of data science?
a. Facial recognition
b. Text analytics
c. Sentiment analysis
d. All of the above
4. What does natural language processing help us with?
a. Text analytics
b. Video analytics
c. Image analytics
5. What technologies are used by chatbots?
a. Text analytics
b. Speech recognition
c. Both above

Standard Questions
Please answer the questions below in no less than 100 words.

1. How is data science used for speech recognition?


2. Write a use case for analyzing images.
3. What are some of the goals of AI?

Higher Order Thinking Skills(HOTS)


Please answer the questions below in no less than 200 words.
1. How is text data analyzed?
2. What are some of the applications of image recognition?

Applied Project
Understanding the mood of the speaker can be very useful. Certain keywords can be
associated with different sentiments.
Example 1: “The news continues to be gloomy.” If you read this sentence you will
understand that the sentiment of the speaker is sad.
Example 2: “I was infuriated by his arrogance.” This sentence tells you that the
sentiment of the speaker is angry.

Discuss with your classmates how text analytics can help us identify the sentiment of
the speaker i.e. if the speaker is happy, angry, or sad. It is possible that a sentence may
have more than one keywords which highlight the sentiment of the speaker. Provide 2
examples of such scenarios for each of the sentiments discussed above.

23
References

Luis, B.E.R.M.U.D.E.Z. 2021. Overview of Artificial Intelligence Buzz. [Online]. [25


February 2021]. Available from: https://medium.com/machinevision/overview-of-
artificial-intelligence-buzz-adb7a5487ac8

Vivek Kumar. 2020. WHY DOES DATA SCIENCE MATTER IN ADVANCED IMAGE
RECOGNITION? [Online]. [4 March 2021]. Available from:
https://www.analyticsinsight.net/data-science-matter-advanced-image-recognition/

24

You might also like