3961502-Class10 Ai Part B Unit3 Unit3 Data Science

Chapter 3 discusses Data Science as a unifying concept that combines statistics, data analysis, and machine learning to analyze real-world phenomena. It highlights various applications of Data Science, including fraud detection, genetics, internet search, targeted advertising, and airline route planning. The chapter also covers data collection methods, types of data formats, and Python packages like NumPy, Pandas, and Matplotlib that facilitate data manipulation and visualization.

Uploaded by

sri19sp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views15 pages

3961502-Class10 Ai Part B Unit3 Unit3 Data Science

Uploaded by

sri19sp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Chapter 3

Data science

INDIAN SCHOOL AL WADI AL KABIR

CLASS 10 ARTIFICIAL INTELLIGENCE
Introduction

 Artificial Intelligence is a technology which completely depends on data. It is the

data which is fed into the machine which makes it intelligent.
 And depending upon the type of data we have; AI can be classified into three broad
domains:
DATA SCIENCE

 Data Sciences is a concept to unify statistics, data analysis, machine

learning and their related methods in order to understand and analyse
actual phenomena with data.
 It employs techniques and theories drawn from many fields within the
context of Mathematics, Statistics, Computer Science, and Information
Science.
 Rock, Paper & Scissors: https://www.afiniti.com/corporate/rock-paper-scissors
Applications of Data Sciences
 Data Science is not a new field. Data Sciences majorly work around analysing the
data and when it comes to AI, the analysis helps in making the machine intelligent
enough to perform tasks by itself. There exist various applications of Data Science
in today’s world. Some of them are:

1. Fraud and Risk Detection*: The earliest applications of data science were in Finance.
Companies were fed up of bad debts and losses every year. However, they had a lot of
data which use to get collected during the initial paperwork while sanctioning loans.
They decided to bring in data scientists in order to rescue them from losses.
Over the years, banking companies learned to divide and conquer data via customer
profiling, past expenditures, and other essential variables to analyse the probabilities of
risk and default. Moreover, it also helped them to push their banking products based on
customer’s purchasing power.
2. Genetics & Genomics*: Data Science applications also enable an advanced level of
treatment personalization through research in genetics and genomics. The goal is to
understand the impact of the DNA on our health and find individual biological
connections between genetics, diseases, and drug response. Data science techniques
allow integration of different kinds of data with genomic data in disease research, which
provides a deeper understanding of genetic issues in reactions to particular drugs and
diseases. As soon as we acquire reliable personal genome data, we will achieve a deeper
understanding of the human DNA. The advanced genetic risk prediction will be a major
step towards more individual care.

3. Internet Search*: When we talk about search engines, we think ‘Google’. Right? But
there are many other search engines like Yahoo, Bing, Ask, AOL, and so on. All these
search engines (including Google) make use of data science algorithms to deliver the best
result for our searched query in the fraction of a second. Considering the fact that
Google processes more than 20 petabytes of data every day, had there been no data
science, Google wouldn’t have been the ‘Google’ we know today.
4. Targeted Advertising*: If you thought Search would have been the biggest of all data science
applications, here is a challenger – the entire digital marketing spectrum. Starting from the
display banners on various websites to the digital billboards at the airports – almost all of them
are decided by using data science algorithms. This is the reason why digital ads have been able
to get a much higher CTR (Call-Through Rate) than traditional advertisements. They can be
targeted based on a user’s past behavior

5. Website Recommendations:* Aren’t we all used to the suggestions about similar products
on Amazon? They not only help us find relevant products from billions of products available
with them but also add a lot to the user experience.
A lot of companies have fervidly used this engine to promote their products in accordance with
the user’s interest and relevance of information. Internet giants like Amazon, Twitter, Google
Play, Netflix, LinkedIn, IMDB and many more use this system to improve the user experience.
The recommendations are made based on previous search results for a user.

*CTR is the number of clicks that your ad receives divided by the number of times your ad is shown:
6. Airline Route Planning*: The Airline Industry across the world is known to bear heavy losses.
Except for a few airline service providers, companies are struggling to maintain their occupancy
ratio and operating profits. With high rise in air-fuel prices and the need to offer heavy discounts
to customers, the situation has got worse. It wasn’t long before airline companies started using
Data Science to identify the strategic areas of improvements. Now, while using Data Science, the
airline companies can:
• Decide which class of airplanes to buy
• Whether to directly land at the destination or take a halt in between (For example, A flight can
have a direct route from New Delhi to New York. Alternatively, it can also choose to halt in any
country.)
• Effectively drive customer loyalty programs
Data Collection
Data collection is nothing new which has come up in our lives. It has been in our society
since ages. Even when people did not have fair knowledge of calculations, records were still
maintained in some way or the other to keep an account of relevant things. Data collection
is an exercise which does not require even a tiny bit of technological knowledge. But when it
comes to analysing the data, it becomes a tedious process for humans as it is all about
numbers and alpha-numerical data. That is where Data Science comes into the picture. It
not only gives us a clearer idea around the dataset, but also adds value to it by providing
deeper and clearer analyses around it. And as AI gets incorporated in the process,
predictions and suggestions by the machine become possible on the same.

Now that we have gone through an example of a Data Science based project, we have a bit
of clarity regarding the type of data that can be used to develop a Data Science related
project. For the data domain-based projects, majorly the type of data used is in numerical or
alpha-numerical format and such datasets are curated in the form of tables. Such databases
are very commonly found in any institution for record maintenance and other purposes.
Some examples of datasets which you must already be aware of are:
Sources of Data
 There exist various sources of data from where we can collect any type of
data required and the data collection process can be categorised in two
ways: Offline and Online.

While accessing data from any of the data sources, following points should be kept in
mind:
1. Data which is available for public usage only should be taken up.
2. Personal datasets should only be used with the consent of the owner.
3. One should never breach someone’s privacy to collect data.
4. Data should only be taken form reliable sources as the data collected from random
sources can be wrong or unusable.
5. Reliable sources of data ensure the authenticity of data which helps in proper
training of the AI model.
Types of Data
For Data Science, usually the data is collected in the form of tables. These tabular
datasets can be stored in different formats. Some of the commonly used formats
are:
1. CSV: CSV stands for comma separated values. It is a simple file format used to
store tabular data. Each line of this file is a data record and reach record consists of
one or more fields which are separated by commas. Since the values of records are
separated by a comma, hence they are known as CSV files.
2. Spreadsheet: A Spreadsheet is a piece of paper or a computer program which is
used for accounting and recording data using rows and columns into which
information can be entered. Microsoft excel is a program which helps in creating
spreadsheets.
3. SQL: SQL is a programming language also known as Structured Query Language.
It is a domain-specific language used in programming and is designed for
managing data held in different kinds of DBMS (Database Management System) It
is particularly useful in handling structured data.
Data Access
After collecting the data, to be able to use it for programming purposes, we
should know how to access the same in a Python code. To make our lives
easier, there exist various Python packages which help us in accessing
structured data (in tabular form) inside the code. Let us take a look at some of
these packages:
1. NumPy
NumPy, which stands for Numerical Python, is the fundamental package for
Mathematical and logical operations on arrays in Python. It is a commonly used
package when it comes to working around numbers. NumPy gives a wide range
of arithmetic operations around numbers giving us an easier approach in
working with them. NumPy also works with arrays, which is nothing but a
homogenous collection of Data.
import numpy
A=numpy.array([1,2,3,4,5,6,7,8,9,0])
2. Pandas
Pandas is a software library written for the Python programming language for
data manipulation and analysis. In particular, it offers data structures and
operations for manipulating numerical tables and time series. The name is
derived from the term "panel data",
Pandas is well suited for many different kinds of data:
• Tabular data with heterogeneously-typed columns, as in an SQL table or
Excel spreadsheet
• Ordered and unordered (not necessarily fixed-frequency) time series data.
• Arbitrary matrix data (homogeneously typed or heterogeneous) with row
and column labels
• Any other form of observational / statistical data sets. The data actually
need not be labelled at all to be placed into a Pandas data structure
3. Matplotlib
Matplotlib is an amazing visualization library in Python for 2D plots of arrays.
Matplotlib is a multi-platform data visualization library built on NumPy
arrays. One of the greatest benefits of visualization is that it allows us visual
access to huge amounts of data in easily digestible visuals. Matplotlib comes
with a wide variety of plots. Plots helps to understand trends, patterns, and
to make correlations. They’re typically instruments for reasoning about
quantitative information. Some types of graphs that we can make with this
package are listed below:

Sony rcp-1530 1st-Edition Rev.1 MM
No ratings yet
Sony rcp-1530 1st-Edition Rev.1 MM
172 pages
Data Science
No ratings yet
Data Science
10 pages
Data Science
No ratings yet
Data Science
9 pages
Chapter-3 Data Sciences Study Materials Final-1
No ratings yet
Chapter-3 Data Sciences Study Materials Final-1
3 pages
Data Science
No ratings yet
Data Science
8 pages
Class X AI Unit 4: Data Science
No ratings yet
Class X AI Unit 4: Data Science
57 pages
8 A 8 D 4
No ratings yet
8 A 8 D 4
2 pages
Data v2
No ratings yet
Data v2
25 pages
Unit I
No ratings yet
Unit I
262 pages
Data Science Notes
No ratings yet
Data Science Notes
4 pages
Data Science Class X Notes
No ratings yet
Data Science Class X Notes
3 pages
Lecture 1 and 2 Powerpoints
No ratings yet
Lecture 1 and 2 Powerpoints
32 pages
Unit 1 To 5
No ratings yet
Unit 1 To 5
202 pages
Data Science CBSE Notes
No ratings yet
Data Science CBSE Notes
45 pages
Grade 10 Unit 4 - Data Science
No ratings yet
Grade 10 Unit 4 - Data Science
14 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
36 pages
UNIT - I Intro To DS
No ratings yet
UNIT - I Intro To DS
18 pages
Data Science Training
No ratings yet
Data Science Training
8 pages
Unit 1
No ratings yet
Unit 1
28 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
30 pages
FDS CH1
No ratings yet
FDS CH1
4 pages
Data Science - FYBCA-Sem-II
No ratings yet
Data Science - FYBCA-Sem-II
13 pages
DS Unit 1 - NUMPY
No ratings yet
DS Unit 1 - NUMPY
29 pages
Chapter-14 Data Science
No ratings yet
Chapter-14 Data Science
12 pages
Unit I TYCS DS
No ratings yet
Unit I TYCS DS
73 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Data Science Unit 1 Notes
No ratings yet
Data Science Unit 1 Notes
22 pages
SAS 101 - Introduction To Data Science
No ratings yet
SAS 101 - Introduction To Data Science
10 pages
FODS Full Notes
No ratings yet
FODS Full Notes
217 pages
Data Science SPPU
No ratings yet
Data Science SPPU
115 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
42 pages
Fods Notes
No ratings yet
Fods Notes
139 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
Unit 1
No ratings yet
Unit 1
25 pages
Ds Unit 1
No ratings yet
Ds Unit 1
18 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
15 pages
Data Science
No ratings yet
Data Science
244 pages
Unit 1-FDS
100% (2)
Unit 1-FDS
18 pages
Question Bank Syllbuswise
No ratings yet
Question Bank Syllbuswise
16 pages
Explaratory Data Analysis - Python
No ratings yet
Explaratory Data Analysis - Python
16 pages
CS 3353 FDS Unit 1 Notes JPR
No ratings yet
CS 3353 FDS Unit 1 Notes JPR
39 pages
Handbook Introduction of Data Science AY 23-24
No ratings yet
Handbook Introduction of Data Science AY 23-24
171 pages
Unit 1 Data Science Notes
No ratings yet
Unit 1 Data Science Notes
33 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
L1 - Introduction To Data Science
No ratings yet
L1 - Introduction To Data Science
33 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
37 pages
Data Sciences
No ratings yet
Data Sciences
6 pages
Data Science: by Neha Tyagi
100% (1)
Data Science: by Neha Tyagi
17 pages
Mod 3
No ratings yet
Mod 3
96 pages
Foundation of Data Science
100% (2)
Foundation of Data Science
143 pages
The Field of Data Science
No ratings yet
The Field of Data Science
4 pages
DS R Unit-1
No ratings yet
DS R Unit-1
41 pages
Chapter 1
No ratings yet
Chapter 1
62 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
Data Science Material
No ratings yet
Data Science Material
48 pages
DS231 Module 2
No ratings yet
DS231 Module 2
33 pages
Introduction To Data Science and Big Data
No ratings yet
Introduction To Data Science and Big Data
124 pages
Unit I Introduction To Data Science and Big Data
No ratings yet
Unit I Introduction To Data Science and Big Data
121 pages
1c. INTRODUCTION-Data-Science-basic
No ratings yet
1c. INTRODUCTION-Data-Science-basic
31 pages
PYTHON DATA ANALYTICS: Mastering Python for Effective Data Analysis and Visualization (2024 Beginner Guide)
From Everand
PYTHON DATA ANALYTICS: Mastering Python for Effective Data Analysis and Visualization (2024 Beginner Guide)
FLOYD BAX
No ratings yet
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
From Everand
Hadoop BIG DATA Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Martin Luther's Legacy: Reforming Reformation Theology For The 21st Century
100% (8)
Martin Luther's Legacy: Reforming Reformation Theology For The 21st Century
369 pages
A Concise Encyclopedia of Islam
100% (8)
A Concise Encyclopedia of Islam
257 pages
Sample Essay 1 - MLA Format
No ratings yet
Sample Essay 1 - MLA Format
3 pages
14.07.24 - SR - Star Co Super Chaina (Model-A&b) - Exams Syllabus Clarification
No ratings yet
14.07.24 - SR - Star Co Super Chaina (Model-A&b) - Exams Syllabus Clarification
2 pages
798 - Section 06
No ratings yet
798 - Section 06
6 pages
362WH14C0
No ratings yet
362WH14C0
77 pages
K. Palepu - Business Analysis Valuation - Ch.1
No ratings yet
K. Palepu - Business Analysis Valuation - Ch.1
40 pages
Unit 2 - School - Keys
No ratings yet
Unit 2 - School - Keys
15 pages
Hai An Agency & Logistics Co.,LTD (HAAL)
No ratings yet
Hai An Agency & Logistics Co.,LTD (HAAL)
26 pages
4MS Year Lesson Plan 1 Seq 1 2018-2019
No ratings yet
4MS Year Lesson Plan 1 Seq 1 2018-2019
3 pages
All Aboard Unit 1
No ratings yet
All Aboard Unit 1
7 pages
Jona
No ratings yet
Jona
4 pages
Synonyms
No ratings yet
Synonyms
3 pages
BITSAT Preference Sheet 2021
No ratings yet
BITSAT Preference Sheet 2021
4 pages
System and Network Administration Assignment
No ratings yet
System and Network Administration Assignment
64 pages
HFHDJSJWDJNDNDKWM
No ratings yet
HFHDJSJWDJNDNDKWM
81 pages
Manual de Instalación XLED
No ratings yet
Manual de Instalación XLED
92 pages
Describe and Evaluate Vygotsky's Theory of Cognitive Development
No ratings yet
Describe and Evaluate Vygotsky's Theory of Cognitive Development
2 pages
Poetry Mid Test
No ratings yet
Poetry Mid Test
4 pages
Formulation Development
No ratings yet
Formulation Development
1 page
Practice Test 1 Answers
No ratings yet
Practice Test 1 Answers
30 pages
Ysio
100% (1)
Ysio
252 pages
Plano de Trabalho
No ratings yet
Plano de Trabalho
107 pages
The Classical Civilization of Greece and Rome.
No ratings yet
The Classical Civilization of Greece and Rome.
8 pages
Typical Vs Atypical Antipsychotics
No ratings yet
Typical Vs Atypical Antipsychotics
6 pages
A Brief Overview of Artificial Intelligence
No ratings yet
A Brief Overview of Artificial Intelligence
2 pages
VBM Phase 4 US Europe Updated 22 July 20
No ratings yet
VBM Phase 4 US Europe Updated 22 July 20
3 pages
Neeraj Tripathi: Simple Dynamic D-Latch
No ratings yet
Neeraj Tripathi: Simple Dynamic D-Latch
14 pages
Thread Chart PDF
No ratings yet
Thread Chart PDF
9 pages

3961502-Class10 Ai Part B Unit3 Unit3 Data Science

Uploaded by

3961502-Class10 Ai Part B Unit3 Unit3 Data Science

Uploaded by

Chapter 3

INDIAN SCHOOL AL WADI AL KABIR

 Artificial Intelligence is a technology which completely depends on data. It is the

 Data Sciences is a concept to unify statistics, data analysis, machine

You might also like