0% found this document useful (0 votes)
7 views

Unit-4 Data Science

Data Science is a field that utilizes programming, scientific methods, and algorithms to extract meaningful insights from large volumes of structured and unstructured data, primarily for AI applications. The increasing prevalence of unstructured data due to the growth of internet users has heightened the demand for Data Science technologies. Applications of Data Science span various sectors including banking, finance, healthcare, and e-commerce, with roles in risk modeling, fraud detection, medical analysis, and targeted advertising among others.

Uploaded by

araj131207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Unit-4 Data Science

Data Science is a field that utilizes programming, scientific methods, and algorithms to extract meaningful insights from large volumes of structured and unstructured data, primarily for AI applications. The increasing prevalence of unstructured data due to the growth of internet users has heightened the demand for Data Science technologies. Applications of Data Science span various sectors including banking, finance, healthcare, and e-commerce, with roles in risk modeling, fraud detection, medical analysis, and targeted advertising among others.

Uploaded by

araj131207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Part-B, Unit-4 Data Science

Q: What is Data Science?


Ans: - Data Science is a field that uses the skills of programming, scientific methods (mathematical + statistical),
processes, algorithms and systems to find meaningful information from huge volumes of structural and unstructured data
to apply in AI applications.
Data Science uses data analytics to analyse and gather insights from past data. It is also capable to discover hidden
patterns from the raw data. It can also use for predictions.

Q: What is the main reasons for using Data Science technology?


Ans: - Earlier the data processing was quite easy because data was limited and structured. The structure can be analysed
easily and effectively. Nowadays more than 80% of data is unstructured. So, with unstructured data, the traditional
methods cannot work appropriately.

In addition to this, day by day the number of internet users increasing day by day. So, it increases the use of unstructured
data. These unstructured data collected by the various organizations through mobile apps, websites and other platforms
can be used to serve the specific requirements of the customer and users. This is the main reason that increases the
demand for using data science.

Q: What are the applications of Data Science?


Ans: - Some most common applications of data science in modern era are –

S.No Field Application Area

Customer Services
1 Banking Risk Modelling
Fraud Detection

Decision-Making
2 Finance Risk Analytics
Algorithm Trading

Maintenance Scheduling
Predicting Problems
3 Manufacturing
Detecting Anomalies
Automated Processes

Optimizing Vehicle Performance


Self Driving cars,
4 Transport
Airline Route Planning
Vehicle monitoring system

Medical Image Analysis


5 Healthcare Drug Discovery
Diagnosis Prediction

Consumer Identification
Analysis of Reviews
6 E-Commerce
Recommendations
Digital Advertisements (Targeted Advertisements)

Speech recognition system


Machine Learning Algorithm
7 Artificial Conversational Bots Amazon’s Alexa and Apple’s Siri),
Image Recognition
Gaming, Virtual Reality
Other applications and fields are:
Internet Search, Recommender Systems (RSs), Weather Predictions in Agriculture Sector, Education etc.
Q: What is the role of data science in the Internet Search?
Ans: - Internet involves huge volumes of data saved across servers and connected devices. To search in this huge amount
of data, data science tools are really beneficial as they provide precise results in a fraction of time. For example- search
engine like Google make use of data science tool and algorithms to search for the asked query across the Internet, in a
fraction of seconds.

Q: What is the Digital advertising / targeted advertising?


Ans: - Digital Advertising (Targeted Advertising) is a form of advertising, including online advertising, that is directed
towards an audience with certain traits, based on the product or person the advertiser is promoting. It makes uses of past
data about the needs and choices of the user and picks products and time for advertising the product accordingly.

Q: What are the stages of AI project cycle? Explain it in detail.


Ans: - There are mainly following stages of AI Project cycle in the context of data science: -

Q: What is Data Collection? Name some commonly used datasets.


Ans: - Data collection is a method of gathering numeric and alphanumeric data. For data analysis, you need to perform
data collection. It gives a clear idea about the dataset and adds value to it by providing deeper and clearer analyses
around it. The AI predictions and suggestions by the machine are possible through data collection.
The data collection is mainly used for record maintenance and other purposes. The commonly used datasets are:
It holds data for loans, accounts, lockers, payrolls,
Banks
bank visitors etc.
It holds data related to daily transactions, visitor’s
ATM Machines
information, money is withdrawn etc.
It holds details on movie details, tickets sold online
Movie Theatres
and offline modes, purchase of refreshments etc.
School data like student’s fee collection, results,
School
teachers; salary database etc.

Q: What are the various kinds of sources of Data Collection?


Ans:
There are various sources for data collection found nowadays in the market. The major kinds of sources for data
collection are:
1. Online
2. Offline
Online Sources Offline Sources
Open-Sources web portals run by Government Sensors
Reliable private websites such as Kaggle Surveys
Word Organizations Open-source websites Interviews
WHO websites Observations
Q: What are the Data Formats/ Types used in Data Science?
Ans:
For data science models or projects, generally, data is collected in the form of tables in different formats:

1. CSV: It is a common and simple file format to store data in tabular form. It can be opened through any
spreadsheet software (MS Excel), documentation software (MS Word) and any text editor (Notepad). Everyone
contains a record, each record has a number of fields, and these fields are separated by a comma.
2. Spreadsheet: A spreadsheet contains rows and columns to represent data in tabular form. Mostly spreadsheet is
used to calculate data, manipulate data, analyse data and maintain data records. Ms excel is well known and
popular spreadsheet software.
3. SQL: It stands for Structured Query Language. It is used to handle the data stored in DBMS (Database
Management Software) System. It provides basic commands to create, alter, delete and manage transactions for
database management.

Q: Name some issues that may appear at the time of collecting the data needed for Data Science.
Ans: Some issues are:
i. Erroneous Data- The values of dataset is not received as per the expectations. It is of two types-
a. a. Incorrect Values b. Invalid or Null values
ii. Missing Data- Data not present at the desired location of a dataset.
iii. Outliers Data- The data that differs drastically from the rest of the data.

*********************

You might also like