Unit-4 Data Science
Unit-4 Data Science
In addition to this, day by day the number of internet users increasing day by day. So, it increases the use of unstructured
data. These unstructured data collected by the various organizations through mobile apps, websites and other platforms
can be used to serve the specific requirements of the customer and users. This is the main reason that increases the
demand for using data science.
Customer Services
1 Banking Risk Modelling
Fraud Detection
Decision-Making
2 Finance Risk Analytics
Algorithm Trading
Maintenance Scheduling
Predicting Problems
3 Manufacturing
Detecting Anomalies
Automated Processes
Consumer Identification
Analysis of Reviews
6 E-Commerce
Recommendations
Digital Advertisements (Targeted Advertisements)
1. CSV: It is a common and simple file format to store data in tabular form. It can be opened through any
spreadsheet software (MS Excel), documentation software (MS Word) and any text editor (Notepad). Everyone
contains a record, each record has a number of fields, and these fields are separated by a comma.
2. Spreadsheet: A spreadsheet contains rows and columns to represent data in tabular form. Mostly spreadsheet is
used to calculate data, manipulate data, analyse data and maintain data records. Ms excel is well known and
popular spreadsheet software.
3. SQL: It stands for Structured Query Language. It is used to handle the data stored in DBMS (Database
Management Software) System. It provides basic commands to create, alter, delete and manage transactions for
database management.
Q: Name some issues that may appear at the time of collecting the data needed for Data Science.
Ans: Some issues are:
i. Erroneous Data- The values of dataset is not received as per the expectations. It is of two types-
a. a. Incorrect Values b. Invalid or Null values
ii. Missing Data- Data not present at the desired location of a dataset.
iii. Outliers Data- The data that differs drastically from the rest of the data.
*********************