0% found this document useful (0 votes)

10 views15 pages

Chapter-1 DS

Data science is a multidisciplinary field focused on extracting insights from structured and unstructured data using various technologies and algorithms. It has applications across multiple domains, including healthcare, finance, and transportation, and is often compared to business intelligence, machine learning, and artificial intelligence. Data warehousing and data mining are integral components of data science, facilitating the storage, analysis, and discovery of patterns in large datasets.

Uploaded by

trexwarrior92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views15 pages

Chapter-1 DS

Uploaded by

trexwarrior92

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Chapter-1

What is Data Science? Definition and scope of Data Science, Applications

and domains of Data Science, Comparison with other fields like Business
Intelligence (BI), Artificial Intelligence (AI), Machine Learning (ML), and
Data Warehousing/Data Mining (DW-DM)

Data Science:

Data science is a deep study of the massive amount of data, which involves
extracting meaningful insights from raw, structured, and unstructured data that is
processed using the scientific method, different technologies, and algorithms.
It is a multidisciplinary field that uses tools and techniques to manipulate data so
that you can find something new and meaningful.

Applications of Data Science:

o Image recognition and speech recognition:

Data science is currently used for Image and speech recognition. When you
upload an image on Facebook and start getting the suggestion to tag your
friends. This automatic tagging suggestion uses an image recognition
algorithm, which is part of data science.
When you say something using, "Ok Google, Siri, Cortana", etc., these
devices respond as per voice control, so this is possible with speech
recognition algorithms.
o Gaming
In the gaming world, the use of Machine learning algorithms is increasing day
by day. EA Sports, Sony, Nintendo, are widely using data science for
enhancing user experience.
o Internet:
When we want to search for something on the internet, then we use different
types of search engines such as Google, Yahoo, Bing, Ask, etc. All these
search engines use data science technology to make the search experience
better, and you can get a search result within a fraction of seconds.
o Transport:
Transport industries are also using data science technology to create self-
driving cars. With self-driving cars, it will be easy to reduce the number of
road accidents.
o Healthcare:
In the healthcare sector, data science is providing lots of benefits. Data science
is being used for tumor detection, drug discovery, medical image analysis,
virtual medical bots, etc.
o Recommendation systems:
Most of the companies, such as Amazon, Netflix, Google Play, etc., are using
data science technology for making a better user experience with personalized
recommendations. Such as, when you search for something on Amazon, and
you start getting suggestions for similar products, so this is because of data
science technology.
o Risk detection:
Finance industries always had an issue of fraud and risk of losses, but with the
help of data science, this can be rescued.
Most of the finance companies are looking for data scientists to avoid risk and
any type of losses with an increase in customer satisfaction.

BI stands for business intelligence, which is also used for data analysis of business
information:

differences between BI and Data sciences:

Criterion Business intelligence Data science

Data Business intelligence deals with Data science deals with

Source structured data, e.g., data structured and unstructured

warehouse. data, e.g., weblogs, feedback,

etc.

Method Analytical(historical data) Scientific(goes deeper to know

the reason for the data report)

Skills Statistics and Visualization are the Statistics, Visualization, and

two skills required for business Machine learning are the

intelligence. required skills for data science.

Focus Business intelligence focuses on Data science focuses on past

both Past and present data data, present data, and also

future predictions.

Difference between Data Science and Machine Learning:

Data Science Machine Learning

It deals with understanding and It is a subfield of data science that enables

finding hidden patterns or useful the machine to learn from the past data and

insights from the data, which helps experiences automatically.

to make smarter business decisions.

It is used for discovering insights It is used for making predictions and

from the data. classifying the result for new data points.

It is a broad term that includes It is used in the data modeling step of data

various steps to create a model for a science as a complete process.

given problem and deploy the

model.

A data scientist needs to have skills A Machine Learning Engineer needs to

to use big data tools like Hadoop, have skills such as computer science

Hive and Pig, statistics, fundamentals, programming skills in

programming in Python, R, or Scala. Python or R, statistics and probability

concepts, etc.

It can work with raw, structured, and It mostly requires structured data to work

unstructured data. on.

Data scientists spend lots of time ML engineers spend a lot of time managing

handling the data, cleansing the data, the complexities that occur during the

and understanding its patterns. implementation of algorithms and

mathematical concepts behind that.

Difference between Data Science and AI

Data Science is a detailed

AI(short) is the implementation of a
process that mainly involves
Basics predictive model to forecast future
pre- processing analysis,
events and trends.
visualization and prediction.

Identifying the patterns that are Automation of the process and the
Goals concealed in the data is the main granting of autonomy to the data
objective of data science. model are the main goals of artificial
intelligence.
Data Science will have a variety of AI uses standardized
Types of different types of data, including data in the form of
data structured, semi-structured, and vectors and
unstructured type of data. embeddings.

It has a lot of high

Scientific It has a high degree of scientific
levels of complex
Processing processing.
processing.

The tools utilized in Data Science are far

The tools used in AI
more extensive than those used in AI.
are less extensive
Tools used This is because Data Science entails a
compared to Data
number of procedures for analyzing data
Science.
and developing insights from it.
By using the concept of data By using this we emulate
science, we can build complex cognition and human
Build
models about statistics and facts understanding to a certain
about data. level.

Technique It uses the technique of data It uses a lot of machine

used analysis and data analytics. learning techniques.

Artificial intelligence makes

Data science makes use of
Use use of algorithms and
graphical representation.
network node representation.

Its knowledge was established to Its knowledge is all about

Knowledge find hidden patterns and trends in imparting some autonomy to a
the data. data model.

Data Warehousing
A Data Warehouse (DW) is a relational database that is designed for query and
analysis rather than transaction processing. It includes historical data derived from
transaction data from single and multiple sources.
A Data Warehouse provides integrated, enterprise-wide, historical data and focuses
on providing support for decision-makers for data modeling and analysis.
A Data Warehouse is a group of data specific to the entire organization, not only to
a particular group of users.
It is not used for daily operations and transaction processing but used for making
decisions.
A Data Warehouse can be viewed as a data system with the following attributes:

o It is a database designed for investigative tasks, using data from various

applications.
o It supports a relatively small number of clients with relatively long
interactions.
o It includes current and historical data to provide a historical perspective of
information.
o Its usage is read-intensive.
o It contains a few large tables.

"Data Warehouse is a subject-oriented, integrated, and time-variant store of

information in support of management's decisions."
Characteristics:

Subject-Oriented
A data warehouse target on the modeling and analysis of data for decision-makers.
Therefore, data warehouses typically provide a concise and straightforward view
around a particular subject, such as customer, product, or sales, instead of the global
organization's ongoing operations. This is done by excluding data that are not useful
concerning the subject and including all data needed by the users to understand the
subject.
Integrated
A data warehouse integrates various heterogeneous data sources like RDBMS, flat
files, and online transaction records. It requires performing data cleaning and
integration during data warehousing to ensure consistency in naming conventions,
attribute types, etc., among different data sources.
Time-Variant
Historical information is kept in a data warehouse. For example, one can retrieve
files from 3 months, 6 months, 12 months, or even previous data from a data
warehouse. These variations with a transactions system, where often only the most
current file is kept.
Non-Volatile
The data warehouse is a physically separate data storage, which is transformed from
the source operational RDBMS. The operational updates of data do not occur in the
data warehouse, i.e., update, insert, and delete operations are not performed. It
usually requires only two procedures in data accessing: Initial loading of data and
access to data. Therefore, the DW does not require transaction processing, recovery,
and concurrency capabilities, which allows for substantial speedup of data retrieval.
Non-Volatile defines that once entered the warehouse, and data should not change.
Goals of Data Warehousing

o To help reporting as well as analysis

o Maintain the organization's historical information.
o Be the foundation for decision making.

Benefits of Data Warehouse

1. Understand business trends and make better forecasting decisions.

2. Data Warehouses are designed to store enormous amounts of data.
3. The structure of data warehouses is more accessible for end-users to navigate,
understand, and query.
4. Queries that would be complex in many normalized databases could be easier
to build and maintain in data warehouses.
5. Data warehousing is an efficient method to manage demand for lots of
information from lots of users.
6. Data warehousing provides the capabilities to analyze a large amount of
historical data.

Difference between database and data warehouse: -

Database Data Warehouse

1. It is used for Online Transactional 1. It is used for Online Analytical

Processing (OLTP) but can be used for Processing (OLAP). This reads the

other objectives such as Data Warehousing. historical information for the

This records the data from the clients for customers for business decisions.

history.

2. The tables and joins are complicated 2. The tables and joins are accessible

since they are normalized for RDBMS. since they are denormalized. This is

This is done to reduce redundant files and done to minimize the response time

to save storage space. for analytical queries.

3. Data is dynamic 3. Data is largely static

4. Entity: Relational modeling procedures 4. Data: Modeling approaches are

are used for RDBMS database design. used for the Data Warehouse design.

5. Optimized for write operations. 5. Optimized for read operations.

6. Performance is low for analysis queries. 6. High performance for analytical

queries.

7. The database is the place where the data 7. Data Warehouse is the place

is taken as a base and managed to get where the application data is

available fast and efficient access. handled for analysis and reporting

objectives.

ETL (Extract, Transform, and Load) Process

The mechanism of extracting information from source systems and bringing it into
the data warehouse is commonly called ETL, which stands for Extraction,
Transformation and Loading.
The ETL process requires active input from various stakeholders, including
developers, analysts, testers, top executives and is technically challenging.
To maintain its value as a tool for decision-makers, Data warehouse technique needs
to change with business changes. ETL is a recurring method (daily, weekly,
monthly) of a Data warehouse system and needs to be agile, automated, and well
documented.

Extraction

o Extraction is the operation of extracting information from a source system for

further use in a data warehouse environment. This is the first stage of the ETL
process.
o Extraction process is often one of the most time-consuming tasks in the ETL.
o The source systems might be complicated and poorly documented, and thus
determining which data needs to be extracted can be difficult.
o The data has to be extracted several times in a periodic manner to supply all
the changed data to the warehouse and keep it up-to-date.

Cleansing
The cleansing stage is crucial in a data warehouse technique because it is supposed
to improve data quality. The primary data cleansing features found in ETL tools are
rectification and homogenization. They use specific dictionaries to rectify typing
mistakes and to recognize synonyms, as well as rule-based cleansing to enforce
domain-specific rules and define appropriate associations between values.
Transformation
Transformation is the core of the reconciliation phase. It converts records from its
operational source format into a particular data warehouse format. If we implement
a three-layer architecture, this phase outputs our reconciled data layer.
Loading
The Load is the process of writing the data into the target database. During the load
step, it is necessary to ensure that the load is performed correctly and with as little
resources as possible.
Loading can be carried in two ways:

1. Refresh: Data Warehouse data is completely rewritten. This means that older
files are replaced. Refresh is usually used in combination with static extraction
to populate a data warehouse initially.
2. Update: Only those changes applied to source information are added to the
Data Warehouse. An update is typically carried out without deleting or
modifying pre-existing data. This method is used in combination with
incremental extraction to update data warehouses regularly.
Data Mining:
The process of extracting information to identify patterns, trends, and useful data
that would allow the business to take the data-driven decision from huge sets of data
is called Data Mining.
We can say that Data Mining is the process of investigating hidden patterns of
information to various perspectives for categorization into useful data, which is
collected and assembled in particular areas such as data warehouses, efficient
analysis, data mining algorithms, helping decision making and other data
requirements to eventually cost-cutting and generating revenue.
Data mining is the act of automatically searching for large stores of information to
find trends and patterns that go beyond simple analysis procedures. Data mining
utilizes complex mathematical algorithms for data segments and evaluates the
probability of future events. Data Mining is also called Knowledge Discovery of
Data (KDD).
Data mining can be performed on the following types of data:
Relational Database:
A relational database is a collection of multiple data sets formally organized by
tables, records, and columns from which data can be accessed in various ways
without having to recognize the database tables. Tables convey and share
information, which facilitates data searchability, reporting, and organization.
Data warehouses:
A Data Warehouse is the technology that collects the data from various sources
within the organization to provide meaningful business insights. The huge amount
of data comes from multiple places such as Marketing and Finance. The extracted
data is utilized for analytical purposes and helps in decision- making for a business
organization. The data warehouse is designed for the analysis of data rather than
transaction processing.
Data Repositories:
The Data Repository generally refers to a destination for data storage. However,
many IT professionals utilize the term more clearly to refer to a specific kind of setup
within an IT structure. For example, a group of databases, where an organization has
kept various kinds of information.
Object-Relational Database:
A combination of an object-oriented database model and relational database model
is called an object-relational model. It supports Classes, Objects, Inheritance, etc.
Transactional Database:
A transactional database refers to a database management system (DBMS) that has
the potential to undo a database transaction if it is not performed appropriately. Even
though this was a unique capability a very long while back, today, most of the
relational database systems support transactional database activities.

Advantages of Data Mining

o The Data Mining technique enables organizations to obtain knowledge-based

data.
o Data mining enables organizations to make lucrative modifications in
operation and production.
o Compared with other statistical data applications, data mining is cost-
efficient.
o Data Mining helps the decision-making process of an organization.
o It Facilitates the automated discovery of hidden patterns as well as the
prediction of trends and behaviors.
o It can be induced in the new system as well as the existing platforms.
o It is a quick process that makes it easy for new users to analyze enormous
amounts of data in a short time.

Disadvantages of Data Mining

o There is a probability that the organizations may sell useful data of customers
to other organizations for money. As per the report, American Express has
sold credit card purchases of their customers to other organizations.
o Many data mining analytics software is difficult to operate and needs advance
training to work on.
o Different data mining instruments operate in distinct ways due to the different
algorithms used in their design. Therefore, the selection of the right data
mining tools is a very challenging task.
o The data mining techniques are not precise, so that it may lead to severe
consequences in certain conditions.

Data Mining Applications

Data Mining is primarily used by organizations with intense consumer demands-

Retail, Communication, Financial, marketing company, determine price, consumer
preferences, product positioning, and impact on sales, customer satisfaction, and
corporate profits. Data mining enables a retailer to use point-of-sale records of
customer purchases to develop products and promotions that help the organization
to attract the customer.
Data Mining Techniques
Data mining includes the utilization of refined data analysis tools to find previously
unknown, valid patterns and relationships in huge data sets. These tools can
incorporate statistical models, machine learning techniques, and mathematical
algorithms, such as neural networks or decision trees. Thus, data mining incorporates
analysis and prediction.
Depending on various methods and technologies from the intersection of machine
learning, database management, and statistics, professionals in data mining have
devoted their careers to better understanding how to process and make conclusions
from the huge amount of data, but what are the methods they use to make it happen?
In recent data mining projects, various major data mining techniques have been
developed and used, including association, classification, clustering, prediction,
sequential patterns, and regression.

Chapter Ends…

Unit I - Data Science
No ratings yet
Unit I - Data Science
161 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
Amazon - Pass4sures - Aws Certified Solutions Architect Associate
100% (3)
Amazon - Pass4sures - Aws Certified Solutions Architect Associate
69 pages
PDF
100% (2)
PDF
39 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
37 pages
Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
Unit 1 DS BCA NOTES
No ratings yet
Unit 1 DS BCA NOTES
7 pages
Draftsman Interview Questions and Answers Guide.: Global Guideline
No ratings yet
Draftsman Interview Questions and Answers Guide.: Global Guideline
9 pages
What Is Data Science A Beginner's Guide To Data Science
No ratings yet
What Is Data Science A Beginner's Guide To Data Science
15 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Unit 1-FDS
100% (2)
Unit 1-FDS
18 pages
ISPF User's Guide Volume I PDF
No ratings yet
ISPF User's Guide Volume I PDF
260 pages
Telecom Customer Churn
0% (1)
Telecom Customer Churn
39 pages
Operating Systems: Chapter 2 - Operating System Structures
No ratings yet
Operating Systems: Chapter 2 - Operating System Structures
56 pages
Data Science 2020
100% (1)
Data Science 2020
123 pages
Data Science Lecture 1 Introduction
No ratings yet
Data Science Lecture 1 Introduction
27 pages
FDS - Unit 1
No ratings yet
FDS - Unit 1
233 pages
Project Management Book1
100% (1)
Project Management Book1
25 pages
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
100% (1)
OceanofPDF - Com DATA SCIENCE Simple and Effective Tips An - Benjamin Smith
122 pages
Introduction To Data Science What Is Data Science?
No ratings yet
Introduction To Data Science What Is Data Science?
11 pages
21css303t Datascience Unit 1 Notes
No ratings yet
21css303t Datascience Unit 1 Notes
246 pages
Tycs Data Science Sem6
No ratings yet
Tycs Data Science Sem6
99 pages
FDS - Unit 1
No ratings yet
FDS - Unit 1
233 pages
DS Notes
No ratings yet
DS Notes
31 pages
Unit I - Data Science
No ratings yet
Unit I - Data Science
161 pages
Data Science Chacha
No ratings yet
Data Science Chacha
150 pages
Invoice Template
No ratings yet
Invoice Template
5 pages
IMAS 08.10 Ed.1 Am2
No ratings yet
IMAS 08.10 Ed.1 Am2
19 pages
BCA Lecture I
No ratings yet
BCA Lecture I
20 pages
Roles of Data Scientists in Business and Society
No ratings yet
Roles of Data Scientists in Business and Society
47 pages
t10 - Requirements Management
No ratings yet
t10 - Requirements Management
47 pages
Data Science Material
No ratings yet
Data Science Material
48 pages
Himadev
No ratings yet
Himadev
37 pages
Prelim Intro To Multimedia Chap 1
No ratings yet
Prelim Intro To Multimedia Chap 1
38 pages
Sample Complaint Letter
No ratings yet
Sample Complaint Letter
2 pages
Osy Question Bank
No ratings yet
Osy Question Bank
8 pages
Chorus Trio Expander User Manual Rev 1.8 en 05.2022
No ratings yet
Chorus Trio Expander User Manual Rev 1.8 en 05.2022
104 pages
DTCN
No ratings yet
DTCN
232 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
24 pages
Subject List 2017 18
No ratings yet
Subject List 2017 18
5 pages
Unit 1
No ratings yet
Unit 1
60 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
CD101 Fundamental of Data Science
No ratings yet
CD101 Fundamental of Data Science
41 pages
DS-BDS (Unit 1) Technical
No ratings yet
DS-BDS (Unit 1) Technical
22 pages
Data Science
No ratings yet
Data Science
6 pages
DS Unit 1
No ratings yet
DS Unit 1
37 pages
Ch7-Overview of Data Science-Part 1
No ratings yet
Ch7-Overview of Data Science-Part 1
37 pages
Unit 1 Introduction
No ratings yet
Unit 1 Introduction
31 pages
Data Science Tutorial 1
No ratings yet
Data Science Tutorial 1
26 pages
HCS 111 Handout 1
No ratings yet
HCS 111 Handout 1
11 pages
DS B&V-1
No ratings yet
DS B&V-1
30 pages
What Is Data Science?: Module - 1
No ratings yet
What Is Data Science?: Module - 1
29 pages
Unit 1
No ratings yet
Unit 1
28 pages
GPT (CH 6)
No ratings yet
GPT (CH 6)
22 pages
Intro To Data Science - LVC1 With Markings
No ratings yet
Intro To Data Science - LVC1 With Markings
22 pages
Question Bank Syllbuswise
No ratings yet
Question Bank Syllbuswise
16 pages
Intro To Data Science - LVC1
No ratings yet
Intro To Data Science - LVC1
22 pages
Ccw331 Two Marks
No ratings yet
Ccw331 Two Marks
18 pages
Project Report
No ratings yet
Project Report
29 pages
Chapter 01
No ratings yet
Chapter 01
36 pages
Chapter 8 Implementing VPNv2
No ratings yet
Chapter 8 Implementing VPNv2
23 pages
UNIT - I Intro To DS
No ratings yet
UNIT - I Intro To DS
18 pages
Chapter 14 Big Data and Data Science - DONE DONE DONE
No ratings yet
Chapter 14 Big Data and Data Science - DONE DONE DONE
28 pages
Are View of Data Science
No ratings yet
Are View of Data Science
18 pages
AI UNIT 1 Data Science
No ratings yet
AI UNIT 1 Data Science
16 pages
Computational Data Science - Unit 1
No ratings yet
Computational Data Science - Unit 1
18 pages
TLMweek 1 Intro Ds
No ratings yet
TLMweek 1 Intro Ds
11 pages
Kshitij Tiwari: Qualification
No ratings yet
Kshitij Tiwari: Qualification
3 pages
Applied - Data - Science MODULE 1 SEM8
No ratings yet
Applied - Data - Science MODULE 1 SEM8
16 pages
1) Data-Sci Chapter-1
No ratings yet
1) Data-Sci Chapter-1
17 pages
Chapter 6 Word - Table and Mail Merge
No ratings yet
Chapter 6 Word - Table and Mail Merge
29 pages
Data Science - FYBCA-Sem-II
No ratings yet
Data Science - FYBCA-Sem-II
13 pages
Mess Management System
No ratings yet
Mess Management System
13 pages
Basic of Ds
No ratings yet
Basic of Ds
14 pages
01 - Disaster - (2) - JupyterLab
No ratings yet
01 - Disaster - (2) - JupyterLab
16 pages
Data Science and Its Importance
No ratings yet
Data Science and Its Importance
9 pages
ActiveModels HR7
No ratings yet
ActiveModels HR7
8 pages
09 Handout 1
No ratings yet
09 Handout 1
4 pages
Introduction To Data Science Lecture 1
No ratings yet
Introduction To Data Science Lecture 1
4 pages
Tendernotice 1
No ratings yet
Tendernotice 1
16 pages
JAVA For Beginners: Using The Vehicle Class
No ratings yet
JAVA For Beginners: Using The Vehicle Class
12 pages
PATH310
No ratings yet
PATH310
6 pages
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
No ratings yet
Rundown Pelatihan Threat Hunting - Beta Dan Charlie (WIB)
3 pages
Properties of GCD and LCM
No ratings yet
Properties of GCD and LCM
1 page
Changelog
No ratings yet
Changelog
2 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Data Science Essentials: Machine Learning and Natural Language Processing
From Everand
Data Science Essentials: Machine Learning and Natural Language Processing
Angel Gabaldon
No ratings yet

Chapter-1 DS

Uploaded by

Chapter-1 DS

Uploaded by

Chapter-1

What is Data Science? Definition and scope of Data Science, Applications

Applications of Data Science:

o Image recognition and speech recognition:

differences between BI and Data sciences:

Criterion Business intelligence Data science

Data Business intelligence deals with Data science deals with

Source structured data, e.g., data structured and unstructured

warehouse. data, e.g., weblogs, feedback,

Method Analytical(historical data) Scientific(goes deeper to know

the reason for the data report)

two skills required for business Machine learning are the

intelligence. required skills for data science.

Focus Business intelligence focuses on Data science focuses on past

Difference between Data Science and Machine Learning:

Data Science Machine Learning

It deals with understanding and It is a subfield of data science that enables

insights from the data, which helps experiences automatically.

to make smarter business decisions.

various steps to create a model for a science as a complete process.

given problem and deploy the

A data scientist needs to have skills A Machine Learning Engineer needs to

Hive and Pig, statistics, fundamentals, programming skills in

programming in Python, R, or Scala. Python or R, statistics and probability

unstructured data. on.

and understanding its patterns. implementation of algorithms and

mathematical concepts behind that.

Difference between Data Science and AI

Data Science is a detailed

It has a lot of high

The tools utilized in Data Science are far

Technique It uses the technique of data It uses a lot of machine

Artificial intelligence makes

Its knowledge was established to Its knowledge is all about

o It is a database designed for investigative tasks, using data from various

"Data Warehouse is a subject-oriented, integrated, and time-variant store of

o To help reporting as well as analysis

Benefits of Data Warehouse

1. Understand business trends and make better forecasting decisions.

Difference between database and data warehouse: -

Database Data Warehouse

1. It is used for Online Transactional 1. It is used for Online Analytical

other objectives such as Data Warehousing. historical information for the

to save storage space. for analytical queries.

4. Entity: Relational modeling procedures 4. Data: Modeling approaches are

5. Optimized for write operations. 5. Optimized for read operations.

6. Performance is low for analysis queries. 6. High performance for analytical

is taken as a base and managed to get where the application data is

ETL (Extract, Transform, and Load) Process

o Extraction is the operation of extracting information from a source system for

Advantages of Data Mining

o The Data Mining technique enables organizations to obtain knowledge-based

Disadvantages of Data Mining

Data Mining Applications

Data Mining is primarily used by organizations with intense consumer demands-

You might also like