unit-7-dbms

Uploaded by

Sudhan Khanal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views29 pages

unit-7-dbms

Uploaded by

Sudhan Khanal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Unit 7

Advanced Topics
Database Performance Tuning:
• Database performance tuning refers to a group of activities DBAs
perform to ensure databases operate smoothly and efficiently. It
helps re-optimize a database system from top to bottom, from
software to hardware, to improve overall performance.
• Tuning involves accelerating query response, improving indexing,
deploying clusters, and reconfiguring OS according to how they're
best used to support system function and end-user experience.
MySQL and Oracle are prominent examples of database management
systems (DBMS) on which DBAs generally perform database tuning.
Database Security
• Security of databases refers to the array of controls, tools, and procedures
designed to ensure and safeguard confidentiality, integrity, and
accessibility. This tutorial will concentrate on confidentiality because it's a
component that is most at risk in data security breaches.
Security for databases must cover and safeguard the following aspects:
• The database containing data.
• Database management systems (DBMS)
• Any applications that are associated with it.
• Physical database servers or the database server virtual, and the hardware
that runs it.
• The infrastructure for computing or network that is used to connect to the
database.
Concept of Parallel and Distributed databases
• A parallel DBMS is a DBMS that runs across multiple processors and is
designed to execute operations in parallel, whenever possible. The
parallel DBMS link a number of smaller machines to achieve the same
throughput as expected from a single large machine.
Features :
• There are parallel working of CPUs
• It improves performance
• It divides large tasks into various other tasks
• Completes works very quickly
2. Distributed Database :
• A Distributed database is defined as a logically related collection of
data that is shared which is physically distributed over a computer
network on different sites. The Distributed DBMS is defined as, the
software that allows for the management of the distributed database
and makes the distributed data available for the users.
Features :
• It is a group of logically related shared data
• The data gets split into various fragments
• There may be a replication of fragments
• The sites are linked by a communication network
Concept of Data Warehousing and Data
Mining
• Data Warehousing:
• It is a technology that aggregates structured data from one or more
sources so that it can be compared and analyzed rather than transaction
processing. A data warehouse is designed to support the management
decision-making process by providing a platform for data cleaning, data
integration, and data consolidation. A data warehouse contains subject-
oriented, integrated, time-variant, and non-volatile data. The Data
warehouse consolidates data from many sources while ensuring data
quality, consistency, and accuracy. Data warehouse improves system
performance by separating analytics processing from transnational
databases. Data flows into a data warehouse from the various databases. A
data warehouse works by organizing data into a schema that describes the
layout and type of data. Query tools analyze the data tables using schema.
Diagram:
Advantages of Data Warehousing:
• The data warehouse’s job is to make any form of corporate data
easier to understand. The majority of the user’s job will consist of
inputting raw data.
• The capacity to update continuously and frequently is the key benefit
of this technology. As a result, data warehouses are perfect for
organizations and entrepreneurs who want to stay current with their
target audience and customers.
• It makes data more accessible to businesses and organizations.
• A data warehouse holds a large volume of historical data that users
can use to evaluate different periods and trends in order to create
predictions for the future.
Disadvantages of Data Warehousing:
• There is a great risk of accumulating irrelevant and useless data. Data
loss and erasure are other potential issues.
• Data is gathered from various sources in a data warehouse. Cleansing
and transformation of the data are required. This could be a difficult
task.
Data Mining
• It is the process of finding patterns and correlations within large data
sets to identify relationships between data. Data mining tools allow a
business organization to predict customer behavior. Data mining tools
are used to build risk models and detect fraud. Data mining is used in
market analysis and management, fraud detection, corporate
analysis, and risk management.
Diagram
Advantages of Data Mining:
• Data mining aids in a variety of data analysis and sorting procedures.
The identification and detection of any undesired fault in a system is
one of the best implementations here. This method permits any
dangers to be eliminated sooner.
• In comparison to other statistical data applications, data mining
methods are both cost-effective and efficient.
• Companies can take advantage of this analytical tool by providing
appropriate and easily accessible knowledge-based data.
• The detection and identification of undesirable faults that occur in the
system are one of the most astonishing data mining techniques.
Functions of data mining:
• Forecasting:Forecasting uses historical data to predict future values or
trends.
• Risk and probability:Assessing the likelihood and potential impact of
negative events or outcomes.
• Recommendation:Providing suggestions or personalized items based
on user preferences or past behavior.
• Grouping:Grouping similar items or data points together based on
their characteristics.
• Finding Sequences: Analyzing the order of events or actions to
identify patterns and predict future events.
Disadvantages of Data Mining:
• Data mining isn’t always 100 percent accurate, and if done incorrectly,
it can lead to data breaches.
• Organizations must devote a significant amount of resources to
training and implementation. Furthermore, the algorithms used in the
creation of data mining tools cause them to work in different ways.
BigData
• Big Data is a term that is used for denoting a collection of data sets
that is large and complex, making it very difficult to process using
legacy data processing applications.
• So, legacy or traditional systems cannot process a large amount of
data in one go. But, how will you classify the data that is problematic
and hard to process.
Types of Big Data
Big Data is essentially classified into three types:
• Structured Data
• Unstructured Data
• Semi-structured Data
• Structured Data
• Structured data is highly organized and thus, is the easiest to work
with. Its dimensions are defined by set parameters. Every piece of
information is grouped into rows and columns like spreadsheets.
Structured data has quantitative data such as age, contact, address,
billing, expenses, debit or credit card numbers, etc.
Semi-structured Data
• Semi-structured data falls somewhere between structured data and
unstructured data. It mostly translates to unstructured data that has
metadata attached to it. Semi-structured data can be inherited such
as location, time, email address, or device ID stamp. It can even be a
semantic tag attached to the data later.
• Consider the example of an email. The time an email was sent, the
email addresses of the sender and the recipient, the IP address of the
device that the email was sent from, and other relevant information
are linked to the content of the email. While the actual content itself
is not structured, these components enable the data to be grouped in
a structured manner.
Unstructured Data
• Not all data is structured and well-sorted with instructions on how to
use it. All unorganized data is known as unstructured data.
• Almost everything generated by a computer is unstructured data. The
time and effort required to make unstructured data readable can be
cumbersome. To yield real value from data, datasets need to be
interpretable. But the process to make that happen can be much
more rewarding.
Characteristics of Big Data
1. Volume: This refers to tremendously large data. As you can see from the image,
the volume of data is rising exponentially. In 2016, the data created was only 8 ZB;
it is expected that, by 2020, the data would rise to 40 ZB, which is extremely large.
2. Variety: A reason for this rapid growth of data volume is that data is coming from
different sources in various formats. We have already discussed how data is
categorized into different types.
3.Velocity: The speed of data accumulation also plays a role in determining
whether the data is big data or normal data.
4. Value: How will the extraction of data work? Here, our fourth V comes in; it deals
with a mechanism to bring out the correct meaning of data. First of all, you need to
mine data, i.e., the process to turn raw data into useful data. Then, an analysis is
done on the data that you have cleaned or retrieved from the raw data.
5.Veracity: Since packages get lost during execution, we need to start again from
the stage of mining raw data to convert it into valuable data. And this process goes
on. There will also be uncertainties and inconsistencies in the data that can be
overcome by veracity. Veracity means the trustworthiness and quality of data.
Application area of Big data:
• Education
• Social Media
• Banking
• Government
• E-commerce
NoSQL databases
NoSQL is a type of database management system (DBMS) that is
designed to handle and store large volumes of unstructured and semi-
structured data. Unlike traditional relational databases that use tables
with pre-defined schemas to store data, NoSQL databases use flexible
data models that can adapt to changes in data structures and are
capable of scaling horizontally to handle growing amounts of data.
• The term NoSQL originally referred to “non-SQL” or “non-relational”
databases, but the term has since evolved to mean “not only SQL,” as
NoSQL databases have expanded to include a wide range of different
database architectures and data models.
NoSQL databases are generally classified into four
main categories:
• Document databases: These databases store data as semi-structured
documents, such as JSON (JavaScript Object Notation )or XML, and
can be queried using document-oriented query languages.
• Key-value stores: These databases store data as key-value pairs, and
are optimized for simple and fast read/write operations.
• Column-family stores: These databases store data as column families,
which are sets of columns that are treated as a single entity. They are
optimized for fast and efficient querying of large amounts of data.
• Graph databases: These databases store data as nodes and edges,
and are designed to handle complex relationships between data.
Key Features of NoSQL :
• Dynamic schema: NoSQL databases do not have a fixed schema and can accommodate changing
data structures without the need for migrations or schema alterations.
• Horizontal scalability: NoSQL databases are designed to scale out by adding more nodes to a
database cluster, making them well-suited for handling large amounts of data and high levels of
traffic.
• Document-based: Some NoSQL databases, such as MongoDB, use a document-based data model,
where data is stored in semi-structured format, such as JSON or BSON.
• Key-value-based: Other NoSQL databases, such as Redis, use a key-value data model, where data
is stored as a collection of key-value pairs.
• Column-based: Some NoSQL databases, such as Cassandra, use a column-based data model,
where data is organized into columns instead of rows.
• Distributed and high availability: NoSQL databases are often designed to be highly available and
to automatically handle node failures and data replication across multiple nodes in a database
cluster.
• Flexibility: NoSQL databases allow developers to store and retrieve data in a flexible and dynamic
manner, with support for multiple data types and changing data structures.
• Performance: NoSQL databases are optimized for high performance and can handle a high
volume of reads and writes, making them suitable for big data and real-time applications.
Advantages of NoSQL:
• High scalability : NoSQL databases use sharding for horizontal scaling.
Partitioning of data and placing it on multiple machines in such a way that
the order of the data is preserved is sharding. Vertical scaling means adding
more resources to the existing machine whereas horizontal scaling means
adding more machines to handle the data. Vertical scaling is not that easy
to implement but horizontal scaling is easy to implement. Examples of
horizontal scaling databases are MongoDB, Cassandra, etc. NoSQL can
handle a huge amount of data because of scalability, as the data grows
NoSQL scale itself to handle that data in an efficient manner.
• Flexibility: NoSQL databases are designed to handle unstructured or semi-
structured data, which means that they can accommodate dynamic
changes to the data model. This makes NoSQL databases a good fit for
applications that need to handle changing data requirements.
• High availability : Auto replication feature in NoSQL databases makes it
highly available because in case of any failure data replicates itself to the
previous consistent state.
 Scalability: NoSQL databases are highly scalable, which means that they
can handle large amounts of data and traffic with ease. This makes them a
good fit for applications that need to handle large amounts of data or
traffic.
 Performance: NoSQL databases are designed to handle large amounts of
data and traffic, which means that they can offer improved performance
compared to traditional relational databases.
 Cost-effectiveness: NoSQL databases are often more cost-effective than
traditional relational databases, as they are typically less complex and do
not require expensive hardware or software.
 Agility: Ideal for agile development.
Disadvantages of NoSQL:
• Lack of standardization : There are many different types of NoSQL databases,
each with its own unique strengths and weaknesses. This lack of standardization
can make it difficult to choose the right database for a specific application
• Lack of ACID compliance : NoSQL databases are not fully ACID-compliant, which
means that they do not guarantee the consistency, integrity, and durability of
data. This can be a drawback for applications that require strong data consistency
guarantees.
• Narrow focus : NoSQL databases have a very narrow focus as it is mainly
designed for storage but it provides very little functionality. Relational databases
are a better choice in the field of Transaction Management than NoSQL.
• Open-source : NoSQL is open-source database. There is no reliable standard for
NoSQL yet. In other words, two database systems are likely to be unequal.
• Lack of support for complex queries : NoSQL databases are not designed to
handle complex queries, which means that they are not a good fit for applications
that require complex data analysis or reporting.
• Lack of maturity : NoSQL databases are relatively new and lack the
maturity of traditional relational databases. This can make them less
reliable and less secure than traditional databases.
• Management challenge : The purpose of big data tools is to make the
management of a large amount of data as simple as possible. But it is not
so easy. Data management in NoSQL is much more complex than in a
relational database. NoSQL, in particular, has a reputation for being
challenging to install and even more hectic to manage on a daily basis.
• GUI is not available : GUI mode tools to access the database are not
flexibly available in the market.
• Backup : Backup is a great weak point for some NoSQL databases like
MongoDB. MongoDB has no approach for the backup of data in a
consistent manner.
• Large document size : Some database systems like MongoDB and CouchDB
store data in JSON format. This means that documents are quite large
(BigData, network bandwidth, speed), and having descriptive key names
actually hurts since they increase the document size.
When should NoSQL be used:
• When a huge amount of data needs to be stored and retrieved.
• The relationship between the data you store is not that important
• The data changes over time and is not structured.
• Support of Constraints and Joins is not required at the database level
• The data is growing continuously and you need to scale the database
regularly to handle the data.

Data Warehouse & Data Mining
No ratings yet
Data Warehouse & Data Mining
41 pages
Lecture 8-Is Infrastructure DBMS
No ratings yet
Lecture 8-Is Infrastructure DBMS
34 pages
Practicalno: 1 Introduction To Database: Data
No ratings yet
Practicalno: 1 Introduction To Database: Data
33 pages
ADBMS-Module 1 Notes
No ratings yet
ADBMS-Module 1 Notes
18 pages
Week 5 Database
No ratings yet
Week 5 Database
34 pages
Ds Notes
No ratings yet
Ds Notes
88 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Data Mining v3
No ratings yet
Data Mining v3
54 pages
Unit 1
No ratings yet
Unit 1
61 pages
Unit 3: by Dr. Anand Vyas
No ratings yet
Unit 3: by Dr. Anand Vyas
20 pages
Session 1a-Data and Data Management
No ratings yet
Session 1a-Data and Data Management
30 pages
Unit 3 DBMS
No ratings yet
Unit 3 DBMS
114 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Bi - Unit 3
No ratings yet
Bi - Unit 3
18 pages
Challenges in Big Data Analytics Techniques
No ratings yet
Challenges in Big Data Analytics Techniques
6 pages
Unit I: Chapter 1: Introduction To Big Data
No ratings yet
Unit I: Chapter 1: Introduction To Big Data
35 pages
DBMS, Data Warehousing and Data Mining
No ratings yet
DBMS, Data Warehousing and Data Mining
31 pages
DWDM
No ratings yet
DWDM
48 pages
ACC IT APP MIdterm Bigdata
No ratings yet
ACC IT APP MIdterm Bigdata
12 pages
Chapter 5 ITM100
No ratings yet
Chapter 5 ITM100
5 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
12 pages
Data Mining 1
No ratings yet
Data Mining 1
13 pages
BDA - Unit-I
No ratings yet
BDA - Unit-I
35 pages
Laudon Mis16 PPT Ch06
No ratings yet
Laudon Mis16 PPT Ch06
42 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
CH 6 - Foundations of Business Intelligence Databases and Information Management
100% (1)
CH 6 - Foundations of Business Intelligence Databases and Information Management
16 pages
Chapter 5
No ratings yet
Chapter 5
41 pages
Big Data in Business
No ratings yet
Big Data in Business
11 pages
Foundations of Business Intelligence
No ratings yet
Foundations of Business Intelligence
38 pages
Module 10 MIS March 29 2021
No ratings yet
Module 10 MIS March 29 2021
49 pages
Unit 1 - Big Data Technologies
No ratings yet
Unit 1 - Big Data Technologies
89 pages
Big Data Chapter 1
No ratings yet
Big Data Chapter 1
7 pages
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
No ratings yet
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
10 pages
358 44 Datamining and Warehousing 4.4
No ratings yet
358 44 Datamining and Warehousing 4.4
155 pages
Data Science Unit 1 Notes
No ratings yet
Data Science Unit 1 Notes
65 pages
Introduction To Big Data Platform (Module-3)
No ratings yet
Introduction To Big Data Platform (Module-3)
23 pages
Unit-1 PPT Dma
No ratings yet
Unit-1 PPT Dma
83 pages
02 Data Science
No ratings yet
02 Data Science
23 pages
05 Database Management Systems
No ratings yet
05 Database Management Systems
37 pages
Describe The Data Processing Chain: Business Understanding
No ratings yet
Describe The Data Processing Chain: Business Understanding
4 pages
Big data
No ratings yet
Big data
34 pages
Big Data Analytics
No ratings yet
Big Data Analytics
21 pages
18mca52c U1
No ratings yet
18mca52c U1
17 pages
Big Data Analytics
No ratings yet
Big Data Analytics
10 pages
Class+2+ +Lecture+Note.
No ratings yet
Class+2+ +Lecture+Note.
43 pages
Databases and Information Management: Raw Facts Without Context or Intent
No ratings yet
Databases and Information Management: Raw Facts Without Context or Intent
14 pages
BIG DATA INTRODUCTION Hadoop
No ratings yet
BIG DATA INTRODUCTION Hadoop
24 pages
Parcial Cono 1 14
No ratings yet
Parcial Cono 1 14
14 pages
Parcial Cono 1 21
No ratings yet
Parcial Cono 1 21
21 pages
1 DM Intro
No ratings yet
1 DM Intro
34 pages
Database 4
No ratings yet
Database 4
35 pages
Data Mining
No ratings yet
Data Mining
7 pages
Week5 - Business Intelligence DBMS & Info MNGT
No ratings yet
Week5 - Business Intelligence DBMS & Info MNGT
41 pages
MISch 03
No ratings yet
MISch 03
58 pages
Unit I - Big Data Programming
No ratings yet
Unit I - Big Data Programming
19 pages
Motivation of Data Mining
No ratings yet
Motivation of Data Mining
4 pages
Dmi Unit 1
No ratings yet
Dmi Unit 1
8 pages
Unit 1
No ratings yet
Unit 1
21 pages
Hadoop Report
No ratings yet
Hadoop Report
110 pages
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
No ratings yet
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
67 pages
Introduction To Database Administrator
100% (4)
Introduction To Database Administrator
48 pages
Dpu Mba Artificial Intelligence & Machine Learning Management
No ratings yet
Dpu Mba Artificial Intelligence & Machine Learning Management
26 pages
Vaniresume
No ratings yet
Vaniresume
3 pages
CHPT 3 ITB Notes
No ratings yet
CHPT 3 ITB Notes
13 pages
Abdul Atif-IICS Support Engineer - INT - 1-24
No ratings yet
Abdul Atif-IICS Support Engineer - INT - 1-24
12 pages
Management Information Systems Managing The Digital Firm 15th Edition Laudon Solutions Manual PDF Download
100% (7)
Management Information Systems Managing The Digital Firm 15th Edition Laudon Solutions Manual PDF Download
45 pages
Unit-1 Data Warehousing
No ratings yet
Unit-1 Data Warehousing
17 pages
BasanagoudaPatil (4y 0m)
No ratings yet
BasanagoudaPatil (4y 0m)
5 pages
Managing Information Resources: Information Systems Management in Practice 6E Mcnurlin & Sprague
No ratings yet
Managing Information Resources: Information Systems Management in Practice 6E Mcnurlin & Sprague
65 pages
BAFEDM FinalJul 102019
No ratings yet
BAFEDM FinalJul 102019
35 pages
Your Answer 1
No ratings yet
Your Answer 1
3 pages
Informatica 9.x Course Curriculum
No ratings yet
Informatica 9.x Course Curriculum
8 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
108 pages
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
No ratings yet
Introductions: What Are The 5 Vs of Big Data/ Characteristics of Big Data or Nature of Data
75 pages
Sertif GCP
No ratings yet
Sertif GCP
177 pages
Chapter 2: Database Concepts and Applications in HRIS Test Bank
100% (1)
Chapter 2: Database Concepts and Applications in HRIS Test Bank
11 pages
DSECLZG529 AIMLCZG529 Data Management For Machine Learning Compre - Regular AK
No ratings yet
DSECLZG529 AIMLCZG529 Data Management For Machine Learning Compre - Regular AK
10 pages
Mba 4 Sem Business Data Warehousing and Data Analytics Kmbnit05 2022
No ratings yet
Mba 4 Sem Business Data Warehousing and Data Analytics Kmbnit05 2022
1 page
Dba
No ratings yet
Dba
9 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
31 pages
710/16/S07 Business Intelligence: A CRM Solution
No ratings yet
710/16/S07 Business Intelligence: A CRM Solution
24 pages
CS 2032 - Data Warehousing and Data Mining PDF
No ratings yet
CS 2032 - Data Warehousing and Data Mining PDF
3 pages
Kai075 Data Warehousing and Data Mining
No ratings yet
Kai075 Data Warehousing and Data Mining
2 pages
Big Data - Midsem
No ratings yet
Big Data - Midsem
526 pages
Mod 1
No ratings yet
Mod 1
3 pages
Data Warehousing 2
No ratings yet
Data Warehousing 2
14 pages
IDQ Learning
0% (1)
IDQ Learning
33 pages
Data Warehousing & Dimensional Modeling Concepts !!
No ratings yet
Data Warehousing & Dimensional Modeling Concepts !!
33 pages

unit-7-dbms

Uploaded by

unit-7-dbms

Uploaded by

Unit 7

You might also like