0% found this document useful (0 votes)

64 views

Denormalization

This document discusses denormalization as a technique to improve database performance. It begins by providing background on database normalization and how normalization is important for data integrity but can degrade query performance due to large numbers of joins. The document then argues that denormalization, when done systematically and carefully, can optimize performance while still maintaining data integrity. It proposes a framework for systematically applying denormalization and discusses some specific denormalization techniques. Finally, it presents results showing performance gains from denormalizing a hierarchical database table.

Uploaded by

Tanay Behera

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views

Denormalization

Uploaded by

Tanay Behera

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Global Journal of Computer Science and Technology P a g e | 44

A Framework for Systematic Database

Denormalization
YMA PINTO

Goa University , India

[email protected]

Abstract- It is currently the norm that relational database delete anomalies that would have otherwise been present in
designs should be based on a normalized logical data model. a non-normalized database. Another goal of normalization is
The primary objective of this design technique is data integrity to minimize redesign of the database structure. Admittedly,
and database extendibility. The Third Normal Form is it is impossible to predict every need that your database
regarded by academicians and practitioners alike to be point at design will have to fulfill and every issue that is likely to
which the database design is most efficient. Unfortunately, even arise, but it is important to mitigate against potential
this lower level normalization form has a major drawback with problems as much as possible by a careful planning.
regards to query evaluation. Information retrievals from the Arguably, normalizing your data is essential to good
database can result in large number of joins which degrades performance, and ease of development, but the question
query performance. So you need to sometimes break always comes up: "How normalized is normalized enough?"
theoretical rules for real world performance gains. Most Many books on normalization, mention that 3NF is
existing Conceptual Level RDBMS data models provide a set of essential, and many times BCNF, but 4NF and 5NF are
constructs that only describes ―what data is used‖ and does really useful and well worth the time required to implement
not capture ―how the data is being used‖. The question of ―how
them [Davidson, 2007]. This optimization, however, results
in performance degradation in data retrievals from the
data is used‖ gets embedded in the implementation level
database as a large number of joins need to be done to solve
details. As a result, every application built on the existing
queries [Date, 1997] [Inmon, 1987] [Schkolnick and
database extracts the same or similar data in different ways. If
Sorenson ,1980].
the functional use of the data is also captured, common query
"Third normal form seems to be regarded by many as the
evaluation techniques can be formulated and optimized at the
points where your database will be most efficient ... If your
design phase, without affecting the normalized database database is overnormalized you run the risk of excessive
structure constructed at the Conceptual Design phase. This table joins. So you denormalize and break theoretical rules
paper looks at denormalization as an effort to improve the for real world performance gains." [Sql Forums, 2009].
performance in data retrievals made from the database There is thus a wide gap between the academicians and the
without compromising data integrity. A study on a hierarchical database application practitioners which needs to be
database table shows the performance gain - with respect to addressed. Normalization promotes an optimal design from
response time – using a denormalization technique. a logical perspective. Denormalization is a design level that
needs to be mitigated one step up from normalization. With
Keywords: denormalization, database deign, performance tuning, respective to performance of retrieval, denormalization is
materialized views, query evaluation not necessarily a bad decision if implemented following a
systematic approach to large scale databases where dozens
I. INTRODUCTION of relational tables are used.
Denormalization is an effort that seeks to optimize
M ost of the applications existing today have been built,
or are still being built using RDBMS or ORDBMS
technologies. The RDBMS is thus not dead, as stated by
performance while maintaining data integrity. A
denormalized database is thus not equivalent to a database
Arnon-Roten [Roten_Gal, 2009]. Van Couver, a software that has not been normalized. Instead, you only seek to
engineer with vast experience in databases at Sun denormalize a data model that has already been normalized.
MicroSystems, emphasizes the fact that RDBMSs are here This distinction is important to understand, because you go
to stay but do require improvements in scalability and from normalized to denormalized, not from nothing to
performance bottlenecks [Couver , 2009]. denormalized. The mistake that some software developers
Normalization is the process of putting one fact and nothing do is to directly build a denormalized database considering
more than one fact in exactly one appropriate place. Related only the performance aspect. This only optimizes one part of
facts about a single entity are stored together, and every the equation, which is database reads. Denormalization is a
attribute of each entity is non-transitively associated to the design level that is one step up from normalization and
Primary Key of that entity. This design technique results in should not be treated naively. Framing denormalization
enhanced data integrity and removes insert, update and against normalization purely in the context of performance
P a g e | 45 Global Journal of Computer Science and Technology

is unserious and can result in major application problems is the overheads required in view consistency maintenance.
[Thought Clusters, 2009]. We need to understand how and Denormalization is not necessarily a bad decision if
when to use denormalization implemented wisely [Mullins , 2009].
This paper is organized as follows: Section 1 introduces the
concept and current need for denormalization. Section 2 Some denormalization techniques have been researched and
provides us a background of the related work in this area implemented in many strategic applications to improve
from the academic and the practitioners‘ point of view. query response times. These strategies are followed in the
Section 3 makes a strong case for denormalization while creation of data warehouses and data marts [Shin and
Section 4 presents the framework for a systematic Sanders, 2006] [Barquin and Edelstein ] and are not directly
denormalization. Section 5 elucidates some denormalization applicable to an OLTP system. Restructuring a monolithic
techniques that can be followed during the database design Web application composed of Web pages that address
life cycle and shows the performance gain of this technique queries to a single database into a group of independent
over a Hierarchical Normalized Relation. Web services querying each other also requires
denormalization for improved performance [Wei Z et al,
2008].
II. BACKGROUND AND RELATED WORK
Relational Databases can be roughly categorized into Several researches have developed a list of normalization
Transaction Processing (OLTP) and Data Warehouse and denormalization types ,and have subsequently
(OLAP). As a general rule, OLTP databases use normalized mentioned that denormalization should be carefully
schema and ACID transactions to maintain database deployed according to how the data will be used [Hauns
integrity as the data needs to be continuously updated when ,1994] [Rodgers, 1989].The primary methods that have been
transactions occur. As a general rule, OLAP databases use identified are : combining tables, introducing redundant
unnormalized schema (the ―star schema‖ is the paradigmatic data, storing derivable data, allowing repeating groups,
OLAP schema) and are accessed without transactions partitioning tables, creating report tables, mirroring tables.
because each table row is written exactly one time and then These ―denormalization patterns‖ have been classified as
never deleted nor updated. Often, new data is added to Collapsing Relations, Partitioning Relations, Adding
OLAP databases in an overnight batch, with only queries Redundant Attributes and Adding Derived Attributes [
occurring during normal business hours [Lurie M.,IBM, Sanders and Shin ,2001]
2009] [Microsoft SQL Server guide] [Wiseth ,Oracle].
Software developers and practitioners mention that database III. A CASE FOR DENORMALIZATION
design principles besides normalization, include building of Four main arguments that have guided experienced
indices on the data and denormalization of some tables for practitioners in database design have been listed here [26]
performance. Performance tuning methods like indices and The Convenience Argument
clustering data of multiple tables exist, but these methods The presence of calculated values in tables‘ aids the
tend to optimize a subset of queries at the expense of the evaluation of adhoc queries and report generation.
others. Indices consume extra storage and are effective only Programmers do not need to know anything about the API
when they work on a single attribute or an entire key value to do the calculation.
.The evaluation plans sometimes skip the secondary indexes The Stability Argument
that are created by users if these indices are nonclustering As systems evolve, new functionality must be provided to
[Khaldtiance , 2008]. the users while retaining the original. History data may still
Materialized Views can also be used as a technique for need to be retained in the database.
improving performance [Vincent et al,97] but these The Simple Queries Argument
consume vast amount of storage and their maintenance Queries that involve join jungles are difficult to debug and
results in additional runtime overheads. Blind application of dangerous to change. Eliminating joins makes queries
Materialized Views can actually result in worse query simpler to write, debug and change
evaluation plans and should be used carefully [Chaudhuri et The Performance Argument
al, 1995]. View update techniques have been researched and Denormalized databases require fewer joins in comparison
a relatively new method of updating using additional views to normalized relations. Computing joins are expensive and
has been proposed [Ross et al, 1996]. time consuming. Fewer joins directly translates to improved
In the real world, denormalization is sometimes necessary. performance.
There have been two major trends in the approach to Denormalization of Databases, ie, a systematic creation of a
demoralization. The first approach uses a ―non normalized database structure whose goal is performance improvement,
ERD‖ where the entities in the ERD are collapsed to is thus needed for today‘s business processing requirements.
decrease the joins. In the second approach, denormalization This should be an intermediate step in the DataBase Design
is done at the physical level by consolidating relations, Life Cycle integrated between the Logical DataBase Design
adding synthetic attributes and creating materialized views Phase and the Physical DataBase Design Phase. Retrieval
to improve performance. The disadvantage of this approach performance needs dictate very quick retrieval capability for
Global Journal of Computer Science and Technology P a g e | 46

data stored in relational databases, especially since more violate data integrity. The IUDs to data are done on the Base
accesses to databases are being done through Internet. Users Tables and the denormalized structures are kept in synch by
are concerned with more prompt responses than an optimum triggers on the base tables.
design of databases. To create a Denormalization Schema Since the denormalized structures are used for information
the functional usage of the operational data must be retrieval , they need to consider the authorization access that
analyzed for optimal Information Retrieval. users have over the base tables.
The construction of the ―Denormalization View‖ is not an
Some of the benefits of denormalization can be listed: intermediate step between the Logical and the Physical
Design phases, but needs to be consolidated by considering
(a)Performance improvement by all 3 views of the SPARC ANSI architectural specifications.
 Precomputing derived data
 Minimizing joins Most existing Conceptual Level RDBMS data models
 Reducing Foreign Keys provide a set of constructs that describes the structure of the
 Reducing indices and saving storage database [Elmashree and Navathe]. This higher level of
 Smaller search sets of data for partitioned tables conceptual modeling only informs the end user ―what data is
 Caching the Denormalized structures at the Client used‖ and does not capture ―how the data is being used‖.
for ease of access thereby reducing query/data The question of ―how data is used‖ gets embedded in the
shipping cost. implementation level details. As a result, every application
built on the existing database extracts the same or similar
data in different ways. If the functional use of the data is
(b)Since the Denormalized structures are primarily also captured, common query evaluation techniques can be
designed keeping in mind the functional usage of the formulated and optimized at the design phase, without
application, users can directly access these structures rather affecting the normalized database structure constructed at
then the base tables for report generation. This also reduces the Conceptual Design phase. Business rules are descriptive
bottlenecks at the server. integrity constraints or functional (derivative or active) and
ensure a well functioning of the system. Common models
used during the modeling process of information systems do
A framework for denormalization needs to address the not allow the high level specification of business rules
following issues: except a subset of ICs taken into account by the data model
(i) Identify the stage in the DataBase Design Life Cycle [Amghar and Mezaine, 1997].
where Denormalization structures need to be created.
(ii) Identify situations and the corresponding candidate
base tables that cause performance degradation. The ANSI 3 level architecture stipulates 3 levels – The
(iii) Provide strategies for boosting query response times. External Level and the Conceptual Level, which captures
(iv) Provide a method for performing the cost-benefit data at rest, and the Physical Level which describes how the
analysis. data is stored and depends on the DBMS used. External
(v) Identify and strategize security and authorization Schemas or subschemas relate to the user views. The
constraints on the denormalized structures. Conceptual Schema describes all the types of data that
Although (iv) and (v) above are important issues in appear in the database and the relationships between data
denormalization, they will not be considered in this paper items. Integrity constraints are also specified in the
and will be researched on later. conceptual schema. The Internal Schema provides
definitions for stored records, methods of representation,
data fields, indexes, and hashing schemes. Although this
architecture provides the application development
IV. A DENORMALIZATION FRAMEWORK environment with logical and physical data independence, it
The framework presented in this paper differs from the does not provide an optimal query evaluation platform. The
papers surveyed above in the following respects: DBA has to balance conflicting user requirements before
It does not create denormalized tables with all contributing creating indices and consolidating the Physical schema.
attributes from the relevant entities, but instead creates a set
of Denormalized Structures over a set of Normalized tables. The reason denormalization is at all possible in relational
This is an important and pertinent criteria as these structures databases is because, courtesy of the relational model, which
can be built over existing applications with no ―side effects creates lossless decompositions of the original relation, no
of denormalization‖ over the existing data. Information is lost in the process. The Denormalized
The entire sets of attributes from the contributing entities are structure can be reengineered and populated from the
not stored in the Denormalized structure. This greatly existing Normalized database and vice-versa. In a
reduces the storage requirements and redundancies. distributed application development environment the
The Insert, Update and Delete operations (IUDs) are not Denormalization Views can be cached on the client resulting
done to the denormalized structures directly and thus do not in a major performance boost by saving run time shipping
P a g e | 47 Global Journal of Computer Science and Technology

costs. It would require only the Denormalization View  the number of entities the queries involve,
Manager to be installed on the Client.  the usage of the data (ie, the kind of attributes and
A High Level Architecture that this framework considers is their frequency of extraction within queries and
defined as follows: reports),
 the volume of data being analyzed and extracted in
queries ( cardinality and degree of relations,
number and frequency of tuples, blocking factor of
tuples, clustering of data, estimated size of a
relation ),
 the frequency of occurrence and the priority of the
query,
 the time taken by the queries to execute(with and
without denormalization).

The problem can now be stated as ―Given a logical schema

with its corresponding database statistics and a set of queries
with their frequencies, arrive at a set of denormalized
structures that enhances query performance‖

A few definitions are required

Defn 1: A Relational Data Information Retrieval System
(RDIRS) has as its core components (i) a set of Normalized
Relations {R} (ii) a set of Integrity Constraints {ICs} (iii) a
set of data access methods {A} (iv) a set of Denormalization
Structures {DS} and (v) a set of queries and subqueries that
can be defined and evaluated on these relations.
Each component of the RDIRS, by definition, can have
dynamic elements resulting in a flexible and evolvable
To realize the potential of the Denormalization View, system.
efficient solutions to the three encompassing issues are
required: Defn 2: A ―Denormalized Structure‖ (DSM) is a relvar
[Date ,Kannan , Swamynathan] comprising of the
Denormalization View design: Determining what data and Denormalized Schema Design and the Denormalized
how it is stored and accessed in the Denormalization Structure Manager.
Schema
Denormalization View maintenance: Methods to
efficiently update the data in the Denormalized schema
when base tables are updated.
Denormalization View exploitation: Making efficient use
of denormalization views to speed up query processing
(either entire queries or sub queries)
Extensive research has been done on subquery evaluation on
materialized views [Afrati et al, 2001] [Chirkova et al, 2006]
[Halevy , 2001]

The inputs that are required for the construction of the A system cannot enforce truth, only consistency. Internal
Denormalized schema can be identified as: Predicates (IPs) are what the data means to the system and
 the logical and external views schema design, External Predicates (EPs) are what data means to a user. The
 the physical storage and access methods provided EPs result in criterion for acceptability of IUD operations on
by the DBMS, the data, which is an unachievable goal [Date, Kannan,
 the authorization the users have on the Swamynathan], especially when Materialized Views are
manipulation and access of the data within the created. In the framework presented in this paper, IUDs on
database, the Denormalized Structures are never rejected as these are
 the interaction (inter and intra) between the entities, automatically propagated to the base relations where the
Global Journal of Computer Science and Technology P a g e | 48

Domain and Table level ICs are enforced. Once the base The Denormalization Schema Design is an input to the
relations are updated, the Denormalized Schema Relation Query Optimizer for collapsing access paths, resulting in the
triggers are invoked atomically to synchronize the data, IRT which is then submitted to the Query Evaluation
ensuring simultaneous consistency of Base and Engine.
Denormalized tables. Further, the primary reason for the
Denormalization Structures is faster information retrieval Although the metadata tables are query able at the server,
and not data manipulation; hence no updates need be made the Denormalized Structure Manager can have its own
to the Denormalization Schema directly. metadata stored locally (at the node where the DSs are
stored).
Every Normalized Relation requires a Primary Key which DS_Metadata_Scheme(DS_Name,DS_Trigger_Name,DS
satisfies the Key Integrity Constraint. This PK maintains _Procedure_Name, DS_BT1_Name,
uniqueness of tuples in the database and is not necessarily Creator,DS_BT1_Trigger_Name,DS_BT2_Trigger_Nam
the search key value for users. For the RDIRS we define e,DS_BT1_Authorization,DS_BT2_Authorization)

Defn 3: An Information Retrieval Key (IRK) is a (set of) V. DENORMALIZATION TECHNIQUES

attributes that the users most frequently extract from an
entity. The IRK is selected from amongst the mandatory
Denormalization looks at normalized databases which have
attribute values which gives the end user meaningful
operational data, but whose performance degrades during
information about the entity.
query evaluation. There are several indicators which will
For ex, an employee table may have an Empid as its PK, but
help to identify systems and tables which are potential
the IRK could be EmpName and Contact No.
denormalization candidates.
Defn 4: An Information Retrieval Tree (IRT) is a Query
The techniques that can be used are summarized below:
Evaluation Tree which has as its components the operators
required to extract the information from the database and the
a. Pre joined Tables
relvars that contribute to an optimized Data Extraction Plan.
Application: When two or more tables need to be joined on
The IRT consists of relational algebra operations along the
a regular basis and the cost of joins is prohibitive.
intermediate nodes and the relvars in the leaf nodes (base
This happens when Foreign Keys become a part of a relation
relations, views, materialized views or denormalization
or when transitive dependencies are removed.
structures) and is a requisite for cost benefit analysis and
Denormalization Technique: Collapse the relations.
query rewrites.
b. Report Tables
Researchers and Practitioners [Inmon, 1987] [Shin and
Sanders, 2006] [Mullins, 2009] create the denormalized Application: When the application requires creation of
tables by creating a schema with all the attributes from the specialized reports that requires lot of formatting and data
participating entities. This results in (i) additional storage manipulation.
and redundancy (ii) slows down the system on updates to Denormalization Technique: The report table must contain
data (iii) creates a scenario for data anomalies. the mandatory columns required for the report

Defn 5: The Denormalization Schema (DS) in the RDIR c. Fragmenting Tables

Model is a relation that has as its attributes only the PKs, the
IRKs and the URowIds (Universal Row Id) of the Application: If separate pieces of a normalized table are
participating or contributing Base Relations. accessed by different and distinct groups of users or
applications, then the original relation can be split into two
The storage of only the PK, IRKs and URowIds is justifiable (or more) denormalized tables; one for each distinct
as most often, end users are interested in only the significant processing group. The relation can be fragmented
attributes of an entity. If required, the remaining attributes horizontally or vertically by preserving losslessness.
can be obtained from the base table using the RowId field Denormalization Technique: When horizontal
stored in the Denormalized Scheme. The URowIds are fragmentation is done, the predicate must be chosen such
chosen as they can even support row-ids on remote foreign that rows are not duplicated.
tables. When vertical fragmentation is done, the primary key must
It is interesting to note that even when a ―select * ―clause is be included in the fragmented tables. Associations between
used in an adhoc query, it is either because the user is the attributes of the relation must be considered. Projections
unaware of the attributes of the entity or is uninterested in that eliminate rows in the fragmented tables must be
the attribute per se, but is actually looking for other avoided.
information.
P a g e | 49 Global Journal of Computer Science and Technology

table removing the restriction on the number of values that

5.1: An illustration of the above techniques can repeat.

Consider the following Normalized database (3NF) relations: f. Derivable Data

(Primary Keys are in Red , Foreign keys are in Blue)
Application: If the cost of deriving data using complicated
formulae is prohibitive then the derived data can be stored in
a column. It is imperative that the stored derived value needs
Customer (CustomerNo, CustomerName, ContactId) to be changed when the underlying values that comprise the
calculated value change.
Denormalization Technique: Frequently used aggregates
can be precomputed and materialized in an appropriate
Order OrderNo, CustomerNo, OrderDate, ShipRecdDate, relation.
VATax, Local_Tax, ShiptoContactId, BillToContactId)
g. Hierarchical Speed Tables
Application: A hierarchy or a recursive relation can be
easily supported in a normalized relational table but is
ContactInfo (ContactId, Name, Street, City, State Country, difficult to retrieve information from efficiently.
Zip)
Denormalized ―Speed Tables‖ are often used for faster data
retrieval.
Denormalization Technique: Not only the immediate
parent of a node is stored, but all of the child nodes at every
ContactPhone (ContactId PhoneNo) level are stored.

Some of the major reports identified and that need to be

generated from this database:
Item (ItemNo, ItemName, ItemPrice, ItemPart, SubItemNo)  What are the current outstanding orders along with
their shipping and Billing details
 For a given order, find all the parts that are ordered
along with the subparts of that part.
OrderItem (OrderNo, ItemSerialNo, ItemNo, Quantity)  Prepare a voucher for a given order.
 For orders that were paid for on the same date that
d. Redundant Data the Shipment was received, give a 10% discount if
the amount exceeds a value ‗x‘ and a 20% discount
PaymentInfo
Application: (OrderNo,
Sometimes PaymentNo,
one or more PaymentType,
columns from one if the amount exceeds a value ‗y‘.
table are accessed whenever data from another table is
PaymentDate)  Retrieve all sub items that item number 100
accessed. If this happens frequently they could be stored as contains
redundant data in the tables.  Find all subparts that have no subpart.
Denormalization Technique: The columns that are The Denormalized Schema thus constructed over the
duplicated PaymentType (PaymentType,
in the relation Description)
to avoid a lookup (join) should be Normalized Tables to improve performance and using the
used by a large number of users but should not be techniques described above:
frequently updated.
DN_Oust_Order (OrderNo, CustomerNo, OrderDate,
e. Repeating Groups ShipToContactInfo_Name, ShipToContactPhone_PhNo,
BillToContactInfo_Name, BillToContactPhone_PhNo,
Application: When repeating groups are normalized they ShipToContactInfo_URowId, BillToContactInfo_URowId)
are implemented as distinct rows instead of distinct columns
resulting in less efficient retrieval. These repeating groups
can be stored as a nested table within the original parent DN_Aggregate (OrderNo, OrderDate, TotalAmt, Discount)
table.
Before deciding to implement repeating groups, it is DN_Voucher (OrderNo, OrderDate, ItemName, ItemPrice,
important to consider if the data will be aggregated or Quantity, DN_Aggregate_RowId)
compared within the row or if the data would be accessed
collectively, otherwise SQL may slow down query
evaluation. DN_Item_Hierarchy (Main_ItemId, Sub_ItemId,
Denormalization Technique: Repeating groups can be Child_Level, Is_Leaf, Item_URowId)
stored as ―setoff(values)‖ - SQL Extensions - within the These tables can be created using the
Global Journal of Computer Science and Technology P a g e | 50

create materialized view

build immediate 2.
refresh fast on commit
enable query rewrite select itemno,itemname,parentitem from item start with
clauses provided by the DBMSs. The URowIds of the Base parentitem=100 connect by prior itemno=parentitem ;
Table rows can also be selected and inserted into the
Denormalized Schema Extensions.
The DN_Aggregate Tables need to be created using the
withschemabinding 69 rows selected.
clause .
The Denormalized Hierarchy tables can be created using the Elapsed: 00:00:00.17
connect by prior
start with
level
clauses. 3.
The CONNECT BY prior clause can automatically handle
insertions. select parentitem,childitemno,itemname from dn_item_hier
where parentitem=100

5.2: A Performance Study on Hierarchical Queries

The Hierachical Technique for Denormalization needs to be Consider a query ―Find all items that are contained in
69 rows selected.
further illustrated. item 100‖ that requires to be run on the above table. This
involves finding the child nodes at every level of the
Elapsed: 00:00:00.15
Considering the Normalized Item Data consisting of data hierarchy.
shown below (partial view of the database)
A Solution to the above query:
Select ItemNo from item where
ParentItemNo=‘100‘
100

101 105
Union
With an increased set of tuples, and a greater depth in the
Select ItemNo from item where ParentItemNo
hierarchy, the improvement will be substantial.
in
108 200 203
204 (Select ItemNo from item
where ParentItemNo=‘100‘)
Union
109
209 Select ItemNo from item where ParentItemNo
110 112
in
111 (Select ItemNo from item where
ParentItemNo in
Figure 3: Partial Hierarchical Item Data
(Select ItemNo from item
where ParentItemNo=‘100‘))

The Normalized Relation for the Hierarchical Item Table This retrieval query, besides being extremely inefficient, one
would be stored as needs to know the maximum depth of the hierarchy.
ItemNo ParentItemNo OtherItemDetails
100 … The Denormalized Schema for the Item Information in the
101 100 … RDIRS :
105 100 … DN_Item_Hierarchy (ParentItemNo, ChildItemNo,
108 101 … ItemName, ChildLevel, IsLeaf, Item_URowId)
200 101 … The ChildLevel ascertains the level in the hierarchy that the
203 101 … child node is at; IsLeaf specifies if that node has further
204 101 … child nodes and makes queries like ―Find all items that
109 108 … have no subparts‖ solvable efficiently.
110 108 …
111 108 … The (part) extension of the DN_Item_Hierarchy Schema
112 108 … ParentItemNo ChildItemNo ItemName ChildLevel IsLeaf
209 204 … ItemRowId
P a g e | 51 Global Journal of Computer Science and Technology

100 101 SubPart1 The results are as shown :

1 N ….
100 105 SubPart2 1
N ….
100 108 SubPart3 2 1.
N ….
100 200 SubPart4 Set timing on;
2 Y ….
100 203 SubPart5 select itemno,itemname,parentitem from item1 where
2 Y …. itemno in
100 204 SubPart6
2 N .... (select itemno from item1 where parentitem=100
100 109 SubPart7
3 Y …. union
100 110 SubPart8
3 Y …. select itemno from item1 where parentitem in
100 111 SubPart9
3 Y …. (select itemno from item1 where parentitem=100)
100 112 SubPart10
3 Y …. union
100 209 SubPart11
3 Y … select itemno from item1 where parentitem in

(select itemno from item1 where parentitem in

101 108 SubPart3
2 N … (select itemno from item1 where parentitem=100)));
101 200 SubPart4
2 N …
101 203 SubPart5
2 N … 69 rows selected.
101 204 SubPart6
2 N … Elapsed: 00:00:00.31
108 109 SubPart7
3 Y …
108 110 SubPart8
3 Y … VI. CONCLUSIONS AND FUTURE WORK
108 111 SubPart9 Although each new RDBMS release usually brings
3 Y … enhanced performance and improved access options that
108 112 SubPart10 may reduce the need for denormalization, there will be
3 Y … many occasion where even these popular RDBMSs will
204 209 SubPart11 require denormalized data structures. Denormalizatio will
3 Y … continue to remain an integral part of DataBase Design. A
……………. detailed authorization and access matrix which is stored
…………… along with the Denormalization view will further enhance
A Solution to the above query ―Find all items that are performance. This and a detailed strategy for cost benefit
contained in item 100‖ can now be written as: analysis will be the next stage in the subject of my research.
Select itemno from dn_item_hierarchy where
parentitemno=100; REFERENCES
To study the performance improvement using
denormalization, the normalized item table was created with [1] Afrati F., Chen Li, and Ullman J D. ―Generating
100 tuples, 70 tuples had the main root level as 100.The efficient plans using views‖. In SIGMOD, pages 319–330,
maximum child level nodes was 4. 2001.
[2] Amghar Y. and Mezaine M., ―Active database design‖
,Comad 97, Chennai, India.
[3] Chaudhuri, Krishnamurthy R, Potamianos S, and Shim
K,‖Optimizing Queries using materialized views‖, In
Proceedings of the 11th International Conference on Data
Engineering (Taipei, Taiwan, Mar.), ,1995,pp. 190--200.
Global Journal of Computer Science and Technology P a g e | 52

[4] Chirkova R., Chen Li, and J Li, ―Answering queries [7] Hauns M., ―To normalize or denormalize, that is the
using materialized views with minimum size‖ ,. Vldb question‖, Proceedings of 19th Int.Conf for Management
Journal 2006, 15 (3), pp. 191-210. and Performance Evaluation of Enterprise computing
[5] Date C.J, ―The Normal is so …interesting‖, DataBase Systems, San Diego,CA,1994,pp 416-423
programming and Design, Nov 1997,pp 23-25
[6] Halevy A. ―Answering queries using views: A survey.‖
In VLDB, 2001.
[8] Inmon W.H, ―Denormalization for Efficiency, [17] Date C.J. ,Kannan A., Swamynathan S.,‖An
―ComputerWorld‖, Vol 21 ,1987 pp 19-21 Introduction to Database Systems ―, ,8th Ed.,Pearson
[9] Ross K., Srivastava D. and Sudarshan S., ‖Materialized Education
View Maintenace and integrity constraint checking : trading [18] Elmashree R. and Navathe S.,―Fundamentals of
space for time‖, ACM Sigmod Conference 1996,pp 447 -458 Database Systems‖,3rd Ed, Addison Weisley.
[10] Rodgers U., ‖Denormalization: why, what and how?‖ [19] Davidson L., ―Ten common design mistakes ―,
Database Programming and Design,1989 (12) ,pp 46-53 software engineers blog, Feb 2007
[11] Sanders G. and Shin S.K, ―Denormalization Effects on [20] Downs K.,‖The argument for Denormalization‖,The
Performance of RDBMS‖, Proceedings of the 34 th Database Programmer,Oct 2008
International Conference on Systems Sciences, 2001 [21] Khaldtiance S., ―Evaluate Index Usage in Databases‖,
[12] Schkolnick M., Sorenson P. , ―Denormalization :A SQL Server Magazine, October 2008
performance Oriented database design technique‖ , [22] Lurie M.,IBM, ‖Winning Database Configurations
Proceedings of the AICA 1980 Congress ,Italy. [23] Mullins C, ―Denormalization Guidelines ―, Platinum
[13] Shin S.K and Sanders G.L., ― Denormalization Tecnology Inc.,Data administration Newsletter, Accessed
strategies for data retrieval from data warehouses June 2009.
―,Decision support Systems, VolVol. 42, No. 1, pp. 267-282, [24] Microsoft - SQL Server 7.0 Resource Guide ‖Chapter
2006 12 - Data Warehousing Framework‖
[14] Vincent M., Mohania M. and Kambayashi Y., ―A Self- [25] Roten-Gal-Oz A. Cirrus minor in ―Making IT work‖
Maintainable View maintenance technique for data Musings of an Holistic Architect, Accessed June 2009
warehouses‖ ,8th Int. Conf on Management of Data, [26] Van Couver D. on his blog ―Van Couvering is not a
Chennai,India verb‖, Accessed June 2009
[15] Wei Z., Dejun J., Pierre G.,Chi C.H, Steen [27] Wiseth K, Editor-in-Chief of Oracle Technology News,
M.,‖Service-Oriented Data Denormalization for Scalable in ‖Find Meaning‖,Accessed June 2009
Web Applications‖ , Proceedings of the 17 th International [28] Thought Clusters on software, development and
WWW Conference 2008, Beijing, China programming, website -– March 2009
[16] Barquin R., Edelstein H., ―Planning and Designing the [29] website – http://www.sqlteam.com/Forums/, Accessed
Data Warehouse‖, Prentice Hall July 2009

Join Cardinality Estimation Methods
100% (1)
Join Cardinality Estimation Methods
35 pages
03 Database Management System Important Questions Answers
No ratings yet
03 Database Management System Important Questions Answers
35 pages
600+ DBMS MCQs
81% (16)
600+ DBMS MCQs
69 pages
Normalization and Denormalization Balancing Performance and Storage Efficiency
No ratings yet
Normalization and Denormalization Balancing Performance and Storage Efficiency
6 pages
Normalization vs. Denormalization Striking The Right Balance in Database Design
No ratings yet
Normalization vs. Denormalization Striking The Right Balance in Database Design
7 pages
dbmsmicroproject2
No ratings yet
dbmsmicroproject2
7 pages
Purple & White Business Profile Presentation
No ratings yet
Purple & White Business Profile Presentation
16 pages
report for blood bank
No ratings yet
report for blood bank
10 pages
The Normalization Process: The Atomic Age Is Here To Stay-But Are We?
No ratings yet
The Normalization Process: The Atomic Age Is Here To Stay-But Are We?
2 pages
Database Normalization
No ratings yet
Database Normalization
10 pages
Running Head: Normalization For An Efficient Database: Strayer University
No ratings yet
Running Head: Normalization For An Efficient Database: Strayer University
5 pages
Normalization
No ratings yet
Normalization
9 pages
DBMS Module 4
No ratings yet
DBMS Module 4
33 pages
Normalisation
No ratings yet
Normalisation
21 pages
Database Techniques DB Normalization
No ratings yet
Database Techniques DB Normalization
37 pages
Unit 3 - Rdbms Notes
No ratings yet
Unit 3 - Rdbms Notes
29 pages
Normalisation Concepts in Database
No ratings yet
Normalisation Concepts in Database
5 pages
Discussion M5
No ratings yet
Discussion M5
2 pages
Normalization
No ratings yet
Normalization
11 pages
Unit 1 Lesson 7 Normalization
No ratings yet
Unit 1 Lesson 7 Normalization
12 pages
DBMS Session 6 Notes
No ratings yet
DBMS Session 6 Notes
50 pages
DB Normalization
No ratings yet
DB Normalization
3 pages
DBMS 2
No ratings yet
DBMS 2
8 pages
VII. Normalización
No ratings yet
VII. Normalización
16 pages
Unit 4
No ratings yet
Unit 4
6 pages
LESSON-7.-Normalization-of-Database-Tables
No ratings yet
LESSON-7.-Normalization-of-Database-Tables
34 pages
DBMS Ca3
No ratings yet
DBMS Ca3
15 pages
Dbms Theory Notes Unit IV
No ratings yet
Dbms Theory Notes Unit IV
73 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
6 pages
Normalization
No ratings yet
Normalization
22 pages
Normalization-in-IAD-413-for-reporting (1)
No ratings yet
Normalization-in-IAD-413-for-reporting (1)
4 pages
Lesson5-NORMALIZATION(midtrem)
No ratings yet
Lesson5-NORMALIZATION(midtrem)
29 pages
BCNF
No ratings yet
BCNF
3 pages
Data Warehousing: Lecture No 04
No ratings yet
Data Warehousing: Lecture No 04
47 pages
(6737) Seminardocumentation 1
No ratings yet
(6737) Seminardocumentation 1
26 pages
Normalization
No ratings yet
Normalization
3 pages
12th Databases
No ratings yet
12th Databases
32 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
21 pages
Database Design
No ratings yet
Database Design
4 pages
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
Database Normal Is at Ion
No ratings yet
Database Normal Is at Ion
37 pages
Lecture 22 Denormalization
No ratings yet
Lecture 22 Denormalization
14 pages
Normalization
No ratings yet
Normalization
47 pages
Thesis On Database Normalization
100% (3)
Thesis On Database Normalization
6 pages
Denormalization: 1 See Also
No ratings yet
Denormalization: 1 See Also
2 pages
PDF Document 2
No ratings yet
PDF Document 2
72 pages
Data Structures Explained: A Practical Guide with Examples
From Everand
Data Structures Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
MODULE 5 RELATIONAL DATABASE DESIGN (1)
No ratings yet
MODULE 5 RELATIONAL DATABASE DESIGN (1)
41 pages
Normalized vs. Denormalized: Normalization
No ratings yet
Normalized vs. Denormalized: Normalization
3 pages
QDUS Session 06
No ratings yet
QDUS Session 06
42 pages
Importance of Database Design in DBMS
No ratings yet
Importance of Database Design in DBMS
5 pages
4th Module DBMS Notes
No ratings yet
4th Module DBMS Notes
23 pages
Chapter05 Updated
No ratings yet
Chapter05 Updated
52 pages
Normalized and denormalized data
No ratings yet
Normalized and denormalized data
2 pages
Advanced Database Integration Group 52
No ratings yet
Advanced Database Integration Group 52
45 pages
SQL: Database Normalization
No ratings yet
SQL: Database Normalization
10 pages
Cs403 Short Notes
No ratings yet
Cs403 Short Notes
53 pages
Normalization in SQL
No ratings yet
Normalization in SQL
2 pages
Mastering Database Design
From Everand
Mastering Database Design
Ted Noreux
No ratings yet
Normalization
No ratings yet
Normalization
13 pages
Database Normalization: Problems Addressed by Normalization
No ratings yet
Database Normalization: Problems Addressed by Normalization
22 pages
DSDSF
No ratings yet
DSDSF
35 pages
BDC Interview Questions (1) - 2
No ratings yet
BDC Interview Questions (1) - 2
2 pages
ME
No ratings yet
ME
37 pages
SQL, BI, PowerBI 7yr
No ratings yet
SQL, BI, PowerBI 7yr
4 pages
1.5.2b Utility Software ANSWERS
No ratings yet
1.5.2b Utility Software ANSWERS
1 page
Topic 8. Semantic Data Modeling: Conceptual Model
No ratings yet
Topic 8. Semantic Data Modeling: Conceptual Model
30 pages
UCS551 Chapter 1 - Introduction To Data Analytics
No ratings yet
UCS551 Chapter 1 - Introduction To Data Analytics
23 pages
Team-4 - Project Report
No ratings yet
Team-4 - Project Report
94 pages
Chapter 2 - ABAP Dictionary - Data Dictionary - DDIC - 1 - SAP ABAP - Hands-On Test Projects With Business Scenarios
No ratings yet
Chapter 2 - ABAP Dictionary - Data Dictionary - DDIC - 1 - SAP ABAP - Hands-On Test Projects With Business Scenarios
35 pages
Creating and Administering Analytics
No ratings yet
Creating and Administering Analytics
168 pages
DBMS and MySql-2
No ratings yet
DBMS and MySql-2
94 pages
CRUD
No ratings yet
CRUD
12 pages
4
No ratings yet
4
3 pages
Ism Lab File
No ratings yet
Ism Lab File
117 pages
Artemis
No ratings yet
Artemis
11 pages
DA_Bootcamp_4 (1)
No ratings yet
DA_Bootcamp_4 (1)
46 pages
ADB Logcat Output
No ratings yet
ADB Logcat Output
86 pages
Contra in Ts
No ratings yet
Contra in Ts
3 pages
MySQL Quiz Results1
No ratings yet
MySQL Quiz Results1
8 pages
JDBC Connection Code To Connect Servlet With Oracle.
No ratings yet
JDBC Connection Code To Connect Servlet With Oracle.
7 pages
OS DB Certification Questions
No ratings yet
OS DB Certification Questions
6 pages
IM Ch10 Trans MGT and Concurrency CTRL Ed12
No ratings yet
IM Ch10 Trans MGT and Concurrency CTRL Ed12
20 pages
Apache Calcite Tutorial
No ratings yet
Apache Calcite Tutorial
83 pages
Web Data PDF
No ratings yet
Web Data PDF
21 pages
1Z0 902 Demo
No ratings yet
1Z0 902 Demo
6 pages
Dbms (r20) Unit - 3 (Part - 1)
No ratings yet
Dbms (r20) Unit - 3 (Part - 1)
17 pages
Sap Abap Database Tables
No ratings yet
Sap Abap Database Tables
3 pages
E R Model Chapter 5
No ratings yet
E R Model Chapter 5
44 pages

Denormalization

Uploaded by

Denormalization

Uploaded by

Global Journal of Computer Science and Technology P a g e | 44

A Framework for Systematic Database

Goa University , India

The problem can now be stated as ―Given a logical schema

A few definitions are required

Defn 3: An Information Retrieval Key (IRK) is a (set of) V. DENORMALIZATION TECHNIQUES

Defn 5: The Denormalization Schema (DS) in the RDIR c. Fragmenting Tables

table removing the restriction on the number of values that

Consider the following Normalized database (3NF) relations: f. Derivable Data

Some of the major reports identified and that need to be

create materialized view

5.2: A Performance Study on Hierarchical Queries

100 101 SubPart1 The results are as shown :

(select itemno from item1 where parentitem in

You might also like