0% found this document useful (0 votes)

26 views14 pages

SQL NoSQL Databases

The document discusses SQL and NoSQL databases, focusing on their models, languages, and consistency, and highlights the importance of understanding database administration and user data management. It emphasizes the stability of SQL technology while acknowledging the emergence of NoSQL databases to address Big Data challenges. The authors aim to provide a comprehensive overview of both database types for students and professionals in the field.

Uploaded by

nguyendminhhien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views14 pages

SQL NoSQL Databases

Uploaded by

nguyendminhhien

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

SQL & NoSQL Databases

Andreas Meier · Michael Kaufmann

SQL & NoSQL Databases

Models, Languages, Consistency
Options and Architectures for
Big Data Management
Andreas Meier Michael Kaufmann
Department für Informatik Departement für Informatik
Universität Fribourg Hochschule Luzern
Fribourg, Switzerland Rotkreuz, Switzerland

Translated from German by Anja Kreutel.

ISBN 978-3-658-24548-1 ISBN 978-3-658-24549-8 (eBook)

https://doi.org/10.1007/978-3-658-24549-8

Library of Congress Control Number: 2019935851

Springer Vieweg
© Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage
and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or
hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors
or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

This Springer Vieweg imprint is published by the registered company Springer Fachmedien Wiesbaden GmbH
part of Springer Nature
The registered company address is: Abraham-Lincoln-Str. 46, 65189 Wiesbaden, Germany
Foreword

The term “database” has long since become part of people’s everyday vocabulary, for
managers and clerks as well as students of most subjects. They use it to describe a logi-
cally organized collection of electronically stored data that can be directly searched and
viewed. However, they are generally more than happy to leave the whys and hows of its
inner workings to the experts.
Users of databases are rarely aware of the immaterial and concrete business values
contained in any individual database. This applies as much to a car importer’s spare parts
inventory as to the IT solution containing all customer depots at a bank or the patient
information system of a hospital. Yet failure of these systems, or even cumulative errors,
can threaten the very existence of the respective company or institution. For that rea-
son, it is important for a much larger audience than just the “database specialists” to be
well-informed about what is going on. Anyone involved with databases should under-
stand what these tools are effectively able to do and which conditions must be created
and maintained for them to do so.
Probably the most important aspect concerning databases involves (a) the distinction
between their administration and the data stored in them (user data) and (b) the economic
magnitude of these two areas. Database administration consists of various technical and
administrative factors, from computers, database systems, and additional storage to the
experts setting up and maintaining all these components—the aforementioned database
specialists. It is crucial to keep in mind that the administration is by far the smaller part
of standard database operation, constituting only about a quarter of the entire efforts.
Most of the work and expenses concerning databases lie in gathering, maintaining,
and utilizing the user data. This includes the labor costs for all employees who enter data
into the database, revise it, retrieve information from the database, or create files using
this information. In the above examples, this means warehouse employees, bank tellers,
or hospital personnel in a wide variety of fields—usually for several years.
In order to be able to properly evaluate the importance of the tasks connected with
data maintenance and utilization on the one hand and database administration on the
other hand, it is vital to understand and internalize this difference in the effort required

v
vi Foreword

for each of them. Database administration starts with the design of the database, which
already touches on many specialized topics such as determining the consistency checks
for data manipulation or regulating data redundancies, which are as undesirable on the
logical level as they are essential on the storage level. The development of database solu-
tions is always targeted at their later use, so ill-considered decisions in the development
process may have a permanent impact on everyday operations. Finding ideal solutions,
such as the golden mean between too strict and too flexible when determining consist-
ency conditions, may require some experience. Unduly strict conditions will interfere
with regular operations, while excessively lax rules will entail a need for repeated expen-
sive data repairs.
To avoid such issues, it is invaluable that anyone concerned with database develop-
ment and operation, whether in management or as a database specialist, gain systematic
insight into this field of computer sciences. The table of contents gives an overview of
the wide variety of topics covered in this book. The title already shows that, in addition
to an in-depth explanation of the field of conventional databases (relational model, SQL),
the book also provides highly educational information about current advancements and
related fields, the keywords being “NoSQL” or “post-relational” and “Big Data.” I am
confident that the newest edition of this book will, once again, be well received by both
students and professionals—its authors are quite familiar with both groups.

Carl August Zehnder

Preface

It is remarkable how stable some concepts are in the field of databases. Information
technology is generally known to be subject to rapid development, bringing forth new
technologies at an unbelievable pace. However, this is only superficially the case. Many
aspects of computer science do not essentially change at all. This includes not only the
basics, such as the functional principles of universal computing machines, processors,
compilers, operating systems, databases and information systems, and distributed sys-
tems, but also computer language technologies such as C, TCP/IP, or HTML, which are
decades old but in many ways provide a stable fundament of the global, earth-spanning
information system known as the World Wide Web. Likewise, the SQL language has
been in use for over four decades and will remain so in the foreseeable future. The the-
ory of relational database systems was initiated in the 1970s by Codd (relation model
and normal forms), Chen (entity and relationship model) and Chamberlin and Boyce
(SEQUEL). However, these technologies have a major impact on the practice of data
management today. Especially, with the Big Data revolution and the widespread use of
data science methods for decision support, relational databases, and the use of SQL for
data analysis are actually becoming more important. Even though sophisticated statistics
and machine learning are enhancing the possibilities for knowledge extraction from data,
many if not most data analyses for decision support rely on descriptive statistics using
SQL for grouped aggregation. In that sense, although SQL database technology is quite
mature, it is more relevant today than ever.
Nevertheless, a lot has changed in the area of database systems lately over the years.
Especially the developments in the Big Data ecosystem brought new technologies into
the world of databases, to which we pay enough attention to. The nonrelational database
technologies, which are finding more and more fields of application under the generic
term NoSQL, differ not only superficially from the classical relational databases, but
also in the underlying principles. Relational databases were developed in the twentieth
century with the purpose of enabling tightly organized, operational forms of data man-
agement, which provided stability but limited flexibility. In contrast, the NoSQL data-
base movement emerged in the beginning of the current century, focusing on horizontal

vii
viii Preface

partitioning and schema flexibility, and with the goal of solving the Big Data problems
of volume, variety, and velocity, especially in Web-scale data systems. This has far-
reaching consequences and has led to a new approach in data management, which devi-
ates significantly from the previous theories on the basic concept of databases: the way
data is modeled, how data is queried and manipulated, how data consistency is handled,
and the system architecture. This is why we compare these two worlds, SQL and NoSQL
databases, from different perspectives in all chapters.
We have also launched a website called sql-nosql.org, where we share teaching and
tutoring materials such as slides, tutorials for SQL and Cypher, case studies, a work-
bench for MySQL and Neo4j, so that language training can be done either with SQL or
with Cypher, the graph-oriented query language of the NoSQL database Neo4j.
At this point, we would like to thank Anja Kreutel for her great effort and success
in translating the eighth edition of the German textbook to English. We also thank
Alexander Denzler and Marcel Wehrle for the development of the workbench for rela-
tional and graph-oriented databases. For the redesign of the graphics, we were able to
win Thomas Riediker and we thank him for his tireless efforts. He has succeeded in giv-
ing the pictures a modern style and an individual touch. For the further development
of the tutorials and case studies, which are available on the website sql-nosql.org, we
thank the computer science students Andreas Waldis, Bettina Willi, Markus Ineichen,
and Simon Studer for their contributions to the tutorial in Cypher and to the case study
Travelblitz with OpenOffice Base and with Neo4J. For the feedback on the manuscript
we thank Alexander Denzler, Daniel Fasel, Konrad Marfurt, and Thomas Olnhoff, for
their willingness to contribute to the quality of our work with their hints. A big thank you
goes to Sybille Thelen, Dorothea Glaunsinger, and Hermann Engesser of Springer, who
have supported us with patience and expertise.

February 2019 Andreas Meier

Michael Kaufmann
Contents

1 Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Information Systems and Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 SQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Relational Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Structured Query Language (SQL) . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Relational Database Management System . . . . . . . . . . . . . . . . . . . . 8
1.3 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 NoSQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.1 Graph-based Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Graph Query Language Cypher . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.3 NoSQL Database Management System . . . . . . . . . . . . . . . . . . . . . . 16
1.5 Organization of Data Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 Data Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1 From Data Analysis to Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 The Entity-Relationship Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Entities and Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2 Association Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.3 Generalization and Aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Implementation in the Relational Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.1 Dependencies and Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.2 Mapping Rules for Relational Databases . . . . . . . . . . . . . . . . . . . . . 46
2.3.3 Structural Integrity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4 Implementation in the Graph Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.4.1 Graph Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.4.2 Mapping Rules for Graph Databases . . . . . . . . . . . . . . . . . . . . . . . . 68
2.4.3 Structural Integrity Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.5 Enterprise-Wide Data Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

ix
x Contents

2.6 Formula for Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

2.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3 Database Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.1 Interacting with Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.2 Relational Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.1 Overview of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.2 Set Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.2.3 Relational Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.3 Relationally Complete Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.3.1 SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3.2 QBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.4 Graph-based Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.4.1 Cypher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.5 Embedded Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.5.1 Cursor Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.5.2 Stored Procedures and Stored Functions . . . . . . . . . . . . . . . . . . . . . 108
3.5.3 JDBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.5.4 Embedding Graph-based Languages . . . . . . . . . . . . . . . . . . . . . . . . 110
3.6 Handling NULL Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
3.7 Integrity Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.8 Data Protection Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4 Ensuring Data Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.1 Multi-User Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.2 Transaction Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.2.1 ACID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.2.2 Serializability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.2.3 Pessimistic Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.2.4 Optimistic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.2.5 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.3 Consistency in Massive Distributed Data . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.3.1 BASE and the CAP Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.3.2 Nuanced Consistency Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
4.3.3 Vector Clocks for the Serialization of Distributed Events . . . . . . . . 137
4.4 Comparing ACID and BASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.5 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Contents xi

5 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.1 Processing of Homogeneous and Heterogeneous Data . . . . . . . . . . . . . . . . 143
5.2 Storage and Access Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2.1 Indexes and Tree Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.2.2 Hashing Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.2.3 Consistent Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.2.4 Multidimensional Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.3 Translation and Optimization of Relational Queries . . . . . . . . . . . . . . . . . . 155
5.3.1 Creation of Query Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.3.2 Optimization by Algebraic Transformation . . . . . . . . . . . . . . . . . . . 156
5.3.3 Calculation of Join Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.4 Parallel Processing with MapReduce. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.5 Layered Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.6 Use of Different Storage Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6 Postrelational Databases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.1 The Limits of SQL—and Beyond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2 Federated Databases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.3 Temporal Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.4 Multidimensional Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.5 Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.6 Object-Relational Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
6.7 Knowledge Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.8 Fuzzy Databases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.9 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
7 NoSQL Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.1 Development of Nonrelational Technologies. . . . . . . . . . . . . . . . . . . . . . . . 201
7.2 Key-Value Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.3 Column-Family Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
7.4 Document Stores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
7.5 XML Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
7.6 Graph Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.7 Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
List of Figures

Fig. 1.1 Architecture and components of information systems . . . . . . . . . . . . . . . 2

Fig. 1.2 Table structure for an EMPLOYEE table. . . . . . . . . . . . . . . . . . . . . . . . . 4
Fig. 1.3 EMPLOYEE table with manifestations . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Fig. 1.4 Formulating a query in SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Fig. 1.5 The difference between descriptive and procedural languages . . . . . . . . 8
Fig. 1.6 Basic structure of a relational database management system . . . . . . . . . 9
Fig. 1.7 Variety of sources for Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Fig. 1.8 Section of a property graph on movies . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Fig. 1.9 Section of a graph database on movies . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Fig. 1.10 Basic structure of a NoSQL database management system . . . . . . . . . . . 17
Fig. 1.11 Three different NoSQL databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Fig. 1.12 The four cornerstones of data management . . . . . . . . . . . . . . . . . . . . . . . 19
Fig. 2.1 The three steps necessary for data modeling . . . . . . . . . . . . . . . . . . . . . . 27
Fig. 2.2 EMPLOYEE entity set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Fig. 2.3 INVOLVED relationship between employees and projects . . . . . . . . . . . 29
Fig. 2.4 Entity-relationship model with association types . . . . . . . . . . . . . . . . . . 30
Fig. 2.5 Overview of the possible cardinalities of relationships . . . . . . . . . . . . . . 32
Fig. 2.6 Generalization, illustrated by EMPLOYEE . . . . . . . . . . . . . . . . . . . . . . . 33
Fig. 2.7 Network-like aggregation, illustrated by
CORPORATION_STRUCTURE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Fig. 2.8 Hierarchical aggregation, illustrated by ITEM_LIST . . . . . . . . . . . . . . . 35
Fig. 2.9 Redundant and anomaly-prone table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Fig. 2.10 Overview of normal forms and their definitions . . . . . . . . . . . . . . . . . . . 37
Fig. 2.11 Tables in first and second normal forms. . . . . . . . . . . . . . . . . . . . . . . . . . 39
Fig. 2.12 Transitive dependency and the third normal form . . . . . . . . . . . . . . . . . . 41
Fig. 2.13 Table with multivalued dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Fig. 2.14 Improper splitting of a PURCHASE table . . . . . . . . . . . . . . . . . . . . . . . . 44
Fig. 2.15 Tables in fifth normal form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Fig. 2.16 Mapping entity and relationship sets onto tables . . . . . . . . . . . . . . . . . . . 47

xiii
xiv List of Figures

Fig. 2.17 Mapping rule for complex-complex relationship sets . . . . . . . . . . . . . . . 49

Fig. 2.18 Mapping rule for unique-complex relationship sets. . . . . . . . . . . . . . . . . 50
Fig. 2.19 Mapping rule for unique-unique relationship sets . . . . . . . . . . . . . . . . . . 51
Fig. 2.20 Generalization represented by tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Fig. 2.21 Network-like corporation structure represented by tables . . . . . . . . . . . . 53
Fig. 2.22 Hierarchical item list represented by tables . . . . . . . . . . . . . . . . . . . . . . . 54
Fig. 2.23 Ensuring referential integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 2.24 A Eulerian cycle for crossing 13 bridges . . . . . . . . . . . . . . . . . . . . . . . . . 58
Fig. 2.25 Iterative procedure for creating the set Sk(v) . . . . . . . . . . . . . . . . . . . . . . 59
Fig. 2.26 Shortest subway route from stop v0 to stop v7 . . . . . . . . . . . . . . . . . . . . . 61
Fig. 2.27 Construction of a Voronoi cell using half-spaces . . . . . . . . . . . . . . . . . . . 63
Fig. 2.28 Dividing line T between two Voronoi diagrams
VD(M1) and VD(M2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Fig. 2.29 Sociogram of a middle school class as a graph and as
an adjacency matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Fig. 2.30 Balanced (B1–B4) and unbalanced (U1–U4) triads. . . . . . . . . . . . . . . . . 68
Fig. 2.31 Mapping entity and relationship sets onto graphs . . . . . . . . . . . . . . . . . . 69
Fig. 2.32 Mapping rule for network-like relationship sets . . . . . . . . . . . . . . . . . . . 70
Fig. 2.33 Mapping rule for hierarchical relationship sets . . . . . . . . . . . . . . . . . . . . 71
Fig. 2.34 Mapping rule for unique-unique relationship sets . . . . . . . . . . . . . . . . . . 72
Fig. 2.35 Generalization as a tree-shaped partial graph . . . . . . . . . . . . . . . . . . . . . 73
Fig. 2.36 Network-like corporation structure represented as a graph . . . . . . . . . . . 74
Fig. 2.37 Hierarchical item list as a tree-shaped partial graph . . . . . . . . . . . . . . . . 75
Fig. 2.38 Abstraction steps of enterprise-wide data architecture . . . . . . . . . . . . . . 77
Fig. 2.39 Data-oriented view of business units . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Fig. 2.40 From rough to detailed in ten design steps. . . . . . . . . . . . . . . . . . . . . . . . 80
Fig. 3.1 SQL as an example for database language use . . . . . . . . . . . . . . . . . . . . 86
Fig. 3.2 Set union, set intersection, set difference,
and Cartesian product of relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Fig. 3.3 Projection, selection, join, and division of relations . . . . . . . . . . . . . . . . 88
Fig. 3.4 Union-compatible tables SPORTS_CLUB and PHOTO_CLUB. . . . . . . 89
Fig. 3.5 Set union of the two tables SPORTS_CLUB and PHOTO_CLUB . . . . . 90
Fig. 3.6 COMPETITION relation as an example of Cartesian products. . . . . . . . 91
Fig. 3.7 Sample projection on EMPLOYEE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Fig. 3.8 Examples of selection operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Fig. 3.9 Join of two tables with and without a join predicate . . . . . . . . . . . . . . . . 94
Fig. 3.10 Example of a divide operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Fig. 3.11 Recursive relationship as entity-relationship model
and as graph with node and edge types . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Fig. 3.12 Unexpected results from working with NULL values . . . . . . . . . . . . . . . 112
Fig. 3.13 Truth tables for three-valued logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
List of Figures xv

Fig. 3.14 Definition of declarative integrity constraints . . . . . . . . . . . . . . . . . . . . . 114

Fig. 3.15 Definition of views as part of data protection . . . . . . . . . . . . . . . . . . . . . 117
Fig. 4.1 Conflicting posting transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Fig. 4.2 Analyzing a log using a precedence graph. . . . . . . . . . . . . . . . . . . . . . . . 127
Fig. 4.3 Sample two-phase locking protocol for the transaction TRX_1 . . . . . . . 129
Fig. 4.4 Conflict-free posting transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Fig. 4.5 Serializability condition for TRX_1 not met . . . . . . . . . . . . . . . . . . . . . . 132
Fig. 4.6 Restart of a database system after an error. . . . . . . . . . . . . . . . . . . . . . . . 134
Fig. 4.7 The three possible combinations under the CAP theorem . . . . . . . . . . . . 135
Fig. 4.8 Ensuring consistency in replicated systems . . . . . . . . . . . . . . . . . . . . . . . 136
Fig. 4.9 Vector clocks showing causalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Fig. 4.10 Comparing ACID and BASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Fig. 5.1 Processing a data stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Fig. 5.2 B-tree with dynamic changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Fig. 5.3 Hash function using the division method . . . . . . . . . . . . . . . . . . . . . . . . . 150
Fig. 5.4 Ring with objects assigned to nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Fig. 5.5 Dynamic changes in the computer network . . . . . . . . . . . . . . . . . . . . . . . 152
Fig. 5.6 Dynamic partitioning of a grid index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Fig. 5.7 Query tree of a qualified query on two tables . . . . . . . . . . . . . . . . . . . . . 156
Fig. 5.8 Algebraically optimized query tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Fig. 5.9 Computing a join with nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Fig. 5.10 Going through tables in sorting order . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Fig. 5.11 Determining the frequencies of search terms with MapReduce . . . . . . . 162
Fig. 5.12 Five-layer model for relational database systems . . . . . . . . . . . . . . . . . . 163
Fig. 5.13 Use of SQL and NoSQL databases in an online store . . . . . . . . . . . . . . . 165
Fig. 6.1 Horizontal fragmentation of the EMPLOYEE and
DEPARTMENT tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Fig. 6.2 Optimized query tree for a distributed join strategy . . . . . . . . . . . . . . . . 172
Fig. 6.3 EMPLOYEE table with data type DATE . . . . . . . . . . . . . . . . . . . . . . . . . 174
Fig. 6.4 Excerpt from a temporal table TEMP_EMPLOYEE . . . . . . . . . . . . . . . . 175
Fig. 6.5 Data cube with different analysis dimensions . . . . . . . . . . . . . . . . . . . . . 177
Fig. 6.6 Star schema for a multidimensional database . . . . . . . . . . . . . . . . . . . . . 178
Fig. 6.7 Implementation of a star schema using the relational model . . . . . . . . . . 179
Fig. 6.8 Data warehouse in the context of business intelligence processes. . . . . . 182
Fig. 6.9 Query of a structured object with and without
implicit join operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Fig. 6.10 BOOK_OBJECT table with attributes of the relation type . . . . . . . . . . . 185
Fig. 6.11 Object-relational mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Fig. 6.12 Comparison of tables and facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Fig. 6.13 Analyzing tables and facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Fig. 6.14 Derivation of new information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
xvi List of Figures

Fig. 6.15 Classification matrix with the attributes Revenue and Loyalty . . . . . . . . 192
Fig. 6.16 Fuzzy partitioning of domains with membership functions. . . . . . . . . . . 194
Fig. 7.1 Massively distributed key-value store with sharding and
hash-based key distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Fig. 7.2 Storing data in the Bigtable model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Fig. 7.3 Example of a document store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Fig. 7.4 Illustration of an XML document represented by tables . . . . . . . . . . . . . 211
Fig. 7.5 Schema of a native XML database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Fig. 7.6 Example of a graph database with user data of a website . . . . . . . . . . . . 216

PT120 CT120 Database Schema Essentials: Training Manual
No ratings yet
PT120 CT120 Database Schema Essentials: Training Manual
163 pages
SQL & NoSQL Data PDF
100% (8)
SQL & NoSQL Data PDF
238 pages
The Relational Model: A Structural Part A Manipulative Part A Set of Integrity Rules
No ratings yet
The Relational Model: A Structural Part A Manipulative Part A Set of Integrity Rules
5 pages
Dr.E.F.Codd's Rules
No ratings yet
Dr.E.F.Codd's Rules
3 pages
SQL - The Ultimate Beginner - S Guide To Learn SQL Programming Step-by-Step
50% (2)
SQL - The Ultimate Beginner - S Guide To Learn SQL Programming Step-by-Step
121 pages
Intelligent Data Management With SQL Server - ACE - INTL - Aptech
No ratings yet
Intelligent Data Management With SQL Server - ACE - INTL - Aptech
426 pages
DBMS Chapter 3
No ratings yet
DBMS Chapter 3
13 pages
SQL - Wikipedia
No ratings yet
SQL - Wikipedia
66 pages
Mongodb
No ratings yet
Mongodb
2 pages
MCS-014 Block 3
No ratings yet
MCS-014 Block 3
70 pages
CH - 3 Fundamentals of A Database System
No ratings yet
CH - 3 Fundamentals of A Database System
15 pages
A Proposal For Patient Record Management
No ratings yet
A Proposal For Patient Record Management
43 pages
Mongodb Multi-Document Acid Transactions
No ratings yet
Mongodb Multi-Document Acid Transactions
45 pages
SQL Commands
No ratings yet
SQL Commands
6 pages
Introduction To Database Administration by Dr. Mariam Rehman
No ratings yet
Introduction To Database Administration by Dr. Mariam Rehman
88 pages
Rukmini College of Management & Commerce: Ameerpet, Hyderabad. Important Questions For 2 Semester Examination
No ratings yet
Rukmini College of Management & Commerce: Ameerpet, Hyderabad. Important Questions For 2 Semester Examination
7 pages
Roll No 17
No ratings yet
Roll No 17
89 pages
SQL and NoSQL Databases: Modeling, Languages, Security and Architectures for Big Data Management Michael Kaufmann instant download
100% (1)
SQL and NoSQL Databases: Modeling, Languages, Security and Architectures for Big Data Management Michael Kaufmann instant download
137 pages
Sujal Ism File
No ratings yet
Sujal Ism File
44 pages
DatabaseSystem Final PDF
No ratings yet
DatabaseSystem Final PDF
20 pages
SQL Nosql Databases Architectures 2nd
No ratings yet
SQL Nosql Databases Architectures 2nd
263 pages
Database Management System
No ratings yet
Database Management System
5 pages
What Is Data Modelling - Types (Conceptual, Logical, Physical)
No ratings yet
What Is Data Modelling - Types (Conceptual, Logical, Physical)
10 pages
What Is Documentum
No ratings yet
What Is Documentum
30 pages
Unit 1 Database Concepts and Architecture Bitinfonepal
No ratings yet
Unit 1 Database Concepts and Architecture Bitinfonepal
12 pages
Lecture 1.1 Slides
No ratings yet
Lecture 1.1 Slides
54 pages
SQL Server Architecture
No ratings yet
SQL Server Architecture
20 pages
CS2255 DATABASE MANAGEMENT SYSTEMS (2marks and 16 Marks) Unit-I Part-A
No ratings yet
CS2255 DATABASE MANAGEMENT SYSTEMS (2marks and 16 Marks) Unit-I Part-A
32 pages
An Introduction To Database Systems, 8th Edition, C J Date
No ratings yet
An Introduction To Database Systems, 8th Edition, C J Date
1,034 pages
DBMSC 03 Co 4 NOtes
No ratings yet
DBMSC 03 Co 4 NOtes
3 pages
Shouhong Wang, Hai Wang - Business Database Technology (2nd Edition) - Theories and Design Process of Re
No ratings yet
Shouhong Wang, Hai Wang - Business Database Technology (2nd Edition) - Theories and Design Process of Re
321 pages
Advanced Data Management - For SQL, NoSQL, Cloud and Distributed Databases
No ratings yet
Advanced Data Management - For SQL, NoSQL, Cloud and Distributed Databases
375 pages
Review
No ratings yet
Review
8 pages
Task Ida Sash BD Did MSR 1500
No ratings yet
Task Ida Sash BD Did MSR 1500
13 pages
DBMS
No ratings yet
DBMS
4 pages
BDA Question Bank
No ratings yet
BDA Question Bank
8 pages
SQL & NoSQL Databases: Models, Languages, Consistency Options and Architectures for Big Data Management 1st Edition Andreas Meier pdf download
100% (1)
SQL & NoSQL Databases: Models, Languages, Consistency Options and Architectures for Big Data Management 1st Edition Andreas Meier pdf download
120 pages
XI - IT Part-B Unit-4
No ratings yet
XI - IT Part-B Unit-4
5 pages
ARE 510 5 Databases
No ratings yet
ARE 510 5 Databases
23 pages
Database and Information Management - Generated Textbook
No ratings yet
Database and Information Management - Generated Textbook
10 pages
SQL and Nosql Databases
100% (1)
SQL and Nosql Databases
12 pages
Shefali Naik - Concept of Database Management System-Pearson (2014)
No ratings yet
Shefali Naik - Concept of Database Management System-Pearson (2014)
259 pages
DBMS Advanced Test Key
0% (1)
DBMS Advanced Test Key
30 pages
DBMS - 2 Marks
No ratings yet
DBMS - 2 Marks
19 pages
Database Bca
No ratings yet
Database Bca
148 pages
Activities 15 % of EXAM: Material For PRPC Certification
No ratings yet
Activities 15 % of EXAM: Material For PRPC Certification
36 pages
DBMS Detailed Project
No ratings yet
DBMS Detailed Project
20 pages
Database Design and Development I
No ratings yet
Database Design and Development I
11 pages
DBMS Lecture 4
No ratings yet
DBMS Lecture 4
27 pages
Power BI Resume Sample
No ratings yet
Power BI Resume Sample
4 pages
What Is A Database
100% (1)
What Is A Database
6 pages
Likhith P Srms Report
No ratings yet
Likhith P Srms Report
43 pages
Database
No ratings yet
Database
11 pages
DBMS Assignment 1
No ratings yet
DBMS Assignment 1
11 pages
CS502 DBMS Notes Unit 1
No ratings yet
CS502 DBMS Notes Unit 1
19 pages
Unit - 2
No ratings yet
Unit - 2
26 pages
Chapter 1 Continuation
No ratings yet
Chapter 1 Continuation
7 pages
DBMS3
No ratings yet
DBMS3
42 pages
Unit01 - Database System Concept & Data Modeling
No ratings yet
Unit01 - Database System Concept & Data Modeling
34 pages
Ankit
No ratings yet
Ankit
27 pages
Course Pack - Introduction To Databases
No ratings yet
Course Pack - Introduction To Databases
41 pages
CO1 Notes Complete
No ratings yet
CO1 Notes Complete
49 pages
Bookdb
No ratings yet
Bookdb
64 pages
Unidad I Introduccion A Las Bases de Datos - Conceptos Basicos - Chapt01
No ratings yet
Unidad I Introduccion A Las Bases de Datos - Conceptos Basicos - Chapt01
21 pages
Introduction To Database Management
No ratings yet
Introduction To Database Management
28 pages
Hand Out Intro To Database
No ratings yet
Hand Out Intro To Database
112 pages
Data Design and Modeling Database Management System Software End Users
No ratings yet
Data Design and Modeling Database Management System Software End Users
18 pages
Lecture 2
No ratings yet
Lecture 2
14 pages
Slide Database Management System
No ratings yet
Slide Database Management System
53 pages
Introduction To Database Systems: Information Superhighway Have Become Ubiquitous, and Information Processing Is A
No ratings yet
Introduction To Database Systems: Information Superhighway Have Become Ubiquitous, and Information Processing Is A
21 pages
Unit - 1
No ratings yet
Unit - 1
24 pages
Unit - 1 PDF
No ratings yet
Unit - 1 PDF
24 pages
504 Lecture2 PDF
No ratings yet
504 Lecture2 PDF
34 pages
Lab 2
No ratings yet
Lab 2
8 pages
Symbiosis DatabaseManagementSystems
No ratings yet
Symbiosis DatabaseManagementSystems
208 pages
Module 1&2 Notes
No ratings yet
Module 1&2 Notes
21 pages
1 - SQL Server Notes
No ratings yet
1 - SQL Server Notes
19 pages
W Paper
No ratings yet
W Paper
40 pages
SKILLX Presentation
No ratings yet
SKILLX Presentation
12 pages
DBMS Notes
No ratings yet
DBMS Notes
19 pages
DBA Database Admistration Book
No ratings yet
DBA Database Admistration Book
108 pages
DBMS Sumit
No ratings yet
DBMS Sumit
15 pages
DBMS 1
No ratings yet
DBMS 1
31 pages
SQL & NoSQL Databases
No ratings yet
SQL & NoSQL Databases
238 pages
Getting Started with SQL and Databases: Managing and Manipulating Data with SQL Mark Simon pdf download
No ratings yet
Getting Started with SQL and Databases: Managing and Manipulating Data with SQL Mark Simon pdf download
145 pages

SQL NoSQL Databases

Uploaded by

SQL NoSQL Databases

Uploaded by

SQL & NoSQL Databases

Andreas Meier · Michael Kaufmann

SQL & NoSQL Databases

Translated from German by Anja Kreutel.

ISBN 978-3-658-24548-1 ISBN 978-3-658-24549-8 (eBook)

Library of Congress Control Number: 2019935851

Carl August Zehnder

February 2019 Andreas Meier

2.6 Formula for Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Fig. 1.1 Architecture and components of information systems . . . . . . . . . . . . . . . 2

Fig. 2.17 Mapping rule for complex-complex relationship sets . . . . . . . . . . . . . . . 49

Fig. 3.14 Definition of declarative integrity constraints . . . . . . . . . . . . . . . . . . . . . 114

You might also like