0% found this document useful (0 votes)
13 views

No SQLData Modeling

This document discusses document database design principles for NoSQL databases. It begins by showing an example of collections for customers, orders, and products with embedded linking between them. It then discusses potential normalization approaches if using a relational database instead. Key principles discussed for document database design include determining the appropriate level of embedding and linking between collections based on factors like expected object size, application needs, and avoiding unnecessary duplication. Entity-relationship modeling concepts are also mapped to document database design patterns like using nested objects for embedding and linking relationships.

Uploaded by

mikejcarey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

No SQLData Modeling

This document discusses document database design principles for NoSQL databases. It begins by showing an example of collections for customers, orders, and products with embedded linking between them. It then discusses potential normalization approaches if using a relational database instead. Key principles discussed for document database design include determining the appropriate level of embedding and linking between collections based on factors like expected object size, application needs, and avoiding unnecessary duplication. Entity-relationship modeling concepts are also mapped to document database design patterns like using nested objects for embedding and linking relationships.

Uploaded by

mikejcarey
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

CS122D:

Beyond SQL Data Management


⎯ Lecture #10 ⎯

Mike Carey
UC Irvine
[email protected]

M. Carey, Spring 2023: CS122D


Document DB Design

M. Carey, Spring 2023: CS122D


Example Collections
customers: orders: products:
{ "custid": "C13", lin kin g
{ "orderno": 1002, { "itemno": 460,
"name": "T. Cruise", "custid": "C13", "category": "music",
"address": { "order_date": "2017-05-01", "name": "Fender Bender Flight Case",
"street": "201 Main St.", "ship_date": "2017-05-03", "descrip": "Sturdy flight case for
"city": "St. Louis, MO", "items": [ Fender Bender guitars",
"zipcode": "63101” }, { "itemno": 460, "manuf": "Fender Bender",
"rating": 750 "qty": 95, "listprice": 109.99
}, ... "price": 100.99 }, },
embedding
{ "itemno": 680, {
"qty": 150, "itemno": 680,
"price": 8.75} "category": "essentials",
] "name": "Automatic Beer Opener",
Q: Is this a “good” }, ... "description": "Robotic beer bottle
document DB design opener",
for NoSQL...?
aggregate "manuf": "Robo Brew",
object "listprice": 29.95
}, ...
M. Carey, Spring 2023: CS122D 2
Some Possible Relational Designs

Universe(custid, name, rating, street, city, zipcode, orderno, order_date, ship_date,


itemno, qty, price, category, name, descrip, manuf, listprice)

Customer1(custid, name)
...
...

There’s a wide spectrum of


options – but what’s “best”? Customer3(custid, rating)
Address1(addrid, street)

...
...
Customer(custid, name, addrid, rating) Order1(orderno, custid)
...
Address(addrid, street, city, zipcode) Lineitems1(orderno, itemno, qty)
Order(orderno, custid, order_date, ship_date) ...
Products1(itemno, category)
Lineitems(orderno, itemno, qty, price) ...
Products(itemno, category, name, descrip, manuf, listprice) Products5(itemno, listprice)
M. Carey, Spring 2023: CS122D 3
Relational DB Design Theory
Normal Form Description
1NF All attributes must be atomic (a.k.a. scalar)
2NF Eliminates partial dependencies on PK
3NF Eliminates transitive dependencies on PK
BCNF Eliminates “all” remaining redundancy

• Bottom line normalization objectives


• “The world is flat”
• “One fact, one place”

• Other considerations
• Query performance, joins, space, ....
M. Carey, Spring 2023: CS122D 4
ER-Driven Relational DB Design
ER Concept Relational Artifact
Entity Table with entity’s attributes and PK
Relationship (M:N) Table with relationship’s attributes and FKs
Relationship (1:N) Merge relationship table with N-side entity table
Composite attribute Use flattened column naming convention
Multivalued attribute Add separate side table with entity’s PK as FK
Inheritance Delta tables or mashup table
... ...

• Same bottom line objectives as before


• “The world is flat”
• “One fact, one place”
M. Carey, Spring 2023: CS122D 5
Document DB Design Principles?
• Revisit our previous design objectives
• “The world is flat”
• “One fact, one place”
• Central issue: aggregate object design
• Often follows from the application
• Unit of read/write/modify operations
• Unit of ACID behavior
• Unit of storage (contiguous bytes)
• Object size
• Initial expected size
• Eventual expected size (!)
• ...
M. Carey, Spring 2023: CS122D 6
Example Collections (from slide 2)
customers: orders: products:
{ "custid": "C13", lin kin g
{ "orderno": 1002, { "itemno": 460,
"name": "T. Cruise", "custid": "C13", "category": "music",
"address": { "order_date": "2017-05-01", "name": "Fender Bender Flight Case",
"street": "201 Main St.", "ship_date": "2017-05-03", "descrip": "Sturdy flight case for
"city": "St. Louis, MO", "items": [ Fender Bender guitars",
"zipcode": "63101” }, { "itemno": 460, "manuf": "Fender Bender",
"rating": 750 "qty": 95, "listprice": 109.99
}, ... "price": 100.99 }, },
embedding
{ "itemno": 680, {
"qty": 150, "itemno": 680,
"price": 8.75} "category": "essentials",
] "name": "Automatic Beer Opener",
Q: Is this a “good” }, ... "description": "Robotic beer bottle
document DB design opener",
for NoSQL...?
aggregate "manuf": "Robo Brew",
object "listprice": 29.95
}, ...
M. Carey, Spring 2023: CS122D 7
Some Possible Nesting Designs
• Our previous example (from slide 2)
Q: Is this a “good” design?

customer order product

address items
item

item
...

M. Carey, Spring 2023: CS122D 8


Some Possible Nesting Designs (II)
• Or we could nest things more heavily...

customer
product
address

orders
order
items
item
...

order
items
item
... Caution: Is this a
... customer’s whole
order history...?!
M. Carey, Spring 2023: CS122D 9
Some Possible Nesting Designs (II)
• Or even more heavily...
ecommerce
customers
customer products
address product
product
orders ...
order
items
item
...

order
items
item
...
...

M. Carey, Spring 2023: CS122D 10


Some Possible Nesting Designs (III)
• Or, we could unnest things entirely...!
(JSON equivalent of a normalized relational design)

product
customer order

item

address

M. Carey, Spring 2023: CS122D 11


ER-Driven Document DB Design
ER Concept Document Artifact
Entity Collection with entity’s attributes and PK
Relationship (M:N) Collection with relationship’s attributes and FKs (linking)*
Relationship (1:N) Merge relationship collection with N-side entity collection
(linking)
Composite attribute Use nested object (embedding)
Weak entity Use nested object (embedding)
Multivalued attribute Use nested array (embedding)
Inheritance Celebrate diversity: Use entity collection with type flag(s)
... ...

• Bottom line objectives *NOTE : Might consider


using nested arrays of FK
• Natural “entry points” for queries “links” on each side to
“cut out the middleman”
• “One fact, one place” iff their sizes were small...
M. Carey, Spring 2023: CS122D 12
Cutting Out the M:N Middleman
• The aforementioned nested array approach
• Based on arrays of keys
• An appropriate option iff both arrays are small
• For example....

customer vehicle
... ...

vehicles customers

M. Carey, Spring 2023: CS122D 13


Object Size (I/O) Considerations

Q: Do most order
... queries also want
the line items?
M. Carey, Spring 2023: CS122D 14
Query (and Indexing) Considerations
SELECT * FROM customers c, orders o WHERE c.custid = o.custid AND ...

SELECT * FROM customers WHERE ... SELECT * FROM products WHERE ...
SELECT * FROM orders o, o.items i WHERE ...

customer order product

address items
item
CREATE INDEX
item ON products (listprice)
CREATE INDEX ...
ON customers (name)
CREATE INDEX
ON customers (address.city) CREATE INDEX ON orders (shipdate)
CREATE INDEX ON orders (custid)
CREATE INDEX ON orders (items.itemno)

M. Carey, Spring 2023: CS122D 15


order

Modeling Inheritance IsA

phone_order web_order
...
order Note: An object of a particular
subtype is free to contain
any/all fields that it needs in
phone_order the JSON world...!
Possible ways to indicate the
subtype(s) of an object include:
web_order
• Disjoint subtypes: a “type” field
containing a value of “order”,
web_order “phone_order”, or “web_order”
• Non-disjoint subtypes: a
”types” field containing an array
phone_order of values, or a set of boolean
type indicator fields, such as
“is_order”, “is_phone_order”,
... and “is_web_order”

M. Carey, Spring 2023: CS122D 16


Anti-Pattern 1: Fields vs. Values
...
{
"storeid": 288,
Field "2020-01-01": 1258.77
}
s ...
SELECT * FROM daily_sales
WHERE day >= "2020-01-01"
...
{ AND day < "2020-02-01"
"storeid": 288,
”day": "2020-01-01",
VALUES "sales": 1258.77
}
...
CREATE INDEX ON daily_sales (day)
M. Carey, Spring 2023: CS122D 17
Anti-Pattern 2: Unbounded Arrays
customer
address

orders
customer
order
address items

...
item
...
orders
order order
items items
item item
... ...

order order “Those who


items items nest history
item
...
item
... are bound to
regret it...”
M. Carey, Spring 2023: CS122D 18
Anti-Pattern 3: Heterogeneous Values
... { "itemno": 347,
"name": "Beer Cooler Backpack",
"colors": "black",
"price": 29.95 },
{ "itemno": 375,
"name": "Stratuscaster Guitar",
"colors": ["sunburst", "black", "cherry"],
"lprice": 1499.99 }, ...
vs.
... { "itemno": 347,
"name": "Beer Cooler Backpack",
"colors": ["black"],
"price": 29.95 },
{ "itemno": 375,
"name": "Stratuscaster Guitar",
"colors": ["sunburst", "black", "cherry"],
"lprice": 1499.99 }, ...
M. Carey, Spring 2023: CS122D 19
Document DB Design Principles!
• Rethink the relational DB design objectives
• “The world is flat”
• “One fact, one place”
• Central issue: aggregate object design
• Follows from the application (E-R)
• Unit of read/write/modify ops and ACID
• Unit of storage (contiguous bytes)
• Object size (both initial and eventual!)
• Consider queryability (and indexability)
• Avoid anti-patterns...!

M. Carey, Spring 2023: CS122D 20


Questions?

M. Carey, Spring 2023: CS122D 21

You might also like