0% found this document useful (0 votes)
3 views

PostgreSQL Data Base Design Part 1

The document discusses the organization and management of data, focusing on the differences between Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP). It outlines key concepts such as schemas, normalization, data types, and storage solutions like data warehouses and data lakes. The document emphasizes the importance of understanding business requirements and selecting appropriate database models for effective data design.

Uploaded by

sleonsalome
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

PostgreSQL Data Base Design Part 1

The document discusses the organization and management of data, focusing on the differences between Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP). It outlines key concepts such as schemas, normalization, data types, and storage solutions like data warehouses and data lakes. The document emphasizes the importance of understanding business requirements and selecting appropriate database models for effective data design.

Uploaded by

sleonsalome
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

OLTP and OLAP

D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
Our motivating question:

How should we organize and manage data?


Schemas: How should my data be logically organized?

Normalization: Should my data have minimal dependency and redundancy?

Views: What joins will be done most often?

Access control: Should all users of the data have the same level of access

DBMS: How do I pick between all the SQL and noSQL options?

and more!

DATABASE DESIGN
Our motivating question:

How should we organize and manage data?


Schemas: How should my data be logically organized?

Normalization: Should my data have minimal dependency and redundancy?

Views: What joins will be done most often?

Access control: Should all users of the data have the same level of access

DBMS: How do I pick between all the SQL and noSQL options?

and more!

It depends on the intended use of the data.

DATABASE DESIGN
Approaches to processing data
OLTP OLAP
Online Transaction Processing Online Analytical Processing

DATABASE DESIGN
Some concrete examples
OLTP tasks OLAP tasks
Find the price of a book Calculate books with best profit margin

Update latest customer transaction Find most loyal customers

Keep track of employee hours Decide employee of the month

DATABASE DESIGN
OLAP vs. OLTP
OLTP OLAP
Purpose support daily transactions report and analyze data
Design application-oriented subject-oriented
Data up-to-date, operational consolidated, historical
Size snapshot, gigabytes archive, terabytes

Queries simple
updates
transactions & frequent complex, aggregate queries & limited
updates
Users thousands hundreds

DATABASE DESIGN
Working together

DATABASE DESIGN
Takeaways
Step back and figure out business requirements
Difference between OLAP and OLTP

OLAP? OLTP? Or something else?

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N
Storing data
D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
Structuring data
1. Structured data 2. Unstructured data

Follows a schema Schemaless

Defined data types & relationships Makes up most of data in the world
_e.g., SQL, tables in a relational database _ e.g., photos, chat logs, MP3

3. Semi-structured data # Example of a JSON file


"user": {
Does not follow larger schema "profile_use_background_image": true,
Self-describing structure "statuses_count": 31,
"profile_background_color": "C0DEED",
e.g., NoSQL, XML, JSON "followers_count": 3066,
...

DATABASE DESIGN
Structuring data

1Flower by Sam Oth and Database Diagram by Nick Jenkins via Wikimedia Commons
https://commons.wikimedia.org/wiki/File:Languages_xml.png

DATABASE DESIGN
Storing data beyond traditional databases
Traditional databases
For storing real-time relational structured data ? OLTP

Data warehouses
For analyzing archived structured data ? OLAP

Data lakes
For storing data of all structures = flexibility and scalability

For analyzing big data

DATABASE DESIGN
Data warehouses
Optimized for analytics - OLAP
Organized for reading/aggregating data

Usually read-only

Contains data from multiple sources

Massively Parallel Processing (MPP)

Typically uses a denormalized schema and


dimensional modeling

Data marts

Subset of data warehouses


Dedicated to a specific topic

DATABASE DESIGN
Data lakes
Store all types of data at a lower cost:
e.g., raw, operational databases, IoT device logs, real-time, relational and non-relational

Retains all data and can take up petabytes

Schema-on-read as opposed to schema-on-write

Need to catalog data otherwise becomes a data swamp

Run big data analytics using services such as Apache Spark and Hadoop
Useful for deep learning and data discovery because activities require so much data

DATABASE DESIGN
ETL

ELT

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N
Database design
D ATA B A S E D E S I G N

Lis Sulmont
Curriculum Manager
What is database design?
Determines how data is logically stored
How is data going to be read and updated?

Uses database models: high-level specifications for database structure


Most popular: relational model

Some other options: NoSQL models, object-oriented model, network model

Uses schemas: blueprint of the database


Defines tables, fields, relationships, indexes, and views

When inserting data in relational databases, schemas must be respected

DATABASE DESIGN
Data modeling
Process of creating a data model for the data to be stored

1. Conceptual data model: describes entities, relationships, and attributes

Tools: data structure diagrams, e.g., entity-relational diagrams and UML diagrams
2. Logical data model: defines tables, columns, relationships

Tools: database models and schemas, e.g., relational model and star schema
3. Physical data model: describes physical storage

Tools: partitions, CPUs, indexes, backup systems and tablespaces

1 https://en.wikipedia.org/wiki/Data_model

DATABASE DESIGN
Conceptual - ER diagram Logical - schema

Fastest conversion: entities become the


Entities, relationships, and attributes tables

DATABASE DESIGN
Other database design options

Determining tables

DATABASE DESIGN
Beyond the relational model
Dimensional modeling
Adaptation of the relational model for data warehouse design

Optimized for OLAP queries: aggregate data, not updating (OLTP)


Built using the star schema

Easy to interpret and extend schema

DATABASE DESIGN
Elements of dimensional modeling
Fact tables

Decided by business use-case

Holds records of a metric

Changes regularly

Connects to dimensions via foreign keys

Dimension tables
Organize by:
Holds descriptions of attributes
What is being analyzed?
Does not change as often
How often do entities change?

DATABASE DESIGN
Let's practice!
D ATA B A S E D E S I G N

You might also like