0% found this document useful (0 votes)
6 views1 page

Data Engineering SQL Concepts - Mindmap

The document outlines key concepts in data management, including the characteristics of big data such as volume, velocity, and variety, as well as storage solutions like data lakes and cloud platforms. It distinguishes between structured and unstructured data, detailing data warehousing techniques and programming types, including OLAP and OLTP. Additionally, it describes data pipeline processes, emphasizing extraction, transformation, and loading (ETL) and the alternative ELT approach.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views1 page

Data Engineering SQL Concepts - Mindmap

The document outlines key concepts in data management, including the characteristics of big data such as volume, velocity, and variety, as well as storage solutions like data lakes and cloud platforms. It distinguishes between structured and unstructured data, detailing data warehousing techniques and programming types, including OLAP and OLTP. Additionally, it describes data pipeline processes, emphasizing extraction, transformation, and loading (ETL) and the alternative ELT approach.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Volume

Big Data Velocity

Variety

Unstructured Data/Images/Videos/Text/Tables/

Hard disk

Traditional Storage Platform Computer


Data Lake
Mobile

Google Drive
Cloud Storage Platforms
S3 Buckets

Horizontal Increase in number of computers, processing power increases


Scalability
Vertical Increase in Processing Power of your current computer

Traditional Data Warehouse

MySQL

SQL Microsoft SQL

PL-SQL

OLAP Online Analytical Processing Column-Major Format MySQL


Type of Activities
OLTP Online Transactional Processing Row-Major Format MySQL

Star Schema

SnowFlake Schema Data Model


Ways to define the schema Data Model(OLAP)
Structured Data(RDBMS) Normalization

Denormalization
Data Engineering
Ways to define Schema No need to follow any Data Model in OLTP(RDBMS)

You must have Schema Design

How To Store the Data? ER Diagrams - We Design Schema

Schema on Write

Snowflake

Google Big Query


Data Warehouse
Cloud Based Data Warehouse Redshift

Hive - Hadoop based SQL Platform

Databricks Spark SQL

Declarative You define the output and SQL going to take care of the steps
Types of Programming
Imperative You define the steps in python/programming so we are able to get the output

Mongo DB

No Schema Database Cassandra

Structured Data + Semi Structured + Unstructured Data NoSQL Database HBAse

Schema on Read

1. [E]We Extract The From the Source Website/Excel/API

Data Pipeline 2. [T]We Transform the Data Python/alteryx/Power Query

3. [L]We Load the Data SQL/Warehouse

1. [E]We Extract The From the Source Website/Excel/API

2. [L]We Load the Data Data Lake/S3/HardDisk


ELT Pipeline
3. [T]We Transform the Data Python/alteryx/Power Query

4. [L]We Load the Data SQL/Warehouse

You might also like