0% found this document useful (0 votes)
282 views5 pages

Snowflake Architecture

Snowflake uses a shared nothing architecture where data is stored independently from compute resources. It organizes data into an optimized columnar format across cloud storage. Queries are executed using virtual warehouses which are allocated compute resources on demand. The cloud services layer acts as the control plane, routing queries to the appropriate storage and optimizing performance across the decoupled layers.

Uploaded by

vrjs27 v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
282 views5 pages

Snowflake Architecture

Snowflake uses a shared nothing architecture where data is stored independently from compute resources. It organizes data into an optimized columnar format across cloud storage. Queries are executed using virtual warehouses which are allocated compute resources on demand. The cloud services layer acts as the control plane, routing queries to the appropriate storage and optimizing performance across the decoupled layers.

Uploaded by

vrjs27 v
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Snowflake Architecture

Datab
ase Storage:
 Whenever data loading into snowflake, the Snowflake reorganizes the data into its
internal optimized, compressed, columnar format. Snowflake stores this optimized
in cloud storage.
 The data objects stored by Snowflake are not directly visible nor accessible by
customers. They are only accessible through SQL query operations run using
Snowflake.
Query Processing:
 Query execution is performed by Query processing layer.
 Query processing queries using "virtual warehouses".
 Warehouses are required for queries, as well as all DML operations, including
loading data into tables.
 A warehouse is defined by its size.
 Increasing the size of a warehouse does not always improve data loading
performance. Data loading performance is influenced more by the number of files
being loaded (and the size of each file) than the size of the warehouse.

 What is a Multi-cluster Warehouse?


o By default, a virtual warehouse consists of a single cluster of compute
resources available to the warehouse for executing queries. As queries are
submitted to a warehouse, the warehouse allocates resources to each query
and begins executing the queries. If sufficient resources are not available to
execute all the queries submitted to the warehouse, Snowflake queues the
additional queries until the necessary resources become available.
o With multi-cluster warehouses, Snowflake supports allocating, either
statically or dynamically, additional clusters to make a larger pool of
compute resources available. A multi-cluster warehouse is defined by
o Specifying the following properties:
 Maximum number of clusters, greater than 1 (up to 10).
 Minimum number of clusters, equal to or less than the maximum (up
to 10).
 If minimum cluster and maximum cluster both size is same means it is
called Maximized warehouse plan.
 If minimum cluster and maximum cluster both size is different it is
called auto-scale warehouse plan.
Cloud Services:
 Cloud services layer is a collection of services.
o Authentication
o Infrastructure management (Build bridge between warehouse and database)
o Metadata management
o Query parsing and optimization
o Transaction Manager (DML)
o Access control

 We have share disk architecture and shared nothing architecture.


Shared Disk Architecture:

 Different server stored to single storage point. So shared-disk architecture


performance is will going to slowed down. It is coupled each node connect with
single storage.
Shared Nothing Architecture:

 Each server stored to each storage point, so the performance will not be going
down. It works effectively. When storage is too high but server very low means it
will not work properly again it will slow. If we want performance high means we
should increase both server and storage.
 It is decoupled those all are not connected with each other.
Multi cluster shared data architecture:
 Snowflake followed this one and it is decoupled. Query processing layer and data
storage layer both not connected with each other, both are separated. So that is the
reason warehouse cost is separate and storage cost is separate.
 If without connect each layer then query processing layer how data has pick from
storage layer. Which is done by cloud service layer. It is a brain of the Snowflake
architecture.
 Whatever our body will do before the brain only decided., same concept applicable
here.
 Cloud service layer has Metadata service. So, it will connect with each layer.
Whenever we write the query it will directly go to cloud service layer. The query
will ask the exact address of the layer where it has stored.
 Then will optimize the query.

Snowflake Prizing:
https://www.snowflake.com/pricing/
Query Running Time VWH Credits per Sec Edition Edition Price Total Price Total Price INR
10.87 L 0.0022 BC 4 0.095656 7.65248
2.26 XS 0.0003 BC 4 0.002712 0.21696
Note:
• Snowflake is not charging if we are querying the Metadata information
• Snowflake is not charging DDL statements also
• Azure Does not support the VPN account

You might also like