0% found this document useful (0 votes)
64 views19 pages

Snowflake Learning Path. Let Your Data Take Centerstage - by DataCouch - Medium

Uploaded by

ravi.abinit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views19 pages

Snowflake Learning Path. Let Your Data Take Centerstage - by DataCouch - Medium

Uploaded by

ravi.abinit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

1/19/24, 10:22 AM Snowflake Learning Path.

Let Your Data Take Centerstage | by DataCouch | Medium

Jump-start your best year yet: Become a member and get 25% off the first year

Snowflake Learning Path


DataCouch · Follow
10 min read · Apr 26, 2023

Let Your Data Take Centerstage

What is Snowflake?
Snowflake is an integrated cloud-based data storage, processing, and
analytics service that is a pioneer in offering DaaS or Datawarehousing-as-a-
Service. With Snowflake you can leverage the biggest 3 brands of cloud
service providers — Amazon S3, Microsoft Azure and Google Cloud Platform
— to store, process and analyze your enterprise data.

Some of the key features and advantages of Snowflake are:

Separation of storage and compute

On-the-fly scalable compute

Data sharing

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 1/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Data cloning

Support for third-party tools

With these and many more features, Snowflake can support your growing
and ever demanding enterprise data needs. Snowflake is all about the data
enabling governed access to near-infinite volume of data with support of
cutting-edge tools, applications, and services.

Your teams can collaborate from anywhere in the world with Snowflake’s
Data Cloud to collectively know more about existing customers, identify new
customers, create business opportunities, reveal new insights — quickly,
easily, securely, seamlessly, and at scale.

Snowflake is one of the most opted data warehouse service


providers for building self-managing cloud data warehouses,
data lakes, and other cloud storage. It runs on popular cloud
providers such as Amazon Web Services (AWS), Microsoft
Azure and Google Cloud Platform.
The platform can handle all aspects of authentication, configurations,
resource management, data protection, availability, and optimization. It is
known for its range of unique features, which includes data sharing, time
travel, database replication and failover, and inbuilt zero-copy cloning. It
enables its users with greater agility by decoupling storage and computing.

The Snowflake ecosystem consists of:

Unlimited Storage via Data Lake

Query Processing through Virtual Data Warehouse

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 2/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Snowflake for Data Science

Underlying Cloud Infrastructure Services

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 3/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Use Cases for Snowflake

Near-infinite Data Storage with Snowflake Data Lake


The Snowflake data lake is a highly scalable repository capable of storing
raw data (structured, semi-structured, and unstructured) until further
processing. It can hold data from heterogeneous sources in different data
formats.

Unlimited and Economical Storage


You can Store data in multiple formats including AVRO, Parquet, ORC, CSV,
JSON, XML, and many more. The cost of per Tb storage is very low and
comes with unlimited storage limits. The storage uses efficient compression
& encryption techniques and applies automatic micro-partitioning. You can
also easily load data from different cloud providers through real time
streaming and batch uploads.

Leverage SQL
You can run SQL queries for any structured data stored in different formats
and structures. You can also work with external data from cloud storage like
AWS S3 and Azure Data Lake Storage without loading it to Snowflake.
Snowpark allows you to streamline development of data pipelines using SQL
or in other languages of choice without the need for managing additional
clusters or services, or even making copies of the data.

Reliable and Performant Query Processing


The elastic engine can simplify your workloads and underlying architecture
with virtual warehouse capabilities. The issues related to concurrency or
resource contention become a thing of the past. Control the data within the
lake with a handle on how your data is stored and used allowing easy
collaboration and secure data sharing.

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 4/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Snowflake Data Warehouse for Data Processing


Leverage the Snowflake DWH for executing long-running analytical jobs to
yield business insights.

Massively Parallel Processing


The Snowflake DWH runs using MPP to handle multiple query and analysis
operations simultaneously using several processing elements. The MPP
database can process large volumes of data without compromising execution
speed of analytics ops. The centralized storage of Snowflake is shared by all
compute nodes that run queries in clusters called virtual warehouses.
Snowflake offers the ideal mix of simplicity with scalability.

Seamless Data Integration


Snowflake supports both ETL and ELT operations to let you perform data
transformations during or after loading it. With support for multiple data
integration tools like Informatica, Talend, Fivetran, Matillion, and others
you have virtually the entire universe at your service. Snowpark can be used
to build complex data pipelines allowing developers to interact with
Snowflake without moving data.

Columnar Storage
Snowflake is a complete SQL database to support data intensive operations
with data stored in columnar manner rather than the traditional row
manner. The columnar-format allows faster data aggregation, high
compression ratio, reduction in physical storage requirements, and at low
storage cost.

Data Partitioning
Snowflake tables are automatically partitioned into micro-partitions in
contiguous storage blocks. These micro-partitions allow data grouping of

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 5/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

rows within tables and are stored in columnar fashion. Each table can have
hundreds of millions of micro-partitions of sizes varying from 50 to 500 MB
uncompressed and 16MB max as compressed.

Advanced Analytics Support


Snowflake supports running analytics operations on large volumes of data
by applying complex ML algorithms helping developers identify data
patterns and insights.

Data sharing feature to use cleansed data from data warehouse


Data Analysts and Data Scientists can easily use cleansed and transformed
data for their ML by leveraging advanced and secure data sharing. Perform
Exploratory Data Analysis (EDA) and develop new models at no additional
storage cost.

Snowsight for EDA


Snowsight is a powerful Web UI from Snowflake accelerating users’
querying, scripting, and data visualization capabilities. Identify outliers,
point out quality issues, and utilize the data exploration and model
distribution capabilities with ease. Data preparation and data visualization
come handy with a scalable computing backbone.

Quick and Easy TP Integration


Many Third-party (TP) tools are developed by Snowflake partners that you
can easily integrate with your instance. With easy, secure, and quick TP
Integration leverage their advanced capabilities for statistical and predictive
modeling. Use third-party APIs for data modeling, model production and
testing.

Reliable Cloud Infrastructure


https://datacouch.medium.com/snowflake-learning-path-f404718044ea 6/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Federated Authentication and SSO


Snowflake’s federated authentication service allows you to either use
Snowflake’s authentication mechanism or continue to use your user
credentials. Security administrators can create users with passwords
maintained in Snowflake while also supporting two methods of
authenticating users via Browser-based SSO and Programmatic SSO (as of
now only for Okta).

Infrastructure Management
The cloud service layer is set as an independent, higher-level, fault-tolerant
service bouquet. The Infrastructure manager is responsible for managing
virtual warehouses, coordinating data access & updates without impacting
its availability or overall performance.

They are also responsible for communication with client applications like
Snowflake Web UI, JDBC/ODBC clients, etc. for query processing. It also
ensures security of your account, users,and data stored in Snowflake.

Metadata management
Metadata is used to define warehouse objects and functions as a directory or
a folder. It helps in better organization and helps locate them faster.
Snowflake metadata management ensures the highest level of data
governance using processes, policies, workflows, and technology. It can
identify, organize, and make metadata available to data consumers by adding
actionable context to the assets in the data warehouse.

Snowflake metadata management offers following benefits to users:

Faster access to insights

End-to-end visibility
https://datacouch.medium.com/snowflake-learning-path-f404718044ea 7/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Improved data quality and trust

Improved operational efficiency

Compliance and regulations

Improve ROI on data

Never Worry about Query Optimization


Snowflake optimizer parses queries optimizing them dynamically. The auto
tuning helps users get results faster by speeding up data processing for both
structured and semi-structured data alike. As a data user you do not have to
worry about query performance at all — you just need to model, load, and
query the data!

The cloud services layer handles all tasks related to query planning and
optimization based on metadata and data profiles. It determines the most
effective distribution of the data and queries across the available compute
nodes. Data users need not worry about indexes, partitions & partition keys,
pre-sharding data, and most importantly updating statistics.

Access control
Snowflake access control mechanism combines aspects from Discretionary
Access Control (DAC) as well as Role-based Access Control (RBAC). In DAC,
access to objects are granted by their owners, while in RBAC, access
privileges are assigned to roles assigned to users.

Access control in Snowflake revolves around Securable objects, Roles,


Privileges, and Users. This model provides a significant improvement for
both control over data and flexibility of access.

Why Snowflake?
https://datacouch.medium.com/snowflake-learning-path-f404718044ea 8/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Snowflake is a complete and truly cloud-based SaaS offering:

You need not select, configure, and monitor any hardware.

Or worry about installing, configuring, and managing any software.

Maintenance, data backup & recovery, system management & upgrades,


and performance tuning are all handled automatically.

Every component and tool from Snowflake — except the optional CLI
clients, drivers, and connectors — run on public clouds.

Snowflake employs virtual compute resources and storage service for


persistent storage.

Get highly performant query processing and speed up your analytics with
multiple virtual warehouses, automatic query optimisation, cluster
tuning, and micro-partitions.

The On-demand pricing model allows even the smallest of teams to


leverage Snowflake’s capabilities to scale up economically.

There is literally “Zero” administration and operational costs involved as


auto-scaling, auto suspend, and zero-copy data sharing allow you to focus
on your work and not admin tasks.

The UI/UX of Snowflake and its tools is intuitive and user-friendly with
ANSI SQL being the language of choice.

Snowflake is compatible with multiple BI tools giving you power to query


large datasets from them along with support for multiple programming
languages.

Scalability is built-in to Snowflake’s architecture as it is a cloud-agnostic


solution freeing up your team from worrying about system failures or
delays.
https://datacouch.medium.com/snowflake-learning-path-f404718044ea 9/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Seamless data sharing allows users to collaborate with anyone, even


when they’re not Snowflake users.

The data is secured with in-store and in-transit encryption at no


additional cost.

No need to learn a new language for data blending, analysis, and


transformations and other data p tasks. You only need SQL.

Support for a variety of file formats for both structured and semi-
structured data.

Snowflake Learning Paths


​At the core of Snowflake learning Open in app
is the understanding of the Snowflake
Cloud Data Platform concepts and application of the solution in
Search Write
implementing and migrating to Snowflake. At this stage, the necessary
knowledge is applied to design, develop and manage secure, scalable
Snowflake solutions for driving business objectives.

The advanced learners and practitioners can study and master concepts like:

Designing an end-to-end data flow and near real-time data streams

Designing and deploying a data architecture to meet business


requirements

Selecting appropriate tools for optimization

Sourcing data from Data Lakes, APIs, and on-prem data warehouses

Transforming, replicating, and sharing data across cloud platforms

Designing scalable compute solutions for DE workloads

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 10/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Implementing Snowflake DS best practices

Implementing data and feature engineering in Snowflake

Training and using ML models

Using data visualization and implementing model lifecycle management

Managing and administering Snowflake resources like accounts, data


security and governance, database objects, and virtual warehouses.

Performing database monitoring and tuning

Enabling data sharing and using Snowflake Data Exchange and Data
Marketplace

Administering disaster recovery, backup, and data replication policies

The Snowflake learning journeys can be summed in the following image


from who else but Snowflake.

(Image Source: Snowflake)

Prepare for SnowPro Core Certification


https://datacouch.medium.com/snowflake-learning-path-f404718044ea 11/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

You can check the Official SnowPro Certification resources at Snowflake


Certifications | Stand Out in the Data Community to check out the learning
material and booking your certification exams.

Learning Tracks from Snowflake


Essential hands-on curated learning journey can be found here.

Check out courses and community discussion at the Snowflake


University.

Learn more about learning tracks and certifications here.

Prepare for Advanced SnowPro Certifications for


- Architect
- Administrator
- Data Engineer
- Data Scientist

Watch videos on DataCouch’s YouTube Channel


Getting Started with Snowflake

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 12/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

You can directly click on the following YouTube links to directly watch the
videos that interest you the most. :)

Evolution of Data Warehousing Technologies

Challenges with Traditional DWs and Big Data DWs | Data Warehouse | Big Data

Data Warehouses vs Data Lakes

Why Snowflake?

Getting started with Snowflake

Snowflake Architecture

Snowflake Demo

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 13/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Data Sharing in Snowflake

Limitations of Snowflake

RedShift vs Snowflake

Understanding Snowflake vs BigQuery

About DataCouch
DataCouch is a leading Enterprise IT Training and Consultancy company
based in India, Canada, and Malaysia. Our team of experts is a trusted Global
Training Partner or have delivered technology training for many Fortune 500
companies including Adobe, Apple, BCG, Google, Microsoft, Neo4j, PayPal,
Salesforce, Starbucks, Walmart and many more.

DataCouch is a preferred training delivery partner of Neo4j, Confluent,


PayPal, Snowflake, and many others. We continuously improve our Global
Training Delivery Model and offer the services of world-class instructors,
course content to meet your specific needs, and a proprietary virtual
Browser-Based Lab Solution. Our premium Enterprise Training services
have benefitted the engineering teams of companies with unique needs in
niche markets, emerging domains, serving demanding customers.

The specialized Enterprise Training at DataCouch is delivered by experts


having a collective 100+ years’ of experience. The team is at the forefront of
bleeding–edge technologies in AI, ML, Data Engineering, Cloud, DevOps,
IoT, Blockchain, and RPA domains.

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 14/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Our consultants are transforming teams and organizations with pragmatic


solutions and attainable objectives. We have successfully delivered
consultancy services in Data Engineering, Digital Transformation, and
DevOps helping clients implement solutions for businesses of all sizes.

For official virtual instructor-led Snowflake classes, please reach out to us at


[email protected] or visit our website — DataCouch for courses
related to Agile, AI, Cloud, DevOps, Data Engineering, ML, RPA and many
more.

Snowflake Data Warehouse Cloud Computing Datacouch

Written by DataCouch Follow

270 Followers

We are a team of Data Scientists who provide training and consultancy services to
professionals worldwide. Linkedin- https://in.linkedin.com/company/datacouch

More from DataCouch

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 15/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

DataCouch DataCouch

Getting Started with Kafka Connect Getting started with Apache Hudi
Introduction Introduction

6 min read · Oct 28, 2021 15 min read · Dec 19, 2023

14 15

DataCouch DataCouch

Google Gemini — A New Era in How to Set up Spark Environment


Generative AI on Mac
In the ever-evolving landscape of artificial This guide simplifies the journey to setup and
intelligence (AI), Google has taken a… configure the Spark environment on Mac OS.

13 min read · Dec 29, 2023 8 min read · Aug 19, 2022

11 4

See all from DataCouch

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 16/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Recommended from Medium

Amulya Kumar panda Vino Duraisamy in Snowflake

Features of select statements in Step by step roadmap to becoming


Snowflake a Snowflake Data Engineer in 2023
Snowflake added so many features in the Hint — SQL & Python is all you need. Yeah, no
select statement. kidding!!

3 min read · Aug 12, 2023 4 min read · Aug 15, 2023

17 612 8

Lists

Natural Language Processing


1103 stories · 572 saves

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 17/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

Tom Christian Mauricio Rojas

RAG Made Simple with Snowflake Running SnowSQL scripts directly


Cortex from SnowPark
End to end RAG within a single data platform? As a Snowflake user, I have developed or use
Cortex makes things simple. several scripts written using the SnowSQL…

12 min read · Jan 4 4 min read · Jan 3

46 9

Karim Faiz Amit Jaiswar

5 ways to Snowflake Change Data Implementing (SCD2) in Snowflake:


Capture (CDC) Slowly Changing Dimension Type 2
Introduction:

· 3 min read · Dec 9, 2023 3 min read · Jul 26, 2023

6 31 1

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 18/19
1/19/24, 10:22 AM Snowflake Learning Path. Let Your Data Take Centerstage | by DataCouch | Medium

See more recommendations

https://datacouch.medium.com/snowflake-learning-path-f404718044ea 19/19

You might also like