0% found this document useful (0 votes)
6 views

Access Control Overview -2

Snowflake's access control combines role-based access control (RBAC) and discretionary access control, allowing roles to manage privileges on securable objects. Users can be assigned multiple roles, creating a hierarchy where privileges can be inherited. The document also covers user authentication methods, including username/password and multi-factor authentication (MFA), emphasizing the importance of security in managing access to Snowflake objects.

Uploaded by

Prakash Js
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Access Control Overview -2

Snowflake's access control combines role-based access control (RBAC) and discretionary access control, allowing roles to manage privileges on securable objects. Users can be assigned multiple roles, creating a hierarchy where privileges can be inherited. The document also covers user authentication methods, including username/password and multi-factor authentication (MFA), emphasizing the importance of security in managing access to Snowflake objects.

Uploaded by

Prakash Js
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 60

Snowflake Access Control Overview

Snowflake Access control is about who can perform what


operations on which objects.

To start, Snowflake uses two access control frameworks.

The first, you'll probably be familiar with,


1. role-based access control or RBAC for short.
This is the common practice of attaching access privileges,
so who can select, modify, remove, etcetera, to roles,
which are then assigned to users.
2. user-based access control model.
Their privileges are granted directly to a user,
Snowflake combines RBAC with discretionary access
control.
This describes an access control framework in which each
object has an owner, typically the role that created the
object, who can in turn grant access to that object.
For example, a user assuming role A might execute a
create table statement.
Role A would then be the owner of that table. This means
they have full privileges on that table, including the ability
to grant access or pass ownership.
So in this example, role A could grant role B the privilege to
select from that table object it just created.
There are four key areas of theory to review with respect to
access control,
 securable objects and the object hierarchy,
 roles and role hierarchy,
 privileges and privilege inheritance,
 users.
Okay, so let's take a look at our object hierarchy again,
the vast majority of objects in Snowflake are securable.
This means you can apply very granular privileges on each
object, determining what a given user can do with that
object.
For example, you can grant a role the ability to use a
warehouse, but not necessarily have the ability to modify
its configuration.
These would be two different privileges it's possible to
apply to a specific object.
The concept of the object hierarchy or object containers, so
accounts containing databases, databases containing
schemas, and so on, it's important to understand in the
context of access control.
This is because you often need access to a parent object.
For example, when you create a table, you need access to
the database and schema you'd like it to reside in.
Every securable object is owned by a single role, which is
typically the role used to create the object, a feature of the
discretionary access control framework.
The owner of an object can be seen when executing the
SHOW <object> command. There's a column in the result
set called owner.
Being an owner of an object gives the role all the privileges
on the object by default, including the ability to grant or
revoke privileges on the object to other roles.
The owner can transfer ownership from one role to another,
and access to a securable object can be shared between
users if a role is shared.
However, it's not just the owner that can grant access to its
objects.
Any role with the managed grants global privilege can
grant or evoke privileges on any object as if it were the
object owner. Through either the owner role or by a role
which has the managed grants global privilege, access to
an object can also be granted to a non-owner role.
For example, if you were to create a read-only role for a BI
tool, a security admin with the managed grants global
privilege could control on a very granular level what it
would be able to do, like reading from only one or two
views.
More broadly, this could include the ability to create a
warehouse, list tables contained in a schema, or add data
to a table.
And there's one last point to mention, which might seem
obvious, but unless allowed by a grant, access to a
securable object will be denied.

The owning role:


• Has all privileges on the object by default.
• Can grant or revoke privileges on the object to other
roles.
• Transfer ownership to another role.
• Share control of an object if the owning role is
shared.
Access to objects is also defined by privileges granted to
roles:
• Ability to create a Warehouse.
• Ability to list tables contained in a schema
• Ability to add data to a table
Unless allowed by a grant, access to a securable object will
be denied.

Roles
A role is an entity to which privileges on securable objects
can be granted or evoked.
Here are some example commands granting privileges to
roles.
In the second command, we're granting SELECT on a table
object to the role, TEST_ROLE.
Roles are then assigned to users, giving them authorization
to perform certain actions.
A user can have multiple roles assigned to them, and
switch between them within a Snowflake session.
A role itself is a securable object. This means roles could be
granted to other roles, creating a role hierarchy.

In the scenario we see here, Role 3 is granted to Role 2.


Role 3 becomes Role 2's child role.
Privileges of child roles are inherited by parent roles.
Role 2 can do everything Role 3 can do, and Role 1 can do
everything both Role 2 and 3 can do.
This is what we call a role hierarchy.

 A role is an entity to which privileges on securable


objects can be granted or revoked.
 Roles are assigned to users to give them the
authorization to perform actions.
 A user can have multiple roles and switch between
them within a Snowflake session.
 Roles can be granted to other roles creating a role
hierarchy.
 Privileges of child roles are inherited by parent roles.
As we briefly mentioned, Snowflake accounts ship with six
system-defined roles:
 ORGADMIN,
 ACCOUNTADMIN,
 SECURITYADMIN,
 USERADMIN,
 SYSADMIN,
 PUBLIC,
which themselves form a role hierarchy.

Although these roles cannot be dropped, and their


privileges cannot be revoked, you can add privileges to
these roles.
However, Snowflake generally discourage this, and instead
support creating custom roles, and as you go down the
hierarchy tree, each role has less privileges.
For example, ACCOUNTADMIN can perform all the functions
of both SECURITY and SYSADMIN,
but USERADMIN cannot perform the functions of the
ORGADMIN.
Let's take a closer look at each role.
The ORGADMIN role, AKA the organization administrator
manages operations at the organization level.
Not only can this role create accounts in the organization, it
can also view all accounts, as well as see which regions are
enabled for the organization, and lastly, the ORGADMIN
role can view usage information across the organization.
ACCOUNTADMIN role, AKA the account administrator.
This is the top level role for an account and has a very
broad set of permissions, so should be granted to only a
limited or controlled number of users. This role
encapsulates the SYSADMIN and SECURITYADMIN system-
defined roles, meaning by role inheritance, all the
privileges granted to SYSADMIN, SECURITYADMIN, and
USERADMIN are also available to the ACCOUNTADMIN role.
This role alone is responsible for configuring parameters at
the account level.
Users with the ACCOUNTADMIN role can view and operate
on all objects in the account, can view and manage
Snowflake billing and credit data, and can stop any running
SQL statements.
SYSADMIN role, AKA the system administrator.
Its main job is to manage objects in an account. It has the
privileges to create warehouses, databases, and many
other objects.
The SECURITYADMIN role, AKA the security administrator
can manage any object grant globally, as well as create,
monitor, and manage users and roles. More specifically,
this role is granted the MANAGE GRANT security privilege,
enabling it to modify any grant, including revoking it.
SECURITYADMIN inherits the privileges of the USERADMIN
role via the system role hierarchy.
The USERADMIN role is used for user and role
management, and is granted the CREATE USER and
CREATE ROLE security privileges, enabling a user with this
role assumed to create users and roles in the account.
PUBLIC role. This is a pseudo role that is automatically
granted to every user and every role in your account.
Because of this, an object which is owned by a PUBLIC role
is therefore accessible to everyone in your account.
Custom roles.
Custom roles allow you to create your own role with custom
and fine-grained security privileges. They can be aligned to
groups or teams within an organization, a support team, for
example, a group of analysts, or by environments such as
dev and prod.
Custom roles allow administrators working with the system
to find roles to exercise the security principle of least
privilege, only giving users the privileges they need to
perform their duties and no more, reducing the risk of
human error or malicious activity.
For example, you might not want everyone on your account
to have SYSADMIN privileges, as this would entitle them to
create warehouses, which has the potential to incur
significant cost if not managed properly.
A custom role can be created by any role which has the
CREATE ROLE privilege. This includes the USERADMIN,
SECURITYADMIN, and ACCOUNTADMIN.
It's recommended to create a hierarchy of custom roles
with the top-most custom role assigned to the SYSADMIN
role.
If custom roles are not assigned to the SYSADMIN role,
system administrators will not be able to manage the
objects owned by custom roles.

ORGADMIN
• Manages operations at organization level.
• Can create account in an organization.
• Can view all accounts in an organization.
• Can view usage information across an organization.
ACCOUNTADMIN
• Top-level and most powerful role for an account.
• Encapsulates SYSADMIN & SECURITYADMIN.
• Responsible for configuring account-level
parameters.
• View and operate on all objects in an account.
• View and manage Snowflake billing and credit data.
• Stop any running SQL statements.
SYSADMIN
• Can create warehouses, databases, schemas and
other objects in an account.
SECURITYADMIN
• Manage grants globally via the MANAGE GRANTS
privilege.
• Create, monitor and manage users and roles
USERADMIN
• User and Role management via CREATE USER and
CREATE ROLE security privileges.
• Can create users and roles in an account.
PUBLIC
• Automatically granted to every user and every role
in an account.
• Can own securable objects, however objects owned
by
PUBLIC role are available to every other user and role in an
account

Custom Roles

 Custom roles allows you to create a role with custom


and finegrained security privileges defined.
 Custom roles allow administrators working with the
systemdefined roles to exercise the security principle
of least privilege.
 If custom roles are not assigned to the SYSADMIN role,
system admins will not be able to manage the objects
owned by the custom role.
 Custom roles can be created by the SECURITYADMIN &
USERADMIN roles as well as by any role to which the
CREATE ROLE privilege has been granted.
 It is recommended to create a hierarchy of custom
roles with the top-most custom role assigned to the
SYSADMIN role.
Privileges

A security privilege defines a level of access to an object.


For each object there is a set of security privileges that can
be granted on it.
There are 4 categories of security privileges:
 Global Privileges
 Privileges for account objects
 Privileges for schemas
 Privileges for schema objects
Privileges are managed using the GRANT and REVOKE
commands.

Future grants allow privileges to be defined for objects not


yet created.
A security privilege defines what level of access a role has
on a securable object.
For example, you can have a modified privilege on a
database, which would allow a role to change the settings
of a specific database.
And multiple privileges can be used together to control the
granularity of access granted to an object.
For each securable object, we have a set of privileges that
can be granted on it.
Some of the verbs, such as modify, are common across
different objects.
You can modify both a database and a resource monitor, for
example, and some are specific to an object.
There are four categories of security privileges we should
be aware of.
The first are global privileges.
These enable you to perform actions such as create
databases and monitor account level usage.
The manage grants privilege we've discussed is an
example of a global privilege.
The next category of privileges is for account objects.
These privileges provide different access levels to account
level objects like databases.
Like in the example above, the ability to modify a database
is a privilege on an account object, so it fits into this
category.
Next along, we have privileges for schemas.
These define what you can do within a schema. The usage
privilege on a schema fits into this category.
And the last category is privileges for schema objects.
These privileges provide different access levels to schema
level objects like tables. Select or truncate are examples
that fit into this category.
Privileges are managed using the GRANT and REVOKE
commands.
Use of these commands is restricted to the role that owns
the object, or roles that have the managed grant's global
privilege.

One granting usage on a database to a role, and the other


doing the opposite, revoking that privilege from the role.
Snowflake also have the concept of future grants.
These allow you to define privileges on objects which don't
yet exist.
The code example here shows how you can grant the
select privilege on all new tables created in a schema.
This makes managing grants a bit easier because you don't
have to add grants to each individual object as they're
created.
Future grants are not supported with data sharing, data
replication, and masking policies, and row access policies.

User Authentication
So far, we've discussed authorization, setting up a system
of privileges, roles, and users, which determine levels of
access to Snowflake objects.
Authentication.
User authentication is the process of authenticating with
Snowflake via user provided username and password
credentials.
User authentication is the default method of
authentication.
Users with the USERADMIN role can create additional
Snowflake users, which makes use of the CREATE USER
privilege.
 A password can be any case-sensitive string up to 256
characters.
 Must be at least 8 characters long.
 Must contain at least 1 digit.
 Must contain at least 1 uppercase letter and 1
lowercase letter.

This is about determining if the person or program


accessing Snowflake is really who they say they are.
To start with, let's take a look at the humble username and
password method of authentication.
It's the process of authenticating with Snowflake via user-
provided username and password credentials, either
through the UI or a Snowflake client such as SQL or the
Python connector.
More than likely, this will be the method you're most
familiar with, as it is the default authentication method for
Snowflake.
As we've covered, users with the user admin system-
defined role can create additional Snowflake users, which
makes use of the CREATE USER privilege.
Example of the CREATE USER command. And it's worth
bearing in mind, user passwords have some requirements,
a password can be any case-sensitive string up to 256
characters, including blank spaces and special characters
such as exclamation points and percentage signs.
It must be at least eight characters long, must contain at
least one digit, and must contain at least one uppercase
letter and one lowercase letter.
In example , you can see a password that meets these
requirements, and you might have noticed in the CREATE
statement, it's possible to set a weak password for a user
that doesn't meet the minimum password requirements.
This feature allows admins the option to use generic
passwords for the user during the creation process.
If this option is chosen, Snowflake strongly recommends
setting the MUST CHANGE PASSWORD property to true, to
require users to change their password on their next login.
Once created, and the user attempts to log in via the UI,
the user will be prompted to create a new password, which
must conform to the password requirements.
If a default role is not set, when a user starts a session,
they will be automatically assigned the public system-
defined role.
So that's it for user authentication, nice and easy.
Multi-factor Authentication (MFA)
MFA is an additional layer of security, requiring the user to
prove their identity not only with a password but with an
additional piece of information (or factor).
MFA in Snowflake is powered by a service called Duo
Security.
MFA is enabled on a per-user basis & only via the UI.
Snowflake recommend that all users with the
ACCOUNTADMIN role be required to use MFA.

Now let's take a look at how we can make our simple


username and password authentication a bit more secure.
it's a method to apply an additional layer of security to an
authentication process.
Requiring you to prove your identity, not only with a
password, but with an additional bit of information.
Generally, a push notification or a code supplied through
email or text.
MFA in Snowflake is powered by a service called Duo
Security.
However, it's completely managed by Snowflake and
doesn't require signup anywhere else.
The user will have to download the DUO Mobile application
on a mobile device to authenticate requests.
You access it via a QR code given during the setup process
on the Snowflake UI.
MFA is enabled on a per user basis and only via the UI.
A user isn't automatically enrolled in MFA, they must go
into the preferences of the UI, and there's a section under
general called multifactor authentication.
At a minimum, Snowflake strongly recommends that all
users with the account admin role be required to use MFA.
MFA is automatically enabled on the account level and
doesn't require additional management.

Let's quickly run through the login flow for a user who's
enrolled in MFA.
The top box contains actions performed in the Snowflake
UI, and the bottom box contains actions performed on the
Duo Security application.
So as usual, you would enter your Snowflake credentials.
However, once enrolled in multifactor authentication and
have the Duo Security app running, there are three ways
you can prove your second factor.
The quickest is to approve a DUO push notification that
pops up on your phone.
The second is to click call me from the app. You then follow
the instructions from a phone call, which enables you to
successfully log in.
And lastly, you can click enter a passcode from the DUO
app. You'll enter a passcode generated into Snowflake,
allowing you to log in.
Configurable properties in relation to MFA a user with the
alter user privilege could perform.

MINS_TO_BYPASS_MFA
ALTER USER USER1 SET
MINS_TO_BYPASS_MFA=10;

Specifies the number of minutes to temporarily disable MFA


for the user so that they can log in.
Means to bypass MFA allows a user admin to temporarily
disable MFA for a user for a configurable number of
minutes.
After the duration elapses, MFA is enforced again.
This might be appropriate if a trusted user is unable to
access their device with Duo Security installed.

DISABLE_MFA
ALTER USER USER1 SET
DISABLE_MFA=TRUE;

Disables MFA for the user, effectively cancelling their


enrolment. To use MFA again, the user must re-enrol.
Disable MFA effectively cancels a user's enrollment.
To use MFA again, the user would have to re-enroll.

ALLOWS_CLIENT_MFA_CACHING
ALTER USER USER1 SET
ALLOWS_CLIENT_MFA_CACHING=TRUE;

MFA token caching reduces the number of prompts that


must be acknowledged while connecting and
authenticating to Snowflake
This config allows a user to store an MFA token in the client
side cache.
The upshot of this is that a user won't have to respond to
as many prompts when connecting and authenticating with
Snowflake using MFA.
This is particularly useful if you're logging in multiple times
within a short duration.

Federated Authentication (SSO)


 Federated authentication enables users to connect to
Snowflake using secure SSO (single sign-on).
 Snowflake can delegate authentication responsibility
to an SAML 2.0 compliant external identity provider
(IdP) with native support for Okta and ADFS IdPs.
 An IdP is an independent service responsible for
creating and maintaining user credentials as well as
authenticating users for SSO access to Snowflake.
 In a federated environment Snowflake is referred to as
a Service Provider (SP).

Federated authentication enables users to connect to


Snowflake using secure single sign-on.
Single sign-on, or SSO, describes the process of logging in
to several independent software systems with a single ID
and password.
With federated authentication, Snowflake delegates
authentication responsibility to a SML 2.0 compliant
external identity provider, referred to an an IdP.
Snowflake has native support for Okta and ADFS IdPs.To
use an IdP other than Okta or ADFS, you must define a
custom application for Snowflake in your chosen IdP.
And IdP is an independent service responsible for creating
and maintaining user credentials, as well as authenticating
users for SSO access to Snowflake.
Essentially, Snowflake will go and check with the third party
to confirm you are who you say you are.
In a federated environment, Snowflake is referred to as a
service provider, so you have your IdP like Okta and your
SP which is Snowflake.
The workflow or the steps involved in logging in, logging
out and having your session timeout are slightly different
with federated authentication enabled in Snowflake.
The behaviour for each workflow is determined by whether
the action is initiated within Snowflake or your IdP.
Let's look at a concrete example of this by running through
a SSO log in workflow. We could either initiate a Snowflake
session via the Snowflake UI, an additional button can be
enabled which links you to the IdP log in page.
From there, you enter your IdP login credentials and then
once successfully logged in, it will load Snowflake with a
new session.
Alternatively, you could go straight to the IdP log in page,
log in there and use the IdP facilities to start a Snowflake
session.
The first would be a Snowflake initiated login workflow and
the second, an IdP initiated login workflow.
So actually setting up federated authentication is outside
the scope the exam.
It involves managing an external service and many manual
steps we don't need to cover in detail, however, there's a
couple of account properties on the Snowflake side used
during the set-up process we should highlight.

SAML_IDENTITY_PROVIDER
The SAML_IDENTITY_PROVIDER is the account level
property that enables federated authentication.
The property accepts the JSON object with the following
fields.

A certificate field that specifies the IdP certificate that


verifies communication between the IdP and Snowflake, an
SSO URL which stores the IdP endpoint where Snowflake
sends SAML requests.
The type option specifies the type of IdP used for federated
authentication, so Okta, ADFS or a custom IdP.

The label option specifies the text on the button which


appears on the Snowflake login page linking you to the IdP.
SSO_LOGIN_PAGE
This actually controls if the SSO button which you set the
text on, is shown on the main login page.
ALTER ACCOUNT SET SSO_LOGIN_PAGE = TRUE;

Enable button for Snowflake-initiated SSO for your identity


provider (as specified in SAML_IDENTITY_PROVIDER) in the
Snowflake main login page.

Key Pair Authentication, OAuth & SCIM


Key Pair Authentication, is a method of authentication
which uses a pair of cryptographic keys, one public and one
private.
In Snowflake, it's used as an alternative to username and
password authentication, and for connecting via Snowflake
clients, not the UI.
Let's run through the process of setting up key pair
authentication in Snowflake.
Firstly, users will generate a private public key pair using
open SSL, with a minimum of 2048 bit keys.
The public key of this key pair which is generated, is
assigned to a user with this command.
The private key is kept safe locally by the user.
A user would then configure one of the following snowflake
connectors, to authenticate using the key pair.
For example, with the Python connector, you create a
private key object, and pass that to the connect function as
a parameter, and each connector has their own method to
configure key pair authentication.
Snowflake also has key pair rotation for a user.
This allows you to update the active public key.
It's enabled by allowing two active keys, so you can swap
them with no downtime.
The commands here show setting a second key, and then
unsetting the now out of date first key.
There are two additional features around authentication
and authorization you should be aware of.
However, for the exam, these are not required to be
understood in a great level of detail.
So let's just run through some of the most essential points
around OAuth and SCIM.
Snowflake supports the OAuth two protocol.
OAuth is an open standard protocol that allows supported
clients authorized access to Snowflake without sharing or
storing user log-in credentials.
This allows something like a Tableau client to connect to
Snowflake without having to maintain usernames and
passwords.
This is known as delegated authorization, because a user
delegates the ability to read data to a client.
Snowflake offers two OAuth pathways, Snowflake OAuth,
and External OAuth.
Next up, we have SCIM.
SCIM stands for System for Cross-domain Identity
Management. So a bit of a mouthful. It's used to help
facilitate management of user identities and groups.
In Snowflake's case, this would be roles.
An IDP like ADFS uses a SCIM client on their end, to make a
restful API request to the Snowflake SCIM server.
Snowflake will check this API request, and if it looks good, it
will perform actions on roles and users.
Specifically, you can use the Snowflake SCIM API to
manage the user lifecycle.
So creating users, deleting users, and updating user
settings.
You can also do things like map active directory groups to
Snowflake roles.

Network Policies
 Network Policies provide the user with the ability to
allow or deny access to their Snowflake account based
on a single IP address or list of addresses.
 Network Policies currently support only IPv4
addresses.
 Network policies use CIDR notation to express an IP
subnet range.
 Network Policies can be applied on the account level
or to individual users.
 If a user is associated to both an account-level and
user-level network policy, the user-level policy takes
precedence.
 Network Policies are composed of an allowed IP range
and optionally a blocked IP range. Blocked IP ranges
are applied first.

SHOW NETWORK POLICIES;

Network policies provide the ability to allow or deny access


to Snowflake based on a single IP address or range of IP
addresses.
They act to filter incoming traffic to a Snowflake account,
preventing people from accessing the Snowflake account
URL.
This provides an additional layer of security on top of
authentication.
Network policy objects are composed of an allowed IP list
and a blocked IP list.
example showing the create statement for a network
policy.
If you specify an allow list, all IP addresses outside of that
range will be denied access.
Block lists are generally for blocking IP addresses within the
range of the allowed IP list.
And network policies only currently support IPv4 addresses
as opposed to the longer IPv6 addresses.
And when filling out the allowed and blocked IP properties,
we use the standard CIDR notation to express an IP subnet
range.
Network policies are first created and then applied to
become active.
They can be applied on the account level or to individual
users.
Applying a network policy to a specific user would restrict
which IP address a user could connect to a Snowflake
account from.
Applying it at the account level would restrict all users
connecting to that account.
And if a user is associated to both an account level and
user-level network policy, the user-level policy takes
precedence.
It's possible to temporarily bypass a network policy for a
set number of minutes by configuring the user object
property means to bypass network policy, which can be
viewed by executing a describe user command.
The one caveat is that you have to contact Snowflake
support to set the value of this property.
The DDL for network policies are syntactically identical for
both the user and account.
You have three parameters: allowed IP list, blocked IP list,
and common.
In this example, we're allowing a range of IP addresses and
blocking one from that group.
A security admin or account admin role can create policies
or a custom role with the create network policy privilege.
To list the network policies created regardless of if they're
active or not, use the show command.
First, let's take a look at how you'd apply a network policy
to an account.
ACCOUNT
Only one Network Policy can be associated with an account
at any one time.
ALTER ACCOUNT SET NETWORK_POLICY = MYPOLICY;

SECURITYADMIN or ACCOUNTADMIN system roles can apply


policies. Or custom role with the ATTACH POLICY global
privilege.

SHOW PARAMETERS LIKE ‘MYPOLICY' IN ACCOUNT;

USER
Only one Network Policy can be associated with an user at
any one time.
ALTER USER USER1 SET NETWORK_POLICY = MYPOLICY;

SECURITYADMIN or ACCOUNTADMIN system roles can apply


policies. Or custom role with the ATTACH POLICY global
privilege.
SHOW PARAMETERS LIKE ‘MYPOLICY' IN USER USER1;

Although you can define as many network policies as you


like, there can only be one network policy associated with
an account at any one time.
If another is set, it will replace the old one.
This is the syntax for associating a network policy with an
account.
Any users already logged into Snowflake from one of the
restricted IP addresses will be prevented from executing
queries.
Your current IP address must be included in the allowed IP
range and not included in the blocked.
Otherwise Snowflake will return an error.
A security admin or account admin role can apply policies
or a custom role with the attached policy global privilege.
To get the active network policy for an account, run the
show parameters command.
And it's a very similar story for when you want to apply a
network policy to a user.
Only one network policy can be associated with a user at
any given time. And to get the active network policy for a
user, you again run the show parameters command.

Data Encryption
All data in the storage layer, data loaded into Snowflake
tables is encrypted using AES-256 strong encryption
automatically during the loading process.
All files stored in internal stages for data loading and
unloading are also automatically encrypted using AES-256
strong encryption virtual warehouse and query result
caches are also encrypted.
The main takeaway here is that all data that Snowflake has
control over is automatically encrypted at rest.
Data encryption in Snowflake is entirely transparent and
requires no configuration or management by the user.
And because Snowflake is a remote service generally
connected to over the internet, it's also important to
discuss the data that is traversing the network.
Things like the text of queries when issuing commands on
the UI or the data of input files during data loading.
These also need to be encrypted.
Secure HTTPS is always used when connecting to a
Snowflake account, URL, whether through a browser on the
UI or using a JDBC driver, for example.
Snowflake makes use of the TLS 1.2 protocol to encrypt all
network communications from your client machine to the
Snowflake endpoints.
Using both encryption at rest and encryption in transit at
every stage during the data loading and unloading process
gives us end-to-end encryption.
This reduces the attack surface of Snowflake and allows us
to only expose data to authorize users.
Our aim is to get a raw file you have on a client machine
into a snowflake table in a secure way.
There are two main flows when thinking about end-to-end
encryption in Snowflake, and they're determined by which
type of stage you're copying data from.
So we've not yet reviewed stages, but for now, understand
that there are areas to temporarily store raw files used in
data loading and unloading.
There are two types, one internal
to Snowflake which they manage, and one you as a user
can set up called an external stage.
When uploading a file to an internal stage, for example,
with a put command, more on that in the data loading
section, snowflake transparently encrypts the loaded files.
The encryption computation itself is performed on the
client machine when uploading to an internal stage.
And then using a different key, the data is encrypted when
loaded into long-term table storage.
If you're loading from an external stage, say an S3 bucket,
which is not managed by Snowflake, you can choose
yourself to leave the contents of the external stage
unencrypted.
In which case when data is loaded into snowflake tables
from the external stage, it'll be encrypted at that point.
However, if you choose to use client side encryption in the
external stage prior to data loading into a table, decryption
information will have to be provided so Snowflake can
decrypt and then re-encrypt when loading into long-term
table storage.
So how does Snowflake manage the keys used to encrypt
all this data?

They use something called a hierarchical key model with


each key higher up in the key hierarchy, encrypting the key
below it and the final file key encrypting the user's data, a
process called wrapping.
Each account master key corresponds to one customer
account in Snowflake, each table master key corresponds
to one database table.
That means that every account and every table is
encrypted with a separate key.
Even each individual data file or micro partition is
encrypted with a separate key.
Key hierarchies are commonly used to segment data, limit
the risk of key exposure, and minimize key material that
needs to be stored in plain text.
The essential thing to understand about the key hierarchy
is that it provides the ability to limit the amount of data a
key protects.
It reduces the attack surface if that key is compromised.
For example, if a data file key were exposed, it would not
give the attacker the ability to decrypt data from other
tables or data from other accounts.
And you might wonder how the root key is created and
protected as this would have the power over the child keys
in the key hierarchy.
Snowflake stores the top most encryption keys of the key
hierarchy in an AWS service called AWS Cloud HMS Classic
and generates lower level keys using cloud HMS's random
number generation.
It's a hardware based solution which integrates with
snowflake security framework.
Another method of constraining the amount of data a key
protects is key rotation.
This is the practice of replacing existing account and table
encryption keys every 30 days with a new key.
The existing key is retired and only used from that point on
to decrypt data.
Here's an example of a table key being rotated every
month and encrypting new files with a new key each
month.
Key rotation is the practise of transparently replacing
existing account and table encryption keys every 30 days
with a new key.
Here we'd keep table key versions one and two to decrypt
data files that were encrypted earlier in January and
February.
We can also enable a feature called periodic re-keying.
Once a retired key exceeds one year, Snowflake would
automatically create a new encryption key and re-encrypt
all the data previously protected by the retired key using
the new key.
The new key is then used to decrypt the table data going
forward.
This is an enterprise edition feature and has to be enabled
with the account admin role by setting the periodic data re-
keying parameter to true.

Re-keying is an opportunity to apply keys using the latest


security standards.
Once a retired key exceeds 1 year, Snowflake automatically
creates a new encryption key and re-encrypts all data
previously protected by the retired key using the new key.
ALTER ACCOUNT SET PERIODIC_DATA_REKEYING = TRUE;

However, there is an additional cost associated with


periodic re-keying as Snowflake customers would be
charged for the extra fail safe storage of data files that
were re-keyed.
Tri-secret secure.
This feature lets you encrypt data in Snowflake with a
composite master key, one part customer manage key and
one part account manage key by Snowflake.
The customer managed master key can be maintained in
the key management service for the cloud provider that
hosts your Snowflake account.
For AWS, you can use KMS or the key management service.
In the example shown here, we're using the AWS Key
Management Service to generate our customer managed
account master key.
For Google Cloud, you can use the Cloud Key Management
Service or Cloud KMS, and for Microsoft Azure, you can use
the Azure Key Vault.
Enabling tri-secret secure allows for a greater level of
control over your data.
It makes it impossible to decrypt any data in Snowflake
without the customer managed key.
And if you experience a data breach, you can withdraw
access to the customer managed key, effectively disabling
access to your data.
The downside of this is that it comes with more
management overhead to protect and ensure the key is
always available.
To enable tri-secret secure, you'd have to contact
Snowflake Support.

Column Level security


Sensitive data in plain text is loaded into Snowflake, and it
is dynamically masked at the time of query for
unauthorized users.
let's take a look at how we can implement column-level
security with dynamic data masking.
Dynamic data masking is a security feature in which a
policy is applied to a column in a table or view to mask
data at query runtime.
And just to make things clear from the outset, data is not
stored in a masked format, which is called static masking.
It's applied dynamically at query runtime.

Before we break down a CREATE statement for a policy,


let's take a look at how this works at a conceptual level.
A masking policy can be created and applied to a column of
a table or view.
It defines rules determining who has authorization to see
that column's data and in what way they want to mask it.
If an unauthorized user issues a SELECT statement with the
mask column included in the projection of the query, in the
query result, the user will see masked or partially masked
data.
ALTER TABLE IF EXISTS EMP_INFO MODIFY COLUMN USER_EMAIL
SET MASKING POLICY EMAIL_MASK;

In this example, a masking policy has been set on the


email field.
This policy dictates that unauthorized users will only be
able to see the domain of the email, an example of partial
masking.
This all happens transparently to the user querying the
table.
They don't need to have knowledge of the masking policy.
So what happens when an authorized user issues the same
query?
The query result will not be masked and contain plaintext
values as expected.
Let's now take a look at an example of how to create and
apply a masking policy.
The aim of this policy is to mask emails from users that
don't have the support custom role currently active.
If we break this query down, we can see we're creating a
masking policy with the identifier EMAIL_MASK.
We then have val string. Val is a way we can reference the
column value we attach the policy to in the policy body,
like we do later down here.
This is followed by the input data type.
This determines on which column the masking policy can
be set.
Because we're setting a string, we can only apply this
masking policy to a column of type string.
Next, we have the data type of the masked value to return
from the masking policy when queried.
A masking policy definition must have the same data type
for the input and the output.
The arrow separates the policy signature from the policy
body.
The body is made up of a case statement, which specifies
conditions determining if a masked value is returned and
what masking to apply.
Here we're saying when the currently active role for a user
is support, then return the plaintext value, else return a
completely masked value.
We could also partially mask a value, so this would show
the domain of an email, not the identifier before the @
symbol.
We can also use functions to hide the value of a column.
Here we're using SHA-2, which is a hashing algorithm.
User-defined functions can also be used.
In this next example,
we're returning a semi-structured value.
And lastly, we have the else statement.
So if none of these conditions are true, return a fully
masked value for that column.
Data masking policies are applied to a column with the
following ALTER TABLE statement.
They can be applied to thousands of columns across
databases and schemas if required.
In this example, you might want to apply it to all columns
that hold email addresses.
There's a few bits of additional information around masking
policies that might be useful for the exam.
Data masking policies are schema-level objects, like tables
and views.
Creating and applying data masking policies can be done
independently of object owners.
This allows for a separate security focus team to determine
masking policies for the rest of an organization, enabling a
segregation of duties.
Thirdly, masking policies can be nested, existing in tables
and views that reference those tables.
The lowest masking policy will apply first.
A masking policy is applied no matter where the column is
referenced in a SQL statement.
This is quite important to know as it may result in some
unexpected query behavior.
Take the example of joining mask data.
If both columns in the join condition are masked, they'll join
based on the mask data, not on the underlying data.
And lastly, a data masking policy can be added to a table
or view either when the object is created or after the object
is created.
Masking Policies
 Data masking policies are schema-level objects, like
tables & views.
 Creating and applying data masking policies can be
done independently of object owners.
 Masking policies can be nested, existing in tables and
views that reference those tables.
 A masking policy is applied no matter where the
column is referenced in a SQL statement.
 A data masking policy can be applied either when the
object is created or after the object is created.
External tokenization.
Masking policies can also make use of external functions to
mask their data. This is called external tokenization.
Tokenized data is loaded into Snowflake, which is
detokenized at query run-time for authorized users via
masking policies that call an external tokenization service
using external functions.

It allows us to ingest scrambled or tokenized data into


Snowflake and detokenize it at query runtime.
Let's look at an example to see how it works.
Tokenized data is ingested into a table.
Tokenization is the process of replacing a meaningful or
sensitive value, like date of birth, with a string of random
characters or tokens, which have no meaning if exposed.
The key point to realize here is that unlike dynamically
masking data, this will store the data in a tokenized form in
Snowflake storage.
A masking policy is created and applied with a
detokenization external function configured in its policy
body.
When that column is used in a SQL query, if the condition
of the masking policy is met and the role or user is
authorized to see the data, the masking policy will make a
call to an external service via an external function with the
tokenized data as its payload.
The external function will return detokenized data.
If the user or role does not meet the condition of the
masking policy, a masked value will be returned.
The main benefit of external tokenization is that you can
store the underlying data in a tokenized form so not even
Snowflake or the cloud provider will be able to read its
contents.

Row Level security

Row Access Policies


Row access policies enable a security team to restrict
which rows are return in a query

Row level security allows Snowflake users to restrict which


rows are returned in a query based on a condition like what
their current role is.
Like with column masking, we'll walk through a conceptual
level first.
Row level security is provided through the use of row
access policies, which are very syntactically similar to
masking policies.
However, unlike masking policies, the result set will not
have a single column masked, but an entire row will either
be present or omitted from the output based on one or
more conditions.

ALTER TABLE ACCOUNTS ADD ROW ACCESS POLICY RAP_IT


ON (ACC_ID);

Adding a masking policy to a column fails if the


column is referenced by a row access policy. Row
access policies are evaluated before data masking
policies
Row access policies are also applied transparently
when querying a table.
If an unauthorized user, like in this example, does not meet
the conditions defined in the row access policy, their select
query will either have all rows filtered out in the result set
or partially filtered out.
On the other hand, if a user is authorized, a complete result
set will be returned to them.
Here we have a create row access policy statement.
You can see it's very similar in definition to the masking
policy for column security.
First, we provide an identifier, RAP_ID.
Next we specify the column and its data type we like to
attach this row access policy to. This point initially threw
me off.
Why would you need to specify a column if it's filtering
rows?
The best way to think of this is to think of the column as an
anchoring point, and ideally, the column is related to what
this row access policy is trying to achieve.
For example, in this case, we're allowing an admin to see
account IDs of bank accounts, hence why we've put the
account ID column.
We then specify the return type, which is always Boolean,
making it different from masking policies in that the input
data type can be different from the output data type.
And following that, we have the arrow, which splits the
policy definition from the policy body.
The policy body is, again, a case statement with one or
more conditions.
This policy either returns all rows or none based on if the
user's current role equals admin.
Other context functions can also be used here too, like
current user or current region.
A row access policy can be simple, like in the code example
shown here, or more complicated by incorporating a lookup
table, essentially limiting the rows a user can see based on
another column's value.
We'll see this in action in the hands-on as it's a lot easier to
grasp practically.
And much like the masking policy, we issue an altar table
command to apply the policy to a column.
Let's now explicitly call out some similarities with column
masking policies.
Row access policies are schema level objects, which means
a database and schema must exist in Snowflake before a
masking policy can be applied to a column.
They both allow for segregation of duties. They also both
follow the same pattern of creating first and then applying.
Row access policies can be nested with the first policy in
the chain applying first.
Both row access policies and column masking policies can
be added to a table or view either when the object is
created or after the object is created.
And there's a couple of points to be aware of if setting both
row and column level policies on the same table or view.
Firstly, you can't apply a masking policy to a column if it's
already referenced by a row access policy.
And secondly, row access policies are evaluated before
data masking policies, and the same column cannot be
specified in both a masking policy signature and a row
access policy signature at the same time.
Okay, that's it for the theory on row access policies.

Secure Views
Secure views are a type of view designed to limit access to
the underlying tables or internal structural details of the
view.
 Secure views are a type of view designed to limit
access to the underlying tables or internal structural
details of a view.
 Both standard and materialized views can be
designated as secure.
 A secure view is created by adding the keyword
SECURE in the view DDL.
 The definition of a secure view is only available to the
object owner.
 Secure views bypass query optimizations which may
inadvertently expose data in the underlying table.
Both standard and materialized views can be designated as
secure.

Like shown in this create view DDL code here, a secure


view is created by adding the keyword secure before the
view identifier.
Once it's being created, the role that owns the view can
convert a secure view back to a standard view by issuing
the AltarView onset secure command.
So what is it actually doing differently to secure the view?
It secures a view in two key ways.
Firstly, the underlying tables on which the view is based or
internal structural details for the secure view are not
exposed.
The view definition and details are only visible to
authorized users.
So there's users who are granted the role that owns the
view.
For example, when a show views command is
executed,secure views are not included in the output.
This is also true for the show materialized views
command,the get DDL utility function, the views
information scheme of view, and the views, account usage
view.
The second way a secure view works is slightly more
complicated. One of the reasons we might create a view is
to restrict the data you can query from the underlying
table.
However, on a standard view, the query optimizer that
constructs the execution plan, slots in some clever
optimizations to ensure the query runs quickly, such as
placing a filter first to reduce the total size of data to
compute, also known as push down.
This could be potentially problematic because it could
introduce a way for a user that doesn't have access to all
the data in the underlying table to determine values they
wouldn't usually have access to.
Secure views are designed to prevent users from possibly
being exposed to data from the underlying table that is
meant to be filtered by the view.
It does this by bypassing some of the view optimizations
performed by the Query Optimizer.
And because secure views don't make use of the query
plans optimizations, it can be slower than a standard view.
And for that reason, making views which don't have strict
security requirements, secure is discouraged by Snowflake.

Account Usage and Information Schema


By default, Snowflake provide a shared read-only database
called SNOWFLAKE with all accounts.

Account Usage
 Snowflake provide a shared read-only databased
called SNOWFLAKE, imported using a Share object
called ACCOUNT_USAGE.
 It is comprised of 6 schemas, which contain many
views providing fine-grained usage metrics at the
account and object level.
 By default, only users with the ACCOUNTADMIN role
can access the SNOWFLAKE database.
 Account usage views record dropped objects, not just
those that are currently active.
 There is latency between an event and when that
event is recorded in an account usage view.
 Certain account usage views provide historical usage
metrics. The retention period for these views is 1 year.
The SNOWFLAKE database is imported using a secure data
share called ACCOUNT_USAGE.
Its purpose is to share views containing many different
types of fine-grained usage metrics in order to query and
report on account and object usage.
For example, you can use it to programmatically check how
many queries were executed in the last hour or do
something like check how many credits an individual
warehouse used in the last three months.
Essentially, it's a home for long-term historical metadata
about what's going on in your Snowflake account.
The SNOWFLAKE database is comprised of six schemas.
The first is perhaps confusingly called ACCOUNT_USAGE,
the same name of the share object.
This is the main schema you'll be using. It contains views
that display object metadata and historical usage metrics
for your account.
For example, it has a view called Tables, which contains
metadata for all the tables created in your account.
It's by interacting with these views that you can govern and
maintain your Snowflake account.

The CORE schema is relatively new and currently only


contains the system tags used by data classification.
Additional views and schema objects will be introduced in
future releases.
Next is the READER_ACCOUNT_USAGE schema.
It contains views that display object metadata and usage
metrics for all the reader accounts that have been created
for your main account.
The DATA_SHARING_USAGE schema includes views that
display information about listings published in a data
exchange.
The ORGANIZATION_USAGE schema provides historical
usage data for all accounts in an organization.
And lastly, you'll see a schema called the
INFORMATION_SCHEMA.
This is created by default for every database in an account.

By default, only account administrators, users with the


ACCOUNTADMIN role, can access the SNOWFLAKE database
and its schemas or perform queries on the views.
However, privileges on the database can be granted to
other roles in your account to allow other users to access
the objects.
And account usage views record dropped objects, not only
those that are currently active.
We can see if an object is dropped when querying a view
with the additional timestamp column in the output named
DELETED, which records when the object was dropped.
We also have a field called internal ID for some objects,
which helps us differentiate between objects that were
dropped and recreated with the same name.
And for each view, there's some latency associated with
querying it.
For most views, the latency is around two hours.
This means it will take about two hours for changes to
appear in the view.
A complete breakdown of timings for each view can be
found in the Snowflake documentation.
And the last point for this slide, most of the account usage
views provide historical usage metrics.
For example, the METERING_HISTORY view gives us
historical metadata about credit consumption.
The retention period for these views is one year.
Throughout the course, we'll make use of the account
usage views to understand billing, warehouse usage, query
history, and much more.
Information schema.
You might have already noticed that each database created
in your account automatically includes a read-only schema
named INFORMATION_SCHEMA.

The Snowflake INFORMATION_SCHEMA is based on the SQL-


92 ANSI Information Schema, but with the addition of
queryable views and functions that are specific to
Snowflake.

Each database created in an account automatically


includes a built-in, read-only schema named
INFORMATION_SCHEMA based on the SQL-92 ANSI
Information Schema.
Each INFORMATION_SCHEMA contains:
• Views displaying metadata for all objects contained in
the database.
• Views displaying metadata for account-level objects
(non-database objects such as roles, warehouses and
databases).
• Table functions displaying metadata for historical and
usage data across an account.

The output of a view or table function depends on the


privileges granted to the user’s current role.
It includes views containing information for all the objects
contained in the database, like the tables.
It also has views for account-level objects, so objects that
sit outside the scope of the database the information
schema is in, such as roles, warehouses, and other
databases.
This means there are some views which hold metadata
specific to a database and some which hold identical
metadata across each database's information schema.
For example, there's a view called TABLES that will only
have metadata for tables in that specific database,
whereas the view called DATABASES has metadata on
databases across the account.
The information schema also includes table functions for
historical and usage data across your account.
These are functions which can return multiple rows and can
be queried like a table.
An example of an information schema table function is the
COPY_HISTORY table function.
This allows us to retrieve metadata about all the files
loaded into a table.
The output of a view or table function depends on the
privileges granted to the user's current role.
When querying an information schema view or table
function, only objects which the current role has been
granted access are returned.
Account usage views and the information schema views
and table functions have a lot of overlapping functionality,
with identical structures and naming conventions.
However, there are some crucial differences between the
two that will help us determine which is the right one to
use when reporting on account and object usage.
The account usage views contain dropped objects, whereas
the information schema does not.
So if you're looking to get an understanding of past objects,
you'd use the account usage views.
However, if you needed the most up-to-date metadata,
you'd use the information schema.
There's no latency associated with the views and table
functions in the information schema, whereas in the
account usage views, this could be up to three hours.
Account usage views generally hold more data, going back
up to a year, whereas the information schema varies from
seven days to six months.

You might also like