Dynamic Data Masking
Dynamic Data Masking
Presentati
on Interrupt Me
4
SQL Server Security Features
• Prior Versions
• TDE
• Backup
Encryption
• Auditing
• SQL 2016 Additions
• Row-level security
• Always Encrypted
• Dynamic data
masking
• TDE requires
Enterprise
5
What is Data Masking?
• Data masking may also call as data obfuscation, sanitization, scrambling, de-identification and many other
terms
• It is the process of taking data and making it unreadable so end users cannot see the actual data values
• Data masking is generally applied to sensitive data that if leaked could expose you and your company to lost
business, embarrassment, and/or civil or criminal penalties
• Two different approaches to masking data
• DML - replacing the actual data (test data set) – Grid Tools
• DDL – change definition of a table to mask the data. Data values do not change – MS implementation
• SQL Server DDM is considered a masking or obfuscation solution, not scrambling or de-identification
• Data is real data on disk
• If your end user needs accurate data, you should not mask that data.
6
Types of Data to Mask
• Personally Identifiable Information (PII) Data – Data that uniquely identifies a person such as:
o Full name, address, phone, email address, SSN or national health care number, passport number, date of
birth, driver’s license number
o Vehicle Identification Number (VIN) – yep, VIN is PIII according to NHTSA
7
SQL Server Dynamic Data Masking
• First introduced in SQL Server 2016
• Available in all editions of SQL Server (Web/Express requires SP1)
• “Helps” prevent unauthorized access to sensitive data (Remember - only one
part of a total data protection plan)
• Obfuscates the data by applying masking rules (functions) to the result set.
• Strictly a DDL statement. The underlying data does not change
o Easy to implement, part of column definition
• Possible DDM use cases
o HIPAA, PCI compliance, GDPR
o Personally identifiable information (PII)
o Credit card, SSN, etc.
8
Four available masking functions
• Default() – full masking to the data type of the field
o String data types – Masks values with X’s (char, nchar, varchar, nvarchar, text, ntext)
o Numeric data types - Uses the value of zero (bigint, int, decimal, int, money, numeric, smallint, smallmoney,
tinyint, float, real)
o Date and time data types - Uses 01.01.1900 00:00:00.0000000 (date, datetime2, datetime, datetimeoffset,
smalldatetime, time)
o Binary data types – uses a single byte of ASCII value 0 (binary, varbinary, image)
• Email()
o Exposes the first letter of the email address, X’s the remaining out with constant value of .com for suffix
• Random()
o Used for numberic data types. Returns a random numeric value within a specified range
• Partial()/Custom String
o Used to display only parts of a string based on the arguments passed into the function
o Most commonly used for fixed length fields like SSN, credit card number, etc.
9
Permissions / Security Concerns
• To define a mask, you must have CREATE TABLE and ALTER on the schema
• Adding, replacing, modifying the mask of a column requires ALTER ANY MASK and ALTER permission
on the table & schema.
• Users with SELECT, db_datareader will see the masked data when issuing queries
• UNMASK permission needed to see the unmasked data (DBO’s have this permission by default but
can be revoked)
• The CONTROL permission on the database includes both ALTER ANY MASK and UNMASK
• Caution – all data touch points must be reviewed to make sure the proper masking rule is or is not
applied. Failure to do so could result in that touch point receiving unmasked data when it is not
supposed to.
o Application interfaces, reports, services, users with direct query access, etc.
10
Data Masking FYI’s/Limitations
• Masking does not prevent updates to a column. Members with UPDATE permission or higher can
make updates even though they may see masked data
• Using SELECT INTO or INSERT INTO or using the import/export wizard to copy data from a masked
column to another table or file results in masked data in the target table or file if that user does not
have the UNMASK permission. Applies to #temp and ##temp tables also.
• Limitations – a masking rule cannot be applied to the following
o Always Encrypted and FILESTREAM columns
o Computed columns. However, if the computed column depends on a column with a mask, then the computed
column will be returned masked data
o COLUMN_SET or a sparse column that is part of a column set.
o A column with data masking cannot be a key for a FULLTEXT index
o In some cases, cannot be performed on a column with dependencies (ex: an index). You must remove the
dependency first, add the data masking, and recreate the dependency
11
More Gotchas
• Brut force techniques may be used to determine the real values or ranges (example later on)
o SELECT * FROM Employees WHERE Salary BETWEEN 50000 AND 100000
12
Dynamic Data Masking DDL Statements
• CREATE TABLE – use the MASK WITH clause and specify the function
• ALTER TABLE ALTER COLUMN – use the ADD MAKED WITH clause and specify the function
• DROP MASK – to drop masking use the DROP MASKED clause on the ALTER TABLE ALTER COLUMN
command
13
Demo
• Create DDM database and populate an employee table
o Apply DEFAULT(), RANDOM(), EMAIL(), and PARTIAL() functions to various fields using:
o CREATE table and ALTER table commands
o Test it using two different users – reader and DBO user
o Use brut force techniques to determine real values
o Test various data anomalies (e-mail not being e-mail, data does not conform to expected lengths, etc.)
o Granting UNMASK to read user and revoking from DBO user
o Test some of the limitations
o Computed column, sparse data, etc.
o Test views, stored procedures, etc.
o Move some data around
o Review metadata and how to prevent updates to masking rules
14
Dynamic Data Masking Use Cases
• Dynamic data masking use cases
o Can be used in your application to mask sensitive data for the application front end, provided that:
o All data touch points are analyzed to ensure the UNMASK permission is set appropriately for application
logins (Application, reports, services, etc.)
o Tightly controlled servers by dedicated DBA team – Only DBA team can manage servers
o Users do not have direct query access through SSMS or other non managed ad-hoc tool
15
Questions & Discussion
16
References
• https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
• http://www.sqlservercentral.com/stairway/139056/
• https://docs.microsoft.com/en-us/sql/relational-databases/security/row-level-security
17
THANK YOU !
18