Netwrix Auditor Data Discovery and Classification Quick Start Guide
Netwrix Auditor Data Discovery and Classification Quick Start Guide
The information in this publication is furnished for information use only, and does not constitute a
commitment from Netwrix Corporation of any features or functions, as this publication may describe
features or functionality not applicable to the product release or version you are using. Netwrix makes no
representations or warranties about the Software beyond what is provided in the License Agreement.
Netwrix Corporation assumes no responsibility or liability for the accuracy of the information presented,
which is subject to change without notice. If you believe there is an error in this publication, please report
it to us in writing.
Netwrix is a registered trademark of Netwrix Corporation. The Netwrix logo and all other Netwrix product
or service names and slogans are registered trademarks or trademarks of Netwrix Corporation. Microsoft,
Active Directory, Exchange, Exchange Online, Office 365, SharePoint, SQL Server, Windows, and Windows
Server are either registered trademarks or trademarks of Microsoft Corporation in the United States
and/or other countries. All other trademarks and registered trademarks are property of their respective
owners.
Disclaimers
This document may contain information regarding the use and installation of non-Netwrix products.
Please note that this information is provided as a courtesy to assist you. While Netwrix tries to ensure
that this information accurately reflects the information provided by the supplier, please refer to the
materials provided with any non-Netwrix product and contact the supplier for confirmation. Netwrix
Corporation assumes no responsibility or liability for incorrect or incomplete information provided about
non-Netwrix products.
2/39
Table of Contents
1. Introduction 5
3. DDC Collector 10
5. DDC Provider 28
3/39
6.2. Subscribe to Report 33
8. Glossary 38
9. Related Documents 39
4/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
1. Introduction
1. Introduction
This guide is intended for the first-time users of Netwrix Auditor Data Discovery and Classification. It can be
used for evaluation purposes, therefore, it is recommended to read it sequentially, and follow the
instructions in the order they are provided. After reading this guide you will be able to:
NOTE: The DDC Collector and DDC Provider work only in combination with supported Netwrix Auditor
applications; so this guide covers a basic procedure for running the modules and assumes that you
have Netwrix Auditor installed and configured in your environment. For installation scenarios, data
collection options, as well as detailed information on how Netwrix Auditor works, refer to the
following Quick-Start Guides, depending on your data source:
5/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
1. Introduction
Netwrix Auditor includes applications for Active Directory, Azure AD, Exchange, Office 365, Windows file
servers, EMC storage devices, NetApp filer appliances, network devices, SharePoint, Oracle Database, SQL
Server, VMware, and Windows Server. Empowered with a RESTful API and user activity video recording, the
platform delivers visibility and control across all of your on-premises or cloud-based IT systems in a unified
way.
Major benefits:
To learn how Netwrix Auditor can help your achieve your specific business objectives, refer to Netwrix
Auditor Best Practices Guide.
6/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
With Netwrix Auditor Data Discovery and Classification, you can identify, classify and secure sensitive data
on Windows file servers, EMC storage devices and NetApp filer appliances.
Major benefits:
7/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
The DDC Collector is a data discovery and classification service that runs on a dedicated server. It scans
your various file repositories for supported file content, stores the raw text in the DDC Index and indexes
that content. It classifies the indexed file content by matching it against predefined third-party taxonomies
(rules and patterns for finding, for example, personal data governed by the GDPR or medical records
governed by HIPAA) and any custom taxonomies you create. It stores the resulting document
classifications in the DDC Collector database. You use the DDC Collector console to monitor and control the
DDC Collector service, as well as to select, create, modify and manage taxonomies.
Meanwhile, the DDC Provider service runs on the Netwrix Auditor Server . It reads the classification
results from the DDC Collector database and translates the DDC Collector taxonomy format into the
Netwrix Auditor category format the resulting list of objects and their categories is periodically
transferred to the Categories database.
Netwrix Auditor merges data from the Categories database and other Netwrix Auditor databases (such as
the file server State-in-Time database) to generate the reports you request.
l Processor—3 cores
l RAM—12 GB
If you plan to deploy in bigger environments, consider the following considerations and restrictions:
8/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
9/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
3. DDC Collector
DDC Collector is a web-based configuration module designed to discover potentially sensitive documents
and directories and classify them according to specific taxonomy clues.
l Hardware Requirements
l Software Requirements
You can deploy DDC Collector on a virtual machine running Microsoft Windows guest OS on the
corresponding virtualization platform, in particular:
l VMware vSphere
l Microsoft Hyper-V
l Nutanix AHV
Note that DDC Collector supports only Windows OS versions listed in the Software Requirements section.
NOTE: Hardware requirements for SQL Servers listed in the table above apply to each SQL Server instance
in your configuration separately.
10/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
To index a large number of objects (8- 32 m objects), DDC Collector requires multi- server
configuration (Distributed Query Server mode). The requirements marked with an asterisk (*) apply
to a single server. To estimate total hardware requirements, multiply the values above to a number
of your nodes. Netwrix recommends deploying an additional DQS server for every 8 million file
objects to be indexed.
For XLarge environments, each SQL Server instance must be deployed to a separate physical disk.
l Install the first DDC Collector instance where you are going to enable DQS mode.
Refer to To enable DQS Mode for detailed instructions on how to re-balance load to your system.
Recommendations below refer to clear install of DDC Collector in extra-large environment. If you upgraded
from the previous version, perform steps 6 - 18.
1. Prepare machines for SQL Server instances where DDC Collector database and Netwrix Auditor
(including Categories) databases reside. Consider the following recommendations:
l For better performance, deploy both SQL Server instances to dedicated SSD storages. The
configuration where one of the SQL Server instances runs on HDD is acceptable, but not
recommended. Using HDD for all instances lead to performance loss and long report generation.
2. Create and configure DDC Collector database as described in the DDC Collector Database section.
3. Prepare server to install DDC Collector. Meet the XLarge environment (up to 32 m objects) hardware
requirements and general Software Requirements.
4. Install and configure DDC Collector as described in the Install DDC Collector section.
5. Add license.
6. Prepare servers to install other DDC Collector instances assuming one server per one instance. Each
server must meet the XLarge environment (up to 32 m objects) hardware requirements and general
Software Requirements.
11/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
7. Copy the Netwrix_Auditor_DDC_Collector.exe file to the server considered to be the next DDC
Collector instance.
9. Proceed with installation as described in the Install DDC Collector section until SQL Database
configuration.
10. On the SQL Database step, provide the name of the SQL Server instance that hosts DDC Collector
database you configured on the step 2.
NOTE: Ignore the confirmation dialog on the existing schema in the selected SQL database.
12. Repeat the steps 7 - 11 for each successive DDC Collector instance.
14. On the computer where the first DDC Collector instance installed, open DDC Collector console.
NOTE: Once DQS mode enabled you cannot roll back your configuration. Netwrix strongly
recommends to ensure that you took a full backup of your environment.
17. On the DQS tab, click Add to add servers you prepared on the step 6 one by one.
12/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
Option Description
19. When prompted, re-collect data sources to re-distribute the content across all of the configured
servers.
20. You can review system health and services dashboards to check your configuration. See Review
Dashboard for more information.
Component Requirements
Operating system Windows 2012 R2 and above Server Operating System Software.
13/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
Component Requirements
Windows Features
Web Server Role (IIS)
l HTTP Errors
l Static Content
l HTTP Redirection
l Anonymous Authentication
Other features
Microsoft IFilters l Microsoft Office 2010 Filter Packs and above, 64-x edition.
Visual Studio l Visual C++ Redistributable Packages for Visual Studio 2015 and above.
14/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
NOTE: For performance purposes, Netwrix strongly recommends to separate DDC Collector and SQL
Server machine.
NOTE: The account used to create the DDC Collector database must be granted the dbcreator server-level
role.
1. On the computer where SQL Server instance with the DDC Collector database resides, navigate to
Start → All Programs → Microsoft SQL Server → SQL Server Management Studio.
4. Select the Files page and set the Initial Size (MB) parameter for PRIMARY file group to 512 MB.
5. Click Expand next to PRIMARY file group and set Autogrowth / Maxsize as follows:
Option Description
6. Go to Options page and make sure that the Recovery model parameter is set to "Simple ".
15/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
1. Run Netwrix_Auditor_DDC_Collector.exe.
2. Review minimum system requirements and then read the License Agreement. Click Next.
3. On the Product Settings step, specify path to install DDC Collector. For example, C:\Program
Files\Netwrix DDC\.
l Unique name for your DDC Collector instance. For example, Netwrix DDC.
l Directory where Index files reside. For example, C:\Program Files\Netwrix DDC\DDC Index.
5. On the SQL Database step, provide SQL Server database connection details. Complete the following
fields:
16/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
Option Description
Server Name Provide the name of the SQL Server instance that hosts your DDC Collector
database. For example, "WORKSTATIONSQL\SQLSERVER".
Database Enter the name of the SQL Server database you created for DDC Collector.
Name Netwrix recommends using DDC_Collector_database name.
l File System Path—Provide a path to store DDC Collector's Services files. For example,
C:\Program Files\Netwrix DDC Services.
l Provide user name and password for the product services service account.
8. On the Pre-Installation Tasks and Checks step, review your configuration and select Install.
9. When the installation completes, open a web browser and navigate to the following URL:
http://hostname/conceptQS where hostname is the name or IP address of the computer where DDC
Collector is installed.
17/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
1. In your web browser, navigate to the following URL: http://hostname/conceptQS where hostname is
the name or IP address of the computer where DDC Collector is installed. The following window
appears:
l Add License
l Add Taxonomy
l Review Dashboard
18/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
1. In DDC Collector console, navigate to Config → Settings and expand the System node.
3. In the License details dialog, drag and drop the license file in the License area.
4. When completed, the license is displayed in the list of available licenses and has the Valid status.
Out of the box, you are assigned the "Super User" role in DDC Collector console. If you want to provide
access to several tabs of DDC Collector console to other users, do the following:
1. In DDC Collector console, navigate to Sources → "file share or folder" and unselect the Anonymous
Access checkbox.
NOTE: If the Anonymous Access checkbox is selected for your content source, access to your
sensitive data is unrestricted.
19/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
To... Do...
The first user you add will be assigned the "Super User" role and
he or she will have unrestricted rights in DDC Collector console.
Consider to add verified user first.
Restrict access to DDC In the Users tab, select DDC Collector components available for
Collector console components this user:
l Sources
l Taxonomies
l Config
l Users
l Reports
20/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
3. Financial records and payment cards information covering GLBA and PCI DSS scope.
Each taxonomy contains a set of terms. You can add, edit and remove these terms using configuration
rules (Clues). For evaluation purposes, you will be fine with the following types of clues:
l Standard—A single word or multi-word concept. Matched on a fuzzy basis with word stemming
enabled. Use quotes around single words to disable stemming. Use double quotes around phrases to
invoke exact phrase matching.
l To manage taxonomies
4. Click Load.
5. In the Add Termsets dialog, select your taxonomies and click Add Selected.
21/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
3. Select the Load XML file to SQL option to import an XML file directly into the DDC Collector console;
large taxonomies will be imported by the background services.
5. Select Upload.
6. In the Add Termsets dialog, select your taxonomies and click Add Selected.
To manage taxonomies
1. In DDC Collector console, navigate to Taxonomies and locate the taxonomy that you want to
manage.
NOTE: If your taxonomy does not have any terms yet, right-click the taxonomy and select Add Child
Term. Specify one or several child terms—one term per line.
2. Expand the taxonomy and locate the desired term on the left pane. Review the following for
additional information:
To... Do...
Review predefined clues Navigate to the Clues tab and review available default clues.
Clues are used to describe the language found in documents
that make them about a particular topic.
Suggest clues 1. Navigate to the Suggest tab and click Suggest to add new
clues.
2. You can suggest a score for the clue and change its type.
Search collected and classified 1. Navigate to the Search tab and enter search criteria in the
files Find field.
Review all files matching the 1. Navigate to the Browse tab and review the list of files
taxonomy matching the selected taxonomy.
22/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
Option Description
Folder Enter the UNC path of the root folder where collection is to start. You
can add either windows directories, or NetApp filer or EMC storage
devices, to the index.
NOTE: Specify equal UNC paths for both: in Netwrix Auditor and DDC
Collector. Any actions made over data sources configured in
different way or locally (e.g., “C:\” ) are out of scope. Otherwise,
you need to map to the same server location and then restart
the DDC Provider service.
Include sub-folders Select if you want to process data in sub-folders and set depth limit.
Allow anonymous This option is used to disable security filtering for selected sources. If
access unselected, the indexing processes will collect Windows Access Control
Lists (ACLs) for the files and search results will be filtered based upon
the end user's Windows identity.
Netwrix recommends unselect this option. See Secure Your Data for
more information.
Enable duplicate Select to exclude documents that contain the same text content from
detection the index.
3.
23/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
Option Description
Re-Index Period Specifies how often the source should be checked for changes. Netwrix
recommends using default values.
Document Type Specify a value that can be used to restrict queries when utilising the
DDC Collector search index.
4. Select Index Folder to start indexing process. You will see an information popup window on
successful indexing.
The default screen (Dashboard) shows a high level overview of Netwrix Auditor Data Discovery and
Classification service statistics. You can review all processing stages of every component:
24/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
l Collector
l Indexer
l Classifier
For the full list of the potential issues, refer to System Health and Troubleshooting
Out of the box, DDC Collector processes JPEG, PNG, TIFF, and Bitmap images. For the full list of supported
content types, refer to Supported Content Types section. If you want to enable OCR, configure the product
as follows:
To... Do...
Recognize stand-alone images Do the following to enable OCR for image files having specific
extension:
25/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
3. DDC Collector
The settings will be applied in an hour after configuration. If you want to start process images and
documents earlier, navigate to the Services snap-in and restart the following services:
l conceptIndexer
l ConceptCollector
l conceptClassifier
NOTE: Make sure that DDC Collector does not process any files, otherwise service restart may fail data
classification process.
26/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
l EMC
l NetApp
Option Description
Item Specify file shares that you want to process with DDC Collector.
NOTE: Specify equal UNC paths for both: item in Netwrix Auditor
and DDC Collector. Any actions made over data sources
configured in different way or locally (e.g., “C:\”) are out of
scope. Otherwise, you need to map to the same server
location and then restart the DDC Provider service.
Additional options 1. Enable the Collect data for state-in-time reports option
for each item that you want to process.
NOTE: Refer to the Create a New Plan section in Netwrix Auditor Online Help Center for detailed
instructions on how to create a new monitoring plan.
When DDC Collector processes ("crawls") specified files and folders, it performs read operation under the
dedicated DDC Collector account (described in the Add Content Sources section). Netwrix Auditor that
monitors your file storage system (Windows File Server, NetApp Filer or EMC Storage), will report these
read operations by default. To avoid this excessive reporting, it is recommended to include the dedicated
DDC Collector account and its read operations in the omitreportlist.txt and omitstorelist.txt files for Netwrix
Auditor. See Exclude Data from File Servers Monitoring Scope for more information.
27/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
5. DDC Provider
5. DDC Provider
DDC Provider is the integration module used to deliver classified and indexed documents collected by DDC
Collector to Netwrix Auditor and display them in reports.
NOTE: If you plan to use Microsoft SQL Server 2016, make sure it has SP2 installed.
l A member of the local Administrators group on the computer where Netwrix Auditor Server and
DDC Provider are installed.
l The Database datareader server role must be assigned to the account on the SQL Server instance
where the DDC Collector database resides.
NOTE: Netwrix recommends using different accounts to connect to the SQL Server instances where DDC
Collector database and Categories database reside.
1. On the computer where SQL Server instance with DDC_ Collector_ Database resides, navigate to
Start → All Programs → Microsoft SQL Server → SQL Server Management Studio.
3. In the left pane, expand the Security node. Right-click the Logins node and select New Login from
the pop-up menu.
28/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
5. DDC Provider
4. Select User mapping on the left and select the DDC_Collector_database for which you want to
assign the role.
5. In the Database role membership for: DDC_Collector_database list, select the db_datareader
role.
29/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
5. DDC Provider
Option Description
Enable DDC Starts the Netwrix Auditor DDC Provider service and changes it's Startup Type
Provider to "Automatic".
SQL Server Provide the name of the SQL Server where DDC Collector database resides
instance (e.g., WORKSTATIONSQL\SQLEXPRESS for SQLEXPRESS instance). See DDC
Collector Database for more information.
Database Provide the name of the database you created for DDC Collector.
User name Specify the account to be used to connect to the SQL Server instance.
NOTE: Mind that DDC Provider is a part of Netwrix Auditor Data Discovery and Classification. For the
solution to function properly, install and configure DDC Collector as described in the DDC Collector
section.
30/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
5. DDC Provider
If you have any issues while using DDC Provider, See System Health and Troubleshooting for more
information.
This section lists steps required to upgrade DDC Provider to the latest version. Review the following for
additional information:
l To perform upgrade
1. Check that the account under which you plan to run the setup has local Administrator rights.
a. Start Microsoft SQL Server Management Studio and connect to SQL Server instance hosting the
database.
b. In Object Explorer, right-click Categories database and select Tasks → Back Up.
l conceptCollector
l conceptIndexer
l conceptClassifier
To perform upgrade
You can upgrade DDC Provider by running the Netwrix Auditor installation package.
31/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
NOTE: Re-open Netwrix Auditor if you handled it during DDC Provider installation.
In Netwrix Auditor, navigate to Reports → and select a report you are interested in and click View.
l State-in-time reports—Provide information on the system's state at a specific moment of time. They
are based on the daily configuration snapshots, and reflect a particular aspect of the audited
environment.
Report Description
Activity reports
Activity Related to Sensitive This report lists all access attempts to files and folders that contain
Files and Folders certain categories of sensitive data at the moment.
State-in-time reports
Most Accessible Sensitive Files This report shows the number of users that effectively have access to
and Folders sensitive files or folders, sorted in descending order. Use this report to
identify data at high risk and plan for corrective actions accordingly.
Overexposed Files and This report shows sensitive files and folders accessible by the specified
Folders users or groups, based on the combination of folder and share
permissions. Use this report to identify data at high risk and plan for
corrective actions accordingly.
Sensitive Files and Folders by This report shows ownership of files and folders that are stored in the
Owner specified file share and contain selected categories of sensitive data.
Use this report to determine the owners of particular sensitive data.
Files and Folders Categories This report shows files and folders that contain specific categories of
by Object sensitive data. Use this report to see whether a specific file or folder
32/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
Report Description
Sensitive Files Count by This report shows the number of files that contain specific categories of
Source sensitive data. Use this report to estimate amount of your sensitive
data in each category, plan for data protection measures and control
their implementation.
Sensitive File and Folder This report shows permissions granted on files and folders that contain
Permissions Details certain categories of sensitive data. Use this report to see who has
access to a particular file or folder, via either group membership or
direct assignment. Reveal sensitive content that has permissions
different from the parent folder.
To apply filters
2. Apply filters to the report and click View Report . For example, you can update report timeframe,
select specific values for Who and Where , apply sorting, etc.
Wildcards are supported. For example, type %admin% in the Who (domain\user) field if you want to view
changes made by users with the name containing "administrator" (e.g., enterprise\administrator,
corp\administrator, sqladmin).
Do not use % in the exclusive filters (e.g., Who (Exclude domain\user)). Otherwise, you will receive an empty
report.
1. On the main Netwrix Auditor page, navigate to Reports. Specify the report that you want to
subscribe to and click Subscribe.
33/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
Option Description
General
Send empty subscriptions Slide the switch to Yes if you want to receive a report even if no
when no activity occurred changes occurred.
Specify delivery options l File format —Configure reports to be delivered as the doc
or xls files.
34/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
Option Description
Other tabs
Expand the Recipients list and click Add Recipient to add more
recipients.
Filters Specify the report filters, which vary depending on the selected
report.
35/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
l Troubleshooting Issues
Dashboard Description
System Health Review health statuses of every service. If an issue occurs, you can expand it
and review details and suggested resolution.
Service Viewer Shows real-time activity of all services. Once all work is complete “Idle …” will
be displayed. It is possible to use this to check which sources are currently
being processed, as well as to ensure that the services are currently running.
DDC Collector installation On the computer where DDC Collector is installed, navigate to the
completes with warnings. Services snap-in and restart the following services manually:
36/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
Upgrade completes with On the computer where DDC Provider is installed, navigate to DDC
warnings and errors. Provider logs. By default, they are stored to "C:\ProgramData\Netwrix
Auditor\Logs\Data Discovery and Classification\Tracing" and open the
DDC Provider configuration Netwrix.DDC.Service.log.
completes with warnings.
37/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
8. Glossary
8. Glossary
The table below contains basic glossary terms:
Clues Clues are used to describe the language found in Not reflected in
documents that make them about a particular topic. reports.
38/39
Netwrix Auditor Data Discovery and Classification Quick-Start Guide
9. Related Documents
9. Related Documents
The table below lists all documents available to support Netwrix Auditor Data Discovery and Classification:
Document Description
Netwrix Auditor Online Help Center Gathers information about Netwrix Auditor from multiple
sources and stores it in one place, so you can easily search and
access any data you need for your business. Read on for
details about the product configuration and administration,
its security intelligence features, such as interactive search and
alerts, and Integration API capabilities.
Netwrix Auditor Installation and Provides detailed instructions on how to install Netwrix
Configuration Guide Auditor, and explains how to configure your environment for
auditing.
Netwrix Auditor Administration Guide Provides step-by-step instructions on how to configure and
use the product.
Netwrix Auditor Intelligence Guide Provides detailed instructions on how to enable complete
visibility with Netwrix Auditor interactive search, report, and
alert functionality.
Netwrix Auditor Integration API Guide Provides step-by-step instructions on how to leverage Netwrix
Auditor audit data with on- premises and cloud auditing
solutions using RESTful API.
Netwrix Auditor Release Notes Lists the known issues that customers may experience with
Netwrix Auditor 9.6, and suggests workarounds for these
issues.
39/39