Nuix EDiscovery User Guide v4 2 - 4
Nuix EDiscovery User Guide v4 2 - 4
Version 4.2
Nuix believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” NUIX MAKES NO REPRESENTATIONS OR
WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION AND SPECIFICALLY
DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any Nuix software described in this publication requires an applicable software license.
Introduction ............................................................................................................................. 10
Key features of Nuix 4.2 .................................................................................................... 10
Automatic Classification ............................................................................................ 10
Document Navigator Filter ......................................................................................... 10
Batch Load Details .................................................................................................... 10
Cluster Runs .............................................................................................................. 10
Search Macros........................................................................................................... 10
Redaction and Bulk Redactions ................................................................................ 10
Export to Ringtail load file .......................................................................................... 10
Support for Windows Registry Files .......................................................................... 11
Support for File Carving, Slack Space and Deleted Space ....................................... 11
Hex Viewer ................................................................................................................ 11
History Tab ................................................................................................................ 11
Scan for new Child Items........................................................................................... 11
Improved currency entity extraction .......................................................................... 11
Added support for: ..................................................................................................... 11
Other functionalities: .................................................................................................. 11
Key enhancements for Nuix 4 ................................................................................... 12
Architecture ............................................................................................................................ 14
Interface Overview.................................................................................................................. 16
Menus ................................................................................................................................ 16
File Menu ................................................................................................................... 17
Edit Menu................................................................................................................... 18
Items Menu ................................................................................................................ 18
Go Menu .................................................................................................................... 20
Litigation support specialists who use Nuix for processing, searching, and exporting clients' data.
These workflows favor speed, scale, and the ability to work with large datasets very quickly.
Corporate and law enforcement investigators who use Nuix to explore and analyze their own
corporate data as part of internal investigations or as precursors to litigation.
Attorneys who are interested in quickly and easily assessing the facts and merits of the case they
have been presented.
The guide is primarily organised into chapters that follow an end-to-end eDiscovery workflow, and includes
task-based instructions for activities such as loading the data, searching, analysing, reviewing, and
exporting. If you need more granular details about a particular option or control in Nuix, refer to the Interface
Overview. The Introduction contains an overview of features and salient information about licencing, product
architecture, and enhancements for Version 4. Supplemental topics about scripting, the API, supported file
types, and troubleshooting are located in the Appendices.
Cluster Runs
Ability to filter items by cluster jobs, remove unwanted cluster jobs, add a cluster to a fast review job (in
addition to adding to Family), perform a cluster job via Nuix API, and export a cluster job along with the case
subset. Refer to Interacting with the Results View: Rows and Filtered Items for further details on clustering.
Search Macros
Ability to point and click keyword filtering that enables a repeatable process of identifying relevant items,
screening them for sensitive information prior to exporting from Nuix. It allows investigators to build a profile
around individuals being investigated. This is portable from case to case. Refer to Filtered Items for further
details.
Hex Viewer
Hex Viewer in the Preview Pane provides the ability to view internal structure of files and search raw data
structure at a binary level. Refer to Preview Item Detail Tabs for more information.
History Tab
History tab in the Preview Pane provides the ability to view the processing settings applied to an item and
other user actions taken on the item. Refer to Preview Item Detail Tabs for more information.
Other functionalities:
Read-only cases
Multiple user access
Legal Provides processing, investigation and item level Does not include Case Evidence
Workstation and legal export functionality. Legal workstation is Pre-Filter
offered as an mid level option for organizations with
light processing requirements but still have a need Does not include Sub-Case Exports
to create load files.
Enterprise Provides processing, investigation and item level No restrictions, all features are
Workstation and legal export functionality. Legal Workstation is enabled.
offered as an mid level option for organizations with
light processing requirements but still have a need
to create load files.
Reviewer Provides review and analysis functionality. Reviewer Does not include data ingestion.
licenses are offered in conjunction with the
Nuix_Server and and Enterprise Workstation license Does not include item level, legal,
to facilitate a multi-user, concurrent, collaborative or sub-case exports.
review within Nuix.
ARX Provide Analysis, Review and eXport functionality. Does not include data ingestion.
ARX licenses enable additional power users to
perform full analysis and export operations. Standard ARX does not include
Legal Export
Entrprise ARX has no export
restrictions
Nuix Server Provides a mechanism for securely allowing multiple Only used for case collaboration
users to interact with the same case simultaneously and license distribution. The
as well as distribute multiple licenses from a single Nuix_Server has no processing or
dongle. export functionality.
Processing Recommendations
Microsoft Office 2007/2010 is strongly recommended for all processing systems. Nuix will attempt to open
Office 95 and Office works files with Office 2007/2010, otherwise Nuix will default to text extraction only for
these file formats.
Export Requirements
Microsoft Office 2007/2010 - Office 2007/2010 is required to export PST files, create Ringtail
databases (MS Access MDB) and as part of our PDF rendering process. Office 2010 includes a
64-bit version of Access, which will allow you to export out to Ringtail and Discovery Radar on the
64-bit version of Nuix. If you do not have the 2010 64-bit version of Access your Ringtail and
Discovery Roader exports can only be run from the 32-bit version of Nuix. Office 2010 has in-built
PDF capabilities.
Microsoft Office 2007 PDF Plug-in - The Office 2007 PDF plug-ins are used as part of the legal
export to create PDF renderings of native electronic documents. Note: Office 2007 SP2 now
includes the PDF Plugin by default. If you have Office 2007 installed and are uncertain whether the
correct PDF plug-ins have been installed open a word document and save it as a PDF. If the option
is not present, another save option will be displayed to save as an alternative form including PDF.
Ghostscript - Ghostscript is used to convert PDF images to TIFF files.
Review only options
Document Viewer: Depending on the individual reviewer or analyst desktop configurations, they
may not have access to all of the necessary software to launch the items in there native
application. There are several applications available, notably Outside-In from Oracle and Quick
View Plus from AvantStar. These applications can be installed on the reviewer desktop and will
replace the OS file associations to open the majority of file types.
Office 2010 Viewers: Microsoft has made free viewers available for many of the Office 2010 suite.
These free viewers eliminate the requirement for installing a full copy of Office 2010 on each
reviewer client.
Download:
Excel Viewer
Word Viewer
PowerPoint Viewer
Visio Viewer
Note:
Using the Microsoft Office viewers does not allow for an accurate PDF rendering of the item. Nuix will still
generate a PDF view, but it will simply be a PDF rendering of the extracted text, as opposed to a formatted,
true to life representation.
Menus
Nuix 4 contains a set of standard menus to help you run commands from the user interface. Many of the
commands on these menus are also located closer in context with the tasks with which they are associated,
such as on right-click menus.
The menus are:
File - Commands for managing cases, printing functions, and exiting the application.
Edit - Commands for editing items in the case.
Items - Commands for editing, managing, and finding items in the case.
Go - Commands for navigating through items in a case, and for managing search queries.
Window - Commands for managing the user interface, such as showing and hiding elements.
Reports - Commands for opening new tabs to show reports.
Scripts - Commands for launching and managing scripts.
File Menu
The Nuix 4 File menu contains commands for managing cases, print functions, and exiting the application.
Edit Menu
The Nuix 4 Edit menu contains commands for editing items.
Items Menu
The Nuix 4 Items menu contains commands for editing, managing, and finding items in the case.
Go Menu
The Nuix 4 Go menu contains commands for navigating through items in a case, and for managing search
queries.
GO COMMAND FUNCTION
Next Item Displays the next item in the result set.
Next Batch Displays the first item in the next family of items during a Fast Review job.
Show All Descendants Finds and displays all child items for the selected item(s) in a new Workbench tab.
Finds and displays the highest-level parent item for the selected item(s) in a new Workbench
Show All Top-level Items
tab.
Show All Families Finds and displays the family items for the selected item(s) in a new Workbench tab.
Window Menu
The Nuix 4 Window menu contains commands for managing the user interface, such as showing and hiding
elements or resetting the window panes within the tabs to their default layouts.
You can configure the default set of tabs shown when you open a case through Global Options (File >
Global Options > Default Tabs). You must close and reopen a case for any changes to take effect.
Reports Menu
The Nuix 4 Window menu contains commands for managing the user interface, such as showing and hiding
New Fast Review StatisticsTab Opens a new Fast Review Statistics tab.
Scripts Menu
The Nuix 4 Scripts menu contains commands for launching and managing scripts. All Nuix 4 license types
support scripting.
Nuix has a Scripts directory for organizing scripts that can be run from this menu. Scripts in this directory
display in the menu, and scripts that you place into sub-folders in the directory are displayed by folder in the
menu. In the image shown, numerous scripts have been collected into logical folders, which display in
the Scripts menu and allow for organized access to the scripts. If no scripts exist, this menu displays only
the last two commands.
Open Scripts Directory Opens the Nuix directory where you place scripts that can be accessed by Nuix 4.
Opens the Nuix Script Console that allows you to type or paste scripts and run them,
Show Console and shows all programmatic output from the script, including informational updates and
errors.
Help Menu
The Nuix 4 Help menu contains commands for viewing online help and the product version, opening system
logs and diagnostic tools, and downloading updates.
Reports a variety of information about the system on which Nuix 4 is installed, including
System Diagnostics hardware, software, application dependencies, and system file and license properties. Also used
as part of the customer support process.
Opens the Nuix directory where application log files are written, which help the Nuix Support
Open Log Directory
staff troubleshoot Nuix errors or failures.
Download Updates Opens the secure web page where you can download Nuix and dependent third-party software.
About Nuix License_Type Displays the name and version of your Nuix 4 license type.
Networks Menu
The Nuix 4 Networks menu contains commands for customizing the display of the Networks view. This
menu only displays when you select View by: Network in the Results pane.
The Networks tab displays a dynamically arranged diagram of all communications within the results set. The
Networks diagram can be used to determine communications patterns including frequency of
communications as well as any unusual or one-off communications. The diagram dynamically updates as
you change the filters and search criteria.
Locks the graphic in place or unlocks it. When you lock the graphic, you
Lock/Unlock All Nodes can pull the nodes apart manually to highlight specific communication
threads.
Sets a variety of display options for the text on the nodes, including
Node Display Options truncating the text to less than 15 characters or showing no text, and
options for displaying the addresses, such as showing only the personal
name, only the address, either the personal name or address based on
Sets the colour or shades used in the Networks view. A different color is
Colour Schemes used when the communications between two people reach a certain value.
Choose between Vivid, Classic, or Grayscale.
Tabs
Nuix 4 contains eight tabs that host a variety of workflows and case information. The primary tab is
the Workbench tab, which contains a holistic view of the data within the case and supports most of the
necessary eDiscovery tasks. You can open multiple tabs of the same type as needed to manage your work.
The Processing tab displays when you create the case, after the data is ingested, to show you information
about the results of the processing operation, but no longer displays the next time the case is opened.
You can control which tabs display by default when you open a case by going to File > Global Options >
Default Tabs and selecting the ones you wish to see when you open that case. To maintain a high level of
performance, not all tabs can be shown by default.
The eight tabs are:
Processing - Lists the processing operations with timestamps, as well as file type statistics and an
overall processing job status. This tab is only available immediately after the processing operation
has completed. Use the Statics and History tabs to review the file types and total processing time
information after the Processing tab is closed.
Workbench - Hosts the primary tasks of excluding, filtering, and searching for data within the
case. You can also analyze data, preview individual items, and tag from this tab. This tab is set to
display by default when you open a case.
Statistics - Displays information about the processed and irregular files by file type, including
number processed, corrupted, and encrypted, as well as a percentage of each file type
encountered.
Word List - Displays a list of every word that appears in the data set or words matching a custom
word list, and a count of the number of items containing that word.
Addresses - Displays a list of every address that appears in the properties of the data within the
case, and a count of the number of items containing that address.
Entities - This tab is displayed only on the Preview pane. Displays a list of every entity that
appears within the data of the case, and a count of the number of items containing that entity.
History - Displays information about how Nuix 4 has been used. All case searches and primary
interactions are logged, with timestamps and the user that performed the action.
Fast Review - Lets you create jobs that can be batched up for review by investigators. For each
job, you can specify tags and words to highlight. You can then associate items to each job, and
those items are presented individually in a linear fashion for tagging.
Progress - Logs the processing events, including the data being ingested and other related
operations, with a time stamp.
Statistics - Displays the types of files processed, with the number corrupted, encrypted, deleted,
and related job percentages.
Job Status - Displays the status of the overall job
At the bottom of the tab, you can also view the elapsed time since the job began, and a status bar showing
percent complete.
From this tab, you can perform the following tasks:
Pause a job, which temporarily halts the processing job, at which point the Resume button
becomes active. Pausing and then pressing Stop is the same as just pressing Stop.
Resume a job, which continues processing.
Stop a job, which displays a dialogue that provides two options for stopping case processing, Stop
and Abort.
Document Navigator Pane - Displays the evidence in its original hierarchical structure, any
excluded items, the filter mechanisms, and a history of searches performed in the case.
Search Bar - Contains a search text field with a date filter, and a tool for building more complex
queries.
Results Pane - Displays items that match the result of any exclusion, filter or search actions, and
support for reviewing or analysing the result set in seven different views: list of items, thumbnails of
images, a word list, items statistics, a map of events over time, and by communication network.
Preview Pane - Displays a full text preview of the selected item along with metadata about the
item, and offers support for viewing similar or related items, adding comments and opening the
message in its native application.
Review and Tag Pane - Provides support for navigating through the result set and tagging
documents through single selections or hot keys.
The Document Navigator is located by default to the left side of the Workbench tab, and can be popped
out of the window frame and/or resized within the Workbench window as necessary. You can also show or
hide the sections within it, or adjust their vertical size to meet your viewing needs. When you narrow the
number of items in the case that you wish to act on, either by de-selecting evidence or filtering on medadata,
the Document Navigator highlights the associated area in yellow to indicate that you have reduced the scope
of the data set.
Evidence - Displays the complete original source structure of the evidence loaded into the case.
You can also filter the data you wish to work with by selecting or clearing (deselecting) the
evidence from this control. It displays the data loaded into the case in its original source folder
hierarchy, allowing you to sort evidence based on item name, browse the evidence by folder, or
filter the set of files to view or analyze.
Excluded Items - Lists the items you have excluded from consideration, organised by name and
displaying their location within the data set. After you exclude items, they are suppressed from the
Results view and Document Navigator. They will still appear as part of the children/attachment tabs
in the Preview pane.
Custodians - Lists Custodians that have been allocated to items prior to ingest or through the
results pane.
Item Sets - Displays Item Sets created including all batches added to a particular Item Set,
displayed as originals and duplicated documents within the set.
Automatic Classifiers - Automatic Classifiers navigator displays the created Automatic Classifiers
that includes the Training, Automatically Classified, and Skipped items.
Production Sets - Displays Productions Sets created on export or manually through the results
Evidence Navigator
The Evidence navigator is located within the Document Navigator pane on the Workbench tab. It
displays the data loaded into the case in its original source folder hierarchy, allowing you to sort evidence
based on item name, browse the evidence by folder, or filter the set of files to view or analyze by selecting
only the nodes you need.
Within the title bar, the number of items in the case that are not part of the excluded items, and
then the total number of items in the collection, followed by the percentage of items that have not
been excluded. For example: (18491/18919; 97.74%)
The name you created for the set of evidence as the root folder, with the total number of items in
that set of evidence that have not been excluded. By default, all data is selected.
Child folders that show the source folders that were processed, with the total number of items in
each folder that have not been excluded.
Irregular file icons, if applicable, which Nuix assigns to items upon ingestion if the items meet the
criteria. The irregular file icons are:
Corrupted Containers
Non-searchable PDFs
Text Updated
Bad Extension
Unrecognised
Unsupported Items
Empty
Encrypted
Deleted
Text Stripped
License Restricted
Carved
Decrypted
Fully Recovered
Metadata Recovered
Partially Recovered
Hidden Stream
Poison
The Excluded Item icon , for items that have been added to the excluded items list.
Filter the data you work with by clearing (deselecting) nodes or folders in the tree. When you do,
the navigator turns yellow to indicate that the full set of data is not being used for search and
review tasks.
Expand the nodes in the tree by clicking on the plus sign, and collapse them by clicking on the
minus sign.
At the top of the tree, select Reset to clear any filters and include the entire set of evidence once
more.
At the top of the navigator, show or hide this section by clicking on the double-arrow icon in the
blue title bar.
View the entire tree left to right by using the scroll bar at the bottom.
Within the blue title bar, the total number of excluded items in the case, and then the total number
of items in the collection, followed by the percentage of items that have been excluded. For
example: (428/18919; 2.26%)
The name you created for the exclusion set as the root folder, with the total number of excluded
items defined by that set. By default, all data you have excluded in the case is selected (checked).
Child folders that show the items being excluded in their original source structure, with the total
number of items in each folder that are excluded.
The Excluded Item icon to indicate an item is excluded.
To find all items that are part of a specific exclusion set, follow these steps:
1. Clear all of the filters.
2. Uncheck all of the Excluded Items.
3. Search for exclusion:Exclusion.Set.Name
Item Sets
The Item Set navigator is located within the Document Navigator pane on the Workbench. It allows the
management of deduplication across sets of documents within the same case. By using Items sets you can
clearly see which documents in your set are unique and those documents that are considered duplicates to
those original items.
Documents can be deduplicated considering each item or only considering the document family as a group.
Note: Custodians can not be re-ranked once they have been used within an Item Set. Should you require
the custodians to be re-ranked then the documents will need to be added to a new document set.
Automatic Classifiers
The Automatic Classifiers is located within the Document Navigator pane on the Workbench tab.
Copy Automatic Classifier Creates a new automatic classifier by copying an existing classifier along with its items.
Builds a model within an automatic classifier from its training data. Refer to Build Model
Build Model
for further details.
View Build Model History Displays the history of models built within the selected automatic classifier.
Displays the confusion matrix for the selected items. Refer to Build Model for further
View Confusion Matrix
details.
Exports the predictive model from an automatic classifier to a PMML file. Refer to
Export Model
Export Model for further details.
Cross Check Against Cluster Compares the selected automatic classifier training items against a cluster run to find
Run conflicting data.
The root folder lists the Training, Automatically Classified and Skipped folders.
Training
The Training folder is classified in terms of relevance. The right click option allows you to build model, view
built model history, view confusion matrix, rescore the items, export model, and cross check against cluster
run to find conflicting training data.
Automatically Classified
The Automatic Classifiers folder is classified in terms of relevance. The right click option allows you to
rescore the selected items.
The right-click menu on the results pane displays the Automatic Classifier menu. Alternatively, the Automatic
Classifier menu can be accessed from Items > Automatic Classifier.
In the Add or Edit an Automatic Classifier dialog box specify the name, description, classifications in
terms of relevance, view model details and automatically classified items. Click OK to save the Automatic
Classifier.
In the Add Training Items dialog box, select the Automatic classifier you wish to add the training items to,
and specify the classification based on its relevance. Click OK to add the training items.
To remove training items, select the items within the results pane you wish to remove and click Automatic
Classifier > Remove Training Items. In the Remove Training Items dialog box, select the Automatic
classifier from which you wish to remove the training Items and click OK.
Build/Export/Import Model
Note: Ensure you add training items to the automatic classifier before building a model.
To build an automatic classifier model, click Automatic Classifier > Build Model from the right click menu on
On completion, the Automatic Classifier Confusion Matrix is displayed, showing Matrix, Omitted words,
and settings tabs.
The Matrix tab displays the item count, confusion matrix and statistics as shown below:
The Settings tab allows you to configure the settings. Click OK to save the changes.
Export Model
To export a model, click Automatic Classifier > Export Model from the right click menu on the results pane,
the Export Automatic Classifier Model dialog box is displayed.
Alternatively, right click on the root folder or Training folder of the automatic classifier you wish to export
model and select Export Model.
Select the Automatic Classifier from the drop down list, specify the Export to File destination, and
click OK.
Import Model
To import a model, click Automatic Classifier > Import Model from the right click menu on the results pane,
Once the production set is created a PDF image version of the documents can be pre-generated within Nuix
depending on the item using the Populate Stores function. The settings available are the same found
in imaging options. Once PDF versions of the items are created and stored in the Nuix PDF print store, the
Production Set Navigator will show a report of the results of the PDF generation. The items can then be
QAed for issues and if required images updated with those rendered outside Nuix in special application.
Note that this section does not include any excluded items. Items can appear in multiple filters (folders in the
navigation tree) if they meet the metadata criteria. Each folder displays a count of items included in that filter.
All Items - Items by file type, organized under the parent folder called All Items. This folder
includes all items in the collection, except excluded items.
Email Attachments - Items that are attached to the emails in the collection.
Emails and Loose Files - A combination of the emails and loose files, which are are items that
were found in the source folders that were not emails or email attachments.
Irregular Items - Items that Nuix has determined to be irregular, listed by type of irregular file:
Corrupted Containers
Non-searchable PDFs
Text Updated
Bad Extension
Unrecognised
Unsupported Items
Empty
Encrypted
Deleted
Corrupted
Text Stripped
License Restricted
Carved
Decrypted
Fully Recovered
Metadata Recovered
Partially Recovered
Hidden Stream
Poison
Commented - Items to which you have applied comments. You can search through commented
items for a particular word or phrase by using the comment search syntax in the Search bar in
Filter the data you want to work with by selecting the check boxes next to the folders or nodes in
the tree or by double-clicking the filter name. When you do, the navigator turns yellow to indicate
that the full set of data is not being used for search and review tasks. *Note*: Double clicking on
the filter name will show the search syntax used to filter the data in the Search bar.
Expand the nodes in the tree by clicking on the plus sign, and collapse them by clicking on the
minus sign.
At the top of the tree, select Reset to clear any filters and include the entire set of evidence once
more.
At the top of the navigator, show or hide this section by clicking on the double-arrow icon in the
blue title bar.
View the entire tree left to right by using the scroll bar at the bottom.
You can perform equivalent filtering actions by using the Kind search syntax in the Search bar, as
described below.
Word processor documents such as Microsoft Word documents and rich text
Documents kind:document
format (RTF) files.
Other
kind:other-document Other types of documents a user might create.
Documents
Multimedia kind:multimedia Audio and video files, and other types of multimedia.
Containers kind:container Data types that resemble directories, such as archives or mailboxes.
Nuix uses a regular expression search to find the specific character ranges associated with each language.
See the regular expression section for additional detail on searching for character sets not included in the
drop down list.
For a complete list of the Unicode character ranges, see this Unicode Chart.
Filters by Languages
The Languages filter in the Filtered Items navigator enables you to identify an item's primary language
based on its textual content rather than just the writing scripts that it contains. The language is identified by a
"majority wins" algorithm, so items containing multiple languages will be categorized by the majority content.
Currently we identify 52 languages as follows:
By default, Search History is hidden to reduce the impact of gathering the previous search data for display.
Nuix desktop can be opened with the Search History enabled using the following switch, -
Dnuix.documentnavigator.removeSearchHistoryPanel=false, at start up through the Application Command
Line.
Note: Nuix does not save any filter settings you might have applied when you ran a search, so the items
displayed when you rerun a search may vary.
Searches are named using the criteria you used in the query, such as raptor AND
ethical or comment:"research". Click the search query in the tree to rerun it.
Expand the nodes in the tree by clicking on the plus sign, and collapse them by clicking on the
minus sign.
At the top of the navigator, show or hide this section by clicking on the double-arrow icon in the
blue title bar.
View the entire tree left to right by using the scroll bar at the bottom.
Search Bar
The Search bar, located at the top of the Workbench tab, provides you with a tool for performing both
simple and complex searches against the evidence set. Searches will run against items that match any
existing filters and items that are not excluded.
Note that the Date filter searches against the Nuix metadata property called Item Date. The Item Date is
defined as follows:
For emails, it is the Nuix Communications Date, which could be the Map-Client-Submit-Time, Sent
Date, or Date of the email item.
For files, it is the File Modified date, or if not present, the File Created date.
For items that don't have a date, they are given the item date of their parent.
CONTROL DESCRIPTION
Previous and Next Advances backwards and forwards through the searches already performed in the currently
buttons open session of Nuix. Searches performed prior to the current session are not available.
When you use these buttons, Nuix automatically runs the search and the items in
the Results pane update.
Search text field Free text field into which you can type or paste a search query. The Search field can
contain millions of characters.
Date filter The date filter offers four options that you can use in conjunction with the calendar
controls: Between, Not Between, After, and Before. By default, searches are set to the
option No date filter.
Calendar controls Two calendar controls allow you to specify one or two dates in time to use in conjunction
with the Date filter, including year, month, and day. Click the drop-down arrow to select a
date using the visual calendar tool or type in the date you want to use in the field.
Clear button Clears the Search field and all filters, and sets all search criteria back to the default
settings.
Advanced button Shows the Advanced Query Builder tool for building more complex queries without needing
to know specific Nuix or Lucene search syntax.
For more information, see the Search section and further details about searching for items by date.
CONTROL DESCRIPTION
Lets you type the first letters of the search criterion for which you are looking and finds it
Search Criterion Filter
in the list box.
Lists the types of criteria you can use to build a search expression. The associated
Search Criterion List Box options for each criterion display in step two, to the right. You can use as many of these
criteria as you wish in your query by adding them to the search expression one at a time.
Allows you to type in terms and phrases to use in the search in the associated free-text
Keywords: All of these words
field on the right. The search returns only items that match all of the terms listed.
Allows you to type in terms and phrases to use in the search in the associated free-text
Keywords: Any of these words
field on the right. The search returns items that match any of the terms listed.
Allows you to type in terms and phrases to use in the search in the associated free-text
Keywords: None of these words
field on the right. The search returns only items that do not include the terms listed.
Allows you to specify a minimum and maximum numerical file size to use in the search in
File size the associated fields on the right. You must enter a value for both fields. File sizes are
measured in bytes, and uses the Nuix Digest Input Size.
Allows you to specify one or more file type(s) to search for in the associated list box on
the right. Type into the filter control to go directly to a particular file type or file extension,
or browse through the list of file types to find and select file extensions to include in your
File type
search expression. The file types you can choose from include application, audio,
filesystem, image, message, server, text, and video. The list only includes the file
extensions registered on the local system. You can select as many file types as you wish.
Allows you to select from the list of tags that exist in the case and match items that have
Tags the selected tag(s) applied to them. You can choose to match items with any, all, or none
of the tags chosen with the drop-down control at the top of the list box.
Allows you to specify a text string in the associated free-text field on the right. The search
Comments
returns only those items that include the string in the Nuix Comment field.
Item Sets Allows you to select an item set from your list of item sets.
Production Sets Allows you to select a production set from your list of production sets.
Document ID Allows you to enter text to match items where the text matches a Document ID.
Adds the criteria you selected in steps one and two to the search expression, which
Add to Expression Button displays in the Expression table. You must click this button each time you complete step
two to add the expression to the query.
Displays each rule, or expression, as you add them. This collection of rules makes up the
Expression Table search query. You can choose to match all of any of the rules in the table, via the drop-
down control at the top right of the table.
After selecting an expression in the table, allows you to edit that rule by loading the
Edit button
criteria you entered in steps one and two.
Clear All Clears all of the expressions from the Expression table.
You can hide or show the Advanced Query Builder by clicking the Advanced button at any time. The
corresponding search syntax for the expressions you specified in the tool is displayed in the Search bar.
For more information on performing an advanced search, refer to Performing Advanced Search page.
Results Pane
The Results pane, located within the Workbench tab, displays a list of the items that match your selected
CONTROL DESCRIPTION
Sets which view to use to show the items that match the selected criteria (i.e., the items
currently in the result set). Views include: Results, Thumbnails, Word List, Statistics,
View By
Addresses, Event Map, Shingles, Entities and Network (some views only become
visible if previously selected in the pre-processing options).
Hide immaterial items Suppresses items that are not included in a legal export. Immaterial items are those
items that are extracted for forensic completeness, but do not necessarily have intrinsic
Filters the items in the result set by MD5 hash to show only one of an item if it has
duplicates. It is also possible to deduplicate at custodian level, this means that the
Deduplicate results deduplication will only remove duplicate items within the same custodian's data.
Selecting this option increases the amount of time it takes to load a view.
Select None to view items without deduplication.
Displays the items or data in the format of the view you selected in the View by control.
View area The columns in the default Results view can be changed by right-clicking on a column
header and choosing from one of the available options.
Opens the Add Tags dialogue so that you can apply tags to the selected items. This
Add Tags button is enabled when you select items in the result set in the Results, Thumbnails,
and Addresses views.
Opens the Exclude Items dialogue so that you can exclude the selected items. This
Exclude Items button is enabled when you select items in the Results, Thumbnails, and Addresses
views.
Allows you to select from a variety of export options and opens the corresponding
Export dialogue. Export options include exporting by view, items, case subset, annotations,
digest list, and legal export to a load file.
You can change the view in the Results pane to display and interact with the the data in different ways. The
following topics explain how to interact with the various views using the available controls.
COMMAND DESCRIPTION
Lists the metadata profiles that you can use to change the metadata values that display in
Choose Column Profile
the Results table view.
Sorts the items in the column, starting with items that start with special characters, followed
Metadata Name Column:
by items that start with numbers beginning with zero, and lastly items in alphabetical order
Sort Ascending
beginning with the letter A.
Sorts the items in the column, starting with items in reverse alphabetical order that start with
Metadata Name Column:
the letter Z, followed by items that start with numbers beginning with the highest number first,
Sort Descending
and lastly items with special characters in reverse order.
Finds and displays all of the unique values in a given column. Each row is a unique record,
Metadata Name Column:
and no parsing is performed within any field. The results of the Distinct Values calculation
Metadata Name Column: Totals all of the numerical values in a given column. Primarily for use with metadata whose
Compute Column Sum values range in size, such as Digest Input Size.
Resets the column to the Nuix default sort order, which is the order in which the documents
Reset Sort Order
were displayed when the search or filter operation was performed.
To highlight a single row and display the item in the Preview pane, single-click on the row with the
mouse or use the up or down arrows on your keyboard.
To select one or more highlighted items in the list, press the space bar.
To select all visible rows in the Results view, select the check box at the top of the table, or use Ctrl
+ A on the keyboard to select all visible rows in the Results, Word List, Statistics, History, and
Thumbnail views.
To clear all visible rows in the Results view, clear (deselect) the checkbox at the top of the table, or
use Ctrl + Shift + A on the keyboard to clear all visible rows in the Results, Word List, Statistics,
History, and Thumbnail views.
To highlight contiguous rows of items, single-click an item and drag the mouse down or up to select
additional rows or select the first item and press the Shift key and then select the last item to select
all rows in between.
To highlight non-contiguous rows, select the first item and press the Shift + Ctrl keys while
selecting additional rows.
A right-click on any row or rows displays a context-sensitive set of commands. Some commands are only
available if the item is selected (that is, the checkbox on that row is selected).
COMMAND DESCRIPTION
Copies the selected rows to the clipboard. Includes just the metadata displayed by the
Copy
current metadata profile.
Copy Value Copies the value of the selected cell to the clipboard.
Add Tags Adds tags to selected items, including to items in the associated family and/or duplicates.
Removes a tag from the selected item(s), including from items in the associated family
Remove Tags
and/or duplicates.
Add to Review Job Adds the selected items to an existing Fast Review job.
Remove from Review Job Removes the selected items from an existing Fast Review job.
Removes the selected items from an existing Fast Review job, including items in the
Remove from Review Job
associated family.
Scans new child items and processes only new items found to place them into the
Scan for new Child Items
accurate location within the data tree.
Adds the selected items to a new or existing custodian with options to include associated
Assign Custodian
family items.
Removes the selected items from the selected custodian with options to include associated
Unassign Custodian
family items.
Add the selected items to a new or existing Production Set with options for numbering,
Add to Production Set
deduplication and including associated family items.
Remove from Production Set Removes the selected items from an existing Production Set.
Excludes items from being available for further case activity. This suppresses the items
Exclude Items
within the data set, including items in the associated family and /or duplicates.
Allows the regeneration of both the binary natives store and the PDF image store with
Populate Stores
options to format the PDF images on generation.
Add to Item Set Add the selected items to a new or existing Item Set.
Chooses a sample from the selected items and displays the sample in a new workbench
Sample Items
tab.
Provides options to generate groups of chained near-duplicate clusters an item belongs to,
Cluster Runs
and, remove clusters.
Provides options to create an automatic classifier; build, export, and import models; add
Automatic Classifiers
training items; automatically classify items.
Processes bulk redaction by selecting the word list, creating a new markup set or using an
Bulk Redactions
existing markup set, and specifying the required PDF settings for the redaction.
Show All Descendants * Finds all child items for the selected items.
Finds the highest-level ancestors and all child items for the selected items, including the
Show All Families
items themselves with the results.
Show All Near Duplicates * Finds all items considered to be near duplicates of the selected items.
Single-click on an item to highlight the item and have it displayed in the Preview pane.
A right-click on any thumbnail image displays a context-sensitive set of commands.
COMMAND DESCRIPTION
Copy Copies the selected item to the clipboard.
Add to Review Job Adds the selected items to an existing Fast Review job.
Remove from Review Job Removes the selected items from an existing Fast Review job.
Adds the selected items to a new or existing custodian with options to include associated
Assign Custodian
family items.
Removes the selected items from the selected custodian with options to include associated
Unassign Custodian
family items.
Show All Descendants * Finds all child items for the selected items.
Show All Top-level Items * Finds the highest-level ancestors for the selected items.
Finds the highest-level ancestors and all child items for the selected items, including the
Show All Families
items themselves with the results.
To re-sort the rows, toggling between ascending and descending order, single-click on a column
header.
View hits for just ASCII words, numbers in the text, non-ASCII words, words that are an Atypical
Length or all terms found.
Word lists can be created from searching within the content of the documents selected or from just
the properties of the selected documents.
Use the filter to narrow down the returned results in the word list.
To view the items that include a specific word in the list, double-click on the row to create a
new Workbench tab displaying those items in a new result set.
Show the entities found within the data set grouped by entities found.
Filter the results by typing in the text to be matched in the entities, a particular company or card
type, for a quick filter to narrow results.
To re-sort the rows, toggling between ascending and descending order, single-click on a column
header.
To view the items associated with one of the file types, double-click on the row to create a
new Workbench tab displaying those items in a new result set.
To export the view, select the Export button. For more information about exporting views,
see Exporting Information from a View.
You can filter the items that display in the Networks diagram by selecting the following options:
Run Layout - Freezes or unfreezes the automatic placement of the nodes in the diagram. When
selected, the diagram is active and works to display the nodes in the most readable layout for
Immaterial items are those items that are extracted for forensic completeness, but do not necessarily have
intrinsic value in a legal context. Additionally, these items are not exported as part of a legal export and are
not included in the total size calculation for audited licenses.
Immaterial items include:
4. Across the entire case, search for –flag:audited. This returns all of the immaterial items.
Starting with version 2.20, the copy that appears in the result set (the "original"), is the earliest item
in the evidence tree as seen in the browser view. This ensures that each time a duplicate is
removed the exact same item is always displayed/exported as part of the result set. Prior to version
2.20, preference was given to items that contained comments or classifications.
SHA-1 and SHA-256 hashes are only calculated for reference purposes. They are not used as part
of the duplicate determination.
For additional detail on how duplicates are removed during the export process, see the Legal Export
option, Export items.
Preview Pane
The Preview pane, located on the Workbench tab, is comprised of information and tools that allow you to
view the item itself, the metadata associated with the item and additional information to help analyse the
context of the item.
A toolbar at the top of the pane allows you to navigate between items, apply or edit comments, and
view the item natively.
An area with contextual information about the item, such as its source path and similar or related
items.
A set of tabs that present details about the item, such as the item's textual or image content and
associated metadata.
Preview Toolbar
A toolbar at the top of the Preview pane allows you to navigate between items, apply or edit comments, and
view the item natively.
Previous Item - Select the left arrow to preview the previous item in the result set.
Next Item - Select the right arrow icon to preview the next item in the result set.
Item Name - Displays the Subject line of an email or the file name for all other item types.
Comment - Opens the Edit Comment dialogue, allowing you to enter or edit a comment associated
with the item being previewed. You can search for the text entered in a comment field.
Save As... - Ability to save the current item outside of the case. The default file type will be selected
depending on the item.
Launch - Opens the item in its native application if the application is installed on your system.
When you create a case, selecting the option Store binary of data items will decrease the
amount of time it takes to open an item natively.
Path - The complete, hierarchical path that shows all parent items for the item being previewed.
You can view the items within the path by clicking any folder link, which opens a new results set.
Duplicates - Shows items that are Exact and Near duplicates of the item being previewed. Exact
duplicate items are items with the same MD5 Hash value as the item being previewed. Near
duplicate items are items that have a similarity resemblance that is equal to or above the
resemblance threshold set in Global Options.
Similar items - Shows items that are like the item being previewed. The High (90%+ similar),
Medium (70%+ similar), and Low (50%+ similar) categories group like items by looking at the name
of the item, the MD5 Hash value, and all words over six letters long that are the same.
Clusters - Shows the related clusters.
Related items - Shows the items that are a part of the same conversation thread as the item being
previewed. An email thread is a series of emails that have been sent, forwarded, copied, and
received, beginning with the first related communication. You can use the Event Map view to see
who was involved with an email thread over time. Related items is only visible when dealing with
email items.
Each tab in the Preview pane presents horizontal or vertical scroll bars if the content does not fit in the
viewing area.
By default, the Review and Tag pane is located below the Preview pane, and can be popped out of the
window frame and/or resized within the Workbench window as necessary.
A toolbar at the top of the pane allows you to navigate between items and edit the tags for the case
(add, remove, or rename).
A tagging grid that displays the tags that have been assigned to numerical values on the keyboard,
as well as tagging options.
A tree view of all tags in the case, showing any hierarchical relationships (nested tags).
You can adjust the width of the tagging grid or the tag tree area by right-clicking the horizontal dotted line
between the two areas and moving the divider to the left or right.
Note: Tags are only applied to the item that is actively selected and displayed as part of the Review and Tag
pane header. You cannot use the Review and Tag pane to bulk tag multiple selected items.
Previous Item - Select the left arrow to tag the previous item in the result set.
Next Item - Select the right arrow icon to tag the next item in the result set.
Item Name - Displays the Subject line of an email or the file name for all other item types.
Edit Tags - Allows you to add, remove, or rename case tags. You must create and manage all tags
from this dialogue box. A hierarchical structure allows for organizing tags into groups.
Tagging Grid
Below the Review and Tag pane toolbar is a tagging grid where you can assign tags you have created for
the case to numerical keyboard values. This allows you to tag quickly from the keyboard without using your
hands on the mouse.
After creating tags for the case, drag and drop a tag from the tag tree to an empty position on the tagging
grid. The grid displays the name of the tag and the numerical value to use to apply that tag to items. For
example, if you drag a tag named Responsive to the top left position on the tagging grid, the numerical
hotkey for that tag becomes seven (7). Once an item is selected in a result set, pressing 7 on your keyboard
tags that item as Responsive. Pressing the number 7 again removes the tag.
The tagging grid provides two options to use while tagging items that allow you to apply tags to all items in
the same family or to duplicate items. For this feature to work, either option must be selected prior to
applying the tag(s) to the item. Tags applied to items prior to selecting these options are not propagated to
family or duplicate items.
Apply same tags to all family items (#) - Applies the tag(s) you select to the current item as well
as all items in the family, including duplicate items. When selected, the Previous and Next arrows
in the toolbar advance by family instead of by individual item.
Apply same tags to all duplicate items (#) - Applies the tag(s) you select to the current item as
well as any duplicate items.
The following operations can be performed from the keyboard to review and tag items in a result set:
To move vertically through a result set, with the Results pane in focus use the Up and Down arrow
keys.
Press the hotkey numbers assigned to your case tas to apply tag items. You can apply multiple
tags by pressing multiple numbers in succession.
Tag Tree
The tag tree in the Review and Tag pane is a visual representation of the tags in the case, which you can
use to apply tags or populate the tagging grid. If you have more than nine tags to use in the case and all the
hotkeys in the tagging grid have been assigned, you can apply tags by selecting them in the tree.
Drag a tag onto an empty position on the tagging grid to assign it a numerical hotkey.
Select one or more items in the result set and tag them by clicking the blue checkbox for a tag in
the tree. You can select multiple tags in the tree to apply tags to the selected items.
Statistics Tab
The Statistics tab offers an itemised listing of all file types processed in the case and their respective
frequency within the dataset, including a listing of the raw file extensions found and any files classified as
irregular files. The Statistics tab offers a good overview of the items in the case and should be carefully
reviewed after you load data into a new case and subsequently each time you add evidence to a case. Open
a new Statistics tab by going to Reports > New Statistics Tab.
Processed Files - Shows statistics (processed, corrupted, encrypted, and deleted) by file type,
including percentage of that file type within all items processed. The Processed Files section
includes the files marked as irregular files.
Raw File Extensions - Shows the how many of each file extension is found from the raw ingested
files.
Irregular Files - Shows how many of the processed items were marked irregular, and the
percentage of of each irregular file type within all items marked as irregular. Files listed as Irregular
are still represented in the Processed Files section, the Irregular Files designation is simply an
additional attribute associated with the item.
Notes:
Nuix does not rely on the item's extension to determine its file type. Nuix checks the contents of the
file to ensure it accurately associates the file type. This eliminates the chance to hide evidence
simply by changing the file extension.
The Statistics tab differs from the View by: Statistics feature in the Results pane. While the
Statistics tab shows information about all case evidence, the latter view only shows information
about the items in a given result set.
File Type - Lists all of the file types encountered during the ingestion process.
Processed - Lists the total number of items processed for the specific file type.
Corrupted - Lists the total number of items that Nuix was unable to process, or found to be
corrupted for a specific file type.
Encrypted - Lists the total number of items that Nuix detected as encrypted.
Deleted - Lists the total number of permanently deleted items found in Microsoft mail container
formats for a specific file type.
Percentage Encountered - Lists the percentage, by item count, of the total dataset consumed by
the specific file type.
Statistics for raw file extensions include:
Raw File Extension - Lists all of the file extensions of the raw evidence encountered during the
ingestion process.
Processed - Lists the total number of items processed for the specific raw file extension.
Percentage Encountered - Lists the percentage, by item count, of the total dataset consumed by
the specific raw file extension.
Types of irregular files include:
Text Stripped - Items where Nuix recognized the file type, but does have a routine to cleanly
extract all text and metadata in accordance with the file types API. The results in a item that is
searchable, but the text may be garbled or not be properly formatted.
Unrecognised - Items where Nuix did not recognise the header and was therefore unable to assign
Open a result set containing items for a specific file type by double-clicking on any row in the
Statistics tab.
Sort a column in ascending or descending order by single-clicking in the column header. The
default is ascending.
Export the Statistics view by using File > Export > Export View.
Note: The Statistics Tab is for the entire case, and does not honor Excluded Items filters.
Use the drop-down list on the upper left to display words by a word list. The default setting is ASCII
Words, which displays a list of all words that are ASCII based the data set. You can import a text
file containing a custom word to scope the listing on this tab to only those words that are of interest.
Use the drop-down to search across all items content or just the properties of all items.
Type one or more characters into the Filter text box on the upper right to filter the filtering the list
that displays to match your entry. This filter is based on an anchor at the beginning of the word, so
"ranteed" will not show "guaranteed". The filter supports numbers, letters and symbols.
Open a result set containing only the items that include a specific word by double-clicking the row
that contains the word.
Notes:
Nuix views a word as any item that is surrounded by white space so 24014 is considered a word.
From a practical perspective this could be gibberish or it could a critical zip code.
All words are listed, including all character sets and symbols.
A script is available within the Knowledge Base that can be used to remove all alphanumeric
entries from and exported list.
The Word List tab is for the entire case, and does not honor Excluded Items.
Addresses Tab
Show the results grouped by domain group or expanded to show all addresses.
Filter the results by type of correspondence sent to only show emails From, To, CC, or BCC.
Find matching results to a particular domain group or user by using the Find function from
the Edit menu.
Open a result set containing only the items that include a specific word by double-clicking the row
that contains the word.
Notes:
Nuix views an address as any item that is removed from the transport headers of email items.
From a practical perspective this could be just a name from an address book or a fully resolved
email address.
The Addresses tab is for the entire case, and does not honor Excluded Items unlike the View by
Address view from the Results pane.
History Tab
The History List tab provides a log of a variety of events and user actions in the case, such as when the
case was opened, searches that were performed, when and who annotated items, and the like. Timestamps
Case Opened - Records the version of the Nuix application that opened the case in the Details of
Event.
Case Closed - Records the version of the Nuix application that opened the case in the Details of
Event.
Load Data - Records that data was loaded in the Details of the Event.
Search - Records the search parameters that were used and the number of results that were
returned.
Annotation - Records that an annotation was applied, including the specific annotation.
Import - Records that a PDF was imported.
Export - Records that an export was performed.
Script Run - Records that a script was run.
For each event, the following information is logged:
Re-run a specific search query by double-clicking on a Search event. A new Workbench tab
displays showing the results of the query against the current data set. This is not a memorialized
result set, so if new evidence is added to a case, the number of results will reflect the new
evidence.
Filter the History results by the type of event performed, by the user that performed the event or by
the date the event was performed.
Sort the columns in ascending or descending order by single-clicking on a column header. The
default order is ascending.
Export the contents of the History tab by using File > Export > Export View.
Available Review Jobs - Displays a list of all existing jobs and their status, along with functions for
creating a job, editing a job, deleting a job, and joining a job.
User and Tag Statistics - Displays statistics for each user (reviewer), including total number of
items tagged and totals by individual tag. Displays the total number of each tag applied to items in
the job, as well as totals by the various combinations of tags used (such as Responsive AND
Privileged).
Once a reviewer joins a review job, Nuix presents each item in the job in succession for tagging via a
new Workbench tab. Nuix always groups items by family in the result set.
New job - Create a new review job, including which tags to use and words to automatically highlight
in the items.
Edit job - Modify properties of the review job, including its name, order of item assignment,
associated tags, and highlighted words.
Delete job - Delete a review job.
Join job - Join a review job to review and tag the items in the job.
The User Statistics tab provides a detailed breakdown of each reviewers activity, including:
The Review and Tag pane displays the tagging palette for applying tags. In this workflow, only one item or
family can be tagged at a time. You must tag all items in one family before you can advance to the next
family of items. The green Next Family arrow displays in the Review and Tag toolbar after all items in a
family are tagged. The yellow Previous and Next arrows navigate between items within a single family.
Dialog Boxes
Nuix 4 manages a wide array of tasks with dialog boxes. Not all dialog boxes are documented in this section.
Some are covered thoroughly in the topics covering the tasks the dialog boxes support, and some are
undocumented as they are consistent with commonly used Microsoft dialog boxes (such as Save, Open, and
Print).
When you add tags, you can also choose to add them in bulk by selecting one of the following options:
Also include all items in the same family (#) - Applies the tag(s) you select to the current item as
well as all items in the family, including duplicate items.
Also include all duplicate items (#) - Applies the tag(s) you select to the current item as well as
any duplicate items.
You can set up the tags that display in this list from the Edit Tags link in the Review and Tag pane on
the Workbench tab, or you can create them from this dialog box while working with the data.
Investigation settings include the Investigation time zone, which sets the base time zone used for
investigations. You can search and review Event Maps in the desired time zone. Nuix stored all time stamp
data to system time, but displays dates and times according to the time zone set in this field. This ensures all
event maps progress linearly through time, and eliminates the complexity of managing communications from
different time zones.
You can also choose to exclude items in bulk by selecting one of the following options:
Also exclude all items in the same family (#) - Excludes the item(s) you select as well as all
items in the family, including duplicate items.
Also exclude all duplicate items (#) - Excludes the item(s) you select as well as any duplicate
items.
From an existing case, you can use one of the following methods to open this dialog box:
From the menu, File > Export > Export Case Subset.
From the Results pane, click the Export button and select Export Case Subset and the load file
format.
Once items are selected, right-click over them and select Export > Export Case Subset.
Note: Nuix only exports the selected items and any of the necessary parent document records to complete
the evidence hierarchy. Nuix does not gather families and export those. To include entire families in the
Case Subset, you will need to ensure Include Family Members is check or that all family items are found
prior to exporting.
Name - Name of the case subset. The value defaults to the name of the parent case appended
with "-Export #".
Directory - Directory where you want Nuix to export the case. The value defaults to the root folder
of the last export.
Investigator - The actively logged in user.
Description - Description of the case subset used solely for informational purposes.
Number of Indexes - Allows new consolidated indexes to be created for faster searching once
exported. To ensure the optimal searching the number of indexes should be reduced to the number
of workers being used on that case +2.
Note: The case subset adopts all of the parent case's ingestion processing settings. The most important
setting to note is the "Store binary of items" option. If the store binary option was not selected when the case
was processed originally, the case subset will not contain the binary. If it was selected, then the case subset
will contain the binary. This is important if the case subset is to be transported, as the path to the original
source will most likely not be available after transport.
Note: Nuix does not reprocess the source data when it creates a case subset, instead it only re-indexes the
previously extracted text.
Annotation Settings
The Annotation settings in the Export Case Subset dialog box determine if user-defined comments, tags,
custodians, item sets and production sets from the parent case are to be included in the case subset. By
default, all options are selected.
Item - Shows the progress at an individual item level. Some items are compound items and can
take a considerable amount of time.
Total - Shows the progress of the entire export.
Failed Items- Number of items that failed to export.
Note: Creating case exports from cases without the stored binary is very fast. Cases with stored binary
can take considerably longer because all of the binary data needs to be copied.
Another dialog box, Export Results, follows this one indicating the number of items that were successfully
exported.
Create new list named - Creates a new list and saves it as a binary file in
the Nuix\Digests directory.
Merge with existing list - Appends the current highlighted results to an existing digest list.
Digests lists are stored locally and we support them being moved/copied to other workstations via the digest
import feature.
You can use one of the following methods to open this dialog box:
From the menu, File > Export > Export Digest List.
From the Results pane, click the Export button and select Export Digest List and the load file
format.
Once items are selected, right-click over them and select Export > Export Digest List.
Note: The Exporting Items feature is available to all export enabled licences.
Exported Files
Export directory
Defines the root path to where you want Nuix to export the data. The Save dialog box defaults to the
previously defined location. Once a valid, empty directory has been selected the destination directory
warning will change to a green tick icon.
Export items
Defines the items to export. Selected items only will export only the items selected for export with no family
items. Top level items only will export the immediate top level item for the selected items for export only.
Export messages as
Sets what format to use when exporting the email items.
The Native, EML, and MSG and options will export individual items to the export folder. The MBOX, PST,
NSF options will export a single email container with all of the items.
Individual file options:
Export PDF
Option will export a PDF copy of each item selected for export.
Export Thumbnails
Option will export Thumbnails of any images selected for export where thumbnails were created when
originally processed.
Notes: * Prior to performing a legal export, ensure that the target file system has sufficient disk space for the
export. * Nuix strongly recommends that all exports be performed to local disk. Nuix does not recommend
exporting to a mapped drive, a UNC share or an externally attached hard drive as these all present sever
performance limitations. * A single email container file is created at the root of the Export_Dir\Files folder
called Export.xxx. This file contains all of the selected emails.
* Lotus Notes data is classified as a message/RFC822 and is exported as EML if Native is selected.
* If Native is selected, most data from Microsoft Exchange EDB files is exported as EML. If all Microsoft data
is expected to be MSG, select the MSG option.
Path Options
Single directory exports all items to the to the root of the email container or targeted file system export
folder. This option is selected by default.
Recreate directory structure of original data sets whether Nuix recreates the entire folder structure of the
source evidence when exporting the items. This operation is applied to both email containers (PST, MBOX)
and loose files. Nuix uses the folder structure contained in the Nuix Path name field to recreate the folder
structure.
The following image shows the default result exporting all items to a single directory.
Notes:
This option does not recreate the PST exactly as it existed prior to ingestion. The directory
structure will be relative to the name of the email container that was ingested. This allows email
from multiple email containers to be exported into a single PST/NSF/MBOX file. Additionally, Nuix
limits the default PST size to 10,000 messages per archive as larger sizes are susceptible to
corruption issues. If you wish increase the number of items stored per PST, you can launch Nuix
with the -Dnuix.export.pst.maximumMessagesPerPst=500000 command line switch.
This option is not supported for NSF files.
Reports
Item Report
Contains all of the metadata and textual content of the item. The report can only be generated in XHTML.
The item report specifically contains:
Case information, including name, time and the date the case was opened, and when the report
was generated.
All the metadata retrieved from the item.
Details of the communications activities about that file, such as who sent the item and to whom it
was sent.
The text of the document itself.
See the sample item report attached below.
Note: Item report hyperlinks will fail if you do not select the appropriate email container export formats
(PST, NSF, MBOX) in the Export messages as field.
Note: If the Retain directory structure of original data option was selected, a folder tree matching that of
the source is created underneath the Files folder. Otherwise all items are exported to the root of the Files
directory. If naming collisions occur, all duplicate names will have their name modified with an incremental
counter.
Concordance
Discovery Radar
DocuMatrix
EDRM XML
IPRO
Relativity
Ringtail
Summation
Export Type - Settings to establish how you want to export the items.
Load File Settings - Settings that are specific to the load file being created.
Numbering and Files - Settings to configure document numbering and file names.
Parallel Processing - Settings for adjusting the number of Nuix worker machines and associated
memory for tuning the performance of the export operations.
For each legal export Nuix creates several standard files:
Summary-Report.txt/xml - The summary report provides a complete report for the legal export.
Top-level-MD5-digests.txt - The Top-level-MD5-digests.txt file contains a list all the top-level MD5
digests included in the legal export.
You can use any of the following methods to open this dialog box:
From the menu, File > Export > Legal Export to.
From the Results pane, click the Export button and select Legal Export to and the load file
format.
Once items are selected, right-click over them and select Export > Legal Export to > Load file
format.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 100 of 390
The following sections describe each setting or option on the Export Type tab.
Export Items
Export items controls what items are exported. Options include:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 101 of 390
Selected items only - Nuix exports only the selected items. This will not deduplicate or export the
entire family. So, if an attachment to an email is part of the result set, only the attachment is
exported. The parent email is not exported.
Selected items and descendants - Nuix exports only the selected items and any descendant
items. This option is most frequently used when you manually find the top level items and
specifically tailor the exact contents of the result.
Top-level items only - Nuix identifies the top level item for each item in the result set, and then
exports those items. If duplicate top-level items are present they will be exported as separate
items.
Top level items and descendants - Nuix identifies the top level item for each item in the result
set, and then exports those items plus their descendants. If duplicate top-level items are present
they will be exported as separate families.
Notes:
Top-level options can and will result in the export reporting a different number of items exported
than than are selected in the result set. This occurs because Nuix is taking the result set, finding all
of the top-level items, and then optionally deduplicating the exported set. Review the Pre-export
summary report as well as the post-export summary report to assist in reconciling the result counts.
None -Nuix will export all items with no deduplication. The resulting exported set may contain
duplicates.
MD5 - Nuix will deduplicate across all of the top level items before exporting the results. This
ensures that a single copy of each logical, top-level item and its family is exported.
MD5 per custodian - Nuix will deduplicate the top level items within each named custodian before
exporting the results. This ensures that a single copy of each logical, top-level item and its family is
exported within each custodian. The resulting exported set may contain duplicates between
custodians.
Sort Order
Sort Order determines the order in which the items are exported. Options include:
Default sort order (fastest) – Exports the selected items based on their internal ID. This is the
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 102 of 390
same order in which they are returned to the result set. This method is the fastest, because it does
not require any additional sorting. This option is the default setting, and is the recommended option
if you are importing this data into another review tool that has its own sorting capabilities.
Top-level document date (ascending) – Exports the selected items based on the top-level
document’s date, with the oldest document appearing first in the load file.
Top-level document date (descending) – Exports the selected items based on the top-level
document’s date, with the most recent document appearing first in the load file.
Results set order – Exports the selected items in the same order as they appear in the Results
set view.
The Native, EML, and MSG and options will export individual items to the export folder. The MBOX, PST,
NSF options will export a single email container with all of the items.
Individual file options:
A single email container file is created at the root of the Export_Dir\Files folder called Export.xxx.
This file contains all of the selected emails.
Lotus Notes data is classified as a message/RFC822 and is exported as EML if Native is selected.
If Native is selected, most data from Microsoft Exchange EDB files is exported as EML. If all
Microsoft data is expected to be MSG, select the MSG option.
Export Scheme
Export scheme provides control over how the native emails will be exported. Options include:
Leave attachments on emails - Nuix exports the parent email with all of the attachments as single
file. It also exports each of the attachments as separate files. This allows the entire message to be
viewed a single entity, while still maintaining the parent-child relationship of the entire family within
the Legal Export numbering scheme.
Separate attachments from emails - Nuix exports the email and all of its attachments as
separate files. This ensures that when performing a native review, that each item can be viewed as
a unique item. The Legal Export numbering scheme maintains the parent-child relationship for the
entire family of documents.
Export directory
Export directory defines the root path to where you want Nuix to export the data. The Save dialog box
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 103 of 390
defaults to the previously defined location.
Notes:
Prior to performing a legal export, ensure that the target file system has sufficient disk space for the
export.
Nuix strongly recommends that all exports be performed to local disk. Nuix does not recommend
exporting to a mapped drive, a UNC share or an externally attached hard drive as these all present
sever performance limitations.
Regenerate natives
Regenerate natives populates the Nuix binary store with the native file of the selected items during the
export. This option can be used to reload the binary store if it was populated when the case was created, or
it can be used to cache files that will likely be launched in native format during a review (Excels,
PowerPoints, etc.). This option is off by default.
Note: It is generally recommended that Excel files be produced in native format. The nature of an Excel
document does not lend itself to the flat nature of a printed page. If Excel documents must be converted to
images, you should first preview the PDF rendering in the item level view before using this Export option to
ensure it meets your expectations.
Regenerate PDFs
Regenerate PDFs forces all of the PDFs in the Nuix PDF print store to be replaced with new PDFs
generated by Nuix. This option is off by default.
Note: If you have imported custom PDFs into the case, using this option will replace them.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 104 of 390
Options include:
Show Header Divider Line - Shows or hides a black rule beneath the header text. This option is
selected by default.
Show Footer Divider Line - Shows or hides a black rule above the footer text. This option is
selected by default.
Name - An identifier for the item located in the upper left corner of the header. By default this is set
to Name, which is the subject of an email of the name of a file.
GUID - An identifier for the item located in the upper right corner of the header. By default this is
set to GUID, which is the Globally Unique Identifier used by Nuix to reference the individual item.
This ID is unique to every item, but not every page.
Produced by - An identifier for the item located in the lower left corner of the footer. By default this
is set to Produced by, which is the name of the user performing the export operation.
Bates Number - An identifier for the item located in the lower right corner of the footer. By default
this is set to Bates Number, the Document ID assigned to the file/page during the PDF/TIFF
process. This number is unique to every page of an imaged document.
Two additional options to add field or custom data to the center of either the header or footer
section.
Imaging options
Imaging options allows you to set custom rendering options that can be applied to MS office documents.
The Imaging options link launches the Imaging Options in the Global Options dialogue box.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 105 of 390
Wrap lines in text files
Wrap lines in text files forces a text string to wrap at a certain number of characters, ensuring that all of the
text for a given document is easily viewable. This option is off by default.
Load file separation - This option allows the load file only to be broken up on export into more
manageable files for loading into the final review platform. Note: Families of documents may break
across load files with the exception of Ringtail.
Metadata profiles - This option allows you to select a set of custom metadata fields to include in
the legal export load file. The drop-down list on the left contains all of the metadata profiles defined
in Nuix. The default setting is blank (no profile), which means no metadata is exported.
Manage Metadata Profiles link launches the Metadata Profile page in the Global Options dialog
box.
The following sections describes the additional options for the different load file formats.
Concordance
The Concordance load file settings tab has the following additional options -
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 106 of 390
Load file encoding sets the document encoding that is used when creating the Concordance load file
(*.dat). Older versions of Concordance are not Unicode compliant and only supports ASCII characters. The
default setting is ISO-8859-1 (8-bit single-byte coded graphic character sets -= Part 1: Latin alphabet). Nuix
is fully Unicode compliant and allows you to export the *.dat with any encoding. The most commonly used
encoding after ISO-8859-1 will be UTF-8. UTF-8 allows Unicode characters to be exported as part of the
load file.
EDRM
The EDRM load file settings tab has the following additional options -
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 107 of 390
Export to option allows exporting to either of two EDRM legal XML formats. Choose from version 1.01 or
version 1.1.
For more information regarding the format of EDRM legal XML format please see the EDRM site.
Relativity
The Relativity load file settings tab has the following additional options -
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 108 of 390
Server Settings section allows for the details of your Relativity instance to be entered and a connection to
be established.
Select Version allows you to identify the version of Relativity you would like to export to and
determines what remaining server settings are required to enable direct export. Choose from
version 7.4, version 6.6 - 7.3 or Any Version (manual setup).
Relativity URL - this setting is the URL for the instance of Relativity you would like to import into.
For version 7.4, enter the details of your instance without the http prefix, e.g. nuix.kcura.com. For
version 6.6 - 7.3, this is inherited by the Web Service URL that is saved in the Relativity Desktop
Client. Please ensure the Relativity Desktop client is installed on your Nuix machine and the
desired web service is correctly set. For the Any Version option this setting is not required.
End Point Type - This option allows you to select from Nuix what type of endpoint your Relativity
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 109 of 390
Services exists behind in your IIS instance. Previously this would need to be hand edited at the
Relativity server. This option only exists for version 7.4.
User Name and Password - Enter your Relativity username and password and click Workspaces
to show the workspaces and folders you have access to. The username used must have import
rights into cases. This option is available for version 7.4 and version 6.6 - 7.3.
Workspace ID - This option allows you to enter in your workspace artifact id to directly import into
the top level folder for that case. This option is only available if using Any Version (manual setup).
Other Relativity Settings section allow you to map fields for export and chose how the items are imported
into your Relativity instance.
Edit Mapping - This option allow you to manually manually map the metadata fields you are
exporting to the fields in your case. Select Create Mapping link to match fields or load an existing
Nuix Relativity Mapping file.
Error Check - This option allows you to check field mapping errors between your metadata profile
and fields matched from within your Relativity workspace. Please see the following section as to
what is error checked with this option. This option is only available with version 7.4 and version 6.6
- 7.3.
KWE Mapping File - This option allows you to use a standard KWE file that has been created by
the Relativity Desktop Client to match the fields you are currently exporting. This option is only
required for Any Version (manual setup) Note: These fields used your Nuix metadata profile must
be in the same order as previously matched in the KWE file as this is a direct mapping into
Relativity. Should one or two of your fields have changed position in the chosen metadata profile
and the data type matches the field in Relativity this data will import silently without error.
Mode - This option allows you to chose if you would like to Append new records to your Relativity
workspace or simply Overlay data to existing records in your workspace. This option is available for
all versions.
Native Export - This option allows you to choose if Relativity copies the native files to your main
Relativity documents directory or alternatively, copy them to a Relativity accessible directory and
have Relativity point to those files in that location. This option is available for all versions.
Ringtail
The Ringtail load file settings tab has the following additional settings -
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 110 of 390
Load File Settings
Export to: Creates the full Ringtail database as well as a number of other documents/items. Select the
Ringtail versions from the drop-down list: Ringtail CaseBook 6 or Ringtail Legal 2005. Ringtail Legal 2005 is
the default setting.
Load file separation: Allows you to separate entries at specific range. Select the metadata profile from the
drop-down menu. To add or edit metadata profiles, navigate to Manage metadata profiles.
Other Settings
Inherit document dates- Applies the email communications date to all descendant (child) items.
This option is selected by default.
Remove commas from number fields on export
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 111 of 390
Use direct parent for host reference
Use document ID for page label
Include native page counts for TIFF in num_pages
Map Export Extras
Select the Category Field for all Nuix metadata fields listed in the Ringtail (MS Access) database. You can
select the category by clicking on the category field of the metadata you wish to modify and select a category
from the drop-down list. This option is turned off by default.
The Multi-value field separator displays a comma (,) by default. To load the Ringtail mapping file,
click load and specify the path. To save the Ringtail mapping file, click Save and specify the path.
Select Show pre-export summary to preview the summary.
simple sequential numbering for the Concordance, Summation, IPRO, and Discovery Radar load
files load file formats
a more granular scheme for the Ringtail format with specific box, folder, and page numbering
The following topics describe each setting or option on the Numbering and Files tab.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 112 of 390
Numbering
Numbering provides six basic schemes for numbering documents.
Document ID - Assigns a nine digit with alphanumeric prefix, sequentially assigned number to
each document. This effectively provides for a legal export up to 999,999,999 items. This is the
default setting. If exporting from a production set then the numbering from the production set is
used and this option is grayed out.
Box, Folder, Page - Assigns a nine digit, sequentially assigned number to each document. This
effectively provides for a legal export up to 999,999,999 items.
Folder, Page - This will assign a six digit, sequentially assigned number to each document. This
effectively provides for a legal export up to 999,999 items.
Page - Assigns a 3 digit, sequentially assigned number to each document. This effectively provides
for a legal export up to 999 items.
Prefix, Box, Folder, Page - Same as the "Box, Folder, Page" option only the text string in the
prefix field is included at the beginning of document number.
Prefix, Folder, Page - Same as the "Box, Folder, Page" option only the text string in the prefix field
is included at the beginning of document number.
Prefix, Page - Same as the "Page" option only the text string in the prefix field is included at the
beginning of document number.
Document ID
Document ID allows for an alphanumeric/special character ("_", ".", "-") prefix to be included with a 9 digit
number for each document ID. The default value is DOC-000000001.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 113 of 390
The value you choose for the Numbering setting drives which Box, Folder, and Page fields are active. In
this screen shot, the Folder, Page option is selected, which leads to a numbering scheme like 001001,
where the first 001 represent the folder numbering scheme that begins with 1, and the 001 presents the page
numbering scheme that begins with 1.
These three fields in the Legal Export dialog box allow zero padding up to 7 digits wide. You can set the
page rollover value explicitly, while Box and Folder rollover values are determined by having a 9 for every
digit in the respective numbering (e.g., a field value of ‘0001’ results in a rollover of ‘9999’.)
Options include:
Can exist in multiple folders - If a document family consists of multiple documents or multiple
pages of documents, this option enforces the numbering scheme, and simply spans a single family
or document across a folder boundary.
Must exist in same folder - If a document family consists of multiple documents or multiple pages
of documents, this option forces the entire document family into a single folder. This means that
that the number of files/pages per folder can be exceeded.
Delimiter - Ringtail
Delimiter allows you to add a separator between the box, folder, and page numbers. The default setting is
to delimit these values with a period (.). The other option is blank, meaning no delimiter is used.
File Naming
File Naming displays the various formats of the exports (native, text, TIFF, PDF, Thumbnails and XHTML
Report), the sub-folder path within the export directory where they will be written, and how the items will be
numbered. You can only define the properties for the generated file once for each file type.
Use the Add, Edit, and Remove buttons to manage the contents of this table.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 114 of 390
Note: Only use the Per Page Text option with documents that are being rendered to TIFF. Otherwise, it will
force Nuix to create a PDF for each document, then extract the text from each page of the PDF to create a
separate text file. This operation dramatically increases the export time and should only be used when
exporting items to TIFF.
Add/Edit:
Add and Edit open the Generated Files dialog, which allows you to define the different export file types.
Click Add to define a new file type. Click Edit to change an existing one. By default, Nuix provides a
definition for the Native file type.
In the Generated File dialog box, you can define four properties.
Native - The documents are exported as individual items that can be opened in their native
application.
Text - The extracted text of the document is exported. This does not include all of the extracted
metadata.
Per-Page Text - The extracted text of the document is exported as individual pages created from
PDF rendering of the document. This does not include all of the extracted metadata.
PDF - This is a PDF rendering of the document. A single "Searchable PDF" is created for each
item.
TIFF - TIFFs are created from the PDF rendering of the document using Ghostscript. The TIFFs
are single page TIFFs.
Thumbnails - Thumbnail images are exported for any images that had thumbnails extracted when
processed.
XHTML Item Report - An XHTML item report is created for each item exported.
The page naming options are shown below and can include Document ID selected. In most cases you will
want to ensure that the page name is consistent across all export file types. Additionally, the "Full" option will
honor the settings made on the Numbering and Files tab.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 115 of 390
The sub-folder path allows the name of the export sub-directory to be defined. This is used when natives,
text, images all need to be stored in separate folders under the root export directory.
Remove:
Remove deletes the file type definition that is currently highlighted in the File Naming table. Be sure to
highlight the file you type you wish to delete prior to clicking the button.
Preview
Preview shows an example of the numbering scheme using the document numbering values you have
specified. Use this field to ensure your numbering scheme is correct.
Parallel Processing
In the Legal Export dialog box, the Parallel Processing tab offers settings that allow you to control how the
Nuix workers operate while exporting the data.
Note: These settings only apply to the export operation, and are separate from the parallel processing
settings associated with ingesting data.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 116 of 390
Nuix offers the following settings:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 117 of 390
Show pre-export summary report
Show pre-export summary report displays a complete list of all items to be exported. This includes top-
level and descendant (child) items and should be used as a guide when determining the overall export size.
By default this option is off.
The Export Summary section at the top of the report provides the total native file counts that will be exported:
Items selected for export - The total number of items highlighted in the result set.
Top-level items found from selected items - The total number of top-level items found. This number
includes duplicates.
Deduplicated top-level items found from selected items - The total number of top level items that
will be exported.
Duplicate top-level items not exported - The number of top-level items that will not be exported
because they are duplicates.
Total items, including child items, discovered for export - The total number of items that will be
exported. This number matches the total number of native files exported.
Click OK to export the items, or click Cancel to cancel the export operation.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 118 of 390
The following options and controls are available:
Language - Sets the scripting language to one of two, either ECMAScript or Ruby. Ruby is the
default setting.
Script - A free-text box into which your script is typed or pasted.
Console - A read-only box that displays the results of the script as well as a status message and
any errors.
Clear - Clears the results in the Console box.
Execute - Runs the script that has been pasted into the Script text box.
Cancel - Cancels a currently running script.
To close the dialogue box, click the Close icon in the upper right corner.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 119 of 390
Dependencies – Identifies all required/recommended software. Dependencies not installed are
noted in the Status column. Highlight a row to view details of the dependency in the box below.
Nuix regularly receives support questions about the software Nuix requires to perform its tasks. To
reduce these requests, Nuix now requires you to indicate that you understand these requirements
by selecting the I understand the consequences of lacking this dependency checkbox.
Summary Report – Reports product version and other information about your hardware and
software.
Environment – Details a variety of variables and values about your hardware.
System properties – Details a number of Nuix system file properties and values useful for
troubleshooting problems.
Licence properties – Details properties and values about the software licence on the licence
dongle.
Note:
If an error occurs during the operation of the Nuix 4 software application, go to Help > System Diagnostics
and save the error message to send the output to our support team at [email protected] along with a
description of the scenario while the error occurred. It is advisable to save the error message during the
same session, since by exiting the application, system diagnostics stops corresponding to the session.
However, if you accidentally exit the application or encounter a system crash, the log files archived in the log
directory can be sent to the support team for investigation. Go to Help > Open Log Directory to open the log
directory.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 120 of 390
Customizing the Interface
Nuix 4 supports customizing the application interface in a couple of different ways to better support your
personal workflow and to promote efficiency in mousing operations:
Resize panes.
Rearrange the location of panes.
Un-dock (pop-out) the panes, distributing them across multiple monitors or floating them outside
the application window on a single monitor.
Hide the panes.
At any time, you can reset all the panes in the Workbench tab to their default locations by selecting Window
> Reset Layout.
To resize panes:
1. Select the yellow title bar of a pane and drag it to another location within the tab window.
Nuix displays an outline depicting where the pane can be placed.
2. Release the mouse when the pane is in the desired location.
3. Resize the panes to achieve a particular result.
To undock and replace panes:
1. On the Workbench tab, in the pane you wish to undock, select the undock icon . The pane
pops out of the Nuix window.
2. Select the yellow title bar of the pane and drag it onto another monitor, or to another position on
your current monitor.
3. To replace the pane, drag it back to the desired location within the Workbench tab or click the
same icon in the title bar again to return it to its original location.
To hide and show panes:
1. To hide a pane, select the Window menu and then select the Show Pane Name command for
the pane you wish to hide.
The pane is hidden from the Workbench tab.
2. To show the pane again, select the same command again in the Window menu.
The pane returns.
Keyboard Shortcuts
Nuix 4 provides a variety of keyboard shortcuts to enable greater efficiency for the tasks you perform
frequently. Shortcut keys are keys that you hold down or press to activate a command or trigger an activity.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 121 of 390
Letters are not case sensitive. Using the keyboard instead of the mouse might also reduce the risk of
repetitive stress injuries.
Print Ctrl + P
Cut Ctrl + X
Copy Ctrl + C
Paste Ctrl + V
Find Ctrl + F
Next Batch (Family) - Only active while in Fast Review. Shift + Right Arrow
Help Shift + F1
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 122 of 390
Apply same tags to all family items Alt + Shift + F
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 123 of 390
Install
Nuix provides two basic installer packages:
Nuix 4 - This includes all processing and review licence types, from Enterprise Workstations to Investigator.
whether you need to install the 32-bit, 64-bit, or both versions of Nuix 4
the proper hardware for your processing needs
the proper software for the tasks you perform
the minimum requirements for Nuix to operate
Nuix installs the 32-bit software into C:\Program Files (x86)\Nuix\Nuix 4 and the 64-bit software into
C:\Program Files\Nuix\Nuix4
Nuix creates two desktop icons. The Nuix 4 icon launch the 64-bit application and the Nuix 4 (32-
bit) launches the 32-bit application.
If you limited to only using the 32-bit Lotus Notes client install, you must use the 32-bit version of
Nuix when processing, launching the natives, or exporting Lotus nsf data. For all other work, use
the 64-bit version.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 124 of 390
To follow are the minimum system requirements for operating Nuix. See the Hardware Sizing Guidelines for
optimum performance.
Hardware
CPU – Dual Core (2.4 Ghz or Greater)
RAM – 4 GB
Hard Drive – 2x 7200RPM drives + adequate capacity for source and case data
Video Card – 1280X1020 Screen Resolution (Required for Network Visualizations)
Network – 10/100 Ethernet Controller
Operating System
32-bit: Windows XP, Vista, Server 2003, Server 2008, Windows 7 or later
64-bit: Windows XP, Vista, Server 2003, Server 2008, Windows 7 or later
Cores/CPUs/Processors – for the sake of this discussion, the number of cores equals the number
of CPUs displayed in Task Manager.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 125 of 390
Note: Throughput rates vary depending on the type of data processed. Our average ingestion rate is based
on a mixed collection of 50% PSTs and 50% loose business documents. Processing all EDB files or all text
files results in lower or higher throughput.
The following is a listing of some sample configurations:
# OF RAM PHYSICAL
DESCRIPTION OPERATING SYSTEM
CORES (GB) DISKS*
Windows 64-bit OS (Windows Vista, 7 or Server
2x Core 2 8 2
2008)
* 7200 RPM disks are a minimum, with Nuix realizing improved performance with 10K or 15RPM drives.
Processing Recommendations
Microsoft Office 2007/2010 is strongly recommended for all processing systems. Nuix will attempt
to open Office 95 and Office works files with Office 2007/2010, otherwise Nuix will default to text
extraction only for these file formats.
Export Requirements
Microsoft Office 2007/2010 - Office 2007/2010 is required to export PST files, create Ringtail
databases (MS Access MDB) and as part of our PDF rendering process. Office 2010 includes a
64-bit version of Access, which will allow you to export out to Ringtail and Discovery Radar on the
64-bit version of Nuix. If you do not have the 2010 64-bit version of Access your Ringtail and
Discovery Roader exports can only be run from the 32-bit version of Nuix. Office 2010 has in-built
PDF capabilities.
Microsoft Visio 2007 or above is required for PDF rendering process.
Microsoft Office 2007 PDF Plug-in - The Office 2007 PDF plug-ins are used as part of the legal
export to create PDF renderings of native electronic documents.
NOTE: Office 2007 SP2 now includes the PDF Plugin by default. If you have Office 2007 installed
and are uncertain whether the correct PDF plug-ins have been installed open a word document
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 126 of 390
and save it as a PDF. If the option is not present, another save option will be displayed to save as
an alternative form including PDF.
Ghostscript - Ghostscript is used to convert PDF images to TIFF files.
Installing Nuix 4
Once you have configured your hardware and installed any prerequisite software, you can install the Nuix 4
application.
To install Nuix 4:
1. Download and open the Nuix installer package.
The Setup Wizard displays.
2. On the Welcome screen, select Next.
3. Specify where you want to install Nuix. You can click Browse to navigate to a location on your
system.
This location should be local to your machine. If you are installing both the 32-bit and the 64-bit
versions of Nuix, review details for that scenario.
4. Click Next to continue.
5. On the Ready to Install screen, click Install.
A screen displays indicating that the application is being installed. Optionally, you can
click Cancel to cancel the installation.
6. When the install is complete, the final screen displays. Click Finish to complete the installation.
You can now open Nuix. The first time that Nuix opens, the System Diagnostics window displays.
The Dependencies tab shows whether the prerequisites are installed. Review this list carefully to ensure
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 127 of 390
that all of the expected prerequisites have been installed. For any dependencies that are not found, you must
confirm that you "Understand the consequences of lacking this dependency", for each missing item.
Common issues include:
Lotus Notes or Microsoft Access are shown as "Not Found". If the prerequisites have been
installed, this is usually seen when running the 64-bit version of Nuix. For details, review the
information about 32-bit and 64-bit coexistence.
Office 2007 is installed, but not the PDF extensions.
For additional detail on installing the dependencies, review the software requirements.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 128 of 390
Configure
Nuix 4 requires that you configure your environment to support certain tasks, and also offers you a variety of
options within the product itself that will help you maximize its value.
Configuring Nuix 4 includes:
Setting global options - Global options apply to all cases accessed from this user profile. They are
not global in the sense that they apply to all users.
Setting case options - Case options apply to the case that is currently open.
Configuring your environment includes:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 129 of 390
View Options
View Options section includes the following options.
Launch Options lets you set the default application Nuix uses to open email messages. Regardless of the
source format of the email message, Nuix opens the message in the application you specify here.
EML - Standard message format (RFC822). On most Windows systems this setting defaults to
using Outlook Express. If you have not configured Outlook Express, you will be prompted to
configure it. You can close the configuration screens and Outlook Express will still display
messages.
MSG - Microsoft Outlook
NSF - Lotus Notes
Viewer Limits lets you manage how Nuix presents large datasets when viewing items
by Results or Network in the Results pane. You can set the maximum number of items in the list or view
to make review and analysis tasks more manageable.
You can set the following maximum values:
Result table row limit - Sets the maximum number of items that display in the Results list. The
default value is set to 1,000,000 items. If you are working with very large datasets you might need
to increase this number to see all of the items. If there are more items to list than the maximum
viewing limit you select, a status message is provided at the bottom of the Results list in the form of
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 130 of 390
"Displaying X items, truncated from Y". The minimum value you can set is 10,000.
Network node limits - Sets the maximum number of nodes that display in the Networks view by
default. The default value is set to 500 items. After the graph displays, you can adjust it to show
more or fewer nodes. Increase or decrease this value based on the speed of your system, as
needed. The minimum value you can set is 15.
Document Navigator allows you to enable or disable the count facets from being visable and updating in
the different panels in the document navigator section. The following panels can have their facets disabled
from updating:
Evidence Panel
Excluded Items Panel
Custodian Panel
Production Set Panel
History Panel
In addition each section of the Filter panel has addition options of either being enabled always, enabled on
expand of that section, or being disabled.
Highlighting allow individual words with a phrase to be highlighted separately when hit in a search term. If
unchecked the whole phrase will be highlighted as one.
Search Options
Search Options lets you manage how Nuix will search across the data by including or excluding what data
can be searched across.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 131 of 390
The following options are available:
Search Content - allows any searches performed to find results in the Content or text only of the
indexed documents.
Search Properties - allows any searches performed to find results in the Properties section only of
the indexed documents.
Search Names - allows any searches performed to find results in the Names field only of the
indexed documents.
Search Path Names - allows any searches performed to find results in the Path Names field only
of the indexed documents.
Search Evidence Metadata - allows any searches performed to find results in the Evidence
Metadata or user defined data section only of the indexed documents.
Resemblance threshold set the level of similarity required to allow documents to be found as near
duplicates of each other. The resemblance value ranges between 0 and 1 with 1 representing very similar
documents. The default value is set at 0.5.
Default Tabs
Default Tabs lets you set which tabs you want to view by default in Nuix4 when opening a new case.
Workbench - This tab hosts the primary tasks of excluding, filtering, and searching for data within
the case. You can also analyze data, preview individual items, and tag from this tab. This tab is set
to display by default when you open a case.
Statistics Tab - This tab displays information about the processed and irregular files by file type,
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 132 of 390
including number processed, corrupted, and encrypted, as well as a percentage of each file type
encountered.
Fast Review - This tab lets you create jobs that can be batched up for review by investigators. For
each job, you can specify tags and words to highlight. You can then associate items to each job,
and those items are presented in a linear fashion for tagging.
Imaging Options
Imaging options provide a means to alter the way Nuix renders MS Office documents to PDF and Tiff.
Microsoft Excel - allows for the customization of options such as showing or hiding grid lines,
headings, hidden columns, hidden rows, hidden worksheets, notes or comments. It also allows for
the customization of page size and orientation, zoom and limiting the number of pages printed per
worksheet.
Microsoft PowerPoint - provides a means to select how the power point documents are printed.
Choose from the following per page options -
Single Slides
Two, three, six or nine slides per page
Outline
Notes
Microsoft Word - choose to show or not show mark up comments when rendering Word
documents if they are present.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 133 of 390
Metadata Profiles
Metadata Profiles provide a means to manage the presentation and export of metadata in Nuix. Nuix has
three types of metadata:
Nuix Defined - Metadata properties defined by the Nuix application, such as GUID, MD5 Digest,
Name, etc. These properties are specifically extracted or created for internal purposes.
User Defined - Custom metadata properties you can create when you load a case, which are
applied to all items in the evidence set.
Item Properties - Nuix takes an opportunistic approach to metadata extraction. Essentially, Nuix
just enumerates all of the metadata properties that we encounter for each item, and insert the
key/value pairs into the Lucene full text index. We are not mapping or building any type of
relationship behind the scenes. So for each item we target the non-binary metadata, and put it into
our full text index. These value as grouped collectively as properties, so that you can search on a
single metadata property (properties:”key:value”) or against all properties (properties:value).
You can create metadata profiles for specific item types (email/files), specific purposes (exception handling),
or the specific load file formats required by your clients. The Default Metadata Profile is the only profile
provided with the application. You can view a collection of sample metadata profiles in the Nuix Knowledge
Base.
Nuix makes use of metadata profiles in several places so that you can customise what metadata information
to view:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 134 of 390
Change the profile associated with a view by right-clicking on a column header and selecting Choose
Column Profile > profile name.
Add a Profile
To add a profile, select Global Options > Metadata Profiles, and click Add.
The Create Metadata Profile dialog displays, allowing you to add an unlimited number of metadata values.
Types of values can include Nuix-derived metadata, User-defined Evidence Metadata, Properties, and
Derived Metadata fields. You can order the values by using Move Up and Move Down.
The following image shows a sample "Email Profile" that includes a variety of different metadata types
combined into a single profile.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 135 of 390
Edit a Profile
To edit an existing profile, select it in the list of metadata profiles and click Edit. The Edit Metadata
Profile dialog displays, allowing you to manage the metadata for that profile.
Remove a Profile
To remove an existing profile, select it and click Remove. Once a profile is applied to a view, that set of
metadata will be assigned to the columns in that view even if you delete the profile.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 136 of 390
Nuix metadata is grouped into three categories:
Nuix Defined - Metadata properties defined by the Nuix application, such as GUID, MD5 Digest,
Name, etc. These properties are specifically extracted or created for internal purposes.
User Defined - Custom metadata properties you can create when you load a case, which are
applied to all items in the evidence set.
Item Properties - Nuix takes an opportunistic approach to metadata extraction. Essentially, Nuix
just enumerates all of the metadata properties that we encounter for each item, and insert the
key/value pairs into the Lucene full text index. We are not mapping or building any type of
relationship behind the scenes. So for each item we target the non-binary metadata, and put it into
our full text index. These value as grouped collectively as properties, so that you can search on a
single metadata property (properties:”key:value”) or against all properties (properties:value).
Items in each list are presented in alphabetical order. You can also type text into the Filter field to find
names that match the text you enter.
NUIX-DEFINED METADATA
Nuix-defined Metadata includes:
Audited Size Audited Size is the size of the item as it exists on disk. The Audited Size is
calculated only for cases created while running with an Audited licence
type. Note: For emails, this is the size of the email itself, without any attachments.
The Audited Size differs from the Digest Input Size in that for emails the Audited
Size represents the size of all properties, not just those that are used in the creation
of the digest.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 137 of 390
Automatic Classifications Classifications for automatically classified items, for the Automatic Classifier(s)
(Selected) currently selected in the Document Navigator.
Automatic Classifier The confidence (or probability of correctness) for automatically classified items.
Confidence
Automatic Classifier The confidence (or probability of correctness) for automatically classified items, for
Confidence (Selected) the Automatic Classifier(s) currently selected in the Document Navigator.
Automatic Classifier Gain The confidence (or probability of correctness) for automatically classified items as
Confidence would be shown in a gain chart. If the predicted classification is the positive
classification, this is identical to Automatic Classifier Confidence, otherwise it
is 1.0 - Automatic Classifier Confidence.
Automatic Classifier Gain The confidence (or probability of correctness) for automatically classified items as
Confidence (Selected) would be shown in a gain chart, for the Automatic Classifier(s) currently selected in
the Document Navigator. If the predicted classification is the positive classification,
this is identical to Automatic Classifier Confidence (Selected), otherwise it
is 1.0 - Automatic Classifier Confidence (Selected).
Bad Extension Whether the file appears to have an irregular extension. Plain text and items
without a file size property are excluded.
Bcc Bcc are the blind carbon copy addresses extracted from an email. The contents of
the Bcc field are searched by the bcc: communications search field.
Binary Stored Binary Stored indicates whether the binary is stored in the database for this item.
Carved Indicates that the item was carved out of slack-space or from unidentified item data.
Cc CC are the carbon copy addresses extracted from an email. The contents of the cc:
field are searched by the cc communications search field.
Chained Near- Duplicate Chained Near- Duplicate Count is the number of chained near-duplicate items. It
Count does not include the item itself.
Chained Near- Duplicate Chained Near- Duplicate Custodian Set is the set of custodians near-duplicate
Custodian Set items and its chained near-duplicate items.
Chained Near- Duplicate Chained Near- Duplicate GUIDs is a list of GUIDs of chained near-duplicate items.
GUIDs This does not include the item itself.
Chained Near- Duplicate Chained Near- Duplicate Paths are a list of paths to chained near-duplicate items.
Paths This does not include the item itself.
Child Count Child count gives a total count of all child items of a given item including immaterial
child items.
Child Material Count Child material count gives a total count of all material child items of a given item
and excludes all immaterial child items.
Child Names Child names are the names all of the child items for a given document. This can be
used when building an export profile that needs to show the names of all embedded
documents or attachments. Note: Use caution when including the Child Names
property in your default metadata profile, as it can increase the amount of time
required to render the result set or display items.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 138 of 390
Cluster IDs Cluster IDs are a list of chained near-duplicate clusters an item belongs to. Clusters
are denoted by their run label followed by an integer cluster label. E.g.,
myClusterRun-123. Labels are sorted according to cluster run, oldest first.
Cluster IDs (Selected) The list of chained near-duplicate clusters an item belongs to in the set of clusters
currently selected in the the Filtered Items pane of the Document Navigator.
Cluster Pivot Resemblances Cluster Pivot Resemblances are a list of resemblance values, one for each of the
clusters an item belongs to. Values are the resemblance values between the item
and each cluster's pivot item. The list is sorted according to cluster run, oldest first.
Cluster Pivot Resemblances A list of resemblance values, one for each of the clusters an item belongs to in the
(Selected) set of clusters currently selected in the Filtered Items pane of the Document
Navigator.
Cluster Pivots Cluster Pivots are a list of boolean values, one for each of the clusters an item
belongs to. A value of true indicates an item is the pivot member of the cluster it is
contained in. A value of false means it is not a pivot item. The list is sorted
according to cluster run, oldest first.
Cluster Pivots (Selected) A list of boolean values, one for each of the clusters an item belongs to in the set of
clusters currently selected in the Filtered Items pane of the Document Navigator.
Custodian Custodian stores the assigned custodian name assigned to items. This can be set
when ingesting evidence or assigned later manually.
Deleted Deleted signifies that the item was found in a Microsoft mail store while extracting
permanently deleted items. For additional information on processing deleted items,
see Deleted items.
Deleted File Metadata Indicates that the item is a deleted file that has had its metadata recovered from
Recovered unallocated space. Either all of the file was overwritten with other data or the file
record couldn't be linked back to its data.
Digest Input Size Digest Input Size is the number of bytes associated from the file used to generate
the various digests. This will be the file size for loose files and a rough
approximation of the size of an email.
Document IDs Document IDs lists all the document ids used within production sets for a given
item.
Document IDs (Selected) Document IDs (Selected) are the document IDs that have been assigned to the
item in the currently selected production sets.
Duplicate Count Duplicate count gives the total count for all duplicate items of a given item.
Duplicate Custodian Set Duplicate custodian set lists all the custodians that have duplicate items to a given
item. Note: Use caution when including the Duplicate Custodian Set property in
your default metadata profile, as it can increase the amount of time required to
render the result set or display items.
Duplicate GUIDs Duplicate GUIDs lists the GUIDs of all duplicate items. Note: Use caution when
including the Duplicate GUIDs property in your default metadata profile, as it can
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 139 of 390
increase the amount of time required to render the result set or display items.
Duplicate Paths Duplicate Paths lists the Paths for all duplicate items. Note: Use caution when
including the Duplicate Paths property in your default metadata profile, as it can
increase the amount of time required to render the result set or display items.
Entity: Company Entity Company is the company named entities identified in the items text.
Entity: Country Entity Country is the country named entities identified in the items text.
Entity: Credit Card Number Entity Credit Card Number is the credit card number named entities identified in the
items text.
Entity: Email Entity Email is the email named entities identified in the items text.
Entity: IP Address Entity IP Address is the IP address named entities identified in the items text.
Entity: Money Entity Money is the money named entities identified in the items text.
Entity: Personal ID Entity Personal ID is the personal ID named entities identified in the items text.
Entity: URL Entity URL is the URL named entities identified in the items text.
Family Inline Indicates the item and its family members are present in the one evidence
database. This flag was not present in v3.6 cases and earlier.
File Extension (Corrected) File Extension (Corrected) is the extension based on the header signature. The
"Corrected" version of the extension is what will be appended to the file name when
performing a native file export.
File Extension (Original) File Extension (Original) is the extension listed on the source file.
File Type File Type is the type of the document based on Nuix's header analysis.
From From is the sender address extracted from an email. The contents of the from field
are searched by the from: communications search field.
Fully Recovered Deleted File Indicates that the item is a deleted file that has been fully recovered from
unallocated space.
GUID GUID is the Globally Unique Identifier assigned to the item during ingestion.
Hidden Stream Indicates that the item is a hidden stream associated with another item. Examples
of this include NTFS alternate data streams and HFS+ resource forks.
Identification Disabled Indicates that file-type identification was not enabled when this item was processed.
Inlined If given item is being displayed as a part of outer item. e.g., an image in RTF
document. Legal export, for example, excludes such items.
Item Category Item Category is defined as Email, Attachment, Electronic File (loose file from file
system), Electronic Directory.
Item Date For emails, the Nuix Communications Date (Map-Client-Submit-Time, Sent Date,
Date). For files, it is the File Modified or if not present, the File Created date.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 140 of 390
Item ID Item ID is a human friendly number assigned to each item during ingestion. This is
a sequentially assigned ID that can be used to uniquely reference an item without
the having to reference the GUID. Note The item ID is there for convenience and
should not be relied upon as the sole reference of a document as it is updated
when simple cases are aggregated into compound cases. For example, as port of a
simple case, each item is assigned an numerical item-id (12345). When that simple
case is combined into a compound case, the Item-id is prefixed with the relative
position of the simple case with in the compound case. If the simple case
containing item 12345 was the second simple case added to the compound case,
the new item-id would be 1-12345. "1-" represents the location of the simple case
within the compound case and 12345 represents the item within the original simple
case. The Nuix GUID is the only absolute reference for an item.
Item Sets As Duplicate Item Sets As Duplicate is the item sets that this item is a member of as a duplicate.
Item Sets As Original Item Sets As Original is the item sets that this item is a member of as an original.
Kind Kind lists the type of document a given item is based on the kind of data it contains.
Loose File Loose files are the files which you would generally see in a file browser when
browsing a directory. However, the files inside disk image and logical image
formats are treated as the loose files instead of the outer image file.
Material Child Names Material Child Names are a list of names of material child items.
MD5 Digest (Latest) MD5 Digest (Latest) are the latest digests (hashes) for the item. The presence
depends on the settings specified at load time, as well as the size of the data item,
and whether the item has been reloaded/replaced.
MD5 Digest (Original) MD5 Digest (Original) are the original digests (hashes) for the item. the presence
depends on the settings specified at load time, as well as the size of the data item,
and whether the item has been reloaded/replaced.
Mime-type Mime-type lists the mime-type of a given item based on the extracted header
information.
Name Name is the Nuix assigned document name. For files the Name is the file name and
for emails the name is the subject.
Near-Duplicate Count Near-Duplicate Count is the number of near-duplicate items. This does not include
the item itself.
Near-Duplicate Custodian Near-Duplicate Custodian Set is the set of custodian names associated with this
Set item and its near-duplicate items.
Near-Duplicate GUIDs Near-Duplicate GUIDs are a list of GUIDs of near-duplicate items. This does not
include the item itself.
Near-Duplicate Paths Near-Duplicate Paths are the list of paths to near-duplicate items. This does not
include the item itself.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 141 of 390
Not Physical File Items not marked as being physical files.
Parent GUID Parent GUID is the GUID of the item's parent. The combination of the GUID and
the Parent GUID allows Nuix to maintain the entire document's ancestry.
Partially Processed Indicates the item's children were only partially processed. Some children were
explicitly skipped at the direction of the user.
Partially Recovered Deleted Indicates that the item is a deleted file that has been partially recovered from
File unallocated space. Some areas of the file were overwritten with other data.
Path Name Path Name is the complete path to the source evidence.
Physical File Physical files correspond to the highest items in the data tree which have binary,
and typically correspond to those files which were used as input evidence to the
case.
Poisoned Indicates the item caused a critical error during processing on several attempts.
Printed Image Generation Shows the method by which the PDF in the print store was populated.
Method
Printed Image Page Count Number of pages in a PDF that is stored in the print store (location that Nuix stores
the PDF as part of export operation).
Position Position lists the numerical position in the evidence tree for a given item and is a
useful field for sorting evidence.
Production Sets Production sets lists the name of all production sets a given item has been included
in.
Production Sets (Selected) Production sets the item has been assigned to in the currently selected production
sets.
Selected Document IDs Selected document ids lists all the document ids for a given item when production
sets are selected within the Production Set panel. If no production sets are selected
then there will be no values in this field.
Selected Production Sets Selected production sets lists the name of all production sets a given item has been
included in when production sets are selected within the Production Set panel. If no
production sets are selected then there will be no values in this field.
SHA-1 Digest (Latest) Latest digests (hashes) for the item. Which are present will depend on the settings
specified at load time, as well as the size of the data item, and whether the item has
been reloaded/replaced.
SHA-1 Digest (Original) Original digests (hashes) for the item. Which are present will depend on the
settings specified at load time, as well as the size of the data item, and whether the
item has been reloaded/replaced.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 142 of 390
SHA-256 Digest (Latest) Latest digests (hashes) for the item. Which are present will depend on the settings
specified at load time, as well as the size of the data item, and whether the item has
been reloaded/replaced.
SHA-256 Digest (Original) Original digests (hashes) for the item. Which are present will depend on the
settings specified at load time, as well as the size of the data item, and whether the
item has been reloaded/replaced.
Skin Tone Skin Tone is the confidence score for skin tone images, ranging from 0.0 (low
confidence) to 1.0 (high confidence).
Slack Space Region Indicates that the item represents a region of recovered slack-space.
Suppressed Immaterial Indicates that the item contains an immaterial item that has not been exposed as a
Children separate data item. This is only present when the "Hide Immaterial Items"
processing option is enabled.
Tags Classifications that you create and apply to items, such as Responsive.
Thread Count Thread Count is the number of items in the same discussion thread. This does not
include the item itself.
Thread GUIDs Thread GUIDs lists all of the item GUIDs for each of the emails determined by Nuix
to be apart of the thread. Note:Use caution when including the Thread GUIDs
property in your default metadata profile, as it can increase the amount of time
required to render the result set or display items.
Thread Paths Thread Paths lists all of the item Paths for each of the emails determined by Nuix to
be apart of the thread.
Top-level Indicates the item is considered a top-level item, since all of its ancestor items are
containers. e.g. Loose files that are not containers, Office documents inside a zip
container, emails inside mailboxes etc.
Top-Level GUID Top-Level GUID is the items top-level GUID. By storing the item's top-level GUID
as a property of the child item, the time to find all top level items is significantly
reduced.
Top-Level Item Date Top-Level item date is the date of the top level item for a given item. By storing the
item's top-level date as a property of the child item sorting on this field ensures
items are sorted in family date order.
Top-Level Path Name Top-Level path name is the path name of the top level item for a given item.
Training Classifications Classifications for items provided as training data for automatic classifiers.
Training Classifications Classifications for items provided as training data for the automatic classifiers(s)
(Selected) currently selected in the Document Navigator.
Unallocated Space Indicates that the item represents a region of recovered unallocated space in the
file system.
Unaudited Indicates the item has explicitly been marked as not audited. Items processed in
Nuix 3.0 will not have this flag set for unaudited items.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 143 of 390
MAPI PROPERTIES
For additional detail related to metadata properties extracted from MAPI messages, see Translating Nuix
extracted MAPI properties to MAPI canonical names.
Overview
Most data stored inside the Microsoft Outlook and Exchange files (MSG, TNEF, PST and EDB) is composed
of property/value pairs. These property pairs are known as MAPI properties. They are also stored with a data
type to help work with the stored value. Below is a partial list of the possible types:
TYPE DESCRIPTION
I2 16-bit integer
I4 32-bit integer
I8 64-bit integer
PR_ENTRYID
Nuix does not record the PR_ENTRYID. The MAPI property PR_ENTRYID ("Entry ID") is not actually stored
for a PST or EDB file. Instead it is generated when required and includes details about the current MAPI
provider. For this reason Nuix doesn't currently index PR_ENTRYID for PST and EDB items.
Instead Nuix exposes the PR_RECORD_KEY and PR_SEARCH_KEY properties which can often be
substituted instead of using PR_ENTRYID. The following page has more information on what types of MAPI
objects will have these properties:
Unknown Properties
Unknown properties will sometimes appear on MAPI messages. They take the following form:
Mapi-97-2002-String8-16376: Data
Where "String8" is the property type and "16376" is the property value as a decimal number; this example
corresponds to MAPI property 0x3ff8.
Microsoft maintains a list of published property IDs
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 144 of 390
Adding Derived Metadata
You can use derived metadata to create custom views of one or multiple pieces of metadata. This is useful
when metadata needs to be normalised across a diverse set of metadata.
Note: Metadata profiles are not searchable. If you create a new Derived Metadata field, you cannot search
its contents using the properties:value search syntax. Metadata profiles are populated at the point in
time that they are being used, and are not stored in the index.
Field 1 - Mapi-Smtp-Message-Id
Field 2 - Message-id
Field 3 - Message-Id
Field 4 - Message-ID
Field 5 - Notes-Universal-ID
The MessageID (User-derived) metadata will start with Field 1 and go down the list looking for an available
piece of metadata. The "First non-blank value" will be populated into the derived MessageID field. In this
example, if the data set contained a mixture of Microsoft Outlook, MBOX, or Lotus Notes emails, the
Message-ID field will always be populated with the appropriate message ID.
DERIVED METADATA OPTIONS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 145 of 390
Node Types:
On the Edit Derived Metadata dialog, the Use custom date format option lets you convert various date
fields into different formats. This is useful when a specific load file format only supports a specific time/date
format (e.g., Concordance MM/DD/YYYY).
Click Add Field to build the required formats. If any special characters (/) are required between each date
segment, you must insert them into the text box. A preview helps you see if the date is correctly formatted.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 146 of 390
Note: These formatting options only function on true date fields. If the date in the field looks like a date, but
is actually stored as a simple string of text by the native application, Nuix cannot apply the custom date
format. Values for PDF-Creation-Date often exhibit this behavior.
Date part options include:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 147 of 390
CREATING DERIVED METADATA FIELDS
Use the following example to build a custom, or derived, metadata field for the Last Modified Date of an item.
This field shows the most recent date, if multiple fields exist.
To create a derived metadata field:
1. Explore the available metadata to determine which fields are relevant.
Not all items use the same metadata. For example the "Last Modified" time on a file does not exist
for a MAPI message. It is therefore necessary to understand the available metadata. This can be
done either through the Metadata Profile builder or by searching the data set for something
likeproperties:modif*, then sorting by File type and exploring the highlighted results. This can
also be done in the Metadata Profile builder by filtering on modif.
2. Once you determine the list of targeted metadata for the different item types, select the Add
Derived Metadata button.
The Edit Metadata Profile dialog displays.
3. In the Name field, typeLast Modified Date.
4. Select First non-blank value and right-click to select Replace with > Highest Value.
5. Select Highest value and right-click to select Add child expression > Metadata Value.
6. Change the drop down list box from Nuix-defined Metadata to Properties.
7. In Filter, type modif.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 148 of 390
8. From the list, select File Modified and Mapi-Last-Modification-Time and click OK.
Use Ctrl and Ctrl+Shift to single select or multi-select values.
9. Select Use custom date format to define a standard date format.
10. Select the ellipses to display the Edit Date Format dialog.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 149 of 390
11. Set the desired date format by clicking Add Field and choosing the format from the menu.
This example shows the MM/dd/yyyy format which equals - Month of year (padded to 2 digits) /
Day of month (padded to 2 digits) / Year (4-digit).
Note: Previous custom data formats are listed in the drop-down list.
12. Select OK.
The Last Modified Date (User-derived) metadata field is added to the metadata profile.
Reusing profiles across all Nuix machines, for example displaying metadata for Fast Review jobs
Facilitate consistent views for specific file types (emails, files, internet caches, etc...)
Align with specific client metadata requirements (Legal Export / Summary Reports)
To provide case-specific documentation, when included as a client deliverable to demonstrate
process
Metadata profiles are stored in the following directories:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 150 of 390
Windows Vista/7: %AppData%\Nuix\Metadate Profiles
Windows 2000/XP: %UserProfile%\Application Data\Nuix\Metadata Profiles
Each metadata profile is stored as an *.xml file. These files are portable and can be used on any system
running Nuix.
Nuix offers a collection of sample metadata profiles that you can download.
Digest Lists
Nuix allows you to import digest lists from third party sources as well as directly create them from within
Nuix. A digest list is a list of MD5 digests (hashes) for a collection of files.
You can use digest lists to assist with the following operations:
To eliminate system files or other application files that have known signatures and little or no value
to the investigation. This process is often called "De-NISTing".
To eliminate previously produced content. This is done by importing the top-level digest list
report included as part of the legal export.
To eliminate or suppress inappropriate content. If inappropriate content is detected, you can
import/generate a hash list of know inappropriate content, and pass that along as part of the export
process to allow this content to suppressed downstream.
Nuix supports standard digest lists, including NSRL, iLook, Hashkeeper, as well as plain text.
The plain text format includes a single digest per line. When creating plain text hashes, ensure that there is
no trailing punctuation or whitespace.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 151 of 390
Importing Digest Lists
You can import digest lists, and remove (delete) them.
To import a digest list, select Global Options > Digest Lists, and then click Add. The Add Digest
List dialog displays.
To remove a digest list, select a specific digest in the list and click Remove.
1. Download the NSRL hash lists from http://www.nsrl.nist.gov/Downloads.htm#isos. You will want to
download all disks (Disc1-4).
2. Fully extract the contents of each *.iso image. The NSRLFile.txt is the ultimate target.
The NSRLFile.txt for each Disc needs to be streamlined and loaded.
3. To streamline the files, use the following command:
4. # The syntax is actually '"' or Single Quote Double Quote Single Quote
5. cat NSRLFile.txt | cut -d '"' -f 4 | sort | uniq > NSRLFile.sorted
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 152 of 390
type the following sequence.
1. Move to the root directory:
2. cd /
3. Show all available folders:
4. ls
5. Move into the root directory for all the mapped drives:
6. cd cygdrive
7. Move into the specific folder (type for instance "d" instead of "drive letter"):
8. cd drive letter/folder name
9. Once all four of the NSRLFile.txt files have been sorted, combine all of the NSRLFile.sorted files
into a single file. This will allow them to be used as a single Digest List filter.
Alternatively from Cygwin you can do the following:
This will produce a single merged and sorted (and deduplicated) has list.
10. To load the digest lists into Nuix, select File > Global Options > Digest List > Add.
Performing this steps significantly decreases the time it takes to load the digest lists as well as search with
them.
Email:
Since not all email types actually have a binary stream and two copies of the same message can have
completely different header information, we compute an email's MD5 digest by taking the following data
encoded using UTF-8 as input:
1. Subject header
2. From header
3. To header
4. Cc header
5. Email body text tokenised so whitespace and irrelevant characters are removed.
6. Binary streams of all attachments.
For address headers the personal part is discarded and only the address part is used. The email body is
tokenised to ignore white-space differences, which can be a factor when comparing HTML and plain text
messages.
Shingle Lists
With Nuix 4, you can create a Shingle List from a set of key documents that you can use as a filter against
the dataset or import into other cases to use against other datasets. You can select one or more shingle lists
in the Filtered Items pane to return a list of items that are similar within the resemblance threshold that
you have set in the View Options section.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 153 of 390
Note
Only shingle lists that have been created in Nuix's propriety .shlist format can be imported and
used to find further similar documents.
Word Lists
With Nuix Desktop, you can import a .txt file containing a list of keywords that you can use as a filter against
the dataset. You can select one or more word lists in the Filtered Items pane to produce a list of items that
include the words you have compiled.
Each new word in the text file must be placed on a separate line. There is no limit to the number of words
that you can include in the word list, but the greater the number of words in a list, the greater the number of
matching documents you will receive in the Results list. Select File > Global Options > Word Liststo view,
add, or remove word lists.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 154 of 390
Notes:
Multiple words on a single row are treated as an exact phrase. (e.g. Dog Cat Mouse, are treated
like a search for "dog cat mouse"). Quotes are unnecessary, and will be stripped.
Boolean or other searches are not supported within a word list, so "(classification OR maxim)" is
not valid. To perform a series of Boolean or complex searches against a Nuix dataset, the scripting
interface provides you with a means of automatically executing queries, and applying
classifications to the result set. If complex queries or reporting is required, see the scripting section
for additional detail.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 155 of 390
Nuix stores the word list as a *.words file in the following directories:
Note: Text file encoding must be in the UTF-8 character set, which is particularly important for non-latin
based languages.
Memory
Memory lets you configure the amount of RAM made available to Nuix 4. The amount of RAM allocated to
the Nuix 4 can be adjusted up and down based on the circumstance and current use case. In general 4GB of
RAM should be sufficient for most operations. However, if you are working with very large datasets, and
performing operations like finding top-level items, or deduplicating large collections, then it is not uncommon
to set you Memory to 30+ GB.
The 30GB is not reserved when the application is launched, but instead set as a maximum threshold of the
Java virtual machine used by the Nuix 4. If this value is set disproportionately high, it is important to reset it
to a lower value prior to loading or exporting data. You must balance the memory that could be used by the
Nuix 4 and the Nuix single workers used for processing and export operations. For additional information on
allocating application memory, see Allocating memory (RAM) for better performance.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 156 of 390
Notes:
The maximum memory that can be allocated on a 32-bit OS is 1300 MB. If you are unable to set
the value higher, confirm that you are using a 64-bit OS.
This option eliminates the need for using the command line switch.
You must close and reopen the Nuix 4 for this setting to take effect.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 157 of 390
Allocating Memory (RAM) for Better Performance
For both processing and export purposes, Nuix requires an absolute minimum of 2 gigabytes of RAM per
core (that is, per instance of nuix_single_worker.exe running on the system as defined by the Nuix
licence). By default, the software uses a maximum of 1 gigabyte of memory for the 32-bit version, and 1.8
gigabytes for the 64-bit version. However, larger ratios of RAM, for instance 4GB or 8GB of RAM per core,
will dramatically improve performance and prevent Java's Out of Memory errors when processing complex
datasets or exporting large numbers of items.
Examples
Here are examples for working with processing and exporting operations.
Processing:
While processing a 15GB PST file with the default settings on a 64-bit machine, Nuix encountered
an Out of Memory error while trying to process the Inbox folder. After looking at the PST, it was
noticed that the Inbox folder contained 120,000 items and 100+ folders.
The nuix_single_worker.exeprocess was simply running out of memory while trying to enumerate
that folder. Nuix was restarted with the appropriate parameters to allocate 4GB of RAM
per nuix_single_worker.exe instance and the file processed without issue.
Exporting:
While attempting to export 2 million items with the default settings on a 64-machine, Nuix
encountered an Out of Memory error while trying to find all of the top-level items.
The nuix_desktop.exe was simply running of memory while trying to build that list. Nuix was
restarted with the appropriate parameters to allocate 8GB of RAM to
the nuix_desktop.exe process and the export proceeded without issue.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 158 of 390
gigabytes, although this can vary depending on other software present on the system. Virus scanners and
similar software can map their DLLs for all processes on the system at certain specific virtual addresses,
which can prevent Java allocating more than it should.
Nuix 4 - To allocate additional RAM to the Nuix 4, see Nuix 4 System Options.
Nuix Import Workers - To allocate additional RAM to the Nuix workers during ingestion processing,
see Parallel Processing Settings for loading data.
Nuix Export Workers - To allocate additional RAM to the Nuix workers during export operations,
see Parallel Processing for exporting data.
1. In the Run box or at the the command line, type gpedit.msc to open the Local Group Policy
Editor.
2. Navigate to Computer Configuration > Administrative Templates > Windows Components >
Terminal Services > Terminal Server > Printer Redirection or Computer Configuration >
Administrative Templates > Windows Components > Remote Desktop Services > Remote Desktop
Session Host > Printer Redirection.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 159 of 390
3. Set Do not allow client printer redirection to Enabled.
This policy setting allows you to specify whether to prevent the mapping of client printers in
Terminal Services sessions.
You can use this policy setting to prevent users from redirecting print jobs from the remote
computer to a printer attached to their local (client) computer. By default, Terminal Services allows
this client printer mapping.
If you enable this policy setting, users cannot redirect print jobs from the remote computer to a
local client printer in Terminal Services sessions.
If you disable this policy setting, users can redirect print jobs with client printer mapping.
If you do not configure this policy setting, client printer mapping is not specified at the Group Policy
level. However, an administrator can still disable client printer mapping by using the Terminal
Services Configuration tool.
Running two dual-core licences on different machines is not as fast as a single quad-core licence.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 160 of 390
Running in a distributed fashion increases the overall network traffic and requires the completed
indexes to be copied back to the primary server.
Running in a distributed environment requires that you have a license for the master server as well
as all of the worker servers. These can be licensed using a separate dongle plugged into each
machine, or using the Nuix Server as a shared licensing server.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 161 of 390
2. Once the processing job completes, the indexes from the remote worker are copied into these
placeholder directories, and are renamed Complete-99087….
2. Once the data is processed, the directories are renamed Complete, and copied to the Master.
Note: There is no garbage collection on these directories, so you will need to clean them out
manually after the case is finalized.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 162 of 390
Configuring the Master and Worker Machines
To employ distributed processing with Nuix, you need to properly configure both the master machine and
workers machines when you create a new case in Nuix.
When running this for the first time, we recommend that Run local workers is disabled. This
allows you to ensure that the worker machines are connecting correctly.
The only reason to disable the Run local workers option in a production environment is when
your case server is just used for case access and you have a pool of processing machines that
shared amongst a collection of case servers. In this case, it might be advantageous just to use
the processing resources from the pool.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 163 of 390
3. Select the desired evidence, ensuring that the evidence definition references a universally
accessible pool or source evidence.
4. Start the processing job.
Note: If you are doing this for the first time, and have deselected the “Run local workers”, then the
processing window will appear and sit idle.
1. Open the Nuix Worker by going to Start > Programs > Nuix > Nuix 4. Select the appropriate 32 or
64-bit version.
2. In the Nuix Worker dialogue, define the master machine where the case has been configured.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 164 of 390
a. Master Hostname: The DNS name or the IP address of the master machine.
b. Directory: The local working directory.
c. Number of workers: The number of nuix_single_worker.exe instances to be run on the worker
machine.
d. Memory per-worker (MB): The amount of RAM to allocate to each of the
individual nuix_single_worker.exe processes.
3. Click Start. The Nuix worker begins processing, and updates are posted to the Nuix Processing
window on the master.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 165 of 390
Create a shared drive for holding the case directory.
A common disk resource must be available for the case directory. The case directory is typically a
storage pool local to the Master processing server, presented as a share to the other servers
running in the team. When declaring the UNC path, enter the entire UNC path in the File
Name field, or browse to it using the Look in control.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 166 of 390
Create Local Working Folders
Each Nuix worker machine requires its own temporary working directory. Nuix uses this directory to create its
local set of indexes. Once the indexing process is complete, these temp/local indexes are copied from the
worker servers to the case directory on the master machine.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 167 of 390
Search
This section explains how to search for evidence in Nuix, including:
Using the Search bar at the top of the Workbench tab to search using simple keyword queries
and/or dates.
Using the predefined Filtered Items categories within the search query to refine the evidence
based on metadata type.
1. Type directly into the Search text field or cut and paste a predefined query into the field.
The Search field can hold an unlimited number of characters, so queries can be as long as
necessary. You can use Boolean operators such as AND, OR, and NOT between search terms,
and quotes around phrases. For more information on the supported search syntax, see Search
Query Syntax.
2. If needed, use the date filter to search Between, After, or Before certain dates, or use the Not
between option to exclude a specific date range. The date filter searches on the Nuix Item Date.
For emails, it uses the Nuix Communications Date which is the Map-Client-Submit-Time, Sent
Date, or Date metadata property. For files, it is the File Modified or, if not present then the File
Created. If the item doesn't have any of these date fields, then the item date of the parent item is
used. The left date control will search starting from 00:00:00 HH:MM:SS and the right date control
will search until 23:59:59 of the selected date.
3. Click the Search button or press the Enter key to run the search.
View and reuse prior search strings using the Backwards and Forwards arrow buttons in
front of the Search field.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 168 of 390
Clear the search keywords and date filter by clicking the Clear button. When you clear a search,
any selected nodes in the Filtered Items pane are cleared as well.
Build a more complex search query by clicking Advanced.
including only the evidence you want to search by clearing the nodes in the tree you do not wish to
search, in the Evidence pane.
including only the types of items you want to search for by selecting the appropriate metadata
filters in the Filtered Items pane.
When you then use the Search bar, evidence and filtered items that are unselected in the Document
Navigator will be excluded from the search.
To create a search query using the Advanced Search tool, click Advanced button in the search bar on the
workbench tab.
The following screen is displayed:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 169 of 390
1. Select a criterion (type of metadata) for which you want to search.
The available types are keywords, file size, file type, tags, comments, custodians, item sets,
production sets, document ID, and filters.
2. Enter the values for the criterion selected.
3. For example, if you select File size, specify the minimum and maximum range in byte sizes to
match against.
4. Click Add to Expression.
5. This adds the search syntax, called rules, to the query and displays in the Expression table.
6. Repeat steps 1-3 as needed until your query contains all the criteria you need for your search.
7. Select whether to match all of the rules or any of the rules.
8. Click Search to run the search.
Other actions that you can perform in the Advanced Search tool includes:
Click Edit from the Selected criterion to edit a piece of syntax from the expression table.
Click Remove from the Selected criterion to remove a piece of syntax from the expression table.
Click Clear All from the Selected criterion to clear the entire search from the expression table.
Click Advanced button in the search bar to close the Advanced Search tool. The search criteria
specified in the fields are saved.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 170 of 390
Just like queries that you type into the Search bar, you can save the search queries built in this tool. For
detailed information on the available options, see Advanced Query Builder.
Note: When you save a search query, Nuix saves it in the following location %AppData%\Nuix\Saved
Searches key, in case you need to use a common set of search queries across multiple machines.
Nuix also saves all search queries that you perform within a case in the Search History pane of the
Document Navigator. The Search History lists all searches performed, categorized by how long ago in time
the searches were performed. This list serves as both an audit trail of the searches run within the case, but
also allows you to find an rerun a search that you have not saved.
Simple Queries
The simplest Nuix search query is a single word. When you enter a single word into the Search field it
locates all occurrences of the word, found in the properties, the name, the path and/or text content of items.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 171 of 390
The search terms are not case sensitive; the queries "joe", "Joe" or "JOE" will return identical results.
Example:
Note: Nuix by default searches the path name of the item. For Example, if the files I am looking for are
located in \Evidence 1\Email\Joe's Email\Important stuff, a default query for Joe finds all
items from "Joe's Email" and below. To exclude the path name from the search, search in each
field:name:joe OR content:joe OR properties:joe.
Wildcard Queries
You can use wildcards to search for multiple words that share some of the same characters. You can use
more than one wildcard in a search term.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 172 of 390
Mixing Wildcards
You can use both the single and multiple character wildcards in a single query.
Example:
Fuzzy Queries
Nuix supports fuzzy searches based on the Levenshtein distance or "Edit distance" algorithm. The
Levenshtein distance between two strings is defined as the minimum number of edits needed to transform
one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a
single character.
To find words that are similar to one another in as far as the characters they contain, add the tilde (~) symbol
at the end of a search term.
You can add an optional parameter after the tilde to specify the required similarity. The value can be
between 0.0 and 1.0, where higher values require a more similar match (using 1.0 is the same as not using a
fuzzy search) and lower values allow more letters to be different.
The default value in the absence of this parameter is 0.5.
Examples:
cold~ Matches the words "cold", "clod", "mold", "bold", "coil", "mould", ...
cold~0.75 Matches the words "cold", "mold", "bold", ... but not "clod", "mould", ...
Exact Queries
Sometimes you want to search for punctuation which is normally removed from the query. This can be found
using exact queries. To search for a sequence of characters, add single quotes (') at the start and the end.
Unicode quotation marks U+2019 (’) and U+2019 (’) are permitted in addition to the ASCII single quotes.
Note: Exact queries can only be used if support is enabled at indexing time.
Matches items containing the text, "123-456-0000", with up to two unrelated characters
'123-456-0000'~2
mixed in.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 173 of 390
Logical (or Boolean) Operators
You can use Boolean operators in your queries to help refine your search tasks. Nuix supports the following
Boolean operators: AND, OR, NOT. While you can chain together any number of logical ANDs (or any
number of logical ORs) without ambiguity, combining the various operators together can lead to ambiguity. In
such cases, you can use parentheses to clarify the order of operations. As always, the operations within the
innermost pair is performed first, followed by the next pair out, etc., until all operations within parentheses
are complete. Then any operations outside the parentheses are performed. Review the following sections for
details on how to use the logical operators.
AND Operator
A search combining two or more search terms using the AND operator matches only those items that include
all of the individual terms.
The AND operator is case sensitive and must be written in uppercase. If you search using "and" instead, you
will get items that contain the word "and".
You can combine the AND operator with other types of search syntax. For example, you can use AND in
between terms that use a wildcard and a fuzzy search.
If you use two single terms in the query, by default Nuix combines the terms using the AND operator.
Another syntax for the AND operator is to add the plus (+) symbol to additional terms you want to include in
the search; therefore insider AND trading AND options is the same as insider +trading +options.
Examples:
Joe Bloggs Matches the same items as the previous query, because AND is the default operator.
Joe +Bloggs Matches the same items as the previous two queries (alternative syntax).
J* AND Bloggs Matches items that contain both text starting with "J" and the full word "Bloggs".
Joe~ AND Bloggs Matches items that match the fuzzy search results for "Joe" and the full word "Bloggs".
OR Operator
A search combining two or more search terms using the OR operator matches items that include either of
the words in them.
The OR operator behaves much like the AND operator with respect to mixing with other queries.
Example:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 174 of 390
NOT Operator
A search combining two or more search terms using the NOT operator matches those items that include the
first term, but do not include the second term.
The NOT operator behaves much like the AND operator with respect to mixing with other queries.
Another syntax for the NOT operator is to add the minus (-) symbol to additional terms you want to exclude
from the search.
Examples:
Field Queries
A single search term, such as "Joe", will be matched on text that is contained in the text contents and the
properties of data items. It is possible to restrict the search to specific properties of the data item by the use
of "fields" in the search query.
To restrict a search to a specific field, prefix the search term with the field name followed by the ":" symbol.
For example, the search term "name:Wow" will locate the items whose name contain the term "Wow", but it
will not locate items which simply contain the term "Wow" in the text content.
name:( Embedded AND Matches items which contain both "Embedded" and "1" in their names, such as
1 ) "Embedded Item 1" or "Embedded Image 1".
name:( picture* OR Concise syntax which matches items which contain words beginning with "picture" or
itext* ) "itext" in their names, such as "Some more pictures", "Picture 1" or "itext6".
Phrase Queries
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 175 of 390
To search for a sequence of words in a specific order (a phrase), add double quote marks (") at the start and
the end of the phrase.
Example:
Punctuation is removed from the search string automatically, and treated as whitespace. If a punctuation
mark is converted to whitespace, the entire term is automatically converted to a phrase.
Example:
[email protected] Matches items that contain "joe bloggs nuix com", in that order.
"[email protected]" Matches items that contain "joe bloggs nuix com", in that order. This is the same result
as without quotes.
To search for words within a certain distance of each other, use the tilde (~) symbol at the end of the query
along with a numerical value. This is referred to as the "slop" of a phrase query.
Example:
Matches items containing "Joe Bloggs", Joe's Blog" or other combinations that match the provided
"Joe* Blog*"~2
wildcards, with up to two unrelated words in between them.
Note: The behaviour of phrase queries with slop applied is not immediately obvious. The number input as
the slop value is applied relative to the term being searched for, whereas some users expect it to be applied
relative to the previous term.
Take the phrase, "The quick brown fox jumps over the lazy dog." If we want to search for "fox quick"~2. Nuix
will first find "fox", and then set about looking for "quick" immediately after fox, allowing it to fall 2 words
either side.
Visually this can be represented as follows:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 176 of 390
The numbers below the words indicate the slop value required to match each term, from -2 up to 2.
Therefore, the following queries should (and do) result in a match (this assumes that stop words are not in
use):
"fox jumps"~2
"fox over"~2
"fox the"~2
"fox brown"~2
The following queries do not result in a match:
"fox quick"~2
"fox lazy"~2
Additionally, "fox fox"~2 does not return a match as phrase queries can only match each term once for each
position in the phrase.
Please see the Java Today - Query Parser Rules and search for "slop" for additional details.
SYNTAX RESULT
\d A digit (0-9).
\D A non-digit.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 177 of 390
SYNTAX RESULT
[] One of the characters within the brackets.
. Any character.
Examples:
/eat|ate|apple|orange/ Matches all items which contain either eat or ate and then apple or orange,
~2 with up to two unrelated terms separating them.
/gr[eao]y/ Matches all items that contain either grey, gray or groy.
/gr[^eao]y/ Matches all items that contain at least one word starting with gr followed by a
character that is not e, a or o, followed by y. This query would match griy and
gr3y.
/.oe.* not/ An example of a phrase query. Matches all items that have a word starting with
any letter followed by oe, optionally followed by any other characters then the
word not. This query would match "does not", "joe not" and "ioexception not".
/\d{4}-.*/ Matches all hyphenated terms starting with 4 digits. This query would match
0404-, 8823-4524 and 8823-4524-6754-2345.
/0\d{1,3}/ Matches all items that start with 0 followed by 1 to 3 digits. This query would
match 02, 0404, 00 and 080.
/0\d{1,3} \d{3,4} Matches all items that may contain local phone number patterns. The first part
\d{3,4}/ OR /0\d{1,3} of this query would match 02 2328 1929, 043 232 192 and 0404 0233 2333.
\d{6,8}/ The second part would match 02 23281929, 043 23221923 and 0404 023323.
There are different conventions for how phone numbers are grouped, so you
will probably need to adjust this query for different cases.
/[\u0400-\u052f]*/ Matches all unicode Cyrillic and Cyrillic Supplement family of alphabets. Note:
adding the asterisk (*) will highlight whole words for some languages.
Range Queries
To search for terms within an upper and lower bound, use square brackets or curly braces. Square brackets
mean that the term on the corresponding side is matched (i.e. the range is inclusive of that bound), whereas
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 178 of 390
curly braces mean that the term on the corresponding side is not matched (i.e. the range is exclusive of that
bound.)
If you wish to omit either bound, the wildcard character ("*") can be used in place of either bound. Omitting
both is not possible (normal wildcard queries should be used instead.)
The keyword "TO" (or "to") can optionally be inserted between the upper and lower bounds, to make the
query more readable.
When using a range query on date fields, the dates should be entered in yyyyMMdd syntax.
This relatively complex query type should become clearer by reading the following examples.
Example:
{Joe TO Johnathan} Matches items which contain "John" somewhere in the properties or the text
content, but does notmatch "Joe" nor "Johnathan".
[Joe TO Johnathan} Matches items which contain "John" or "Joe" somewhere in the properties or the
text content, but does not match "Johnathan".
comm-date:[20070101 TO Matches items which are inside a top-level communication which was sent in
20070131] January 2007.
Note: Range searches for words searching return any items that would appear between the two terms in the
alphabet. For example, if the words "Jet", "Joe", "Joseph", "Joey", "John", "Johnathan" and "Jordan" are
alphabetized, you get "Jet", "Joe", "Joey", "Joeseph", "John", "Johnathan", and "Jordan". A range search for
[Joe TO Johnathan] returns the items that fall between those terms in the alphabetic order - "Joe", "Joey",
"Joseph", "John", "Johnathan". Both "Jet" and "Jordan" are excluded because they don't fall between the
range.
Proximity Operators
W/n
A search combining two or more search terms using the W/n operator matches only items which are near
each other. The maximum distance which matches is specified as the parameter n. One term is matched
within that distance of the other.
The operator is case insensitive, so a lowercase "w/n" in queries behaves the same way.
Much like the AND operator, the W/n operator can be used to combine most of the above search queries,
but in particular, NOT queries are not permitted. Complex nested boolean queries are permitted but are
unlikely to give meaningful results.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 179 of 390
QUERY STRING RESULTS
(John OR Johnny) W/2 Matches items which contain either "John" or "Johnny" a maximum distance of 2 words
Smith from "Smith".
(John AND Mary) W/2 Matches items which contain both "John" and "Mary" a maximum distance of 2 words
Smith from "Smith".
PRE/n
The PRE/n operator works similarly to the W/n operator. The difference is that the matches must occur in the
order in which they are specified. The operator is case insensitive, so a lowercase "pre/n" in queries behaves
the same way.
Examples:
(John OR Johnny) Matches items which contain either "John" or "Johnny", with "Smith" occurring within the
PRE/2 Smith 2 words after the first half of the match.
Acme NOT PRE/1 ( Matches items which contain "Acme" but without "Corporation" or "Inc" as the
Corporation OR Inc ) next term.
Operator Precedence
Query operators group in the following precedence ordering, from highest to lowest:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 180 of 390
Examples:
Operator Notes
1. Slop suffix ('~') | Can only be used with quoted queries, quoted wildcard queries, exact
queries or regex queries.
2. Fuzzy suffix ('~') | Can only be used with unquoted queries.
3. Groups ( '(' ... ')' ) |
4. Field prefix ('field:') | -
5. Alternate logical operators ('+', '-') | -
6. NOT | -
7. W/n, PRE/n, NOT W/n, NOT PRE/n | -
8. AND | -
9. OR | -
Operator Grouping
If AND and OR operators are mixed in a single expression, use parentheses to group the expression to
produce the desired query.
Examples:
Joe AND (Bloggs Matches all items that contain both Joe, and either Bloggs or Smith (so it would not match
OR Smith) "Keith Smith", but it would match "Joe Bloggs".)
Indexed Fields
Nuix provides a variety of different indexed fields to help you search by the metadata associated with an
item, instead of just searching the full text of an item. Review each type of indexed field to understand the full
range of search tasks you can perform.
For example, using a single search term such as "Joe" returns items wherein that word was in the text
contents or properties. However, you can also restrict the search to specific properties of the item by using
the fields Nuix has indexed in your query. To restrict a search to a specific indexed field, prefix the term for
which you are searching with the field name followed by a colon (:). For example, the search
expression name:wow locates the items whose name contain the term "wow", but it will not locate items that
only contain the term "wow" in the text content.
When using fields, note that the field search only works against the word that directly follows the colon. If you
want to search for the phrase "Options to sell" in the subject of an email or in the name of an item, you would
use name:"Options to sell"; otherwise, only items matching the word "options" in the subject or title
are found.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 181 of 390
Common Fields
You can use the Nuix Common fields to search for additional attributes about the item, that aren't necessarily
part of the content the item itself. For example, you can search for all items that have extracted text and are
emails by using the query: contains-text:1 AND kind:email.
You can search within common fields by typing the field name followed by a colon (:) and then the term you
are looking for. You can also search against more than one field at a time in a query.
For example, you want to find only documents that have the name "Joe Bloggs" as the author. To do so, in
the Search field type:
kind:document properties:"Author:Joe Bloggs"
content
Searches within the email body or the text portion of a document.
Example:
content:wow
Matches all items that contain the term "wow" in the email body or text portion of a document, essentially the
Text tab of the Preview pane.
name
Searches on the file name of the item, or in the subject of email messages.
Example:
name:"Check this out"
Matches items with the phrase "Check this out" somewhere in the file name, including email items with the
phrase somewhere in the subject.
kind
You can use this field to search for items based on the kind of data they contain. This is similar to using the
mime-type field, but simpler to use.
The supported kinds of items are:
KIND EXPLANATION
email Email messages.
spreadsheet Spreadsheets.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 182 of 390
KIND EXPLANATION
other-document Other types of document a user might create.
Examples:
kind:email Matches all email messages.
-kind:system Excludes all system files.
mime-type
Searches on the MIME type of the item. This field is the more advanced alternative to the kind field, and
allows you to select more specific types of items in your query.
Examples:
mime-type:application/vnd.ms-
Matches all Outlook email messages.
outlook-note
mime-type:application/vnd.ms-
Matches all Outlook data items.
outlook*
Matches all images (some image types, however, may have different
mime-type:image* MIME types, for instance Adobe Illustrator does not fall into this
category).
properties
Searches the property names and values associated with every item.
Examples:
properties:"Author: Matches data items that contain the value "Joe Bloggs" for the "Author" property. This
actually matches some other things, such as "Author Joe Bloggs" all in the value,
Joe Bloggs"
since the colon character and other punctuation are ignored in the query.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 183 of 390
evidence-metadata
Searches the custom metadata that the investigator added to the case when the evidence was loaded. In
these examples, "site" is a piece of custom metadata.
Examples:
evidence-metadata:"site: Matches items whose top-level evidence folder contains the value "23 Dickson
Street, Canberra" for the "site" metadata field. This actually matches some other
23 Dickson Street,
things, such as "site 23 Dickson Street Canberra" all in the value, since the colon
Canberra" character and other punctuation are ignored in the query.
has-binary
You can use this field to search for items that either have or do not have binary data. Very few types of items
lack binary data, such as filesystem directories, mail folders or folders inside compressed zip files.
This field contains either 0 or 1. Use a 1 to find items that contain binary data. Use a 0 to find items that do
not contain binary data.
Examples:
contains-text
You can use this field to search for items that either have or do not have text data. This field only applies to
items that are returned by has-text:1, therefore images, videos, etc., are never matched.
This field contains either 0 or 1. Use a 1 to find items that contain text data. Use a 0 to find items that do not
contain text data.
Examples:
mime-type:application/pdf
Matches all pdf documents that do not contain text.
AND contains-text:0
has-text
You can use this field to search for items that either can or cannot contain text data. This type of search does
not imply the document has text, but rather just that the item type could contain text.
This field contains either 0 or 1. Use a 1 to find items that could contain text data. Use a 0 to find items that
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 184 of 390
cannot contain text data.
Examples:
has-image
You can use this field to search for items that could contain image data.
This field contains either 0 or 1. Use a 1 to find items that could contain images. Use a 0 to find items that do
cannot contain images.
Examples:
has-communication
You can use this field to search for items that have communication data. This search matches items that are
communications in their own right, but not the items that are attached to, or associated with, a
communication. To search for attachments, see the communications fields.
This field contains either 0 or 1. Use a 1 to find items that contain communications fields. Use a 0 to find
items that do not contain communications fields.
Examples:
has-communication:0 Matches all items that do not contain communication fields (To, Cc, Bcc, From fields).
has-embedded-data
You can use this field to search for items that could contain embedded data. Using this search matches
items that have the ability to contain embedded data. For instance, it will match all directories even if a
directory contains no files.
This field contains either 0 or 1. Use a 1 to find items that could contain embedded data. Use a 0 to find
items that cannot contain embedded data.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 185 of 390
QUERY STRING RESULTS
has-embedded- Matches all items that could contain embedded items. This does not mean that the
data:1 item does contain embedded items.
has-embedded-
Matches all items that cannot contain embedded items.
data:0
audited-size
Searches for audited items of the same size. Audited items have been marked for auditing and will also be
matched with flag:audited.
Examples:
audited-size:[400 TO
Matches all audited data items with a size from 400 to 789.
789]
audited-size:* Matches all data items with an audited size, although flag:audited will run quicker.
-audited-size:* Matches all data items without an audited size, although -flag:audited will run quicker.
date-properties
Searches over the date properties associated with every item of data.
Examples:
date-properties:"last Matches data items that have the date property "last saved" with
saved":[20021118 TO 20021119] the value between 2002-11-18 and 2002-11-19.
digest-input-size
Searches for items of the same size.
Examples:
digest-input-size:[400 TO
Matches all data items with a size from 400 to 789.
789]
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 186 of 390
QUERY STRING RESULTS
digest-input-size:* Matches all data items with a computed size.
Matches all data items without a computed size, which includes directories and
-digest-input-size:*
evidence folders.
digests
Searches on the digests of the items. Digest are created from these components.
You can use this field to find data items with contents identical to other items in the same data set, and also
items outside the data set. Due to the nature of digests, queries on this field may (although it is extremely
unlikely) return data items that are not actually identical to the data item you are looking for.
Note: The software will only compute digests on files less than 256MB in size, for the sake of faster
processing.
Digests supported by Nuix have lengths as detailed in the following table. The number of hexadecimal digits
represents how many digits will come after the colon when using this field in a search query.
SHA-1 160 40
SHA-256 256 64
Example:
sha-
1:354d8b33aa51aed2a7fcb8ad5476a5d5ede8b Matches all items with the SHA-1 digest
"354d8b33aa51aed2a7fcb8ad5476a5d5ede8bb2a".
b2a
family
Searches over the name, content, property names and values associated with every item of data, but returns
the associated top-level item for any hits. This is a convenience field which automatically searches over the
family-content, family-name and family-properties fields.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 187 of 390
QUERY STRING RESULTS
Matches top-level items which have a family item with Bloggs in any name, content,
family:Bloggs
property name or property value.
family-content
Searches over item contents but returns the associated top-level item for any hits.
Examples:
family-name
Searches over item names but returns the associated top-level item for any hits.
Examples:
family-properties
Searches over the property names and values associated with every item of data, but returns the associated
top-level item for any hits.
Examples:
family-properties:Bloggs Matches top-level items which have a family item with the name Bloggs in any
property name or property value.
family- Matches top-level items which have a family item which contain the value "Joe
Bloggs" for the "Author" property. This actually matches some other things, such as
properties:"Author: Joe
"Author Joe Bloggs" all in the value, since the colon character and other
Bloggs" punctuation are ignored in the query.
file-extensions
Searches over the file extensions detected over all items.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 188 of 390
QUERY STRING RESULTS
Matches data items which were identified as having a file extension when originally
file-extension:*
processed.
flag
Searches for items that were flagged by Nuix as being of a particular type. You can use this field to find
items that were identified as a particular type during processing.
Examples:
Matches all data items which were marked as being displayed as a part of an
flag:inline
outer item. An example is an image inside an RTF document.
Indicates the item's children were only partially processed. Some children were
flag:partially_processed
explicitly skipped at the direction of the user.
flag:identification_disa Indicates that identification was disabled when the item was processed. For
bled version Nuix 4.0 and above.
Matches all data items which caused a critical error during processing on
flag:poison
several attempts.
flag:text_stripped Matches all data items whose text was determined via text stripping.
Matches all data items which have been marked to be audited for calculating
flag:audited the total size calculation for audited licences. These items would be exported
in a legal export.
Matches all data items which have been marked as being top-level items. For
flag:top_level
version Nuix 3.2 and above.
Matches all data items which have been marked as not being top-level
flag:not_top_level
items. For version Nuix 3.2 and above.
Matches all data items which have been marked as loose files, which are the
files you would see, for example, in Windows explorer when browsing a
flag:loose_file
directory. Loose files within a disk image are also flagged. For version Nuix 3.2
and above.
Matches all data items which have been marked as not being loose files. For
flag:not_loose_file
version Nuix 3.2 and above.
Matches all data items which have been marked as physical files, which are
the highest items in the data tree which contain binary. These will correspond
flag:physical_file to the "real files" that were loaded into the case. As a consequence, this can
flag disk images and forensic containers unlike the loose_file flag. For version
Nuix 4.2 and above.
Matches all data items which have been marked as not being physical
flag:not_physical_file
files. For version Nuix 4.2 and above.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 189 of 390
Matches all data items which have not been processed fully due to a licence
flag:licence_restricted
restriction. For version Nuix 3.6 and above.
Matches all data items which have been reloaded from source data or a new
flag:reloaded
binary file. For version Nuix 4.0 and above.
Matches all data items which contain an immaterial item that has not been
flag:suppressed_immateri exposed as a separate data item. This is only present when the "Hide
al_item Immaterial Items" processing option is enabled. For version Nuix 4.0 and
above.
Matches all data items which were encrypted but successfully decrypted. Since
flag:decrypted
For version Nuix 4.2 and above.
float-properties
Searches over the fractional number properties associated with every item of data.
Examples:
has-stored
Matches items which have the stored data type specified by the query.
Examples:
integer-properties
Searches over the whole number properties associated with every item of data.
Examples:
integer-properties:"char Matches data items that have the whole number property "char
count":[14000 TO 16000] count" with the value between 14000 and 16000.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 190 of 390
item-date
Searches for an item by the date of the item. The date of the item will generally be the same as the
communication date for items which represent a communication and the modified date for other types of
item. Items without a date inherit the date of their parent.
This field is only of practical use in conjunction with a range query.
Note: This field will only be present on cases created with version 3.0 and above. Using it on earlier cases
will return zero search results.
Examples:
item-id
Searches for the short ID that is unique to each item in this case.
Examples:
Matches the data item, contained in a compound case, with the given ID. In this case the
item-id:1-1234
item 1234 is part of the first case added to a compound case.
Note: The item-id is there for convenience and should not be relied upon as the sole reference of a
document, as it is updated when simple cases are aggregated into compound cases. For example, as part of
a simple case, each item is assigned a numerical item-id (12345). When that simple case is combined into a
compound case, the item-id is prefixed with the relative position of the simple case with in the compound
case. If the simple case containing item 12345 was the second simple case added to the compound case,
the new item-id would be 1-12345. "1-" represents the location of the simple case within the compound case
and 12345 represents the item within the original simple case. The Nuix GUID is the only absolute reference
for an item.
lang
Matches items that contain text in specified languages.
The value is a ISO 639-3 code. The full list of codes can be found here: http://www.sil.org/iso639-3/.
Examples:
lang:(rus OR ukr) Matches all items that contain Russian or Ukrainian text.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 191 of 390
md5, sha-1, sha-256
Searches on the digests of the data items. This can be used to find data items with contents identical to
other items in the same data set, and also items outside the data set.
Note:
Due to the nature of digests, queries on this field may (although it's extremely unlikely) return data
items which are not actually identical to the data item you are looking for.
The software will only compute digests on files less than 256MB in size, for the sake of faster
processing.
Digests supported by the software have lengths as detailed in the following table. The number of
hexadecimal digits represents how many digits will come after the colon when using this field in a search
query.
SHA-1 160 40
SHA-256 256 64
Examples:
sha-
1:354d8b33aa51aed2a7fcb8ad5476a5d5e Matches all items with the SHA-1 digest
"354d8b33aa51aed2a7fcb8ad5476a5d5ede8bb2a".
de8bb2a
modifications
Matches items which have been modified since their initial load.
Examples:
modification:reloaded Matches all data items which have been reloaded since the initial load.
named-entities
Matches items that contain specified named entity.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 192 of 390
Examples:
path-kind
Matches items where one of the ancestors of the item contains the specified kind of data. This is similar to
using the path-mime-type field, but simpler to use. This can be used if you know what you're searching for
was inside a certain kind of data.
Examples:
path-mime-type
Matches items where one of the ancestors of the item has the provided MIME type. This can be used if you
know what you're searching for was inside a certain kind of data.
Examples:
path-mime-
Matches all items (chiefly images) contained within Microsoft documents.
type:application/vnd.ms-*
path-name
Matches items where one of the ancestors of the item has the provided name.
This can be used if you know what you're searching for was inside a certain kind of data.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 193 of 390
previous-version-docid
The doc ID of the previous version, present on items whose content replaces other items.
Examples:
print-method
Searches for items that are stored as PDF, based on how the PDF was created.
You can use this to identify items that have been printed in a less than ideal fashion, so that custom PDFs
can be substituted for these items.
Examples:
skintone
Searches for items with a skintone score in the specified range. This search only works if skin tone analysis
was selected when you created the case.
You can use this to match all images that have set levels of skin-tone. The skintone filter uses the following
ranges.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 194 of 390
LEVEL LOWER RANGE UPPER RANGE
Low 0.00 0.05
Examples:
skintone:[0.20 TO 1.01] Matches all data items with Severe or High skin tone values.
path-mime-type
Searches for items where one of the ancestors of the item has the provided MIME type.
You can use this field if you know what you're searching for was inside a certain type of data.
Examples:
path-mime-
Matches all items (chiefly images) contained within Microsoft documents.
type:application/vnd.ms-*
encrypted
Searches for items that have been encrypted. You can use this field to return all encrypted office and PST
files.
This field contains either 0 or 1. Use a 1 to find items that are encrypted. Use a 0 to find items that are not
encrypted.
Examples:
characters
Searches for items that contain the specified characters.
You can use this to return all items that contain characters from particular writing systems. The following
types are supported.
TYPE EXPLANATION
arabic Includes Arabic characters.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 195 of 390
TYPE EXPLANATION
chinese Includes Chinese characters that are shared with other languages such as Japanese and Korean.
cyrillic Includes many East and South Slavic languages, and almost all languages in the former Soviet Union.
hangul Includes characters from the native alphabet of the Korean language.
non-latin Includes any characters not found in common Latin (English, Spanish, German, etc.) text.
Examples:
deleted
Searches for items that were deleted, and those that were recovered from slackspace. You can use this to
find items flagged as deleted during processing, such as EnCase files that were marked as deleted, or to find
deleted email messages and their attachments from the slackspace of Microsoft email containers.
This field contains either 0 or 1. Use a 1 to find deleted items. Use a 0 to find items that were not deleted.
Examples:
deleted:0 Matches all data items that were not marked as deleted.
Refer Deleted items for additional details. For information on how to ensure that permanently deleted items
are being processed, review the Extract from mailbox slack-space option from the Evidence Processing
Settings dialog.
exclusion
Matches items that have been excluded by the given exclusion name.
Examples:
exclusion:"irrelevant*" Finds all items that have been excluded with an exclusion name starting with
"irrelevant".
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 196 of 390
has-exclusion
Matches all items based on whether they do or don't have any exclusion.
Examples:
custodian
Matches items assigned to a custodian.
Examples:
custodian:"High*" Finds all items that have been assigned to a custodian with a reference name starting with
"High".
has-custodian
Matches all items based on whether they are or aren't assigned to any custodian.
Examples:
document-id
Matches all items that have the specified document ID.
Examples:
Finds any items with a document ID that start with, or match, "DOC-00001",
document-id:DOC-00001*
such as "DOC-000012232".
document-id:[DOC-00001 TO Finds any items with a document ID that have the prefix "DOC" and are
DOC-00003] numbered, ignoring padding, from 1 to 3.
document-id:[DOC-1 TO DOC- Matches the same items as the query above, as padding is ignored and 4 is
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 197 of 390
QUERY STRING RESULTS
4} explicitly excluded with '}'.
production-set
Matches all items that are assigned to the specified production set.
Examples:
has-production-set
Matches all items based on whether they are or aren't assigned to any production set.
Examples:
production-set-guid
Matches all items that are assigned to the specified production set matching based on the production set
GUID.
Examples:
QUERY STRING RESULTS
production-set-guid:2b09b47c-efc8- Finds any items are assigned to the production set that has GUID
11e0-a03f-8d0a4924019b "2b09b47c-efc8-11e0-a03f-8d0a4924019b".
item-set
Matches all items that are assigned to the specified item set.
Examples:
item-set-originals
Matches all items that are assigned to the specified item set as originals.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 198 of 390
QUERY STRING RESULTS
item-set-originals:"Item Set Finds any items that are assigned to the set "Item Set 1" that were deemed
1" to be originals.
item-set-duplicates
Matches all items that are assigned to the specified set as duplicates.
Examples:
is-item-original
Matches all items that are deemed originals in any item set.
Examples:
is-item-original:0 Finds all items that were not deemed originals in any item set. This includes
duplicates in all sets and items not in any item set.
has-item-set
Matches all items based on whether they are or are not assigned to any item set.
Examples:
item-set-batch
Matches all items or original or duplicate items that are assigned to the specified item set in the named
batch. It takes two or three parameters separated by ';' - the set name, optionally "originals" for original items
or "duplicates" for duplicate items and the name of the load.
Examples:
item-set-batch:"Item Set 1;load 1" Finds all items that are assigned to the set "Item Set 1"
that were assigned during "load 1" batch.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 199 of 390
item-set-batch:"Item Set Finds all original items that are assigned to the item set
1;originals;load 1" "Item Set 1" that were assigned during "load 1" batch.
item-set-batch:"Item Set Finds all duplicate items that are assigned to the set
1;duplicates;load 1" "Item Set 1" that were assigned during "load 1" batch.
item-set- Finds all items that are assigned to the set with GUID
batch:"5bca1b8cfcaa4896915f685d6a873 5bca1b8cfcaa4896915f685d6a873eaf that were assigned
eaf;load 1" during "load 1" batch.
item-set- Finds all original items that are assigned to the item set
batch:"5bca1b8cfcaa4896915f685d6a873 with GUID 5bca1b8cfcaa4896915f685d6a873eaf that
eaf;originals;load 1" were assigned during "load 1" batch.
item-set- Finds all duplicate items that are assigned to the set with
batch:"5bca1b8cfcaa4896915f685d6a873 GUID 5bca1b8cfcaa4896915f685d6a873eaf that were
eaf;duplicates;load 1" assigned during "load 1" batch.
Communications Fields
To, Cc, Bcc, and From are Nuix-derived communication fields. This allows Nuix to normalise a variety of
different types of email (Microsoft, Lotus, SMTP) into a standardised set of fields. The data in these fields
can be exported and directly searched with the communications fields.
Each of the fields To, Cc, Bcc, and From are distinct, and therefore must be searched independently. For
example, to find all of the emails sent to [email protected], you must use the following query:
to:[email protected] OR cc:[email protected] OR bcc:[email protected]
Note: The To, Cc, Bcc, and From fields in Nuix are extracted from the message transport headers, and
therefore do not have a direct item level metadata property correlation.
When indexing email addresses, Nuix ignores all punctuation (e.g. "@", ".", "_"). By ignoring punctuation,
Nuix provides a means of performing exact email address searches as well partial or domain searches.
For Example:
[email protected] is indexed as "jane doe nuix com". This allows all of the following query strings to match
the email address:
to:"jane doe"
cc:doe
bcc:nuix.com
from:[email protected]
from
Searches for items contained inside a communication sent from a party that matches the pattern. The string
here may be a partial address.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 200 of 390
QUERY STRING RESULT
from:example Matches all messages sent from [email protected], or [email protected].
Note: The from field is a Nuix-derived metadata field that is populated either from the transport headers, or
if not present there, a combination of the PR_SENDER_EMAIL_ADDRESS / PR_SENDER_NAME.
address
Searches for any address contained inside a communication which matches the pattern. The string here may
be a partial address.
Examples:
comm-date
Searches for an item by the date of the communication which contains it. This field is only of practical use in
conjunction with a range query.
Note: This field will only be present on cases created with version 2.18 and above. Using it on earlier cases
will return zero search results.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 201 of 390
QUERY STRING RESULTS
Matches items which are inside a top-level communication which was sent on
comm-date:20070101
1 January, 2007.
comm-date:[20070101 TO Matches items which are inside a top-level communication which was sent in
20070131] January 2007.
recipient
Searches for any recipient address (to / cc / bcc) contained inside a communication which matches the
pattern. The string here may be a partial address.
Examples:
GUID Fields
Nuix 4 assigns a unique ID to each item that it processes. The GUID (gloabally unique identifier) is unique
across all cases. The GUID should not be confused with the digest identifier (MD5, SHA-1, SHA-256), as
these are based on the item content and are designed to show authenticity and to find duplicate content.
Example GUID: debfa9a5-4fdb-47d1-b1ea-0cc105a626fa
Note: Nuix supports wildcards with all of the GUID fields searches, so searching
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 202 of 390
for guid:da056718* matches all items with this string of characters in its GUID.
guid
Searches for an item with a specific GUID. A GUID is a 100% unique ID assigned to each item of data that is
processed by Nuix Desktop.
By definition, this will always return either one or zero search results.
Example:
parent-guid
Searches for all of the child items that are embedded one level deep for a specific GUID.
The parent-guid is different from the path-guid in that it searches for only directly embedded items, not all
items. In the Parent GUID screen shot below, the parent-guid search only finds those items that are one
level beneath the parent item.Example:
path-guid
Searches for all of the child items for a specific GUID. This is useful when you need to search for all items
associated with a piece of Nuix evidence.
The path-guid is different from the parent-guid in that it searches for all embedded items, not just
those one level deep. In the Path GUID screen shot below, the path-guid search finds all items, including
those in that are within the Inbox.
Example:
Searching for path-guid:3aad2ab02fb04feea7f45077fc0e75de will display all of the subfolders for the "Top
or Personal Folders" folder. In this example, it matches over 10,000 items.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 203 of 390
comm-guid
Searches for items that were contained in a communication with a given GUID. For this to make sense, the
GUID provided needs to be a communication; that is, the query has-communication:1 should show the
item in question.
Example:
batch-load-guid
Searches for items which were contained, loaded or reloaded in a particular batch.
Examples:
Annotation Fields
Annotation fields are those that contain information added by investigators. The two types of annotations for
which you can search are tags and comments.
tag
Searches for a specific tag creted by an investigator. You must specify the full name of the tag. If the
classification name has any spaces, such as Not Relevant, the classification name must be enclosed in
double quotes.You can also use a minus sign (-) in front of the tag field to exclude a tag from a search. All
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 204 of 390
tags that exist as part of a nested structure, should be separated by the "|".
Examples:
tag:"Not Relevant" Matches items that have been tagged as not relevant.
tag:top* Matches items which have been tagged with a tag beginning with 'top', or
any sub-tags of such tags.
tag:"top|*" Matches items tagged "Attorney Work Product" where the tag is nested
three levels deep.
has-tag
Searches for items that either have or do not have tags.
This field contains either 0 or 1. Use a 1 to find items that are associated with a tag. Use a 0 to find items
that have not been tagged.
Examples:
comment
Searches for text in comments made by investigators. Text entered in this field is automatically treated as a
wildcard.
Example:
QUERY
RESULTS
STRING
Matches items containing the word "bank" in the investigator's comments, but also items containing
comment:bank
the words "banking" and "embank".
has-comment
Searches for the existence or absense of any comments made by investigators.
This field contains either 0 or 1. Use a 1 to find items that contain comments. Use a 0 to find items that do
not contain comments.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 205 of 390
QUERY STRING RESULTS
has-comment:1 Matches items that have an associated comment.
automatic-classifier
Searches for items within the given automatic classifier. The parameters are classifier,
subset and classification. They are separated by semicolons and are all optional. If a parameter is
omitted it is treated as a "match all" filter.
Available options for subset are training, automatically-classified and skipped.
Additionally, auto can be used as an alias for automatically-classified.
Examples:
automatic-
classifier:"classifier1;tra Matches items which have been marked as relevant training items
ining;relevant" for the classifier "classifier1".
automatic-
classifier:"classifier1;aut Matches all items which have been automatically classified by the
omatically-classified" classifier "classifier1" regardless of classification.
automatic- Matches all items which have been identified as irrelevant by the
classifier:"classifier1;;ir classifier "classifier1" regardless of how that happened, i.e. both
relevant" training items and automatically classified items.
automatic-
classifier:";automatically- Matches all items which have been automatically classified as
classified;relevant" relevant regardless of the classifier used.
automatic-classifier:";;" Matches all items which form any part of any automatic classifier.
cluster
Searches for clustered items using up to three parameters separated by semi-colons.
Examples:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 206 of 390
QUERY STRING RESULTS
"investigation23".
markup-set
Searches for markup set items.
Examples:
markup-set:"general Matches items which have markups stored in a markup set with a name that includes
public" spaces, in this case "general public".
List-based Fields
List-based search fields allow you to leverage imported word or digest lists to find.
See more information about importing and working with Word lists, Shingle lists and Digest lists.
digest-list
Searches for items whose digest matches a digest in the named list, effectively equivalent to using a digest
list filter.
Digest lists are frequently used to eliminate duplicate data from previously processed or reviewed evidence.
This is typically done by creating a digest list for a set of evidence and then using the NOT option to ensure
that only unique content is returned.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 207 of 390
Example:
shingle-list
Searches for items having at least one shingle contained in the named list, effectively equivalent to using a
shingle list filter.
The name of the list is considered to be case sensitive.
Examples:
shingle- Matches items with at least one shingle present in the shingle list named
list:"MyFiles;0.9" "MyFiles", overriding the default resemblance threshold of 0.5 with 0.9.
word-list
Searches for items containing the words and phrases in the given word list file, effectively equivalent to using
a word list filter.
The name of the list is considered to be case sensitive.
Example:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 208 of 390
Analyse
Nuix supports a variety of analysis tasks, from analyzing the file types that were processed in a case to
looking for themes or patterns of communication between key custodians. The following workflows are
typical for analysing cases:
Providing a means for senior investigators and attorneys to sift through the data and develop the
case strategy, followed by creating review jobs that will guide the actual document-level review and
tagging by themselves or others.
Providing case data to Litigation Support Vendors, with lists of search strings, keywords, date
ranges, custodians, and similar criteria, which they can use to analyse the data and batch the items
into review jobs for the client.
After you have selected the evidence to analyze, you can evaluate the items in the Results list in many ways,
such as:
Reviewing words in a set of evidence to find those that match relevant keywords or that might
seem out of context with the rest of the data set.
Viewing images to find inappropriate content, including those with high skin tones to detect
pornography.
Analysing the frequency of various files types in a set of evidence to find spreadsheet, containers,
multimedia files or other content types that might help the investigation.
Analysing the email addresses in a set of evidence to see if custodians are emailing competitors,
their personal email addresses, etc.
Analysing communications over time to see how a specific spreadsheet containing the projected
sales figures moves from employee to employee, changing names, and ultimately is sent to a
competitor.
Analysing patterns of communications to see who is talking to who and how frequently.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 209 of 390
For each item of type image, a thumbnail rendering of the document displays. In addition, the total number of
copies of that image based on its MD5 digest is listed.
Review Images
Nuix displays a thumbnail for each item, showing the number of copies in parentheses if any. We
recommended that when working with the Thumbnails view that you select the Deduplicate results option
when conducting a search. This will eliminate seeing the same image twice in the view.
1. Search for the set of items you wish to analyze, optionally removing duplicates if you want to tag
images and do not want to see multiples of the same items in the view.
2. Optionally, in the Filtered Items pane, select the Skin Toned Images filters to further narrow the
set.
3. In the Results pane, select View by: Thumbnails.
The images matching your search and filter criteria display.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 210 of 390
You can now apply a tag to all copies of the item at once.
Notes:
Nuix views a word as any item that is surrounded by white space so 24014 is considered a word.
From a practical perspective, this could be gibberish or it could a critical zip code.
All words are listed, including all character sets and symbols.
A script is available within the Knowledge Base that can be used to remove all alphanumeric
entries from the list.
Viewing the results by Word List can be memory intensive on large datasets. You may wish to
increase your memory allotment in File > Global Settings > Memory prior to reviewing the list of
individual words in the results set.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 211 of 390
Review the Word List
Nuix itemizes all the words in the results set in the word list.
1. Select for the set of items you wish to analyze, optionally selecting the Hide immaterial
items option in the Results pane, if desired.
2. In the Results pane, select View by: Word List.
3. Scroll through the list, or move directly to a specific keyword by typing it in the Filter field.
This filter is based on an anchor at the beginning of the word, so typing "ranteed" will not show
"guaranteed". The filter supports numbers, letters and symbols.
1. Select the set of items you wish to analyze, optionally selecting the Hide immaterial items option
in the Results pane, if desired.
2. In the Filtered Items pane, select the Word List filters you wish to use.
The results set changes to include only those items that include one or more of the words from the
word list(s) chosen.
3. In the Results pane, select View by: Word List. You can review the number of items for each
word, and double-click a row to open a new Workbench tab to preview those items in
the Preview pane.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 212 of 390
To review the file statistics:
File Type - Lists all of the file types encountered during the ingestion process.
Processed - Lists the total number of items processed for the specific file type.
Corrupted - Lists the total number of items that Nuix was unable to process, or found to be
corrupted for a specific file type. *
Encrypted - Lists the total number of items that Nuix detected as encrypted.
Deleted - Lists the total number of permanently deleted items found in Microsoft mail container
formats for a specific file type.
Percentage Encountered -Lists the percentage, by item count, of the total dataset consumed
by the specific file type.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 213 of 390
Managing Irregular Files
After you load data into Nuix, you must review the irregular files for that specific collection of evidence. This
workflow should be followed after both of these steps:
Creating a New Case - Immediately after processing, Nuix displays the full case statistics. The
lower portion of the screen lists all of the irregular files.
Adding Case Evidence - Immediately after processing, Nuix displays the full case statistics. To see
the evidence-specific irregular files, in the Evidence pane filter by the evidence name, and then in
the Results pane, select View by: Statistics.
You will want to familiarize yourself with the types of irregular files, as well as develop a consistent exception
handling process.
Note that Nuix only presents those types of irregular files that are present in the case, so this list can vary by
case.
Text Stripped
Text Stripped items are items where Nuix recognized the file type, but does not have a routine to cleanly
extract all text and metadata in accordance with the file types API. The results in a item that is searchable,
but the text may be garbled or not be properly formated
Note: Nuix only strips out US-ASCII characters (punctuation, 0-9, A-z). Nuix uses the UTF-16LE encoding (a
unicode encoding used by Microsoft) to potentially get out more textual data.
Text stripped file types include the following (list is subject to change):
image/vnd.corel-draw
image/vnd.micrografx-designer
image/x-pict
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 214 of 390
image/vnd.micrografx-designer
application/vnd.adobe-photoshop
application/vnd.ms-shortcut
application/vnd.lotus-freelance
application/vnd.lotus-wordpro
application/vnd.borland-paradox
image/vnd.autocad-dwg
image/cgm
application/vnd.myob
application/x-js-taro
application/vnd.lotus-123
application/vnd.ms-works-ss
application/vnd.ms-works-wp
application/vnd.corel-slideshow
application/vnd.ms-works-wp
application/vnd.ms-visio
application/vnd.corel-quattro
application/vnd.corel-wordperfect
application/vnd.stardivision.calc
application/vnd.stardivision.draw
application/vnd.stardivision.impress
application/vnd.stardivision.math
application/vnd.stardivision.writer
application/x-hwp
application/octet-stream
To search for text stripped file, use the following search syntax:
flag:text_stripped
Unrecognised
Unrecognised items are items where Nuix did not recognise the header and was therefore unable to assign a
mime-type. For items where Nuix is unable to recognise the header, we tag the item
as application/octet-stream and text strip the item. In addition to extracting the ASCII text, Nuix
extracts all recognisable system metadata.
Note: Nuix only strips out US-ASCII characters (punctuation, 0-9, A-z). Nuix uses the UTF-16LE encoding (a
unicode encoding used by Microsoft) to potentially extract more textual data.
To search for unrecognised files, use the following search syntax:
mime-type:application/octet-stream
Bad Extension
Bad Extension indicates items whose file type (MIME type) is not consistent with its file extension.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 215 of 390
In this example, the Family.jpeg file is not an image, but is actually a Microsoft Word document.
To search for files with improper extensions, use the following search syntax:
flag:irregular_file_extension
Note: Nuix will set an native file's extension to the "File Extension (Corrected)" during an export. Nuix
records the exported item's definitive metadata in the export item summary, per-item XHTML report files, or
load file.
Corrupted
Corrupted items are those that Nuix has been unable to process. Nuix will mark a document corrupt if it is
unable to open the file, when opening the file experiences some type of failure, or is otherwise unable to
process the file.
For items that are listed as Corrupted, the File Type property displays the type of corruption. Additionally, two
pieces of metadata might be recorded: FailureDetail and FailureMessage. By reviewing these items or
optionally building a specific metadata profile that contains these fields, you can gain insight into the nature
of the failures. At times, a reason could be something as simple as a file being locked by an external
process. Holding the mouse over the FailureDetail value displays a hover message with the complete details
for you to review.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 216 of 390
properties:FailureDetail
Deleted
Deleted items are those items that Nuix extracted from the slack space of Microsoft email boxes.
Deleted emails are not items from the "Deleted Items" folder, but rather items that have been
"permanently deleted" from within Outlook or Outlook Express. While processing, Nuix attempts to
extract as many fragments as possible, and reconstitute complete messages. However, if only a
portion of the message still exists, Nuix will extract the portion that is available.
To search for deleted items, use the following search syntax:
deleted:1
Deleted items are rarely present in application-created PSTs. They are typically found only in PSTs
created by end users.
Nuix does not find every message that was ever deleted. Through the regular use of a Microsoft
email client, permanently deleted items will be over written. As these messages or attachments are
overwritten, they cease to be recoverable.
Compressing PSTs and OSTs removes deleted items.
Understanding PST Property Blocks and PST Blocks:
When scanning for deleted information, where possible Nuix 4 attempts to reconstitute the complete PST
item. Each PST has an associated property block that contains all the basic metadata associated with the
item. For types of metadata that have large values associated with them, such as the main text of the item,
the internet headers (if any) or attachments are located via additional file pointers.
With deleted items, often these pointers are no longer valid, so many of the deleted items found are these
"orphaned" property block items, which represent old PST items that can no longer link to their larger
metadata values, but may still contain useful information.
PST blocks are the chunks of data that are no longer referenced. For example, a large PDF attachment will
be broken up into 4kB blocks. When the item associated with the attachment is deleted the blocks of data
are effectively put on a free list, but will often still contain the old data that was resident in the block and may
still contain valuable information.
All extracted metadata properties are included in the text (body) of the document to ensure that this
information can be exported in a usable format.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 217 of 390
Encrypted
Encrypted items are those that Nuix has determined to contain encrypted content. Nuix still extracts
metadata, and as much information as possible from an encrypted file, but Nuix is unable to index all of the
content.
To search for encrypted files, use the following search syntax:
encrypted:1
Unsupported Items
Unsupported Items are items for which Nuix was unable to extract any content or text.
To search for unsupported items, use the following search syntax:
( has-embedded-data:0 AND has-text:0 AND has-image:0 AND NOT kind:multimedia )
OR ( mime-type:application/vnd.lotus-notes AND has-embedded-data:0 )
See the Appendix for for a listing of Nuix's supported files types.
Non-Searchable PDFs
Non-Searchable PDFs are items that are determined to be a PDF through header recognition but do not
contain indexable text. These items are most frequently image-only PDFs and warrant further investigation,
as the content in these PDFs is not text indexed, and therefore unsearchable by Nuix.
To search for non-searchable PDFs, use the following search syntax:
mime-type:application/pdf AND contains-text:0
See OCR Processing with Nuix for additional details on exporting these items out using a third party tool to
OCR them, and importing them back into Nuix.
Empty
Empty items are items that are zero (0) bytes in size.
To search for empty items, use the following search syntax:
mime-type:application/x-empty
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 218 of 390
With the above filters applied, use various searches to look for files of interest:
1. Search by file size to find files over a certain size. The idea is to look for larger files, in that they are
likely candidates to contain unprocessed child items:
a. digest-input-size:[1000 TO 10000000000000000] – Searches for files that are larger than 1 kB.
2. Filter or search for container files. The idea is to look for any file that is considered a container
within Nuix, as they are likely to contain unprocessed child items:
a. kind:container – Searches for all files that are of kind container.
b. Select the All Items | Containers
3. Eliminate system from the results. The idea is to eliminate anything that might be a system file from
the result set.
a. NOT kind:system – Searches for all items that are NOT considered system files.
Typical workflows to use for review include:
Item Level Review - Systematically work through each irregular file type looking for anomalies. The
most thorough methodology is to create a Fast Review Job for each type of irregular file and apply
a tag acknowledging that this irregular file is accepted.
Group Review - Use the result set view to group and slowly exclude items from the result set by
building queries like -name:picture* AND -name:object
In the Filtered Items pane, we recommend filtering the result set to show only Email, as all attachments and
parent communication data are included in the set. You can select individual addresses, or all items from a
particular domain, and tag or exclude them.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 219 of 390
To review web domains and email addresses:
1. Select the evidence for which you wish to view web domains and email addresses.
2. In the Filtered Items pane, filter to show only Email.
3. In the Results pane, select View by: Addresses.
A separate Workbench tab opens to display by addresses, with an item count for each domain
and email address.
4. Optionally, clear any of the communication fields to narrow the list.
5. Optionally, clear the Group by domain option to view a flat list of email addresses, not grouped by
domain.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 220 of 390
Any communications in the results list and/or the ancestor emails of the items in the results list.
This means that if the result set hit is an attachment, the communications date of the parent is used
for event mapping. So even if the search is only for *.zip files, the event map will provide value.
Each message is represented by a line from the sender to the recipient. The time and date of the
messages display on the timeline above the map.
Note: Nuix normalises all dates and times to UTC when processed, and then displays them using the time
zone defined in the Case Properties dialogue box.
1. In the Filtered Items pane, filter to just Email unless you are also interested in analysing other
types of items (documents, zip files, etc.).
2. Search the evidence using the desired criteria.
3. Review the items in the results set to find a communication you wish to further understand.
4. In the Preview pane, select the Thread link to narrow the result set to just that conversation.
You can also view conversations by Similar Items.
5. In the Results pane, select View by: Event Map.
Displayed is a static diagram showing the communications in the result set over time, with who
sent what to whom, as well as when and how. This view can make it easy to see who is emailing
directly to outside addresses, as well.
6. Display email addresses as you prefer, choosing from one of the following options:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 221 of 390
Personal - Displays only the personal portion of each email address. For example, Stephen
Stewart <[email protected]> would only display "Stephen Stewart".
Address - Displays only the address portion of each email address. For example, Stephen
Stewart <[email protected]> would only display "[email protected]".
Personal or Address - Displays either the Personal or Address portion of the email address
depending on its availability.
Formatted Address - Displays the fully formatted email address. For example, Stephen
Stewart <[email protected]> would display "Stephen Stewart
<[email protected]>".
7. Select a node in the diagram to view a specific email from the conversation thread in
the Preview pane.
If desired, you can export the Event Map diagram as an image. See Exporting Information from a View for
more information.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 222 of 390
You can increase the efficiency of this workflow by clicking the arrow in the far right of the Results pane to
undock the Network view and move it to another monitor, which allows more room for viewing the resulting
items that display when you click on a link in the view. See Interacting with the Network View for information
on how to customize and manipulate this view.
To review patterns of communication:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 223 of 390
value to 40 shows only those persons in the result set who have sent at least that many emails (but
no fewer).
5. Filter the view by clearing or selecting the communications fields, such as Direct (To) or Hidden
(Bcc), to show only those types of communications, as needed.
6. Optionally, clear the Run Layout option to halt the dynamic movement of nodes in the view.
When Run Layout is selected, Nuix tries to position the nodes in a readable position. You can still
drag the nodes into different positions, whether this option is selected or not.
7. Review emails sent from person A to person B by double-clicking the link between the two.
A new Workbench tab opens to display the items.
If desired, you can export the Network diagram as an image. See Exporting Information from a View for
more information.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 224 of 390
Load Data
With Nuix 4 you can create cases and add evidence to existing cases. During this process, you specify the
files, directories, or mail stores you want to add to the case. You can add up to two million items into a single
case. Nuix then ingests the items and processes them, adding Nuix metadata and indexing them for search,
analysis, review, and export tasks.
Forensic Images - See the list of Nuix's supported forensic image formats.
Groupwise email
Bloomberg data
Using a forensic application, such as Guidance Software EnCase or AccessData Forensic Tool Kit:
1. Locate the files and directories of interest.
2. Export the data from the forensic image.
3. Import into Nuix Desktop via the Add > Add Directories command when creating a case or
adding new evidence.
The advantages of using this method are:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 225 of 390
Allows you to extract recovered files found via EnCase or FTK
Bypasses directory and file security
The disadvantages include:
Once you export the files/directories, there is a chance of the files being altered prior to being
ingested into Nuix Desktop
Requires additional disk space for exporting the files
Use an application, such as GetData’s Mount Image Pro: 1. Mount the EnCase (E01), Raw, Smart, ISO or
DD image as a virtual drive on your Nuix workstation. 2. Once the image is mounted, add evidence to Nuix
Desktop using Add > Add Directories or Add Files commands.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 226 of 390
store.
Note that further properties of the trusted application can be modified from ConsoleOne, via Tools >
GroupWise System Operations > Trusted Applications. An example property you can change is to
indicate what IP addresses the trusted application is permitted to run from.
You can run GroupWiseTrustedAppInstaller.exe again if a new trusted application key is required. This
will overwrite the existing key, making it obsolete.
Further Configuration
Ensure that you have enabled the IMAP protocol on your GroupWise server. You can edit the IMAP settings
in the startup file for your associated GroupWise post office. If the post office program is running
interactively, you can access this via the Configuration > Edit Startup File option. Any changes require the
post office program to be restarted.
It is also important to set the /imapreadlimit option in the post office startup file. For the Nuix software to
read all the messages from a folder, we recommend to specify the value /imapreadlimit-50 in this file. This
means up to 50,000 messages can be downloaded from a single folder. See the Novell Documentation for
more details.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 227 of 390
key you obtained in the previous steps.
5. Once the new case is created, the software will connect to the GroupWise server and download
message data for every user account on that server.
You need to ensure that there are corresponding att.tar.gz files in the same folder and with the same base
name as each daily dump as these files contain all the attachments. If you process the file
“1294414613500.941000.F35834.BI.080304.080402.txt”
in the example above, all the attachments within the att.tar.gz will be pulled into the appropriate emails. The
“BI” in that name seems to be common to both the XML and text email message formats.
The chat XML messages are usually named using the following format - “..u.C..*.xml”.
*Attachments within Bloomberg chat messages are supported as being inserted into the appropriate chat
message from version 3.6.3 onwards, as we haven’t had samples to sufficiently identify attachment XML
elements prior.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 228 of 390
The New Case dialogue displays. Here you can describe and set options for the case, as well as set how
you want Nuix to process the data. Include as much detail as necessary to ensure a complete and accurate
chain of custody. You can edit these details later via File > Case Options. The information you specify here
is saved as a part of the case properties, along with the data that you select for processing.
To create a case:
1. Specify a case name.
2. Select the directory where you want Nuix to save the case.
3. Specify the investigator (name or ID) for the case.
4. Briefly describe the case so that it is easily identifiable.
5. For Case type, choose either Simple or Compound.
When you create a simple case, you can add to it any collection of items (emails, documents,
images, etc.), which are then ingested and indexed. A compound case is one that ties together
multiple simple cases that have already been processed; you cannot add individual items to the
collection during this step when you create a compound case. You can also combine multiple
compound cases together as well, which allows you to roll all data related to an investigation into a
single searchable repository.
6. Click OK.
Nuix creates the case and presents the Add Evidence dialog box to start processing data.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 229 of 390
The Add Case Evidence dialogue allows you to add, remove and edit the metadata of case evidence
before Nuix retrieves and processes it.
Evidence can be added as either as a static folder or loose files, or as a repository of evidence that
can be re-scanned to index new files added.
The data that you add as evidence should be logically organised, such as by custodian or other
relevant factor.
Each piece of evidence can contain multiple files, directories or mail stores.
The evidence names within cases should be unique, in case you ever combine the simple case
into a compound case.
You can also set the processing settings for each load of evidence from the Settings section at the bottom
right corner.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 230 of 390
The settings available are:
Data Processing Settings - The Data Processing Settings tab lets you set various options for how
the data will be processed.
Parallel Processing - The Parallel Processing tab lets you set how individual worker machines will
operate in a distributed processing environment.
Audit Filtering - The Audit Filtering tab is only visible for "audited" licence types, and lets you
define a digest list to exclude items from the audit report.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 231 of 390
Data Processing Settings
Data Processing Settings allows granular control over how evidence is ingested and reloaded.
Perform Item Identification - Allows items recognised with full metadata or minimal metadata if
only performing a light scan on loose files
Traversal - Three options are provided for traversing the documents when ingesting.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 232 of 390
Process Loose Files but not their contents option allows a quick directory listing of all the
files presented for ingestion without any further extraction.
Process Loose Files and forensic but not their contents allows forensic images to be
treated like a file directory along with any loose files for ingestion without any further extraction.
Full traversal will extract all items fully
Reuse Evidence Stores - Allows the new evidence to be indexed into any preview evidence
stores which in turn will result in faster searching and exporting.
Calculate Audit Size - Allows the audit size field to be populated with a file size for items
considered material
Store Binary of Data Items - Allows a binary copy of the item to be stored within the case
directory as a fixed copy up to the maximum size set. The default maximum is 256
MB. Note: Selecting this option can reduce indexing speed by 15-20% as well as increase the
amount of storage required for evidence from about 20-50% of the original data set to 220-250%.
Recover deleted files from disk images - Recovers all the deleted files from disk images
Extract end-of-file slack space from disk images - Extracts the end of file slack space from disk
images
Extract from mailbox slack space - Extracts files from mailbox slack space
Carve file system unallocated space - Carves files from the system unallocated space
Carve data from unidentified items - Carves data from unidentified items
Generate thumbnails for image data - Generates a thumbnail image of any image processed
within the dataset
Perform Skintone Analysis - Captures skintone information on any images processed within the
dataset
Digests to Compute - Allows the generation of extra digests, in addition to the default MD5, for file
signature checking up the maximum file size set. Select from SHA-1 and SHA-256. Note: These
additional digests are not used in the deduplication process.
Email Digest Settings - Select additional fields to add to the default fields used in digest creation
from emails only. Select from Include BCC and Include Item Date
Process Text - Allows the capture of text from the processed evidence items.
Enable Exact Queries - Enables indexing to allow Exact queries to be performed in the case
Enable Near Duplicates - Enables the creation of word shingles to allow for Near Duplicate
detection within the case
Enable named entity recognition - Enables the capture of named entities within the data set for
further analysis
Perform Stemming on English words - Selecting the English language stemming option means
that Nuix stems all words during processing. Nuix does not store both the stemmed and
unstemmed variants of the words in the index. It is therefore very important to understand how
stemming impacts a data set. When this option is set to English, Nuix searches for plurals and
other word variants when you search for a given word. For example, if the search word is "control",
having this option enabled returns documents containing "control", "controlling", "controller",
"controls", etc. When set to None, the search returns only documents containing the word "control".
This option is set to None by default. If desired, review more information on stemming.
Use English Stop Words - By selecting this option the English language stop words will not be
indexed. The English language stop words are a, an, and, are, as, at, be, but, by, for, if, in, into, is,
it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will and with.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 233 of 390
Note: DTSearch excludes stop words from its index by default. This can result in different search
counts being returned when comparing the results of Nuix and DTSearch based proximity queries.
Create family search fields for top level items - This creates an extra field in the text index that
contains the text of the top level item as well as the text of the descendants of that item. This can
create faster searches and more accurate results when using DT style proximity
searches. Note: This field is hidden in the UI and is only used to facilitate faster searching.
Hide Immaterial Items - Allows Immaterial items to be hidden in the Results pane when
processed to avoid clutter in the results set. The extracted text from this hidden immaterial items is
rolled up to it's parent item so it is available for searching.
Run local workers - Selecting this option allows workers to run on the local machine, in addition to
the remote server. It is possible to run both local and remote workers on the same job, but the
success/speed of processing is directly dependent on your hardware. If there are a large number of
remote workers on the job, it is often more efficient to disable local workers so that the master gets
more time to coordinate with the workers. You should discuss the optimal configuration settings
with Nuix [email protected] or your reseller. This option is selected by default, and unless you are
processing in adistributed configuration, the option should always be selected.
Number of workers - Sets the number of nuix_single_worker.exe instances to use during a
processing job. In the majority of cases, you should always set this to the maximum available
based on your licence. However, there are some cases when the number of workers needs to be
reduced and the amount of RAM increased to successfully process a dataset. By default, the value
is set to to the maximum allowed by your license.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 234 of 390
Memory per-worker (MB) - Sets the amount of RAM that each nuix_single_worker.exe has
available during a processing job. Nuix does not immediately consume the allocated memory, but
rather sets this a the threshold for the Java Virtual Machine. By default, the value is set to 1,000.
Note: The sum of ("Number of Workers" × "Memory per-worker") + "System Options | Application
Memory" should be at least 2GB less than the total available RAM on the system. For additional
information on allocating application memory, see Allocating memory (RAM) for better
performance.
Worker temp directory - Specifies the temporary location used by the Nuix during processing.
Nuix will use this directory as cache for any files that it needs to write to disk.
Note When processing Lotus Notes data, Nuix will create one copy of the active NSF file for each
nuix_single_worker.exe. For example: If you are processing one 10GB NSF file, with a 4-core
license, Nuix creates four copies of the NSF file in the Worker temp directory.
Filter out items matching the following digest list - Select this option to remove a specific list of
files from the audit report.
Digest list - Specify the digest list to exclude from auditing from the set of digest lists imported into
Nuix.
For additional detail on creating digest lists, see:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 235 of 390
Importing digest lists
Creating digest lists from within Nuix
From the Add Case Evidence dialogue, select Add to display the Add/Edit Evidence dialogue. When you
select Add, four options are available:
Add File – Select files from a computer, network or external drive (e.g. PST, EDB, NSF, MBOX
etc…)
Add Split "DD" Files – Select multiple DD image files from a directory to add to the case.
Add Folders – Select a directory that includes all files to be added. This is the suggested way to
import an EnCase, Compressed EnCase or dd image. Nuix does not support segmented dd files,
only whole dd images.
Add Mail Store – Selects an individual mail store via POP or IMAP. Use this method to connect to
Novell GroupWise or for corporate mail servers that support POP and IMAP connections, as well
as loading Gmail, Hotmail and other internet-stored email data.
To collect information from any of these sources the appropriate credentials must be provided to
Nuix:
Mail store type - POP, POP/SSL, IMAP, IMAP/SSL and Groupwise
Server hostname - DNS name or IP address of the targeted mail server
Server port - Will update based on Mail Store type. If a custom port is required, please make
the appropriate change.
Username
Password
Note: Connecting to corporate mail servers can result in exporting large volumes of data, which
can put a heavy strain on the server. Also storing a binary copy of the items harvested from a Mail
Store should be considered as best practice as pointers to items can often change within mail
servers.
Add Load Files – Select a load file from the directory to add to the case.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 236 of 390
Describing the Evidence
In the Add/Edit Evidence dialogue, you will need to describe the set the evidence that you are adding,
including certain metadata properties.
The following fields are available:
Evidence name - Describes the evidence. You should use unique evidence names, as you can
both search for these names and view them in the Document Navigator. If all of the evidence
appears as the default “Evidence 1”, the value of these capabilities is diminished.
Comments - Further information about the evidence you are adding to the case or that your
business policy dictates should be associated with the evidence.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 237 of 390
Custodian - Assigns a custodian name to the evidence which has been added.
Source time zone - Nuix stores all date/time values in absolute time or system time. Absolute time
or system time is recorded as the number of ticks since epoch. For each date/time, Nuix calculates
the offset based on the time zone, then stores the system time. The Source Time Zone sets the
default time zone that is used for processing the evidence collection. This provides a means of
controlling the time zone value for those data types that don't explicitly declare their time zone. This
is useful for when a group of documents have been collected from one geography/timezone (e.g.,
New York, Eastern Standard Time, or EST), but are being processing in a different
geography/timezone (e.g., London, Greenwich Mean Time, or GMT). This ensures that all dates
without a time zone are correctly processed using the correct collection geographies timezone
(EST).
Note: The only point you can add custom metadata to items is when you create a case. Once Nuix loads the
data, you can only add tags and comments to items.
Below the Custom metadata table, click Add to add metadata one at a time. The Add Metadata dialogue
box displays. Provide a name and a value for each custom metadata field. You can add as many as you like.
These metadata values will be added to every item that is imported as part of this collection of evidence.
Examples include custodian name, client case #, internal job #, etc.
You can also import a CSV file with a list of the desired name and value pairs, by clicking Import.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 238 of 390
An example file would look like:
name, value
Custodian, John Smith
id, 000001
To remove custom metadata, select the item(s) in the table and click Remove.
Note: For details on how to search for custom metadata, see evidence-metadata.
When adding an Evidence Repository add the root folder that contains the evidence. Each immediate sub-
folder inside this folder will be added as a separate evidence container. Each immediate sub-folder can also
be added as a new custodian on ingestion if desired with the name of the sub-folder creating the custodian
name.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 239 of 390
Exchange Database Files (*.EDB) - Process only specific custodian mailboxes from within an EDB
or alternatively select only a single custodian's Inbox or Calendar for processing.
Forensic Images (E01, L01, DD) - Process only specific folders from within an image (Documents
and Settings or Users).
NSF files - Selectively process specific views from within a Lotus NSF file, instead of extracting all
documents.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 240 of 390
From the Processing tab, select one of the following options to interrupt the processing of case evidence:
Pause - Temporarily halts the processing job, at which point the Resume button becomes active.
Press Resume to continue processing. Pausing and then pressing Stop is the same as just
pressing Stop.
Stop - Displays a dialogue that provides two options for stopping case processing, Stop and Abort.
Note: Pausing is a very temporary state. You cannot pause Nuix, then reboot the machine or close Nuix,
and open it back up and resume processing. If you are looking to exit out of Nuix completely, use
the Stop or Abort option.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 241 of 390
From the Stop Processing dialogue, you can select one of the following options to stop processing case
evidence:
Stop - Quits processing and cleans up the case, making the data that has been processed
available for search. In some instances this can take a while. The unprocessed portion of the data
cannot be reprocessed. You must reload it into the case.
Abort - Quits processing and exits the case; when you reopen the case, Nuix will continue
processing evidence from the beginning of all partially processed files. For example, if you are
processing a single large EDB file and you Abort, Nuix will restart at the begnning of the partially
processed file, which in this case is the EDB file. If you are processing a directory of 1GB PSTs,
and Nuix has completed 50, has partially processed 4, and had 46 remaining - Nuix will resume
processing by restarting at the beginning of the 4 partially processed PST files. This leads to some
duplication, but no data is omitted.
Cancel - Cancels the dialogue box and resumes the processing operation.
Reload Data
Evidence can be reloaded into a case, updating the existing record and text for the items by clicking
the Import function from the File menu.
Note: The items to be reloaded do not have to be in the same structure or location as the source data but it
is recommended that replacement files are stored along with the original source evidence as they will be
required for any action that points to those files, e.g. exporting or launching native.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 242 of 390
Reload Items from Data Source -This option allows the original source evidence to be reloaded for the
selected files with new Evidence Processing Settings. This option can be useful in cases where only a light
traversal of the evidence was done in the first instance or an option such as near duplicates was not checked
when the evidence originally processed.
Click a case from the list of recently opened cases that display in the Nuix 4 window.
Click Open Case from the Nuix 4 window.
From the menu, select File> Open Case.
To edit the properties of a case, open the case and select File > Case Properties. You can edit the name,
investigator, and description of the case, as well as set the time zone associated with the investigative work
on the case. The "Investigation Time Zone" controls the time zone offset used for all date/times presented in
the result set, the Metadata tab of the Preview pane, and legal exports.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 243 of 390
To add more case evidence after you have created the case, open the case and select File > Add Case
Evidence. See Adding Case Evidence for more information.
To close a case you have opened, select File > Close Case.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 244 of 390
Review and Tag
Nuix 4 supports a variety of workflows for reviewing and tagging evidence in a case.
You can perform ad hoc investigative reviews, searching for inappropriate content or items relevant to a
possible legal action without being constrained to a linear review of each item in order. Or more formally, you
can construct review jobs and work linearly to review each and every item in a case, tagging or commenting
them as needed. In the second workflow, each item is presented in order and you cannot skip items during
the review.
When tagging, you can apply one or more tags to individual items of interest as you review them, or you can
apply tags in bulk to an entire result set in one operation.
For more formal reviews, a typical workflow is as follows:
1. Create review jobs, which can be separated by any logical grouping, such as issue, keyword,
custodian, investigator, etc.
2. Create tags for use with the case, such as SPAM, Relevant, Privileged, Responsive, etc.
3. Preview items, either in the Preview pane, natively in the source application, or in PDF.
4. Apply tags and/or add comments to the items.
Optionally, you can also create subsets of cases for review to support a review that is being performed by
someone that does not have permission to see the entire case. This workflow might be as follows:
1. In the parent (original) case, search for all information that is not to be viewed by the reviewer,
such as Privileged content.
2. Tag that information accordingly.
3. Exclude the information by tag, so that it is culled from the result set.
4. When you have just the items in the result set remaining that need to be reviewed separately from
the rest of the collection, export it to a case subset.
5. Have the reviewers annotate (tag/comment) the items as needed.
6. Import the annotations from the case subset back into the parent case, where the tags and/or
comments are automatically applied to the same items (except for duplicates).
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 245 of 390
Creating a Review Job
When you need to review each item in a case or result set, one at a time, you should create a review job.
You can assign each review job a specific set of documents, with tags and keyword highlighting specific for
that review job. The text you specify for highlighting within the review job are separate from those specified in
keyword searches from the Workbench tab; only words can be highlighted, as wildcards or other forms or
query syntax are not accepted.
The items added to a review job are grouped and must be reviewed as an entire family of documents. You
can add different tags to each member of the family, but in order to advance to the next batch, the entire
family must be tagged. Review jobs are designed this way to allow for an accelerated review, in that if one
item in the family is response or privileged, then with a couple of key strokes the entire family can be tagged
and the reviewer is able to move onto the next family.
Within Nuix 4 you can track the overall status of a review job, as well as the tags that a reviewer has applied
and how many.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 246 of 390
c. Click OK to add the tags to the review job.
6. If desired, add keywords to highlight in the items within this review job:
a. To add an individual word or phrase, click Add and specify it in the text field.
b.To paste a list of words from the clipboard or import a list of words from a Nuix word list,
click More.
7. Select OK to create the review job. The new review job displays in the list of available review jobs
on the Fast Review tab.
8. Add items to the review job.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 247 of 390
To add items to a review job:
1. One the Workbench tab, select the item(s) from the Results or Thumbnail view.
2. Right-click in the Results pane and select Add to Review Job (or select Edit > Add to Review
Job).
3. In the Select Review Job dialogue, select the review job that you wish to add the items to.
4. Optionally, select to add all items from the same family to the review job.
5. Click OK.
Nuix adds the items to the selected review job, and the progress of the task is shown on the Fast
Review tab.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 248 of 390
items were processed by mistake or added to the incorrect review job. When you remove items from a
review job, any tags that might have been added to the items during the review process remain associated
with the items.
To remove items from a review job:
1. On the Workbench tab, select the item(s) in the Results or Thumbnails view that you wish to
remove.
2. Right-click in the Results pane and select Remove from Review Job (or select Edit > Remove
from Review Job).
3. In the Select Review Job dialogue, select the review job from which you wish to remove the
item(s).
4. Optionally, select whether to also remove items from the same family.
5. Click OK.
Nuix removes the item from the review job. The statistics on the Fast Review tab update to reflect the
changes.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 249 of 390
To edit a review job:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 250 of 390
To join a review job:
A list of review jobs associated with the case and a progress (status) for each.
Statistics about each user (reviewer) working on a selected review job, including how many items
have been reviewed per user, and how many of each tag the user has applied to date.
The tags assigned to the selected review job, and how many of each tag has been assigned to the
review job (across all users).
This information affords you the opportunity to understand how the case, the review jobs, and individual
reviewers are progressing.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 251 of 390
1. Open or click the Fast Review tab.
2. Peruse the list of review jobs, the user statistics, and the statistics for tags.
Tag-based workflows that search for specific terms to find items, and then tag sets of items into
groups such as "Privileged" and "Responsive", and then subset just the items tagged Responsive
into separate cases for review.
Exclusion-based workflows that cull the evidence by excluding items such as SPAM or other
irrelevant items first, and then with the remainder of the evidence create subsets of the items for
review.
Performance-based workflows that aggregate multiple simple cases, with numerous databases into
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 252 of 390
a single database.
The review workflow is as follows:
1. Exporting a set of items from one case into a new case, known as the case subset.
2. Review those items, which are now seen in isolation from the items in the parent case.
3. When the review is complete, export the annotations (tags and comments) from the child case to a
CSV file.
4. Import those annotations back into the parent case. Note that the history and any other case
metadata from the child case is not brought back over to the parent case.
When you export to a case subset from a result set, the parent items of the selected items in the
result set are also exported. If you export 1000 items, you will have a greater number in the new
case as the additional parent items are included. Nuix deduplicates the parent items for selected
items having the same parent(s).
Subset cases do not store the binary of the items that are subset unless the binary was previously
stored and still requires the source data to be accessible. If you require the subset to be
independent of the original stored data, then the binary for the items to be subset should be pre-
populated prior using the Populate Stores options from the Items Menu
1. From the Results pane, to export the entire result set, click the checkbox in the column header to
select all the items.
Or, you can manually select individual items from a result set to export.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 253 of 390
2. Click Export > Export Case Subset.
The Export Case Subset dialogue displays.
3. Complete the dialogue by specifying case options:
The .csv file is formatted as follows. Nuix associates each annotation with the item's GUID so that they can
be mapped back to the same items in the parent case.
1. From the Results pane, to export the entire result set, click the checkbox in the column header to
select all the items.
Or, you can manually select individual items from a result set to export.
2. Click Export > Export Annotations.
The Export CSV Annotation File dialogue displays.
3. Specify the location and name for the annotation file, and click OK.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 254 of 390
An Exporting Annotations dialogue remains open while the task is in progress, and an Export
Results dialogue displays to let you know the task is finished and how many items were
sucessfully exported.
4. In the Export Results dialogue, click OK.
A window opens displaying the folder view of the directory containing the annotations file.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 255 of 390
case. Nuix only applies unique new tags, duplicate tags are ignored. Technically, this means that
an item could be tagged with both Responsive and Nonresponsive tags, for example, if one of
those tags was applied to the item in the parent case and another in the child case.
After the items are tagged, the Annotation Complete dialogue displays indicating how many
items were annotated.
5. Click OK.
Creating Tags
In Nuix, tags are a piece of user-defined metadata that you use to classify an item after you have reviewed it.
Some cases may require all items have a tag associated with them, which suits a workflow that uses review
jobs to manage the review process, while others may only require that you tag items of relevance.
You can create a set of tags for a case, and you can also define a specific set of tags for use with a review
job (that is, a subset of the total set of tags, or tags that are specific to the review job).
You can organise tags into hierarchical groupings (nest tags), if desired. Some typical tags include:
Responsive, Non-responsive, Privileged, Confidential, SPAM.
Once you have tagged items in Nuix 4, you can:
Filter the result set to show just those items that have a certain tag applied, and then exclude or
export them
Search for items using the tag as part of the search criteria
Include the tags as part of the metadata when exporting items
1. Ensure the Review and Tag pane is showing (Window > Show Review & Tag).
2. In the Review and Tag pane, click the link to configure tags.
The Edit case tags dialogue displays.
3. Right-click in the empty list box to and select New Tag.
Nuix creates a tag and highlights the default name.
4. Type the name of the tag you wish to create to overwrite the default name, and press Enter.
5. To nest tags:
From this dialogue, you can also rename and delete tags by selecting a tag and using those commands on
the right-click menu.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 256 of 390
Assign Tags to a Review Job
To define a specific set of tags for use in a review job:
From this dialogue, you can also remove any tags from the review job, by selecting them and
clicking Remove.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 257 of 390
Reviewing Items
Reviewing items for relevance is a major component of any investigation. After you have culled the data,
excluding any items that are irrelevant, and filtered and searched to find sets of items that need further
investigation, you will want to review individual items.
Nuix provides you with the ability to review items in several ways:
All of these methods are available from the Nuix 4 Preview pane, highlighted in the screenshot below.
Located on the Workbench tab, the Preview pane by default is hosted alongside the Results pane and
the Review and Tag panes, allowing you to move through items and tag them in an efficient manner. Any
keywords that you used in a search query are highlighted in the preview.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 258 of 390
To review an item:
1. Select the Workbench tab, which could show the default pane setup, or if performing a Fast
Review, the customized version for reviewing items linearly.
Nuix displays the first item in the Preview pane, unless you have chosen to order them differently
in a review job. The item name is shown at the top, which is the subject of an email or the file name
for all other item types.
2. Select one or more of the following methods to review the content of the item:
Click the Email tab to see the extracted text of the item in the Nuix built-in previewer. This is
not a rich-text viewer.
Click the PDF tab to see a PDF rendering of the item, which is a rich-text view of the item. From
this tab you can import a PDF to replace the one currently in the Nuix Print Store, launch the
PDF in a PDF Viewer, and use zoom and page controls to adjust the PDF rendered in
the Preview pane.
Click the Launch button in the upper right-hand corner of the Preview pane to view the item in
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 259 of 390
its native application. The application must be installed on your system to view the item natively.
3. Optionally, review items that might have some contextual relationship to this item by:
Viewing the folder structure in which the item existed to gain context from its location, and
clicking any of the links in the Path to view other items from the same place in the data
collection.
Viewing Similar items to the one you are reviewing by clicking on the links
for Duplicates, High, Medium, or Low, which display items that are similar to some degree in
content based on words in common with the selected item.
Viewing Related items by clicking Thread to see all the items in the conversation to which this
item belonged, if any.
Reviewing all child items for a given item by right-clicking on an item and selecting Show All
Descendants in the Results pane.
Finding the top-level item (highest level ancestor) for a given item by right-clicking on an item
and selecting Show All Top-level Items in the Results pane.
When you are finished reviewing the item, add a tag or comment, as needed. Use the
yellow Next and Previous arrows at the top left-hand corner of the Preview pane to cycle through the
items, or, if you are in a review job, add a tag and click the green Next Family arrow (or press the Shift +
right arrow key) to advance to the next family of items.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 260 of 390
Assign Tags to the Tag Grid
To use the keyboard number pad for tagging items, assign the tags to the tag grid. The numbers 1-9 are
used; if you have more than 9 tags, place the most frequently used tags on the grid. You can still apply tags
that are not on the tag grid by selecting them from the tag tree.
The tags created for the case display in the tag tree on the right-hand side of the Review and Tag pane. If
this area is empty, click Edit Tags to create tags for the case.
1. From the tag tree on the right, select a tag and drag it onto a number in the tag grid on the left.
You can also select a spot on the grid and right-click to select Assign Tag to this Shortcut,
choosing the tag from the list.
The tag displays in the tag grid.
2. Repeat the process until you have placed the tags where you want on the grid.
3. Optionally, you can:
Move a tag by selecting and dragging it into an empty spot on the grid.
If you place a tag onto a spot that already has a tag, it replaces the original tag, but does not
swap the locations of the tags.
Remove a tag by selecting and dragging it back to the tag tree, or right-clicking and
selecting Remove Tag Shortcut
The tags now display on the tag grid, each with an assigned number for tagging items from the keyboard.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 261 of 390
and not update any items already tagged.
2. Select an item in the results list.
The selected row displays in blue with a gold outline in the Results pane.
3. Either:
From the keyboard, apply a tag by typing its assigned number.
In the Review and Tag pane, select the tag in the tag grid or tag tree with the mouse.
In the Results pane, click Add Tags and select one or more tags from the Add
Tags dialogue.
A tag icon displays next to the checkbox to indicate that item has one or more tags associated with
it. If the metadata profile you have assigned to the Results pane includes the metadata field "Tags",
you can also see the names of the tag(s) applied to the item in the Tags column.
4. To advance to the next or previous family in the result set, click the Next or Previous arrows in the
top left-hand corner of the Review and Tag pane.
This does not advance from item to item in the result set, but to the first item in the next family in
the result set.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 262 of 390
This additional piece of metadata can be seen in the Comment column of the Results pane, if you have
added the Comment metadata field to the profile you are using. You can also view the comments in the load
file when exporting items.
You can also search for text within comments, by using the comment search field.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 263 of 390
Export Data
Depending on your workflow, once you have processed or investigated the evidence, Nuix offers a variety of
ways to export the evidence based on the features enabled by your licence type.
The current results set with the data from the metadata profile in use
The contents of the Word List tab
The contents of the Statistics tab
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 264 of 390
The contents of the History tab
The contents of the Addresses tab
The contents of the Entities tab
Note: Nuix exports this information to a UTF-8 encoded CSV file. When importing this file into an application
like Excel, follow these steps:
1. Open Microsoft Excel.
2. Select Data from the menu.
3. Select Get External Data and Import From Text File from the ribbon.
4. Select the Delimited option and UTF-8 from the File Origin drop-down list and select Next.
5. Select the Comma option from the Delimiters group and select Finish.
If you simply double click on the CSV file and allow it to launch in Excel, the fields will not be correctly
parsed.
You can export the following views to an image file (.svg or .png):
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 265 of 390
relationships for the items. You can then open an item in the application in which it was created, if the
application exists on your system.
This option exports only the items you have selected in the result set, and not top-level items or
descendants. Therefore, if a search only hits on an attachment, only the attachment will be exported. To
export the parent email as well as all of the attachments, you must ensure you show all top-level items and
include those in the set of items to be exported.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 266 of 390
is Native - Export messages in their original format.
4. Click OK to begin the export process.
The Exporting Items dialogue displays indicating the progress of the export operation. The Export
Results dialogue displays when the export is finished and indicates how many items were successfully
exported and a link that runs a query for any items that failed. A window opens to the directory where the
files were exported.
For more information, see Exporting a Subset of Items to a New Case, which is part of the "Creating Subsets
of Cases for Review" workflow in the Review and Tag section.
Exporting Annotations
The Export Annotations option exports all of the comments and classifications from the selected items in a
result set to a CSV file. Nuix uses the GUID (Globally Unique ID) to trace annotations back to the original
item, if you then import the annotations back into a case that holds those same items.
For more information, see Exporting Annotations to a File, which is part of the "Creating Subsets of Cases
for Review" workflow in the Review and Tag section.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 267 of 390
Alternatively, if you need to export an actual list of MD5 hashes, you can do so by exporting a view in
conjunction with an appropriate metadata profile (one that includes the MD5 Digest property).
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 268 of 390
To create a shingle list for use in Nuix:
1. In the Results pane, select the items in the result set with which you want to create a shingle list.
2. Click Export > Export Shingle List.
The Export Shingle List dialogue displays, indicating the number of unique shingles to export.
3. Choose one of the following options:
a. Specify the name of a shingle list to create a new one.
b. Select the name of an existing list, to add the shingles to that list.
4. Click OK.
An Exporting dialogue displays while the shingle list is being created and then closes when the
task is complete.
Concordance
Discovery Radar
DocuMatrix
EDRM XML
IPRO
Relativity
Ringtail
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 269 of 390
Summation
To export to a legal load file:
1. Select the items in the result set that you want to export.
2. In the Results pane, select Export > Legal Export to and choose the desired legal application.
You can also find the command on the File menu and right-click menu in the Results pane.
3. From the Legal Export dialogue box, on the Export Type tab, specify properties for the type of
export you want to perform.
4. On the Load File Settings tab, specify the additional settings to export to your chosen load file
format.
5. On the Numbering and Files tab, specify how you want to number the items in the legal load file.
6. On the Parallel Processing tab, review the defaults and make any changes if needed to more
efficiently export items.
7. If you would like to view a summary of the items discovered for export, and optionally tag them
prior to export, select the Show pre-export summary option at the bottom of the dialogue.
8. When you have finished setting up the export job, click OK.
The Exporting Items dialogue displays, indicating the progress of the export operation.
The Export Results dialogue displays when Nuix has finished and shows the number of items
exported.
9. In the Export Results dialogue, click OK.
A window opens displaying the folder to which you exported the items.
summary-report.txt/xml - The summary report provides a complete report for the legal export.
top-level-MD5-digests.txt - The Top-level-MD5-digests.txt file contains a list all the top-level MD5
digests included in the legal export.
The summary report.txt and .xml files contain details of the export operation, including:
the exact legal export configuration
detailed breakdowns of any and all files that were exported
timing information for each of the export stages
detailed file type statistics
details of all duplicate top level items not exported
a fully qualified query string that can be used to find all items that failed to export correctly
After exporting items to a legal load file format, you should review the associated summary report to ensure
the content of the export meets expectations.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 270 of 390
The load file: loadfile.dat.
An Opticon load file: loadfile.opt. The loadfile.opt is always included in the export, but will be an
empty (zero size) file unless you select to export PDFs or TIFFs.
The summary report detailing information about the production/ export run itself: summary-
report.txt and summary-report.xml. The XML is provided to assist in the creation of more user
friendly reports by combining it with a custom cascading style sheet.
A text file containing the top level MD5 digests: top-level-MD5-digests.txt.
A custom folder for each type of exported data: Native, TIFF, PDF, and Text. These are defined on
the Legal Export dialogue, Numbering and Files tab, in the File Naming section.
The Concordance load file is essentially a delimited file. You can use this format to facilitate the transfer of
information from Nuix to other systems.
By default the Concordance load file is created using ASCII encoding (Concordance only recently started
supporting UTF-8 encoding). To create a Concordance load file with UTF-8 encoding, you need to start Nuix
using a command line switch. See the nuix.export.concordance.loadfile.encoding portion of theApplication
Command-line section.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 271 of 390
NAME DB
Auto-generated during the export process. The format is
DOCID Export Metadata Text, 50, Image, Key
controlled as part of the Legal Export dialog.
Export Specific
ENDGROUP Text, 50 Ending DOCID for a family of documents.
Metadata
Export Specific
PAGECOUNT Numeric, 5 Number of pages for an imaged document.
Metadata
FILENAME Item Metadata Paragraph, Indexed Specific Filename. Maps to the Nuix “Name” field.
FILEEXTENSION Item Metadata Paragraph, Indexed File extension for the specific item.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 272 of 390
TITLE Item Metadata Paragraph, Indexed Subject of an email or the name of a file.
User-defined – PATHNAME, GUID, and FILETYPE are included by default, if other user-selected metadata
fields are added, the appropriate changes to the Concordance DB need to be made.
TYPE DELIMITER
Comma (020) – ASCII (decimal)
COLUMN
SOURCE CONCORDANCE DB DESCRIPTION
NAME
Auto-generated during the export process. The
DOCID Export Metadata Text, 50, Image, Key format is controlled as part of the Legal Export
dialog.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 273 of 390
Export Specific Beginning DOCID for a multi-page document.
BEGINBATES Text, 50
Metadata Relevant when creating TIFFs or PDFs.
Export Specific
ENDGROUP Text, 50 Ending DOCID for a family of documents.
Metadata
Export Specific
PAGECOUNT Numeric, 5 Number of pages for an imaged document.
Metadata
ITEMPATH Item Metadata Paragraph, Indexed Relative path to the native file.
TEXTPATH Item Metadata Paragraph, Indexed Relative path to the text file.
PDFPATH Item Metadata Paragraph, Indexed Relative path to the PDF file.
TIFFPATH Item Metadata Paragraph, Indexed Relative path to the first TIFF page.
TYPE DELIMITER
Comma (020) – ASCII (decimal)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 274 of 390
DII TOKEN SOURCE DESCRIPTION
Export Specific Auto-generated during the export process. The format is controlled as
@DOCID
Metadata part of the Legal Export dialog.
Export Specific
@PARENTID Used to track an maintain the parent-child relationship of documents.
Metadata
@FULLTEXT DOC Standard DII token One full-text file exists for each database record
Long name for the item, includes Nuix specific item metadata: GUID,
@L Standard DII Token
PathName, Name
Nuix Defined
@FROM Nuix Communications FROM field.
Metadata
Nuix Defined
@TO Nuix Communications TO field.
Metadata
Nuix Defined
@CC Nuix Communications CC field.
Metadata
Nuix Defined
@BCC Nuix Communications BCC field.
Metadata
Nuix Defined
@SUBJECT Email subject or Nuix Name.
Metadata
Nuix Defined
@DATESENT Sent Date for email - Nuix Communications Date.
Metadata
Nuix Defined
@TIMESENT Sent Time for email - Nuix Communications Date.
Metadata
@HEADER / @HEADER-
Item Properties Email header content including all extracted metadata.
END
@EMAIL-BODY /
Item Content Email body content
@EMAIL-END
All additional metadata that is referenced from the metadata profile used
@MULTILINE Additional Metadata
for the export.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 275 of 390
your result set. This happens when responsive items are a part of the same family as excluded items, and
Nuix performs its Top-level item roll-up operation. If you need to export items that contain X but exclude
items containing Y from the export, you must follow a certain set of steps.
Requirement
1. Find all document families that contain the word dog* (responsive) and export the responsive
families with a Concordance load file.
2. Ensure that the no documents families that contain the word cat* are included in the data set.
Sample Email
Email_1 has two attachments; Attach_1 contains the word “dogs”, and Attach_2 contains the word “cats”.
Explanation
This issue occurs because Email_1 is a top-level item whose family contains both a responsive and an
excluded search term. The search string dog* NOT cat* is working correctly, in that it is only returning
Attach_1. However, when Nuix is set to find Top-Level items (deduplicated) and descendants, it includes
the entire family, including Attach_2.
Recommended Steps
To ensure that you do not export hits containing excluded terms, follow these steps:
1. Run a search for X (such as cat*).
2. Apply a tag that to the hits and their entire family that marks the documents as a match (e.g.,
Hit.Cat.Family).
3. Select the Hit.Cat.Family tag from the Filter Items | Tagged Items tree.
4. Run a query for the NOT dog*
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 276 of 390
5. You can confirm that dog* doesn't exist in the result set by using the word list view of the result set
and filtering on the word dog.
In the example below, none of the items will match the query, and therefore the entire family has been
excluded based on the existence of the excluded content (cats).
To ensure that you do not export hits containing items that have been added to an Excluded Items set, follow
these steps:
1. When you are adding items to the Exclusion set make sure that you are adding the the entire
family. If you don't want to always exclude the entire family, then add a second excluded items set
that contains the entire family prior to performing an export.
2. Run a search for X (such as cat*).
3. Apply a tag that to the hits and their entire family that marks the documents as a match (e.g.,
Hit.Cat.Family).
4. Select the Hit.Cat.Family tag from the Filter Items | Tagged Items tree.
5. Run a query for the NOT (dog* OR exclusion:Tag.Applied.in.Step1.Family). You can also just
ensure that the Exclusion set created in Step 1 is active.
6. You can confirm that dog* doesn't exist in the result set by using the word list view of the result set
and filtering on the word dog.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 277 of 390
Audit
Nuix 4 offers auditing capabilities for companies that need to monitor reviewer/user activity on a case. It
displays information about how the application is being used, logging information about case events.
Case Opened - Records the version of the Nuix application that opened the case in the Details of
Event.
Case Closed - Records the version of the Nuix application that opened the case in the Details of
Event.
Load Data - Records that data was loaded in the Details of the Event.
Search - Records the search parameters that were used and the number of results that were
returned.
Annotation - Records that an annotation was applied, including the specific annotation.
Import - Records that a PDF was imported.
Export - Records that an Export was performed.
Script Run- Records that a script was run.
For each event, the following information is logged:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 278 of 390
Viewing the Audit History for a Case
Nuix logs details about some of the operations that users perform in a case on the History tab.
Export View: "Export of results view (# items) to spreadsheet file ({File Path})"
Export Items: "Export of # items to binary files ({File Path}), # processed, # skipped, #
unprocessed"
Export Annotations: "Export # items to spreadsheet file ({File Path}), # processed, # skipped, #
unprocessed"
Export Digest List: "Export # of items to digest list ({digest list name})"
Legal Export: "Export # of items to {Load File Type} load format ({File Path}), # processed, #
skipped, # unprocessed"
Launch Item: "Record the Detail of Event with "Export of single item ({GUID}) to external viewer"
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 279 of 390
Appendices
Scripting
The Nuix scripting API exposing the majority of operations that can be performed within the user interface
(UI), allowing you to automate processes ranging from creating and configuring cases to processing items to
custom reporting. This allows you to automate frequently performed or repetitive tasks.
For additional information about building scripts and for sample code beyond what is available in this guide,
see the Scripting section of the Knowledge Base.
Scripts Menu
The Scripts menu is available within the Enterprise Workstation licence only of Nuix 4. This menu allows
you to launch scripts from the application. By default, the only items that exist in the menu item are the Open
Scripts Directory and Show Console options.
In addition, Nuix will present the folder structure created at the root of the Nuix Scripts directory in the menu
(select Open Scripts Directory to view the folder path and any existing folders or scripts). In the image
below, you can see that a collection of scripts has been written and classified into eight folders. Scripts
dropped into this directory display in the Scripts menu, regardless of whether they are placed in sub-folders.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 280 of 390
The Nuix Scripting Directory is stored in the logged on UserProfile:
Show Console
The Nuix Script Console provides a means of writing and directly executing script code against a Nuix
case as well as displaying any console output from that script. You can also use the console verify the status
of a running script, as informational messages can be written to the console as well as any errors.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 281 of 390
The following options and controls are available:
Language - Sets the scripting language to one of two, either ECMAScript or Ruby. Ruby is the
default setting.
Script - A free-text box into which your script is typed or pasted.
Console - A read-only box that displays the results of the script as well as a status message and
any errors.
Clear - Clears the results in the Console box.
Execute - Runs the script.
Cancel - Cancels a currently running script.
To close the dialogue box, click the Close icon in the upper right corner.
As you can see from the example searches included in the script, the search query syntax used from scripts
is the same as that used by the desktop application.
If you now go back to the program and open the Scripts menu, you will see a My First Script action in there.
This name comes from the first line in the script. For more information on this, see Script Metadata.
If you activate that menu action it will run your script. The results of running a script (both a record of running
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 282 of 390
the script itself, but also records of what it did) can be viewed through the case history (View → New History
Tab.) No obvious dialogs or windows come up when running a script. Part of the reason for this is that scripts
can do this internally and may want to show more detailed information than what a simple progress dialog
can provide.
Performing searches
Performs a search, sorts by date and prints the name and communication info for each item.
Ruby code:
all_email =
$utilities.item_sorter.sort_items($current_case.search('kind:email')) do |item|
item.communication ? item.communication.date : nil
end
all_email.each do |item|
puts "Subject: #{item.name}"
puts "Date: #{item.communication.date}"
puts "From: #{item.communication.from.join(', ')}"
puts "To: #{item.communication.to.join(', ')}" unless
item.communication.to.empty?
puts "Cc: #{item.communication.cc.join(', ')}" unless
item.communication.cc.empty?
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 283 of 390
puts "Bcc: #{item.communication.bcc.join(', ')}" unless
item.communication.bcc.empty?
puts ""
end
ECMAScript code:
var allEmail = utilities.itemSorter.sortItems(currentCase.search('kind:email'),
function(item) {
item.communication ? item.communication.date : null;
});
Sample output:
Subject: Test Email 1
Date: Wed Jan 01 15:44:13 EST 2003
From: [email protected]
Bcc: [email protected]
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 284 of 390
To: [email protected]
Cc: [email protected]
traverse($current_case.root_items)
ECMAScript code:
function traverse(items) {
for (var i = 0; i < items.length; i++) {
var item = items[i];
if (item.mimeType == 'application/vnd.nuix-evidence' || item.mimeType ==
'filesystem/directory') {
traverse(item.children.toArray());
} else {
println(item.name);
}
}
}
traverse(currentCase.rootItems.toArray());
Sample output:
sanity.mbox
Ruby code:
puts item.tags.sort.join(',')
item.add_tag('Important')
puts item.tags.sort.join(',')
item.remove_tag('IP Leak')
puts item.tags.sort.join(',')
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 285 of 390
item.comment = 'My comment'
puts item.comment
ECMAScript Code:
println(item.tags.toArray().sort().join(','));
item.addTag('Important');
println(item.tags.toArray().sort().join(','));
item.removeTag('IP Leak');
println(item.tags.toArray().sort().join(','));
Sample output:
IP Leak
IP Leak,Important
Important
My comment
Bulk Classification
Performs a search, then adds and removes tags from all items.
Ruby code:
bulk_tagger = $utilities.bulk_tagger
bulk_tagger.add_tag('Important', $current_case.search('kind:email'))
puts $current_case.search('name:"Test Email 12"')[0].tags.sort.join(',')
ECMAScript code:
bulkTagger.addTag('Important', currentCase.search('kind:email'));
println(currentCase.search('name:"Test Email
12"').get(0).tags.toArray().sort().join(','));
Sample output:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 286 of 390
Important
Not Important
Ruby code:
email_exporter = $utilities.email_exporter
export_dir = 'D:\\Exports\\Sanity'
Dir.mkdir(export_dir) unless File.exists?(export_dir)
all_email =
$utilities.item_sorter.sort_items($current_case.search('kind:email')) do |item|
item.communication ? item.communication.date : nil
end
all_email.each do |item|
eml_file_name = "#{item.digests.md5}.eml"
eml_file = File.join(export_dir, eml_file_name)
email_exporter.export_item(item, eml_file, :include_attachments => false)
puts "Exported #{item.name} to #{eml_file_name}"
end
ECMAScript code:
Sample output:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 287 of 390
Exported Test Email 1 to 9f68d908cd36635f0ca32087fe298b9f.eml
Exported Test Email 2 to 332cfa45a880ba178af4fe67c2c0cba6.eml
Exported Test Email 3 to c1152b9121897a06b5c9a6d469502702.eml
Exported Test Email 4 to 237b743270c4c354c01189f2b4a192eb.eml
Exported Test Email 5 to 98d621e573c55112e403e5a833c1e61f.eml
Exported Test Email 6 to c4656a979716f6d694368d9b9ebf34e1.eml
Exported Test Email 7 to 0dc301380eb012ffa9fbbdd9daf32fac.eml
Exported Test Email 8 to ff1f619a78aa21014cfb09478b4bceb1.eml
Exported Test Email 9 to 5b5a8afa333ac4667536490e800c4166.eml
Exported Test Email 10 to 78ae6926cd8c672c0880f50138404998.eml
Mailbox export
Performs a search, then exports all items to a single MBOX file.
Ruby code:
mbox_file = 'D:\\Exports\\Output.mbox'
all_email =
$utilities.item_sorter.sort_items($current_case.search('kind:email')) do
|item|
end
var allEmail =
utilities.itemSorter.sortItems(currentCase.search('kind:email'),
function(item) {
});
options.put('format', 'mbox');
Ruby code:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 288 of 390
pst_file = 'D:\Exports\Output.pst'
all_email =
utilities.item_sorter.sort_items_by_top_level_item_date(current_case.searc
h('kind:email'))
ECMAScript code:
var allEmail =
utilities.itemSorter.sortItemsByTopLevelItemDate(currentCase.search('kind:
email'));
(Generates no output.)
History
Iterates over all the history records and prints out some details of each search which is performed.
Ruby code:
ECMAScript code:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 289 of 390
0, 0));
var history = currentCase.getHistory(options);
var iterator = history.iterator();
while (iterator.hasNext()) {
var event = iterator.next();
Sample output:
2011-05-23T15:23:17.734+10:00,tester,assets
2011-05-23T15:23:42.331+10:00,tester,credit
2011-05-23T15:24:31.026+10:00,tester,
2011-05-23T15:28:17.227+10:00,tester,review-
job:e11f1e57e04d4388936c8e9096a550de,assigned,1
2011-05-23T15:28:48.630+10:00,tester,review-
job:e11f1e57e04d4388936c8e9096a550de,assigned,1
items = current_case.search('name:test')
items = utilities.item_utility.find_families(items)
puts "Item count = #{items.size} items"
excluded_items = current_case.search('has-exclusion:1')
excluded_items =
utilities.item_utility.find_items_and_descendants(excluded_items)
puts "Excluded item count = #{excluded_items.size} items"
ECMAScript code:
items = currentCase.search('name:test');
items = utilities.itemUtility.findFamilies(items);
println("Item count = " + items.size() + " items");
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 290 of 390
excludedItems = currentCase.search('has-exclusion:1');
excludedItems =
utilities.itemUtility.findItemsAndDescendants(excludedItems);
println("Excluded item count = " + excludedItems.size() + " items");
Sample output:
Ruby code:
require 'fileutils'
include FileUtils
TEMPDIR = java.lang.System.getProperty("java.io.tmpdir")
# Ensures that the item has been exported to PDF already. If the resulting
# PDF was not printed, replaces the PDF with one which was custom generated.
def replace_pdf_if_necessary(item)
tmp_bin_file = File.join(TEMPDIR, item.guid + '.bin')
tmp_pdf_file = File.join(TEMPDIR, item.guid + '.pdf')
begin
printed_image_info = item.printed_image_info
if printed_image_info.nil?
# There isn't any info so the system hasn't generated one yet. Have the
system do this...
begin
$utilities.pdf_print_exporter.export_item(item, tmp_pdf_file)
printed_image_info = item.printed_image_info
rescue
# leaves printed_image_info as nil
end
end
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 291 of 390
if printed_image_info.nil? or printed_image_info.was_text_converted?
# It couldn't print, or the printed copy was just a direct text
conversion.
$utilities.binary_exporter.export_item(item, tmp_bin_file)
do_something_to_convert_to_pdf(tmp_bin_file, tmp_pdf_file)
$utilities.pdf_print_importer.import_item(item, tmp_pdf_file)
end
ensure
rm_f(tmp_pdf_file)
rm_f(tmp_bin_file)
end
end
$current_case.search('mime-type:application/vnd.ms-visio').each do |item|
replace_pdf_if_necessary(item)
end
ECMAScript code:
TEMPDIR = java.lang.System.getProperty("java.io.tmpdir");
// Ensures that the item has been exported to PDF already. If the resulting
// PDF was not printed, replaces the PDF with one which was custom generated.
function replacePdfIfNecessary(item) {
var tmpBinFile = new java.io.File(TEMPDIR, item.guid + '.bin');
var tmpPdfFile = new java.io.File(TEMPDIR, item.guid + '.pdf');
try {
var printedImageInfo = item.printedImageInfo;
if (printedImageInfo == null) {
// There isn't any info so the system hasn't generated one yet. Have the
system do this...
try {
utilities.pdfPrintExporter.exportItem(item, tmpPdfFile);
printedImageInfo = item.printedImageInfo;
} catch (e) {
// leaves printedImageInfo as nil
}
}
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 292 of 390
} finally {
tmpBinFile['delete'](); // 'delete' is a reserved word in ECMAScript
tmpPdfFile['delete']();
}
}
(Generates no output.)
Creates a new case with the specified metadata, loads some data into the case and then
checks the information after the load completes.
Ruby code:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 293 of 390
puts 'Processing complete.'
puts the_case.name
puts the_case.description
puts the_case.investigator
puts the_case.root_items[0].name
puts the_case.search('kind:email AND name:18')[0].name
ensure
the_case.close
end
ECMAScript code:
println('Starting processing.');
processor.process();
println('Processing complete.');
println(theCase.name);
println(theCase.description);
println(theCase.investigator);
println(theCase.rootItems.get(0).name);
println(theCase.search('kind:email AND name:18').get(0).name);
} finally {
theCase.close();
}
Sample output:
Starting processing.
Processing complete.
New case
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 294 of 390
Description of the new case
My name
New folder
Test Email 18
Ruby code:
the_case = $utilities.case_factory.open('D:\\Cases\\Sanity')
begin
item = the_case.search('kind:email AND name:18')[0]
puts item.name
ensure
the_case.close
end
ECMAScript code:
Sample output:
Test Email 18
For Ruby, variables available in the scripting context require a dollar sign at the front, as they are passed in
as global variables.
In addition to this, we support Ruby-style naming for these global variables. For instance, the current case
can be accessed using $current_case.
The documented form can also be used if desired, for instance $currentCase, but the dollar sign is still
required.
As scripting context variables are local variables, they cannot be used from inside methods without doing
some additional work. For instance, the following example will not run:
def tag(items)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 295 of 390
utilities.bulk_tagger.tag('Important', items)
end
tag(items)
Instead, the utilities could be passed into the method. So for instance, the following example will run:
tag(utilities, items)
Prior to version 3.2, it was required to use a dollar sign before these variable names, causing them to be
referenced as global variables. These references will still work, even inside methods where the object
weren't passed in. However, this usage is now deprecated, and will be removed in some future version
(probably version 5.0.)
API Methods
Method names in the API documentation are documented using the Java form, for example exportItem().
However, it is also possible to use these in Ruby form, for example export_item().
File Encoding
Nuix always reads scripts in UTF-8 encoding.
However, due to the particular way JRuby handles encoding (they are trying to emulate MRI as best they
can), if your scripts contain any character not supported by the default character set used by your computer
(for English Windows systems this is typically windows-1252), an error will occur when the interpreter
encounters these characters.
The workaround for this is to put the following line at the top of each script: #encoding: utf-8
Since this is the standard way to declare a script encoding for Ruby, it will have a side-benefit of some text
editors picking up the information and automatically choosing the correct encoding to edit the file.
This will run all scripts using the Ruby 1.8 compatible interpreter. There is currently no way to selectively
change the version used for some scripts and not others.
Installing Gems
Many additional libraries exist for Ruby in the form of gems. Many of these work with the Java
implementation of Ruby we ship; the exception is gems which require native libraries to be compiled.
To install these, open up a Command Prompt at the Nuix 4 install directory and type:
C:\Program Files\Nuix\Nuix Desktop>java -Xmx500M -jar lib\jruby-complete.jar --
command gem install gemname --user-install
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 296 of 390
It will install all the gems under a directory called .gem in your user directory, so it will survive if the Nuix
software is reinstalled, and any gems used for Nuix software will be accessible by any other software which
happens to use JRuby (and vice versa.)
External Resources
A full description of the features available in Ruby is outside of the scope for this manual, but further
information can be found at the following locations.
The version of Ruby supported is 1.8.6.
Programming Ruby - The Pragmatic Programmer's Guide (best resource for beginners)
Core API Documentation
Standard Library API Documentation
Ruby-doc.org
However, the present support for ECMAScript is limited in that many useful operations which do work on
arrays, such as sorting, will not work directly on lists.
To get around this, it is possible to convert lists to arrays using toArray(), for example:
var itemsByDate = currentCase.search('query').toArray().sort(function(a, b) {
var aDate = a.properties.get('Date');
var bDate = b.properties.get('Date');
return aDate.compareTo(bDate);
});
Unsupported Features
The default ECMAScript implementation shipped does not support some of the features supported by the
complete Rhino implementation.
It is not possible to extend Java classes in ECMAScript, it is only possible to implement a single
Java interface. This will not affect integration with the Nuix API, but may affect the scripter if they
wish to integrate with a third-party Java library.
Support for E4X has been removed, so it is not possible to use the XML class, nor is it possible to
use XML literals in the script. Support for E4X is optional for the ECMAScript standard, however it
is a standard feature of JavaScript, so some scripters may expect it to work.
External Resources
A full description of the features available in ECMAScript is outside of the scope for this manual, but further
information can be found at the following locations.
Note that on many of the pages below, the older naming "JavaScript" is used in place of "ECMAScript". Both
names refer to the same scripting language.
The version of ECMAScript supported is 1.6, but documentation specific to web browsers does not apply.
A re-introduction to JavaScript
Core JavaScript 1.5 Reference
New in JavaScript 1.6
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 297 of 390
Mozilla Developer Centre
Advanced Scripting
Script Metadata
When the software reads a script file, the first few lines of the script are scanned to read additional metadata
relating to the script.
Any arbitrary metadata can be stored in the script file, but the most important one for the application is "Menu
Title". This decides how the script will be displayed in the menu.
"Menu Title"
Decides how the script will be displayed in the menu.
"Menu Keyboard Shortcut"
Decides the keyboard shortcut which will execute the script.
"Needs Case"
Declares that the script needs an open case to function. The script menu item will be disabled if no case
is open in the application. Valid values are "true" and "false", with "false" as the default.
"Needs Selected Items"
Declares that the script needs selected items to function. The script menu item will be disabled if no
items are selected in the application. Additionally, these scripts will appear in the Results context menu.
Valid values are "true" and "false", with "false" as the default.
NOTE: Script menu items are sorted by the filename, not the name obtained through "Menu Title". This is
useful if you want to put your scripts into the menu in a particular order, while giving them names which
would not normally be in that order. To force a particular ordering, just name the files starting with 001, 002,
003, etc.
Example of doing this inside Ruby:
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 298 of 390
Script Logging
Output from scripts run from inside the application is written to the Script Console. Output from scripts run
from outside the application is written to the output of the application itself, and thus will appear in
"stdout.log".
However, in either situation a copy of the output is written to the main "nuix.log". A side-effect of this is that
the user or system administrator can configure where the output of scripts is logged, which may be useful for
auditing.
In the "config" directory inside the application install directory, there is a file named "log4j.properties". This is
a normal text file (actually a Java properties file) which can be edited in any text editor.
Adding the following block of configuration to the end of the file will result in all output from "My First Script"
being written to "D:\my_first_script.log". Notice that the script name had spaces, which required prefixing with
a backslash ().
log4j.logger.SCRIPT.My\ First\ Script=INFO,MY_FIRST_SCRIPT
log4j.appender.MY_FIRST_SCRIPT=org.apache.log4j.FileAppender
log4j.appender.MY_FIRST_SCRIPT.File=D:\\my_first_script.log
log4j.appender.MY_FIRST_SCRIPT.Append=true
log4j.appender.MY_FIRST_SCRIPT.layout=org.apache.log4j.PatternLayout
log4j.appender.MY_FIRST_SCRIPT.layout.ConversionPattern=%d %-5p - %m%n
Application Command-Line
This section of the Nuix 4 User Guide lists the supported options for the command-line application of Nuix
Desktop and Nuix Worker.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 299 of 390
nuix_app.exe [-Dname=value] casefile
nuix_worker.exe [-Dname=value]
-Dname=value
Adding a -D (define) overrides a default system property. The following system properties are of particular
interest.
nuix.licence.preference The name of the licence type to obtain from the server. (e.g.
"Enterprise Workstation").
nuix.loglevel Specifies the logging level which specifies what kind of information
is output to the logs. The allowed levels are DEBUG, INFO,
WARN, ERROR, FATAL. INFO is the default level used by Nuix as
it usually provides sufficient information for support requests.
nuix.processing.ndNumBottomTextTokens For emails, the number of tokens at the end of the item to look for
potential signature and/or disclaimer text. Affects indexing.
nuix.processing.ndSampleFrequency The shingle sampling frequency. If, for example, this value is 8,
discard all hashed shingles whose hexadecimal value is not
exactly divisible by 8. Affects indexing.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 300 of 390
nuix.processing.ndTokensPerShingle The number of contiguous text tokens used to form a shingle.
Affects indexing.
-Xparam
Command-line parameters starting with -X will be passed directly to the underlying JVM. A common usage
for this is increasing the amount of memory available to the main Nuix Desktop process.
Example: nuix_desktop.exe –Xmx4g
-nologo
Adding the -nologo flag disables the product name and copyright notice which would otherwise appear when
using the console version of the application.
Some script developers may prefer this if the application will be run many times from a single script or batch
file.
-interactive
Adding the -interactive flag enters an interactive Ruby prompt which can be used for simple testing of
scripting code without the need to run a full script.
-script scriptfile
Adding the -script parameter followed by the full pathname to a script file will run the script without displaying
the main window. This allows batch processing of cases and is thus useful for integration with other
software.
For more information on writing scripts, see Getting Started with Scripting and Examples of Scripting Outside
the Application. casefile.
casefile
Specifies the location of the case to automatically open.
Example: "\baseline\Bubble2 - Nuix Cases\Case1\case.fbi2"
Adding the full path-name to a case.fbi2 file will open the specified case immediately after displaying the
main window.
Supported - The file type is fully supported, including the extraction of all metadata and content.
Recognised - The file type is recognised by its header, but is not fully supported. This file type
designation is simply text stripped.
Partially Supported - The file type is recognised and partially supported, meaning that some but not
all data from the file is processed.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 301 of 390
Nuix can directly consume some forensic images. Directly consuming forensic images allows Nuix to
process the source data without interference from the operating system or the filesystem security.
Supported Formats:
Encase:
E01, E02, E03, etc... - Nuix supports direct processing of Encase images. Both single E01 files and
segmented.
L01, L02, L03, etc... - Nuix supports direct processing of Encase Logical Volumes.
dd Images:
Note: Nuix does not recover or extract data from the images deleted, swap, slack or free space.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 302 of 390
Adaptive Multi-Rate audio/amr-wb Multimedia *.amr Recognised
Wide Band Audio File
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 303 of 390
Apple Keynote application/vnd.apple.keyn Presentations *.key Partially Supported (iWork
Presentation File ote 09 and later, text extracted
but not embedded images
or files)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 304 of 390
Block Device filesystem/block-device System Files *.dat Supported
COFF Object File application/coff System Files *.obj, *.o, *.exp, *.dll Recognised
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 305 of 390
Graphic
Corel Draw Drawing image/vnd.corel-draw Drawings *.cdr, *.cdt, *.drw Recognised (>= Corel Draw
4 for RIFF formats and
.cdt/.drw)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 306 of 390
Evernote Thumbnail application/vnd.evernote- Containers *.thumbnail Supported
Container File thumbnail-container
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 307 of 390
Graphic Database image/x-graphic-database- Drawings *..gdsii Recognised (All)
System II Layout File system-ii
Ichitaro Word application/x-js-taro Documents *.jtd, *.jtt, *.jtdc, Recognised (>= 4 AND <=
Processing File *.jfw, *.jvw, *.jbw, 2008)
*.juw, *.jaw, *.jtw,
*.jsw
Lotus 1-2-3 application/vnd.lotus-123 Spreadsheets *.wk1, *.wk4, *.wks, Partially Supported (Text
Spreadsheet File *.123 stripped)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 308 of 390
Lotus Domino XML application/vnd.lotus- Calendar *.xml Supported (>= 1 AND <=8)
Appointment domino-xml-appointment-
document
Lotus Domino XML application/vnd.lotus- Email *.xml Supported (>= 1 AND <=8)
Mail domino-xml-mail-document
Lotus Domino XML application/vnd.lotus- Other *.xml Supported (>= 1 AND <=8)
Other Document domino-xml-other- Documents
document
Lotus Domino XML application/vnd.lotus- Contacts *.xml Supported (>= 1 AND <=8)
Person Document domino-xml-person-
document
Lotus Domino XML application/vnd.lotus- Other *.xml Supported (>= 1 AND <=8)
Task Document domino-xml-task-document Documents
Lotus Notes View application/vnd.lotus-notes- Containers *.dat Supported (>= 1 AND <=8)
view
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 309 of 390
Matroska Video File video/x-matroska Multimedia *.mkv Recognised
Microsoft 2007 Excel application/vnd.openxmlfor Spreadsheets *.xlsx, *.xlsm, *.xltx, Supported (2007)
Spreadsheet mats- *.xlk
officedocument.spreadshe
etml.sheet
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 310 of 390
Microsoft application/vnd.ms- System Files *.chm, *.chtml Supported
Compressed HTML htmlhelp
Help File
Microsoft Excel application/vnd.ms-excel Spreadsheets *.xls, *.xlt, *.xlk, Partially Supported (95
Spreadsheet *.nxl, *.nxt, *.et, *.ett Text stripped)
Microsoft Exchange application/vnd.ms- Containers *.edb Supported (>= 5.5 AND <=
Server Property Store exchange-edb 2010)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 311 of 390
Microsoft Exchange application/vnd.ms- Containers *.stm Supported (>= 2000 AND
Server Streaming exchange-stm <= 2010)
Store
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 312 of 390
Microsoft Outlook application/vnd.ms-outlook- Email *.msg Supported
Activity activity
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 313 of 390
Microsoft Outlook application/vnd.ms-outlook- Containers *.msg Supported
Shortcut shortcut
Microsoft PowerPoint application/vnd.ms- Presentations *.ppt, *.pot, *.pps, Partially Supported (95
Presentation powerpoint *.dps, *.dpt Text stripped)
Microsoft Project File application/vnd.ms-project Other *.mpp, *.mpt Partially Supported (95
Documents Text stripped)
Microsoft Visio application/vnd.ms-visio Drawings *.vsd, *.vst, *.vss Recognised (2000 <)
Drawing
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 314 of 390
Microsoft Windows image/vnd.ms-emf Drawings *.emf Supported
Enhanced Metafile
Microsoft Word application/vnd.ms-word Documents *.doc, *.dot, *.wps, Partially Supported (95
Document *.wpt Text stripped)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 315 of 390
MPEG-4 Video File application/mp4 Multimedia *.mp4, *.m4a, Recognised (All)
*.mpeg4, *.mpeg,
*.m4v, *.f4v
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 316 of 390
OpenType Font application/vnd.ms- System Files *.otf, *.ttf Recognised
opentype
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 317 of 390
RFC822 Email message/rfc822 Email *.eml, *.mht Supported (All)
Message
Skype Chat Sync File application/vnd.skype-chat- Other *.dat Partially Supported (Not all
sync Documents versions supported)
SQLite Database application/vnd.sqlite- Databases *.db, *.sqlite, *.db3, Recognised (> Version 3
database *.sqlite3 (from June, 2004))
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 318 of 390
StarCalc application/vnd.stardivision Spreadsheets *.sdc Recognised (All)
Spreadsheet .calc
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 319 of 390
Unallocated Space filesystem/unallocated- Containers *.dat Supported
space
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 320 of 390
Windows 7 Sticky application/vnd.ms- Containers *.snt Recognised
Notes File stickynote
Recognised (Other
compression algorithms)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 321 of 390
Recognised (Encryption)
POSSIBLE SUPPORT
COMMON NAME FILE TYPE KIND
EXTENSIONS LEVEL
Centera Cluster application/vnd.emc-centera-cluster Containers Supported
application/vnd.emc-centera-eclip-
Centera Clip Container Containers Supported
xml
System Partially
Symantec KVS IIS File application/vnd.symantec-kvs-iis
Files Supported
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 322 of 390
Java Archive application/java-archive Containers *.jar, *.war, *.ear, Supported (All)
*.sar
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 323 of 390
AOL Personal Filing application/vnd.aol-personal- Containers *.pfc Recognised
Cabinet File filing-cabinet
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 324 of 390
Apple iOS Message application/vnd.apple-ios- Databases *.db Supported
Database message-database
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 325 of 390
Cellebrite XML Report application/vnd.cellebrite- Other Documents *.xml Supported
xml-report
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 326 of 390
Evernote Thumbnail application/vnd.evernote- Containers *.thumbnail Supported
Container File thumbnail-container
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 327 of 390
EnCase EWC Disk application/vnd.guidance- Containers *.e01 Partially
Image encase Supported (<=
7)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 328 of 390
Lotus Domino XML application/vnd.lotus-domino- Email *.xml Supported (>= 1
Mail xml-mail-document AND <=8)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 329 of 390
Mozilla Mork application/vnd.mozilla.mdb- Databases *.mab, *.msf, *.dat Supported
Database mork
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 330 of 390
Microsoft Entourage application/vnd.ms-entourage Containers *.dat Partially
Mailbox Supported
(Microsoft
Entourage 2001
- 2008 and
Microsoft
Outlook For
Mac (OLM)
2011 - 2013)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 331 of 390
Microsoft Excel application/vnd.ms-excel Spreadsheets *.xls, *.xlt, *.xlk, Partially
Spreadsheet *.nxl, *.nxt, *.et, Supported (95
*.ett Text stripped)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 332 of 390
Microsoft Internet application/vnd.ms-ie-cache- Containers *.dat Supported
Explorer Cache Entry entry
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 333 of 390
OpenType Font application/vnd.ms-opentype System Files *.otf, *.ttf Recognised
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 334 of 390
Microsoft Outlook application/vnd.ms-outlook- Email *.msg Supported
Item item
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 335 of 390
Microsoft Outlook application/vnd.ms-outlook- Calendar *.msg Supported
Schedule schedule
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 336 of 390
Microsoft Registry application/vnd.ms-registry- System Files *.dat Recognised
Key key (All)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 337 of 390
Microsoft Word Art application/vnd.ms-word-art Drawings *.dat Recognised
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 338 of 390
OpenDocument Chart application/vnd.oasis.opendo Drawings *.odc, *.otc Recognised
cument.chart (ALL)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 339 of 390
Microsoft 2007 application/vnd.openxmlform Presentations *.pptx, *.pptm, Supported
PowerPoint ats- *.ppsx, *.ppsm, (2007)
Presentation officedocument.presentation *.potx
ml.presentation
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 340 of 390
Informed Form application/vnd.shana.inform Other Documents *.itp Recognised
Template Document ed.formtemplate
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 341 of 390
StarCalc Spreadsheet application/vnd.stardivision.c Spreadsheets *.sdc Recognised
alc (All)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 342 of 390
Uniform Office application/vnd.uof.presentati Presentations *.uop, *.uof Recognised
Presentation File on
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 343 of 390
UNIX Ar Archive File application/x-ar Containers *.ar, *.deb, *.udeb, Supported
*.lib, *.a
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 344 of 390
UNIX/Linux ELF application/x-elf System Files *.dat Recognised
Executable
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 345 of 390
Ichitaro Word application/x-js-taro Documents *.jtd, *.jtt, *.jtdc, Recognised (>=
Processing File *.jfw, *.jvw, *.jbw, 4 AND <= 2008)
*.juw, *.jaw, *.jtw,
*.jsw
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 346 of 390
Parchive (Parity application/x-par Containers *.par Recognised
Archive) 1.0 File
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 347 of 390
Voice Mail Record application/x-voice-mail- Other Documents *.dat Supported
record
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 348 of 390
Matroska Audio File audio/x-matroska Multimedia *.mka Recognised
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 349 of 390
Portable Network image/png Images *.png Supported
Graphic
Corel Draw Drawing image/vnd.corel-draw Drawings *.cdr, *.cdt, *.drw Recognised (>=
Corel Draw 4
for RIFF
formats and
.cdt/.drw)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 350 of 390
Efax Image image/vnd.j2global-efax Images *.efx, *.jsd Recognised
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 351 of 390
Graphic Database image/x-graphic-database- Drawings *..gdsii Recognised
System II Layout File system-ii (All)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 352 of 390
SharePoint Site server/sharepoint-site Containers *.dat Supported
(2010)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 353 of 390
Adobe Flash Video video/x-flv Multimedia *.flv Recognised (All
File known versions)
Supported (>=
97 AND <=
2002)
Supported (>=
97 AND <=
2002)
Supported (>=
1998)
Supported (>=
2000)
Supported (>=
97 AND <=
2002)
Recognised
(Other
compression
algorithms)
Recognised
(Encryption)
POSSIBLE
COMMON NAME FILE TYPE KIND STATUS
EXTENSIONS
Centera Cluster application/vnd.emc-centera-cluster Containers Supported
application/vnd.emc-centera-eclip-
Centera Clip Container Containers Supported
xml
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 354 of 390
POSSIBLE
COMMON NAME FILE TYPE KIND STATUS
EXTENSIONS
EmailXtender Notes application/vnd.emc-mailxtender-
Containers *.onm, *.emx Supported
Message notes-msg
System Partially
Symantec KVS IIS File application/vnd.symantec-kvs-iis
Files Supported
POSSIBLE
COMMON NAME FILE TYPE KIND STATUS
EXTENSIONS
Lotus Domino XML application/vnd.lotus- Calendar *.xml Supported (>= 1 AND
Appointment domino-xml- <=8)
appointment-
document
BY CONTACTS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 355 of 390
Microsoft Outlook application/vnd.ms- Contacts *.msg Supported
Contact outlook-contact
BY CONTAINERS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 356 of 390
Autonomy Load File application/vnd.auton Containers *.idx Supported
omy-load-file
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 357 of 390
EnCase LEF2 Logical application/vnd.guida Containers *.l01 Partially
Volume File nce-encase-lef2 Supported (<=
7)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 358 of 390
Microsoft Entourage application/vnd.ms- Containers *.dat Partially
Orphaned Item entourage-orphan Supported
(Unlinked from
parent)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 359 of 390
Microsoft Outlook application/vnd.ms- Containers *.mbx Supported
Express 4 Mailbox outlook-express-4
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 360 of 390
Microsoft Virtual PC / application/vnd.ms- Containers *.vhd Supported (All)
Server VHD Disk File virtual-harddisk
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 361 of 390
X-Ways File System application/vnd.x- Containers *.ctr Partially
Image ways.filesystem Supported
(XWFS1 only,
not all metadata
extracted)
Apple Disk Image application/x-apple- Containers *.dmg, *.smi, *.img, *.dsk, Recognised
diskimage *.nib
UNIX Ar Archive File application/x-ar Containers *.ar, *.deb, *.udeb, *.lib, *.a Supported
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 362 of 390
gzip-Compressed File application/x-gzip Containers *.gz, *.tgz, *.wmz, *.emz Supported
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 363 of 390
Directory filesystem/directory Containers *.dat Supported
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 364 of 390
BY DATABASES
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 365 of 390
Quicken Document application/vnd.intuit. Databases *.qdf Recognised
qdf
MYOB Company File application/vnd.myob Databases *.myo, *.prm, *.dat, *.pls Recognised
BY DOCUMENTS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 366 of 390
DocBook Document application/docbook+x Documents *.dbk, *.xml Recognised
ml
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 367 of 390
Microsoft Word application/vnd.ms- Documents *.doc, *.dot, *.wps, *.wpt Partially
Document word Supported (95
Text stripped)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 368 of 390
Haansoft Hangul application/x-hwp Documents *.hwp, *.hwt Recognised (>=
Word Processing 3.0)
File
Ichitaro Word application/x-js-taro Documents *.jtd, *.jtt, *.jtdc, *.jfw, *.jvw, Recognised (>=
Processing File *.jbw, *.juw, *.jaw, *.jtw, 4 AND <= 2008)
*.jsw
BY DRAWINGS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 369 of 390
Microsoft OrgChart application/vnd.ms- Drawings *.dat Recognised
OLE Object orgchart
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 370 of 390
AutoCAD DWG image/vnd.autocad- Drawings *.dwg Recognised
Drawing dwg
Corel Draw Drawing image/vnd.corel-draw Drawings *.cdr, *.cdt, *.drw Recognised (>=
Corel Draw 4 for
RIFF formats
and .cdt/.drw)
BY EMAIL
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 371 of 390
Lotus Notes application/vnd.lotus- Email *.eml Supported (>= 1
Document notes-document AND <=8)
BY IMAGES
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 372 of 390
Windows Bitmap image/bmp Images *.bmp Supported
Graphic
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 373 of 390
Wireless Bitmap image/vnd.wap.wbmp Images *.wbmp Supported
Graphic
BY MULTIMEDIA
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 374 of 390
Sun Basic Audio audio/basic Multimedia *.au Recognised
File
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 375 of 390
BY OTHER DOCUMENTS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 376 of 390
Microsoft Note-It application/vnd.ms- Other *.dat Recognised
OLE Object note-it Documents
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 377 of 390
OpenDocument application/vnd.oasis.op Other *.odf, *.otf Supported (ALL)
Formula endocument.formula Documents
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 378 of 390
Address Book application/x-contact Other *.dat Supported
Contact Documents
BY PRESENTATIONS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 379 of 390
Kingsoft application/vnd.haansoft- Presentations *.hpt, *.rbk Recognised
Presentation presentation
Document
BY SPREADSHEETS
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 380 of 390
Google Drive application/vnd.google- Spreadsheets *.dat Recognised
Spreadsheet drive-spreadsheet
Microsoft Excel application/vnd.ms-excel Spreadsheets *.xls, *.xlt, *.xlk, *.nxl, *.nxt, Partially
Spreadsheet *.et, *.ett Supported (95
Text stripped)
Microsoft 2007 application/vnd.openxmlfor Spreadsheets *.xlsx, *.xlsm, *.xltx, *.xlk Supported (2007)
Excel mats-
Spreadsheet officedocument.spreadsheet
ml.sheet
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 381 of 390
Comma text/csv Spreadsheets *.csv Supported (All)
Separated
Values
BY SYSTEM FILES
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 382 of 390
Microsoft application/vnd.ms-hyperlink- System Files *.dat Recognised
Hyperlink record
Record
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 383 of 390
Executable application/x-executable- System Files *.dat Supported
Script File script
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 384 of 390
FIFO filesystem/fifo System Files *.dat Supported
NO DATA
BY UNRECOGNISED
Supported (>= 97
AND <= 2002)
Supported (>= 97
AND <= 2002)
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 385 of 390
Supported (>=
1998)
Supported (>=
2000)
Supported (>= 97
AND <= 2002)
Recognised
(Other
compression
algorithms)
Recognised
(Encryption)
POSSIBLE
COMMON NAME FILE TYPE KIND STATUS
EXTENSIONS
Centera Cluster application/vnd.emc-centera-cluster Containers Supported
application/vnd.emc-centera-eclip-
Centera Clip Container Containers Supported
xml
System Partially
Symantec KVS IIS File application/vnd.symantec-kvs-iis
Files Supported
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 386 of 390
HTML
XHTML
RFC822 / RFC822 headers
Outlook item, journal, note, short cut, sticky note, task
Lotus Notes legacy items and DXL mail items
Outlook appointment, schedule
DXL appointment items
Outlook contact
DXL person items
Rendered Natively via MS Office:
Word
RTF
Word Open XML (.docx) files
Powerpoint
Powerpoint Open XML (.pptx) files
Excel
Excel Open XML and binary (.xlsx; .xslb) files
Supported image types for rendering:
Bitmap
GIF
JPEF
JP2
PCX
PNG
TIFF
Lotus Notes bitmap
Microsoft icons
EMF
WMF
WBMP
PBM
PGM
PPM
Supported by passing through original file:
PDF
Adobe Illustrator
application/vnd.corel-quattro
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 387 of 390
application/vnd.lotus-123
application/vnd.ms-works-ss
application/vnd.stardivision.calc
text/csv
application/vnd.ms-access
application/octet-stream
application/vnd.myob
video/*
audio/*
Conditional Image Conversion:
application/vnd.ms.excel
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Note: The two Excel file types are conditionally imaged based on the Image Excel Spreadsheets option.
The slip sheet will read: "Unprintable document - refer to native file" and include the some item level detail:
Name
GUID
MIME Type
Items that do not contain binary data can be found using the has-binary search syntax.
application/com
application/dll
application/exe
application/java-class
application/vnd.ms-fon
application/vnd.ms-htmlhelp
application/vnd.ms-installer
application/vnd.ms-outlook-property-block
application/vnd.ms-shortcut
application/x-empty
application/x-font-ttf
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 388 of 390
application/x-nls
filesystem/inaccessible
image/vnd.microsoft.icon
image/vnd.ms-ani
Containers (kind:container):
application/vnd.ms-exchange.edb
application/vnd.guidance-encase
application/vnd.ms-exchange-edb
application/vnd.ms-ie-cache
application/vnd.ms-ie-cache-entry
application/vnd.ms-outlook
application/vnd.ms-outlook-folder
application/vnd.nuix-evidence
application/x-disk-image
application/x-gzip
application/x-zip-compressed
filesystem/directory
filesystem/drive
image/bmp
image/gif
image/jpeg
image/jp2
image/pcx
image/png
image/tiff
image/vnd_lotus_notes_bitmap
image/vnd_microsoft_icon
image/vnd_ms_emf
image/vnd_ms_wmf
image/vnd_wap_wbmp
image/x_portable_bitmap
image/x_portable_graymap
image/x_portable_pixmap
Troubleshooting
Refer to the Troubleshooting section of the Knowledge Base for troubleshooting topics.
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 389 of 390
February 2013 Nuix eDiscovery User Guide v 4.2 PAGE 390 of 390