Domino Full Text Indexing (FTI) Fundamentals
Domino Full Text Indexing (FTI) Fundamentals
Fundamentals
Table of Contents
Author(s) ................................................................................................................................................. 3
Introduction ............................................................................................................................................ 4
Glossary ................................................................................................................................................... 4
Chapter 1 – What is a Full Text Index and why have one? ..................................................................... 5
Chapter 2 - Creating the Full Text Index. ................................................................................................ 6
Chapter 3 - Maintaining the Full Text Index. ........................................................................................ 11
Chapter 4 – Using the Full Text Index. .................................................................................................. 16
Chapter 5 – Things to be aware of with a Full Text Index..................................................................... 22
Chapter 6 – Troubleshooting and debug options with a Full Text Index. ............................................. 36
Chapter 7 – Common Issues with Full Text Indexing and Search. ........................................................ 46
Resources .............................................................................................................................................. 49
Legal Statements................................................................................................................................... 51
Disclaimers ............................................................................................................................................ 52
Authors
This document was created and drafted by the following Subject Matter Experts:
Bio
Bio
Bio
Robert Steen joined IBM in 1998, and in 2003 he joined the Customer
Support team, where he worked with the Server Core team, assisting
numerous customers from all over the world.
Robert Steen
Introduction
The aims of this article are:
• To explain what a full-text index is.
• To show why you would want to use them.
• To show how to use them.
• To work through some troubleshooting options and solutions to common issues.
Glossary
FTI Abbreviation for full-text index.
.FT folder The full-text index is created in a folder with the same root name and location
as the database that contains it, but it has the “.FT” extension. This folder will
be referred to as the “.FT folder” throughout this document.
Search term The string or piece of text being searched for.
It gives more accurate and faster results than a simple search, and lists the search results in order of
relevance.
It allows for more complicated searches, like searching in specific fields or forms and for Boolean
options like And, Or, Not, etc. More on these options later.
The index itself is several files external to the database. It is generated in a subfolder to the database
with the database's root filename but the extension ".FT".
Generally the files within the .FT folder are not accessed directly and should not be directly edited.
Further information on the files that compose the Full-text index can be found here:
Title: Overview of Full-text index subdirectory and file layout
Doc #: KB0033058
URL: https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0033058
The FTI can be created for server-based databases or locally, with a Notes client, and for databases
stored on a hard drive and not accessed through a server.
If the FTI does not exist, many of the options will not be available. We will be looking at these later.
To create the FTI click Create Index to open the Create Full-Text Index dialog:
This controls if text in attachments is included in the full-text index. Selecting it will make the index
larger but allows for more a complete search of the text contained in the documents.
Conversion filters will attempt to parse only the text within the attachments that would normally be
displayed by the attachment - so any programming or formatting data stored in the file will not be
included in the index.
If the Conversion filters are not selected, all text within the attachments, including all formatting or
coding, will be added to the index.
By not using the conversion filters the indexing process can be faster as the indexing task does not
need to extract the text from the files. However, the resulting index is often larger, and it can
contain references to text that is not displayed as text from the attachment itself.
By using conversion filters the time and resources needed to create and maintain the index can
increase, but the overall size of the index will be smaller, as less text is added to it and searches will
find only text that should be displayed in the attachment. Resulting searches can also be faster.
The Domino Server and Notes standard clients use Apache Tika Open-Source conversion filters to
extract text for full-text searches of attachments.
Tika runs as a Java™ process when you start the Notes® standard client or Domino®. The process
calls tika-server.jar, which starts the HTTP task and listens for text extraction requests on port 9998
by default.
By default, all file formats that are supported by Tika are full-text indexed. However, several
attachment types are not included in the FTI by default: .au, .bqy, .cca, .dbd, .dll, .exe, .gif, .gz, .img,
.jar, .jpg, .mov, .mp3,.mpg, .msi,.nsf, .ntf, .p7m, .p7s,.pag, .pdb, .png, .rar, .sys, .tar, .tif, .wav, .wpl, .z,
.zip.More information on attachments will be covered later.
This option sets whether encrypted fields are included in the Index - provided the indexer (either the
server ID or the user's ID if indexing a local database) has access to the text in the encrypted fields.
Index sentence and paragraph breaks (word breaks are always indexed)
This sets weather these breaks will be included in the index when separating words.
Enable case-sensitive searches (when searching, use the EXACT CASE keyword)
Enables case sensitive searching. By default, searches are not case sensitive.
Note: If case sensitive searches are required, the index needs to be created with case sensitivity
enabled. This will increase the size of the index.
This selects how the index will be updated when it is hosted on a server; here are the options:
Daily - The index will be updated when the UPDALL task is run on the server overnight.
Scheduled - requires an UPDALL task to be run on the database through a program document on a
schedule.
Hourly - The index will be updated hourly by the UPDATE task.
Immediate - The index will be updated when a change is made.
Note: Immediate could mean up to several minutes depending on how busy the server is.
We will be looking closer at the options to update and maintain the index later.
Creating the FTI for multiple databases using the Administrator Client
We've seen above how the FTI can be generated for a single database. If indexes are required for
multiple databases, this process would become tedious and very time consuming for the
administrator.
To access this:
1. Open the Administrator client.
2. Select the Files tab.
3. Highlight the databases that require an FTI. Multiple databases can be selected using either
the Shift key or the CTRL key and clicking on the individual databases.
4. Once selected, click Tools>Database>Full Text Index option on the right-hand side to open
the Full Text Index dialog:
Figure 2-5. Full Text Index dialog from the Administrator client.
Here the same options for creating the FTIs are available.
Note: In a situation where a database was previously full-text indexed and the index was removed
through the operating system (not through a Notes client), it is possible to recreate the full-text
index on the server with the command:
This is only possible if the full-text index was removed outside of a Notes client.
We will be looking at further options to correctly remove a full-text index later.
Note: Designer or higher rights are needed in the database's ACL to be able to create the full-text
index. If the user creating the full-text index does not have at least designer rights to the database,
they will not be able to create the full-text index. The Create Index options will not be active.
A method is then needed to update the full-text index to incorporate the changes allowing it to be
used successfully again.
As when creating a full-text index, select the database and open the database properties. Select the
Full Text tab (second from the far right):
By clicking the Update Index button, the index will be updated with any changes that were made to
the database.
Another option to update the index using the Notes client is displayed in the Search bar (more on
the Search bar when we go through using a full-text index). When the extra options are accessed in
this bar, there is a button available - Update Index. This will update the index when activated:
For server-based databases the Administrator client can be used to update one or more indexes. As
when creating multiple full-text Indexes:
The "Updall" command on the server console can be used to both update the View Indexes in a
database and the full-text indexes.
The command line to only update the full-text index for a database is:
Example:
This can be run on individual databases, all databases in a folder, or all databases on the server.
There are several other options with the Updall command that will be reviewed below.
All these options will update the full-text index, allowing for accurate searches of the current data.
By default, on both client and server, documents with unindexed changes are indexed before
searching. Up to 200 documents can be indexed before searching; additional documents are queued
for immediate indexing.
When creating the full-text index, we saw that there were four options for the update frequency of
the index on a server:
Daily
Scheduled
Hourly
Immediate
Daily and Scheduled options are carried out with the Updall task and Hourly and Immediate are
handled by the UPDATE task.
Daily
By default, the Updall task will run on all databases in a server at 2am. This performs several tasks
including updating the View Index, purging old deletion stubs, and updating the full-text index for
databases that have an index.
ServerTasksAt2=UpdAll
Scheduled
The update of the full-text index of a database can be configured to run at any time with a Program
document:
Enabled/disabled: Enabled
Run at times: <Enter the time that you require the update to run. Leave blank if you wish to use an
interval time>
Repeat interval of: <Enter the interval between updates. If zero (0) then the update will only happen
once>
Days of week: <Select the days that you require the update to run>
Note: In the "Server to run on" field, use the full hierarchical name of the server or use an asterisk
(*) to run the program document on all servers in the domain.
The Update task runs constantly on the server. It is loaded on the server when it starts in the
ServerTasks line of the NOTES.INI:
ServerTasks=Update,Replica,Router,AMgr,AdminP,CalConn,Sched,HTTP,LDAP,RnRMgr
The Update task works continuously from a queue called the $UpdateQueue. When changes have
been made in a database, such as deletions, additions and edits, a corresponding request is entered
into the Update queue. Update checks the queue every 5 seconds for any new requests that have
been deposited, plucking the requests from the queue on a first-come, first-served basis.
Although Update checks the queue every 5 seconds, it does not refresh indexes at the same interval.
Instead, it uses what is known as the Update Suppression Time. With Suppression Time, Update
waits for multiple, similar requests to be deposited in the queue and then batches them. In this way,
Update processes all changes to a database at the same time. Since the Indexer is the most CPU-
intensive Domino server task, batching requests reduces the performance impact on a server
significantly. It is only after the Suppression Time has passed that Update forces the update of the
view collection and the full-text indexes as requested. By default, the Suppression Time is 15
minutes; however, this can be overridden with the following NOTES.INI parameter if required:
Update_Suppression_Time=minutes
After updating view indexes in a database, it then updates all databases that have full-text search
indexes set for immediate or hourly updates.
When Update encounters a corrupted view index or full-text index, it rebuilds the view index or full-
text index to try to correct the problem. This means it deletes the view index or full-text index and
rebuilds it. However, some corruption will not be discovered by the Update task requiring a manual
rebuild.
So, you have created the full-text index - now you want to use it.
With the database open you may see the search bar at the top of the screen, if not select it through
the menu option View\Search this view:
Restricted words
Certain words are reserved for use by the Full-Text search engine: TOPIC, AND, NOT, OR, CONTAINS,
NEAR, ACCRUE, EXACTCASE, TERMWEIGHT, PARAGRAPH, FIELD, and SENTENCE.
To search for these words, you need to place the words in “quotes.” If the words are part of a phrase
the whole phrase should be within the quotation marks.
FTI supports Boolean searches using “and,” “or,” and so on. This allows for the search of multiple
words or phrases that might not be together in the document.
Example:
The search term “this and that” will find documents with both the words “this” and “that” but not
those with only one of the words.
The search term “this or that” will find any document containing either the words “this” or “that”
even if the other words in the search are not in the document.
The search term “This text to be found” and “find these words” will find documents with both of
those phrases as listed.
Further search operators are available; these can be reviewed in the Administrator Help document:
Refining a search query using operators.
Note: To get the best results from using the full-text index it is important to understand that this is a
WORD search engine – the search will try and match the search term to words and not to a binary
match.
Example: Searching for the term “fun” will not find the word “functionality” within a document even
though the letters “fun” is a part of the word “functionality.”
To be able to find the word “functionality” with the search term “Fun” a wildcard asterisk needs to
be used. With the search term “fun*” the results will find “functionality.”
Show results
By relevance
By Date (last modified)
By Date (first modified)
Sorted like current view
Within all documents
Date
Clicking on the Date button opens the Add Condition dialog, which has several options. We are
focusing on the By date option here. The other options will be discussed individually.
This allows for searches for documents based on their creation or last modified dates.
The options in “Add Condition” are, is on, is after, is before, and is not on allow the entry of a single
date. Clicking the pull down in the date field opens a standard monthly calendar.
The options is in the last, is in the next, is older than, and is after the next allow the entry of several
days for the search parameter.
The options is between and is not between allow the entry of two date fields for the search
parameter.
Author
The search will find documents authored (or not authored) by the specified people. Multiple names
can be entered.
Field
The search will find documents which contain the specified value in the identified field. Multiple
values can be entered by separating the values with commas.
Selecting Field will list all the fields from the database.
Form
The search will find documents which use the selected form or forms. Multiple forms can be
selected.
Multiple words
The search will find documents which contain the specified term. For each term single words or
short phrases can be entered.
The search will find documents with data like the example selected. Single words or phrases can be
entered and only the fields relevant to the search need to be filled out.
This option is available after a search. It allows a further search where only the current result
documents are searched through.
Save Search
As seen above there are many options available for a search, and these can become very complex
and time consuming to enter. To make this more usable, search options can be saved allowing them
to be reused without having to enter all the variables again.
The Save Search option allows the user to save the search to whatever name is desired. The search
can also be made available to other users.
Load search
Clicking this button will display the available saved searches and allows the deletion of any saved
search.
Max results
This configuration limits the maximum number of results being displayed.
Note: Lowering this value will not speed up the search, as all the search results are first found, and
then displayed as configured (by relevance, date, and so on). So, the functionality is that all results
are found, then sorted, and the maximum number of them are then listed in order.
If run without any options Updall will first update the view indexes in a database and then the full-
text index.
Updall can be used on all databases by not entering a database path and filename, on an entire
folder or on an individual database. It will not accept wildcard characters.
There are several switches available for the Updall command that don't affect the full-text index.
These can be reviewed either in the Administrator Help or by entering the command:
load updall /?
The remaining Updall options are mostly related to View Indexing and not to full-text indexing:
-t "name" Update single view named "viewname" only.
-T+ "name" Special view named "viewname" is a critical view.
-t- "name" Special view named "viewname" is not a critical view.
-c Build unused view indexes.
-v Update existing and already built view indexes only.
-r Rebuild all already built view indexes.
-p Obsolete, on by default.
-a Incremental full-text index update of search site databases (faster).
-b Complete full-text index update of search site databases (slower).
-g Remove build-on-first-use collations when rebuilding views
-nodbmt Do not perform the dbmaintool nightly tasks
The safest option to remove the FTI is to select the database, open the database properties, select
the FTI tab and click on “Delete Index...”, a confirmation will be requested and then, once approved,
the server will remove the index.
If the server does not have deletion rights to the location for the FTI then this may fail.
Also, if the Index is being updated at the time of the deletion request the deletion can fail.
Multiple FTIs can be deleted using the Files tab of the Administrator client, select the databases
required to have their FTIs deleted and click on Tools\Database\Full Text Index... to open the Full
Text Index dialog box.
From here the option to 'Delete' is available, once selected click on OK and confirm the request:
If the above options do not remove the full-text index, it is also possible to simply delete the .FT
folder that contains the full-text index. However, this can affect the functionality of the database in
that it will still be configured as having a full-text index, but the functionality will not be available.
Some users have needed to move the full-text index itself and this is possible by simply copying the
.FT folder through the operating system.
The target database must be configured as having been full-text indexed, with its own index deleted
at the operating system level and the new index replacing it in the matching location for the
database, but the index may not be present.
The most common need for this is where the database is extremely large and the server is busy,
leading to the resources for updating the FTI not being available or being very slow. In this case the
index can be created on another server and copied.
This is not normally recommended as the documents in the database the index is created from, even
though it is a replica, may not match the documents database being searched at the time of the
search. Also, maintaining the index can be a problem if the index is being built on another server.
5.4) Attachments
Documents can contain attachments with text in them. The full-text indexer can parse text from
these and a search will return the documents the attachments are in.
Text within attached graphics (BMP. JPG, GIF, and so on) cannot be added to the index – so if there is
text listed in a screen shot or another picture this cannot be added to the index.
As listed above the option to include attachments in a full-text index can be selected when creating
the index. An additional option is then available for whether to use conversion filters or not.
The Apache Tika conversion filter will attempt to parse only the text within the attachments that
would be displayed by the attachment normally.
By not using the conversion filter the indexing process can be faster, as the indexing task does not
need to extract the text from the files. However, the resulting index is often larger as more text is
added to it, and it can contain text for information that is not displayed from the attachment itself,
possibly leading to incorrect search results.
By using conversion filters the time and resources needed to create and maintain the index can
increase, but the overall size of the index will be smaller as less text is added to it. Therefore,
searches will only find text that should be displayed in the attachment.
By default, all file formats that are supported by Tika are full-text indexed. However, several
attachment types are not included in the FTI by default:
.au, .bqy, .cca, .dbd, .dll, .exe, .gif, .gz, .img, .jar, .jpg, .mov, .mp3,.mpg,
.msi,.nsf, .ntf, .p7m, .p7s,.pag, .pdb, .png, .rar, .sys, .tar, .tif, .wav, .wpl, .z, .zip.
Using NOTES.INI, define own list of attachment types that can be full-text indexed by Apache Tika
conversion filter on Domino server or Notes client.
Add the following NOTES.INI parameter to overrule default behavior of indexing file formats so that
no attachments can be full-text indexed until you define attachment types using further parameters.
FT_USE_MY_ATTACHMENT_WHITE_LIST=1
Use the following NOTES.INI parameters to define attachment types to be full-text indexed by the
conversion filter.
To configure attachment types to be indexed in all databases, use any one or both NOTES.INI
parameters as needed:
FT_INDEX_FILTER_ATTACHMENT_TYPES=*.<format>,*.<format>
Where <format> is a file format. Use a comma between formats.
FT_INDEX_FILTER_ATTACHMENT_TYPES_MAX_MB=<value>
Where <value> is an optional maximum attachment size in MB to limit the size of files that can be
searched. There is no limit if not specified.
Example:
FT_INDEX_FILTER_ATTACHMENT_TYPES=*.pdf,*.zip,*.jar
FT_INDEX_FILTER_ATTACHMENT_TYPES_MAX_MB=5
To configure attachment types to be indexed in a specific database, use any one or both of the
following NOTES.INI parameters as needed.
FT_INDEX_FILTER_ATTACHMENT_TYPES_<replicaID>=*.<format>
where <replicaID> is the replica ID of a database and <format> is the file type.
FT_INDEX_FILTER_ATTACHMENT_TYPES_<replicaID>_MAX_MB=<value>
Where <value> is an optional maximum attachment size in MB to limit the size of files that can be
searched.
Example:
FT_INDEX_FILTER_ATTACHMENT_TYPES_652586AE00240ED9=*.txt
FT_INDEX_FILTER_ATTACHMENT_TYPES_652586AE00240ED9_MAX_MB=2
Further attachment types can be configured to not be included in the FTI using the NOTES.INI
parameter:
FT_INDEX_IGNORE_ATTACHMENT_TYPES=<value>
Note: This parameter is limited to 256 characters, so plan the list ahead accordingly. However, if you
want to add more values you can add the following parameter:
FT_INDEX_IGNORE_ATTACHMENT_TYPES2=<value>
FT_INDEX_IGNORE_ATTACHMENT_TYPES3=<value>
Note: The lists consist only of file extensions and, as an example, the format used to list them is as
follows: *.ext*.
It's always a good practice to put a comma as the last character in the list. It hurts nothing and it
guards against any concatenation issues between the three INI lists.
When the search text is found within an attachment on opening the link to the resulting document,
the file name of the attachment will be highlighted. However, when the attachment is opened the
search term will not be highlighted within the opened file. For more on highlighting, see section 5.8
– Highlighting.
At a database level the indexing of attachments can be enabled and disabled through the database
properties. However, it is also possible to enable or disable the indexing of attachments for the
entire server.
FT_Index_Attachments=<value>
By adding “UPDATE_FULLTEXT_THREAD=1” to the NOTES.INI of the server and restarting the server,
this task splits the full-text indexing into its own thread on the server. This will require further
resources from the server but can result in the full-text indexes being maintained more promptly.
When this is enabled, the new task will be used on the server. The “Show tasks” command will show
this as follows:
Existing full-text indexes will still function normally but will not be updated.
This can be used to prevent a server from using resources maintaining any full-text index except
when required, but the indexes might quickly be out of date compared to the documents in the
databases.
This option can be used when the server disks are low on space or when DISK I/O needs to be
optimized.
It can only be set at a server level for all databases, not for individual databases, and only one
location can be configured for the server.
The full-text indexes will be recreated in the new location matching the folder structure of the
databases in the DATA folder.
Example:
If the FTBasePath is set to “X:\FTI_path” and the databases are in the data folder like this:
DATA\MAIL\User1.NSF
DATA\MAIL\User2.NSF
DATA\MAIL\User3.NSF
DATA\APPLICATION_DBS\APP_DB1.NSF
DATA\APPLICATION_DBS\APP_DB2.NSF
DATA\APPLICATION_DBS\APP_DB3.NSF
X:\FTI_path\MAIL\User1.FT
X:\FTI_path\MAIL\User2.FT
X:\FTI_path\MAIL\User3.FT
X:\FTI_path\APPLICATION_DBS\APP_DB1.FT
X:\FTI_path\APPLICATION_DBS\APP_DB2.FT
X:\FTI_path\APPLICATION_DBS\APP_DB3.FT
5.8) Highlighting
When viewing the results of a search using a Notes client, the matching text in the document will be
highlighted when viewed.
As mentioned in the Attachment section the attachment file name will be highlighted if the string is
found in the attachment.
For example, searching for the term “test” found one document and highlighted the text:
In the following example, the text being searched for is “Text In”, and it was found in an attached
text file called “Attachment Text”:
Here the attachment file name is highlighted when the search term is found in the attachment.
Opening the attachment, even opening the attachment in a Notes client, will not then highlight the
search text in the file.
The highlighter will attempt to highlight the matching word to the search string. Although in most
cases this is simple, it also will attempt to highlight variants and fuzzy matches if enabled, which can
lead to confusion on which piece of text is highlighted. This can occur when there are many symbols
or other characters in a document, and thus can lead to the wrong words being highlighted.
There are times when opening a document from the search results is very slow. Opening the
document normally when not in search results has no problems. This can be caused by the
highlighter parsing through attachments in the document to see if the attachment should be
highlighted.
FT_LIMIT_HIGHLIGHT_FILTER=1
This setting prevents the highlighter from parsing through documents that would not contain the
search text.
Note: Search results are not highlighted when searching using an internet browser.
In this situation the document with the Reader field is still not displayed in the search results.
Although the document is found by the search, it is then filtered out from the display.
By enabling the debug option “DEBUG_FTV_SEARCH=1” (more on this in Chapter 6), we see what the
server has found:
IN FTGSearch
[9164:0081-7EA0] option = 0x400219
[9164:0081-7EA0] Query: (search AND text)
[9164:0081-7EA0] Engine Query: ("search"%STEM * "text"%STEM)
GTR query performed in 6 ms. 2 documents found
[9164:0081-7EA0] 0 documents disualified by deletion
[9164:0081-7EA0] 0 documents disqualified by ACL
[9164:0081-7EA0] 0 documents disqualified by IDTable
[9164:0081-7EA0] 1 documents disqualified by NIF
[9164:0081-7EA0] Results marshalled in 1 ms. 1 documents left
[9164:0081-7EA0] OUT FTGSearch error = 0
[9164:0081-7EA0] FTGSearch: found=1, returned=1, start=0, count=0, limit=0
[9164:0081-7EA0] Total search time 8 ms.
Here the debug output is showing that two documents have been found in the search. It then shows
that one document was disqualified by NIF (Notes Indexing Facility), indicationg that the document
was removed from being displayed due to the Reader field.
The final number of search results that are displayed to the user is shown as “Results marshalled in 1
ms. 1 documents left.”
The number of search results displayed in the Notes client status bar will show only the displayed
count.
IN FTGSearch
[9164:0081-351C] option = 0x400219
[9164:0081-351C] Query: (search AND text)
[9164:0081-351C] Engine Query: ("search"%STEM * "text"%STEM)
[9164:0081-351C] GTR query performed in 6 ms. 2 documents found
[9164:0081-351C] 0 documents disualified by deletion
[9164:0081-351C] 0 documents disqualified by ACL
[9164:0081-351C] 0 documents disqualified by IDTable
[9164:0081-351C] 0 documents disqualified by NIF
[9164:0081-351C] Results marshalled in 1 ms. 2 documents left
[9164:0081-351C] OUT FTGSearch error = 0
[9164:0081-351C] FTGSearch: found=2, returned=2, start=0, count=0, limit=0
[9164:0081-351C] Total search time 7 ms.
This shows that the document is not being removed from the results.
The index file can be copied from one server to another, but not replicated.
When creating a new replica of a database it is possible to have the full-text index created on the
target system. – This is not a replica of the original full-text index but rather a new full-text index.
Figure 5-5a. Creating a new FTI when creating a new replica of a database.
Figure 5-5b. Creating a new FTI when creating a new replica of a database.
As the full-text index does not replicate, this can lead to differences in results between the same
searches performed on different replicas of the same database. This can be caused by the indexes
not being updated at the same time and not reflecting changes made to the documents, or by
documents themselves not replicating for any reason.
The full-text index is an index created for a single database. It is specific to that database in one
location – either a server or locally on a Notes client.
The Domain Index is an index that can cover multiple databases on one or more servers. It does not
have all the search functions of the full-text index search and will not be maintained either by the
Update or Updall functions. Databases can have both a full-text index and be included in the Domain
Index.
This includes all program documents and can be used to confirm when a scheduled update to a full-
text index should next happen.
Example:
> sh sched
[9164:0008-8740] Scheduled Type Next schedule
[9164:0008-8740] TestServer1/Tst Mail Routing 01/01/2023 13:30:00
[9164:0008-8740] TestServer1/Tst Replication 01/01/2023 13:30:41
[9164:0008-8740] Updall Run Program 01/01/2023 13:18:32
Here the next events are displayed, including a program document that will run Updall.
There are other searches available with Domino databases – these have different functionality and
properties.
Simple Search
This refers to a search through the documents for a database where there is no full-text index.
Searches are still possible when there is no full-text index, however the search results will be slower,
and many of the advanced search options will not be available.
When accessing the advanced features for a non-full-text indexed database, this is displayed:
View Search
When looking at a list of documents in a view and selecting the menu option “Edit>Find Next...” the
“Find” dialog is displayed. This allows for searches in the text displayed within the view, in the
columns. It will not find text within the documents, only text displayed by the view to sort the
documents.
The “Find” field allows the selection of which available column from the View to search through.
This method of searching places the current focus on only the first document that meets the criteria;
it does not return a complete collection of documents that match the search term.
A quick option for activating this search is to click a document in the view and start typing the search
term, causing the 'Find' dialog to be activated and the search enabled.
It is enabled and configured through Domino Administrator. Which servers and databases are to be
included in the index can be configured.
The search is initiated, and the results viewed through the CATALOG.NSF database.
The Domain Index can be very large and very resource intensive to use and maintain. The selection
of databases and update frequency for the index should be carefully planned.
Further information on Domain Indexing can be found in the Administrator Help document “Domain
Search.”
https://help.hcltechsw.com/domino/12.0.2/admin/conf_domainsearch_c.html
Search in a document
It is possible to search through the displayed text in an open document. This can be started either by
clicking on the menu option “Edit>Find/Replace...” or by using Ctrl+F. This will open the “Find Text in
Document” dialog.
Figure 5-10. The Find Text in Document dialog with full options displayed.
This allows the search for terms within the document. This is not a word search, so partial words can
be searched for and variants or fuzzy matches will not be found.
General Troubleshooting
For many issues with the full-text index the best solution and the first step should be:
1. Delete the current full-text index – preferably through the database properties. Confirm that
the complete index with the .FT folder has been removed.
2. Re-create the full-text index.
This will ensure that the current full-text index is completely up to date with the latest document
edits and that, if the original index was corrupt, the corruption was eliminated.
This will not resolve all full-text indexing issues but is a very good first step for most problems.
Further troubleshooting guidance follows to help localize where the issue might be. Although this
doesn't cover all possible issues, it is a good guide:
Local or on a Server
Confirm if the database being searched is on a server or being accessed locally through a Notes
Client, this information can determine what steps may follow.
a. If Local
Confirm if the Notes client can create full-text indexes for other databases. This will confirm
if the problem is with the client itself.
If the client is unable to create the index, the database should be tested on another Notes
client, as it is possible that the install of the Notes client is corrupt or something else is
preventing the normal FTI functionality. The Notes client might need to be re-installed.
If the client can create fully working full-text indexes for other databases, the issue might be
specific to the database or this index.
The index should be recreated and tested after ensuring it was fully deleted.
Also, the database might be corrupt – running maintenance on the database or replacing it
with a replica might resolve the issue.
Debug options (more on these later) can also be added to the NOTES.INI of the Notes client
and might help to find the source of the issue.
b. If on a server
Confirm if the Domino server can create full-text indexes for other databases. This will
confirm if the problem is with the server itself.
If the server is unable to create the index then the database should be tested on another
server, as it is possible that there is an issue with the server – there might be a corrupt file or
setting preventing the index being created.
If the server can create fully working full-text indexes for other databases, the issue may be
specific to the database or this index.
The index should be recreated and tested after ensuring it was fully deleted.
Also, the database might be corrupt – running maintenance on the database or replacing it
with a replica might resolve the issue.
Debug options (more on these later) can also be enabled on the server and might help to
find the source of the issue.
There are several debug parameters available that when enabled can increase the logging when
creating, updating, or searching with the full-text index.
Note: Apart from DEBUG_THREADID, debug options should be disabled once the required data has
been generated. Leaving them running can impact the server’s performance and resources.
DEBUG_THREADID
Although not specific to full-text indexing, this is a very important debug option.
This displays each thread and task on the server with a separate ID in the CONSOLE.LOG. With this
enabled it is possible to follow each task through the console.log by the leading numbers on each
line. This is very useful when there are multiple tasks being displayed on the console, even multiple
indexing events.
Starting with R8.0, DEBUG_THREADID is enabled by default on all servers; however some
administrators have disabled it by adding the following setting:
DEBUG_THREADID=0
Before starting any of the logging or debugging options, DEBUG_THREADID=1 should be enabled on
the test server, either by adding "DEBUG_THREADID=1" to the NOTES.INI and restarting the server
or by running the command "set config DEBUG_THREADID=1".
This option does not impact the performance of the server but will slightly increase the size of
CONSOLE.LOG.
If the server cannot be restarted it is possible to enable this DEBUG option using the server
command:
This will enable the debug until it is disabled by a user or until the server is restarted.
Adding “DEBUG_THEADID=1” to the NOTES.INI will ensure it is enabled when the server starts.
Debug_FTV_Search
Setting this to '1' will send information about searches to the server console. This will include the
actual search string being used and some information on the results of the search.
This can be helpful in situations where a search term is not working, or if the results are not as
expected.
This can be enabled on a running server without having to restart the server with the command:
or it can be added to the NOTES.INI of the server and the server restarted.
Example output:
IN FTGSearch
[2F04:007C-3518] option = 0x400219
[2F04:007C-3518] Query: (test)
[2F04:007C-3518] Engine Query: ("test"%STEM)
[2F04:007C-3518] GTR query performed in 9 ms. 5 documents found
[2F04:007C-3518] 0 documents disualified by deletion
[2F04:007C-3518] 0 documents disqualified by ACL
[2F04:007C-3518] 0 documents disqualified by IDTable
[2F04:007C-3518] 1 documents disqualified by NIF
[2F04:007C-3518] Results marshalled in 5 ms. 4 documents left
[2F04:007C-3518] OUT FTGSearch error = 0
[2F04:007C-3518] FTGSearch: found=4, returned=4, start=0, count=0, limit=0
[2F04:007C-3518] Total search time 17 ms.
The search term will be displayed depending on how it is entered. In the above example the search
term was just “test” (without quotes in the actual search). To show how a more complicated search
term would be displayed:
Complete search term: "Test search term1" and "Test search term2" or "Test search term3" not
"search term4"
IN FTGSearch
[2F04:007C-33B8] option = 0x400219
Query: ("Test search term1" AND "Test search term2" OR "Test search term3" AND
NOT "search term4")
[2F04:007C-33B8] Engine Query: ("Test search term1"%STEM * "Test search
term2"%STEM + "Test search term3"%STEM * ! "search term4"%STEM)
[2F04:007C-33B8] OUT FTGSearch error = F22
[2F04:007C-33B8] FTGSearch: found=0, returned=0, start=0, count=0, limit=0
Here we can see that the search term is split into the separate phrases being searched for and then
the operands added.
Debug_FTV_Index
This parameter will output extra data when a full-text index is created or updated.
This can be enabled on a running server without having to restart the server with the command:
or it can be added to the NOTES.INI of the server and the server restarted.
Setting this to '0' in the NOTES.INI will disable the debug once the server is restarted.
Debug_FTV_Index=1
Example output:
[2184:0002-18F4] FTGIndex: After call to FTGIndexStart 2 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: Get modified NoteID table 0 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: Before calling IDEnumerate 242 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: Finished: 15 ms. for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] 3 documents added, 3 updated, 3 deleted: 1630 text bytes; 144
numeric bytes for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: All Done: 162 ms for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] OUT FTGIndex rc = 0 (No error) - for
[C:\HCL\Domino\data\ftitest.ft]
This setting with this debug will not have a significant effect on server performance but would not be
recommended to be left enabled on the server.
Debug_FTV_Index=15
Setting this parameter will display data on each note in the database being indexed. This can be very
long and will affect server performance, so it is recommended to disable this once any testing is
completed.
Example output:
[2184:0002-18F4] FTGIndex: After call to FTGIndexStart 1 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: Get modified NoteID table 0 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: Before calling IDEnumerate 2 ms for
[C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] Start indexing document 0xEEF6 (0 text bytes indexed)
[2184:0002-18F4] This create date is 01/08/2015 11:28:30
[2184:0002-18F4] This revision date is 01/08/2015 11:28:39
[2184:0002-18F4] Close document 0xEEF6 (262 text bytes indexed)
[2184:0002-18F4] Start indexing document 0xF4E6 (262 text bytes indexed)
[2184:0002-18F4] This create date is 01/08/2015 15:13:32
[2184:0002-18F4] This revision date is 01/08/2015 15:13:32
[2184:0002-18F4] Close document 0xF4E6 (524 text bytes indexed)
[2184:0002-18F4] FTGIndex: Finished: 79 ms. for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] 2 documents added, 0 updated, 0 deleted: 524 text bytes; 48
numeric bytes for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] FTGIndex: All Done: 129 ms for [C:\HCL\Domino\data\ftitest.ft]
[2184:0002-18F4] OUT FTGIndex rc = 0 (No error) - for
[C:\HCL\Domino\data\ftitest.ft]
If the index is being updated, only new or updated documents will be listed in the output This could
be useful to determine if a document update is being correctly added to the index.
If the index is being built for the first time, each document will be added to the output as it is
indexed.
Debug_FT_Stream
As with DEBUG_FTV_INDEX this parameter will output extra data to the server console when the
index is being updated or rebuilt.
To enable this parameter, it must be added to the NOTES.INI of the server and the server restarted.
The parameter cannot be enabled using the 'set config' command.
DEBUG_FT_STREAM=1
This debug level will just show each document that needs to be indexed being opened and closed. It
can be used if there is a suspected corruption within the database affecting the full-text index.
Example output:
This output will show only documents that are new or have been updated.
DEBUG_FT_STREAM=3
As with level '1' this will output information for each document being indexed – it will then give
further information regarding each document.
Here, displayed, is the NoteID of the document and the fields within the document being indexed.
This can be used to investigate if text in a field is being added to the index.
This output can be verbose and can affect the server performance. It is recommended that this be
disabled once testing is complete.
DEBUG_FT_STREAM=13
This is the most verbose of the debug options listed. Along with listing each document and field
being indexed it will also give information on the amount of data per field being indexed and the text
being added to the index.
Example output:
[33F4:0002-3528] FTGetDocStream: INIT: Opened NoteID F4E6 in DB
C:\HCL\Domino\data\ftitest.nsf
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: UNKNOWN: 261
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: UNKNOWN: 261
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: UNKNOWN: 263
[33F4:0002-3528] FTGetDocStream: Start item 'Form'; Datatype 500; UNK # 244
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 9 bytes
[33F4:0002-3528] 'MainTopic'
[33F4:0002-3528] FTGetDocStream: Start item 'From'; Datatype 500; UNK # 88
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 32 bytes
[33F4:0002-3528] 'CN=Test User/OU=Test/O=Test'
[33F4:0002-3528] FTGetDocStream: Start item 'AbbreviateFrom'; Datatype 500; UNK#
90
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 24 bytes
[33F4:0002-3528] 'Test User/Test/Test'
[33F4:0002-3528] FTGetDocStream: Start item 'AltFrom'; Datatype 500; UNK # 92
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 32 bytes
[33F4:0002-3528] 'CN=Test User/OU=Test/O=Test'
[33F4:0002-3528] FTGetDocStream: Start item 'AltLang'; Datatype 500; UNK # 77
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 0 bytes''
[33F4:0002-3528] FTGetDocStream: Start item 'ThreadId'; Datatype 500; UNK # 74
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 11 bytes
[33F4:0002-3528] 'RSTN-9SKFGD'
[33F4:0002-3528] FTGetDocStream: Start item 'Remote_User'; Datatype 500; UNK #
127
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 0 bytes''
[33F4:0002-3528] FTGetDocStream: Start item 'MainID'; Datatype 500; UNK # 89
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: FIELD SEP
[33F4:0002-3528] FTGetDocStream: GETCHAR returns: TEXT: 32 bytes
[33F4:0002-3528] 'E16F7C523AF9493880257DC7003F08FE'
From this the text 'test document' is being added to the index.
DEBUG_TIKA
This parameter will output logs related to Java process load by Apache Tika while indexing
attachments.
On the console log, the following type of messages can be observed to confirm if the Java process is
started successfully or there is any issue while launching the Java process by Tika.
Example:
[51F0:0002-412C] 01/23/2023 11:32:00 PM DEBUG_TIKA - Launched java process with PID '19796'
and args ' -Djt="C:\Users\YALAVA~1\AppData\Local\Temp\notes567CFD" -Dll="D:\Program
Files\HCL\Domino\Data\IBM_TECHNICAL_SUPPORT\ndtika20976.log" -Dcf="D:\Program
Files\HCL\Domino\log4jTika.xml" -jar "D:\Program Files\HCL\Domino\tika-server.jar" -c "D:\Program
Files\HCL\Domino\dtikacfg.xml"'
In addition to above messages on the console log, tika logs (ndtika*.log) are generated in the
IBM_TECHNICAL_SUPPORT directory of Domino server or Notes client to record any error messages
and exception messages that are encountered by Tika process while processing attachment files.
Example log:
"01/23/2023 11:32:02 PM WARN TikaServerConfig:656 - no system property set for jt, falling back
to -Djava.io.tmpdir=${sys:jt}
""01/23/2023 11:32:02 PM WARN TikaServerConfig:656 - no system property set for cf, falling back
to -Dlog4j.configurationFile=${sys:cf}
""01/23/2023 11:32:03 PM WARN ContextHandler:1673 - Empty contextPath
""01/23/2023 11:32:59 PM WARN PAPBinTable:220 - Paragraph [1612; 1613) has no PAPX. Creating
new one.
""01/23/2023 11:33:10 PM WARN AbstractPOIFSExtractor:122 - Ignoring unexpected exception
while parsing summary entry SummaryInformation
" java.io.UnsupportedEncodingException: Codepage number may not be 0
at org.apache.poi.util.CodePageUtil.codepageToEncoding(CodePageUtil.java:281) ~[tika-
server.jar:2.4.1]
at org.apache.poi.util.CodePageUtil.codepageToEncoding(CodePageUtil.java:259) ~[tika-
server.jar:2.4.1]
at org.apache.poi.util.CodePageUtil.getStringFromCodePage(CodePageUtil.java:237) ~[tika-
server.jar:2.4.1]
at org.apache.poi.util.CodePageUtil.getStringFromCodePage(CodePageUtil.java:225) ~[tika-
server.jar:2.4.1]
at org.apache.poi.hpsf.CodePageString.getJavaValue(CodePageString.java:83) ~[tika-
server.jar:2.4.1]
at org.apache.poi.hpsf.VariantSupport.read(VariantSupport.java:216) ~[tika-server.jar:2.4.1]
at org.apache.poi.hpsf.Property.<init>(Property.java:182) ~[tika-server.jar:2.4.1]
at org.apache.poi.hpsf.Section.<init>(Section.java:240) ~[tika-server.jar:2.4.1]
at org.apache.poi.hpsf.PropertySet.init(PropertySet.java:493) ~[tika-server.jar:2.4.1]
Issues discussed:
Can the server or Notes client create full-text indexes for other databases?
If it can, the issue is likely just with the single database.
a) Check that the ID (either the user's ID or the server's ID has Manager rights defined in the
ACL for the database.
b) Take a backup of the database and then run maintenance on it (Fixup, Updall & Compact),
and monitor if there are any errors displayed that would show corruption in the database.
c) If there is a replica of the database available check to see if it can be full-text indexed; if it
can then replace the problem database from the replica, keeping a backup.
d) Confirm if there are any documents in the database. If there are no documents present, the
full-text index will not be generated.
e) Enable the “DEBUG_FTV_INDEX=15” debug; this might highlight if the index build is stopping
on one problem document that needs to be investigated.
If it can't create FTIs for other databases, the issue may be:
a) “UPDATE_NO_FULLTEXT=1” is enabled – this will prevent the creation of any new full-text
indexes and the update of any existing ones. This should be set to '0' in the NOTES.INI and
the server or Notes client restarted.
b) The Update task was not loaded on the server. This task is required to be loaded in the
ServerTasks line of the NOTES.INI for the Indexer to be available; this should be added to the
NOTES.INI and the server restarted.
c) The server itself could have some damaged files. Reinstalling the server could resolve the
issue - full backups of all critical data should be taken before trying this.
Rebuilding the index or running the command “load updall -x” will recreate the full-text index for
use.
If the full-text index is not actually wanted, the full-text index needs to be recreated and then
removed correctly. This will set the Database to “Not Indexed.”
Note: When the full-text index is set to update “Immediately” it could still be some time before the
Index is updated. The update requests are all queued together and then the Update task runs
through them on a first-come, first-served basis. This is to prevent the server being overloaded by
index update requests every time a document is created, deleted, or updated.
a) Reviewing the database properties on the Full Text tab, confirm that attachments are
included in the index, either filtered or not:
b) Confirm that the search term is correct and present in the attachment. Try searching for
other complete words from the attachment.
c) Rebuild the full-text index.
d) Confirm that the text in the attachment is text. If words are included in a graphic, the
indexer is not able to parse the text from them. A test for this that works in most, but not all,
cases is that if you can open the attachment, highlight the text, copy it, and then paste it into
NotePad, it should be in the index.
e) Confirm that the attachment is not one of the types automatically skipped over by the
indexer:
.au, .cca, .dbd, .dll, .exe, .gif, .img, .jpg, .mp3, .mpg, .mov, .nsf, .ntf, .p7m, .p7s, .pag, .sys,
.tar, .tif, .wav, .wpl, .zip
FT_INDEX_IGNORE_ATTACHMENT_TYPES=<value>
f) Confirm that the indexing of attachments is not disabled for the server with the setting:
FT_Index_Attachments=0
Resources
Refer to the following resources to learn more.
FTI search & Search for "All Mail and Archives" is not working as expected
KB0101788
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0101788
FTI search is not working as expected when the attachments are created with ANSI format and
contains the Hebrew characters
KB0093805
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0093805
Full Text Index -FTI option set to Immediate doesn't get updated in a timely way
KB0077177
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0077177
Domino LDAP performance problems under high volume FTI related searches
KB0032573
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0032573
How can you disable the full text indexer on a Domino server
KB0039453
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0039453
Full text index Search Incorrect Results When Searching For Email Address Containing Separator
With Underscore
KB0087347
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0087347
Tika doesn't appear to be running even though all necessary settings are enabled.
KB0078165
https://support.hcltechsw.com/csm?id=kb_article&sysparm_article=KB0078165
Legal Statements
This edition applies to HCL Domino 9.0.x, 10.0.x, 11.0.x, 12.0.x and to all subsequent releases and
modifications until otherwise indicated in new editions.
When you send information to HCL Technologies Ltd., you grant HCL Technologies Ltd. a
nonexclusive right to use or distribute the information in any way it believes appropriate without
incurring any obligation to you.
©2023 Copyright HCL Technologies Ltd and others. All rights reserved.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or
disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with HCL Technologies
Ltd.
Disclaimers
This report is subject to the HCL Terms of Use (https://www.hcl.com/terms-of-use) and the
following disclaimers:
The information contained in this report is provided for informational purposes only. While efforts
were made to verify the completeness and accuracy of the information contained in this publication,
it is provided AS IS without warranty of any kind, express or implied, including but not limited to the
implied warranties of merchantability, non-infringement, and fitness for a particular purpose. In
addition, this information is based on HCL’s current product plans and strategy, which are subject to
change by HCL without notice. HCL shall not be responsible for any direct, indirect, incidental,
consequential, special, or other damages arising out of the use of, or otherwise related to, this
report or any other materials. Nothing contained in this publication is intended to, nor shall have the
effect of, creating any warranties or representations from HCL or its suppliers or licensors, or altering
the terms and conditions of the applicable license agreement governing the use of HCL software.
References in this report to HCL products, programs, or services do not imply that they will be
available in all countries in which HCL operates. Product release dates and/or capabilities referenced
in this presentation may change at any time at HCL’s sole discretion based on market opportunities
or other factors, and are not intended to be a commitment to future product or feature availability
in any way. The underlying database used to support these reports is refreshed on a weekly basis.
Discrepancies found between reports generated using this web tool and other HCL documentation
sources may or may not be attributed to different publish and refresh cycles for this tool and other
sources. Nothing contained in this report is intended to, nor shall have the effect of, stating or
implying that any activities undertaken by you will result in any specific sales, revenue growth,
savings or other results. You assume sole responsibility for any results you obtain or decisions you
make as a result of this report. Notwithstanding the HCL Terms of Use (https://www.hcl.com/terms-
of-use), users of this site are permitted to copy and save the reports generated from this tool for
such users own internal business purpose. No other use shall be permitted.