Crawling
Crawling
2
Create Content in Documentum
Access the Documentum Administrator by accessing http://<server>:port/da.
3
Click on + next to access the cabinets.
4
Create a cabinet or access a cabinet that is already created.
5
To add documents, click File.
6
Click on Import.
7
Click on Add Files.
8
Select the documents that you want to add.
9
Click Next to edit the Object Definition before adding the document.
10
You can accept the defaults or you can make modifications here. For example if these five documents
are to be included in the vijai cabinet and later you want to crawl only those documents, you can create
a document type such as vijai_document and assign that here. You can make the vijai_document inherit
from dm_document so that it will show when you crawl all documents of type dm_document. Of
course, you can also crawl only documents of type vijai_document.
11
When you get to the last document and have finished making changes, click Finish
12
Here is the list of documents we just added.
13
You can log out of Documentum Administrator.
14
2. Extract all files from the zip file.
15
3. Here the list of files that were extracted. Double-click on dfcWinSuiteSetup.exe.
16
… and be initialized.
Click Next.
17
Accept the license agreement and click Next.
18
Accept the default installation location for the DFC Runtime Environment. Make a note of this location
because you will need this information later on.
19
Optionally choose to install the documentation.
20
Accept or change the default location for the User directory. Make a note of this location because you
will need this information later on.
21
Enter the hostname or IP address of the Documentum server and accept or change the port number
after consulting with the administrator of the Documentum server. It may take a few seconds after
clicking Next to refresh and move to the next screen.
22
Review the summary and click Next.
23
The DFC Runtime environment installation starts.
24
You may be asked during the installation whether you want to designate a global registry. If not, you can
uncheck the “Designate the global registry repository to use” checkbox. This will show the following
message:
25
And it will show the unchecked screen with the options removed.
26
Click Finish when the DFC installation finishes.
27
Setting up the connector: Creating the RMI Bridge
The next step is to create an RMIBridge script for the Documentum connector. You can refer to the
following IBM® Content Integrator documentation. This document corrects errors in the documentation.
http://pic.dhe.ibm.com/infocenter/ce/v8r6/topic/com.ibm.discovery.ci.conn.doc/cdncr018.htm
To retrieve data from a Documentum repository, you must configure the Documentum connector to
work with the supported version of Documentum Foundation Classes (DFC). Ensure that the clients are
installed on the same system where the Documentum connector runs.
Navigate to the installation folder of Content Integrator and make a copy of the RMIBridge.bat file. We
will be using the copy.
28
Rename the file to make it easy to recognize it as being the RMIBridge for the Documentum Connector.
For example you can name it to DCTMRMIBridge.bat, as shown in this example.
Open and review the DCTMRMIBridge.bat file for the default values. Note the call made to config.bat.
You can open that file to see what values are set there. For example, the documentation says to set the
following lines after the VBR_CLASSPATH is set. You will notice that that is done in the config.bat. So you
29
can set the lines in the DCTMRMIBridge.bat file after the config.bat is executed, for example in the :run
section.
set DFC_CONFIG_HOME=C:\\Documentum\\config
set VBR_CLASSPATH=%VBR_CLASSPATH%;%DFC_JAR%;%DFC_CONFIG_HOME%
Verify that these paths are the correct paths for DFC_JAR and DFC_CONFIG_HOME.
30
Here is the modified file:
Now start DCTMRMIBridge.bat. You can start it either by double-clicking it or by executing the batch file
from the command line.
31
Setting up the connector: Creating the Connector
32
33
Accept the default name of Documentum Connector 1 for this example.
Review each of the tabs in the Properties editor and note the default values. We will modify these to
create our connector.
34
Defaults of the Properties Editor
35
36
37
Configuring the Documentum Connector
Ensure “Enable the connector” checkbox is selected. Next click on the Repository Tab.
Add the Docbase name for the Documentum server. You can obtain this information from the system
administrator of the Documentum server. Next, click on the RMI proxy connectors tab to configure the
RMI Bridge.
38
Click on the + sign to add a URL for the RMI proxy connector, that is, the DCTMRMIBridge.bat file that
we created. If this is your first RMIBridge, you can accept the port number as is. If you configure more
than one RMIBridge, make sure that the batch file has a different port number configured for each
RMIBridge.
Click on Save.
39
Testing the Connector
Right-click on the “Documentum Connector 1” connector.
40
Enter the username and password.
If everything is configured correctly, a message is displayed that the connection to the connector
succeeded.
at
com.documentum.fc.client.security.impl.DfcIdentityPublisher.<init>(DfcIdentityPublishe
r.java:51)
at
com.documentum.fc.client.security.internal.RegistrationMgr.register(RegistrationMgr.ja
va:34)
at com.documentum.fc.impl.RuntimeContext.<clinit>(RuntimeContext.java:191
You can ignore these messages because the global registry was not configured.
41
Testing access to the content
Now access the Repository and see the document itself.
Click on Start > IBM Content Integrator > Repository Browser Sample.
42
Click on Logon.
Look at the status bar below to see the progress of the logon process.
43
Click on the folder in which you have saved the documents. The documents will be displayed in the
window on the right.
44
Crawling and searching the content
Confirm that the VBR_HOME and JAVA_HOME environment variables in the iice_install_root/bin/config.sh file
(on AIX or Linux) or iice_install_root\bin\config.bat file (on Microsoft Windows) specify the correct directories.
To set up IBM Content Analytics with Enterprise Search to crawl sources by using Content Integrator,
you must run the provided escrvbr.vbs script on Windows.
If you are not configuring with an IBM Content Integrator installation that uses WebSphere Application
Server, select No, as shown in this example.
45
When setup finishes, stop and restart esadmin sessions to restart IBM Content Analytics with Enterprise
Search.
46
When security is enabled, the crawler crawls documents that allow read access, not browse access. When a
user uses an application to query a secure collection, only content that the user has authority to read is
returned in the results, not content that the user has authority to browse.
Creating a Collection
Log in to the IBM Content Analytics with Enterprise Search administration console (ESAdmin).
47
Click on Create a collection and give it the name Documentum Collection.
48
For Collection ID, click on the radio button next to Custom ID. Assigning your own collection name will
be helpful for easily finding the logging and configuration folders to troubleshoot if needed.
49
Creating a Content Integrator Crawler
Once the collection is created, click on the + sign that appears next to Crawlers when you hover your
mouse over it. This opens up the Create a crawler screen. From the list of choices, select Content
Integrator.
50
Click on Next.
51
Provide a name and a description for the crawler. In this example, we give the name “Documentum
Crawler”. Click Next to go to the next screen where you can select the repositories to crawl.
52
Enter * in the Repository name or pattern text box and click on Search for repositories.
53
Find the Documentum Connector that we created in IBM Content Integrator and click on the play button
to add it to the Repositories to crawl section.
54
Here is how it should appear after adding the connector. If you don’t explicitly add the connector, you
will get an error when you click Next. Click Next to move to the screen to enter credentials.
55
Notice there is nothing in the Applied column of the Documentum Connector 1 connector.
56
Select the radio button to Specify new connection credentials, and enter the user ID and password.
Then click on Apply. Don’t forget to click the Apply button or you will get an error when you click Next or
Finish.
57
When you click Apply, the Applied column shows an icon with a person and a lock. You will also see a
message at the top of the screen saying that the user ID and password were applied. Now Click Next.
58
On the Select Content Integrator Item Classes to Crawl page, enter * or an item type that you created.
Remember the item type vijai_document mentioned when adding the documents to the vijai cabinet.
59
For example you can select dm_document and click the play button to add it to the list of item classes
to crawl.
60
You can determine the item type by accessing Documentum and a document, and right-clicking on the
document to access its properties. Here the type is dm_document.
61
Once you select the item type, click Apply. Then Click Next.
62
Review the crawler configuration and click Finish.
63
Start the crawler.
Click on the eye icon to monitor the crawler to see how many documents have been crawled so far. For
example, this sample screen shows 265.
64
Alternatively, if you configure an item type like vijai_document and upload a document with that item
type, you can search for it and select that item type.
65
In this example I added one document with the type vijai_document and the crawler successfully
crawled it.
66
Once the parser and indexer completes the indexing, you can access the search application and try to
search for the document by using a query term, and verify that you are able to access the document.
Now you have an end-to-end configured system with the Documentum server, the documents and item
types, the DFC classes, the RMIBridge and Connector, the crawler, the index, and the ability to search.
For more information or questions about this document, contact [email protected] or IBM
Support at 1-800-IBM-SERV.
67
Troubleshooting:
1. The information center says:
a. Start the administration tool.
b. Right-click Connectors and select Create an RMI proxy connector script file wizard.
c. Answer the remaining prompts.
Correction: You must create a Documentum Connector, right-click on it, and then click on RMI proxy
connector script wizard.
68
69
2. During the Test Connection, you may see exceptions In the console window:
at
com.documentum.fc.client.security.impl.DfcIdentityPublisher.<init>(DfcIdentityPubli
sher.java:51)
at
com.documentum.fc.client.security.internal.RegistrationMgr.register(RegistrationMgr
.java:34)
at com.documentum.fc.impl.RuntimeContext.<clinit>(RuntimeContext.java:191
You can ignore these messages because the global registry was not configured.
70