Integrating Search Interface To SAS Content 3.6: Rebuikld
Integrating Search Interface To SAS Content 3.6: Rebuikld
Other brand and product names are trademarks of their respective companies.
Table of Contents
Chapter 1 — Introduction ........................................................................................ 1
Architecture ................................................................................................... 1
Data Feeds and Indexing ............................................................................................... 2
Installation and Prerequisites ........................................................................ 2
Post-Installation Configuration to Enable SSL (Optional) ................................................... 3
Integration with SAS Information Retrieval Studio ........................................ 3
Integration with Apache Lucene ..................................................................... 4
Apache Lucene Index .................................................................................................... 4
Verify the Installation .................................................................................... 5
Verify the Integration with SAS Information Retrieval Studio ........................ 5
Verify the SAS Information Retrieval Studio Server Installation .................... 5
Chapter 2 — Configuration Options ....................................................................... 7
Step 1 - Verify the Registration Information in Metadata ............................... 7
Step 2 - Verify the Configuration of SAS Information Retrieval Studio ........... 9
Rebuild the Search Index ............................................................................... 9
Disable the CSRF Filter ................................................................................. 10
Add Sites to the CSRF Whitelist .................................................................... 10
Whitelisting Methods .................................................................................................. 11
Chapter 3 — Logging............................................................................................. 12
Log Settings and Location ............................................................................ 12
Appendix — Troubleshooting (FAQ) .................................................................... 13
1. I see no results when I submit a query string in Visual Analytics Hub. What can
I do? ............................................................................................................... 13
2. How can I enable the Administration Console for SAS Information Retrieval Studio? .. 13
3. How can I verify that SAS Information Retrieval Studio has been configured
correctly? ......................................................................................................... 14
4. How can I verify the host and port of SAS Information Retrieval Studio? .................. 15
5. Some of the required fields are missing from the IR Studio schema, so its contents
are not searchable. How can I add the missing fields? ........................................... 16
6. I have a ‘Connection refused’ message in my log file. What should I do? .................. 16
7. I am receiving email messages that report errors. What do they mean? .................. 16
8. What configuration changes are required for an external reverse proxy
environment? ................................................................................................... 17
9. How can I recover lost data for the Indexing Server after an indexing failure? .......... 18
10. Search does not work with SSL. What steps can I take in this situation? .................. 18
11. How can I change the frequency of indexing? ....................................................... 18
12. What is the frequency of email notifications? ........................................................ 19
13. Is IPv6 supported? ............................................................................................ 19
14. How can I enable/disable the daily full feed? ........................................................ 19
15. How can I change the default setting for the maximum number of records
returned? ......................................................................................................... 19
Technical Support.................................................................................................. 20
ii
Integrating Search Interface to SAS Content 3.6
Chapter 1 — Introduction
Search Interface to SAS Content is a web application that enables users to search the contents of SAS
deployments. It is a user-interface component of the search facility that is installed on the middle tier
of SAS Solutions. The search facility indexes and searches SAS content that is registered in metadata
and in the Content Server. It returns only the results that are available to the requesting user, based
on user account permissions.
Search Interface to SAS Content relies on a second component, an indexing provider that creates a
content index. Two index providers are supported:
SAS Information Retrieval Studio (enabled by default in IPv4 environments)
Apache Lucene
One of these providers is enabled during product installation.
Once the index has been built, Search Interface to SAS Content queries it in response to search criteria
supplied by SAS Solution users.
A Search Interface to SAS Content plug-in is available in the Configuration Management plug-ins of
the SAS Management Console to enable the administrator to perform configuration tasks.
Architecture
Search Interface to SAS Content is a component of the Platform Content Services within the SAS Web
Infrastructure Platform. The Content Services provide common management of and access to
different data types, often stored in multiple repositories. All of the services in the SAS Platform
Content Services category contain methods that can interact with any content object type. The Search
Service and the Index Data Service work together to provide the search functionality. The following
diagram provides a high-level description of the architecture, focused on security requirements:
Note that the three “tiers” of the architecture can be running on a single machine, or on separate
machines. The Metadata Server is part of the Server Tier, while the Content Server is part of the
Middle Tier.
Page 1
Chapter 1—Introduction
1. An end-user who has logged into a SAS user interface (such as the SAS Visual Analytics
Hub, client software running on the user’s PC) initiates a search. The client sends a query
request to the SAS Search Service web application, which is configured on a separate server
for the Middle Tier components. A connection is made to the SAS Search Service running in
the application server using port 7980 by default, or an external port.
2. The SAS Search Service makes a connection to the Information Retrieval Studio Query
Server on port 10731, and sends the search request.
3. Information Retrieval Studio searches its index for the query string. The index is stored in
the following directory on the machine where your SAS solution server is running:
<SASConfig>/LevN/Applications/SASInformationRetrievalStudioforSAS/work/
index.
4. Information Retrieval Studio sends an authorization request to the Search Service
Authorization Provider running in the Middle Tier. The query-server-db file is used to
find the authorization provider. The value of the searchsas.auth.provider.url
parameter determines the REST endpoint to call for authorizing search results. For more
information, see “Step 1 - Verify the Registration Information in Metadata” on page 7.
5. The Authorization Service Provider returns a filtered list of authorized objects.
6. The Information Retrieval Studio Query Server returns search results to the SAS Search
Service.
7. The SAS Search Service returns the results to the user interface (in this example, the Hub).
The diagram shows a configuration with Information Retrieval Studio as the index server, but this
service could also be provided by Apache Lucene.
2
Integrating Search Interface to SAS Content 3.6
During the installation, you are prompted to configure a SAS internal account for the Search
Interface to SAS Content user. The default Search Interface to SAS Content User is
sassearch@saspw, which is required to enable access to content in SAS Information Retrieval
Studio. This default user is one of the internal user accounts that are required by SAS Foundation and
are authenticated on the SAS Metadata Server. For more information about these user accounts, see
the SAS® Visual Analytics Installation and Configuration Guide.
Search Interface to SAS Content provides search results exclusively for the Search Interface to SAS
Content User. The results are then authorized as a separate step for the individual user accounts that
are logged in and performing searches.
SAS Information Retrieval Studio 1.53 requires Python version 2.6 through 2.7.x (not 3.0 or higher) for
deployment. This third-party software is included in the Pre-Installation Checklist that accompanies
your SAS Deployment Plan. It must be installed on the target machine before you can install SAS
Information Retrieval Studio on the SAS Visual Analytics middle tier. You are prompted to specify
the Python installation location during SAS Visual Analytics installation.
For more information about Python requirements and a link where you can download the software,
go to http://support.sas.com/resources/thirdpartysupport/v94/othersw.html.
3
Chapter 1—Introduction
On UNIX, the IRStudio.sh script runs in the SAS configuration directory under
/Applications/SASInformationRetrievalStudioforSAS.
You can use the following commands to operate the server on UNIX:
IRStudio.sh start | stop | status | restart
Generated log files are in the /logs subdirectory.
When the SAS middle-tier search facility is first initialized, all available content is fully indexed.
Subsequently, incremental changes to SAS content are regularly indexed. If a failure in generating or
loading the index occurs, an email message is automatically sent to the address that is designated for
administrative messages for your SAS deployment.
Refer to “Chapter 2 — Configuration” for the procedure to enable the index server of SAS
Information Retrieval Studio to receive SAS contents.
4
Integrating Search Interface to SAS Content 3.6
5
Chapter 1—Introduction
Proxy Server
Pipeline Server
Indexing Server
Query Server
These servers are required to enable the Search applications to work. If any of them are missing from
the administration page, you might have a licensing issue. Contact SAS Technical Support if you
suspect a licensing issue.
If a server is not listed as running, start it by clicking the Start button on the top right side of the
server Details page. Then wait for a minute or so to make sure it does not stop again. Troubleshooting
has occasionally turned up a problem with this manual step.
If all of these servers are listed and are running, SAS Information Retrieval Studio has been
successfully installed.
6
Integrating Search Interface to SAS Content 3.6
4. Right-click Search Interface to SAS Content 3.6 and select Properties. Then click the
Advanced tab in the Properties dialog box.
Page 7
Chapter 2 – Configuration Options
5. Verify that the following properties have the correct values. These properties were
configured during the installation and configuration of SAS Information Retrieval Studio:
searchsas.irstudio.server.host: the hostname of the server where SAS Information
Retrieval Studio servers are running.
searchsas.irstudio.server.port: the port number of the server where SAS Information
Retrieval Studio is running. The default (10651) should not be changed.
searchsas.irstudio.query.server.port: the port number of the server where the query
server of SAS Information Retrieval Studio is running.
searchsas.irstudio.proxy.server.port: the port number of the server where the proxy
server of SAS Information Retrieval Studio is listening.
searchsas.auth.provider.url: constructs the URL that is used to enable communications
among SAS components. Search Interface to SAS Content uses this URL to contact the
authorization service provider, which filters the query results that are returned based
on user permissions.
If a load balancer or firewall is running in between client machines and the SAS Web
Server, or if a proxy server is being used to mask URLs, you might need to change the
default value on the External Connection tab in this Properties dialog box. By default,
the Use internal connection information check box is selected. Clear that check box,
and then enter the connection information for the proxy in situations where the proxy
server must route connections.
Note: If you are unsure of the correct values, see the installation documentation for SAS Visual Analytics. For
more information about configuration options in a proxy server or firewall environment, see
“Specifying Connection Properties” in the SAS(R) 9.4 Intelligence Platform: Middle-Tier
Administration Guide.
8
Integrating Search Interface to SAS Content 3.6
9
Chapter 2 – Configuration Options
10
Integrating Search Interface to SAS Content 3.6
Whitelisting Methods
In addition to whitelisting sites, you can also whitelist certain HTTP methods. For example, you can
choose to allow GET requests from any site. However, the result of this configuration is that SAS
applications that use GET requests to invoke certain actions are now susceptible to CSRF attacks.
If you decide to whitelist certain methods, you can instead set the
sas.web.csrf.referers.skipMethods property in the Advanced properties for the SAS
Application Infrastructure. This value should be a comma-delimited list of HTTP methods (for
example, GET,OPTIONS,TRACE) that are to be skipped.
To enable CSRF checking and enforcement, set the sas.web.csrf.referers.performCheck
property to true in the Advanced properties for SAS Application Infrastructure.
After changing any of the properties that are discussed above, you must restart the SAS Web
Application Server in order for the changes to take effect.
You also have the option to set CSRF properties on individual web applications in your SAS
installation by using Configuration Manager. After you make any changes in Configuration Manager,
you must restart all of the web application servers in your SAS middle tier in order for the settings to
take effect.
11
Chapter 3--Logging
Chapter 3 — Logging
Log Settings and Location
Logging for Search Interface to SAS Content is enabled by default. Also by default, the log4j root level
is set to error, which can be changed to warn or info as required. The configuration file for Search
Interface to SAS Content logs can be found at this location:
SAS-Configuration-Directory/LevN/Web/Common/LogConfig/SASSearchService-
log4j.xml
Logs are generated at this location:
SAS-Configuration-Directory/LevN/Web/Logs
For “LevN,” substitute the location of the configuration file for the selected environment.
To change the default logging level from error to warn or info, take the following steps:
1. Use your preferred text editor to modify the following file:
SAS-Configuration-Directory/LevN/Web/Common/LogConfig/
SASSearchService-log4j.xml
2. Restart all SAS Web Application Server instances for the changes to take effect.
Page 12
Integrating Search Interface to SAS Content 3.6
2. How can I enable the Administration Console for SAS Information Retrieval
Studio?
The SAS Information Retrieval Studio Administration Console is disabled by default so that the
settings cannot be changed by an anonymous user.
You may need to enable the interface if SAS Information Retrieval Studio is running abnormally. You
can use the administration interface to check server status and start/stop services.
To enable the SAS Information Retrieval Studio Administration Console, take the following steps:
1. On the machine that is hosting SAS Information Retrieval Studio, which is typically the SAS
Application Server machine that hosts the SAS Visual Analytics server tier, make a backup
copy of the following file:
SAS-Configuration-Directory/LevN/Applications
/SASInformationRetrievalStudioforSAS/work/information-retrieval-
studio.conf.
2. Use your preferred text editor to open information-retrieval-studio.conf.
3. Add the following line to the bottom of the file:
enable-web-admin-interface=true
4. Save the file and close it.
5. Restart the SAS Information Retrieval service:
SAS-Configuration-Directory/LevN/Applications
/SASInformationRetrievalStudioforSAS/IRStudio.sh restart
6. Access the SAS Information Retrieval Studio Administration Console:
http://host-name:port/
13
Appendix – Troubleshooting (FAQ)
For the host name, substitute the host name of the machine that is hosting SAS Information Retrieval
Studio. The default port is 10651.
To disable the SAS Information Retrieval Studio Administration Console, take the following steps:
1. Use your preferred text editor to open the following file:
SAS-Information-Retrieval-Studio-config-directory/work
/information-retrieval-studio.conf
2. Remove the following line:
enable-web-admin-interface=true
3. Save the file and close it.
4. Restart the SAS Information Retrieval service.
3. How can I verify that SAS Information Retrieval Studio has been configured
correctly?
First, verify that SAS Information Retrieval Studio is installed correctly and that the search
application has been installed correctly. Use the REST URL for search, which also silently configures
the SAS Information Retrieval Studio for Search. The REST APIs can be accessed by using the context
URL for the Search service: http://host:port/SASSearchService/rest
Next, open the Administration Console of SAS Information Retrieval Studio, and then open the
Indexing Server. Verify the Information Retrieval Studio configuration by checking the following
elements in the Configuration section of the Indexing Server:
title
description
link
sastype
sasmetatype
sasowner
sasid
keywords
promptlabels
14
Integrating Search Interface to SAS Content 3.6
These elements must all be present in the Configuration section of the target setup. This ensures that
the configuration of SAS Information Retrieval Studio has been invoked by Search Interface to SAS
Content. Note that the page may include more field names than are shown above.
4. How can I verify the host and port of SAS Information Retrieval Studio?
The host and port number that SAS Information Retrieval Studio is using can be taken from the
properties of the metadata software component of Search Interface to SAS Content.
Take the following steps:
1. Launch SAS Management Console and log in with administrator credentials.
2. Click the Plug-ins tab.
3. Navigate to SAS Management Console Application Management and select Configuration
Manager.
4. Right-click Search Interface to SAS Content 3.6 and select Properties. Then click the
Advanced tab in the Properties dialog box.
searchsas.irstudio.server.host: The hostname of the server where SAS Information Retrieval Studio
servers are running.
searchsas.irstudio.server.port: The port number of the server where the SAS Information Retrieval
Studio Administration Console is running.
searchsas.irstudio.proxy.server.port: The port number where the feed is pushed in SAS Information
Retrieval Studio.
searchsas.irstudio.query.server.port: The port number where queries to the Index are sent.
15
Appendix – Troubleshooting (FAQ)
5. Some of the required fields are missing from the IR Studio schema, so its
contents are not searchable. How can I add the missing fields?
If fields are missing from the configuration of the Information Retrieval Studio Indexing Server, the
contents of those fields would neither be indexed nor available in responses. In this scenario, it is
important to reconfigure the Indexing Server to make the missing fields available for searching.
Take the following steps to reconfigure the schema:
1. Stop all SAS Web Application Server instances.
2. Stop the Information Retrieval Studio server.
3. Go to the following location relative to the configuration directory:
LevN\Applications\SASInformationRetrievalStudioforSAS\work
4. Open the file pipeline-server.db and replace the text ‘BIRD’ with ‘BIRD1’.
5. Indexed content is located in the ‘index’ folder inside the current directory of Information
Retrieval Studio (the ‘work’ directory). Manually delete all of the files in this folder.
6. Save the file, and restart the Information Retrieval Studio server.
7. Start the Search Interface to SAS Content Web Application Server.
After these steps are completed, but before content is pushed to the index server, Information
Retrieval Studio is reconfigured with the latest information available through the Index Data Service
schema API.
16
Integrating Search Interface to SAS Content 3.6
You received this email message because Search Interface to SAS Content failed to connect
to Information Retrieval Studio. Make sure Information Retrieval Studio is up and
running. (Refer to Questions 2 and 3 above.) Also verify that the correct values for the
Information Retrieval Studio host and port are being used. (Refer to Question 4 above.)
7 b) Error Message: com.sas.svcs.search.client.IndexServiceException:
Exception encountered related to Index Path of Lucene.
You are receiving this email message because Search Interface to SAS Content could not
access the Apache Lucene index directory. Verify that
searchsas.lucene.index.default.dir has a valid directory path. (Refer to
Question 4 above and the image of the SAS Management Console.) You should also verify
that the SAS install user has write access to the specified directory.
7 c) Error Message: com.sas.svcs.search.client.IndexServiceException: An
error occurred while retrieving data from Index Data Service.
If you receive this email message, make sure that the Web Infrastructure Platform Data
Server is running. If it is not running, start it, and also restart all Web application server
instances. If the Web Infrastructure Platform Data Server is already running, check the
SASSearchService log file for the exact cause of the failure.
7 d) Error Message: Search Interface to SAS Content encountered error while
feeding the content to the Indexing Server.
This email message reports that the Search Interface to SAS Content “encountered [a]
problem while retrieving content” for a specified list of data types. The message further
instructs you to check web application server logs. The affected host is listed to help you
parse log content.
The cause of this type of error can vary. It can be caused by an internal fault in the object
type, or by corrupt data handling by the object type owner. When you check the web
application server log, check for exceptions. The log information should indicate which
team is responsible for troubleshooting the problem, based on the object types that are
affected.
By default, the web application server logs are located in the following directory:
Config/LevN/Web/Logs/SASServerN/SASSearchService.log
You might also need a hot fix to resolve this issue. If your environment included a recent
migration, some data object types can be unknown to the Metadata Server when indexing
runs. The Indexing Service tries to find the lookup support service for those objects, and
when it fails, it throws the exception.
If you believe that this description fits your situation, contact SAS Technical Support.
17
Appendix – Troubleshooting (FAQ)
9. How can I recover lost data for the Indexing Server after an indexing failure?
Search Interface to SAS Content does not lose data after an indexing failure. After the cause of the
indexing failure has been found and corrected, all missing index fields will be populated
automatically. Typical usage of Search Interface to SAS Content does not require you to interact with
the Index Server; all such interactions are performed by the Search Service and are transparent to
you. However, if a failure occurs that affects the search index, you can manually delete and rebuild
the index. Any problem that involves corrupted Information Retrieval Studio indexes requires you to
re-index all SAS data. Take the following steps:
1. Stop all SAS Web Application Server instances.
2. Stop Information Retrieval Studio. You can use the Administration Console to perform this
step. For more information about the Administration Console, see How can I enable the
Administration Console for SAS Information Retrieval Studio? on page 13.
3. Delete the Information Retrieval Studio index. You can use the Administration Console to
perform this step.
4. Start all SAS Web Application Server instances.
5. Start Information Retrieval Studio.
Search Interface to SAS Content will then index all data. The process can take two to three hours.
10. Search does not work with SSL. What steps can I take in this situation?
SAS Information Retrieval Studio requires manual configuration to work in an environment where
Secure Sockets Layer (SSL) encryption is used. Ensure that it has been configured properly. For
details, refer to the section titled “Post-Deployment Step for SAS Information Retrieval Studio” in
SAS Visual Analytics Installation and Configuration Guide, available at
http://support.sas.com/documentation/solutions/va/73/en/vaicg.pdf
18
Integrating Search Interface to SAS Content 3.6
15. How can I change the default setting for the maximum number of records
returned?
Your search results are subject to a default setting that limits the number of records that can be
returned from a single query. If your search results include a message indicating that “the matching
record count has exceeded a threshold of N records,” you have experienced this limit.
19
Appendix – Troubleshooting (FAQ)
If you have a very large database and do not want to see this message every time you do a search,
you can change the maximum number of records that are returned from a search.
This limit is imposed in order to ensure optimal search performance. Each object that is returned
from a query must be authorized for the user account that is performing the search. This series of
checks and authorizations can easily become a bottleneck when searching a large number of database
objects.
To change the maximum number of records that are returned per search, take the following steps:
1. Open the SAS Management Console and go to Application Management Configuration
Manager.
2. Right-click the software component with the name “Search Interface to SAS Content X.X”
and click Properties.
3. Click the Advanced tab.
4. The number of records returned in a search can be customized by changing value of the
searchsas.max.auth.records.count property. It is set to 20000 by default.
5. Click OK to save the changes.
Technical Support
If you need assistance with the software, we ask that only SAS support personnel call our Technical
Support Division.
For U.S. and Canadian customers, support is provided from our corporate headquarters in
Cary, North Carolina. You may call (919) 677-8008, Monday through Friday.
Customers outside of the U.S. can obtain local-language technical support through the local
office in their countries. Customers in these locations should contact their local office for
specific support hours. See
http://support.sas.com/techsup/contact/index.html for contact information for
local offices.
Before you call, we recommend exploring the SAS Support Web site at
http://support.sas.com/techsup/
This site offers access to the SAS Knowledge Base, as well as discussion forums, Technical Support
contact options, and other support materials that may answer your questions.
20
SAS is the leader in business analytics software
and services, and the largest independent
vendor in the business intelligence market.
Through innovative solutions delivered within
an integrated framework, SAS helps customers
at more than 50,000 sites improve performance
and deliver value by making better decisions
faster. Since 1976, SAS has been giving
customers around the world THE POWER TO
KNOW®.