Performance Tuning in OBIEE 11G
Performance Tuning in OBIEE 11G
Performance Tuning in OBIEE 11G
environment.
To maximize Oracle Business Intelligence Enterprise Edition performance, you need to monitor,
analyze, and tune all the Fusion Middleware / BI components. This guide describes the tools that you can
use to monitor performance and the techniques for optimizing the performance of Oracle Business
Intelligence Enterprise Edition components.
In order to have an organized way to analyze the performance of an OBIEE installation, we will take a
three pronged approach so that all the components can be tested, their performance analyzed and
subsequently altering the settings to fine tune performance.
This approach involves the following steps:
Define a User
The first step in defining a good test is to define extremely precisely and clearly what it is we are testing.
When we consider how to model an OBIEE user in a performance test, we have the following challenges:
Navigation paths
In a BI system, once a user has run their first report, which will they run then? Maybe a different report,
but quite likely it will be navigation within the existing report (e.g. drill down, prompt changes, etc.)
Ad hoc
Whilst some reporting will be fixed reports, much business value from BI comes in the analysis and
exploration of data in an ad hoc manner to discover trends etc. to give competitive advantage. These ad
hoc queries cannot, by definition, be known in advance, so the performance test definition must find a
way to address this.
Think about how many types of users present in the system, and how to characterize and group them.
There will be different sets of data they access, and varying degrees of interactivity with which they use
the tool. Some will be using prebuilt reports with fixed filters, other may be doing extensive ad-hoc
analyses, and somewhere in the middle will be pre-built reports but with a fair amount of interactivity
through functionality such as prompts and selection steps.
Define a Report
For each of the User types that we define, we need to precisely define the workload for them. This will be
made up of the execution of one or more dashboards/reports. As well as the distinction to make between
individual reports (Analyses/Answers) and Dashboards, we also need to consider:
Filters
How do we model a report which has six possible filters? We cant just call that report A, because each
filter permutation could cause different SQL to be generated by the BI reporting tool and consequently
vastly different execution plan(s) on the database where the query runs.
Aggregate navigation
One of OBIEEs many strengths is its aggregate navigation capability, enabling the user to drill down into
data whilst in the background the tool switches between pre-defined aggregate tables on the database
to pick the fastest one to use. For performance testing we need to make sure that within a report
each possible aggregate is considered separately.
Any report or dashboard thats using prompts needs to have those prompt values stated in the test
definition. The execution of a query on the database can vary considerably depending on the predicates
supplied, and it is in part from the report/dashboard prompts that these predicates come.
User Workload
Having precisely defined the reports which are going to be run, we can sequence them together to
represent different types of users. Clearly this is going to be a generalization, an estimation, of how a
particular group of users is going to tend to use the reports. Unlike an OLTP transaction, where the steps
are much more predefined and free from possible deviations, users of OBIEE have the flexibility to run
any number of reports, drill within existing reports, and so on.
Some user types may just run one report and exit; they dont have to have a complex sequence. We should
model several different instances of each user group in order or increase the test coverage and realism of
different reports running concurrently. There should be awareness that the more user types and tests to be
done, the longer it will take to build, and the more complex the troubleshooting could be.
User Concurrency
Once we have designed a set of individual user workloads, we can bring these together into a larger test to
assess the performance of the system as a whole under load. The aim of load testing is to assess what
happens to the response time of reports, and hardware utilization, as the number of concurrent users on
the system increases. There are three key things to define in a load test:
Workload of each user
Different types of user
Number of concurrent users
The first two of these are discussed above. When it comes to user concurrency, we need to be very careful
how we derive our figure, because it has a big impact on the scale of the tests that we run and the
discussion of the outcomes.
First, ensure that definition of concurrency that is used is concise. These include:
Concurrent user sessions logged into OBIEE, viewing dashboards, formatting reports.
Concurrently executing queries at the BI Server level, which could be a single user running one
report or one dashboard.
Concurrently executing queries on the database, which could be a single user in OBIEE running a
single report or dashboard requiring more than one SQL request to get its data.
Each of these have their own particular load on the system, but can easily differ by orders of magnitude.
Typically a user concurrency figure will be derived from the number of users. From this, a reckoning is
made on how many will be using the system at one time. So unless its qualified otherwise, we need to
work with the first definition concurrent user sessions logged into OBIEE.
If you have an existing system then it is easy to determine your actual user concurrency through either
Usage Tracking or the OBI DMS metrics.
If your user concurrency figure is based on users logged into OBIEE, not necessarily running reports,
then the metric
Oracle BI PS Sessions: Sessions Logged In should be appropriate.
For the number of queries (logical SQL) concurrently running on the BI Server, use Usage Tracking, or
DMS metric
The number of concurrent queries on the database can either be obtained from the database monitoring
tool, or through the Oracle BI DB Connection Pool DMS metrics.
To estimate user concurrency from Usage Tracking, you can use the following query. It will analyze
Usage Tracking to give an approximate number of concurrent users, but this is based on individual
queries run on the BI Server, so it not the same as users simply logged into OBIEE, nor the same as
dashboard refreshes (since one dashboard could be many queries):
-- Usage Tracking Concurrency usage estimate
-- Per minute granularity
SELECT TO_CHAR(start_ts, 'YYYY-MM-DD HH24:MI') AS start_ts_min,
COUNT(DISTINCT user_name)
AS distinct_users
FROM s_nq_acct
Database, including:
o DDL
o Object statistics
BI Server Repository (RPD)
Dashboard and report definitions (Webcat)
For example, if a change was made to the RPD but in a different Business Model from the one
you are testing then it may not matter. If, however, they have partitioned an underlying fact table,
then this could drastically change your results to the extent you should be discarding your first
results and retesting.
Make sure the data in the tables from which reporting is done is both unchanging and representative of
Production. If you are going live with 10M rows of data then it would be pretty wise to do our utmost to
run the performance test against 10M rows of data. Different types of reports might behave differently,
and this is where judgment comes in. For example, a weekly report that is based on a fact table partitioned
by week might perform roughly the same whether all or just one partition is loaded.
However, the same fact table as the source for a historical query going back months, or a query cutting
across partitions, is going need more data in to be representative. The configuration of the software should
be constant. This means obvious configuration such as BI Server caching, but also things like version
numbers and patch levels of the OBIEE stack. Consider taking a snapshot of all main configuration files
(NQSConfig.INIinstanceconfig.xml, etc) to store alongside your test data.
It is advisable to turn off BI Server caching for the initial tests, and then re-enable it if required as a
properly tested optimisation step.
Measure
Before executing a performance test, we need to work out what data is to be collected and how to
collect it. The data to collect for the test itself includes:
Response times at the lowest grain possible/practical. Response times should be 2dimensional; transaction name, plus time offset from beginning of test.
Number of test users running over time (i.e. offset from the start of your test)
Environment details a diagram of the top-to-bottom stack, software versions, code levels,
and data volumes. Anything that is going to be relevant in assessing the validity of the test
results, or rerunning the test in the future.
System metrics if response times and user numbers are the eye-catching numbers in a test,
system metrics are the oft-missed but vital numbers that give real substance to a test and
make it useful. If response times are bad, we need to know why. If theyre good, we need to
know how good. Both these things come from the system metrics.
Query Metrics depending on the level at which the testing is being done, collecting metrics
for individual query executions can also be vital for aiding performance analysis. Consider
this more of a second round, drill down, layer of metrics rather than one to always collect in a
large test since it can be a large volume of data.
Here is a checklist to work through for executing our test:
Clear down the logs of any component going to be analyzed (eg nqquery.log, sawlog.log)
Record any configuration/environment changes from the original baseline
Record test start time
Restart the OBIEE stack (i.e. WLS, OPMN)
Start OS metric collection
Start OBIEE metric collection (if used)
Run the test.
Analyzing Results
Analyzing the data breaks down into several stages, and is often an iterative process:
1. Analyze the net response time. Is it as fast as it needs to be, at the required level of user
concurrency?
2. If the response time is too slow (too slow being defined by us or the users, in advance of the
test), then diagnose to determine why. This is another phase of analysis, breaking down the net
response time into its constituent parts, analyzing system and OBI metrics for signs of a bottleneck.
The output of this phase will be a hypothesis as to the cause of the performance problem
3. Based on the diagnosis of the issue, apply one change to the system to improve it, that is, resolve
the performance issue. Having made one change (and one change only), the original test should be
repeated and the analysis cycle repeated to determine the impact of the tuning.
Performance vs Capacity
Performance issues can be local to a report, or global to a system implementation and exacerbated by
a particular report or set of reports or both.
If an individual dashboard doesnt perform with a single user running it, there is clearly a
performance problem in the design (of the dashboard, RPD, or physical data model design or
implementation). However, if an individual dashboard runs fine with a single user but performance
gets worse and worse the more users that run it concurrently, this would indicate a capacity problem
in the configuration or physical capacity of the system.
An easy way to find out is this: before launching into multi-user-concurrency tests, test the dashboard
with a single user. Is the response time acceptable? If not, then we have a performance problem. We
have eliminated user concurrency from the equation entirely. If the response time is acceptable, then
move onto the user concurrency tests.
If a user concurrency test is performed and are trying to identify whether the issue is performance or
capacity, then look at what happens to the response time compared to the number of users running. If
the response time is constant throughout then it indicates a performance problem; if it is increasing as
more users are added it shows a capacity (which can include configuration) problem.
Ensure that the RPD has a correct dimensional Star model in the Logical (BMM) layer
Push the dimensional Star schema into the database; dont build against a normalized data
model if performance is key requirement
Create aggregate tables to support user queries
Use Usage Tracking to spot dimension levels that would benefit, or Summary Advisor on
Exalytics
Aggregate Persistence Wizard can do the initial hard work in plumbing the metadata into the
RPD.
The aggregate needs to be maintained.
Need to create associated aggregated dimension tables. These are particularly useful for good
performance of prompts where a distinct list of values at a level in a dimension are returned.
Make sure that aggregates get used when appropriate. Check the SQL that OBIEE is
generating isnt using a less efficient source table.
If users really want a data dump, consider doing that for them outside of the tool, for example
with sql*plus.
To still make use of the metadata model in the RPD, but without causing big problems in
Presentation Services, use an ODBC or JDBC call into the BI Server directly to get the data
dump out. Using this method, we could hook in Excel directly to the BI Server.
The fastest query is one that never runs challenge dashboard & reports designs. Analyze the
users workflow, to see if the reports built can support and make more efficient what the user
does.
Generally, we should avoid building the RPD against database views, as they can hinder
OBIEEs SQL generation with the result of sub-optimal queries. Database views can also
hide inefficient or unnecessary joins and logic. Put the underlying tables into the Physical
layer of the RPD instead and let OBIEE work with them.
Minimize the work being done by the BI Server. When using federation to join data across
databases it is unavoidable but generally it is to be avoided if within the same database.
Wherever possible, all work should be seen to be pushed down to the database.
Check how much data the BI Server pulls back from the database as a percentage of rows
returned to the user (low % is bad).
Monitor the BI Servers temp directory if this is filling up with large files it means that the
BI Server is having to crunch lots of data
How many database queries does one report trigger? (higher is generally less efficient).
Size hardware correctly to support the BI Server and Presentation Server (based on past
experience and/or Oracles documentation)
BI Server Caching
Caching is not the bread and butter of good performance. Caching should not be the sole means of
ensuring good performance. That is not to say caching is bad. Caching is good, when its use is
thought through and evaluated carefully. Caching has an overhead in terms of management, so we
need to manage the cache to make sure that stale data is not passed on to the end users. It might be
fast, but itll be wrong.
Caching can improve performance for several reasons:
It is typically faster to return the results of a query already run and stored in the cache than it
is to re-execute the necessary Physical SQL against the source database(s)
By not executing the Physical SQL on the database(s), we reduce both the load on the
database, and the amount of network traffic, freeing up these resources for non-cached
queries
The cache stores the results of a Logical SQL query sent to the BI Server, not the data that the
database returns. If the BI Server is doing lots of work, for example, federating and
aggregating lots of data across sources, then by caching the result post-processing, all of this
work can be avoided by subsequent cache hits.
Pre-seeding the cache is a good idea, so that users all benefit from the cache, not just those who come
along after the first user has run a report that gets stored in the cache. There are a couple of ways to
pre-seed the cache:
1. A BI Delivers Agent with the Destination set to System Services : Oracle BI Server Cache.
This agent could optionally be set to run straight after your ETL batch has loaded the data.
2. Directly from an ETL tool via ODBC/JDBC using the SASeedQuery statement.
The flip-side of pre-seeding the cache is purging it, and there are two sensible ways to do this :
1. Event Polling Table
2. ODBC/JDBC command to the BI Server, triggered by the completion of a data load (ETL)