Benchmarking Guide Oracle
Benchmarking Guide Oracle
List of Tables
Table 1: Acronyms ....................................................................................................................................... 6
Table 2: Terminology ................................................................................................................................... 8
Table 3: Virtual Network Interrupt Coalescing and SplitRX Mode ............................................................. 18
Table 4: Power Management Profiles ........................................................................................................ 19
Table 5: Virtualization Performance by Processor ..................................................................................... 19
Table 6: DA-MP Test Set Up ..................................................................................................................... 21
Table 7: DA-MP Alarms/Events ................................................................................................................. 21
Table 8: DA-MP Utilization Metrics ............................................................................................................ 22
Table 9: DA-MP Connection Metrics .......................................................................................................... 23
Table 10: SDS DP Message Details .......................................................................................................... 24
Table 11: SDS DP Alarms/Events ............................................................................................................. 25
Table 12: SDS DP Utilization Metrics ........................................................................................................ 25
Table 13: DP SOAM Metrics ...................................................................................................................... 26
Table 14: SS7 MP Message Detail ............................................................................................................ 27
Table 15: SS7 MP Alarms/Events .............................................................................................................. 27
Table 16: SS7 MP Metrics ......................................................................................................................... 28
Table 17: VSTP-MP Utilization Metrics ...................................................................................................... 29
Table 18: PDRA Test Call Model ............................................................................................................... 32
Table 19: SBR (b) Alarms/Events .............................................................................................................. 32
Table 20: Session SBR (b) VMs Metrics .................................................................................................... 33
Table 21: Binding SBR (b) Server Metrics ................................................................................................. 33
Table 22: Stateful and Statelss Counter Measures ................................................................................... 34
Table 23: 10K MPS DA-MP VM Profile ...................................................................................................... 35
List of Figures
Figure 1: DA-MP Testing Topology ............................................................................................................ 21
Figure 2: DA-MP Message Sequence ....................................................................................................... 21
Figure 3: SDS DP Testing Topology .......................................................................................................... 24
Figure 4: SDS DP Message Sequence ...................................................................................................... 24
Figure 5: SS7 MP Testing Topology .......................................................................................................... 27
Figure 6: SS7 MP Message Flow .............................................................................................................. 27
Figure 7: SBR Testing Topology ................................................................................................................ 30
Figure 8: PDRA Message Sequence ......................................................................................................... 31
Figure 9: DSA Testing Topology ................................................................................................................ 34
Figure 10: SCEF Testing Topology ............................................................................................................ 36
Figure 11: EIR Testing Topology ............................................................................................................... 37
Figure 12. IPFE on Ingress Side Only ....................................................................................................... 39
Figure 13: IPFE on both Ingress and Egress Sides ................................................................................... 39
References
[1] Performance Best Practices for VMware vSphere® 6.0 at
http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-perfbest-
practices-vsphere6-0-white-paper.pdf
The following are available at Oracle.com on the Oracle Help Center (OHC):
[2] DSR Alarms, KPIs, and Measurements
[3] DSR Cloud Deployable Installation Guide
[4] Policy and Charging Application User’s Guide
Acronyms
Table 1: Acronyms
Acronym Description
API Application Programming Interface
ARR Application Route Rule
ART Application Route Table
CM Counter Measure
COTS Commercial Off The Shelf
CPU Central Processing Unit
DA-MP Diameter Agent Message Processor
DB Database
DP Database Processor
DSA Diameter Security Application
DSR Diameter Signaling Router
EIR Equipment Identity Register
ETG Egress Throttle Group
FABR Full Address Based Resolution
FQDN Fully Qualified Domain Name
GB Gigabyte
HDD Hard Disk Drive
Terminology
Table 2: Terminology
Term Description
1+1 For every 1, an additional 1 is needed to support redundant capacity. The specific
Redundancy redundancy scheme is not inferred (for example, active-active, and active-standby).
Geo-Diverse Refers to DSR equipment located at geographically separated sites/datacenters
Geo- A node at a geo-diverse location which can assume the processing load for another
Redundant DSR signaling node(s)
Ingress A measure of the total Diameter messages per second ingressing the DSR. For this
Message measure, a message is defined as any Diameter message that DSR reads from a
Rate Diameter peer connection independent of how the message is processed by the DSR.
Flexibility
DSR is flexibly deployed into many different clouds. It is unlikely that any two clouds are exactly the
same and operators need to optimize for different reasons (for example, power consumption may be
critical for one operator, and WAN latency at another), varying sets of applications, and differing
operational requirements. The performance and capacity of the DSR varies in each cloud, and the DSR
application can no longer provide a guaranteed level of performance and capacity. However, the
operator still needs to:
• Plan their networks – DSRs use resources, what impact DSR has on their datacenters?
• Deploy DSR with predictable (if not exact) performance and capacity.
• Manage the capacity and performance of the DSR in their datacenters.
Methodology
There is a set of DSR specific tools, methods and documentation to assist in planning, deploying, and
managing the capacity and performance of a cloud deployable DSR. This toolset is expected to be used
in conjunction with information and tools provided by the infrastructure (hardware, cloud manager,
hypervisor) vendors.
• Planning for cloud deployable DSR
• Estimating required resources for a given DSR cloud deployment
• Please contact your Oracle Sales Consultant. They have access to the DSR Cloud
Dimensioning tool which estimates DSR cloud resources. This tool takes into account many
factors not covered in this benchmarking guide, such as the overhead for optional DSR
features not covered in the benchmarking guide, and recommended margins for redundancy.
• DSR Cloud Customer Documentation
• Can be found with the DSR customer documentation at Oracle.com on the Oracle Help
Center (OHC).
• Look under the topic: “Cloud Planning, Installation, Upgrade, and Disaster Recovery.”
• Deploy DSR with predictable performance and capacity
• It is recommended that the DSR is run through a benchmark on the target cloud infrastructure to
determine the likely capacity and performance in the target infrastructure. This information can
Infrastructure Environment
This section describes the infrastructure that was used for benchmarking. In general, the defaults or
recommendations for hypervisor settings are available from the infrastructure vendors (for example, ESXi
vendor recommendations and defaults found in [[1]]. Whenever possible the DSR recommendations
align with vendor defaults and recommendations. Benchmarking was performed with the settings
described in this section. Operators may choose different values; better or worse performance compared
to the benchmarks may be observed. When recommendations other than vendor defaults or
recommendations are made, additional explanations are included in the applicable section.
There is a subsection included for each infrastructure environment used in benchmarking.
CPU Technology
The CPUs in the servers used for the benchmarking were the Intel Xeon E5-2699v3. Servers with
different processors are going to give different results. In general there are two issues when mapping the
results of the benchmarking data in this document to other CPUs:
1. The per-thread performance of a CPU is the main attribute that determines VM performance since the
number of threads is fixed in the VM sizing as shown in section DSR VM Configurations. A good
metric for comparing the per-thread performance of different CPUs is the integer performance
measured by the SPECint2006 (CINT2006) defined by SPEC.ORG. The mapping of SPECint2006
ratios to DSR VM performance ratios isn’t exact, but it’s a good measure to determine whether a
different CPU is likely to run the VMs faster or slower than the benchmark results in this document.
Conversely CPU clock speeds are a relatively poor indicator of relative CPU performance. Within a
given Intel CPU generation (v2, v3, v4, etc.) there are other factors that affect per-thread performance
such as potential turbo speeds of the CPU vs. the cooling solution in a given server. Comparing
between Intel CPU generations there is a generation over generation improvement of CPU
throughput vs. clock speed that means that even a newer generation chip with a slower clock speed
may run a DSR VM faster.
2. The processors must have enough cores that a given VM can fit entirely into a NUMA node. Splitting
a VM across NUMA nodes greatly reduces the performance of that VM. The largest VM size (see
section DSR VM Configurations) is 12 vCPUs. Thus, the smallest processor that should be used is a
6 core processor. Using processors with more cores typically makes it easier to “pack” VMs more
efficiently into NUMA nodes, but should not affect individual VM CPU-related performance otherwise
(see the next note though).
3. One caveat about CPUs with very high core counts is that the user must be aware of potential
bottlenecks caused by many VMs contending for shared resources such as network interfaces and
ephemeral storage on the server. These tests were run on relatively large CPUs (18 physical cores
per chip), and no such bottlenecks were encountered while running strictly DSR VMs. In clouds with
VMs from other applications potentially running on the same physical server as DSR VMs, or in future
processor generations with much higher core counts, this potential contention for shared server
resources has to be watched closely.
Recommendation: The selected VM sizes should fit within a single NUMA node (for instance 6 physical
cores for the VMs that required 12 vCPUs. Check the performance of the target
CPU type against the benchmarked CPU using per-thread integer performance
metrics.
Recommendation: When planning for physical server capacity a packing ratio of 80% is a good
guideline. Packing ratios of greater than 95% might affect the benchmark numbers
since there aren’t sufficient server resources to handle the overhead of Host OSs.
Infrastructure Tuning
The following parameters should be set in the infrastructure to improve DSR VM performance. The
instructions for setting them for a given infrastructure is including the DSR Cloud Installation Guide [[3]].
• Txqueuelen: The default of 500 is too small. Recommendation is to set this parameter to 30,000.
• Tuned on the compute hosts.
• Default value of 500 is too small. Our recommendation is to set to 30000. This increases the
network throughput of a VM.
• Ring buffer increase on the physical Ethernet interfaces: The default is too small. The
recommendation is to set both receive and transmit values to 4096.
• Multiqueue: Multiqueue should be enabled on any IPFE VMs to improve performance. Already
enabled by default on ESXi, needs to be set for Openstack.
• Advanced NUMA settings (ESXi only): The SwapLoadEnabled and SwapLocalityEnabled options
should be disabled. This prevents the ESXi scheduler from moving VMs around from one NUMA
node to another trying to optimize performance. These settings aren’t appropriate for VMs that are
processing real-time loads since messages might be delayed during the move.
Device Drivers
VirtIO is a virtualizing standard for network and disk device drivers where just the guest’s device driver
“knows” it is running in a virtual environment, and cooperates with the hypervisor. This enables guests to
get high performance network and disk operations, and gives most of the performance benefits of para-
virtualization.
Vhost-net provides improved network performance over Virtio-net by totally bypassing QEMU as a fast
path for interruptions. The vhost-net runs as a kernel thread and interrupts with less overhead providing
near native performance. The advantages of using the vhost-net approach are reduced copy operations,
lower latency, and lower CPU usage.
Recommendation: QCOW2 (Since DSR does not involve processes which are disk I/O intensive.)
Recommendation: Bare Container format
Recommendation: vm.swappiness = 10
Kernel Same Page Merging
Kernel Same-page Merging (KSM), used by the KVM hypervisor, allows KVM guests to share identical
memory pages. These shared pages are usually common libraries or other identical, high-use data.
KSM allows for greater guest density of identical or similar guest operating systems by avoiding memory
duplication. KSM enables the kernel to examine two or more already running programs and compare
their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory
pages to a single page. This page is then marked copy on write. If the contents of the page is modified
by a guest virtual machine, a new page is created for that guest.
Recommendation: KSM service set to active and ksmtuned service running on KVM hosts.
Zone Reclaim Mode
When an operating system allocates memory to a NUMA node, but the NUMA node is full, the operating
system reclaims memory for the local NUMA node rather than immediately allocating the memory to a
remote NUMA node. The performance benefit of allocating memory to the local node outweighs the
performance drawback of reclaiming the memory. However, in some situations reclaiming memory
decreases performance to the extent that the opposite is true. In other words, in these situations,
allocating memory to a remote NUMA node generates better performance than reclaiming memory for the
local node.
A guest operating system causes zone reclaim in the following situations:
• When you configure the guest operating system to use huge pages.
• When you use KSM to share memory pages between guest operating systems.
Configuring huge pages and running KSM are both best practices for KVM environments. Therefore, to
optimize performance in KVM environments, it is recommended to disable zone reclaim.
Recommendation: Disable Zone Reclaim.
Network Settings
Network Adapters
There is a number of networking adapter choices when deploying a virtual machine:
• E1000: This adapter is an emulated version of Intel 82545EM Gigabit Ethernet Controller.
VMXNET3 adapter is the next generation of Para virtualized NIC designed for performance.
• VMXNET3: This adapter has less CPU overhead compared to e1000 or e1000e. Also, VMXNET3 is
more stable than e1000 or e1000e. VMXNET3 adapter is the next generation of Para virtualized NIC
designed for performance. This is the vSphere default setting.
• VMXNET family implements an idealized network interface that passes network traffic between the
virtual machine and the physical network interface cards with minimal overhead.
Recommendation: VMXNET3. No observable differences were noticed between E1000 and VMXNET3
for DSR application testing.
Virtual Network Interrupt Coalescing and SplitRx Mode
• Virtual network Interrupt Coalescing: This option reduces number of interrupts thus potentially
decreasing CPU utilization. This may however increase network latency. By default this is enabled in
ESX 5.5 and 6.0.
• SplitRxMode: This option uses multiple physical CPUs to process network packets received in single
network queue. By default this is enabled in ESX 5.5 and 6.0 for VMXNET3 adapter type.
Table 3: Virtual Network Interrupt Coalescing and SplitRX Mode
Virtual Network Interrupt SplitRxMode:
Network Setting Default Coalescing: Disabled Disabled
DSR.CPU (Avg/Max) ~40.7%/~44.5% ~42%/~45.5% ~38.8%/~40.6%
System.CPU_UtilPct (Avg/Max) ~44.4%/~53% ~44.4%/~55.5% ~41.8%/~53.3%
Latency Observed as same in DSR application benchmarking
The data in the table above is collected from a DA MP, but similar trends are observed on the other DSR
virtual server types. A small but significant difference was observed between balanced and high
performance power settings. However, the data did not indicate a large enough deviation to vary from the
hardware vendor’s guidelines. DSR benchmark testing was performed with ESXI Power Mode set to
Balanced Performance.
Recommendation: Refer to host hardware vendor power management guidelines for virtualization.
Recommendation: Default settings. No visible advantage was observed when comparing iterative
memory stats as observed through /proc/meminfo. No visible advantage could be
observed in using large pages.
Benchmark Testing
The way the testing was performed and the benchmark test set-up is the same for each benchmark
infrastructure. Each section below describes the common set-up and procedures used to benchmark,
and then the specific results for the benchmarks are provided for each benchmark infrastructure. In
general the benchmarking results for VMware/ESXi vs. Openstack/KVM are close enough that only one
set of numbers are shown.
DA MP Relay Benchmark
Overview
This benchmarking case illustrates conditions for an overload of a DSR DA MP. Simulator message rate
is increased until the DA-MP Overload mechanisms are triggered causing messages to be discarded.
DA-MP-01
IPFE-01
...
MME Simulator HSS Simulator
IPFE-02
DA-MP-X
The dsr.cpu utilization can be increased from dsr.cpu 53% to higher levels by means of configuration
changes with DOC/CL1/CL2 discards set to 0 and multi queuing enabled on all hosts. With this
configuration, it must be noted that all the discards will be at one step CL3 for all incoming and outgoing
messages. The Default CPU threshold configuration shall remain the same which is DOC: 54% CPU ;
CL1: 60% CPU ; CL2: 66% CPU.
The Relay traffic with the above configuration changes is measured at 36K MPS at 66% dsr.cpu for a DA-
MP profile as defined in Appendix A. DSR VM Configuration. The hyper threading was enabled
Message Flow
Figure 2 illustrates the Message sequence for this benchmark case.
ULR
ULR
ULA
ULA
Indicative Alarms/Events
During benchmark testing the following alarms/events were observed as it crossed into congestion.
Table 7: DA-MP Alarms/Events
Number Severity Server Name Description
DA-MP Utilization
In this section, only the key recommended metrics for planning expansions of the DA-MP are discussed.
There are many more measurements available on the DA-MP, and these can be found in [[2]].
The key metrics for managing the DA-MP are:
Table 8: DA-MP Utilization Metrics
Recommended Usage
Measure-
ment ID Name Group Scope Description Condition Actions
10204 EvDiameterProcessAvg MP Server Average When running in If additional growth in the system
Performance Group percent normal operation with a is anticipated, then consider
Diameter mate in normal adding an additional DA-MP.
Process CPU operation, and this It is possible that the traffic mix is
utilization (0- measurement exceeds different than originally
100%) on a MP 30% of the rated dimensioned (for example, 40%
server maximum capacity, IPSEC instead of the originally
OR dimensioning 5%). In these cases,
Exceeds 60% of the re-assess the dimensioning with
rated capacity when the actual traffic/application mix
running without an and add additional DA-MPs as
active mate. needed.
10205 TmMpCongestion MP Server Total time (in Any number greater After eliminating any configuration,
Performance Group milliseconds) than 0 anomalous traffic spikes or major
spent in local failure conditions, then is a late
MP congestion indication that additional DA MPs
state are needed.
It is highly desirable that planning
for additional DA-MPs happens
before this condition occurs.
10133 RxMsgSizeAvg Diameter Server The average Average message size DA-MP dimensioning assumes 2K
Performance Group ingress > 2000 bytes average message size. This
message size information is used to dimension
in Diameter IPFEs and DIH/IDIH. No action
payload octets required if there are no alarms
associated with the PDU message
pool (available memory for
messages). If PDU message pool
is exhausting, contact Oracle.
31056 RAM_UtilPct_Average System System The average If the average Ram Contact Oracle
committed utilization exceeds
RAM usage as 80% utilization
a percentage of
the total
physical RAM
Suggested Resolution
If congestion alarms shown in Table 7: DA-MP Alarms/Events, then add additional DA-MPs to avoid CPU
congestion. However, if the connection alarm shown in Table 9: DA-MP Connection Metrics is seen,
then adding additional connections for that peer helps distribute the load and alleviates the connection
alarm.
In general, the growth mechanism for DA MPs is via horizontal scaling. That is by adding additional DA
MPs. The current maximum number of the DA MPs per DSR signaling NE is 32.
SDS
DP
IPFE-01
MME
DA-MPs HSS Simulators
Simulators
IPFE-02
SDS DB Details
The SDS database was first populated with subscribers. This population simulates real-world scenarios
likely encountered in a production environment and ensure the database is of substantial size to be
queried against.
• SDS DB Size: 300 Million Routing Entities (150 M MSISDNs/150 M IMSIs)
• AVP Decoded: User-Name for IMSI
New SDS profile (Large) is introduced to enhance the capacity of SDS FABR database to 1 Billion
Routing Entities. The Large profile is defined in Appendix A. DSR VM Configurations based on the below
1 Billion Entry configuration:
• 260 Million Subscribers having 2 IMSI, 1 MSISDN = 780 Million Routing entities. IMSI = 15 bytes,
MSISDB= 11 bytes.
• 220 Million IOT records, of 27 bytes each.
• Destinations: 300 Entries.
• One Destination per Routing Entity is configured.
• Longest FQDN configured: 32 characters
• Longest Realm configured: 13 characters
Message Flow
ULR (S6a)
ULR (S6a)
ULA (S6a)
ULA (S6a)
Measuring DP Utilization
In this section, only the key recommended metrics for managing the performance of the DP are
discussed. There are many more measurements available on the DP, and these can be found in [[2]].
There are two key components of the subscriber database within a DSR Signaling node: the Database
Processors (DPs), and OAM component which runs on the System OAM VMs. The key metrics for
managing the DPs are:
Table 12: SDS DP Utilization Metrics
Recommended Usage
Measure-
ment ID Name Group Scope Description Condition Actions
4170 DpQueriesReceived DP System The total When running in normal The operator should
(per number of operation with a mate in normal determine if additional growth
DP) queries received operation, and this in the number traffic requiring
per second measurement exceeds 30% of subscriber database look-ups
the benchmarked maximum is continuing to grow. If so,
capacity, an estimate of the additional
OR rate of database lookups
Exceeds 60% of the should be calculated and
benchmarked capacity when additional DPs should be
running without an active mate. planned for.
While memory is a consideration for the DPs, the SDS provides the centralized provisioning for the entire
DSR network.
The OAM application related to the DPs (DP SOAM) runs at each DSR Signaling NE requiring the Full
Address Resolution feature. Currently these are fixed sized VMs with no horizontal or vertical scaling
recommended as no need for scaling these VMs has been observed. The following two metrics should
be monitored,
Table 13: DP SOAM Metrics
Suggested Resolution
The growth mechanism for DPs is via horizontal scaling by adding additional DPs. The current maximum
number of the DPs per DSR signaling NE is 10. This amount of scaling currently well exceeds capacities
of the DA MPs driving queries to the DPs.
SS7 MP
Overview
SS7-MP is not supported since DSR 8.3. The SS7-MP server type is responsible for transforming
messages between SS7 and Diameter protocols. Both Diameter and MAP messages were sent from the
simulator to the DSR. The SS7-MP is based on similar technology to the DA-MP benchmarked in
previous sessions. The licensed capacity of the SS7-MP is currently limited to 12k MPS per SS7-MP,
even though the performance is similar to the DA-MP. This section is here to explain the functions of the
SS7-MP, and note the SS7-MP specific events and measurements used to monitor it.
IPFE-01
MME/SGSN
DA-MPs SS7-MPs* HLR Simulator
Simulator
IPFE-02
Message Flow
MME /
DSR SS7-MP HLR
SGSN
ULR (S6a)
ULA (translated)
ULA (S6a)
Purge_MSArg
PUR (translated)
PUR (S6a)
PUA (s6a)
Purge_MSReg (translated)
Detail Distribution
Diameter originated 50%
MAP originated 50%
Indicative Alarms/Events
Table 15: SS7 MP Alarms/Events
Number Severity Server Name Description
19250 Minor DA-SS7MP SS7 Process The SS7 Process, which is responsible for
CPU Utilization handling all SS7 traffic, is approaching or
exceeding its engineered traffic handling
capacity.
Suggested Resolution
Add additional SS7 MPs to accommodate additional MAP traffic.
The growth mechanism for SS7 MPs is via horizontal scaling by adding additional SS7 MPs. There can
be up to 8 SS7 MPs in a DSR.
vSTP MP
Overview
The vSTP-MP server type is a virtualized STP that supports M2PA, M3UA, EIR. It can be deployed either
with other DSR functionality as a combined DSR/vSTP, or as a standalone virtualized STP without any
DSR functionality.
The vSTP MP requires 8 GB of RAM. Up to 32 vSTP MPs can be configured, with a capacity of 10k MPS
per vSTP MP giving a total MPS of 300k. For EIR application, the capacity of 2k MPS per vSTP MP
giving a total MPS of 64k.
Suggested Resolution
In general, the growth mechanism for vSTP MPs is via horizontal scaling, that is by adding additional
vSTP MPs. The current maximum number of vSTP MPs per DSR signaling NE is 32.
Topology
SBR-s
SBR-b
CCR-I
CCR-I
CCA-I
CCA-I
CCR-U
CCR-U x4
CCA-U
CCA-U
RAR
RAR x2
RAA
RAA
CCR-T
CCR-T
CCA-T
CCA-T
AAR
AAR
AAA
AAA
RAR
RAR x2
RAA
RAA
STR
STR
STA
STA
CCR-T
CCR-T
CCA-T
CCA-T
Indicative Alarms/Events
Table 19: SBR (b) Alarms/Events
Number Severity Server Name Description
19825 Minor/Major/Critical DA-MP Communication Agent The number of failed transactions
Transaction Failure during the sampling period has
Rate exceeded configured thresholds.
19826 Major DA-MP, Communication Agent Communication Agent
SBR(s) Connection Congested Connection Congested
19846 Major DA-MP, Communication Agent Communication Agent Resource
SBR(s) Resource Degraded Degraded
22051 Critical SOAM Peer Unavailable Unable to access the Diameter
Peer because all of the diameter
connections are Down.
22101 Major SOAM Connection Unavailable Connection is unavailable for
Diameter Request/Answer
exchange with peer.
22715 Minor SBR(s) SBR Audit Suspended SBR audit is suspended due to
congestion.
22725 Minor/Major SBR(s) SBR Server In SBR server operating in
Congestion congestion.
22732 Minor/Major SBR(s) SBR Process CPU SBR process CPU utilization
Utilization Threshold threshold has been exceeded.
Exceeded
Suggested Resolution
If either additional Bindings or MPS capacity is required is required then additional Server Groups may be
added to an existing SBR(b) using the SBR reconfiguration feature. There can be up to 8 Server Groups
in the SBR(b).
Topology
Number of concurrent sessions a SBR-u can handle is memory dependent. With Benchmarking, it is
observed that we can have up to 5Million records in SBR-u before it reaches 70% of system RAM with
record size of ~2K Bytes.
Topology
Suggested Resolution
The growth mechanism for DA MPs, OCSG App Server is via horizontal scaling. That is by adding
additional DA MPs. The current maximum number of the DA MPs, OCSG App server per DSR signaling
NE is 32. If SCEF context information or MPS capacity is required then additional Server Groups may be
added to an existing SBR(u) using the SBR reconfiguration feature. There can be up to 64 Server
Groups in the SBR(u).
NOAM
Overview
Specific benchmark data for the DSR NOAM is not provided in this release as the DSR Cloud deployable
footprint is modest and system testing of the DSR indicates that NOAM growth in not currently needed.
Indicative Alarms/Events
The DSR Network OAM is potentially a RAM intensive function. The Network OAM is designed not to
exceed the available memory; however RAM is the most likely resource constraint.
Measurements
Measuring Network OAM Utilization
In this section, only the key recommended metrics for managing the performance of the Network OAM
are discussed. There are many more measurements available, and these can be found in [[2]].
The key metric for managing the Network OAM Servers are:
Table 25: Network OAM Metrics
Suggested Resolution
The NOAM can be vertically scaled; however this action is not anticipated to be necessary with the DSR
cloud deployable footprint. Please contact Oracle support for additional guidance as needed.
Indicative Alarms/Events
A key metric for managing the System OAM VM is:
Table 26: System OAM Metrics
Suggested Resolution
The DSR SOAM can be vertically scaled, the criteria for vertically scaling to a SOAM large profile (refer
DSR VM Configuration section) is set as > 2000 connections. Horizontal scaling of the DSR SOAM is not
supported or indicated in this release. Please contact Oracle support for additional guidance as needed.
IPFE
Overview
The IPFE was exercised in both VMware and KVM environments. Table 27 shows the measurement
capacity of the IPFE. Note that there are three main factors that determine the throughput limits:
• The number of TSAs (one or more) on the IPFE
• Whether there are more than 2,000 connections
• Whether the average message size is less than the MTU size.
Under most conditions the throughput of the IPFE is 2 Gbit/sec. However under the worst case of all
three of the above conditions the throughput of the IPFE drops to 1.6 Gb/sec.
When monitoring IPFE capacity both the guest and host CPU utilization should be monitored. Much of
the IPFE work is done at the host kernel level so the CPU utilization numbers returned by the IPFE
application level don’t fully reflect all of the IPFE overhead on the system.
Table 27: IPFE Throughput
Two or more TSAs on IPFE Pair
Single TSA on IPFE Pair (Total on both TSAs)
Avg Msg Size < Avg Msg Size >= Avg Msg Size < Avg Msg Size >=
1 MTU 1 MTU 1 MTU 1 MTU
2,000 Connections 2 Gbit/sec 2 Gbit/sec 2 Gbit/sec 2 Gbit/sec
or less
More than 2,000 2 Gbit/sec 1.6 Mbits/sec 2 Gbit/sec 2 Gbit/sec
Connections
Indicative Alarms/Events
In this section, only the key recommended metrics for managing the performance of the IPFE are
discussed. There are many more measurements available on the IPFE, and these can be found in [[2]].
Measurements
The key metrics for managing the IPFE VMs are:
Table 28: IPFE Metrics
Recommended Usage
Measure-
ment ID Name Group Scope Description Condition Actions
5203 RxIpfeBytes IPFE Server Bytes received by If the number of (bytes * 8 If the traffic is expected
Performance Group the IPFE bits/byte)/(time interval in s) is > to grow then, consider
benchmarked capacity (Gbps) adding an additional
IPFE pair
31052 CPU_UtilPct_Average System System The average CPU When running in normal Contact Oracle
(IPFE) usage from 0 to operation with a mate in normal
100% (100% operation, and this
indicates that all measurements exceeds 30%
cores are occupancy,
completely busy) OR
Exceeds 60% occupancy when
running without an active mate.
31056 RAM_UtilPct_Average System System The average If the average Ram utilization Contact Oracle
(IPFE) committed RAM exceeds 80% utilization
usage as a
percentage of the
total physical RAM
Suggested Resolution
Horizontal scaling by adding up to two pairs in total IPFEs per DSR Signaling NE as indicated.
Suggested Resolution
Contact Oracle support.
DA MP XMI 0.2 MPS See explanation above for how to calculate the
DA MP IMI Dependent signaling network traffic.
w/IWF XSI
vSTP MP
SS7 MP
Table 32 shows some guidelines for mapping the logical OCDSR networks (XMI, IMI, etc.) to interfaces.
There is nothing fixed about these assignments in the application, so they can be assigned as desired if
the customer has other requirements driving interface assignment.
Table 32: Typical OCDSR Network to Device Assignments
VM OAM Local Signaling Signaling Signaling Signaling Signaling Replication DIH
Name (XMI) (IMI) A (XSI1) B (XSI2) C (XSI3) (…) D (XSI6) (SBR Rep) Internal
DSR eth0 eth1
NOAM
DSR eth0 eth1
SOAM
DA-MP eth0 eth1 eth2 eth3 eth4 eth17 eth18
IPFE eth0 eth1 eth2 eth3 eth4 eth17
SS7 MP eth0 eth1 eth2 eth3 eth4 eth17 eth18
SBRB eth0 eth1 eth2
SBRS eth0 eth1 eth2
SBRU eth0 eth1 eth2
Traffic Mix
The different classes of applications tested (Relay, Database, Statefull) have significantly different MPS
results for a fixed infrastructure configuration (server type, VM size). The capacity for an OCDSR running
a mixture of these types can be calculated by using percentage of traffic of a given type. For example,
using the values from Table 33, consider an OCSR with the following traffic mix:
• Relay 40% (18k MPS)
• Database (30%) (16k MPS)
• Statefull (30%) (13k MPS)
The effective throughput for this traffic mix would be:
(40%*18k) + (30%*16k) + (30%*16k) =15.9k MPS
VM Configuration
The tested VM configurations are given in Section DSR VM Configurations and Section DSR DA-MP
Profiles. It’s recommended that these VM configurations are used as tested for the most predictable
results. If for some reason the customer desires to change them for a given installation, here are some
guidelines that should be kept in mind:
• The installation and upgrade procedures for the VMs requires a minimum of 70GB of disk storage.
While assigning less storage than this may appear to work during installation, it will likely cause
failures during the upgrade procedures.
• The IWF function requires 24 GBs of memory to run. It does not come into service with any smaller
memory allocation.
• Adding vCPUs to the configurations may increase performance, but only up to a point since there
may not be enough threads to efficiently take advantage of the extra vCPUs.
• Reducing the vCPU counts should not be done for any VM except for the DA-MPs. The issue is that
configurations that appear to run fine under normal traffic conditions may not have sufficient capacity
under recovery conditions or under heavy loads (for instance running reports on a SOAM while it’s
doing a backup). The DA-MP vCPU number should not be lower than 4 vCPUs (see section Small
DSR VM Configuration for an example small DSR configuration.)