SD Wan Orchestrator Deployment and Monitoring Guide
SD Wan Orchestrator Deployment and Monitoring Guide
You can find the most up-to-date technical documentation on the VMware website at:
https://docs.vmware.com/
VMware, Inc.
3401 Hillview Ave.
Palo Alto, CA 94304
www.vmware.com
©
Copyright 2022 VMware, Inc. All rights reserved. Copyright and trademark information.
VMware, Inc. 2
Contents
VMware, Inc. 3
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 4
VMware SD-WAN Orchestrator
Deployment and Monitoring Guide 1
The VMware SD-WAN™ Orchestrator Deployment and Monitoring Guide includes the following
sections listed below.
n System Properties
The SD-WAN Orchestrator Deployment and Monitoring Guide provides the following information:
n How to tune various system properties (depending on the scale of the deployment)
VMware, Inc. 5
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Prerequisites
This section describes the prerequisites that must be met before installing the SD-WAN
Orchestrator.
Instance Requirements
VMware recommends installation of the Orchestrator and Gateway applications as a virtual
machine (i.e. guest instance) on an existing hypervisor.
The SD-WAN Orchestrator requires the following minimal guest instance specifications:
Note Although we recommend using Intel Xeon processors, similar Intel or AMD processors
having the same or greater CPU frequency are also acceptable.
n 64 GB of memory
n SD-WAN Orchestrator requires 4 SSD based persistent volumes (expandable through LVM if
needed)
n 128GB x 1 - Root
n 1TB x 1 - Store
n 500GB x 1 - Store2
n 1TB x 1 - Store3
n 1 Gbps NIC
External Services
The SD-WAN Orchestrator relies on several external services. Before proceeding with an
installation, ensure that licenses are available for each of the services.
Google Maps
Google Maps is used for displaying Edges and data centers on a map. No account needs to be
created with Google to utilize the functionality. However, Internet access must be available to the
SD-WAN Orchestrator instance in order for the service to be available.
VMware, Inc. 6
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
The service is limited to 25,000 map loads each day, for more than 90 consecutive days.
VMware does not anticipate exceeding these limits for nominal use of the SD-WAN Orchestrator.
For more information, see Google Maps.
Twilio
Twilio is used for SMS-based alerting to enterprise customers to notify them of Edge or link
outage events. An account needs to be created and funded at http://www.twilio.com.
The account can be provisioned in the SD-WAN Orchestrator through the Operator Portal's
System Properties page. The account will be provisioned through a system property, as
described later in the guide. See Twilio for more information.
MaxMind
MaxMind is a geolocation service. It is used to automatically detect Edge and Gateway locations
and ISP names based on IP address. If this service is deactivated, then geolocation information
will need to be updated manually. The account can be provisioned in the SD-WAN Orchestrator
through the Operator Portal's System Properties page. See MaxMind for more information.
Installation Procedures
This section describes installation.
Cloud-init Preparation
This section describes how to use the cloud-init package to handle the early initialization of
instances.
About cloud-init
Cloud-init is a Linux package responsible for handling the early initialization of instances. If
available in the distributions, it allows for configuration of many common parameters of the
instance directly after installation. This creates a fully functional instance that is configured based
on a series of inputs.
Cloud-init's behavior can be configured via user-data. User-data can be given by the user at
instance launch time. This is typically done by attaching a secondary disk in ISO format that
cloud-init will look for at first boot time. This disk contains all early configuration data that will be
applied at that time.
The SD-WAN Orchestrator supports cloud-init and all essential configurations can be packaged in
an ISO image.
VMware, Inc. 7
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
instance-id: vco01
local-hostname: vco-01
Additionally, you can specify network interface information (if the network is not configured via
DHCP, for example):
instance-id: vco01
local-hostname: vco-01
network-interfaces: |
auto eth0
iface eth0 inet static
address 10.0.1.2
network 10.0.1.0
netmask 255.255.255.0
broadcast 10.0.1.255
gateway 10.0.1.1
#cloud-config
password: Velocloud123
chpasswd: {expire: False}
ssh_pwauth: True
ssh_authorized_keys:
- ssh-rsa AAA...SDvz [email protected]
- ssh-rsa AAB...QTuo [email protected]
vco:
super_users:
list: |
[email protected]:password1
remove_default_users: True
system_properties:
list: |
mail.smtp.port:34
mail.smtp.host:smtp.yourdomain.com
service.maxmind.enable:True
service.maxmind.license:todo_license
service.maxmind.userid:todo_user
service.twilio.phoneNumber:222123123
VMware, Inc. 8
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
network.public.address:222123123
write_files:
- path: /etc/nginx/velocloud/ssl/server.crt
permissions: '0644'
content: "-----BEGIN CERTIFICATE-----\nMI….ow==\n-----END CERTIFICATE-----\n"
- path: /etc/nginx/velocloud/ssl/server.key
permissions: '0600'
content: "-----BEGIN RSA PRIVATE KEY-----\nMII...D/JQ==\n-----END RSA
PRIVATE KEY-----\n"
- path: /etc/nginx/velocloud/ssl/velocloudCA.crt
This user-data file enables the default user, vcadmin, to login either with a password or with an
SSH key. The use of both methods is possible, but not required. The password login is enabled
by the password and chpasswd lines.
n The password contains the plain-text password for the vcadmin user.
n The chpasswd line turns off password expiration to prevent the first login from immediately
prompting for a change of password. This is optional.
Note If you set a password, it is recommended that you change it when you first log in because
the password has been stored in a plain text file.
The ssh_pwauth line enables SSH login. The ssh_authorized_keys line begins a block of one or
more authorized keys. Each public SSH key listed on the ssh-rsa lines will be added to the
vcadmin ~/.ssh/authorized_keys file.
In this example, two keys are listed. For this example, the key has been truncated. In a real
file, the entire public key must be listed. Note that the ssh-rsa lines must be preceded by two
spaces, followed by a hyphen, followed by another space.
super_users contains list of VMware Super Operator accounts and corresponding passwords.
The system_properties section allows to customize Orchestrator System Properties. See System
Properties for details regarding system properties configuration.
The write_files section allows to replace files on the system. By default, SD-WAN Orchestrator
web services are configured with self-signed SSL certificate. If you would like to provide different
SSL certificate, the above example replaces the server.crt and server.key files in the /etc/
nginx/velocloud/ssl/ folder with user-supplied files.
Note The server.key file must be unencrypted. Otherwise, the service will fail to start without
the key password.
VMware, Inc. 9
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Transfer the newly created ISO image to the datastore on the host running VMware.
Install on VMware
VMware vSphere provides a means of deploying and managing virtual machine resources. This
section explains how to run the SD-WAN Orchestrator using the VMware vSphere Client.
Note This procedure assumes familiarity with VMware vSphere and is not written with reference
to any specific version of VMware vSphere.
Field Description
OVF template details Verify that you pointed to the correct OVA template for this installation.
Provisioning Select the provisioning type. "thin" is recommended for database and binary log volumes.
Network mapping Select the network for each virtual machine to use.
Important Uncheck Power On After Deployment. Selecting it will start the virtual machine
and it should be started later after the cloud-init ISO has been attached.
4 Click Finish.
Note Depending on your network speed, this deployment can take several minutes or more.
VMware, Inc. 10
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
4 Browse to find the ISO image you created earlier (we called ours vco01-cidata.iso), and
then select it. The ISO can be found in the datastore that you uploaded it to, in the folder that
you created.
2 Select the Console tab to watch as the virtual machine boots up.
Note If you configured SD-WAN Orchestrator as described here, you should be able to log
into the virtual machine with the user name vcadmin and password that you defined when
you created the cloud-init ISO.
Install on KVM
This section explains how to run the SD-WAN Orchestrator using the libvirt. This deployment was
tested in Ubuntu 18.04 LTS.
Images
For KVM deployment, VMware will provide the SD-WAN Orchestrator in four qcow images.
n ROOTFS
n STORE
n STORE2
n STORE3
Start by copying the images to the KVM server. In addition, you must copy the cloud-init iso build
as described in the previous section.
XML Sample
Note For the images in the images/vco folder, you will need to edit from the XML.
VMware, Inc. 11
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
<os>
<type>hvm</type>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<cpu mode='custom' match='exact'>
<model fallback='allow'>SandyBridge</model>
<vendor>Intel</vendor>
<feature policy='require' name='vme'/>
<feature policy='require' name='dtes64'/>
<feature policy='require' name='invpcid'/>
<feature policy='require' name='vmx'/>
<feature policy='require' name='erms'/>
<feature policy='require' name='xtpr'/>
<feature policy='require' name='smep'/>
<feature policy='require' name='pbe'/>
<feature policy='require' name='est'/>
<feature policy='require' name='monitor'/>
<feature policy='require' name='smx'/>
<feature policy='require' name='abm'/>
<feature policy='require' name='tm'/>
<feature policy='require' name='acpi'/>
<feature policy='require' name='fma'/>
<feature policy='require' name='osxsave'/>
<feature policy='require' name='ht'/>
<feature policy='require' name='dca'/>
<feature policy='require' name='pdcm'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='fsgsbase'/>
<feature policy='require' name='f16c'/>
<feature policy='require' name='ds'/>
<feature policy='require' name='tm2'/>
<feature policy='require' name='avx2'/>
<feature policy='require' name='ss'/>
<feature policy='require' name='bmi1'/>
<feature policy='require' name='bmi2'/>
<feature policy='require' name='pcid'/>
<feature policy='require' name='ds_cpl'/>
<feature policy='require' name='movbe'/>
<feature policy='require' name='rdrand'/>
</cpu>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/bin/kvm-spice</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2'/>
<source file='/images/vco/rootfs.qcow2'/>
<target dev='hda' bus='ide'/>
<alias name='ide0-0-0'/>
VMware, Inc. 12
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 13
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
<memballoon model='virtio'>
<alias name='balloon0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</memballoon>
</devices>
<seclabel type='none' />
<!-- <seclabel type='dynamic' model='apparmor' relabel='yes'/> -->
</domain>
Create the VM
To create the VM using the standard virsh commands:
Install on AWS
This section describes how to install SD-WAN Orchestrator on AWS.
Installation
1 Launch the EC2 instance in AWS cloud.
Example: http://docs.aws.amazon.com/efs/latest/ug/gs-step-one-create-ec2-resources.html
2 Configure the security group to allow inbound HTTP (TCP/80) as well as HTTPS (TCP/443).
3 After the instance is launched, point the web browser to the Operator login URL:
https://<name>/operator
n Create gateways
VMware, Inc. 14
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
1 Login into the SD-WAN Orchestrator CLI console through SSH. If you configured the SD-WAN
Orchestrator as described here, you should be able to log into the virtual machine with the
user name vcadmin and password that you defined when you created the cloud-init ISO.
Note Do not encrypt the key. It must remain unencrypted on the SD-WAN Orchestrator
system.
Field Description
C country
ST state
L locality (city)
O company
OU department (optional)
4 Send server.csr to a Certificate Authority for signing. You should get back the SSL
certificate (server.crt). Ensure that it is in the PEM format.
5 Install the certificate (which requires root access). SD-WAN Orchestrator SSL certificates are
located in /etc/nginx/velocloud/ssl/.
6 Restart nginx.
VMware, Inc. 15
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
System Properties can be set initially using the cloud-init config file. For more information,
see Cloud-init Preparation. The following properties need to be configured to ensure proper
operation of the service.
System Name
Enter a fully qualified VMware domain name in the network.public.address system property.
Google Maps
Google Maps is used for displaying edges and data centers on a map. Maps may fail to display
without a license key. The Orchestrator will continue to function properly, but browser maps will
not be available in this case.
3 Locate the button Enable API. Click under the Google Maps APIs and enable both Google
Maps JavaScript API and Google Maps Geolocation API.
5 Under the Credentials page, click Create Credentials, then select API key. Create an API key.
Twilio
Twilio is a messaging service that allows you to receive VMware alerts via SMS. It is optional.
The account details can be entered into the VMware through the Operator Portal's System
Properties page. The properties are called:
n service.twilio.accountSid
n service.twilio.authToken
VMware, Inc. 16
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
MaxMind
MaxMind is a geolocations service. It is used to automatically detect Edge and Gateway locations
and ISP names based on an IP address. If this service is deactivated, then geolocation information
will need to be updated manually. The account details can be entered into the VMware through
the Operator Portal's System Properties page. You can configure:
Email
Email services can be used for both sending the Edge activation messages as well as for alarms
and notifications. It is not required, but it is strongly recommended that you configure this as part
of VMware operations. The following system properties are available to configure the external
email service used by the Orchestrator:
1 Upload the image to the SD-WAN Orchestrator system using any file transfer tool available
in your infrastructure, for example “scp.” Copy the image to the following location on the
system: /var/lib/velocloud/software_update/vco_update.tar.
sudo /opt/vc/bin/vco_software_update
Note If you configured the SD-WAN Orchestrator as described here, you should be able to
log into the virtual machine with the user name vcadmin and the password that you defined
when you created your the cloud-init configuration files.
For instructions on how to upgrade the SD-WAN Orchestrator with DR deployment, see
Upgrade SD-WAN Orchestrator with DR Deployment.
VMware, Inc. 17
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Example:
Example:
VMware, Inc. 18
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
configuration: sectorsize=512
*-disk:2
description: SCSI Disk
physical id: 0.2.0
bus info: scsi@2:0.2.0
logical name: /dev/sdc
serial: fTQFJ2-giAV-WsXL-1Wha-V305-oQkV-qqS3SA
size: 100GiB
capacity: 100GiB
capabilities: lvm2
configuration: sectorsize=512
4 On the hypervisor host, locate the disk attached to the VM using bus information. Example:
SCSI(0:1)
5 Extend the virtual disk. For instructions, see VMware KB article 1004047: http://
kb.vmware.com/kb/1004047
7 Re-scan the block device for the resized physical volume. Example:
Example:
pvresize /dev/sdb
Example:
Example:
VMware, Inc. 19
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
df -h /dev/store/data
Example:
root@vco:~# df -h /dev/store/data
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/store-data 379G 1.2G 359G 1% /store
System Properties
VMware provides System Properties to configure various features and options available in the
Orchestrator portal.
In the Operator portal, navigate to the System Properties page, which lists the available pre-
defined system properties. See List of System Properties, which lists some of the system
properties that you can modify as an Operator.
2 In the New System Property window, enter a name for the new property and choose the
Data Type from the drop-down list.
3 Enter the Value for the property according to the data type.
5 Click Save.
VMware, Inc. 20
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
6 To modify the values of a property, click the link to the property or select the property and
click Actions > Modify System Property.
7 To remove a property, select the property and click Actions > Delete System Property.
You can use the Search field to find a specific system property. See the section titled, "List of
System Properties" in the VMware SD-WAN Orchestrator Deployment and Monitoring Guide,
which lists some of the system properties that you can modify as an Operator.
Note It is recommended to contact VMware Support before making changes to the system
properties.
The following tables describe some of the system properties. As an Operator, you can set the
values for these properties.
n List item.
n List item.
VMware, Inc. 21
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 22
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
edge.certificate.renewal.window This optional system property allows the Operator to define one
or more maintenance windows during which the Edge certificate
renewal is enabled. Certificates scheduled for renewal outside of
the windows will be deferred until the current time falls within one
of the enabled windows.
Enable System Property:
To enable this system property, type "true" for "enabled" in the
first part of the Value text area in the Modify System Property
dialog box. An example of the first part of this system property
when it is enabled is shown below.
Operators can define multiple windows to restrict the days and
hours of the day during which Edge renewals are enabled. Each
window can be defined by a day, or a list of days (separated by
a comma), and a start and end time. Start and end times can be
specified relative to an Edge's local time zone, or relative to UTC.
See image below for an example.
VMware, Inc. 23
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 24
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 25
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
{
"enabled": false,
"windows": [
{
retention.healthstats.days Edge health stats retention period (-1 sets retention to the
maximum time period allowed)
VMware, Inc. 26
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Flow Stats retention.lowResFlow 1 year – 1 hour rollup 1 year – 1 hour rollup 1 year with rollup
s.months 2 weeks – 5 min 3 month – 5 min
retention.highResFlo
ws.days
edge.link.disconnected.limit.sec When the Orchestrator does not receive link statistics for
a link for the specified duration, the link is disconnected.
VMware, Inc. 27
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 28
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 29
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n disallowUsernameCharacters: Password
must not match a configurable portion
of the user's ID. For example, if
disallowUsernameCharacters is set to 5, if a
user with username [email protected]
attempts to configure a new password
that includes ‘usern’ or ‘serna’, or any five-
character string that matches a section of
the user’s username, that new password
would be rejected by the Orchestrator. The
default value of -1 signifies that this feature is
not enabled.
n variationValidationCharacters: New
password must vary from the old password
by a configurable number of characters.
The Orchestrator uses the Levenshtein
distance between two words to determine
the variation between the new and old
password. The Levenshtein distance is
the minimum number of single-character
edits (insertions, deletions, or substitutions)
required to change one word into another.
n If variationValidationCharacters is set to 4,
then the Levenshtein distance between the
new and old password must be 4 or greater.
In other words, the new password must have
4 or more variations from the old password.
For example, if the old password used was
"kitten" and the new password is "sitting",
the Levenshtein distance for these is 3, since
it requires only three edits to change kitten
into sitting:
n kitten → sitten (substitution of "s" for "k")
n sitten → sittin (substitution of "i" for "e")
n sittin → sitting (insertion of "g" at the
end).
Since the new password only varies by 3
characters from the old, “sitting” would be
rejected as a new password to replace “kitten”.
The default value of -1 signifies that this feature
is not enabled.
expiry:
n enable: Set this to true to enable automatic
expiry of customer user passwords.
n days: Enter the number of days that an
customer password may be used before
forced expiration.
VMware, Inc. 30
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
history:
n enable: Set this to true to enable recording
of customer users' previous Passwords.
n count: Enter the number of previous
Passwords to be saved in the history.
When a customer user tries to change the
password, the system does not allow the
user to enter a password that is already
saved in the history.
VMware, Inc. 31
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 32
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 33
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 34
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 35
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
[
{
"type": "OPERATOR_USER",
"policies": [
{
"match": {
"type": "ALL"
},
"rules": {
"reservoir": 500,
"reservoirRefreshAmount": 500,
"reservoirRefreshInterval": 5000
}
}
]
},
{
"type": "MSP_USER",
"policies": [
{
"match": {
"type": "ALL"
},
"rules": {
"reservoir": 500,
"reservoirRefreshAmount": 500,
"reservoirRefreshInterval": 5000
}
}
]
},
{
"type": "ENTERPRISE_USER",
"policies": [
{
"match": {
"type": "ALL"
},
"rules": {
"reservoir": 500,
"reservoirRefreshAmount": 500,
"reservoirRefreshInterval": 5000
}
}
]
VMware, Inc. 36
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
}
]
For more information on Rate limiting, see Rate Limiting API Requests.
session.options.websocket.portal.idle.timeout Allows to set the total amount of time (in seconds) the
browser WebSocket connection is active in an idle state.
By default, the browser WebSocket connection is active
for 300 seconds in an idle state.
VMware, Inc. 37
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 38
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
log.syslog.lastFetchedCRL.backend Keeps the last updated CRL as PEM formatted string for
service syslog and updated regularly.
log.syslog.lastFetchedCRL.portal Keeps the last updated CRL as PEM formatted string for
service syslog and updated regularly.
log.syslog.lastFetchedCRL.upload Keeps the last updated CRL as PEM formated string for
service syslog and updated regularly.
vco.enterprise.authentication.twoFactor.mode Defines the mode for the second level authentication for
Enterprise users. Currently, only SMS is supported as the
second level authentication mode.
VMware, Inc. 39
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
vco.operator.authentication.twoFactor.mode Defines the mode for the second level authentication for
Operator users. Currently, only SMS is supported as the
second level authentication mode.
[
{
"vendor": "Vendor Name",
"version": "VNF Image Version",
"checksum": "VNF Checksum Value",
"checksumType": "VNF Checksum Type"
}
]
[
{
"vendor": "checkPoint",
"version": "r80.40_no_workaround_46",
"checksum":
"bc9b06376cdbf210cad8202d728f1602b79cfd7d",
"checksumType": "sha-1"
}
]
[
{
"vendor": "fortinet",
"version": "624",
"checksum":
"6d9e2939b8a4a02de499528c745d76bf75f9821f",
"checksumType": "sha-1"
}
]
VMware, Inc. 40
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware, Inc. 41
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n The recovery time objective (RTO), therefore, is dependent on explicit action by the operator
to trigger promotion of the standby.
n The recovery point objective (RPO), however, is essentially zero, regardless of the recovery
time, because all configuration is instantaneously replicated. Monitoring data that would have
been collected during the outage is cached on the edges and gateways pending promotion
of the standby.
Note DR is mandatory. For licensing and pricing, contact the VMware sales team for support.
Active/Standby Pair
In a SD-WAN Orchestrator DR deployment, two identical SD-WAN Orchestrator systems are
configured as an active / standby pair. The operator can view the state of DR readiness
through the web UI on either of the servers. Edges and gateways are aware of both SD-WAN
Orchestrators, and while they receive configuration changes only from the active SD-WAN
Orchestrator, they periodically send DR heartbeats to both systems to report their view of both
servers and to query the DR system status. When the operator triggers a failover, the edges and
gateways are informed of the change in their next DR heartbeat.
DR States
From the view of an operator, and of the edges and gateways, a SD-WAN Orchestrator has one
of four DR states:
DR State Description
Standalone No DR configured.
Zombie DR formerly configured and active but no longer acting as the active or standby.
Run-time Operation
When DR is configured, the standby server runs in a limited mode, blocking all API calls except
those related to the DR status and the DR heartbeats. When the operator invokes a failover, the
standby is promoted to become fully operational as a Standalone server. The server that was
formerly active is automatically transitioned to a Zombie state if it is responsive and visible from
the promoted standby. In the Zombie state, management configuration services are blocked and
any contact from edges and gateways that have not transitioned to the new active SD-WAN
Orchestrator are redirected to the promoted server.
VMware, Inc. 42
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n The active server is then given the address and credentials of the standby and it enters the
ACTIVE_CONFIGURING state.
When a STANDBY_CONFIG_RQST is made from active to standby, the two servers synchronize
through the state transitions.
The two Orchestrators on which Disaster Recovery (DR) need to be established must have same
time. Before you initiate SD-WAN Orchestrator replication, ensure you check the following NTP
configurations:
n The Gateway time zone must be set to Etc/UTC. Use the following command to view the NTP
time zone.
If the time zone is incorrect, use the following commands to update the time zone.
n The NTP offset must be less than or equal to 15 milliseconds. Use the following command to
view the NTP offset.
If the offset is incorrect, use the following commands to update the NTP offset.
VMware, Inc. 43
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n By default, a list of NTP Servers are configured in the /etc/ntpd.conf file. The Orchestrators
on which DR need to be established must have Internet to access the default NTP Servers
and ensure the time is in sync on both the Orchestrators. Customers can also use their local
NTP server running in their environment to sync time.
1 Click Replication from the Navigation panel to display the Orchestrator Replication screen.
2 Enable the Standby Orchestrator by selecting the Standby (Replication Role) radio button.
The Orchestrator Success dialog box appears, indicating that the Orchestrator has been
enabled for Standby, and that the Orchestrator will restart in Standby mode.
4 Click OK.
VMware, Inc. 44
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
After the Standby Orchestrator has been configured for replication, configure the Active
Orchestrator according to the instructions below.
1 Click Replication from the Navigation panel. The Orchestrator Replication screen appears.
3 Type in the Standby Orchestrator Address and the Standby Orchestrator Uuid.
The Orchestrator Address and Uuid are displayed in the Standby Orchestrator
screen.
4 Type in the username and password for the Orchestrator Superuser to be used for
replication.
The Active Orchestrator screen displays showing a status of the current state.
VMware, Inc. 45
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
When configuration is complete, both Orchestrators (Standby and Active) will be in sync.
You can click the toggle history link to view the status of each state.
VMware, Inc. 46
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Test Failover
The following testing failover scenarios are forced failovers for example purposes. You can
perform these actions in the Available Actions area of the Active and Standby screens.
2 Click the Promote Standby button in the Available Actions area on the Standby Orchestrator
screen.
The following dialog box appears, indicating that when you promote your Standby
Orchestrator, administrators will no longer be able to manage the SD-WAN Orchestrator
using the previously Active Orchestrator.
Another message dialog box appears to verify your request to promote the Standby
Orchestrator. This message will appear only if the Standby Orchestrator perceives the Active
Orchestrator to be in good health, meaning the Standby is communicating with the Active
and duplicating data.
A final dialog box appears indicating that the Orchestrator is no longer a Standby and will
restart in Standalone mode.
VMware, Inc. 47
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
If the Standby can communicate with the formerly Active Orchestrator, it will instruct that
Orchestrator to enter a Zombie state. In Zombie state, the Orchestrator communicates
with its clients (edges, gateways, UI/API) that it is no longer active, and that they must
communicate with the newly promoted Orchestrator. If the promoted Standby cannot
communicate with the formerly Active Orchestrator, the operator should, if possible, manually
demote the formerly Active Orchestrator.
Note The Orchestrator can be returned to the Standalone mode from the Zombie state after
the time specified in the system property "vco.disasterRecovery.zombie.expirySeconds," which is
defaulted to 1800 seconds.
Recoverable Failures
The following errors are recoverable failures that can occur after SD-WAN Orchestrator DR
reaches an in sync state. If the problem causing these failures is corrected, SD-WAN Orchestrator
DR will automatically return to normal operation.
n FAILURE_SYNCING_FILES
n FAILURE_GET_STANDBY_STATUS
n FAILURE_MYSQL_ACTIVE_STATUS
n FAILURE_MYSQL_STANDBY_STATUS
VMware, Inc. 48
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Unrecoverable Failures
The following failures can occur during configuration of the SD-WAN Orchestrator DR. SD-WAN
Orchestrator DR will not automatically recover from these failures.
n FAILURE_ACTIVE_CONFIGURING
n FAILURE_LAUNCHING_STANDBY
n FAILURE_STANDBY_CONFIGURING
n FAILURE_COPYING_DB
n FAILURE_COPYING_FILES
n FAILURE_SYNC_CONFIGURING
n FAILURE_GET_STANDBY_CONFIG
n FAILURE_STANDBY_CANDIDATE
n FAILURE_STANDBY_UNCONFIG
n FAILURE_STANDBY_PROMOTION
n FAILURE_ACTIVE_DEMOTION
For SD-WAN Orchestrator Disaster Recovery, see " Set Up DR in the VMware" and " Upgrade the
DR Setup."
Upgrade an Orchestrator
This section describes how to upgrade an Orchestrator.
VMware, Inc. 49
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
1 VMware Support will assist you with your upgrade. Collect the following information prior to
contacting Support.
n Provide the current and target Orchestrator versions, for example: current version (ie
2.5.2 GA-20180430), target version (3.3.2 p2).
Note For the current version, this information can be found on the top, right corner of
the Orchestrator by clicking the Help link and choosing About.
Note Commands must be run as root (e.g. ‘sudo <command>’ or ‘sudo -i’).
n LVM layout
n Memory Information
n CPU Information
n Kernel Parameters
n ssh configurations
VMware, Inc. 50
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n Copy of /var/log
1 From the SD-WAN Orchestrator, select Orchestrator Upgrade from the navigation panel.
2 In the Upgrade Announcement area, type in your message in the Banner Message text box.
A popup message appears indicating that you have successfully created your announcement,
and that your banner message displays at the top of the SD-WAN Orchestrator.
VMware, Inc. 51
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
4 (Optional) You can remove the announcement from the SD-WAN Orchestrator by clicking
the Unannounce Orchestrator Upgrade button. A popup message will appear indicating that
you have successfully unannounced the Orchestrator upgrade. The announcement that was
displayed at the top of the SD-WAN Orchestrator will be removed.
To verify that the status of the upgrade is complete, run the following command to display the
correct version number for all the packages:
When you are logged in as an Operator, the same version number should display at the bottom
right corner of the SD-WAN Orchestrator.
Upgrade VMware SD-WAN Orchestrator from version 3.3.2 or 3.4 to version 4.0
This document provides and overview and best practices on how to upgrade the VMware SD-
WAN Orchestrator from the 3.3.2 or 3.4 release to the 4.0 release. However, please contact
VMware Support to asssit you with the 3.3.2 or 3.4 to 4.0 upgrade at https://kb.vmware.com/s/
article/53907
Only 3.3.2 and 3.4 Orchestrators can be upgraded to the 4.0 release. If you are running a 3.3.1
or lower version of the Orchestrator, you must upgrade to at least the 3.3.2 version before
upgrading to the 4.0 version.
n Just like other releases, there are schema changes with the 4.0 release. However, these
changes will not impact the upgrade process.
The OS for the SD-WAN Orchestrator virtual appliance and the underlying data stores that
store the configuration and statistics data are being upgraded. The specific upgrades include the
following:
VMware, Inc. 52
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Note The Orchestrator OS, database, and several other dependent components currently in use
have reached their end of life, and will no longer be supported.
n Faster query performance for statistics, longer retention out of the box for flow stats.
Best Practices/Recommendations:
Listed below are some upgrade best practices:
n From the System Properties page in the Orchestrator, make a note of the value of the
edge.heartbeat.spread.factor system property. Then, change the heartbeat spread factor
to a relatively high value for a large Orchestrator (e.g. 20, 40, 60). This will help reduce
the sudden spike of the resource utilization (CPU, IO) on the system. Make sure to verify
that all Gateways and Edges are in a connected state before restoring the previous
edge.heartbeat.spread.factor value from the System Property page in the Orchestrator.
n Leave the demoted SD-WAN Orchestrator up for a few hours before complete shutdown or
decommission.
n Freeze configuration modifications to avoid any additional configuration changes until the
upgrade process is completed.
VMware, Inc. 53
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Upgrade Procedures
Please contact VMware Support to assist you with the 3.3.2 or 3.4 to 4.0 upgrade at https://
kb.vmware.com/s/article/53907
1 Install a new SD-WAN Orchestrator whose version matches the version of the VMware that is
currently the Active SD-WAN Orchestrator.
2 Set the following properties on the Active and Standby SD-WAN Orchestrator, if necessary.
3 Set up the network.public.address property on the Active and Standby to the address
contacted by the Edges (Heartbeats).
VMware, Inc. 54
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Note If the Orchestrator upgrade is from 2.X -> 3.2.X, run dr-standby-schema.sh on the Standby
before starting the upgrade.
1 Prepare for the Upgrade. For instructions, go to Step 1: Prepare for the Orchestrator Upgrade
of the section titled, Upgrade an Orchestrator with DR Deployment.
2 Proceed with the Orchestrator Upgrade. For instructions, go to Step 3: Proceed with the
Orchestrator Upgrade of the section titled, Upgrade an Orchestrator with DR Deployment.
Orchestrator Diagnostics
This section describes Orchestrator Diagnostics.
n Diagnostic Bundles Tab: Request and download a diagnostic bundle. This information can
be found in the VMware SD-WAN Orchestrator Deployment and Monitoring Guide. See the
section titled, "Diagnostic Bundle Tab."
n Database Statistics Tab: Provides a read-only access view of some of the information from
a diagnostic bundle. This information can be found in the VMware SD-WAN Orchestrator
Deployment and Monitoring Guide. See the section titled, "Database Statistics Tab."
VMware, Inc. 55
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Reason for The specific reason given for generating a diagnostic bundle. Click the Request Diagnostic
Generation Bundle button to include a description of the bundle.
Generated The date and time when the diagnostic bundle request was sent.
Cleanup Date The default Cleanup Date is three months after the generated date, when the bundle will be
automatically deleted. If you need to extend the Cleanup date period, click the Cleanup Date
link located under the Cleanup Date column. For more information, see Updating Cleanup Date.
2 From the Request Diagnostic Bundle tab, click the Request Diagnostic Bundle button.
3 In the Request Diagnostic Bundle dialog, enter the reason for the request in the appropriate
area.
4 Click Submit. The bundle request you created displays in the grid area of the Diagnostic
Bundle screen with an In Progress status.
VMware, Inc. 56
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
5 Refresh your screen to check the status of diagnostic bundle request. When the bundle is
ready for download, a Complete status appears.
2 Click the Actions button, and choose Download Diagnostic Bundle. You can also click the
Complete link to download the diagnostics bundle.
1 From the Cleanup Date column, click the Cleanup Date link of your chosen Diagnostic Bundle.
2 From the Update Cleanup Date dialog, click the Calendar icon to change the date.
3 You can also choose to keep the bundle indefinitely by checking the Keep Forever checkbox.
VMware, Inc. 57
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
4 Click OK.
The Orchestrator Diagnostics table grid updates to reflect the changes to the Cleanup Date.
If you require additional information, go to the Diagnostic Bundles tab, request a diagnostic
bundle, and download it locally. For more information, see Request Diagnostic Bundle.
VMware, Inc. 58
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Field Description
Database Table Statistics Statistical details of all tables in the Orchestrator database.
Database Engine Status The InnoDB engine status of the MySQL server.
To enable the monitoring stack, run the following command on the orchestrator:
VMware, Inc. 59
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
To activate more metrics or deactivate some enabled metrics, edit the Telegraf configuration file
on the Orchestrator by the following:
n sudo vi /etc/telegraf/telegraf.d/system_metrics_input.conf
The SD-WAN Orchestrator makes use of certain defence mechanisms that curb API abuse and
provides system stability. API requests that exceed the allowed request limits are blocked and
returned with HTTP 429 (Too many Requests). The system needs to go through a cool down
period before making the requests again.
n Leaky bucket limiter – Smooths the burst of requests and only allows a pre-defined number
of requests. This limiter takes care of limiting the number of requests allowed in a given time
window.
VMware, Inc. 60
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n Concurrency limiter – Limits the number of requests that occur in parallel which leads to
concurrent requests fighting for resources and may result in long running queries.
The following are the major reasons that lead to rate limiting of the API requests:
n Requests resulting in long running queries on the Orchestrator holding system resources for
long being dropped.
Developers that rely on the API can adopt the following measures to improve the stability of their
code when the VCO rate-limiting capability is enabled.
n Handle HTTP 429 response code when requests exceed rate limits.
n The penalty time duration is 5000 ms when the rate limiter reaches the maximum allowed
requests in a given period. If blocked, the clients are expected to have a cool down period of
5000 ms before making requests again. The requests made during the cool down period of
5000 ms will still be rate limited.
n Use shorter time intervals for time series APIs which will not let the request to expire due to
long running queries.
n Prefer batch query methods to those that query individual Customers or Edges whenever
possible.
Note Operator Super users configure Rate limits discretely based on the environment. For any
queries on relevant policies, contact your Operator.
n vco.api.rateLimit.enabled
n vco.api.rateLimit.mode.logOnly
n vco.api.rateLimit.rules.global
n vco.api.rateLimit.rules.enterprise.default
n vco.api.rateLimit.rules.enterpriseProxy.default
For more information on the system properties, see Table 1-14. Rate Limiting APIs.
VMware, Inc. 61
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Overview
Even though the enterprise on-premises model has some unique advantages and features, there
are considerations that the service provider or customer managing the solution must understand.
Some of these considerations are as follows:
n Isolation of the solution: The VMware Cloud Operations team will not have access to apply
hotfixes and upgrades.
n Inadequate or insufficient solution monitoring: This situation may happen due to a lack
of personnel capable of managing the infrastructure, resulting in functional issues, slower
resolution of problems, and customer dissatisfaction.
This approach always requires a significant investment in people and time to manage, operate,
and patch properly. The table below outlines some of the elements that must be considered
when managing a system on-premises.
VMware, Inc. 62
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Backup No Yes
Replication No Yes
Troubleshooting No Yes
application/infrastructure
issues
DR infrastructure No Yes
DR testing No Yes
Two-day operation scenarios for Enterprise On-Premises deployments are explained in the two
sections below, respectively (Day One Operations and Day Two Operations).
VMware, Inc. 63
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
VMware Security Advisories document remediation for security vulnerabilities that are reported
in VMware products. Please subscribe to the link below to receive an alert if an action is required
in an on-prem component.
https://www.vmware.com/security/advisories.html
The data-source contains two sections: meta-data and user-data. Meta-data includes the instance
ID and should not change during the lifetime of the instance, while user-data is a configuration
applied on the first boot (for the instance ID in meta-data).
After the first boot up, it is recommended to deactivate the cloud-init file to speed up the
SD-WAN Orchestrator boot sequence. To deactivate cloud-init, run:
./opt/vc/bin/cloud_init_ctl -d
It is not recommended to "purge" the cloud-init file with the command "apt purge cloud-init" (this
procedure does not cause issues in the VMware SD-WAN Controller). Purging the cloud-init file
also erases some essential SD-WAN Orchestrator tools and scripts (for instance, the upgrade
and backup scripts). In case the "purge" command was used, you can restore the files using the
following commands:
n Install the SD-WAN Orchestrator tool package from the folder: “sudo dpkg -i vco-tools_3.4.1-
R341-20200423-GA-69c0f688bf.deb”. The vco-tools package name may change depending
on your release. Please check the correct file name with the command "ls vco-tools."
NTP Timezone
NTP Offset
VMware, Inc. 64
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
When the SD-WAN Orchestrator is initially deployed, three partitions are created: /, /store, /
store2., /store3 (version 4.0 and onwards). The partitions are created with default sizes. Please
follow the instructions in the section titled, "Increasing Storage in the SD-WAN Orchestrator" for
guidance in modifying the default sizes to match the design.
Additional Tasks
The SD-WAN Orchestrator requires further configuration after its implementation via the
following steps:
The configurations in the list above are out of this document's scope and can be found in
the deployment guides in the VMware documentation. Detailed instructions can be found in
the VMware SD-WAN Orchestrator Deployment and Monitoring Guide, section titled, "Install
SD-WAN Orchestrator."
This section provides the available mechanisms to periodically backup the SD-WAN Orchestrator
database to recover from Operator errors or catastrophic failure of both the Active and Standby
Orchestrator.
Remember that the Disaster Recovery feature or DR is the preferred recovery method.
It provides a Recovery Point Objective of nearly zero, as all configurations on the Active
Orchestrator is instantly replicated. For more details on the Disaster recovery feature, refer to
the next section.
VMware, Inc. 65
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
The script essentially takes a database dump of the configuration data and events, while
excluding some of the large monitoring tables during the database dump process. Once the
script is executed, backup files are created in the local directory path provided as input to the
above script.
The Backup consists of two .gzs files, one containing the database schema definition and the
other one containing the actual data without definition. The administrator should ensure that the
backup directory location has enough disk space for the Backup.
Best Practices
n Mount a remote location and configure the backup script to it. The remote location should
have the same storage as /store if flows are also being Backup.
n Before using the Backup Script, check the Disaster Recovery (DR) replication status from the
SD-WAN Orchestrator replication page. They should be in sync, and no errors should be
present.
n Additional to this, execute a MySQL query and check the replication lag.
n In the above query, look at the field seconds_behind_master. Ideally, it should be zero,
but under 10 would be sufficient as well.
n For the large SD-WAN Orchestrators, it is recommended to use the Standby for the
Backup script execution. There will be no difference in the Backup that is generated from
both SD-WAN Orchestrators.
Caveats
n The Script only takes a backup of the configuration; flow stats or events are not included.
The duration of the Backup depends on the scale of the actual customer configuration.
Since the monitoring tables are excluded from the Backup operation, it is expected that the
configuration Backup operation will complete quickly. For a large SD-WAN Orchestrator with
thousands of SD-WAN Edge and lots of historical events, it could take up to an hour, while a
smaller SD-WAN Orchestrator should be completed within a few minutes.
VMware, Inc. 66
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Depending on the size and time it takes to complete the initial backup, the Backup operation
frequency can be determined. The Backup operation should be scheduled to run during
off-peak hours to reduce the impact on SD-WAN Orchestrator resources.
3 What if the root filesystem doesn't have enough space for the backup?
It is recommended that other mounted volumes are used to store the backup. Note, it is not a
best practice to use the root filesystem for the backup.
The script stdout and stderr should be sufficient to determine the success or failure of the
Backup operation. If the script invocation is automated, the exit code can determine the
Backup operation's success or failure.
Currently, VMware requires that the customer work with VMware Support to recover the
configuration data. VMware Support will help to recover the customer's configuration.
Customers should refrain from making any additional configuration changes until the
configuration is restored.
Even though a backup of the configuration should have little impact on performance, there
will be an increase in resource utilization for the MySQL process. It is recommended that the
Backup be run during off-peak hours.
7 Are any configuration changes allowed during the run of the Backup operation?
It is safe to make configuration changes while the Backup operation is running. However,
to ensure up-to-date backups, it is recommended that no configuration operations are done
while the Backup is running.
8 Can the configuration be restored on the original SD-WAN Orchestrator, or does it require a
new SD-WAN Orchestrator?
Yes, the configuration can, and ideally should, be restored on the same SD-WAN
Orchestrator if it is available. This will ensure that the monitoring data is utilized after the
Restore operation is completed. If the original SD-WAN Orchestrator cannot be recovered
and the Standby SD-WAN Orchestrator is down, the configuration can be restored on a new
SD-WAN Orchestrator. In this instance, the monitoring data will be lost.
9 What actions should be taken in case the configuration needs to be restored to a new
SD-WAN Orchestrator?
Please contact VMware Support for the recommended set of actions on the new SD-WAN
Orchestrator as the steps vary depending on the actual deployment.
No, SD-WAN Edges are not required to register on the new SD-WAN Orchestrator, as all
needed information is preserved as part of the Backup.
VMware, Inc. 67
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
TheSD-WAN Orchestrator Disaster Recovery (DR) feature prevents the loss of stored data and
resumes SD-WAN Orchestrator services in the event of system or network failure. SD-WAN
OrchestratorDR involves setting up an Active/Standby SD-WAN Orchestrator pair with data
replication and a manually-triggered failover mechanism.
Note DR is mandatory. For licensing and pricing, contact the VMware SD-WAN Sales team for
support.
States
From the view of an Operator, and of the SD-WAN Edges and SD-WAN Gateways, a SD-WAN
Orchestrator has one of four DR states:
n Zombie (DR formerly configured and Active, but no longer working as the Active or Standby)
Table 1-24. Table 2: Instance Minimal Requirements for On-Prem SD-WAN Orchestrator
VMware, Inc. 68
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Best Practices
n Before promoting a Standby SD-WAN Orchestrator as Active, confirm that the DR replication
Status is in Sync. The previously Active SD-WAN Orchestrator will no longer be able to
manage the inventory and configuration.
n If the Standby can communicate with the formerly Active Orchestrator, it will instruct that
Orchestrator to enter a Zombie state. In the Zombie state, the SD-WAN Orchestrator
communicates with its clients (SD-WAN Edges, SD-WAN Gateways, UI/API) that it is no
longer Active, and they must communicate with the newly promoted SD-WAN Orchestrator.
n If the promoted Standby cannot communicate with the formerly Active Orchestrator, the
Operator should, if possible, manually demote the previously Active.
For Enterprise on-prem deployments, contact the VMware Support team to prepare for the
SD-WAN Orchestrator upgrade as described below:
1 VMware Support will assist with the upgrade. Collect the following information before
contacting VMware Support.
n Provide the current and target SD-WAN Orchestrator versions, for example, the current
version (i.e., 3.4.2), target version (3.4.3).
Note For the current version, this information can be found on the top, right corner of
the SD-WAN Orchestrator by clicking the Help link and choosing About.
VMware, Inc. 69
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n Commands from the SD-WAN Orchestrator (Commands must be run as root (e.g. 'sudo
<command>' or 'sudo -i'). ):
n LVM layout
n pvdisplay -v
n vgdisplay -v
n lvdisplay -v
n df -h
n cat /etc/fstab
n Memory information
n free -m
n cat /proc/meminfo
n ps -ef
n top -b -n 2
n CPU Information
n cat /proc/cpuinfo
n Copy of /var/log
VMware, Inc. 70
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
3 ESXi Snapshot guidelines are provided in the next section in case the customer wants a quick
rollback solution after an upgrade.
ESXi Snapshot
The ESXi snapshot capability can be used before the SD-WAN Orchestrator upgrades to provide
a quick rollback to the previous SD-WAN Orchestrator version.
Before reviewing the step-by-step process, check the following best practices and guidelines
about the feature:
n Standby and Active SD-WAN Orchestrator must be powered off before performing or
restoring from the Snapshot to avoid any database inconsistencies.
n All Snapshot-related tasks must be done in the Standby and Active SD-WAN Orchestrator to
avoid any database inconsistencies.
n It is essential to consolidate the Snapshot if the upgrade process was successful. The
snapshot file continues to grow when it is retained for a more extended period. This can
cause the snapshot storage location to run out of space and impact the system performance.
n Deactivate alerting in the SD-WAN Orchestrator while creating snapshots to avoid false
alarms.
n Feature validation was done with ESXi 6.7 and SD-WAN Orchestrator version 3.4.4.
VMware Snapshot best practices can be found in the following kb article: https://
kb.vmware.com/s/article/1025279
VMware, Inc. 71
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
1 Deactivate alert, notification, and monitoring System Properties on the Active SD-WAN
Orchestrator. The approximate duration is 10 Minutes.
a In the Operator portal, click System Properties. Change the following System Properties
to false.
n vco.alert.enable
n vco.notification.enable
n vco.monitor.enable
2 Deactivate alert, notification, and monitoring System Property on the Standby SD-WAN
Orchestrator.
n vco.alert.enable
n vco.notification.enable
n vco.monitor.enable
VMware, Inc. 72
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
5 Take a Snapshot of the Active SD-WAN Orchestrator. Confirm that the VM is powered off
before performing this step.
VMware, Inc. 73
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
6 Take a Snapshot of Standby SD-WAN Orchestrator. Confirm that the VM is powered off
before performing this step.
Use the following instructions if you have a successful upgrade. An increased CPU usage of
about 5 percent is expected while conducting the consolidation process. The approximate
duration is 10 Minutes.
1 After confirming a successful upgrade on the Active and Standby Orchestrators, you can
consolidate the Snapshots starting with the Active SD-WAN Orchestrator.
VMware, Inc. 74
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
3 Re-enable alert, notification, and monitoring System Properties on the Active SD-WAN
Orchestrator and the Standby SD-WAN Orchestrator.
In the Operator portal, click System Properties. Change the following system properties to
true.
n vco.alert.enable
n vco.notification.enable
n vco.monitor.enable
VMware, Inc. 75
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
4 If the Delete All snapshots do not work with vSphere 6.x/7.x, you can try to Consolidate
Snapshots. For more information, see the Consolidate Snapshots section in the vSphere
Product Documentation.
Perform the instructions below if you want to perform a rollback to the previous SD-WAN
Orchestrator version. The approximate duration is 10 Minutes
Select the Snapshot you want to restore the VM → Revert to (see image below).
VMware, Inc. 76
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
5 Re-enable the alert, notification, and monitoring System Properties on the Active SD-WAN
Orchestrator and the Standby SD-WAN Orchestrator. In the Operator portal, click System
Properties. Change the following System Properties to true.
n vco.alert.enable
n vco.notification.enable
n vco.monitor.enable
VMware, Inc. 77
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
The software upgrade file contains Gateway and system updates. Do NOT run ‘apt-get update
&& apt-get –y upgrade.’
Before proceeding with the VMware SD-WAN Controller's upgrade, ensure that the SD-WAN
Orchestrator was upgraded before to the same or a higher version.
2 Upload the image to the SD-WAN Controller storage (using, for example, the SCP command).
Copy the image to the following location on the system: /var/lib/velocloud/software_update/
vcg_update.tar.
sudo /opt/vc/bin/vcg_software_update
Example:
n A new system disk layout based on LVM to allow more flexibility in volume management
VMware, Inc. 78
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n Improved security hardening based on the Center for Internet Security benchmarks
Due to these changes, the standard upgrade procedure which uses the upgrade script does
not work. A particular upgrade procedure is required. It is in the product manual below. This
procedure is to replace the 3.3.2 or 3.4 Gateway VM with the new 4.0 Gateway VM. Refer to the
following document: VMware SD-WAN Partner Gateway Upgrade and Migration 3.3.2 or 3.4 to
4.0
This upgrade procedure requires SD-WAN Orchestrator system property configuration, which
only SD-WAN Orchestrator Operator accounts can run. Please create a support ticket with the
VMware Support team to request the System Property change.
Monitoring
You can monitor the status and usage data of Controllers available in the Operator portal.
3 Click the link to a Gateway. The details of the selected Controller displays.
4 Click the Monitor tab to view the usage data of the selected Controller.
The Monitor tab of the selected Controller displays the following details as shown in the image
below.
VMware, Inc. 79
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
You can choose a specific period to view the Controller's details for the selected duration at the
top of the page.
The page displays a graphical representation of usage details of the following parameters for the
period of selected time duration, along with the minimum, maximum, and average values.
Usage Description
The following list shows values that should be monitored and their thresholds. The list below
is given as a start point, and it is not exhaustive. Some deployments may require assessing
additional components such as flows, packet loss, etc.
VMware, Inc. 80
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Handoff Drops Due to the busy Consistent drops in specific queues may indicate a
nature of traffic through capacity problem.
a Controller, occasional
drops are expected.
Controller NTP Check for Time offset Offset of 5 Seconds Offset of 10 Seconds
The SD-WAN Orchestrator comes with a built-in system metrics monitoring stack, which can
attach to an external metrics collector and a time-series database. With the monitoring stack, you
can quickly check the health condition and the system load for the SD-WAN Orchestrator.
Before getting started, set up a time-based database and a dashboard/alerting agent. After this
is complete, you can enable telegraf in the SD-WAN Orchestrator.
n n To enable the monitoring stack, run the following command on the orchestrator:
VMware, Inc. 81
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Telegraf is used as the SD-WAN Orchestrator system metrics collector, which has plenty of
plugins to collect different system metrics. The following metrics are enabled by default.
VMware, Inc. 82
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
To activate more metrics or deactivate some enabled metrics, you can edit the Telegraf
configuration file on the SD-WAN Orchestrator by:
sudo vi /etc/telegraf/telegraf.d/system_metrics_input.conf
A time Series Database can be used to store the system metrics collected by Telegraf. A
time-series database (TSDB) is a database optimized for time series data.
The Dashboard and Alerting Agent allows you to query, visualize, alert, and explore the data
stored in the TSDB. The image is an example of a dashboard using Telegraph (a TSDB and a
dashboard engine) that can be created to monitor the solution.
1 Add the iptables entry to allow for external monitoring systems to access to telegraf port.
The source IP address should be specified for security reasons.
a Example. The IP address of the external monitoring system is 191.168.0.200 Add "-A
INPUT -p tcp -m tcp --source 191.168.0.200 --dport 9273 -m comment --comment "allow
telegraf port" -j ACCEPT" to /etc/iptables/rules.v4
VMware, Inc. 83
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
b Restart iptables.
2 Add the time-series database details in the telegraf configuration. Create an output
configuration file. Example with prometheus is as follows:
/etc/telegraf/telegraf.d/prometheus_out.conf
VMware, Inc. 84
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
The following list shows a list of values that should be monitored and their thresholds. The
list below is given as a starting point, as it is not exhaustive. Some deployments may require
assessing additional components such as database transactions, automatic backups, etc.
VMware, Inc. 85
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
SD-WAN Orchestrator Check for Time offset Offset of 5 Seconds Offset of 10 Seconds
Time -Telegraf input plugin:
inputs.ntpq (version 4.0
and onwards).
SD-WAN Orchestrator Check for Internet access. Response time > 5 secs Response time > 10 secs
Internet (not applicable
for MPLS only topologies)
VMware, Inc. 86
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
SD-WAN Orchestrator Check Total – Example CRL When Total Cert count
Total Cert Count mysql query: exceeds 5000
SELECT count(id) FROM
VELOCLOUD_EDGE_CER
TIFICATE WHERE
validFrom <= NOW() AND
validTo >=NOW()',
'SELECT count(id) FROM
VELOCLOUD_GATEWAY_
CERTIFICATE WHERE
validFrom <= NOW() AND
validTo >=NOW()
DR Replication Status Confirm the Standby SD- Review that the DR SD-WAN Orchestrator is no
WAN Orchestrator is up- more than 1000 seconds behind the Active SD-WAN
to-date. Orchestrator.
Seconds_Behind_Master: from mysql command: show
slave STATUS\G;
DR Replication SD-WAN Confirm that SD-WAN The same amount of SD-WAN Edges talking with the
Edge Gateway delta Edges and SD-WAN Active SD-WAN Orchestrator should be able to reach
Gateways can talk to the the Standby SD-WAN Orchestrator. This value can be
DR SD-WAN Orchestrator. checked on the "replication" tab or via the API.
Different values between
the Active and
the Standby SD-WAN
Orchestrators can be
due to a difference
in the timezone in SD-
WAN Edges and SD-WAN
Gateways.
The VMware SD-WAN Orchestrator powers the management plane in the VMware SD-WAN
solution. It offers a broad range of configuration, monitoring, and troubleshooting functionality to
service providers and enterprises. The main web service with which users interact to exercise this
functionality is called the SD-WAN Orchestrator Portal.
The SD-WAN Orchestrator Portal allows network administrators (or scripts and applications
acting on their behalf) to manage network and device configuration and query the current or
historical network and device state. API clients may interact with the Portal via a JSON-RPC
interface or a REST-like interface. It is possible to invoke all of the methods described in
this document using either interface. There is no Portal functionality for which access is
constrained exclusively to either JSON-RPC clients or REST-like ones.
VMware, Inc. 87
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Both interfaces accept exclusively HTTP POST requests. Both also expect that request
bodies, when present, are JSON-formatted -- consistent with RFC 2616, clients are
furthermore likely to formally assert where this is the case using the Content-Type request
header, e.g., Content-Type: application/json.
More information about the VMware SD-WAN API can be found here:
https://code.vmware.com/apis/1000/velocloud-sdwan-vco-api
n VMware requests that clients limit the number of API calls in flight at any given time to no
more than a handful (i.e., <2-4). If a user feels there is a compelling reason to parallelize
API calls, VMware requests that they contact VMware Support to discuss alternative
solutions.
n We ordinarily don't recommend polling the API for stats data more frequently than every
10 min. New stats data arrives at the SD-WAN Orchestrator every 5 minutes. Due to jitter
in reporting/processing, clients polling every 5 minutes might observe "false-positive"
cases where stats aren't reflected in API calls' results. Users tend to find the best result
using request intervals of 10 minutes or greater in duration.
n For complex software automations, run your scripts and evaluate the CPU/Memory
impact. Then adjust as required.
The VMware SD-WAN Orchestrator Syslog capability can be configured independently for the
following Orchestrator processes: portal, upload, and backend.
n Portal: The Portal process runs as an internal HTTP server downstream from NGINX. The
Portal service handles incoming API requests, either from the SD-WAN Orchestrator web
interface or from an HTTP/SDK client, primarily in a synchronous fashion. These requests
allow authenticated users to configure, monitor, and manage the various services provided
by the SD-WAN Orchestrator.
This log is very useful for AAA activities as it has all actions taken by users in the SD-WAN
Orchestrator.
Log files: /var/log/portal/velocloud.log (Logs all info, warn, and error logs)
VMware, Inc. 88
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n Upload: The Upload process runs as an internal HTTP server downstream from NGINX. The
Upload service handles incoming requests from SD-WAN Edges and SD-WAN Gateways,
either synchronously or asynchronously. These requests primarily consist of activations,
heartbeats, flow statistics, link statistics, and routing information sent by SD-WAN Edges and
SD-WAN Gateways.
Log files: /var/log/upload/velocloud.log (Logs all info, warn, and error logs)
n Backend: Job runner that primarily runs scheduled or queued jobs. Scheduled jobs consist of
cleanup, rollup, or status update activities. Queued jobs consist of processing link and flow
statistics.
Log files: /var/log/backend/velocloud.log (Logs all info, warn, and error logs)
2 Change the “enable”:false value to true for one or more of the servers. Change the Host IP
and port accordingly to your implementation.
Detailed instructions to increase the Storage in the SD-WAN Orchestrator can be found in the
SD-WAN Orchestrator
n Best Practices:
n Make sure that the same LVM distribution is applied to the Standby SD-WAN
Orchestrator.
VMware, Inc. 89
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
n It is not recommended to reduce the size of the volumes once they were increased. Use
thin provisioning instead.
n In 3.4, when increasing the disk size, the following percentage/value distribution may be
used:
n “/” Volume: This volume is used for the operative system. Production SD-WAN
Orchestrators are usually set to 140GBs and have from 40% to 60% usage.
n The following guidelines in the table below should be used in the 4.x release and
onwards.
The SD-WAN Orchestrator uses a built-in certificate server to manage the overall PKI lifecycle of
all SD-WAN Edges and SD-WAN Controllers. X.509 certificates are issued to the devices in the
network.
Detailed instructions to configure the CA can be found in the official SD-WAN Orchestrator
documentation at https://docs.vmware.com/ under "Install SD-WAN Orchestrator" and "Install an
SSL Certificate."
Certificates issued by the CA are used only for the authentication of the following:
n Management plane TLS 1.2 tunnels between the SD-WAN Orchestrator and SD-WAN Edge
SD-WAN Controller.
n Control and Data plane IKEv2/IPsec tunnels between SD-WAN Edges and between SD-WAN
Edgeand SD-WAN Controller.
VMware, Inc. 90
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
On Controllers with PKI enabled, revoked certificates are stored in a Certificate Revocation List
("CRL"). If this list grows too long (generally due to an issue with the SD-WAN Orchestrator's
Certificate Authority), the Controller's performance will be impacted. The CRL should be less than
4,000 entries long.
Support Interaction
Our Customer Support organization provides 24x7x365 world-class technical assistance and
personalized guidance to VMware SD-WAN customers.
This section provides some guidelines to interact with the VMware Support team.
n Diagnostic Bundles
While investigating an incident, a diagnostic bundle of the SD-WAN Orchestrator and SD-
WAN Controller can be created. The resulting file will assist the VMware Support team to
further analyze the events around an issue.
VMware, Inc. 91
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
On occasion assistance from VMware Support representatives for the SD-WAN Orchestrator
and SD-WAN Controllers may be required.
n Remote sessions with Support: The customer would either grant remote control to the
SSH jump server or follow the Support representative's instructions.
n Creating an account for the Support team in the SD-WAN Orchestrator. This helps the
Support team gather logs without customer interaction.
n Through the Bastion Host: SSH permissions and keys can be configured to allow the
Support engineers to access the on-premises SD-WAN Orchestrator and SD-WAN
Controller using a Bastion Host.
When contacting VMware SD-WAN Support to assist triaging an issue, include the data
described in the table below.
Required Suggested
VMware, Inc. 92
VMware SD-WAN Orchestrator Deployment and Monitoring Guide
Required Suggested
VMware, Inc. 93