Martin Berger - Oracle Priva
Martin Berger - Oracle Priva
1
Oracle Private Cloud Appliance X9-2
- There is a new kid on the Block!
... 75 Service Requests later...
Martin Berger
Trivadis – Part of Accenture
Agenda 1 Project
2 Machine
3 Launch
5 Management
6 VM Provisioning
7 Various Things
3
Martin Berger – Data Platforms
[email protected]
martinberger_ch www.martinberger.com
4
ASG DATA PLATFORMS 3 key benefits
1 Architecture expertise
WHY? We are the game changer for our from hands-on projects
client's data platform projects
2 Delivery of tailor-made
HOW? Maximum automation, maximum data platforms
efficiency, maximum quality!
3 Integrated Teams - Like
WHAT? We build innovative data platforms a Rowing team, perfect
based on our blueprints, assets and tools. alignment and
interaction.
6
A federal office moves
• Hardware change from Oracle Exadata to Private Cloud Appliance X9-2
• Consolidation of the virtual and bare-metal application platform
• Application server: Oracle Weblogic
• Database: Oracle 19c
• POC in year 2022 on PCA X8-2 e.g., with JMeter transaction measurements
7
2
Machine
8
Private Cloud Appliance X9-2 Factsheet
• 180 – 1,080 OCPUs- Intel Xeon Processors (Icelake - 2593.952 MHz)
• 3 – 18 TB Memory
• 100 TB – 8.4 PB combined Block, File und Object Storage
• NFS v3, v4.1, SMB 3.1/2.0
• Flex Shapes or Fixed Shapes
• Oracle Linux, Oracle Solaris, 3rd Party Linux und Microsoft Windows.
• Oracle ZFS Storage ZS9-2 Dual-Controller HA Cluster mit 2x 24-core 2.1 GHz Intel® Xeon®
• Network 100Gbit redundant
• KVM Virtualization Layer
• OVN (Open Virtual Networking) for Open vSwitch (OVS) – Software-defined Networking
9
Front and rear of of a newly racked PCA
10
Components - basic equipment
11
Data center network connection
12
Storage types
13
Enclave concept
• Separation of administration from the user
• Separate web based user interfaces
Tenancy Netzwerk
Rack-Provisioning Compute
Netzwerkeinstellungen Storage
Upgrade und Patching DNS
ASR Phone Home IAM
Exadata Networks Governance
https://adminconsole.<pca-name>.<domain> https://console.<pca-name>.<domain>
14
Service Enclave User Interface SEUI
15
Compute Enclave User Interface CEUI
16
From ZFSSA to the virtual machine
17
Just like in the Oracle Cloud?
Architecture and terms Network configuration in OCI CLI command line tool
• Compute Instance, File System, the Compute Enclave • Slow response time
Storage, Image etc. • Routing, Security List, Local- • Certificate used
Peering Gateway etc.
18
Prometheus – Grafana - Stack
20
3
Launch
21
PCA provision – Installation Checklist
22
Fault domain status display
• Display utilization of the fault domains:
PCA-ADMIN> getFaultDomainInfo
Command: getFaultDomainInfo
Status: Success
Time: 2023-05-18 00:15:18,204 UTC
Data:
id totalCNs totalMemory freeMemory totalvCPUs freevCPUs notes
-- -------- ----------- ---------- ---------- --------- -----
UNASSIGNED 0 0.0 0.0 0 0
FD1 1 984.0 392.0 120 40
FD2 1 984.0 744.0 120 48
FD3 1 984.0 775.0 120 84
23
HA or not?
• Scenarios
– 1. Manual migration for Compute Node maintenance or patching
• When start the migrateVM command, running virtual machines are moved to other available
FDs. There is no virtual machine downtime.
– 2. Compute Node outage < 10min
• The virtual machines are restarted automatically as soon the Compute Node is available.
– 3. Compute Node outage > 10min
• According documentation: A compute node is considered failing when it has been
disconnected from the data network or has been in powered-off state for more than 10
minutes.
• Polling is done in 5min interval. After 2 attempts, internally the Compute Node is change to
FAIL state and agents in EVACUATING mode. When the evacuating starts, instances are flagged
in CEUI with state MOVING and then RUNNING on the new Compute Node. Evacuating results
in downtime of the virtual machines.
24
4
Patching &
Upgrade
25
Patching and upgrade requirements
• Requires a CSI and ULN registration
• A local ULN mirror with activated channels
• Space for patches on the ULN mirror
• Via GUI or PCA-ADMIN CLIT
• The order in the patch and upgrade documents must be observed
26
Patching and upgrade information
• [PCA 3.x] Private Cloud Appliance Component Upgrade by Release Matrix (Doc ID
2907892.1)
• Not every component is patched in every release.
• The 3.0.2 ISO contains the latest version.Read the Manual...
27
Patching and upgrade infrastructure
root@<vm-pca-02>:~ [PROD]# ls
pca-3.0.2-b819070.iso
28
Patching – ULN Mirror Bug
• The local ULN mirror on the same PCA is not recognized correctly - an external ULN mirror is
required.
• Service request open.
29
Migrate before patch
• Move the virtual machines away first before patching the Compute Nodes.
-- evacuate
PCA-ADMIN> migrateVm id=c44901a6-3793-43bd-a3f8-3c7feab12a50
-- lock
PCA-ADMIN> provisioningLock id=c44901a6-3793-43bd-a3f8-3c7feab12a50
PCA-ADMIN> maintenanceLock id=c44901a6-3793-43bd-a3f8-3c7feab12a50
-- upgrade
PCA-ADMIN> upgradeCN hostIp=100.96.2.66 imageLocation="http://://<ULN-Mirror-
Hostname>:10001/yum/iso/pca-3.0.2-b819070.iso" isoChecksum="7a18a6..83e897de024859...
-- unlock
PCA-ADMIN> maintenanceUnlock id=c44901a6-3793-43bd-a3f8-3c7feab12a50
PCA-ADMIN> provisioningUnlock id=c44901a6-3793-43bd-a3f8-3c7feab12a50
30
5
Management
31
PCA-ADMIN
• Management of the service enclave
• e.g. patching, compute node provisioning
32
OCI CLI
• Online and offline installation
• A certificate is required - must be initially "fetched" from the PCA.
• Not all queries known from the cloud are supported.
• Responds slowly, at least for us...
33
Terraform
• Also requires the PCA certificate.
• Not 100% compatible.
• Not reliable, missing modules, wrong order when clearing, etc.
• Terraform Provider Region Variable:
variable "region" {
type = string
default = "pca01.<meine-domain>"
}
34
6
VM Provisioning
35
Components for starting an instance
36
Minimum network configuration
37
A few minutes later
38
VM with a second network interface
• The 2nd vNIC must be configured in the same way as OCI at OS level with a script.
• This also applies to routing - but not for a 2nd IP address on the 2nd vNIC - no NMCLI
– https://docs.oracle.com/en-us/iaas/Content/Resources/Assets/secondary_vnic_all_configure.sh
39
Console Connection – Port 5000
C:\> plink.exe -ssh -i <path and file for your SSH private key> -P 32222 -L 5000:localhost:5000
mlockcxjr0cabcdhxdfpdb24vdxdrya@<your pca management vip>
vnc@ocid1.instance.AK02134552.PCA01.cpyc0v4abczd647dfhsbs9fslraj4b38agmeoy37gtkdenry2p7zb5zu18f
40
7
Various
41
Automation Matters
• SSH key distribution, DNF updates etc.
42
My Oracle Support Experience - 75 SRs later
• In 95% of all cases, a support bundle is required when uploading.
• Zoom sessions bring more than SR ping-pong.
• The Oracle crew in the USA seems to be more technically savvy - time zone on Zoom calls!
• Meanwhile they "know" each other in the calls.
• SEV2 takes a long time, if SEV1 then only with 7/24.
• A lot is solved in the internal MySQL database.
• Blocker so far:
– Day0 configuration failed
– VMs no longer bootable after shape customization
– Local peering gateway routes no longer work after an LPG in the same subnet is deleted.
– DNS cannot be deleted.
43
Five tips to take away
01 02 03 04 05
Never use 3.0.1. Complete the Plan network, Work on bugs Reduce project
checklists in full order DNS consistently & risk and plan
and ask if entries, firewall, be patient. sufficient
anything is routing etc. reserves.
unclear.
44
“Gring abe u vou seckle”
45
Merci vöumou
46