OVN Kubernetes Presentation OCTO October6th2020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

CONFIDENTIAL designator

Openshift Networking +
OVN-Kubernetes
An introduction for beginners [Recording]

Andrew Stoycos Surya Seetharaman


Software Engineer OCTO Software Engineer (OpenShift SDN)

October 6th 2020

1
V0000000
CONFIDENTIAL designator

Agenda

● OVN Kubernetes
○ What is it, where does it fit in Openshift?
○ Architecture and components
● What happens when a Pod is created
○ How is IP allocated?
○ What points in the OVN-K8’s codebase are touched
○ What OVN networking entities are created
● The lifecycle of an ICMP packet
○ Inter-node pop2pod within cluster

Future talks:
- More packet tracing and deeper dives into the layers below OVN, OVS, Linux datapaths, etc.

2
V0000000
CONFIDENTIAL designator

Big Picture: Kubernetes Networking


● What are the main requirements?
1. Every Pod must have it’s own unique IP
2. Any pod should be able to talk to any other pod on the cluster without NAT
3. Agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that
node
● How is the networking model implemented in Kubernetes?
○ The core networking requirements are not natively built into kubernetes
○ Network overlay plugins follow the Container Network Interface(CNI) spec
○ These include OVN-Kubernetes, Flannel, etc

Kubernetes CNI Pod

eth0

3
https://github.com/containernetworking/cni/blob/master/SPEC.md
V0000000
https://kubernetes.io/docs/concepts/cluster-administration/networking/
CONFIDENTIAL designator

Big Picture: Openshift Networking

Openshift
OVN
4.6
Kubernetes

● Openshift implements it’s networking framework within the CNO(Cluster Network Operator)
○ It currently supports three main network types (CNI’s) -> OpenShiftSDN, OVNKubernetes, Kuryr
● Multus is a meta plugin that allows pods to have multiple network interfaces

4
V0000000

https://www.openshift.com/blog/demystifying-multus
Big Picture UPDATED 5/1/21
CONFIDENTIAL designator

OPENSHIFT
KUBERNETES CNI

OpenShift Tigera
VMware Isovalent VMware Juniper kuryr-
SDN OVN Calico Cisco ACI kubernetes2
NCP Cilium Antrea Contrail
DEFAULT (open src)

4.x 4.6+ 4.2+ 4.4+ 4.4+ 4.5+ Q2CY2021 Q2CY2021 4.2.2+

3rd-party Kubernetes CNI plug-in certification RH-OSP


primarily consists of: Neutron
1. Formalizing the partnership Plugin
2. Certifying the container(s)
3. Certifying the Operator
4. Successfully passing the same Kubernetes
networking conformance tests that
Fully Supported Tech Preview Cert In-Progress
OpenShift uses to validate its own SDN
5
V0000000

Product Manager: Marc Curry Version 2021-04-21


CONFIDENTIAL designator

OVN-Kubernetes

● CNI network plugin for openshift (GA from 4.6)/kubernetes


○ https://github.com/ovn-org/ovn-kubernetes
○ Started by the OVN/OVS communities
○ Uses OVN on OVS as the abstraction to manage network traffic flows on the node
○ Uses the Geneve (Generic Network Virtualization Encapsulation) protocol rather than
VXLAN (used by OpenShift SDN) to create an overlay network between nodes.
○ Creates logical network topologies
■ Logical switches, routers, ports, acls (network policies), load balancers etc..
○ Does not need kube-proxy like SDN
○ Allows for network management to be abstracted to multiple nodes

6
V0000000
CONFIDENTIAL designator

OVN-Kubernetes Architecture: Master


● OVN-Kubernetes Master
○ OVN Kubernetes component
ovn-kubernetes ○ Central process that watches for cluster events (Pods, Namespaces,
master Services, Endpoints, NetworkPolicy)
○ Translates cluster events into OVN logical network elements stored in nbdb
○ Tracks Kube-API state
OVN northbound
○ Manages pod subnet allocation to nodes
DB ● OVN-northd
○ Native OVN component
○ Process that converts northbound DB network representation to the
lower-level logical flows that are stored in southbound DB
ovn-northd

These processes are started by the CNO via a Daemonset -> ovnkube-master
OVN
southbound DB They can be seen running on an OCP cluster in the openshift-ovn-kubernetes
Master Node
namespace within the ovnkube-master pod as the following containers
● [northd, nbdb, sbdb, ovnkube-master]

7
V0000000

https://docs.google.com/presentation/d/1iktpCdAsdaJPJe7Yk36T-BtYwuB04oc3lHdMHH2qBcw/edit#slide=id.g6fa84b026f_2_109
CONFIDENTIAL designator

OVN-Kubernetes Architecture: Node


● Open vSwitch
○ Pushes the new “edge of network” to the hypervisor
○ Multilayer software switch used to implement openflow rules
kubelet/ ○ Datapath used by containers
ovn-controller
CRI-O
The OVS process is run via systemd on the host but is managed by the CNO via a
Daemonset for log tailing and other monitoring reasons-> ovs-node

ovn-kubernetes
OVS database
It can be seen running on an OCP cluster in the openshift-ovn-kubernetes
node namespace within the ovs-node pods as the following containers
● [ovs-daemons]

OVS bridge

NIC
Worker Node

8
V0000000

https://docs.google.com/presentation/d/1iktpCdAsdaJPJe7Yk36T-BtYwuB04oc3lHdMHH2qBcw/edit#slide=id.g6fa84b026f_2_109
CONFIDENTIAL designator

OVN-Kubernetes Architecture: Node


● OVN-Controller
○ OVN component
○ Watches sbdb
kubelet/ ○ Matches OVS “physical” ports to OVN logical ports
ovn-controller
CRI-O
○ Reads logical flows from sbdb, translates them into OpenFlow flows and
sends them to the worker node’s OVS daemon

ovn-kubernetes
● OVN-Kubernetes node
node
OVS database ○ Called as CNI plugin (just an executable) from kublet/CRI-O
○ Digests IPAM annotation written by ovn-kubernetes master
○ Sets up firewall rules and routes for HostPort and Service access from
node
OVS bridge ○ Creates OVS port on bridge, moves it into pod network namespace, sets
IP details/QoS
NIC ○ Deletes entities when pods die
Worker Node
These process are also started by the CNO via a Daemonset-> ovnkube-node
They can be seen running on an OCP cluster in the openshift-ovn-kubernetes
namespace within the ovnkube-node pod as the following containers
9
● [ovn-controller, ovnkube-node] V0000000

https://docs.google.com/presentation/d/1iktpCdAsdaJPJe7Yk36T-BtYwuB04oc3lHdMHH2qBcw/edit#slide=id.g6fa84b026f_2_109
Local GW Topology Overview
CONFIDENTIAL designator

Updated:
06/25/2021

10
V0000000
Shared GW Topology Overview
CONFIDENTIAL designator

Updated:
06/25/2021

11
V0000000
CONFIDENTIAL designator

Pod Creation Workflow

watcher
etcd kube-apiserver scheduler

12
V0000000
CONFIDENTIAL designator

Pod Creation Workflow

apiVersion: v1
kind: Pod
metadata:
name: client
namespace: default
spec:
containers:
- name: client
image: xxx

watcher
etcd kube-apiserver scheduler

13
V0000000
CONFIDENTIAL designator

Pod Creation Workflow

apiVersion: v1
kind: Pod
metadata:
name: client
namespace: default
spec:
containers: Aha!
- name: client New pod
image: xxx

watcher
etcd kube-apiserver scheduler

14
V0000000
CONFIDENTIAL designator

Pod Creation Workflow

apiVersion: v1
kind: Pod
metadata:
name: client
namespace: default
spec:
containers:
- name: client
image: xxx
nodeName: node1

watcher watcher
etcd kube-apiserver scheduler

15
V0000000
CONFIDENTIAL designator

Pod Creation Workflow

apiVersion: v1
kind: Pod
metadata: kubelet
name: client
namespace: default
spec:
her
containers: tc node1
- name: client wa
image: xxx
nodeName: node1

watcher
kube-apiserver watch
etcd er
ovnkube-master

16
master V0000000
CONFIDENTIAL designator

Pod Creation Workflow


node1 ● Kubelet: I now have a pod object. How do I get it
running?

client ● Kubelet: I’ll ask CRI-O to create a PodSandbox over


kubelet
pod CRI using gRPC (RuntimeService.RunPodSandbox)
● PodSandbox: I’m the initial container launched to setup
a networking environment that can be used by all the
CRIO containers in the pod.
sandbox
creation

17
V0000000
CONFIDENTIAL designator

Pod Creation Workflow


node1 ● kubelet: I now have a pod object. How do I get it
running?

client ● kubelet: I’ll ask CRI-O to create a PodSandbox over CRI


kubelet
pod using gRPC (RuntimeService.RunPodSandbox)
● PodSandbox: I’m the initial container launched to setup
CNI_ADD a networking environment that can be used by all the
CRIO containers in the pod.
sandbox executes CNI
creation binary on the ● CRI-O: creates the container and sets up networking.
host through ○ calls CNI_ADD to execute the CNI binary
CNI interface
(ovn-k8s-cni-overlay)
/opt/cni/bin/ovn-k8s-cni-overlay ○ the CNI binary sends http message to our CNI
plugin (ovnkube-node) to do a cni_add.
sends request through
http socket ○ the CNI plugin does the necessary work to get the
ovn-cni-server.sock ovnkube-node networking working on the pod.

ovnkube node
18
- https://github.com/ovn-org/ovn-kubernetes/blob/22c61bd81deb347e8ad94ae3191c1f4e1fc2c5b4/go-controller/pkg/cni/cni.go#L140 V0000000
- https://github.com/ovn-org/ovn-kubernetes/blob/22c61bd81deb347e8ad94ae3191c1f4e1fc2c5b4/go-controller/pkg/cni/cni.go#L68
- https://github.com/ovn-org/ovn-kubernetes/blob/22c61bd81deb347e8ad94ae3191c1f4e1fc2c5b4/go-controller/pkg/cni/cni.go#L165
CONFIDENTIAL designator

Pod Creation Workflow


node1
ConfigureInterface

● hostIface, contIface, err =


kubelet client setupInterface(netns, pr.SandboxID,
veth
pod pr.IfName, ifInfo)
○ SetupVethWithName
br-int
○ makeVeth(contVethName,
CNI_ADD
CRIO hostVethName, mtu)
sandbox creates a
executes CNI
creation binary on the veth pair
host through
CNI interface

/opt/cni/bin/ovn-k8s-cni-overlay
sends request through
http socket
ovn-cni-server.sock ovnkube-node

ovnkube node
19
- https://github.com/ovn-org/ovn-kubernetes/blob/e766bc19d0338c4d11136c5fa97202569d2702f8/go-controller/pkg/cni/helper_linux.go#L279
- https://github.com/ovn-org/ovn-kubernetes/blob/e766bc19d0338c4d11136c5fa97202569d2702f8/go-controller/pkg/cni/helper_linux.go#L140 V0000000
- https://github.com/containernetworking/plugins/blob/e78e6aa5b9fd7e3e66f0cb997152c44c2a4e43df/pkg/ip/link_linux.go#L130
- https://github.com/ovn-org/ovn-kubernetes/blob/e766bc19d0338c4d11136c5fa97202569d2702f8/go-controller/pkg/cni/helper_linux.go#L319
CONFIDENTIAL designator

Pod Creation Workflow


node1
ConfigureInterface

● hostIface, contIface, err =


kubelet client setupInterface(netns, pr.SandboxID,
veth
pod pr.IfName, ifInfo)
○ SetupVethWithName
br-int
○ makeVeth(contVethName,
CNI_ADD
CRIO hostVethName, mtu)
sandbox creates a
executes CNI ● waitForPodFlows(ifInfo.MAC.String());
creation binary on the veth pair
host through and waits ○ wait.PollImmediate(200*time.Mill
isecond, 20*time.Second, func()
CNI interface
(bool, error)

/opt/cni/bin/ovn-k8s-cni-overlay ○ ofctlExec("dump-flows",
"br-int",
sends request through "table=9,dl_src=mac_addr")
http socket
ovn-cni-server.sock ovnkube-node

ovnkube node
20
- https://github.com/ovn-org/ovn-kubernetes/blob/e766bc19d0338c4d11136c5fa97202569d2702f8/go-controller/pkg/cni/helper_linux.go#L375 V0000000
- https://github.com/openshift/ovn-kubernetes/blob/86b3feb85044261b202c627dda22320506748d8b/go-controller/pkg/cni/ovs.go#L101
CONFIDENTIAL designator

Pod Creation Workflow

ovnkube-master

master
21
- https://github.com/ovn-org/ovn-kubernetes/blob/d17a8bcfc68e8893c78dac8fd30a40d21ef22194/go-controller/pkg/ovn/ovn.go#L529 V0000000
- https://github.com/ovn-org/ovn-kubernetes/blob/d17a8bcfc68e8893c78dac8fd30a40d21ef22194/go-controller/pkg/ovn/ovn.go#L544
CONFIDENTIAL designator

Pod Creation Workflow

ovnkube-master

creates logical
objects

nbdb

master
22
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L257 V0000000
CONFIDENTIAL designator

Pod Creation Workflow


addLogicalPort

● logicalSwitch := node1

ovnkube-master ● portName := default_client


● cmd, err = oc.ovnNBClient.LSPAdd(logicalSwitch, portName)
● podMac, podIfAddrs, err = oc.assignPodAddresses(node1)
● podAnnotation := util.PodAnnotation{IPs: podIfAddrs, MAC: podMac,}
creates logical ● nodeSubnets = oc.lsManager.GetSwitchSubnets(logicalSwitch)
objects
● oc.addRoutesGatewayIP(pod, &podAnnotation, nodeSubnets)
○ podAnnotation.Gateways = append(podAnnotation.Gateways, gatewayIP)
● oc.kube.SetAnnotationsOnPod(pod, podAnnotation)

nbdb

master
23
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L296
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L366 V0000000
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L515
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L387
CONFIDENTIAL designator

Pod Creation Workflow annotations:


k8s.ovn.org/pod-networks: |
{
"default": {
"ip_addresses": ["192.168.0.5/24"],
"mac_address": "0a:58:fd:98:00:01",
sets annotations "gateway_ips": ["192.168.0.1"]
ovnkube-master
# for backward compatibility
"ip_address": "192.168.0.5/24",
"gateway_ip": "192.168.0.1"
creates logical }
}
objects

nbdb

master
24
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L416 V0000000
CONFIDENTIAL designator

Pod Creation Workflow


addLogicalPort

● cmd, err = oc.ovnNBClient.LSPSetAddress(portName, addresses)


○ addresses = make([]string, len(podIfAddrs)+1)
ovnkube-master ○ addresses[0] = podMac.String()
○ for idx, podIfAddr := range podIfAddrs {
addresses[idx+1] = podIfAddr.IP.String()
}
creates logical ● podAddDefaultDenyMulticastPolicy(portName);
objects ● oc.addGWRoutesForPod(routingGWs, podIfAddrs, pod.Namespace, pod.Spec.NodeName)
● cmd, err = oc.ovnNBClient.LSPSetPortSecurity(portName, addresses)

nbdb

master
25
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L422 V0000000
- https://github.com/ovn-org/ovn-kubernetes/blob/0c534f8bb21d2950dffcc41d96c255893ccece27/go-controller/pkg/ovn/pods.go#L498
-
CONFIDENTIAL designator

Pod Creation Workflow annotations:


k8s.ovn.org/pod-networks: |
{
"default": {
"ip_addresses": ["192.168.0.5/24"],
"mac_address": "0a:58:fd:98:00:01",
sets annotations "gateway_ips": ["192.168.0.1"]
ovnkube-master
# for backward compatibility
"ip_address": "192.168.0.5/24",
"gateway_ip": "192.168.0.1"
creates logical }
}
objects

nbdb
wa
tc
h
er

northd

master
26
V0000000
CONFIDENTIAL designator

Pod Creation Workflow annotations:


k8s.ovn.org/pod-networks: |
{
"default": {
"ip_addresses": ["192.168.0.5/24"],
"mac_address": "0a:58:fd:98:00:01",
sets annotations "gateway_ips": ["192.168.0.1"]
ovnkube-master
# for backward compatibility
"ip_address": "192.168.0.5/24",
"gateway_ip": "192.168.0.1"
creates logical }
}
objects

l ows
a lf
g ic sbdb
nbdb l o
wa

tes
tc

a
h

cre
er

northd

master
27
V0000000
CONFIDENTIAL designator

Pod Creation Workflow


node1

kubelet client veth


openvswitch

sbdb
cni-add
CRIO creates a
sandbox executes CNI
creation binary on the veth pair
host through and waits syncs
CNI interface

/opt/cni/bin/ovn-k8s-cni-overlay
sends request through
http socket
ovnkube-node ovn-controller
ovn-cni-server.sock

ovnkube node
28
V0000000
CONFIDENTIAL designator

Pod Creation Workflow


node1

kubelet client veth


openvswitch

creates a sbdb
cni-add veth pair
CRIO and waits
sandbox executes CNI
creation assigns IP
binary on the
host through details to
syncs
CNI interface pod
interface
/opt/cni/bin/ovn-k8s-cni-overlay
sends request through
http socket
ovnkube-node ovn-controller
ovn-cni-server.sock

ovnkube node
29
V0000000
CONFIDENTIAL designator

Pod Creation Workflow


node1

kubelet client veth


openvswitch

creates a sbdb
veth pair
CRIO and
response
assigns IP
exits binary details to
syncs
pod
interface
/opt/cni/bin/ovn-k8s-cni-overlay
sends request through
http socket
ovnkube-node ovn-controller
ovn-cni-server.sock

ovnkube node
30
- https://github.com/ovn-org/ovn-kubernetes/blob/22c61bd81deb347e8ad94ae3191c1f4e1fc2c5b4/go-controller/pkg/cni/cni.go#L197 V0000000
- https://github.com/ovn-org/ovn-kubernetes/blob/22c61bd81deb347e8ad94ae3191c1f4e1fc2c5b4/go-controller/pkg/cni/cni.go#L162
CONFIDENTIAL designator

Testbed Setup - OVN-KIND clusters


● Wondering how to play around with a cluster that uses the ovn-k8s plugin ?
○ Check out OVN-KIND setup
○ Spins up a cluster with two workers and one control-plane (non-ha) within minutes
■ On default out of the box openshift clusters, we would have 3 control nodes and
3 worker nodes
○ This is what we have used to demonstrate concepts in this presentation

KIND (Kubernetes in Docker)


Node3

ovn-control-plane

Node1 Node2

ovn-worker ovn-worker2

31
server Client
V0000000

Pod Pod
CONFIDENTIAL designator

Goal - trace the packet flow


KIND (Kubernetes in Docker)
Node3

ovn-control-plane

Node1 Node2

ovn-worker ovn-worker2

ICMP echo request


server Client

Pod Pod

32
V0000000
CONFIDENTIAL designator

33
V0000000
OVN logical entities
CONFIDENTIAL designator

● Let us look at the OVN topology for our testbed setup.


● The logical components are stored in ovn-nbdb (showing only necessary components):
[root@ovn-control-plane ~]# ovn-nbctl show port dtoj-ovn-worker2
switch 560002d1-c90c-41c9-afe7-7fffddfafd05 (ovn-worker) mac: "0a:58:64:40:02:02"
port default_server networks: ["100.64.2.2/29"]
addresses: ["0a:58:0a:f4:02:03 10.244.2.3"] port rtos-ovn-worker
port k8s-ovn-worker mac: "0a:58:0a:f4:02:01"
addresses: ["02:5a:96:a9:18:ec 10.244.2.2"] networks: ["10.244.2.1/24"]
port stor-ovn-worker port rtos-ovn-worker2
type: router mac: "0a:58:0a:f4:00:01"
addresses: ["0a:58:0a:f4:02:01"] networks: ["10.244.0.1/24"]
router-port: rtos-ovn-worker port dtoj-ovn-worker
switch 6f710093-f1ef-427c-bda4-3cfc51690424 (ovn-worker2) mac: "0a:58:64:40:01:02"
port default_client networks: ["100.64.1.2/29"]
addresses: ["0a:58:0a:f4:00:04 10.244.0.4"] port rtos-ovn-control-plane
port k8s-ovn-worker2 mac: "0a:58:0a:f4:01:01"
addresses: ["9a:f1:9c:49:89:8e 10.244.0.2"] networks: ["10.244.1.1/24"]
port stor-ovn-worker2 nat 53825cd4-fdf4-4748-9555-da50f4fcc5e1
type: router external ip: "169.254.13.190"
addresses: ["0a:58:0a:f4:00:01"] logical ip: "10.244.1.2"
router-port: rtos-ovn-worker2 type: "dnat_and_snat"
router 3366b7fe-9996-4e79-b872-0050381473ee (ovn_cluster_router) nat 95f64668-f320-4e6d-9070-278a918d0664
port dtoj-ovn-control-plane external ip: "169.254.10.48"
mac: "0a:58:64:40:00:02" logical ip: "10.244.0.2"
networks: ["100.64.0.2/29"] type: "dnat_and_snat"
port rtos-node_local_switch nat cc52698d-08fc-4ebb-9e28-79171d8853a8
34 external ip: "169.254.7.235"
mac: "0a:58:a9:fe:00:02"
logical ip: "10.244.2.2" V0000000
networks: ["169.254.0.2/20"]
gateway chassis: [872e351f-5c82-45e4-a055-28952a3d9b3b] type: "dnat_and_snat"
CONFIDENTIAL designator

OVN Packet Processing


● OVN has a complex methodology for processing packets
● It has two pipelines through which packets progress
○ Ingress
○ Egress
● A packet in OVN enters the ingress pipeline first and then goes to the egress pipeline
○ It gets compared against the logical flow rules defined in ovn-sbdb Logical_Flows tables for the
ingress/egress pipeline of that respective datapath
○ The matching flow is then found and the action gets executed.

[root@ovn-control-plane ~]# ovn-sbctl lflow-list


Datapath: "ovn-worker2" (684a68ba-7d4c-4bac-9360-de8b47e52dd7) Pipeline: ingress
table=0 (ls_in_port_sec_l2), priority=50, match=(inport == "default_client" && eth.src ==
{0a:58:0a:f4:00:05}), action=(next;)
table=1 (ls_in_port_sec_ip), priority=90 , match=(inport == "default_client" && eth.src ==
0a:58:0a:f4:00:05 && ip4.src == {10.244.0.5}), action=(next;)
table=19(ls_in_l2_lkup), priority=50 , match=(eth.dst == 0a:58:0a:f4:00:01), action=(outport
= "stor-ovn-worker2"; output;)
35
V0000000
CONFIDENTIAL designator

OVN Packet Processing

36
V0000000
CONFIDENTIAL designator

OVN Packet Processing

37
V0000000
CONFIDENTIAL designator

Ovn-Trace
● To trace a packet through OVN a utility called ovn-trace
○ It simulates packet forwarding scenarios within OVN logical topologies
● A typical ovn-trace command:

ovn-trace --ct new --ovs ovn-worker2 'inport=="default_client" &&


eth.dst==0a:58:0a:f4:00:01 && eth.src==0a:58:0a:f4:00:04 &&
● --ovs
tcp && tcp.src==80 && ip4.src==10.244.0.4 && ip4.dst==10.244.2.3 && ip.ttl==64'
○ Makes ovn-trace attempt to obtain and display the Openflow flows that correspond
to each OVN logical flow, i.e connects to ovn-sbdb
● inport
○ The port to start the packet trace from
● eth.dst and eth.src
○ The destination of the next layer two hop
○ Here the MAC’s relate to: default_client -> rtos-ovn-worker2
● ip4.src and ip4.dst
○ The IPs of the client and server pod’s respectively
38
V0000000
CONFIDENTIAL designator

Tracing a packet from Client to Server


ingress(dp="ovn-worker2", inport="default_client")
--------------------------------------------------
0. ls_in_port_sec_l2 (ovn-northd.c:4755): inport == "default_client" && eth.src ==
{0a:58:0a:f4:00:04}, priority 50, uuid bb2379ca ovn-cluster-router
1. ls_in_port_sec_ip (ovn-northd.c:4411): inport == "default_client" && eth.src == 0a:58:0a:f4:00:04
&& ip4.src == {10.244.0.4}, priority 90, uuid bfcab2f0
rtos-ovn-worker rtos-ovn-worker2
3. ls_in_pre_acl (ovn-northd.c:4937): ip, priority 100, uuid 83aab749
5. ls_in_pre_stateful (ovn-northd.c:5135): reg0[0] == 1, priority 100, uuid 4598be4b
ct_next(ct_state=new|trk)
-------------------------
6. ls_in_acl (ovn-northd.c:5510): ip && (!ct.est || (ct.est && ct_label.blocked == 1)), priority 1,
uuid 0cafa16c ovn-control
10. ls_in_stateful (ovn-northd.c:5907): reg0[1] == 1, priority 100, uuid a731e7c0
19. ls_in_l2_lkup (ovn-northd.c:7185): eth.dst == 0a:58:0a:f4:00:01, priority 50, uuid debd14ec plane
outport = "stor-ovn-worker2"; ovn-wor
output;
ker-2
egress(dp="ovn-worker2", inport="default_client", outport="stor-ovn-worker2") stor-ovn-worker2
-----------------------------------------------------------------------------
0. ls_out_pre_lb (ovn-northd.c:4875): ip && outport == "stor-ovn-worker2", priority 110, uuid d8a9ab69
1. ls_out_pre_acl (ovn-northd.c:4875): ip && outport == "stor-ovn-worker2", priority 110, uuid a1a609fd ovn-worker2
3. ls_out_lb (ovn-northd.c:4875): ip && outport == "stor-ovn-worker2", priority 65535, uuid 38b97222
4. ls_out_acl (ovn-northd.c:5513): ip && (!ct.est || (ct.est && ct_label.blocked == 1)), priority 1, client
uuid b8ff6b01
7. ls_out_stateful (ovn-northd.c:5910): reg0[1] == 1, priority 100, uuid ce20c9c6 Client
9. ls_out_port_sec_l2 (ovn-northd.c:4821): outport == "stor-ovn-worker2", priority 50, uuid c4ab4f94 10.244.0.4
output; Conntrack
/* output to "stor-ovn-worker2", type "patch" */
39
V0000000
CONFIDENTIAL designator

Tracing a packet from Client to Server


ingress(dp="ovn_cluster_router", inport="rtos-ovn-worker2")
-----------------------------------------------------------
0. lr_in_admission (ovn-northd.c:8484): eth.dst == 0a:58:0a:f4:00:01 && inport == "rtos-ovn-worker2",
priority 50, uuid cf793317
1. lr_in_lookup_neighbor (ovn-northd.c:8560): 1, priority 0, uuid 170e4752
2. lr_in_learn_neighbor (ovn-northd.c:8569): reg9[2] == 1, priority 100, uuid 069d6c13
actions=resubmit(,11)
next;
10. lr_in_ip_routing (ovn-northd.c:7971): ip4.dst == 10.244.2.0/24, priority 49, uuid c45ffb9d
11. lr_in_ip_routing_ecmp (ovn-northd.c:10097): reg8[0..15] == 0, priority 150, uuid 2b7dd6c3
12. lr_in_policy (ovn-northd.c:7445): ip4.src == 10.244.0.0/16 && ip4.dst == 10.244.0.0/16, priority 101, ovn-cluster-router
uuid f34c3bfa
13. lr_in_arp_resolve (ovn-northd.c:10363): outport == "rtos-ovn-worker" && reg0 == 10.244.2.3, priority rtos-ovn-worker rtos-ovn-worker2
100, uuid 8900c94c
actions=set_field:0a:58:0a:f4:02:03->eth_dst,resubmit(,22)
eth.dst = 0a:58:0a:f4:02:03;
next;
17. lr_in_arp_request (ovn-northd.c:10813): 1, priority 0, uuid 69473ac7
output; ovn-control
plane
egress(dp="ovn_cluster_router", inport="rtos-ovn-worker2", outport="rtos-ovn-worker")
-------------------------------------------------------------------------------------
3. lr_out_delivery (ovn-northd.c:10858): outport == "rtos-ovn-worker", priority 100, uuid d201ea55
cookie=0xd201ea55, duration=40174.911s, table=43, n_packets=0, n_bytes=0, idle_age=40174,
priority=100,reg15=0x5,metadata=0x1 actions=resubmit(,64)
output;
/* output to "rtos-ovn-worker", type "patch" */

40
V0000000
CONFIDENTIAL designator

Tracing a packet from Client to Server


ingress(dp="ovn-worker", inport="stor-ovn-worker")
--------------------------------------------------
0. ls_in_port_sec_l2 (ovn-northd.c:4755): inport == "stor-ovn-worker", priority 50, uuid d9311a4b
3. ls_in_pre_acl (ovn-northd.c:4872): ip && inport == "stor-ovn-worker", priority 110, uuid aedc4e10 ovn-cluster-router
4. ls_in_pre_lb (ovn-northd.c:4872): ip && inport == "stor-ovn-worker", priority 110, uuid db2e45d9
6. ls_in_acl (ovn-northd.c:5510): ip && (!ct.est || (ct.est && ct_label.blocked == 1)), priority 1, rtos-ovn-worker rtos-ovn-worker2
uuid 87e5052a
9. ls_in_lb (ovn-northd.c:4872): ip && inport == "stor-ovn-worker", priority 65535, uuid c935cb70
10. ls_in_stateful (ovn-northd.c:5907): reg0[1] == 1, priority 100, uuid 8939cf97
19. ls_in_l2_lkup (ovn-northd.c:7185): eth.dst == 0a:58:0a:f4:02:03, priority 50, uuid c0ca6902
cookie=0xc0ca6902, duration=34895.902s, table=27, n_packets=0, n_bytes=0, idle_age=34895,
priority=50,metadata=0x5,dl_dst=0a:58:0a:f4:02:03 actions=set_field:0x3->reg15,resubmit(,32)
ovn-control
outport = "default_server";
output; plane
egress(dp="ovn-worker", inport="stor-ovn-worker", outport="default_server")
---------------------------------------------------------------------------
0. ls_out_pre_lb (ovn-northd.c:5120): ip, priority 100, uuid 038978db
stor-ovn-worker
1. ls_out_pre_acl (ovn-northd.c:4939): ip, priority 100, uuid 6a480021
2. ls_out_pre_stateful (ovn-northd.c:5137): reg0[0] == 1, priority 100, uuid a09062d0
ct_next(ct_state=est|trk /* default (use --ct to customize) */) ovn-worker
--------------------------------------------------------------- Server
8. ls_out_port_sec_ip (ovn-northd.c:4411): outport == "default_server" && eth.dst == 0a:58:0a:f4:02:03
&& ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.244.2.3}, priority 90, uuid 155b48f2
*** no OpenFlow flows
Server
next;
Conntrack 10.244.23
9. ls_out_port_sec_l2 (ovn-northd.c:4821): outport == "default_server" && eth.dst ==
{0a:58:0a:f4:02:03}, priority 50, uuid ee5892b0
41
*** no OpenFlow flows
output;
V0000000
/* output to "default_server", type "" */
CONFIDENTIAL designator

42
V0000000
CONFIDENTIAL designator

Thank you

Main References:
● OpenShift and OVN Technical Dive by Dan Williams - link

● Would like to thank Tim (ovn-tracing), Casey (pod-creation) for their knowledge base.

● If you have any questions please don’t hesitate to reach us on #forum-sdn @sdn-team on CoreOS slack

43
V0000000

You might also like