100% found this document useful (1 vote)
205 views

Kubernetes Networking

iptables rules The kube-proxy watches the Kubernetes API for service and endpoint changes and configures iptables rules on the node to handle traffic routing. When a service IP or endpoint changes, kube-proxy automatically updates the iptables rules to load balance and route traffic accordingly. This provides network load balancing, service discovery, and service abstraction without a dedicated load balancer or reverse proxy.

Uploaded by

Dodo winy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
205 views

Kubernetes Networking

iptables rules The kube-proxy watches the Kubernetes API for service and endpoint changes and configures iptables rules on the node to handle traffic routing. When a service IP or endpoint changes, kube-proxy automatically updates the iptables rules to load balance and route traffic accordingly. This provides network load balancing, service discovery, and service abstraction without a dedicated load balancer or reverse proxy.

Uploaded by

Dodo winy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

Kubernetes Networking

Seattle Kubernetes Meetup

CJ Cullen <[email protected]>
Software Engineer
@cj_cullen
github.com/cjcullen
Docker Networking
Docker networking

docker start ...


Docker networking

docker start ...


Docker networking

docker0 172.16.1.0/24
Docker networking

docker run ...

docker0 172.16.1.0/24
Docker networking

docker0 172.16.1.0/24
Docker networking

172.16.1.1

eth0

vethAQ2IT
docker0 172.16.1.0/24
Docker networking

172.16.1.1

eth0

docker run ...

vethAQ2IT
docker0 172.16.1.0/24
Docker networking

172.16.1.1 172.16.1.2

eth0 eth0

vethAQ2IT vethS1LUI
docker0 172.16.1.0/24
Docker networking

172.16.1.1
172.16.1.1

172.16.1.2
172.16.1.1
Docker networking

NAT
172.16.1.1
172.16.1.1
NAT
NAT

NAT

NAT
172.16.1.2
172.16.1.1
Host ports

A: 172.16.1.1

3306 C: 172.16.1.1
9376 SNAT

8000

80

SNAT 11878

B: 172.16.1.2
Host ports

T E D
C
A: 172.16.1.1

E JE
3306 C: 172.16.1.1

R
9376 SNAT

8000

80

SNAT 11878

B: 172.16.1.2
Kubernetes Networking
Kubernetes networking
IPs are routable
• vs docker default private IP

Pods can reach each other without NAT


• even across nodes

No brokering of port numbers


• too complex, why bother?

This is a fundamental requirement


• can be L3 routed
• can be underlayed (cloud)
• can be overlayed (SDN)
Kubernetes networking

10.1.2.1
10.1.1.1
10.1.2.0/24

10.1.1.2
10.1.3.1
10.1.1.0/24 10.1.3.0/24
Kubernetes networking

10.1.2.1
10.1.1.1
10.1.2.0/24

10.1.1.2 ? 10.1.3.1
10.1.1.0/24 10.1.3.0/24
Kubernetes networking
On GCE/GKE
• GCE Advanced Routes (program the fabric)
• “Everything to 10.1.1.0/24, send to this VM”

Plenty of other ways


• AWS: Route Tables
• Weave
• Calico
• Flannel
• OVS
• OpenContrail
• Cisco Contiv
• Others...
Kubernetes networking
On GCE/GKE
• GCE Advanced Routes (program the fabric)
• “Everything to 10.1.1.0/24, send to this VM”

Plenty of other ways


• AWS: Route Tables
• Weave
• Calico
• Flannel
• OVS
• OpenContrail
• Cisco Contiv
• Others...
Kubernetes networking
On GCE/GKE
• GCE Advanced Routes (program the fabric)
• “Everything to 10.1.1.0/24, send to this VM”

Plenty of other ways


• AWS: Route Tables
• Weave
• Calico
• Flannel
• OVS
• OpenContrail
• Cisco Contiv
• Others...
Pods
Pods
Content
Consumers
Manager
Small group of containers & volumes
Tightly coupled
The atom of scheduling & placement
Shared namespace File Web
Puller Server
• share IP address & localhost
• share IPC, etc.

Managed lifecycle
• bound to a node, restart in place Volume
• can die, cannot be reborn with same ID

Example: data puller & web server Pod


Pods
Small group of containers & volumes
Tightly coupled
The atom of scheduling & placement
Shared namespace
• share IP address & localhost 10.1.1.2
• share IPC, etc.

Managed lifecycle
• bound to a node, restart in place
• can die, cannot be reborn with same ID

Example: data puller & web server


Pods
Small group of containers & volumes
Tightly coupled
The atom of scheduling & placement
c1 c2
Shared namespace
--net=container:infra --net=container:infra
• share IP address & localhost --ipc=container:infra --ipc=container:infra

• share IPC, etc.

Managed lifecycle infra


• bound to a node, restart in place
• can die, cannot be reborn with same ID 10.1.1.2

Example: data puller & web server


Services
Services
A group of pods that work together Client
• grouped by a selector

Defines access policy


• “load balanced” or “headless”
Virtual IP
Gets a stable virtual IP and port
• sometimes called the service portal
• also a DNS name

VIP is managed by kube-proxy


• watches all services
• updates iptables when backends change

Hides complexity - ideal for non-native apps


kube-proxy
kube-proxy (legacy)

Node X
kube-proxy apiserver

iptables
kube-proxy (legacy) services &
endpoints

Node X
kube-proxy watch apiserver

iptables
kube-proxy (legacy) kubectl run ...

Node X
kube-proxy watch apiserver

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

schedule

iptables
kube-proxy (legacy) kubectl expose ...

Node X
kube-proxy watch apiserver

iptables
kube-proxy (legacy) new
service!

Node X
kube-proxy update apiserver

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver
listen

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver
listen

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

configure

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

VIP

iptables
kube-proxy (legacy) new
endpoints!

Node X
kube-proxy update apiserver

VIP

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

VIP

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

Client

VIP

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

Client

VIP

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

Client

VIP

iptables
kube-proxy (legacy)

Node X
kube-proxy watch apiserver

Client

VIP

iptables
kube-proxy (legacy)
Userspace proxy isn’t ideal

Burns CPU copying bytes


• “Proxy” is just parallel copy loops.

Loses source IP
• Everything looks like it’s from the node IP.

Userspace TCP listening = higher latency


iptables kube-proxy
iptables kube-proxy

Node X
kube-proxy apiserver

iptables
iptables kube-proxy services &
endpoints

Node X
kube-proxy watch apiserver

iptables
iptables kube-proxy kubectl run ...

Node X
kube-proxy watch apiserver

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

schedule

iptables
iptables kube-proxy kubectl expose ...

Node X
kube-proxy watch apiserver

iptables
iptables kube-proxy new
service!

Node X
kube-proxy update apiserver

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

configure

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

VIP

iptables
iptables kube-proxy new
endpoints!

Node X
kube-proxy update apiserver

VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

configure
VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

Client

VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

Client

VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

Client

VIP

iptables
iptables kube-proxy

Node X
kube-proxy watch apiserver

Client

VIP

iptables
iptables kube-proxy Mean Latency
contrib/for-tests/netperf-tester --number=1000

iptables
kube-proxy

legacy
kube-proxy

Mean Latency Microseconds


Services
Services are just an abstraction
• Only requirement: route (and maybe load
balance) a virtual IP to a set of backends.

Kube-proxy is an implementation
• Kube-proxy watches apiserver.
• iptables is re-configured on changes.

There could be other ways


• Userspace, iptables, IP Virtual Servers?
DNS
Run SkyDNS as a pod in the cluster
• kube2sky bridges Kubernetes API -> SkyDNS
• Tell kubelets about it (static service IP) kubernetes

kubernetes.default
Strictly optional, but practically required
• LOTS of things depend on it
kubernetes.default.svc.cluster.local
• Probably will become more integrated

foo.my-namespace.svc.cluster.local
Or plug in your own!
DNS
Run SkyDNS as a pod in the cluster
• kube2sky bridges Kubernetes API -> SkyDNS
• Tell kubelets about it (static service IP)

Strictly optional, but practically required


• LOTS of things depend on it
• Probably will become more integrated
kube-dns-qxin
skyDNS kube2sky
Or plug in your own! watch apiserver

etcd
DNS
Run SkyDNS as a pod in the cluster
• kube2sky bridges Kubernetes API -> SkyDNS
• Tell kubelets about it (static service IP) /etc/resolv.conf

nameserver 10.0.0.10
Strictly optional, but practically required ...
• LOTS of things depend on it
• Probably will become more integrated
kube-dns-qxin
skyDNS kube2sky
Or plug in your own! watch apiserver

etcd
DNS
Run SkyDNS as a pod in the cluster
• kube2sky bridges Kubernetes API -> SkyDNS
• Tell kubelets about it (static service IP) /etc/resolv.conf

nameserver 10.0.0.10
Strictly optional, but practically required ...
• LOTS of things depend on it
• Probably will become more integrated
kube-dns-qxin
skyDNS kube2sky
Or plug in your own! 10.0.0.10 watch apiserver

etcd
Putting it Together
What happens when I...

$ curl foo.my-namespace

Client
Putting it Together
What happens when I...

$ curl foo.my-namespace /etc/resolv.conf

nameserver 10.0.0.10
...

10.1.0.1
Client
Putting it Together
What happens when I...

$ curl foo.my-namespace

foo.my-namespace? kube-dns-qxin
10.1.0.1 skyDNS kube2sky
Client 10.0.0.10

etcd
Putting it Together
What happens when I...

$ curl foo.my-namespace

10.0.123.45 kube-dns-qxin
10.1.0.1 skyDNS kube2sky
Client 10.0.0.10

etcd
Putting it Together
What happens when I...

$ curl foo.my-namespace

10.1.0.1
Client 10.0.123.45
Putting it Together
What happens when I...

$ curl foo.my-namespace

10.1.0.1
Client 10.0.123.45 VIP
Putting it Together
What happens when I...

$ curl foo.my-namespace

10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

3.1
.1.
10
10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

3.1
10.1.3.0/24 -> Node X

.1.
10
10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

3.1
.1.
10
10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together Hello World!
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

3.1
.1.
10
10.1.0.1
Client 10.0.123.45 VIP

iptables
Putting it Together Hello World!
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

0.1
.1.
10
10.1.0.1
Client

iptables
Putting it Together Hello World!
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

0.1
.1.
10
10.1.0.1
Client
10.1.0.0/24 -> Node Y
iptables
Putting it Together Hello World!
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

0.1
.1.
10
10.1.0.1
Client
10.1.0.0/24 -> Node Y
iptables
Putting it Together Hello World!
10.1.0.6 10.1.3.1 10.1.6.3

What happens when I...

$ curl foo.my-namespace

0.1
.1.
10
Hello World!

10.1.0.1
Client
10.1.0.0/24 -> Node Y
iptables
What about external?
External Services
Services IPs are only available inside the
cluster
Need to receive traffic from “the outside
world”
Builtin: Service “type”
• nodePort: expose on a port on every node
• loadBalancer: provision a cloud load-balancer

DiY load-balancer solutions


• socat (for nodePort remapping)
• haproxy
• nginx
The Bleeding Edge
Ingress (L7) Client

Services are assumed L3/L4


Lots of apps want HTTP/HTTPS
Ingress maps incoming traffic to backend URL Map
services
• by HTTP host headers
• by HTTP URL paths

HAProxy and GCE implementations


No SSL yet
Status: BETA in Kubernetes v1.1
Ingress (L7) Client

Services are assumed L3/L4


api.company.com

Lots of apps want HTTP/HTTPS


Ingress maps incoming traffic to backend URL Map
services
• by HTTP host headers
• by HTTP URL paths api.company.com/foo api.company.com/bar

HAProxy and GCE implementations


No SSL yet othercompany.com/*

Status: BETA in Kubernetes v1.1


Network Plugins
Network Plugins
Introduced in Kubernetes v1.0
• VERY experimental Plugin

Uses CNI (CoreOS) in v1.1


• Simple exec interface
net
• Not using Docker libnetwork Plugin
• but can defer to Docker for networking

Cluster admins can customize their installs Plugin


• DHCP, MACVLAN, Flannel, custom
Kubernetes is Open
- open community
- open design
- open source
- open to ideas
Networking is Hard
- help guide us!
http://kubernetes.io
https://github.com/kubernetes/kubernetes
slack: kubernetes twitter: @kubernetesio

You might also like