3 Virtualisation
3 Virtualisation
Partie 2
NFV
VM & Docker
Nicolas ROHART
[email protected]
RÉFÉRENCES
• Sofware Defined Network, A Comprehensive Approach, Paul
Göransson
• SDN: Software Defined Networks, Thomas D. Nadeau and
Ken Gray, O’REILLY
• Foundations of modern networking: SDN, NFV, QoE, IoT, and
Cloud, William Stallings. Addison-Wesley Professional, 2015
• Network Functions Virtualization with a touch of SDN,
Rajendra Chayapathi. Addison-Wesley, 2017
2
SOMMAIRE
• Introduction NFV
• Concepts de Virtualisation
• Accélération
• Cloud
• Containers
• Docker
• K8S
3
INTRODUCTION NFV
4
NETWORK FUNCTIONS
Bloc fonctionnel :
- Interface externe
- Comportement défini
5
NETWORK FUNCTIONS
NF can impact a huge number of users to access any service
≠
Failure of a web server hosting a single service
6
NETWORK FUNCTIONS
Optimisé pour un
Network OS et un Hardware
Application
Propriétaire ou
OS customisé
(-> standard)
Hardware
Mix de GPP et
d’ASIC
7
CARRIER GRADE
Taux de disponibilité de
99,999% , appelé le
standard des « cinq
neufs »
Recovery : moins de
50 millisecondes
8
HISTORIQUE
Ordinateur
Mainframe haute Serveurs
serveur performance virtualisé
Ordinateur Ferme de
personnel serveurs
9
MULTI STEP TRANSFORMATION
10
ÉVOLUTION DES INFRA RÉSEAUX
11
CRÉATION D’UNE TRAME
Raisons Objectifs
• Nouveau service = nouvel • Réduction Capex & Opex
équipement • Time-to-market plus court
• Installation compliqué • Flexibilité
• Hardware courte durée de vie • Déploiement de services avec
• Retour sur investissement risques faibles
raccourci • PFS NFV doit être carrier-grade
(classe opérateur)
12
ETSI
13
14
ETSI
15
NFVI LAYER
16
VNF LAYER
17
OSS LAYER
18
POINTS DE RÉFÉRENCE
19
POINTS DE RÉFÉRENCE
20
CONCEPT DE
L’ARCHITECTURE
21
MANAGEMENT
&
AUTOMATISATION
Les NFV management and orchestration
procédures sont dictées par des
templates de déployement:
•Resource Requirements
•Deployment Constraints
•Lifecycle management policies and scripts
22
VM
• Host Operating System
23
NFV INFRASTRUCTURE (NFVI) COMPONENTS
VM ou container
Vswitch ou
Vrouteur
Hyperviseur ou
Docker Engine
COST Serveurs,
basé sur du GPP
24
VNFC
Virtual Machine
25
VNF À VNF
26
MULTI-SITE DISTRIBUTED NFVI
27
VNFC
28
NFV-MANO
29
FLOW DIAGRAM FOR NS INSTANTIATION
OSS NFVO VNFM VIM NFVI
Instantiate NS Request
(NSD Identifier)
Read NSD
Process NS VL
Descriptions
Allocate Network Request
Create Network Request
Repeat for each virtual Repeated for
link to be created each virtual
Ack
network
Allocate Network Response
Authorize request
& select VIM
Grant VNF lifecycle operation respond
31
FLOW DIAGRAM FOR VNF INSTANTIATION (2/3)
OSS NFVO VNFM VIM NFVI
32
FLOW DIAGRAM FOR VNF INSTANTIATION (3/3)
OSS NFVO VNFM VIM NFVI
Allocate Compute Request
Select Host
Create VM
Repeated for
each VNFC
Ack
instance
Allocate Compute Response
Configure VNFC
VNFC
Ack VNF Instantiation
Ack VNF Instantiation
33
PANORAMA OPEN SOURCE
34
EXEMPLE
35
A DATA DRIVEN SYSTEM
• Deployement Template
• Ressources requirements
• Deployements constraints
• Lifecycle management policies and
scripts
• High Automation
• Time to deploy
• Time to repair
• Lower the risk of misconfiguration
37
VNFD
38
VNFD
39
VNFD
40
VNFD
41
REPOSITORIES
• VNF Catalog is a repository of all usable VNFDs
• Network Services (NS) Catalog is a deployment template for a
network service in terms of VNFs and description of their connectivity
through virtual links is stored in NS Catalog for future use.
• NFV Instances list holds all details about Network Services instances
and related VNF Instances.
• NFVI Ressources, it is a repository of NFVI resources utilized for the
purpose of establishing NFV services
42
REPOSITORIES
43
CONGESTION CONTROL
44
LOAD MANAGEMENT
Load Management affects both the
VNF Instance
performance and the availability of
network services. Peer NF VNFC
LB VNFC
instance
Two main tools applicable at both
the VNF and NS level:
- Load balancing VNF instance#1
- Dynamic scaling VNFC
VNF instance
instance
Peer NF LB VNFC
...
VNF instance#n
VNFC
instance
45
VIRTUALISATION : BÉNÉFICES ET BUT
• Disponibilité • Capex
• Moins de serveurs physiques • Sauvegardes
• Performances • PRA
• Sécurité • Réduction de l’obsolescence
46
CYLCE DE VIE
DES FONCTIONS RÉSEAUX
47
Design of Network
Functions :
- Consumption
- Programmability
DESIGN
Management &
Infrastructure Design : Orchestration :
- Host OS - Elasticity & Scalability
- Power & Space Usage - Time & Location
- Scalability HW Influence
- Licensing consideration
NFV Network
Design
Considerations
48
DESIGN
49
UPGRADE
• Adding resources
• Gracefull
• Hypervisor / Host OS ?
51
SEAMLESS VNF SOFTWARE MODIFICATION
Flow 1
• Graduel
• Réversible
Flow 2
52
53
MINIMIZE IMPACT OF NFVI SOFTWARE
MODIFICATIONS ON VNFS
54
VNF LIFECYCLE
Onboarding: integration of the supplier solution in automated processes which perform
package validation against requirements as provided by Orange. The package contains or references
(e.g. through models or built scripts) all artifacts needed to bring up the supplier solution in the IaaS
environment.
Onboarding
Terminating: deconstruction of the
supplier solution. Instantiation: installation of the
Terminating Instantiation supplier solution on the Orange IaaS so that
Healing: capacity of the supplier solution to VNF the solution becomes ready to run and
provide expected functions for the provision
of services.
dynamically adapt its running environment (as
provided by the IaaS) in order to tackle issues
crippling the performances Healing
LCM Run
Run: activation of the supplier solution
(including start, stop, suspend, resume
actions)
56
CPE / TRIPLE PLAY AVEC NFV
59
CONCEPTS DE
VIRTUALISATION
60
NETWORK OVERLAY
61
NETWORK OVERLAY
62
TECHNIQUE DE
VIRTUALISATION
63
RING PRIVILÈGE
64
FULL VIRTUALISATION
Hardware Resources
65
PARA-VIRTUALISATION
Hardware Resources
66
HARDWARE ASSISTED VIRTUALISATION
Hardware Resources
67
COMPARAISON
HARDWARE ASSISTED
PARAMETER FULL VIRTUALIZATION PARA VIRTUALIZATION
VIRTUALIZATION
Generation 1st 2nd 3rd
Performance Good Better in certain cases Fair
VMware, Xen, Microsoft,
Used By VMware, Microsoft, KVM VMware, Xen
Parallels
68
VIRTUALISATION VS ÉMULATION
Applications
Device
Applications Applications Applications
Drivers
Device Device Device Applications Applications Emulator
Drivers Drivers Drivers
Guest Operating Guest Operating
System System Device Drivers
69
TYPE HYPERVISEUR
70
EXEMPLE HYPERVISEURS
• Type-2 • Type-1
• QEMU • VMESXi
• VMWare Workstation • Microsoft Hyper-V
• Oracle VirtualBox • Xen
71
ALLOCATION DE RESSOURCES
• CPU et Mémoire
• I/O
• Espace Disque
• Thick
• Thin
72
API
73
WEB APPLICATION ARCHITECTURE
74
WEB APPLICATION ARCHITECTURE
• Multi-Page Application
Multi-page apps are more in demand on the web. Nowadays organizations
choose them in case their sites are very large. These solutions reload a website
page for loading or sending data to/from a server through the users’ browsers.
• Microservices Architecture
Microservices architecture concentrates on the particular functions and single
page application for quicker rollout and efficiency. It is easily developed with
codes that provide the best quality to the application and can be flexible.
75
WEB APPLICATION INFRASCTRUCTURE
76
SERVERLESS
77
SERVERLESS ARCHITECTURE
External
Database
78
SERVERLESS
• 1. Security Token Service
Serverless users use the API offered by the third-party providers to log into the system and utilize its
services. The infrastructure should produce a security token for users before they use the API.
• 2. Web Server
The web server needed to perform client apps should be potent enough to deal with all the static CSS,
HTML, and JavaScript needed.
• 3. FaaS Solution
A FaaS (Function as a Service) solution is an essential component of the serverless infrastructure. It
allows developers to create, run, deploy, and maintain apps without requiring a server infrastructure. A
developer can use any tool, OS, or framework through FaaS with only a few clicks.
• 4. User Verification
In a standard serverless environment, clients usually register for the service. And then the serverless
computing ensures every end-user can register and log into the application conveniently.
• 5. Client Software
The client interface should perform on the client-side, regardless of your server infrastructure’s
condition.
• 6. Database
Whether an app is built and managed on a serverless infrastructure or not, its data must be stored in
a database. In short, a strong database has become an essential part of this cloud-based
infrastructure.
79
SERVERLESS vs TRADITIONAL
80
ACCELERATION
Nicolas ROHART
07/02/2023 81
LATENCE ET PERFORMANCE VNF
C C C C
P P P P
U U U U Processed by
CPU
VNF 2
VNF 1
CPU Handled by
Switched in
Hardware VNF interface
ASIC
Handled by
Dedicated Virtualization Layer virtual NIC
vNIC
vNIC
Hardware
Handled by
physical NIC Switched in
Soft Software
NIC
NIC
Switch Handled by
physical NIC
Host
Purpose-Built
NIC
NIC
Network Device NFV-Based
82
Network
LA VIE D’UN PAQUET
83
PERFORMANCE BOTTLENECKS IN NFVI
Line rate packet size: 64B packet size: 1518B
Per-packet processing
time budget: 10Gbit/s 67.2 ns 1230.4 ns
40Gbit/s 16.8 ns 307.6 ns
100Gbit/s 6.7 ns 123.0 ns
85
PERFORMANCE
userspace
Application
Kernel context
SoftIrq
Network Stack
NETWORKING
L2/L4 processing
kernel
SUBSYSTEM: RX Copy to sk_buff
Interrupt
context
PACKET LIFE DMA IRQ
ISR
hardware
NIC
86
PERFORMANCE BOTTLENECKS IN NFVI
• Number of context switches
• By default two context switches per interrupt and system call upon a packet receipt
87
SOFTWARE & HARDWARE ACCELERATION
Pure software acceleration techniques
• CPU isolation and pinning
• Polling for packet
88
CPU ISOLATION AND PINNING
• Problems VM1 VM2
• A VNF is assigned a number of vCPUs
Other
• vCPUs and multiplexed on physical CPUs vCPU vCPU vCPU vCPU
processes
• Possible interference between
pCPU pCPU pCPU pCPU pCPU pCPU
• Solutions
• CPU isolation: remove some CPU cores from the Linux scheduler control
• CPU pinning: establish a fixed mapping between a vCPU and a physical core so
that the vCPU can always run on the same physical core.
89
POLLING FOR PACKETS
• Interrupt mode:
• Push model:
• NIC driver interrupts the CPU to notify the availability of a new packet
• High overhead: context switch and cache pollution
• Polling mode:
• Pull model:
• the CPU periodically checks for pending packets
• Busy loop polling: 100% CPU consumption whatever the traffic load
• Hybrid modes:
• Mix of interrupt and polling
• Polling with sleep or CPU frequency scaling depending on the traffic load
90
POLLING FOR PACKETS
NAPI (New API) alternates between interrupt and polling mode
In polling mode the interrupts are disabled and the network stack polls the device
in regular intervals.
Throughput optimized at the expense of latency
Without NAPI With NAPI
IRQs IRQs
NIC NIC
91
MEMORY MANAGEMENT TECHNIQUES
• Issues:
• Important performance gap between CPU and memory
• CPU often stalls due to memory access
• Cache hierarchy (L1,L2,L3) helps to mitigate this gap
• Allocating/freeing packet buffer incurs a significant overhead
• Uniform-Memory-Architecture (UMA)
• Virtual memory management overhead
• Solutions:
• Zero-copy packet send/receive path:
• Avoid costly memory copy operations by mapping DMA regions into user-space memory
• Use pre-allocated memory pools allocated at VNF startup and never freed
• Use cache-friendly data structures (e.g. so that a packet header fits into a cache line)
92
NUMA AWARE NF PLACEMENT
• NUMA is a computer hardware design choice for multiprocessor servers
• Access to some regions of memory will take longer than others
• Memory access time depends on the memory location relative to a processor
93
NUMA AWARE NF PLACEMENT
VNF co-located on same NUMA node as NIC
Cross-NUMA
and access localtraffic:
memory 10-20 % overhead
VNF VNF
94
NUMA AWARE NF PLACEMENT
95
SR-IOV
Two hardware-based virtualization solutions that bypass the hypervisor and
virtual switch layer
• PCI-passthrough:
• A VM has direct and exclusive access to a physical NIC
• SR-IOV (Single Root I/O Virtualization):
• A PCIe ethernet device is split into multiple Virtual Functions (VF)
that can be directly assigned to a VM and that appear as
standard network cards
• Hardware switching based on MAC/VLANs
Pros Cons
PCI- Near native performance • Scalability: one physical NIC port dedicated to a VNF
passthrough • Security issues
SR-IOV • High performance (close to native non-virtual • Scalability issues especially for container-based NFs
device performance) for north-south traffic • Poor manageability: VM live migration, OpenStack
• Consumes much less CPU that a (DPDK- security groups and floating IPs unsupported.
accelerated) software vSwitch • Hardware-specific driver in VM
• Performance issues for east-west traffic (chained VNFs)
due to PCIe interface throughput limitations.
96
Receive-Side-Scaling (RSS)
App worker threads App worker threads
Software
dispatcher receive queues
CPU
single queue
RSS
with RSS
NIC without RSS
NIC
incoming packet
97 incoming packet
DATA PLANE DEVELOPMENT KIT
98
DPDK
• DPDK within NFVI
• DPDK accelerated vSwitch (or vRouter)
‐ OVS-DPDK
‐ Contrail vRouter-DPDK
• DPDK-enabled VNFs
• Use DPDK PMD
• Requirements
• CPU pinning, NUMA
• DPDK-compliant NIC
Pros Cons
DPDK • Performance (throughput & latency) • Applications must be recoded to use DPDK API
• Can be used both at the host and within VMs • Dedicated CPUs busy polling (100% load)
(NFVI+VNFs) • DPDK-bound devices cannot be used by other
applications
• No access to Linux network stack and services
99
DPDK & SR-IOV
100
SERVICE MESH
103
SERVICE MESH
• Resilient Connectivity: Service to service communication must be possible
across boundaries such as clouds, clusters, and premises. Communication
must be resilient and fault tolerant.
• L7 Traffic Management: Load balancing, rate limiting, and resiliency must be
L7-aware (HTTP, REST, gRPC, WebSocket, …).
• Identity-based Security: Relying on network identifiers to achieve security is
no longer sufficient, both the sending and receiving services must be able to
authenticate each other based on identities instead of a network identifier.
• Observability & Tracing: Observability in the form of tracing and metrics is
critical to understanding, monitoring, and troubleshooting application
stability, performance, and availability.
• Transparency: The functionality must be available to applications in a
transparent manner, i.e. without requiring to change application code.
104
COUT DU SIDECAR
105
SERVICE MESH
106
SERVICE MESH / eBPF
107
eBPF ACCELERATION
108
EXTENDED BERKELEY PACKET FILTER
109
eBPF XDP
• XDP (eXpress Data Path) hook
intercepts packet right off the NIC
device driver before it starts moving
upwards into the kernel network stack
110
BASE DU CLOUD
112
CARACTÉRISTIQUES
• On-demand déploiement
• Accessible
• Scalabilité et élasticité
• Polling de ressources
• Ressource monitoring
113
IaaS/PasS/SaaS
114
IaaS/PaaS/SaaS
115
CONTAINERS
116
UN PEU D’HISTOIRE
• 1999 : Jails (https://www.freebsd.org/doc/handbook/jails.html)
• 2001 : Linux-VServer (http://www.linux-vserver.org)
• 2006 : cgroups (https://www.kernel.org/doc/Documentation/cgroup-
v1/cgroups.txt)
• 2008 : Namespaces (https://lwn.net/Articles/528078/)
• 2008 : Linux Container Project (https://linuxcontainers.org/lxc/)
• 2013 : Docker (https://docker.io/)
• 2015 : OCI (https://www.opencontainers.org/)
• 2017 : Projet Moby (https://mobyproject.org/)
117
UN TOURNANT
chroot
cgroup
namespace
118
CHROOT
119
CGROUP
120
NAMESPACE
121
UNE HISTOIRE D’IMAGE
122
CONTAINER
• Pourquoi? VM = OS complet
Contrainte en taille et flexibilité
• Solution OS commun
Espace de stockage partagé
• Avantages Légère (~10-100 Mo)
Démarrage plus rapide
Prend moins de place
123
CONTAINER
• Limitations
• Pour les applications standardisées
• Fonctionne sur Linux principalement
• Installation plus complexe qu’avec un hyperviseur
• Besoin de créer/utiliser des modèles de containers
124
CONTAINER VS VM
125
VIRTUALIZATION TECHNIQUE
Hypervisor / VM Container
126
FORCES
• Ultra léger (pas de reprise de toute la pile système classique), un
conteneur = un processus -> micro-service
• Isolation très forte
• Automatisation très poussée, au coeur du système (plus besoin
d’outils comme Puppet ou Ansible, même si possible)
• Mutualisation poussée et native des ressources (AUFS, cache
mémoire)
128
FAIBLESSES
• Temps d’appropriation
• Gestion du réseau et du stockage pas toujours adapté aux
applications ou aux infrastructures existantes
• Toutes les applications ne sont pas destinées à être dockerisées
• Noyau de l’hôte est commun
• Gestion du paramétrage fin du noyau (ulimit/sysctl) encore incomplet
129
VM vs CONTAINER
130
DOCKER
131
DOCKER
132
DOCKER EDITION
133
DOCKER INTERACTION
134
POINT DÉFINITION
• Image : Ensemble de dépendances (binaires, configurations, code)
agrémentées de méta-données (nom, labels, date de création).
• Conteneur : Environnement isolé dans lequel est exécuté un ou plusieurs
processus.
• Volume : Espace de stockage persistant.
• Registry : Dépôt d’images Docker.
• Client Docker : Outil en ligne de commande permettant d’interagir avec l’API
d’un daemon Docker.
• Serveur Docker : daemon Docker. This daemon listens to API requests and
manages Docker objects (images, containers, networks, and volumes).
135
NETWORK
136
DNS
137
BRIDGE
138
VOLUME
139
VOLUME
140
DOCKERFILE
141
DOCKERFILE
142
ORCHESTRATION
• Compose
• Swarm
• Kubernetes
• OpenShit
143
DOCKER COMPOSE
144
DOCKER COMPOSE
145
DOCKER COMPOSE
146
KUBERNETES
147
HIGH LEVEL DESIGN
148
K8S FEATURES
149
MASTER COMPONENTS
• API Server : expose l'API Kubernetes et constitue le frontal du plan de contrôle
Kubernetes.
• etcd : toutes les données du cluster sont stockées dans le stockage etcd et il s'agit d'un
magasin clé-valeur simple, distribué et cohérent. Il est principalement utilisé pour la
configuration partagée et la découverte de services.
• Controller-manager : le gestionnaire de contrôleur exécute des contrôleurs qui gèrent des
tâches de routine dans le cluster, par ex. Contrôleur de réplication, contrôleur de points
de terminaison, etc.
• Scheduler : Le planificateur dispose des informations concernant les ressources
disponibles sur les membres du cluster. Le planificateur surveille également les pods
nouvellement créés auxquels aucun nœud n'est attribué et sélectionne un nœud sur
lequel ils doivent s'exécuter.
151
NODE COMPONENTS
• kubelet : l'agent de nœud principal qui surveille les pods qui ont été
attribués à son nœud et effectue des actions pour le maintenir en bonne
santé et en bon état de fonctionnement, par ex. monter des volumes de pod,
exécuter des conteneurs, effectuer des vérifications de l'état, etc.
• kube-proxy : active l'abstraction du service Kubernetes en maintenant les
règles du réseau sur l'hôte et en effectuant le transfert de connexion. kube-
proxy agit comme un proxy réseau et un équilibreur de charge pour un
service sur un nœud de travail unique.
• Docker : il est utilisé pour exécuter réellement des conteneurs.
152
VUE SIMPLIFIÉE
L7 Logic (Ingress)
L3-L4 Networking
L3 – L7 Network
Management ==
Service Mesh
153
CONCLUSION