0% found this document useful (0 votes)
71 views382 pages

NSX Sdwan

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 382

abc of VMWARE NSX-T(2.

4)
Agenda
First 3 Sections Vmware Virtualization with vSphere 6.7 Pre NSX-T
--- Rest of the Sections are :

NSX-T Componets & Architecture


Dashboard
Fabric & Logical Switching
What NSX-T Solving ??

VMware NSX-T is designed to address application frameworks and


architectures that heterogeneous endpoints and technology stacks.

In addition to vSphere, these environments include other hypervisors,


containers, bare metal operating systems, and public clouds.

NSX-T allows IT and development teams to choose the technologies best suited for their
particular applications. NSX-T is also designed for management, operations, and
consumption by development organizations in addition to IT.
NSX-T Anywhere Architecture
Characteristics of NSX-T architecture include:

Networking
Policy and Security and Visibility
and
Consistency Services
Connectivity

These attributes enable the heterogeneity, app-alignment, and extensibility


required to support diverse requirements. Additionally, NSX-T supports DPDK
libraries that offer line-rate stateful services.
Programmatic Integration with Various PaaS and CaaS

*NCP NSX-T Container Plug-in


*PaaS Platform as a service
*CaaS Container as a service
NCP Architecture
NSX Container Plugin: NCP is a software component in the form of a container
image, typically run as a Kubernetes pod.

Adapter layer: NCP is built in a modular manner so that individual adapters can
be added for a variety of CaaS and PaaS platforms.

NSX Infrastructure layer: Implements the logic that creates topologies, attaches
logical ports, etc.

NSX API Client: Implements a standardized interface to the NSX API.


Multi-Cloud Architecture and NSX-T
Starting with NSX-T 2.5 release, NSX Cloud supports two modes of
operations –
The Native cloud Enforced Mode and NSX Enforced Mode.

When using the Native Cloud Enforced mode, NSX Policies are translated to Native
Cloud Constructs such as Security Groups (in AWS) or combination of Network
Security Group/Application Security Groups (in Azure).

In NSX Enforced Mode (which was the only mode available in NSX-T 2.4
and prior), the NSX policie are enforced using NSX Tools which is
deployed in each Cloud instance.

Additionally, in NSX-T2.5, native cloud service endpoints (RDS, ELB,


Azure LB, etc.) are discovered automatically and can be used in NSX-T
DFW policy.
NSX-T Architecture Components
NSX-T works by implementing three separate but integrated planes:
management, control, and data.

The three planes are implemented as sets of processes, modules, and agents
residing on three types of nodes: manager, controller, and transport.
NSX-T Architecture and Components
Management Plane
Serves as a unique entry point for user configuration via RESTful API
(CMP,automation) or NSX-T user interface.

Provides ubiquitous connectivity, consistent enforcement of security and operational


visibility via object management and inventory collection and for multiple
Compute domains – up to 16 vCenters, container orchestrators (PKS & OpenShift)
and clouds (AWS and Azure)
Retrieves the desired configuration in addition to system information (e.g.,
statistics)

Responsible for storing desired configuration in its database. The NSX-T Manager
stores the final configuration request by the user for the system. This configuration
will be pushed by the NSX-T Manager to the control plane to become a realized
configuration (i.e., a configuration effective in the data plane).
Control Plane
NSX-T splits the control plane into two parts:
● Central Control Plane (CCP) – The CCP is implemented as a
cluster of virtual machines called CCP nodes. The cluster form factor
provides both redundancy and scalability of resources. The CCP is
logically separated from all data plane traffic, meaning any failure in
the control plane does not affect existing data plane operations. User
traffic does not pass through the CCP Cluster.
● Local Control Plane (LCP) – The LCP runs on transport nodes. It is
adjacent to the data plane it controls and is connected to the CCP. The
LCP is responsible for programing the forwarding entries and firewall
rules of the data plane.
NSX Manager Appliance

Instances of the NSX Manager and NSX Controller are bundled in a virtual
machine called the NSX Manager Appliance.

Starting 2.4, the NSX manager, NSX policy manager and NSX controller as
an element will co-exist under a common VM. Three unique NSX appliance
VMs are required for cluster availability.

Because the NSX-T Manager is storing all its information in a database


immediately synchronized across the cluster, configuration or read
operations can be performed on any appliance.
 Each appliance has a dedicated IP address and its manager can be accessed
directly or through a load balancer.

 Optionally, the three appliances can be configured to maintain a virtual IP


address which will be serviced by one appliance selected among the three.
Control Plane

Computes all ephemeral runtime state based on configuration from the


management plane, disseminates topology information reported by the
data plane elements, and pushes stateless configuration to forwarding
engines.

The set of objects that the control plane deals with include VIFs, logical
networks, logical ports, logical routers, IP addresses, and so on.
Data Plane
There are two main types of transport nodes in NSX-T:
 Hypervisor Transport Nodes: Hypervisor transport nodes are
hypervisors prepared and configured for NSX-T. The N-VDS provides
network services to the virtual machines running on those hypervisors.
NSX-T currently supports VMware ESXi™ and KVM hypervisors. The
N-VDS implemented for KVM is based on the Open vSwitch (OVS) and
is platform independent .
 Edge Nodes: VMware NSX-T Edge™ nodes are service appliances
dedicated to running centralized network services that cannot be distributed
to the hypervisors. They can be instantiated as a bare metal appliance or in
virtual machine form factor. They are grouped in one or several clusters,
representing a pool of capacity.
The data plane performs stateless forwarding/transformation of packets based on
tables
populated by the control plane and reports topology information to the control
plane,
and maintains packet level statistics.

The data plane performs stateless forwarding/transformation of packets based on


tables populated by the control plane and reports topology information to the
control plane, and maintains packet level statistics.
Summary what we have studied so far …
Advanced API/UI Object(NSX 2.3) Declarative API Object (NSX-T 2.4)
Logical switch Segment
Networking
T0 Logical Router Tier-0 Gateway
T1 Logical Router Tier-1 Gateway
Centralized Service Port Service Interface

NSGroup, IP Sets, MAC Sets Group


Security Firewall Section Security-Policy
Edge Firewall Gateway Firewall
NSX-T Consumption Model
1)Simplified UI/API (Declarative Interface)
o New interface introduced on NSX-T 2.4 which uses new Declarative API/Data
Model.
2)Advanced UI/API (Imperative interface)
o Continuing NSX 2.3 user interface to address upgrade & Cloud Management
Platform (CMP) use case .
o Advanced UI/API would be deprecated over time, as we transition all features/use case
to Simplified UI/API.
Advanced UI/API Simplified UI/API

CMP integration - Plugins will continue use All deployments – except for CMP use case
Imperative Manager API’s for now. Will be (until plugins are updated with Declarative
updated to use Declarative API’s in Future API)

Container – Flexible * Plugin vendor specific Container – Flexible * Plugin vendor specific
OpenStack – Flexible
OpenStack – Flexible
Upgrade scenario
Existing Config gets ported to Advanced UI NSX 2.4 Onward New Features:
and DNS Services/Zones VPN
only available from there (not from Simplified Endpoint Protection (EPP)
UI) Network Introspection-E-W Service Insertion
DFW Settings Context Profile Creation – L7 APP or
FQDN’s New DFW/Gateway FW Layout –
Session Timer Bridging Different Categories, Auto Plumbed rules
Configuration
NSX-T Logical Object Naming Changes
The declarative API/Data model some of the networking and security logical objects names
have changed to build unified object model. The table below provides the before and after
naming side by side for those NSX-T Logical objects. This just changes the name for the
given NSX-T object, but conceptually and functionally it is same as before.
Advanced API/UI Object Declarative API Object
Logical switch Segment
Networking
T0 Logical Router Tier-0 Gateway
T1 Logical Router Tier-1 Gateway
Centralized Service Port Service Interface

NSGroup, IP Sets, MAC Sets Group


Security Firewall Section Security-Policy
NSX-T Declarative API Framework

Some of the main benefits of declarative API framework are:


Outcome driven: Reduces number of configuration steps by allowing
user to describe desired end-goal (the “what”), and letting the system
figure out “how” to achieve it. Utilize user-specified names, not system
generated IDs
Order Independent: create/update/delete in any order and always
arrive at the same consistent result
Prescriptive: reduces potential for user error with built-in dependency
checks
Policy Life cycle management: Simpler with single API call. Toggle
marked-to-delete flag in the JSON request body to manage life cycle of
entire application topology.
Let’s Understand the Logic/Topology NSX-T
the top-tier Gateway router, called a Tier-0 (T0)
secondary-tier Gateway router called a Tier-1 (T1)

Note1: The south edge of T0 connects to one or more T1 routing


layer(s) and receives routing information from them. To optimize
resource usage, the T0 layer does not push all routes coming
from the physical network towards T1, but does provide default
route information.
Note that a two-tiered routing topology is not mandatory. If
there is no need for provider/tenant isolation, a single Tier-0
topology can be implemented. In this scenario, Layer 2
segments are connected directly to the T0 layer, and a Tier-1
router is not configured.
A Gateway router is comprised of up to two components:
 a distributed router (DR), and
optionally one or more service routers (SR)
The DR is kernel based and spans hypervisors, providing local
routing functions to those VMs that are connected to it, and also
exists in any edge nodes the logical router is bound to.

Functionally, the DR is responsible for one-hop distributed


routing between logical switches and/or Gateway routers
connected to this logical router, and functions similar to the
distributed logical router (DLR) in earlier viersions of NSX.
The SR is responsible for delivering services that are not
currently implemented in a distributed fashion, such as stateful
NAT, load balancing, DHCP or VPN services.

Service Routers are deployed on the Edge node cluster that is


selected when the T0/T1 router is initially configured.
a Gateway router in NSX-T always has an associated DR,
regardless of whether it's deployed as a T0 or a T1. It will also
have an associated SR created if either of the following is true:

 The Gateway router is a Tier-0 router, even if no stateful


services are configured .

 The Gateway router is a Tier-1 router, is linked to a Tier-0


router, and has services configured that do not have a
distributed implementation (such as NAT, LB, DHCP or VPN)
 Note: The NSX-T management plane (MP) automatically creates the structure
connecting the service router to the distributed router. The MP allocates a VNI
and creates a transit segment, then configures a port on both the SR and DR,
connecting them to the transit segment. The MP then automatically allocates
unique IP addresses for both the SR and DR.
NSX Edge Node
NSX Edge Node provides routing services and connectivity to
networks that are external to the NSX-T deployment.

When virtual machine workloads residing on different NSX


segments communicate with one another through a T1, the
distributed router (DR) function is used to route the traffic in a
distributed, optimized fashion.
However, when virtual machine workloads need to communicate with
devices outside of the NSX environment, the service router (SR), which is
hosted on an NSX Edge Node, is used. If stateful services are required -
for example, network address translation - the SR will also perform this
function and must receive the traffic as well, whether the stateful service
is associated with a T0 or a T1 router.

Common deployments of NSX Edge Node include DMZ and multi-tenant


Cloud environments, where the NSX Edge Node creates virtual
boundaries for each tenant through the use of service routers.
Transport Zones

A transport zone controls which hosts a logical switch can reach. It can
span one or more host clusters. Transport zones dictate which hosts
and, therefore, which VMs can participate in the use of a particular
network.

A Transport Zone defines a collection of hosts that can communicate


with each other across a physical network infrastructure. This
communication happens over one or more interfaces defined as a
Tunnel End Point (TEP).
If two transport nodes are in the same transport zone, VMs hosted on
those transport nodes can be attached to the NSX-T logical switch
segments that are also in that transport zone. This attachment makes it
possible for the VMs to communicate with each other, assuming the
VMs otherwise have Layer 2/Layer 3 reachability.

If VMs are attached to switches that are in different transport zones, the
VMs cannot communicate with each other. Transport zones do not
replace Layer 2/Layer 3 reachability requirements, but they place a
limit on reachability.
A node can serve as a transport node if it contains at least one
hostswitch. When creating a host transport node and adding it to a
transport zone, NSX-T installs a hostswitch on the host. The hostswitch
is used for attaching VMs to NSX-T logical switch segments and for
creating NSX-T gateway router uplinks and downlinks.

In previous versions of NSX, a hostswitch could host a single transport


zone; configuring multiple transport zones required multiple
hostswitches on the node. However, as of NSX-T 2.4 it is possible to
configure multiple transport zones using the same hostswitch.
Components of NSX-T
Management Plane
Provides single API entry point to the system, persists user configuration,
handles user queries, and performs operational tasks on all of the
management, control and data plane nodes in the system. Management plane
is also responsible for querying, modifying and persisting user configuration.

Control Plane
Computes runtime state based on configuration from the management plane.
Control plane disseminates topology information reported by the data plane
elements and pushes stateless configuration to forwarding engines.

Data Plane
Performs stateless forwarding or transformation of packets based on tables
populated by the control plane. Data plane reports topology information to the
control plane and maintains packet level statistics.
NSX Manager
Management function that exists as a component of the NSX Manager Cluster.
In prior versions of NSX, the NSX Manager was a dedicated virtual appliance.
As of NSX-T 2.4, the NSX Manager function and Controller Cluster functions are
consolidated into a single cluster called the NSX Manager Cluster.

NSX Controller Cluster


Deployed as a cluster of highly available virtual appliances that are responsible
for the programmatic deployment of virtual networks across the entire NSX-T
architecture. NSX Manager and NSX Controller services both exist in the NSX
Controller Cluster.

NSX Edge Cluster


Collection of NSX Edge node appliances that are logically grouped for high-
availability monitoring.
External Network
A physical network or VLAN not managed by NSX-T. You can link your logical
network or overlay network to an external network through an NSX Edge. For
example, a physical network in a customer data center or a VLAN in a physical
environment.

Transport Zone
Collection of transport nodes that defines the maximum span of logical
switches. A transport zone represents a set of similarly provisioned
hypervisors, and the logical switches that connect VMs on those hypervisors.

Host Transport Node


Hypervisor node that has been registered with the NSX-T management plane
and has NSX-T modules installed. For a hypervisor host to be part of the NSX-
T overlay, it must be added to the NSX-T fabric.
Edge Transport Node
Edge node that has been registered with the NSX-T management plane. The
Edge Transport Node hosts the NSX Service Routers (SR) that are associated
with Tier-0 and Tier-1 routers, including Uplink connectivity to External
Networks as well as stateful services such as NAT.

NSX Edge Node


Component that provides computational power to deliver IP routing and IP
services functions. Service Routers (SR), used for Uplink connectivity and
stateful services, are provisioned on Edge node appliances.

Profile
Represents a specific configuration that can be associated with an NSX Edge
cluster. For example, the fabric profile might contain the tunneling properties
for dead peer detection.
Gateway Router
NSX-T routing entity that provides distributed East-West routing. A
gateway router also links a Tier-1 router with a Tier-0 router.

Logical Router Port


Logical network port which can attach to either a logical switch
segment port or a physical network uplink port. Logical Router Ports
are also used to connect the LR to SR services such as Network
Address Translation (NAT), Load Balancing, Gateway Firewall, VPN
etc.
Tier-0 (T0) Logical Router

Provider gateway router is also known as Tier-0 gateway router, and


interfaces with the physical network. Tier-0 gateway router is a top-tier
router and can be configured as an active-active or active-standby cluster of
service routers. The gateway router runs BGP and peers with physical
routers via the service router. In active-standby mode, the gateway router
can also provide stateful services.

Tier-1 (T1) Gateway Router

Tier-1 gateway router is the second tier router that connects to one Tier-0
gateway router for northbound connectivity, and one or more overlay
networks for southbound connectivity. Tier-1 gateway router can also be
configured in an active-standby cluster of services when the router is
configured to provide stateful services.
Segment / Logical Switch

 Segments, called logical switches in previous versions of NSX, are API


entities that provide virtual Layer 2 switching for both VM and router
interfaces.

 A segment gives tenant network administrators the logical equivalent of a


physical Layer 2 switch, allowing a group of VMs to communicate on a
common broadcast domain.

 A segment is a logical entity that exists independent of the underlying


infrastructure and spans many hypervisors. It provides network connectivity to
VMs regardless of their physical location, allowing them to migrate between
locations without requiring any reconfiguration.
Logical Switch Port

Logical switch attachment point to establish a connection to a


virtual machine network interface or a logical router interface.
The logical switch port reports applied switching profile, port
state, and link status.
NSX-T Hostswitch or KVM Open vSwitch (OVS)

 Software that runs on the hypervisor and provides physical traffic


forwarding.
 The hostswitch or OVS is invisible to the tenant network administrator and
provides the underlying forwarding service that each logical switch relies on.

 To achieve network virtualization, a network controller must configure the


hypervisor hostswitches with network flow tables that form the logical
broadcast domains the tenant administrators defined when they created and
configured their logical switches.

 Each logical broadcast domain is implemented by tunneling VM-to-VM and


VM-to-logical router traffic, using the tunnel encapsulation protocol Geneve.
Open vSwitch (OVS)

Open source software switch that acts as a hypervisor hostswitch within


XenServer, Xen, KVM and other Linux-based hypervisors. NSX Edge switching
components are based on OVS.

Overlay Logical Network

Logical network implemented using Layer 2-in-Layer 3 tunneling such that the
topology seen by VMs is decoupled from that of the physical network.

Physical Interface (pNIC)


Network interface on a physical server that a hypervisor is installed on.
VM Interface (vNIC)
Network interface on a virtual machine that provides connectivity between
the virtual guest operating system and the standard vSwitch or vSphere
Distributed Switch. The vNIC can be attached to a logical port. You can
identify a vNIC based on its Unique ID (UUID). The vNIC is equivalent to a
network interface card (NIC) on a physical machine.

VIrtual Network Interface (VNI)


The network identifier associated with a given logical switch. As Layer 2
segments are created in NSX, an associated VNI is allocated. This VNI is
used in the encapsulated overlay packet, and facilitates Layer 2 separation.
TEPTunnel End Point

 Tunnel endpoints enable hypervisor hosts to participate in an NSX-T


network overlay.

 The NSX-T overlay deploys a Layer 2 network over an existing physical


network fabric by encapsulating frames inside of packets, and transferring
the encapsulated packets over the underlying transport network.

 The underlying transport network can consist of either Layer 2 or Layer 3


networks. The TEP is the connection point at which encapsulation and
decapsulation takes place.
NSX-T Dashboard
HOL-2026-01
Understand Fabric Nodes
Observe that the hosts in the RegionA01-COMP01 cluster show an NSX
Configuration status of Configured, while the hosts in RegionA01-MGMT are
Not Configured. This means that hosts in the RegionA01-COMP01 cluster can
participate in NSX overlay networking and security, while hosts in RegionA01-
MGMT cannot.

In contrast, the Edge Transport Node in NSX-T contains its own TEP within the
Edge, and no longer requires the hypervisor to perform encapsulation and
decapsulation functions on its behalf. When an encapsulated packet is
destined for an Edge, it is delivered in its encapsulated form directly to the
Edge Node via its TEP address. This allows for greater portability of the Edge
Node, since it no longer has dependencies on underlying kernel services of
the host.
NSX-T Logical Switching
The N-VDS
 The primary component involved in the data plane of the transport nodes is
the N-VDS.

 The NVDS forwards traffic between components running on the transport


node (e.g., between virtual machines) or between internal components and
the physical network.

 In the latter case, the NVDS must own one or more physical interfaces
(pNICs) on the transport node.
 The N-VDS is mandatory with NSX-T for both overlay and VLAN
backed networking. On ESXi hypervisors, the N-VDS implementation is
derived from VMware vSphere® Distributed Switch™ (VDS).

 With KVM hypervisors, the N-VDS implementation is derived from the


Open vSwitch (OVS).
Segments and Transport Zones

In NSX-T, virtual layer 2 domains are called segments. There are two
kinds of segments:
 VLAN backed segments
 Overlay backed segments
VLAN backed segment

A VLAN backed segment is a layer 2 broadcast domain that is implemented


as a traditional VLAN in the physical infrastructure.

That means that traffic between two VMs on two different hosts but attached
to the same VLAN backed segment will be carried over a VLAN between the
two hosts.

The resulting constraint is that an appropriate VLAN needs to be provisioned


in the physical infrastructure for those two VMs to communicate at layer 2
over a VLAN backed segment.
Overlay backed segment

On the other hand, two VMs on different hosts and attached to the
same overlay backed segment will have their layer 2 traffic carried by
tunnel between their hosts.

This IP tunnel is instantiated and maintained by NSX without the need


for any segment specific configuration in this physical infrastructure,
thus decoupling NSX virtual networking from this physical
infrastructure.
Uplink vs. pNIC
The N-VDS introduces a clean differentiation between the pNICs of the
host and the uplinks of the N-VDS. The uplinks of the N-VDS are logical
constructs that can be mapped to one or multiple pNIC bundled into a
link aggregation group (LAG).
Teaming Policy
The teaming policy defines how the N-VDS uses its uplinks for redundancy and traffic load
balancing. There are two main options for teaming policy configuration:

● Failover Order – An active uplink is specified along with an optional list of standby
uplinks. Should the active uplink fail, the next available uplink in the standby list takes its
place immediately.

● Load Balanced Source/Load Balance Source Mac Address – Traffic is distributed


across a specified list of active uplinks.
○ The “Load Balanced Source” flavor makes a 1:1 mapping between a virtual interface and an uplink of
the host. Traffic sent by this interface will leave the host through this uplink only, and traffic destined to
this virtual interface will necessarily enter the host via this uplink.
○ The “Load Balanced Source Mac Address” goes a little bit further in term of
granularity for virtual interfaces that can source traffic from different mac
addresses: two frames sent by the same virtual interface could be pinned to
different host uplinks based on their source mac address.
N-VDS Teaming Policies
View Uplink Profile Configuration

Uplink Profiles are assigned to Transport Nodes in the NSX-T environment, and define
the configuration of the physical NICs that will be used.

1. On the left side of the NSX-T user interface, click Profiles


2. Uplinks Profiles should be selected by default. If it is not, click to select
3.Click nsx-default-uplink-hostswitch-profile
◦ NOTE: Click on the name of the Uplink Profile, not the checkbox to its left
Verify Uplink Profile Configuration
Observe the following configuration of the nsx-default-uplink-hostswitch-profile
Uplink Profile:

 Name: nsx-default-uplink-hostswitch-profile
 Description: [blank]
 Transport VLAN: 0
 MTU: Using global MTU
 Teaming Policy: FAILOVER_ORDER
 Active Uplinks: uplink-1
 Standby Uplinks: uplink-2

This profile states that two uplinks will be configured in a failover configuration.
Traffic will normally utilize uplink-1, and will traverse uplink-2 in the event of a
failure of uplink-1.
The N-VDS
The primary component involved in the data plane of the transport
nodes is the N-VDS. The NVDS forwards traffic between components
running on the transport node (e.g., between virtual
machines) or between internal components and the physical network.

 In the latter case, the NVDS must own one or more physical interfaces
(pNICs) on the transport node.
The N-VDS is mandatory with NSX-T for both overlay and VLAN
backed networking. On ESXi hypervisors, the N-VDS implementation is
derived from VMware vSphere® Distributed Switch™ (VDS).

With KVM hypervisors, the N-VDS implementation is derived from the


Open vSwitch (OVS).
View Overlay Transport Zone Configuration
A Transport Zone defines the scope of where an NSX segment can exist within the
fabric. For example, a dedicated DMZ cluster may contain a DMZ transport zone.
Any segments created in this DMZ transport zone could then only be used by VM
workloads in the DMZ cluster.

There are two types of Transport Zone in NSX-T, Overlay and VLAN:

• Overlay transport zones are used for NSX-T Logical Switch segments. Network
segments created in an Overlay transport zone will utilize TEPs and Geneve
encapsulation, as explored in Module 2: Logical Switching.
• VLAN transport zones are used for traditional VLAN-backed segments. Network
segments created in a VLAN transport zone function similar to a VLAN port group
in vSphere.

1. On the left side of the NSX-T user interface, click Transport Zones
2. Click TZ-Overlay
◦ NOTE: Click on the name of the Transport Zone, not the checkbox to its left
Verify Overlay Transport Zone Configuration
Observe the following configuration of the TZ-Overlay Transport Zone:

 Name: TZ-Overlay
 Description: [blank]
 Traffic Type: Overlay
 N-VDS Name: N-VDS-1
 Host Membership Criteria: Standard
 Uplink Teaming Policy Names: [blank]
 Logical Ports: 8
 Logical Switches: 3

This information is useful for seeing where a given Transport Zone is being used.
Revisiting Host Transport Node Configuration
Now it's time to review how uplink profiles and transport zones are combined to
configure Host Transport Nodes in NSX-T. There are two ways that this can be done:

• Individual: An individual host can be configured by choosing the transport zones


and uplink profiles that should be associated with the NSX fabric on that host.
This configuration is applied on each standalone host in the fabric, and is used for
KVM and non vCenter-managed vSphere hypervisors.
• Transport Node Profile: Transport Node Profiles allow you to define the
configuration for hosts at a vCenter cluster level. This allows you to apply the
configuration to a cluster once, and have all hosts in that cluster automatically
receive the appropriate configuration.

1. On the left side of the NSX-T user interface, click Profiles


2. Click Transport Node Profiles
3. Click the checkbox to the left of ESXi-transport-node-profile to select it
4. Click Edit to review the configuration
Verify Host Transport Node Profile
Observe the following details in the General tab of the Edit Transport Node Profile
dialog:
 Name: ESXi-transport-node-profile
 Description: [blank]
 Transport Zones (Selected): TZ-Overlay (Overlay)

1. Click the N-VDS tab to view the N-VDS configuration


Observe the following details in the N-VDS tab of the Edit Transport Node Profile dialog:

• N-VDS-1
 N-VDS Name: N-VDS-1
 Associated Transport Zones: TZ-Overlay

 NIOC Profile: nsx-default-nioc-hostswitch-profile


 Uplink Profile: nsx-default-uplink-hostswitch-profile
 LLDP Profile: LLDP [Send Packet Disabled]
 IP Assignment: Use IP Pool

 IP Pool: TEP-ESXi-Pool
 Physical NICs: vmnic1 to uplink-1

This profile states that a single Transport Zone, TZ-Overlay, will be associated with hosts
in this profile. Their connectivity to the physical network will use the nsx-default-
uplink-hostswitch-profile. Finally, when a TEP is provisioned on each host, it will
assign an IP address from the TEP-ESXi-Pool range of IP addresses.
 1. Click CANCEL to return to the list of Transport Node Profiles
Verify Host Transport Nodes
1. On the left side of the NSX-T user interface, click Nodes
2. Click the dropdown next to Managed by
3. Click vCenter from the list of available options
4. If the list of hosts in the RegionA01-COMP01 (2) cluster are not already visible,
click the arrow to the left of the cluster name to display them

Observe that the RegionA01-COMP01 cluster is configured to use the ESXi-transport-


node-profile Transport Node Profile. All hosts in this cluster will inherit the
configuration that was defined in the profile.
1.Click the dropdown next to Managed By
2.Click None: Standalone Hosts from the list of available options

A single, standalone KVM host has been provisioned as part of this lab and has been
configured to participate in the NSX fabric.
1.Click the checkbox to the left of kvm-01a.corp.local to select it
2.Click Edit to review the configuration
Observe
• Name:the following details in the Host Details tab of the Edit Transport Node dialog:
kvm-01a.corp.local
• Description: [blank]
• IP Addresses: 192.168.110.61
1. Click NEXT to view the Configure NSX settings
Observe the following details in the Configure NSX tab of the Edit Transport Node
dialog:

 N-VDS Name: N-VDS-1


 Associated Transport Zones: TZ-Overlay
 Uplink Profile: nsx-default-uplink-hostswitch-profile
 LLDP Profile: LLDP [Send Packet Disabled]
 IP Assignment: Use IP Pool
 IP Pool: TEP-KVM-Pool
 Physical NICs: eth1 to uplink-1

This profile states that a single Transport Zone, TZ-Overlay, will be associated with
the KVM Transport Node host. Its connectivity to the physical network will use the
nsx- default-uplink-hostswitch-profile. Finally, when a TEP is provisioned on this
host, it will assign an IP address from the TEP-KVM-Pool range of IP addresses.

 1. Click CANCEL to return to the list of Transport Nodes


View Edge Transport Node Configuration
Similar to the way a Host Transport Node is configured, an Uplink Profile and one
or more Transport Zones are used to define an Edge Transport Node in NSX-T. Edge
Transport Nodes perform an important function in the NSX fabric. They host the
Service Routers (SRs) that are used by Tier-0 and Tier-1 Gateways to perform
stateful services such as NAT or load balancing. Most importantly, they host the
Tier-0 Service Router that provides route peering between NSX overlay networking
and the physical routed environment.

In this lab, there are four total Edge Transport Nodes, configured in two fault-
tolerant clusters of two nodes each.
1. Click Edge Transport Nodes
2. Click the checkbox to the left of nsx-edge-01 to select it
3. Click Edit to review the configuration
Observe the following details in the General tab of the Edit Edge Transport Node
Profile dialog:
 Name: nsx-edge-01
 Description: [blank]
 Transport Zones (Selected): TZ-Overlay (Overlay), TZ-VLAN (VLAN)

1. Click the N-VDS tab to view the N-VDS configuration


Observe the following details in the N-VDS tab of the Edit Edge Transport Node dialog:

• N-VDS-1
◦N-VDS Name: N-VDS-1
◦Associated Transport Zones: TZ-Overlay
◦Uplink Profile: nsx-edge-single-nic-uplink-profile-large-mtu
◦IP Assignment: Use IP Pool
◦IP Pool: TEP-ESXi-Pool
◦Physical NICs: Uplink-1 to edge-uplink-A

• N-VDS-2
◦ N-VDS Name: N-VDS-2
◦ Associated Transport Zones: TZ-VLAN
◦ Uplink Profile: nsx-edge-single-nic-uplink-profile-large-mtu
 IP Assignment: [Disabled] Use DHCP
 Physical NICs: Uplink-1 to edge-uplink-B

This profile states that this Edge Node will host two Transport Zones, TZ-Overlay and TZ-
VLAN. One transport zone will be used for route peering with the physical network (TZ-
VLAN), while the other transport zone will be used for overlay network services.
Their connectivity to the physical network will use the nsx-edge-single-nic-uplink-
profile-large-mtu. Finally, when a TEP is provisioned on the TZ-Overlay transport zone, it
will assign an IP address from the TEP-ESXi-Pool range of IP addresses. No TEP will be
provisioned on the VLAN transport zone, so the option is disabled.

 1. Click CANCEL to return to the list of Edge Transport Nodes


View Edge Cluster Configuration
As we reviewed, there are four Edge Transport Nodes defined in the NSX fabric. For
fault tolerance, these edges have been configured in two clusters of two nodes each.
1.Click Edge Clusters
2.Click the checkbox to the left of edge-cluster-01 to select it
3.Click Edit to review the configuration
Observe the following details in the Edit Edge Cluster dialog:
• Name: edge-cluster-01
• Description: [blank]
• Edge Cluster Profile: nsx-default-edge-high-availability-profile
• Transport Nodes (Selected): nsx-edge-01, nsx-edge-02

1. Click CANCEL to return to the list of Edge Clusters


Log into kvm-01a
We will now login to host kvm-01a and verify that the KVM hypervisor is running
the web-03a.corp.local virtual machine. This workload has already been added to
the NSX inventory, and will be used later in this lab.

 1. Click the PuTTY icon in the taskbar. This will launch the PuTTY terminal client

1. Scroll through the list of Saved Sessions until kvm-01a.corp.local is visible


2. Click kvm-01a.corp.local to highlight it
3. Click Load to load the saved session
4. Click Open to launch the SSH session
• If prompted, click Yes to accept the server's host key
• If not automatically logged in, use username vmware and password VMware1!
to log into kvm-01a

1. Enter virsh list to view the virtual machine workloads currently running on this KVM host
and confirm that VM web-03a is running
virsh list
1. Enter ifconfig nsx-vtep0.0 into the command-line on kvm-01a to see that the TEP
interface has been created with an IP address of 192.168.130.61 and an MTU of 1600.

ifconfig nsx-vtep0.0
Uplink Profile Lab
The uplink profile is a template that defines how an N-VDS connects to the physical
network. It specifies:
●The format of the uplinks of an N-VDS
●The default teaming policy applied to those uplinks
●The transport VLAN used for overlay traffic
●The MTU of the uplinks
●The Network IO Control profile
Transport Node Creation with Uplink Profile
Leveraging Different Uplink Profiles
Network I/O Control
 Network I/O Control, or NIOC, is the implementation in NSX-T of vSphere’s
Network I/O Control v3.

 This feature allows managing traffic contention on the uplinks of an ESXi


hypervisor. NIOC allows the creation of shares, limits and bandwidth reservation for
the different kinds of ESXi infrastructure traffic.

 Shares: Shares, from 1 to 100, reflect the relative priority of a system traffic type
against the other system traffic types that are active on the same physical adapter.

 Reservation: The minimum bandwidth that must be guaranteed on a single


physical adapter. Reserved bandwidth that is unused becomes available to other
types of system traffic.

 Limit: The maximum bandwidth that a system traffic type can consume on a single
physical adapter.
The pre-determined types of ESXi infrastructure traffic are:

 Management Traffic is for host management


 Fault Tolerance (FT) is for sync and recovery.
 NFS Traffic is traffic related to a file transfer in the network file system.
 vSAN traffic is generated by virtual storage area network.
 vMotion traffic is for computing resource migration.
 vSphere replication traffic is for replication.
 vSphere Data Protection Backup traffic is generated by backup of data.
 Virtual Machine traffic is generated by virtual machines workload
 iSCSI traffic is for Internet Small Computer System Interface strorage
Logical Switching

I. This section on logical switching focuses on overlay backed


segments/logical switches due to their ability to create isolated
logical L2 networks with the same flexibility and agility that exists
with virtual machines.

II. This decoupling of logical switching from the physical network


infrastructure is one of the main benefits of adopting NSX-T.
Overlay Backed Segments
 Each hypervisor is an NSX-T transport node equipped with a tunnel
endpoint (TEP). The TEPs are configured with IP addresses, and the
physical network infrastructure provides IP connectivity - leveraging layer
2 or layer 3 - between them.
 The VMware® NSX-T Controller™ (not pictured) distributes the IP
addresses of the TEPs so they can set up tunnels with their peers.
 The example shows “VM1” sending a frame to “VM5”. In the physical
representation, this frame is transported via an IP point-to-point tunnel
between transport nodes “HV1” to “HV5”.

Note:
The benefit of this NSX-T overlay model is that it allows direct connectivity between
transport nodes irrespective of the specific underlay inter-rack connectivity (i.e., L2 or
L3).Segments can also be created dynamically without any configuration of the physical
network infrastructure
Flooded Traffic
 The NSX-T segment behaves like a LAN, providing the capability of flooding traffic to all
the devices attached to this segment; this is a cornerstone capability of layer 2.

NSX-T does not differentiate between the different kinds of frames replicated to multiple
destinations. Broadcast, unknown unicast, or multicast traffic will be flooded in a similar
fashion across a segment.

In the overlay model, the replication of a frame to be flooded on a segment is orchestrated
by the different NSX-T components. NSX-T provides two different methods for flooding
traffic .

1) Head-End Replication Mode


2) Two-tier Hierarchical Mode
Head-end Replication Mode

In the head end replication mode, the transport node at the origin of the
frame to be flooded sends a copy to each other transport node that is
connected to this segment.
Note : The default two-tier hierarchical flooding mode is recommended as a
best practice as it typically performs better in terms of physical uplink
bandwidth utilization.
Head-end Replication Mode
Two-tier Hierarchical Mode

 In the two-tier hierarchical mode, transport nodes are grouped according to the subnet of
the IP address of their TEP.

Transport nodes in the same rack typically share the same subnet for their TEP IPs,
though this is not mandatory.

Based on this assumption, Figure shows hypervisor transport nodes classified in three
groups: subnet 10.0.0.0, subnet 20.0.0.0 and subnet 30.0.0.0.
Two-tier Hierarchical Mode
Tables Maintained by the NSX-T Controller
●Global MAC address to TEP table
●Global ARP table, associating MAC addresses to IP addresses

1. MAC Address to TEP Tables

When the vNIC of a VM is attached to a segment/logical switch, the NSX-T Controller is


notified of the MAC address as well as the TEP by which this MAC address is reachable.
Unlike individual transport nodes that only learn MAC addresses corresponding to received
traffic, the NSX-T Controller has a global view of all MAC addresses of the vNIC declared
in the NSX-T environment.
2. ARP Tables
The NSX-T Controller also maintains an ARP table in order to help implement an ARP
suppression mechanism. The N-VDS snoops DHCP and ARP traffic learn MAC address to
IP association. Those associations are then reported to the NSX-T Controller.
Unicast Traffic
The N-VDS maintains such a table for each segment/logical switch it is attached to. Either
MAC address can be associated with a virtual NIC (vNIC) of a locally attached VM or
remote TEP when the MAC address is located on a remote transport node reached via the
tunnel identified by a TEP.
Unicast Traffic between VMs
Overlay Encapsulation
NSX-T uses Generic Network Virtualization Encapsulation (Geneve) for its
overlay model. Geneve is currently an IETF Internet Draft that builds on the
top of VXLAN concepts to provide enhanced flexibility in term of data plane
extensibility.
Figure 3-15

VXLAN has static fields while Geneve offers flexible field. This capability can be used by anyone to adjust the need of typical workload and overlay
fabric, thus NSX-T tunnels are only setup between NSX-T transport nodes. NSX-T only needs efficient support for the Geneve encapsulation by the NIC
hardware; most NIC vendors support the same hardware offload for Geneve as they would for VXLAN.

Network virtualization is all about developing a model of deployment that is applicable to variety of physical network variety and diversity of compute
domains. New networking features are developed in software and implemented without worry of support on the physical infrastructure. The data plane
learning section described how NSX-T relies on metadata inserted in the tunnel header to identify the source TEP of a frame. The benefit of Geneve over
VXLAN is that it allows any vendor to add its own metadata in the tunnel header with a simple Type-Length- Value (TLV) model. NSX-T defines a
single TLV, with fields for:
Geneve allows any vendor to add its own metadata in the tunnel header with a
simple Type-Length-Value (TLV) model. NSX-T defines a single TLV, with fields
for:

 Identifying the TEP that sourced a tunnel packet

 A version bit used during the intermediate state of an upgrade

 A bit indicating whether the encapsulated frame is to be traced

 A bit for implementing the two-tier hierarchical flooding mechanism. When a


transport node receives a tunneled frame with this bit set, it knows that it must
perform local replication to its peers

 Two bits identifying the type of the source TEP


Data Plane Learning Using Tunnel Source IP Address

In case of Two-tier Hierarchical Mode


Two-tier Hierarchical Mode
https://tools.ietf.org/html/draft-gross-geneve-00
Create a new Logical Switch Segment Lab
Go to NSX-T and verify
NSX-T Logical Routing
The logical routing capability in the NSX-T platform provides the ability to
interconnect both virtual and physical workloads deployed in different logical L2
networks. NSX-T enables the creation of network elements like segments (Layer 2
broadcast domain) and gateways (routers) in software as logical constructs and embeds
them in the hypervisor layer, abstracted from the underlying physical hardware.
Since these network elements are logical entities, multiple gateways can
be created in an automated and agile fashion .
Note:
In modern data centers, more than 70% of the traffic is EastWest.
Please note that DR is not a VM and DR on both hypervisors has the
same IP addresses.
Logical and Physical View of Routing Services
Single Tier Routing

NSX-T Gateway provides optimized distributed routing as well as centralized routing


and services like NAT, Load balancer, DHCP server etc. A single tier routing topology
implies that a Gateway is connected to segments southbound providing E-W routing
and is also connected to physical infrastructure to provide N-S connectivity.

This gateway is referred to as Tier-0 Gateway.

Tier-0 Gateway consists of two components:

distributed routing component (DR)


centralized services routing component (SR).
Distributed Router (DR)
A DR is essentially a router with logical interfaces (LIFs) connected to multiple subnets.

It runs as a kernel module and is distributed in hypervisors across all transport nodes, including
Edge nodes.

The traditional data plane functionality of routing and ARP lookups is performed by the logical
interfaces connecting to the different segments.

Each LIF has a vMAC address and an IP address representing the default IP gateway for its logical
L2 segment.

The IP address is unique per LIF and remains the same anywhere the segment/logical switch exists.
The vMAC associated with each LIF remains constant in each hypervisor, allowing the default
gateway and MAC to remain the same during vMotion.
E-W Routing with Workload on the same Hypervisor
Packet Flow between two VMs on same Hypervisor
1. “Web1” (172.16.10.11) sends a packet to “App1” (172.16.20.11). The packet is sent to
the default gateway interface (172.16.10.1) for “Web1” located on the local DR.

2. The DR on “HV1” performs a routing lookup which determines that the destination
subnet 172.16.20.0/24 is a directly connected subnet on “LIF2”. A lookup is performed
in the “LIF2” ARP table to determine the MAC address associated with the IP address
for “App1”. If the ARP entry does not exist, the controller is queried. If there is no
response from controller, an ARP request is flooded to learn the MAC address of
“App1”.
3. Once the MAC address of “App1” is learned, the L2 lookup is performed in the local
MAC table to determine how to reach “App1” and the packet is sent.

4. The return packet from “App1” follows the same process and routing would happen
again on the local DR.
E-W Packet Flow between two Hypervisors

The routing decisions taken by the DR on “HV1” and the DR on “HV2”. When “Web1” sends
traffic to “App2”, routing is done by the DR on “HV1”. The reverse traffic from “App2” to
“Web1” is routed by DR on “HV2”.
Services Router
East-West routing is completely distributed in the hypervisor, with each hypervisor in the
transport zone running a DR in its kernel. However, some services of NSX-T are not
distributed, including, due to its locality or stateful nature:
●Physical infrastructure connectivity
●NAT
●DHCP server
●Load Balancer
●VPN
●Gateway Firewall
●Bridging
●Service Interface
●Metadata Proxy for OpenStack

The appliances where the centralized services or SR instances are hosted are called Edge
nodes. An Edge node is the appliance that provides connectivity to the physical
infrastructure.
Logical Router Components and Interconnection
A Tier-0 Gateway can have following interfaces:
● External Interface – Interface connecting to the physical infrastructure/router. Static routing
and BGP are supported on this interface. This interface was referred to as uplink interface in
previous releases.
● Service Interface: Interface connecting VLAN segments to provide connectivity to VLAN
backed physical or virtual workloads. Service interface can also be connected to overlay
segments for Tier-1 standalone load balancer use cases explained in Load balancer
● Intra-Tier Transit Link – Internal link between the DR and SR. A transit overlay segment is
auto plumbed between DR and SR and each end gets an IP address assigned in 169.254.0.0/25
subnet by default. This address range is configurable and can be changed if it is used somewhere
else in the network.
● Linked Segments – Interface connecting to an overlay segment. This interface was referred
to as downlink interface in previous releases.
North-South Routing by SR Hosted on Edge Node
Routing Packet Flow
End-to-end Packet Flow – Application “Web1” to External
“Web1” (172.16.10.11) sends a packet to 192.168.100.10. The packet is sent to the “Web1”
default gateway interface (172.16.10.1) located on the local DR.
The packet is received on the local DR. DR doesn’t have a specific connected route for
192.168.100.0/24 prefix. The DR has a default route with the next hop as its corresponding SR,
which is hosted on the Edge node.
The “ESXi” TEP encapsulates the original packet and sends it to the Edge node TEP with a
source IP address of 10.10.10.10 and destination IP address of 30.30.30.30.
The Edge node is also a transport node. It will encapsulate/decapsulate the traffic sent to or
received from compute hypervisors. The Edge node TEP decapsulates the packet, removing
the outer header prior to sending it to the SR.
The SR performs a routing lookup and determines that the route 192.168.100.0/24 is learned
via external interface with a next hop IP address 192.168.240.1.
The packet is sent on the VLAN segment to the physical router and is finally delivered to
192.168.100.10.
End-to-end Packet Flow – External to Application “Web1”
An external device (192.168.100.10) sends a packet to “Web1” (172.16.10.11). The packet
is routed by the physical router and sent to the external interface of Tier-0 Gateway hosted
on Edge node.
A single routing lookup happens on the Tier-0 Gateway SR which determines that
172.16.10.0/24 is a directly connected subnet on “LIF1”. A lookup is performed in the
“LIF1” ARP table to determine the MAC address associated with the IP address for “Web1”.
This destination MAC “MAC1” is learned via the remote TEP (10.10.10.10), which is the
“ESXi” host where “Web1” is located.
The Edge TEP encapsulates the original packet and sends it to the remote TEP with an outer
packet source IP address of 30.30.30.30 and destination IP address of 10.10.10.10. The
destination VNI in this Geneve encapsulated packet is of “Web LS”.
The “ESXi” host decapsulates the packet and removes the outer header upon receiving the
packet. An L2 lookup is performed in the local MAC table associated with “LIF1”.
The packet is delivered to Web1.
Two-Tier Routing

The concept of multi-tenancy is built into the routing model. The top-tier
gateway is referred to as Tier-0 gateway while the bottom-tier gateway is Tier-
1 gateway. This structure gives both provider and tenant administrators
complete control over their services and policies. The provider
administrator controls and configures Tier-0 routing and services, while
the tenant administrators control and configure Tier-1.

Configuring two tier routing is not mandatory but recommended. It can be


single-tiered with as shown in the previous section. There are several
advantages of a multi-tiered design which will be discussed in later parts of
the design guide. Figure presents an NSX-T two-tier routing architecture.
Two Tier Routing and Scope of Provisioning
Route Types on Tier-0 and Tier-1 Gateways
Tier-0 Gateway
○ Connected – Connected routes on Tier-0 include external interface subnets, service
interface subnets and segment subnets connected to Tier-0. 172.16.20.0/24
(Connected segment), 192.168.20.0/24 (Service Interface) and 192.168.240.0/24
(External interface) are connected routes for Tier-0 gateway in Figure .

○ Static – User configured static routes on Tier-0.

○ NAT IP – NAT IP addresses owned by the Tier-0 gateway discovered from NAT rules
configured on Tier-0 Gateway.

○ BGP – Routes learned via a BGP neighbor.

○ IPSec Local IP – Local IPSEC endpoint IP address for establishing VPN sessions.

○ DNS Forwarder IP – Listener IP for DNS queries from clients and also used as
source IP used to forward DNS queries to upstream DNS server.
Tier-1 Gateway

○ Connected – Connected routes on Tier-1 include segment subnets connected to Tier-1 and
service interface subnets configured on Tier-1 gateway. 172.16.10.0/24 (Connected segment)
and 192.168.10.0/24 (Service Interface) are connected routes for Tier-1 gateway in Figure

○ Static– User configured static routes on Tier-1 gateway.

○ NAT IP – NAT IP addresses owned by the Tier-1 gateway discovered from NAT
rules configured on the Tier-1 gateway.

○ LB VIP – IP address of load balancing virtual server.

○ LB SNAT – IP address or a range of IP addresses used for Source NAT by load balancer.

○ IPSec Local IP – Local IPSEC endpoint IP address for establishing VPN sessions.

○ DNS Forwarder IP –Listener IP for DNS queries from clients and also used as source IP
used to forward DNS queries to upstream DNS server.
Route Advertisement on the Tier-1 and Tier-0 Logical Router
Logical Routing Lab

Task:

The lab environment we are building currently includes a Tier-0 Gateway


which connects to outside networks. In this module, we will build a Tier-1
Gateway that will handle routing between a sample three-tiered
application's network segments, and move those segments from the
existing Tier-0 Gateway to the newly created Tier-1 Gateway.
Topology
View Existing Tier-0 Gateway
View Linked Segments
Create New Tier-1 Gateway
Add Tier-1 Gateway
1.In the Tier-1 Gateway Name field, enter Tier-1-gateway-01
2.In the Linked Tier-0 Gateway field, select Tier-0-gateway-01
3.Click SAVE

Continue Configuring Tier-1 Gateway


Configure Route Advertisement
Verify Creation of New Tier-1 Gateway
Verify that the new T1 Gateway Tier-1-gateway-01 has been created. Confirm that it is linked to T0
Gateway Tier-0-gateway-01, has 0 Linked Segments, and has a Status of Up.

Connect Segment LS-web to Tier-1 Gateway


Edit Segment LS-web
1. Select Tier-1-gateway-01 for Uplink & Type
2. Click SAVE

Close Editing Mode


Do the same with LS-web & LS-db
Finally it will look like this
Verification
Ping app-01a and db-01a VMs
Verify N-S Traffic
Fully Distributed Two Tier Routing

 NSX-T provides a fully distributed routing architecture. The


motivation is to provide routing functionality closest to the source.

 NSX-T leverages the same distributed routing architecture discussed


in distributed router section and extends that to multiple tiers. Figure
shows both logical and per transport node views of two Tier-1 gateways
serving two different tenants and a Tier-0 gateway.

 Per transport node view shows that the distributed component (DR)
for Tier-0 and the Tier-1 gateways have been instantiated on two
hypervisors.
Logical Routing Instances
Multi-Tier Distributed Routing with Workloads on the same Hypervisor

The following list provides a detailed packet walk between workloads


residing in different tenants but hosted on the same hypervisor.

1. “VM1” (172.16.10.11) in tenant 1 sends a packet to “VM3”


(172.16.201.11) in tenant 2. The packet is sent to its default gateway
interface located on tenant 1, the local Tier-1 DR.

2. Routing lookup happens on the tenant 1 Tier-1 DR and the packet is


routed to the Tier-0 DR following the default route. This default route
has the RouterLink interface IP address (100.64.224.0/31) as a next hop.
3. Routing lookup happens on the Tier-0 DR. It determines that the
172.16.201.0/24 subnet is learned the tenant 2 Tier-1 DR
(100.64.224.3/31) and the packet is routed there.

4. Routing lookup happens on the tenant 2 Tier-1 DR. This determines


that the 172.16.201.0/24 subnet is directly connected. L2 lookup is
performed in the local MAC table to determine how to reach “VM3” and
the packet is sent.
Multi-Tier Distributed Routing with Workloads on different Hypervisors
Logical routing end-to-end packet Flow between hypervisor
1. “VM1” (172.16.10.11) in tenant 1 sends a packet to “VM2”
(172.16.200.11) in tenant 2. The packet is sent to its default gateway
interface located on the local Tier-1 DR.

2. Routing lookup happens on the tenant 1 Tier-1 DR and the packet is


routed to the Tier-0 DR. It follows the default route to the Tier-0 DR
with a next hop IP of 100.64.224.0/31.

3. Routing lookup happens on the Tier-0 DR which determines that the


172.16.200.0/24 subnet is learned via the tenant 2 Tier-1 DR
(100.64.224.3/31) and the packet is routed accordingly.
4. Routing lookup happens on the tenant 2 Tier-1 DR which determines
that the 172.16.200.0/24 subnet is a directly connected subnet. A lookup
is performed in ARP table to determine the MAC address associated with
the “VM2” IP address. This destination MAC is learned via the remote
TEP on hypervisor “HV2”.

5. The “HV1” TEP encapsulates the packet and sends it to the “HV2”
TEP.

6. The “HV2” TEP decapsulates the packet and a L2 lookup is performed


in the local MAC table associated to the LIF where “VM2” is connected.

7. The packet is delivered to “VM2”.


Routing Capabilities
NSX-T supports static routing and the dynamic routing protocol BGP
on Tier-0 Gateways for IPv4 and IPv6 workloads.

In addition to static routing and BGP, Tier-0 gateway also supports a
dynamically created iBGP session between its Services router
component.

Tier-1 Gateway supports static routes but do not support any dynamic
routing protocols.
Dynamic Routing
BGP is the de facto protocol on the WAN and in most modern data
centers. A typical leaf-spine topology has eBGP running between leaf
switches and spine switches.

Tier-0 gateways support eBGP and iBGP on the external interfaces with
physical routers. BFD can also be enabled per BGP neighbor for faster
failover. BFD timers depend on the Edge node type.

Bare metal Edge supports a minimum of 300ms TX/RX BFD keep alive
timer while the VM form factor Edge supports a minimum of 1000ms
TX/RX BFD keep alive timer.
With NSX-T 2.5 release, the following BGP features are supported:
● Two and four bytes AS numbers in asplain, asdot and asdot+ format.
● eBGP multi-hop support, allowing eBGP peering to be established on
loopback interfaces.
● iBGP
● eBGP multi-hop BFD
● ECMP support with BGP neighbors in same or different AS numbers.
● BGP Allow AS in
● BGP route aggregation support with the flexibility of advertising a
summary route only to the BGP peer or advertise the summary route along
with specific routes. A more specific route must be present in the routing
table to advertise a summary route.
● Route redistribution in BGP to advertise Tier-0 and Tier-1 Gateway
internal routes .
● Inbound/outbound route filtering with BGP peer using prefix-lists or
route-maps.
● Influencing BGP path selection by setting Weight, Local preference, AS
Path Prepend, or MED.
● Standard, Extended and Large BGP community support.
● BGP well-known community names (e.g., no-advertise, no-export, no-
export-subconfed) can also be included in the BGP route updates to the
BGP peer.
● BGP communities can be set in a route-map to facilitate matching of
communities at the upstream router.
● Graceful restart (Full and Helper mode) in BGP.
Services High Availability

Services High Availability

 NSX Edge nodes run in an Edge cluster, hosting centralized services


and providing connectivity to the physical infrastructure.

 Since the services are run on the SR component of a Tier-0 or Tier-1


gateway, the following concept is relevant to SR. This SR service runs on
an Edge node and has two modes of operation – active/active or
active/standby.
Active/Active
This is a high availability mode where SRs hosted on Edge nodes act as
active forwarders. Stateless services such as layer 3 forwarding are IP
based, so it does not matter which Edge node receives and forwards the
traffic.

All the SRs configured in active/active configuration mode are active


forwarders. This high availability mode is only available on Tier-0
gateway.
 Stateful services typically require tracking of connection state (e.g.,
sequence number check, connection state), thus traffic for a given
session needs to go through the same Edge node.

 As of NSX-T 2.5, active/active HA mode does not support stateful


services such as Gateway Firewall or stateful NAT. Stateless services,
including reflexive NAT and stateless firewall, can leverage the
active/active HA model.
Tier-0 gateway configured in Active/Active HA mode
Inter-SR Routing
 To provide redundancy for physical router failure, Tier-0 SRs on
both Edge nodes must establish routing adjacency or exchange
routing information with different physical router or TOR.

 These physical routers may or may not have the same routing
information. For instance, a route 192.168.100.0/24 may only be
available on physical router 1 and not on physical router 2.
 For such asymmetric topologies, users can enable Inter-SR routing.
This feature is only available on Tier-0 gateway configured in
active/active high availability mode.

 Figure shows an asymmetric routing topology with Tier-0 gateway on


Edge node, EN1 and EN2 peering physical router 1 and physical
router 2, both advertising different routes.
 When Inter-SR routing is enabled by the user, an overlay segment is
auto plumbed between SRs (similar to the transit segment auto
plumbed between DR and SR) and each end gets an IP address
assigned in 169.254.0.128/25 subnet by default.

 An IBGP session is automatically created between Tier-0 SRs and


northbound routes (EBGP and static routes) are exchanged on this
IBGP session.
Inter-SR Routing
Active/Standby

 This is a high availability mode where only one SR act as an active


forwarder. This mode is required when stateful services are enabled.

 Services like NAT are in constant state of sync between active and
standby SRs on the Edge nodes. This mode is supported on both
Tier-1 and Tier-0 SRs. Preemptive and Non-Preemptive modes are
available for both Tier-0 and Tier-0 SRs.
 Default mode for gateways configured in active/standby high
availability configuration is non-preemptive. A user needs to select
the preferred member (Edge node) when a gateway is configured in
active/standby preemptive mode.

 When enabled, preemptive behavior allows a SR to resume active


role on preferred edge node as soon as it recovers from a failure.
For Tier-1 Gateway, active/standby SRs have the same IP addresses
northbound. Only the active SR will reply to ARP requests, while the
standby SR interfaces operational state is set as down so that they will
automatically drop packets.

For Tier-0 Gateway, active/standby SRs have different IP addresses


northbound and have eBGP sessions established on both links.

 Both of the Tier-0 SRs (active and standby) receive routing updates
from physical routers and advertise routes to the physical routers;
however, the standby Tier-0 SR prepends its local AS three times in the
BGP updates so that traffic from the physical routers prefer the active
Tier-0 SR.
 Southbound IP addresses on active and standby Tier-0 SRs are the
same and the operational state of standby SR southbound interface is
down. Since the operational state of southbound Tier-0 SR interface
is down, the Tier-0 DR does not send any traffic to the standby SR.
Active and Standby Routing Control with eBGP
High Availability (HA)
Edge Node
 Edge nodes are service appliances with pools of capacity, dedicated to
running network and security services that cannot be distributed to the
hypervisors.

 Edge node also provides connectivity to the physical infrastructure.


Previous sections mentioned that centralized services will run on the SR
component of Tier-0 or Tier-1 gateways.
These features include:
 Connectivity to physical infrastructure
 NAT
 DHCP server
 Metadata proxy
 Gateway Firewall
 Load Balancer
 L2 Bridging
 Service Interface
 VPN
 As soon as one of these services is configured or an external interface
is defined on the Tier-0 gateway, a SR is instantiated on the Edge
node.
 The Edge node is also a transport node just like compute nodes in
NSX-T, and similar to compute node it can connect to more than one
transport zones.
 Typically, Edge node is connected to one overlay transport zone and
depending upon the topology, is connected to one or more VLAN
transport zones for N-S connectivity.
There are two transport zones on the Edge:
Overlay Transport Zone

VLAN Transport Zone


Overlay Transport Zone:
 Any traffic that originates from a VM participating in NSX-T domain
may require reachability to external devices or networks.
 This is typically described as external North-South traffic. Traffic
from VMs may also require some centralized service like NAT, load
balancer etc.
 To provide reachability for N-S traffic and to consume centralized
services, overlay traffic is sent from compute transport nodes to Edge
nodes.
 Edge node needs to be configured for overlay transport zone so that it
can decapsulate the overlay traffic received from compute nodes as
well as encapsulate the traffic sent to compute nodes.
VLAN Transport Zone

 Edge nodes connect to physical infrastructure using VLANs.


 Edge node needs to be configured for VLAN transport zone to
provide external or N-S connectivity to the physical infrastructure.
 Depending upon the N-S topology, an edge node can be configured
with one or more VLAN transport zones.
Types of Edge Nodes

 Edge nodes are available in two form factors – VM and bare metal. Both
leverage the data plane development kit (DPDK) for faster packet
processing and high performance.

 Depending on the required functionality, there are deployment-specific VM


form factors. These are detailed in below table.

 When NSX-T Edge is installed as a VM, vCPUs are allocated to the


Linux IP stack and DPDK.
 The number of vCPU assigned to a Linux IP stack or DPDK depends
on the size of the Edge VM.
 A medium Edge VM has two vCPUs for Linux IP stack and two
vCPUs dedicated for DPDK.
 This changes to four vCPUs for Linux IP stack and four vCPUs for
DPDK in a large size Edge VM.
Size Memory vCPU Disk Specific Usage Guidelines
Small 4GB 2 200 GB PoC only, LB functionality is not available.
Suitable for production with centralized
services like NAT, Gateway Firewall. Load
Medium 8GB 4 200 GB balancer functionality can be leveraged for
POC.
Suitable for production with centralized
services like NAT, Gateway Firewall, load
Large 32GB 8 200 GB balancer etc.
Bare Suitable for production with centralized
metal services like NAT, Gateway Firewall, load
Edge 32GB 8 200 GB
balancer etc. Typically deployed, where
higher performance at low packet size and
sub-second N-S convergence is desired.
Multi-TEP support on Edge Node
 Staring with NSX-T 2.4 release, Edge nodes support multiple overlay
tunnels (multi-TEP) configuration to load balance overlay traffic for
overlay segments/logical switches.

 Multi-TEP is supported in both Edge VM and bare metal. Figure


shows two TEPs configured on the bare metal Edge.

 Each overlay segment/logical switch is pinned to a specific tunnel end


point IP, TEP IP1 or TEP IP2. Each TEP uses a different uplink, for
instance, TEP IP1 uses Uplink1 that’s mapped to pNIC P1 and TEP IP2
uses Uplink2 that’s mapped to pNIC P2.
 This feature offers a better design choice by load balancing overlay
traffic across both physical pNICs and also simplifies N-VDS design on
the Edge.

 Notice that a single N-VDS is used in this topology that carries both
overlay and external traffic.

 In-band management feature is leveraged for management traffic.


Overlay traffic gets load balanced by using multi-TEP feature on Edge
and external traffic gets load balanced using "Named Teaming policy"
Bare metal Edge -Same N-VDS for overlay and external traffic with Multi-TEP
Following pre-requisites must be met for multi-TEP support:

 TEP configuration must be done on one N-VDS only.


 All TEPs must use same transport VLAN for overlay traffic.
 All TEP IPs must be in same subnet and use same default gateway.

 During a pNIC failure, Edge performs a TEP failover by migrating TEP


IP and its MAC address to another uplink.

 For instance, if pNIC P1 fails, TEP IP1 along with its MAC address will
be migrated to use Uplink2 that’s mapped to pNIC P2. In case of pNIC
P1 failure, pNIC P2 will carry the traffic for both TEP IP1 and TEP IP2.
VM Edge Node
 NSX-T VM Edge in VM form factor can be installed using an OVA, OVF,
or ISO file. NSX-T Edge VM is only supported on ESXi host.

 An NSX-T Edge VM has four internal interfaces: eth0, fp-eth0, fp-eth1,


and fp-eth2. Eth0 is reserved for management, while the rest of the
interfaces are assigned to DPDK Fast Path.

 These interfaces are allocated for external connectivity to TOR switches


and for NSX-T overlay tunneling.

 There is complete flexibility in assigning Fast Path interfaces (fp-eth) for


overlay or external connectivity. As an example, fp-eth0 could be
assigned for overlay traffic with fp-eth1, fp-eth2, or both for external
traffic.
 There is a default teaming policy per N-VDS that defines how the N-VDS
balances traffic across its uplinks.

 This default teaming policy can be overridden for VLAN segments only
using “named teaming policy”.

 To develop desired connectivity (e.g., explicit availability and traffic


engineering), more than one N-VDS per Edge node may be required.

 Each N-VDS instance can have a unique teaming policy, allowing for
flexible design choices.
Edge Cluster

 An Edge cluster is a group of Edge transport nodes. It provides scale out,


redundant, and high throughput gateway functionality for logical networks.

 Scale out from the logical networks to the Edge nodes is achieved using
ECMP. NSX-T 2.3 introduced the support for heterogeneous Edge nodes
which facilitates easy migration from Edge node VM to bare metal Edge
node without reconfiguring the logical routers on bare metal Edge nodes.

 There is a flexibility in assigning Tier-0 or Tier-1 gateways to Edge nodes


and clusters. Tier-0 and Tier-1 gateways can be hosted on either same or
different Edge clusters.
Edge Cluster with Tier-0 and Tier 1 Services
 Depending upon the services hosted on the Edge node and their usage,
an Edge cluster could be dedicated simply for running centralized
services (e.g., NAT).
 Figure shows two clusters of Edge nodes. Edge Cluster 1 is dedicated for
Tier-0 gateways only and provides external connectivity to the physical
infrastructure.
 Edge Cluster 2 is responsible for NAT functionality on Tier-1 gateways.
Multiple Edge Clusters with Dedicated Tier-0 and Tier-1 Services
 There can be only one Tier-0 gateway per Edge node; however,
multiple Tier-1 gateways can be hosted on one Edge node.

 A maximum of 10 Edge nodes can be grouped in an Edge cluster.

 A Tier-0 gateway supports a maximum of eight equal cost paths,


thus a maximum of eight Edge nodes are supported for ECMP.

 Edge nodes in an Edge cluster run Bidirectional Forwarding Detection


(BFD) on both tunnel and management networks to detect Edge node
failure.
 The BFD protocol provides fast detection of failure for forwarding
paths or forwarding engines, improving convergence.

 Edge VMs support BFD with minimum BFD timer of one second with
three retries, providing a three second failure detection time.

 Bare metal Edges support BFD with minimum BFD TX/RX timer of
300ms with three retries which implies 900ms failure detection time.
Network Services

Users can enable NAT as a network service on NSX-T. This is a


centralized service which can be enabled on both Tier-0 and Tier-1
gateways.
Supported NAT rule types include:

Source NAT (SNAT)


Destination NAT (DNAT)
Reflexive NAT
 Source NAT (SNAT): Source NAT translates the source IP of the outbound
packets to a known public IP address so that the application can
communicate with the outside world without using its private IP address. It
also keeps track of the reply.

 Destination NAT (DNAT): DNAT allows for access to internal private IP


addresses from the outside world by translating the destination IP address
when inbound communication is initiated. It also takes care of the reply.
For both SNAT and DNAT, users can apply NAT rules based on 5 tuple
match criteria.
 Reflexive NAT: Reflexive NAT rules are stateless ACLs which must
be defined in both directions. These do not keep track of the
connection. Reflexive NAT rules can be used in cases where stateful
NAT cannot be used due to asymmetric paths (e.g., user needs to
enable NAT on active/active ECMP routers).
Summarizes NAT rules and usage restrictions
Lab :
DHCP Services

 NSX-T provides both DHCP relay and DHCP server functionality.


DHCP relay can be enabled at the gateway level and can act as relay
between non-NSX managed environment and DHCP servers.

 DHCP server functionality can be enabled to service DHCP requests


from VMs connected to NSX-managed segments.

 DHCP server functionality is a stateful service and must be bound to an


Edge cluster or a specific pair of Edge nodes as with NAT functionality.
NSX-T Security

NSX-T also serves as an advanced security platform, providing a rich set of


features to streamline the deployment of security solutions .
 NSX-T distributed firewall (DFW) provides stateful protection of the
workload at the vNIC level. DFW enforcement occurs in the hypervisor
kernel, helping deliver microsegmentation.
 Uniform security policy model for on-premises and cloud deployment,
supporting multihypervisor (i.e., ESXi and KVM) and multi-workload,
with a level of granularity down to VM/containers/bare metal
attributes.
 Agnostic to compute domain - supporting hypervisors managed by
different compute managers while allowing any defined micro-
segmentation policy to be applied across hypervisors spanning multiple
vCenter environments.
 Support for Layer 3, Layer 4, Layer-7 APP-ID, & Identity based firewall
policies.
 NSX-T Gateway firewall serves as a centralized stateful firewall service for N-
S traffic. Gateway firewall is implemented per gateway and supported at both
Tier-0 and Tier-1. Gateway firewall is independent of NSX-T DFW from
policy configuration and enforcement perspective.
 Gateway & Distributed Firewall Service insertion capability to provide
advanced firewall services like IPS/IDS using integration with partner
ecosystem.
 Dynamic grouping of objects into logical constructs called Groups based on
various criteria including tag, virtual machine name, subnet, and segments.
 The scope of policy enforcement can be selective, with application or
workload-level granularity.
 IP discovery mechanism dynamically identifies workload addressing.
 SpoofGuard blocks IP spoofing at vNIC level.
 Switch Security provides storm control and security against
unauthorized traffic.
NSX-T Security Use Cases
 The NSX-T firewall is delivered as part of a distributed platform that offers
ubiquitous enforcement, scalability, line rate performance, multi-
hypervisor support, and API-driven orchestration.

 These fundamental pillars of the NSX-T firewall allow it to address many


different use cases for production deployment.

 One of the leading use cases NSX-T supports is micro-segmentation.


 Micro-segmentation enables an organization to logically divide its
data center into distinct security segments down to the individual
workload level, then define distinct security controls for and deliver
services to each unique segment.
 A central benefit of micro-segmentation is its ability to deny
attackers the opportunity to pivot laterally within the internal
network, even after the perimeter has been breached.
 VMware NSX-T supports micro-segmentation as it allows for a centrally
controlled, operationally distributed firewall to be attached directly to
workloads within an organization’s network.

 The distribution of the firewall for the application of security policy to protect
individual workloads is highly efficient; rules can be applied that are specific to
the requirements of each workload.

 Of additional value is that NSX’s capabilities are not limited to homogeneous


vSphere environments. It supports the heterogeneity of platforms and
infrastructure that is common in organizations today.
 Micro-segmentation provided by NSX-T supports a zero-trust architecture
for IT security. It establishes a security perimeter around each VM or
container workload with a dynamically - defined policy.

 Conventional security models assume that everything on the inside of an


organization's network can be trusted; zero-trust assumes the opposite - trust
nothing and verify everything.

 This addresses the increased sophistication of networks attacks and insider


threats that frequently exploit the conventional perimeter-controlled
approach.

 For each system in an organization's network, trust of the underlying


network is removed. A perimeter is defined per system within the network
to limit the possibility of lateral (i.e., East-West) movement of an attacker.
Example of Micro-segmentation with NSX
NSX-T Data Plane - ESXi vs. KVM Hosts || DFW
NSX-T DFW Architecture and Components

The NSX-T DFW architecture management plane, control plane, and data
plane work together to enable a centralized policy configuration model
with distributed firewalling.

This section will examine the role of each plane and its associated
components, detailing how they interact with each other to provide a
scalable, topology agnostic distributed firewall solution.
NSX-T DFW Architecture and Components
Management Plane

 When a firewall policy rule is configured, the NSX-T management


plane service validates the configuration and locally stores a
persistent copy.

 Then the NSX-T Manager pushes user-published policies to the


control plane service within Manager Cluster which in turn pushes to
the data plane.

 A typical DFW policy configuration consists of one or more sections


with a set of rules using objects like Groups, Segments, and
application level gateway (ALGs).
 For monitoring and troubleshooting, the NSX-T Manager interacts
with a host-based management plane agent (MPA) to retrieve
DFW status along with rule and flow statistics.

 The NSX-T Manager also collects an inventory of all hosted


virtualized workloads on NSX-T transport nodes. This is dynamically
collected and updated from all NSX-T transport nodes.
Control Plane
 From a DFW policy configuration perspective, NSX-T Control plane
will receive policy rules pushed by the NSX-T Management plane.

 If the policy contains objects including segments or Groups, it converts


them into IP addresses using an object-to-IP mapping table.

 This table is maintained by the control plane and updated using an IP


discovery mechanism. Once the policy is converted into a set of rules
based on IP addresses, the CCP pushes the rules to the LCP on all the
NSX-T transport nodes.
 The CCP utilizes a hierarchy system to distribute the load of CCP-to-
LCP communication.

 The responsibility for transport node notification is distributed


across the managers in manager clusters based on internal hashing
mechanism.

 For example, for 30 transport nodes with three managers, each


manager will be responsible for roughly ten transport nodes.
Data Plane
 The NSX-T transport nodes comprise the distributed data plane with DFW enforcement
done at the hypervisor kernel level.

 Each of the transport nodes, at any given time, connects to only one of the CCP managers,
based on mastership for that node.

 On each of the transport nodes, once local control plane (LCP) has received policy
configuration from CCP, it pushes the firewall policy and rules to the data plane filters (in
kernel) for each of the virtual NICs.

 With the “Applied To” field in the rule or section which defines scope of enforcement, the
LCP makes sure only relevant DFW rules are programmed on relevant virtual NICs instead
of every rule everywhere, which would be a suboptimal use of hypervisor resources.
NSX-T Data Plane Implémentation - ESXi vs. KVM Hosts

 The DFW is functionally identical in both environments; however, there are


architectural and implementation differences depending on the hypervisor
specifics.

 Management and control plane components are identical for both ESXi and
KVM hosts. For the data plane, they use a different implementation for
packet handling.

 NSX-T uses N-VDS on ESXi hosts, which is derived from vCenter VDS, along
with the VMware Internetworking Service Insertion Platform
(vSIP) kernel module for firewalling.
 For KVM, the N-VDS leverages Open vSwitch (OVS) and its utilities.
ESXi Hosts- Data Plane Components
 NSX-T uses N-VDS on ESXi hosts for connecting virtual workloads,
managing it with the NSXT Manager application.

 The NSX-T DFW kernel space implementation for ESXi is same as the
implementation of NSX for vSphere – it uses the VMware Internetworking
Service Insertion Platform (vSIP) kernel module and kernel IO chains filters.

 NSX-T does not require vCenter to be present.


 Figure next provides details on the data plane components for the ESX host.
NSX-T DFW Data Plane Components on an ESXi Host
KVM Hosts- Data Plane Components
 NSX-T uses OVS and its utilities on KVM to provide DFW functionality, thus the LCP agent
implementation differs from an ESXi host.

 For KVM, there is an additional component called the NSX agent in addition to LCP, with
both running as user space agents. When LCP receives DFW policy from the CCP, it sends
it to NSX-agent.

 NSX-agent will process and convert policy messages received to a format appropriate for
the OVS data path. Then NSX agent programs the policy rules onto the OVS data path
using OpenFlow messages.

 For stateful DFW rules, NSX-T uses the Linux conntrack utilities to keep track of the state
of permitted flow connections allowed by a stateful firewall rule. For DFW policy rule
logging, NSX-T uses the ovs-fwd module.
NSX-T DFW Data Plane Components on KVM
NSX-T DFW Policy Lookup and Packet Flow
 In the data path, the DFW maintains two tables: a rule table and a
connection tracker table. The LCP populates the rule table with the
configured policy rules, while the connection tracker table is updated
dynamically to cache flows permitted by rule table.

 NSX-T DFW can allow for a policy to be stateful or stateless with section-
level granularity in the DFW rule table.

 The connection tracker table is populated only for stateful policy rules; it
contains no information on stateless policies. This applies to both ESXi
and KVM environments.
NSX-T DFW rules are enforced as follows:

 Rules are processed in top-to-bottom order.

 Each packet is checked against the top rule in the rule table before
moving down the subsequent rules in the table.

 The first rule in the table that matches the traffic parameters is
enforced. The search is then terminated, so no subsequent rules will
be examined or enforced.

 Because of this behavior, it is always recommended to put the most


granular policies at the top of the rule table.
 This will ensure more specific policies are enforced first. The DFW
default policy rule, located at the bottom of the rule table, is a catchall
rule; packets not matching any other rule will be enforced by the
default rule - which is set to “allow” by default.

 This ensures that VM-toVM communication is not broken during


staging or migration phases.

 It is a best practice to then change this default rule to a “drop” action


and enforce access control through a whitelisting model (i.e., only
traffic defined in the firewall policy is allowed onto the network).
Figure diagrams the policy rule lookup and packet flow.
NSX-T DFW Policy Lookup
In the example shown above,

 WEB VM initiates a session to APP VM by sending TCP SYN packet.

 The TCP SYN packets hit the DFW on vNIC and does Flow Table Look
Up first, to see any state match to existing Flow. Given it's the first
packet of the new session, lookup Results in Flow state not found.

 Since Flow Table Miss, DFW does Rule Table lookup in top-down
order for 5-Tuple match.

 Flow Matches FW rule 2, which is Allow so the packet is sent out to the
destination.
 In addition, the Flow table is updated with New Flow State for
permitted flow as "Flow 2”.
 Subsequent packets in this TCP session checked against this flow in
the flow table for the state match. Once session terminates flow
information is removed from the flow table.
NSX-T Security Policy - Plan, Design and Implement

Planning, designing, and implementing NSX-T security policy is a three-


step process:
1. Policy Methodology – Decide on the policy approach - application-
centric, infrastructure centric, network-centric
2. Policy Rule Model – Select grouping and management strategy for
policy rules by the NSX-T DFW policy categories and sections.
3. Policy Consumption – Implement the policy rules using the abstraction
through grouping construct and options provided by NSX-T.
Micro-segmentation Methodologies
Application
 In an application-centric approach, grouping is based on the application
type (e.g., VMs tagged as “Web-Servers”), application environment (e.g.,
all resources tagged as “Production-Zone”) and application security
posture.

 An advantage of this approach is the security posture of the application is


not tied to network constructs or infrastructure. Security policies can
move with the application irrespective of network or infrastructure
boundaries, allowing security teams to focus on the policy rather than the
architecture.
 Policies can be templated and reused across instances of the same types of
applications and workloads while following the application lifecycle; they
will be applied when the application is deployed and is destroyed when the
application is decommissioned.

 An application-based policy approach will significantly aid in moving


towards a self-service IT model.

 An application-centric model does not provide significant benefits in an


environment that is static, lacks mobility, and has infrastructure functions
that are properly demarcated.
Infrastructure

 Infrastructure-centric grouping is based on infrastructure


components such as segments or segment ports, identifying
where application VMs are connected.

 Security teams must work closely with the network


administrators to understand logical and physical boundaries.

 If there are no physical or logical boundaries in the


environment, then an infrastructure-centric approach is not
suitable.
Network
 Network-centric is the traditional approach of grouping based on L2 or L3
elements. Grouping can be done based on MAC addresses, IP addresses, or
a combination of both.

 NSX-T supports this approach of grouping objects. A security team needs


to aware of networking infrastructure to deploy network-based policies.

 There is a high probability of security rule sprawl as grouping based on


dynamic attributes is not used. This method of grouping works well for
migrating existing rules from an existing firewall.

 A network-centric approach is not recommended in dynamic environments


where there is a rapid rate of infrastructure change or VM
addition/deletion.
Security Policy - Consumption Model
NSX-T Security policy is consumed by the firewall rule table, which is using NSX-T
Manager GUI or REST API framework. When defining security policy rules for the
firewall table, it is recommended to follow these high-level steps:

 VM Inventory Collection – Identify and organize a list of all hosted virtualized


workloads on NSX-T transport nodes. This is dynamically collected and saved by
NSX-T Manager as the nodes – ESXi or KVM – are added as NSX-T transport
nodes.
 Tag Workload – Use VM inventory collection to organize VMs with one or more
tags. Each designation consists of scope and tag association of the workload to
an application, environment, or tenant. For example, a VM tag could be “Scope =
Prod, Tag = web” or “Scope=tenant-1, Tag = app-1”.
 Group Workloads – Use the NSX-T logical grouping construct with
dynamic or static membership criteria based on VM name, tags, segment,
segment port, IP’s, or other attributes.

 Define Security Policy – Using the firewall rule table, define the security
policy. Have categories and policies to separate and identify emergency,
infrastructure, environment, and application-specific policy rules based on
the rule model.
NSX-T Groups & DFW Rule
 NSX-T provides collection of referenceable objects represented in a construct called
Groups. The selection of a specific policy methodology approach – application,
infrastructure, or network – will help dictate how grouping construct is used.

 Groups allow abstraction of workload grouping from the underlying infrastructure


topology. This allows a security policy to be written for either a workload or zone
(e.g., PCI zone, DMZ, or production environment).

 A Group is a logical construct that allows grouping into a common container of static
(e.g., IPSet/NSX objects) and dynamic (e.g., VM names/VM tags) elements. This is a
generic construct which can be leveraged across a variety of NSX-T features where
applicable.
Static criteria provide capability to manually include particular objects
into the Group. For dynamic inclusion criteria, Boolean logic can be used
to create groups between various criteria.

A Group constructs a logical grouping of VMs based on static and


dynamic criteria.

Table shows one type of grouping criteria based on NSX-T Objects.


NSX-T Objects used for Groups

IP Address Grouping of IP addresses and subnets.

Segment All VMs/vNICs connected to this segment/logical switch segment will be selected.
Nested (Sub-group) of collection of referenceable objects - all VMs/vNICs defined within the Group will be selected.
Group
This particular vNIC instance will be selected.
Segment Port
Selected MAC sets container will be used. MAC sets contain a list of individual MAC addresses.
MAC Address AD
Grouping based on Active Directory groups for Identity Firewall (VDI/RDSH) use case.
Groups
VM Properties used for Groups
The use of Groups gives more flexibility as an environment changes
over time. This approach has three major advantages:

 Rules stay more constant for a given policy model, even as the data center
environment changes. The addition or deletion of workloads will affect
group membership alone, not the rules.

 Publishing a change of group membership to the underlying hosts is more


efficient than publishing a rule change. It is faster to send down to all the
affected hosts and cheaper in terms of memory and CPU utilization.

 As NSX-T adds more grouping object criteria, the group criteria can be
edited to better reflect the data center environment.
Using Nesting of Groups

In the example shown in Figure , three Groups have been defined with different
inclusion criteria to demonstrate the flexibility and the power of grouping
construct.
 Using dynamic inclusion criteria, all VMs with name starting by “WEB” are
included in Group named “SG-WEB”.

 Using dynamic inclusion criteria, all VMs containing the name “APP” and
having a tag “Scope=PCI” are included in Group named “SG-PCI-APP”.

 Using static inclusion criteria, all VMs that are connected to a segment “SEG-
DB” are included in Group named “SG-DB”.
Group and Nested Group Example
Define Policy using DFW Rule Table
 The NSX-T DFW rule table starts with a default rule to allow (blacklist) any
traffic. An administrator can add multiple policies on top of default rule
under different categories based on the specific policy model.

 NSX-T distributed firewall table layout consists of Categories like Ethernet,


Emergency, Infrastructure, Environment, and Application to help user to
organize security policies. Each category can have one or more policy/section
with one or more firewall rules.

 In the data path, the packet lookup will be performed from top to bottom
order, starting with policies from category Ethernet, Emergency,
Infrastructure, Environment and Application.
 Any packet not matching an explicit rule will be enforced by the last rule in
the table (i.e., default rule). This final rule is set to the “allow” action by
default, but it can be changed to “block” (whitelist) if desired.

 The NSX-T DFW enables policy to be stateful or stateless with policy-level


granularity. By default, NSX-T DFW is a stateful firewall; this is a
requirement for most deployments.

 In some scenarios where an application has less network activity, the


stateless section may be appropriate to avoid connection reset due to
inactive timeout of the DFW stateful connection table.
Meaning is described below

Rule Name: User field; supports up to 30 characters.

Source and Destination: Source and destination fields of the packet. This
will be a GROUP which could be static or dynamic groups as mentioned
under Group section.

Service: Predefined services, predefined services groups, or raw protocols


can be selected. When selecting raw protocols like TCP or UDP, it is possible
to define individual port numbers or a range.
There are four options for the services field:

Pre-defined Service – A pre-defined Service from the list of available


objects.

Add Custom Services – Define custom services by clicking on the


“Create New Service” option. Custom services can be created based on
L4 Port Set, application level gateways (ALGs), IP protocols, and other
criteria. This is done using the “service type” option in the
configuration menu. When selecting an L4 port set with TCP or UDP, it
is possible to define individual destination ports or a range of
destination ports.
When selecting ALG, select supported protocols for ALG from the
list. ALGs are only supported in stateful mode; if the section is
marked as stateless, the ALGs will not be implemented. Additionally,
some ALGs may be supported only on ESXi hosts, not KVM.

Custom Services Group – Define a custom Services group, selecting


from single or multiple services. Workflow is similar to adding
Custom services, except you would be adding multiple service
entries.
Profiles: This is used to select & define Layer 7 Application ID & FQDN
whitelisting profile. This is used for Layer 7 based security rules.

Applied To: Define the scope of rule publishing. The policy rule could
be published all workloads (default value) or restricted to a specific
GROUP. When GROUP is used in Applied To it needs to be based on
NON-IP members like VM object, Segments etc.

Action: Define enforcement method for this policy rule; available


options are listed in Table
Firewall Rule Table – “Action” Values

Action Description

Drop Block silently the traffic.

Allow Allow the traffic.

Reject Reject action will send back to initiator:


•RST packets for TCP connections.
•ICMP unreachable with network administratively prohibited code for UDP, ICMP and other IP
connections.
Advanced Settings: Following settings are under advanced settings
options:

Log: Enable or disable packet logging. When enabled, each DFW enabled host will send
DFW packet logs in a syslog file called “dfwpktlog.log” to the configured syslog server. This
information can be used to build alerting and reporting based on the information within the
logs, such as dropped or allowed packets.

Direction: This field matches the direction of the packet, default both In-Out. It can be set
to match packet exiting the VM, entering the VM, or both directions.

Tag: You can tag the rule; this will be sent as part of DFW packet log when traffic hits this
rule.

Notes: This field can be used for any free-flowing string and is useful to store comments.

Stats: Provides packets/bytes/sessions statistics associated with that rule entry. Polled every
5 minutes.
Examples of Policy Rules for 3-Tier Application

Figure shows a standard 3-Tier application topology used to define NSX-T


DFW policy. Three web servers are connected to “SEG Web”, two
applications servers are connected to “SEG App”, and 2 DB servers
connected to “SEG DB”.

A distributed Gateway is used to interconnect the three tiers by providing


inter-tier routing. NSX-T DFW has been enabled, so each VM has a
dedicated instance of DFW attached to its vNIC/segment port.
3-Tier Application Network Topology
Example 1: Static IP addresses/subnets Group in security policy rule.

This example shows use of the network methodology to define


policy rule. Groups in this example are identified in Table while the
firewall policy configuration is shown in Table.
Firewall Rule Table - Example 1
Using Segment object Group in Security Policy rule

This example uses the infrastructure methodology to define policy rule.


Groups in this example are identified in Table while the firewall policy
configuration is shown in Table .
Firewall Rule Table - Example 2
Reading this policy rule table would be easier for all teams in the organization,
ranging from security auditors to architects to operations. Any new VM connected on
any segment will be automatically secured with the corresponding security posture.

For instance, a newly installed web server will be seamlessly protected by the first
policy rule with no human intervention, while VM disconnected from a segment will
no longer have a security policy applied to it. This type of construct fully leverages the
dynamic nature of NSX-T.

It will monitor VM connectivity at any given point in time, and if a VM is no longer


connected to a particular segment, any associated security policies are removed.
Additional Security Features

● SpoofGuard - Provides protection against spoofing with MAC+IP+VLAN


bindings. This can be enforced at a per logical port level. The SpoofGuard feature
requires static or dynamic bindings (e.g., DHCP/ARP snooping) of IP+MAC for
enforcement.

● Segment Security - Provides stateless L2 and L3 security to protect segment


integrity by filtering out malicious attacks (e.g., denial of service using
broadcast/multicast storms) and unauthorized traffic entering segment from
VMs. This is accomplished by attaching the segment security profile to a segment
for enforcement. The segment security profile has options to allow/block bridge
protocol data unit (BPDU), DHCP server/client traffic, non-IP traffic. It allows for
rate limiting of broadcast and multicast traffic, both transmitted and received.
NSX-T Security Enforcement – Agnostic to Network Isolation

The consumption of security policies requires no changes from policy


planning, design, and implementation perspective. This applies to all of the
deployment options mentioned below.

However, the following initial provisioning steps required to enforce NSX


security policies:
a) Preparation of compute hosts for NSX-T.
b) Create VLAN or overlay segments on NSX-T based on network isolation and

c) Move relevant workloads to relevant VLAN or overlay segments/networks


on compute hosts for policy enforcement.
NSX-T Distributed Firewall for VLAN Backed workloads

This is very common use case for our customer who is looking at NSX-T
as a platform only for micro-segmentation security use case without
changing existing network isolation, which is VLAN backed. This is ideal
use case for brownfield deployment where customer wants to enhance the
security posture for existing application without changing network
design.
NSX-T DFW Logical topology – VLAN Backed Workloads
NSX-T DFW Physical Topology – VLAN Backed Workloads
NSX-T Distributed Firewall for Mix of VLAN and Overlay backed
workloads

This use case mainly applies to customer who wants to adapt NSX-T
micro-segmentation policies to all of their workloads and looking at
adapting NSX-T network virtualization (overlay) for their application
networking needs in phases. This scenario may arise when customer
starts to either deploy new application with network virtualization or
migrating existing applications in phases from VLAN to overlay backed
networking to avail the advantages of NSX-T network virtualization.
NSX-T DFW Logical Topology – Mix of VLAN & Overlay Backed Workloads
NSX-T DFW Physical Topology – Mix of VLAN & Overlay Backed
Workloads
Summary
NSX-T Platform enforces micro-segmentation policies irrespective of
network isolation, VLAN or overlay or Mix, without having to change
policy planning, design, and implementation. A user can define NSX-T
micro-segmentation policy once for the application, and it will continue
to work as you migrate application from VLAN based networking to NSX-
T overlay backed networking.
Gateway Firewall

The NSX-T Gateway firewall provides essential perimeter firewall protection which
can be used in addition to a physical perimeter firewall. Gateway firewall service is
part of the NSX-T Edge node for both bare metal and VM form factors. The Gateway
firewall is useful in developing PCI zones, multi-tenant environments, or DevOps
style connectivity without forcing the inter-tenant or inter-zone traffic onto the
physical network. The Gateway firewall data path uses DPDK framework supported
on Edge to provide better throughput.
Optionally, Gateway Firewall service insertion capability can be
leveraged with the partner ecosystem to provide advanced security
services like IPS/IDS and more. This enhances the security posture by
providing next-generation firewall (NGFW) services on top of native
firewall capability NSX-T provides. This is applicable for the design
where security compliance requirements mandate zone or group of
workloads need to be secured using NGFW, for example, DMZ or PCI
zones or Multi-Tenant environments.
Consumption

NSX-T Gateway firewall is instantiated per gateway and supported at


both Tier-0 and Tier-1. Gateway firewall works independent of NSX-T
DFW from a policy configuration and enforcement perspective. A user
can consume the Gateway firewall using either the GUI or REST API
framework provided by NSX-T Manager. The Gateway firewall
configuration is similar to DFW firewall policy; it is defined as a set of
individual rules within a section. Like DFW, the Gateway firewall rules
can use logical objects, tagging and grouping constructs (e.g., Groups) to
build policies.
Similarly, regarding L4 services in a rule, it is valid to use predefined
Services, custom Services, predefined service groups, custom service
groups, or TCP/UDP protocols with the ports. NSX-T Gateway firewall
also supports multiple Application Level Gateways (ALGs). The user can
select an ALG and supported protocols by using the other setting for type
of service. Gateway FW supports only FTP and TFTP as part of ALG.
ALGs are only supported in stateful mode; if the section is marked as
stateless, the ALGs will not be implemented.
Implementation

Gateway firewall is an optional centralized firewall implemented on


NSX-T Tier-0 gateway uplinks and Tier-1 gateway links. This is
implemented on a Tier-0/1 SR component which is hosted on NSX-T
Edge. Tier-0 Gateway firewall supports stateful firewalling only with
active/standby HA mode. It can also be enabled in an active/active
mode, though it will be only working in stateless mode. Gateway
firewall uses a similar model as DFW for defining policy, and NSX-T
grouping construct can be used as well. Gateway firewall policy rules
are organized using one or more policy sections in the firewall table for
each Tier-0 and Tier-1 Gateway.
Deployment Scenarios

This section provides two examples for possible deployment and data
path implementation.
Gateway FW as Perimeter FW at Virtual and Physical Boundary The
Tier-0 Gateway firewall is used as perimeter firewall between physical
and virtual domains. This is mainly used for N-S traffic from the
virtualized environment to physical world. In this case, the Tier-0 SR
component which resides on the Edge node enforces the firewall policy
before traffic enters or leaves the NSX-T virtual environment. The E-W
traffic continues to leverage the distributed routing and firewalling
capability which NSX-T natively provides in the hypervisor.
Tier-0 Gateway Firewall – Virtual-to-Physical Boundary
Gateway FW as Inter-tenant FW

The Tier-1 Gateway firewall is used as inter-tenant firewall within an NSX-T


virtual domain. This is used to define policies between different tenants who
resides within an NSX-T environment. This firewall is enforced for the traffic
leaving the Tier-1 router and uses the Tier-0 SR component which resides on
the Edge node to enforce the firewall policy before sending to the Tier-0
Gateway for further processing of the traffic. The intra-tenant traffic continues
to leverage distributed routing and firewalling capabilities native to the NSX-
T.
Tier-1 Gateway Firewall - Inter-tenant
Gateway FW with NGFW Service Insertion – As perimeter or Inter Tenant Service

This deployment scenario extends the Gateway Firewall scenarios depicted above
with additional capability to insert the NGFW on top of native firewall capability
NSX-T Gateway Firewall provides.

This is applicable for the design where security compliance requirements mandate
zone or group of workloads need to be secured using NGFW, for example, DMZ or
PCI zones or Multi-Tenant environments.

The service insertion can be enabled per Gateway for both Tier-0 and Tier-1
Gateway depending on the scenario.
As a best practice Gateway firewall policy can be leveraged as the first
level of defense to allow traffic based on L3/L4 policy. And leverage
partner service as the second level defense by defining policy on
Gateway firewall to redirect the traffic which needs to be inspected by
NGFW. This will optimize the NGFW performance and throughput.
The following diagram provides the logical representation of overall
deployment scenario. Please refer to NSX-T interoperability matrix to
check certified partners for the given use case.
Gateway Firewall – Service Insertion
Endpoint Protection with NSX-T

NSX-T provides the Endpoint Protection platform to allow 3rd party


partners to run agentless Anti-Virus/Anti-Malware (AV/AM)
capabilities for virtualized workloads on ESXi. Traditional AV/AM
services require agents be run inside the guest operating system of a
virtual workload. These agents can consume small amounts of
resources for each workload on an ESXi host. In the case of Horizon,
VDI desktop hosts typically attempt to achieve high consolidation ratios
on the ESXi host, providing 10s to 100s of desktops per ESXi host.
 With each AV/AM agent inside the virtualized workload consuming a
small amount of virtual CPU and memory, the resource costs can be
noticeable and possibly reduce the overall number of virtual desktops
an ESXi host can accommodate, thus increasing the size and cost of the
overall VDI deployment. The Guest Introspection platform allows the
AV/AM partner to remove their agent from the virtual workload and
provide the same services using a Service Virtual Machine (SVM) that is
installed on each host. These SVMs consume much less virtual CPU
and memory overall than running agents on every workload on the
ESXi host.
The Endpoint Protection platform for NSX-T following a simple 3 step
process to use.
Registration
Registration of the VMware Partner console with NSX-T and vCenter.
Deployment

Creating a Service Deployment of the VMware Partner SVM and


deployment to the ESXi Clusters. The SVMs require a Management network
with which to talk to the Partner Management Console. This can be handled
by IP Pool in NSX-T or by DHCP from the network. Management networks
must be on a VSS or VDS switch.
Consumption Consumption of the Endpoint Protection platform
consists of creating a Service Profile of which references the Service
Deployment and then creating Service Endpoint Protection Policy with
Endpoint Rule that specifies which Service Profile should be applied to
what NSX-T Group of Virtual Machines.
NSX-T Load Balancer
 A load-balancer defines a virtual service, or virtual server, identified by a virtual
IP address (VIP) and a UDP/TCP port.

 This virtual server offers an external representation of an application while


decoupling it from its physical implementation: traffic received by the load
balancer can be distributed to other network-attached devices that will perform
the service as if it was handled by the virtual server itself.

This model is popular as it provides benefits for application scale-out and


high-availability:
 Application scale-out:
The following diagram represents traffic sent by users to the VIP of a virtual
server, running on a load balancer. This traffic is distributed across the members
of a pre-defined pool of capacity.
Load Balancing offers application scale-out

The server pool can include an arbitrary mix of physical servers, VMs or containers that
together, allow scaling out the application.
Application high-availability

Application high-availability: The load balancer is also tracking the health of


the servers and can transparently remove a failing server from the pool,
redistributing the traffic it was handling to the other members:

Modern applications are often built around advanced load balancing capabilities,
which go far beyond the initial benefits of scale and availability. In the example
below, the load balancer selects different target servers based on the URL of the
requests received at the VIP:
Load Balancing offers advanced application load balancing

Thanks to its native capabilities, modern applications can be deployed in NSX-T


without requiring any third party physical or virtual load balancer. The next
sections in this part describe the architecture of the NSX-T load balancer and its
deployment modes.
NSX-T Load Balancing Architecture

In order to make its adoption straightforward, the different constructs


associated to the NSX-T load balancer have been kept similar to those of a
physical load balancer. The following diagram show a logical view of those
components.
NSX-T Load Balancing main components
Load Balancer

The NSX-T load balancer is running on a Tier-1 gateway. The arrows in the above
diagram represent a dependency: the two load balancers LB1 and LB2 are
respectively attached to the Tier-1 gateways 1 and 2.

Load balancers can only be attached to Tier-1 gateways (not Tier-0 gateways), and
one Tier-1 gateway can only have one load balancer attached to it.
Virtual Server
 On a load balancer, the user can define one or more virtual server (the
maximum number depends on the load balancer form factor – See NSX-T
Administrator Guide for load balancer scale information).

 As mentioned earlier, a virtual server is defined by a VIP and a TCP/UDP


port number, for example IP: 20.20.20.20 TCP port 80. The diagram
represents four virtual servers VS1, VS2, VS5 and VS6.

 A virtual server can have basic or advanced load balancing options such as
forward specific client requests to specific pools, or redirect them to external
sites, or even block them.
Pool
 A pool is a construct grouping servers hosting the same application.
Grouping can be configured using server IP addresses or for more
flexibility using Groups.

 NSX-T provides advanced load balancing rules that allow a virtual


server to forward traffic to multiple pools.

 In the above diagram for example, virtual server VS2 could load
balance image requests to Pool2, while directing other requests to
Pool3.
Monitor
 A monitor defines how the load balancer tests application availability.
Those tests can range from basic ICMP requests to matching patterns
in complex HTTPS queries.

 The health of the individual pool members is then validated according


to a simple check (server replied), or more advanced ones, like
checking whether a web page response contains a specific string.

 Monitors are specified by pools: a single pool can use only 1 monitor,
but the same monitor can be used by different Pools.
Lab
NSX-T Load Balancing deployment modes

NSX-T load balancer is flexible and can be installed in either traditional in-
line or one-arm topologies. This section goes over each of those options and
examine their traffic patterns.

In-line load balancing In in-line load balancing mode, the clients and the
pool servers are on different side of the load balancer. In the design below,
the clients are on the Tier-1 uplink side, and servers are on the Tier-1
downlink side:

You might also like