LP 1005
LP 1005
Scott Lorditch
This paper presents recommended topologies and configurations for the layer 2 portion of a
network topology which includes Lenovo® switches with CNOS. It includes discussion of
connectivity to upstream routed environments and downstream servers.
This document is intended for network architects and engineers who work for end-user
organizations or for Lenovo and its business partners. The document will be most useful for
those who have experience working on network design andr implementation and are familiar
with Ethernet and its associated protocols.
At Lenovo Press, we bring together experts to produce technical publications around topics of
importance to you, providing information and best practices for using Lenovo products and
solutions to solve IT challenges.
See a list of our most recent publications at the Lenovo Press web site:
http://lenovopress.com
Do you have the latest version? We update our papers from time to time, so check
whether you have the latest version of this document by clicking the Check for Updates
button on the front page of the PDF. Pressing this button will take you to a web page that
will tell you if you are reading the latest version of the document and give you a link to the
latest if needed. While you’re there, you can also sign up to get notified via email whenever
we make an update.
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Virtual Link Aggregation Group considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Layer 2 failover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
IGMP Snooping considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Link aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Spanning Tree Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Storm Control considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
BootP and DHCP relay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Change history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Author. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Introduction
This paper describes the best practices for designing the topology for the Layer 2 portion of a
network.
Layer 2 is the switching layer of a network. In this paper we cover networks built on Ethernet,
where forwarding decisions are based upon Ethernet MAC addresses. Key aspects of
Layer 2 include redundant and parallel links (aggregations), and dealing with topologies
which include loops which can result in broadcast storms (spanning tree and related
techniques).
The vast majority of the practices and techniques in this paper are readily applicable to
networks which contain other vendors’ products in addition to those from Lenovo.
The paper is part of a series of papers that cover the design of Layer 1 (physical), Layer 2
(switching) and Layer 3 (routing) aspects of a network. The other two papers are available
from these web pages:
Layer 1 paper: Lenovo Networking Best Practices for CNOS: Network Design and
Topologies, https://lenovopress.com/lp1068
Layer 3 paper: Lenovo Networking Best Practices for CNOS: Layer 3 Technology,
https://lenovopress.com/lp1087
Introduction to vLAG
vLAG is a feature on Lenovo Networking switches that allows a pair of switches to act as a
single endpoint for aggregation and is similar in function to Cisco Virtual PortChannel (vPC),
Juniper MC-LAG, and others. There is no IEEE or IETF standard that defines this function. It
provides improved high availability compared to a single switch that is acting as an endpoint
and can enhance performance by splitting loads across the aggregated links and switches.
For more information about configuring the vLAG feature, see the Application Guide for your
product.
Device 2
LAG (Aggregation)
LAG (Aggregation)
Device 1
Traditional LAG
Figure 1 Example of a simple Link Aggregation
vLAG is similar to Cisco vPC or Juniper MC-LAG. Lenovo Networking vLAG is a form of
multi-chassis (multi-switch) aggregation. vLAG is not a form of aggregation in its own right;
instead, it is an enhancement to current aggregation standards. vLAG attempts to
overcome a shortcoming of standards-based aggregations.
By current standards definitions, an aggregation can connect two devices only (see
“Traditional LAG” in Figure 1). In traditional LAG, if one device on either end of the
aggregation fails, the entire path is gone. By using vLAG, you can take a pair of switches
and make them logically act as a single switch for aggregation purposes (see the split
LAG example in Figure 2). This example splits the aggregation on the end that is running
vLAG (or both sides if the other side is also running some form of split LAG); therefore, a
single switch failure in that pair does not take down the entire path.
Device 2a Device 2b
vLAG/vPC
vLAG/vPC
Device 1a Device 1b
Split LAG (also known as vLAG, vPC, etc)
Figure 2 Example of cross switch aggregation with vLAG
4 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
it is helpful to first understand how packet flow works without local preference. Note that
local preference is the default behavior of vLAG on CNOS software.
Without local preference, a packet on one switch in the vLAG pair that needed to get
upstream uses normal aggregation hashing. It might use the other switch in the pair to get
to the upstream network and add an unnecessary hop across the ISL before it heads
upstream (the ISL connects the physical pair to help form the virtual pair). An example of
this extra hop is shown in Figure 3.
vLAG
Aggregation
ISL
Switch-1 Switch-2
vmnic0 vmnic1
vSwitch1
ESX Route based on Originating virtual port ID
VM1
10.10.10.1
ESXi Server
Figure 3 Without local preference, packets might take the ISL, even if local links are up
With local preference, if a packet on a switch in the pair needs to get to the upstream
network and if that switch has any links in that common cross chassis aggregation toward
the destination that is up, the switch always prefers the local links to send to the upstream,
and does not add that extra hop. An example of this operation is shown in Figure 4 on
page 6.
In order to offer local preference, both vLAG peers should have the same MAC entries in
their mac-address table every time and this is done using an FDB sync mechanism that
works as follows:
– The MAC entries locally learned on one switch over vLAG port-channel are
synchronized on its peer over the port-channel from the corresponding instance. In this
case, FDB synchronization only works when the vLAG instance is formed.
– The MAC entries locally learned on one switch over non-vLAG port are synchronized
on its peer over the ISL port-channel. In this case, FDB synchronization happens as
long as there is one vLAG instance formed.
FDB synchronization is also necessary for MAC entries that are aged or purged. Also,
when user manually clears the mac-address table on one of the vLAG switches, only the
locally learned addresses are deleted, the ones learned by synchronization remain in FDB
table (let’s say user clears mac-address-table on vLAG Primary switch – then all locally
learned addresses on Primary are deleted but also the ones Secondary corresponding to
the ones learned from Primary by synchronization; on both switches, they remain the mac
entries locally learned by Secondary and hence by Primary by synchronization).
5
When ISL comes up, each vLAG switch installs its peer MAC address as a static
mac-address entry in FDB for ISL port-channel and each VLAN it is a member of, need to
avoid unnecessary traffic flooding of traffic towards the peer switch.
On current implementation, vLAG feature can synchronize a maximum of 32K MAC
entries.
vLAG does not take care of static mac address synchronization hence static addresses
manually configured on one side should be configured accordingly on the other peer.
vLAG
Aggregation
ISL
Switch-1 Switch-2
vmnic0 vmnic1
vSwitch1
ESX Route based on Originating virtual port ID
VM1
10.10.10.1
ESXi Server
Figure 4 With local preference, local links take precedence over the hash
In addition to reducing hops, local preference also reduces the overall load on the ISL
between the pair of vLAGed switches, which might permit having a smaller ISL than if the
local preference is not used. Local preference is built into vLAG and cannot be disabled.
All versions of vLAG support local preference, but not all vendors support it in their
implementations of cross-chassis aggregation technologies.
Switch independent mode teaming and switch dependent mode teaming (which is also
known as bonding in some operating systems) is a method of combining NICs in the
server to increase performance and high availability. Some of these teaming modes can
affect how the upstream switches to the server must be configured and how traffic flows
through a vLAGed environment. Switch independent mode teaming does not require
configuration on the switch (except for allowing the specified VLANs). Switch-dependent
mode teaming requires a form of special configuration on the switch (in the form of
aggregation) to interoperate with the teaming mode on the server.
For more information about teaming modes and their interaction and operation, see An
Introduction to NIC Teaming with Lenovo Networking Switches, which is available from:
https://lenovopress.com/redp5245
The remaining parts of this section describe the examples that are shown in Figure 5 on
page 7 with switch-independent mode teaming, and Figure 6 on page 9 with
switch-dependent mode teaming to help show how packets can flow in an environment that
features vLAG.
6 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Packet flow with switch-independent mode teaming
The example that is shown in Figure 5 is described in this section, via step-by-step numbered
packet flows.
3 8 4
vLAG or vPC
Aggregation
9
ISL
Access Sw-1 7 Access Sw-2
1
vmnic0 vmnic1
vSwitch1
ESX Route based on Originating virtual port ID
Teaming NOT
using VM1
aggregation 10.10.10.1
Figure 5 Possible packet flows with a host that uses a non-aggregation form of teaming
7
send the packet directly to the remote host that is using aggregation and it comes in on
eth0 on the remote host.
4. Remote host responds. As far as the remote host is co μWncerned, either interface (eth0
and eth1, part of a bond mode 4 [LACP aggregation] interface named bond0) is as usable
as the other, and it uses its own hash to decide the return path. Assume that it picked eth1
for the response, so the return packet is sent to Aggregation Switch #2. If this switch also
has local preference enabled, it can choose to send that response out of its link to
Aggregation Switch #1 or #2. It will use its own hash to decide.
5. If Aggregation Switch #2 forwards the packet to Access Switch #1, the same interface
(vmnic0) back into the host will be used. Aggregation Switch #2 can also forward the
packet to Access Switch #2.
6. VM1 has no active interface on Access Switch #2, and VM1’s MAC is known only on the
server-facing port on Switch #1, not on Switch #2. If the VLAN that is used is carried
across the ISL (which it should be), the MAC table for Switch #2 also has the MAC for
VM1 point to the ISL, and in this case, the return packet must cross the ISL to get back to
VM1.
7. The same situation might occur if the remote host decided to respond on eth0 instead and
sent it toward Aggregation Switch #1.
Figure 5 on page 7 shows how the possible outbound packets might flow north (items 1 - 3),
with this switch-independent mode teaming design, and the possible return paths the packets
might take are shown in items 4 - 9. (It is assumed that all links are up.) All possible paths
also are shown, but it is assumed that only one of these paths is taken.
Consider the following points when switch independent mode teaming with vLAG is used:
There are designs within which normal production traffic can use the ISL (not only
low-level vLAG traffic and failover traffic). Figure 5 on page 7 shows this ISL path usage
on the return traffic.
You must carry all VLANS on the ISL that go down to the hosts and uplinks in a common
aggregation. If you do not carry all VLANs on the ISL, a packet is discarded at best when
it gets back to Switch #2 or else it is flooded to all ports but it is unlikely to reach its
intended destination.
The Aggregation switches that are shown in Figure 5 on page 7 do not know that there are
potentially two separate switches below them. Instead, the two switches below appear to
be a single switch and any port is as good as another to forward traffic from the
Aggregation switches to the access switches. Therefore, the network administrators must
ensure that the proper VLAN paths exist when vLAG and similar technologies are used.
During normal operation (all links are in an up state), only returning inbound traffic to the
local host might need to use the ISL in this design (depending on what path the ToR used).
Outbound traffic from the local host in this design always uses the local uplinks to get out
(based on local preference) and do not cross the ISL.
Although this path is not shown in this example, the ISL might be used in other normal
circumstances. For example, if two VMs on different ESX hosts in this same Flex
System™ chassis hash to different I/O modules for outbound traffic (specific to the use of
a switch-independent mode of teaming, such as the VMware route that is based on the
originating virtual port ID). In that case, any communications between those two VMs goes
over the ISL links.
8 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Packet flow with switch-dependent mode teaming
The next example shows the hop-by-hop packet flow between a host that is running
switch-dependent mode teaming in the Flex System chassis and the aggregated host at the
top of the design. The numbered steps correspond to the numbers that are shown in Figure 6.
4 = 10.10.10.2
Remote host - Bond0 interface
Bond0 – Mode 4
Eth0 Eth1
LAG (Aggregation)
vLAG or vPC
Aggregation
ISL
Access Sw-1 Access Sw-2
6
6
vLAG Aggregation
LAG (Aggregation)
2 2
vmnic0 vmnic1
vSwitch1
ESX Route based on IP hash
Figure 6 Possible packet flows with host that uses an aggregation form of teaming
9
4. The remote host responds. As in the previous example, for the remote host, both
interfaces are usable and the remote host uses its own hash to decide the return NIC (as
with all aggregation paths, it picks one path for a specific conversation).
5. The return packet from the remote host might be sent via either of the two Aggregation
switches; the NIC teaming configuration on the remote host will choose which one is used.
The selected Aggregation switch will then forward the packet to one of the two Access
switches, depending on its hashing configuration.
6. If the response packet is forwarded via Access switch #1, that switch will then forward it to
the originating server (ESXi server at the bottom of Figure 6 on page 9) and the server will
receive it on port vmnic0, the same port from which the original request was sent. If the
response is forwarded via Access switch #2, it will ultimately arrive at the server on port
vmnic1. Because the ESXi server is configured to load balance both sent and received
traffic, this is an acceptable route and will result in the successful delivery of the packet.
7. Figure 6 on page 9 shows how the possible outbound packets might flow north (items 1 -
3) with this switch-dependent and aggregation mode of host teaming. The possible return
paths the packets might take shown in items 4 - 6. This scenario assumes that all links are
up. It also shows all possible paths but assumes that only one of these paths is taken.
The example as shown in Figure 6 on page 9 also shows the following important items when
switch dependent modes of teaming are used:
Assuming that all local hosts are in aggregation, the ISLs in this design are used for
low-level vLAG traffic and fault recovery traffic only (that is, one of the access switches
loses all its uplinks); however, normal traffic does not need to use the ISL.
You still must carry all VLANS on the ISL that also go down to the hosts to account for
failover scenarios.
This design is potentially more efficient for return traffic than the design in Figure 5 on
page 7 because it can reduce hops for returning packets and reduce the potential load on
the ISL.
Although not shown in this example, any time different VMs on different ESX hosts inside
this same Flex System chassis communicate with each other, that traffic never needs to
cross the ISL (because local preference keeps traffic that is local to a switch in the pair) in
normal conditions. This configuration is another improvement of network utilization over
the switch-independent mode of teaming.
Although this example shows that switch-dependent teaming operation is more efficient
than switch independent teaming operation, this more efficient design is not always the
best choice. Some examples where it might not be a better choice include the following
circumstances:
– For local hosts that are not using any form of teaming, using a switch-independent
mode of teaming, or are single-homed into one of the pairs of switches only, you
cannot use this type of aggregated connection option to the server.
– For environments that do not support vLAG, stacking, or some other form of
multi-chassis aggregation to the embedded hosts in the Flex System chassis, you
cannot use this design (not all forms of stacking support local preference, but stacking
provides a multi-chassis aggregation, which is beyond the scope of this section).
10 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
– For environments that require LACP to the host and where some form of
communications must take place to the local host before the operating system fully
starts so LACP can start (for example, a fresh installation or bare-metal Preboot
Execution Environment [PXE] boot), you cannot use this design (the access switches
do not pass traffic on a vLAGed LACP port if LACP is not formed).
This issue also is a potential issue with the use of static aggregations to hosts that
need PXE boot because before the operating system is loaded, the switch thinks it is
one common static aggregation to the host. However, the host (before the operating
system load) does not know the switch is in a static aggregation and is expecting any
response to come back on the single interface on which it is sending out.
The reason connecting vLAGed pairs must use a unique Tier ID is that this Tier ID is used to
generate a common shared MAC so that the other side of the vLAG pair thinks it is
aggregating with a single device. If two pairs of vLAGed switches (four switches total) are
connected with the same Tier ID, they attempt to use the same MAC, which does not work.
This Tier ID generated MAC is derived from a base from a Lenovo reserved range of
08:17:f4:c3:dd:00 - 08:17:f4:c3:de:ff.
The last two bytes are determined by the vLAG Tier ID setting, as shown in Figure 7.
To show the MAC address that is generated by the Tier ID, use the show vlag information
command, Figure 8.
Figure 8 Output of “show vlag information” command with generated MAC address
Mis-Match Information:
Local Peer
-------------+---------------------------+-----------------------
Match Result : Match Match
Tier ID : 1 1
System Type : G8272 G8272
OS Version : 10.8.x.x 10.8.x.x
11
Role Information:
Local Peer
-------------+---------------------------+-----------------------
Admin Role : Primary Secondary
Oper Role : Primary Secondary
Priority : 100 200
System MAC : a8:97:dc:dd:ed:Ol a8:97:dc:f7:d5:01
This MAC address is not used by Virtual Router Redundancy Protocol (VRRP) or
communications to end hosts (it is used only to form cross-switch aggregations and to
present a common MAC between the vLAGed pair to the other side of the cross-switch
aggregation).
If it is using non-vLAG LACP aggregation, the local aggregation uses a MAC from the base
system MACs that are available on a switch (as displayed by the show sys-info command) to
form the aggregation.
12 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Importance of a proper health check network with vLAG
An optional, but strongly recommended, configuration for vLAG is a health check network.
Without a health check network configured, the ISL is the only way a vLAGed pair knows that
the other switch in the pair is up and in what state. If the ISL goes down without a separate
health check network configured, the switches do not know whether the other switch is down
or only the ISL path is down. In this state, both switches go active forwarding without knowing
what the other switch is doing, and this situation can cause issues with connectivity for
attaching devices.
That is why, when a user enables vLAG globally without health check configured, the user is
alerted by the importance of using health check connection between the vLAG peers, as
shown in Figure 9.
Figure 9 Error message from “vlag enable” command when ISL is not operational
DUTl(config)#vlag enable
2018-06-08T13:30:45+02:00 DUTl(cnos:default) %VLAG-4-ENA_WITHOUT_HC_CFG: vLAG health check
should be configured to avoid black holing packets when ISL fails
To prevent this situation, an optional health check network can be configured. Owing to the
critical function the health check configuration provides, it should be considered mandatory
for any production environment.
Figure 10 IPv6 addressing displayed and configured for vLAG Health Check
These IP interfaces can be shared for a health check and switch management, but a
dedicated health check IP interface is preferred.
Do not configure the IP interfaces for a health check to use the ISL links to communicate
with each other. In this case, if the ISL goes down, so does the health check network and
13
the split brain vLAGed pair can still occur. A separate health check network is designed to
prevent this situation.
The IP interfaces that are used for this purpose cannot connect to each other over an
aggregation with vLAG. Because the split aggregation of a vLAG pair is considered a
single logical interface, any health check packet that goes out the vLAGed aggregation
cannot come back to the other switch with the same vLAG aggregation to complete the
health check path.
If the ISL is up, the health check network status has no effect on operation. The health
check network plays a part only if the ISL goes down. The following states of the
combination of the ISL and health check network status are possible:
– Health check is up and the ISL is up. The vLAG packet passing is 100% operational.
– Health check is down and the ISL is up. The vLAG packet passing is 100% operational,
but there is no protection from an ISL failure if the ISL goes down.
– Health check is down and the ISL is down. Split brain operation occurs and both
switches are forwarding. This failure of two components is considered a double fault,
which often is not considered when redundant designs are built.
– Health check is up and the ISL is down. This situation is why the health check network
is important. Consider the following points:
• In this case, the primary VLAG member keeps all links forwarding and the
secondary vLAG member error-disables all aggregations that are configured to use
vLAG.
• This configuration is critical to ensure a stable environment when an ISL failure
occurs.
• When the ISL goes back up, the switches automatically return to normal vLAG
operation after a brief time to ensure stability.
Health check is used also in conjunction with vLAG auto-recovery functionality – at boot,
in case the ISL doesn’t come up or fails, the vLAG switch determines the status of the
peer switch through Health check. If the peer is operational, the switch continues to keep
its vLAG ports in err-disabled state. If the vLAG peer is not detected as functional, the
switch assumes the role of the primary and then transitions all of its vLAG ports to the up
state.
Other health check related parameters, shown in Figure 11, can be configured based on
the customer environment but it is recommended to use the default ones (keep-alive
messages are exchanged every 5 seconds, vLAG peer is considered down if the local
switch doesn’t receive a reply to 3 consecutive hello messages with a 30 seconds retry
interval).
DUTl(config)#vlag hlthchk?
keepalive-attempts Configure VLAG health check number of keepalive attempts
keepalive-interval Configure VLAG health check keepalive interval
peer-ip Configure VLAG health check peer-ip
retry-interval Configure VLAG health check connection retry interval
14 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
ISL considerations
The ISL is a special and important component for a vLAG environment and has the following
unique requirements:
The ISL must be some form of aggregation (LACP or static). In general, LACP is
preferred, owing to the nature of LACP to protect from misconfiguration or miscabling.
However, static aggregations work for this purpose.
The ISL should always be a minimum of two physical links.
Single link aggregation can be configured and used, but it is not advised because it
becomes a single point of failure for the vLAG.
It is also possible to take a QSFP/QSFP28 port, set it to 4x10Gb/4x25G/2x50G mode, set
an aggregation across that, and use a single QSFP+/QSFP28-to-QSFP+/QSFP28 cable
to carry that aggregation. This configuration is also not advised because, although
logically it is four different 10/25/50Gb ports, it is using a single physical cable and that
single cable becomes a single point of failure.
Although it is possible to use 1 Gb ports for the ISL and that setting is okay for switches
that are primarily 1 Gb, use 10 Gb or higher bandwidth links for this ISL aggregation to
ensure proper performance if the switch is primarily a 10/25/40/50/100 Gb switch.
Sizing of the ISL path is not a simple topic because it depends on the number and speed
of uplinks and downlinks, the host bandwidth requirements, and how the hosts are
configured for teaming (as described in , “Understanding packet flow in a vLAG
environment” on page 3).
One method for sizing the ISL is to create the ISL aggregation to be equal to the
bandwidth of the vLAG aggregation uplinks out of a specific vLAGed pair. For example, if
a pair of vLAG switches each has 4 x 10 Gb uplinks that create an 8 x 10 Gb vLAG
aggregation that is headed upstream, create a 4 x 10 Gb ISL (50% of the total uplink
capacity). The logic is that if one side loses all uplinks, it has an equal size path over to the
partner switch of the vLAGed pair through the ISL. Consider the following points:
– This 50% might be considered excessive because you can saturate the uplinks of the
other switch, but it depends on how much bandwidth is normally in use.
– The numbers that are provided in this section are limited by the total number of
available uplinks for use for vLAGed aggregation uplinks and the ISL aggregation;
therefore, it might not be practical for every environment.
ISL will always be forwarding even if the root bridge is not in vLAG switch so in this case, the
user may see two root ports in spanning tree information on the switch.
Note that suspend-individual prevents a member of an LACP port channel from forwarding
traffic when the LACP protocol fails and that port is suspended when it is placed in individual
mode.
15
A sample configuration fragment and the error message generated if the port-channel is not
properly configured are shown in Figure 12.
Once at least one vLAG instance goes to the Formed state, MAC learning on ISL
port-channel is automatically disabled no matter the user config is and from now on, all MAC
entries appeared in mac-address-table learned over ISL port-channel are added based on
vLAG FDB synchronization. The user is alerted by this fact through the syslog output shown
in Figure 13.
ISL is used also at boot in conjunction with vLAG start-up delay functionality – when vLAG
switch reboots, all vLAG ports are kept in the err-disabled state during the auto-recovery
interval. The startup delay timer is started only when the ISL is Active; when this timer
expires, vLAG ports are brought up.
In the current design, the frames received over ISL port-channel will never be forwarded to
vLAG port-channel but only to non-vLAG ports. The reason for doing this is to avoid packet
duplication.
16 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
If spanning tree is required, PVRST (default) and MSTP can be used on the vLAG pairs.
When you are looking at spanning tree (for example, by using the show span command)
on a vLAGed pair of switches and the other side of the vLAG aggregation is towards the
root, the root bridge appears as being on the uplinks of the vLAG primary switch of the
pair; however, the secondary switch appears as both on the uplinks and toward the ISL of
the other vLAGed switch. This display is not an indication of an issue. Instead, it is an
artifact of how vLAG works and it does not affect operation or forwarding and blocking.
Check the Application Guide for the switch model and code version in use to see whether
there are any specific limitations with vLAG for that release and model.
When you are using vLAG in a tiered design and the other side is another pair of vLAG
switches, vPC, or some form of cross chassis aggregation, an optimal design is to connect
at least one link between all four switches in the pair. This configuration is shown in
Figure 2 on page 4, where each of the lower switches has a connection to each of both of
the upper switches. The alternative is a design where the four links do not cross connect
as shown. Both designs work, but the crossed design is more robust during switch failure
events.
When you are upgrading a vLAGed pair of switches, consider the following guidelines:
– Upgrade both vLAG peers with the new OS image; shutdown all vLAG ports on
Primary switch so all traffic passes now Secondary switch; reload the primary switch of
the vLAG pair first. During that reload, the secondary switch that is running the older
code becomes the primary. After the primary is fully operational and forwarding,
shutdown all vLAG ports on Secondary switch so all traffic passes now Primary switch;
reload the current Primary one (old Secondary). After this last one is fully operational
and forwarding, use the show vlag information command to determine the vLAG
election and vLAG instances state.
– When you are upgrading vLAGed pairs, it is important to have both switches running
the same version of code. Do not upgrade one switch in the pair and leave the
switches running for an extended time on different versions of code.
– During the upgrade process using the above steps, we will have a mismatch situation
when one switch uses the new OS and the other one uses the old OS image. This is
not impacting vLAG functionality, all instances get formed but a syslog appears once at
10 seconds notifying the user about this mismatch scenario.
Some additional notes about vLAG:
The peer switches negotiate so that one switch is primary and one is secondary. This is
done by comparing the switches’ system MAC addresses; the switch with the lower MAC
wins. In the event of an ISL failure where both switches are still operational (not a switch
failure), the vLAG interfaces on the secondary switch are disabled. This will only work
properly if there is a health check in place, as discussed in “Importance of a proper health
check network with vLAG” on page 13.
The possible vLAG state values for a vLAG instance are:
– Formed – physical connections from both peer switches are up and carrying traffic.
– Local Up – physical connections from this switch are up but the corresponding
members of the instance on the peer switch are down. Traffic can transit the remote
switch over the ISL and still reach its destination.
– Remote Up – physical connections from this switch are down but the corresponding
members of the instance on the peer switch are up. Traffic can transit this switch via
the ISL and still reach its destination.
– Down – the instance is not operating and is not able to carry any traffic.
17
vLAG LACP misconfiguration/mis-cabling
In vLAG topology, LACP misconfiguration or incorrect cabling can cause vLAG to form 2
port-channels which may lead to a network loop. Current vLAG implementation will detect this
misconfiguration or cabling errors and will take the following actions, shown in Figure 14:
Return a level 4 syslog message alerting the user about the error situation
Move the vLAG LACP port-channel members to suspended state
Figure 14 Diagnostic messages from mis-cabling that results in two distinct port channels
Once the invalid configuration/error in the topology is corrected, the ports that were
suspended due to vLAG misconfiguration will be re-enabled.
Please note that the misconfiguration check is done on vLAG Secondary switch only and so
the log appears here only and not on vLAG Primary switch too.
For static port-channels configured in a vLAG instance, the firmware cannot detect the
misconfiguration since there are no control packets exchanged between the switches. This is
one reason why LACP is preferred over static aggregations.
Each configuration parameter is classified with its associated priority level. Different actions
will be applied when an incompatibility is detected based upon the parameter’s priority.
Whenever a high priority parameter is detected as being inconsistent across vLAG peers, a
syslog message with a critical severity level is logged and the following actions are taken:
If the inconsistent parameter is global or related to the ISL, the Link Aggregation Groups
(LAGs) of all the vLAG instances on the vLAG Secondary switch are put down in
err-disabled state; if the inconsistent parameter is related to the vLAG instance, the LAG
associated with that instance is put into the err-disabled state. After the inconsistency is
fixed, respectively LAG/LAGs are brought up and all vLAG instances recover to a formed
state
If the detected inconsistency refers to a low priority parameter, by default the switch will
only record a syslog message with a warning severity level for that inconsistency. To allow
the switch to perform the same actions as when dealing with high priority parameters,
strict consistency checking must be manually enabled.
18 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
vLAG configuration consistency-check is enabled by default and can be disabled manually
based on user choice. It has to be enabled on both vLAG peer switches in order to work
properly.
The monitored parameters, their priorities and the vLAG peer’s config consistency-check
results can be observed using the commands shown in Figure 15.
Mis-Match Information:
Local Peer
-------------+---------------------------+-----------------------
Match Result : Match Match
Tier ID : 511 511
System Type : NE2572 NE2572
OS Version : 10.8.x.x 10.8.x.x
Role Information:
Local Peer
-------------+---------------------------+-----------------------
Admin Role : Secondary Primary
Oper Role : Secondary Primary
Priority : 200 100
System MAC : a4:8c:db:ba:76:0l a4:8c:db:ba:79:0l
19
VLAG instance 1 : enabled
Instance Information
PCH ifindex State Previous State Cons Res
----------+-----------+--------------+-----------------+--------
1000 101000 Formed Remote UP pass
20 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
00 00 00 00
stp mode high pass rapid-pvst rapid-pvst
stp path cost high pass long long
mst region name high pass b8 10 61 e7 10 8c aa b8 10 61 e7 10 8c aa
de 12 61 64 77 ef a9 de 12 61 64 77 ef a9
eb 3b eb 3b
mst region version high pass 0 0
mst inst mapping high pass ac 36 17 7f 50 28 3c ac 36 17 7f 50 28 3c
d4 b8 38 21 dB ab 26 d4 b8 38 21 dB ab 26
de 62 de 62
mst max-age low pass 20 20
mst max-hops low pass 20 20
mst hello time low pass 2 2
mst forward time low pass 15 15
21
access vlan high pass 1 1
native vlan high pass 10 10
allowed vlan high pass ec df f3 65 df 4f 3d ec df f3 65 df 4f 3d
72 46 9d a2 3c aa 37 72 46 9d a2 3c aa 37
b6 e6 b6 e6
tag native high pass none none
agg type high pass lacp lacp
suspend-individual low pass enable enable
dot1q tunnel high pass disable disable
egress tagged vlans high pass 6e 6f 6e 65 00 00 00 6e 6f 6e 65 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00
private-vlan high pass 56 d7 9a ef 02 b4 75 56 d7 9a ef 02 b4 75
74 le fl Of be 77 08 74 le fl Of be 77 08
c7 fe c7 fe
stp port mode high pass enable enable
stp port Path cost high pass auto auto
stp port Type high pass disable disable
stp bpdu filter low pass disable disable
stp BPDU guard low pass disable disable
stp port loop guard low pass disable disable
stp port root guard low pass disable disable
stp link type low pass auto auto
stp port priority low pass 128 128
mst port cost high pass c7 19 f2 95 45 6b 84 c7 19 f2 95 45 6b 84
ld 55 fc 6c cd 7e 81 ld 55 fc 6c cd 7e 81
le 8d le 8d
mst port priority low pass c7 19 f2 95 45 6b 84 c7 19 f2 95 45 6b 84
ld 55 fc 6c cd 7e 81 ld 55 fc 6c cd 7e 81
le 8d le 8d
As expected, vLAG configuration config-consistency works only when ISL is active. When ISL
is down, the 2 vLAG peers cannot synchronize the information and this is reflected also in the
output of config consistency-check as shown in Figure 16.
22 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
mst region version high N/A 0 N/A
mst inst mapping high N/A - N/A
mst max-age low N/A 20 N/A
mst max-hops low N/A 20 N/A
mst hello time low N/A 2 N/A
mst forward time low N/A 15 N/A
23
vLAG Peer Gateway
There are multiple OS implementations that allow the servers to use the source MAC address
of a received packet as the destination MAC for the response packet instead of a gateway
MAC address. In a vLAG environment, this will result in the server using the physical MAC
address of the vLAG switch instead of the gateway MAC. It is desirable to use the gateway
MAC address, which usually is the VRRP MAC shared between both vLAG peer switches.
The vLAG Peer Gateway feature allows the vLAG switches to forward to downstream vLAG
ports all packets that are received and which have the DMAC address equal to the peer vLAG
switch MAC address, instead of sending them to the other switch via the ISL). This allows a
vLAG switch to act as a gateway for traffic addressed to its vLAG peer by enabling local
forwarding of such traffic.
By default, the vLAG Peer Gateway is disabled. For it to function properly, it must be enabled
on both vLAG peers. The command that controls this function is [no] vlag peer-gateway.
Routing protocols can be running on each vLAG switch, but there is no mechanism to make
them work as one logical protocol peer. The L3 forwarding tables are not synchronized
between vLAG peers, therefore, they will likely have different L3 routes.
However, to enable the usage of VLAG switches as a gateway, the use of VRRP active-active
mode is supported. Both primary and secondary switches can forward the L3 traffic in the
given virtual router instance, independent of their VRRP state (master or backup virtual
router).
Static and dynamic routing protocols can run on vLAG switches but not on those VLANs
which are members of the L2 vLAG domain. Those VLANs should be enabled on the links
that connect to the L3 domain.
vLAG switches can work in 2 modes in a 2- tier vLAG topology known as 4xvLAG+VRRP
configuration:
vLAG VRRP Active (Full Active-Active HAA) mode – in this mode, all vLAG switches are in
IP active state. This means that all four switches will install the Layer 3 routing entry
regardless of their virtual router role or whether their vLAG peer is the master or backup
virtual router. By default, the vLAG VRRP works in this mode
vLAG VRRP Passive (Half Active-Active HAA) mode – in this mode, vLAG VRRP backup
will check if the vLAG peer of a switch is the VRRP master. If neither the vLAG primary nor
secondary is the VRRP master, then vLAG VRRP backup will not install L3 routing entries
on this switch. Only the VRRP Master and its vLAG peer backup VRRP are in IP active
state if the VRRP Master should go down, the newly elected VRRP Master and its vLAG
peer backup VRRP switch will be in IP active state.
This design is useful in an environment where there are two distinct facilities, one of which
is for backup or disaster recovery.
24 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Layer 2 failover
Layer 2 failover on Lenovo switches works with NIC teaming on the servers to prevent a black
hole when all uplinks out of a switch go down. This feature is important in embedded
environments but also applies to stand-alone Lenovo switches. When the uplinks out of an
embedded switch go down in an embedded environment, the server-facing ports normally are
still up. In this situation, teaming on the servers does not know that the path via the uplinks
went down and that it needed to failover. When failover is enabled on the switches, the switch
also shuts down the server facing ports when the uplinks being monitored go down, which
alerts NIC teaming that this switch no longer has a path out and failing over.
After the monitored uplinks are restored, failover automatically re-enables the server facing
ports, which informs NIC teaming that this path is available again. Failover is triggered not
only on the monitored uplinks being in a down state but also if the monitored uplinks are all in
a spanning-tree blocked state or disabled by LACP due to a mismatch. In other words, if the
monitored links are not able to carry traffic then failover would be triggered.
Consider the following points when you are implementing IGMP snooping on Lenovo
switches:
All Lenovo switches support IGMP V1, V2, and V3.
Some other features might limit the use of IGMP snooping. It is important to check the
Application Guide for the product and feature to ensure IGMP use is not restricted. For
example, consider the following points:
The maximum number of multicase group entries is currently 8191 when switches run
CNOS. If an environment includes multicast routers, they can be used to perform
keep-alive queries and keep multicast groups from expiring. If you are working in an
environment where only local multicast is being used (no mrouters), the IGMP snooping
querier feature of Lenovo switches can be enabled to provide this keep-alive query service
that is normally provided by an mrouter.
Link aggregation
This section provides information about link aggregation. Lenovo switches support static
(PortChannel) and dynamic (LACP) link aggregation modes. The key difference between
these modes is that static aggregation is unconditional and always in effect on ports where it
is configured. Dynamic aggregation uses an interactive protocol between both devices, which
helps protect against cabling errors and other errors that can cause unwanted effects.
25
Static PortChannel
This section describes Static PortChannel, which has the following form:
Ports in the same channel must have the same attributes in the following areas or the
channel does not form successfully:
VLAN membership, including native VLAN
Spanning Tree options
Bandwidth (ports with different bandwidths cannot be channeled together)
LACP channels are configured primarily on lists or ranges of ports, where the LACP key and
state are configured, and involves the following commands:
interface ethernet 1/<x> channel-group <y> mode active|passive
The channel group number will be used as the local LACP key number and must be the same
on all ports to be put in a common aggregation. Mode active often is the preferred mode
when enabling LACP.
The purpose of these options is to allow the choice of an option that best provides even
allocation of traffic across the available aggregated links. The reason that aggregated links
might not deliver the same performance as a single link is that the traffic can be unevenly
allocated across the available links, which saturates some of them and leaving others nearly
idle.
Lenovo switches allow the hashing of traffic on the physical ports that are members of a port
channel using several different metrics. The command to accomplish this is:
port-channel load-balance Ethernet XXXX
Where XXXX is one of the below options shown in Figure 17 on page 27. These options enable
hashing based on options at Layers 2-4 as well as options which only apply for FCoE traffic.
This command is entered in global configuration mode and applies to all port-channels; there
is not currently an option to use different hashing for different port channels.
26 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
#port-channel load-balance eth ?
destination-ip Load distribution on the destination IP address
destination-mac Load distribution on the destination MAC address
destination-port Load distribution on the destination TCP/UDP port
fcoe-sid-did Load distribution on the FCoE sid and did
fcoe-sid-did-oxid Load distribution on the FCoE sid, did and oxid
fcoe-sid-did-oxid-rxid Load distribution on the FCoE sid, did, oxid, and rxid
source-dest-ip Load distribution on the source and destination IP address
source-dest-mac Load distribution on the source and destination MAC address
source-dest-port Load distribution on the source and destination TCP/UDP port
source-interface Load distribution on the source ethernet interface
source-ip Load distribution on the source IP address
source-mac Load distribution on the source MAC address
source-port Load distribution on the source TCP/UDP port
Changing the parameters on a switch influences only the packets that are egressing from that
switch. The device at the other end, which might be from a different vendor, needs a similar
configuration to ensure that traffic coming into a Lenovo switch is well-balanced across a link
aggregation group.
LACP timers
The LACP timeout option can be set to long (30 seconds, which is the default setting), or
short (1 second). Most vendors also default to long, but a few (for example, Juniper) default to
short. For proper operation, both sides must agree on the same LACP timers. Based on
several factors, it is best to use long timers whenever possible. The use of short timers on
both sides can affect switch stability, and result in false positives for LACP failure, which leads
to less than stable operation. This parameter is set once per interface.
Suspend-individual
This option on a port-channel command configures an LACP PortChannel so that if an
individual port does not receive LACPDU packets from its counterpart on the other device,
the port goes into a suspended state and not pass any traffic. In general, this result is
desirable and is the default option; if it is disabled (no suspend-individual), after a period the
port functions as though it were an individual port and not part of any aggregation.
The primary case when this option is not helpful is on server-facing ports where the server
uses a form of network boot (PXE, BOOTP, and so on). Because the code that provides the
27
LACP protocol is typically not available before boot, the port does not pass any traffic and the
network boot fails.
LACP configuration
The use of a systematic scheme for numbering LACP keys can be useful for debugging.
The LACP key is derived from the port-channel number as specified in the channel-group
command which is shown in Figure 18 below.
Because LACP keys have only local significance, the device at the other end of the links can
use a key appropriate for the ports that are used on its side, although it might facilitate
troubleshooting for both sides to use the same number.
Although passive mode is also supported, there is no advantage in using it. If both sides of a
link aggregation use active mode, the side that starts the protocol first takes the active role. If
both sides use the passive option, the LACP protocol does not complete and the links do not
pass traffic.
Although STP is one of the most misunderstood and problematic configuration elements in
networking, successful implementation can be achieved if you are careful and follow a few
simple rules that are described in this section. STP is an active protocol, which means that
transmissions that occur on the wire and missteps in the configuration can result in inefficient
network paths and potential network outages. Therefore, STP configuration changes should
be conducted during scheduled network downtimes.
28 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
STP fundamentals
STP was originally defined in the IEEE 802.1D standard, which defined a single instance of
STP. This standard describes how multiple Layer 2 bridges (typically switches) can interact to
define a spanning tree blocking any redundant paths between network nodes.
When you configure STP, you must consider the hierarchy of the spanning tree. The tree has
a root bridge with a subtree below it. Then, each subtree has a designated root bridge that
points back to the root bridge. The subtrees fan out all the way down to the edge devices.
A1 A2
B1 B2 B3 B4
C1 C2 C3 C4 C5 C6 C7 C8
As shown in Figure 20, multiple loops are in the network and, if nothing is done to block the
loops, a single broadcast packet (such as an ARP) can rapidly overrun the network. This
issue occurs because each broadcast packet replicates two or more times while it is being
flooded by each switch. STP blocks the redundant links to prevent the network loops through
a negotiation. This negotiation uses a packet called a Bridge Protocol Data Unit (BPDU) that
is transmitted by the switches.
To configure STP, you must first determine which switch should be the root bridge, which
often is the core of the network. In the example, this switch is A1 or A2. Then, you must
determine each level of the tree down to the edge (in this example, C1-C8).
After you determine the hierarchy of the tree, you must determine the bridge priority to use at
each level that is used for STP to determine the root of the tree. Bridge priorities have a value
of 0 - 61,440 incremented in intervals of 4096 where a switch with the lowest bridge priority is
selected as the root bridge. If the bridge priority is equal, the switch with the lower MAC
address is the root.
If you want a switch to be the root bridge, set the bridge priority to 0 but exercise caution
when this setting is used in a network. By default, most switches set their default bridge
priority to 32768, which includes the RackSwitch™ TOR switches.
29
STP then uses the bridge priority to determine the designated root bridge for each loop that is
the preferred path to the root bridge or the tree’s root bridge. The switch that is determined to
be the furthest path from the root bridge blocks ports by placing them to discarding to
eliminate the network loops. Figure 21 shows an example network with bridge priorities.
A1 A2
45056 49153
X X X X
B1
53248 X 57344
B2 B3
53248 X 57344
B4
X X X X X X X X
C1
61440 X 61440
C2 C3
61440 X 61440
C4 C5
61440 X 61440
C6 C7
61440 X 61440
C8
Figure 21 displays the example network with some example bridge priorities configured,
assuming that the subtree is inserted at the edge of the network. Devices C1 - C8 are edge
devices; therefore, the bridge priority should be set to 61440. The next layers should have a
primary and secondary bridge configured so each pair should be set to adjacent bridge
priorities. B1 - B4 are configured with bridge priorities 53248 and 57344 and switches A1 and
A2 use the bridge priorities of 45056 and 49153. These values are deterministic on the target
network.
If the preceding bridge priorities are used, each level of the tree determines the designated
bridge for each level (indicated by the gray box in Figure 21), which has the preferred path to
the root bridge. The non-preferred paths are blocked (designated by X in Figure 21) on the
switch that is furthest from the root bridge. The alternative paths are marked by dashed lines
in Figure 21 and all loops are blocked.
Although there are many other elements to STP, the bridge priorities are the most important
elements to understand because this element is the primary element that is used to form the
hierarchy of the spanning tree. When any link on a port changes, communication between
devices in the spanning tree might be interrupted while STP renegotiates the best path to the
root bridge, which is called STP convergence. This concept is important because if you reboot
a device on the LAN, network traffic can be interrupted.
The STP standard evolved as the Layer 2 protocols evolved to include support for protecting
loops in multiple VLANs. One such standard developed by Cisco is called Per VLAN
Spanning Tree (PVST) and Per VLAN Spanning Tree Plus (PVST+) where a spanning tree
instance is created for each VLAN. Each STP instance uses a BPDU that is tagged with the
VLAN ID to negotiate the STP protocol. PVST/PVST+ STP instance 1 uses an untagged
BPDU, which is compatible with the IEEE standard STP. STP instance 1 is the only instance
that can control more than one VLAN because all of the other instances use the unique VLAN
ID to tag the BPDU.
30 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Rapid STP (RSTP) defined by IEEE 802.1w evolved from STP to significantly reduce the
amount of time it takes for a network to converge from 40 to 50 seconds down to 6 seconds or
less. RSTP is compatible with an earlier version with STP and is incorporated into the IEEE
802.1D-2004 standard. Cisco extended PVST/PVST+ to include RSTP extensions with Rapid
Per VLAN Spanning Tree (RPVST) or Per VLAN Rapid Spanning Tree (PVRST).
RSTP is no longer supported on Lenovo switches, but equivalent capability can be obtained
via PVRST or MSTP when only one VLAN is in use.
Finally, the IEEE community added Multiple Spanning Tree Protocol (MSTP), which was
originally defined in the IEEE 802.1s standard and later merged into IEEE 802.1Q-2005.
MSTP is similar to PVRST in that it contains multiple STP instances. However, unlike PVRST,
MSTP allows multiple VLANs to be defined in all of the STP instances. MST also limits the
number of STP instances to 64 with IDs 0 - 64 and maps these into a single BPDU to provide
less processing overhead than PVRST, which uses a BPDU for each VLAN.
Another STP mode to consider is to disable STP. The way switches function with STP
disabled is not defined in IEEE 802.1D-2004, so care should be taken to understand how the
device works in this mode. The following section describes how this mode works in the
Lenovo switches.
Each of these modes is different and it is important to select the proper STP mode (type)
when integrating into a network. Network instability and outages occur if the same STP mode
is not used on every network device.
In this example, GROUP is the STP group and PRIORITY is the STP priority (0 - 61440). The
priority is required to be a multiple of 4096, as described in “STP fundamentals” on page 29.
For example, if 61000 is selected, an error message will be generated.
31
Before MSTP mode can be selected, the name and revision number must be configured with
the same value as all other devices in the network. The name is a character string of up to 32
case-sensitive characters and the revision is a value of 0 - 65535 where the default value is 1.
Most vendors set the default revision number as 1, except for Cisco.
The MSTP configuration can be validated by using the command show spanning-tree mst
configuration.
For MSTP to properly balance the trees, the VLANs must be mapped to the same instances
throughout the network. The assignment is done by using the following command:
spanning-tree mst configuration
instance INSTANCE vlan VLANS
exit
In this example, INSTANCE is the MSTP instance 0 - 64 and VLANS is the range of VLANS to be
assigned. The VLAN mappings can be validated by comparing the MSTP Digest, which is the
common calculated value defined by the MSTP standard and used by all devices in the tree.
The MSTP instance bridge priorities function the same way as in all of the STP standards and
can be set by using the following command:
spanning-tree mst INSTANCE priority PRIORITY
Globally disabling STP disables all local STP processing on the switch and any BPDU that is
received is forwarded as is any other L2 packet, which allows the upstream switches to
process all STP control.
Loop Guard
Loop Guard adds protection against network loops that are caused by improperly functioning
remote devices with conditions, such as a unidirectional link failure. This feature monitors the
BPDUs that are normally received on STP enabled ports and places the port into a
loop-inconsistent blocking state if BPDUs are no longer received. After BPDUs are received
again, the port is placed back into a normal error-free STP state.
32 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Edge ports
The edge feature enables a port to immediately go into forwarding state. This feature should
be enabled on all server-facing ports. If this feature is not enabled, a server that is rebooted
can cause a network convergence event that disrupts network traffic and results in lost data.
To configure a port as an edge port, use the spanning-tree port type edge configuration
command at the interface level.
Another reason for configuring edge mode on client- and server-facing ports is that the ports
are immediately placed into forwarding so that traffic is passed. This fact is important for
devices that use DHCP for IP configuration or use BOOTP or similar network booting
protocols, because otherwise, ports may time out before processing is completed.
BPDU guard
BPDU guard error disables a port and generates a log message if a BPDU is received on it.
This feature is useful to protect a port against unexpected switches that are plugged in or are
from other improper configurations that can result in a loop. It is configured with the following
command:
Spanning-tree bpduguard enable [or disable]
Root guard
Root guard protection is used to prevent the STP root bridge for any tree from being learned
on a port. If a root bridge is learned, the port is put into an inconsistent state. To enable root
guard, use the following command:
spanning-tree guard root
To prevent excessive amounts of this type of traffic from causing issues in an environment,
each can be independently throttled to limit the number of packets per second.
Storm control commands are run on the interface by using the following format:
storm-control <broadcast | multicast | unicast> level <percent of port bandwidth>
Configuring Storm Control can be as much art as science. Every environment can have
different levels of these types of traffic that are considered normal. Arbitrarily throttling these
types of traffic can cause more issues than it might help. The following examples describe
different approaches for selecting appropriate values for Storm Control settings:
33
Important: Do not be too aggressive when you are setting Storm Control values because
the switch can discard normal wanted packets of these types (for example, ARPs are a
broadcast and you might inadvertently affect the normal ARP process if the broadcast
threshold is set too low).
Monitor the amount of these types of traffic by using the show int eth <1/x> counters
storm-control CLI command to determine the number of packets being discarded.
Use the show int eth <1/x> command (it displays a report for the utilization for a
30-second period), monitor the broadcast and multicast for some period, and then take the
average number and add one or two zeros. For example, if the broadcast averaged 20 or
30 per second, set the value for 2000 or 3000.
A more scientific (but potentially time-consuming) method is to use a monitoring tool that
gathers data over a period and use the averages as reported by this tool to select
appropriate values.
The simplest and least time-consuming way is to pick a number that is fairly safe or high
for most environments and set for this value; for example, 10% of the total port bandwidth
is usually – but guaranteed – safe.
Important: After Storm Control values are set, it is important to monitor the policy
discards on the port (by using the show int eth <1/x> counters storm-control
command). If there are excessive policy discards in normal conditions (which indicates
that storm control is being triggered), consider tuning the values higher until policy
discards are minimal to none in normal operating conditions.
When storm control is applied, it controls only packets that are coming into a port, not
packets that are going out of a port.
When a switch receives a BOOTP and DHCP request from a client that is seeking an IPv4
address, the switch acts as a proxy for the client. The request is forwarded as a UDP unicast
MAC layer message to the BOOTP and DHCP servers that are configured on the port where
the request was received. The server(s) responds to the switch with a Unicast reply that
contains the IPv4 default gateway and the IPv4 address for the client. The switch forwards
this reply back to the client. Additional parameters such as hostname, DNS server address,
log server address, and others can also be provided to the client.
In the examples below, dhcp relay is configured on VLAN interfaces; it can also be configured
in the same way on individual physical ports (interface Ethernet <1/x>).
34 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Figure 22 shows a BootP and DHCP request process to a single switch.
DHCP Client
VLAN 10
192.168.10.x (dhcp)
IP Address
192.168.10.1
Switch
IP Address
192.168.20.1
DHCP Server
VLAN 20
192.168.20.152
Figure 23 shows how to correctly configure BootP and DHCP relay on a G8264 switch.
If used in parallel with vLAG, VRRP performs Layer 3 routing on the master and backup
switches. In this scenario, VRRP with BootP and DHCP relay uses the VRRP Address to
35
communicate with the Client and Server. Figure 24 shows a BootP and DHCP request
process to a single switch.
DHCP Client
VLAN 10
192.168.10.x (dhcp)
192.168.
0.1
Layer 3 VRRP Layer 3 Switch
192.168. 0.1
DHCP Server
VLAN 20
192.168. 0.152
Figure 24 BootP and DHCP packet walk for a switch that is running VRRP with vLAG
Figure 25 shows how to correctly configure BootP and DHCP relay with VRRP.
interface vlan 10
ip address 192.168.10.2/24
exit
!
interface vlan 20
ip address 192.168.20.2/24
exit
!
!enable globally
ip dhcp relay
!
!configure interfaces
interface vlan 10
ip dhcp relay address 192.168.20.152
!
interface vlan 20
ip dhcp relay address ! same address as above
interface vlan 10
vrrp 10
address 192.168.10.1
priority 101
!
interface vlan 20
vrrp 20
address 192.168.20.1
priority 101
36 Lenovo Networking Best Practices for CNOS: Layer 2 Design and Configuration
Change history
May 22, 2019
Minor corrections
Author
Scott Lorditch is a Consulting System Engineer for Lenovo. He performs network
architecture assessments and develops designs and proposals for solutions that involve
Lenovo Networking products. He also developed several training and lab sessions for
technical and sales personnel. Scott joined IBM as part of the acquisition of Blade Network
Technologies® and joined Lenovo as part of the System x® acquisition from IBM. Scott spent
almost 20 years working on networking in various industries, as a senior network architect, a
product manager for managed hosting services, and manager of electronic securities transfer
projects. Scott holds a BS degree in Operations Research with a specialization in computer
science from Cornell University.
This paper is based on a chapter in the Lenovo Press book, Lenovo Networking Best
Practices for Configuration and Installation. Thanks to the authors:
Scott Irwin
Scott Lorditch
Ted McDaniel
William Nelson
Matt Slavin
Megan Gilge
37
Notices
Lenovo may not offer the products, services, or features discussed in this document in all countries. Consult
your local Lenovo representative for information on the products and services currently available in your area.
Any reference to a Lenovo product, program, or service is not intended to state or imply that only that Lenovo
product, program, or service may be used. Any functionally equivalent product, program, or service that does
not infringe any Lenovo intellectual property right may be used instead. However, it is the user's responsibility
to evaluate and verify the operation of any other product, program, or service.
Lenovo may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not give you any license to these patents. You can send license
inquiries, in writing, to:
LENOVO PROVIDES THIS PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some
jurisdictions do not allow disclaimer of express or implied warranties in certain transactions, therefore, this
statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. Lenovo may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at
any time without notice.
The products described in this document are not intended for use in implantation or other life support
applications where malfunction may result in injury or death to persons. The information contained in this
document does not affect or change Lenovo product specifications or warranties. Nothing in this document
shall operate as an express or implied license or indemnity under the intellectual property rights of Lenovo or
third parties. All information contained in this document was obtained in specific environments and is
presented as an illustration. The result obtained in other operating environments may vary.
Lenovo may use or distribute any of the information you supply in any way it believes appropriate without
incurring any obligation to you.
Any references in this publication to non-Lenovo Web sites are provided for convenience only and do not in
any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this Lenovo product, and use of those Web sites is at your own risk.
Any performance data contained herein was determined in a controlled environment. Therefore, the result
obtained in other operating environments may vary significantly. Some measurements may have been made
on development-level systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurements may have been estimated through
extrapolation. Actual results may vary. Users of this document should verify the applicable data for their
specific environment.
Send us your comments via the Rate & Provide Feedback form found at
http://lenovopress.com/lp1005
Trademarks
Lenovo, the Lenovo logo, and For Those Who Do are trademarks or registered trademarks of Lenovo in the
United States, other countries, or both. These and other Lenovo trademarked terms are marked on their first
occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law
trademarks owned by Lenovo at the time this information was published. Such trademarks may also be
registered or common law trademarks in other countries. A current list of Lenovo trademarks is available on
the Web at http://www.lenovo.com/legal/copytrade.html.
The following terms are trademarks of Lenovo in the United States, other countries, or both:
Blade Network Technologies® Lenovo® Lenovo(logo)®
Flex System™ RackSwitch™ System x®
Other company, product, or service names may be trademarks or service marks of others.
39