MLAG-ConfigGuide-4 7 3
MLAG-ConfigGuide-4 7 3
Multi Chassis Link Aggregation Group (MLAG) which is a layer 2 feature to logically represent 2
physical devices as one, now has the enhanced feature of - Stateful Switch Over (SSO) in EOS
version 4.7.3. Failover between switches acting as MLAG Peers has now been reduced to <
500ms which is virtually instantaneous. The use of the same system-id in STP and LACP helps
minimize protocol churn at layer 2 when one of the peers goes down and provides for fast
failover. In theory if a switch within an MLAG pair goes down , the only packets lost should be
the packets across the wire to the switch rebooting. It also minimizes impact of restarting a
switch participating as an MLAG peer.The document contains the following sections:
1. Key changes in the MLAG behavior in 4.7.3 - Please read if you have MLAG deployments
running EOS images older than 4.7.3.
2. Upgrading an MLAG pair of switches to EOS-4.7.3
3. Configuring MLAG in 4.7.3 - Please read if you are configuring MLAG for the first time on a
pair of switches running 4.7.3
4. Configuring active/active router redundancy
• The primary-priority values are only used when the MLAG pair negotiates for the first time.
• The command ʻshow mlagʼ does not list the MLAG interfaces. The MLAG interfaces can now
be viewed using the command ʻshow mlag interfacesʼ
• An MLAG System ID is assigned to an MLAG domain, which does not match either peer's
bridge MAC address. This ID is used by either peer in both STP and LACP PDUs
1
MLAG Switch-1 MLAG Switch-2
The use of the same system-id in STP and LACP helps minimize protocol churn at layer 2 when
one of the peers goes down.
• These configurations must now be applied to both peers, if they aren't already:
• VLAN (including ID and name)
• switchport configuration for MLAG interfaces
• Spanning-tree global configuration
• Spanning-tree configuration for MLAG interfaces (i.e. port-channels configured with an
MLAG ID)
• The MLAG state of primary and secondary only apply to STP and are no longer reflected in
the output of ʻshow mlagʼ. This state can be seen in the output of ʻshow mlag detailʼ
• Seamless failover is only possible when the Stp agent is "restartable". You can see if it is
restartable by running show spanning-tree bridge detail on either switch and looking for Stp
agent is [not] restartable. Whenever a significant event occurs that requires the protocol to run
one of its state machines, Stp will not be restartable for about 30 seconds:
• If switch-1 and switch-2 are MLAG peers and switch-1 reboots, all ports on switch-1 other than
those in the peer-link port-channel are in the ʻerrdisabledʼ state for 5 minutes. This is to allow
any topology state to stabilize before forwarding traffic. The 5 minute value is the default value
of the ʻreload delay’ which is configurable under the MLAG configuration mode. The
2
recommend minimum value is 60 seconds for ToR switches and 600 seconds for the
modular platform to ensure that at least the forwarding hardware is initialized with the topology
state. Another reason the ports are in the ʻerrdisabledʼ state is to allow multicast group
membership and dynamic routes to be re-learned before traffic is allowed in
• For seamless failover to work, the peer switch must be directly, physically connected. An
MLAG peer will only enter the failover state when it received a link-down event on the peer-
link.
• It's possible for the MLAG peers to enter a "split brain" scenario. For example, if all of the peer
link cables are removed. We currently don't have a way for one MLAG peer to detect liveness
of the other in this scenario, so the two peers will independently stay active until they are able
to renegotiate. At the time of negotiation, they will both go to the inactive state until the MLAG
association is successfully established.
Hitless MLAG does not include support for ISSU. MLAG ISSU which is scheduled for a future
release will support the ability to upgrade an MLAG pair of switches without any disruption to
downstream devices.
The following steps can be followed to upgrade the switches one at a time with minimal network
churn at layer 2:
- If switch A and switch B are the MLAG peers running EOS versions before 4.7.3, disable the
physical interfaces on switch B (secondary) that are configured in an MLAG, using the ʻshutʼ
command from interface configuration mode.
switch-B#config t
switch-B(config)#mlag
switch-B(config-mlag)#no domain-id
3
- Reload switch B with the boot-config set to the new image
- Wait for switch B to stabilize. Enable the physical interfaces on switch B that are configured in
an MLAG, using the ʻno shutʼ command on the required set of interfaces.
- Disable the physical interfaces on switch A that are configured in an MLAG, using the ʻshutʼ
command on the required set of interfaces.
- Save the configuration on switch A so that the interfaces are still in the shut down state.
- Once switch A has completed the boot process and is accessible, enable the interfaces in the
MLAG using the ʻno shutʼ command
- Once the MLAG association is established, the MLAG interfaces will change to the ʻconnectedʼ
state
4
Configuring MLAG on EOS-4.7.3 and later releases
Note: Both MLAG switches must run the same version of EOS. Running different versions may
result in a failure to form an association with the MLAG peer.
On both switches, ensure that the control plane ACL configuration is compatible with MLAG.
If a custom access list is configured, it must also contain these two rules.
2. Create a port-channel for the peer link. Assuming interface Eth1 and Eth2 connect the two
peers, configure the following on both switches:
Switch1# config t
Switch1(conf)#interface eth1-2
Switch1(config-if-Et1-2)# channel-group 10 mode active
Switch1(config-if-Et1-2)# interface port-channel 10
Switch1(config-if-Po10)# switchport mode trunk
3. On both switches, create a VLAN with any unused vlan-id for the peer-to-peer
communication
Switch1(conf)#vlan 4094
Switch1(config-vlan-4094)# trunk group mlagpeer
Switch1(config-vlan-4094)# interface port-channel 10
Switch1(config-if-Po10)# switchport trunk group mlagpeer
Switch1(config-if-Po10)# exit
Switch1(conf)#no spanning-tree vlan 4094
5
The trunk group name used in the ʻtrunk groupʼ command can be any alpha-numeric string. The
string ʻmlagpeerʼ is not a key word and is used in this example to add relevance to the purpose
of VLAN 4094.
Assigning Vlan4094 and Port-Channel10 to trunk group 'mlagpeer' prevents Vlan4094 from
being carried on any trunk other than Po10. This allows you to safely disable spanning tree on
Vlan4094 (ensuring that the MLAG peers can communicate) without creating a loop through the
(other) trunk links.
On Switch1:
On Switch2:
Test IP connectivity between the two switches by pinging one peer from the other.
Switch1(config)#mlag
Switch1(config-mlag)#local-interface vlan 4094
Switch1(config-mlag)#peer-address 10.0.0.2
Switch1(config-mlag)#peer-link port-channel 10
Switch1(config-mlag)#domain-id mlag1
Switch2(config)#mlag
Switch2(config-mlag)#local-interface vlan 4094
Switch2(config-mlag)#peer-address 10.0.0.1
Switch2(config-mlag)#peer-link port-channel 10
Switch2(config-mlag)#domain-id mlag1
The MLAG peer relationship will form once the peer-link is up, the domain-ids match and a bi-
directional TCP connection is established between the MLAG peers.
The MLAG association dissolves and both switches revert to their independent state if any one
of the following occurs:
6
•
If the TCP connection is broken
•
If the peer-link goes down
6. Wait for the peers to form an MLAG association and enter the active state. The output of the
command ʻshow mlagʼ shows the MLAG configuration and the status. Once the MLAG
association is established both switches will be in the ʻactiveʼ state.
On Switch1:
switch-1#show mlag
MLAG Configuration:
domain-id : mlag1
local-interface : Vlan4094
peer-address : 10.0.0.2
peer-link : Port-Channel10
MLAG Status:
state : Active
peer-link status : Up
local-int status : Up
system-id : 02:1c:73:00:13:19
MLAG Ports:
Disabled : 0
Configured : 0
Inactive : 0
Active-partial : 0
Active-full : 0
On Switch2:
switch-1#show mlag
MLAG Configuration:
domain-id : mlag1
local-interface : Vlan4094
peer-address : 10.0.0.1
peer-link : Port-Channel10
MLAG Status:
state : Active
peer-link status : Up
local-int status : Up
system-id : 02:1c:73:00:13:19
MLAG Ports:
Disabled : 0
Configured : 0
Inactive : 0
Active-partial : 0
Active-full : 0
7. Configure an MLAG
In this example, a simple two-port MLAG is used. One of the ports from Switch3 is connected to
Switch1 and the other port is connected to Switch2. The two interfaces on Switch3 or the Host
can be configured as a regular port-channel using LACP.
7
If Eth 3 on Switch1 and Switch2 are used in the MLAG, on both switches configure the
following:
Switch1(conf)#interface eth3
Switch1(config-if-Et3)# channel-group 3 mode active
Switch1(config-if-Et3)# interface port-channel 3
Switch1(config-if-Po3)# mlag 3
This puts Eth3 into Port-Channel3 on both switches and connects the two Port-Channel3
interfaces into MLAG 3. The MLAG peer switches associate the port channels using the MLAG
identification number. LACP should be used on all MLAG interfaces.
The output of spanning tree on both the MLAG peers will list the local and the peer interfaces:
switch1#show spanning-tree
VL3908
Spanning tree enabled protocol rapid-pvst
Root ID Priority 36676
Address 001c.7301.021e
This bridge is the root
8
Interface Role State Cost Prio.Nbr Type
------------- ---------- ---------- --------- -------- --------------------
Et17 designated forwarding 2000 128.217 P2p
Et18 designated forwarding 2000 128.218 P2p
PEt17 designated forwarding 2000 128.17 P2p
PEt18 designated forwarding 2000 128.18 P2p
The local interfaces and peer interfaces will be listed in the output of ʻshow port-channel
all-portsʼ when viewed on either of the MLAG peers:
NOTE: Peer interfaces do not appear in the running-config or startup-config of either MLAG
peer.
It is highly recommended to configure VLANS identically on both switches. If the VLAN is not
configured on both of the MLAG peers, the peer missing the VLAN configuration will fail to
forward traffic for that VLAN.
switch1#show vlan
VLAN Name Status Ports
----- ---------------------- --------- -------------------------------
Peer interfaces are displayed in the output of ʻshow vlanʼ on both switches.
9
Configuring Active-Active Router Redundancy
If you wish to have a IP unicast routing redundancy for the MLAG domain, you can configure
VRRP or VARP. Both protocols use a virtual router IP address that is defined as the next hop
for the local nodes.
In VRRP, based on priority configuration or router election, the 2 routers take on either master or
backup states. At any given time, only one of the routers will be routing and when the master
becomes unavailable, it fails over to the backup router. VRRP is standards based, but less
desirable as the MLAG peer link will be utilized by some of the traffic routed to and from remote
subnets.
For active/active unicast IP routing in MLAG configurations, Arista recommends VARP (Virtual
ARP). As a primary benefit, VARP does not require the traffic to traverse the peer-link to the
master router as VRRP would. VARP also provides rapid failover in the event of a link or switch
failure while enabling the sharing of IP forwarding load between both switches.
VARP requires configuring the same virtual-router IP address on the appropriate VLAN
interfaces of both MLAG peers, as well as a global unique virtual-router MAC address. VARP
functions by having both switches respond to ARP requests and GARP for a configured IP
address with the "virtual-router" MAC address. This address is receive-only MAC address and
no packet is ever sent with this address as its source. If 'ip routing' is enabled, received
packets will be routed with the following process: When the DMAC of a packet destined to a
remote network matches the configured "virtual-router" MAC address, each MLAG peer locally
forwards the traffic to it's next hop destination. Note that there is no route table synchronization
via the MLAG protocol. Each MLAG peer must have the same routes available, either via static
configuration or learned via a dynamic routing protocol such as OSPF or BGP.
Switch1:
Switch1#config t
Switch1(config)#interface vlan 10
Switch1(config-if-Vl10)#ip address 10.10.10.2/24
Switch1(config-if-Vl10)#ip virtual-router address 10.10.10.1
Switch1(config-if-Vl10)#interface vlan 20
Switch1(config-if-Vl20)#ip address 10.10.20.2/24
Switch1(config-if-Vl20)#ip virtual-router address 10.10.20.1
Switch1(config-if-Vl20)#exit
Switch2:
10
Switch2#config t
Switch2(config)#interface vlan 10
Switch2(config-if-Vl10)#ip address 10.10.10.3/24
Switch2(config-if-Vl10)#ip virtual-router address 10.10.10.1
Switch2(config-if-Vl10)#interface vlan 20
Switch2(config-if-Vl20)#ip address 10.10.20.3/24
Switch2(config-if-Vl20)#ip virtual-router address 10.10.20.1
Switch2(config-if-Vl20)#exit
11