14-740: Networks: Lecture 24 Spring 2018 Kesden

The document describes the venerable 3-tier data center topology and how it scales up. It then introduces Clos networks, leaf-spine networks, and fat-tree networks as improved topologies that provide full bisection bandwidth. The Portland solution is presented as using a fat-tree network with commodity switches and offloading services to software on servers. It assigns hierarchical MAC addresses to enable location awareness and uses a fabric manager for coordination.

Uploaded by

Thinh Tien Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views26 pages

14-740: Networks: Lecture 24 Spring 2018 Kesden

Uploaded by

Thinh Tien Nguyen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

14-740: Networks

Lecture 24 * Spring 2018 * Kesden

DC Topology: Venerable 3-Tier
• Since, beyond a certain point, we can’t make switches wider and/or faster, we
need to “fan out”, most commonly with a tree topology
• Venerable 3-tier network is a straight-forward example:
Core

Aggregation

Leaf
DC Topology: Venerable 3-Tier
• Can add a redundant core for increased throughput and resilience

Core

Aggregation

Leaf
DC Topology: Venerable 3-Tier
• Scales nicely, but …
• Higher up gets over-subscribed since everything passes through
• Over-subscription increases with scale
• Request-to-stream and host-to-host cases generate bottlenecks
1x Switch Throughput

Wx Switch Throughput

W 2x Switch Throughput
Clos Networks
• Allocating an input port, and associated
throughput,
• Allocates path whole way through.
• NxN connectivity with switches with less
than NxN connectivity
• Basically a way to make a large NxN
switch
• Still an expensive expansion and not likely
to need all throughput capacity
By Piggly (talk) (Uploads) - Transferred from en.wikipedia to Commons.,
Public Domain, https://commons.wikimedia.org/w/index.php?curid=61536102
simultaneously
Leaf and Spine
Leaf and Spine
• Type of Clos network
• Essentially folded, but still N-to-N connections
• Derived from old phone company architecture, invented in 1950s.
• All paths are same length from edge to edge
• Great for switch vendors
• Need to pick path, as can choose any middle router
• Very redundant
• Can implement at layer-2 or layer-3
• More soon
Fat-Tree Networks
• More throughput at higher levels, more even across levels
• Not easy to do since buying more powerful switches is harder
• To extent possible, more cost per unit capacity
• Not possible beyond a modest point
• This is somewhat necessarily the case as, if bigger switches were more readily available
and economical, they’d be used at the bottom, and we’d be back where we started.
Fat-Trees With Skinny Switches: Goals
• Use all commodity switches
• Full throughput from host-to-host
• Compatible with usual TCP/IP stack
• Better energy efficiency per unit throughput from more smaller switches
than fewer bigger switches
Note the replacement of
aggregation layer
(K/2)2 core routers Fat Tree (K=4) switches with 2 layers of
K/2 K-port switches

(K/2)2 servers
per pod

K-port switches support K3/4 servers

Fat Tree Details
• K-ary fat free: three layers (core, aggregation, edge)
• Each pod consists of (K/2)2 servers and 2 layers of K/2 K-port switches.
• Each edge switch connects (K/2) servers to (K/2) aggregator switches
• Each aggregator switch connects (K/2) edge and (K/2) core switches
• (K/2)2 core switches, each ultimately connecting to K pods
• Providing K different roots, not 1. Trick is to pick different ones
• K-port switches support K3/4 servers/host:
• (K/2 hosts/switch * K/2 switches per pod * K pods)
Using Multiple Paths
• Must pick different paths (“path diversity”) or will have a hotspot
• Unless sessions use the same path, reordering will be a problem and need to be
resolved with buffering higher up
• Static paths may not respond to actual, dynamic workloads
• Can be done at different levels.
• Higher levels, e.g. transport, are more flexible, but likely more effort and slower
• Lower levels are likely less adaptive, but simpler and faster.
• Ability to weight or remove paths can aid fault tolerance
Portland Solution
• Use commodity switches and off-load services into software on commodity
server
• Start With Fat Tree for a topology without hot spots
• Use layer-2 to avoid routing, forwarding, and related complexity
• Separate host identifier from host location
• IP addresses identify host, but not location, just and ID
• Use “Pseudo MAC Address” to identify location at Level-2
PortLand Addresses
• Normally MAC addresses are arbitrary – no clue about location
• IP normally is hierarchical, but here we are using it only as a host identifier
• If MAC addresses are not tied to location, switch tables grow linearly with growth of
network, i.e. O(n)
• PortLand uses hierarchical MAC addresses, called “Pseudo MAC” or PMAC
addresses to provide for switch location
• <pod:port:position:vmid>
• <16,8,8,16> bits
0 1
PortLand PMAC Addresses
2 3

Position 0 1 0 1 0 1

PMAC: <pod.position.port.vmid> 48 bits: <16-bits.8-bits.8-bits.16-bits>

Portland PMAC Addresses

PMAC: <pod.position.port.vmid> 48 bits: <16-bits.8-bits.8-bits.16-bits>

VM Migration
• Flat address space.
• IP address unchanged after migration, higher level doesn’t see state change
• After migration IP<->PMAC changes, as PMAC is location dependent
• VM sends gratuitous ARP with new mapping.
• Fabric Manager receives ARP and sends invalidation to old switch
• Old switch sets flow table to software, causing ARP to be sent to any stray packets
• Forwarding the packet is optional, as retransmit (if reliable) will fix delivery
Location Discovery: Configuring Switch IDs
• Humans = Not right Answer
• Discovery = Right Answer
• Send messages to neighbors – Get Tree Level
• Hosts don’t reply, so edge only hears back from above
• Aggregate hears back from both levels
• Core hears back only from aggregate
• Contact Fabric Manager with tree level to get ID
• Fabric Manager is service running on commodity host
• Assigns ID
Name Resolution: MACPMACIP
• End hosts continue to use Actual MAC (AMAC) addresses
• Switches convert PMAC<->AMAC for the host
• Edge switch responsible for creating PMAC:AMAC mapping and telling Fabric
Manager
• Software on commodity server, can be replicated, etc. Simplicity is a virtue.
• Mappings timed out of Fabric Manager’s cache, if not used.
• ARPs are for PMACs
• First ask fabric manager which keeps cache. Then, if needed, broadcast.
No loops, No Spanning Trees
• Forwarding can only go up the tree.
• Cycles not possible.
Failure
• Keep-alives like the link discovery messages
• Miss a keep alive? Tattle to the Fabric Manager
• Fabric manager tells effected switches, which adjust own tables.
• O(N) vs O(N2) for traditional routing algorithms (Fabric Manager tells every
switch vs every switch tells every switch)
Looking Back
• Connectivity – Hosts can talk! No possibility of loops
• Efficiency – Much less memory needed in switches, O(N) fault handlingh
• Self configuring – Discovery protocol + ARP
• Robust – Failure handling coordinated by FM
• VMs and Migration – Each has own IP address, each has own MAC address
• Commodity hardware – Nothing magic.
Flow Classification
• Type of “diffusion optimization”
• Mitigate local congestion
• Assign traffic to ports based upon flow, not host.
• One host can have many flows, thus many assigned routings
• Fairly distribute flows
• (K/2)2 shortest paths available – but doesn’t help if all pick same one, e.g. same root
of multi-rooted tree
• Periodically reassign output port to free up corresponding input port
Flow Scheduling
• Also a “Diffusion Optimization”
• Detect and deconflict large, long-lived flows
• Threshholds for throughput and longevity
Fat-Tree Solution: “Special” IP Addressing
• “10.0.0.0/8” private addresses
• Pod-level uses “10.pod.switch.1“
• pod,switch < K
• Core-level uses "10.K.j.i“
• K is the same K as elsewhere, the number of ports/switch
• View cores as logical square. i, j denote position in square.
• Hosts use “10.pod.switch.ID" addresses
• 2 <= ID <= (K/2)
• K=1 is pod-level switch; ID > 2 is too many hosts
• 8-bits implies K < 256
• Will pre-bake the paths to ensure diversity, while maintaining
ordering

Ccna 200 301 Complete Slides
100% (2)
Ccna 200 301 Complete Slides
113 pages
Jo 3120.4R
100% (1)
Jo 3120.4R
271 pages
Lantronix Modbus Protocol UsersGuide
No ratings yet
Lantronix Modbus Protocol UsersGuide
28 pages
Networked Applications: Sockets: COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109)
No ratings yet
Networked Applications: Sockets: COS 461: Computer Networks Spring 2006 (MW 1:30-2:50 in Friend 109)
39 pages
Sean Flack - Arista - L3 Leaf Spine Networks and VXLAN
100% (2)
Sean Flack - Arista - L3 Leaf Spine Networks and VXLAN
32 pages
The Nautical Institute Dynamic Positioning Certification and Accreditation Standard Vol 1 Training and Certification v1 Jan
100% (1)
The Nautical Institute Dynamic Positioning Certification and Accreditation Standard Vol 1 Training and Certification v1 Jan
225 pages
Abis Over IP Troubleshooting Guideline
No ratings yet
Abis Over IP Troubleshooting Guideline
8 pages
Java Network Programming
100% (1)
Java Network Programming
17 pages
IOT Module 4
No ratings yet
IOT Module 4
142 pages
A CR CCP 702 PF 001 Red Star IG
No ratings yet
A CR CCP 702 PF 001 Red Star IG
730 pages
3.2.1.7 Packet Tracer - Configuring VLANs Instructions
100% (1)
3.2.1.7 Packet Tracer - Configuring VLANs Instructions
3 pages
Introduction To QoS
No ratings yet
Introduction To QoS
24 pages
Net Flow
100% (1)
Net Flow
54 pages
Original Message
No ratings yet
Original Message
309 pages
It Ass
No ratings yet
It Ass
147 pages
Enterprise Network Architecture
No ratings yet
Enterprise Network Architecture
3 pages
3GPP Ims PDF
No ratings yet
3GPP Ims PDF
221 pages
Networking Deep Dive PDF
No ratings yet
Networking Deep Dive PDF
6 pages
Day 2 Slides
No ratings yet
Day 2 Slides
100 pages
Fibre Channel
100% (7)
Fibre Channel
72 pages
2VAA002993 en S Control and IO Symphony Plus Ethernet Networking
100% (1)
2VAA002993 en S Control and IO Symphony Plus Ethernet Networking
18 pages
An Introduction To Fibre Channel Sans: Mel Tsai
No ratings yet
An Introduction To Fibre Channel Sans: Mel Tsai
28 pages
VLSM Examination 3c
100% (1)
VLSM Examination 3c
2 pages
KX-TDA100-200 2-0-16 Channel IP Gatway Programming Guide For 0490
No ratings yet
KX-TDA100-200 2-0-16 Channel IP Gatway Programming Guide For 0490
72 pages
C3 IPFabric
No ratings yet
C3 IPFabric
50 pages
Q in Q Tunneling
No ratings yet
Q in Q Tunneling
17 pages
Campus Net Best Practices
No ratings yet
Campus Net Best Practices
51 pages
Cloud Computing Final Lab File
No ratings yet
Cloud Computing Final Lab File
27 pages
1 s2.0 S0167739X24001705 Main
No ratings yet
1 s2.0 S0167739X24001705 Main
12 pages
Lec3 InnerconnectionNetworks
No ratings yet
Lec3 InnerconnectionNetworks
28 pages
Ns 3 Training Session 5
No ratings yet
Ns 3 Training Session 5
54 pages
DataCenterSwitch PDF
No ratings yet
DataCenterSwitch PDF
59 pages
IP CN Practicals FINAL
No ratings yet
IP CN Practicals FINAL
17 pages
Unit 5
No ratings yet
Unit 5
11 pages
Final Paper
No ratings yet
Final Paper
8 pages
Abeba
No ratings yet
Abeba
17 pages
ETAtouch Modbustcp
No ratings yet
ETAtouch Modbustcp
10 pages
Spanning Tree Protocol
No ratings yet
Spanning Tree Protocol
14 pages
Network Pathologies
No ratings yet
Network Pathologies
11 pages
Project Report
No ratings yet
Project Report
11 pages
5.1.5.7 Packet Tracer - Configuring OSPF Advanced Features Instructions
No ratings yet
5.1.5.7 Packet Tracer - Configuring OSPF Advanced Features Instructions
3 pages
Sean Flack - Arista - L3 Leaf Spine Networks and VXLAN
No ratings yet
Sean Flack - Arista - L3 Leaf Spine Networks and VXLAN
32 pages
Simple Routing and Switching
No ratings yet
Simple Routing and Switching
27 pages
09 SwitchConfig
No ratings yet
09 SwitchConfig
21 pages
Answer: A
No ratings yet
Answer: A
12 pages
STP Topology Change Notification (TCN)
No ratings yet
STP Topology Change Notification (TCN)
13 pages
CN Utk 2
No ratings yet
CN Utk 2
12 pages
D-Link Airpremier: Dwl-2700Ap
No ratings yet
D-Link Airpremier: Dwl-2700Ap
34 pages
Power With Purpose Rivian TNC White Paper 2023 HiRes
No ratings yet
Power With Purpose Rivian TNC White Paper 2023 HiRes
31 pages
Top-Down Network Design: Chapter Five
No ratings yet
Top-Down Network Design: Chapter Five
35 pages
Cornell Notes Sheet: 200-301 CCNA
No ratings yet
Cornell Notes Sheet: 200-301 CCNA
17 pages
Lect8 Datacenter
No ratings yet
Lect8 Datacenter
27 pages
Industrial Solutions: Food&Beverage
No ratings yet
Industrial Solutions: Food&Beverage
28 pages
Networking Terminologies
No ratings yet
Networking Terminologies
7 pages
VLAN DataCenterx4
No ratings yet
VLAN DataCenterx4
8 pages
A Scalable, Commodity Data Center Network Architecture
No ratings yet
A Scalable, Commodity Data Center Network Architecture
30 pages
LL 20 Opn PDF
No ratings yet
LL 20 Opn PDF
9 pages
Switches: Reading: Section 3.2
No ratings yet
Switches: Reading: Section 3.2
35 pages
SX3000sbc Maintenance Guide
No ratings yet
SX3000sbc Maintenance Guide
15 pages
OSI andTCP Model
No ratings yet
OSI andTCP Model
7 pages
Fabricpath - Part 1, Introduction: Conversational Mac Addresses Learning
No ratings yet
Fabricpath - Part 1, Introduction: Conversational Mac Addresses Learning
11 pages
2.1.2 Lab - Observe STP Topology Changes and Implement RSTP - ILM - Student 2025
No ratings yet
2.1.2 Lab - Observe STP Topology Changes and Implement RSTP - ILM - Student 2025
6 pages
Core Techniques On Designing Network For Data Centers: A Comparative Analysis
No ratings yet
Core Techniques On Designing Network For Data Centers: A Comparative Analysis
4 pages
Bridging & Switching: Notice: Project 1 Is Up
No ratings yet
Bridging & Switching: Notice: Project 1 Is Up
27 pages
CS 4700 / CS 5700: Network Fundamentals
No ratings yet
CS 4700 / CS 5700: Network Fundamentals
21 pages
Network Basic - OSI - TCPIP
No ratings yet
Network Basic - OSI - TCPIP
51 pages
Scenario 7 1 Configurations: Scenario 7 1 L3 Switch A Configuration
No ratings yet
Scenario 7 1 Configurations: Scenario 7 1 L3 Switch A Configuration
12 pages
Tech Field Day: Hugo Riveros
No ratings yet
Tech Field Day: Hugo Riveros
46 pages
Switches: Reading: Section 3.2
No ratings yet
Switches: Reading: Section 3.2
34 pages
Wiring CLBVM001
No ratings yet
Wiring CLBVM001
16 pages
Cloud and Datacenter Networking
No ratings yet
Cloud and Datacenter Networking
13 pages
A Scalable, Commodity Data Center Network Architecture
No ratings yet
A Scalable, Commodity Data Center Network Architecture
24 pages
Steps To Configure NAT
No ratings yet
Steps To Configure NAT
2 pages
Assignment IP Address Task 1
No ratings yet
Assignment IP Address Task 1
12 pages
Open Commerce API
No ratings yet
Open Commerce API
30 pages
MikroTik Routers and Wireless
No ratings yet
MikroTik Routers and Wireless
7 pages
CSED702Y: Software-Defined Networking Assignment 4: Design and Implement Fat-Tree Using Mininet
No ratings yet
CSED702Y: Software-Defined Networking Assignment 4: Design and Implement Fat-Tree Using Mininet
5 pages
Data Centers
No ratings yet
Data Centers
24 pages
Edennington - Wordpress - Com - 2014 - 11 - 05 - Find The Port A Host Is Connected To in A Fabricpath Fabric - Amp
No ratings yet
Edennington - Wordpress - Com - 2014 - 11 - 05 - Find The Port A Host Is Connected To in A Fabricpath Fabric - Amp
6 pages
124101016
No ratings yet
124101016
43 pages
Network Programming: What Is in This Chapter ?
No ratings yet
Network Programming: What Is in This Chapter ?
21 pages
TSH Datasheet
No ratings yet
TSH Datasheet
5 pages
2 BCW 4 X BJG Kis 6 B UFu 6 MOr G6 BF C9206 Fy J9 XJ DPre
No ratings yet
2 BCW 4 X BJG Kis 6 B UFu 6 MOr G6 BF C9206 Fy J9 XJ DPre
4 pages
Huawei Training Day 1 Notes
No ratings yet
Huawei Training Day 1 Notes
28 pages
Wires Devices SMS Commands List: Project Setting Query
No ratings yet
Wires Devices SMS Commands List: Project Setting Query
2 pages
DCv2 Transition Technologies Section 005 FabricPath Review
No ratings yet
DCv2 Transition Technologies Section 005 FabricPath Review
17 pages
Tma Datasheet
No ratings yet
Tma Datasheet
4 pages
Observe STP Topology Changes and Implement RSTP
No ratings yet
Observe STP Topology Changes and Implement RSTP
14 pages
04 TUV Certificate Cu Mica FR - PVC BS 6387 Cat. C W Z
No ratings yet
04 TUV Certificate Cu Mica FR - PVC BS 6387 Cat. C W Z
2 pages
Networking Technology NET 272 - Lab Exercise Advanced STP - PVST+, RSTP, MST Conducted On 9/16 By: Chris Ortiz
No ratings yet
Networking Technology NET 272 - Lab Exercise Advanced STP - PVST+, RSTP, MST Conducted On 9/16 By: Chris Ortiz
7 pages
CCNA Summary Notes: Spanning Tree Protocol (802.1D)
No ratings yet
CCNA Summary Notes: Spanning Tree Protocol (802.1D)
32 pages
Simplified Data Center Networking: With Two-Tier Architectures
No ratings yet
Simplified Data Center Networking: With Two-Tier Architectures
3 pages
ACI Basic - LEARN WORK IT
No ratings yet
ACI Basic - LEARN WORK IT
12 pages
CS 4700 / CS 5700: Network Fundamentals
No ratings yet
CS 4700 / CS 5700: Network Fundamentals
21 pages
Fabric Path Switching
No ratings yet
Fabric Path Switching
1 page
Huawei Broadband Access Products Portfolio 210X285mm PDF
No ratings yet
Huawei Broadband Access Products Portfolio 210X285mm PDF
2 pages
LAB 4 - Create Server Farm Prototype To Test RSTP PDF
No ratings yet
LAB 4 - Create Server Farm Prototype To Test RSTP PDF
6 pages
Understanding FabricPath - Data Center and Network Technobabble
No ratings yet
Understanding FabricPath - Data Center and Network Technobabble
4 pages
CCNA Certification All-in-One For Dummies
From Everand
CCNA Certification All-in-One For Dummies
Silviu Angelescu
5/5 (1)
SPANNING TREE PROTOCOL: Most important topic in switching
From Everand
SPANNING TREE PROTOCOL: Most important topic in switching
Mulayam Singh
No ratings yet

14-740: Networks: Lecture 24 Spring 2018 Kesden

Uploaded by

14-740: Networks: Lecture 24 Spring 2018 Kesden

Uploaded by

14-740: Networks

Lecture 24 * Spring 2018 * Kesden

K-port switches support K3/4 servers

PMAC: <pod.position.port.vmid> 48 bits: <16-bits.8-bits.8-bits.16-bits>

PMAC: <pod.position.port.vmid> 48 bits: <16-bits.8-bits.8-bits.16-bits>

You might also like