Download Data Center Network Topologies: VL2 (Virtual Layer 2) and more Slides Computer Networks in PDF only on Docsity!
Data Center Network Topologies:
VL2 (Virtual Layer 2)
Hakim Weatherspoon
Assistant Professor, Dept of Computer Science
CS 5413: High Performance Systems and Networking
September 26, 2014
Slides used and adapted judiciously from COS-561, Advanced Computer Networks At Princeton University
Goals for Today
- VL2: a scalable and flexible data center network
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. ACM Computer Communication Review (CCR), August 2009, pages 51-62.
Conventional DCN Problems
- Static network assignment
- Fragmentation of resource
- Poor server to server connectivity
- Traffics affects each other
- Poor reliability and utilization
CR CR
AR AR AR AR
S S
S S
…
S S
…
S S
S S
…
S S
…
I want more
I have spare ones, but… 1:
1:
1:
Objectives:
- Uniform high capacity:
- Maximum rate of server to server traffic flow should be limited only by capacity on network cards
- Assigning servers to service should be independent of network topology
- Performance isolation:
- Traffic of one service should not be affected by traffic of other services
- Layer-2 semantics:
- Easily assign any server to any service
- Configure server with whatever IP address the service expects
- VM keeps the same IP address even after migration
- Traffic matrix analysis:
- Poor summarizing of traffic patterns
- Instability of traffic patterns
- Failure characteristics:
- Pattern of networking equipment failures: 95% < 1min, 98% < 1hr, 99.6% < 1 day, 0.09% > 10 days
- No obvious way to eliminate all failures from the top of the hierarchy
Measurements and Implications of DCN
Virtual Layer 2 Switch (VL2)
- Design principle:
- Randomizing to cope with volatility:
- Using Valiant Load Balancing (VLB) to do destination independent traffic spreading across multiple intermediate nodes
- Building on proven networking technology:
- Using IP routing and forwarding technologies available in commodity switches
- Separating names from locators:
- Using directory system to maintain the mapping between names and locations
- Embracing end systems:
- A VL2 agent at each server
VL2 Goals and Solutions
Objective Approach Solution
2. Uniform high capacity between servers
Enforce hose model using existing mechanisms only
Employ flat addressing
1. Layer- **semantics
- Performance** Isolation
Guarantee bandwidth for hose-model traffic
Flow-based random traffic indirection (Valiant LB)
Name-location separation & resolution service
TCP
11^ “Hose”: each node has ingress/egress bandwidth constraints
Name/Location Separation
ToR 3 payload
y x
Servers use flat names
Switches run link-state routing and maintain only switch-level topology
Cope with host churns with very little overhead
ToR 4 z payload^ y^ z
ToR 1 ToR 2 ToR 3 ToR (^4)
ToR payload^ y, z 3 z
Directory Service … x ToR 2 y ToR 3 z ToR 4 …
Lookup & Response
… x ToR 2 y ToR 3 z ToR 3 …
- Allows to use low-cost switches
- Protects network and hosts from host-state churn
- Obviates host and switch reconfiguration
12
Valiant Load Balancing: Indirection
x y
T 3 y payload z
T 5 z payload
I (^) ANY I (^) ANY I (^) ANY
IANY
Cope with arbitrary TMs with very little overhead
Links used for up paths Links used for down paths
T 1 T 2 T 3 T 4 T 5 T (^6)
[ ECMP + IP Anycast ]
- Harness huge bisection bandwidth
- Obviate esoteric traffic engineering or optimization
- Ensure robustness to failures
- Work with switch mechanisms available today
- Must spread traffic
- Must ensure dst independence
Equal Cost Multi Path Forwarding
14
VL2 Directory System
RSM
DS
RSM
DS
RSM
DS
Agent
Agent
...... Directory
Servers
RSM Servers
- Reply 2. Reply
- Lookup
“Lookup”
- Ack
- Set 4. Ack (6. Disseminate)
- Replicate
- Update
“Update”
Evaluation
- Fairness:
- 75 nodes
- Real data center workload
- Plot Jain’s fairness index for traffics to intermediate switches
Time (s)
0 100 200 300 400 500
1.
Fairness Index 0.94^ Aggr1^ Aggr2^ Aggr
Evaluation
- Performance isolation:
- Two types of services:
- Service one: 18 servers do single TCP transfer all the time
- Service two: 19 servers starts a 8GB transfer over TCP every 2 seconds
- Service two: 19 servers burst short TCP connections
Perspective
- Studied the traffic pattern in a production data center and find the traffic patterns
- Design, build and deploy every component of VL2 in an 80 server testbed
- Apply VLB to randomly spreading traffics over multiple flows
- Using flat address to split IP addresses and server names
Critique
- The extra servers are needed to support the VL2 directory system,:
- Brings more cost on devices
- Hard to be implemented for data centers with tens of thousands of servers.
- All links and switches are working all the times, not power efficient
- No evaluation of real time performance.