Monday, November 7, 2011

Scalable data center networking: Portland, VL2, and c-Through


Commodity Ethernet switches offer very useful semantics to end hosts. An end host can be added or removed without any manual configuration (plug-and-play). Even if the physical port to a specific host is changed, the communication with the server is not disrupted. Most commercial L2 switches provide full bisection bandwidth between hosts, which is particularly important with MapReduce-like workloads.

However, these good characteristics do not scale beyond one switch. Each switch has limited port density (typically 24-48), small SRAM limiting the maximum size of MAC learning table. At the scale of data centers, this scalability problem introduces many issues: operational cost for manual configuration, obsession with locality due to limited bisection bandwidth, etc. VL2 and Portland address this problem with the same goal; data center network should be viewed as a (virtually) single layer-2 domain.

Portland and VL2 have slightly different scopes and approaches. For example, VL2 works with modified end hosts and relies on some layer-3 functionalities of commodity switches such as ECMP and IP-in-IP tunneling, while Portland only relies on simple layer-2 functionalities but assumes some modifications on switches (MAC rewriting). They both have some notion of centralized service to maintain location-mapping state.

c-Through takes an interesting approach to achieve larger bandwidth between servers. Ethernet switches are suitable for low latency, bursty, and uniform traffic. On the other hand, optical switches provides much more bandwidth for stable traffic. Their theoretical bandwidth is unlimited, as they do not interpret the data itself but just forward optical signal between ports. The disadvantage of optical switches is high circuit switching time (at milliseconds order) to mechanically adjust the mirrors. They also need external schedulers to manage the switching fabric.

In SIGCOMM 2010, c-Through and Helios independently introduced the use of optical switches to augment datacenter networks  with optical switches. While c-Through makes end hosts to decide whether to offload traffic to the optical network, in Helios it is up to switches. This difference (where to make modifications) is very similar to the relationship between Portland and VL2, interestingly.

No comments:

Post a Comment