Link aggregation

From Wikipedia, the free encyclopedia

Link aggregation or IEEE 802.1AX-2008, is a computer networking term which describes using multiple network cables/ports in parallel to increase the link speed beyond the limits of any one single cable or port, and to increase the redundancy for higher availability.

Most implementations now conform to what used to be clause 43 of IEEE 802.3-2005 Ethernet standard, usually still referred to by its working group name of "IEEE 802.3ad". The Link Aggregation definition has since been moved to a standalone IEEE 802.1AX standard.

Other terms for link aggregation include "Ethernet trunk", "NIC teaming", "port channel", "port teaming", "port trunking", "link bundling", "EtherChannel", "Multi-Link Trunking (MLT)", "NIC bonding", "Network Fault Tolerance (NFT)".

Link Aggregation between a switch and a server

1 Description
2 IEEE Link Aggregation
- 2.1 Standardization process
  - 2.1.1 Initial release 802.3ad in 2000
  - 2.1.2 Move to 802.1 layer in 2008
- 2.2 Link Aggregation Control Protocol
  - 2.2.1 Advantages over static configuration
  - 2.2.2 Practical notes
3 Usage
4 Limitations
5 References
6 External links

[edit] Description

Link aggregation is designed to overcome two problems with ethernet connections: bandwidth limitations and lack of redundancy.

The first issue is that bandwidth requirements do not scale linearly. Ethernet bandwidths historically have increased by an order of magnitude each generation (10 Megabit/s, 100 Mbit/s, 1000 Mbit/s, 10000 Mbit/s). If one started to bump into bandwidth ceilings, then the only option was to move to the next generation which could be cost prohibitive. An alternative solution, introduced by many of the network manufacturers in the early 1990s, is to combine two physical Ethernet links into one logical link via channel bonding. Most of these solutions required manual configuration and identical equipment on both sides of the aggregation.^[1]

The second problem is that there are three single point of failures in a typical port-cable-port connection. In either the usual computer-to-switch or in a switch-to-switch configuration, the cable itself or either of the ports the cable is plugged into can fail. Multiple physical connections can be made, but many of the higher level protocols were not designed to failover completely seamlessly.

[edit] IEEE Link Aggregation

[edit] Standardization process

By the mid 1990s, most network switch manufacturers had included aggregation capability as a proprietary extension to increase bandwidth between their switches. However, each manufacturer developed their own method which led to compatibility problems. The IEEE 802.3 group took up a study group to create an inter-operable link layer standard in November 1997 meeting.^[1] The group quickly agreed to include an automatic configuration feature which would add in redundancy as well. This became known as "Link Aggregation Control Protocol".

[edit] Initial release 802.3ad in 2000

Most gigabit channel bonding is now based IEEE standard of Link Aggregation which was formerly clause 43 of the IEEE 802.3 standard added in March 2000 by the IEEE 802.3ad task force.^[2] Nearly every network equipment manufacturer quickly adopted this joint standard over their proprietary standards.

[edit] Move to 802.1 layer in 2008

It had been noted that certain 802.1 layers (such as 802.1X security) were positioned in the protocol stack above Link Aggregation which was defined as an 802.3 sublayer.^[3] This discrepancy was resolved with formal transfer of the protocol to the 802.1 group with the publication of IEEE 802.1AX-2008 on on 3 November 2008.

[edit] Link Aggregation Control Protocol

The Link Aggregation Control Protocol (LACP) is included in IEEE specification as a method to control the bundling of several physical ports together to form a single logical channel. LACP allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer (directly connected device that also implements LACP).

[edit] Advantages over static configuration

Failover when a link fails and there is (for example) a Media Converter between the devices which means that the peer will not see the link down. With a static link aggregation the peer would continue sending traffic down the link causing it to be lost.
The device can confirm that the other end is configured for link aggregation. With Static link aggregation a cabling or configuration mistake could go undetected and cause undesirable network behavior. ^[4]

[edit] Practical notes

LACP works by sending frames (LACPDUs) down all links that have the protocol enabled. If a device is present on the other end of the link that also has LACP enabled, it will also independently send frames along the same links enabling the two units to detect multiple links between themselves and then combine them into a single logical link. LACP can be configured in one of two modes: Active or Passive. In Active mode it will always send frames along the configured links. In passive mode however, it acts as "speak when spoken to", and therefore can be used as a way of controlling accidental loops (as long as the other device is in active mode). ^[5]

[edit] Usage

[edit] Network backbone

Link aggregation is an inexpensive way to set up a high-speed backbone network that transfers much more data than any one single port or device can deliver. Although, in the past, various vendors used proprietary techniques, the preference today is to use the IEEE standard, which can either be set up statically or by using the Link Aggregation Control Protocol (LACP). This allows several devices to communicate simultaneously at their full single-port speed while not allowing any one single device to monopolize all available backbone capacity.

This has limitations: originally, link aggregation was developed to provide redundancy, and not bandwidth benefits. The actual benefits vary based on the load-balancing method used on each device (different balancing algorithms can be configured to each end and is actually encouraged to avoid path polarization).

The most common way to balance the traffic is to use L3 hashes. These hashes are calculated when the first connection is established and then kept in the devices' memory for future use. This effectively limits the client bandwidth in an aggregate to its single member's maximum bandwidth per session. This is the main reason why 50/50 load balancing is almost never reached in real-life implementations, more like 70/30. More advanced distribution layer switches can employ an L4 hash, which will bring the balance closer to 50/50.

Link aggregation also allows the network's backbone speed to grow incrementally as demand on the network increases, without having to replace everything and buy new hardware.

For most backbone installations it is common to install more cabling or fiber optic pairs than are initially necessary, even if there is no immediate need for the additional cabling. This is done because labor costs are higher than the cost of the cable, and running extra cable reduces future labor costs if networking needs change. Link aggregation can allow the use of these extra cables to increase backbone speeds for little or no extra cost if ports are available.

[edit] Efficiency of equipment

Aggregation becomes inefficient beyond a certain bandwidth depending on the total number of ports on the switch equipment. A 24-port gigabit switch with two 8-gigabit trunks is using sixteen of its available ports just for the two interswitch connections, and leaves only eight of its 1-gigabit ports for other devices. This same configuration on a 48-port gigabit switch leaves 32 1-gigabit ports available, and so it is much more efficient (assuming of course that those ports are actually needed at the switch location).

When 40-50% of the switch ports are being utilized for backbone trunking, upgrading to a switch with either more ports or a higher base-operating speed may be a better option than simply adding more switches, especially if the old switch can be re-used elsewhere on a less performance-critical part of the network.

[edit] Use on network interface cards

Network interface cards (NICs) can also sometimes be trunked together to form network links beyond the speed of any one single NIC. For example, this allows a central file server to establish a 2-gigabit connection using two 1-gigabit NICs trunked together.

Note that Microsoft Windows does not natively support link aggregation (at least up to Win 2003) ^[6]; however some manufacturers provide software for aggregation on their multiport NICs at the device driver layer. Intel, for example, has released a package for Linux called Advanced Networking Services (ANS) to bind Intel Fast Ethernet and Gigabit cards.^[7] Nvidia also supports "teaming" with their Nvidia System Tools.

In Linux, FreeBSD, NetBSD, OpenBSD, Mac OS X Server, OpenSolaris, Citrix XenServer, VMware ESX Server, and commercial Unix distributions such as AIX, Ethernet bonding (trunking) is implemented on a higher level, and can hence deal with NICs from different manufacturers or drivers, as long as the NIC is supported by the kernel.

[edit] Limitations

[edit] Order of frames

A limitation on link aggregation is that it would like to avoid reordering Ethernet frames. That goal is approximated by sending all frames associated with a particular session across the same link^[8]. Depending on the traffic, this may not provide even distribution across the links in the trunk.

[edit] Single switch

A limitation of link aggregation is that all physical ports in the link aggregation group must reside on the same logical switch which in most scenarios will leave a single point of failure when the physical switch both links are connected to goes offline.

However, this can be overcome by using vendor-specific extensions which aggregate multiple physical switches into one logical switch. As of 2009^[update], the IEEE has not yet committed resources to standardize this feature.

[edit] Same media

The ports and media used in a trunk should, as a rule of thumb, be all of the same type, such as all copper ports (CAT-5E/CAT-6), all multi-mode fiber ports (SX), or all single-mode fiber ports (LX). However more importantly the speed of each link within a trunk should be the same.

Many of today’s switches are PHY independent meaning that in an SFP slot you could have copper, SX, LX , ZX, XD or CWDM GBICs. Maintaining the same PHY is a good rule of thumb however more importantly is to maintain the same speed on all links. You could use a SX fiber for one link and a LX (longer, diverse path) for the second link. The speed will still be 1 Gbit/s for both links but one path would have a slightly longer transit time which is of no concern.

However if you use two copper links or one copper and one fiber you must ensure that both links negotiate to the same speed be it 10, 100 or 1,000. If one link negotiates to 1,000 Mbit/s and the other to 100 Mbit/s then all session traffic that is forced to use the 100 Mbit/s link will be very slow or become so congested that it becomes virtually unusable. Worst case would be when the 1,000 link fails and all traffic is then placed on the 100 Mbit/s link.

[edit] References

[edit] External links

[firstieeemeeting-0] ttp://grouper.ieee.org/groups/802/3/trunk_study/tutorial/index.html

[1] IEEE 802.3ad Link Aggregration Task Force

[2] ttp://www.ieee802.org/3/maint/public/maint_open_1106.pdf

[3] Link aggration on Dell servers

[4] IEEE 802.3ad Link Aggregration Task Force

[5] LACP (802.3ad) on Windows 2003

[6] Intel Advanced Networking Services

[7] ttp://grouper.ieee.org/groups/802/3/hssg/public/apr07/frazier_01_0407.pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]