Content delivery network

From Wikipedia, the free encyclopedia

Jump to: navigation, search

A content delivery network or content distribution network (CDN) is a system of computers networked together across the Internet that cooperate transparently to deliver content to end users, most often for the purpose of improving performance, scalability, and cost efficiency.

Contents

[edit] CDN benefits

The capacity sum of strategically placed servers can be higher than the network backbone capacity. This can result in an impressive increase in the number of concurrent users. For instance, when there is a 10 Gbit/s network backbone and 100 Gbit/s central server capacity, only 10 Gbit/s can be delivered. But when 10 servers are moved to 10 edge locations, total capacity can be 10*10 Gbit/s.

Strategically placed edge servers decrease the load on interconnects, public peers, private peers and backbones, freeing up capacity and lowering delivery costs. It uses the same principle as above. Instead of loading all traffic on a backbone or peer link, a CDN can offload these by redirecting traffic to edge servers.

CDNs primarily deliver content over TCP connections. TCP throughput over a network is impacted by both latency and packet loss. In order to reduce both of these parameters, CDNs traditionally place servers as close to the edge networks that users are on as possible. Theoretically the closer the content the faster the delivery. End users will likely experience less jitter, fewer network peaks and surges, and improved stream quality - especially in remote areas. The increased reliability allows a CDN operator to deliver HD quality content with high QoS, low costs and low network load.

Modern CDNs can dynamically distribute assets to strategically placed redundant core, fallback and edge servers. Modern CDNs can have automatic server availability sensing with instant user redirection. A CDN can offer 100% availability, even with large power, network or hardware outages.

Modern CDN technologies give more control of asset delivery and network load. They can optimize capacity per customer, provide views of realtime load and statistics, reveal which assets are popular, show active regions and report exact viewing details to the customers.

[edit] ASP versus on-net

Most CDNs are operated as an ASP (Application service provider) on the Internet, although an increasing number of internet network owners are building their own CDN to improve on-net content delivery and to generate revenues from content customers. Some develop internal CDN software, others use commercially available software.

However, as CDN prices have dramatically decreased, it is hard to economically justify building an internal CDN.

[edit] Technology

Traditional CDNs focus on web acceleration. That is still their main revenue service. New CDNs have integrated media delivery services so they are optimized for live video streaming, high definition video and large asset delivery.

CDN nodes are deployed in multiple locations, often over multiple backbones. These nodes cooperate with each other to satisfy requests for content by end users, transparently moving content to optimize the delivery process. Optimization can take the form of reducing bandwidth costs, improving end-user performance, or increasing global availability of content.

The number of nodes and servers making up CDN varies, depending on the architecture, some reaching thousands of nodes with tens of thousands of servers on many remote PoPs. Others build a global network and have a small number of geographical PoPs.

Requests for content are algorithmically directed to nodes that are optimal in some way. When optimizing for performance, locations that are best for serving content to the user may be chosen. This may be measured by choosing locations that are the fewest hops or fewest number of network seconds away from the requesting client, so as to optimize delivery across local networks. When optimizing for cost, locations that are least expensive may be chosen instead.

In a optimal scenario, these two goals tend to align, as servers that are close to the end user at the edge of the network may have an advantage in serving costs, perhaps because they are located within the same network as the end user. Traditional CDNs tend to compete based on the size and scale of their Edge Network deployments and generally follow a strategy of pushing the Edge Network closer to end users. The Edge Network is grown outward from the origin/s by further purchasing co-locations facilities, bandwidth and servers.

[edit] Content networking techniques

The Internet was designed according to the end-to-end principle[1]. This principle keeps the core network relatively simple and moves the intelligence as much as possible to the network end-points: the hosts and clients. As a result the core network is specialized, simplified, and optimized to only forward data packets.

Content Delivery Networks augment the end-to-end transport network by distributing on it a variety of intelligent applications employing techniques designed to optimize content delivery. The resulting tightly integrated overlay uses web caching, server-load balancing, request routing, and content services.[2]. These techniques are briefly described below.

Because closer is better, web caches store popular content closer to the user. These shared network appliances reduce bandwidth requirements, reduce server load, and improve the client response times for content stored in the cache.

Server-load balancing uses one or more layer 4–7 switches, also known as a web switch, content switch, or multilayer switch to share traffic among a number of servers or web caches. Here the switch is assigned a single virtual IP address. Traffic arriving at the switch is then directed to one of the real web servers attached to the switch. This has the advantages of balancing load, increasing total capacity, improving scalability, and providing increased reliability by redistributing the load of a failed web server and providing server health checks.

A content cluster or service node can be formed using a layer 4–7 switch to balance load across a number of servers or a number of web caches within the network.

Request routing directs client requests to the content source best able to serve the request. This may involve directing a client request to the service node that is closest to the client, or to the one with the most capacity. A variety of algorithms are used to route the request. These include Global Server Load Balancing, DNS-based request routing, Dynamic metafile generation, HTML rewriting[3], and anycasting[4]. Proximity—choosing the closest service node—is estimated using a variety of techniques including reactive probing, proactive probing, and connection monitoring.

Simple CDNs require manual asset copying. Earlier CDNs used active web caches and global hardware load balancers. Modern CDNs use cheap and simple edge servers and intelligent central CDN management technologies that distribute assets dynamically.

[edit] Content service protocols

Several protocols suites are designed to provide access to a wide variety of content services distributed throughout a content network. The Internet Content Adaptation Protocol (ICAP) was developed in the late 1990s[5][6] to provide an open standard for connecting application servers. A more recently defined and robust solution is provided by the Open Pluggable Edge Services (OPES) protocol[7]. This architecture defines OPES service applications that can reside on the OPES processor itself or be executed remotely on a Callout Server. Edge Side Includes or ESI is a small markup language for edge level dynamic web content assembly. It is fairly common for websites to have generated content. It could be because of changing content like catalogs or forums, or because of personalization. This creates a problem for caching systems. To overcome this problem a group of companies created ESI.

[edit] P2P CDNs

Although peer-to-peer (P2P) is not traditional CDN technology, it is increasingly used to deliver content to end users. P2P claims low cost and efficient distribution. Even though P2P actually generates more traffic than traditional client-server CDNs (because a peer also uploads data instead of just downloading it) it's welcomed by parties running content delivery/distribution services. The real strength of P2P shows when one has to distribute highly attractive data, like the latest episode of a soap opera or some sort of software patch/update in short period of time. Ironically, the more people who download the (same) data, the more efficient P2P is, thus slashing the cost of the peering fees that a CDN provider has to pay due to inter-peer delivery (in comparison to the same amount of data distributed using traditional techniques).

On the other hand, the “long tail” type material does not benefit much from P2P delivery schema, since, to gain advantage over traditional distribution models, a P2P-enabled CDN must force storing (caching) data on peers--something that is usually not desired by users and which is rarely enabled.

Contrary to popular belief P2P is not limited to low-bandwidth audio-video signal distribution. There is no technical boundary, built-in inefficiency, or flaw-by-design in peer-to-peer technology to prevent distribution of full HD audio+video signal at, for example, 8 Mbit/s. It's just environmental factors, like low (upload) bandwidth or inadequate computing power in CE devices, that prevent HD material being publicly available in P2P CDNs. (Low bandwidth problems also apply to traditional CDN, though.)

There have been several trials done by the Telewizja Polska SA, Poland's national broadcaster, involving live 800 Kbit/s and 1.5 Mbit/s Windows Media simultaneous streams delivered over the public Internet, proving that the limiting factor is the upload capacity of peers, which in DSL environment varies from 1/4 to even 1/16 of the download bandwidth.

The biggest event that involved P2P CDN was CNN's coverage of Barack Obama's inauguration. According to CNN's press release, more than 21.3 million people watched the live feed using Octoshape Infinite Edge technology with more than 1.34 million concurrent streams being delivered at the peak time. This is also currently the largest live simultaneous event on the Internet to date.

There are some concerns about lack of QoS control over P2P distribution, but these are being addressed by the P2P-Next consortium.

[edit] See also

[edit] Related technologies

[edit] Commercial CDNs

[edit] Commercial CDNs Using P2P for Delivery

[edit] Commercial CDN solutions

  • PacketWarp by Vector Data
  • ACNS by Cisco
  • ProxySG by Blue Coat
  • CDD by HP
  • Aspera and Isilon
  • VDO-X by Jet Stream

[edit] Mobile CDNs

[edit] Academic CDNs

[edit] Source(s)

Parts of this page are taken from blog.streamingmedia.com and from VDO-X.

[edit] Notes

  1. ^ Saltzer, J. H., Reed, D. P., Clark, D. D.: “End-to-End Arguments in System Design,” ACM Transactions on Communications, 2(4), 1984
  2. ^ Hofmann, Markus; Leland R. Beaumont (2005). Content Networking: Architecture, Protocols, and Practice. Morgan Kaufmann Publisher. ISBN 1-55860-834-6. 
  3. ^ RFC 3568 Barbir, A., Cain, B., Nair, R., Spatscheck, O.: "Known Content Network (CN) Request-Routing Mechanisms," July 2003
  4. ^ RFC 1546 Partridge, C., Mendez, T., Milliken, W.: "Host Anycasting Services," November 1993.
  5. ^ RFC 3507 Elson, J., Cerpa, A.: "Internet Content Adaptation Protocol (ICAP)," April 2003.
  6. ^ ICAP Forum
  7. ^ RFC 3835 Barbir, A., Penno, R., Chen, R., Hofmann, M., and Orman, H.: "An Architecture for Open Pluggable Edge Services (OPES)," August 2004.

[edit] External links

Personal tools