CS  G254/U645                                                                           Lecturer: Ravi Sundaram

Network Security                                                                         October 5, 2007

Lecture 8: Content Delivery Networks

Content Delivery Networks (CDN’s)

  • Critical to the Web infrastructure.
  • Today most every major website uses a CD Service.
  • Helps minimize the problems that are inherent with the internet.
    • Network Issues:
      • Latency – Browser takes a long time to load
      • Packet Loss – Browser hangs and a refresh is needed
      • Jitter – Streams are jerky because of deviation in arrival time of packets
    • Server Issues:
      • Server load – Browser connects, but doesn’t fully load the page
      • Broken or missing content

CDN implementation

1.      Place CDN servers at the Network Edge and cache the content of websites at these servers.

2.      Place CNAME entries in appropriate DNS servers (i.e. www.xyz.edgesuite.net)

o       When a node requests the IP for xyz.com DNS will resolve this hostname to an IP of an optimal CD server (i.e. Akamai) that has been contracted by xyz.com.

o       The CD server (Akamai) assembles the page, contacts the origin (xyz.com) as needed.

3.      CD servers utilize Overlay Routing to determine best path

Given a large enough CDN with strategically placed nodes, the CDN-administrators are able to take extensive and continuous measurements between nodes, and use that to direct the traffic between source and destination – as a simple example consider a CDN  moving traffic from A to B; in addition to the direct route (BGP-dictated) between A and B they can also use two intermediate nodes C1 and C2; they continuously measure the quality of all 3 routes and use the optimal one to move the traffic.   

Mapping of CDN nodes

  • Create a topology map that the CDN nodes can use to compute proximity.
  • Computing proximity is difficult on the internet because it is so dynamic.
  • The mapping problem is addressed using a two-pronged approach taking into account:
    • Topology:
      • relatively static
      • changes in BGP time
      • order of hours, if not days

§         Identify 500,000 unique nameserver. Further reduced to 90,000 Topology-Discovery Proxy Points as Set Covers built from histograms

    • Congestion:
      • Dynamic
      • Changes in Round Trip Time (RTT)
      • Order of mille seconds

§         Importance based sampling: With the scaled-down topology-map using fewer proxy-points, using CDF of end-user loads. Further reduce 90,000 clusters to 7,000, which account for 95% client load.

-  Mapping problem solved. Maps converge every 10 sec per measurement cycles.