Cache Provisioning and Management for Content Delivery

06 Nov
Monday, 11/06/2017 10:00am to 12:00pm
Computer Science Building, Room 140
Ph.D. Dissertation Proposal Defense

"Cache Provisioning and Management for Content Delivery"

The exponential increase in Internet traffic coupled with the scarcity of available resources makes content delivery a challenging problem. Content delivery networks (CDNs) are integral to delivering a vast variety of content requested by users around the world at low latency and high reliability. CDNs deploy servers close to the users and cache and deliver trillions of user requests every day for a diverse set of content such as web pages, videos, software downloads and images, among others. In this dissertation, we propose algorithms to provision cache servers and manage the content they host to achieve different performance objectives such as minimizing the energy consumption of the network and maximizing the end-user performance.

Cache management is the process of deciding how content is cached in the servers of a CDN.  We propose cache management algorithms to make the servers energy-efficient using disk shutdown. We find that disk shutdown provides good energy-performance tradeoff and is suitable for CDN servers. We also propose TTL-based caching algorithms that provably achieve performance targets specified by a CDN operator in the presence of non-stationary and bursty traffic. Using production traces from Akamai, we show that the proposed algorithms converge to the target hit rate and target cache size with low error. 

Cache provisioning is the process of determining the set of content domains hosted on the edge servers. Cache provisioning is complicated by the presence of thousands of content domains with widely varying performance characteristics. To address this challenge, we propose footprint descriptors which are a succinct representation of the content requested by users. Footprint descriptors effectively capture the popularity characteristics and caching performance of different content classes. We also propose a footprint descriptor calculus that can be used to decide how content should be mixed or partitioned and so on to efficiently provision caches. We finally propose optimization models that can be used to automatically provision caches to jointly minimize the cache miss traffic from the network and the end-user latency. Such optimization models would guide the development of better cache provisioning and management schemes to meet the traffic demands of the future.

Advisor: Ramesh Sitaraman