You are here


His current research interests focus on stochastic modeling, distributed computing systems and statistical learning.



NSF #1717060 - CSR: NeTS: Small: Theoretical Foundations for Cache Networks: Performance Models, Algorithms, and Applications


Caching systems are already widely deployed but need to scale efficiently to support emerging big data applications.  A fundamental question is whether the cache space should be pooled together to serve multiple request flows jointly or be divided to serve them separately. There are no straight yes or no answers, depending on four critical factors. They include the popularity distributions, request rates, data item sizes and overlapped data across different flows.  This problem becomes even more complicated due to engineering issues. However,  there are still guidelines to improve the miss ratios.  

I. We show that for large cache spaces it is beneficial to jointly serve multiple flows if their data item sizes and popularity distributions are similar and their arrival rates do not differ significantly. Resource pooling can adaptively achieve the optimal resource allocation for multiple competing flows when the data item sizes are of the same range. The current practice uses applications and domains to separate flows of requests into different cache spaces.  One possible explanation is that the data item sizes, e.g., text and image objects, and request rates are quite different. 

II. Many caching systems rely on consistent hashing to group a large number of servers to form a cooperative cluster. We show that these individual cache spaces on different servers  can be effectively viewed as if they could be pooled together to form a single virtual LRU cache space parametrized by an equivalent cache size.

III. We are developing/implementing new algorithms to optimize the formation of caching networks (e.g., organization of the Memcached pools and routing/hashing decisions for data requests).