Practical Bounds on Optimal Caching with Variable Object Sizes

Berger, Daniel S.; Beckmann, Nathan; Harchol-Balter, Mor

doi:10.1145/3224427

Computer Science > Performance

arXiv:1711.03709 (cs)

[Submitted on 10 Nov 2017 (v1), last revised 5 Jul 2018 (this version, v4)]

Title:Practical Bounds on Optimal Caching with Variable Object Sizes

Authors:Daniel S. Berger, Nathan Beckmann, Mor Harchol-Balter

View PDF

Abstract:Many recent caching systems aim to improve miss ratios, but there is no good sense among practitioners of how much further miss ratios can be improved. In other words, should the systems community continue working on this problem? Currently, there is no principled answer to this question. In practice, object sizes often vary by several orders of magnitude, where computing the optimal miss ratio (OPT) is known to be NP-hard. The few known results on caching with variable object sizes provide very weak bounds and are impractical to compute on traces of realistic length.
We propose a new method to compute upper and lower bounds on OPT. Our key insight is to represent caching as a min-cost flow problem, hence we call our method the flow-based offline optimal (FOO). We prove that, under simple independence assumptions, FOO's bounds become tight as the number of objects goes to infinity. Indeed, FOO's error over 10M requests of production CDN and storage traces is negligible: at most 0.3%. FOO thus reveals, for the first time, the limits of caching with variable object sizes. While FOO is very accurate, it is computationally impractical on traces with hundreds of millions of requests. We therefore extend FOO to obtain more efficient bounds on OPT, which we call practical flow-based offline optimal (PFOO). We evaluate PFOO on several full production traces and use it to compare OPT to prior online policies. This analysis shows that current caching systems are in fact still far from optimal, suffering 11-43% more cache misses than OPT, whereas the best prior offline bounds suggest that there is essentially no room for improvement.

Subjects:	Performance (cs.PF)
Cite as:	arXiv:1711.03709 [cs.PF]
	(or arXiv:1711.03709v4 [cs.PF] for this version)
	https://doi.org/10.48550/arXiv.1711.03709
Journal reference:	Proceedings of the ACM on Measurement and Analysis of Computing Systems, Article 32, Volume 2, Issue 2, June 2018
Related DOI:	https://doi.org/10.1145/3224427

Submission history

From: Daniel S. Berger [view email]
[v1] Fri, 10 Nov 2017 06:27:08 UTC (426 KB)
[v2] Tue, 14 Nov 2017 18:07:40 UTC (434 KB)
[v3] Thu, 22 Mar 2018 18:14:21 UTC (1,915 KB)
[v4] Thu, 5 Jul 2018 20:10:45 UTC (548 KB)

Computer Science > Performance

Title:Practical Bounds on Optimal Caching with Variable Object Sizes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Performance

Title:Practical Bounds on Optimal Caching with Variable Object Sizes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators