Saif Shaikh

Serving ML at scale without burning the budget

Jul 12, 2024

Draft notes on caching, batching, and autoscaling.