About the role

We're looking for experienced engineers to build Anthropic's cache layer as a managed service from the ground up. The Caching team is part of the Databases organization and owns the systems that keep Anthropic's hottest paths fast and correct: a managed Redis fleet, client libraries used across the company, and CDC-driven cache invalidation that solves one of distributed systems' hardest problems the right way.

This is a foundational role on a small team with outsized leverage. Every millisecond you take off a hot path is multiplied across every Claude request. You'll set the technical direction for caching at Anthropic, from the data plane to the developer experience, and you'll work closely with product and research teams to make caching something engineers get for free rather than something they have to think about.

Key responsibilities

Drive the technical direction for caching infrastructure used across Product and Research
Design, build, and operate a managed Redis fleet that scales to support millions of users across Claude's product ecosystem
Build client libraries and developer-facing abstractions that make correct caching the default for Anthropic engineers
Design and operate CDC-driven cache invalidation that keeps cached data consistent with source-of-truth databases
Architect caching solutions that operate across GCP, AWS, first-party deployments, and other environments
Optimize latency, hit rates, reliability, and cost efficiency on Anthropic's hottest paths
Build observability and tooling that makes cache behavior easy to understand and debug
Partner with product and research teams to understand access patterns and build infrastructure that accelerates their work
Make build-vs-buy decisions for caching technologies

Minimum qualifications

Significant experience as a software engineer building and operating production distributed systems
Deep knowledge of caching architectures, including invalidation strategies, consistency tradeoffs, and failure modes
Experience operating Redis, Memcached, or similar in-memory data stores in production
Proficiency in at least one systems programming language (e.g., Go, Rust, Java, C++) or Python at scale
Track record of leading large, complex infrastructure projects as an engineer or tech lead
Ability to balance moving quickly with the reliability needs of production systems
Strong technical leadership and cross-functional collaboration skills

Preferred qualifications

10+ years building and scaling distributed infrastructure, with 3+ years leading large-scale projects or teams
Experience building managed infrastructure platforms or internal services consumed by many engineering teams
Experience with change data capture (Debezium or similar) or streaming data infrastructure
Experience operating Redis Cluster, Valkey, ElastiCache, Memorystore, or similar managed offerings at scale
Experience designing client libraries or SDKs for internal infrastructure
Experience scaling infrastructure through periods of rapid growth at high-growth companies
Experience with multi-cloud or hybrid cloud deployments
Contributions to caching systems, database internals, or related open source projects

Note: Prior AI/ML infrastructure experience is not required. We value deep infrastructure expertise from any domain.

Staff+ Software Engineer, Caching

Description

About the role

Key responsibilities

Minimum qualifications

Preferred qualifications

Stack