
Staff+ Software Engineer, Caching
Compensation
$320,000-$485,000Description
About the role
We're looking for experienced engineers to build Anthropic's cache layer as a managed service from the ground up. The Caching team is part of the Databases organization and owns the systems that keep Anthropic's hottest paths fast and correct: a managed Redis fleet, client libraries used across the company, and CDC-driven cache invalidation that solves one of distributed systems' hardest problems the right way.
This is a foundational role on a small team with outsized leverage. Every millisecond you take off a hot path is multiplied across every Claude request. You'll set the technical direction for caching at Anthropic, from the data plane to the developer experience, and you'll work closely with product and research teams to make caching something engineers get for free rather than something they have to think about.
Key responsibilities
- Drive the technical direction for caching infrastructure used across Product and Research
- Design, build, and operate a managed Redis fleet that scales to support millions of users across Claude's product ecosystem
- Build client libraries and developer-facing abstractions that make correct caching the default for Anthropic engineers
- Design and operate CDC-driven cache invalidation that keeps cached data consistent with source-of-truth databases
- Architect caching solutions that operate across GCP, AWS, first-party deployments, and other environments
- Optimize latency, hit rates, reliability, and cost efficiency on Anthropic's hottest paths
- Build observability and tooling that makes cache behavior easy to understand and debug
- Partner with product and research teams to understand access patterns and build infrastructure that accelerates their work
- Make build-vs-buy decisions for caching technologies
Minimum qualifications
- Significant experience as a software engineer building and operating production distributed systems
- Deep knowledge of caching architectures, including invalidation strategies, consistency tradeoffs, and failure modes
- Experience operating Redis, Memcached, or similar in-memory data stores in production
- Proficiency in at least one systems programming language (e.g., Go, Rust, Java, C++) or Python at scale
- Track record of leading large, complex infrastructure projects as an engineer or tech lead
- Ability to balance moving quickly with the reliability needs of production systems
- Strong technical leadership and cross-functional collaboration skills
Preferred qualifications
- 10+ years building and scaling distributed infrastructure, with 3+ years leading large-scale projects or teams
- Experience building managed infrastructure platforms or internal services consumed by many engineering teams
- Experience with change data capture (Debezium or similar) or streaming data infrastructure
- Experience operating Redis Cluster, Valkey, ElastiCache, Memorystore, or similar managed offerings at scale
- Experience designing client libraries or SDKs for internal infrastructure
- Experience scaling infrastructure through periods of rapid growth at high-growth companies
- Experience with multi-cloud or hybrid cloud deployments
- Contributions to caching systems, database internals, or related open source projects
Note: Prior AI/ML infrastructure experience is not required. We value deep infrastructure expertise from any domain.
Stack
- Posted
- Jun 25, 2026
- Last seen
- Jun 25, 2026
- First seen
- Jun 25, 2026
- Status
- active