What you'll do#
Scaling retrieval systems is a challenging engineering problem. Chroma's distributed data plane must support workloads appropriate to serving AI applications at scale, which involves scaling vector search, document storage, metadata filtering, and more, for billions of records across millions of collections. Success will mean inventing many of these components from scratch ourselves.
Above all, Chroma's users need to be able to trust that their data is safe, available, and can be accessed efficiently. You will be responsible for ensuring that Chroma's distributed systems are reliable, scalable, and performant.
In this role, you will work on the distributed systems which make up the open source Chroma data plane, and will power Chroma's cloud service. You will work closely with the Database Systems team to build our scalable open source data plane, the Cloud Platform team to build the core of Chroma cloud, and the Product Engineering team to ensure product features are developed in a scalable way.
Who you are#
- You have experience building correct, performant, and reliable distributed systems at scale.
- You are familiar with best practices in deploying and operating data platforms.
- You have substantial production experience in Rust, C++, Zig or other system programming languages.
Bonus
- You are passionate about testing through formal (e.g. TLA+), lightweight formal, and probabilistic methods.