Dropbox runs hundreds of services, written in different languages, which exchange millions of requests per second. At the core of our Service Oriented Architecture is Courier, our gRPC-based Remote Procedure Call (RPC) framework. While developing Courier, we learned a lot about extending gRPC, optimizing performance for scale, and providing a bridge from our legacy RPC system.
Courier is not Dropbox’s first RPC framework. Even before we started to break our Python monolith into services in earnest, we needed a solid foundation for inter-service communication. Especially since the choice of the RPC framework has profound reliability implications.
Previously, Dropbox experimented with multiple RPC frameworks. At first, we started with a custom protocol for manual serialization and de-serialization. Some services like our Scribe-based log pipeline used Apache Thrift.
But our main RPC framework (legacy RPC) was an HTTP/1.1-based protocol with protobuf-encoded messages. For our new framework, there were several choices. We could evolve the legacy RPC framework to incorporate Swagger (now OpenAPI).
Or we could create a new standard. We also considered building on top of both Thrift and gRPC. We settled on gRPC primarily because it allowed us to bring forward our existing protobufs.
For our use cases, multiplexing HTTP/2 transport and bi-directional streaming were also attractive.