scalability

Observability at Scale: Building Uber’s Alerting Ecosystem

Uber’s software architectures consists of thousands of microservices that empower teams to iterate quickly and support our company’s global growth. These microservices support a variety of solutions, such as mobile applications, internal and infrastructure services, and products along with complex configurations that affect these products at city and sub-city levels. To maintain our growth and architecture, Uber’s Observability team built a robust, scalable metrics and alerting pipeline responsible for detecting, mitigating, and notifying engineers of issues with their services as soon as they occur.
Read more

Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster

The number of shards on each node, and tries to balance the number of shards per node evenly across the clusterThe high and low disk watermarks. Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. A nodes that has reached the low watermark (i.e 80% disk used) is not allowed receive any more shards.
Read more

GraphQL: A success story for PayPal Checkout

At PayPal, we recently introduced GraphQL to our technology stack. At PayPal, GraphQL has been a complete game changer to the way we think about data, fetch data and build applications. This blog post takes a close look at PayPal Checkout and explains our journey from REST to Batch REST to GraphQL and lessons learned along the way. PayPal’s Checkout products spread across many web and mobile apps, supporting millions of users across ~200 countries and has hundreds of experiments running at any time.
Read more

Cross shard transactions at 10 million requests per second

Dropbox stores petabytes of metadata to support user-facing features and to power our production infrastructure. The primary system we use to store this metadata is named Edgestore and is described in a previous blog post, (Re)Introducing Edgestore. In simple terms, Edgestore is a service and abstraction over thousands of MySQL nodes that provides users with strongly consistent, transactional reads and writes at low latency. Edgestore hides details of physical sharding from the application layer to allow developers to scale out their metadata storage needs without thinking about complexities of data placement and distribution.
Read more

Scaling Time Series Data Storage—Part II

In January 2016 Netflix expanded worldwide, opening service to 130 additional countries and supporting 20 total languages. Later in 2016 the TV experience evolved to include video previews during the browsing experience. More members, more languages, and more video playbacks stretched the times series data storage architecture from part 1 close to its breaking point. In part 2 here, we will explore the limitations of that architecture and describe how we’re re-architecting for this next phase in our evolution.
Read more