casestudy

Running Envoy as an Edge Proxy at eBay: Replacing Hardware Load Balancers with a Software Solution

At the inaugural EnvoyCon that ran in Seattle, USA, the eBay engineering team talked about running the Envoy Proxy at the edge as a replacement for hardware-based load balancers. Key learnings included that having a “programmable edge” provides many advantages and also several challenges. Source: infoq.com

The Many Faces of Envoy Proxy: Edge Gateway, Service Mesh, and Hybrid Networking Bridge

At the inaugural EnvoyCon in Seattle, USA, engineers from Pinterest, Yelp and Groupon presented their current use cases for the Envoy Proxy. The overarching message was that the Envoy Proxy appears to be moving closer to fulfilling its vision of providing the “universal [proxy] data plane API” for modern networking, including edge gateways, service meshes and hybrid networking bridges. Source: infoq.com

The Biggest IT Failures of 2018

This year provedonce againthat IT-related failures “are universally unprejudiced: they happen in every country; to large companies and small; in commercial, nonprofit, and governmental organizations; and without regard to status or reputation.” Below is a review that just scratches the surface of the sundry failures, glitches, and other IT hiccups that made the news in 2018. This year saw a slight reduction in the number of flight cancellations and delays due to computer-related problems as compared with the past three years, especially in the United States.
Read more

Observability at Scale: Building Uber’s Alerting Ecosystem

Uber’s software architectures consists of thousands of microservices that empower teams to iterate quickly and support our company’s global growth. These microservices support a variety of solutions, such as mobile applications, internal and infrastructure services, and products along with complex configurations that affect these products at city and sub-city levels. To maintain our growth and architecture, Uber’s Observability team built a robust, scalable metrics and alerting pipeline responsible for detecting, mitigating, and notifying engineers of issues with their services as soon as they occur.
Read more

Stack Overflow: How We Do Monitoring

What is monitoring? As far as I can tell, it means different things to different people. But we more or less agree on the concept. I think. Maybe. Let’s find out! Source: nickcraver.com

How Uber Beacon Helps Improve Safety for Riders and Drivers

Globally, there are approximately 1.3 million collision-related fatalities on the road every year. Crash fatalities are still the leading cause of death for people between 15-29 years old, impacting families, communities, and cities. Governments around the world are working to reduce the risks, committing more resources towards improving road safety. At Uber, we want to do our part by committing the power of our technology to help make travel safer for everyone.
Read more

Cape Technical Deep Dive

In this post, we’ll take a deep dive into the design of the Cape framework. First, we’ll discuss Cape’s architecture. Then we’ll look at the core scheduling component of the system. Throughout, we’ll focus the discussion on a few key design decisions. Before we begin, let’s touch on a few of our principles for developing and maintaining Cape. These principles were proposed based on learnings from the development of other systems at Dropbox, especially from Cape’s predecessor Livefill.
Read more

Kubernetes in production

I’ve provisioned Kubernetes clusters on bare metal before and have some examples here on how it can be done with CoreOS ( Warning the content is rather old now and not maintained ) In the beginning a bunch of tools & methods was considered: For network CNI kube-router was used as I became one of the maintainers for it some time ago after writing most of the metrics for it.
Read more

Bye bye Mongo, Hello Postgres

In April the Guardian switched off the Mongo DB cluster used to store our content after completing a migration to PostgreSQL on Amazon RDS. This post covers why and how At the Guardian, the majority of content – including articles, live blogs, galleries and video content – is produced in our in-house CMS tool, Composer. This, until recently, was backed by a Mongo DB database running on AWS. This database is essentially the “source of truth” for all Guardian content that has been published online – approximately 2.
Read more

Implementing the Netflix Media Database

In the previous blog posts in this series, we introduced the Netflix Media DataBase (NMDB) and its salient “Media Document” data model. In this post we will provide details of the NMDB system architecture beginning with the system requirements—these will serve as the necessary motivation for the architectural choices we made. A fundamental requirement for any lasting data system is that it should scale along with the growth of the business applications it wishes to serve.
Read more