Loading Android data with coroutines

Many moons ago, I was working at the New York Times and created a library called Store, which was “a Java library for effortless, reactive data loading.” We built Store using RxJava and patterns adopted from Guava’s Cache implementation. Today’s app users expect data updates to flow in and out of the UI without having […]

Personalizing Spotify Home with Machine Learning

Machine learning is at the heart of everything we do at Spotify. Especially on Spotify Home, where it enables us to personalize the user experience and provide billions of fans the opportunity to enjoy and be inspired by the artists on our platform. This is what makes Spotify unique. Across our engineering community, we are […]

Istio as an Example of When Not to Do Microservices

I’ve been pretty invested in helping organizations with their cloud-native journeys for the last five years. Modernizing and improving a team (and eventually an organization’s) velocity to deliver software-based technology is heavily influenced by it’s people, process and eventual technology decisions. A microservices approach may be appropriate when the culmination of an application’s architecture has […]

Database Migration To Amazon Aurora

In this blog post we’ll show you how we migrated a critical Postgres database with 18Tb of data from Amazon RDS (Relational Database Service) to Amazon Aurora, with minimal downtime. To do so, we’ll discuss our experience at Codacy. We chose Amazon’sAuroradatabase as a solution for a few key reasons including: 1) automatic storage growth […]

Injecting the flu vaccine into a tumor gets the immune system to attack it

A number of years back, there was a great deal of excitement about using viruses to target cancer. A number of viruses explode the cells that they’ve infected in order to spread to new ones. Engineering those viruses so that they could only grow in cancer cells would seem to provide a way of selectively […]

How does a Prometheus Histogram work?

How does a Prometheus Histogram work? We looked previously at thecounter, gauge, and summary, how does the Prometheus histogram work? The histogram has several similarities to the summary. A histogram is a combination of various counters. Like summary metrics, histogram metrics are used to track the size of events, usually how long they take, via […]

Building a Service Mesh with Envoy

Service Mesh is the communication layer in a microservice setup. All requests, to and from each of the services go through the mesh. Also known as an infrastructure layer in a microservices setup, the service mesh makes communication between services reliable and secure.Each service has its own proxy service (sidecars) and all the proxy services […]

Monitoring blocked and passthrough external service traffic

What are BlackHole and Passthrough clusters? Understanding, controlling and securing your external service access is one of the key benefits that you get from a service mesh like Istio. From a security and operations point of view, it is critical to monitor what external service traffic is getting blocked as they might surface possible misconfigurations […]

Scaling a Mature Data Pipeline—Managing Overhead

Before delving into our specifics, I want to take a moment to discuss the technical stack backing our pipeline. Our platform uses a mixture of Spark and Hive jobs. Our core pipeline is primarily implemented in Scala. However, we leverage Spark SQL in certain contexts. We leverage YARN for job scheduling and resource management, and […]

Peloton – Uber’s Webscale Unified Scheduler on Mesos & Kubernetes

Mayank Bansal and Min Cai present Peloton, a Unified Resource Scheduler for collocating heterogeneous workloads in shared Mesos clusters. Its goal is to manage compute resources more efficiently while providing hierarchical max-min fairness guarantees for different teams. Peloton schedules large-scale batch jobs with millions of tasks and supports distributed TensorFlow jobs with thousands of GPUs. […]