Anatomy Of A Power Outage: Explaining the August Outage Affecting 5% of Britain

Without warning on an early August evening a significant proportion of the electricity grid in the UK went dark. It was still daylight so the disruption caused was not as large as it might have been, but it does highlight how we take a stable power grid for granted. The story is a fascinating one of a 76-second chain of unexpected shutdown events in which individual systems reacted according to their programming, resulted in a partial grid load shedding — what we might refer to as a shutdown.
Read more

How Lyft Creates Hyper-Accurate Maps from Open-Source Maps and Real-Time Data

At Lyft, our novel driver localization algorithm detects map errors to create a hyper-accurate map from OpenStreetMap (OSM) and real-time data. We have fixed thousands of map errors in OSM in bustling urban areas. Later in the post, we share a sample of the detected map errors in Minneapolis with the OSM Community to improve the quality of the map. Why are maps important for Lyft? Lyft’s mission to build the world’s best transportation relies on its inherent geospatial capabilities.
Read more

Introducing KiloGram, a New Technique for AI Detection of Malware

A team of researchers recently presented their paper on KiloGram, a new algorithm for managing large n-grams in files, to improve machine-learning detection of malware. The new algorithm is 60x faster than previous methods and can handle n-grams for n=1024 or higher. The large values of n have additional application for interpretable malware analysis and signature generation. Source: infoq.com

Why our team cancelled our move to microservices

Recently our development team had a small break in our feature delivery schedule. Technical leadership decided that this time would be best spent splitting our monolithic architecture into microservices. After a month of investigation and preparation, we cancelled the move, instead deciding to stick with our monolith. For us, microservices were not only going to not help us; they were going to hurt our development process. Microservices had been sold to us as the ideal architectural for perhaps a year now.
Read more

Announcing etcd 3.4

In particular, etcd experienced performance issues with a large number of concurrent read transactions even when there is no write (e.g. “read-only range request … took too long to execute”). Previously, the storage backend commit operation on pending writes blocks incoming read transactions, even when there was no pending write. Now, the commit does not block reads which improve long-running read transaction performance. We further made backend read transactions fully concurrent.
Read more

Down The Rabbit Hole of Performance Monitoring

Hi, I’m Tony, and I’m an engineer on League. This article is a followup to my performance series, where I talk about optimisation and profiling. This will be a high level overview of how we monitor game performance in League of Legends, how we detect when a performance degradation has slipped through QA and escaped into the wild, and how we track global trends in frame times over many patches and millions of players.
Read more

Why Spinnaker matters to CI/CD

It takes many tools to deliver an artifact into production. Tools for building and testing, tools for creating a deployable artifact like a container image, tools for authentication and authorization, tools for maintaining infrastructure, and more. Seamlessly integrating these tools into a workflow can be transformative for an engineering culture, but doing it yourself can be a tall order. As organizations mature, both the number of tools and the number of people managing them tend to grow, often leading to confusing complexity and fragmentation.
Read more

What Bitbucket learned from migrating its unit testing tool

More than four years ago, Stash Bitbucket transitioned from running Javascript unit tests by Java to Karma. Today, I can announce that we switched our test runner to a modern tool called Jest! How did we end up here? Before Jest, our testing framework was a combination of different tools and helpers wired up together: Each of them serves a different purpose and is mandatory to write unit tests. The problem with that setup is that you need to maintain the custom integrations and this can cost you time and a lot of headaches.
Read more

The Linux kernel: Top 5 innovations

The word innovation gets bandied about in the tech industry almost as much as revolution, so it can be difficult to differentiate hyperbole from something that’s actually exciting. The Linux kernel has been called innovative, but then again it’s also been called the biggest hack in modern computing, a monolith in a micro world. Setting aside marketing and modeling, Linux is arguably the most popular kernel of the open source world, and it’s introduced some real game-changers over its nearly 30-year life span.
Read more

The lifecycle of Linux kernel testing

In Continuous integration testing for the Linux kernel,I wrote about the Continuous Kernel Integration (CKI) project and its mission to change how kernel developers and maintainers work. This article is a deep dive into some of the more technical aspects of the project and how all the pieces fit together. Every exciting feature, improvement, and bug in the kernel starts with a change proposed by a developer. These changes appear on myriad mailing lists for different kernel repositories.
Read more