The Beginner’s Guide to the CNCF Landscape

The cloud native landscape can be complicated and confusing. Its myriad of open source projects are supported by the constant contributions of a vibrant and expansive community. The Cloud Native Computing Foundation (CNCF) has alandscape mapthat shows the full extent of cloud native solutions, many of which are under their umbrella. In 2014 Google open sourced an internal project called Borg that they had been using to orchestrate containers. Not having a place to land the project, Google partnered with the Linux Foundation to create the Cloud Native Computing Foundation (CNCF), which would encourage the development and collaboration of Kubernetes and other cloud native solutions.
Read more

Tensorflow 2.0: models migration and new design

Tensorflow 2.0 will be a major milestone for the most popular machine learning framework: lots of changes are coming, and all with the aim of making ML accessible to everyone. These changes, however, requires for the old users to completely re-learn how to use the framework: this article describes all the (known) differences between the 1.x and 2.x version, focusing on the change of mindset required and highlighting the pros and cons of the new and implementations.
Read more

Horizon: An open-source reinforcement learning platform

Horizon is the first open source end-to-end platform that uses applied reinforcement learning (RL) to optimize systems in large-scale production environments. The workflows and algorithms included in this release were built on open frameworks — PyTorch 1.0, Caffe2, and Spark — making Horizon accessible to anyone using RL at scale. We’ve put Horizon to work internally over the past year in a wide range of applications, including helping to personalize M suggestions, delivering more meaningful notifications, and optimizing streaming video quality.
Read more

An in-depth look at 100% Zero Downtime deployments with Terraform

At Checkly, we run our browser checks on AWS EC2 instances managed by Terraform. When shipping a new version, we don’t want to interrupt our service, so we need zero downtime deployments. Hashicorp has their own write up on zero downtime upgrades, but it only introduces the Terraform configuration without a lot of context, workflow or other details that are needed to actually make this work in real life™. This is the full lowdown of how we do it in production for ~1.
Read more

A Linkerd 2.0 Deep Dive

It’s always exciting to see what folks in the Cloud Native community think the first time they encounter Linkerd. In this video, Lachlan Evenson, a Principal Program Manager at Microsoft Azure Container Services and CNCF Ambassador, takes a super extensive tour of Linkerd 2.0. His under-the-hood perambulations include: running health checks to ensure that Linkerd has installed correctly, tracking resource consumption and CRD counts (hint: 0) in the install process, and extensive looks at Linkerd’s UNIX-style CLI, including ‘stat’, ‘tap’ and ‘top’ commands to track service behaviors like inbound and outbound request performance.
Read more

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Cluster management, a common software infrastructure among technology companies, aggregates compute resources from a collection of physical hosts into a shared resource pool, amplifying compute power and allowing for the flexible use of data center hardware. At Uber, cluster management provides an abstraction layer for various workloads. With the increasing scale of our business, the efficient use of cluster resources becomes very important. However, our compute stack was underutilized due to several dedicated clusters for batch, stateless, and stateful use cases.
Read more

Kubernetes deployment strategies

In Kubernetes there are a few different ways to release an application, it is necessary to choose the right strategy to make your infrastructure reliable during an application update. Let’s take a look at each strategy and see what type of application would fit best for it. Source: container-solutions.com

October 21 GitHub post-incident analysis

Last week, GitHub experienced an incident that resulted in degraded service for 24 hours and 11 minutes. While portions of our platform were not affected by this incident, multiple internal systems were affected which resulted in our displaying of information that was out of date and inconsistent. Ultimately, no user data was lost; however manual reconciliation for a few seconds of database writes is still in progress. For the majority of the incident, GitHub was also unable to serve webhook events or build and publish GitHub Pages sites.
Read more

New Theory of Intelligence May Disrupt AI and Neuroscience

Recent advancement in artificial intelligence, namely in deep learning, has borrowed concepts from the human brain. The architecture of most deep learning models is based on layers of processing– an artificial neural network that is inspired by the neurons of the biological brain. Yet neuroscientists do not agree on exactly what intelligence is, and how it is formed in the human brain — it’s a phenomena that remains unexplained. Technologist, scientist, and co-founder of Numenta, Jeff Hawkins, presented an innovative framework for understanding how the human neocortex operates, called “The Thousand Brains Theory of Intelligence,” at the Human Brain Project Summit in Maaastricht, the Netherlands, in October 2018.
Read more

Why React’s new Hooks API is a game changer

I have been developing with React since it’s early days and during that time there have been many attempts by both influencers, as well as the core team to improve the API and patterns developers are using to creating software. One of the biggest challenges we have had was how to share behaviour neatly between components to enable reuse or even just separation of concerns. Every single solution proposed up until this point had some problems associated with it.
Read more