Real Time Facial Expression Recognition

Computer animated agents and robots bring new dimension in human computer interaction which makes it vital as how computers can affect our social life in day-to-day activities. Face to face communication is a real-time process operating at a time scale in the order of milliseconds. The level of uncertainty at this time scale is considerable, making it necessary for humans and machines to rely on sensory rich perceptual primitives rather than slow symbolic inference processes.
Read more

How we spent two weeks hunting an NFS bug in the Linux kernel

On Sep. 14, the GitLab support team escalated a critical problem encountered by one of our customers: GitLab would run fine for a while, but after some time users encountered errors. When attempting to clone certain repositories via Git, users would see an opaque Stale file error message. The error message persisted for a long time, blocking employees from being able to work, unless a system administrator intervened manually by running ls in the directory itself.
Read more

Meet TiDB: An open source NewSQL database

TiDB is an open source NewSQL database released under the Apache 2.0 License. Because it speaks the MySQL protocol, your existing applications will be able to connect to it using any MySQL connector, and most SQL functionality remains identical (joins, subqueries, transactions, etc.). Step under the covers, however, and thereare differences. If your architecture is based on MySQL with Read Replicas, you’ll see things work a little bit differently with TiDB.
Read more

CloudFormation Drift Detection

AWS CloudFormation supports you in your efforts to implement Infrastructure as Code (IaC). You can use a template to define the desired AWS resource configuration, and then use it to launch a CloudFormation stack. The stack contains the set of resources defined in the template, configured as specified. When you need to make a change to the configuration, you update the template and use a CloudFormation Change Set to apply the change.
Read more

Druid @ Airbnb Data Platform

Airbnb serves millions of guests and hosts in our community. Every second, their activities on Airbnb.com, such as searching, booking, and messaging, generate a huge amount of data we anonymize and use to improve the community’s experience on our platform. The Data Platform Team at Airbnb strives to leverage this data to improve our customers’ experiences and optimize Airbnb’s business. Our mission is to provide infrastructure to collect, organize, and process this deluge of data (all in privacy-safe ways), and empower various organizations across Airbnb to derive necessary analytics and make data-informed decisions from it.
Read more

Setting up the Kubernetes AWS Cloud Provider

The AWS cloud provider for Kubernetes enables a couple of key integration points for Kubernetes running on AWS; namely, dynamic provisioning of Elastic Block Store (EBS) volumes, and dynamic provisioning/configuration of Elastic Load Balancers (ELBs) for exposing Kubernetes Service objects. Unfortunately, the documentation surrounding how to set up the AWS cloud provider with Kubernetes is woefully inadequate. This article is an attempt to help address that shortcoming. Source: heptio.com

Debugging Node Services in Kubernetes With Linkerd 2.0

Node is one of the most popular languages for microservices. With the rise of Kubernetes, increasingly, Node developers are being asked to deploy their services to a Kubernetes cluster. But what’s required to safely deploy and run Node services on Kubernetes? In this post, we focus on one specific, but vital, component: how do I understand what’s happening with my Node service on Kubernetes, and how do I debug it when things go wrong?
Read more

Accurate Online Speaker Diarization with Supervised Learning

Speaker diarization, the process of partitioning an audio stream with multiple people into homogeneous segments associated with each individual, is an important part of speech recognition systems. By solving the problem of “who spoke when”, speaker diarization has applications in many important scenarios, such as understanding medical conversations, video captioning and more. However, training these systems with supervised learning methods is challenging — unlike standard supervised classification tasks, a robust diarization model requires the ability to associate new individuals with distinct speech segments that weren’t involved in training.
Read more

A Google Brain engineer’s guide to entering AI

Note that this guide was written in November 2018 to complement an in-depth conversation on the 80,000 Hours Podcast with Catherine Olsson and Daniel Ziegler on how to transition from computer science and software engineering in general into ML engineering, with a focus on alignment and safety. If you like this guide, we’d strongly encourage you to check out the podcast episode where we discuss some of the instructions here, and other relevant advice.
Read more

Introspected REST: An Alternative to REST and GraphQL

In this manifesto, we will give a specific definition of what REST is, according to Roy, and see the majority of APIs and API specs (JSONAPI, HAL etc) fail to follow this model. We will see what problems a RESTful API brings and why API designers have been constantly avoiding using it but instead come up with half-way solutions or retreat to alternative models like RPC-over-HTTP or, lately, GraphQL. Then, we will propose a new model, Introspected REST, that solves the issues that REST creates and allows the design of progressively evolvable APIs, in a much simpler way than conventional REST.
Read more