news

Running Kubernetes in the Federal Government

Posted on Jan 21, 2019

Tackling security compliance is a long and challenging process for agencies, systems integrators, and vendors trying to launch new information systems in the federal government. Each new information system must go through the Risk Management Framework (RMF) created by the National Institute of Standards and Technology (NIST) in order to obtain authority to operate (ATO). This process is often long and tedious and can last for over a year.

Monitoring Kubernetes, part 1: the challenges + data sources

Posted on Jan 21, 2019

Our industry has long been relying on microservice-based architecture to deliver software faster and safer. The advent and ubiquity of microservices naturally paved the way for container technology, empowering us to rethink how we build and deploy our applications. Docker exploded onto the scene in 2013, and, for companies focusing on modernizing their infrastructure and cloud migration, a tool like Docker is critical to shipping applications quickly, at scale.

Creating a Zoo of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning

Posted on Jan 21, 2019

Some of the most exciting advances in AI recently have come from the field of deep reinforcement learning (deep RL), where deep neural networks learn to perform complicated tasks from reward signals. RL operates similarly to how you might teach a dog to perform a new trick: treats are offered to reinforce improved behavior. Recently, deep RL agents have exceeded human performance in benchmarks like classic video games (such as Atari 2600 games), the board game Go, and modern computer games like DOTA 2.

MySQL High Availability Framework Explained – Part II

Posted on Jan 21, 2019

InPart I, we introduced a High Availability (HA) framework forMySQL hostingand discussed various components and their functionality. Now in Part II, we will discuss the details of MySQL semisynchronous replication and the related configuration settings that help us ensure redundancy and consistency of the data in our HA setup. Make sure to check back in for Part III where we will review various failure scenarios that could arise and the way the framework responds and recovers from these conditions.

Enterprise grade CI/CD with GitOps

Posted on Jan 21, 2019

Implementing Continuous Delivery[1] at enterprise scale is a major challenge. As every company has to innovate their software delivery methods, we need to allow individual teams to learn and improve their own delivery pipeline. This is especially true in the Cloud Native world, where many best practices are still emerging. However, giving teams flexibility to experiment needs to be balanced with security and compliance requirements. In this post, I will explore how we successfully employed the GitOps architecture pattern to find a good balance between flexibility and security at a large enterprise customer of Container Solutions.

Courier: Dropbox migration to gRPC

Posted on Jan 21, 2019

Dropbox runs hundreds of services, written in different languages, which exchange millions of requests per second. At the core of our Service Oriented Architecture is Courier, our gRPC-based Remote Procedure Call (RPC) framework. While developing Courier, we learned a lot about extending gRPC, optimizing performance for scale, and providing a bridge from our legacy RPC system. Courier is not Dropbox’s first RPC framework. Even before we started to break our Python monolith into services in earnest, we needed a solid foundation for inter-service communication.

New whitepaper: Achieving Operational Resilience in the Financial Sector and Beyond

Posted on Jan 21, 2019

AWS has released a new whitepaper, Amazon Web Services’ Approach to Operational Resilience in the Financial Sector and Beyond, in which we discuss how AWS and customers build for resiliency on the AWS cloud. We’re constantly amazed at the applications our customers build using AWS services — including what our financial services customers have built, from credit risk simulations to mobile banking applications. Depending on their internal and regulatory requirements, financial services companies may need to meet specific resiliency objectives and withstand low-probability events that could otherwise disrupt their businesses.

Western Digital HDD Simulation at Cloud Scale – 2.5 Million HPC Tasks, 40K EC2 Spot Instances

Posted on Jan 21, 2019

Earlier this month my colleague Bala Thekkedath published a story about Extreme Scale HPC and talked about how AWS customer Western Digital built a cloud-scale HPC cluster on AWS and used it to simulate crucial elements of upcoming head designs for their next-generation hard disk drives (HDD). The simulation described in the story encompassed a little over 2.5 million tasks, and ran to completion in just 8 hours on a million-vCPU Amazon EC2 cluster.

AI year in review

Posted on Jan 21, 2019

At Facebook, we think that artificial intelligence that learns in new, more efficient ways – much like humans do – can play an important role in bringing people together. That core belief helps drive our AI strategy, focusing our investments in long-term research related to systems that learn using real-world data, inspiring our engineers to share cutting-edge tools and platforms with the wider AI community, and ultimately demonstrating new ways to use the technology to benefit the world.

Rate Limiting at the Edge

Posted on Jan 21, 2019

I’m sure many of you have heard of the “Death Star Security” model—the hardening of the perimeter, without much attention paid to the inner core—and while this is generally considered bad form in the current cloud native landscape, there is still many things that do need to be implemented at edge in order to provide both operational and business logic support. One of these things is rate limiting. Modern applications and APIs can experience a burst of traffic over a short time period, for both good and bad reasons, but this needs to be managed well if your business model relies upon the successful completion of requests by paying customers.