news
Tackling security compliance is a long and challenging process for agencies, systems integrators, and vendors trying to launch new information systems in the federal government. Each new information system must go through the Risk Management Framework (RMF) created by the National Institute of Standards and Technology (NIST) in order to obtain authority to operate (ATO). This process is often long and tedious and can last for over a year.
Read more
Our industry has long been relying on microservice-based architecture to deliver software faster and safer. The advent and ubiquity of microservices naturally paved the way for container technology, empowering us to rethink how we build and deploy our applications. Docker exploded onto the scene in 2013, and, for companies focusing on modernizing their infrastructure and cloud migration, a tool like Docker is critical to shipping applications quickly, at scale.
Read more
Some of the most exciting advances in AI recently have come from the field of deep reinforcement learning (deep RL), where deep neural networks learn to perform complicated tasks from reward signals. RL operates similarly to how you might teach a dog to perform a new trick: treats are offered to reinforce improved behavior. Recently, deep RL agents have exceeded human performance in benchmarks like classic video games (such as Atari 2600 games), the board game Go, and modern computer games like DOTA 2.
Read more
InPart I, we introduced a High Availability (HA) framework forMySQL hostingand discussed various components and their functionality. Now in Part II, we will discuss the details of MySQL semisynchronous replication and the related configuration settings that help us ensure redundancy and consistency of the data in our HA setup. Make sure to check back in for Part III where we will review various failure scenarios that could arise and the way the framework responds and recovers from these conditions.
Read more
Implementing Continuous Delivery[1] at enterprise scale is a major challenge. As every company has to innovate their software delivery methods, we need to allow individual teams to learn and improve their own delivery pipeline. This is especially true in the Cloud Native world, where many best practices are still emerging.
However, giving teams flexibility to experiment needs to be balanced with security and compliance requirements. In this post, I will explore how we successfully employed the GitOps architecture pattern to find a good balance between flexibility and security at a large enterprise customer of Container Solutions.
Read more
Dropbox runs hundreds of services, written in different languages, which exchange millions of requests per second. At the core of our Service Oriented Architecture is Courier, our gRPC-based Remote Procedure Call (RPC) framework. While developing Courier, we learned a lot about extending gRPC, optimizing performance for scale, and providing a bridge from our legacy RPC system.
Courier is not Dropbox’s first RPC framework. Even before we started to break our Python monolith into services in earnest, we needed a solid foundation for inter-service communication.
Read more
AWS has released a new whitepaper, Amazon Web Services’ Approach to Operational Resilience in the Financial Sector and Beyond, in which we discuss how AWS and customers build for resiliency on the AWS cloud. We’re constantly amazed at the applications our customers build using AWS services — including what our financial services customers have built, from credit risk simulations to mobile banking applications. Depending on their internal and regulatory requirements, financial services companies may need to meet specific resiliency objectives and withstand low-probability events that could otherwise disrupt their businesses.
Read more
Earlier this month my colleague Bala Thekkedath published a story about Extreme Scale HPC and talked about how AWS customer Western Digital built a cloud-scale HPC cluster on AWS and used it to simulate crucial elements of upcoming head designs for their next-generation hard disk drives (HDD). The simulation described in the story encompassed a little over 2.5 million tasks, and ran to completion in just 8 hours on a million-vCPU Amazon EC2 cluster.
Read more
At Facebook, we think that artificial intelligence that learns in new, more efficient ways – much like humans do – can play an important role in bringing people together. That core belief helps drive our AI strategy, focusing our investments in long-term research related to systems that learn using real-world data, inspiring our engineers to share cutting-edge tools and platforms with the wider AI community, and ultimately demonstrating new ways to use the technology to benefit the world.
Read more
I’m sure many of you have heard of the “Death Star Security” model—the hardening of the perimeter, without much attention paid to the inner core—and while this is generally considered bad form in the current cloud native landscape, there is still many things that do need to be implemented at edge in order to provide both operational and business logic support. One of these things is rate limiting. Modern applications and APIs can experience a burst of traffic over a short time period, for both good and bad reasons, but this needs to be managed well if your business model relies upon the successful completion of requests by paying customers.
Read more