News
On July 19th, 2019 Capital One got the red flag that every modern company hopes to avoid – their data had been breached. Over 106 million people affected.
140,000 Social Security numbers. 80,000 bank account numbers. 1,000,000 Social Insurance Numbers.
Pretty messy right? Unfortunately, the 19th wasn’t when the breach occurred.
It turns out that Paige Thompson, aka Erratic, had done the deed between March 22nd and March 23rd 2019.
Read more
Envoy Proxy in 2019: Security, Caching, Wasm, HTTP/3, and more
Since its release in September 2016, Envoy Proxy has gained enormous traction in the market. Envoy was a classic case of the right product at the right time: Envoy had the right set of features and performance to address this need. Some of these features included a runtime API for configuration & management, dynamic configuration, gRPC & HTTP/2 support, automatic retries, traffic shadowing, and robust observability systems.
Read more
YuniKorn: a universal resource scheduler
We are super excited today to announce the open-sourcing of one of the exciting new projects we’ve been working behind the scenes at the intersection of big-data and computation platforms – YuniKorn! Yunikorn is a new standalone universal resource-scheduler responsible for allocating/managing resources for big-data workloads including batch jobs and long-running services. YuniKorn is a light-weight, universal resource scheduler for container orchestrator systems.
It is created to achieve fine-grained resource sharing for various workloads efficiently on large scale, multi-tenant environments on one hand and dynamically brought up cloud-native environment on the other. YuniKorn brings a unified, cross-platform scheduling experience for mixed workloads consists of stateless batch workloads and stateful services, with support for, but not limited to, YARN and Kubernetes. YuniKorn [‘ju:nikɔ:n] is a made-up word, “Y” for YARN, “K” for K8s, “Uni” for Unified, and its pronunciation is the same as “Unicorn”.
Read more
How a Production Outage Was Caused Using Kubernetes Pod Priorities
On Friday, July 19, Grafana Cloud experienced a ~30min outage in our Hosted Prometheus service. To our customers who were affected by the incident, I apologize. Itâs our job to provide you with the monitoring tools you need, and when they are not available we make your life harder.
We take this outage very seriously. This blog post explains what happened, how we responded to it, and what weâre doing to ensure it doesnât happen again. The Grafana Cloud Hosted Prometheus service is based on Cortex, a CNCF project to build a horizontally scalable, highly available, multi-tenant Prometheus service.
Read more
The journey to support nanosecond timestamps in Elasticsearch
The ability to store dates in nanosecond resolution required a significant refactoring within the Elasticsearch code base. Read this blog post for the why and how on our journey to be able to store dates in nanosecond resolution from Elasticsearch 7.0 onwards. Elasticsearch supports a date mapping type that parses the string representation of a date in a variety of configurable formats, converts this date into milliseconds since the epoch and then stores it as a long value in Lucene.
Read more
Key Conjurer: Our Policy of Least Privilege
Hi, my name is Reza Nikoopour and I’m a security engineer on the Security team at Riot. My team is responsible for securing Riot infrastructure wherever we’re deployed – whether that means internal or external data centers or clouds. We provide cloud security guidance to the rest of Riot, and we’re responsible for Key Conjurer, our open source AWS API programmatic access solution.
Read more
Visualizing Istio external traffic with Kiali
Suppose that you have an application using several third party services to store files, send messages, write tweets, etc. It is useful to know how much traffic is going off your mesh to these services, for example, you might want to know how many requests are directed to twitter or how much data is being sent to Dropbox. Also knowing if these requests are successful or if they fail.
Read more
A 30-Second Earthquake Warning Gives a Menlo Park Fire Station a Chance to Protect Itself
SkyAlert’s technology has been evolving for years in Mexico. Over time, the company extended what was initially a pager-based emergency alert system to a comprehensive earthquake alert system that includes a proprietary network of seismic sensors. Recently, Cantu’s efforts got the attention of researchers and investors in the U.S.—and an opportunity to give SkyAlert’s IoT technology far broader reach.
Read more
VPC Traffic Mirroring – Capture & Inspect Network Traffic
Running a complex network is not an easy job. In addition to simply keeping it up and running, you need to keep an ever-watchful eye out for unusual traffic patterns or content that could signify a network intrusion, a compromised instance, or some other anomaly. VPC Traffic Mirroring Today we are launching VPC Traffic Mirroring.
This is a new feature that you can use with your existing Virtual Private Clouds (VPCs) to capture and inspect network traffic at scale. This will allow you to: Detect Network & Security Anomalies – You can extract traffic of interest from any workload in a VPC and route it to the detection tools of your choice. You can detect and respond to attacks more quickly than is possible with traditional log-based tools.
Read more
Whats new in Kubernetes 1.15?
Another outstanding Kubernetes release, this time focused on making the CustomResource a first class citizen in your cluster, allowing for better extensibility and maintainability. But wait, there is much more! Here is the full list of what’s new in Kubernetes 1.15.
NodeLocal DNSCache improves Cluster DNS performance by running a dns caching agent on cluster nodes as a Daemonset, thereby avoiding iptables DNAT rules and connection tracking. The local caching agent will query kube-dns service for cache misses of cluster hostnames (cluster.local suffix by default). This effort has two main goals – reduce performance impact that Events have on the rest of the cluster and add more structure to the Event object which is the first and necessary step to make it possible to automate event analysis.
Read more