Tools that enable fast and flexible experimentation democratize and accelerate machine learning research. Take for example the development of libraries for automatic differentiation, such as Theano, Caffe, TensorFlow, and PyTorch: these libraries have been instrumental in catalyzing machine learning research, enabling gradient descent training without the tedious work of hand-computing derivatives. In these frameworks, it’s simple to experiment by adjusting the size and depth of a neural network, by changing the error function that is to be optimized, and even by inventing new architectural elements, like layers and activation functions–all without having to worry about how to derive the resulting gradient of improvement.
Read more
Welcome to part 3 in our series about secure control of egress traffic in Istio. In the first part in the series, I presented the attacks involving egress traffic and the requirements we collected for a secure control system for egress traffic. In the second part in the series, I presented the Istio way of securing egress traffic and showed how you can prevent the attacks using Istio.
In this installment, I compare secure control of egress traffic in Istio with alternative solutions such as using Kubernetes network policies and legacy egress proxies and firewalls.
Read more
For online serving systems it’s fairly well known that you should look for request rate, errors and duration. What about offline processing pipelines though? For a typical web application, high latency or error rates are the sort of thing you want to wake someone up about as they usually negatively affect the end-user’s experience.
Request rate isn’t something to alert on in and of itself, however it’s important to know as it’s often related to errors/latency plus you’ll want it for capacity planning.
Read more
This panel is a very diverse group, and I’m actually going to let them introduce themselves rather than me trying to butcher any names. This is all about answering my need, literally, my first steps. What should I be focused on as a software engineer wanting to get into ML and start using ML more convinced leadership on things that I want to do?
For example, I work for an edge company deploying use cases at edge, so I want to be able to use machine learning to be able to anomaly-detect things at the edge.
Read more
Tucked within Norway’s fjord-riddled coast, nearly 3,500 fish pens corral upwards of 400 million salmon and trout. Not only does the country raise and ship more salmonoid overseas than any other in the world (1.1 million tons in 2018), farmed salmon is Norway’s third largest export behind crude petroleum and natural gas. In a global industry expected to quintuple by 2050, farmed salmon is a fine kettle of fish.
Read more
As the Kubernetes API evolves, APIs are periodically reorganized or upgraded. When APIs evolve, the old API is deprecated and eventually removed. The 1.16 release will deprecate APIs for four services: None of these resources will be removed from Kubernetes or deprecated in any way.
However, to continue using these resources, you must use a current version of the Kubernetes API. NetworkPolicy: will no longer be served from extensions/v1beta1 in v1.
Read more
Apache Spark is a foundational piece of Uber’s Big Data infrastructure that powers many critical aspects of our business. We currently run more than one hundred thousand Spark applications per day, across multiple different compute environments. Spark’s versatility, which allows us to build applications and run them everywhere that we need, makes this scale possible.
However, our ever-growing infrastructure means that these environments are constantly changing, making it increasingly difficult for both new and existing users to give their applications reliable access to data sources, compute resources, and supporting tools.
Read more
It is important to fine-tune the set of services that a workload has access to. It is a good practice to give the least privilege. In that sense, we should grant permissions to each workload to communicate with exactly the services it needs to access.
This could also help reducing the attack surface in case of a compromised workload in our mesh. Unwanted requests between servicesFor example, a developer could contact the ratings service directly instead of using the review service.
Read more
In 2018, we published a blog post titled Categorizing Listing Photos at Airbnb. In that post, we introduced an image classification model which categorized listing photos into different room types and helped organize hundreds of millions of listing photos on the Airbnb platform. Since then, the technology has been powering a wide range of internal content moderation tools, as well as some consumer-facing features on the Airbnb website.
We hope such an image classification technology makes our business more efficient, and our products more pleasant to use.
Read more
GitHub stars are an essential growth factor for many open source projects, but they can easily be from bot accounts. How can we trust GitHub stars again? For Open Source GitHub projects, stars are a primordial metric.
Of course, there are ways to abuse this system, as you might have heard recently. As an open source company, we want our community’s legitimacy to be transparent, and we want to help the open source community do the same for other projects.
Read more