News
Anomalies, often referred to as outliers, are data points or patterns in data that do not conform to a notion of normal behavior. Anomaly detection, then, is the task of finding those patterns in data that do not adhere to expected norms. The capability to recognize or detect anomalous behavior can provide highly useful insights across industries.
Flagging or enacting a planned response when these unusual cases occur can save businesses time, money, and customers. Automatically detecting and correctly classifying something unseen as anomalous is a challenging problem that has been tackled in many different manners over the years. Traditional machine learning approaches are sub-optimal when it comes to high dimensional data, because they fail to capture the complex structure in the data.
Read more
Designing a Production-Ready Kappa Architecture for Timely Data Stream Processing
At Uber, we use robust data processing systems such as Apache Flink and Apache Spark to power the streaming applications that helps us calculate up-to-date pricing, enhance driver dispatching, and fight fraud on our platform. Such solutions can process data at a massive scale in real time with exactly-once semantics, and the emergence of these systems over the past several years has unlocked an industry-wide ability to write streaming data processing applications at low latencies, a functionality previously impossible to achieve at scale. However, since streaming systems are inherently unable to guarantee event order, they must make trade-offs in how they handle late data.
Read more
How Amazon is solving big-data challenges with data lakes
Back when Jeff Bezos filled orders in his garage and drove packages to the post office himself, crunching the numbers on costs, tracking inventory, and forecasting future demand was relatively simple. Fast-forward 25 years, Amazon’s retail business has more than 175 fulfillment centers (FC) worldwide with over 250,000 full-time associates shipping millions of items per day. Amazon’s worldwide financial operations team has the incredible task of tracking all of that data (think petabytes).
Read more
2019 in Review: 10 AI Papers That Made an Impact
Synced spotlights 10 artificial intelligence papers that garnered extraordinary attention and accolades in 2019. The volume of peer-reviewed AI research papers has grown by more than 300 percent over the past three decades (Stanford AI Index 2019), and the top AI conferences in 2019 saw a deluge of paper. CVPR submissions spiked to 5,165, a 56 percent increase over 2018; ICLR received 1,591 main conference paper submissions, up 60 percent over last year; ACL reported a record-breaking 2,906 submissions, almost doubling last year’s 1,544; and ICCV 2019 received 4,303 submissions, more than twice the 2017 total.
Read more
How we 30x’d our Node parallelism
What’s the best way to safely increase parallelism in a production Node service? That’s a question my team needed to answer a couple of months ago. We were running 4,000 Node containers (or ‘workers’) for our bank integration service.
The service was originally designed such that each worker would process only a single request at a time. This design lessened the impact of integrations that accidentally blocked the event loop, and allowed us to ignore the variability in resource usage across different integrations. But since our total capacity was capped at 4,000 concurrent requests, the system did not gracefully scale.
Read more
Serving 100µs reads with 100% availability
This is the story of how we built ctlstore, a distributed multi-tenant data store that features effectively infinite read scalability, serves queries in 100µs, and can withstand the failure of any component. Highly-reliable systems need highly-reliable data sources. Segment’s stream processing pipeline is no different.
Pipeline components need not only the data that they process, but additional control data that specifies how the data is to be processed. End users configure some settings in a UI or via our API which in turn this manipulates the behavior of the pipeline. In the initial design of Segment, the stream processing pipeline was tightly coupled to the control plane.
Read more
From 15,000 database connections to under 100: DigitalOcean’s tale of tech debt
A new hire recently asked me over lunch, “What does DigitalOcean’s tech debt look like?” I could not help but smile when I heard the question. Software engineers asking about a company’s tech debt is the equivalent of asking about a credit score.
It’s their way of As a cloud provider that manages our own servers and hardware, we have faced complications that many other startups have not encountered in this new era of cloud computing. These tough situations ultimately led to tradeoffs we had to make early in our existence. And as any quickly growing company knows, the technical decisions you make early on tend to catch up with you later.
Read more
Boeing’s Starliner won’t make it to the ISS now because its internal clock went wrong
Boeing’s CST-100 Starliner launched into space for the first time today, but the spacecraft failed to make it into a stable orbit that would allow it to rendezvous with the International Space Station. What happened: An Atlas V rocket safely carried Starliner into space from Cape Canaveral Air Force Station on Friday, but the capsule had an anomaly with its internal system timer. A faulty internal clock means Starliner won’t rendezvous with the ISS, a massive setback for NASA and Boeing.
Read more
Making the LinkedIn experimentation engine 20x faster
At LinkedIn, we like to say that experimentation is in our blood because no production release at the company happens without experimentation; by “experimentation,” we typically mean “A/B testing.” The company relies on employees to make decisions by analyzing data. Experimentation is a data-driven foundation of the decision-making process, which helps with measuring the precise impact of every change and release, and evaluating whether expectations meet reality.
Read more
2020 Cloud Report
Read the 2020 Cloud Report from Cockroach Labs, and learn which cloud platform performs best for transactional workloads across TPC-C, Network Throughput, CPU, and Storage benchmarks. If there’s one thing we’ve learned in our three years of benchmarking cloud providers on transactional workloads, it’s this: the results change often. Last year’s report showed AWS dramatically outperforming GCP across TPC-C performance, CPU, Network, and even cost.
The 2020 Cloud Report shows GCP caught up to AWS’s performance and offers the best price per performance for transactional workloads, and that new-to-the-report Azure is broadly competitive with both GCP and AWS. In the 2020 Cloud Report, we’ve expanded our research by: TL;DR? Conventional wisdom is that hardware performance is plateauing.
Read more