news
Anomalies, often referred to as outliers, are data points or patterns in data that do not conform to a notion of normal behavior. Anomaly detection, then, is the task of finding those patterns in data that do not adhere to expected norms. The capability to recognize or detect anomalous behavior can provide highly useful insights across industries.
Flagging or enacting a planned response when these unusual cases occur can save businesses time, money, and customers.
Read more
At Uber, we use robust data processing systems such as Apache Flink and Apache Spark to power the streaming applications that helps us calculate up-to-date pricing, enhance driver dispatching, and fight fraud on our platform. Such solutions can process data at a massive scale in real time with exactly-once semantics, and the emergence of these systems over the past several years has unlocked an industry-wide ability to write streaming data processing applications at low latencies, a functionality previously impossible to achieve at scale.
Read more
Back when Jeff Bezos filled orders in his garage and drove packages to the post office himself, crunching the numbers on costs, tracking inventory, and forecasting future demand was relatively simple. Fast-forward 25 years, Amazon’s retail business has more than 175 fulfillment centers (FC) worldwide with over 250,000 full-time associates shipping millions of items per day. Amazon’s worldwide financial operations team has the incredible task of tracking all of that data (think petabytes).
Read more
Synced spotlights 10 artificial intelligence papers that garnered extraordinary attention and accolades in 2019. The volume of peer-reviewed AI research papers has grown by more than 300 percent over the past three decades (Stanford AI Index 2019), and the top AI conferences in 2019 saw a deluge of paper. CVPR submissions spiked to 5,165, a 56 percent increase over 2018; ICLR received 1,591 main conference paper submissions, up 60 percent over last year; ACL reported a record-breaking 2,906 submissions, almost doubling last year’s 1,544; and ICCV 2019 received 4,303 submissions, more than twice the 2017 total.
Read more
What’s the best way to safely increase parallelism in a production Node service? That’s a question my team needed to answer a couple of months ago. We were running 4,000 Node containers (or ‘workers’) for our bank integration service.
The service was originally designed such that each worker would process only a single request at a time. This design lessened the impact of integrations that accidentally blocked the event loop, and allowed us to ignore the variability in resource usage across different integrations.
Read more
This is the story of how we built ctlstore, a distributed multi-tenant data store that features effectively infinite read scalability, serves queries in 100µs, and can withstand the failure of any component. Highly-reliable systems need highly-reliable data sources. Segment’s stream processing pipeline is no different.
Pipeline components need not only the data that they process, but additional control data that specifies how the data is to be processed. End users configure some settings in a UI or via our API which in turn this manipulates the behavior of the pipeline.
Read more
A new hire recently asked me over lunch, “What does DigitalOcean’s tech debt look like?” I could not help but smile when I heard the question. Software engineers asking about a company’s tech debt is the equivalent of asking about a credit score.
It’s their way of As a cloud provider that manages our own servers and hardware, we have faced complications that many other startups have not encountered in this new era of cloud computing.
Read more
Boeing’s CST-100 Starliner launched into space for the first time today, but the spacecraft failed to make it into a stable orbit that would allow it to rendezvous with the International Space Station. What happened: An Atlas V rocket safely carried Starliner into space from Cape Canaveral Air Force Station on Friday, but the capsule had an anomaly with its internal system timer. A faulty internal clock means Starliner won’t rendezvous with the ISS, a massive setback for NASA and Boeing.
Read more
At LinkedIn, we like to say that experimentation is in our blood because no production release at the company happens without experimentation; by “experimentation,” we typically mean “A/B testing.” The company relies on employees to make decisions by analyzing data. Experimentation is a data-driven foundation of the decision-making process, which helps with measuring the precise impact of every change and release, and evaluating whether expectations meet reality.
LinkedIn’s experimentation platform operates at an extremely large scale: It serves up to 800,000 QPS of network calls, It serves about 35,000 concurrently running A/B experiments, It handles up to 23 trillion experiment evaluations per day, Average latency of experiment evaluation is 700 ns and the 99th percentile is 3 μs, It is used in about 500 production services.
Read more
Read the 2020 Cloud Report from Cockroach Labs, and learn which cloud platform performs best for transactional workloads across TPC-C, Network Throughput, CPU, and Storage benchmarks. If there’s one thing we’ve learned in our three years of benchmarking cloud providers on transactional workloads, it’s this: the results change often. Last year’s report showed AWS dramatically outperforming GCP across TPC-C performance, CPU, Network, and even cost.
The 2020 Cloud Report shows GCP caught up to AWS’s performance and offers the best price per performance for transactional workloads, and that new-to-the-report Azure is broadly competitive with both GCP and AWS.
Read more