Operating Apache Kafka Clusters 24/7 Without A Global Ops Team

Earlier this year, the Streaming PubSub team at Lyft got multiple Apache Kafka clusters ready to take on load that required 24/7 support. The team’s operational burden for Kafka quickly started heading towards burn-out territory. On-call rotations started getting miserable because we’d get woken up at night due to failing hosts. Business requirements kept coming […]

Making long-term forecasts at Lyft

At Lyft, like many other companies, we need to make accurate short and long-term forecasts. Some of the metrics that we need to accurately predict are number of driver hours provided by drivers in different regions — i.e our supply side of the business — and also number of rides taken by riders in different […]