
Operating Apache Kafka Clusters 24/7 Without A Global Ops Team
Earlier this year, the Streaming PubSub team at Lyft got multiple Apache Kafka clusters ready to take on load that required 24/7 support. The team’s operational burden for Kafka quickly started heading towards burn-out territory. On-call rotations started getting miserable because we’d get woken up at night due to failing hosts. Business requirements kept coming […]