In particular, etcd experienced performance issues with a large number of concurrent read transactions even when there is no write (e.g. “read-only range request … took too long to execute”). Previously, the storage backend commit operation on pending writes blocks incoming read transactions, even when there was no pending write. Now, the commit does not block reads which improve long-running read transaction performance.
We further made backend read transactions fully concurrent. Previously, ongoing long-running read transactions block writes and upcoming reads. With this change, write throughput is increased by 70% and P99 write latency is reduced by 90% in the presence of long-running reads.
We also ran Kubernetes 5000-node scalability test on GCE with this change and observed similar improvements. For example, in the very beginning of the test where there are a lot of long-running “LIST pods”, the P99 latency of “POST clusterrolebindings” is reduced by 97.4%. This non-blocking read transaction is now used for compaction, which, combined with the reduced compaction batch size, reduces the P99 server request latency during compaction.
Source: kubernetes.io