This post provides the background on how Trip.com uses Cilium and what lead the team to standardize on Cilium as their networking and network security platform for the years to come. It is a summary with some commentary of the original trip.com blog post which provides extensive details into the decision-making process and experiences while running Cilium in production.
We are excited to announce the Cilium 1.6 release. A total of 1408 commits have been contributed by the community with many developers contributing for the first time. Cilium 1.6 introduces several exciting new features: KVStore free operation:
The addition of a new CRD-based backend for security identities now allows to operate Cilium entirely without a KVstore in the context of Kubernetes. (More details) KVStore free operation:
Socket-based load-balancing: Socket-based load-balancing combines the advantage of client-side and network-based load-balancing by providing fully transparent load-balancing using Kubernetes services with the translation from service IP to endpoint IP done once during connection establishment instead of translating each network packet for the lifetime of a connection.
The survey was announced on our Slack channel and on Twitter. Participation was anonymous and did not require to leave behind contact information. Most questions had a set of predefined answers plus a field to add additional answers.
All questions were optional, some users did not answer all questions.
We are excited to announce the Cilium 1.5 release. Cilium 1.5 is the first release where we primarily focused on scalability with respect to number of nodes, pods and services.
Our goal was to scale to 5k nodes, 20k pods and 10k services. We went well past that goal with the 1.5 release and are now officially supporting 5k nodes, 100k pods and 20k services. Along the way, we learned a lot, some expected, some unexpected, this blog post will dive into what we learned and how we improved.
Let’s review some of the use cases of connecting multiple Kubernetes clusters before we dive into the implementation details. High availability is the most obvious use case for most. This use case includes operating Kubernetes clusters in multiple regions or availability zones and runs the replicas of the same services in each cluster.
Upon failure, requests can fail over to other clusters. The failure scenario covered in this use case is not primarily the complete unavailability of the entire region or failure domain.
We are excited to announce the Cilium 1.4 release. The release introduces several new features as well as optimization and scalability work. The highlights include the addition of global services to provide Kubernetes service routing across multiple clusters, DNS request/response aware authorization and visibility, transparent encryption (beta), IPVLAN support for better performance and latency (beta), integration with Flannel, GKE on COS support, AWS metadata based policy enforcement (alpha) as well as significant efforts into optimizing memory and CPU usage.