Shanghai, China
June 24–26, 2019
Click here for more information and registration

Simultaneous translation will be provided for all keynote and breakout sessions.

To view the Chinese version of this schedule please go here.

Venue + Sponsor Showcase Map
场馆 + 赞助商展示区地图

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

KC+CNC - Performance [clear filter]
Tuesday, June 25


Go FaaSter: Cold Start Optimization in a Serverless Platform - Scott Zhou & Yanbo Li, Tencent
There is an open secret in the Serverless industry, most functions start off cold, taking many seconds the very first time -- impacting the latency of many business critical applications.

Come to this talk, where we'll talk about how we are dramatically improving the cold start performance of the Tencent Serverless Cloud Functions(SCF) platform.

We'll talk about our SCF architecture -- including networking, infrastructure, and function deployments -- and the factors that causes cold start latency.

We'll then cover the architectural changes we're making that is improving cold starts. Including improvements to elastic network interfaces, migration from containers to micro VMs, function code deployment, and resource reuse.

Finally, we'll cover how combining ML with autoscaling can avoid cold starts alltogether.

avatar for scott zhou

scott zhou

Expert Engineer, Tencent
Scott Zhou, is an Expert Engineer leading serverless computing at Tencent cloud.He is one of the pioneers of Tencent Cloud, in the past he has worked on VM migration for resource utilization, VM scheduling, DnsPod, and OpenAPI platform. In the distant past, he's worked on message... Read More →
avatar for Yanbo Li

Yanbo Li

Senior Engineer, Tencent
Li Yanbo, Senior software engineer in Tencent Cloud Middleware team. He has great interest in containers and networks. Previously, he worked at Huawei on LTE network protocol stack development.

Tuesday June 25, 2019 11:00 - 11:35


How Should You Effectively Use etcd Metrics - Wenjia Zhang & Jingyi Hu, Google
All production systems need monitoring, to detect problems in advance and troubleshoot with the right information. etcd is no exception. How to effectively use ~100 etcd metrics and how to interpret the values under different usages?

First of all, one must monitor if a leader exists, otherwise the system becomes unavailable. Furthermore, frequent leadership changes can impair the performance of consensus systems. Therefore, leader related metrics are critical. Some other etcd metrics also need special attention. Disk I/O and networking I/O related metrics hint physical constraints. Latency and throughput metrics are meaningful only when cross referencing with hardware configurations. We will walk you through etcd benchmarking tool, explain the important etcd metrics, and eventually help you understand how to apply etcd metrics with some case studies.

avatar for Wenjia Zhang

Wenjia Zhang

Software Engineer, Google
Wenjia Zhang is a Software Engineer on GKE team at Google. She is an active contributor for both Kubernetes and etcd open source projects.
avatar for Jingyi Hu

Jingyi Hu

Software Engineer, Google
Jingyi Hu is a Software Engineer for Google Cloud. He is a maintainer of etcd and an active contributor to Kubernetes.

Tuesday June 25, 2019 11:45 - 12:20


Benchmark Your Cloud Native Database - Josh Berkus, Red Hat
You can run your stateful apps on Kubernetes. You can even run your databases on Kubernetes. But what are you giving up in performance? Is it worth it, or should you stick to the hosting you know?

For the past several months, we've been benchmarking various forms of Kubernetes storage, including host storage, network storage, cloud storage and cloud-native storage systems like Rook. Let us share with you the results of running PostgreSQL, CockroachDB and filesystem benchmarks so that you can make the best possible tradeoffs. We'll even show you how to do your own, to test your own platform.

You will leave this talk with a much better idea of the quantitative tradeoffs between performance, reliability, data retention, and manageability.

avatar for Josh Berkus

Josh Berkus

Community Lead at Red Hat, Red Hat
Josh Berkus is Red Hat's Kubernetes Community Manager, which is the reason he spends so much time working in SIG-Release and SIG-Contributor Experience. He's also a long-time database geek, and has done benchmarks for the TPC and SPEC. His real passion in the cloud native world is... Read More →

Tuesday June 25, 2019 13:35 - 14:10


Understanding Scalability and Performance in the Kubernetes Master - Xingyu Chen & Fansong Zeng, Alibaba
Currently, the scale limit of Kubernetes is 5k nodes, so if you want to use it to manage a web-scale cluster like 10k nodes, you probably can't make it.

Have you wondered what is the performance bottleneck for Kubernetes to manage more than 5k nodes? When you want to expand its scalability to a new level, who's to "blame" first? Etcd, apiserver, or scheduler?

Understanding these questions is the key to operate a large-size kubernetes cluster. In Alibaba, we encountered many issues like pod creation gets extremely slower as the cluster grows to larger and larger. In this talk, we would like to share how we did various benchmark tests and profiling. And how we did tweaks/tunings on the master and achieved more than 100x performance improvement in the master. Currently, operating a 10K-node kubernetes cluster is just as smooth as a 2k-node one.

avatar for Fansong Zeng

Fansong Zeng

Staff Engineer, Alibaba
Zeng is a tech leader at the schedule team in Alibaba scheduling systems department, he has rich experience of cluster resource management system,especially running mixed workloads in a cluster.
avatar for Xingyu Chen

Xingyu Chen

software engineer, Alibaba
Xingyu Chen is from Alibaba Cloud who works in the infrastructure team which is responsible for managing the super-large computing resources in the Alibaba.He starts to contribute to Kubernetes since its beginning. His main interest is on the performance and scalability of Kubern... Read More →

Tuesday June 25, 2019 14:20 - 14:55


Istio Performance and Best Practices in Large Scale Kubernetes Cluster - Guang Ya Liu & Chun Lin Yang, IBM
As many industry cloud solutions and frameworks are adopting Istio since its GA in 2018, it is important to understand its performance in large scale Kubernetes cluster (2000+ nodes). In this session, we will share our test results and observation for Istio 1.1 a 2000 nodes Kubernetes cluster based on the requirement of a large bank in China and also discuss the best practices and tuning guidelines for effectively using Istio service mesh to obtain best performance and scalability.

avatar for Guang Ya Liu

Guang Ya Liu

Senior Technical Staff Member, IBM
Guang Ya Liu is a Senior Technical Staff Member (STSM) for IBM Cloud Private and is now focusing on cloud computing, container technology, and distributed computing. He is also a member of the IBM Academy of Technology. He used to be an OpenStack Magnum Core member from 2015 to 2017... Read More →

Chun Lin Yang

Senior Software Architect, IBM
Chunlin Yang is a Senior Software Architect in IBM. He joined Istio project after 10 years experiences in HPC/UX/frontend area. He is Istio squad leader in IBM Private Cloud and member of Istio Open Source community.

Tuesday June 25, 2019 15:05 - 15:40