CPU Utilization is Wrong

 — 1 minute read

Brendan Gregg renown performance expert currently with Netflix presented at UpSCALE what he thinks is wrong about monitoring CPU utilization. And why it’s getting worse while CPUs get faster.

Brendan presents that the general understanding of CPU utilization is the ratio between the CPU working and being idle as measured by the kernel. When in fact it’s dramatically different in that the kernel can’t distinguish between the CPU executing instructions and waiting (“being stalled”) for external resources.

At Netflix he observed in general that a 90% utilized CPU is actually busy waiting 70% for external resources and only 20% executing instructions. He illustrated his findings with an example regarding MySQL using various CPU low-level tools and makes a devastating realization why recent Linux kernel changes had such a negative impact on overall performance.

Brendan’s talk is just 5 minutes long, but full of insight. There’s also his blog post KPTI/KAISER Meltdown Initial Performance Regressions going in to more details.