0
  • 聊天消息
  • 系统消息
  • 评论与回复
登录后你可以
  • 下载海量资料
  • 学习在线课程
  • 观看技术视频
  • 写文章/发帖/加入社区
创作中心

完善资料让更多小伙伴认识你,还能领取20积分哦,立即完善>

3天内不再提示

如何使用OFF-CPU火焰图调查分析Linux性能问题概述

Linux阅码场 来源:未知 2018-12-23 13:47 次阅读

本文用off-cpu火焰图分析一个程序的延迟(主要在拿锁上),找出来瓶颈,并消除的故事。本文非常值得一读,但是阅码场没有足够的时间将其翻译为中文,希望童鞋们直接读英文。

The Setup

As a performance engineer at MemSQL, one of my primary responsibilities is to ensure that customer Proof of Concepts (POCs) run smoothly. I was recently asked to assist with a big POC, where I was surprised to encounter an uncommon Linux performance issue. I was running a synthetic workload of 16 threads (one for each CPU core). Each one simultaneously executed a very simple query (select count(*) from t where i > 5) against a columnstore table.

In theory, this ought to be a CPU bound operation since it would be reading from a file that was already in disk buffer cache. In practice, our cores were spending about 50% of their time idle

In this post, I’ll walk through some of the debugging techniques and reveal exactly how we reached resolution.

What were our threads doing?

After confirming that our workload was indeed using 16 threads, I looked at the state of our various threads. In every refresh of myhtopwindow, I saw that a handful of threads were in theDstate corresponding to “Uninterruptible sleep”:

Why were we going off CPU?

At this point, I generated anoff-cpu flamegraphusing Linuxperf_eventsto see why we entered this state.Off-CPUmeans that instead of looking at what is keeping the CPU busy, you look at what is preventing it from being busy by things happening elsewhere (e.g. waiting for IO or a lock). The normal way to generate these visualizations is to useperf inject -s, but the machine I tested on did not have a new enough version ofperf. Instead I had to use anawkscriptI had previously written:

$ sudoperfrecord --call-graph=fp -e 'sched:sched_switch' -e 'sched:sched_stat_sleep' -e 'sched:sched_stat_blocked' --pid $(pgrep memsqld | head -n 1) -- sleep 1

[ perf record: Woken up 1 times to write data ]

[ perf record: Captured and wrote 1.343 MB perf.data (~58684 samples) ]

$ sudoperfscript -f time,comm,pid,tid,event,ip,sym,dso,trace -i sched.data | ~/FlameGraph/stackcollapse-perf-sched.awk | ~/FlameGraph/flamegraph.pl --color=io --countname=us >off-cpu.svg

Note: recording scheduler events viaperf recordcan have a very large overhead and should be used cautiously in production environments. This is why I wrap theperf recordaround asleep 1to limit the duration.

In an off-cpu flamegraph, the width of a bar is proportional to the total time spent off cpu. Here we see a lot of time is spent inrwsem_down_write_failed.

From the repeated calls torwsem_down_read_failedandrwsem_down_write_failed, we see that culprit wasmmapcontending in the kernel on themm->mmap_semlock:

down_write(&mm->mmap_sem);

ret = do_mmap_pgoff(file, addr, len, prot, flag, pgoff,&populate);

up_write(&mm->mmap_sem);

This was causing everymmapsyscall to take 10-20ms (almost half the latency of the query itself). MemSQL was so fast that that we had inadvertently written a benchmark for Linuxmmap!

The fix was simple — we switched from usingmmapto using the traditional filereadinterface. After this change, we nearly doubled our throughput and became CPU bound as we expected:

For more information and discussion around Linux performance,check out the original post on my personal blog.

Download MemSQL Community Edition to run your own performance tests for free today:memsql.com/download

Alex Reece is a systems and performance engineer. He believes in active benchmarking, root cause analysis, and fast code.

声明:本文内容及配图由入驻作者撰写或者入驻合作网站授权转载。文章观点仅代表作者本人,不代表电子发烧友网立场。文章及其配图仅供工程师学习之用,如有内容侵权或者其他违规问题,请联系本站处理。 举报投诉
  • cpu
    cpu
    +关注

    关注

    68

    文章

    10446

    浏览量

    206571
  • Linux
    +关注

    关注

    87

    文章

    10990

    浏览量

    206738
  • SQL
    SQL
    +关注

    关注

    1

    文章

    738

    浏览量

    43462

原文标题:用off-cpu火焰图调查Linux性能问题

文章出处:【微信号:LinuxDev,微信公众号:Linux阅码场】欢迎添加关注!文章转载请注明出处。

收藏 人收藏

    评论

    相关推荐

    Linux性能分析工具大全

    今天浩道跟大家分享关于linux性能分析过程中常用到的分析工具!
    发表于 01-05 09:52 477次阅读

    中国锂离子电池原材料市场调查分析报告2008-2009版

    中国锂离子电池原材料市场调查分析报告2008-2009版 详细内容请见:http://www.boomingfield.com/Html/yjxxcl/2008-9/18
    发表于 12-29 15:12

    _首届中国嵌入式应用状况_调查分析报告

    _首届中国嵌入式应用状况_调查分析报告
    发表于 08-20 14:48

    火焰探测器参考方案

    和室外。而随着技术的进步,这样的分界也逐渐模糊起来,比如今天我们拆解的这款由世健国际贸易(上海)有限公司设计的红外火焰探测参考方案,就集快速、长距、高灵敏度等性能于一身,让人刮目相看。1:世健红外
    发表于 06-21 05:00

    函数关系模型分析概述

    文章目录概述函数关系模型分析资源层->设备层设备层->驱动层总结概述今天看了《韦东山升级版全系列嵌入式视频之总线设备驱动模型》这一节的视频,看完之后感觉有一种似懂非懂的感觉,因此我
    发表于 12-23 06:27

    全志Tina中使用perf分析CPU使用率

    perf简介Perf是是内置于Linux内核源码树中的性能剖析(profiling)工具。不仅可以用于应用程序的性能统计分析,还可以用于内核的性能
    发表于 05-20 14:25

    火焰识别

    本人长期从事Linux系统的图像处理产品研发,近期在做火焰识别,火炉温度控制,智能精准灭火,最近在用树莓派,期待本产品有更好的性能,我希望可以有机会试用该开发版,体验新产品的强大功能,同时及时反馈自己的用户体验,使双方共赢。
    发表于 07-23 10:18

    CPU核心工作性能

    CPU核心工作性能 CPU核心概述       
    发表于 12-17 10:59 321次阅读

    Linux CPU性能应该如何优化

    Linux系统中,由于成本的限制,往往会存在资源上的不足,例如 CPU、内存、网络、IO 性能。本文,就对 Linux 进程和 CPU
    的头像 发表于 01-18 08:52 3138次阅读

    火焰图:全局视野的Linux性能剖析

    CPU火焰图中的每一个方框是一个函数,方框的长度,代表了它的执行时间,所以越宽的函数,执行越久。火焰图的楼层每高一层,就是更深一级的函数被调用,最顶层的函数,是叶子函数。
    的头像 发表于 06-28 09:44 1808次阅读

    杀手级分析——bootchart

    之前小弟一直在宣传推广火焰图,结果是很多童鞋凡事都用火焰图。说实话,火焰图特别适合分析运行时热点(无论是on-cpu
    的头像 发表于 09-08 09:13 6573次阅读
    杀手级<b class='flag-5'>分析</b>——bootchart

    基于linux eBPF的进程off-cpu的方法

    的swap等。如下图所示,红色部分属于on-cpu部分,蓝色部分属于off-cpu。 一般我们用的perf命令等都是采样on-cpu的指令进行CPU的消耗
    的头像 发表于 09-25 15:41 2765次阅读
    基于<b class='flag-5'>linux</b> eBPF的进程<b class='flag-5'>off-cpu</b>的方法

    Linux下Apache性能分析总结

    Linux下Apache性能分析总结(深圳核达中远通电源技术有限公司地址)-该文档为Linux下Apache性能
    发表于 09-24 14:53 2次下载
    <b class='flag-5'>Linux</b>下Apache<b class='flag-5'>性能</b><b class='flag-5'>分析</b>总结

    Linux问题分析性能优化

    文章来源于:https://mp.weixin.qq.com/s/d1NLXGp7teOgskussBXNMQ作者:alex目录排查顺序方法论性能分析工具CPU分析思路内存
    的头像 发表于 09-06 19:01 659次阅读
    <b class='flag-5'>Linux</b>问题<b class='flag-5'>分析</b>与<b class='flag-5'>性能</b>优化

    Linux问题故障定位的小技巧

    a. on-CPU:执行中,执行中的时间通常又分为用户态时间user和系统态时间sys。 b. off-CPU:等待下一轮上CPU,或者等待I/O、锁、换页等等,其状态可以细分为可执行、匿名换页、睡眠、锁、空闲等状态。
    的头像 发表于 07-09 16:30 300次阅读
    <b class='flag-5'>Linux</b>问题故障定位的小技巧