windows server 2012R2, rtc中断次数增多导致 kvm宿主机cpu拉高到百分之百
1.话不多说,如标题所示
2.采集 linux kernel kvm trace event 日志
用如下脚本:https://www.cnblogs.com/maojun1998/p/14982065.html
3.分析统计日志:
发现
kvm_pio: pio_write at 0x70 size 1 count 1 val 0xc
kvm_pio: pio_read at 0x71 size 1 count 1 val 0xc0
这两个日志出现最多:统计命令:cat trace.txt_607 | grep 0x71 | nl | tail -1
20s高达 128428 =12w,恐怖,每秒大概6k次,频率非常之高,基本有个规律就是
在0x70写0x0c,然后在0x71读它,
从 https://wiki.osdev.org/RTC,得知70 71是主板rtc端口,0xc 寄存器用了设置中断复位
0xfffff800d31ad347
0xfffff800d31ad34d
两个挨得好近
dumpbin /DISASM hal.dll > hal.asm
搜偏移347,34d,找到
看来函数 CMOS_READ 被调用如此频繁,上内核调试器,断点分析
1 # Child-SP RetAddr Call Site 2 00 fffff803`4a91ce88 fffff803`4999d037 hal!CMOS_READ 3 01 fffff803`4a91ce90 fffff803`4999cfb2 hal!HalpGetSetCmosData+0x5b 4 02 fffff803`4a91ced0 fffff803`499c4203 hal!HalpGetCmosData+0xe 5 03 fffff803`4a91cf10 fffff803`49996b33 hal!HalpRtcAcknowledgeInterrupt+0x1b 6 04 fffff803`4a91cf40 fffff803`492c6943 hal!HalpTimerClockInterrupt+0x33 7 05 fffff803`4a91cf70 fffff803`493539ca nt!KiCallInterruptServiceRoutine+0xa3 8 06 fffff803`4a91cfb0 fffff803`49353e77 nt!KiInterruptSubDispatchNoLockNoEtw+0xea 9 07 fffff803`4a90e840 fffff803`499a981f nt!KiInterruptDispatchNoLockNoEtw+0x37 10 08 fffff803`4a90e9d8 fffff803`494090ad hal!HalProcessorIdle+0xf 11 09 fffff803`4a90e9e0 fffff803`49307a50 nt!PpmIdleDefaultExecute+0x1d 12 0a fffff803`4a90ea10 fffff803`4927c186 nt!PpmIdleExecuteTransition+0x400 13 0b fffff803`4a90eb10 fffff803`493561ac nt!PoIdle+0x2f6 14 0c fffff803`4a90ec60 00000000`00000000 nt!KiIdleLoop+0x2c
参考:https://bugzilla.redhat.com/show_bug.cgi?id=1610461
hal!HalpRtcSetDivisor,用了设置中断频率,断点之
1 kd> k 2 # Child-SP RetAddr Call Site 3 00 ffffd000`205415d8 fffff801`22d9806b hal!CMOS_WRITE 4 01 ffffd000`205415e0 fffff801`22d97fce hal!HalpGetSetCmosData+0x8f 5 02 ffffd000`20541620 fffff801`22dbf3d8 hal!HalpSetCmosData+0xe 6 03 ffffd000`20541660 fffff801`22da5f4c hal!HalpRtcSetDivisor+0x48 7 04 ffffd000`20541690 fffff801`22d8f28d hal!HalpSetTimer+0x16c68 8 05 ffffd000`205416f0 fffff801`2270effa hal!HalpTimerClockArm+0x5d 9 06 ffffd000`20541730 fffff801`2270efba nt!KiSetClockTickRate+0x3a 10 07 ffffd000`20541770 fffff801`2270eeb2 nt!KiSetClockIntervalToMinimumRequested+0x2e 11 08 ffffd000`205417a0 fffff801`226b46dd nt!ExpUpdateTimerConfigurationWorker+0x3a 12 09 ffffd000`205417d0 fffff801`22ab28cf nt!KeGenericProcessorCallback+0xf1 13 0a ffffd000`20541940 fffff801`22ab2733 nt!ExpUpdateTimerConfiguration+0xa7 14 0b ffffd000`20541a70 fffff801`22ab2554 nt!ExpUpdateTimerResolution+0x57 15 0c ffffd000`20541aa0 fffff801`2275d3e3 nt!NtSetTimerResolution+0xf8 16 0d ffffd000`20541b00 00007ff9`01281eba nt!KiSystemServiceCopyEnd+0x13 17 0e 000000e3`21e8f478 00007ff9`00a8e910 ntdll!NtSetTimerResolution+0xa 18 0f 000000e3`21e8f480 00007ff8`ead28659 KERNEL32!timeBeginPeriod+0xc0 19 10 000000e3`21e8f4b0 00007ff8`ead2855f wpfgfx_v0400!CRenderTargetManager::EnableVBlankSync+0x50 20 11 000000e3`21e8f520 00007ff8`ead28531 wpfgfx_v0400!CComposition::Partition_SetVBlankSyncMode+0x17 21 12 000000e3`21e8f550 00007ff8`eacf2776 wpfgfx_v0400!CComposition::ProcessCommandBatch+0x1aa 22 13 000000e3`21e8f600 00007ff8`eacf2707 wpfgfx_v0400!CComposition::ProcessPartitionCommand+0x6a 23 14 000000e3`21e8f630 00007ff8`eacf2689 wpfgfx_v0400!CCrossThreadComposition::ProcessBatches+0x77 24 15 000000e3`21e8f660 00007ff8`eacf30a6 wpfgfx_v0400!CCrossThreadComposition::OnBeginComposition+0x3f 25 16 000000e3`21e8f690 00007ff8`eacf2ff8 wpfgfx_v0400!CComposition::ProcessComposition+0xb1 26 17 000000e3`21e8f720 00007ff8`eacf2f54 wpfgfx_v0400!CComposition::Compose+0x4e 27 18 000000e3`21e8f760 00007ff8`eacf2f0d wpfgfx_v0400!CPartitionThread::RenderPartition+0x34 28 19 000000e3`21e8f790 00007ff8`eacf57ab wpfgfx_v0400!CPartitionThread::Run+0x64 29 1a 000000e3`21e8f7c0 00007ff9`00a713f2 wpfgfx_v0400!CPartitionThread::ThreadMain+0x2b 30 1b 000000e3`21e8f7f0 00007ff9`012054f4 KERNEL32!BaseThreadInitThunk+0x22 31 1c 000000e3`21e8f820 00000000`00000000 ntdll!RtlUserThreadStart+0x34
nnd, timeBeginPeriod这个函数看来有鬼,
https://docs.microsoft.com/en-us/windows/win32/api/timeapi/nf-timeapi-timebeginperiod
确实有鬼,
1 Setting a higher resolution can improve the accuracy of time-out intervals in wait functions.
However, it can also reduce overall system performance, because the thread scheduler switches tasks more often.
High resolutions can also prevent the CPU power management system from entering power-saving modes.
Setting a higher resolution does not improve the accuracy of the high-resolution performance counter.
简单来说就是提高时间精度,降低计算机性能