[dpdk] TSC , HPET, Timer, Event Timer,RDTSCP

关于dpdk timer跨越CPU core调度的准确性问题


首先dpdk的timer接口里边使用 cpu cycle来比较时间。根据之前的内容 

[dpdk] dpdk --lcores参数

当一个EAL thread映射在多个processor上的时候,cpu cycle有可能在不同的CPU core上面获得,

又因为cpu cycle是使用rdtsc指令获取的,这样会造成拿到的cpu cycle不准的问题。

 

首先,调查一下 rdtsc 指令:

https://stackoverflow.com/questions/3388134/rdtsc-accuracy-across-cpu-cores?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

 

Invariant TSC

X86_FEATURE_CONSTANT_TSC + X86_FEATURE_NONSTOP_TSC

"16.11.1 Invariant TSC

The time stamp counter in newer processors may support an enhancement, referred to as invariant TSC. Processor's support for
invariant TSC is indicated by CPUID.80000007H:EDX[8].
The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is the architectural behavior moving
forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or
HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or
access to a platform resource."

 

[root@D128 ~]# cat /proc/cpuinfo |grep tsc
constant_tsc nonstop_tsc

只能保证在单个core 改变频率或挂起的时候的tsc准确性,不能保证跨CPU core的同步问题。

https://software.intel.com/en-us/forums/software-tuning-performance-optimization-platform-monitoring/topic/388964

Hello Samuel,

The 'Invariant TSC' means that the TSC runs at a fixed frequency and doesn't stop when the cpu halts.
The TSCs are not guaranteed to be synchronized although the OS usually does try to synchronize the TSC at boot time. This is one reason 
for the rdtscp instruction. On Nehalem and later cpus, the rdtscp instruction returns the TSC and an identifier indicating on which cpu
you read the TSC. RDTSCP is a serializing instruction... unlike the regular rdtsc instruction. Pat

 

HPET

https://en.wikipedia.org/wiki/High_Precision_Event_Timer

 

An HPET chip consists of a 64-bit up-counter (main counter) counting at a frequency of at least 10 MHz, 
and a set of (at least three, up to 256) comparators. These comparators are 32- or 64-bit-wide. The HPET
is programmed via a memory mapped I/O window that is discoverable via Advanced Configuration and Power
Interface (ACPI). The HPET circuit in modern PCs is integrated into the southbridge chip.[a]

 

HPET是一个芯片全局的计数器,最小精度为10纳秒,一般集成在南桥。

HPET提供最少3最多256个独立的计数器。

 

 

The Linux kernel can also use HPET as its clock source. The documentation of Red Hat MRG version 2 states that TSC is the preferred 
clock source due to its much lower overhead, but it uses HPET as a fallback. A benchmark in that environment for 10 million event
counts found that TSC took about 0.6 seconds, HPET took slightly over 12 seconds, and ACPI Power Management Timer took around 24 seconds.[5]

虽然精度高,到底有性能损耗,linux Kernel仍然推荐TSC作为首选计数器,HPET作为备选。

 

查看HPET是否启用:

[root@D129 cli]# grep hpet /proc/timer_list 
Clock Event Device: hpet
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
[root@D129 haha-walawala]# cat /sys/devices/system/clocksource/clocksource0/available_clocksource 
kvm-clock hpet acpi_pm 
[root@D129 haha-walawala]# cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
kvm-clock
[root@D129 haha-walawala]# ll /dev/hpet 
crw-------. 1 root root 10, 228 May  3 16:23 /dev/hpet
[root@D129 haha-walawala]# 

dpdk如何配置生效:

https://dpdk.org/doc/guides/linux_gsg/enable_func.html#high-precision-event-timer-hpet-functionality

 

rdtscp 

 

 

ACPI

略。

 

Event Timer Adapter Library

https://dpdk.org/doc/guides/prog_guide/event_timer_adapter.html#id1

看完以上文档,读一下代码,确定两个问题:

1. RDTSC的调用时机

2. Event Timer backend的hardware是什么?

官方没有Event Timer的例子,看一下Event Device library库的用法:

 http://dpdk.org/doc/guides/prog_guide/eventdev.html

 

 

 

其他参考阅读:

https://www.ibm.com/developerworks/cn/linux/l-cn-timerm/

 

posted on 2018-05-03 20:15  toong  阅读(1933)  评论(0编辑  收藏  举报