gettimeofday、clockgettime 以及不同时钟源的影响
$rpm -qa|grep glibc-comm
$uname -a
$./a.out -help
[gettimeofday/clock_gettime] thread_number loop_count
$./a.out gettimeofday 1 100000000
gettimeofday(50035480681901) , times : 100000000
thread 1105828160 consume 4000225 us
图1 gettimeofday 走vsyscall
图2 gettimeofday能将usr态cpu消耗压到100%
$./a.out gettimeofday 12 100000000
gettimeofday(51127820371298) , times : 100000000
thread 1201568064 consume 4111854 us
$./a.out clock_gettime 1 100000000
clock_gettime(50265567600696623) , times : 100000000
thread 1107867968 consume 10242448 us
图3 clock_gettime 走真正的系统调用
图4 clock_gettime 70%的cpu花在sys态,确实进入了系统调用流程
$./a.out clock_gettime 12 100000000
clock_gettime(50369061997211567) , times : 100000000
thread 1122031936 consume 10226828 us
clock_gettime -> sys_call -> sys_clock_gettime -> getnstimeofday -> read_tsc -> native_read_tsc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | void getnstimeofday( struct timespec *ts) { unsigned long seq; s64 nsecs; WARN_ON(timekeeping_suspended); do { // 下面代码执行过程中xtime可能会被更改,这里通过持有一个序号来避免显示加锁,如果该代码执行完毕之后,seq并未改变,说明xtime未被更改,此次执行成功,否则重试;无论这里重试与否,CPU都会一直干活; seq = read_seqbegin(&xtime_lock); *ts = xtime; nsecs = timekeeping_get_ns(); //从当前时钟源取更细致的时间精度 /* If arch requires, add in gettimeoffset() */ nsecs += arch_gettimeoffset(); // 中断从发出到执行有时间消耗,可能需要做补偿 } while (read_seqretry(&xtime_lock, seq)); timespec_add_ns(ts, nsecs); //时间转换 } /* Timekeeper helper functions. */ static inline s64 timekeeping_get_ns( void ) { cycle_t cycle_now, cycle_delta; struct clocksource * clock ; /* read clocksource: */ clock = timekeeper. clock ; // 使用系统注册的时钟源[2]来读取,当前情况下,一般tsc是默认时钟源 // /sys/devices/system/clocksource/clocksource0/current_clocksource // 下面的调用最终执行native_read_tsc, 里面就是一条汇编指令rdtsc cycle_now = clock ->read( clock ); /* calculate the delta since the last update_wall_time: */ cycle_delta = (cycle_now - clock ->cycle_last) & clock ->mask; /* return delta convert to nanoseconds using ntp adjusted mult. */ return clocksource_cyc2ns(cycle_delta, timekeeper.mult, timekeeper.shift); } |
$cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
$sudo bash -c "echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource"
$cat /sys/devices/system/clocksource/clocksource0/current_clocksource
$./a.out gettimeofday 1 100000000
gettimeofday(50067118117357) , times : 100000000
thread 1091926336 consume 71748597 us
延时是原来的17倍,大概700ns一次;clock_gettime 与此类似,因为此时瓶颈已经不是系统调用,而是hpet_read很慢。
clock_gettime -> sys_call -> sys_clock_gettime -> getnstimeofday -> read_hpet -> hpet_readl –> readl
总结来说,上文制定的内核和glibc版本下,tsc时钟源,gettimeofday 比 clock_gettime快1倍多,适合做计时用(clock_gettime使用CLOCK_REALTIME_COARSE也是很快的);如果因为tsc不稳定(硬件或者内核bug都可能导致,碰到过),hpet一般不会同时出问题,这时hpet成为了新的时钟源,整体性能下降数十倍,两者没啥区别了。
[1]. On vsyscalls and the vDSO :
[2]. Linux内核的时钟中断机制 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | #include <sys/time.h> #include <iostream> #include <time.h> using namespace std; uint64_t now() { struct timeval tv; gettimeofday(&tv, NULL); return tv.tv_sec * 1000000 + tv.tv_usec; } void * func_gettimeofday( void * p) { int32_t c = *(int32_t*)p; uint64_t start = now(); uint64_t us = 0; int i = 0; while (i++ < c) { struct timeval tv; gettimeofday(&tv, NULL); us += tv.tv_usec; // avoid optimize } cout << "gettimeofday(" << us << ") , times : " << c << endl; cout << "thread " << pthread_self() << " consume " << now() - start << " us" << endl; return 0; } void * func_clockgettime( void * p) { int32_t c = *(int32_t*)p; uint64_t start = now(); uint64_t us = 0; int i = 0; while (i++ < c) { struct timespec tp; clock_gettime(CLOCK_REALTIME, &tp); us += tp.tv_nsec; } cout << "clock_gettime(" << us << ") , times : " << c << endl; cout << "thread " << pthread_self() << " consume " << now() - start << " us" << endl; return 0; } int main( int argc, char ** argv) { if (argc != 4) { cout << " [gettimeofday/clock_gettime] thread_number loop_count" << endl; exit (-1); } string mode = string(argv[1]); int n = atoi (argv[2]); int loop = atoi (argv[3]); pthread_t* ts = new pthread_t[n]; for ( int i = 0; i < n; i++) { if (mode == "gettimeofday" ) { pthread_create(ts+i, NULL, func_gettimeofday, &loop); } else { pthread_create(ts+i, NULL, func_clockgettime, &loop); } } for ( int i = 0; i < n; i++) { pthread_join(ts[i], NULL); } delete [] ts; return 0; } |
posted on 2014-07-03 19:48 RaymondSQ 阅读(12678) 评论(0) 编辑 收藏 举报
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 一个奇形怪状的面试题:Bean中的CHM要不要加volatile?
· [.NET]调用本地 Deepseek 模型
· 一个费力不讨好的项目,让我损失了近一半的绩效!
· .NET Core 托管堆内存泄露/CPU异常的常见思路
· PostgreSQL 和 SQL Server 在统计信息维护中的关键差异
· DeepSeek “源神”启动!「GitHub 热点速览」
· 我与微信审核的“相爱相杀”看个人小程序副业
· 微软正式发布.NET 10 Preview 1:开启下一代开发框架新篇章
· 如何使用 Uni-app 实现视频聊天(源码,支持安卓、iOS)
· C# 集成 DeepSeek 模型实现 AI 私有化(本地部署与 API 调用教程)