ARTS Week 23
Algorithm
本周的 LeetCode 题目为 50. Pow(x, n)
实现 pow(x, n) ,即计算 x 的 n 次幂函数(即,xn)。
输入:x = 2.00000, n = 10
输出:1024.00000
输入:x = 2.00000, n = -2
输出:0.25000
解释:2^-2 = (1/2)^2 = 1/4 = 0.25
为了使用二分法,需要首先保证 n 永远为正值,故需要对x的n次方进行转换。首先对 n 进行判断,若 n < 0,xn等于 pow(1/x, -n),当 n = 0或1 时,pow(x, n) 永远为 x,只有当 n>1 时,x的n次方就是计算 pow(x,n)。在真正计算xn函数calPow(double x, long n)
中,若 n==0
则返回1,先通过递归得到xn/2的结果tmp,若n为偶数,那么最后结果为 tmp*tmp
;若n为奇数,那么最后结果为 tmp*tmp*x
,代码如下:
class Solution {
public double myPow(double x, int n) {
if (x == 0 || x == 1) {
return x;
}
if (n > 0) {
return calPow(x, n);
} else if (n < 0) {
return calPow(1/x, -n);
} else {
return 1;
}
}
// 使用 long 类型:因为 绝对值(Integer.MIN_VALUE) > 绝对值(Integer.MAX_VALUE)
public double calPow(double x, long n) {
if (n == 0) {
return 1;
}
double tmp = calPow(x, n/2);
if (n % 2 == 0) {
return tmp * tmp;
} else {
return tmp * tmp * x;
}
}
}
Review
本周 Review 的英文文章为:C语言运行的开销
作者介绍了当C语言库(libc)成为程序运行的瓶颈时该如何应对的方法。
作者通过对一个空程序使用 strace -tt
进行追踪,发现在其机器上执行之间长达 9.4ms,这完全是链接器和C语言运行时的开销
// strace output for `int main() { return 0; }`
00:32:27.539873 execve("./overhead", ["./overhead"], [/* 15 vars */]) = 0
00:32:27.541060 brk(0) = 0xa7b000
00:32:27.541466 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
00:32:27.541834 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
00:32:27.542199 fstat(3, {st_mode=S_IFREG|0644, st_size=36586, ...}) = 0
00:32:27.542438 mmap(NULL, 36586, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f4f640be000
00:32:27.542865 close(3) = 0
00:32:27.543109 open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
00:32:27.543455 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\1\2\0\0\0\0\0"..., 832) = 832
00:32:27.543782 fstat(3, {st_mode=S_IFREG|0755, st_size=1984416, ...}) = 0
00:32:27.544162 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bd000
00:32:27.544466 mmap(NULL, 3813200, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f4f63b03000
00:32:27.544796 mprotect(0x7f4f63c9c000, 2097152, PROT_NONE) = 0
00:32:27.545104 mmap(0x7f4f63e9c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x199000) = 0x7f4f63e9c000
00:32:27.545503 mmap(0x7f4f63ea2000, 16208, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f4f63ea2000
00:32:27.545775 close(3) = 0
00:32:27.546087 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bc000
00:32:27.546386 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f640bb000
00:32:27.546664 arch_prctl(ARCH_SET_FS, 0x7f4f640bc700) = 0
00:32:27.547373 mprotect(0x7f4f63e9c000, 16384, PROT_READ) = 0
00:32:27.548209 mprotect(0x7f4f640c7000, 4096, PROT_READ) = 0
00:32:27.548533 munmap(0x7f4f640be000, 36586) = 0
00:32:27.548949 exit_group(0) = ?
00:32:27.549272 +++ exited with 0 +++
接着作者尝试使用内联汇编和 -ffreestanding -nostdlib
来进行测试检查:
// gcc -m32 -ffreestanding -nostdlib
void _start() {
/* exit system call */
asm("movl $1,%eax;"
"xorl %ebx,%ebx;"
"int $0x80"
);
}
// strace -tt output
// 00:41:38.884122 execve("./blank", ["./blank"], [/* 15 vars */]) = 0
// 00:41:38.884552 exit(0) = ?
// 00:41:38.884629 +++ exited with 0 +++
可见这个没有使用任何系统调用的程序只运行了 0.5ms。接着作者尝试了静态编译程序,运行时间大约降到了8ms,追踪得到的输出如下:
00:35:19.095150 execve("./main-static", ["./main-static"], [/* 15 vars */]) = 0
00:35:19.095973 uname({sys="Linux", node="vagrant-arch", ...}) = 0
00:35:19.096373 brk(0) = 0x16cf000
00:35:19.096724 brk(0x16d01c0) = 0x16d01c0
00:35:19.097065 arch_prctl(ARCH_SET_FS, 0x16cf880) = 0
00:35:19.097387 readlink("/proc/self/exe", "/vagrant/level0/main-static", 4096) = 27
00:35:19.097779 brk(0x16f11c0) = 0x16f11c0
00:35:19.098140 brk(0x16f2000) = 0x16f2000
00:35:19.098539 fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
00:35:19.098837 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7cb000
00:35:19.099192 read(0, "Collaboratively administrate emp"..., 4096) = 1398
00:35:19.099544 fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
00:35:19.099893 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8d5a7ca000
最后,作者编写了一个最小libc库来使用 -ffreestanding -nostdlib
标志进行编译,运行时间降低到了1ms,接近glibc的运行时间的1/10,其只有五个系统调用:
00:53:41.882305 execve("./main-43.1", ["./main-43.1"], [/* 15 vars */]) = 0
00:53:41.882920 read(0, "Collaboratively administrate emp"..., 4096) = 1398
00:53:41.882989 read(0, "", 4096) = 0
00:53:41.883006 write(1, "Collaboratively administrate emp"..., 1449) = 1449
00:53:41.883023 _exit(0) = ?
00:53:41.883356 +++ exited with 0 +++
总结:如果你觉得程序or进程启动时间比较慢并且确实需要从库层面进行优化,那么你值得花时间尝试另一种libc的实现(如 musl libc 或 Diet libc)
Tip
在 C 语言回调函数的示例。所谓回调函数就是将一个函数当作是另一个函数的参数,最经典的应用的便是在排序算法中自定义实现比较大小的函数。下面是一个简单的示例,operate
函数分别调用 add
和 multi
函数来分别实现加法、乘法的效果。
#include <stdio.h>
int operate(int a, int b, int (* func) (int a, int b));
int add(int a, int b);
int multi(int a, int b);
int main()
{
int a = 3, b = 4;
printf("%d + %d = %d\n", a, b, operate(a, b, add));
printf("%d * %d = %d\n", a, b, operate(a, b, multi));
return 0;
}
int operate(int a, int b, int (* func) (int a, int b))
{
return func(a, b);
}
int add(int a, int b)
{
return a+b;
}
int multi(int a, int b)
{
return a*b;
}
程序运行结果如下:
$ gcc test.c
$ ./a.out
3 + 4 = 7
3 * 4 = 12
Share
目前已经想好了一个非 ARTS 的技术类博文的选题,大致分为三部曲,目前正在努力写作中,加油!