[读书笔记]阅读代码的方法与实践(一)

这里基本上全部是书摘

看了前面的两三个章节,让我想直接去看 K&R 的 C语言,这个应该算是十年前欠下的“债”

-------------------------------------------------------------
译者序
-------------------------------------------------------------

有一扇窗,从未打开,却要永远关闭;

有一些人,确实存在,我们却无缘相见;

有一种生活,还没有到来,我们却已永远离开。

 

好的程序就如同好的音乐一样,他们完成的那么巧妙,那么完美,体现出完全没有词藻的魅力。

-------------------------------------------------------------

-------------------------------------------------------------

学习编写伟大代码的方式就是阅读代码,阅读大量的代码。

通过正确的使用(代码)审查,软件产品中 90% 以上的错误能够在测试之前消除 —— Robert Glass

-------------------------------------------------------------
1 导论
-------------------------------------------------------------
1.1 为什么以及如何阅读代码


1.1.1 将代码作为文献

低品质的代码:

  • 编码分割不一致;
  • 结构不必要的复杂或难以理解;
  • 明显的逻辑错误或疏忽;
  • 过度使用不可移植的构造;
  • 缺乏维护。

要有选择的阅读代码。

  • 从小型的程序开始阅读;
  • 编译研究的程序,并运行它们;
  • 主动修改代码来检验对代码的理解是否准确。
  • 改进它。

1.1.2 以代码为范例
1.1.3 维护
1.1.4 演进

有选择的处理大型系统的各个组成部分:

  • 定位到感兴趣的代码;
  • 单独了解各个特定的部分;
  • 推断节选出的代码和其余代码的关系。

1.1.5 重用
1.1.6 审查
走查 walkthrough,审查 inspection,循环复查 round-robin review

Maxims

  1. Make it a habit to spend time reading high-quality code that others have written.
  2. Read code selectively and with a goal in your mind. Are you trying to learn new patterns, a coding style, a way to satisfy some requirements?
  3. Notice and appreciate the code's particular nonfunctional requirements that might give rise to a specific implementation style.
  4. When working on existing code, coordinate your efforts with the authors or maintainers to avoid duplication of work or bad feelings.
  5. Consider the benefits you receive from open-source software to be a loan; look for ways to repay it by contributing back to the open-source community.
  6. In many cases if you want to know "how did they do that?" there's no better way than reading the code.
  7. when looking for a bug, examine the code from the problem manifestation to the problem source. Avoid following unrelated paths.
  8. Use the debugger, the compiler's warning or symbolic code output, a system call tracer, your database's SQL logging facility, packet dump tools, and windows message spy programs to locate a bug's location.
  9. You can successfully modify large well-structured system with only a minimal understanding of their complete functionality.
  10. When adding new functionality to a system, your first task is to find the implementation of a similar feature to use as a template for the one you will be implementing.
  11. To go from a feature's functional specification to the code implementation, follow the string messages or search the code using keywords.
  12. When porting code or modifying interfaces, you can save code-reading effort by directing your attention to the problem areas identified by the compiler.
  13. When refactoring, you start with a working system and want to ensure that you will end up with a working one. A suite of pertinent test cases will help you satisfy this obligation.
  14. When reading code to search for refactoring opportunities, you can maximize your return on investment by starting from the system's architecture and moving downward, looking at increasing levels of detail.
  15. Code reusability is a tempting but elusive concept; limit your expectations and you will not be disappointed.
  16. If the code you want to reuse is intractable and difficult to understand and isolate, look at larger granularity packages or different code.
  17. while reviewing a software system, keep in mind that it consists of more elements than executable statements. Examine the file and directory structure the build and configuration process, the user interface, and the system's documentation.
  18. Use software reviews as a chance to learn, teach, lend a hand, and receive assistance.

-------------------------------------------------------------
2 基本编程元素
------------------------------------------------------------- 

我们所观测到的不是自然本身,而是大自然在我们所用的观察方法下展现出来的特性。—— Werner Heisenberg

What we observe is not nature itself, but nature exposed to our method of questioning. - Werner Heisenberg

2.1 一个完整的程序

首先检查参数是否合法,接下来使用该参数,中间用布尔与运算符(&&)将二者组合起来。  

if( *++argv && !strcmp(*argv, "-n")) { ... }

定义STREQ宏,在两个字符串相等的时候返回true

#define STREQ(a, b) (*(a) == *(b) && strcmp((a), (b)) == 0)

printf 的返回值被转换成 void 类型。 printf 返回实际打印出的字符数。如果未能写出字符,putchar 将返回 EOF。

所有输出函数——特别在程序的标准输出重定向到文件时——都有可能会由于各种各样的原因而失败。

  • 存储输出的设备可能没有剩余空间;
  • 设备商分配给用户的空间可能耗尽;
  • 进程对文件的写入可能超出进程或系统文件大小的最大限制;
  • 输出设备商可能发生硬件错误;
  • 文件描述符或与标准输出关联的流或许不能写入;

不检查输出操作的结果,有可能会引起程序的悄然失败,在没有任何警告的情况下丢失输出。

if(ferror(stdout))
     err(1, "stdout")

2.2 函数和全局变量

前置声明(forward declaration) 允许编译器检验传递给函数的参数,以及它们的返回值,相应的生成正确的代码。

在检查代码时,确保所有只用于单一文件的变量都声明为 static 。

了解函数(或方法)功用的策略:

  • 猜,基于函数名;
  • 阅读位于函数开始部分的注释;
  • 分析如何使用该函数;
  • 阅读函数体的代码;
  • 查阅外部的程序文档。

阅读代码时,常常采用渐进式理解(gradual understanding)的方式。

2.3 while 循环、条件和块

比较运算符(一般与赋值结合使用)比赋值介个更为紧密。

2.4 switch 语句

switch 语句用于处理若干离散的整形或字符类型的值。

缺少default 标记的 switch 语句会默然忽略意外的值。即使知道 switch 语句只处理一系列能够确定的值,也要尽量包括 default 标记。

2.5 for 循环

代码阅读有许多可选择的策略:自底向上(bottom-up)和自顶向下(top-down)的分析、应用试探法和检查注视和外部文档,应该依据问题的需要尝试所有这些方法。

描述 for 语句的 3 个部分是表达式,不是语句。

使用 for( ; ; ) 形式的语句来执行“无限”循环。

2.6 break 和 continue 语句

break 语句将程序转移到最内层的循环或 switch 语句之后执行;break 用于提前退出循环。

continue 语句则跳过该语句到循环末尾之间的语句,继续最内层循环的迭代。continue 语句会再次计算 while 条件表达的值,并执行循环。在 for 循环中,该语句将首先计算第三表达式的值,之后是条件表达式。continue 用于在循环体分开处理不同情况的地方;每种情况一般都以 continue 结束,以便进行下一次循环迭代。

要确定 break 语句的作用,请从 break 开始向上阅读程序代码,直到遇到包含 break 语句的第一个 while, for, do, 或 switch 块位置。找到循环后的第一个语句;break 执行后,控制权机会转移到这个地点。

分析包括 continue 语句的代码是,请从 continue 语句开始向上阅读,直到遇到包含 continue 语句的第一个 while, for 或 do 循环为止。找到循环的最后一个语句;紧随其后(但不是在循环外部)就是 continue 执行后控制权将转移到的地点。

又是,循环的执行只是为了得到控制表达式的副作用。continue 作为占位符,替代空语句(用一个分号表示)。

for( ; *string && isdigit(*string); string++)
     continue;

continue 忽略 switch 语句,break 和 continue 都不会影响到 if 语句的操作。

2.7 字符和布尔型表达式

通过从 *cp 中减去 '0' 的序数值,得到 *cp 指向的数字字符所表示的整数值。

// 将字符串转换为数字
while (*cp >= '0' && *cp <= '9' )
     i= i*10 + *cp++ - '0';

将被比较的值放到表达式的中央,安排另外两个值以升序排列,更易于理解

while( '0' <= *cp && *cp <= '9' )

将发现的小写字母(通过 if 测试判断)都减去字符集中从 'a' 到 'A' 的距离,将小写字符转换成大写。当字符集中存在位于区间 a...z 之外的小写字母时、当字符集去见 a...z 包含非小写字母时、当每个小写字母的代码与对应大写字母之间的距离不固定时,这段代码都不能工作。

if ( 'a' <= *s && *s <= 'z' )
  *s -= ('a' - 'A');

De Morgan 法则

!(a||b) <=> !a && !b

!(a&&b) <=> !a || !b

短路求值(short-circuit evaluation): 

  • 在用 && 运算符(逻辑与)连接起来的表达式序列中,第一个表达式的求值结果如果为 false,则会结束整个表达式的求值,并生成 false 结果;
  • 在用 || 运算符(逻辑或)连接起来的表达式序列中,如果第一个表达式求值为 true,则会终止对整个表达式的求值,产生一个 true 的结果。

2.8 goto 语句

goto 容易被误用,创造出“意大利面条(spaghetti)”式的代码:代码的控制流程难以跟踪和断定。

  • 用 goto 语句退出程序或函数;
  • 用 goto 语句重新执行某一部分代码...使用 goto 语句有时能够更好的传达编码者的意图;
  • 在嵌套循环和 switch 语句中,goto 语句可以用来替代 break 和 continue,改变程序的控制流程。

2.9 小范围重构

代码编写完成后对设计进行的改进成为重构(refactoring)

在不改变代码外部行为的前提下,对代码做出修改,以改进程序的内部结构。——《重构》Martin Fowler

“代码的可读性和效率在某种程度上不能兼顾”是一个谬论。

使代码更为紧凑和不易读不意味着会提高代码的效率。

op = &( !x ? ( !y ? upleft : ( y == bottom ? lowleft : left )) :
( x == last ? ( !y ? upright : ( y == bottom ? lowright : right )) :
( !y ? upper : ( y == bottom ? lower : normal ))))[w->orientation];

 类 if 语句编排(左) 和 类层叠 if-else 语句编排(右)

Image

创造性的代码布局可以用来提高代码的易读性。

op =
     & (          !y ? ( !x ? upleft  : x != last ? upper   :  upright ) :
          y!= bottom ? ( !x ? left    : x != last ? normal  :  right ) :
                       ( !x ? lowleft : x != last ? lower   :  lowright )
)[w->orientation];

使表达式更易读

  • 添加空格
  • 利用临时变量将表达式分解成较小的部分
  • 用圆括号提高特定运算符的优先次序
  • 用好的缩进
  • 对变量名称的明智选择

最好不要将重新编排与对程序逻辑的任何实际更改合并起来。应该先对代码进行重新编排,将它检入(check-in),然后再做其他修改。

diff 程序 -w 选项,忽略空白差异。

International Obfuscated C Code Contest

2.10 do 循环和整型表达式

希望一个赋值操作,并且之后与 0 进行比较(test against zero):

if ( ( p = q ) )
     q[-1] = '\n' ;

显式的与 NULL 做比较(test against NULL):

if ( (p = strchr(name, '=')) != NULL ) { p++; }

所有有常量参与的比较,都将常量写在比较表达式的左边

if ( 0 == serconsole )
     serconsinit = 0;

& 运算符执行两个操作数之间的逐位与(bitwise-add)操作

column & 7
// 屏蔽掉 column 变量的一些最高有效位,返回 column 除以 8 后的余数

执行算数运算时,当 b = 2 exp n -1 时,可以将 a & b 理解为 a % (b+1)。将除法替换为逐位与指令(有时计算起来更高效)。现代的优化编译器能够识别出这种情况,独立的完成替换。

用位移指令替代算数指令。

  • 将 a << n 理解为 a * k, k = 2 exp n。
  • 将 a >> n 理解为 a / k, k = 2 exp n。

2.11 再论控制结构

  • 每次只分析一个控制结构
  • 将每个控制结构的表达式看作是它所包含代码的断言

Image(1)

循环不变式(invariant)是程序状态的一个断言,在循环开始和结尾都成立。如果能够证明特定的循环维护了不变式,并且在循环结束时,选定的不变式可以用来表明期望的结果业已获得,那么,我们能够确信,算法的循环始终工作在正确算法结果的包络之内。我们还需要确保循环能够终止。使用变式(variant)表示与最终距离的量度,每次循环迭代都要递减它。如果我们能够演示出一个循环的操作在维护不变式的同时递减变式,我们就能够确定,循环能够狗正确终止。

Image(2)

 Maxims

  1. When examining a program for the first time, main can be a good starting point.
  2. Read a cascading if-else if...else sequence as a selection of mutually exclusive choices.
  3. Sometimes executing a program can be a more expedient way to understand an aspect of its functionality than reading its source code.
  4. When examining a nontrivial program, it is useful to first identify its major constituent parts.
  5. Learn local naming conventions and use them to guess what variables and functions do.
  6. When modifying code based on guesswork, plan the process that will verify your initial hypotheses. This process can involve checks by the compiler, the introduction of assertions, or the execution of appropriate test cases.
  7. Understanding one part of the code can help you understand the rest.
  8. Disentangle difficult code by starting with the easy parts.
  9. Make it a habit to read the documentation of library elements you encounter; it will enhance both your code-reading and code-writing skills.
  10. Code reading involves many alternative strategies: bottom-up and top-down examination, the use of heuristics, and review of comments and external documentation should all be tried as the problem dictates.
  11. Loops of the form for(i=0;i<n;i++) execute n times; treat all other forms with caution.
  12. Read comparison expressions involving the conjunction of two inequalities with one identical term as a range membership test.
  13. You can often understand the meaning of an expression by applying it on sample data.
  14. Simplify complicated logical expressions by using De Morgan's rules.
  15. When reading a conjunction, you can always assume that the expressions on the left of the expression you are examining are true; when reading a disjunction, you can similarly assume that the expressions on the left of the expression you are examining are false.
  16. Reorganize code you control to make it readable.
  17. Read expressions using the conditional operator ?: like if code.
  18. There is no need to sacrifice code readability for efficiency.
  19. while it is true that efficient algorithms and certain optimizations can make the code more complicated and therefore more difficult to follow, this does not mean that making the code compact and unreadable will make it more efficient.
  20. Creative code layout can be used to improve code readability.
  21. You can improve the readability of expressions using whitespace, temporary variables, and parentheses.
  22. When reading code under your control, make it a habit to add comments as needed.
  23. You can improve the readability of poorly written code with better indentation and appropriate variable names.
  24. When you are examining a program revision history that spans a global reindentation exercise using the diff program, you can often avoid the noise introduced by the changed indentation levels by specifying the -w option to have diff ignore whitespace differences.
  25. The body of a do loop is executed at least once.
  26. When performing arithmetic, read a&b as a%(b+1) when b+1 = 2 exp n.
  27. Read a<<n as a*k, where k = 2 exp n.
  28. Read a>>n as a/k, where k = 2 exp n.
  29. Examine one control structure at a time, treating its contents as a black box.
  30. Treat the controlling expression of each control structure as an assertion for the code it encloses.

posted on 2012-12-05 22:10  zhaorui  阅读(418)  评论(0编辑  收藏  举报

导航