从vimdiff get命令为什么不是dg看vim cmd解析
intro
当使用vimdiff来获取另外一个文件的diff内容时,在Ex模式下使用的是diffget,但是在normal模式下对应的cmd却不是对应的dg而是另一个do(diff obtain),这个都少有些意外。
单单的对于"为什么vim使用do而不是dg命令来获得diff?"这个问题,其实在vim的“do”帮助文档中已经明确说明:
[count]do Same as ":diffget" without range. The "o" stands for "obtain"
("dg" can't be used, it could be the start of "dgg"!). Note:
this doesn't work in Visual mode.
If you give a [count], it is used as the [bufspec] argument
for ":diffget".
简言之,就是因为dg可能是dgg命令的前缀,当然,dg也可能是其它(例如dge)这种命令的前缀。因为d是一个operator,而g可以是很多移动命令(motion)的引导符,"operator + motion"是vim经典的操作模式。
类似于这种operator和多个字符组成的命令(例如gg,zO等)vim是如何解析的?尽管g开始的命令可能有很多,但是也不是g后面加任意一个字符都是合法的vim命令,如果出现vim不识别的双字符(例如gA),此时vim会把把A作为单独的Append吗(当然,单单这个问题在vim中试一下就知道了)?
如果不是"operator+motion"的输入而是"operator+operator"输入组合,vim会如何处理?
Operator-pending
在执行一个命令前,vim会将收集的信息保存在一个cmdarg_T结构中,这个结构中除了意料之中的cmdchar字段之外,还有一个额外的oparg_T结构。明显的,oparg_T结构中的op_type字段对应的就是operator类型。
//@file: vim\src\structs.h
/*
* Arguments for operators.
*/
typedef struct oparg_S
{
int op_type; // current pending operator type
int regname; // register to use for the operator
int motion_type; // type of the current cursor motion
int motion_force; // force motion type: 'v', 'V' or CTRL-V
int use_reg_one; // TRUE if delete uses reg 1 even when not
// linewise
int inclusive; // TRUE if char motion is inclusive (only
// valid when motion_type is MCHAR)
int end_adjusted; // backuped b_op_end one char (only used by
// do_format())
pos_T start; // start of the operator
pos_T end; // end of the operator
pos_T cursor_start; // cursor position before motion for "gw"
long line_count; // number of lines from op_start to op_end
// (inclusive)
int empty; // op_start and op_end the same (only used by
// do_change())
int is_VIsual; // operator on Visual area
int block_mode; // current operator is Visual block mode
colnr_T start_vcol; // start col for block mode operator
colnr_T end_vcol; // end col for block mode operator
long prev_opcount; // ca.opcount saved for K_CURSORHOLD
long prev_count0; // ca.count0 saved for K_CURSORHOLD
int excl_tr_ws; // exclude trailing whitespace for yank of a
// block
} oparg_T;
/*
* Arguments for Normal mode commands.
*/
typedef struct cmdarg_S
{
oparg_T *oap; // Operator arguments
int prechar; // prefix character (optional, always 'g')
int cmdchar; // command character
int nchar; // next command character (optional)
int ncharC1; // first composing character (optional)
int ncharC2; // second composing character (optional)
int extra_char; // yet another character (optional)
long opcount; // count before an operator
long count0; // count before command, default 0
long count1; // count before command, default 1
int arg; // extra argument from nv_cmds[]
int retval; // return: CA_* values
char_u *searchbuf; // return: pointer to search pattern or NULL
} cmdarg_T;
operator的解析和非operator的解析经过相同的流程,只是如果输入的是一个operator的话,这个信息不是保存在cmdarg_T的cmdchar字段,而是保存在了oparg_T结构中的op_type字段中,或者说,operator的相关信息主要保存在单独的oparg_T结构中。
这意味着:vim必须识别并区分处理哪些是operator。
关于“operator和非operator经过相同流程”,在vim的介绍文档intro.txt中就有说明
Operator-pending mode This is like Normal mode, but after an operator
command has started, and Vim is waiting for a {motion}
to specify the text that the operator will work on.
对应的在vim的实现中,所谓的“pending"就是先pending到了单独的oparg_T结构中。
operator
operator的识别经过的是和常规cmd相同的流程:当读取到一个字符的时候,会从nv_cmds表查找到对应的命令项。这些命令行中包括了一些flags和对应的执行函数。大家常见的change、delete等operator命令,它们对应的都是nv_operator函数。
// Values for cmd_flags.
#define NV_NCH 0x01 // may need to get a second char
#define NV_NCH_NOP (0x02|NV_NCH) // get second char when no operator pending
#define NV_NCH_ALW (0x04|NV_NCH) // always get a second char
#define NV_LANG 0x08 // second char needs language adjustment
#define NV_SS 0x10 // may start selection
#define NV_SSS 0x20 // may start selection with shift modifier
#define NV_STS 0x40 // may stop selection without shift modif.
#define NV_RL 0x80 // 'rightleft' modifies command
#define NV_KEEPREG 0x100 // don't clear regname
#define NV_NCW 0x200 // not allowed in command-line window
/*
* Generally speaking, every Normal mode command should either clear any
* pending operator (with *clearop*()), or set the motion type variable
* oap->motion_type.
*
* When a cursor motion command is made, it is marked as being a character or
* line oriented motion. Then, if an operator is in effect, the operation
* becomes character or line oriented accordingly.
*/
/*
* This table contains one entry for every Normal or Visual mode command.
* The order doesn't matter, this will be sorted by the create_nvcmdidx.vim
* script to generate the nv_cmd_idx[] lookup table.
* It is faster when all keys from zero to '~' are present.
*/
static const struct nv_cmd
{
int cmd_char; // (first) command character
nv_func_T cmd_func; // function for this command
short_u cmd_flags; // NV_ flags
short cmd_arg; // value for ca.arg
} nv_cmds[] =
#else // DO_DECLARE_NVCMD
/*
* Used when creating nv_cmdidxs.h.
*/
# define NVCMD(a, b, c, d) a
static const int nv_cmds[] =
#endif // DO_DECLARE_NVCMD
{
///...
NVCMD(' ', nv_right, 0, 0),
NVCMD('!', nv_operator, 0, 0),
NVCMD('"', nv_regname, NV_NCH_NOP|NV_KEEPREG, 0),
NVCMD('#', nv_ident, 0, 0),
///...
NVCMD('c', nv_operator, 0, 0),
NVCMD('d', nv_operator, 0, 0),
///...
NVCMD('f', nv_csearch, NV_NCH_ALW|NV_LANG, FORWARD),
NVCMD('g', nv_g_cmd, NV_NCH_ALW, FALSE),
///...
NVCMD('y', nv_operator, 0, 0),
NVCMD('z', nv_zet, NV_NCH_ALW, 0),
NVCMD('{', nv_findpar, 0, BACKWARD),
///...
}
在nv_operator函数中,会从operator表中查找对应的动作,并把operator记录到cap->oap结构中。
nv_operator函数实现还有一个有意思的细节:如果有个operator处于pending状态,此时再次输入一个operator动作,那么两个动作会抵消,也就是新输入的operator也不生效。
/*
* Check for operator active and clear it.
*
* Beep and return TRUE if an operator was active.
*/
static int
checkclearop(oparg_T *oap)
{
if (oap->op_type == OP_NOP)
return FALSE;
clearopbeep(oap);
return TRUE;
}
/*
* Handle an operator command.
* The actual work is done by do_pending_operator().
*/
static void
nv_operator(cmdarg_T *cap)
{
int op_type;
op_type = get_op_type(cap->cmdchar, cap->nchar);
#ifdef FEAT_JOB_CHANNEL
if (bt_prompt(curbuf) && op_is_change(op_type) && !prompt_curpos_editable())
{
clearopbeep(cap->oap);
return;
}
#endif
if (op_type == cap->oap->op_type) // double operator works on lines
nv_lineop(cap);
else if (!checkclearop(cap->oap))
{
cap->oap->start = curwin->w_cursor;
cap->oap->op_type = op_type;
#ifdef FEAT_EVAL
set_op_var(op_type);
#endif
}
}
text object
在vim中有一些不是基于motion确定的范围,而是使用textobject选择范围。vim的text-objects文档说明了它的作用。
This is a series of commands that can only be used while in Visual mode or
after an operator. The commands that start with "a" select "a"n object
including white space, the commands starting with "i" select an "inner" object
without white space, or just the white space. Thus the "inner" commands
always select less text than the "a" commands.
可以注意到,其中的a和i都是vim已经存在的append和insert命令。当vim读到一个a字符的时候如何处理呢?当知道了vim的pending operator是单独存储的时候,就可以推测实现方法:当有operator pending的时候就认为是一个object,否则即使一个cmd。
/*
* Handle "A", "a", "I", "i" and <Insert> commands.
* Also handle K_PS, start bracketed paste.
*/
static void
nv_edit(cmdarg_T *cap)
{
// <Insert> is equal to "i"
if (cap->cmdchar == K_INS || cap->cmdchar == K_KINS)
cap->cmdchar = 'i';
// in Visual mode "A" and "I" are an operator
if (VIsual_active && (cap->cmdchar == 'A' || cap->cmdchar == 'I'))
{
#ifdef FEAT_TERMINAL
if (term_in_normal_mode())
{
end_visual_mode();
clearop(cap->oap);
term_enter_job_mode();
return;
}
#endif
v_visop(cap);
}
// in Visual mode and after an operator "a" and "i" are for text objects
else if ((cap->cmdchar == 'a' || cap->cmdchar == 'i')
&& (cap->oap->op_type != OP_NOP || VIsual_active))
{
nv_object(cap);
}
NV_NCH_ALW
在vim的命令配置表中,还有一些条目配置了NV_NCH_ALW属性,这个选项表示“always get a second char”,而一些常见的扩展,g、z、]、[ 引导的命令簇,它们都是必定要求第二个字符的。并且当第二个字符如果是不识别的合法组合,会不清除pending的operator,并且不会退回第二个字符进行再次处理。
/*
* Commands starting with "g".
*/
static void
nv_g_cmd(cmdarg_T *cap)
{
oparg_T *oap = cap->oap;
int i;
switch (cap->nchar)
{
case '+':
case '-': // "g+" and "g-": undo or redo along the timeline
if (!checkclearopq(oap))
undo_time(cap->nchar == '-' ? -cap->count1 : cap->count1,
FALSE, FALSE, FALSE);
break;
default:
clearopbeep(oap);
break;
}
}
///...
do/dp
在vimdiff模式下,do和dp都是vim的内置命令(而不是该模式下通过map实现的功能),但是由于d、o、p都是vim中已经存在的命令,所以在这两个看起来很常规的命令实现中,同样是vim内部做了“谨慎"的处理(take care)。
/*
* "o" and "O" commands.
*/
static void
nv_open(cmdarg_T *cap)
{
#ifdef FEAT_DIFF
// "do" is ":diffget"
if (cap->oap->op_type == OP_DELETE && cap->cmdchar == 'o')
{
clearop(cap->oap);
nv_diffgetput(FALSE, cap->opcount);
}
else
#endif
if (VIsual_active) // switch start and end of visual
v_swap_corners(cap->cmdchar);
#ifdef FEAT_JOB_CHANNEL
else if (bt_prompt(curbuf))
clearopbeep(cap->oap);
#endif
else
n_opencmd(cap);
}
/*
* "P", "gP", "p" and "gp" commands.
* "fix_indent" is TRUE for "[p", "[P", "]p" and "]P".
*/
static void
nv_put_opt(cmdarg_T *cap, int fix_indent)
{
int regname = 0;
void *reg1 = NULL, *reg2 = NULL;
int empty = FALSE;
int was_visual = FALSE;
int dir;
int flags = 0;
int keep_registers = FALSE;
#ifdef FEAT_FOLDING
int save_fen = curwin->w_p_fen;
#endif
if (cap->oap->op_type != OP_NOP)
{
#ifdef FEAT_DIFF
// "dp" is ":diffput"
if (cap->oap->op_type == OP_DELETE && cap->cmdchar == 'p')
{
clearop(cap->oap);
nv_diffgetput(TRUE, cap->opcount);
}
else
#endif
clearopbeep(cap->oap);
return;
}
}
回到主题
因为d是一个operator,所以它跟后面的g引导的命令是没有依赖关系的。当遇到g的时候,该字符配置了NV_NCH_ALW属性,也就是g之后必须再输入一个字符(即使不是一个合法的g命令)。
另外,diff模式下使用的][也是配置了NV_NCH_ALW选项。
outro
简言之: operator是一种特殊的cmd:特殊的地方在于它的信息是记录在单独的和常规cmd独立的存储位置。
一些看似基本甚至理所当然的功能,在实现的过程中可能都经过了权衡和妥协。