从vimdiff get命令为什么不是dg看vim cmd解析

intro

当使用vimdiff来获取另外一个文件的diff内容时,在Ex模式下使用的是diffget,但是在normal模式下对应的cmd却不是对应的dg而是另一个do(diff obtain),这个都少有些意外。

单单的对于"为什么vim使用do而不是dg命令来获得diff?"这个问题,其实在vim的“do”帮助文档中已经明确说明:

[count]do Same as ":diffget" without range. The "o" stands for "obtain"
("dg" can't be used, it could be the start of "dgg"!). Note:
this doesn't work in Visual mode.
If you give a [count], it is used as the [bufspec] argument
for ":diffget".

简言之,就是因为dg可能是dgg命令的前缀,当然,dg也可能是其它(例如dge)这种命令的前缀。因为d是一个operator,而g可以是很多移动命令(motion)的引导符,"operator + motion"是vim经典的操作模式。

类似于这种operator和多个字符组成的命令(例如gg,zO等)vim是如何解析的?尽管g开始的命令可能有很多,但是也不是g后面加任意一个字符都是合法的vim命令,如果出现vim不识别的双字符(例如gA),此时vim会把把A作为单独的Append吗(当然,单单这个问题在vim中试一下就知道了)?

如果不是"operator+motion"的输入而是"operator+operator"输入组合,vim会如何处理?

Operator-pending

在执行一个命令前,vim会将收集的信息保存在一个cmdarg_T结构中,这个结构中除了意料之中的cmdchar字段之外,还有一个额外的oparg_T结构。明显的,oparg_T结构中的op_type字段对应的就是operator类型。

//@file: vim\src\structs.h
/*
 * Arguments for operators.
 */
typedef struct oparg_S
{
    int		op_type;	// current pending operator type
    int		regname;	// register to use for the operator
    int		motion_type;	// type of the current cursor motion
    int		motion_force;	// force motion type: 'v', 'V' or CTRL-V
    int		use_reg_one;	// TRUE if delete uses reg 1 even when not
				// linewise
    int		inclusive;	// TRUE if char motion is inclusive (only
				// valid when motion_type is MCHAR)
    int		end_adjusted;	// backuped b_op_end one char (only used by
				// do_format())
    pos_T	start;		// start of the operator
    pos_T	end;		// end of the operator
    pos_T	cursor_start;	// cursor position before motion for "gw"

    long	line_count;	// number of lines from op_start to op_end
				// (inclusive)
    int		empty;		// op_start and op_end the same (only used by
				// do_change())
    int		is_VIsual;	// operator on Visual area
    int		block_mode;	// current operator is Visual block mode
    colnr_T	start_vcol;	// start col for block mode operator
    colnr_T	end_vcol;	// end col for block mode operator
    long	prev_opcount;	// ca.opcount saved for K_CURSORHOLD
    long	prev_count0;	// ca.count0 saved for K_CURSORHOLD
    int		excl_tr_ws;	// exclude trailing whitespace for yank of a
				// block
} oparg_T;

/*
 * Arguments for Normal mode commands.
 */
typedef struct cmdarg_S
{
    oparg_T	*oap;		// Operator arguments
    int		prechar;	// prefix character (optional, always 'g')
    int		cmdchar;	// command character
    int		nchar;		// next command character (optional)
    int		ncharC1;	// first composing character (optional)
    int		ncharC2;	// second composing character (optional)
    int		extra_char;	// yet another character (optional)
    long	opcount;	// count before an operator
    long	count0;		// count before command, default 0
    long	count1;		// count before command, default 1
    int		arg;		// extra argument from nv_cmds[]
    int		retval;		// return: CA_* values
    char_u	*searchbuf;	// return: pointer to search pattern or NULL
} cmdarg_T;

operator的解析和非operator的解析经过相同的流程,只是如果输入的是一个operator的话,这个信息不是保存在cmdarg_T的cmdchar字段,而是保存在了oparg_T结构中的op_type字段中,或者说,operator的相关信息主要保存在单独的oparg_T结构中。

这意味着:vim必须识别并区分处理哪些是operator

关于“operator和非operator经过相同流程”,在vim的介绍文档intro.txt中就有说明

Operator-pending mode This is like Normal mode, but after an operator
command has started, and Vim is waiting for a {motion}
to specify the text that the operator will work on.

对应的在vim的实现中,所谓的“pending"就是先pending到了单独的oparg_T结构中。

operator

operator的识别经过的是和常规cmd相同的流程:当读取到一个字符的时候,会从nv_cmds表查找到对应的命令项。这些命令行中包括了一些flags和对应的执行函数。大家常见的change、delete等operator命令,它们对应的都是nv_operator函数。

// Values for cmd_flags.
#define NV_NCH	    0x01	  // may need to get a second char
#define NV_NCH_NOP  (0x02|NV_NCH) // get second char when no operator pending
#define NV_NCH_ALW  (0x04|NV_NCH) // always get a second char
#define NV_LANG	    0x08	// second char needs language adjustment

#define NV_SS	    0x10	// may start selection
#define NV_SSS	    0x20	// may start selection with shift modifier
#define NV_STS	    0x40	// may stop selection without shift modif.
#define NV_RL	    0x80	// 'rightleft' modifies command
#define NV_KEEPREG  0x100	// don't clear regname
#define NV_NCW	    0x200	// not allowed in command-line window

/*
 * Generally speaking, every Normal mode command should either clear any
 * pending operator (with *clearop*()), or set the motion type variable
 * oap->motion_type.
 *
 * When a cursor motion command is made, it is marked as being a character or
 * line oriented motion.  Then, if an operator is in effect, the operation
 * becomes character or line oriented accordingly.
 */

/*
 * This table contains one entry for every Normal or Visual mode command.
 * The order doesn't matter, this will be sorted by the create_nvcmdidx.vim
 * script to generate the nv_cmd_idx[] lookup table.
 * It is faster when all keys from zero to '~' are present.
 */
static const struct nv_cmd
{
    int		cmd_char;	// (first) command character
    nv_func_T   cmd_func;	// function for this command
    short_u	cmd_flags;	// NV_ flags
    short	cmd_arg;	// value for ca.arg
} nv_cmds[] =

#else  // DO_DECLARE_NVCMD

/*
 * Used when creating nv_cmdidxs.h.
 */
# define NVCMD(a, b, c, d)  a
static const int nv_cmds[] =

#endif // DO_DECLARE_NVCMD
{
///...
    NVCMD(' ',		nv_right,	0,			0),
    NVCMD('!',		nv_operator,	0,			0),
    NVCMD('"',		nv_regname,	NV_NCH_NOP|NV_KEEPREG,	0),
    NVCMD('#',		nv_ident,	0,			0),
///...
    NVCMD('c',		nv_operator,	0,			0),
    NVCMD('d',		nv_operator,	0,			0),
///...
    NVCMD('f',		nv_csearch,	NV_NCH_ALW|NV_LANG,	FORWARD),
    NVCMD('g',		nv_g_cmd,	NV_NCH_ALW,		FALSE),
///...
    NVCMD('y',		nv_operator,	0,			0),
    NVCMD('z',		nv_zet,		NV_NCH_ALW,		0),
    NVCMD('{',		nv_findpar,	0,			BACKWARD),
///...
}

在nv_operator函数中,会从operator表中查找对应的动作,并把operator记录到cap->oap结构中。

nv_operator函数实现还有一个有意思的细节:如果有个operator处于pending状态,此时再次输入一个operator动作,那么两个动作会抵消,也就是新输入的operator也不生效。


/*
 * Check for operator active and clear it.
 *
 * Beep and return TRUE if an operator was active.
 */
    static int
checkclearop(oparg_T *oap)
{
    if (oap->op_type == OP_NOP)
	return FALSE;
    clearopbeep(oap);
    return TRUE;
}

/*
 * Handle an operator command.
 * The actual work is done by do_pending_operator().
 */
    static void
nv_operator(cmdarg_T *cap)
{
    int	    op_type;

    op_type = get_op_type(cap->cmdchar, cap->nchar);
#ifdef FEAT_JOB_CHANNEL
    if (bt_prompt(curbuf) && op_is_change(op_type) && !prompt_curpos_editable())
    {
	clearopbeep(cap->oap);
	return;
    }
#endif

    if (op_type == cap->oap->op_type)	    // double operator works on lines
	nv_lineop(cap);
    else if (!checkclearop(cap->oap))
    {
	cap->oap->start = curwin->w_cursor;
	cap->oap->op_type = op_type;
#ifdef FEAT_EVAL
	set_op_var(op_type);
#endif
    }
}

text object

在vim中有一些不是基于motion确定的范围,而是使用textobject选择范围。vim的text-objects文档说明了它的作用。

This is a series of commands that can only be used while in Visual mode or
after an operator. The commands that start with "a" select "a"n object
including white space, the commands starting with "i" select an "inner" object
without white space, or just the white space. Thus the "inner" commands
always select less text than the "a" commands.

可以注意到,其中的a和i都是vim已经存在的append和insert命令。当vim读到一个a字符的时候如何处理呢?当知道了vim的pending operator是单独存储的时候,就可以推测实现方法:当有operator pending的时候就认为是一个object,否则即使一个cmd。


/*
 * Handle "A", "a", "I", "i" and <Insert> commands.
 * Also handle K_PS, start bracketed paste.
 */
    static void
nv_edit(cmdarg_T *cap)
{
    // <Insert> is equal to "i"
    if (cap->cmdchar == K_INS || cap->cmdchar == K_KINS)
	cap->cmdchar = 'i';

    // in Visual mode "A" and "I" are an operator
    if (VIsual_active && (cap->cmdchar == 'A' || cap->cmdchar == 'I'))
    {
#ifdef FEAT_TERMINAL
	if (term_in_normal_mode())
	{
	    end_visual_mode();
	    clearop(cap->oap);
	    term_enter_job_mode();
	    return;
	}
#endif
	v_visop(cap);
    }

    // in Visual mode and after an operator "a" and "i" are for text objects
    else if ((cap->cmdchar == 'a' || cap->cmdchar == 'i')
	    && (cap->oap->op_type != OP_NOP || VIsual_active))
    {
	nv_object(cap);
    }

NV_NCH_ALW

在vim的命令配置表中,还有一些条目配置了NV_NCH_ALW属性,这个选项表示“always get a second char”,而一些常见的扩展,g、z、]、[ 引导的命令簇,它们都是必定要求第二个字符的。并且当第二个字符如果是不识别的合法组合,会不清除pending的operator,并且不会退回第二个字符进行再次处理。


/*
 * Commands starting with "g".
 */
    static void
nv_g_cmd(cmdarg_T *cap)
{
    oparg_T	*oap = cap->oap;
    int		i;

    switch (cap->nchar)
    {
    case '+':
    case '-': // "g+" and "g-": undo or redo along the timeline
	if (!checkclearopq(oap))
	    undo_time(cap->nchar == '-' ? -cap->count1 : cap->count1,
							 FALSE, FALSE, FALSE);
	break;

    default:
	clearopbeep(oap);
	break;
    }
}    
    ///...

do/dp

在vimdiff模式下,do和dp都是vim的内置命令(而不是该模式下通过map实现的功能),但是由于d、o、p都是vim中已经存在的命令,所以在这两个看起来很常规的命令实现中,同样是vim内部做了“谨慎"的处理(take care)。


/*
 * "o" and "O" commands.
 */
    static void
nv_open(cmdarg_T *cap)
{
#ifdef FEAT_DIFF
    // "do" is ":diffget"
    if (cap->oap->op_type == OP_DELETE && cap->cmdchar == 'o')
    {
	clearop(cap->oap);
	nv_diffgetput(FALSE, cap->opcount);
    }
    else
#endif
    if (VIsual_active)  // switch start and end of visual
	v_swap_corners(cap->cmdchar);
#ifdef FEAT_JOB_CHANNEL
    else if (bt_prompt(curbuf))
	clearopbeep(cap->oap);
#endif
    else
	n_opencmd(cap);
}

/*
 * "P", "gP", "p" and "gp" commands.
 * "fix_indent" is TRUE for "[p", "[P", "]p" and "]P".
 */
    static void
nv_put_opt(cmdarg_T *cap, int fix_indent)
{
    int		regname = 0;
    void	*reg1 = NULL, *reg2 = NULL;
    int		empty = FALSE;
    int		was_visual = FALSE;
    int		dir;
    int		flags = 0;
    int		keep_registers = FALSE;
#ifdef FEAT_FOLDING
    int		save_fen = curwin->w_p_fen;
#endif

    if (cap->oap->op_type != OP_NOP)
    {
#ifdef FEAT_DIFF
	// "dp" is ":diffput"
	if (cap->oap->op_type == OP_DELETE && cap->cmdchar == 'p')
	{
	    clearop(cap->oap);
	    nv_diffgetput(TRUE, cap->opcount);
	}
	else
#endif
	    clearopbeep(cap->oap);
	return;
    }
}

回到主题

因为d是一个operator,所以它跟后面的g引导的命令是没有依赖关系的。当遇到g的时候,该字符配置了NV_NCH_ALW属性,也就是g之后必须再输入一个字符(即使不是一个合法的g命令)。

另外,diff模式下使用的][也是配置了NV_NCH_ALW选项。

outro

简言之: operator是一种特殊的cmd:特殊的地方在于它的信息是记录在单独的和常规cmd独立的存储位置。

一些看似基本甚至理所当然的功能,在实现的过程中可能都经过了权衡和妥协。

posted on 2024-08-09 19:26  tsecer  阅读(5)  评论(0编辑  收藏  举报

导航