YouCompleteMe的completer和parser

intro

在使用YCM完成c++输入提示(complete)时可以注意到一个细节:通常只有后输入“.”,"->","::"的时候提示的才是提示字段的类型信息。对于函数,提示包含了函数的参数类型等信息;对于数据成员,提示中也包含了类型信息。

对应地,其它情况下尽管提示中可能会包含变量名,但是不会在preview窗口中显示这些提示的类型信息。更让人觉得不方便的地方在于提示的内容通常都是只在当前文件中,如果此时不太记得某个函数(可能是第三方实现的接口),那么这个提示就会丢失不少珍贵的信息。

直观上看,YCM的后端cpp解析器是能够看到所有包含头文件展开之后的标识符,并且内置提供了转到声明的功能(YcmCompleter GoToDeclaration),所以找到整个编译单元(translation unit)中提示符的类型信息并非难事。

那么该如何实现呢?

completer

客户端

当vim客户端触发complete请求时,调用SendCompletionRequest接口发送请求,这个请求参数中包含了一个force_semantic参数。

#@file: YouCompleteMe/python/ycm/youcompleteme.py
def SendCompletionRequest( self, force_semantic = False ):
	request_data = BuildRequestData()
	request_data[ 'force_semantic' ] = force_semantic
	#....
	self._latest_completion_request = CompletionRequest( request_data )
	#....

接下来通过CompletionRequest接口向服务器发送"completions"请求,ycmd将通过这个URL标识符识别出来是一个complete请求。

#@file: YouCompleteMe/python/ycm/client/completion_request.py
class CompletionRequest( BaseRequest ):
def Start( self ):
	self._response_future = self.PostDataToHandlerAsync( self.request_data,
	'completions' )

ycmd

server在处理完成请求时,通过ShouldUseFiletypeCompleter接口判断是否使用基于有文件类型的completer,对C++来说,这个completer就是clangd。

如果没有文件类型定制的completer(GetFiletypeCompleter),则执行通用机制,也就是基于标识符的completer(GetGeneralCompleter)。

#@file: YouCompleteMe/third_party/ycmd/ycmd/handlers.py
@app.post( '/completions' )
def GetCompletions():
  request_data = RequestWrap( request.json )
  do_filetype_completion = _server_state.ShouldUseFiletypeCompleter(
    request_data )
  LOGGER.debug( 'Using filetype completion: %s', do_filetype_completion )

  errors = None
  completions = None

  if do_filetype_completion:
    try:
      filetype_completer = _server_state.GetFiletypeCompleter(
        request_data[ 'filetypes' ] ) 
      completions = filetype_completer.ComputeCandidates( request_data )
    except Exception as exception:
      if request_data[ 'force_semantic' ]:
        # user explicitly asked for semantic completion, so just pass the error
        # back
        raise

      # store the error to be returned with results from the identifier
      # completer
      LOGGER.exception( 'Exception from semantic completer (using general)' )
      stack = traceback.format_exc()
      errors = [ BuildExceptionResponse( exception, stack ) ]

  if not completions and not request_data[ 'force_semantic' ]:
    completions = _server_state.GetGeneralCompleter().ComputeCandidates(
      request_data )

  return _JsonResponse(
      BuildCompletionResponse( completions if completions else [],
                               request_data[ 'start_column' ],
                               errors = errors ) )

在ShouldUseFiletypeCompleter函数中,会判断协议中是否设置了force_semantic选项,

#@file: YouCompleteMe/third_party/ycmd/ycmd/server_state.py
  def ShouldUseFiletypeCompleter( self, request_data ):
    """Determines whether or not the semantic completion should be called for
    completion request."""
    filetypes = request_data[ 'filetypes' ]
    if not self.FiletypeCompletionUsable( filetypes ):
      # don't use semantic, ignore whether or not the user requested forced
      # completion as that's not relevant to signatures.
      return False 

    if request_data[ 'force_semantic' ]:
      # use semantic, and it was forced
      return True

    filetype_completer = self.GetFiletypeCompleter( filetypes )
    # was not forced. check the conditions for triggering
    return filetype_completer.ShouldUseNow( request_data )
  def ShouldUseNowInner( self, request_data ):
    if not self.completion_triggers:
      return False

    current_line = request_data[ 'line_value' ]
    start_codepoint = request_data[ 'start_codepoint' ] - 1 
    column_codepoint = request_data[ 'column_codepoint' ] - 1 
    filetype = self._CurrentFiletype( request_data[ 'filetypes' ] ) 

    return self.completion_triggers.MatchesForFiletype( current_line,
                                                        start_codepoint,
                                                        column_codepoint,
                                                        filetype )

如果没有设置force_semantic,则判断输入内容是否触发了语言定制的触发词(trigger)。

默认的文件类型配置中,C++使用的是常见的" '->', '.', '::' "。

#@file: YouCompleteMe/third_party/ycmd/ycmd/completers/completer_utils.py
DEFAULT_FILETYPE_TRIGGERS = {
  'c' : [ '->', '.' ],
  'objc,objcpp' : [
    '->',
    '.',
    r're!\[[_a-zA-Z]+\w*\s',    # bracketed calls
    r're!^\s*[^\W\d]\w*\s',     # bracketless calls
    r're!\[.*\]\s',             # method composition
  ],
  'ocaml' : [ '.', '#' ],
  'cpp,cuda,objcpp,cs' : [ '->', '.', '::' ],
  'perl' : [ '->' ],
  'php' : [ '->', '::' ],
  ( 'd,'
    'elixir,'
    'go,'
    'gdscript,'
    'groovy,'
    'java,'
    'javascript,'
    'javascriptreact,'
    'julia,'
    'perl6,'
    'python,'
    'scala,'
    'typescript,'
    'typescriptreact,'
    'vb' ) : [ '.' ],
  'ruby,rust' : [ '.', '::' ],
  'lua' : [ '.', ':' ],
  'erlang' : [ ':' ],
}   

parser

YCM还提供了提示文件语法错误的功能,这个功能明显和complete实现机制不同:这种全局文件提示需要对整个文件进行编译,然后解析编译器输出的语法错误。

和complete类似,ycm是通过OnFileReadyToParse>>self.CurrentBuffer().SendParseRequest( extra_data )>>EventNotification( 'FileReadyToParse'extra_data = extra_data )。此时ycmd会触发后端对整个buffer的语法编译。

理所当然的,这种全量编译的触发时机和complete不同,当前vim会在下面场景中会触发文件重新parse命令:

离开insert,打开文件并识别出文件类型后,在normal模式下修改文件内容后。

这里有一个细节:在持续编辑的时候并不会触发重新parse。这也很合理,因为编辑没有完成,此时解析几乎一定会触发(无意义的)语法错误。

"YouCompleteMe/autoload/youcompleteme.vim
function! s:OnInsertLeave()
call s:OnFileReadyToParse()
function! s:OnTextChangedNormalMode()
call s:OnFileReadyToParse()
function! s:OnFileTypeSet()

手动语义complete

回到开始的问题:complete请求中可以提供"force_semantic"参数要求ycmd完成语义匹配,而YCM也提供了简单的配置项来主动触发这种语义提示:

function! s:SetUpKeyMappings()

  if !empty( g:ycm_key_invoke_completion )
    let invoke_key = g:ycm_key_invoke_completion

    " Inside the console, <C-Space> is passed as <Nul> to Vim
    if invoke_key ==# '<C-Space>'
      imap <Nul> <C-Space>
    endif

    silent! exe 'inoremap <unique> <silent> ' . invoke_key .
          \ ' <C-R>=<SID>RequestSemanticCompletion()<CR>'
  endif

注意其中关键的 let s:force_semantic = 1

function! s:RequestSemanticCompletion() abort
  if !s:AllowedToCompleteInCurrentBuffer()
    return ''
  endif 

  if get( b:, 'ycm_completing' )
    let s:force_semantic = 1
    let s:current_cursor_position = getpos( '.' )
    call s:StopPoller( s:pollers.completion )
    py3 ycm_state.SendCompletionRequest( True )

那么具体使用哪个快捷键呢?逐个查询vim insert模式下的ctrl快捷键,可以发现insert模式下ctrl-b的功能通常并不会用到

CTRL-B in Insert mode gone i_CTRL-B-gone

When Vim was compiled with the +rightleft feature, you could use CTRL-B to
toggle the 'revins' option. Unfortunately, some people hit the 'B' key
accidentally when trying to type CTRL-V or CTRL-N and then didn't know how to
undo this. Since toggling the 'revins' option can easily be done with the
mapping below, this use of the CTRL-B key is disabled. You can still use the
CTRL-_ key for this i_CTRL-_.
:imap :set revins!

所以在vim的配置中增加下面配置,即可愉快的手动触发在insert模式下(使用 ctrl-b)语义匹配了:

let g:ycm_key_invoke_completion = '<c-b>'

最少触发字符

即使通用的identifier_completer,是否触发提示也有配置:ycm_min_num_of_chars_for_completion ,只是这个配置的生效不是vim客户端判断而是ycmd判断的。

# YouCompleteMe/third_party/ycmd/ycmd/completers/all/identifier_completer.py
  def ShouldUseNow( self, request_data ):
    return self.QueryLengthAboveMinThreshold( request_data )
# YouCompleteMe/third_party/ycmd/ycmd/completers/completer.py 
  def QueryLengthAboveMinThreshold( self, request_data ):
    # Note: calculation in 'characters' not bytes.
    query_length = ( request_data[ 'column_codepoint' ] -
                     request_data[ 'start_codepoint' ] )

    return query_length >= self.min_num_chars

outro

killer级别的应用,绝大部分人的需求/痛点都已经有解决方案,只是你是否知道,是否理解原理("生活中从不缺少美,而是缺少发现美的眼睛")。困难的反而是提出一个更好的需求,更别说一个好的解决方案了。

posted on 2024-10-18 20:34  tsecer  阅读(11)  评论(0编辑  收藏  举报

导航