MySQL InnoDB Engine--自适应哈希索引代码瞎猜03

自适应哈希索引KEY和VALUE

在MySQL中,对于不同对象,哈希索引键值的计算各不相同,基本思路都是对查询的键进行fold,然后通过hash_calc_hash来计算对应所处哈希表中哈希槽(cell)。

对于自适应哈希索引,KEY对应的是逻辑记录经过fold后的值,VALUE对应的是记录所在的页编号和页偏移量。

 

自适应哈希索引更新

当插入新记录时更新AHI函数为btr_search_update_hash_node_on_insert:

/** Updates the page hash index when a single record is inserted on a page.
@param[in]    cursor    cursor which was positioned to the place to insert
                        using btr_cur_search_, and the new record has been
                        inserted next to the cursor. */
void btr_search_update_hash_node_on_insert(btr_cur_t *cursor);

/** Updates the page hash index when a single record is inserted on a page.
@param[in]    cursor    cursor which was positioned to the place to insert
                        using btr_cur_search_, and the new record has been
                        inserted next to the cursor. */
void btr_search_update_hash_node_on_insert(btr_cur_t *cursor) {
  hash_table_t *table;
  buf_block_t *block;
  dict_index_t *index;
  rec_t *rec;

  if (cursor->index->disable_ahi || !btr_search_enabled) {
    return;
  }

  rec = btr_cur_get_rec(cursor);

  block = btr_cur_get_block(cursor);

  ut_ad(rw_lock_own(&(block->lock), RW_LOCK_X));

  index = block->index;

  if (!index) {
    return;
  }

  ut_a(cursor->index == index);
  ut_a(!dict_index_is_ibuf(index));

  btr_search_x_lock(index);

  if (!block->index) {
    goto func_exit;
  }

  ut_a(block->index == index);

  if ((cursor->flag == BTR_CUR_HASH) &&
      (cursor->n_fields == block->curr_n_fields) &&
      (cursor->n_bytes == block->curr_n_bytes) && !block->curr_left_side) {
    table = btr_get_search_table(index);

    if (ha_search_and_update_if_found(table, cursor->fold, rec, block,
                                      page_rec_get_next(rec))) {
      MONITOR_INC(MONITOR_ADAPTIVE_HASH_ROW_UPDATED);
    }

  func_exit:
    assert_block_ahi_valid(block);
    btr_search_x_unlock(index);
  } else {
    btr_search_x_unlock(index);

    btr_search_update_hash_on_insert(cursor);
  }
}

其中通过ha_search_and_update_if_found来查找和更新AHI,ha_search_and_update_if_found等价于ha_search_and_update_if_found_func:

#if defined UNIV_AHI_DEBUG || defined UNIV_DEBUG
/** Looks for an element when we know the pointer to the data and
updates the pointer to data if found.
@param table in/out: hash table
@param fold in: folded value of the searched data
@param data in: pointer to the data
@param new_block in: block containing new_data
@param new_data in: new pointer to the data */
#define ha_search_and_update_if_found(table, fold, data, new_block, new_data) \
  ha_search_and_update_if_found_func(table, fold, data, new_block, new_data)
#else /* UNIV_AHI_DEBUG || UNIV_DEBUG */
/** Looks for an element when we know the pointer to the data and
updates the pointer to data if found.
@param table in/out: hash table
@param fold in: folded value of the searched data
@param data in: pointer to the data
@param new_block ignored: block containing new_data
@param new_data in: new pointer to the data */
#define ha_search_and_update_if_found(table, fold, data, new_block, new_data) \
  ha_search_and_update_if_found_func(table, fold, data, new_data)
#endif /* UNIV_AHI_DEBUG || UNIV_DEBUG */


/** Looks for an element when we know the pointer to the data, and updates
 the pointer to data, if found.
 @return true if found */
ibool ha_search_and_update_if_found_func(
    hash_table_t *table, /*!< in/out: hash table */
    ulint fold,          /*!< in: folded value of the searched data */
    const rec_t *data,   /*!< in: pointer to the data */
#if defined UNIV_AHI_DEBUG || defined UNIV_DEBUG
    buf_block_t *new_block, /*!< in: block containing new_data */
#endif                      /* UNIV_AHI_DEBUG || UNIV_DEBUG */
    const rec_t *new_data)  /*!< in: new pointer to the data */
{
  ha_node_t *node;

  ut_ad(table);
  ut_ad(table->magic_n == HASH_TABLE_MAGIC_N);
  hash_assert_can_modify(table, fold);
#if defined UNIV_AHI_DEBUG || defined UNIV_DEBUG
  ut_a(new_block->frame == page_align(new_data));
#endif /* UNIV_AHI_DEBUG || UNIV_DEBUG */

  ut_d(ha_btr_search_latch_x_locked(table));

  if (!btr_search_enabled) {
    return (FALSE);
  }

  node = ha_search_with_data(table, fold, data);

  if (node) {
#if defined UNIV_AHI_DEBUG || defined UNIV_DEBUG
    if (table->adaptive) {
      ut_a(node->block->n_pointers.fetch_sub(1) - 1 < MAX_N_POINTERS);
      ut_a(new_block->n_pointers.fetch_add(1) + 1 < MAX_N_POINTERS);
    }

    node->block = new_block;
#endif /* UNIV_AHI_DEBUG || UNIV_DEBUG */
    node->data = new_data;

    return (TRUE);
  }

  return (FALSE);
}

 

自适应哈希索引键的封装

在函数btr_search_guess_on_hash中如如下代码:

/** Tries to guess the right search position based on the hash search info
of the index. Note that if mode is PAGE_CUR_LE, which is used in inserts,
and the function returns TRUE, then cursor->up_match and cursor->low_match
both have sensible values.
@param[in,out]    index        index
@param[in,out]    info        index search info
@param[in]    tuple        logical record
@param[in]    mode        PAGE_CUR_L, ....
@param[in]    latch_mode    BTR_SEARCH_LEAF, ...;
                NOTE that only if has_search_latch is 0, we will
                have a latch set on the cursor page, otherwise
                we assume the caller uses his search latch
                to protect the record!
@param[out]    cursor        tree cursor
@param[in]    has_search_latch
                latch mode the caller currently has on
                search system: RW_S/X_LATCH or 0
@param[in]    mtr        mini transaction
@return TRUE if succeeded */
ibool
btr_search_guess_on_hash(
    dict_index_t*    index,
    btr_search_t*    info,
    const dtuple_t*    tuple,
    ulint        mode,
    ulint        latch_mode,
    btr_cur_t*    cursor,
    ulint        has_search_latch,
    mtr_t*        mtr)
{
    const rec_t*    rec;
    ulint        fold;
    index_id_t    index_id;
    ......
    cursor->n_fields = info->n_fields;
    cursor->n_bytes = info->n_bytes;
    fold = dtuple_fold(tuple, cursor->n_fields, cursor->n_bytes, index_id);
    cursor->fold = fold;
    cursor->flag = BTR_CUR_HASH;
    rec = (rec_t*) ha_search_and_get_data(
            btr_get_search_table(index), fold);

}

方法调用dtuple_fold来对逻辑记录tuple进行封装:

/** Fold a prefix given as the number of fields of a tuple.
@param[in]    tuple        index record
@param[in]    n_fields    number of complete fields to fold
@param[in]    n_bytes        number of bytes to fold in the last field
@param[in]    index_id    index tree ID
@return the folded value */
UNIV_INLINE
ulint
dtuple_fold(
    const dtuple_t*    tuple,
    ulint        n_fields,
    ulint        n_bytes,
    index_id_t    tree_id)
{
    const dfield_t*    field;
    ulint        i;
    const byte*    data;
    ulint        len;
    ulint        fold;

    ut_ad(tuple);
    ut_ad(tuple->magic_n == DATA_TUPLE_MAGIC_N);
    ut_ad(dtuple_check_typed(tuple));

    fold = ut_fold_ull(tree_id);

    for (i = 0; i < n_fields; i++) {
        field = dtuple_get_nth_field(tuple, i);

        data = (const byte*) dfield_get_data(field);
        len = dfield_get_len(field);

        if (len != UNIV_SQL_NULL) {
            fold = ut_fold_ulint_pair(fold,
                          ut_fold_binary(data, len));
        }
    }

    if (n_bytes > 0) {
        field = dtuple_get_nth_field(tuple, i);

        data = (const byte*) dfield_get_data(field);
        len = dfield_get_len(field);

        if (len != UNIV_SQL_NULL) {
            if (len > n_bytes) {
                len = n_bytes;
            }

            fold = ut_fold_ulint_pair(fold,
                          ut_fold_binary(data, len));
        }
    }

    return(fold);
}

/*************************************************************//**
Folds a 64-bit integer.
@return folded value */
UNIV_INLINE
ulint
ut_fold_ull(
/*========*/
    ib_uint64_t    d)    /*!< in: 64-bit integer */
{
    return(ut_fold_ulint_pair((ulint) d & ULINT32_MASK,
                  (ulint) (d >> 32)));
}


/*************************************************************//**
Folds a pair of ulints.
@return folded value */
UNIV_INLINE
ulint
ut_fold_ulint_pair(
/*===============*/
    ulint    n1,    /*!< in: ulint */
    ulint    n2)    /*!< in: ulint */
{
    return(((((n1 ^ n2 ^ UT_HASH_RANDOM_MASK2) << 8) + n1)
        ^ UT_HASH_RANDOM_MASK) + n2);
}


/*************************************************************//**
Folds a binary string.
@return folded value */
UNIV_INLINE
ulint
ut_fold_binary(
/*===========*/
    const byte*    str,    /*!< in: string of bytes */
    ulint        len)    /*!< in: length */
{
    ulint        fold = 0;
    const byte*    str_end    = str + (len & 0xFFFFFFF8);

    ut_ad(str || !len);

    while (str < str_end) {
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
    }

    switch (len & 0x7) {
    case 7:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 6:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 5:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 4:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 3:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 2:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
        // Fall through.
    case 1:
        fold = ut_fold_ulint_pair(fold, (ulint)(*str++));
    }

    return(fold);
}

可以看出,在封装KEY的过程中,使用到:

  • 用于标识索引的index_id
  • 用于标识查询模式的n_fields和n_bytes
  • 用于标识索引记录的tuple

自适应哈希索引值的封装

在调用ha_search_and_update_if_found中,使用page_rec_get_next(rec)来获取记录的页信息,并作为*new_data参数的值传递进去。

/** Gets the pointer to the next record on the page.
 @return pointer to next record */
UNIV_INLINE
const rec_t *page_rec_get_next_low(
    const rec_t *rec, /*!< in: pointer to record */
    ulint comp)       /*!< in: nonzero=compact page layout */
{
  ulint offs;
  const page_t *page;

  ut_ad(page_rec_check(rec));

  page = page_align(rec);

  offs = rec_get_next_offs(rec, comp);

  if (offs >= UNIV_PAGE_SIZE) {
    fprintf(stderr,
            "InnoDB: Next record offset is nonsensical %lu"
            " in record at offset %lu\n"
            "InnoDB: rec address %p, space id %lu, page %lu\n",
            (ulong)offs, (ulong)page_offset(rec), (void *)rec,
            (ulong)page_get_space_id(page), (ulong)page_get_page_no(page));
    ut_error;
  } else if (offs == 0) {
    return (nullptr);
  }

  return (page + offs);
}

/** Gets the pointer to the next record on the page.
 @return pointer to next record */
UNIV_INLINE
rec_t *page_rec_get_next(rec_t *rec) /*!< in: pointer to record */
{
  return ((rec_t *)page_rec_get_next_low(rec, page_rec_is_comp(rec)));
}

函数page_rec_get_next调用page_rec_get_next_low函数,而page_rec_get_next_low返回的值为(page + offs),即记录所在页编号和页偏移量。

 

索引记录向左和向右查找

在保存索引信息的btr_search_t对象和保存数据页信息的buf_block_t对象中,都有left_side对象用来标识索引记录查找方向。

在创建自适应哈希索引时,不会对数据页中所有记录进行索引,而是根据访问模式来决定对相同键值的记录如何进行索引。

假设有索引idx_c1,其记录集合为[1,2,2,3,3,3,3,4],如对于c1=3的索引记录有3条:

  • 对于查询模式为PAGE_CUR_LE或PAGE_CUR_G,对相同记录取最左的记录进行索引
  • 对于查询模式为PAGE_CUR_GE或PAGE_CUR_L,对相同记录取最右的记录进行索引

当找到最左或最右记录后,再按照left_side值进行向左或向右的查询。

PS:在每条物理记录的record header部分,有18bit用来存放下一条记录的位置,那么如何向左查找记录呢?

posted @ 2021-07-09 22:36  TeyGao  阅读(177)  评论(0编辑  收藏  举报