OceanBase-clog、日志-队列积压-dump tenant info
dump tenant info
日志中搜索dump tenant info关键字,可看到租户的规格,线程,队列,请求统计等信息。这条日志每个租户每10s打印一次。
查询办法: grep 'dump tenant info.*
observer.log 日志:
tenant={id:1002' log/observer.log.*[2021-05-10 16:56:22.564978] INFO [SERVER.OMT] ob_multi_tenant.cpp:803 [48820][2116][Y0-0000000000000000] [lt=5] dump tenant info(tenant={id:1002, compat_mode:1, unit_min_cpu:"1.000000000000000000e+01", unit_max_cpu:"1.500000000000000000e+01", slice:"0.000000000000000000e+00", slice_remain:"0.000000000000000000e+00", token_cnt:30, ass_token_cnt:30, lq_tokens:3, used_lq_tokens:3, stopped:false, idle_us:4945506, recv_hp_rpc_cnt:2420622, recv_np_rpc_cnt:7523808, recv_lp_rpc_cnt:0, recv_mysql_cnt:4561007, recv_task_cnt:337865, recv_large_req_cnt:1272, tt_large_quries:3648648, actives:35, workers:35, nesting workers:7, lq waiting workers:5, req_queue:total_size=48183 queue[0]=47888 queue[1]=0 queue[2]=242 queue[3]=5 queue[4]=48 queue[5]=0 , large queued:12, multi_level_queue:total_size=0 queue[0]=0 queue[1]=0 queue[2]=0 queue[3]=0 queue[4]=0 queue[5]=0 queue[6]=0 queue[7]=0 , recv_level_rpc_cnt:cnt[0]=0 cnt[1]=0 cnt[2]=0 cnt[3]=0 cnt[4]=0 cnt[5]=165652 cnt[6]=10 cnt[7]=0 })
其中,req_queue字段用来判断租户队列积压,每项含义: total_size=优先级队列中总共的排队请求数,queue[n]=每个优先级子队列中的排队请求数。
其他字段含义:
id: 租户ID
unit_min_cpu: 最小cpu核数,保证提供
unit_max_cpu: 最大cpu核数,限制上限
slice: 无意义
slice_remain: 无意义
token_cnt: 调度器分配的token数,一个token会转换为一个工作线程
ass_token_cnt: 租户当前确认的token数(根据token_cnt确认,一般两者相等)
lq_tokens: Large Query token个数,根据token_cnt乘以大请求比例设置
used_lq_tokens: 当前持有LQ Token的Worker数
stopped: 租户unit是否正在删除
idle_us: 一轮(10秒)中工作线程空闲的总时间和, 所谓空闲实际只统计了等待队列的时间。
recv_hp/np/lp_rpc_cnt:租户累计收到不同级别的rpc请求数,hp(High), np(Normal), lp(Low)
recv_mysql_cnt:租户累计收到的mysql请求数
recv_task_cnt:租户累计收到的内部任务数
recv_large_req_cnt:租户累计预判的大请求数,只会递增,不会清零。实际是重试的时候递增的。
tt_large_quries:租户累计处理的大请求数, 只会递增,不会清零。实际是打点check的时候递增的。
actives:活跃工作线程数,一般和workers相等,它们的差包含:租户工作线程缓存+带工作线程的大请求缓存
workers:租户持有的工作线程数, 实际就是workers_这个list的size。
nesting workers:租户持有的嵌套请求专用线程数,共7个线程对应7个嵌套层级。
lq waiting workers:处于等待调度的工作线程
req_queue: 不同优先级的工作队列,数字越小优先级越高
large queued:当前预判出的大请求个数
multi_level_queue:存放嵌套请求的工作队列,1~7对应7个嵌套层级(queue[0]暂时不用)。
recv_level_rpc_cnt:租户累积收到的各个嵌套层级rpc的请求数。
1.req_queue中每个优先级的含义,参考ob_tenant.cpp:recv_request() 代码中is_normal_prio/is_low_prio处理上没有差别,这里都归为普通优先级。 (0, 5) 高优先级 [5, 10) 普通优先级 10是ddl的低优先级,不应该调用到recv_request 11 预热的超低优先级 1. queue0: high prio rpc 2. queue1: lock retry 3. queue2: normal prio rpc 4. queue3: lock retry SQL req; OB_TASK;OB_GTS_TASK 5. queue4: normal SQL req;OB_SQL_TASK 6. queue5: warmup req OB_TASK: 目前只用来断链接 ./src/observer/ob_srv_task.h-class ObDisconnectTask ./src/observer/ob_srv_task.h- : public ObSrvTask OB_GTS_TASK:目前只用来处理GTS请求 ./src/storage/transaction/ob_gts_response_handler.cpp-int ObGtsResponseTask::init(const uint64_t tenant_id, ./src/storage/transaction/ob_gts_response_handler.cpp: set_type(ObRequest::OB_GTS_TASK); OB_SQL_TASK: 目前只用来处理batch processor拆分任务,任务拆分之后分重新push会租户队列 ./src/share/rpc/ob_batch_processor.cpp: sql::ObSqlTask *task = ObSqlTaskFactory::get_instance().alloc(cur_tenant->id()); 低优先级RPC: 目前只用于warmup request ./src/storage/ob_partition_service_rpc.h: RPC_AP(PR11 post_warm_up_request, OB_WARM_UP_REQUEST, (ObWarmUpRequestArg)); 高优先级RPC:用于事务,选举,GTS;所有请求都不会再发rpc ./deps/oblib/src/rpc/obrpc/ob_rpc_proxy_macros.h:#define RPC_AP(args...) _CONCAT(OB_DEFINE_RPC, _AP IGNORE_(args)) ./src/storage/transaction/ob_trans_rpc.h: RPC_AP(PR3 post_trans_msg, OB_TRANS, (transaction::ObTransMsg), ObTransRpcResult); ./src/storage/transaction/ob_trans_rpc.h: RPC_AP(PR3 post_trans_resp_msg, OB_TRANS_RESP, (transaction::ObTransMsg)); ./src/storage/transaction/ob_gts_rpc.h: RPC_AP(PR1 post, OB_GET_GTS_REQUEST, (transaction::ObGtsRequest), ObGtsRpcResult); ./src/storage/transaction/ob_gts_rpc.h: RPC_AP(PR1 post, OB_GET_GTS_ERR_RESPONSE, (transaction::ObGtsErrResponse), ObGtsRpcResult); ./src/storage/transaction/ob_dup_table_rpc.h: RPC_AP(PRZ post_dup_table_lease_request, OB_DUP_TABLE_LEASE_REQUEST, ./src/storage/transaction/ob_dup_table_rpc.h: RPC_AP(PRZ post_dup_table_lease_response, OB_DUP_TABLE_LEASE_RESPONSE, ./src/storage/transaction/ob_dup_table_rpc.h: RPC_AP(PR3 post_redo_log_sync_request, OB_REDO_LOG_SYNC_REQUEST, ./src/storage/transaction/ob_dup_table_rpc.h: RPC_AP(PR3 post_redo_log_sync_response, OB_REDO_LOG_SYNC_RESPONSE, ./src/clog/ob_log_rpc_proxy.h: RPC_AP(PR3 log_rpc, OB_CLOG, (ObLogRpcProxyBuffer)); ./src/election/ob_election_rpc.h: RPC_AP(PR1 post_election_msg, OB_ELECTION, (election::ObElectionMsgBuffer), ObElectionRpcResult); ./src/share/ob_srv_rpc_proxy.h: RPC_AP(PR3 batch_get_role, OB_BATCH_GET_ROLE, ./src/share/ob_srv_rpc_proxy.h: RPC_AP(PR1 ha_gts_ping_request, OB_HA_GTS_PING_REQUEST, (ObHaGtsPingRequest), ObHaGtsPingResponse); ./src/share/ob_srv_rpc_proxy.h: RPC_AP(PR1 ha_gts_get_request, OB_HA_GTS_GET_REQUEST, (ObHaGtsGetRequest)); ./src/share/ob_srv_rpc_proxy.h: RPC_AP(PR1 ha_gts_get_response, OB_HA_GTS_GET_RESPONSE, (ObHaGtsGetResponse)); ./src/share/ob_srv_rpc_proxy.h: RPC_AP(PR1 ha_gts_heartbeat, OB_HA_GTS_HEARTBEAT, (ObHaGtsHeartbeat)); ./src/share/interrupt/ob_interrupt_rpc_proxy.h: RPC_AP(PR1 remote_interrupt_call, OB_REMOTE_INTERRUPT_CALL, (ObInterruptMessage)); ./src/share/rpc/ob_blacklist_proxy.h: RPC_AP(PR1 post_request, OB_SERVER_BLACKLIST_REQ, (ObBlacklistReq)); ./src/share/rpc/ob_blacklist_proxy.h: RPC_AP(PR1 post_response, OB_SERVER_BLACKLIST_RESP, (ObBlacklistResp)); ./src/share/rpc/ob_batch_proxy.h: RPC_AP(PR1 post_packet, OB_BATCH, (ObBatchPacket)); ./deps/oblib/src/rpc/obrpc/ob_rpc_proxy_macros.h:#define RPC_S(args...) _CONCAT(OB_DEFINE_RPC, _S IGNORE_(args)) ./src/storage/transaction/ob_weak_read_service_rpc_define.h: RPC_S(PR1 get_weak_read_cluster_version, OB_WRS_GET_CLUSTER_VERSION, ./src/share/ob_srv_rpc_proxy.h: RPC_S(PR1 get_diagnose_args, OB_GET_DIAGNOSE_ARGS, common::ObString); ./src/share/ob_srv_rpc_proxy.h: RPC_S(PR3 get_member_list_and_leader, OB_GET_MEMBER_LIST_AND_LEADER, ./src/share/ob_srv_rpc_proxy.h: RPC_S(PR1 ha_gts_update_meta, OB_HA_GTS_UPDATE_META, (ObHaGtsUpdateMetaRequest), ./src/share/ob_srv_rpc_proxy.h: RPC_S(PR1 ha_gts_change_member, OB_HA_GTS_CHANGE_MEMBER, (ObHaGtsChangeMemberRequest), ./src/share/ob_common_rpc_proxy.h: RPC_S(PRZ renew_lease, obrpc::OB_RENEW_LEASE, (oceanbase::share::ObLeaseRequest), share::ObLeaseResponse); ./src/share/ob_common_rpc_proxy.h: RPC_S(PRZ cluster_heartbeat, obrpc::OB_CLUSTER_HB, (oceanbase::share::ObClusterAddr), obrpc::ObStandbyHeartBeatRes); ./src/share/ob_common_rpc_proxy.h: RPC_S(PRZ cluster_regist, obrpc::OB_CLUSTER_REGIST, (oceanbase::obrpc::ObRegistClusterArg), obrpc::ObRegistClusterRes); ./src/share/ob_common_rpc_proxy.h: RPC_S(PRZ get_schema_snapshot, obrpc::OB_GET_SCHEMA_SNAPSHOT, (oceanbase::obrpc::ObSchemaSnapshotArg), obrpc::ObSchemaSnapshotRes); ./src/share/ob_common_rpc_proxy.h: RPC_S(PR3 fetch_alive_server, obrpc::OB_FETCH_ALIVE_SERVER, (ObFetchAliveServerArg), ObFetchAliveServerResult); ./src/sql/executor/ob_executor_rpc_proxy.h: RPC_S(@PR4 fetch_interm_result_item, obrpc::OB_FETCH_INTERM_RESULT_ITEM, (sql::ObFetchIntermResultItemArg), sql::ObFetchIntermResultItemRes); 2.multi_level_queue的含义:total_size =:层级队列中总共的排队请求数,queue[n] =:每个层级子队列的排队请求数。 1. queue0: 存放无嵌套的请求,由于无嵌套请求有优先级队列来存放,因此常为空 2. queue1: 存放1层嵌套的请求(如sql触发的rpc) 3. queue2: 存放2层嵌套的请求(如sql触发的rpc再次触发的rpc) 4. queue3: 存放3层嵌套的请求 5. queue4: 存放4层嵌套的请求 6. queue5: 存放5层嵌套的请求,同时存放inner sql请求(为避免inner sql造成死锁) 7. queue6: 存放6层嵌套的请求 8. queue7: 存放7层及以上嵌套层级的请求