Performance Schema: Measure Multi-Threaded Slave Activity
2022-08-03 14:53 abce 阅读(33) 评论(0) 编辑 收藏 举报Performance Schema
In many types of database workloads, using a multi-threaded slave from 5.6+ helps improve replication performance. I’ve had a number of users enable this feature, but have not seen anyone ask how each thread is performing. Here’s a quick way with Performance_Schema to measure the amount of multi-threaded slave activity on each thread—after you have already configured MTS on your slave of course.
First, we need to enable the statements instruments:
1 2 3 | slave1> UPDATE setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE 'events_statements_%' ; Query OK, 2 rows affected (0.00 sec) Rows matched: 3 Changed: 2 Warnings: 0 |
Next, let’s find the THREAD_ID for our slave workers:
1 2 3 4 5 6 7 8 9 10 | slave1> SELECT THREAD_ID, NAME FROM threads WHERE NAME LIKE '%worker' ; + -----------+-------------------------+ | THREAD_ID | NAME | + -----------+-------------------------+ | 28 | thread/sql/slave_worker | | 29 | thread/sql/slave_worker | | 30 | thread/sql/slave_worker | | 31 | thread/sql/slave_worker | + -----------+-------------------------+ 4 rows in set (0.00 sec) |
And then we capture measurements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR FROM events_statements_summary_by_thread_by_event_name WHERE THREAD_ID IN (28,29,30,31) AND COUNT_STAR > 0; + -----------+----------------------+------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | + -----------+----------------------+------------+ | 28 | statement/sql/ update | 48 | | 28 | statement/sql/ insert | 24 | | 28 | statement/sql/ delete | 24 | | 28 | statement/sql/ begin | 24 | | 29 | statement/sql/ update | 68 | | 29 | statement/sql/ insert | 34 | | 29 | statement/sql/ delete | 34 | | 29 | statement/sql/ begin | 34 | | 30 | statement/sql/ update | 1864 | | 30 | statement/sql/ insert | 932 | | 30 | statement/sql/ delete | 932 | | 30 | statement/sql/ begin | 932 | | 31 | statement/sql/ update | 40320 | | 31 | statement/sql/ insert | 20160 | | 31 | statement/sql/ delete | 20160 | | 31 | statement/sql/ begin | 20160 | + -----------+----------------------+------------+ 16 rows in set (0.01 sec) |
As you can see above, one of the multi-threadedthreads is doing more work compared to the other three. This is a sysbench on the master in the database doing [--trx-rate=10], compared to the other three which are only doing [--trx-rate=2|5].
Below is another example, this time including the event times. With a slightly modified sysbench test, I’ve designed the tests to send large updates but one with a low transaction rate, another with a high transaction rate, and the rest in between. You can see specifically in threads 32 and 34, that the former is performing a lower amount of [UPDATES], but spending more time per event.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR, -> SUM_TIMER_WAIT/1000000 AS SUM_TIMER_WAIT_MS, -> (SUM_TIMER_WAIT/COUNT_STAR)/1000000 AS AVG_TIME_MS -> FROM events_statements_summary_by_thread_by_event_name -> WHERE THREAD_ID IN (31,32,33,34) AND COUNT_STAR > 0; + -----------+----------------------+------------+-------------------+--------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+----------------------+------------+-------------------+--------------+ | 31 | statement/sql/ update | 3 | 403.9690 | 134.65633333 | | 31 | statement/sql/ insert | 1 | 91.5340 | 91.53400000 | | 31 | statement/sql/ delete | 1 | 75.9540 | 75.95400000 | | 31 | statement/sql/ begin | 1 | 11.6800 | 11.68000000 | | 32 | statement/sql/ update | 75 | 25298.5090 | 337.31345333 | | 32 | statement/sql/ insert | 25 | 2221.1410 | 88.84564000 | | 32 | statement/sql/ delete | 25 | 2187.8970 | 87.51588000 | | 32 | statement/sql/ begin | 25 | 321.0220 | 12.84088000 | | 33 | statement/sql/ update | 4728 | 1008736.6000 | 213.35376481 | | 33 | statement/sql/ insert | 1576 | 138815.0570 | 88.08061992 | | 33 | statement/sql/ delete | 1576 | 136161.9060 | 86.39714848 | | 33 | statement/sql/ begin | 1576 | 20498.2300 | 13.00649112 | | 34 | statement/sql/ update | 70668 | 12304784.2380 | 174.12101995 | | 34 | statement/sql/ insert | 23556 | 2083454.5350 | 88.44687277 | | 34 | statement/sql/ delete | 23556 | 2041202.7690 | 86.65319957 | | 34 | statement/sql/ begin | 23556 | 303710.4860 | 12.89312642 | + -----------+----------------------+------------+-------------------+--------------+ 16 rows in set (0.00 sec) |
Exploring a bit further, one thing I noticed is that when using [binlog_format=ROW] for replication, you will only see the [BEGIN] events instrumented from [performance_schema]. I think this is a bug so I reported it here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR, -> SUM_TIMER_WAIT/1000000 AS SUM_TIMER_WAIT_MS, -> (SUM_TIMER_WAIT/COUNT_STAR)/1000000 AS AVG_TIME_MS -> FROM events_statements_summary_by_thread_by_event_name -> WHERE THREAD_ID IN (40,41,42,43) AND COUNT_STAR > 0; + -----------+---------------------+------------+-------------------+-------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+---------------------+------------+-------------------+-------------+ | 40 | statement/sql/ begin | 16 | 258.6500 | 16.16562500 | | 41 | statement/sql/ begin | 91 | 1526.4320 | 16.77397802 | | 42 | statement/sql/ begin | 589 | 10131.4500 | 17.20110357 | | 43 | statement/sql/ begin | 5022 | 85584.0250 | 17.04182099 | + -----------+---------------------+------------+-------------------+-------------+ 4 rows in set (0.01 sec) |
With MySQL 5.7, MTS has an additional improvement with how parallel replication is applied. By default, transactions are applied in parallel per database. With the new [slave_parallel_type] feature, another option called [LOGICAL_CLOCK] is introduced that allows parallelization intra-database as well. I did some small tests below, with same workload as above. First with the default [DATABASE] type:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | + -----------+----------------------+------------+-------------------+--------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+----------------------+------------+-------------------+--------------+ | 65 | statement/sql/ update | 51 | 9974.1310 | 195.57119608 | | 65 | statement/sql/ insert | 17 | 2037.7080 | 119.86517647 | | 65 | statement/sql/ delete | 17 | 2144.4800 | 126.14588235 | | 65 | statement/sql/ begin | 17 | 293.0650 | 17.23911765 | | 66 | statement/sql/ update | 255 | 67131.4980 | 263.26077647 | | 66 | statement/sql/ insert | 85 | 9629.5520 | 113.28884706 | | 66 | statement/sql/ delete | 85 | 9854.2920 | 115.93284706 | | 66 | statement/sql/ begin | 85 | 1405.0200 | 16.52964706 | | 67 | statement/sql/ update | 1215 | 344712.6330 | 283.71410123 | | 67 | statement/sql/ insert | 405 | 48000.0110 | 118.51854568 | | 67 | statement/sql/ delete | 405 | 53222.3010 | 131.41308889 | | 67 | statement/sql/ begin | 405 | 6836.9070 | 16.88125185 | | 68 | statement/sql/ update | 5943 | 1820669.3290 | 306.35526317 | | 68 | statement/sql/ insert | 1981 | 241513.1400 | 121.91476022 | | 68 | statement/sql/ delete | 1981 | 245022.2450 | 123.68614084 | | 68 | statement/sql/ begin | 1981 | 32978.3390 | 16.64731903 | + -----------+----------------------+------------+-------------------+--------------+ |
And then with [LOGICAL_CLOCK]:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | + -----------+----------------------+------------+-------------------+--------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+----------------------+------------+-------------------+--------------+ | 59 | statement/sql/ update | 9486 | 2667681.7170 | 281.22303574 | | 59 | statement/sql/ insert | 3162 | 376350.1650 | 119.02282258 | | 59 | statement/sql/ delete | 3162 | 388606.5460 | 122.89897090 | | 59 | statement/sql/ begin | 3162 | 53988.2600 | 17.07408602 | | 60 | statement/sql/ update | 903 | 362853.2080 | 401.83079513 | | 60 | statement/sql/ insert | 301 | 36507.3090 | 121.28674086 | | 60 | statement/sql/ delete | 301 | 37437.2550 | 124.37626246 | | 60 | statement/sql/ begin | 301 | 5210.4110 | 17.31033555 | | 61 | statement/sql/ update | 42 | 23394.0330 | 557.00078571 | | 61 | statement/sql/ insert | 14 | 1671.5900 | 119.39928571 | | 61 | statement/sql/ delete | 14 | 1720.1230 | 122.86592857 | | 61 | statement/sql/ begin | 14 | 246.1490 | 17.58207143 | + -----------+----------------------+------------+-------------------+--------------+ |
With [LOGICAL_LOCK] and [slave_preserve_commit_order] enabled:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | + -----------+----------------------+------------+-------------------+--------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+----------------------+------------+-------------------+--------------+ | 71 | statement/sql/ update | 8097 | 2216461.7170 | 273.73863369 | | 71 | statement/sql/ insert | 2699 | 322933.1620 | 119.64918933 | | 71 | statement/sql/ delete | 2699 | 326944.2380 | 121.13532345 | | 71 | statement/sql/ begin | 2699 | 45860.5720 | 16.99169026 | | 72 | statement/sql/ update | 807 | 256668.2730 | 318.05238290 | | 72 | statement/sql/ insert | 269 | 32952.5570 | 122.50021190 | | 72 | statement/sql/ delete | 269 | 33346.3060 | 123.96396283 | | 72 | statement/sql/ begin | 269 | 4650.1010 | 17.28662082 | | 73 | statement/sql/ update | 33 | 6823.1170 | 206.76112121 | | 73 | statement/sql/ insert | 11 | 1512.5810 | 137.50736364 | | 73 | statement/sql/ delete | 11 | 1593.5790 | 144.87081818 | | 73 | statement/sql/ begin | 11 | 188.3910 | 17.12645455 | + -----------+----------------------+------------+-------------------+--------------+ |
Combining the [INSERT], [UPDATE], and [DELETE] counts from these 3 tests, [LOGICAL_CLOCK] tops with around 17k events within 120 seconds of the same workload. The [DATABASE] type was the slowest, recording only about 12k events committed on the slave.
Lastly, with MySQL 5.7 a new instrument has been added and enabled for metrics to be collected. You can also measure the per-transaction rate instead of per event class.
1 2 3 4 5 6 7 | slave1> UPDATE setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE 'vents_transactions%' ; Query OK, 0 rows affected (0.00 sec) Rows matched: 0 Changed: 0 Warnings: 0 slave1> UPDATE setup_instruments SET ENABLED = 'YES' , TIMED = 'YES' WHERE NAME = 'transaction' ; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 |
And a quick measurement:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | slave1> SELECT THREAD_ID, EVENT_NAME, COUNT_STAR, -> SUM_TIMER_WAIT/1000000 AS SUM_TIMER_WAIT_MS, -> (SUM_TIMER_WAIT/COUNT_STAR)/1000000 AS AVG_TIME_MS -> FROM events_transactions_summary_by_thread_by_event_name -> WHERE THREAD_ID IN ( -> SELECT THREAD_ID FROM replication_applier_status_by_worker -> ) AND COUNT_STAR > 0; + -----------+-------------+------------+-------------------+----------------+ | THREAD_ID | EVENT_NAME | COUNT_STAR | SUM_TIMER_WAIT_MS | AVG_TIME_MS | + -----------+-------------+------------+-------------------+----------------+ | 71 | transaction | 4849 | 50323942.6710 | 10378.21049103 | | 72 | transaction | 487 | 6421399.4580 | 13185.62517043 | | 73 | transaction | 17 | 319024.9280 | 18766.17223529 | + -----------+-------------+------------+-------------------+----------------+ 3 rows in set (0.00 sec) |
Of course, not only statement, events, and transactions can be measured; you can also measure wait times and stages as well. Happy instrumenting!
This article was written by Jervin Real.
https://dzone.com/articles/performance-schema-measure-multi-threaded-slave-ac
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
2021-08-03 sql中的exists
2018-08-03 Oracle 18C DBCA建库报ora-01012错误