肥宅快落水

记测试环境一次容量报错

案发现场:收到报警登上去看发现master容量快满了,但slave还有60-70g的剩余空间,master上也没有堆积的binlog,感觉很奇怪。过了一会master容量满了,发生切换,然后老master的容量自动恢复了,感觉又很奇怪

第一时间processlist抓到一条语句如下:

*************************** 8. row ***************************

     ID: 3384308

   USER: us_test_zsj

   HOST: 10.32.132.7:61523

     DB: fltreimbursementdb

COMMAND: Query

   TIME: 552

  STATE: Sending data

   INFO: select * from exp_expressdetail ed

inner join exp_expressinfo ei on ed.expressinfoid=ed.expressinfoid and ei.getticketway='pjs'

where isvalidinpackage=1 and invtype=2 group by ed.expressinfoid having(count(distinct ed.orderid)=2) order by ed.datachange_lasttime desc limit 10

 

然后去翻errorlog

[op1@FATM00001160 mysql]$ cat mysql_error.log|grep 2020-07-30

2020-07-30T16:40:00.634672+08:00 4580228 [ERROR] InnoDB: posix_fallocate(): Failed to preallocate data for file ./ibtmp1, desired size 67108864 bytes. Operating system error number 28. Check that the disk is not full or a disk quota exceeded. Make sure the file system supports this function. Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html

2020-07-30T16:40:00.660929+08:00 4580228 [Warning] InnoDB: Retry attempts for writing partial data failed.

2020-07-30T16:40:00.660956+08:00 4580228 [ERROR] InnoDB: Write to file ./ibtmp1failed at offset 66315091968, 1048576 bytes should have been written, only 0 were written. Operating system error number 28. Check that your OS and file system support files of this size. Check also that the disk is not full or a disk quota exceeded.

2020-07-30T16:40:00.660985+08:00 4580228 [ERROR] InnoDB: Error number 28 means 'No space left on device'

2020-07-30T16:40:00.661002+08:00 4580228 [Warning] InnoDB: Error while writing 67108864 zeroes to ./ibtmp1 starting at offset 66249031680

2020-07-30T16:40:00.661034+08:00 4580228 [ERROR] /usr/sbin/mysqld: The table '/data/tmp/#sql_2d_7' is full

2020-07-30T16:40:00.661082+08:00 4579873 [ERROR] InnoDB: posix_fallocate(): Failed to preallocate data for file ./ibtmp1, desired size 1048576 bytes. Operating system error number 28. Check that the disk is not full or a disk quota exceeded. Make sure the file system supports this function. Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html

 

第一句:给ibtmp1预分配空间失败,结合后面的语句,猜测就是这个语句引起的。Mysql中join会创建一个临时表用于执行语句,猜测可能是临时表太大把它撑满了。

 

问题复现:

#数据+索引

mysql>select concat(round(sum((data_length+index_length)/1024/1024/1024),2),'GB') as data from tables where table_schema='fltreimbursementdb' and table_name='exp_expressdetail';

+--------+

| data   |

+--------+

| 0.29GB |

+--------+

1 row in set (0.00 sec)

 

mysql> select concat(round(sum((data_length+index_length)/1024/1024/1024),2),'GB') as data from tables where table_schema='fltreimbursementdb' and table_name='exp_expressinfo';

+--------+

| data   |

+--------+

| 0.23GB |

+--------+

 

发现这2个表好像也不是很大

 

#行数

mysql> select count(*) from fltreimbursementdb.exp_expressdetail;

select count(*) from fltreimbursementdb.exp_expressinfo;

+----------+

| count(*) |

+----------+

|  1809134 |

+----------+

1 row in set (7.21 sec)

 

mysql> +----------+

| count(*) |

+----------+

|   668840 |

+----------+

1 row in set (2.49 sec)

 

但是有很多行,2个表join,笛卡尔积操作后会生产1809134*668840=1210021184560条数据(一万亿+)

 

2个表dump到自己的测试机上试了一下,跑了这个语句

mysql> select * from exp_expressdetail ed

    -> inner join exp_expressinfo ei on ed.expressinfoid=ed.expressinfoid and ei.getticketway='pjs'

->  where isvalidinpackage=1 and invtype=2 group by ed.expressinfoid having(count(distinct ed.orderid)=2) order by ed.datachange_lasttime desc limit 10;

 

 

 

发现的确临时表空间在增大,到37g时语句报错:ERROR 1114 (HY000): The table '/data/tmp/#sql_12c_0' is full,(我的机器就40g容量)

问题复现成功。容量满了以后触发切换,重启以后容量恢复正常就说得通了。Master爆了slave没爆也解释的通了,因为这个语句在master爆了的时候还没执行完,也就不会进binlog被传到slave那边执行了。

学到了不止是insert,Select也可能导致容量不足。

 

posted on 2020-07-30 22:31  肥宅快落水  阅读(869)  评论(0编辑  收藏  举报

导航