LOAD DATA INFILE – performance case study

转:

http://venublog.com/2007/11/07/load-data-infile-performance/

 

I often noticed that people complain about the LOAD DATA performance when loading the table with large number of rows of data. Even today I saw a case where the LOAD DATA on a simple 3 column table with about 5 million rows taking ~15 minutes of time. This is because the server did not had any tuning in regards to bulk insertion.

Consider the following simple MyISAM table on Redhat Linux 32-bit.

 

The table has a string key column. Here is the data file(download here) that I used it for testing:

 

Here is the default mysql system variables related to LOAD DATA:

 

and here is the actual LOAD DATA query to load all ~5m rows (~256M of data) to the table and its timing.

Now, lets experiment by disabling the keys in the table before running the LOAD DATA:

 

No use, just 1% increase or same…., now lets set the real MyISAM values… and try again…

 

Wow…thats almost 90% increase in the performance. So, disabling the keys in MyISAM is not just the key, but tuning the buffer size does play role based on the input data.

For the same case with Innodb, here is the status by adjusting the Innodb_buffer_pool_size=1G andInnodb_log_file_size=256M along with innodb_flush_logs_at_trx_commit=1.

With innodb_flush_logs_at_trx_commit=2, innodb_flush_method=O_DIRECT and innodb_doublewrite=0; it will be another 40% difference (use all these variables with caution, unless you know what you are doing)

posted @ 2016-05-30 14:08  yuyue2014  阅读(252)  评论(0编辑  收藏  举报