Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
787 views
in Technique[技术] by (71.8m points)

mysql - Slow INSERT into InnoDB table with random PRIMARY KEY column's value

For my website I use the PHP API for Flickr ( http://www.flickr.com/services/api/ ). This API provides several useful methods to get photos around particular GPS positions.

The call to API methods looks like URL with specific parameters like latitude, longitude, API key, radius, sorting, etc. Say, it'll look like http://api.flickr.com/method?lat=0.0&lon=0.0&radius=10

My website makes over 200,000 calls to API to generate several pages with pictures from Flickr. It is a pritty hard push on API thus I created a results cache in mySQL database.

Simplified scheme of the InnoDB table with cache is:

char(32) request
datetime expires // 2-3 days
text     response // serialized data from API response

where request is a PRIMARY KEY and represents an MD5 hash of a request URI. Other fields are quite simple :)

The problem arises when the table becomes large enough, say over 100,000 rows. New INSERTs take up to 2 seconds (and up to 6 (!) second with 1,000,000 rows).

As far as I understand the problem is with PRIMARY INDEX and engine being InnoDB. Every time a new request is being inserted, InnoDB engine rebuilds the tree index and moves data around, because MD5(request) is a really random value.

So... The question is whether there is a better way to cache such requests? Or maybe I should switch to MyISAM engine? Or may be I should try pseudo-partitioning and create several tables to solve the problem? Or may be just use not a BTREE but HASH index?

Any ideas are welcome!

Edit:

Ok, I tried to alter table as Furicane and Johan suggested, but still no luck - INSERTs takes up to 3 seconds. Currently request field became a normal non-unique index and new id column has been added as PRIMARY KEY with auto increment. Also I tried to add 4 partitions on this table with same results.

I think that index on request field is still a bottleneck. The only way I currently see is to determine all possible parameters, add them as columns to a table and then create index on them.

Any other ideas? :)

Edit 2:

Salman A in comments below said that his similar table performs much better (~0.03 for insert). This the problem may be in IO load on system. Though I cannot any high load on it.

iostat results:

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          22.94    0.71    8.42    8.50    0.00   59.43

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              38.01       151.04       114.32 1383655437 1047309046

iotop results:

Total DISK READ: 152.91 K/s | Total DISK WRITE: 197.67 K/s

With mySQL on top of the list both for writing and reading. Maybe my disks are almost dead? How can I check disks performance?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

InnoDB does not support hash keys, only Btree.

MyISAM is infamous for being unreliable.
I think your problem is that you use the MD5 value for the primary key.

The primary key is included in every secondary key. And the PK is forced to be a unique key.

Set an integer autoincrement primary key and set your MD5 value as a normal index.
It does not even need to be unique, because that's a huge part of what's slowing you down.

After this your inserts should run much faster.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...