Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
140 views
in Technique[技术] by (71.8m points)

indexing - Mysql subquery always doing filesort

I have a table gamesplatform_pricehistory with an index : (id_app,country,dateup)

doing this

explain select dateup from gamesplatform_pricehistory
    where id_app=1 and country=1
    order by dateup desc limit 1

shows "Using where; Using index"

but with a subquery :

explain select app.id, (select dateup from gamesplatform_pricehistory
                           where id_app=app.id and country=1
                           order by dateup desc limit 1)
      from app where id > 0;

shows Using where; Using index; Using filesort

Here is a sqlfiddle showing directly the problem : http://sqlfiddle.com/#!2/034bc/1

benchmark with millions rows : (table games_platform is the same as app) :

SELECT sql_no_cache thepricehistory.dateup
    FROM games_platform
    LEFT JOIN (SELECT max(dateup) as dateup, id_app
                   FROM gamesplatform_pricehistory
                   WHERE country='229' GROUP BY id_app
              ) thepricehistory
                         ON thepricehistory.id_app =games_platform.id
    WHERE games_platform.id=2 

eval : 0.8s

SELECT sql_no_cache ( SELECT dateup FROM gamesplatform_pricehistory
                        WHERE id_app= games_platform.id AND country='229'
                        ORDER BY dateup DESC LIMIT 1
                    ) AS dateup
    FROM games_platform
    WHERE games_platform.id=2 

eval : 0.0003s

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using filesort isn't necessarily a bad thing. The name is a bit misleading. Though it contains "file", it does not mean, that the data is written anywhere on hard disk. It still is just processed in memory.

From the manual:

MySQL must do an extra pass to find out how to retrieve the rows in sorted order. The sort is done by going through all rows according to the join type and storing the sort key and pointer to the row for all rows that match the WHERE clause. The keys then are sorted and the rows are retrieved in sorted order. See Section 8.2.1.11, “ORDER BY Optimization”.

You understand why this happens in your query, right? Using this kind of subqueries is bad style since it's a dependent subquery. For every row in your app table the subquery is executed. Very bad. Rewrite the query with a join.

select app.id,
gp.dateup
from app 
join gamesplatform_pricehistory gp on gp.id_app = app.id
where app.id > 0
and gp.country = 1
and gp.dateup = (SELECT MAX(dateup) FROM gamesplatform_pricehistory smgp WHERE smgp.id_app = gp.id_app AND smgp.country = 1)
;

This still uses a dependent subquery, but the explain looks much better:

| id |        select_type | table |  type | possible_keys |     key | key_len |                        ref | rows |                    Extra |
|----|--------------------|-------|-------|---------------|---------|---------|----------------------------|------|--------------------------|
|  1 |            PRIMARY |   app | index |       PRIMARY | PRIMARY |       4 |                     (null) |    2 | Using where; Using index |
|  1 |            PRIMARY |    gp |   ref |        id_app |  id_app |       5 |    db_2_034bc.app.id,const |    1 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY |  smgp |   ref |        id_app |  id_app |       5 | db_2_034bc.gp.id_app,const |    1 |              Using index |

Another way to rewrite it, would be this:

select app.id,
gp.dateup
from app 
LEFT join 
(SELECT id_app, MAX(dateup) AS dateup 
 FROM gamesplatform_pricehistory
 WHERE country = 1
 GROUP BY id_app
)gp on gp.id_app = app.id
where app.id > 0
;

The explain looks even better:

| id | select_type |                      table |  type | possible_keys |     key | key_len |    ref | rows |                    Extra |
|----|-------------|----------------------------|-------|---------------|---------|---------|--------|------|--------------------------|
|  1 |     PRIMARY |                        app | index |       PRIMARY | PRIMARY |       4 | (null) |    2 | Using where; Using index |
|  1 |     PRIMARY |                 <derived2> |   ALL |        (null) |  (null) |  (null) | (null) |    2 |                          |
|  2 |     DERIVED | gamesplatform_pricehistory | index |        (null) |  id_app |      13 | (null) |    2 | Using where; Using index |

And here is a version where you have no dependent subquery at all:

select app.id,
gp.dateup
from app 
left join gamesplatform_pricehistory gp on gp.id_app = app.id and country = 1
left join gamesplatform_pricehistory gp2 on gp.id_app = app.id and country = 1 and gp.dateup < gp2.dateup
where app.id > 0
and gp2.dateup is null
;

It works like this: When gp.dateup is at its maximum, there is no gp2.dateup.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...