Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
184 views
in Technique[技术] by (71.8m points)

MYSQL Query Performance and Optimization

I have the following database structure:

CREATE TABLE `posts` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `title` varchar(255) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tags` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(255) NOT NULL,
 `seo` varchar(255) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tags_table_one` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `post_id` int(11) NOT NULL,
 `tag_id` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tags_table_three` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `post_id` int(11) NOT NULL,
 `tag_id` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

CREATE TABLE `tags_table_two` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `post_id` int(11) NOT NULL,
 `tag_id` int(11) NOT NULL,
 PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

After adding data to this tables:

  • posts (~ 500k rows)
  • tags (~ 30k rows)
  • tags_table_one (~ 5 milions rows)
  • tags_table_two (~ 700k rows)
  • tags_table_three (~ 800k rows)

This is the Query i tried which works but it is slow:

SELECT p.title
FROM   `tags_table_one` x
       JOIN `posts` p
         ON x.post_id = p.id
WHERE  `tag_id` IN ( 244, 229, 193, 93 )
GROUP  BY `post_id`
HAVING Count(*) = 4
       AND NOT EXISTS (SELECT 1
                       FROM   `tags_table_one`
                       WHERE  `post_id` = x.post_id
                              AND `tag_id` IN ( 92, 10, 234 )) 

My goal is to make a search filter by tags (good and bad tags). For example let's say I have as good tags "244","229","193","93" and as bad tags "92","10","234". I need a MYSQL Query to filter those posts by specified tags. The result should return all posts which has all good tags mentioned and NOT all bad tags mentioned (so it should contains ALL not just some). The thing is, good and bad tags can be choosen from any tags_table so i think it needs a JOIN or something like that which i didn't added to my query because i don't know exactly how to. I tried before and i got a bad result maybe because a lot of records into tables (In some cases, queries took over 30-40 seconds which is too much time to execute). With a few records works fast but i don't need that. I need an optimised Database / Query to make it as fast as possible. If you have any examples of databases / queries would be great to try. Even if there is any changes to MYSQL config or something I'm glad to hear.

EDIT:

The EXPLAIN of Query: enter image description here

EDIT 2:

I moved all entire data to one table with a type column and now i have the table structure like this:

CREATE TABLE `tags_table_one` (
 `post_id` mediumint(8) unsigned NOT NULL,
 `tag_id` mediumint(8) unsigned NOT NULL,
 `type` tinyint(1) NOT NULL,
 PRIMARY KEY (`post_id`,`tag_id`,`type`),
 KEY `tag_id` (`tag_id`,`post_id`,`type`),
 KEY `type` (`type`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

according @RickJames solution but now i have the following Query:

SELECT posts.id AS id,
       posts.title
FROM   `tags_table_one`
       INNER JOIN posts
               ON tags_table_one.post_id = posts.id
WHERE  ( `tag_id` IN ( 15, 25, 16, 17,
                       234, 14, 9 )
         AND `type` = 1 )
       AND ( `tag_id` IN ( 81, 48, 56 )
             AND `type` = 2 )
       AND posts.active = '1'
GROUP  BY `post_id`
HAVING Count(*) = "7" 

So let's say i have tags 15, 25, 16, 17, 234, 14, 9 with type 1 and tags 81, 48, 56 with type 2. I want to return all posts which contains all tags mentioned for type 1 and type 2. If anyone could show me an example of Query with NOT IN would be great because i need that too. So that would be for bad tags.

At this moment, that query doesn't return any result for me. It does only if i remove the AND ( tag_idIN ( 81, 48, 56 ) ANDtype = 2 ) part from the Query but is not what i wanted because isn't filtering properly.

EDIT 3

I tried to manage it but i got only this unworking Query:

SELECT p.id,
       p.title
FROM `posts` p
INNER JOIN tags_table_one t1 ON p.id=t1.post_id
INNER JOIN tags_table_one t2 ON p.id=t2.post_id
INNER JOIN tags_table_one t3 ON p.id=t3.post_id
WHERE p.active='1'
  AND t1.tag_id IN (15, 25, 16, 17, 234, 14, 9) AND t1.type = '1'
  AND t2.tag_id IN (81, 48, 56) AND t2.type = '2'
  AND t3.tag_id IN (47, 51, 355, 71) AND t3.type = '3'
GROUP BY p.id 
HAVING COUNT(t1.tag_id) = 7
AND
HAVING COUNT(t2.tag_id) = 3
AND
HAVING COUNT(t3.tag_id) = 4
ORDER BY p.id DESC

The problem is with "HAVING COUNT", if i remove that it works but it doesn't filter.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
CREATE TABLE `tags_table_one` (
 `post_id` int(11) NOT NULL,
 `tag_id` int(11) NOT NULL,
 PRIMARY KEY (post_id, tag_id),
 INDEX(tag_id, post_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Notes:

  • Get rid of id.
  • Make composite indexes going both ways; make one of them the PK.

More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

INDEX(post_id), INDEX(tag_id) would be faster, but still (as you found out) quite slow.

tags probably needs INDEX(name)?

After edit

SELECT  posts.id AS id, posts.title
    FROM  `tags_table_one`
    INNER JOIN  posts  ON tags_table_one.post_id = posts.id
    WHERE  `tag_id` IN ( 15, 25, 16, 17, 234, 14, 9 )
      AND  `type` = 1 
      AND  posts.active = '1'
    GROUP BY  `post_id`
    HAVING  Count(*) = "7" 

SELECT  posts.id AS id, posts.title
    FROM  `tags_table_one`
    INNER JOIN  posts  ON tags_table_one.post_id = posts.id
    WHERE  `tag_id` IN ( 81, 48, 56 )
      AND  `type` = 2 
      AND  posts.active = '1'
    GROUP BY  `post_id`
    HAVING  Count(*) = "3" 

The first of those Selects gives posts with all 7 of those tag_ids. The second of those Selects gives posts with all 3 of the second set of tag_ids.

Which do you want?

A. The posts with all 10 tags
B. The posts with all 7 of the first set, but none of the second
C. The posts with all 7 of the first set plus some of the second set

We can do a JOIN or LEFT JOIN...IS NOT NULL to get (A) or (B). (C) will take some more fiddling.

Where did type come from? since you are reading from tags_table_one in both cases, I guess it is not that "one/two/three"??

More

When you say AND t2.tag_id IN (19, 684) AND t2.type = 2 you are allowing it to have either 19 or 684. But it sounds like you want it to have both 19 and 684. That will require different SQL.

Plan A: turn that one JOIN + IN into two JOINs.

Plan B: continue with the JOIN + IN, but use HAVING count(*) = 2. But this gets messy because you have several such clauses in the same query.

Plan C: use GROUP_CONCAT(tag_id ORDER BY tag_id) = "19,684" (with the numbers in numerical order). This has other complexities.

Plan D: If you always need all of the given tags, then

WHERE t1.tag_id IN (15, 223) AND t1.type = 1
  AND t2.tag_id IN (19, 684) AND t2.type = 2
  AND t3.tag_id IN (5) AND t3.type = 4

plus

HAVING COUNT(*) = 5

might be made to work.

I think that Plan D has the most promise; give it a try.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...