Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

join - MySql: joining same table with millions rows

I have a table with millions of rows (SF_COLLECTIONS)

ID MEMBERID COLLECTIONID CARDID STATE (D / M)
1  1        1            1      D
2  1        1            2      D
3  2        1            1      M
4  2        1            2      M
5  2        1            3      D
6  1        1            3      M

and I have to cross those that have MEMBERID = 1 and STATE = D with those that have MEMBERID = 2 and STATE = M and vice versa

This is my query

SELECT 1
    FROM sf_collections AS rac
    INNER JOIN sf_collections AS myrac 
        ON 
        (myrac.cardid = rac.cardid AND 
            (
                (myrac.state = "M" AND rac.state = "D") OR 
                (myrac.state = "D" AND rac.state = "M")
            )
        ) 
    WHERE
    rac.memberid = 1 AND myrac.memberid = 2
    GROUP BY rac.memberid

(response time about 4 seconds)

Is this a valid approach or there is a better way to improve performance?

Sample data set:

CREATE TABLE `sf_collections` (
 `id` int(11) NOT NULL auto_increment,
 `memberid` int(11) NOT NULL,
 `collectionid` int(11) NOT NULL,
 `cardid` int(11) NOT NULL,
 `state` varchar(1) NOT NULL,
 PRIMARY KEY  (`id`),
 UNIQUE KEY `sf_collections_pkey` (`memberid`,`collectionid`,`cardid`,`state`),
 KEY `collectionid` (`collectionid`),
 KEY `memberid` (`memberid`),
 KEY `cardid` (`cardid`),
 KEY `state` (`state`)
) ENGINE=MyISAM AUTO_INCREMENT=22627806 DEFAULT CHARSET=latin1

INSERT INTO sf_collections (memberid,collectionid,cardid,state) VALUES
(1,1,1,'D'),
(1,1,2,'D'),
(1,1,3,'M'),
(2,1,1,'M'),
(2,1,2,'M'),
(2,1,3,'D');

SELECT 1
    FROM sf_collections AS rac
    INNER JOIN sf_collections AS myrac 
        ON 
        (myrac.cardid = rac.cardid AND 
            (
                (myrac.state = "M" AND rac.state = "D") OR 
                (myrac.state = "D" AND rac.state = "M")
            )
        ) 
    WHERE
    rac.memberid = 1 AND myrac.memberid = 2
    GROUP BY rac.memberid

and db-fiddle

Thanks

EDIT: MySql is 5.0 (very old and cannot upgrade this)

question from:https://stackoverflow.com/questions/65897454/mysql-joining-same-table-with-millions-rows

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Replace the INDEX(cardid) with INDEX(cardid, state)

If practical, don't check for both D->M and M->D; just do one direction. This will cut in half the effort.

Avoid the SELECT ... GROUP BY ... by switching to EXISTS ( SELECT 1 ... ). If there are multiple matching rows, this will speed things up. If you will eventually be "listing the matches", we may as well see what you want rather than nitpicking over the bad use of GROUP BY. Will you be using a GROUP_CONCAT?

Move from MyISAM to InnoDB. Even in 5.0 it may have been intrinsically faster for this query.

DROP INDEX(memberid) since the unique index handles such.

Do you need id for anything? If not, get rid of it and promote the 4-column UNIQUE index to be the PRIMARY KEY.

If state is just a flag (0/1), the INDEX(state) will probably never be used; drop it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...