SQL - many-to-many table primary key

Question

Welcome To Ask or Share your Answers For Others

SQL - many-to-many table primary key

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

SQL - many-to-many table primary key

This question comes up after reading a comment in this question:

When you create a many-to-many table, should you create a composite primary key on the two foreign key columns, or create a auto-increment surrogate "ID" primary key, and just put indexes on your two FK columns (and maybe a unique constraint)? What are the implications on performance for inserting new records/re-indexing in each case?

Basically, this:

PartDevice
----------
PartID (PK/FK)
DeviceID (PK/FK)

vs. this:

PartDevice
----------
ID (PK/auto-increment)
PartID (FK)
DeviceID (FK)

The commenter says:

making the two IDs the PK means the table is physically sorted on the disk in that order. So if we insert (Part1/Device1), (Part1/Device2), (Part2/Device3), then (Part 1/Device3) the database will have to break the table apart and insert the last one between entries 2 and 3. For many records, this becomes very problematic as it involves shuffling hundreds, thousands, or millions of records every time one is added. By contrast, an autoincrementing PK allows the new records to be tacked on to the end.

The reason I'm asking is because I've always been inclined to do the composite primary key with no surrogate auto-increment column, but I'm not sure if the surrogate key is actually more performant.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T22:33:18+0000

With a simple two-column many-to-many mapping, I see no real advantage to having a surrogate key. Having a primary key on (col1,col2) is guaranteed unique (assuming your col1 and col2 values in the referenced tables are unique) and a separate index on (col2,col1) will catch those cases where the opposite order would execute faster. The surrogate is a waste of space.

You won't need indexes on the individual columns since the table should only ever be used to join the two referenced tables together.

That comment you refer to in the question is not worth the electrons it uses, in my opinion. It sounds like the author thinks the table is stored in an array rather than an extremely high performance balanced multi-way tree structure.

For a start, it's never necessary to store or get at the table sorted, just the index. And the index won't be stored sequentially, it'll be stored in an efficient manner to be able to be retrieved quickly.

In addition, the vast majority of database tables are read far more often than written. That makes anything you do on the select side far more relevant than anything on the insert side.

Categories

SQL - many-to-many table primary key

SQL - many-to-many table primary key

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags