Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
188 views
in Technique[技术] by (71.8m points)

database design - in MySQL, is it good to have 500 Columns in one table?

in MySQL, is it good to have 500 Columns in one table? the rows will be increasing daily, the maximum row count would be less than or equal to 1million.

just to give a brief, these are my column headers

TableName: process_detail id, process_id, item_id, item_category, attribute1,attribute2,attribute3,...,attribute500,user_id1_update_time,user_id2_update_time,user_id1_comments,user_id2_comments

all attributes are varchar with length maximum 30. but less than 30.

and i have 25 item_categories.

should i create one single table like 'process_detail', or should i create multiple tables categorywise like category1_process_detail,category2_process_detail,..,category25_process_detail

each item_category has different number of columns, some will have 200 columns and some will have only 50 columns. lets say category1 will have 200 columns, category2 will have 50 columns.

some columns in category1 will not be in category2.

item_ids in process_id1 may come in process_id2. what is the best approach in terms of good performance ? please adivce.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

No it is not a good idea. Instead use a many to one relational mapping.

For example, create the process_detail table as you propose but without the attribute columns. Then create another table process_detail_attributes

CREATE TABLE `process_detail_attributes`  
(`pda_id` INT NOT NULL AUTO_INCREMENT,  
 `id` INT NOT NULL,   
 `attribute_key` INT NOT NULL,  
 `attribute_value` VARCHAR(30) NOT NULL,  
  PRIMARY KEY(`pda_id`),  
  FOREIGN KEY (id) REFERENCES process_detail (id)
) ENGINE...

Then for each attribute (attribute1...attribute500) needed just enter a row into the attribute table with the appropriate id inserted into the foreign key column.

The benefits of doing this are numerous. The link Doomenik mentions is probably a good starting point to understand why, but to put it tersely...

-If all attributes aren't used there will be no wasted storage space.
-Even if the attributes are used, the data will be stored within the actual index B-Tree node, exorbitantly inflating the amount of data per page and decreasing the amount of pages able to fit in the buffer pool (i.e RAM) and decreasing the locality of the keys. This will subsequently slow the index traversal.
-If these attributes are going to require indices (which attributes often do) then the unruliness of this table will be unconscionable.

There are of course times when you can consider de-normalization for the sake of performance but this does not seem like one of them.

You can then select the data from process_detail with all of its attributes like this:

SELECT a.process_id,  
a.user_id1_u??pdate_time,  
a.user_id2_u??pdate_time,  
a.user_id1_comments,  
a.user_id2_comments,  
b.*  
FROM process_detail a INNER JOIN process_detail_attributes b  
WHERE a.id = b.id AND whatever_condition_you_want_to_filter_by_here;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...