Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
396 views
in Technique[技术] by (71.8m points)

sql server - Most Efficient (Fast) T-SQL DELETE For Many Rows?

Our server application receives information about rows to add to the database at a rate of 1000-2000 rows per second, all day long. There are two mutually-exclusive columns in the table that uniquely identify a row: one is a numeric identifier called 'tag' and the other is a 50character string called 'longTag'. A row can have either a tag or a longTag; not both.

Each row that comes in off the socket may or may not already exist in the table. If it exists, that row must be updated with the new information. If it doesn't exist, it must be added. We are using SQL 2005 and in a few cases even SQL 2000, so we cannot use the new MERGE keyword.

The way I am doing this now is to build a giant DELETE statement that looks like this:

DELETE from MyRecords
WHERE tag = 1
OR tag = 2
OR longTag = 'LongTag1'
OR tag = 555

...where each incoming row has its own 'OR tag = n' or 'OR longTag = 'x'' clause.

Then I perform an XML Bulk Load using ISQLXMLBulkLoad to load all the new records at once.

The giant DELETE statement sometimes times out, taking 30 seconds or longer. I'm not sure why.

As records come in off the socket they must either be inserted or they must replace existing rows. Is the way I'm doing it the best way to do it?

EDIT: The ratio of new rows to replacement rows is going to be very heavily slanted toward new rows. In data I have seen coming from production, there will typically be 100-1000 new rows for each correction.

EDIT 2: Both the inserts and the deletes must be handled as a single transaction. If either the insert or the delete fails, they must both be rolled back, leaving the table in the same state it was in before the inserts & deletes began.

EDIT 3: Regarding NULL tags. I need to first briefly describe the system a little more. This is a database for a trading system. MyTable is a trades table containing two kind of trades: so-called "day trades" and so-called "opening positions." Day trades are simply trades -- if you were an options trader and you did a trade, that trade would be a day trade in this system. Opening positions are basically a summary of your portfolio up until today. Both opening positions and day trades are stored in the same table. Day trades have tags (either longTags or numeric tags), and opening positions do not. There can be duplicate rows for opening positions -- that is fine & normal. But there cannot be duplicate rows for day trades. If a day trade comes in with the same tag as some record already in the database, then the data in the table is replaced with the new data.

So there are 4 possibilities for the values in tag & longTag:

1) tag is non-zero & longTag is empty: this is a day trade with a numeric identifier. 2) tag is zero and longTag has a non-empty character value. This is a day trade with an alphanumeric identifier. 3) tag is zero and longTag is empty: this is an opening position. 4) tag is non-zero and longTag has a non-empty character value. This is prevented from every happening by our server software, but if it were to happen the longTag would be ignored and it would be treated the same as case #1. Again, this does not happen.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I think splitting the giant DELETE statement into 2 DELETE may help.

1 DELETE to deal with tag and a separate DELETE to deal with longTag. This would help SQL server to choose to use indexes efficiently.

Of course you can still fire the 2 DELETE statements in 1 DB round-trip.

Hope this helps


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...