Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
148 views
in Technique[技术] by (71.8m points)

sql server - Efficient ways to code or replace double for-loop (table traversal for another table traversal) in SQL

So now I have two tables, namely dbo.DEMAND and dbo.SUPPLY, which contain columns including bidderID/vendorID, itemID, quantity, bidPrice/sellPrice, etc. Now I want to regularly (accomplished by using SQL server's proxy) and automatically match the rows meeting the criteria (here is my problem) specified below, ordered by publishTime, and transfer some information into another table dbo.DEAL.

The way I am currently doing is to first declare @temp table and insert everything in dbo.DEMAND into @temp. Then I use WHILE to traverse the dbo.DEMAND table, for each row use SELECT TOP 1 XXX FROM the dbo.SUPPLY table when specific conditions are met. Finally I DELETE the row in the @temp table when a traversal in dbo.SUPPLY is finished (either @temp.Quantity becomes 0, or when no match can be found in the dbo.SUPPLY table) and restart with the next traversed row. To be more specific, my goal is to find all matches satisfying:

{1. @temp.ItemID = SUPPLY.ItemID;
 2. @temp.Bid > SUPPLY.Price;
 3. @temp.LowestDegre > SUPPLY.DegreeN.}

one-by-one, first traversing @temp ordered by DEMAND.PublishTime, and when one match is found DO:

{1. For the 2 rows (in the two tables), DELETE both rows and the 1 corresponding row in dbo.DEMAND if Quantities are equal, or, 
DELETE the row with smaller Quantity, and SUBTRACT the Quantity in the row with larger Quantity with the smaller one;
2. Add a row in the DEAL table, containing BuyerID, SellerID, ItemID, BidPrice, etc;
3. Redo the MATCH procedure.}

Basically in my opinion, this way is same as double-for-loops in C++. However, I've heard that SQL is good at dealing with sets (presumably like we always want matrix operations in MATLAB), and row-wise traversal does take a long time. How can I optimize the algorithm I am currently using?

Sample data: Sample data from the three tables

Sample code:

DECLARE @temp TABLE
(
buyer char(20),
bid float,
quantity smallint,
item char(20),
lowestnew decimal(18,2),
lowestrep decimal(18,2),
pubtime date
);

INSERT INTO @temp(buyer,bid,quantity,item,lowestnew,lowestrep,pubtime)
SELECT BuyerID,Bid,Quantity,ItemID,LowestDegreeNew,LowestReputation,PublishTime FROM DEMAND
ORDER BY PublishTime;
DECLARE @flag int
SET @flag = 0

DECLARE @seller char(20), @buyer CHAR(20), @item char(20), @lowestdegree decimal(18,2), @lowestrep decimal(18,2)

WHILE EXISTS (SELECT buyer, item, lowestnew, lowestrep FROM @temp)

BEGIN

SET @flag = 1

SELECT @buyer = buyer, @item = item, @lowestdegree = lowestnew, @lowestrep = lowestrep FROM @temp

DECLARE @price FLOAT, @quantity SMALLINT

IF EXISTS (select Price, SellerID, Quantity from SUPPLY, SELLER where SellerID = ID and @lowestdegree >= DegreeNew and @lowestrep>=Reputation order by Price)

BEGIN

SELECT TOP 1 @price=Price, @seller = SellerID, @quantity = Quantity FROM SUPPLY, SELLER WHERE SellerID = ID AND @lowestdegree >= DegreeNew AND @lowestrep>=Reputation
IF ((SELECT Quantity FROM DEMAND WHERE SellerID = @seller AND ItemID = @item AND (DegreeNew >= @LowestDegreeNew) AND (Price <= @bid)) IS NOT NULL)
        BEGIN /* TRANSFER INTO NEW TABLE */

Footnote: I understand that this could be very easily realized by writing easy C# or Python code in fore-end, but I would just like to see if it's possible to solely do this in the database system.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

TLDR: No, you must keep using a loop structure.

I've been there. The joining on "closest day" makes it hard to use a nice set operation. Here's why. Consider those records that have the same equijoin properties (vendor=item etc):

Supply: Day 1
Supply: Day 2
Demand: Day 2

In this case, you will probably want to join Supply Day 2 with Demand Day 2 due to time difference. But if it was instead:

Supply: Day 1 
Supply: Day 2 
Demand: Day 2 
Demand: Day 3

You'd probably want Supply Day 1 - Demand Day 2 and Supply Day 2 - Demand Day 3.

This means that each line depends on other lines in a non-easy way (ie: lag/lead window functions won't save you).

This is one of the rare cases you need a cursor/while.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...