Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
447 views
in Technique[技术] by (71.8m points)

kdb - most performant way to get asof price given a list of timestamps

I have a list of timestamps spanning multiple dates ( no sym, just timestamps). These can be 1000/2000 at times, spanning multiple dates.

What's the most performant way to hit an hdb and get the closest price available for each timestamp?

select from hdbtable where date = x -> can be over 60mm rows.

To do this for each date and then an aj on top is very poor.

Any suggestions are welcome

question from:https://stackoverflow.com/questions/65848896/most-performant-way-to-get-asof-price-given-a-list-of-timestamps

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The most performant way to aj, assuming the HDB follows the standard conventions of date-partitioned with `p# attribute on sym, is

aj[`sym`time;select sym,time,other from myTable where …;select sym,time,price from prices where date=x]

There should be no additional filters/where-clause on the prices table other than date.

You're saying you have no syms just timestamps but what does that mean? Does that mean you want the price of all syms at that timestamp or you want the last price of any sym at that timestamp? The former is easy as you can just join your timestamps to your distinct sym list and use that as the "left" table in the aj. The latter will not be as easy as the HDB data likely isn't fully sorted on time, it's likely sorted by sym and then time. In that case you might have to again join your timestamps to your distinct sym list and aj for the price for all syms and from that result take the one with the max time.

So I guess it depends on a few factors. More info might help.

EDIT: suggestion based on further discussion:

targetTimes:update targetTime:time from ([]time:"n"$09:43:19 10:27:58 13:12:11 15:34:03);
res:aj0[`sym`time;(select distinct sym from trade where date=2021.01.22)cross targetTimes;select sym,time,price from trade where date=2021.01.22];

select from res where not null price,time=(max;time)fby targetTime
sym  time                 targetTime           price
----------------------------------------------------
AQMS 0D09:43:18.999937967 0D09:43:19.000000000 4.5
ARNA 0D10:27:57.999842638 0D10:27:58.000000000 76.49
GE   0D15:34:02.999979520 0D15:34:03.000000000 11.17
HAL  0D13:12:10.997972224 0D13:12:11.000000000 18.81

This gives the price of whichever sym is closest to your targetTime. Then you would peach this over multiple dates:

{targetTimes: ...;res:aj0[...];select from res ...}peach mydates;

Note that what's making this complicated is your requirement that it be the price of any sym that's closest to your sym-less targetTimes. This seems strange - usually you would want the price of sym(s) as of a particular time, not the price of anything closest to a particular time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...