Thanks for reporting this. This has been fixed now in data.table 1.8.9. Here's the timing test with the latest commit (913):
system.time(expand.grid(1:1000,1:10000))
# user system elapsed
# 1.420 0.552 1.987
system.time(CJ(1:1000,1:10000))
# user system elapsed
# 0.080 0.092 0.171
From NEWS :
CJ() is 90% faster on 1e6 rows (for example), #4849. The inputs are now sorted first before combining rather than after combining and uses rep.int instead of rep (thanks to Sean Garborg for the ideas, code and benchmark) and only sorted if is.unsorted(), #2321.
Also check out NEWS for other notable features that have made it in and bug fixes; e.g., CJ()
gains a new sorted
argument too.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…