In data.table
is possible to have columns of type list
and I'm trying for the first time to benefit from this feature. I need to store for each row of my table dt
several comments taken from an rApache web service. Each comment will have a username, datetime, and body item.
Instead of using long strings with some weird, unusual character to separate each message from the others (like |
), and a ;
to separate each item in a comment, I thought to use lists like this:
library(data.table)
dt <- data.table(id=1:2,
comment=list(list(
list(username="michele", date=Sys.time(), message="hello"),
list(username="michele", date=Sys.time(), message="world")),
list(
list(username="michele", date=Sys.time(), message="hello"),
list(username="michele", date=Sys.time(), message="world"))))
> dt
id comment
1: 1 <list>
2: 2 <list>
to store all the comments added for one particular row. (also because it will be easier to convert to JSON
later on when I need to send it back to the UI)
However, when I try to simulate how I will be actually filling my table during production (adding single comment to a particular row), R
either crashes or doesn't assign what I would like and then crashes:
library(data.table)
> library(data.table)
> dt <- data.table(id=1:2, comment=vector(mode="list", length=2))
> dt$comment
[[1]]
NULL
[[2]]
NULL
> dt[1L, comment := 1] # this works
> dt$comment
[[1]]
[1] 1
[[2]]
NULL
> set(dt, 1L, "comment", list(1, "a")) # assign only `1` and when I try to see `dt` R crashes
Warning message:
In set(dt, 1L, "comment", list(1, "a")) :
Supplied 2 items to be assigned to 1 items of column 'comment' (1 unused)
> dt[1L, comment := list(1, "a")] # R crashes as soon as I run
> dt[1L, comment := list(list(1, "a"))] # any of these two
I know I'm trying to misuse data.table
, e.g. the way the j
argument has been designed allows this:
dt[1L, c("id", "comment") := list(1, "a")] # lists in RHS are seen as different columns! not parts of one
Question: So, is there a way to do the assignment I want? Or I just have to take dt$comment
out in a variable, modify it, and then re-assign the whole column every times I need to do an update?
See Question&Answers more detail:
os