Since <-.data.table
doesn't make a copy, you can use <-
:
Create a data.table object:
library(data.table)
di <- data.table(iris)
Create a new column:
di <- di[, z:=1:nrow(di)]
di
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species z
# [1,] 5.1 3.5 1.4 0.2 setosa 1
# [2,] 4.9 3.0 1.4 0.2 setosa 2
# [3,] 4.7 3.2 1.3 0.2 setosa 3
# [4,] 4.6 3.1 1.5 0.2 setosa 4
# [5,] 5.0 3.6 1.4 0.2 setosa 5
# [6,] 5.4 3.9 1.7 0.4 setosa 6
# [7,] 4.6 3.4 1.4 0.3 setosa 7
# [8,] 5.0 3.4 1.5 0.2 setosa 8
# [9,] 4.4 2.9 1.4 0.2 setosa 9
# [10,] 4.9 3.1 1.5 0.1 setosa 10
# First 10 rows of 150 printed.
It is also worth remembering that R only prints the value of an object in interactive mode.
So, in batch mode, you can simply use:
di[, z:=1:nrow(di)]
This will not produce any output when run as a script in batch mode.
Further info from Matthew Dowle:
Also see FAQ 2.21 and 2.22 :
2.21 Why does DT[i,col:=value]
return the whole of DT
? I expected either no visible value (consistent with <-
), or a message or return value containing how many rows were updated. It isn't obvious that the data has indeed been updated by reference.
So that compound syntax can work; e.g., DT[i,done:=TRUE][,sum(done)]
. The number of rows updated is returned when verbosity is on, either on a per query basis or globally using options(datatable.verbose=TRUE)
.
2.22 Ok, but can't the return value of DT[i,col:=value]
be returned invisibly, then?
- We tried to but R internally forces visibility on for
[
. The value of
FunTab's eval column (see src/main/names.c) for [
is 0
meaning force
R_Visible
on (see R-Internals section 1.6). Therefore, when we tried
invisible()
or setting R_Visible
to 0
directly ourselves, eval
in
src/main/eval.c would force it on again.
- After getting used to this behaviour, you might grow to prefer it (we have). After all, how many times do we subassign using
<-
and then immediately look at the data to check it's ok?
- We can mix
:=
into a j
which also returns data; a mixed update and select in one query. To detect whether j
solely updates (and then behave dierently) could be confusing.
Second update from Matthew Dowle:
We have now found a solution and v1.8.3 no longer prints the result when :=
is used. We will update FAQ 2.21 and 2.22.