Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
462 views
in Technique[技术] by (71.8m points)

r - Passing by reference a data.frame and updating it with rcpp

looking at the rcpp documentation and Rcpp::DataFrame in the gallery I realized that I didn't know how to modify a DataFrame by reference. Googling a bit I found this post on SO and this post on the archive. There is nothing obvious so I suspect I miss something big like "It is already the case because" or "it does not make sense because".

I tried the following which compiled but the data.frame object passed to updateDFByRef in R stayed untouched

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void updateDFByRef(DataFrame& df) {
    int N = df.nrows();
    NumericVector newCol(N,1.);
    df["newCol"] = newCol;
    return;
}
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The way DataFrame::operator[] is implemented indeed leeds to a copy when you do that:

df["newCol"] = newCol;

To do what you want, you need to consider what a data frame is, a list of vectors, with certain attributes. Then you can grab data from the original, by copying the vectors (the pointers, not their content).

Something like this does it. It is a little more work, but not that hard.

// [[Rcpp::export]]
List updateDFByRef(DataFrame& df, std::string name) {
    int nr = df.nrows(), nc= df.size() ;
    NumericVector newCol(nr,1.);
    List out(nc+1) ;
    CharacterVector onames = df.attr("names") ;
    CharacterVector names( nc + 1 ) ;
    for( int i=0; i<nc; i++) {
        out[i] = df[i] ;
        names[i] = onames[i] ;
    }
    out[nc] = newCol ;
    names[nc] = name ;
    out.attr("class") = df.attr("class") ;
    out.attr("row.names") = df.attr("row.names") ;
    out.attr("names") = names ;
    return out ;
}

There are issues associated with this approach. Your original data frame and the one you created share the same vectors and so bad things can happen. So only use this if you know what you are doing.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...