Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
181 views
in Technique[技术] by (71.8m points)

r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

I have a Data.Frame I joined using full_join() from dplyr. It looks like this:

View(df1)

Gene Pval   Pval2
ZIC3 0.4123 0.4124
GLA  *NA*   0.135
AFF2 0.003  *NA*
...  ...    ...

I want to pull all the Genes where Pval != Pval2 so I used

DF2 <- DF1[DF1$Pval != DF1$Pval2, ]

This has pulled out mismatching records (294) but DF2 also contains 38 additional rows that are all NA despite the fact that DF1 does not contain any full NA rows. (332 total)

Similarly, if I do

DF3 <- DF1[DF1$Pval == DF1$Pval2, ]

DF3 has 37 NA rows. (13,711 non empty, for 13,748 total)

DF1, the original, has 14042 rows.

The question is, I do not understand where these empty rows are coming from and why the numbers in DF2, DF3 do not add up to DF1......

question from:https://stackoverflow.com/questions/65890799/comparing-cell-values-within-rows-of-a-data-frame-puzzeling-output

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We can also include a condition with is.na

 DF1[(DF1$Pval != DF1$Pval2) | (is.na(DF1$Pval) |is.na(DF1$Pval2)), ]

One issue that could result is when we do the comparison on floating points as the precision can be different and results in unexpected output. It may be better to round and compare

DF1[(round(DF1$Pval, 2) != round(DF1$Pval2, 2)) | 
         (is.na(DF1$Pval) | is.na(DF1$Pval2)), ]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...