r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

Question

Welcome To Ask or Share your Answers For Others

r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

I have a Data.Frame I joined using full_join() from dplyr. It looks like this:

View(df1)

Gene Pval   Pval2
ZIC3 0.4123 0.4124
GLA  *NA*   0.135
AFF2 0.003  *NA*
...  ...    ...

I want to pull all the Genes where Pval != Pval2 so I used

DF2 <- DF1[DF1$Pval != DF1$Pval2, ]

This has pulled out mismatching records (294) but DF2 also contains 38 additional rows that are all NA despite the fact that DF1 does not contain any full NA rows. (332 total)

Similarly, if I do

DF3 <- DF1[DF1$Pval == DF1$Pval2, ]

DF3 has 37 NA rows. (13,711 non empty, for 13,748 total)

DF1, the original, has 14042 rows.

The question is, I do not understand where these empty rows are coming from and why the numbers in DF2, DF3 do not add up to DF1......

question from:https://stackoverflow.com/questions/65890799/comparing-cell-values-within-rows-of-a-data-frame-puzzeling-output

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:18:49+0000

We can also include a condition with is.na

 DF1[(DF1$Pval != DF1$Pval2) | (is.na(DF1$Pval) |is.na(DF1$Pval2)), ]

One issue that could result is when we do the comparison on floating points as the precision can be different and results in unexpected output. It may be better to round and compare

DF1[(round(DF1$Pval, 2) != round(DF1$Pval2, 2)) | 
         (is.na(DF1$Pval) | is.na(DF1$Pval2)), ]

Categories

r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

r - Comparing cell values within rows of a Data.Frame - Puzzeling Output

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags