Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
236 views
in Technique[技术] by (71.8m points)

r - Remove duplicated rows

I have read a CSV file into an R data.frame. Some of the rows have the same element in one of the columns. I would like to remove rows that are duplicates in that column. For example:

platform_external_dbus          202           16                     google        1
platform_external_dbus          202           16         space-ghost.verbum        1
platform_external_dbus          202           16                  localhost        1
platform_external_dbus          202           16          users.sourceforge        8
platform_external_dbus          202           16                    hughsie        1

I would like only one of these rows since the others have the same data in the first column.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For people who have come here to look for a general answer for duplicate row removal, use !duplicated():

a <- c(rep("A", 3), rep("B", 3), rep("C",2))
b <- c(1,1,2,4,1,1,2,2)
df <-data.frame(a,b)

duplicated(df)
[1] FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE

> df[duplicated(df), ]
  a b
2 A 1
6 B 1
8 C 2

> df[!duplicated(df), ]
  a b
1 A 1
3 A 2
4 B 4
5 B 1
7 C 2

Answer from: Removing duplicated rows from R data frame


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...