Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
699 views
in Technique[技术] by (71.8m points)

delete outliers from all columns of a dataframe in r

I'm trying to delete outliers from my dataset, using iqr. I got iqr value for each column in my dataframe and now i want to exclude from the dataframe all the values that are outliers. My code is:

> q1 <- colwise(quantile)(completeData,  probs = c(.25))
> q2 <- colwise(quantile)(completeData,  probs = c(.75))
> IQR <- q2 - q1
> IQR
  MinTemp MaxTemp Rainfall Evaporation Sunshine WindGustSpeed WindSpeed9am WindSpeed3pm Humidity9am Humidity3pm Pressure9am Pressure3pm Cloud9am Cloud3pm Temp9am Temp3pm RainToday Date Location
1     9.2    10.3      2.2         4.4      7.1            19            8           11          26          31         9.6         9.7        5        4     9.3     9.9         1 1537       25
  WindGustDir WindDir3pm RainTomorrow
1           9          8            1

Now that i have iqr values for each variable in the dataframe i want to exclude outliers this way:

completeData <- subset(completeData, completeDat > (q1 - 1.5*IQR) & completeData < (q2+1.5*IQR))

This last line is just to let you understand the idea. The code of the last line is not working and i just want something that can help me delete all outliers from each column of the data frame.

Thanks in advance to who will help me out.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Instead of removing outliers from the dataset I'll suggest to turn them to NA since you can have variable number of outliers in each column which will give you different number of values.

completeData <- lapply(completeData, function(x) {
  q1 <- quantile(x, .25)
  q2 <- quantile(x, .75)
  IQR <- q2 - q1
  replace(x, x < (q1 - 1.5*IQR) | x > (q2+1.5*IQR), NA)
})

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...