I find a function to detect outliers from columns but I do not know how to remove the outliers
is there a function for excluding or removing outliers from the columns
Here is the function to detect the outlier but I need help in a function to remove the outliers
import numpy as np
import pandas as pd
outliers=[]
def detect_outlier(data_1):
threshold=3
mean_1 = np.mean(data_1)
std_1 =np.std(data_1)
for y in data_1:
z_score= (y - mean_1)/std_1
if np.abs(z_score) > threshold:
outliers.append(y)
return outliers
Here the printing outliers
#printing the outlier
outlier_datapoints = detect_outlier(df['Pre_TOTAL_PURCHASE_ADJ'])
print(outlier_datapoints)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…