Copying example from this question:
As a conceptual example, if I have two dataframes:
words = [the, quick, fox, a, brown, fox]
stopWords = [the, a]
then I want the output to be, in any order:
words - stopWords = [quick, brown, fox, fox]
ExceptAll
can do this in 2.4 but I cannot upgrade. The answer in the linked question is specific to a dataframe:
words.join(stopwords, words("id") === stopwords("id"), "left_outer")
.where(stopwords("id").isNull)
.select(words("id")).show()
as in you need to know the pkey and the other columns.
Can anyone come up with an answer that will work on any dataframe?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…