I was wondering what is the best way to convert something like "haaaaapppppyyy" to "haappyy".
Basically, when parsing slang, people sometimes repeat characters for added emphasis.
I was wondering what the best way to do this is? Using set()
doesn't work because the order of the letters is obviously important.
Any ideas? I'm using Python + nltk.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…