What is the optimal way to to remove German (or French) accents from a vector of 16 million string variables.
e.g., 'Sj?gren's syndrome' into 'Sjogren's syndrome'
Converstion of single character into a single character is better then transliteration such as
? => ae ? => oe ü => ue.
e.g., using regular expression would be one option but is there something better (R package for this)?
gsub('ü','u',gsub('?','o',"Sj?gren's syndrome ( über) "))
There are SO solutions for non-R platforms but not a good one for R.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…