You can take advantage of the fact that your "??" character decomposes to a regular "a":
Decomposition LATIN SMALL LETTER A (U+0061)
Java's java.text.Normalizer
class contains different normalizer forms. The NKFD
and NKFC
forms use the above decomposition rule.
String normal = Normalizer.normalize(fancy, Normalizer.Form.NFKC);
Using compatibility equivalence is what you need here:
Compatibility equivalence is a weaker type of equivalence between characters or sequences of characters which represent the same abstract character (or sequence of abstract characters), but which may have distinct visual appearances or behaviors.
(The reason you do not lose diacritics is because this process simply separates these diacritic marks from their base letters - and then re-combines them if you use the relevant form.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…