Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

string - What is the equivalent of stringByFoldingWithOptions:locale: in Java?

I am looking for the way to normalise the list of titles. The title is normalized to be stored in a database as a sort and look up key. "Normalize" means many things such as converting to lowercase, removing the roman accent character, or removing preceding "the", "a" or "an".

In iOS or Mac, NSString class has stringByFoldingWithOptions:locale: method to get the folding version of string.

NSString Class Reference - stringByFoldingWithOptions:locale:

In Java, java.uril.Collator class seems to be useful for comparing, but there seems no way to convert for such purpose.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use java.text.Normalizer which comes close to normalizing Strings in Java. Though regex are also a powerful way to manipulate the Strings in whichever way possible.

Example of accent removal:

String accented = "árvízt?r? tük?rfúrógép";
String normalized = Normalizer.normalize(accented,  Normalizer.Form.NFD);
normalized = normalized.replaceAll("[^\p{ASCII}]", "");

System.out.println(normalized);

Output:

arvizturo tukorfurogep

More explanation here: http://docs.oracle.com/javase/tutorial/i18n/text/normalizerapi.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...