Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
538 views
in Technique[技术] by (71.8m points)

ascii - How do I translate 8bit characters into 7bit characters? (i.e. Ü to U)

I'm looking for pseudocode, or sample code, to convert higher bit ascii characters (like, ü which is extended ascii 154) into U (which is ascii 85).

My initial guess is that since there are only about 25 ascii characters that are similar to 7bit ascii characters, a translation array would have to be used.

Let me know if you can think of anything else.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For .NET users the article in CodeProject (thanks to GvS's tip) does indeed answer the question more correctly than any other I've seen so far.

However the code in that article (in solution #1) is cumbersome. Here's a compact version:

// Based on http://www.codeproject.com/Articles/13503/Stripping-Accents-from-Latin-Characters-A-Foray-in
private static string LatinToAscii(string inString)
{
    var newStringBuilder = new StringBuilder();
    newStringBuilder.Append(inString.Normalize(NormalizationForm.FormKD)
                                    .Where(x => x < 128)
                                    .ToArray());
    return newStringBuilder.ToString();
}

To expand a bit on the answer, this method uses String.Normalize which:

Returns a new string whose textual value is the same as this string, but whose binary representation is in the specified Unicode normalization form.

Specifically in this case we use the NormalizationForm FormKD, described in those same MSDN docs as such:

FormKD - Indicates that a Unicode string is normalized using full compatibility decomposition.

For more information about unicode normalization forms, see Unicode Annex #15.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...