Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
869 views
in Technique[技术] by (71.8m points)

c# - How to convert emoticons to its UTF-32/escaped unicode?

I am working on a chatting application in WPF and I want to use emoticons in it. I am working on WPF app. I want to read emoticons which are coming from Android/iOS devices and show respective images.

On WPF, I am getting a black Emoticon looking like this. I somehow got a library of emoji icons which are saved with respective hex/escaped unicode values. So, I want to convert these symbols of emoticons into UTF-32/escaped unicode so that I can directly replace related emoji icons with them.

I had tried to convert an emoticon to its unicode but end up getting a different string with couple of symbols, which are having different unicode.

string unicodeString = "u1F642";  // represents ?? 

Encoding unicode = Encoding.Unicode;
byte[] unicodeBytes = unicode.GetBytes(unicodeString);

char[] unicodeChars = new char[unicode.GetCharCount(unicodeBytes, 0, unicodeBytes.Length)];
unicode.GetChars(unicodeBytes, 0, unicodeBytes.Length, unicodeChars, 0);
string asciiString = new string(unicodeChars);

Any help is appreciated!!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your escaped Unicode String is invalid in C#.

string unicodeString = "u1F642";  // represents ?? 

This piece of code doesnt represent the "slightly smiling face" since C# only respects the first 4 characters - representing an UTF-16 (with 2 Bytes).

So what you actually get is the letter representing 1F64 followed by a simple 2. http://www.fileformat.info/info/unicode/char/1f64/index.htm

So this: ?2

If you want to type hex with 4 Bytes and get the corresponding string you have to use:

var unicodeString = char.ConvertFromUtf32(0x1F642);

https://msdn.microsoft.com/en-us/library/system.char.convertfromutf32(v=vs.110).aspx

or you could write it like this:

uD83DuDE42

This string can than be parsed like this, to get your desired result which is again is the hex value that we started with:

var x = char.ConvertFromUtf32(0x1F642);

var enc = new UTF32Encoding(true, false);
var bytes = enc.GetBytes(x);
var hex = new StringBuilder();
for (int i = 0; i < bytes.Length; i++)
{
    hex.AppendFormat("{0:x2}", bytes[i]);
}
var o = hex.ToString();
//result is 0001F642

(The result has the leading Zeros, since an UTF-32 is always 4 Bytes)

Instead of the for Loop you can also use BitConverter.ToString(byte[]) https://msdn.microsoft.com/en-us/library/3a733s97(v=vs.110).aspx the result than will look like:

var x = char.ConvertFromUtf32(0x1F642);

var enc = new UTF32Encoding(true, false);
var bytes = enc.GetBytes(x);
var o = BitConverter.ToString(bytes);
//result is 00-01-F6-42

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...