Emoji characters are in unicode plane 1 and thus require more than 16 bits to represent a code point. Thus two UTF8 representations or one UTF32 representation. Unicode is actually a 21-bit system and for plane 0 characters (basically everything except emoji) 16 bits is sufficient and we get by using 16 bits. Emoji need more than 16 bits.
"Youtubeud83dude27ud83dude2eud83dude2fud83d"
. is invalid, it is part of a utf16 unicode escaped string, the last ud83d
is 1/2 of an emoji character.
Also, inorder to create a literal string with the escape character "" the escape character must be escaped: "".
NSString *emojiEscaped = @"Youtube\ud83d\ude27\ud83d\ude2e\ud83d\ude2f";
NSData *emojiData = [emojiEscaped dataUsingEncoding:NSUTF8StringEncoding];
NSString *emojiString = [[NSString alloc] initWithData:emojiData encoding:NSNonLossyASCIIStringEncoding];
NSLog(@"emojiString: %@", emojiString);
NSLog output:
emojiString: Youtube??????
The emoji string can also be expressed in utf32:
NSString *string = @"U0001f627U0001f62eU0001f62f";
NSLog(@"string: %@", string);
NSLog output:
string1: ??????
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…