Right now I'm using this piece of code :
public static bool ContainsEmoji(this string text)
{
Regex rgx = new Regex(@"p{Cs}");
return rgx.IsMatch(text);
}
And it's being somewhat helpful.
Most of them appear to be detected, but some aren't.
Here's a reference list to help : http://unicode.org/emoji/charts/full-emoji-list.html
All the smiley faces appear to be fine, but these specific emojis do not get caught by the Regex :
1920 U+2614 ? umbrella with rain drops
1921 U+26F1 ? umbrella on ground
1922 U+26A1 ? high voltage
1923 U+2744 ? snowflake
On the keyboard these are not close to each other, but in the list they are following each other, so I just assumed that there was a point where it would start not working in the emoji list, and it's not really verifying. From 1905 (weather-like emojis), going down, some are caught in the regex, some aren't. There does not seem to be any rule.
I can't afford to just go full ASCII because I need people to enter characters such as cyrillic, but I can't accept emojis specifically. I have no clue how to go forward from here.
I read the MSDN docs about surrogates high/low pairs, but at this stage this is very confusing to me, and I think some push in the right direction would go a long way.
Thank you very much for your time :)
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…