Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
372 views
in Technique[技术] by (71.8m points)

c# - Convert Latin 1 encoded UTF8 to Unicode

I came upon trying to convert a database that is encoded in UTF8 from what it looks like, into a windows 1251 encoding (dont ask, but I need to do this). All of the Russian, encoded characters in the db show up as D°D±D2D3D′D. When I pull them out of the db into my C# app, into strings, I still see D°D±D2D3D′D. No matter what I try to do to interpret this string as UTF8 encoded string, it seems to be interpreted as latin1 single byte string, and I do not see my text show up as russian. What I basically need to do is convert this latin1 looking-utf8 encoded string into Unicode, so that I can convert it later to 1251, but I have not been able to do this successfully. Anyone got any ideas?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Encoding.UTF8.GetString(Encoding.GetEncoding("iso-8859-1").GetBytes(s))

Now you have a normal Unicode string containing Cyrillic.

Note that it is possible that your ‘Latin-1’ misencoded string might actually be a ‘Windows codepage 1252’ misencoded string; I can't tell from the given example as it doesn't use any of the characters that are different between the two encodings. If this is the case use GetEncoding(1252) instead.

Also this is assuming that it's the contents of the database at fault. If the database is supposed to be storing UTF-8 strings but you're pulling them out as if they were Latin-1 (or codepage 1252 due to that being the system codepage) then really you need to reconfigure your data access layer to set the right encoding. If you're using SQL Server, better to start using NVARCHAR.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...