What is proper procedure to read and output utf8 encoded data in Windows 10?
My attempt to read utf8 encoded file in Windows 10 and output lines into terminal does not reproduce symbols of some languages.
- OS: Windows 10
- Native codepage: 437
- Switched codepage: 65001
In cmd
window issued command chcp 65001
. Following ruby code reads utf8 encoded file and outputs lines with puts
.
fname = 'hello_world.dat'
File.open(fname,'r:UTF-8') do |f|
puts f.read
end
hello_world.dat content
Afrikaans: Hello Wêreld!
Albanian: P?rshendetje Bot?!
Amharic: ??? ???!
Arabic: ????? ???????!
Armenian: ????? ??????!
Basque: Kaixo Mundua!
Belarussian: Прыв?танне Сусвет!
Bengali: ??? ?????!
Bulgarian: Здравей свят!
Catalan: Hola món!
Chichewa: Moni Dziko Lapansi!
Chinese: 你好世界!
Croatian: Pozdrav svijete!
Czech: Ahoj světe!
Danish: Hej Verden!
Dutch: Hallo Wereld!
English: Hello World!
Estonian: Tere maailm!
Finnish: Hei maailma!
French: Bonjour monde!
Frisian: Hallo wrald!
Georgian: ????????? ???????!
German: Hallo Welt!
Greek: Γει? σου Κ?σμε!
Hausa: Sannu Duniya!
Hebrew: ???? ????!
Hindi: ?????? ??????!
Hungarian: Helló Világ!
Icelandic: Halló heimur!
Igbo: Ndewo ?wa!
Indonesian: Halo Dunia!
Italian: Ciao mondo!
Japanese: こんにちは世界!
Kazakh: С?лем ?лем!
Khmer: ??????????????!
Kyrgyz: Салам д?йн?!
Lao: ?????????????????!
Latvian: Sveika pasaule!
Lithuanian: Labas pasauli!
Luxemburgish: Moien Welt!
Macedonian: Здраво свету!
Malay: Hai dunia!
Malayalam: ???? ?????!
Mongolian: Сайн уу дэлхий!
Myanmar: ??????????????????!
Nepali: ??????? ?????!
Norwegian: Hei Verden!
Pashto: ???? ???!
Persian: ???? ????!
Polish: Witaj ?wiecie!
Portuguese: Olá Mundo!
Punjabi: ??? ???? ???? ?????!
Romanian: Salut Lume!
Russian: Привет мир!
Scots Gaelic: Hàlo a Shaoghail!
Serbian: Здраво Свете!
Sesotho: Lefat?e Lumela!
Sinhala: ???? ???????!
Slovenian: Pozdravljen svet!
Spanish: ?Hola Mundo!
Sundanese: Halo Dunya!
Swahili: Salamu Dunia!
Swedish: Hej v?rlden!
Tajik: Салом ?а?он!
Thai: ????????????!
Turkish: Selam Dünya!
Ukrainian: Прив?т Св?т!
Uzbek: Salom Dunyo!
Vietnamese: Chào th? gi?i!
Welsh: Helo Byd!
Xhosa: Molo Lizwe!
Yiddish: ???? ?????!
Yoruba: Mo ki O Ile Aiye!
Zulu: Sawubona Mhlaba!
Steven Penny suggested to use PowerShell and do not change code page. Following picture demonstrates that the issue persists.
Windows Terminal installer (which is not a part of Windows distribution) solves utf8 output issue, please see included screen capture.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…