encoding - R: UTF-8 character bytes as Latin-1 characters bytes

Question

Welcome To Ask or Share your Answers For Others

encoding - R: UTF-8 character bytes as Latin-1 characters bytes

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

encoding - R: UTF-8 character bytes as Latin-1 characters bytes

I get UTF-8 character bytes as Latin-1 character bytes. Examples contain

Latin 1 character bytes        ----- UTF-8 bytes
?¤?¤nn??k                      ----- ??nn?k
?<U+0084>?<U+0084>N?<U+0096>S  ----- ??n?s

and my session info

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.1

locale:
[1] C/UTF-8/C/C/C/C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

So what kind of settings do I need in R to handle umlauts correctly (not to return UTF-8 bytes as Latin-1 character bytes)?

Related?

Turn Unicode into Umlaut in R on Mac (Facebook Data)

https://stackoverflow.com/a/22945233/164148

Apparently by this, I need to

If you call Sys.setlocale with "LC_CTYPE" or "LC_ALL" to change the system locale while RStudio is running, you may run into some minor issues as RStudio assumes the system encoding doesn't change. If you are on Windows, we recommend you only call Sys.setlocale in .Rprofile. If you are on Mac or Linux and want to change the system locale, please visit the support forum and let us know your scenario.

Does there exist some simple tool to convert the Latin-1 character bytes to UTF-8 character bytes?

P.s. I have tested this now in R on Linux and R on OSX, I get the same problem of interpreting the UTF-8 character bytes as Latin-1 character bytes.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

Categories

encoding - R: UTF-8 character bytes as Latin-1 characters bytes

encoding - R: UTF-8 character bytes as Latin-1 characters bytes

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags