Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
561 views
in Technique[技术] by (71.8m points)

utf 8 - Manipulating files with non-English names in R

When using the R functions to manipulate files in Windows, e.g. dir(), those with non-English characters, like Cyrillic, are presented as a sequence of "?".

Similarly, when using file.rename(), if the new name contains non-English characters, the file is renamed with unreadable characters, apparently mapping to a different encoding.

There are a number of functions dealing with encoding for the file contents, but how can we deal with file names?

To reproduce the problem:
Outside R create the file "привет.txt" in the working dir; then in R:

dir() 
# [1] "??????.txt"      
# ...

Note that setting:

Sys.setlocale(category = "LC_ALL", locale="Russian")

doesn't help.

Note: I am using R 3.1.2 for Windows, under Windows 8.1 in English and in Windows consoles (cmd.exe) I see the Cyrillic names properly.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

try this: iconv("привет.txt","UTF-8","CP1251")

Convert Character Vector between Encodings:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/iconv.html

The iconv library:
http://www.delorie.com/gnu/docs/recode/recode_30.html


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...