Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
633 views
in Technique[技术] by (71.8m points)

r - How to check if .csv-File has a comma or a semicolon as separator?

As the title already says:

I have to read in a lot of .csv-Files automatically. Some have a comma as a delimiter, then i take the command read.csv().

Some have a semicolon as a delimiter, then i take read.csv2().

I want to write a piece of code that recognizes if the .csv-File has a comma or a semicolon as a a delimiter(before i read it) so that I don′t have to change the code everytime. My approach would be something like this:

try to read.csv("xyz")
if error 
read.csv2("xyz")

Is something like that possible? Has somebody done this before? How can i check if there was an error without actually seeing it?

I hope the question is clear. Sorry for my English

Thanks in advance

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here are a few approaches assuming that the only difference among the format of the files is whether the separator is semicolon and the decimal is a comma or the separator is a comma and the decimal is a point.

1) fread As mentioned in the comments fread in data.table package will automatically detect the separator for common separators and then read the file in using the separator it detected. This can also handle certain other changes in format such as automatically detecting whether the file has a header.

2) grepl Look at the first line and see if it has a comma or semicolon and then re-read the file:

L <- readLines("myfile", n = 1)
if (grepl(";", L)) read.csv2("myfile") else read.csv("myfile")

3) count.fields If we can assume that that more than one field exists in each file then if there were one field when sep = ";" we know that semicolon is not the separarator.

L <- readLines("myfile", n = 1)
numfields <- count.fields(textConnection(L), sep = ";")
if (numfields == 1) read.csv("myfile") else read.csv2("myfile")

Update Added (3) and made improvements to all three.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...