It seems to me that read.table/read.csv
cannot handle escaped quotes.
...But I think I have an (ugly) work-around inspired by @nullglob;
- First read the file WITHOUT a quote character.
(This won't handle embedded
,
as @Ben Bolker noted)
- Then go though the string columns and remove the quotes:
The test file looks like this (I added a non-string column for good measure):
13,"foo","Fab D"atri","bar"
21,"foo2","Fab D"atri2","bar2"
And here is the code:
# Generate test file
writeLines(c("13,"foo","Fab D"atri","bar"",
"21,"foo2","Fab D"atri2","bar2"" ), "foo.txt")
# Read ignoring quotes
tbl <- read.table("foo.txt", as.is=TRUE, quote='', sep=',', header=FALSE, row.names=NULL)
# Go through and cleanup
for (i in seq_len(NCOL(tbl))) {
if (is.character(tbl[[i]])) {
x <- tbl[[i]]
x <- substr(x, 2, nchar(x)-1) # Remove surrounding quotes
tbl[[i]] <- gsub('"', '"', x) # Unescape quotes
}
}
The output is then correct:
> tbl
V1 V2 V3 V4
1 13 foo Fab D"atri bar
2 21 foo2 Fab D"atri2 bar2
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…