Actually strsplit
uses grep patterns as well. (A comma is a regex metacharacter whereas a space is not; hence the need for double escaping the commas in the pattern argument. So the use of "\s"
would be more to improve readability than of necessity):
> strsplit(test_1, "\, |\,| ")
[[1]]
[1] "abc" "def" "ghi" "klm"
> strsplit(test_2, "\, |\,| ")
[[1]]
[1] "abc" "def" "ghi" "klm"
Without using both \,
and \,
(note extra space that SO does not show) you would have gotten some character(0) values. Might have been clearer if I had written:
> strsplit(test_2, "\,\s|\,|\s")
[[1]]
[1] "abc" "def" "ghi" "klm"
@Fojtasek is so right: Using character classes often simplifies the task because it creates an implicit logical OR:
> strsplit(test_2, "[, ]+")
[[1]]
[1] "abc" "def" "ghi" "klm"
> strsplit(test_1, "[, ]+")
[[1]]
[1] "abc" "def" "ghi" "klm"
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…