Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

How to properly escape a backslash to match a literal backslash in single-quoted and double-quoted PHP regex patterns

To match a literal backslash, many people and the PHP manual say: Always triple escape it, like this \\

Note:

Single and double quoted PHP strings have special meaning of backslash. Thus if has to be matched with a regular expression \, then "" or '\\' must be used in PHP code.

Here is an example string: est

$test = "\test"; // outputs est;

// WON'T WORK: pattern in double-quotes double-escaped backslash
#echo preg_replace("~\~", '', $test); #output -> est

// WORKS: pattern in double-quotes with triple-escaped backslash
#echo preg_replace("~\\t~", '', $test); #output -> est

// WORKS: pattern in single-quotes with double-escaped backslash
#echo preg_replace('~\~', '', $test); #output -> est

// WORKS: pattern in double-quotes with double-escaped backslash inside a character class
#echo preg_replace("~[\]t~", '', $test); #output -> est

// WORKS: pattern in single-quotes with double-escaped backslash inside a character class
#echo preg_replace('~[\]t~', '', $test); #output -> est

Conclusion:

  • If the pattern is single-quoted, a backslash has to be double-escaped \ to match a literal
  • If the pattern is double-quoted, it depends whether the backlash is inside a character-class where it must be at least double-escaped \ outside a character-class it has to be triple-escaped \\

Who can show me a difference, where a double-escaped backslash in a single-quoted pattern e.g. '~\~' would match anything different than a triple-escaped backslash in a double-quoted pattern e.g. "~\\~" or fail.

When/why/in what scenario would it be wrong to use a double-escaped in a single-quoted pattern e.g. '~\~' for matching a literal backslash?

If there's no answer to this question, I would continue to always use a double-escaped backslash \ in a single-quoted PHP regex pattern to match a literal because there's possibly nothing wrong with it.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A backslash character () is considered to be an escape character by both PHP's parser and the regular expression engine (PCRE). If you write a single backslash character, it will be considered as an escape character by PHP parser. If you write two backslashes, it will be interpreted as a literal backslash by PHP's parser. But when used in a regular expression, the regular expression engine picks it up as an escape character. To avoid this, you need to write four backslash characters, depending upon how you quote the pattern.

To understand the difference between the two types of quoting patterns, consider the following two var_dump() statements:

var_dump('~\~');
var_dump("~\\~");

Output:

string(4) "~\~"
string(4) "~\~"

The escape sequence ~ has no special meaning in PHP when it's used in a single-quoted string. Three backslashes do also work because the PHP parser doesn't know about the escape sequence ~. So \ will become but ~ will remain as ~.

Which one should you use:

For clarity, I'd always use ~\\~ when I want to match a literal backslash. The other one works too, but I think ~\\~ is more clear.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...