Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

php - Remove comments from JSON data

I need to remove all /*...*/ style comments from JSON data. How do I do it with regular expressions so that string values like this

{
    "propName": "Hello " /* hi */ there."
}

remain unchanged?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You must first avoid all the content that is inside double quotes using the backtrack control verbs SKIP and FAIL (or a capture)

$string = <<<'LOD'
{
    "propName": "Hello " /* don't remove **/ there." /*this must be removed*/
}
LOD;

$result = preg_replace('~"(?:[^"]+|\.)*+"(*SKIP)(*FAIL)|/*(?:[^*]+|*+(?!/))*+*/~s', '',$string);

// The same with a capture:

$result = preg_replace('~("(?:[^"]+|\.)*+")|/*(?:[^*]+|*+(?!/))*+*/~s', '$1',$string);

Pattern details:

"(?:[^"]+|\.)*+"

This part describe the possible content inside quotes:

"              # literal quote
(?:            # open a non-capturing group
    [^"]+   # all characters that are not  or "
  |            # OR
    \.)*+    # escaped char (that can be a quote)
"

Then You can make this subpattern fails with (*SKIP)(*FAIL) or (*SKIP)(?!). The SKIP forbid the backtracking before this point if the pattern fails after. FAIL forces the pattern to fail. Thus, quoted part are skipped (and can't be in the result since you make the subpattern fail after).

Or you use a capturing group and you add the reference in the replacement pattern.

/*(?:[^*]+|*+(?!/))*+*/

This part describe content inside comments.

/*           # open the comment
(?:           
    [^*]+     # all characters except *
  |           # OR
    *+(?!/)  # * not followed by / (note that you can't use 
              # a possessive quantifier here)
)*+           # repeat the group zero or more times
*/           # close the comment

The s modifier is used here only when a backslash is before a newline inside quotes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...