Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.5k views
in Technique[技术] by (71.8m points)

php - Parsing a string with recursive parentheses

I'm trying to parse a string with the following structure in PHP:

a,b,c(d,e,f(g),h,i(j,k)),l,m,n(o),p

For example, a "real" string will be:

id,topic,member(name,email,group(id,name)),message(id,title,body)

My end result should be an array:

[
   id => null,
   topic => null
   member => [
      name => null,
      email => null,
      group => [
         id => null,
         name => null
      ]
   ],
   message => [
      id => null,
      title => null,
      body => null
  ]
]

I've tried recursive regex, but got totally lost. I've got some success with iterating over the string characters, but that seem a bit "over complicated" and I'm sure that is something a regex can handle, I just don't know how.

The purpose is to parse a fields query parameter for a REST API, to allow the client to select the fields he wants from a complex object collection, and I don't want to limit the "depth" of the field selection.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As Wiktor pointed out, this can be achieved with the help of a lexer. The following answer uses a class originally from Nikita Popopv, which can be found here.

What it does

It skims through the string and searches for matches as defined in the $tokenMap. These are defined as T_FIELD, T_SEPARATOR, T_OPEN and T_CLOSE. The values found are put in an array called $structure.
Afterwards we need to loop over this array and build the structure out of it. As there can be multiple nestings, I chose a recursive approach (generate()).

Demo

A demo can be found on ideone.com.

Code

The actual code with explanations:

// this is our $tokenMap
$tokenMap = array(
    '[^,()]+'       => T_FIELD,     # not comma or parentheses
    ','             => T_SEPARATOR, # a comma
    '('            => T_OPEN,      # an opening parenthesis
    ')'            => T_CLOSE      # a closing parenthesis
);

// this is your string
$string = "id,topic,member(name,email,group(id,name)),message(id,title,body)";

// a recursive function to actually build the structure
function generate($arr=array(), $idx=0) {
    $output = array();
    $current = null;
    for($i=$idx;$i<count($arr);$i++) {
        list($element, $type) = $arr[$i];
        if ($type == T_OPEN)
            $output[$current] = generate($arr, $i+1);
        elseif ($type == T_CLOSE)
            return $output;
        elseif ($type == T_FIELD) {
            $output[$element] = null;
            $current = $element;
        }
    }
    return $output;
}

$lex = new Lexer($tokenMap);
$structure = $lex->lex($string);

print_r(generate($structure));

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...