Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
605 views
in Technique[技术] by (71.8m points)

php - Capturing all method arguments default values

I'm working on reverse engineering PHP methods because provided ReflectionClass mechanism is insufficient for my current project.

Currently I want to get using regular expressions method prototypes. I got stuck on retrieving default argument values. I'm providing static method MethodArgs::createFromString() with the contents of method prototype parentheses. It's goal is to get all arguments from string including argument type, name ... and default value and create an instance of itself. So far I've been able to successfully retrieve default values for string's both single quoted and double quoted including exceptional cases like ' ' ' or " " ". But range of scalar values that PHP accepts for default argument value is a bit larger. I'm having problems extending my regexp to match also types like booleans, integers, floats or arrays.

<?php
class MethodArgs
{
    static public function createFromString($str) {
        $str = "   Peer $M = null, Template $T='variable 'value', BlaBlaBla $Bla = " blablabla " bleble "   ";

        //$pat = '#(?:(?:'|")(?<val>(?:[^'"]|(?<=\)(?:'|"))*)(?:'|"))+#i';
        //$pat = '#(?:(?<type>[^$s,()]+)s)?$(?<name>[^,.s)=]+)(?:s*=s*)?(?:'(?<val>(?:[^']|(?<=\)')*)')?#i';
        $pat = '#(?:(?<type>[^$s,()]+)s)?$(?<name>[^,.s)=]+)(?:s*=s*)?(?:(?:'|")(?<val>(?:[^'"]|(?<=\)(?:'|"))*)(?:'|"))?#i';

        $a = preg_match_all($pat, $str, $match);
        var_dump(array('$a' => $a, '$pat' => $pat, '$str' => $str, '$match' => $match));
        die();

        /*$Args = new static();
        for($i=0; $i<count($match[0]); $i++) {
            $Arg = new MethodArg();
            $Arg->setType($match['type'][$i]);
            $Arg->setName($match['name'][$i]);
            $Arg->setDefaultValue($match['val'][$i]);
            $Args[] = $Arg;
        }

        return $Args;*/
    }
}

Output ( screenshot ):

Array
(
    [$a] => 3
    [$pat] => #(?:(?[^$s,()]+)s)?$(?[^,.s)=]+)(?:s*=s*)?(?:(?:'|")(?(?:[^'"]|(?    Peer $M = null, Template $T='variable 'value', BlaBlaBla $Bla = " blablabla " bleble "   
    [$match] => Array
        (
            [0] => Array
                (
                    [0] => Peer $M = 
                    [1] => Template $T='variable 'value'
                    [2] => BlaBlaBla $Bla = " blablabla " bleble "
                )

            [type] => Array
                (
                    [0] => Peer
                    [1] => Template
                    [2] => BlaBlaBla
                )

            [1] => Array
                (
                    [0] => Peer
                    [1] => Template
                    [2] => BlaBlaBla
                )

            [name] => Array
                (
                    [0] => M
                    [1] => T
                    [2] => Bla
                )

            [2] => Array
                (
                    [0] => M
                    [1] => T
                    [2] => Bla
                )

            [val] => Array
                (
                    [0] => 
                    [1] => variable 'value
                    [2] =>  blablabla " bleble 
                )

            [3] => Array
                (
                    [0] => 
                    [1] => variable 'value
                    [2] =>  blablabla " bleble 
                )

        )

)

~ Thanks in advance for any advice

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are trying to parse single or double quoted strings, it should be done
in two steps. Validation, then parse for values.

You could probably do both in a single regex with the use of a G anchor,
validating with AG and parsing with just the G.

If you are sure its valid, you can skip the validation.
Below are the two parts (can be combined if needed).
Note that it parses the single or double quotes using the un-rolled loop method,
which is pretty quick.

Validation:

 # Validation:  '~^(?s)[^"']*(?:"[^"\]*(?:\.[^"\]*)*"|'[^'\]*(?:\.[^'\]*)*'|[^"'])*$~'

 ^
 (?s)
 [^"']*
 (?:
      "
      [^"\]*
      (?: \ . [^"\]* )*
      "
   |
      '
      [^'\]*
      (?: \ . [^'\]* )*
      '
   |
      [^"']
 )*
 $

Parsing:

 # Parsing:  '~(?s)(?|"([^"\]*(?:\.[^"\]*)*)"|'([^'\]*(?:\.[^'\]*)*)')~'

 (?s)                          # Dot all modifier
 (?|                           # Branch Reset
      "
      (                             # (1), double quoted string data
           [^"\]*
           (?: \ . [^"\]* )*
      )
      "
   |                              # OR
      '
      (                             # (1), single quoted string data
           [^'\]*
           (?: \ . [^'\]* )*
      )
      '
 )

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...