Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
133 views
in Technique[技术] by (71.8m points)

php - String concatenation while incrementing

This is my code:

$a = 5;
$b = &$a;
echo ++$a.$b++;

Shouldn't it print 66?

Why does it print 76?

question from:https://stackoverflow.com/questions/17188279/string-concatenation-while-incrementing

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Alright. This is actually pretty straight forward behavior, and it has to do with how references work in PHP. It is not a bug, but unexpected behavior.

PHP internally uses copy-on-write. Which means that the internal variables are copied when you write to them (so $a = $b; doesn't copy memory until you actually change one of them). With references, it never actually copies. That's important for later.

Let's look at those opcodes:

line     # *  op                           fetch          ext  return  operands
---------------------------------------------------------------------------------
   2     0  >   ASSIGN                                                   !0, 5
   3     1      ASSIGN_REF                                               !1, !0
   4     2      PRE_INC                                          $2      !0
         3      POST_INC                                         ~3      !1
         4      CONCAT                                           ~4      $2, ~3
         5      ECHO                                                     ~4
         6    > RETURN                                                   1

The first two should be pretty easy to understand.

  • ASSIGN - Basically, we're assinging the value of 5 into the compiled variable named !0.
  • ASSIGN_REF - We're creating a reference from !0 to !1 (the direction doesn't matter)

So far, that's straight forward. Now comes the interesting bit:

  • PRE_INC - This is the opcode that actually increments the variable. Of note is that it returns its result into a temporary variable named $2.

So let's look at the source code behind PRE_INC when called with a variable:

static int ZEND_FASTCALL  ZEND_PRE_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
    USE_OPLINE
    zend_free_op free_op1;
    zval **var_ptr;

    SAVE_OPLINE();
    var_ptr = _get_zval_ptr_ptr_var(opline->op1.var, execute_data, &free_op1 TSRMLS_CC);

    if (IS_VAR == IS_VAR && UNEXPECTED(var_ptr == NULL)) {
        zend_error_noreturn(E_ERROR, "Cannot increment/decrement overloaded objects nor string offsets");
    }
    if (IS_VAR == IS_VAR && UNEXPECTED(*var_ptr == &EG(error_zval))) {
        if (RETURN_VALUE_USED(opline)) {
            PZVAL_LOCK(&EG(uninitialized_zval));
            AI_SET_PTR(&EX_T(opline->result.var), &EG(uninitialized_zval));
        }
        if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
        CHECK_EXCEPTION();
        ZEND_VM_NEXT_OPCODE();
    }

    SEPARATE_ZVAL_IF_NOT_REF(var_ptr);

    if (UNEXPECTED(Z_TYPE_PP(var_ptr) == IS_OBJECT)
       && Z_OBJ_HANDLER_PP(var_ptr, get)
       && Z_OBJ_HANDLER_PP(var_ptr, set)) {
        /* proxy object */
        zval *val = Z_OBJ_HANDLER_PP(var_ptr, get)(*var_ptr TSRMLS_CC);
        Z_ADDREF_P(val);
        fast_increment_function(val);
        Z_OBJ_HANDLER_PP(var_ptr, set)(var_ptr, val TSRMLS_CC);
        zval_ptr_dtor(&val);
    } else {
        fast_increment_function(*var_ptr);
    }

    if (RETURN_VALUE_USED(opline)) {
        PZVAL_LOCK(*var_ptr);
        AI_SET_PTR(&EX_T(opline->result.var), *var_ptr);
    }

    if (free_op1.var) {zval_ptr_dtor(&free_op1.var);};
    CHECK_EXCEPTION();
    ZEND_VM_NEXT_OPCODE();
}

Now I don't expect you to understand what that's doing right away (this is deep engine voodoo), but let's walk through it.

The first two if statements check to see if the variable is "safe" to increment (the first checks to see if it's an overloaded object, the second checks to see if the variable is the special error variable $php_error).

Next is the really interesting bit for us. Since we're modifying the value, it needs to preform copy-on-write. So it calls:

SEPARATE_ZVAL_IF_NOT_REF(var_ptr);

Now, remember, we already set the variable to be a reference above. So the variable is not separated... Which means everything we do to it here will happen to $b as well...

Next, the variable is incremented (fast_increment_function()).

Finally, it sets the result as itself. This is copy-on-write again. It's not returning the value of the operation, but the actual variable. So what PRE_INC returns is still a reference to $a and $b.

  • POST_INC - This behaves similarly to PRE_INC, except for one VERY important fact.

Let's check out the source code again:

static int ZEND_FASTCALL  ZEND_POST_INC_SPEC_VAR_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
    retval = &EX_T(opline->result.var).tmp_var;
    ZVAL_COPY_VALUE(retval, *var_ptr);
    zendi_zval_copy_ctor(*retval);

    SEPARATE_ZVAL_IF_NOT_REF(var_ptr);
    fast_increment_function(*var_ptr);
}

This time I cut away all of the non-interesting stuff. So let's look at what it's doing.

First, it gets the return temporary variable (~3 in our code above).

Then it copies the value from its argument (!1 or $b) into the result (and hence the reference is broken).

Then it increments the argument.

Now remember, the argument !1 is the variable $b, which has a reference to !0 ($a) and $2, which if you remember was the result from PRE_INC.

So there you have it. It returns 76 because the reference is maintained in PRE_INC's result.

We can prove this by forcing a copy, by assigning the pre-inc to a temporary variable first (through normal assignment, which will break the reference):

$a = 5;
$b = &$a;
$c = ++$a;
$d = $b++;
echo $c.$d;

Which works as you expected. Proof

And we can reproduce the other behavior (your bug) by introducing a function to maintain the reference:

function &pre_inc(&$a) {
    return ++$a;
}

$a = 5;
$b = &$a;
$c = &pre_inc($a);
$d = $b++;
echo $c.$d;

Which works as you're seeing it (76): Proof

Note: the only reason for the separate function here is that PHP's parser doesn't like $c = &++$a;. So we need to add a level of indirection through the function call to do it...

The reason I don't consider this a bug is that it's how references are supposed to work. Pre-incrementing a referenced variable will return that variable. Even a non-referenced variable should return that variable. It may not be what you expect here, but it works quite well in almost every other case...

The Underlying Point

If you're using references, you're doing it wrong about 99% of the time. So don't use references unless you absolutely need them. PHP is a lot smarter than you may think at memory optimizations. And your use of references really hinders how it can work. So while you think you may be writing smart code, you're really going to be writing less efficient and less friendly code the vast majority of the time...

And if you want to know more about References and how variables work in PHP, checkout One Of My YouTube Videos on the subject...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...