blocks and the stack

86 views Asked by At

According to bbum:

2) Blocks are created on the stack. Careful.

Consider:

typedef int(^Blocky)(void);
Blocky b[3];

for (int i=0; i<3; i++)
    b[i] = ^{ return i;};
for (int i=0; i<3; i++)
    printf("b %d\n", b[i]());

You might reasonably expect the above to output:

0
1
2

But, instead, you get:

2
2
2

Since the block is allocated on the stack, the code is nonsense. It only outputs what it does because the Block created within the lexical scope of the for() loop’s body hasn’t happened to have been reused for something else by the compiler.

I don't understand that explanation. If the blocks are created on the stack, then after the for loop completes wouldn't the stack look something like this:

 stack:
---------
^{ return i;} #3rd block
^{ return i;} #2nd block
^{ return i;} #1st block

But bbum seems to be saying that when each loop of the for loop completes, the block is popped off the stack; then after the last pop, the 3rd block just happens to be sitting there in unclaimed memory. Then somehow when you call the blocks the pointers all refer to the 3rd block??

3

There are 3 answers

1
7stud On

Mike Ash provides the answer:

Block objects [which are allocated on the stack] are only valid through the lifetime of their enclosing scope

In bbum's example, the scope of the block is the for-loop's enclosing braces(which bbum omitted):

for (int i=0; i<3; i++) {#<------
    b[i] = ^{ return i;};
}#<-----

So, each time through the loop, the newly created block is pushed onto the stack; then when each loop ends, the block is popped off the stack.

If you print those 3 addresses as they are created, I bet they are all the same.

Yes, I think that's the way that it must have worked in the past. However, now it appears that a loop does not cause the block to be popped off the stack. Now, it must be the method's braces that determine the block's enclosing scope. Edit: Nope. I constructed an experiment, and I still get different addresses for each block:

AppDelegate.h:

typedef int(^Blocky)(void);  #******TYPEDEF HERE********

@interface AppDelegate : NSObject <NSApplicationDelegate> 

@end

AppDelegate.m:

#import "AppDelegate.h"

@interface AppDelegate ()

    -(Blocky)blockTest:(int)i {
        Blocky myBlock = ^{return i;};  #If the block is allocated on the stack, it should be popped off the stack at the end of this method.
        NSLog(@"%p", myBlock);
        return myBlock;
    }


    - (void)applicationDidFinishLaunching:(NSNotification *)aNotification {
        // Insert code here to initialize your application

        Blocky b[3];

        for (int i=0; i < 3; ++i) {
             b[i] = [self blockTest:i];
        }

        for (int j=0; j < 3; ++j) {
            NSLog(@"%d", b[j]() );
        }
    }

@end

--output:--
0x608000051820
0x608000051850
0x6080000517c0
0
1
2

That looks to me like blocks are allocated on the heap.

Okay, my results above are due to ARC. If I turn off ARC, then I get different results:

0x7fff5fbfe658
0x7fff5fbfe658
0x7fff5fbfe658
2
1606411952
1606411952

That looks like stack allocation. Each pointer points to the same area of memory because after a block is popped off the stack, that area of memory is reused for the next block.

Then it looks like when the first block was called it just happened to get the correct result, but by the time the 2nd block was called, the system had overwritten the reclaimed memory resulting in a junk value? I'm still not clear on how calling a non-existent block results in a value??

4
donjuedo On

Yeah, that does make sense, but you really have to think about it. When b[0] is given its value, the "^{ return 0;}" is never used again. b[0] is just the address of it. The compiler kept overwriting those temp functions on the stack as it went along, so the "2" is just the last function written in that space. If you print those 3 addresses as they are created, I bet they are all the same.

On the other hand, if you unroll your assignment loop, and add other references to "^{ return 0;}", like assigning it to a c[0], and you'll likely see b[0] != b[1] != b[2]:

b[0] = ^{ return 0;};
b[1] = ^{ return 1;};
b[2] = ^{ return 2;};
c[0] = ^{ return 0;};
c[1] = ^{ return 1;};
c[2] = ^{ return 2;};

Optimization settings could affect the outcome. By the way, I don't think bbum is saying the pop happens after the for loop completion -- it's happening after each iteration hits that closing brace (end of scope).

2
newacct On

You are completely misunderstanding what "on the stack" means.

There is no such thing as a "stack of variables". The "stack" refers to the "call stack", i.e. the stack of call frames. Each call frame stores the current state of the local variables of that function call. All the code in your example is inside a single function, hence there is only one call frame that is relevant here. The "stack" of call frames is not relevant.

The mentioning of "stack" means only that the block is allocated inside the call frame, like local variables. "On the stack" means it has lifetime akin to local variables, i.e. with "automatic storage duration", and its lifetime is scoped to the scope in which it was declared.

This means that the block is not valid after the end of the iteration of the for-loop in which it was created. And the pointer you have to the block now points to an invalid thing, and it is undefined behavior to dereference the pointer. Since the block's lifetime is over and the space it was using is unused, the compiler is free to use that place in the call frame for something else later.

You are lucky that the compiler decided to place a later block in the same place, so that when you try to access the location as a block, it produces a meaningful result. But this is really just undefined behavior. The compiler could, if it wanted, place an integer in part of that space and another variable in another part, and maybe a block in another part of that space, so that when you try to access that location as a block, it will do all sorts of bad things and maybe crash.

The lifetime of the block is exactly analogous to a local variable declared in that same scope. You can see the same result in a simpler example that uses a local variable that reproduces what's going on:

int *b[3];
for (int i=0; i<3; i++) {
  int j = i;
  b[i] = &j;
}
for (int i=0; i<3; i++)
  printf("b %d\n", *b[i]);

prints (probably):

b 2
b 2
b 2

Here, as in the case with the block, you are also storing a pointer to something that is scoped inside the iteration of the loop, and using it after the loop. And again, just because you're lucky, the space for that variable happens to be allocated to the same variable from a later iteration of the loop, so it seems to give a meaningful result, even though it's just undefined behavior.

Now, if you're using ARC, you likely do not see what your quoted text says happening, because ARC requires that when storing something in a variable of block-pointer type (and b[i] has block-pointer type), that a copy is made instead of a retain, and the copy is stored instead. When a stack block is copied, it is moved to the heap (i.e. it is dynamically allocated, and has dynamic lifetime and is memory managed like other objects), and it returns a pointer to the heap block. This you can safely use after the scope.