Why stackalloc cannot be used with reference types?

7k views Asked by At

If stackalloc is used with reference types as below

var arr = stackalloc string[100];

there is an error

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Why is so? Why CLR cannot declare pointer to a managed type?

4

There are 4 answers

5
xanatos On BEST ANSWER

The "problem" is bigger: in C# you can't have a pointer to a managed type. If you try writing (in C#):

string *pstr;

you'll get:

Cannot take the address of, get the size of, or declare a pointer to a managed type ('string')

Now, stackalloc T[num] returns a T* (see for example here), so clearly stackalloc can't be used with reference types.

The reason why you can't have a pointer to a reference type is probably connected to the fact that the GC can move reference types around memory freely (to compact the memory), so the validity of a pointer could be short.

Note that in C++/CLI it is possible to pin a reference type and take its address (see pin_ptr)

2
Hans Passant On

The Just-In-Time compiler in .NET performs two important duties when converting MSIL as generated by the C# compiler to executable machine code. The obvious and visible one is generating the machine code. The un-obvious and completely invisible job is generating a table that tells the garbage collector where to look for object references when a GC occurs while the method is executing.

This is necessary because object roots can't be just stored in GC heap, as a field of a class, but also stored in local variables or CPU registers. To do this job properly, the jitter needs to know the exact structure of the stack frame and the types of the variables stored there so it can create that table properly. So that, later, the garbage collector can figure out how to read the proper stack frame offset or CPU register to obtain the object root value. A pointer into the GC heap.

That is a problem when you use stackalloc. That syntax takes advantage of a CLR feature that allows a program to declare a custom value type. A back-door around normal managed type declarations, with the restriction that this value type cannot contain any fields. Just a blob of memory, it is up to the program to generate the proper offsets into that blob. The C# compiler helps you generate those offsets, based on the type declaration and the index expression.

Also very common in a C++/CLI program, that same custom value type feature can provide the storage for a native C++ object. Only space for the storage of that object is required, how to properly initialize it and access the members of that C++ object is a job that the C++ compiler figures out. Nothing that the GC needs to know about.

So the core restriction is that there is no way to provide type info for this blob of memory. As far as the CLR is concerned these are just plain bytes with no structure, the table that the GC uses has no option to describe its internal structure.

Inevitably, the only kind of type you can use is the kind that does not require an object reference that the GC needs to know about. Blittable value types or pointers. So System.String is a no-go, it is a reference type. The closest you could possibly get that is "stringy" is:

  char** mem = stackalloc char*[100];

With the further restriction that it is entirely up to you to ensure that the char* elements point to either a pinned or unmanaged string. And that you don't index the "array" out of bounds. This is not very practical.

2
David Haim On

Because C# works on garbage collection for memory safetiness, as opposed to C++, were you are expected to know neuances of memory management.

for example, take a look at the next code :

public static void doAsync(){
    var arr = stackalloc string[100];
    arr[0] = "hi";
     System.Threading.ThreadPool.QueueUserWorkItem(()=>{
           Thread.Sleep(10000);
           Console.Write(arr[0]);
     });
}

The program will easly crash. because arr is stack allocated, the object + it's memory will disappear as soon as doAsync is over. the lamda function still points to this not-valid-anymore memory address, and this is invalid state.

if you pass local primitives by reference , the same problem will occure.

The schema is:
static objects -> lives throughout the applocation time
local object -> lives as long as the Scope that created them is valid
heap-allocated objects (created with new) -> exist as long as someone hold a reference to them.

Another problem with that is that the Garbage collection works in periods. when an object is local, it should be finalized as soon as the function is over , because after that time - the memory will be overriden by other variables. The GC can't be forced to finalize the object, or shouldn't, anyway.

The good thing though, is that the C# JIT will sometimes (not always) can determine that an object can be safetly be allocated on the stack, and will resort to stack allocation if its possible (again, sometimes).

In C++ on the other hand, you can declare everything enywhere, but this comes with less safetyness then C# or Java, but you can fine-tune you application and achieve high performance - low resources application

10
Matthew Watson On

I think Xanatos posted the correct answer.

Anyway, this isn't an answer, but instead a counterexample to another answer.

Consider the following code:

using System;
using System.Threading;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            doAsync();
            Thread.Sleep(2000);
            Console.WriteLine("Did we finish?"); // Likely this is never displayed.
        }

        public static unsafe void doAsync()
        {
            int n = 10000;
            int* arr = stackalloc int[n];
                ThreadPool.QueueUserWorkItem(x => {
                Thread.Sleep(1000);

                for (int i = 0; i < n; ++i)
                    arr[i] = 0;
            });
        }
    }
}

If you run that code, it will crash because the stack array is being written to after it the stack memory for it has been freed.

This shows that the reason that stackalloc cannot be used with reference types isn't simply to prevent this kind of error.