How does an uint32_t pointer work in this code?

9.8k views Asked by At

I'm really confused by how uint32_t pointers work in C++

I was just fiddling around trying to learn TEA, and I didn't understand when they passed a uint32_t parameter to the encrypt function, and then in the function declared a uint32_t variable and assigning the parameter to it as if the parameter is an array.

Like this:

void encrypt (uint32_t* v, uint32_t* k) {
    uint32_t v0=v[0], v1=v[1], sum=0, i;

So I decided to play around with uint32_t pointers, and wrote this short code:

int main ()
{
    uint32_t *plain_text;
    uint32_t key;
    unsigned int temp = 123232;
    plain_text = &temp;
    key = 7744;

    cout << plain_text[1] << endl;

    return 0;
}

And it blew my mind when the output was the value of "key". I have no idea how it works... and then when I tried with plain_text[0], it came back with the value of "temp".

So I'm stuck as hell trying to understand what's happening.

Looking back at the TEA code, is the uint32_t* v pointing to an array rather than a single unsigned int? And was what I did just a fluke?

4

There are 4 answers

5
elnineo On
uint32_t *plain_text; // In memory, four bytes are reserved for ***plain_text***

uint32_t key; // In memory, the next four bytes after ***plain_text*** are reserved for ***key***

Thus: &plain_text[0] is plain_text and &plain_text[1] refers to the the next four bytes which are at &key.

This scenario may explain that behaviour.

8
user657267 On

Formally your program has undefined behavior.

The expression plain_text[1] is equivalent to *(plain_text + 1) ([expr.sub] / 1). Although you can point to one past the end of an array (objects that aren't arrays are still considered single-element arrays for the purposes of pointer arithmetic ([expr.unary.op] / 3)), you cannot dereference this address ([expr.unary.op] / 1).

At this point the compiler can do whatever it wants to, in this case it has simply decided to treat the expression as if it were pointing to an array and that plain_text + 1, i.e. &temp + 1 points to the next uint32_t object in the stack, which in this case by coincidence is key.

You can see what's going on if you look at the assembly

mov DWORD PTR -16[rbp], 123232 ; unsigned int temp=123232;
lea rax, -16[rbp]
mov QWORD PTR -8[rbp], rax     ; plain_text=&temp;
mov DWORD PTR -12[rbp], 7744   ; key=7744;
mov rax, QWORD PTR -8[rbp]
add rax, 4                     ; plain_text[1], i.e. -16[rbp] + 4 == -12[rbp] == key
mov eax, DWORD PTR [rax]
mov edx, eax
mov rcx, QWORD PTR .refptr._ZSt4cout[rip]
call    _ZNSolsEj              ; std::ostream::operator<<(unsigned int)
mov rdx, QWORD PTR .refptr._ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_[rip]
mov rcx, rax
call    _ZNSolsEPFRSoS_E       ; std::ostream::operator<<(std::ostream& (*)(std::ostream&))
mov eax, 0
add rsp, 48
pop rbp
ret
8
M.M On

uint32_t is a type. It means unsigned 32-bit integer. On your system it is probably a typedef name for unsigned int.

There's nothing special about a pointer to this particular type; you can have pointers to any type.

The [] in C and C++ are actually pointer indexing notation. p[0] means to retrieve the value at the location the pointer points to. p[1] gets the value at the next memory location after that. Then p[2] is the next location after that, and so on.

You can use this notation with arrays too because the name of an array is converted to a pointer to its first element when used like this.

So, your code plain_text[1] tries to read the next element after temp. Since temp is not actually an array, this causes undefined behaviour. In your particular case, the manifestation of this undefined behaviour is that it managed to read the memory address after temp without crashing, and that address was the same address where key is stored.

0
kfsone On

In C and C++ arrays decay to pointers, resulting in array/pointer equivalence.

a[1]

when a is a simple type is equivalent to

*(a + 1)

If a is an array of simple types, a will decay at the earliest opportunity to the address of element 0.

int arr[5] = { 0, 1, 2, 3, 4 };
int i = 10;

int* ptr;

ptr = arr;
std::cout << *ptr << "\n"; // outputs 0
ptr = &arr[0]; // same address
std::cout << *ptr << "\n"; // outputs 0
std::cout << ptr[4] << "\n"; // outputs 4
std::cout << *(ptr + 4) << "\n"; // outputs 4
ptr = &i;
std::cout << *ptr << "\n"; // outputs 10
std::cout << ptr[0] << "\n";
std::cout << ptr[1] << "\n"; // UNDEFINED BEHAVIOR.
std::cout << *(ptr + 1) << "\n"; // UNDEFINED BEHAVIOR.

To understand ptr[0] and ptr[1] you simply have to understand pointer arithmetic.