I'm not too deeply rooted in the very formal side of static code analysis, hence this question.
A couple of years ago I read that distinguishing code from data using static code analysis is equivalent to the Halting Problem. (Citation needed, but I don't have it anymore. Stackoverflow has threads on this here or here.) At least for common computer architectures based on the Von Neumann architecture where code and data share the same memory this seemed to make sense.
Now I'm looking at the static analysis of C/C++ code and pointer analysis; the program does not execute. Somehow I have a feeling that tracking all creations and uses of pointer values statically is similar to the Halting Problem because I can not determine if a given value in memory is a pointer value, i.e. I can not track the value-flow of pointer values through memory. Alias analysis may narrow down the problem, but it seems to become less useful in the face of multi-threaded code.
(One might even consider tracking arbitrary values, not just pointers: constructing a complete value-flow for any given "interesting" value seems equivalent to the Halting Problem.)
As this is just a hunch, my question is: are the more formal findings on this that I can refer to? Am I mistaken?
It's almost certainly equivalent, modulo the fact that C is not a turing-equivalent language (a given C implementation is a gigantic finite state machine rather than a turing machine, due to the Representation of Types). Pointers need not be kept in their original representations in objects whose effective type is pointer type; you can examine the representation and perform arbitrary operations on it, for example, encrypting pointers and decrypting them later. Determining whether an arbitrary computation is reversible, or whether two computations are inverses of one another, is (offhand) probably equivalent to determining halting.