I’d like to use TensorFlow 2.13 in scientific simulation code, written in C++, via C-API. The code runs simulation on GPU, so all necessary input data for the model are already placed on GPU too.
I need to prepare input for the model, that contains multiple TF_Tensors.
My question is: Is it possible to control, where to place TF_Tensor? Can I make it point to existing on-GPU array to avoid CPU-to-GPU memory transfer?
If wrapping TF tensor around existing data is not possible, would that be possible to copy memory within GPU?
I found TF_AllocatorAttributes struct, that contains placement flag and it is used, for example, in TF_AllocateTemp,
but this function requires also TF_OpKernelContext ctx. Unfortunately it is not clear for me where to take it and is it safe at all to use it in C-API