Tensorflow 2.x: call model inference using C/C++ API from inputs, allocated in GPU memory

38 views Asked by At

I’d like to use TensorFlow 2.13 in scientific simulation code, written in C++, via C-API. The code runs simulation on GPU, so all necessary input data for the model are already placed on GPU too.

I need to prepare input for the model, that contains multiple TF_Tensors.

My question is: Is it possible to control, where to place TF_Tensor? Can I make it point to existing on-GPU array to avoid CPU-to-GPU memory transfer?

If wrapping TF tensor around existing data is not possible, would that be possible to copy memory within GPU?

I found TF_AllocatorAttributes struct, that contains placement flag and it is used, for example, in TF_AllocateTemp, but this function requires also TF_OpKernelContext ctx. Unfortunately it is not clear for me where to take it and is it safe at all to use it in C-API

0

There are 0 answers