I am using:
- LibTorch 2.0.1
- OpenMP 201511 (4.5)
When trying to clone a torch::Tensor with many rows it changes the max number of threads in OMP:
torch::Tensor t1 = torch::empty({32950,300});
cout << "A. Max threads in omp: " << omp_get_max_threads() << endl;
torch::Tensor t2 = t1.clone();
cout << "B. Max threads in omp: " << omp_get_max_threads() << endl;
Result:
A. Max threads in omp: 8
B. Max threads in omp: 4
However, if the tensor is "small" then it does not happen:
torch::Tensor t1 = torch::empty({20,20});
cout << "A. Max threads in omp: " << omp_get_max_threads() << endl;
torch::Tensor t2 = t1.clone();
cout << "B. Max threads in omp: " << omp_get_max_threads() << endl;
Result:
A. Max threads in omp: 8
B. Max threads in omp: 8
Same problem happens when building a data_loader torch::data::make_data_loader(ds, options)
A temporary solution that worked for me:
int th = omp_get_max_threads();
torch::Tensor t1 = torch::empty({32950,300});
cout << "A. Max threads in omp: " << omp_get_max_threads() << endl;
torch::Tensor t2 = t1.clone();
cout << "B. Max threads in omp: " << omp_get_max_threads() << endl;
omp_set_num_threads(th);
cout << "C. Max threads in omp: " << omp_get_max_threads() << endl;
Result:
A. Max threads in omp: 8
B. Max threads in omp: 4
C. Max threads in omp: 8
But obviously this is not a good approach. I tried to find documentation but no luck. Any suggestions?