Precompiled Tensorflow is faster than TF built from the source CPU optimized, how to reproduce precompiled?

208 views Asked by dbasaran At 25 March 2021 at 00:32

I have a trained tensorflow.keras model. I'm loading my model and doing inference from my C code on CPU on Ubuntu 18.04. For performance reasons, I'm comparing different builts of Tensorflow.

The first built I have is the precompiled version that I downloaded from https://www.tensorflow.org/install/lang_c.

https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-2.4.0.tar.gz

Then I built the tensorflow 2.4 from the source following the installation procedure here. I used the defaults in ./configure. Then I ran,

bazel build --config=opt //tensorflow/tools/lib_package:libtensorflow

Then I untar the generated file below and do the necessary exports (I didn't untar into /usr/local),

~/tensorflow/bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz

Lastly, I rebuilt the tensorflow 2.4 from the source, again used the defaults in ./configure. Then I ran,

bazel build --config=mkl --config=noaws --config=nogcp --config=nohdfs -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 //tensorflow/tools/lib_package:libtensorflow

From what I read, this command should build tensorflow with intel-mkl support and with avx, avx2, fma, sse4.1 and sse4.2 instruction sets. I checked that my CPU supports those instructions. Then I untar the generated file and did the exports.

The performance results show that precompiled library is nearly twice as fast than the ones I built from the source. What am I doing wrong here, I couldn't find any other way other than using

--config=mkl -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2

Building with these options didn't show any change in performance. Is there a way to learn with which flags the precompiled versions were built, how can I reproduce it?

Thanks,

Original Q&A

TechQA.

Precompiled Tensorflow is faster than TF built from the source CPU optimized, how to reproduce precompiled?

There are 0 answers

Related Questions in TENSORFLOW

Related Questions in OPTIMIZATION

Related Questions in PERFORMANCE-TESTING

Related Questions in PRECOMPILED-BINARIES

Popular Questions

Popular Tags

Trending Questions