I'm trying to follow this tutorial to build an XLA AOT example (with things taken from this). I've been able to build Tensorflow from source and get XLA JIT working on the small mnist_softmax_xla.py.
The steps I've done so far are:
1)
#from tensorflow/tensorflow/compiler/aot/tests
python3 ./make_test_graphs.py --out_dir=./
2) I also had to change line 21 of /home/m2angus/tensorflow/third_party/llvm/llvm.BUILD
to:
package(default_visibility = ["//visibility:public"])
This was to prevent errors with bazel.
3)
bazel build --config=opt --config=cuda --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/compiler/aot/tests:my_binary
With the following files:
tensorflow/tensorflow/compiler/aot/tests/my_code.cc
#define EIGEN_USE_THREADS
#define EIGEN_USE_CUSTOM_THREAD_POOL
#include <iostream>
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/compiler/aot/tests/test_graph_tfmatmul.h" // generated
int main(int argc, char** argv) {
Eigen::ThreadPool tp(2); // Size the thread pool as appropriate.
Eigen::ThreadPoolDevice device(&tp, tp.NumThreads());
foo::bar::MatMulComp matmul;
matmul.set_thread_pool(&device);
// Set up args and run the computation.
const float args[12] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
std::copy(args + 0, args + 6, matmul.arg0_data());
std::copy(args + 6, args + 12, matmul.arg1_data());
matmul.Run();
// Check result
if (matmul.result0(0, 0) == 58) {
std::cout << "Success" << std::endl;
} else {
std::cout << "Failed. Expected value 58 at 0,0. Got:"
<< matmul.result0(0, 0) << std::endl;
}
return 0;
}
tensorflow/tensorflow/compiler/aot/tests/BUILD
# Example of linking your binary
# Also see //third_party/tensorflow/compiler/aot/tests/BUILD
load("//tensorflow/compiler/aot:tfcompile.bzl", "tf_library")
# The same tf_library call from step 2 above.
tf_library(
name = "test_graph_tfmatmul",
cpp_class = "foo::bar::MatMulComp",
graph = "test_graph_tfmatmul.pb",
config = "test_graph_tfmatmul.config.pbtxt",
)
# The executable code generated by tf_library can then be linked into your code.
cc_binary(
name = "my_binary",
srcs = [
"my_code.cc", # include test_graph_tfmatmul.h to access the generated header
],
deps = [
":test_graph_tfmatmul", # link in the generated object file
"//tensorflow/compiler/tf2xla",
"//tensorflow/compiler/tf2xla:common",
"//tensorflow/compiler/tf2xla:tf2xla_proto",
"//tensorflow/compiler/tf2xla:tf2xla_util",
"//tensorflow/compiler/tf2xla:xla_compiler",
"//tensorflow/compiler/tf2xla/kernels:xla_cpu_only_ops",
"//tensorflow/compiler/tf2xla/kernels:xla_ops",
"//tensorflow/compiler/xla:shape_util",
"//tensorflow/compiler/xla:statusor",
"//tensorflow/compiler/xla:util",
"//tensorflow/compiler/xla:xla_data_proto",
"//tensorflow/compiler/xla/client:client_library",
"//tensorflow/compiler/xla/client:compile_only_client",
"//tensorflow/compiler/xla/service:compiler",
"//tensorflow/compiler/xla/service/cpu:cpu_compiler",
"//tensorflow/core:core_cpu",
"//tensorflow/core:core_cpu_internal",
"//tensorflow/core:framework",
"//tensorflow/core:framework_internal",
"//tensorflow/core:lib",
"//tensorflow/core:protos_all_cc",
"//tensorflow/compiler/tf2xla:xla_compiled_cpu_function",
"//third_party/eigen3",
],
linkopts = [
"-lpthread",
]
)
The error output is huge so I'll just put a snippet of it
Loading:
Loading: 0 packages loaded
INFO: Analysed target //tensorflow/compiler/aot/tests:my_binary (0 packages loaded).
INFO: Found 1 target...
[0 / 2] BazelWorkspaceStatusAction stable-status.txt
[1 / 2] Linking tensorflow/compiler/aot/tests/my_binary; 1s local
ERROR: /home/m2angus/tensorflow/tensorflow/compiler/aot/tests/BUILD:14:1: Linking of rule '//tensorflow/compiler/aot/tests:my_binary' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /home/m2angus/.cache/bazel/_bazel_m2angus/5e7d70ea4881ca91d8032ed9fd943ff8/execroot/org_tensorflow && \
exec env - \
CUDA_TOOLKIT_PATH=/usr/local/cuda \
CUDNN_INSTALL_PATH=/usr/local/cuda-8.0 \
GCC_HOST_COMPILER_PATH=/usr/bin/gcc \
PWD=/proc/self/cwd \
PYTHON_BIN_PATH=/usr/bin/python3 \
PYTHON_LIB_PATH=/usr/local/lib/python3.5/dist-packages \
TF_CUDA_CLANG=0 \
TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
TF_CUDA_VERSION=8.0 \
TF_CUDNN_VERSION=6 \
TF_NEED_CUDA=1 \
TF_NEED_OPENCL=0 \
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/k8-py3-opt/bin/tensorflow/compiler/aot/tests/my_binary '-Wl,-rpath,$ORIGIN/../../../../_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib' -Lbazel-out/k8-py3-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -pthread -Wl,-rpath,../local_config_cuda/cuda/lib64 -Wl,-rpath,../local_config_cuda/cuda/extras/CUPTI/lib64 -Wl,-no-as-needed -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections -Wl,@bazel-out/k8-py3-opt/bin/tensorflow/compiler/aot/tests/my_binary-2.params)
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int32.lo(gather_op_kernel_float_int32.o): In function `gather_float_int32_xla_impl':
gather_op_kernel_float_int32.cc:(.text.gather_float_int32_xla_impl+0x0): multiple definition of `gather_float_int32_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int32.lo(gather_op_kernel_float_int32.o):gather_op_kernel_float_int32.cc:(.text.gather_float_int32_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int64.lo(gather_op_kernel_float_int64.o): In function `gather_float_int64_xla_impl':
gather_op_kernel_float_int64.cc:(.text.gather_float_int64_xla_impl+0x0): multiple definition of `gather_float_int64_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int64.lo(gather_op_kernel_float_int64.o):gather_op_kernel_float_int64.cc:(.text.gather_float_int64_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_1d.lo(index_ops_kernel_argmax_float_1d.o): In function `argmax_float_1d_xla_impl':
index_ops_kernel_argmax_float_1d.cc:(.text.argmax_float_1d_xla_impl+0x0): multiple definition of `argmax_float_1d_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_1d.lo(index_ops_kernel_argmax_float_1d.o):index_ops_kernel_argmax_float_1d.cc:(.text.argmax_float_1d_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_2d.lo(index_ops_kernel_argmax_float_2d.o): In function `argmax_float_2d_xla_impl':
index_ops_kernel_argmax_float_2d.cc:(.text.argmax_float_2d_xla_impl+0x0): multiple definition of `argmax_float_2d_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_2d.lo(index_ops_kernel_argmax_float_2d.o):index_ops_kernel_argmax_float_2d.cc:(.text.argmax_float_2d_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::(anonymous namespace)::ArgMaxCustomCallOp::~ArgMaxCustomCallOp()':
index_ops_cpu.cc:(.text._ZN10tensorflow12_GLOBAL__N_118ArgMaxCustomCallOpD2Ev+0x10): undefined reference to `tensorflow::OpKernel::~OpKernel()'
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::(anonymous namespace)::ArgMaxCustomCallOp::~ArgMaxCustomCallOp()':
index_ops_cpu.cc:(.text._ZN10tensorflow12_GLOBAL__N_118ArgMaxCustomCallOpD0Ev+0x17): undefined reference to `tensorflow::OpKernel::~OpKernel()'
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::Status tensorflow::errors::InvalidArgument<char const*>(char const*)':
index_ops_cpu.cc:(.text._ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_[_ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_]+0x34): undefined reference to `tensorflow::strings::StrCat(tensorflow::strings::AlphaNum const&)'
index_ops_cpu.cc:(.text._ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_[_ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_]+0x49): undefined reference to `tensorflow::Status::Status(tensorflow::error::Code, tensorflow::StringPiece)'
....
sendrecv_ops.cc:(.text.startup._Z41__static_initialization_and_destruction_0ii.constprop.9+0x3a9): undefined reference to `tensorflow::register_op::OpDefBuilderReceiver::OpDefBuilderReceiver(tensorflow::register_op::OpDefBuilderWrapper<true> const&)'
collect2: error: ld returned 1 exit status
Target //tensorflow/compiler/aot/tests:my_binary failed to build
INFO: Elapsed time: 10.513s, Critical Path: 10.10s
FAILED: Build did NOT complete successfully
The ....
is pretty much just a bunch of undefined reference to
errors. Any ideas how to fix this?
To fix the link errors I had to use
tf_cc_binary
instead ofcc_binary
inBUILD
(according to this). I also had to add the lineThis is a solution in the context of my post. There are still other errors, but they are outside the scope of this question.