Link error trying to compile XLA AOT for Tensorflow

828 views Asked by At

I'm trying to follow this tutorial to build an XLA AOT example (with things taken from this). I've been able to build Tensorflow from source and get XLA JIT working on the small mnist_softmax_xla.py.

The steps I've done so far are:

1)

#from tensorflow/tensorflow/compiler/aot/tests
python3 ./make_test_graphs.py --out_dir=./

2) I also had to change line 21 of /home/m2angus/tensorflow/third_party/llvm/llvm.BUILD to:

package(default_visibility = ["//visibility:public"])

This was to prevent errors with bazel.

3)

bazel build --config=opt --config=cuda --verbose_failures --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow/compiler/aot/tests:my_binary

With the following files:

tensorflow/tensorflow/compiler/aot/tests/my_code.cc

#define EIGEN_USE_THREADS
#define EIGEN_USE_CUSTOM_THREAD_POOL

#include <iostream>
#include "third_party/eigen3/unsupported/Eigen/CXX11/Tensor"
#include "tensorflow/compiler/aot/tests/test_graph_tfmatmul.h" // generated

int main(int argc, char** argv) {
    Eigen::ThreadPool tp(2);  // Size the thread pool as appropriate.
    Eigen::ThreadPoolDevice device(&tp, tp.NumThreads());

    foo::bar::MatMulComp matmul;
    matmul.set_thread_pool(&device);

    // Set up args and run the computation.
    const float args[12] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
    std::copy(args + 0, args + 6, matmul.arg0_data());
    std::copy(args + 6, args + 12, matmul.arg1_data());
    matmul.Run();

    // Check result
    if (matmul.result0(0, 0) == 58) {
        std::cout << "Success" << std::endl;
    } else {
        std::cout << "Failed. Expected value 58 at 0,0. Got:"
                    << matmul.result0(0, 0) << std::endl;
    }

    return 0;
}

tensorflow/tensorflow/compiler/aot/tests/BUILD

# Example of linking your binary
# Also see //third_party/tensorflow/compiler/aot/tests/BUILD
load("//tensorflow/compiler/aot:tfcompile.bzl", "tf_library")

# The same tf_library call from step 2 above.
tf_library(
    name = "test_graph_tfmatmul",
    cpp_class = "foo::bar::MatMulComp",
    graph = "test_graph_tfmatmul.pb",
    config = "test_graph_tfmatmul.config.pbtxt",
)

# The executable code generated by tf_library can then be linked into your code.
cc_binary(
    name = "my_binary",
    srcs = [
        "my_code.cc",  # include test_graph_tfmatmul.h to access the generated header
    ],
    deps = [
        ":test_graph_tfmatmul",  # link in the generated object file
        "//tensorflow/compiler/tf2xla",
        "//tensorflow/compiler/tf2xla:common",
        "//tensorflow/compiler/tf2xla:tf2xla_proto",
        "//tensorflow/compiler/tf2xla:tf2xla_util",
        "//tensorflow/compiler/tf2xla:xla_compiler",
        "//tensorflow/compiler/tf2xla/kernels:xla_cpu_only_ops",
        "//tensorflow/compiler/tf2xla/kernels:xla_ops",
        "//tensorflow/compiler/xla:shape_util",
        "//tensorflow/compiler/xla:statusor",
        "//tensorflow/compiler/xla:util",
        "//tensorflow/compiler/xla:xla_data_proto",
        "//tensorflow/compiler/xla/client:client_library",
        "//tensorflow/compiler/xla/client:compile_only_client",
        "//tensorflow/compiler/xla/service:compiler",
        "//tensorflow/compiler/xla/service/cpu:cpu_compiler",
        "//tensorflow/core:core_cpu",
        "//tensorflow/core:core_cpu_internal",
        "//tensorflow/core:framework",
        "//tensorflow/core:framework_internal",
        "//tensorflow/core:lib",
        "//tensorflow/core:protos_all_cc",
        "//tensorflow/compiler/tf2xla:xla_compiled_cpu_function",
        "//third_party/eigen3",
    ],
    linkopts = [
        "-lpthread",
    ]
)

The error output is huge so I'll just put a snippet of it

Loading: 
Loading: 0 packages loaded
INFO: Analysed target //tensorflow/compiler/aot/tests:my_binary (0 packages loaded).
INFO: Found 1 target...
[0 / 2] BazelWorkspaceStatusAction stable-status.txt
[1 / 2] Linking tensorflow/compiler/aot/tests/my_binary; 1s local
ERROR: /home/m2angus/tensorflow/tensorflow/compiler/aot/tests/BUILD:14:1: Linking of rule '//tensorflow/compiler/aot/tests:my_binary' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command 
  (cd /home/m2angus/.cache/bazel/_bazel_m2angus/5e7d70ea4881ca91d8032ed9fd943ff8/execroot/org_tensorflow && \
  exec env - \
    CUDA_TOOLKIT_PATH=/usr/local/cuda \
    CUDNN_INSTALL_PATH=/usr/local/cuda-8.0 \
    GCC_HOST_COMPILER_PATH=/usr/bin/gcc \
    PWD=/proc/self/cwd \
    PYTHON_BIN_PATH=/usr/bin/python3 \
    PYTHON_LIB_PATH=/usr/local/lib/python3.5/dist-packages \
    TF_CUDA_CLANG=0 \
    TF_CUDA_COMPUTE_CAPABILITIES=6.1 \
    TF_CUDA_VERSION=8.0 \
    TF_CUDNN_VERSION=6 \
    TF_NEED_CUDA=1 \
    TF_NEED_OPENCL=0 \
  external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -o bazel-out/k8-py3-opt/bin/tensorflow/compiler/aot/tests/my_binary '-Wl,-rpath,$ORIGIN/../../../../_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib' -Lbazel-out/k8-py3-opt/bin/_solib_local/_U@local_Uconfig_Ucuda_S_Scuda_Ccudart___Uexternal_Slocal_Uconfig_Ucuda_Scuda_Scuda_Slib -pthread -Wl,-rpath,../local_config_cuda/cuda/lib64 -Wl,-rpath,../local_config_cuda/cuda/extras/CUPTI/lib64 -Wl,-no-as-needed -B/usr/bin/ -pie -Wl,-z,relro,-z,now -no-canonical-prefixes -pass-exit-codes '-Wl,--build-id=md5' '-Wl,--hash-style=gnu' -Wl,--gc-sections -Wl,@bazel-out/k8-py3-opt/bin/tensorflow/compiler/aot/tests/my_binary-2.params)
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int32.lo(gather_op_kernel_float_int32.o): In function `gather_float_int32_xla_impl':
gather_op_kernel_float_int32.cc:(.text.gather_float_int32_xla_impl+0x0): multiple definition of `gather_float_int32_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int32.lo(gather_op_kernel_float_int32.o):gather_op_kernel_float_int32.cc:(.text.gather_float_int32_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int64.lo(gather_op_kernel_float_int64.o): In function `gather_float_int64_xla_impl':
gather_op_kernel_float_int64.cc:(.text.gather_float_int64_xla_impl+0x0): multiple definition of `gather_float_int64_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libgather_op_kernel_float_int64.lo(gather_op_kernel_float_int64.o):gather_op_kernel_float_int64.cc:(.text.gather_float_int64_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_1d.lo(index_ops_kernel_argmax_float_1d.o): In function `argmax_float_1d_xla_impl':
index_ops_kernel_argmax_float_1d.cc:(.text.argmax_float_1d_xla_impl+0x0): multiple definition of `argmax_float_1d_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_1d.lo(index_ops_kernel_argmax_float_1d.o):index_ops_kernel_argmax_float_1d.cc:(.text.argmax_float_1d_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_2d.lo(index_ops_kernel_argmax_float_2d.o): In function `argmax_float_2d_xla_impl':
index_ops_kernel_argmax_float_2d.cc:(.text.argmax_float_2d_xla_impl+0x0): multiple definition of `argmax_float_2d_xla_impl'
bazel-out/k8-py3-opt/bin/external/org_tensorflow/tensorflow/compiler/tf2xla/kernels/libindex_ops_kernel_argmax_float_2d.lo(index_ops_kernel_argmax_float_2d.o):index_ops_kernel_argmax_float_2d.cc:(.text.argmax_float_2d_xla_impl+0x0): first defined here
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::(anonymous namespace)::ArgMaxCustomCallOp::~ArgMaxCustomCallOp()':
index_ops_cpu.cc:(.text._ZN10tensorflow12_GLOBAL__N_118ArgMaxCustomCallOpD2Ev+0x10): undefined reference to `tensorflow::OpKernel::~OpKernel()'
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::(anonymous namespace)::ArgMaxCustomCallOp::~ArgMaxCustomCallOp()':
index_ops_cpu.cc:(.text._ZN10tensorflow12_GLOBAL__N_118ArgMaxCustomCallOpD0Ev+0x17): undefined reference to `tensorflow::OpKernel::~OpKernel()'
bazel-out/k8-py3-opt/bin/tensorflow/compiler/tf2xla/kernels/libxla_cpu_only_ops.lo(index_ops_cpu.o): In function `tensorflow::Status tensorflow::errors::InvalidArgument<char const*>(char const*)':
index_ops_cpu.cc:(.text._ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_[_ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_]+0x34): undefined reference to `tensorflow::strings::StrCat(tensorflow::strings::AlphaNum const&)'
index_ops_cpu.cc:(.text._ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_[_ZN10tensorflow6errors15InvalidArgumentIJPKcEEENS_6StatusEDpT_]+0x49): undefined reference to `tensorflow::Status::Status(tensorflow::error::Code, tensorflow::StringPiece)'
....
sendrecv_ops.cc:(.text.startup._Z41__static_initialization_and_destruction_0ii.constprop.9+0x3a9): undefined reference to `tensorflow::register_op::OpDefBuilderReceiver::OpDefBuilderReceiver(tensorflow::register_op::OpDefBuilderWrapper<true> const&)'
collect2: error: ld returned 1 exit status
Target //tensorflow/compiler/aot/tests:my_binary failed to build
INFO: Elapsed time: 10.513s, Critical Path: 10.10s
FAILED: Build did NOT complete successfully

The .... is pretty much just a bunch of undefined reference to errors. Any ideas how to fix this?

1

There are 1 answers

1
McAngus On BEST ANSWER

To fix the link errors I had to use tf_cc_binary instead of cc_binary in BUILD (according to this). I also had to add the line

load("//tensorflow:tensorflow.bzl", "tf_cc_binary")

This is a solution in the context of my post. There are still other errors, but they are outside the scope of this question.