How to properly use include stdio.h in an NVRTC-compiled program?

863 views Asked by At

I have written an amazing kernel which will bring me fame and fortune - if I can only get it to compile with NVRTC:

#include <stdio.h>

__global__ void do_stuff() { }

I would have hoped that system headers should be recognized by the (runtime) compiler, just like a regular compiler, and that this would "just work" (modulo any printf-specific machinery). Alternatively, if it didn't work, I would have expected an error message about stdio.h's source not made available with the "program creation" API call (nvrtcCreateProgram()), since I'm passing NULL and NULL as its last two arguments.

However, what I get is the following:

/usr/include/stdio.h(33): catastrophic error: cannot open source file "stddef.h"

This seems strange to me. It means that the runtime compiler is able to look inside system headers, but is not able to find stddef.h, like nvcc or the host side compiler are able to.

Why is this happening, and what is the idiomatic/recommended workaround?

Note: I'd like a workaround which would be cross-platform, not just work on my individual machine.


There are 2 answers

einpoklum On BEST ANSWER

An additional approach is taken in the "JITify" library which Robert Crovella has graciously reminded me of. While this doesn't seem to be documented very well, Jitify pre-includes processed snippets of various headers it sees fit to. In particular for <climits>/<limits.h>:

static const char* jitsafe_header_limits_h = R"(
#pragma once
#if defined _WIN32 || defined _WIN64
 #define __WORDSIZE 32
 #if defined __x86_64__ && !defined __ILP32__
  #define __WORDSIZE 64
  #define __WORDSIZE 32
#define MB_LEN_MAX  16
#define CHAR_BIT    8
#define SCHAR_MIN   (-128)
#define SCHAR_MAX   127
#define UCHAR_MAX   255
enum {
  _JITIFY_CHAR_IS_UNSIGNED = (char)-1 >= 0,
#define SHRT_MIN    (-32768)
#define SHRT_MAX    32767
#define USHRT_MAX   65535
#define INT_MIN     (-INT_MAX - 1)
#define INT_MAX     2147483647
#define UINT_MAX    4294967295U
#if __WORDSIZE == 64
 # define LONG_MAX  9223372036854775807L
 # define LONG_MAX  2147483647L
#define LONG_MIN    (-LONG_MAX - 1L)
#if __WORDSIZE == 64
 #define ULONG_MAX  18446744073709551615UL
 #define ULONG_MAX  4294967295UL
#define LLONG_MAX  9223372036854775807LL
#define LLONG_MIN  (-LLONG_MAX - 1LL)
#define ULLONG_MAX 18446744073709551615ULL

for stddef.h:

static const char* jitsafe_header_stddef_h =
    "#pragma once\n"
    "#include <climits>\n"
    "namespace __jitify_stddef_ns {\n"
    "#if __cplusplus >= 201103L\n"
    "typedef decltype(nullptr) nullptr_t;\n"
    "#if defined(_MSC_VER)\n"
    "  typedef double max_align_t;\n"
    "#elif defined(__APPLE__)\n"
    "  typedef long double max_align_t;\n"
    "  // Define max_align_t to match the GCC definition.\n"
    "  typedef struct {\n"
    "    long long __jitify_max_align_nonce1\n"
    "        __attribute__((__aligned__(__alignof__(long long))));\n"
    "    long double __jitify_max_align_nonce2\n"
    "        __attribute__((__aligned__(__alignof__(long double))));\n"
    "  } max_align_t;\n"
    "#endif  // __cplusplus >= 201103L\n"
    "#if __cplusplus >= 201703L\n"
    "enum class byte : unsigned char {};\n"
    "#endif  // __cplusplus >= 201703L\n"
    "} // namespace __jitify_stddef_ns\n"
    "namespace std {\n"
    "  // NVRTC provides built-in definitions of ::size_t and ::ptrdiff_t.\n"
    "  using ::size_t;\n"
    "  using ::ptrdiff_t;\n"
    "  using namespace __jitify_stddef_ns;\n"
    "} // namespace std\n"
    "using namespace __jitify_stddef_ns;\n";

and for stdio.h:

static const char* jitsafe_header_stdio_h =
    "#pragma once\n"
    "#include <stddef.h>\n"
    "#define FILE int\n"
    "int fflush ( FILE * stream );\n"
    "int fprintf ( FILE * stream, const char * format, ... );\n";

If you include those strings as headers, with the appropriate names as keys, it is likely your kernel will compile.

In fact, one could form header files out of these and other mini-headers in jitify.hpp, to use in non-NVRTC kernel compilation. That might be useful too.

One last point: The constants above do not specify a __device__ execution space. So, either you add __device__ in there, or tell the compiler to assume functions are intended for execution on the device only, unless otherwise specified; that's the --device-as-default-execution-space NVRTC compiler option.

einpoklum On

Here are two solutions which might work, but which I would rather avoid. If they're the only reasonable course of action after all - please comment and say so:

  1. Add the specific path to stddef.h as a compiler parameter (-I or --include-path=).
  2. Pass the source of stddef.h to the nvrtcCreateProgram() call.