What are the "host and target platforms" of a dependency?

93 views Asked by At

https://nixos.org/manual/nixpkgs/stable/#ssec-stdenv-dependencies-reference says:

Dependencies can be broken down along three axes: their host and target platforms relative to the new derivation’s, and whether they are propagated. The platform distinctions are motivated by cross compilation; see Cross-compilation for exactly what each platform means. [1] But even if one is not cross compiling, the platforms imply whether or not the dependency is needed at run-time or build-time, a concept that makes perfect sense outside of cross compilation. By default, the run-time/build-time distinction is just a hint for mental clarity, but with strictDeps set it is mostly enforced even in the native case.

The extension of PATH with dependencies, alluded to above, proceeds according to the relative platforms alone. The process is carried out only for dependencies whose host platform matches the new derivation’s build platform i.e. dependencies which run on the platform where the new derivation will be built. [2] For each dependency of those dependencies, dep/bin, if present, is added to the PATH environment variable.

A dependency is said to be propagated when some of its other-transitive (non-immediate) downstream dependencies also need it as an immediate dependency. [3]

It is important to note that dependencies are not necessarily propagated as the same sort of dependency that they were before, but rather as the corresponding sort so that the platform rules still line up. To determine the exact rules for dependency propagation, we start by assigning to each dependency a couple of ternary numbers (-1 for build, 0 for host, and 1 for target) representing its dependency type, which captures how its host and target platforms are each “offset” from the depending derivation’s host and target platforms. The following table summarize the different combinations that can be obtained: ...

What are the three axes? I only saw two:

  • their host and target platforms relative to the new derivation’s,
  • whether they are propagated.

What do the following parts from the quote mean:

  • the "host and target platforms" of a dependency, as in the first paragraph;

  • "build", as in "assigning to each dependency a couple of ternary numbers (-1 for build, 0 for host, and 1 for target) representing its dependency type" in the last paragraph; (Is "build" another platform of a dependency, besides the dependency's "host and target platforms"?)

  • "their host and target platforms relative to the new derivation’s";

  • the two sentences in bold and in the last paragraph.

?

Can someone rephrase it in a way easier to understand?

1

There are 1 answers

0
tobiasBora On

So there are many questions here (and I saw you also asked a related question here, where I just provided a shorter explanation).

So first, long story short: if you only want to compile a normal software (i.e. not a compiler), you certainly want to use:

  • nativeBuildInputs for softwares that most be available during the compilation (e.g. pkg-config, cmake, unzip),
  • and all the rest should go to buildInputs (e.g. all the libraries your program use like openssl, ffmpeg…).

If you have a doubt on where a package should go, you can run rg yourPackage (rg is a faster and fancier grep -r) in the nixpkgs repository and see what other did.

Now, regarding your questions: so to answer the first question, I think that they just mean that they see the tuple (host, target, propagated) ∈ ({-1,0,1} x {-1,0,1} x {0,1}) as three axes, hence you have 3x3x2 options for each dependency.

Then, the first thing to know is that you often use different machines when dealing with programs, that might run on different architectures, notably when cross-compiling (e.g. you compile on x86_64 a program for Arm v8, maybe because your Intel laptop runs faster than your Raspberry Pi, or because you don't have a Raspberry Pi with you):

  • build is the machine that will compile your software (think "your laptop", or the servers of hydra if your program is integrated in nixpkgs).
  • host is the machine where the program will run on (think "a Raspberry Pi", or the computer of the people that will use the program, maybe its an Apple silicon M2…)
  • target is a bit more special, and only applies when the software you compile is a compiler (e.g g++). Indeed, weirdly enough, when you compile a compiler, you need to specify the architecture on which the programs compiled with your compiler will be run.

So for instance, if I compile g++ on my laptop, in such a way that g++ will be run on a MacOS laptop, and will be used to compile binaries that will be run on a Raspberry Pi, then:

  • build will be your laptop (say it is x86_64-linux)
  • host will be the Mac (say it is aarch64-darwin)
  • target will be the Raspberry Pi (say it is aarch64-linux)

This should answer your questions 2 & 3.

To answer question 4, I will re-use the example given here that you can read for further details and examples. So first, a dependency is any program/library that is needed somehow to compile/run the program, or, possibly, to run the program that will be later compiled when you are compiling a compiler. For instance, to compile g++, you will need gcc. So gcc is a dependency of g++. But you need to know which "flavor" of gcc you need to compile g++: since you will run gcc on your laptop, and use the compiled g++ on the mac, you expect the gcc flavor that runs on x86_64-linux and compiles to aarch-darwin. So, in Nix terminology, the (host, target) of the gcc dependency for this build is (x86_64-linux, aarch-darwin)… (you can see that we do not keep the build since who cares on which platform gcc was built?) You see of course that this tuple is relative to the current derivation (answering question 4), in the sense that if you compile another program, gcc might be assigned another tuple, for instance if gcc is called at run-time, say if you want to build a code editor that might run gcc at runtime.

Note that if your dependency is not a compiler, then the target part of the (host, target) tuple is irrelevant. So for instance, if g++ needs some libraries at runtime, like glibc or perl, then the (host, target) can be simplified into just one host like (host, *), here (aarch64-darwin, *). But it also does not hurt to give target a value anyway, so why not giving it the value (aarch64-darwin, aarch64-linux) (which is the what buildInputs will configure, if you want (aarch64-darwin, aarch64-darwin), then use depsHostHost as explained here).

Then, to answer your last question, you can easily see that you need a way to specify the (host, target) somehow. The choice that was made is that:

  • -1 is just a shortcut for build of current derivation (so here, -1 = x86_64-linux)
  • 0 is just a shortcut for host of current derivation (so here, 0 = aarch64-darwin)
  • 1 is just a shortcut for target of current derivation (so here, 1 = aarch64-linux) Therefore, we can easily convert the above tuple for glibc is (aarch64-darwin, aarch64-linux), i.e. (0,1), while the tuple for gcc is (x86_64-linux, aarch64-darwin) = (-1,0).

The reason of this choice is that if you want to know the host (i.e. where it should run) of the dependency of a dependency of a dependency…, you just sum the host value of the dependency to the host value of the dependency of the dependency, to the host value of the dependency of the dependency of the dependency etc… Note that this process will also stop if the number goes beyond -1 or 1 in order to:

prune transitive dependencies whose combined offsets go out-of-bounds, which can be viewed as a filter over that transitive closure removing dependencies that are blatantly absurd.

Finally, you certainly wonder how to specify that g++ has a dependency whose tuple is (0,1), i.e. that will be run on the host and compile to the target architecture? The whole conversion table is given here:

enter image description here

where the offset is the corresponding tupple. So here, you see that the tuple (-1,0) is provided by nativeBuildInputs = [ gcc ]; while the tuple (0,1) is provided by buildInputs = [ glibc perl ];.