1<chapter xmlns="http://docbook.org/ns/docbook"
2 xmlns:xlink="http://www.w3.org/1999/xlink"
3 xml:id="chap-cross">
4 <title>Cross-compilation</title>
5 <section xml:id="sec-cross-intro">
6 <title>Introduction</title>
7
8 <para>
9 "Cross-compilation" means compiling a program on one machine for another type
10 of machine. For example, a typical use of cross-compilation is to compile
11 programs for embedded devices. These devices often don't have the computing
12 power and memory to compile their own programs. One might think that
13 cross-compilation is a fairly niche concern. However, there are significant
14 advantages to rigorously distinguishing between build-time and run-time
15 environments! This applies even when one is developing and deploying on the
16 same machine. Nixpkgs is increasingly adopting the opinion that packages
17 should be written with cross-compilation in mind, and nixpkgs should evaluate
18 in a similar way (by minimizing cross-compilation-specific special cases)
19 whether or not one is cross-compiling.
20 </para>
21
22 <para>
23 This chapter will be organized in three parts. First, it will describe the
24 basics of how to package software in a way that supports cross-compilation.
25 Second, it will describe how to use Nixpkgs when cross-compiling. Third, it
26 will describe the internal infrastructure supporting cross-compilation.
27 </para>
28 </section>
29<!--============================================================-->
30 <section xml:id="sec-cross-packaging">
31 <title>Packaging in a cross-friendly manner</title>
32
33 <section xml:id="sec-cross-platform-parameters">
34 <title>Platform parameters</title>
35
36 <para>
37 Nixpkgs follows the <link
38 xlink:href="https://gcc.gnu.org/onlinedocs/gccint/Configure-Terms.html">conventions
39 of GNU autoconf</link>. We distinguish between 3 types of platforms when
40 building a derivation: <wordasword>build</wordasword>,
41 <wordasword>host</wordasword>, and <wordasword>target</wordasword>. In
42 summary, <wordasword>build</wordasword> is the platform on which a package
43 is being built, <wordasword>host</wordasword> is the platform on which it
44 will run. The third attribute, <wordasword>target</wordasword>, is relevant
45 only for certain specific compilers and build tools.
46 </para>
47
48 <para>
49 In Nixpkgs, these three platforms are defined as attribute sets under the
50 names <literal>buildPlatform</literal>, <literal>hostPlatform</literal>,
51 and <literal>targetPlatform</literal>. They are always defined as
52 attributes in the standard environment. That means one can access them
53 like:
54<programlisting>{ stdenv, fooDep, barDep, .. }: ...stdenv.buildPlatform...</programlisting>
55 .
56 </para>
57
58 <variablelist>
59 <varlistentry>
60 <term>
61 <varname>buildPlatform</varname>
62 </term>
63 <listitem>
64 <para>
65 The "build platform" is the platform on which a package is built. Once
66 someone has a built package, or pre-built binary package, the build
67 platform should not matter and can be ignored.
68 </para>
69 </listitem>
70 </varlistentry>
71 <varlistentry>
72 <term>
73 <varname>hostPlatform</varname>
74 </term>
75 <listitem>
76 <para>
77 The "host platform" is the platform on which a package will be run. This
78 is the simplest platform to understand, but also the one with the worst
79 name.
80 </para>
81 </listitem>
82 </varlistentry>
83 <varlistentry>
84 <term>
85 <varname>targetPlatform</varname>
86 </term>
87 <listitem>
88 <para>
89 The "target platform" attribute is, unlike the other two attributes, not
90 actually fundamental to the process of building software. Instead, it is
91 only relevant for compatibility with building certain specific compilers
92 and build tools. It can be safely ignored for all other packages.
93 </para>
94 <para>
95 The build process of certain compilers is written in such a way that the
96 compiler resulting from a single build can itself only produce binaries
97 for a single platform. The task of specifying this single "target
98 platform" is thus pushed to build time of the compiler. The root cause of
99 this that the compiler (which will be run on the host) and the standard
100 library/runtime (which will be run on the target) are built by a single
101 build process.
102 </para>
103 <para>
104 There is no fundamental need to think about a single target ahead of
105 time like this. If the tool supports modular or pluggable backends, both
106 the need to specify the target at build time and the constraint of
107 having only a single target disappear. An example of such a tool is
108 LLVM.
109 </para>
110 <para>
111 Although the existence of a "target platfom" is arguably a historical
112 mistake, it is a common one: examples of tools that suffer from it are
113 GCC, Binutils, GHC and Autoconf. Nixpkgs tries to avoid sharing in the
114 mistake where possible. Still, because the concept of a target platform
115 is so ingrained, it is best to support it as is.
116 </para>
117 </listitem>
118 </varlistentry>
119 </variablelist>
120
121 <para>
122 The exact schema these fields follow is a bit ill-defined due to a long and
123 convoluted evolution, but this is slowly being cleaned up. You can see
124 examples of ones used in practice in
125 <literal>lib.systems.examples</literal>; note how they are not all very
126 consistent. For now, here are few fields can count on them containing:
127 </para>
128
129 <variablelist>
130 <varlistentry>
131 <term>
132 <varname>system</varname>
133 </term>
134 <listitem>
135 <para>
136 This is a two-component shorthand for the platform. Examples of this
137 would be "x86_64-darwin" and "i686-linux"; see
138 <literal>lib.systems.doubles</literal> for more. The first component
139 corresponds to the CPU architecture of the platform and the second to the
140 operating system of the platform (<literal>[cpu]-[os]</literal>). This
141 format has built-in support in Nix, such as the
142 <varname>builtins.currentSystem</varname> impure string.
143 </para>
144 </listitem>
145 </varlistentry>
146 <varlistentry>
147 <term>
148 <varname>config</varname>
149 </term>
150 <listitem>
151 <para>
152 This is a 3- or 4- component shorthand for the platform. Examples of this
153 would be <literal>x86_64-unknown-linux-gnu</literal> and
154 <literal>aarch64-apple-darwin14</literal>. This is a standard format
155 called the "LLVM target triple", as they are pioneered by LLVM. In the
156 4-part form, this corresponds to
157 <literal>[cpu]-[vendor]-[os]-[abi]</literal>. This format is strictly
158 more informative than the "Nix host double", as the previous format could
159 analogously be termed. This needs a better name than
160 <varname>config</varname>!
161 </para>
162 </listitem>
163 </varlistentry>
164 <varlistentry>
165 <term>
166 <varname>parsed</varname>
167 </term>
168 <listitem>
169 <para>
170 This is a Nix representation of a parsed LLVM target triple
171 with white-listed components. This can be specified directly,
172 or actually parsed from the <varname>config</varname>. See
173 <literal>lib.systems.parse</literal> for the exact
174 representation.
175 </para>
176 </listitem>
177 </varlistentry>
178 <varlistentry>
179 <term>
180 <varname>libc</varname>
181 </term>
182 <listitem>
183 <para>
184 This is a string identifying the standard C library used. Valid
185 identifiers include "glibc" for GNU libc, "libSystem" for Darwin's
186 Libsystem, and "uclibc" for µClibc. It should probably be refactored to
187 use the module system, like <varname>parse</varname>.
188 </para>
189 </listitem>
190 </varlistentry>
191 <varlistentry>
192 <term>
193 <varname>is*</varname>
194 </term>
195 <listitem>
196 <para>
197 These predicates are defined in <literal>lib.systems.inspect</literal>,
198 and slapped onto every platform. They are superior to the ones in
199 <varname>stdenv</varname> as they force the user to be explicit about
200 which platform they are inspecting. Please use these instead of those.
201 </para>
202 </listitem>
203 </varlistentry>
204 <varlistentry>
205 <term>
206 <varname>platform</varname>
207 </term>
208 <listitem>
209 <para>
210 This is, quite frankly, a dumping ground of ad-hoc settings (it's an
211 attribute set). See <literal>lib.systems.platforms</literal> for
212 examples—there's hopefully one in there that will work verbatim for
213 each platform that is working. Please help us triage these flags and
214 give them better homes!
215 </para>
216 </listitem>
217 </varlistentry>
218 </variablelist>
219 </section>
220
221 <section xml:id="sec-cross-specifying-dependencies">
222 <title>Specifying Dependencies</title>
223
224 <para>
225 In this section we explore the relationship between both runtime and
226 build-time dependencies and the 3 Autoconf platforms.
227 </para>
228
229 <para>
230 A runtime dependency between 2 packages implies that between them both the
231 host and target platforms match. This is directly implied by the meaning of
232 "host platform" and "runtime dependency": The package dependency exists
233 while both packages are running on a single host platform.
234 </para>
235
236 <para>
237 A build time dependency, however, implies a shift in platforms between the
238 depending package and the depended-on package. The meaning of a build time
239 dependency is that to build the depending package we need to be able to run
240 the depended-on's package. The depending package's build platform is
241 therefore equal to the depended-on package's host platform. Analogously,
242 the depending package's host platform is equal to the depended-on package's
243 target platform.
244 </para>
245
246 <para>
247 In this manner, given the 3 platforms for one package, we can determine the
248 three platforms for all its transitive dependencies. This is the most
249 important guiding principle behind cross-compilation with Nixpkgs, and will
250 be called the <wordasword>sliding window principle</wordasword>.
251 </para>
252
253 <para>
254 Some examples will make this clearer. If a package is being built with a
255 <literal>(build, host, target)</literal> platform triple of <literal>(foo,
256 bar, bar)</literal>, then its build-time dependencies would have a triple of
257 <literal>(foo, foo, bar)</literal>, and <emphasis>those packages'</emphasis>
258 build-time dependencies would have a triple of <literal>(foo, foo,
259 foo)</literal>. In other words, it should take two "rounds" of following
260 build-time dependency edges before one reaches a fixed point where, by the
261 sliding window principle, the platform triple no longer changes. Indeed,
262 this happens with cross-compilation, where only rounds of native
263 dependencies starting with the second necessarily coincide with native
264 packages.
265 </para>
266
267 <note>
268 <para>
269 The depending package's target platform is unconstrained by the sliding
270 window principle, which makes sense in that one can in principle build
271 cross compilers targeting arbitrary platforms.
272 </para>
273 </note>
274
275 <para>
276 How does this work in practice? Nixpkgs is now structured so that build-time
277 dependencies are taken from <varname>buildPackages</varname>, whereas
278 run-time dependencies are taken from the top level attribute set. For
279 example, <varname>buildPackages.gcc</varname> should be used at build-time,
280 while <varname>gcc</varname> should be used at run-time. Now, for most of
281 Nixpkgs's history, there was no <varname>buildPackages</varname>, and most
282 packages have not been refactored to use it explicitly. Instead, one can use
283 the six (<emphasis>gasp</emphasis>) attributes used for specifying
284 dependencies as documented in <xref linkend="ssec-stdenv-dependencies"/>. We
285 "splice" together the run-time and build-time package sets with
286 <varname>callPackage</varname>, and then <varname>mkDerivation</varname> for
287 each of four attributes pulls the right derivation out. This splicing can be
288 skipped when not cross-compiling as the package sets are the same, but is a
289 bit slow for cross-compiling. Because of this, a best-of-both-worlds
290 solution is in the works with no splicing or explicit access of
291 <varname>buildPackages</varname> needed. For now, feel free to use either
292 method.
293 </para>
294
295 <note>
296 <para>
297 There is also a "backlink" <varname>targetPackages</varname>, yielding a
298 package set whose <varname>buildPackages</varname> is the current package
299 set. This is a hack, though, to accommodate compilers with lousy build
300 systems. Please do not use this unless you are absolutely sure you are
301 packaging such a compiler and there is no other way.
302 </para>
303 </note>
304 </section>
305
306 <section xml:id="sec-cross-cookbook">
307 <title>Cross packaging cookbook</title>
308
309 <para>
310 Some frequently encountered problems when packaging for cross-compilation
311 should be answered here. Ideally, the information above is exhaustive, so
312 this section cannot provide any new information, but it is ludicrous and
313 cruel to expect everyone to spend effort working through the interaction of
314 many features just to figure out the same answer to the same common problem.
315 Feel free to add to this list!
316 </para>
317
318 <qandaset>
319 <qandaentry xml:id="cross-qa-build-c-program-in-build-environment">
320 <question>
321 <para>
322 What if my package's build system needs to build a C program to be run
323 under the build environment?
324 </para>
325 </question>
326 <answer>
327 <para>
328<programlisting>depsBuildBuild = [ buildPackages.stdenv.cc ];</programlisting>
329 Add it to your <function>mkDerivation</function> invocation.
330 </para>
331 </answer>
332 </qandaentry>
333 <qandaentry xml:id="cross-qa-fails-to-find-ar">
334 <question>
335 <para>
336 My package fails to find <command>ar</command>.
337 </para>
338 </question>
339 <answer>
340 <para>
341 Many packages assume that an unprefixed <command>ar</command> is
342 available, but Nix doesn't provide one. It only provides a prefixed one,
343 just as it only does for all the other binutils programs. It may be
344 necessary to patch the package to fix the build system to use a prefixed
345 `ar`.
346 </para>
347 </answer>
348 </qandaentry>
349 <qandaentry xml:id="cross-testsuite-runs-host-code">
350 <question>
351 <para>
352 My package's testsuite needs to run host platform code.
353 </para>
354 </question>
355 <answer>
356 <para>
357<programlisting>doCheck = stdenv.hostPlatform != stdenv.buildPlatfrom;</programlisting>
358 Add it to your <function>mkDerivation</function> invocation.
359 </para>
360 </answer>
361 </qandaentry>
362 </qandaset>
363 </section>
364 </section>
365<!--============================================================-->
366 <section xml:id="sec-cross-usage">
367 <title>Cross-building packages</title>
368
369 <para>
370 Nixpkgs can be instantiated with <varname>localSystem</varname> alone, in
371 which case there is no cross-compiling and everything is built by and for
372 that system, or also with <varname>crossSystem</varname>, in which case
373 packages run on the latter, but all building happens on the former. Both
374 parameters take the same schema as the 3 (build, host, and target) platforms
375 defined in the previous section. As mentioned above,
376 <literal>lib.systems.examples</literal> has some platforms which are used as
377 arguments for these parameters in practice. You can use them
378 programmatically, or on the command line:
379<programlisting>
380nix-build <nixpkgs> --arg crossSystem '(import <nixpkgs/lib>).systems.examples.fooBarBaz' -A whatever</programlisting>
381 </para>
382
383 <note>
384 <para>
385 Eventually we would like to make these platform examples an unnecessary
386 convenience so that
387<programlisting>
388nix-build <nixpkgs> --arg crossSystem '{ config = "<arch>-<os>-<vendor>-<abi>"; }' -A whatever</programlisting>
389 works in the vast majority of cases. The problem today is dependencies on
390 other sorts of configuration which aren't given proper defaults. We rely on
391 the examples to crudely to set those configuration parameters in some
392 vaguely sane manner on the users behalf. Issue
393 <link xlink:href="https://github.com/NixOS/nixpkgs/issues/34274">#34274</link>
394 tracks this inconvenience along with its root cause in crufty configuration
395 options.
396 </para>
397 </note>
398
399 <para>
400 While one is free to pass both parameters in full, there's a lot of logic to
401 fill in missing fields. As discussed in the previous section, only one of
402 <varname>system</varname>, <varname>config</varname>, and
403 <varname>parsed</varname> is needed to infer the other two. Additionally,
404 <varname>libc</varname> will be inferred from <varname>parse</varname>.
405 Finally, <literal>localSystem.system</literal> is also
406 <emphasis>impurely</emphasis> inferred based on the platform evaluation
407 occurs. This means it is often not necessary to pass
408 <varname>localSystem</varname> at all, as in the command-line example in the
409 previous paragraph.
410 </para>
411
412 <note>
413 <para>
414 Many sources (manual, wiki, etc) probably mention passing
415 <varname>system</varname>, <varname>platform</varname>, along with the
416 optional <varname>crossSystem</varname> to nixpkgs: <literal>import
417 <nixpkgs> { system = ..; platform = ..; crossSystem = ..;
418 }</literal>. Passing those two instead of <varname>localSystem</varname> is
419 still supported for compatibility, but is discouraged. Indeed, much of the
420 inference we do for these parameters is motivated by compatibility as much
421 as convenience.
422 </para>
423 </note>
424
425 <para>
426 One would think that <varname>localSystem</varname> and
427 <varname>crossSystem</varname> overlap horribly with the three
428 <varname>*Platforms</varname> (<varname>buildPlatform</varname>,
429 <varname>hostPlatform,</varname> and <varname>targetPlatform</varname>; see
430 <varname>stage.nix</varname> or the manual). Actually, those identifiers are
431 purposefully not used here to draw a subtle but important distinction: While
432 the granularity of having 3 platforms is necessary to properly *build*
433 packages, it is overkill for specifying the user's *intent* when making a
434 build plan or package set. A simple "build vs deploy" dichotomy is adequate:
435 the sliding window principle described in the previous section shows how to
436 interpolate between the these two "end points" to get the 3 platform triple
437 for each bootstrapping stage. That means for any package a given package set,
438 even those not bound on the top level but only reachable via dependencies or
439 <varname>buildPackages</varname>, the three platforms will be defined as one
440 of <varname>localSystem</varname> or <varname>crossSystem</varname>, with the
441 former replacing the latter as one traverses build-time dependencies. A last
442 simple difference is that <varname>crossSystem</varname> should be null when
443 one doesn't want to cross-compile, while the <varname>*Platform</varname>s
444 are always non-null. <varname>localSystem</varname> is always non-null.
445 </para>
446 </section>
447<!--============================================================-->
448 <section xml:id="sec-cross-infra">
449 <title>Cross-compilation infrastructure</title>
450
451 <para>
452 To be written.
453 </para>
454
455 <note>
456 <para>
457 If one explores Nixpkgs, they will see derivations with names like
458 <literal>gccCross</literal>. Such <literal>*Cross</literal> derivations is a
459 holdover from before we properly distinguished between the host and target
460 platforms—the derivation with "Cross" in the name covered the <literal>build
461 = host != target</literal> case, while the other covered the <literal>host =
462 target</literal>, with build platform the same or not based on whether one
463 was using its <literal>.nativeDrv</literal> or <literal>.crossDrv</literal>.
464 This ugliness will disappear soon.
465 </para>
466 </note>
467 </section>
468</chapter>