···11+<!-- TODO: Render this document in front of function documentation in case https://github.com/nix-community/nixdoc/issues/19 is ever supported -->
22+33+# File sets {#sec-fileset}
44+55+The [`lib.fileset`](#sec-functions-library-fileset) library allows you to work with _file sets_.
66+A file set is a mathematical set of local files that can be added to the Nix store for use in Nix derivations.
77+File sets are easy and safe to use, providing obvious and composable semantics with good error messages to prevent mistakes.
88+99+These sections apply to the entire library.
1010+See the [function reference](#sec-functions-library-fileset) for function-specific documentation.
1111+1212+The file set library is currently very limited but is being expanded to include more functions over time.
1313+1414+## Implicit coercion from paths to file sets {#sec-fileset-path-coercion}
1515+1616+All functions accepting file sets as arguments can also accept [paths](https://nixos.org/manual/nix/stable/language/values.html#type-path) as arguments.
1717+Such path arguments are implicitly coerced to file sets containing all files under that path:
1818+- A path to a file turns into a file set containing that single file.
1919+- A path to a directory turns into a file set containing all files _recursively_ in that directory.
2020+2121+If the path points to a non-existent location, an error is thrown.
2222+2323+::: {.note}
2424+Just like in Git, file sets cannot represent empty directories.
2525+Because of this, a path to a directory that contains no files (recursively) will turn into a file set containing no files.
2626+:::
2727+2828+:::{.note}
2929+File set coercion does _not_ add any of the files under the coerced paths to the store.
3030+Only the [`toSource`](#function-library-lib.fileset.toSource) function adds files to the Nix store, and only those files contained in the `fileset` argument.
3131+This is in contrast to using [paths in string interpolation](https://nixos.org/manual/nix/stable/language/values.html#type-path), which does add the entire referenced path to the store.
3232+:::
3333+3434+### Example {#sec-fileset-path-coercion-example}
3535+3636+Assume we are in a local directory with a file hierarchy like this:
3737+```
3838+├─ a/
3939+│ ├─ x (file)
4040+│ └─ b/
4141+│ └─ y (file)
4242+└─ c/
4343+ └─ d/
4444+```
4545+4646+Here's a listing of which files get included when different path expressions get coerced to file sets:
4747+- `./.` as a file set contains both `a/x` and `a/b/y` (`c/` does not contain any files and is therefore omitted).
4848+- `./a` as a file set contains both `a/x` and `a/b/y`.
4949+- `./a/x` as a file set contains only `a/x`.
5050+- `./a/b` as a file set contains only `a/b/y`.
5151+- `./c` as a file set is empty, since neither `c` nor `c/d` contain any files.
+3
lib/README.md
···70707171# Run the lib.path property tests
7272path/tests/prop.sh
7373+7474+# Run the lib.fileset tests
7575+fileset/tests.sh
7376```
···11+# File set library
22+33+The main goal of the file set library is to be able to select local files that should be added to the Nix store.
44+It should have the following properties:
55+- Easy:
66+ The functions should have obvious semantics, be low in number and be composable.
77+- Safe:
88+ Throw early and helpful errors when mistakes are detected.
99+- Lazy:
1010+ Only compute values when necessary.
1111+1212+Non-goals are:
1313+- Efficient:
1414+ If the abstraction proves itself worthwhile but too slow, it can be still be optimized further.
1515+1616+## Tests
1717+1818+Tests are declared in [`tests.sh`](./tests.sh) and can be run using
1919+```
2020+./tests.sh
2121+```
2222+2323+## Benchmark
2424+2525+A simple benchmark against the HEAD commit can be run using
2626+```
2727+./benchmark.sh HEAD
2828+```
2929+3030+This is intended to be run manually and is not checked by CI.
3131+3232+## Internal representation
3333+3434+The internal representation is versioned in order to allow file sets from different Nixpkgs versions to be composed with each other, see [`internal.nix`](./internal.nix) for the versions and conversions between them.
3535+This section describes only the current representation, but past versions will have to be supported by the code.
3636+3737+### `fileset`
3838+3939+An attribute set with these values:
4040+4141+- `_type` (constant string `"fileset"`):
4242+ Tag to indicate this value is a file set.
4343+4444+- `_internalVersion` (constant string equal to the current version):
4545+ Version of the representation
4646+4747+- `_internalBase` (path):
4848+ Any files outside of this path cannot influence the set of files.
4949+ This is always a directory.
5050+5151+- `_internalTree` ([filesetTree](#filesettree)):
5252+ A tree representation of all included files under `_internalBase`.
5353+5454+- `__noEval` (error):
5555+ An error indicating that directly evaluating file sets is not supported.
5656+5757+## `filesetTree`
5858+5959+One of the following:
6060+6161+- `{ <name> = filesetTree; }`:
6262+ A directory with a nested `filesetTree` value for every directory entry.
6363+ Even entries that aren't included are present as `null` because it improves laziness and allows using this as a sort of `builtins.readDir` cache.
6464+6565+- `"directory"`:
6666+ A directory with all its files included recursively, allowing early cutoff for some operations.
6767+ This specific string is chosen to be compatible with `builtins.readDir` for a simpler implementation.
6868+6969+- `"regular"`, `"symlink"`, `"unknown"` or any other non-`"directory"` string:
7070+ A nested file with its file type.
7171+ These specific strings are chosen to be compatible with `builtins.readDir` for a simpler implementation.
7272+ Distinguishing between different file types is not strictly necessary for the functionality this library,
7373+ but it does allow nicer printing of file sets.
7474+7575+- `null`:
7676+ A file or directory that is excluded from the tree.
7777+ It may still exist on the file system.
7878+7979+## API design decisions
8080+8181+This section justifies API design decisions.
8282+8383+### Internal structure
8484+8585+The representation of the file set data type is internal and can be changed over time.
8686+8787+Arguments:
8888+- (+) The point of this library is to provide high-level functions, users don't need to be concerned with how it's implemented
8989+- (+) It allows adjustments to the representation, which is especially useful in the early days of the library.
9090+- (+) It still allows the representation to be stabilized later if necessary and if it has proven itself
9191+9292+### Influence tracking
9393+9494+File set operations internally track the top-most directory that could influence the exact contents of a file set.
9595+Specifically, `toSource` requires that the given `fileset` is completely determined by files within the directory specified by the `root` argument.
9696+For example, even with `dir/file.txt` being the only file in `./.`, `toSource { root = ./dir; fileset = ./.; }` gives an error.
9797+This is because `fileset` may as well be the result of filtering `./.` in a way that excludes `dir`.
9898+9999+Arguments:
100100+- (+) This gives us the guarantee that adding new files to a project never breaks a file set expression.
101101+ This is also true in a lesser form for removed files:
102102+ only removing files explicitly referenced by paths can break a file set expression.
103103+- (+) This can be removed later, if we discover it's too restrictive
104104+- (-) It leads to errors when a sensible result could sometimes be returned, such as in the above example.
105105+106106+### Empty directories
107107+108108+File sets can only represent a _set_ of local files, directories on their own are not representable.
109109+110110+Arguments:
111111+- (+) There does not seem to be a sensible set of combinators when directories can be represented on their own.
112112+ Here's some possibilities:
113113+ - `./.` represents the files in `./.` _and_ the directory itself including its subdirectories, meaning that even if there's no files, the entire structure of `./.` is preserved
114114+115115+ In that case, what should `fileFilter (file: false) ./.` return?
116116+ It could return the entire directory structure unchanged, but with all files removed, which would not be what one would expect.
117117+118118+ Trying to have a filter function that also supports directories will lead to the question of:
119119+ What should the behavior be if `./foo` itself is excluded but all of its contents are included?
120120+ It leads to having to define when directories are recursed into, but then we're effectively back at how the `builtins.path`-based filters work.
121121+122122+ - `./.` represents all files in `./.` _and_ the directory itself, but not its subdirectories, meaning that at least `./.` will be preserved even if it's empty.
123123+124124+ In that case, `intersect ./. ./foo` should only include files and no directories themselves, since `./.` includes only `./.` as a directory, and same for `./foo`, so there's no overlap in directories.
125125+ But intuitively this operation should result in the same as `./foo` – everything else is just confusing.
126126+- (+) This matches how Git only supports files, so developers should already be used to it.
127127+- (-) Empty directories (even if they contain nested directories) are neither representable nor preserved when coercing from paths.
128128+ - (+) It is very rare that empty directories are necessary.
129129+ - (+) We can implement a workaround, allowing `toSource` to take an extra argument for ensuring certain extra directories exist in the result.
130130+- (-) It slows down store imports, since the evaluator needs to traverse the entire tree to remove any empty directories
131131+ - (+) This can still be optimized by introducing more Nix builtins if necessary
132132+133133+### String paths
134134+135135+File sets do not support Nix store paths in strings such as `"/nix/store/...-source"`.
136136+137137+Arguments:
138138+- (+) Such paths are usually produced by derivations, which means `toSource` would either:
139139+ - Require IFD if `builtins.path` is used as the underlying primitive
140140+ - Require importing the entire `root` into the store such that derivations can be used to do the filtering
141141+- (+) The convenient path coercion like `union ./foo ./bar` wouldn't work for absolute paths, requiring more verbose alternate interfaces:
142142+ - `let root = "/nix/store/...-source"; in union "${root}/foo" "${root}/bar"`
143143+144144+ Verbose and dangerous because if `root` was a path, the entire path would get imported into the store.
145145+146146+ - `toSource { root = "/nix/store/...-source"; fileset = union "./foo" "./bar"; }`
147147+148148+ Does not allow debug printing intermediate file set contents, since we don't know the paths contents before having a `root`.
149149+150150+ - `let fs = lib.fileset.withRoot "/nix/store/...-source"; in fs.union "./foo" "./bar"`
151151+152152+ Makes library functions impure since they depend on the contextual root path, questionable composability.
153153+154154+- (+) The point of the file set abstraction is to specify which files should get imported into the store.
155155+156156+ This use case makes little sense for files that are already in the store.
157157+ This should be a separate abstraction as e.g. `pkgs.drvLayout` instead, which could have a similar interface but be specific to derivations.
158158+ Additional capabilities could be supported that can't be done at evaluation time, such as renaming files, creating new directories, setting executable bits, etc.
159159+160160+### Single files
161161+162162+File sets cannot add single files to the store, they can only import files under directories.
163163+164164+Arguments:
165165+- (+) There's no point in using this library for a single file, since you can't do anything other than add it to the store or not.
166166+ And it would be unclear how the library should behave if the one file wouldn't be added to the store:
167167+ `toSource { root = ./file.nix; fileset = <empty>; }` has no reasonable result because returing an empty store path wouldn't match the file type, and there's no way to have an empty file store path, whatever that would mean.
168168+169169+## To update in the future
170170+171171+Here's a list of places in the library that need to be updated in the future:
172172+- > The file set library is currently very limited but is being expanded to include more functions over time.
173173+174174+ in [the manual](../../doc/functions/fileset.section.md)
175175+- > Currently the only way to construct file sets is using implicit coercion from paths.
176176+177177+ in [the `toSource` reference](./default.nix)
178178+- > For now filesets are always paths
179179+180180+ in [the `toSource` implementation](./default.nix), also update the variable name there
181181+- Once a tracing function exists, `__noEval` in [internal.nix](./internal.nix) should mention it
182182+- If/Once a function to convert `lib.sources` values into file sets exists, the `_coerce` and `toSource` functions should be updated to mention that function in the error when such a value is passed
183183+- If/Once a function exists that can optionally include a path depending on whether it exists, the error message for the path not existing in `_coerce` should mention the new function
+94
lib/fileset/benchmark.sh
···11+#!/usr/bin/env bash
22+33+# Benchmarks lib.fileset
44+# Run:
55+# [nixpkgs]$ lib/fileset/benchmark.sh HEAD
66+77+set -euo pipefail
88+shopt -s inherit_errexit dotglob
99+1010+if (( $# == 0 )); then
1111+ echo "Usage: $0 HEAD"
1212+ echo "Benchmarks the current tree against the HEAD commit. Any git ref will work."
1313+ exit 1
1414+fi
1515+compareTo=$1
1616+1717+SCRIPT_FILE=$(readlink -f "${BASH_SOURCE[0]}")
1818+SCRIPT_DIR=$(dirname "$SCRIPT_FILE")
1919+2020+nixpkgs=$(cd "$SCRIPT_DIR/../.."; pwd)
2121+2222+tmp="$(mktemp -d)"
2323+clean_up() {
2424+ rm -rf "$tmp"
2525+}
2626+trap clean_up EXIT SIGINT SIGTERM
2727+work="$tmp/work"
2828+mkdir "$work"
2929+cd "$work"
3030+3131+# Create a fairly populated tree
3232+touch f{0..5}
3333+mkdir d{0..5}
3434+mkdir e{0..5}
3535+touch d{0..5}/f{0..5}
3636+mkdir -p d{0..5}/d{0..5}
3737+mkdir -p e{0..5}/e{0..5}
3838+touch d{0..5}/d{0..5}/f{0..5}
3939+mkdir -p d{0..5}/d{0..5}/d{0..5}
4040+mkdir -p e{0..5}/e{0..5}/e{0..5}
4141+touch d{0..5}/d{0..5}/d{0..5}/f{0..5}
4242+mkdir -p d{0..5}/d{0..5}/d{0..5}/d{0..5}
4343+mkdir -p e{0..5}/e{0..5}/e{0..5}/e{0..5}
4444+touch d{0..5}/d{0..5}/d{0..5}/d{0..5}/f{0..5}
4545+4646+bench() {
4747+ NIX_PATH=nixpkgs=$1 NIX_SHOW_STATS=1 NIX_SHOW_STATS_PATH=$tmp/stats.json \
4848+ nix-instantiate --eval --strict --show-trace >/dev/null \
4949+ --expr '(import <nixpkgs/lib>).fileset.toSource { root = ./.; fileset = ./.; }'
5050+ cat "$tmp/stats.json"
5151+}
5252+5353+echo "Running benchmark on index" >&2
5454+bench "$nixpkgs" > "$tmp/new.json"
5555+(
5656+ echo "Checking out $compareTo" >&2
5757+ git -C "$nixpkgs" worktree add --quiet "$tmp/worktree" "$compareTo"
5858+ trap 'git -C "$nixpkgs" worktree remove "$tmp/worktree"' EXIT
5959+ echo "Running benchmark on $compareTo" >&2
6060+ bench "$tmp/worktree" > "$tmp/old.json"
6161+)
6262+6363+declare -a stats=(
6464+ ".envs.elements"
6565+ ".envs.number"
6666+ ".gc.totalBytes"
6767+ ".list.concats"
6868+ ".list.elements"
6969+ ".nrFunctionCalls"
7070+ ".nrLookups"
7171+ ".nrOpUpdates"
7272+ ".nrPrimOpCalls"
7373+ ".nrThunks"
7474+ ".sets.elements"
7575+ ".sets.number"
7676+ ".symbols.number"
7777+ ".values.number"
7878+)
7979+8080+different=0
8181+for stat in "${stats[@]}"; do
8282+ oldValue=$(jq "$stat" "$tmp/old.json")
8383+ newValue=$(jq "$stat" "$tmp/new.json")
8484+ if (( oldValue != newValue )); then
8585+ percent=$(bc <<< "scale=100; result = 100/$oldValue*$newValue; scale=4; result / 1")
8686+ if (( oldValue < newValue )); then
8787+ echo -e "Statistic $stat ($newValue) is \e[0;31m$percent% (+$(( newValue - oldValue )))\e[0m of the old value $oldValue" >&2
8888+ else
8989+ echo -e "Statistic $stat ($newValue) is \e[0;32m$percent% (-$(( oldValue - newValue )))\e[0m of the old value $oldValue" >&2
9090+ fi
9191+ (( different++ )) || true
9292+ fi
9393+done
9494+echo "$different stats differ between the current tree and $compareTo"
+131
lib/fileset/default.nix
···11+{ lib }:
22+let
33+44+ inherit (import ./internal.nix { inherit lib; })
55+ _coerce
66+ _toSourceFilter
77+ ;
88+99+ inherit (builtins)
1010+ isPath
1111+ pathExists
1212+ typeOf
1313+ ;
1414+1515+ inherit (lib.path)
1616+ hasPrefix
1717+ splitRoot
1818+ ;
1919+2020+ inherit (lib.strings)
2121+ isStringLike
2222+ ;
2323+2424+ inherit (lib.filesystem)
2525+ pathType
2626+ ;
2727+2828+ inherit (lib.sources)
2929+ cleanSourceWith
3030+ ;
3131+3232+in {
3333+3434+ /*
3535+ Add the local files contained in `fileset` to the store as a single [store path](https://nixos.org/manual/nix/stable/glossary#gloss-store-path) rooted at `root`.
3636+3737+ The result is the store path as a string-like value, making it usable e.g. as the `src` of a derivation, or in string interpolation:
3838+ ```nix
3939+ stdenv.mkDerivation {
4040+ src = lib.fileset.toSource { ... };
4141+ # ...
4242+ }
4343+ ```
4444+4545+ The name of the store path is always `source`.
4646+4747+ Type:
4848+ toSource :: {
4949+ root :: Path,
5050+ fileset :: FileSet,
5151+ } -> SourceLike
5252+5353+ Example:
5454+ # Import the current directory into the store but only include files under ./src
5555+ toSource { root = ./.; fileset = ./src; }
5656+ => "/nix/store/...-source"
5757+5858+ # The file set coerced from path ./bar could contain files outside the root ./foo, which is not allowed
5959+ toSource { root = ./foo; fileset = ./bar; }
6060+ => <error>
6161+6262+ # The root has to be a local filesystem path
6363+ toSource { root = "/nix/store/...-source"; fileset = ./.; }
6464+ => <error>
6565+ */
6666+ toSource = {
6767+ /*
6868+ (required) The local directory [path](https://nixos.org/manual/nix/stable/language/values.html#type-path) that will correspond to the root of the resulting store path.
6969+ Paths in [strings](https://nixos.org/manual/nix/stable/language/values.html#type-string), including Nix store paths, cannot be passed as `root`.
7070+ `root` has to be a directory.
7171+7272+<!-- Ignore the indentation here, this is a nixdoc rendering bug that needs to be fixed -->
7373+:::{.note}
7474+Changing `root` only affects the directory structure of the resulting store path, it does not change which files are added to the store.
7575+The only way to change which files get added to the store is by changing the `fileset` attribute.
7676+:::
7777+ */
7878+ root,
7979+ /*
8080+ (required) The file set whose files to import into the store.
8181+ Currently the only way to construct file sets is using [implicit coercion from paths](#sec-fileset-path-coercion).
8282+ If a directory does not recursively contain any file, it is omitted from the store path contents.
8383+ */
8484+ fileset,
8585+ }:
8686+ let
8787+ # We cannot rename matched attribute arguments, so let's work around it with an extra `let in` statement
8888+ # For now filesets are always paths
8989+ filesetPath = fileset;
9090+ in
9191+ let
9292+ fileset = _coerce "lib.fileset.toSource: `fileset`" filesetPath;
9393+ rootFilesystemRoot = (splitRoot root).root;
9494+ filesetFilesystemRoot = (splitRoot fileset._internalBase).root;
9595+ in
9696+ if ! isPath root then
9797+ if isStringLike root then
9898+ throw ''
9999+ lib.fileset.toSource: `root` "${toString root}" is a string-like value, but it should be a path instead.
100100+ Paths in strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.''
101101+ else
102102+ throw ''
103103+ lib.fileset.toSource: `root` is of type ${typeOf root}, but it should be a path instead.''
104104+ # Currently all Nix paths have the same filesystem root, but this could change in the future.
105105+ # See also ../path/README.md
106106+ else if rootFilesystemRoot != filesetFilesystemRoot then
107107+ throw ''
108108+ lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` "${toString root}":
109109+ `root`: root "${toString rootFilesystemRoot}"
110110+ `fileset`: root "${toString filesetFilesystemRoot}"
111111+ Different roots are not supported.''
112112+ else if ! pathExists root then
113113+ throw ''
114114+ lib.fileset.toSource: `root` ${toString root} does not exist.''
115115+ else if pathType root != "directory" then
116116+ throw ''
117117+ lib.fileset.toSource: `root` ${toString root} is a file, but it should be a directory instead. Potential solutions:
118118+ - If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function.
119119+ - If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as ${toString (dirOf root)}, and set `fileset` to the file path.''
120120+ else if ! hasPrefix root fileset._internalBase then
121121+ throw ''
122122+ lib.fileset.toSource: `fileset` could contain files in ${toString fileset._internalBase}, which is not under the `root` ${toString root}. Potential solutions:
123123+ - Set `root` to ${toString fileset._internalBase} or any directory higher up. This changes the layout of the resulting store path.
124124+ - Set `fileset` to a file set that cannot contain files outside the `root` ${toString root}. This could change the files included in the result.''
125125+ else
126126+ cleanSourceWith {
127127+ name = "source";
128128+ src = root;
129129+ filter = _toSourceFilter fileset;
130130+ };
131131+}
+274
lib/fileset/internal.nix
···11+{ lib ? import ../. }:
22+let
33+44+ inherit (builtins)
55+ isAttrs
66+ isPath
77+ isString
88+ pathExists
99+ readDir
1010+ typeOf
1111+ split
1212+ ;
1313+1414+ inherit (lib.attrsets)
1515+ attrValues
1616+ mapAttrs
1717+ ;
1818+1919+ inherit (lib.filesystem)
2020+ pathType
2121+ ;
2222+2323+ inherit (lib.lists)
2424+ all
2525+ elemAt
2626+ length
2727+ ;
2828+2929+ inherit (lib.path)
3030+ append
3131+ splitRoot
3232+ ;
3333+3434+ inherit (lib.path.subpath)
3535+ components
3636+ ;
3737+3838+ inherit (lib.strings)
3939+ isStringLike
4040+ concatStringsSep
4141+ substring
4242+ stringLength
4343+ ;
4444+4545+in
4646+# Rare case of justified usage of rec:
4747+# - This file is internal, so the return value doesn't matter, no need to make things overridable
4848+# - The functions depend on each other
4949+# - We want to expose all of these functions for easy testing
5050+rec {
5151+5252+ # If you change the internal representation, make sure to:
5353+ # - Update this version
5454+ # - Adjust _coerce to also accept and coerce older versions
5555+ # - Update the description of the internal representation in ./README.md
5656+ _currentVersion = 0;
5757+5858+ # Create a fileset, see ./README.md#fileset
5959+ # Type: path -> filesetTree -> fileset
6060+ _create = base: tree: {
6161+ _type = "fileset";
6262+6363+ _internalVersion = _currentVersion;
6464+ _internalBase = base;
6565+ _internalTree = tree;
6666+6767+ # Double __ to make it be evaluated and ordered first
6868+ __noEval = throw ''
6969+ lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'';
7070+ };
7171+7272+ # Coerce a value to a fileset, erroring when the value cannot be coerced.
7373+ # The string gives the context for error messages.
7474+ # Type: String -> Path -> fileset
7575+ _coerce = context: value:
7676+ if value._type or "" == "fileset" then
7777+ if value._internalVersion > _currentVersion then
7878+ throw ''
7979+ ${context} is a file set created from a future version of the file set library with a different internal representation:
8080+ - Internal version of the file set: ${toString value._internalVersion}
8181+ - Internal version of the library: ${toString _currentVersion}
8282+ Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.''
8383+ else
8484+ value
8585+ else if ! isPath value then
8686+ if isStringLike value then
8787+ throw ''
8888+ ${context} "${toString value}" is a string-like value, but it should be a path instead.
8989+ Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.''
9090+ else
9191+ throw ''
9292+ ${context} is of type ${typeOf value}, but it should be a path instead.''
9393+ else if ! pathExists value then
9494+ throw ''
9595+ ${context} ${toString value} does not exist.''
9696+ else
9797+ _singleton value;
9898+9999+ # Create a file set from a path.
100100+ # Type: Path -> fileset
101101+ _singleton = path:
102102+ let
103103+ type = pathType path;
104104+ in
105105+ if type == "directory" then
106106+ _create path type
107107+ else
108108+ # This turns a file path ./default.nix into a fileset with
109109+ # - _internalBase: ./.
110110+ # - _internalTree: {
111111+ # "default.nix" = <type>;
112112+ # # Other directory entries
113113+ # <name> = null;
114114+ # }
115115+ # See ./README.md#single-files
116116+ _create (dirOf path)
117117+ (_nestTree
118118+ (dirOf path)
119119+ [ (baseNameOf path) ]
120120+ type
121121+ );
122122+123123+ /*
124124+ Nest a filesetTree under some extra components, while filling out all the other directory entries that aren't included with null
125125+126126+ _nestTree ./. [ "foo" "bar" ] tree == {
127127+ foo = {
128128+ bar = tree;
129129+ <other-entries> = null;
130130+ }
131131+ <other-entries> = null;
132132+ }
133133+134134+ Type: Path -> [ String ] -> filesetTree -> filesetTree
135135+ */
136136+ _nestTree = targetBase: extraComponents: tree:
137137+ let
138138+ recurse = index: focusPath:
139139+ if index == length extraComponents then
140140+ tree
141141+ else
142142+ mapAttrs (_: _: null) (readDir focusPath)
143143+ // {
144144+ ${elemAt extraComponents index} = recurse (index + 1) (append focusPath (elemAt extraComponents index));
145145+ };
146146+ in
147147+ recurse 0 targetBase;
148148+149149+ # Expand "directory" filesetTree representation to the equivalent { <name> = filesetTree; }
150150+ # Type: Path -> filesetTree -> { <name> = filesetTree; }
151151+ _directoryEntries = path: value:
152152+ if isAttrs value then
153153+ value
154154+ else
155155+ readDir path;
156156+157157+ /*
158158+ Simplify a filesetTree recursively:
159159+ - Replace all directories that have no files with `null`
160160+ This removes directories that would be empty
161161+ - Replace all directories with all files with `"directory"`
162162+ This speeds up the source filter function
163163+164164+ Note that this function is strict, it evaluates the entire tree
165165+166166+ Type: Path -> filesetTree -> filesetTree
167167+ */
168168+ _simplifyTree = path: tree:
169169+ if tree == "directory" || isAttrs tree then
170170+ let
171171+ entries = _directoryEntries path tree;
172172+ simpleSubtrees = mapAttrs (name: _simplifyTree (path + "/${name}")) entries;
173173+ subtreeValues = attrValues simpleSubtrees;
174174+ in
175175+ # This triggers either when all files in a directory are filtered out
176176+ # Or when the directory doesn't contain any files at all
177177+ if all isNull subtreeValues then
178178+ null
179179+ # Triggers when we have the same as a `readDir path`, so we can turn it back into an equivalent "directory".
180180+ else if all isString subtreeValues then
181181+ "directory"
182182+ else
183183+ simpleSubtrees
184184+ else
185185+ tree;
186186+187187+ # Turn a fileset into a source filter function suitable for `builtins.path`
188188+ # Only directories recursively containing at least one files are recursed into
189189+ # Type: Path -> fileset -> (String -> String -> Bool)
190190+ _toSourceFilter = fileset:
191191+ let
192192+ # Simplify the tree, necessary to make sure all empty directories are null
193193+ # which has the effect that they aren't included in the result
194194+ tree = _simplifyTree fileset._internalBase fileset._internalTree;
195195+196196+ # Decompose the base into its components
197197+ # See ../path/README.md for why we're not just using `toString`
198198+ baseComponents = components (splitRoot fileset._internalBase).subpath;
199199+200200+ # The base path as a string with a single trailing slash
201201+ baseString =
202202+ if baseComponents == [] then
203203+ # Need to handle the filesystem root specially
204204+ "/"
205205+ else
206206+ "/" + concatStringsSep "/" baseComponents + "/";
207207+208208+ baseLength = stringLength baseString;
209209+210210+ # Check whether a list of path components under the base path exists in the tree.
211211+ # This function is called often, so it should be fast.
212212+ # Type: [ String ] -> Bool
213213+ inTree = components:
214214+ let
215215+ recurse = index: localTree:
216216+ if isAttrs localTree then
217217+ # We have an attribute set, meaning this is a directory with at least one file
218218+ if index >= length components then
219219+ # The path may have no more components though, meaning the filter is running on the directory itself,
220220+ # so we always include it, again because there's at least one file in it.
221221+ true
222222+ else
223223+ # If we do have more components, the filter runs on some entry inside this directory, so we need to recurse
224224+ # We do +2 because builtins.split is an interleaved list of the inbetweens and the matches
225225+ recurse (index + 2) localTree.${elemAt components index}
226226+ else
227227+ # If it's not an attribute set it can only be either null (in which case it's not included)
228228+ # or a string ("directory" or "regular", etc.) in which case it's included
229229+ localTree != null;
230230+ in recurse 0 tree;
231231+232232+ # Filter suited when there's no files
233233+ empty = _: _: false;
234234+235235+ # Filter suited when there's some files
236236+ # This can't be used for when there's no files, because the base directory is always included
237237+ nonEmpty =
238238+ path: _:
239239+ let
240240+ # Add a slash to the path string, turning "/foo" to "/foo/",
241241+ # making sure to not have any false prefix matches below.
242242+ # Note that this would produce "//" for "/",
243243+ # but builtins.path doesn't call the filter function on the `path` argument itself,
244244+ # meaning this function can never receive "/" as an argument
245245+ pathSlash = path + "/";
246246+ in
247247+ # Same as `hasPrefix pathSlash baseString`, but more efficient.
248248+ # With base /foo/bar we need to include /foo:
249249+ # hasPrefix "/foo/" "/foo/bar/"
250250+ if substring 0 (stringLength pathSlash) baseString == pathSlash then
251251+ true
252252+ # Same as `! hasPrefix baseString pathSlash`, but more efficient.
253253+ # With base /foo/bar we need to exclude /baz
254254+ # ! hasPrefix "/baz/" "/foo/bar/"
255255+ else if substring 0 baseLength pathSlash != baseString then
256256+ false
257257+ else
258258+ # Same as `removePrefix baseString path`, but more efficient.
259259+ # From the above code we know that hasPrefix baseString pathSlash holds, so this is safe.
260260+ # We don't use pathSlash here because we only needed the trailing slash for the prefix matching.
261261+ # With base /foo and path /foo/bar/baz this gives
262262+ # inTree (split "/" (removePrefix "/foo/" "/foo/bar/baz"))
263263+ # == inTree (split "/" "bar/baz")
264264+ # == inTree [ "bar" "baz" ]
265265+ inTree (split "/" (substring baseLength (-1) path));
266266+ in
267267+ # Special case because the code below assumes that the _internalBase is always included in the result
268268+ # which shouldn't be done when we have no files at all in the base
269269+ if tree == null then
270270+ empty
271271+ else
272272+ nonEmpty;
273273+274274+}
+26
lib/fileset/mock-splitRoot.nix
···11+# This overlay implements mocking of the lib.path.splitRoot function
22+# It pretends that the last component named "mock-root" is the root:
33+#
44+# splitRoot /foo/mock-root/bar/mock-root/baz
55+# => {
66+# root = /foo/mock-root/bar/mock-root;
77+# subpath = "./baz";
88+# }
99+self: super: {
1010+ path = super.path // {
1111+ splitRoot = path:
1212+ let
1313+ parts = super.path.splitRoot path;
1414+ components = self.path.subpath.components parts.subpath;
1515+ count = self.length components;
1616+ rootIndex = count - self.lists.findFirstIndex
1717+ (component: component == "mock-root")
1818+ (self.length components)
1919+ (self.reverseList components);
2020+ root = self.path.append parts.root (self.path.subpath.join (self.take rootIndex components));
2121+ subpath = self.path.subpath.join (self.drop rootIndex components);
2222+ in {
2323+ inherit root subpath;
2424+ };
2525+ };
2626+}
+350
lib/fileset/tests.sh
···11+#!/usr/bin/env bash
22+33+# Tests lib.fileset
44+# Run:
55+# [nixpkgs]$ lib/fileset/tests.sh
66+# or:
77+# [nixpkgs]$ nix-build lib/tests/release.nix
88+99+set -euo pipefail
1010+shopt -s inherit_errexit dotglob
1111+1212+die() {
1313+ # The second to last entry contains the line number of the top-level caller
1414+ lineIndex=$(( ${#BASH_LINENO[@]} - 2 ))
1515+ echo >&2 -e "test case at ${BASH_SOURCE[0]}:${BASH_LINENO[$lineIndex]} failed:" "$@"
1616+ exit 1
1717+}
1818+1919+if test -n "${TEST_LIB:-}"; then
2020+ NIX_PATH=nixpkgs="$(dirname "$TEST_LIB")"
2121+else
2222+ NIX_PATH=nixpkgs="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.."; pwd)"
2323+fi
2424+export NIX_PATH
2525+2626+tmp="$(mktemp -d)"
2727+clean_up() {
2828+ rm -rf "$tmp"
2929+}
3030+trap clean_up EXIT SIGINT SIGTERM
3131+work="$tmp/work"
3232+mkdir "$work"
3333+cd "$work"
3434+3535+# Crudely unquotes a JSON string by just taking everything between the first and the second quote.
3636+# We're only using this for resulting /nix/store paths, which can't contain " anyways,
3737+# nor can they contain any other characters that would need to be escaped specially in JSON
3838+# This way we don't need to add a dependency on e.g. jq
3939+crudeUnquoteJSON() {
4040+ cut -d \" -f2
4141+}
4242+4343+prefixExpression='let
4444+ lib = import <nixpkgs/lib>;
4545+ internal = import <nixpkgs/lib/fileset/internal.nix> {
4646+ inherit lib;
4747+ };
4848+in
4949+with lib;
5050+with internal;
5151+with lib.fileset;'
5252+5353+# Check that a nix expression evaluates successfully (strictly, coercing to json, read-write-mode).
5454+# The expression has `lib.fileset` in scope.
5555+# If a second argument is provided, the result is checked against it as a regex.
5656+# Otherwise, the result is output.
5757+# Usage: expectSuccess NIX [REGEX]
5858+expectSuccess() {
5959+ local expr=$1
6060+ if [[ "$#" -gt 1 ]]; then
6161+ local expectedResultRegex=$2
6262+ fi
6363+ if ! result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace \
6464+ --expr "$prefixExpression $expr"); then
6565+ die "$expr failed to evaluate, but it was expected to succeed"
6666+ fi
6767+ if [[ -v expectedResultRegex ]]; then
6868+ if [[ ! "$result" =~ $expectedResultRegex ]]; then
6969+ die "$expr should have evaluated to this regex pattern:\n\n$expectedResultRegex\n\nbut this was the actual result:\n\n$result"
7070+ fi
7171+ else
7272+ echo "$result"
7373+ fi
7474+}
7575+7676+# Check that a nix expression fails to evaluate (strictly, coercing to json, read-write-mode).
7777+# And check the received stderr against a regex
7878+# The expression has `lib.fileset` in scope.
7979+# Usage: expectFailure NIX REGEX
8080+expectFailure() {
8181+ local expr=$1
8282+ local expectedErrorRegex=$2
8383+ if result=$(nix-instantiate --eval --strict --json --read-write-mode --show-trace 2>"$tmp/stderr" \
8484+ --expr "$prefixExpression $expr"); then
8585+ die "$expr evaluated successfully to $result, but it was expected to fail"
8686+ fi
8787+ stderr=$(<"$tmp/stderr")
8888+ if [[ ! "$stderr" =~ $expectedErrorRegex ]]; then
8989+ die "$expr should have errored with this regex pattern:\n\n$expectedErrorRegex\n\nbut this was the actual error:\n\n$stderr"
9090+ fi
9191+}
9292+9393+# We conditionally use inotifywait in checkFileset.
9494+# Check early whether it's available
9595+# TODO: Darwin support, though not crucial since we have Linux CI
9696+if type inotifywait 2>/dev/null >/dev/null; then
9797+ canMonitorFiles=1
9898+else
9999+ echo "Warning: Not checking that excluded files don't get accessed since inotifywait is not available" >&2
100100+ canMonitorFiles=
101101+fi
102102+103103+# Check whether a file set includes/excludes declared paths as expected, usage:
104104+#
105105+# tree=(
106106+# [a/b] =1 # Declare that file a/b should exist and expect it to be included in the store path
107107+# [c/a] = # Declare that file c/a should exist and expect it to be excluded in the store path
108108+# [c/d/]= # Declare that directory c/d/ should exist and expect it to be excluded in the store path
109109+# )
110110+# checkFileset './a' # Pass the fileset as the argument
111111+declare -A tree
112112+checkFileset() (
113113+ # New subshell so that we can have a separate trap handler, see `trap` below
114114+ local fileset=$1
115115+116116+ # Process the tree into separate arrays for included paths, excluded paths and excluded files.
117117+ # Also create all the paths in the local directory
118118+ local -a included=()
119119+ local -a excluded=()
120120+ local -a excludedFiles=()
121121+ for p in "${!tree[@]}"; do
122122+ # If keys end with a `/` we treat them as directories, otherwise files
123123+ if [[ "$p" =~ /$ ]]; then
124124+ mkdir -p "$p"
125125+ isFile=
126126+ else
127127+ mkdir -p "$(dirname "$p")"
128128+ touch "$p"
129129+ isFile=1
130130+ fi
131131+ case "${tree[$p]}" in
132132+ 1)
133133+ included+=("$p")
134134+ ;;
135135+ 0)
136136+ excluded+=("$p")
137137+ if [[ -n "$isFile" ]]; then
138138+ excludedFiles+=("$p")
139139+ fi
140140+ ;;
141141+ *)
142142+ die "Unsupported tree value: ${tree[$p]}"
143143+ esac
144144+ done
145145+146146+ # Start inotifywait in the background to monitor all excluded files (if any)
147147+ if [[ -n "$canMonitorFiles" ]] && (( "${#excludedFiles[@]}" != 0 )); then
148148+ coproc watcher {
149149+ # inotifywait outputs a string on stderr when ready
150150+ # Redirect it to stdout so we can access it from the coproc's stdout fd
151151+ # exec so that the coprocess is inotify itself, making the kill below work correctly
152152+ # See below why we listen to both open and delete_self events
153153+ exec inotifywait --format='%e %w' --event open,delete_self --monitor "${excludedFiles[@]}" 2>&1
154154+ }
155155+ # This will trigger when this subshell exits, no matter if successful or not
156156+ # After exiting the subshell, the parent shell will continue executing
157157+ trap 'kill "${watcher_PID}"' exit
158158+159159+ # Synchronously wait until inotifywait is ready
160160+ while read -r -u "${watcher[0]}" line && [[ "$line" != "Watches established." ]]; do
161161+ :
162162+ done
163163+ fi
164164+165165+ # Call toSource with the fileset, triggering open events for all files that are added to the store
166166+ expression="toSource { root = ./.; fileset = $fileset; }"
167167+ # crudeUnquoteJSON is safe because we get back a store path in a string
168168+ storePath=$(expectSuccess "$expression" | crudeUnquoteJSON)
169169+170170+ # Remove all files immediately after, triggering delete_self events for all of them
171171+ rm -rf -- *
172172+173173+ # Only check for the inotify events if we actually started inotify earlier
174174+ if [[ -v watcher ]]; then
175175+ # Get the first event
176176+ read -r -u "${watcher[0]}" event file
177177+178178+ # There's only these two possible event timelines:
179179+ # - open, ..., open, delete_self, ..., delete_self: If some excluded files were read
180180+ # - delete_self, ..., delete_self: If no excluded files were read
181181+ # So by looking at the first event we can figure out which one it is!
182182+ case "$event" in
183183+ OPEN)
184184+ die "$expression opened excluded file $file when it shouldn't have"
185185+ ;;
186186+ DELETE_SELF)
187187+ # Expected events
188188+ ;;
189189+ *)
190190+ die "Unexpected event type '$event' on file $file that should be excluded"
191191+ ;;
192192+ esac
193193+ fi
194194+195195+ # For each path that should be included, make sure it does occur in the resulting store path
196196+ for p in "${included[@]}"; do
197197+ if [[ ! -e "$storePath/$p" ]]; then
198198+ die "$expression doesn't include path $p when it should have"
199199+ fi
200200+ done
201201+202202+ # For each path that should be excluded, make sure it doesn't occur in the resulting store path
203203+ for p in "${excluded[@]}"; do
204204+ if [[ -e "$storePath/$p" ]]; then
205205+ die "$expression included path $p when it shouldn't have"
206206+ fi
207207+ done
208208+)
209209+210210+211211+#### Error messages #####
212212+213213+# Absolute paths in strings cannot be passed as `root`
214214+expectFailure 'toSource { root = "/nix/store/foobar"; fileset = ./.; }' 'lib.fileset.toSource: `root` "/nix/store/foobar" is a string-like value, but it should be a path instead.
215215+\s*Paths in strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'
216216+217217+# Only paths are accepted as `root`
218218+expectFailure 'toSource { root = 10; fileset = ./.; }' 'lib.fileset.toSource: `root` is of type int, but it should be a path instead.'
219219+220220+# Different filesystem roots in root and fileset are not supported
221221+mkdir -p {foo,bar}/mock-root
222222+expectFailure 'with ((import <nixpkgs/lib>).extend (import <nixpkgs/lib/fileset/mock-splitRoot.nix>)).fileset;
223223+ toSource { root = ./foo/mock-root; fileset = ./bar/mock-root; }
224224+' 'lib.fileset.toSource: Filesystem roots are not the same for `fileset` and `root` "'"$work"'/foo/mock-root":
225225+\s*`root`: root "'"$work"'/foo/mock-root"
226226+\s*`fileset`: root "'"$work"'/bar/mock-root"
227227+\s*Different roots are not supported.'
228228+rm -rf *
229229+230230+# `root` needs to exist
231231+expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `root` '"$work"'/a does not exist.'
232232+233233+# `root` needs to be a file
234234+touch a
235235+expectFailure 'toSource { root = ./a; fileset = ./a; }' 'lib.fileset.toSource: `root` '"$work"'/a is a file, but it should be a directory instead. Potential solutions:
236236+\s*- If you want to import the file into the store _without_ a containing directory, use string interpolation or `builtins.path` instead of this function.
237237+\s*- If you want to import the file into the store _with_ a containing directory, set `root` to the containing directory, such as '"$work"', and set `fileset` to the file path.'
238238+rm -rf *
239239+240240+# Only paths under `root` should be able to influence the result
241241+mkdir a
242242+expectFailure 'toSource { root = ./a; fileset = ./.; }' 'lib.fileset.toSource: `fileset` could contain files in '"$work"', which is not under the `root` '"$work"'/a. Potential solutions:
243243+\s*- Set `root` to '"$work"' or any directory higher up. This changes the layout of the resulting store path.
244244+\s*- Set `fileset` to a file set that cannot contain files outside the `root` '"$work"'/a. This could change the files included in the result.'
245245+rm -rf *
246246+247247+# Path coercion only works for paths
248248+expectFailure 'toSource { root = ./.; fileset = 10; }' 'lib.fileset.toSource: `fileset` is of type int, but it should be a path instead.'
249249+expectFailure 'toSource { root = ./.; fileset = "/some/path"; }' 'lib.fileset.toSource: `fileset` "/some/path" is a string-like value, but it should be a path instead.
250250+\s*Paths represented as strings are not supported by `lib.fileset`, use `lib.sources` or derivations instead.'
251251+252252+# Path coercion errors for non-existent paths
253253+expectFailure 'toSource { root = ./.; fileset = ./a; }' 'lib.fileset.toSource: `fileset` '"$work"'/a does not exist.'
254254+255255+# File sets cannot be evaluated directly
256256+expectFailure '_create ./. null' 'lib.fileset: Directly evaluating a file set is not supported. Use `lib.fileset.toSource` to turn it into a usable source instead.'
257257+258258+# Future versions of the internal representation are unsupported
259259+expectFailure '_coerce "<tests>: value" { _type = "fileset"; _internalVersion = 1; }' '<tests>: value is a file set created from a future version of the file set library with a different internal representation:
260260+\s*- Internal version of the file set: 1
261261+\s*- Internal version of the library: 0
262262+\s*Make sure to update your Nixpkgs to have a newer version of `lib.fileset`.'
263263+264264+# _create followed by _coerce should give the inputs back without any validation
265265+expectSuccess '{
266266+ inherit (_coerce "<test>" (_create "base" "tree"))
267267+ _internalVersion _internalBase _internalTree;
268268+}' '\{"_internalBase":"base","_internalTree":"tree","_internalVersion":0\}'
269269+270270+#### Resulting store path ####
271271+272272+# The store path name should be "source"
273273+expectSuccess 'toSource { root = ./.; fileset = ./.; }' '"'"${NIX_STORE_DIR:-/nix/store}"'/.*-source"'
274274+275275+# We should be able to import an empty directory and end up with an empty result
276276+tree=(
277277+)
278278+checkFileset './.'
279279+280280+# Directories recursively containing no files are not included
281281+tree=(
282282+ [e/]=0
283283+ [d/e/]=0
284284+ [d/d/e/]=0
285285+ [d/d/f]=1
286286+ [d/f]=1
287287+ [f]=1
288288+)
289289+checkFileset './.'
290290+291291+# Check trees that could cause a naïve string prefix checking implementation to fail
292292+tree=(
293293+ [a]=0
294294+ [ab/x]=0
295295+ [ab/xy]=1
296296+ [ab/xyz]=0
297297+ [abc]=0
298298+)
299299+checkFileset './ab/xy'
300300+301301+# Check path coercion examples in ../../doc/functions/fileset.section.md
302302+tree=(
303303+ [a/x]=1
304304+ [a/b/y]=1
305305+ [c/]=0
306306+ [c/d/]=0
307307+)
308308+checkFileset './.'
309309+310310+tree=(
311311+ [a/x]=1
312312+ [a/b/y]=1
313313+ [c/]=0
314314+ [c/d/]=0
315315+)
316316+checkFileset './a'
317317+318318+tree=(
319319+ [a/x]=1
320320+ [a/b/y]=0
321321+ [c/]=0
322322+ [c/d/]=0
323323+)
324324+checkFileset './a/x'
325325+326326+tree=(
327327+ [a/x]=0
328328+ [a/b/y]=1
329329+ [c/]=0
330330+ [c/d/]=0
331331+)
332332+checkFileset './a/b'
333333+334334+tree=(
335335+ [a/x]=0
336336+ [a/b/y]=0
337337+ [c/]=0
338338+ [c/d/]=0
339339+)
340340+checkFileset './c'
341341+342342+# Test the source filter for the somewhat special case of files in the filesystem root
343343+# We can't easily test this with the above functions because we can't write to the filesystem root and we don't want to make any assumptions which files are there in the sandbox
344344+expectSuccess '_toSourceFilter (_create /. null) "/foo" ""' 'false'
345345+expectSuccess '_toSourceFilter (_create /. { foo = "regular"; }) "/foo" ""' 'true'
346346+expectSuccess '_toSourceFilter (_create /. { foo = null; }) "/foo" ""' 'false'
347347+348348+# TODO: Once we have combinators and a property testing library, derive property tests from https://en.wikipedia.org/wiki/Algebra_of_sets
349349+350350+echo >&2 tests ok