fetchzip: force UTF-8 compatibel locale to unpack non-ASCII symbols

musl and darwin support UTF-8 locales without any extras. As a result
unzip can unpack UTF-8 filenames there as is. But on glibc without
locale archive presence files get mangled as:

deps/αβ -> deps/#U03b1#U03b2

This makes `fetchzip` fixed-output derivations unstable.

Tested this change to fail in `coq.src` which was generated in system
that mangles UTF-8 symbols:

$ nix build -f. coq.src --rebuild -L
source> trying https://github.com/coq/coq/archive/V8.15.2.zip
source> % Total % Received % Xferd Average Speed Time Time Time Current
source> Dload Upload Total Spent Left Speed
source> 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
source> 100 8945k 100 8945k 0 0 1513k 0 0:00:05 0:00:05 --:--:-- 1989k
source> unpacking source archive /build/V8.15.2.zip
error: hash mismatch in fixed-output derivation '/nix/store/hrnyykm7wgw8vxisgq7hc2bg5gr0y6s8-source.drv':
specified: sha256-h81nFqkuvZkMR7YLHy7laTq5yOhjMW+w6rYzncxvyD4=
got: sha256-DTspmwyD3Evl1CUmvUy2MonbLGUezvsHN3prmP9eK2I=

Note: it means that some of existing caches for fixed output
derivations become incorrect. It should not break already cached
tarballs on cache.nixos.org thus the impact should not be widespread.

+12 -3
+5 -2
pkgs/build-support/fetchzip/default.nix
··· 5 # (e.g. due to minor changes in the compression algorithm, or changes 6 # in timestamps). 7 8 - { lib, fetchurl, unzip }: 9 10 { # Optionally move the contents of the unpacked tree up one level. 11 stripRoot ? true ··· 35 36 downloadToTemp = true; 37 38 - nativeBuildInputs = [ unzip ] ++ nativeBuildInputs; 39 40 postFetch = 41 ''
··· 5 # (e.g. due to minor changes in the compression algorithm, or changes 6 # in timestamps). 7 8 + { lib, fetchurl, unzip, glibcLocalesUtf8 }: 9 10 { # Optionally move the contents of the unpacked tree up one level. 11 stripRoot ? true ··· 35 36 downloadToTemp = true; 37 38 + # Have to pull in glibcLocalesUtf8 for unzip in setup-hook.sh to handle 39 + # UTF-8 aware locale: 40 + # https://github.com/NixOS/nixpkgs/issues/176225#issuecomment-1146617263 41 + nativeBuildInputs = [ unzip glibcLocalesUtf8 ] ++ nativeBuildInputs; 42 43 postFetch = 44 ''
+7 -1
pkgs/tools/archivers/unzip/setup-hook.sh
··· 1 unpackCmdHooks+=(_tryUnzip) 2 _tryUnzip() { 3 if ! [[ "$curSrc" =~ \.zip$ ]]; then return 1; fi 4 - unzip -qq "$curSrc" 5 }
··· 1 unpackCmdHooks+=(_tryUnzip) 2 _tryUnzip() { 3 if ! [[ "$curSrc" =~ \.zip$ ]]; then return 1; fi 4 + 5 + # UTF-8 locale is needed for unzip on glibc to handle UTF-8 symbols: 6 + # https://github.com/NixOS/nixpkgs/issues/176225#issuecomment-1146617263 7 + # Otherwise unzip unpacks escaped file names as if '-U' options was in effect. 8 + # 9 + # Pick en_US.UTF-8 as most possible to be present on glibc, musl and darwin. 10 + LANG=en_US.UTF-8 unzip -qq "$curSrc" 11 }