···11+# Gradle {#gradle}
22+33+Gradle is a popular build tool for Java/Kotlin. Gradle itself doesn't
44+currently provide tools to make dependency resolution reproducible, so
55+nixpkgs has a proxy designed for intercepting Gradle web requests to
66+record dependencies so they can be restored in a reproducible fashion.
77+88+## Building a Gradle package {#building-a-gradle-package}
99+1010+Here's how a typical derivation will look like:
1111+1212+```nix
1313+stdenv.mkDerivation (finalAttrs: {
1414+ pname = "pdftk";
1515+ version = "3.3.3";
1616+1717+ src = fetchFromGitLab {
1818+ owner = "pdftk-java";
1919+ repo = "pdftk";
2020+ rev = "v${finalAttrs.version}";
2121+ hash = "sha256-ciKotTHSEcITfQYKFZ6sY2LZnXGChBJy0+eno8B3YHY=";
2222+ };
2323+2424+ nativeBuildInputs = [ gradle ];
2525+2626+ # if the package has dependencies, mitmCache must be set
2727+ mitmCache = gradle.fetchDeps {
2828+ inherit (finalAttrs) pname;
2929+ data = ./deps.json;
3030+ };
3131+3232+ # this is required for using mitm-cache on Darwin
3333+ __darwinAllowLocalNetworking = true;
3434+3535+ gradleFlags = [ "-Dfile.encoding=utf-8" ];
3636+3737+ # defaults to "assemble"
3838+ gradleBuildTask = "shadowJar";
3939+4040+ # will run the gradleCheckTask (defaults to "test")
4141+ doCheck = true;
4242+4343+ installPhase = ''
4444+ mkdir -p $out/{bin,share/pdftk}
4545+ cp build/libs/pdftk-all.jar $out/share/pdftk
4646+4747+ makeWrapper ${jre}/bin/java $out/bin/pdftk \
4848+ --add-flags "-jar $out/share/pdftk/pdftk-all.jar"
4949+5050+ cp ${finalAttrs.src}/pdftk.1 $out/share/man/man1
5151+ '';
5252+5353+ meta.sourceProvenance = with lib.sourceTypes; [
5454+ fromSource
5555+ binaryBytecode # mitm cache
5656+ ];
5757+})
5858+```
5959+6060+To update (or initialize) dependencies, run the update script via
6161+something like `$(nix-build -A <pname>.mitmCache.updateScript)`
6262+(`nix-build` builds the `updateScript`, `$(...)` runs the script at the
6363+path printed by `nix-build`).
6464+6565+If your package can't be evaluated using a simple `pkgs.<pname>`
6666+expression (for example, if your package isn't located in nixpkgs, or if
6767+you want to override some of its attributes), you will usually have to
6868+pass `pkg` instead of `pname` to `gradle.fetchDeps`. There are two ways
6969+of doing it.
7070+7171+The first is to add the derivation arguments required for getting the
7272+package. Using the pdftk example above:
7373+7474+```nix
7575+{ lib
7676+, stdenv
7777+# ...
7878+, pdftk
7979+}:
8080+8181+stdenv.mkDerivation (finalAttrs: {
8282+ # ...
8383+ mitmCache = gradle.fetchDeps {
8484+ pkg = pdftk;
8585+ data = ./deps.json;
8686+ };
8787+})
8888+```
8989+9090+This allows you to `override` any arguments of the `pkg` used for
9191+the update script (for example, `pkg = pdftk.override { enableSomeFlag =
9292+true };`), so this is the preferred way.
9393+9494+The second is to create a `let` binding for the package, like this:
9595+9696+```nix
9797+let self = stdenv.mkDerivation {
9898+ # ...
9999+ mitmCache = gradle.fetchDeps {
100100+ pkg = self;
101101+ data = ./deps.json;
102102+ };
103103+}; in self
104104+```
105105+106106+This is useful if you can't easily pass the derivation as its own
107107+argument, or if your `mkDerivation` call is responsible for building
108108+multiple packages.
109109+110110+In the former case, the update script will stay the same even if the
111111+derivation is called with different arguments. In the latter case, the
112112+update script will change depending on the derivation arguments. It's up
113113+to you to decide which one would work best for your derivation.
114114+115115+## Update Script {#gradle-update-script}
116116+117117+The update script does the following:
118118+119119+- Build the derivation's source via `pkgs.srcOnly`
120120+- Enter a `nix-shell` for the derivation in a `bwrap` sandbox (the
121121+ sandbox is only used on Linux)
122122+- Set the `IN_GRADLE_UPDATE_DEPS` environment variable to `1`
123123+- Run the derivation's `unpackPhase`, `patchPhase`, `configurePhase`
124124+- Run the derivation's `gradleUpdateScript` (the Gradle setup hook sets
125125+ a default value for it, which runs `preBuild`, `preGradleUpdate`
126126+ hooks, fetches the dependencies using `gradleUpdateTask`, and finally
127127+ runs the `postGradleUpdate` hook)
128128+- Finally, store all of the fetched files' hashes in the lockfile. They
129129+ may be `.jar`/`.pom` files from Maven repositories, or they may be
130130+ files otherwise used for building the package.
131131+132132+`fetchDeps` takes the following arguments:
133133+134134+- `attrPath` - the path to the package in nixpkgs (for example,
135135+ `"javaPackages.openjfx22"`). Used for update script metadata.
136136+- `pname` - an alias for `attrPath` for convenience. This is what you
137137+ will generally use instead of `pkg` or `attrPath`.
138138+- `pkg` - the package to be used for fetching the dependencies. Defaults
139139+ to `getAttrFromPath (splitString "." attrPath) pkgs`.
140140+- `bwrapFlags` - allows you to override bwrap flags (only relevant for
141141+ downstream, non-nixpkgs projects)
142142+- `data` - path to the dependencies lockfile (can be relative to the
143143+ package, can be absolute). In nixpkgs, it's discouraged to have the
144144+ lockfiles be named anything other `deps.json`, consider creating
145145+ subdirectories if your package requires multiple `deps.json` files.
146146+147147+## Environment {#gradle-environment}
148148+149149+The Gradle setup hook accepts the following environment variables:
150150+151151+- `mitmCache` - the MITM proxy cache imported using `gradle.fetchDeps`
152152+- `gradleFlags` - command-line flags to be used for every Gradle
153153+ invocation (this simply registers a function that uses the necessary
154154+ flags).
155155+ - You can't use `gradleFlags` for flags that contain spaces, in that
156156+ case you must add `gradleFlagsArray+=("-flag with spaces")` to the
157157+ derivation's bash code instead.
158158+ - If you want to build the package using a specific Java version, you
159159+ can pass `"-Dorg.gradle.java.home=${jdk}"` as one of the flags.
160160+- `gradleBuildTask` - the Gradle task (or tasks) to be used for building
161161+ the package. Defaults to `assemble`.
162162+- `gradleCheckTask` - the Gradle task (or tasks) to be used for checking
163163+ the package if `doCheck` is set to `true`. Defaults to `test`.
164164+- `gradleUpdateTask` - the Gradle task (or tasks) to be used for
165165+ fetching all of the package's dependencies in
166166+ `mitmCache.updateScript`. Defaults to `nixDownloadDeps`.
167167+- `gradleUpdateScript` - the code to run for fetching all of the
168168+ package's dependencies in `mitmCache.updateScript`. Defaults to
169169+ running the `preBuild` and `preGradleUpdate` hooks, running the
170170+ `gradleUpdateTask`, and finally running the `postGradleUpdate` hook.
171171+- `gradleInitScript` - path to the `--init-script` to pass to Gradle. By
172172+ default, a simple init script that enables reproducible archive
173173+ creation is used.
174174+ - Note that reproducible archives might break some builds. One example
175175+ of an error caused by it is `Could not create task ':jar'. Replacing
176176+ an existing task that may have already been used by other plugins is
177177+ not supported`. If you get such an error, the easiest "fix" is
178178+ disabling reproducible archives altogether by setting
179179+ `gradleInitScript` to something like `writeText
180180+ "empty-init-script.gradle" ""`
181181+- `enableParallelBuilding` / `enableParallelChecking` /
182182+ `enableParallelUpdating` - pass `--parallel` to Gradle in the
183183+ build/check phase or in the update script. Defaults to true. If the
184184+ build fails for mysterious reasons, consider setting this to false.
185185+- `dontUseGradleConfigure` / `dontUseGradleBuild` / `dontUseGradleCheck`
186186+ \- force disable the Gradle setup hook for certain phases.
187187+ - Note that if you disable the configure hook, you may face issues
188188+ such as `Failed to load native library 'libnative-platform.so'`,
189189+ because the configure hook is responsible for initializing Gradle.
···11+# Gradle Setup Hook
22+33+## Introduction
44+55+Gradle build scripts are written in a DSL, computing the list of Gradle
66+dependencies is a turing-complete task, not just in theory but in
77+practice. Fetching all of the dependencies often requires building some
88+native code, running some commands to check the host platform, or just
99+fetching some files using either JVM code or commands like `curl` or
1010+`wget`.
1111+1212+This practice is widespread and isn't considered a bad practice in the
1313+Java world, so all we can do is run Gradle to check what dependencies
1414+end up being fetched, and allow derivation authors to apply workarounds
1515+so they can run the code necessary for fetching the dependencies our
1616+script doesn't fetch.
1717+1818+"Run Gradle to check what dependencies end up being fetched" isn't a
1919+straightforward task. For example, Gradle usually uses Maven
2020+repositories, which have features such as "snapshots", a way to always
2121+use the latest version of a dependency as opposed to a fixed version.
2222+Obviously, this is horrible for reproducibility. Additionally, Gradle
2323+doesn't offer a way to export the list of dependency URLs and hashes (it
2424+does in a way, but it's far from being complete, and as such is useless
2525+for nixpkgs). Even if did, it would be annoying to use considering
2626+fetching non-Gradle dependendencies in Gradle scripts is commonplace.
2727+2828+That's why the setup hook uses mitm-cache, a program designed for
2929+intercepting all HTTP requests, recording all the files that were
3030+accessed, creating a Nix derivation with all of them, and then allowing
3131+the Gradle derivation to access these files.
3232+3333+## Maven Repositories
3434+3535+(Reference: [Repository
3636+Layout](https://cwiki.apache.org/confluence/display/MAVENOLD/Repository+Layout+-+Final))
3737+3838+Most of Gradle dependencies are fetched from Maven repositories. For
3939+each dependency, Gradle finds the first repo where it can successfully
4040+fetch that dependency, and uses that repo for it. Different repos might
4141+actually return different files for the same artifact because of e.g.
4242+pom normalization. Different repos may be used for the same artifact
4343+even across a single package (for example, if two build scripts define
4444+repositories in a different order).
4545+4646+The artifact metadata is specified in a .pom file, and the artifacts
4747+themselves are typically .jar files. The URL format is as follows:
4848+4949+`<repo>/<group-id>/<artifact-id>/<base-version>/<artifact-id>-<version>[-<classifier>].<ext>`
5050+5151+For example:
5252+5353+- `https://repo.maven.apache.org/maven2/org/slf4j/slf4j-api/2.0.9/slf4j-api-2.0.9.pom`
5454+- `https://oss.sonatype.org/content/groups/public/com/tobiasdiez/easybind/2.2.1-SNAPSHOT/easybind-2.2.1-20230117.075740-16.pom`
5555+5656+Where:
5757+5858+- `<repo>` is the repo base (`https://repo.maven.apache.org/maven2`)
5959+- `<group-id>` is the group ID with dots replaced with slashes
6060+ (`org.slf4j` -> `org/slf4j`)
6161+- `<artifact-id>` is the artifact ID (`slf4j-api`)
6262+- `<base-version>` is the artifact version (`2.0.9` for normal
6363+ artifacts, `2.2.1-SNAPSHOT` for snapshots)
6464+- `<version>` is the artifact version - can be either `<base-version>`
6565+ or `<version-base>-<timestamp>-<build-num>` (`2.0.9` for normal
6666+ artifacts, and either `2.2.1-SNAPSHOT` or `2.2.1-20230117.075740-16`
6767+ for snapshots)
6868+ - `<version-base>` - `<base-version>` without the `-SNAPSHOT` suffix
6969+ - `<timestamp>` - artifact build timestamp in the `YYYYMMDD.HHMMSS`
7070+ format (UTC)
7171+ - `<build-num>` - a counter that's incremented by 1 for each new
7272+ snapshot build
7373+- `<classifier>` is an optional classifier for allowing a single .pom to
7474+ refer to multiple .jar files. .pom files don't have classifiers, as
7575+ they describe metadata.
7676+- `<ext>` is the extension. .pom
7777+7878+Note that the artifact ID can contain `-`, so you can't extract the
7979+artifact ID and version from just the file name.
8080+8181+Additionally, the files in the repository may have associated signature
8282+files, formed by appending `.asc` to the filename, and hashsum files,
8383+formed by appending `.md5` or `.sha1` to the filename. The signatures
8484+are harmless, but the `.md5`/`.sha1` files are rejected.
8585+8686+The reasoning is as follows - consider two files `a.jar` and `b.jar`,
8787+that have the same hash. Gradle will fetch `a.jar.sha1`, find out that
8888+it hasn't yet downloaded a file with this hash, and then fetch `a.jar`,
8989+and finally download `b.jar.sha1`, locate it in its cache, and then
9090+*not* download `b.jar`. This means `b.jar` won't be stored in the MITM
9191+cache. Then, consider that on a later invocation, the fetching order
9292+changed, whether it was because of a running on different system,
9393+changed behavior after a Gradle update, or any other source of
9494+nondeterminism - `b.jar` is fetched before `a.jar`. Gradle will first
9595+fetch `b.jar.sha1`, not find it in its cache, attempt to fetch `b.jar`,
9696+and fail, as the cache doesn't have that file.
9797+9898+For the same reason, the proxy strips all checksum/etag headers. An
9999+alternative would be to make the proxy remember previous checksums and
100100+etags, but that would complicate the implementation - however, such a
101101+feature can be implemented if necessary. Note that checksum/etag header
102102+stripping is hardcoded, but `.md5/.sha1` file rejection is configured
103103+via CLI arguments.
104104+105105+**Caveat**: Gradle .module files also contain file hashes, in md5, sha1,
106106+sha256, sha512 formats. It posed no problem as of yet, but it might in
107107+the future. If it does pose problems, the deps derivation code can be
108108+extended to find all checksums in .module files and copy existing files
109109+there if their hash matches.
110110+111111+## Snapshots
112112+113113+Snapshots are a way to publish the very latest, unstable version of a
114114+dependency that constantly changes. Any project that depends on a
115115+snapshot will depend on this rolling version, rather than a fixed
116116+version. It's easy to understand why this is a bad idea for reproducible
117117+builds. Still, they can be dealt with by the logic in `gradle.fetchDeps`
118118+and `gradle.updateDeps`.
119119+120120+First, as you can see above, while normal artifacts have the same
121121+`base-version` and `version`, for snapshots it usually (but not
122122+necessarily) differs.
123123+124124+Second, for figuring out where to download the snapshot, Gradle consults
125125+`maven-metadata.xml`. With that in mind...
126126+127127+## Maven Metadata
128128+129129+(Reference: [Maven
130130+Metadata](https://maven.apache.org/repositories/metadata.html),
131131+[Metadata](https://maven.apache.org/ref/3.9.8/maven-repository-metadata/repository-metadata.html)
132132+133133+Maven metadata files are called `maven-metadata.xml`.
134134+135135+There are three levels of metadata: "G level", "A level", "V level",
136136+representing group, artifact, or version metadata.
137137+138138+G level metadata is currently unsupported. It's only used for Maven
139139+plugins, which Gradle presumably doesn't use.
140140+141141+A level metadata is used for getting the version list for an artifact.
142142+It's an xml with the following items:
143143+144144+- `<groupId>` - group ID
145145+- `<artifactId>` - artifact ID
146146+- `<versioning>`
147147+ - `<latest>` - the very latest base version (e.g. `2.2.1-SNAPSHOT`)
148148+ - `<release>` - the latest non-snapshot version
149149+ - `<versions>` - the version list, each in a `<version>` tag
150150+ - `<lastUpdated>` - the metadata update timestamp (UTC,
151151+ `YYYYMMDDHHMMSS`)
152152+153153+V level metadata is used for listing the snapshot versions. It has the
154154+following items:
155155+156156+- `<groupId>` - group ID
157157+- `<artifactId>` - artifact ID
158158+- `<versioning>`
159159+ - `<lastUpdated>` - the metadata update timestamp (UTC,
160160+ `YYYYMMDDHHMMSS`)
161161+ - `<snapshot>` - info about the latest snapshot version
162162+ - `<timestamp>` - build timestamp (UTC, `YYYYMMDD.HHMMSS`)
163163+ - `<buildNumber>` - build number
164164+ - `<snapshotVersions>` - the list of all available snapshot file info,
165165+ each info is enclosed in a `<snapshotVersion>`
166166+ - `<classifier>` - classifier (optional)
167167+ - `<extension>` - file extension
168168+ - `<value>` - snapshot version (as opposed to base version)
169169+ - `<updated>` - snapshot build timestamp (UTC, `YYYYMMDDHHMMSS`)
170170+171171+## Lockfile Format
172172+173173+The mitm-cache lockfile format is described in the [mitm-cache
174174+README](https://github.com/chayleaf/mitm-cache#readme).
175175+176176+The nixpkgs Gradle lockfile format is more complicated:
177177+178178+```json
179179+{
180180+ "!comment": "This is a nixpkgs Gradle dependency lockfile. For more details, refer to the Gradle section in the nixpkgs manual.",
181181+ "!version": 1,
182182+ "https://oss.sonatype.org/content/repositories/snapshots/com/badlogicgames/gdx-controllers": {
183183+ "gdx-controllers#gdx-controllers-core/2.2.4-20231021.200112-6/SNAPSHOT": {
184184+185185+ "jar": "sha256-Gdz2J1IvDJFktUD2XeGNS0SIrOyym19X/+dCbbbe3/U=",
186186+ "pom": "sha256-90QW/Mtz1jbDUhKjdJ88ekhulZR2a7eCaEJoswmeny4="
187187+ },
188188+ "gdx-controllers-core/2.2.4-SNAPSHOT/maven-metadata": {
189189+ "xml": {
190190+ "groupId": "com.badlogicgames.gdx-controllers"
191191+ }
192192+ }
193193+ },
194194+ "https://repo.maven.apache.org/maven2": {
195195+ "com/badlogicgames/gdx#gdx-backend-lwjgl3/1.12.1": {
196196+ "jar": "sha256-B3OwjHfBoHcJPFlyy4u2WJuRe4ZF/+tKh7gKsDg41o0=",
197197+ "module": "sha256-9O7d2ip5+E6OiwN47WWxC8XqSX/mT+b0iDioCRTTyqc=",
198198+ "pom": "sha256-IRSihaCUPC2d0QzB0MVDoOWM1DXjcisTYtnaaxR9SRo="
199199+ }
200200+ }
201201+}
202202+```
203203+204204+`!comment` is a human-readable description explaining what the file is,
205205+`!version` is the lockfile version (note that while it shares the name
206206+with mitm-cache's `!version`, they don't actually have to be in sync and
207207+can be bumped separately).
208208+209209+The other keys are parts of a URL. Each URL is split into three parts.
210210+They are joined like this: `<part1>/<part2>.<part3>`.
211211+212212+Some URLs may have a `#` in them. In that case, the part after `#` is
213213+parsed as `#<artifact-id>/<version>[/SNAPSHOT][/<classifier>].<ext>` and
214214+expanded into
215215+`<artifact-id>/<base-version>/<artifact-id>-<version>[-<classifier>].<ext>`.
216216+217217+Each URL has a value associated with it. The value may be:
218218+219219+- an SRI hash (string)
220220+- for `maven-metadata.xml` - an attrset containing the parts of the
221221+ metadata that can't be generated in Nix code (e.g. `groupId`, which is
222222+ challenging to parse from a URL because it's not always possible to
223223+ discern where the repo base ends and the group ID begins).
224224+225225+`compress-deps-json.py` converts the JSON from mitm-cache format into
226226+nixpkgs Gradle lockfile format. `fetch.nix` does the opposite.
227227+228228+## Security Considerations
229229+230230+Lockfiles won't be human-reviewed. They must be tampering-resistant.
231231+That's why it's imperative that nobody can inject their own contents
232232+into the lockfiles.
233233+234234+This is achieved in a very simple way - the `deps.json` only contains
235235+the following:
236236+237237+- `maven-metadata.xml` URLs and small pieces of the contained metadata
238238+ (most of it will be generated in Nix, i.e. the area of injection is
239239+ minimal, and the parts that aren't generated in Nix are validated).
240240+- artifact/other file URLs and associated hashes (Nix will complain if
241241+ the hash doesn't match, and Gradle won't even access the URL if it
242242+ doesn't match)
243243+244244+Please be mindful of the above when working on Gradle support for
245245+nixpkgs.
···11+import json
22+import sys
33+44+from typing import Dict, Set
55+66+# this compresses MITM URL lists with Gradle-specific optimizations
77+# specifically, it splits each url into up to 3 parts - they will be
88+# concatenated like part1/part2.part3 or part1.part2
99+# part3 is simply always the file extension, but part1 and part2 is
1010+# optimized using special heuristics
1111+# additionally, if part2 ends with /a/b/{a}-{b}, the all occurences of
1212+# /{a}/{b}/ are replaced with #
1313+# finally, anything that ends with = is considered SHA256, anything that
1414+# starts with http is considered a redirect URL, anything else is
1515+# considered text
1616+1717+with open(sys.argv[1], "rt") as f:
1818+ data: dict = json.load(f)
1919+2020+new_data: Dict[str, Dict[str, Dict[str, dict]]] = {}
2121+2222+for url, info in data.items():
2323+ if url == "!version":
2424+ continue
2525+ ext, base = map(lambda x: x[::-1], url[::-1].split(".", 1))
2626+ if base.endswith(".tar"):
2727+ base = base[:-4]
2828+ ext = "tar." + ext
2929+ # special logic for Maven repos
3030+ if ext in ["jar", "pom", "module"]:
3131+ comps = base.split("/")
3232+ if "-" in comps[-1]:
3333+ # convert base/name/ver/name-ver into base#name/ver
3434+3535+ filename = comps[-1]
3636+ name = comps[-3]
3737+ basever = comps[-2]
3838+ ver = basever
3939+ is_snapshot = ver.endswith("-SNAPSHOT")
4040+ if is_snapshot:
4141+ ver = ver.removesuffix("-SNAPSHOT")
4242+ if filename.startswith(f"{name}-{ver}"):
4343+ if is_snapshot:
4444+ if filename.startswith(f"{name}-{ver}-SNAPSHOT"):
4545+ ver += "-SNAPSHOT"
4646+ else:
4747+ ver += "-".join(
4848+ filename.removeprefix(f"{name}-{ver}").split("-")[:3]
4949+ )
5050+ comp_end = comps[-1].removeprefix(f"{name}-{ver}")
5151+ else:
5252+ ver, name, comp_end = None, None, None
5353+ if name and ver and (not comp_end or comp_end.startswith("-")):
5454+ base = "/".join(comps[:-1]) + "/"
5555+ base = base.replace(f"/{name}/{basever}/", "#")
5656+ base += f"{name}/{ver}"
5757+ if is_snapshot:
5858+ base += "/SNAPSHOT"
5959+ if comp_end:
6060+ base += "/" + comp_end[1:]
6161+ scheme, rest = base.split("/", 1)
6262+ if scheme not in new_data.keys():
6363+ new_data[scheme] = {}
6464+ if rest not in new_data[scheme].keys():
6565+ new_data[scheme][rest] = {}
6666+ if "hash" in info.keys():
6767+ new_data[scheme][rest][ext] = info["hash"]
6868+ elif "text" in info.keys() and ext == "xml":
6969+ # nix code in fetch-deps.nix will autogenerate metadata xml files groupId
7070+ # is part of the URL, but it can be tricky to parse as we don't know the
7171+ # exact repo base, so take it from the xml and pass it to nix
7272+ xml = "".join(info["text"].split())
7373+ new_data[scheme][rest][ext] = {
7474+ "groupId": xml.split("<groupId>")[1].split("</groupId>")[0],
7575+ }
7676+ if "<release>" in xml:
7777+ new_data[scheme][rest][ext]["release"] = xml.split("<release>")[1].split(
7878+ "</release>"
7979+ )[0]
8080+ if "<latest>" in xml:
8181+ latest = xml.split("<latest>")[1].split("</latest>")[0]
8282+ if latest != new_data[scheme][rest][ext].get("release"):
8383+ new_data[scheme][rest][ext]["latest"] = latest
8484+ if "<lastUpdated>" in xml:
8585+ new_data[scheme][rest][ext]["lastUpdated"] = xml.split("<lastUpdated>")[
8686+ 1
8787+ ].split("</lastUpdated>")[0]
8888+ else:
8989+ raise Exception("Unsupported key: " + repr(info))
9090+9191+# At this point, we have a map by part1 (initially the scheme), part2 (initially a
9292+# slash-separated string without the scheme and with potential # substitution as
9393+# seen above), extension.
9494+# Now, push some segments from "part2" into "part1" like this:
9595+# https # part1
9696+# domain1/b # part2
9797+# domain1/c
9898+# domain2/a
9999+# domain2/c
100100+# ->
101101+# https/domain1 # part1
102102+# b # part2
103103+# c
104104+# https/domain2 # part1
105105+# a # part2
106106+# c
107107+# This helps reduce the lockfile size because a Gradle project will usually use lots
108108+# of files from a single Maven repo
109109+110110+data = new_data
111111+changed = True
112112+while changed:
113113+ changed = False
114114+ new_data = {}
115115+ for part1, info1 in data.items():
116116+ starts: Set[str] = set()
117117+ # by how many bytes the file size will be increased (roughly)
118118+ lose = 0
119119+ # by how many bytes the file size will be reduced (roughly)
120120+ win = 0
121121+ # how many different initial part2 segments there are
122122+ count = 0
123123+ for part2, info2 in info1.items():
124124+ if "/" not in part2:
125125+ # can't push a segment from part2 into part1
126126+ count = 0
127127+ break
128128+ st = part2.split("/", 1)[0]
129129+ if st not in starts:
130130+ lose += len(st) + 1
131131+ count += 1
132132+ starts.add(st)
133133+ win += len(st) + 1
134134+ if count == 0:
135135+ new_data[part1] = info1
136136+ continue
137137+ # only allow pushing part2 segments into path1 if *either*:
138138+ # - the domain isn't yet part of part1
139139+ # - the initial part2 segment is always the same
140140+ if count != 1 and "." in part1:
141141+ new_data[part1] = info1
142142+ continue
143143+ # some heuristics that may or may not work well (originally this was
144144+ # used when the above if wasn't here, but perhaps it's useless now)
145145+ lose += (count - 1) * max(0, len(part1) - 4)
146146+ if win > lose or ("." not in part1 and win >= lose):
147147+ changed = True
148148+ for part2, info2 in info1.items():
149149+ st, part3 = part2.split("/", 1)
150150+ new_part1 = part1 + "/" + st
151151+ if new_part1 not in new_data.keys():
152152+ new_data[new_part1] = {}
153153+ new_data[new_part1][part3] = info2
154154+ else:
155155+ new_data[part1] = info1
156156+ data = new_data
157157+158158+new_data["!comment"] = "This is a nixpkgs Gradle dependency lockfile. For more details, refer to the Gradle section in the nixpkgs manual." # type: ignore
159159+new_data["!version"] = 1 # type: ignore
160160+161161+with open(sys.argv[2], "wt") as f:
162162+ json.dump(new_data, f, sort_keys=True, indent=1)
163163+ f.write("\n")